Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

A Computational Model of Culture-Specific Conversational Behavior

Lecture Notes in Computer Science, 2007
...Read more
A Computational Model of Culture-Specific Conversational Behavior Duˇ san Jan 1 , David Herrera 2 , Bilyana Martinovski 1 , David Novick 2 , and David Traum 1 1 Institute for Creative Technologies, Los Angeles, CA 2 The University of Texas at El Paso, El Paso, TX Abstract. This paper presents a model for simulating cultural differ- ences in the conversational behavior of virtual agents. The model pro- vides parameters for differences in proxemics, gaze and overlap in turn taking. We present a review of literature on these factors and show results of a study where native speakers of North American English, Mexican Spanish and Arabic were asked to rate the realism of the simulations generated based on different cultural parameters with respect to their culture. Keywords: Conversational agents, proxemics, gaze, turn taking, cul- tural model. 1 Introduction Virtual agents are often embedded in life-like simulations and training environ- ments. The goal is to immerse the user in the virtual environment, to provide an experience that is similar to what they could experience in the real world. There are many factors that influence how believable the virtual experience is. It is influenced by the visual appearance of the world and its soundscape, but the virtual agents also have to behave and interact with each other in a manner that fits in the environment. If the user is on a virtual mission in a small town in the Middle East he will expect to see Arab people on the streets. The experience will be much different than a similar setting in the suburbs of a western city. There will be many differences in how people behave in each environment. For instance when people interact with each other there will be differences in how close to each other they stand and how they orient themselves. Their gaze behavior will be different and there could even be differences in turn taking and overlap. It is important then for the virtual agents to behave in a culturally appropriate manner depending on where the virtual experience is situated. There has been increasing interest in culturally adaptive agents e.g. [1]. In most cases the agents are built to target a particular culture. When applying this kind of agent to a new virtual setting its behavior has to be modified. To minimize the amount of work required for this modification, the agent architecture can be designed in a modular fashion so that only the mappings of functional elements to their C. Pelachaud et al. (Eds.): IVA 2007, LNAI 4722, pp. 45–56, 2007. c Springer-Verlag Berlin Heidelberg 2007
46 D. Jan et al. culture-specific surface behaviors have to be changed as is the case for REA [2]. On the other hand the agents can be made to adapt to different cultural norms. At one extreme they could be able to observe the environment and decide how to behave. A more feasible alternative however is for the agent’s designer to provide this information as is proposed for GRETA [3]. The goal is for the agent architecture to provide a set of parameters that can be modified in order to generate the behavior that is culturally appropriate for the agent in a given scenario. In our work toward augmenting our agents with culture-specific behavior we have first focused on a small subset of cultural parameters. We have created a model that can express cultural differences in agent’s proxemics, gaze and overlap in turn taking. This cultural model is an extension to the personality model we used previously for the agents in face-to-face conversation simulation [4]. We assigned the values for these cultural parameters based on reviewed literature to model Anglo American, Spanish-speaking Mexican and Arab agents. To test whether the differences are salient to the user in a virtual world we performed an experiment where we showed multi-party conversations based on different cultural models to subjects from different cultural backgrounds and asked them to rate the realism of overall animation, agent’s proxemics, gaze behavior and pauses in turn taking with respect to their culture. In this paper we first provide an overview of the conversation simulation in section 2. In section 3 we continue with review of literature on cultural variations in proxemics, gaze and turn taking for our three target cultures and show how this influenced the design of cultural parameters in section 4. Section 5 presents the results of the experiment and section 6 concludes with our observations and plans for future work in this area. 2 Conversation Simulation The simulation of face-to-face interaction we are using is an extension of prior work in group conversation simulation using autonomous agents. Carletta and Padilha [5] presented a simulation of agents engaged in a group conversation, in which the group members take turns speaking and listening to others. The discussion was only simulated on the level of turn taking, there was no actual speech content being generated in the simulation. Previous work on turn taking was used to form a probabilistic algorithm in which agents can perform basic behaviors such as speaking and listening, beginning, continuing or concluding a speaking turn, giving positive and negative feedback, head nods, gestures, posture shifts, and gaze. There was no visual representation of the simulation, the output of the algorithm was a log of different choices made by the agents. Behaviors were generated using a stochastic algorithm that compares ran- domly generated numbers against parameters that can take on values between 0 and 1. These parameters determine the likelihood of wanting to talk, how likely the agents are to produce explicit positive and negative feedback, and turn- claiming signals. They determine how often the agents will interrupt others and
A Computational Model of Culture-Specific Conversational Behavior Dušan Jan1 , David Herrera2, Bilyana Martinovski1 , David Novick2 , and David Traum1 1 Institute for Creative Technologies, Los Angeles, CA 2 The University of Texas at El Paso, El Paso, TX Abstract. This paper presents a model for simulating cultural differences in the conversational behavior of virtual agents. The model provides parameters for differences in proxemics, gaze and overlap in turn taking. We present a review of literature on these factors and show results of a study where native speakers of North American English, Mexican Spanish and Arabic were asked to rate the realism of the simulations generated based on different cultural parameters with respect to their culture. Keywords: Conversational agents, proxemics, gaze, turn taking, cultural model. 1 Introduction Virtual agents are often embedded in life-like simulations and training environments. The goal is to immerse the user in the virtual environment, to provide an experience that is similar to what they could experience in the real world. There are many factors that influence how believable the virtual experience is. It is influenced by the visual appearance of the world and its soundscape, but the virtual agents also have to behave and interact with each other in a manner that fits in the environment. If the user is on a virtual mission in a small town in the Middle East he will expect to see Arab people on the streets. The experience will be much different than a similar setting in the suburbs of a western city. There will be many differences in how people behave in each environment. For instance when people interact with each other there will be differences in how close to each other they stand and how they orient themselves. Their gaze behavior will be different and there could even be differences in turn taking and overlap. It is important then for the virtual agents to behave in a culturally appropriate manner depending on where the virtual experience is situated. There has been increasing interest in culturally adaptive agents e.g. [1]. In most cases the agents are built to target a particular culture. When applying this kind of agent to a new virtual setting its behavior has to be modified. To minimize the amount of work required for this modification, the agent architecture can be designed in a modular fashion so that only the mappings of functional elements to their C. Pelachaud et al. (Eds.): IVA 2007, LNAI 4722, pp. 45–56, 2007. c Springer-Verlag Berlin Heidelberg 2007  46 D. Jan et al. culture-specific surface behaviors have to be changed as is the case for REA [2]. On the other hand the agents can be made to adapt to different cultural norms. At one extreme they could be able to observe the environment and decide how to behave. A more feasible alternative however is for the agent’s designer to provide this information as is proposed for GRETA [3]. The goal is for the agent architecture to provide a set of parameters that can be modified in order to generate the behavior that is culturally appropriate for the agent in a given scenario. In our work toward augmenting our agents with culture-specific behavior we have first focused on a small subset of cultural parameters. We have created a model that can express cultural differences in agent’s proxemics, gaze and overlap in turn taking. This cultural model is an extension to the personality model we used previously for the agents in face-to-face conversation simulation [4]. We assigned the values for these cultural parameters based on reviewed literature to model Anglo American, Spanish-speaking Mexican and Arab agents. To test whether the differences are salient to the user in a virtual world we performed an experiment where we showed multi-party conversations based on different cultural models to subjects from different cultural backgrounds and asked them to rate the realism of overall animation, agent’s proxemics, gaze behavior and pauses in turn taking with respect to their culture. In this paper we first provide an overview of the conversation simulation in section 2. In section 3 we continue with review of literature on cultural variations in proxemics, gaze and turn taking for our three target cultures and show how this influenced the design of cultural parameters in section 4. Section 5 presents the results of the experiment and section 6 concludes with our observations and plans for future work in this area. 2 Conversation Simulation The simulation of face-to-face interaction we are using is an extension of prior work in group conversation simulation using autonomous agents. Carletta and Padilha [5] presented a simulation of agents engaged in a group conversation, in which the group members take turns speaking and listening to others. The discussion was only simulated on the level of turn taking, there was no actual speech content being generated in the simulation. Previous work on turn taking was used to form a probabilistic algorithm in which agents can perform basic behaviors such as speaking and listening, beginning, continuing or concluding a speaking turn, giving positive and negative feedback, head nods, gestures, posture shifts, and gaze. There was no visual representation of the simulation, the output of the algorithm was a log of different choices made by the agents. Behaviors were generated using a stochastic algorithm that compares randomly generated numbers against parameters that can take on values between 0 and 1. These parameters determine the likelihood of wanting to talk, how likely the agents are to produce explicit positive and negative feedback, and turnclaiming signals. They determine how often the agents will interrupt others and A Computational Model of Culture-Specific Conversational Behavior 47 duration of speech segments. The parameters of this algorithm are used to define the personality of the agents. There are great variations present between individuals, even within a single culture, although cultural biases are also possible and would require further investigation. This work was extended in [6], which tied the simulation to the bodies of background characters in a virtual world [7] and also incorporated reactions to external events as part of the simulation. This kind of simulation allowed middle-level of detail conversations, in which the characters are close enough to be visible, but are not the main characters in the setting. Their main role is not to interact with the user, but rather maintain the illusion of a realistic environment where the user is situated. Further improvements to the simulation were made by using new bodies in the Unreal Tournament game engine and adding support for dynamic creation of conversation groups [4]. This allowed dynamic creation, splitting, joining, entry and exit of sub-conversations, rather than forcing the entire group to remain in one fixed conversation. Other extensions to the simulation were made to add support for movement of the agents by adding a movement and positioning component that allows agents to monitor “forces” that make it more desirable to move to one place or another, iteratively select new destinations and move while remaining engaged in conversations [8]. 3 Aspects of Culture-Specific Behavior Our current focus of investigation is on cultural variation of non-verbal behaviors in conversation. We were interested in factors that play an important role in face to face conversation and are mainly controlled and learned on a rather subconscious level. As a first step we examined differences in proxemics, gaze and pauses between turns. While other factors such as gestures also have significant cultural variation and are salient for the outward appearance of conversation, we restricted our investigation to some of the factors where we did not have to alter the art used in the simulation. Our goal was to create a cultural model and provide examples for Anglo American, Spanish-speaking Mexican and Arab cultures. In order to accomplish this we reviewed relevant literature and we report on our findings in the rest of this section. 3.1 Proxemics Proxemics relates to spatial distance between persons interacting with each other, and their orientation toward each other. Hall writes that individuals generally divide their personal space into four distinct zones [9]. The intimate zone is used for embracing or whispering, the personal zone is used for conversation among good friends, the social zone is used for conversation among acquaintances and the public zone for public speaking. While the proxemics are culturally defined, there are also variations based on sex, social status, environmental constraints and type of interaction. 48 D. Jan et al. Baxter observed interpersonal spacing of Anglo-, Black-, and Mexican- Americans in several natural settings [10]. He classified subjects by ethnic group, age, sex and indoor/outdoor setting. Results for Anglo- and Mexican-American adults are listed in table 1. Table 1. Mean Interpersonal Distance in Feet [10] Ethnic Group Sex Combination Indoor Adults Outdoor Adults Anglo M-M 2.72 2.72 Anglo M-F 2.33 2.59 Anglo F-F 2.45 2.46 Mexican M-M 2.14 1.97 Mexican M-F 1.65 1.83 Mexican F-F 2.00 1.67 Graves and Watson [11] observed 32 male Arab and American college students in pairs (6 possible combinations) for 5 minutes after 2 minute warm-ups. They found that Arabs and Americans differed significantly in proxemics, the Arabs interacting with each other closer and more directly than Americans. They also report that differences between subjects from different Arab regions were smaller than for different American regions. While the study confirms that Arabs interact much closer to each other we cannot use their measurements as all their subjects were seated. In a similar experiment Watson studied 110 male foreign students between spring of ’66 and ’67 at the University of Colorado. He found that Latin Americans exhibit less closeness than Arabs, but still interact much closer than Anglo Americans [12]. Shuter’s investigation of proxemic behavior in Latin America gives us some useful data about proxemics in Spanish cultures. He was particularly interested in changes between different geographical regions. In his study he compared proxemics of pairs involved in conversation in a natural setting [13]. He concluded that interactants stand farther apart and the frequency of tactile contact diminishes as one goes from Central to South America. Table 2 lists the distances recorded in his study. Table 2. Mean Interpersonal Distance in Feet [13] Sex Combination Costa Rica Panama Colombia M-M 1.32 1.59 1.56 M-F 1.34 1.49 1.53 F-F 1.22 1.29 1.40 McCroskey et al. performed an interesting study investigating whether real life proxemic behavior translates into expected interpersonal distances when using a projection technique [14]. Their main goal was to get more data on proxemics A Computational Model of Culture-Specific Conversational Behavior 49 as it relates to differences in subjects that are hard to measure in naturalistic observations. They asked subjects to place a dot on a diagram of a room where they would prefer to talk with a person of interest. The results for projection technique were in agreement with findings of real observations. Similar finding of translation of proxemic behavior to a virtual setting is reported by Nakanishi in analysis of proxemics in virtual conferencing system [15]. 3.2 Gaze Most data on gaze is available for dyadic conversations. Kendon writes that gaze in dyadic conversation serves to provide visual feedback, to regulate the flow of conversation, to communicate emotions and relationships and to improve concentration by restriction of visual input [16]. Argyle and Cook provide a number of useful data on gaze measurements in different situations [17]. While most of it is dyadic, there is some data available for triads. They compare gaze behavior between triads and dyads as reported in studies by Exline [18] and Argyle and Ingham [19]. Table 3 shows how the amount of gaze differs between the two situations (although the tasks and physical conditions in the two studies were different so group size may not be the only variable). In dyadic conversation people look nearly twice as much when listening as while speaking. Table 3. Amount of gaze (%) in triads and dyads [17] Sex Combination Average amount of gaze by individuals Looking while listening Looking while talking Mutual Gaze Triads MMM FFF 23.2 37.3 29.8 42.4 25.6 36.9 3.0 7.5 Dyads MM FF 56.1 65.7 73.8 77.9 31.1 47.9 23.4 37.9 A study by Weisbrod looked at gaze behavior in a 7-member discussion group [20]. He found that people looked at each other 70% of the time while speaking and 47% while listening. Among other things, he concluded that to look at someone while he is speaking serves as a signal to be included in the discussion, and to receive a look back from the speaker signals the inclusion of the other. Kendon [16] attributes this reversal of the pattern as compared to dyadic situation to the fact that in multiparty situation the speaker must make it clear to whom he is speaking. There is some data available on cultural differences in gaze behavior. In a review by Matsumoto [21] he reports that people from Arab cultures gaze much longer and more directly than do Americans. In general contact cultures engage in more gazing and have more direct orientation when interacting with others. 50 D. Jan et al. 3.3 Turn Taking and Overlap According to Sacks, Schegloff and Jefferson [22], most of the time only one person speaks in a conversation, occurences of more than one speaker at a time are common, but brief, and transitions from one turn to next usually occur with no gap and no overlap, or with slight gap or overlap. The low amount of overlap is possible because participants are able to anticipate transition-relevance place, a completion point at which it would be possible to change speakers. However, in actual conversations this is not always the case. Berry [23] makes a comparison between Spanish and American turn-taking styles and finds that amount of overlap in Spanish conversation is much higher than predicted by Sacks et al. One of the reasons for this behavior is presence of collaborative sequences. These are genuinely collaborative in nature and include completing another speaker’s sentence, repeating or rewording what a previous speaker has just said, and contributing to a topic as if one has the turn even though they don’t. Also when simultaneous speech does occur, continued speaking during overlap is much more common in Spanish conversation. 3.4 Overall Evaluation of Literature The literature provides enough information to create a general framework for a simple computational model. However, in the process of specifying specific values for the cultural parameters we found that a lot of the needed information is missing. Most of the data on proxemics only has information on mean distance between observed subjects. Information about values for different interaction zones is rare and for North American most agree with values reported by Hall. Data for Mexican and Arab culture is much more scarce. While we did find some information in Spanish literature on interaction distances for different zones, it was not clear whether they were reporting values specific to Spanish cultures or just in general. Literature on cultural differences of gaze and overlap in turn taking is rare and generally lacks quantitative data. The only culture-specific information we found on gaze indicated that gaze is more direct and longer in contact cultures. While data on overlap in turn taking suggested that Spanish cultures allow for more overlap than English, we did not find any comparative analysis for Arab culture. 4 Computational Model Before we could apply cultural variations to the simulation we had to make some changes in the computational model so that it was better able to express the cultural differences. The ability to provide proxemic parameters to the simulation is provided by the movement and positioning algorithm [8]. It takes 3 inputs; maximum distance for intimate zone, maximum distance for personal zone and A Computational Model of Culture-Specific Conversational Behavior 51 maximum distance for social zone. Each agent maintains a belief about relationship with other agents. They will choose appropriate interactional distance based on this information. If at some point during the conversation an agent gets out of his preferred zone for interaction he will adapt by repositioning himself. He will do this while balancing requirements for proxemics with other factors that influence positioning, such as audibility of the speaker, background noise and occlusion of other participants in the conversation. To express the differences in gaze behavior and turn taking overlap we had to make some changes to the simulation. Previously the conversation simulation employed a uniform gazing behavior. In order to differentiate gazing behavior of agents that are speaking and listening we designed a probabilistic scheme where agents transition between different gaze states. We identified 5 different states: 1) agent is speaking and the agent they are gazing at is looking at the speaker, 2) agent is speaking and the agent they are gazing at is not looking at the speaker, 3) agent is speaking and is averting gaze or looking away, 4) agent is listening and speaker is gazing at him, 5) agent is listening and speaker is not gazing at him. In each of the states the agent has a number of possible choices. For example in state 1 he can choose to keep gazing at the current agent, he can choose to gaze at another agent or gaze away. Each outcome has a weight associated with it. If the weights for these 3 outcomes are 6, 2 and 2 respectively, then the speaker will choose to keep gazing at their current target 60% of the time, in 20% he will pick a new agent to gaze at and in the remaining 20% he will look away. The decision on whether to transition between states is performed about every 0.5 seconds of the simulation. In addition to these weights we introduced another modifier based on our informal observations that makes it more likely for agents to look at agents that are currently gazing at us. Last, the overlap between turns was redesigned to follow gaussian distribution [24], with mean and variation as parameters that can be culturally defined. Whenever the agent decides to take a turn at a pre-TRP signal (a cue by which the agents are able to predict when the next transition relevance place will occur), it picks a random value based on this distribution and uses this value to queue when he’s going to start his turn speaking. The parameters defining the cultural variation are represented in XML format with sections for proxemics, gaze and silence and overlap. The following is an example XML culture description for the Anglo American model. The distance for proxemics are expressed in meters. The gaze section starts with GazingAtMeFactor which specifies the modifier making it more likely to gaze at someone that is currently looking at the agent. Following are the distributions for the gaze behavior for each of the 5 gaze states (Speaker/Attending, Speaker/NonAttending, Speaker/Away, Addressee, Listener). The choices for which the weights can be specified are: Speaker - agent that is speaking, Addressee - agent that speaker is gazing at, Random - random conversation participant, Away - averting gaze or looking away. The last section, Silence, includes the before mentioned parameters influencing the gaussian distribution for overlap between turns. 52 D. Jan et al. <Culture> <Proxemics> <IntimateZone>0.45</IntimateZone> <PersonalZone>1.2</PersonalZone> <SocialZone>2.7</SocialZone> </Proxemics> <Gaze> <GazingAtMeFactor>1.5</GazingAtMeFactor> <Speaker> <Attending> <Addressee>6.0</Addressee> <Random>2.0</Random> <Away>2.0</Away> </Attending> <NonAttending> <Addressee>1.0</Addressee> <Random>8.0</Random> <Away>1.0</Away> </NonAttending> <Away> <Random>9.0</Random> <Away>1.0</Away> </Away> </Speaker> <Addressee> <Speaker>8.0</Speaker> <Random>1.0</Random> <Away>1.0</Away> </Addressee> <Listener> <Speaker>6.0</Speaker> <Addressee>2.0</Addressee> <Random>1.0</Random> <Away>1.0</Away> </Listener> </Gaze> <Silence> <StartOffset>0.0</StartOffset> <StartVariation>0.5</StartVariation> </Silence> </Culture> We tried to back most of the values for cultural parameters with data from the available literature, but in many cases we had to resort to approximations based on available qualitative descriptions. For proxemics of North American culture we used the values reported by Hall and are used as shown in the above example A Computational Model of Culture-Specific Conversational Behavior 53 XML. To overcome the lack of data for zone distances of Arab and Mexican culture we used differences in mean distances from reported studies and used them to modify distances for all zones. For Mexican Spanish we used the values of 0.45m for intimate, 1.0m for personal and 2.0m for social zone and 0.45m, 0.7m, 1.5m for respective zones in Arab model. To model the more direct gaze of contact cultures we increased the weight corresponding to gaze at the speaker in the Mexican and Arab model. We have decided not to make any differences in overlap of turn taking because we did not get any data for Arab culture. In the end the cultural models are richer in respect to proxemics due to lack of exact studies on cultural aspects of gaze and overlap in turn taking. While it is unfortunate that we do not have more data this gives us an opportunity to verify if the virtual reality simulation reflects cultural phenomena which are recognized by the subjects. If the subjects are asked to evaluate the parameters in respect to their culture even for the parameters that we do not vary between cultures, then we should expect more cultural differences with respect to proxemics than with respect to gaze and overlap in turn taking. 5 Evaluation To test whether the proposed extension of the conversational simulation can saliently represent the cultural differences we conducted a cross-cultural study of the perceptions of non-verbal behaviors in a virtual world. Native speakers of American English, Mexican Spanish, and Arabic observed six two-minute silent animations representing multi-party conversation created by running the simulation with different parameters. We also identified age and sex of the participants and where they attended high school. In the study we had 18 native English speakers, 22 to 70 years old, who all attended high school in the US. 12 Arab subjects were in the range from 21 to 48 years old and most attended high school in the Middle East (Lebanon, Qatar, Syria, Kuwait, Palestine, Morocco, Egypt). All except one out of 10 Mexican subjects attended high school in Mexico and ranged from 19 to 38 years old. While all of the animations had Afghani characters in a Central Asian setting, the parameters of the characters’ non-verbal behaviors were set to values based on the literature for Americans, Mexicans, and Arabs. The animations differed mainly with respect to proxemics. While the Mexican and Arab model had more direct gaze at the speaker, that aspect was not always easily observable given the location of the camera. Two different animations for each culture were presented to each observer, and the order of presentations was balanced across observer groups. The observers were asked to rate the realism with respect to their culture of the overall animation, the characters’ proxemics, the characters’ gaze behaviors, and the characters’ pauses in turn-taking. They were also asked to describe what differences they noticed between the movies and what elements they thought weren’t appropriate for their culture. 54 D. Jan et al. Fig. 1. These are two examples taken from the animations used in the evaluation. Left picture is from the North American model and right picture from the Arab model. The results contained both expected and unexpected elements. Arab subjects judged the Arab proxemics to be more realistic than both American and Mexican proxemics (p < 0.01). Arab subjects also judged the Arab animation more realistic overall than the American animation (p < 0.01). Arab subjects did not judge American proxemics to differ from Mexican proxemics. And judgments of Arab subjects about gaze and pause did not show significant differences across cultures, which was expected because these parameters did not significantly differ across the animations. The judgments of the Mexican and American subjects did not show differences between any of the cultures with respect to proxemics or overall realism. In the aggregate the subjects saw significant differences between some of the individual animations, even if they did not see significant differences between the sets of animations representing the different cultural parameters. For example, the aggregated subjects judged the proxemics of animation “Arab 1” to differ significantly from those of both “American 1” and “Mexican 2” (p < 0.001). There was suggestive evidence (p < 0.5) that American subjects distinguished the proxemics of “Arab 1” from “American 1”, but Mexican subjects apparently did not perceive these differences (p > 0.59). There is suggestive evidence that Mexican subjects distinguished “Arab 1” from “Mexican 2” (p < 0.13), but Mexican subjects did not distinguish the pairs of Arab and Mexican animations. The significant differences in perceptions of the individual animations suggest that the animations differed from each other along dimensions other than proxemics, gaze and inter-turn pause length. Possible factors include gesture, coincidental coordination of gesture among the characters, and limitations of the virtual world, which may have affected the representations of the different cultural models in different ways. This is also confirmed by qualitative responses from the subjects that were asked to note any factors they thought did not fit their culture. Some Arab subjects noted that there wasn’t enough tactile contact between the characters. One thought that characters conversing in diads in one particular movie was not culturally appropriate. Some were also distracted by the clothes the characters were wearing. A Computational Model of Culture-Specific Conversational Behavior 6 55 Conclusion In this paper we have presented an extension of a conversation simulation that can express cultural differences in conversation. We presented the data used to create the model and an example XML representation of the cultural parameters. The results of the evaluation have shown that subjects were able to distinguish between simulations generated with different parameters in regard to cultureappropriateness, which suggests that the simulations do reflect culturally specific behaviors which are observable by the viewers of same or other cultures. To further the study of cross-cultural differences in conversation we could pursue research in several directions. More studies are needed on culture-specific data on duration of gaze before transition, on turn taking, pauses and overlap. We could explore other factors that change across cultures including gestures and other non-verbal behaviors or investigate in cultural difference in goal oriented conversations as opposed to the free-form conversations in the current simulation. We could also expand the analysis to include more cultures than the ones we examined. In order to achieve many of these goals it would be helpful to have an audiovisual corpus for analyzing non-verbal behaviors, particularly in multiparty interaction. Another way to achieve the same goal could also be to let the subjects of different cultures experiment with the parameters of the simulation in some way and set the parameters themselves. Acknowledgments. The project described here has been sponsored by the U.S. Army Research, Development, and Engineering Command (RDECOM). Statements and opinions expressed do not necessarily reflect the position or the policy of the United States Government, and no official endorsement should be inferred. References 1. O’Neill-Brown, P.: Setting the stage for the culturally adaptive agent. In: Proceedings of the 1997 AAAI Fall Symposium on Socially Intelligent Agents, pp. 93–97. AAAI Press, Menlo Park, CA (1997) 2. Cassell, J., Bickmore, T., Billinghurst, M., Campbell, L., Chang, K., Vilhjálmsson, H., Yan, H.: Embodiment in conversational interfaces: Rea. In: Proceedings of the SIGCHI conference on Human factors in computing systems: the CHI is the limit, pp. 520–527 (1999) 3. de Rosis, F., Pelachaud, C., Poggi, I.: Transcultural believability in embodied agents: a matter of consistent adaptation. In: Agent Culture: Designing HumanAgent Interaction in a Multicultural World, Laurence Erlbaum Associates, Mahwah (2003) 4. Jan, D., Traum, D.R.: Dialog simulation for background characters. In: Panayiotopoulos, T., Gratch, J., Aylett, R., Ballin, D., Olivier, P., Rist, T. (eds.) IVA 2005. LNCS (LNAI), vol. 3661, pp. 65–74. Springer, Heidelberg (2005) 5. Padilha, E., Carletta, J.: A simulation of small group discussion. In: Proceedings of EDILOG 2002: Sixth Workshop on the Semantics and Pragmatics of Dialogue, pp. 117–124 (2002) 56 D. Jan et al. 6. Patel, J., Parker, R., Traum, D.R.: Simulation of small group discussions for middle level of detail crowds. In: Army Science Conference (2004) 7. Swartout, W., Hill, R., Gratch, J., Johnson, W., Kyriakakis, C., Labore, K., Lindheim, R., Marsella, S., Miraglia, D., Moore, B., Morie, J., Rickel, J., Thiebaux, M., Tuch, L., Whitney, R., Douglas, J.: Toward the holodeck: Integrating graphics, sound, character and story. In: Proceedings of 5th International Conference on Autonomous Agents (2001) 8. Jan, D., Traum, D.R.: Dynamic movement and positioning of embodied agents in multiparty conversations. In: AAMAS 2007: Proceedings of the Sixth International Joint Conference on Autonomous Agents and Multi-Agent Systems (2007) 9. Hall, E.T.: Proxemics. Current Anthropology 9(2/3), 83–108 (1968) 10. Baxter, J.C.: Interpersonal spacing in natural settings. Sociometry 33(4), 444–456 (1970) 11. Watson, O.M., Graves, T.D.: Quantitative research in proxemic behavior. American Anthropologist 68(4), 971–985 (1966) 12. Watson, O.: Proxemic Behavior: A Cross-cultural Study. Mouton (1970) 13. Shuter, R.: Proxemics and Tactility in Latin America. Journal of Communication 26(3), 46–52 (1976) 14. McCroskey, J.C., Young, T.J., Richmond, V.P.: A simulation methodology for proxemic research. Sign Language Studies 17, 357–368 (1977) 15. Nakanishi, H.: Freewalk: a social interaction platform for group behaviour in a virtual space. Int. J. Hum.-Comput. Stud. 60(4), 421–454 (2004) 16. Kendon, A.: Some functions of gaze-direction in social interaction. Acta Psychol (Amst) 26(1), 22–63 (1967) 17. Argyle, M., Cook, M.: Gaze and Mutual Gaze. Cambridge University Press, Cambridge (1976) 18. Exline, R.V.: Explorations in the process of person perception: Visual interaction in relation to competition, sex, and need for affiliation. Journal of Personality 31 (1960) 19. Argyle, M., Ingham, R.: Gaze, mutual gaze, and proximity. Semiotica 6(1), 32–50 (1972) 20. Weisbrod, R.M.: Looking behavior in a discussion group (1965) (unpublished paper) 21. Matsumoto, D.: Culture and Nonverbal Behavior. In: The Sage Handbook of Nonverbal Communication, Sage Publications Inc, Thousand Oaks, CA (2006) 22. Sacks, H., Schegloff, E., Jefferson, G.: A simplest systematics for the organization of turn-taking for conversation. Language 50(4), 696–735 (1974) 23. Berry, A.: Spanish and American turn-taking styles: A comparative study. Pragmatics and Language Learning, monograph series 5, 180–190 (1994) 24. ten Bosch, L., Oostdijk, N., de Ruiter, J.: Durational Aspects of Turn-Taking in Spontaneous Face-to-Face and Telephone Dialogues. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2004. LNCS (LNAI), vol. 3206, pp. 563–570. Springer, Heidelberg (2004)