The role of text in televideo cybersex

Rodney H Jones

The role of text in televideo cybersex

2008, Text & Talk

The role of text in televideo cybersex RODNEY H. JONES Abstract Televideo cybersex provides a unique example of the ways meanings are instantiated, identities constructed, and relationships negotiated across different semiotic modes. This article explores the role of verbal messages in these multimodal exchanges, examining the speciﬁc interactional functions they perform. Text, it is argued, plays a rather unique role in this particular kind of interaction. Unlike ordinary face-to-face conversation, in which the body (posture, gestures, gaze) usually plays more of an ancillary role, in televideo cybersex, the bodily performance is primary, with verbal messages functioning to contextualize physical actions. Text is used to help increase the sense of ‘presence’ participants feel, to regulate the rhythm of the unfolding interaction, to help manage the orderly exchange of information, and to create narrative frames within which bodily displays can be interpreted and made coherent. Keywords: 1. computer-mediated communication; cybersex; gay men; multimodality; text. Introduction This paper focuses on the ways verbal messages are combined with bodily displays in televideo cybersex between gay men—what is referred to by my participants as ‘cam sex’ or ‘cam fun’. Although what will be described here might seem rather removed from the experience of many, looking at this kind of interaction, I will argue, can bring us closer to understanding how, in everyday life, the body functions as both an acting subject and an objective ‘display’, and how the subjective and objective qualities of the body change when it is combined with other modes such as spoken and written language. As more and more human interaction is 1860–7330/08/0028–0453 Online 1860–7349 6 Walter de Gruyter Text & Talk 28–4 (2008), pp. 453–473 DOI 10.1515/TEXT.2008.022 454 Rodney H. Jones mediated through technologies and these technologies become increasingly multimodal, it becomes increasingly important to investigate how users employ di¤erent modes to ‘embody’ themselves and the e¤ect these modes have on their social interaction. Televideo cybersex is a kind of interactive ‘reality porn’ (Barcan 2002) in which users—who are often strangers—conduct erotic performances for each other using webcams. Participants meet each other in a variety of ways—in chat rooms, forums, or though Web sites. Interaction may take place in the context of these Web sites with one or several other participants, or users may contact each other privately using software that supports videoconferencing such as MSN Messenger. It chieﬂy consists of users displaying di¤erent parts of their bodies to each other, usually while masturbating. The enjoyment for participants is both voyeuristic and exhibitionistic as they trade positions as subjects gazing upon the bodies of others, and objects o¤ered up to the other’s gaze (Waskul 2002, 2003). These encounters are most often casual and anonymous; information about participants’ actual identities is seldom exchanged, and participants seldom display their faces to each other. There are no reliable statistics on the extent of this behavior, but anecdotal evidence and the sheer number of Web sites, IM groups, and video-conferencing directories devoted to it suggest that it is widespread. In fact, executives from Microsoft admit that the earliest adopters and most frequent users of their video-conferencing applications have been cybersex enthusiasts, though they are careful to add that the company in no way condones the use of their software for such ‘risqué’ activities (Lewis 1998). Although the ‘meat’ of televideo cybersex is in the ‘conversation of gestures’ (Waskul 2002: 204) users engage in with their webcams, there is a considerable amount of verbal communication as well consisting occasionally in voice conversations through microphones connected to users’ computers or, more commonly, intermittent text messages. My purpose here is to explore the function of these text messages in relation to the images users transmit via their webcams, and to understand how text and image interact to create meaning, relationships, and interactional coherence. The data come from an ethnographic study of how gay men in Hong Kong use computers.1 The study took a participatory approach in which gay men were recruited as participant researchers, and the data includes screen movies of participants’ on-line practices, in-depth interviews, and participant diaries. Participants who engaged in televideo cybersex and were willing to share their interactions were asked to help recruit chat partners from their MSN Messenger buddy lists to join the study and to give consent for their interactions to be analyzed. All in all, 17 examples Televideo cybersex 455 of televideo interaction were collected involving 24 di¤erent individuals. Of these participants, 18 were Hong Kong Chinese and six were Caucasians from the United States and the United Kingdom. Their ages ranged from 18 to 48 with an average age of 27.6. My aim here is not to make generalizations about the conduct of televideo cybersex across a range of cultures, genders, and sexual interests— indeed such a limited data set would not allow this. My purpose, rather, is to explore the possible a¤ordances and constraints associated with text and other modes in this kind of interaction through the close examination of the practices of a speciﬁc community of users. The analytical framework I use comes chieﬂy from interactional sociolinguistics, with its focus on the ‘discourse strategies’ (Gumperz 1982) and negotiative processes people use to manage interactions and social relationships. I also draw on Halliday’s functional-semantic view of dialogue (Eggins and Slade 1997; Halliday 1994), which sees interaction as an ‘exchange of commodities’ realized through various conversational ‘moves’. Insights from other analysts working from a more ethnomethodological perspective like Goodwin (1994, 2002), who focuses on the ways the sequential deployment of verbal messages and nonverbal displays operates to structure interaction in recognizable patterns, also ﬁgure in my approach. Finally, I draw on concepts from scholars in multimodal discourse analysis, particularly those working in systemic functional traditions (Kress and Van Leeuwen 1996, 2001; Lemke 1987, 1998; Norris 2004; Stöckl 2004). 2. The role of verbal messages in multimodal communication Televideo cybersex is a unique form of multimodal communication both in terms of its social goals and the particular kinds of modal conﬁgurations users deploy to reach them. While it in some ways resembles other kinds of social interaction, in other ways it di¤ers considerably. Like textbased computer chat or videoconferencing, or everyday face-to-face communication for that matter, it involves the regular, alternating exchange of verbal messages, but the function of these messages is very di¤erent. While verbal messages are usually the primary focus of these other forms of interaction, in televideo cybersex, the visual messages interactants construct with their bodily displays are primary, the ongoing verbal conversation taking on a more secondary or ancillary role. In particular kinds of face-to-face encounters in which bodily displays take on a similarly primary role such as face-to-face sexual interaction and the kinds of bodily 456 Rodney H. Jones displays one ﬁnds in strip shows, peep shows, and the masturbatory displays gay men engage in in public sex venues such as saunas and public toilets, the exchange of verbal messages is not always evident and sometimes prohibited. In televideo cybersex, however, the regular exchange of verbal messages is practically obligatory. Research into the role verbal messages play in other forms of multimodal interaction has resulted in a number of observations applicable to this study. The ﬁrst is the notion that di¤erent modes function di¤erently in texts and interactions partly because these modes themselves embody certain ‘a¤ordances’ and ‘constraints’ regarding the kinds of communication for which they can be employed. According to Kress and Van Leeuwen (1996, 2001), di¤erent modes work to structure and constrain meaningmaking practices and construct participants’ orientations toward reality and each other. Among the chief di¤erences between verbal and visual modes, they say, is that verbal texts work within the logic of time, orienting readers (or hearers) toward causality, and images operate within the logic of space, orienting viewers toward spatial analytic perspectives. One useful way of approaching the di¤erent meaning-making potential of di¤erent modes is through the lens of Halliday’s three metafunctions of language: the ideational, in which language is used to depict a state of a¤airs, the interpersonal, in which language is used to construct the relationship between sender and receiver, and the textual, in which language contributes to the organization and structure of messages. Stöckl (2004) suggests that while all modes are theoretically capable of performing any of these metafunctions, in di¤erent communicative events they may be distributed across modes in an unequal way depending on how they can be realized most e‰ciently. One central point of such work is that modes do not function independently, that verbal messages work together with other modes in texts and interactions in an integrated way in which the meanings they create together are more than just the sum of the meanings they create separately (Kress and Van Leeuwen 1996). For Barthes (1977), for example, one of the key ways that words interact with images is by ‘anchoring’ or ﬁxing certain meanings in them, deﬁning their terms of reference or point of view from which they are to be interpreted. Whereas in printed texts, words are more often seen as functioning to ‘anchor’ or contextualize images, in face-to-face interaction, gestures, posture, gaze, and other nonverbal communication are more often seen as resources with which speakers contextualize words, and so such cues are considered by Gumperz (1982) as part of a broader class of signals he refers to as contextualization cues. Such cues a¤ect interaction in a variety of ways. One way involves the ideational or referential plane of the Televideo cybersex 457 interaction—nonverbal communication acting to add to, refer to, depict, or modulate the meaning of verbal utterances (Kendon 2004). Another way has more to do with organizational aspects of communication— gesture, gaze, and bodily orientation acting as resources with which people display information about their joint participation in the interaction and about the temporal and sequential organization they expect it to take (Goodwin 2002; Kendon 2004). Particularly relevant to the kind of interaction I am addressing is the research into multimodality in computer-mediated environments, which has mainly focused on the degree to which the modes available either contribute to or inhibit the sense of ‘social presence’ (Short et al. 1976) users experience. Early studies with this focus used face-to-face conversation as the standard for optimum social presence and judged ‘low bandwidth’ media environments, with their ‘reduced social cues’ as intrinsically deﬁcient (see, for example, Daft and Lengel 1984; Sproull and Kiesler 1986). Computer systems relying chieﬂy on text were seen as adequate for most task-related communication but insu‰cient for conveying social, emotional, and contextual aspects of messages. Later work in this area, however, has found that ‘low bandwidth’ can sometimes enhance feelings of intimacy and ‘presence’ (McLellan 1996; Walther and Parks 2002), a phenomenon Walther (1996) refers to as hyperpersonal communication. This observation has also been born out in work on mediated sexual or romantic relationships (Mantovani 2001; Stone 1996). In her work on phone sex, for example, Stone (1996: 94) notes that the ‘narrow bandwidth’ of the telephone arises as a ‘powerful asset’ in such encounters because ‘the interpretive faculties of one participant or another are powerfully . . . engaged, (so) . . . extremely complex fantasies can be generated from a small set of cues’. From the perspective of such scholars, one important lesson computer-mediated communication has to teach us about multimodality is that more modalities does not necessarily mean more meaning, and reduced cues can often provide space for highly nuanced meaning making and highly intimate relationships. In another area of communication which is particularly relevant to this study, that of face-to-face sexual communication in physical sexual encounters or other types of sexualized bodily displays, very little research has been done on the role of verbal messages. While linguistic messages and paralinguistic cues clearly play an important role in much sexual interaction, with 58% of respondents in a Kinsey Institute study reporting verbalization during sex (Reinish 1991), and numerous pop psychologists encouraging such verbalization as a way to increase sexual satisfaction (see, for example, Stanton 2006), talk during sex is by no means an obligatory feature of this kind of interaction, and for many it is considered 458 Rodney H. Jones unusual, embarrassing, or ‘kinky’ (Foore 2004; Stanton 2006). Popular and scholarly treatments of sexual communication tend to frame talk during sex either as ‘an exchange of information’, as in most considerations focusing on the negotiation of sexual satisfaction or safe sex (Quina et al. 2000; Molitor et al. 1999), or as a way to increase the eroticism of the experience (Stanton 2006). Neither of these framings, however, leaves room for a consideration of the more interactional, regulatory, and social functions of talk in this context. In other examples of face-to-face eroticized displays, verbal messages take on an even more marginal role. In what is perhaps the type of sexual interaction most similar to that under consideration, the masturbatory displays gay men engage in toilets and other public sex venues (Clatts 1999; Humphreys 1970; Jones et al. 2000), any exchange of verbal messages is often strictly prohibited, as such audible exchanges might attract unwanted attention from passersby or authorities. 3. Text in televideo cybersex In his 2003 book on cybersex, Waskul makes an important distinction between text-based cybersex, in which the body is constructed solely through participants’ verbal descriptions of themselves and of sexual acts, and televideo cybersex in which participants display video images of their bodies to each other in real time. In text-based cybersex, he says, the body is ‘semiotically enselfed’ in words, whereas in televideo cybersex, the self is ‘embodied in moving images’. Despite the qualitative di¤erence in the interaction brought about by the addition of the visual mode, however, most televideo cybersex is still heavily dependent on verbal communication, and in particular, written text. In fact, the mutual display of bodies participants engage in and the way this unfolds is crucially dependent on the ‘textual selves’ they construct to go along with their visible bodies. In my data, during their televideo encounters participants typed messages or received typed messages on the average of once every 12.6 seconds. Many of these messages were minimal, consisting of single words like ‘wow’ or ‘nice’, while others were more elaborate. In general, though, most were relatively short, averaging 2.8 words per turn. Despite their brevity, they were seen by my participants as playing an extremely important role in the interaction. ‘If the guy doesn’t type anything’, said one participant, ‘I just log o¤. So boring!’ If such verbal exchanges are such an important part of the experience, why then, we might ask, do users not avail themselves of opportunities Televideo cybersex 459 for voice-based communication. MSN Messenger, for example, o¤ers the option to supplement videoconferencing with audio chat, and participants also have the option of combining ‘cam sex’ with phone sex. None of the encounters I collected, however, used these options. The main reasons mentioned were technical: participants noted things like the poor sound quality of computer-mediated voice communication and the inconvenience involved in holding a telephone, operating a computer, and masturbating at the same time. Another reason mentioned, however, was that they did not want to hear their partners’ voices or to be burdened with the necessity of having to constantly produce ‘noise’ for their partner to hear. For some the addition of the dimension of voice made the interaction ‘too personal’ or ‘kill(ed) the fantasy’. Text-based communication allowed them to focus more fully on the visual performance and left more room for them to create an idealized version of their partner. This preference of written over spoken communication makes the relationship between the visual and the verbal message in this kind of interaction very di¤erent from that in face-to-face interaction (including face-toface sexual interaction). The ﬁrst di¤erence has to do with the sense of ‘dislocation’ created when the verbal message in interaction is disassociated from the bodily act of talking. Computer-mediated communication in general has a dislocating e¤ect, with the actual body proximally dislocated from the virtual self. In videoconferencing accompanied by textual messaging, this sense of dislocation is further exasperated by the lack of entrainment of the visual and the verbal messages (Raudaskoski 1999). This is particularly true in televideo cybersex in which the action of masturbating often has to be visibly interrupted to accommodate the act of typing. The second major di¤erence is the fact that, as mentioned above, visual and verbal modes in televideo cybersex take on functions di¤erent than they do in face to face communication, the visual performance being primary, and verbal messages taking on more contextualizing and regulatory functions. The functional bifurcation of modes in televideo cybersex is to a large degree a result of the fact that, whereas participants in this type of interaction are quite forthcoming with displays of body parts that, in most face-to-face interaction, are normally kept hidden from view, they are more reluctant to display the one part of the body that is regularly displayed in casual conversation—the face. The most obvious reason is that, unlike other regions of the body, the face contains information that can clearly be linked to users’ ‘real life’ identities. For some of the users I talked to, however, anonymity was not just a matter of safety, but also part of the overall eroticism of the experience. As one participant said: 460 Rodney H. Jones prolonged (facial) exposure is a little bit unusual—sort of face-to-face talk, this really diminishes the sexual tension, as if you identify the person as a friend that is . . . it’s being anonymous that gives you the kick, right? That is not to say that participants never show their faces, but when they do this display is often brief and usually preceded by careful negotiation (see below), and it often signals a change in framing and footing (Go¤man 1974, 1981), a shift from cybersex to some other activity such as casual chatting, for example. This relative absence of the face as a communicative tool, I will argue, is an important factor in understanding the role of written text in these interactions. 4. Functions of written text On the basis of my interviews and an analysis of my corpus of interactions, I have isolated four primary functions of written text in televideo cybersex, functions that are usually taken up by paralinguistic and nonverbal communication in face-to-face encounters, even of the more intimate kind. In televideo cybersex, text is used (i) to convey a sense of presence and orientation toward one’s interlocutor, (ii) to aid in the timing of the interaction, (iii) to help users regulate the moment-by-moment exchange of conversational commodities, and (iv) to create narrative and interactive frames within which bodily actions can be interpreted. 4.1. Presence For any interaction to proceed successfully, interactants must continually communicate and monitor mutual attention toward each other in order to maintain a sense of co-presence, which is central to what Go¤man (1981) calls a ‘state of talk’. In face-to-face communication, this is usually achieved through bodily orientation and gaze (Kendon 2004). Since participants in televideo cybersex usually do not show their faces, the gaze is typically not available, and participants’ physical bodies are usually oriented toward themselves (i.e., in masturbation). Therefore, the function of maintaining this sense of co-presence is transferred almost entirely to the mode of text. According to my participants, if one partner fails to type a message for an extended period of time, his partner is likely to feel ignored, no matter what kind of bodily display he is being o¤ered. ‘If the guy doesn’t type, how do I know he’s there?’ This is further conﬁrmed by the fact that participants typically ‘prompt’ one another with utterances like ‘hey’ and Televideo cybersex 461 ‘???’ and ‘u there?’, or with questions designed to elicit a response such as ‘like it?’ after an extended period in which a textual o¤ering is not reciprocated, in the same way one might respond to an extended silence in a telephone conversation. All this suggests that, even when two participants are aligned in the mutual and reciprocal display of their naked bodies, in the absence of facial displays, this is not su‰cient to give them the feeling that these displays are communicative acts. It is the exchange of textual messages that provides an overall framework for a sustained sense of mutual availability. In face-to-face conversation, according to Go¤man (1959), the body is the anchor for communication, the peg upon which verbal messages are hung. In much televideo cybersex, the text becomes the peg upon which the body is hung. Beyond this most basic sense of mutual monitoring, a sense of presence also involves the feeling of richness in the interaction, which some argue is increased by the number of modes available to participants (Daft and Lengel 1984). Whereas in face-to-face interaction, gestures and other bodily displays serve to ‘animate’ verbal messages, making communication a ‘richer experience’ (Kendon 2004: 175), in televideo cybersex, it is the text that serves to ‘animate’ the images, making what otherwise might seem a cinematic display into something more resembling interaction. 4.2. Timing Another important function of text is in regulating timing and creating a sense of conversational synchrony. In face-to-face conversation, gestures, posture, and paralinguistic cues play a key role in regulating the sequential and temporal organization of interaction and giving participants the feeling that they are ‘in synch’ with one another (Condon 1986; Goodwin 2002; Gumperz 1982). In text-based computer-mediated communication, conversational synchrony is just as important as it is in face-to-face conversation. In an earlier examination of text-based cybersex (Jones 2005b), I noted how users who were successful in engaging partners the longest tended to be those who were able to establish a regular rhythm of sending and receiving messages with their partners, and that this rhythmicity was part of the pleasure participants associated with the activity. In the case of text-based cybersex, such rhythmic coordination is accomplished entirely though the mode of text, dependent upon things like typing speed and the length of pauses between messages. In televideo cybersex there is the added dimension of the body through which information about timing can be sent. Participants, for example, can regulate their verbal contributions based on the actions of the other, delaying, for 462 Rodney H. Jones Figure 1. Time between turns in a sample conversation example, their own message while the other is visibly typing. They can also coordinate the pace of their bodily actions based on visual observation of the other. Even with these visual cues, however, the speed and frequency of verbal messages still plays an important role in the maintenance of conversational synchrony as well as in signaling di¤erent phases in the interaction. To illustrate this I have plotted the length of pauses between turns in one of the interactions from my data (see Figure 1). As can be seen, in the initial stages the time between turns is very short as users exchange greetings and initial information, and then it lengthens slightly after they ‘get down to business’. What is most striking here is the regularity of pause length that is maintained throughout this middle phase, with typed messages being issued at a fairly constant rate of between 5 and 10 seconds, mimicking the kind of rhythmic regularity one associates with actual sex and helping the participants construct what Prior and Shipka (2003: 230–231) call ‘embodied chronotropes’, which add further to the ‘tone and feel’ one associates with physical presence. Goodwin (2002) points out that in multimodal interaction participants orient to multiple orders of temporality simultaneously, with di¤erent modes used to create di¤erent forms of temporal and sequential organization. In the case of televideo cybersex, the rhythmic back and forth of text provides a temporal context for the more rapid sexualized rhythms of bodily movements involved in the masturbatory display. At the same time, rhythm on these two time scales exists in a complementary relation- Televideo cybersex 463 ship in which increased pace on one scale results in a slowing down of action on another time scale—the increased bodily rhythms associated with nearing climax, for example, being associated with a slowing of the pace of verbal messages. In this regard, text is important not just in creating and maintaining the ongoing rhythm of the interaction but also in signaling to participants shifts from one phase of the interaction to another, a function that is particularly evident in the early and later stages of these encounters. These shifts are accomplished both explicitly, through the words people type— initial shifts from ‘cyberchat’ to ‘cybersex’, for example, being signaled by utterances like ‘wanna show?’ or ‘u hard down there?’, and shifts to the closing phase of the encounter being signaled by utterances like ‘r u ready to cum?’—and rhythmically through a slowing down or speeding up of the exchange of turns. As mentioned above, when participants move from their initial negotiations to more erotic interaction, the speed of verbal exchanges typically shifts from a rapid exchange of information to a slower but steady trading of comments to accompany their visual displays. Another such shift in rhythm comes near the end of these encounters when participants are preparing to ejaculate, at which point another dramatic lengthening and time between turns is typical, perhaps because this moment might require more prolonged attention to one’s own physical body. In terms of timing, the moments leading up to climax are the most critical as, in the best of situations, participants prefer ejaculation to be performed simultaneously. In order to facilitate this, users typically issue a series of pre-o¤ers— like ‘lets cum man’ and ‘ready?’. The purpose of such moves is both to monitor the progress of one’s partner and to enforce reciprocity. With these messages, users attempt to open slots for the other to issue the o¤er of ejaculation, as in the following example: (1) A: B: A: B: A: B: A: B: Cum? OK Ready? u? yeahhh cumming? u ﬁrst your cum gets mine. Ejaculations, once performed, also demand a reaction in the form of a coda (like ‘wow’ or ‘nice’) in order to close the transaction. 464 Rodney H. Jones 4.3. Regulating the exchange of conversational commodities The above example dramatically illustrates how, in these encounters, displays are treated as ‘exchanges’ in which an o¤er of a certain type by one participant is seen as requiring (or ‘earning’) a reciprocal o¤er from his partner (‘your cum gets mine’). While cybersexual encounters always involve some degree of cooperation as partners work together to create a mutually satisfying fantasy, it is also to some degree competitive, as partners negotiate these exchanges. Like all interaction, what underlies cybersex is what Go¤man (1959) calls an ‘information game’, a contest in which interactants vie to maintain control over their respective ‘information preserves’ while gathering as much relevant information as possible about their interlocutor’s. What is perhaps unique about cybersex is the ‘value’ of the information at stake, information which in this case includes not just verbal information but also bodily displays of a most intimate nature. The most obvious risk involved in this game is that the information one reveals might somehow give away one’s ‘true identity’, but this is not the only risk. There is also the risk that the information o¤ered may not accord with the desires of one’s partner, resulting in rejection, or that one’s o¤er of information may not be properly or ‘fairly’ reciprocated. From the very beginning of such encounters, users tend to measure out their contributions carefully, trading information about their age, appearance, and sexual proclivities in an incremental fashion, each o¤er opening a slot for a reciprocal o¤er, and each exchange determining whether or not the interaction can progress to the next stage (Jones 2005a). This ‘code of reciprocity’ that governs initial textual exchanges becomes even more important when the visual mode is added, primarily because the more multimodal the message becomes, the less control interactants have over their ‘information preserves’—while textual descriptions allow users to restrict information only to that which is voluntarily given, video also involves information that is involuntarily ‘given o¤ ’. Faced with this increased risk, users take steps to control the information they o¤er and ensure that their o¤ers result in reciprocal o¤ers from their partners. They do this by carefully positioning their cameras to reveal or conceal various bodily regions and by using text to negotiate with their partners the positioning of their cameras. As in earlier text-based exchanges, displays are o¤ered in an incremental, reciprocal fashion. Participants o¤er themselves ‘in pieces’—one piece of you for one piece of me. The default regions are the torso, the torso and penis, the penis alone, and the buttocks. The exposure of a particular region has to do not just with how the body is made erotic but how the body is made meaningful in the on- Televideo cybersex 465 going negotiation of information. Text is used here as a means of enforcing reciprocity, as in the following examples. (2) A: B: (3) A: B: A: B: Show dick? Wow nice Show yours mmm You hard down there. LIKE A ROCK lemme c u hard? Whereas purely textual interactions tend to be dominated by two-part initiation–response sequences, when the visual mode is added, three-part exchanges of the kind Sinclair and Coulthard (1975) observed in classroom interactions are the rule. Verbal initiations are answered by visual responses, which are then followed up by verbal feedback or reactions. These reactions are just as important as the displays that precede them—in fact, they act as a kind of ‘payment’ for the visual display. (4) A: B: A: Can you show me your cock? (displays penis) nice! As noted above, the region that is most rarely displayed is the face. Because of the higher risk involved in such displays, the code of reciprocity is even more strongly enforced—few users would agree to show their face in the absence of an agreement for a reciprocal display from their partners, and in such cases, the sequential reciprocity observed above becomes simultaneous—faces must be shown ‘together’, users concurrently moving their webcams gradually upward while at the same time checking that the other user is also doing the same. The mode of text is crucial in managing these simultaneous exchanges, as can be see in this example: (5) A: B: A: B: A: B: A: B: wanna show face? (displaying torso) together (displaying torso) ok ready? (moving camera slowly upwards) (moves camera upward to revel face) ok thanks (quickly moves camera downward to display torso) (moves camera downward to display torso) ur cute really? yeah 466 Rodney H. Jones Often in such cases the camera only lingers in this region brieﬂy, and, as in this exchange, textual messages like ‘ok thanks’ are used to mark the end of the exposure. 4.4. Framing Interactions in televideo cybersex, however, are more than just mutually negotiated bodily displays. These displays are framed within coherent erotic narratives that are collaboratively constructed by participants turn by turn (Goodwin 2002), and text messages play an important part in this co-construction of discursive coherence. A key phase in the narrative framing of cybersex is the beginning when participants are just getting to know each other. This phase is usually characterized by a series of questions and answers that often need to be successfully negotiated even before participants begin a videoconference, questions usually centering on appearance (age, height, and weight) and sexual preferences (e.g., passive, active). Such constructions are constrained by cultural conventions regarding the kinds of descriptors that are deemed relevant and desirable (Jones 2005c; Stone 1996). Thus, from the beginning, the narratives which are to unfold are to some extent governed by pre-existing scripts that include expectations about roles and relationships (a participant who describes himself as a ‘bottom’, for example, will be expected to conduct himself in a particular way once participants’ cameras are turned on). When participants ﬁnally turn on their cameras, the ‘verbal bodies’ they have constructed are ‘resemiotized’ (Iedema 2001, 2003) into images (Jones 2005a), but this resemiotization does not totally replace the verbal with the visual. Instead the verbal body is superimposed onto the visual image, continually informing the way it is interpreted. On the one hand, then, the initial verbal descriptions one o¤ers are constrained by the future visual display (one must be able to ‘pull o¤ ’ the textual self one has created). On the other hand, the visual display is constrained by the ongoing verbal narration that accompanies it. Once participants have switched on their cameras, text helps to contextualize these displays within an ongoing sexual narrative in which participants claim and impute various roles. In these narratives, verbal contributions give meaning to visual displays, indicating, for example, what function a particular bodily organ or region is meant to play at a particular moment in the fantasy. These ‘stories’, however, do not simply make use of a single narrative frame, but rather normally exploit multiple inter-nested and overlapping frames, as in the example below. Televideo cybersex (6) A: B: A: B: A: B: A: B: 467 nice dick (seated, displaying torso) Thanks (seated, displaying penis) wanna fuck me with it? (seated, displaying torso, leaning forward) sure (seated, displaying and stroking penis) o nice (standing, displaying buttocks) take my cock boy (moving penis toward camera for close-up) fuck me (standing, leaning over, displaying and touching buttocks) show me that ass (standing, stroking penis—close-up) more light please (seated, displaying penis and torso) In this short excerpt, participants construct with their utterances at least three di¤erent interactive frames: one in which they comment upon and direct each others’ actual displays in the present moment (‘nice dick’, ‘show me that ass’), one in which a hypothetical fantasy is played out (‘take my cock boy’, ‘fuck me’), and a broader ‘regulatory frame’ in which technical aspects of the channel and message quality are negotiated (‘more light please’) (see Figure 2). [ regulatory frame [ present moment [ fantasy ] present moment ] regulatory frame ] Figure 2. Interactive frames in televideo cybersex Meanwhile, the shifts of frame accomplished through text are reinforced with bodily movements, the shift from the present moment to the fantasy frame (A: ‘wanna fuck me with it?’, B: ‘sure’), for example, being accompanied by one participant moving forward and the other beginning to stroke his penis. Verbal frames overlap with visual frames as bodily movements (for example, one partner displaying his buttocks and the other moving his penis closer to the camera) imitate the movements of the sexual acts or highlight the body parts mentioned in the verbal track. In such cases, the visual messages ‘act out’ the storyline participants coconstruct with their verbal exchanges, and verbal messages act as captions or as a soundtrack for the visual narrative that is being performed. Because textual messages are to some degree dislocated from the body, however, users can participate in di¤erent frames in di¤erent modes in ways that might be considered rather unusual in real-life sex, as in the following example in which the ‘sexual act’ is momentarily interrupted by an exchange of small talk. (7) A: Yeah man, fuck me! (seated, displaying and stroking penis, legs raised) 468 Rodney H. Jones B: A: B: A: B: yea (seated, displaying and stroking penis) where u from? (seated, displaying and stroking penis, legs raised ) Nottingham (seated, displaying and stroking penis) Robin Hood country (seated, displaying and stroking penis) kewl (seated, displaying and stroking penis, legs raised ) give me ur ass boy (seated, displaying and stroking penis) What is striking about this example, in contrast to Example (6), is that verbal and visual messages are not entrained, the shift from eroticized interaction to small talk in the verbal track having no e¤ect at all on the ongoing masturbatory displays in the visual track. As they move across these multiple frames, participants themselves take on not only particular roles in the erotic narrative that is being written, but also ‘discourse roles’ (Sarangi and Slembrouck 1996) which position them at various times in relation to these frames and to their partners. In televideo cybersex, participants typically perform three di¤erent kinds of discourse roles: the role of performer—presenting their visual bodies and verbal selves for consumption, the role of director, controlling the performance of the other, and the role of the spectator, enjoying and reacting to what one sees. These roles can be fairly consistently mapped onto the semantic functions (Halliday 1994) of the moves participants make. The performer issues o¤ers either of a visual variety, revealing parts of their bodies, or of a verbal variety in the form of invitations such as ‘want to see it?’, information such as ‘I love asian guys’, and descriptions of the actions one is taking in the fantasy frame or acting out on camera, such as ‘I’m fucking that ass man’. Directing, on the other hand, is almost exclusively performed through text, chieﬂy because the range of meanings one can express with the body in this regard is more limited. The director asks questions like ‘do u have any toys?’, makes requests like ‘can u do a close up on your cock and balls?’, and issues directives such as ‘move your cam closer please’. It is the role of the spectator, however, in which text is the most important. In this role, participants issue responses or reactions to the displays of the other like ‘wow’, ‘nice body man’, ‘o my god’, and ‘HUGE MONSTER COCK!’. As noted above, these reactions are a crucial element in the interaction, functioning as a kind of ‘payment’ for visual displays. Figure 3 shows the relative distribution of semantic moves achieved through text in my corpus. O¤ers are the least common moves taken with text, presumably because most o¤ers in this type of interaction are visual. By far the most common moves taken with text are responding moves: reactions, answers to questions, and indications of compliance or Televideo cybersex 469 Figure 3. Distribution of semantic moves refusal. The most likely reason for this is the absence of the face as a communicative tool. In face-to-face interaction, one watches and reacts with one’s gaze and facial expressions. In face-to-face sexual encounters, especially mutual masturbatory displays, the face plays a similar role, signaling responses or reactions to the other’s performance. In televideo cybersex, on the other hand, where the face is usually not available, one watches with one’s words. Of course, there is a kind of ambiguity and polyfunctionality to these visual and verbal moves. A reaction, for example, can also be regarded as a kind of performance, and, because of the code of reciprocity, each visual display is also an implicit demand that the other party produce a similar display. The combination of visual and verbal modes also allows users to take up di¤erent positions in di¤erent modes, making a visual o¤er of a particular body part, for example, while verbally issuing a demand that one’s partner does the same. 5. Conclusion Although telelvideo cybersex is a rather unique form of interaction, analyzing the ways communicative functions are distributed among di¤erent modes in these encounters can inform our understanding of multimodality in general and of the new possibilities for multimodal contact o¤ered by new communication technologies. While any mode has the potential to fulﬁll any communicative function, di¤erent functions are realized differently in di¤erent genres. In face-to-face conversation, verbal messages are usually the chief carriers of ideational meaning, while the body and face, along with paralinguistic cues, are more associated with communicating attitude and working to regulate the structure and ﬂow of the interaction. In televideo cybersex, the opposite is true: most of the ideational 470 Rodney H. Jones meaning is delivered through images, and the words serve more of a contextualizing and regulating function. One performs with one’s body. One watches with one’s words. Jointly employing both modes allows users to simultaneously present themselves as objects and to exert agency as subjects. One of the chief reasons for this phenomenon is the absence of the face as a communicative tool. One might say that in televideo cybersex, while the image is the body, the text is the face. Like the face in face-to-face interaction, text functions as an emblem of selfhood and agency to give to the experience the feeling of being truly interactive. Thus, to use Waskul’s (2003) terminology, while participants are ‘embodied’ in the visual images they broadcast, it is through the words that they type that they are ‘enselfed’. Another important observation to come from this analysis is how different modes serve not just to elaborate one another, but also to regulate one another. One of the chief roles of text in these interactions is to allow participants to manage more precisely their measured and incremental visual displays. Televideo cybersex is just as much about what one does not show as what one does. Visual messages come as carved up body parts rather than complete persons, and verbal messages are stripped of the paralinguistic cues of audible talk, and it is this ‘semiotic minimalism’ of both visual and verbal modes that helps to make these encounters so exciting for users by leaving space for them to weave complex fantasies from a limited set of cues. In their search for something which one of my informants described as ‘more real than pornography and less real than reality’, participants in televideo cybersex deploy the verbal mode of communication in di¤erent ways and for di¤erent purposes than it is deployed in other kinds of interaction like text-based computer chat, and most face-to-face conversation and face-to-face sexual interaction. At the same time, there are also similarities. In nearly all forms of interaction there is an element of bodily ‘display’ (Go¤man 1959), and even users of text-based chat often engage in textual descriptions of their bodies (Jones 2005a, 2005c). Furthermore, there are numerous contexts of face-to-face interaction, especially in the workplace, in which verbal messages do function in similar ways to contextualize bodily actions and help regulate pace and rhythm. Examples can be seen in Nevile’s (2004) descriptions of interactions between pilots in commercial airliners, and in Filliettaz’s (2005) observations of multimodal interactions in a factory. Indeed, much more work needs to be done to understand bodily displays in general and the ways they are managed in various conﬁgurations of time and space using discourse. Finally, this work also invites a closer consideration of the use of verbal messages Televideo cybersex 471 in face-to-face erotic encounters within an approach that focuses not just on the informational function of language but also on its discourse functions, the strategic ways participants use it to negotiate frames, actions, and identities within the sexual act. Note 1. ‘An ethnographic study of computer mediated communication among gay men in Hong Kong’, City University of Hong Kong Small Scale Research Grant #9030988 (http:// personal.cityu.edu.hk/~en-cyber/home.htm). An earlier version of this paper was presented the Third International Conference on Multimodality, 25–27 May 2006, Pavia, Italy. References Barcan, R. (2002). In the raw: ‘Home-made’ porn and reality genres. Journal of Mundane Behavior 3 (1 February). URL: 3http://www.mundanebehavior.org/issues/v3n1/barcan. htm4 [accessed on 3 October 2006]. Barthes, R. (1977). Image–Music–Text. London: Fontana. Clatts, M. C. (1999). Ethnographic observations of men who have sex with men in public. In Public Sex/Gay Space, W. Leap (ed.), 141–156. New York: Columbia University Press. Condon, W. S. (1986). Communication: Rhythm and structure. In Rhythm in Psychological, Linguistic and Musical Processes, J. R. Evans and M. Clynes (eds.), 55–78. Springﬁeld, IL: Charles C. Thomas. Daft, R. L. and Lengel, R. H. (1984). Information richness: A new approach to managerial behavior and organizational design. In Research in Organizational Behavior, L. L. Cummings and B. M. Staw (eds.), 191–233. Homewood, IL: JAI Press. Eggins, S. and Slade, D. (1997). Analyzing Casual Conversation. London: Cassell. Filleittaz, L. (2005). Time, rhythm and multiactivity: Contextualizing teamwork. A paper presented at the 9th International Pragmatics Conference, 10–15 July, Riva del Garda, Italy. Foore, K. A. (2004). Through the looking glass: Constructing sexual identity. Unpublished M.A. thesis, University of Alaska, Fairbanks. Go¤man, E. (1959). The Presentation of Self in Everyday Life: New York: Anchor Doubleday. Go¤man, E. (1974). Frame Analysis. Cambridge: Harvard University Press. Go¤man, E. (1981). Forms of Talk. Philadelphia: University of Pennsylvania Press. Goodwin, C. (1994). Professional vision. American Anthropologist 96 (3): 606–633. Goodwin, C. (2002). Time and action. Current Anthropology 43 (Suppl.): 19–35. Gumperz, J. (1982). Discourse Strategies. Cambridge: Cambridge University Press. Halliday, M. A. K. (1994). An Introduction to Functional Grammar, 2nd ed. London: Edward Arnold. Humphreys, L. (1970). Tearoom Trade: Impersonal Sex in Public Places. Chicago: Aldine. Iedema, R. (2001). Resemiotization. Semiotica 137 (1–4): 23–39. Iedema, R. (2003). Multimodality, resemiotization: Extending the analysis of discourse as multi-semiotic practice. Visual Communication 2 (1): 29–57. 472 Rodney H. Jones Jones, R. (2005a). ‘You show me yours, I’ll show you mine’: The negotiation of shifts from textual to visual modes in computer mediated interaction among gay men. Visual Communication 4 (1): 69–92. Jones, R. (2005b). Rhythm and timing in computer mediated communication. A paper presented at the 9th International Pragmatics Conference, 10–15 July, Riva del Garda, Italy. Jones, R. (2005c). Sexual risk and the Internet. A paper presented at the Language and Global Communication Conference, 7–9 July, Cardi¤, Wales. Jones, R., Yu, K. K., and Candlin, C. N. (2000). A preliminary study of HIV vulnerability and risk behavior among MSM in Hong Kong. Report to the Council for the AIDS Trust Fund, Hong Kong. URL: 3http://personal.cityu.edu.hk/~enrodney/Research/ MSM/ MSMindex.html4 [accessed on 25 September 2006]. Kendon, A. (2004). Gesture: Visible Action as Utterance. Cambridge: Cambridge University Press. Kress, G. and Van Leeuwen, T. (1996). Reading Images: The Grammar of Visual Design. London: Routledge. Kress, G. and Van Leeuwen, T. (2001). Multimodal Discourse: The Modes and Media of Contemporary Communication. London: Edward Arnold. Lemke, J. L. (1987). Strategic deployment of speech and action: A sociosemiotic analysis. In Semiotics: 1983, Proceedings of the Semiotic Society of America ‘Snowbird’ Conference, J. Evans and J. Deely (eds.), 67–79. New York: University Press of America. Lemke, J. L. (1998). Multiplying meaning: Visual and verbal semiotics in scientiﬁc text. In Reading Science: Critical and Functional Perspectives on Discourses of Science, J. R. Martin and R. Veel (eds.), 87–113. London: Routledge. Lewis, P. H. (1998). Videoconferencing’s killer app may be sex. The New York Times 16 July: sec. G, p. 7, col. 1. Mantovani, F. (2001). Cyber-attraction: The emergence of computer-mediated communication in the development of interpersonal relationships. In Say Not to Say: New Perspectives on Miscommunication, L. Anolli, R. Ciceri, and G. Riva (eds.), 229–246. Amsterdam: IOS Press. McLellan, H. (1996). Virtual realities. In Handbook of Research for Educational Communications and Technology, D. H. Jonassen (ed.), 457–487. New York: Macmillan Library Reference. Molitor, F., Facer, M., and Ruiz, J. D. (1999). Sex communication and unsafe sexual behavior among young men who have sex with men in California. Archives of Sexual Behavior 28 (4): 335–344. Nevile, M. (2004). Beyond the Black Box: Talk-in-Interaction in the Airline Cockpit. Aldershot: Ashgate. Norris, S. (2004). Analyzing Multimodal Interaction: A Methodological Framework. London: Routledge. Prior, P. and Shipka, J. (2003). Chronotopic lamination: Tracing the contours of literate activities. In Writing Selves/Writing Societies: Research from Activity Perspectives, C. Bazerman and D. Russell (eds.), 182–238. Fort Collins, CO: The WAC Clearinghouse and Mind, Culture, and Activity. Quina, K., Harlow, L., Moroko¤, P. J., Burkenholder, G., and Deiter, P. J. (2000). Sexual communication in relationships: When words speak louder than actions. Sex Roles: A Journal of Research April. URL: 3http://ﬁndarticles.com/p/articles/mi_m2294/is_ 2000_April/ai_655767104 [accessed on 25 September 2007]. Raudaskoski, P. (1999). The use of communicative resources in language technology environments. Unpublished doctoral dissertation, University of Oulu, Oulu. Televideo cybersex 473 Reinish, J. M. (1991). The Kinsey Institute New Report on Sex. New York: St. Martin’s Gri‰n. Sarangi, S. and Slembrouck, S. (1996). Language, Bureaucracy and Social Control. London: Longman. Short J., Williams, E., and Christie, B. (1976). The Social Psychology of Tele-communications. New York: John Wiley & Sons. Sinclair, J. M. and Coulthard, R. M. (1975). Towards an Analysis of Discourse. Oxford: Oxford University Press. Sproull, L. and Kiesler, S. (1986). Reducing social context cues: Electronic mail in organizational communication. Management Science 32 (11): 1492–1512. Stanton, L. (2006). Talking Dirty: Learning to Speak the Language of Lust. San Francisco: Chronicle Books. Stöckl, H. (2004). In between modes: Language and image in printed media. In Perspectives on Multimodality, E. Ventola, C. Charles, and M. Kaltenbacher (eds.), 9–30. Amsterdam: John Benjamins. Stone, A. R. (1996). The War of Desire and Technology at the Close of the Mechanical Age. Cambridge, MA: MIT Press. Walther, J. B. (1996). Computer-mediated communication: Impersonal, interpersonal and hyperpersonal interaction. Communication Research 23: 3–43. Walther, J. B. and Parks, M. R. (2002). Cues ﬁltered out, cues ﬁltered in: Computermediated communication and relationships. In Handbook of Interpersonal Communication, M. L. Knapp and J. A. Daly (eds.), 529–563. Thousand Oaks, CA: Sage. Waskul, D. D. (2002). The naked self: Being a body in televideo cybersex. Symbolic Interaction 25 (2): 199–227. Waskul, D. D. (2003). Self-Games and Body-Play: Personhood in On-line Chat and Cybersex. New York: P. Lang. Rodney H. Jones is Associate Professor in the Department of English and Communication at City University of Hong Kong. He has published widely in the areas of language and sexuality, computer-mediated communication, and mediated discourse analysis. Address for correspondence: Department of English and Communication, City University of Hong Kong, Tat Chee Ave., Kowloon Tong, Hong Kong 3enrodney@netvigator.com4.

Log In

The role of text in televideo cybersex

Related papers

Related papers

Related topics