Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Journal of Aging Studies 25 (2011) 62–72 Contents lists available at ScienceDirect Journal of Aging Studies j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / j a g i n g Examining the quality of measures of change in cognition and affect for older adults: Two case studies Khaled Barkaoui a,⁎, Merrill Swain b, Sharon Lapkin b a b Faculty of Education, York University, 235 Winters College, 4700 Keele St, Toronto, ON, Canada M3J 1P3 OISE, University of Toronto, 252 Bloor St. W., Toronto, Ontario, Canada M5S 1V6 a r t i c l e i n f o Article history: Received 24 April 2009 Received in revised form 25 September 2009 Accepted 27 October 2009 a b s t r a c t Adopting a case study approach, we examined the quality of three measures of change in cognition and affect for older adults. The measures were used in a pre/post-test design to examine the effects of engaging older adults in languaging on their cognitive functioning and affect. Each of two researchers engaged each participant in the production of cognitively rich speech through sustained interactions over 10–12 sessions. Results from the three measures were compared to each other and to transcripts of participants' interactions and the researchers' experiences with the participants. The different sources of information supported and contradicted each other in terms of changes observed in the participants' affect and cognitive functioning. We critique the three measures in terms of their adequacy for assessing change and argue that a qualitative, process-oriented approach to assessment that allows it to be integrated with the intervention is better at detecting and understanding change in cognition and affect in older adults. © 2010 Elsevier Inc. All rights reserved. Introduction Measuring change is a challenging task. Singer and Willett (2003), for example, noted that some authors in the 1960s and 1970s insisted that “researchers should not even attempt to measure change because it could not be done well” (p. 3). Singer and Willett quoted Cronbach and Furby (1970) who, in a paper entitled “How should we measure change- or should we?,” advised researchers interested in the study of change to “frame their questions in other ways” (cited in Singer & Willett, 2003, p. 3). Assessment of change is key in the evaluation of the impact of interventions. The conventional approach to measure intervention impact is an experimental design with random assignment of participants to intervention and comparison groups and a standardized measure administered to both groups before and after the interven- ⁎ Corresponding author. E-mail addresses: kbarkaoui@edu.yorku.ca (K. Barkaoui), merrill.swain@utoronto.ca (M. Swain), sharon.lapkin@utoronto.ca (S. Lapkin). 0890-4065/$ – see front matter © 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.jaging.2010.08.004 tion. The difference between the two measures is then computed to obtain an index of change and, thus, intervention impact. Researching and measuring change over time, thus, rests on three main components: study design (e.g., when to measure performance), measurement procedures (e.g., tests), and representing the measurement results. Singer and Willett (2003) discuss how to statistically represent change over time, while Saldana (2003) discusses the analysis of change within the context of qualitative longitudinal research. The focus in this paper is on study design and measurement procedures. Specifically, we discuss the use of pre-/post-test designs to assess change and the quality of standardized measures of change. We argue that the difficulty in assessing change noted above may be partially due to the limitations of standardized measures because of both their characteristics and time of administration, and that a qualitative approach to assessment that is organically integrated with the intervention offers a better alternative to detect, map and explain change over time. Two lines of research highlight the importance of examining intervention processes as well as outcomes: K. Barkaoui et al. / Journal of Aging Studies 25 (2011) 62–72 process-oriented evaluation and microgenetic studies of change. In the program evaluation literature, several authors (e.g., Chen, 2007; Greene, 1998; Patton, 2002; Slayton & Llosa, 2005) have emphasized the value of observing intervention processes. Greene (1998), for instance, argued that the close examination of program processes, using qualitative methods, allows “a greater program understanding and more explanatory power, specifically about why and how certain outcomes were attained or not” (p. 141). Chen (2007) also argued that product-oriented approaches that adopt pre-/ post-test designs are limited and that in order “to provide a complete picture, evaluation methods need to be expanded in focus from measuring final results […] to aspects of qualitative and process-oriented assessment” (p. 26). Chen demonstrated the value of examining intervention processes in a study of the effectiveness of strategy training for secondlanguage learners. The study employed measures of program outcomes as well as working journals kept by the students throughout the program and post-training unstructured interviews. This approach, Chen argued, allowed a better understanding of the nature and process of change and impact. Similarly, Slayton and Llosa (2005) emphasized the importance of examining program processes. Specifically, they integrated quantitative measures of students' outcomes with narrative-based classroom observations in a three-year evaluation of a reading program. Slayton and Llosa argued that examination of program processes provided a better and thorough understanding of the program context; added significantly to their ability to interpret test results; and generated findings that were meaningful and useful to stakeholders. For example, classroom observations indicated considerable variation in teacher pedagogy and implementation of the program as well as in the level of student engagement during the program, which might have affected students' scores on outcome measures. Consequently, Slayton and Llosa argued that collecting data about outcomes is insufficient for determining and understanding intervention effectiveness and that the observation of intervention implementation and processes is essential to explain why and how program outcomes are or are not attained. Furthermore, if outcome measures indicate that a program is effective, process information can “confirm that it is actually the program that is responsible for the effect” (p. 2544). The microgenetic research literature also highlights the limitations of outcome-focused research and the importance of observing processes during periods of change (e.g., Calais, 2008; Granott & Parziale, 2002; Kuhn, 1995; Lavelli, Pantoja, Hsu, Messinger & Fogel, 2005; Lee & Karmiloff-Smith, 2002; Siegler, 2002). Kuhn (1995), for example, argued that traditional outcome-focused research designs (e.g., pre-/ post-test) fail to directly observe change while it is occurring (cf. Calais, 2008; Lavelli et al., 2005). By contrast, because they involve conducting detailed observations before, during and after periods of change in a specific domain, microgenetic1 designs allow the researcher to directly observe both shortand long-term changes (Calais, 2008; Kuhn, 1995). As Lavelli 1 Note that microgenetic and microdevelopment are sometimes used interchangeably in the literature (e.g., Granott & Parziale, 2002). 63 et al. (2005) defined them, microgenetic designs are “focused on the microgenesis of development, that is, on the momentby-moment change observed within a short period of time for an elevated number of [observation] sessions” (p. 42). Microgenetic designs are based on two main premises: (a) “only by focusing on the microgenetic details of [individuals'] behavior in particular contexts is it possible to gain the type of fine-grained information that is necessary to understand change processes” and (b) “observing and understanding changes at the micro-level of real time is fundamental to understanding changes at the macro-level of developmental time” (Lavelli et al., 2005; p. 42). Granott and Parziale (2002) provided several examples of the use of microgenetic designs to assess change and development. For example, Siegler (2002) illustrated how the microgenetic method can be used to examine and understand how instructional approaches, specifically encouraging learners to generate self-explanations of other people's reasoning when solving math problems, exercise their effects. The present study This study adopts a case study approach to examine the quality of three measures of change in cognition and affect for older adults. In two studies (Deters, Swain & Lapkin, submitted for publication; Lapkin, Swain & Psyllakis, in press), although assessment tools were used to help describe participants' cognition and affect, the results of these assessments were not used because of the findings of this current study. In this article, we use what we know about the participants and the researchers who interacted with them to evaluate the accuracy and appropriateness of the measurement tools themselves in assessing change in cognition and affect in the research participants. Context and research questions This study is part of a project that examined the role of “languaging” (Swain, 2006) in delaying memory loss and cognitive deterioration in older adults (Swain & Lapkin, 2008). Swain (2006) defined languaging as “the process of making meaning and shaping knowledge and experience through language” (p. 89). Each researcher met individually with one older adult at least 10 times, usually for an hour or more. During these meetings, the participant engaged in cognitively demanding tasks and in the production of cognitively rich speech through sustained interaction with the researcher. As one means of assessing cognitive and/or affective changes experienced by the participants as a result of these interactions, we used a pre-/post-test design with three outcome measures of cognition, affect and social functioning (Deters et al., submitted for publication; Lapkin et al., in press). Lapkin et al. (in press), based on various sources of evidence (e.g., researcher's interactions with participant, interviews with participant's spouse and a personal care attendant), found that languaging activities provided opportunities for the participant (Mike, see below) to demonstrate expertise in several areas which enhanced his self-esteem. Deters et al. (submitted for publication) conducted detailed analyses of transcripts of interactions between a researcher 64 K. Barkaoui et al. / Journal of Aging Studies 25 (2011) 62–72 and a participant (Agnes, see below) in terms of specific linguistic features (e.g., language production, discourse coherence, discourse builders; cf. Dijkstra, Bourgeois, Allen, & Burgio, 2004) as well as metalinguistic ability, self-concept and engagement. They found that languaging positively affected the participant's cognitive and social functioning. For example, as will be described below in more detail, the participant was able to recall more details about her past life in later languaging sessions than in earlier ones. As noted above, in this article we evaluate the quality of the three measures and their results by comparing them with the researchers' assessments of change in their participants' cognition, affect and social functioning. Cognitive functioning includes such domains as memory, orientation, language, attention, focus, and judgment; affect includes mood, selfesteem, and sense of control, while social functioning refers to social networking and activities. Specifically, we address the following research questions: 1. What are the main patterns in the results of the three measures for each participant? 2. How do the results of the measures compare to each other and to the researchers' assessments of change in their participants' cognition, affect and social functioning? 3. How successful do the researchers think the measures were in assessing change in the participants given what the researchers know about both the participants and the measures? Method Participants This study focuses on the cases of two dyads, each including a researcher (Researcher 1 and Researcher 2) and a resident of a long-term care facility (LTCF) (Agnes and Mike).2 Researcher 1 was paired with Agnes, a 94-year old resident at LTCF. Agnes was selected as a participant as she fulfilled the research study's two main criteria: she was socially isolated and considered by the facility staff not to have dementia. Agnes grew up in Saskatchewan during the Depression and had to leave home and work for her room and board in her early teens. She eventually moved to Toronto to work. Given her advanced age, Agnes had survived most of her family members, including her only son. Agnes had some health problems and her eyesight and hearing were declining, which at times made communication difficult. The facility staff members had noticed that Agnes was becoming increasingly isolated and had lost interest in facility activities. Researcher 1 and Agnes met 10 times (over 7 weeks) for a total of approximately 10 h. Agnes died within a year after data collection occurred. Researcher 1 was a research assistant on the project, and was also an experienced teacher. Her family immigrated to Canada when she was a young child, and had settled in Toronto. Prior to pursuing doctoral studies, Researcher 1 had 2 We use pseudonyms throughout the paper to refer to the participants and the facility where the study was conducted. The study proposal went through a rigorous ethical review process by our university ethical review board. taught English as a second language for over 15 years in a variety of contexts in Canada and overseas. Despite their different backgrounds, Researcher 1 and Agnes developed a good relationship and were able to find areas of common interest. During their languaging sessions, they read and discussed newspaper articles and advice columns, listened to music, and looked at personal photos together. They also discussed Agnes' life history and experiences. Researcher 2 was paired with Mike, a 71-year old who had suffered a stroke about 10 years before data collection for the project began. Mike was an early resident in the LTCF. He had been there since its opening 2 years earlier. He had been a community activist and had lobbied against the building of the very LTCF that he resided in. He had worked in Canada's north helping to establish First Nations' artists cooperatives, had been a vocational counselor, and had worked in a community legal clinic. Researcher 2 met with Mike 12 times (over 6 weeks) for a total of approximately 12 h. Researcher 1 was a mature researcher, working in a setting with older adults for the first time in a long career as an applied linguist. She shared Mike's interest in classical music and the history of the Second World War. Together they had discussions about these topics and did activities such as crossword puzzles and poetry writing. They became friends, and Researcher 2 visited Mike socially until his death in 2009. Measurement tools We used three measures in this study which are listed in Table 1: the Mini-Mental State Examination (MMSE), the Multifactorial Memory Questionnaire (MMQ), and the Geriatric Evaluation by Relatives Rating Instrument (GERRI). The MMSE is an objective, 11-question measure of mental status in older adults (Folstein, Folstein & McHugh, 1975). It takes 5–10 min to administer and tests five areas of cognitive functioning: orientation to time and place, attention and calculation, immediate and delayed recall, and various language functions such as the ability to follow verbal commands. The maximum score is 30; a score of 26 or lower is seen to be indicative of current cognitive impairment. Since its publication in 1975, the MMSE has been validated and extensively used in both clinical practice and research (Foreman & Grabowski, 1992; Foreman, Fletcher, Mion & Simon, 1996). The MMSE has been shown to be effective in (a) separating patients with and without cognitive impairment and (b) measuring cognitive change in an individual over time and in response to treatment (Folstein et al., 1975; Foreman & Grabowski, 1992; Foreman et al., 1996). The MMQ is a self-report measure of separate dimensions of memory (Troyer & Rich, 2002). It includes three scales: contentment (i.e., affect regarding one's memory), ability (i.e., self-appraisal of one's memory capabilities), and strategy (i.e., reported frequency of memory strategy use). Only the first two scales, contentment and ability, were used in the current study. Both scales include items measured on a Likerttype scale. Troyer and Rich (2002) evaluated the psychometric properties of the MMQ among a group of 115 older adults and found that it has “excellent content validity, factorial validity, test–retest and intra-test reliability, convergent and discriminant construct validity, and independence from 65 K. Barkaoui et al. / Journal of Aging Studies 25 (2011) 62–72 Table 1 Measures used in the study. Measure Approach Domain Original length Administered length Mini-Mental State Examination (MMSE) Objective 11 items 11 items Multifactorial Memory Questionnaire (MMQ) Geriatric Evaluation by Relatives Rating Instrument (GERRI). Self-report Other-report Cognition (orientation, recall, registration, attention, language) Memory: contentment and ability Cognition, mood and social functioning 3 sections 3 sections; 49 items 2 sections 3 sections; 21 items demographic variables,” making the MMQ a useful tool for both clinical and research purposes (p. 19). Fort, Adoul, Holl, Kaddour and Gana (2004) reported similar findings concerning the measurement qualities of a French version of the MMQ. Finally, Troyer (2001), in a study to evaluate the impact of a memory intervention program for communitydwelling older adults, reported that the MMQ was able to detect significant improvement in the participants' metamemory (i.e., self-rated and satisfaction with memory ability). Finally, the GERRI consists of 49 short-sentence items, grouped into three subscales that assess cognitive functioning (20 items), social functioning (18 items), and mood (11 items) in the elderly (Schwartz, 1983, 1988). Items are rated on a 6-point frequency scale that ranges from “almost all of the time” to “almost never”; a category “does not apply” is also included. An average score (from 1 to 5) based on all applicable items is computed, and the higher the score, the greater the impairment (Schwartz, 1983, 1988). The scale was designed to be completed every 2 weeks, with the items to be rated on the basis of behaviour observed in the previous two-week period. The GERRI is easy to administer and is completed by a person who is close to the patient such as a relative or a friend, thus providing a relative or significant other's point of view. Inter-rater and internal consistency reliabilities have been shown to be high for the GERRI total scale score; the cognitive functioning scale tends to obtain the highest reliability indices, followed by the social functioning and mood scales (Schwartz, 1983). The GERRI has also been shown to differentiate among patients with different levels of cognitive impairment (Schwartz, 1983) and to have weak to good correlations with other measures of mood and cognitive and social functioning (Rozenbilds, Goldney & Gilchrist, 1986). Not all the GERRI items were used in this project; the version we used consisted of 21 items distributed as follows: eleven related to cognitive functioning, six items to social functioning, and four to mood. The individuals who completed the pre- and post-test GERRI for each participant were instructed to rate each statement with reference to the last 2 weeks. Finally, scores on all items were reversed so that higher scores on any item and each scale indicate better functioning, to be consistent with the other two measures used in the study (i.e., the MMSE and MMQ). The three measures were selected for their suitability for the project, technical qualities, and practicality. For example, they are short, easy to complete, have high reliability and validity, and are widely used in clinical and research contexts. In addition, we felt the content of these measures was relevant to the focus of the main study and that they would be sensitive to change in the participants' affect, cognition and social functioning. Furthermore, we included three measures because we wanted to obtain multiple perspectives on changes in the participants' affect and cognition, if any. Finally, as noted above, both the MMSE and MMQ have been used to examine change and/or intervention effectiveness in previous studies. Each of the three measures was administered to each participant at least 1 week before the first languaging session and, again, after the last session. Apart from the MMQ, which is completed by the participant her/himself, the other two measures (MMSE and GERRI) were administered or completed by staff members at the LTCF. We intended to have the same staff member administer the MMSE and complete the GERRI before and after the intervention. However, for practical reasons, this plan was not followed. For example, while the GERRI and MMSE were completed by the same staff member for Agnes, each was completed by two different staff members for Mike for the pre-test and post-test. The implications of this departure from the original plan will be discussed below. Procedure To evaluate the quality of the three measures in assessing changes in participants' cognition and affect, we compared the results of the three measures to each other and to the researchers' assessments of their participants' affective and cognitive functioning. We interviewed each researcher at the end of the study about her impressions of her participant's affective and cognitive functioning at the beginning, during, and at the end of the study; the changes that each observed in her participant; and evidence for and explanations of these changes. We also asked each researcher (a) to compare and explain similarities and differences between her assessments and the pre- and post-test results and (b) to assess the appropriateness and accuracy of each measure in assessing changes in her participant, given the researcher's knowledge of the participant. Two semi-structured interview protocols were developed, a general and a specific interview. The general interview was administered to both researchers and included general questions about the researcher's impressions of their participant, expectations about changes in the participant, changes that occurred and did not occur, and evidence for and explanations of such changes. The researchers were requested to answer the interview questions based on their own experiences with the participants and the transcripts of the meetings, without consulting the test results. The general interview questions covered six areas as follows: 1. Overall impression: researcher's overall impression of the participant when they first met in terms of cognitive functioning, affect, social functioning, and other relevant domains (e.g., identity, agency). 66 K. Barkaoui et al. / Journal of Aging Studies 25 (2011) 62–72 2. Expectations: the changes that the researcher expected to happen as a result of their interaction with the participant and the direction and reasons for these expectations. 3. Change: the changes that the researcher thought happened in the participant during and/or after the intervention. 4. Evidence: evidence, from the transcripts and/or the researcher's own observations, to support claims about the changes in 3 above. 5. Explanation: the researcher's speculations as to why (or why not) expected changes happened. A specific interview was then developed for each case based on a comparison of the researcher's responses to the first (general) interview and their participant's pre-test and post-test results. For each domain (i.e., cognition, affect, social functioning, and other), the researcher was provided with test- and item-level pre-test and post-test results from the three measures (i.e., MMSE, MMQ, and GERRI) for her participant and, where applicable, the researcher's own assessment of the domain (from the first interview). The main focus of the second (specific) interview was the changes (or no change) in these domains and evidence for those changes (or no change) in terms of both test results and the researcher's own observations of the participant. The specific interview included four types of questions relating to each participant: (a) comparisons of the pre-test and post-test total scores for the MMSE, MMQ, and GERRI; (b) comparisons of individual items that showed improvement or decline on each of the three measures; (c) comparisons of test results and the researcher's observations of the participant; and (d) the researcher's assessment of each measure in terms of its appropriateness and accuracy in assessing their participant, given the researcher's knowledge of the participant. At the item level, only differences of two or more points between the pre- and post-test were considered as meaningful changes for the three measures. The researchers were asked to carefully review the items in each measure as well as their participants' pre- and post-test responses to each item, which were shown to the researchers, before answering the questions in the second interview. Findings Table 2 summarizes the pre- and post-test results for each of the three measures as well as the researchers' assessments of changes in cognition, affect and social functioning for Agnes and Mike. It shows that Agnes had higher scores on the MMSE and MMQ-Ability in the posttest, but lower scores on all GERRI scales and on the MMQPerception scale. Mike, on the other hand, showed no change in terms of the MMSE scores, but obtained lower scores on the MMQ scales as well as the GERRI-Mood scale, and higher scores on the GERRI cognitive and social functioning scales in the post-test. In the following subsections we present the findings for each case separately. We discuss the results concerning each domain separately, starting with the researcher's assessment and then contrast it with the pre- and post-test results. The Summary and discussion section synthesizes the main findings from both cases. Table 2 Pre- and post-test results and researcher assessment for Agnes and Mike. Agnes Cognition MMSE (max. 30) MMQ-Contentment (max. 72) MMQ-Ability (max. 64) GERRI-Cognition (max. 44) Affect GERRI-Mood (max. 16) Social functioning GERRI-Social Functioning (max. 24) Mike Cognition MMSE (max. 30) MMQ-Contentment (max. 72) MMQ-Ability (max. 64) GERRI-Cognition (max. 44) Affect GERRI-Mood (max. 16) Social functioning GERRI-Social Functioning (max. 24) Pretest Posttest Test results comparison Researcher assessment 15 33 19 21 Improved Declined Improved Improved 21 24 34 14 Improved Declined Improved Improved 7 4 Declined Improved 6 6 No change Improved 26 63 27 50 No change Declined No change No change 64 22 59 36 Declined Improved No change No change 11 7 Declined Improved 13 22 Improved Improved Agnes Cognition In this article cognition is defined as involving memory and language comprehension and production. Researcher 1 noted that, at the beginning of the study, Agnes had difficulty accessing long-term memory. In particular, details about her life history proved to be challenging. Agnes also frequently mentioned that she had difficulty remembering things from so long ago. In addition, Agnes had problems with her hearing, so sometimes she misunderstood what was said. Researcher 1 expected that with regular contact, Agnes' memory ability would improve over time, as she thought that Agnes had not been asked about her life history before. In addition, Researcher 1 expected that once Agnes started to think about details, more details would be remembered. In terms of change, Agnes did relay a few details about her past during the second last session that she had not mentioned in previous sessions. Researcher 1, however, was not sure whether the changes in Agnes' cognitive functioning she observed were due to their interactions or to how Agnes was feeling physically at the time (Deters et al., submitted for publication). Researcher 1 did not notice any changes in Agnes' language abilities in terms of conversation ability (e.g., making small talk, asking questions and expressing opinions to keep the conversation going, changing topics, giving advice) and social skills (e.g., asking how Researcher 1 is, offering her tea). However, Researcher 1 did notice a development in Agnes' metalinguistic skills. For example, in the second last session, Agnes was able to recall and spell the names of several individuals from her past, and even made some puns. Overall, Researcher 1 noted an improvement in Agnes' memory ability. K. Barkaoui et al. / Journal of Aging Studies 25 (2011) 62–72 Table 2 above reports the pre- and post-test results as well as Researcher 1's assessment of changes in cognitive functioning for Agnes. First, comparing the three measures, note that while the MMSE and the second section of the MMQ indicate an improvement over time, the GERRI and the first section of MMQ (perception of own memory) indicate a decline. The differences across the three measures may be due to the different perspectives (self-report, other, and objective test) as well as the type and focus of items in each measure. Although the MMSE results support Researcher 1's assessment of cognitive improvement, Researcher 1 did not think that it is a good measure of cognition since its items are decontextualized and for someone who is not used to such tasks as spelling a word backwards, this is problematic. In fact, only one item, orientation to time, showed improvement (3 points) across testing times. Researcher 1 noted that this might be because her regular appointments with Agnes increased the latter's awareness of time, as she often asked Researcher 1 what day it was, and when the next visit was. Agnes also obtained a higher score in terms of selfassessment of her memory capabilities (section 2 of the MMQ) in the post-test. These results are consistent with Researcher 1's interpretation of the second last session, where she found instances of improved memory ability. Agnes showed improvement on several items and Researcher 1 felt that these improvements, such as recall of names and details from a newspaper article, were consistent with her assessment. Researcher 1, however, noted that some items, such as remembering phone numbers, were not relevant for Agnes. On the other hand, Researcher 1 was not able to comment on some items that showed a decline, such as ‘misplace something you use daily’ and ‘retell a story or joke to the same person several times,’ because she did not have direct experience with these items with Agnes. Agnes obtained lower scores in the post-test on both the GERRI-Cognitive Functioning scale and the first section of the MMQ-Contentment. The GERRI results indicate a decline in the cognitive functioning of Agnes as perceived by others, while the MMQ results indicate a decline in Agnes' memory as perceived by her. Both results contradict Researcher 1's impression of improvement in Agnes' cognitive functioning. Researcher 1 was surprised by the GERRI results and raised questions as to who completed the pre- and post-GERRI,3 how much contact that person had had with Agnes, and what the basis was for this person's assessment. For example, Agnes was given lower scores in the post-test on the items “grasps point of newspaper articles, news broadcasts, etc.” and “forgets what he/she is looking for.” Researcher 1 asked whether the person who completed the GERRI discussed newspaper articles or news broadcasts with Agnes to be able to answer the first item accurately. Concerning the MMQ results, Agnes obtained lower scores on several items in the post-test such as “I am generally pleased with my memory ability,” “my memory is really going downhill lately,” and “I am embarrassed about my memory ability.”4 This suggests a decline in Agnes' memory. Research3 The same person completed the pre- and post-test GERRI for Agnes. Note that the last two items, “my memory is really going downhill lately” and “I am embarrassed about my memory ability,” were reverse scored. 4 67 er 1, however, argued that these results do not contradict her assessment. She explained that her attempts to engage Agnes in extended conversation, which included life-history questions, may have contributed to Agnes' increased awareness of memory difficulties. In other words, Researcher 1's questions to Agnes about her life history heightened Agnes' awareness of her own memory abilities. Affect Agnes's mood depended on how she was feeling physically. When she was in pain, Agnes was not in a good mood, and did not want to have a visitor. As for self-esteem, Agnes saw herself as an old woman. She also got upset at herself when she could not remember things. Researcher 1 expected that Agnes' mood might improve over time and that it might make her feel better that someone was visiting her, showing interest in her, and chatting with her for a longer period of time. In terms of change in affect, Researcher 1 observed that during the second last session, Agnes mentioned losing memory ability with age (“age does make you forget though”) and did pause several times when trying to remember something, but she did not get upset about it as she had during the earlier sessions. During this session, Agnes also mentioned her hearing problems, ringing in her ears, but she said ‘bell’ as she could not remember the word ‘ringing.’ But instead of getting upset about her hearing/health problems, the word bell triggered the lyrics of a song, which she started singing. As Table 2 shows, Agnes obtained a lower score in the post-test on the GERRI-Mood scale. This indicates a decline in Agnes mood in the post-test and contradicts Researcher 1's impression that Agnes experienced an improvement in affect towards the end of the study. Researcher 1 noted that the GERRI-Mood results might have depended on who completed the GERRI. Researcher 1 was able to observe Agnes' interactions with other individuals over the course of the languaging sessions, and noted that Agnes tended to be much more cheerful with individuals that she liked, but not with others. Agnes's mood also varied according to how she was feeling physically, which the GERRI does not take into account. Social functioning Researcher 1 noted that she had the feeling that Agnes was not very interested in participating in activities at the LTCF. In addition, on days when she was not feeling well, Agnes did not show much interest in the languaging activities. However, in terms of social skills, when she was feeling well physically, Agnes often asked Researcher 1 how she was and if she wanted to have a cup of tea (when a nurse brought tea for her). In addition to expecting change in Agnes mood, Researcher 1 expected that her meetings with Agnes would improve the latter's social functioning. Analyses of the transcripts indicated that compared with earlier sessions, Agnes showed much more engagement in conversations and got involved with the languaging task (a newspaper column about cross-religious marriages) in the second last session. Agnes expressed strong opinions about the topic and, for the first time, showed an interest in an activity at the LTCF, telling Researcher 1 about the song book that the LTCF had and suggesting that Researcher 1 take a look at this book (see 68 K. Barkaoui et al. / Journal of Aging Studies 25 (2011) 62–72 Deters et al., submitted for publication for details). Researcher 1 was not sure whether to attribute this change to the meetings or to how Agnes was feeling physically at the time of the second last session, since in previous sessions, when she was not feeling well, Agnes expressed little interest in activities. Researcher 1 did not notice any changes in Agnes social skills, however. The pre-test and post-test scores on the GERRI-Social Functioning scale for Agnes are 6 out of 24. These results are low but indicate no change in Agnes' social functioning as perceived by others, which is not consistent with Researcher 1's assessment that Agnes experienced an improvement in her social functioning (i.e., engagement in social interactions). Researcher 1, however, noted that Agnes' social functioning depended on how she was feeling physically, and also with whom she was relating, aspects that the GERRI does not take into account. Agnes obtained a score of 0 in the pre-test and a score of 4 for the post-test (i.e., improved) on the item, “does not pursue everyday activities.” Researcher 1 noted that this change is real and consistent with her analyses since according to her comparison across three sessions, she did notice increased social engagement. Other domains Researcher 1 noted that Agnes expressed surprise that someone was interested in talking to her and hearing about her life and that she frequently spoke about her health problems, and said a number of times that she was practically blind, although she was able to read a text in large print. Researcher 1 expected to see changes in how Agnes saw herself (i.e., identity) as a result of someone visiting her regularly and showing interest in her. Researcher 1 noted that the transcripts of the languaging sessions with Agnes show evidence of such change. None of the measures detected such a change in Agnes' self-concept. Mike Cognition In terms of first impression, Researcher 2 reported that she was struck by how intelligent Mike was although she noted some incongruities. For example, Mike said he had been in the facility for 10 years when the facility was constructed only 2 years earlier. He also said he had had his stroke in the late 1960s when in fact it happened in 1996. As they moved through the sessions, Researcher 2 realized that there were gaps in Mike's cognitive functioning, perhaps related to his failing eyesight. Researcher 2 did not expect to be able to effect changes in those areas where Mike might have had some physical shortcomings, such as failing eyesight leading to lack of appreciation of editorial cartoons and some inability to discuss recent political events. But the opportunity to language seems to have affected Mike's mood and selfesteem positively. The cognitively complex exchanges between Mike and Researcher 2 served to help Mike reestablish himself as an actively engaged and knowledgeable individual (see Lapkin et al., in press). As Table 2 shows, the pre- and post-test scores for Mike on the MMSE are 26 and 27 out of 30, respectively. These results indicate no cognitive impairment (i.e., he was at or above the cut-off score of 26) and no change in terms of this test. These results are consistent with Researcher 2's assessment of no change for Mike and reflect her impression of Mike as being a competent person. Researcher 2 reported that Mike's MMSE results are interesting, particularly his ability to remember the exact date. She noted that some MMSE items require some dexterity, such as folding a paper and writing a sentence, where Mike's physical state might impede his performance. In both the pre- and post-test, Mike wrote “this test stinks.” Researcher 2 noted that the test may have been too simple for him and that he probably regarded it as an insult to his intelligence. Mike obtained a higher score on the post-test GERRI cognitive functioning scale (36, compared to 22 on the pretest, out of 44), suggesting a large improvement in his cognitive functioning. These results are difficult to interpret, however, because the pre- and post-test GERRI were completed by two different individuals. The GERRI results are not consistent with the MMSE results or Researcher 2's assessment of no cognitive change for Mike. Researcher 2 noted that the inconsistencies across the two measures (MMSE and GERRI) might be due to the fact that the GERRI cognitive items are quite different in nature, for the most part, from MMSE items, and this might explain the gain in score on the GERRI whose items may reflect more responsiveness to the type of intervention implemented. For example, it is possible that the intervention had a positive impact on such GERRI items like “remembers points in conversation after interruption” or “grasps point of newspaper articles, news broadcasts, etc.” In addition to these two items, Mike obtained higher scores on other GERRI items such as “remembers familiar phone numbers” and “remembers names of close friends.” Researcher 2 was surprised by these results because Mike's wife indicated that he really did not remember numbers or even used the phone much. For the first section of the MMQ, how I feel about my memory, Mike had a pre-test score of 63 and a post-test score of 50 (out of 72). These results indicate a decline in Mike's self-assessment of his memory. This is not consistent with Researcher 2's assessment of no change in Mike's memory. Researcher 2 noted that there were only a few really dramatic changes at the item level (i.e., of 2 points or greater). She found the fact that Mike is “Concerned about [his] memory” at the post-test interesting, and might be because his doctor had expressed concern and Mike was reflecting that. Researcher 2 was surprised that Mike reported getting more “upset when [he has] trouble remembering something,” but she could not relate Mike's feeling to the intervention. Note, however, that Mike was “generally pleased with [his] memory ability” in the post-test. Mike also obtained a lower score on the second section of the MMQ, which measures self-appraisal of one's memory capabilities, indicating a decline in his assessment of his memory. Researcher 2 perceived this as “not a very dramatic change” (5 points out of 64). In addition, the main change concerned one item, “Forget a birthday or anniversary that you used to know well.” Researcher 2 noted that although she did not have opportunities to observe this, she did not feel that this change was real. She pointed out that Mike may just have felt more negative on the day of the post-test, or perhaps he had encountered a recent example of his not remembering something. K. Barkaoui et al. / Journal of Aging Studies 25 (2011) 62–72 Affect Researcher 2 reported that Mike was tentative about their meetings at first, until he got to know her better, when he began to welcome the meetings. He seemed content with the quality of the facility, the meals, and the help he got there. Once she realized how intelligent Mike was, Researcher 2 started to wonder if the intervention would affect his cognitive functioning. Researcher 2 noted that Mike became more attached to her and that his wife reported him taking pride in the nature of their conversations. Evidence for this observed change can be seen in the transcripts where Researcher 2 compliments Mike on all the knowledge he brings to their discussions. The only measure of affect included in this study was the GERRI-Mood scale. The pre- and post-test scores for Mike were 11 and 7 out of 16, respectively. These results indicate a decline in Mike's mood and contradict Researcher 2's impression that Mike experienced an improvement in affect. Most of the changes at the item level, however, were of one point, except for the item, “Mood changes from day to day, happy one day, sad the other,” which showed a decline of two points in the post-test. Researcher 2 noted that the decline registered by the GERRI may be measurement error caused by the pre- and post-test being completed by two different individuals. Researcher 2 observed that Mike always seemed welcoming to her after her initial visits and fairly eventempered, but she had heard that he could be ‘difficult.’ Researcher 2, as a result, felt that the qualitative data (i.e., transcripts) provide a more reliable indicator of changes in Mike's mood than the few items in the GERRI-Mood scale. Social functioning Researcher 2 observed that Mike was certainly open to making friends and was glad to have her as a new friend. He also seemed to interact with his neighbors appropriately. Mike, however, had relatively little to say to his wife; perhaps because of so much familiarity that one no longer ‘needs’ to talk and of the fact that his wife was preoccupied with her own mother's deteriorating health. Researcher 2 did not know much about Mike's social functioning at the beginning of the sessions. But, apparently he was not participating actively in activities open to him and had ceased attending the Residents' Council meetings. Researcher 2 had no expectations concerning changes in Mike's social functioning, but she noted that he became increasingly more engaged in the life of the facility during and after the intervention. For example, he resumed attendance at the Residence Council at the LTCF. In addition, Mike's wife, Anna, believed that he was more outgoing, talkative, and even more affectionate with her after Researcher 2's visits. Despite visiting Mike on a daily basis, Anna felt she did not always have the energy to engage Mike in stimulating conversation. She stated, “You know, you see married couples sitting in a restaurant and not a word is exchanged. That's Mike and I now.” However, Anna noticed a positive difference in Mike after Researcher 2's visits. The pre-test and post-test scores on the GERRI-Social Functioning scale for Mike are 13 and 22 out of 24, respectively. These results indicate a large improvement in Mike's social functioning as perceived by others, which is consistent with Researcher 2's assessment. Researcher 2, however, was surprised by these results, particularly that 69 Mike obtained higher scores on the post-test for the following two items: “Handles incoming calls” and “Continues to work on some favorite hobby.” Researcher 2 was surprised because she did not think that Mike used the phone much at all, but did probably answer the phone. Researcher 2 had difficulty interpreting the results on the GERRI-Social Functioning scale, but she felt the qualitative data were more reliable than the GERRI data, particularly because the pre-tests and post-tests were completed by two different individuals. Other domains Researcher 2 noted that several episodes show Mike asserting his agency, that his wife found his behavior changed for the better, and that he developed more self-esteem as the sessions progressed. Mike tended to assert his agency in early meetings, but then he became more accommodating to Researcher 2's needs as an interlocutor. In addition, providing Mike with the opportunity to language in the visits was associated with enhancing his self-esteem. Mike was given an opportunity to share his abundant knowledge of topics which were meaningful and interesting to him and, in so doing, his self-image improved. Because his needs and desires were acknowledged and respected, his sense of personal control during the sessions also improved. As noted above, through the opportunity to language, Mike's mood and self-esteem improved. The cognitively complex exchanges between Mike and Researcher 2 served to help Mike reestablish himself as an actively engaged and knowledgeable individual. This was evident in Mike's engagement and enthusiasm and confirmed by his wife. None of the three measures included in the study captured these important changes, however. Summary and discussion The findings reported above reveal several differences and contradictions among the three measures as well as between the measures and the researchers' assessments of the changes in the affective, cognitive and social functioning of the participants. These differences are due to (a) differences in the perspectives that each source of information represents [e.g., self vs. other], (b) differences in focus [e.g., one domain at a time vs. all domains simultaneously], and (c) timing and length of observation which affects the quality of the results [e.g., at beginning and end vs. throughout the intervention]. These differences and contradictions raise questions about the validity of the three measures in detecting and estimating change over time in cognition, affect and social functioning in older adults. For example, Researcher 1 raised questions about the validity of the inferences made about Agnes based on the measures in this study. In particular, Researcher 1 asked who completed the tests, what kind of, and how much, contact the person had with Agnes, and how Agnes was feeling at the time the measures were administered or completed. Researcher 1 also believed that the qualitative analysis of her extended interactions with Agnes, such as comparison of discourse, memory ability and self-concept across three sessions, provided information about Agnes in terms of change in cognition and affect that was not captured by the standardized measures used in this study. Similarly, Researcher 2 felt that the three measures did not really reflect 70 K. Barkaoui et al. / Journal of Aging Studies 25 (2011) 62–72 what the study was trying to achieve with the intervention, despite their psychometric qualities. There are several limitations and questions concerning the three measures. Perhaps the main limitation is that none of them takes into account the context (i.e., physical, social and individual) within which the assessment took place. For example, none of the measures could detect how Agnes was feeling (physically as well as emotionally) when the measures were administered to her. Nor do these measures collect information about recent histories of the participant, information that is essential for a valid interpretation of test results. For instance, the MMQ did not reflect the fact that Mike had had feedback from his doctor about his memory that might have affected his response to some of the MMQ items in the post-test. Second, none of the measures examines the relationships and interactions between emotion, cognition, and context, such as how contextual factors enhance or hinder cognition and affect. The researchers' extended engagement with the participants allows the collection of crucial information not only about how the individual performs an activity, but also how that performance compares to previous performances and relates to the specific context (physical, social and individual) and point in time in which the activity is undertaken, information that is essential for identifying and understanding the causes, magnitude, direction and consequences of change. The three measures and the researcher's judgments differ at several levels, differences that have important implications for using either approach to assess change over time in cognition and affect in older adults appropriately and accurately. Table 3 summarizes some of the main differences. For example, as described in rows 1 to 5, the three measures included closed items, were administered at two points in Table 3 Differences between tests and researcher judgment. Tests Researcher judgment 1 Format Closed items 2 Time Before and after intervention Individual (in isolation) out of context Outcome-oriented Specific: one domain or aspect at a time Discrete (e.g., cognition in isolation) No interaction, distant, detached Low or not motivating, not engaging Less relevant Open-ended questions/ discussion Extended: before, during and after intervention. Individual in his/her ecosystem Outcome and process Both general and specific 3 Focus 4 Specificity 5 Approach 6 Interaction 7 Motivation and relevance 8 Roles Fixed roles/identities 9 Relationships and control Hierarchical relationship Tester has control over questions, topics, pace, etc. Wholistic (e.g. cognition and affect in context) Interactive, caring, close, involved Motivating, engaging, interesting More relevant to participant and current topic Allows different roles/ identities Participant in conversation, collaborative, equal Participant encouraged to be involved in decisions about topics and content of sessions time (before and after the intervention), focused on specific aspects of cognition and affect in isolation, are outcomeoriented, and tend to consider the individual out of context. The researcher's judgment, by contrast, is based on extended observation of the participant as a whole in their ecosystem, i.e., their specific physical, historical and social context (rows 2 and 3). As such, it allows the detection of moment-bymoment as well as long-term changes both in specific subdomains and higher domains, both in processes and outcomes. In addition, given its open-ended nature, this approach allows the detection not only of changes in the domains of interest, i.e., cognition, affect and social functioning, but also changes in other domains such as identity. Table 3 shows also that the two approaches differ in terms of the roles, relationships, and interactions between the participants in the assessment activity (rows 6 to 9). With standardized measures, such as the ones used in this study, there is little or no meaningful interaction between the participants who have fixed roles (e.g., tester–testee, staff– patient; rows 6 and 8). In addition, these measures tend to be less engaging or relevant and to place control in the hands of the tester over the content and pace of the questions asked (rows 7 and 9). These characteristics are likely to diminish the ability of these measures to detect change. In the current project, we tried to involve the participants as much as possible in deciding on the topics and questions discussed during the languaging sessions in order to engage and motivate them to language (row 9). In addition, the interactions between the researchers and the participants tended to be close and involved (row 6). These characteristics are likely to allow the topics and questions discussed to be relevant and engaging for the participant, and to allow the participant to adopt different roles and identities and to disclose more information about their experiences and performance (rows 7 and 8). We believe that these characteristics make the researcher judgment better at (a) detecting small changes over short and extended periods of time, (b) detecting changes at more than one level and domain, (c) allowing the mapping of such changes more accurately at shorter intervals, and, most importantly, (d) providing explanations for such changes (as well as no change) within a broader context (including the physical and social environment as well as individual history). The findings of this study also point to some issues with each of the measures. First, Mike and Agnes obtained lower scores on the post-test MMQ, a self-assessment tool, which suggests a decline in their memory ability. We would like to argue that the lower scores on the post-test may reflect, not a decline in memory, but an improvement in one's ability to assess one's own memory ability as a result of their interactions with the researchers. In other words, the intervention seems to have improved the participants' ability to assess their own memory accurately by making them more realistic about their memory ability. The MMQ seems to show this change as a negative one (decline in scores), but our observations highlight it as a positive change. Second, the quality of information from the GERRI should depend on the quality of the tool itself (i.e., questions, topics, etc.). However, we think the quality depends more on the informant who completes it (e.g., who they are, what is their knowledge of and relation to the participant) and context K. Barkaoui et al. / Journal of Aging Studies 25 (2011) 62–72 (when, where, etc.). For example, the item about ability to understand news articles requires that the informant engages in such a discussion with the participant, which is not likely to happen with health professionals who have little time to interact extensively with the participant. As a result, we believe that judgment based on extended engagement and meaningful interactions with the participants, similar to those undertaken in the current study, provide a more accurate approach to detecting and assessing change in cognition, affect and social functioning in older adults. Third, though an objective and practical measure that is widely used, the MMSE suffers from several limitations compared to extended interactions with the participant. In particular, the participant may resist (i.e., refuse to answer or answer differently than they feel) some MMSE items if they perceive them to be irrelevant and/or to insult their intelligence (e.g., Mike). This is less likely to happen in a caring and close interaction between the researcher and the participant, as was the case in the current study. Of course, the main advantage of the three measures is that they are practical (i.e., take less time) and are easy to administer compared to longer interactions with participants. However, we believe that practicality should not take precedence over the ability to detect changes over time and concerns of validity and fairness of assessment results. Limitations and further research As with any research, there were limitations to the present study. First, while the intervention (languaging) rests on Vygotsky (1978) sociocultural theory of mind, the assessment tools we used are rooted in cognitive psychology. However, we were not able to identify any measures that are theoretically compatible with our intervention given its novelty. Second, the study was not originally designed to assess the quality of the measures. As a result, the participants were not specifically asked about their perceptions of and reactions to the measures. Nor did any of the participants talk about the tests or their experiences with the tests during the interactions with the researchers. Third, the changes we made in the GERRI and MMQ to fit the purposes of this study (e.g., deleting some items and sections) might have affected their performance. Fourth, as noted above, there were several problems related to test administration. Finally, the current study raises questions as to how to consolidate results from different data collection methods and sources, particularly when the results of these methods do not converge. Triangulation is often recommended as a powerful validation strategy in research, but our findings suggest that this strategy raises many questions when data sources differ in terms of their theoretical foundations, scope, and accuracy. There are two main implications for future research. First, it is important in any study evaluating the effectiveness of an intervention to collect information on the process of the intervention as well as its outcomes (cf., Chen, 2007; Slayton & Llosa, 2005). A pre-test/post-test design is limited in that it only answers the yes/no question of whether a change happened, but it does not capture the process of change and the nature and causes of change (or no change) (cf. Calais, 2008; Greene, 1998; Lavelli et al., 2005; Saldana, 2003; Slayton & Llosa, 2005; Wenger, 1999). Second, the interven- 71 tion and evaluation of its effectiveness should ideally be grounded in the same theoretical framework and based on the same assumptions about cognition and change. This could facilitate the integration of the intervention and the assessment of its processes and outcomes. This is the case, for example, in dynamic assessment (e.g., Lantolf & Poehner, 2004). Such an approach has the advantage of not only detecting small-scale changes throughout the intervention, but also allowing for the immediate adjustment and adaptation of the intervention in response to such changes. Acknowledgments This research was made possible through a grant (no. 41004-2099) from the Social Sciences and Humanities Research Council of Canada to Merrill Swain and Sharon Lapkin. We thank Ping Deters, Iryna Lenchuk, Kyoko Motobayashi and Paula Psyllakis for their feedback on earlier drafts of this paper. We are grateful to Agnes and Mike and to the senior staff of the long-term care facility for supporting our research endeavors. References Calais, G. J. (2008). Microgenetic analysis of learning: Measuring change as it occurs. National Forum of Applied Educational Research Journal, 21(3), 1−7. Chen, Y. (2007). Learning to learn: The impact of strategy training. ELT Journal, 61, 20−29. Cronbach, L. J., & Furby, L. (1970). How we should measure “change” — Or should we? Psychological Bulleting, 74, 68−80. Deters, P., Swain, M., & Lapkin, S. (submitted for publication). The relationship between languaging and cognitive functioning: The case of an older adult in a long term care facility. Dijkstra, K., Bourgeois, M. S., Allen, R. S., & Burgio, L. D. (2004). Conversational coherence: Discourse analysis of older adults with and without dementia. Journal of Neurolinguistics, 17, 263−283. Folstein, M. F., Folstein, S., & McHugh, P. R. (1975). Mini-mental state: A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12, 189−198. Foreman, M. D., & Grabowski, R. (1992). Diagnostic dilemma: Cognitive impairment in the elderly. Journal of Gerontological Nursing, 18, 5−12. Foreman, M. D., Fletcher, K., Mion, L. C., & Simon, L. (1996). Assessing cognitive functioning. Geriatric Nursing, 17, 228−233. Fort, I., Adoul, L., Holl, D., Kaddour, J., & Gana, K. (2004). Psychometric properties of the French version of the Multifactorial Memory Questionnaire for adults and the elderly. Canadian Journal on Aging, 23, 347−357. Granott, N., & Parziale, J. (Eds.). (2002). Microdevelopment: Transition processes in development and learning. Cambridge: Cambridge university press. Greene, J. F. (1998). Qualitative, interpretive evaluation. In A. J. Reynolds, & H. J. Walberg (Eds.), Evaluation research for educational productivity (pp. 135−154). Greenwich, CT: JAI Press. Kuhn, D. (1995). Microgenetic study of change: What has it told us? Psychological Science, 6, 133−139. Lantolf, J. P., & Poehner, M. E. (2004). Dynamic assessment of L2 development: Bringing the past into future. Journal of Applied Linguistics, 1, 49−72. Lapkin, S., Swain, M., & Psyllakis, P. (in press). The role of languaging in creating zones of proximal development (ZPDs): A long term care resident interacts with a researcher. Canadian Journal on Aging, 10. Lavelli, M., Pantoja, A. P. F., Hsu, H., Messinger, D., & Fogel, A. (2005). Using microgenetic designs to study change processes. In D. M. Teti (Ed.), Handbook of research methods in developmental science (pp. 40−65). Malden, MA: Blackwell Publishing. Lee, K., & Karmiloff-Smith, A. (2002). Macro- and microdevelopmental research: Assumptions, research strategies, constraints, and utilities. In N. Granott, & J. Parziale (Eds.), Microdevelopment: Transition processes in development and learning (pp. 243−265). Cambridge: Cambridge university press. 72 K. Barkaoui et al. / Journal of Aging Studies 25 (2011) 62–72 Patton, M. Q. (2002). Qualitative research and evaluation methods, 3rd ed. Thousand Oaks, CA: Sage. Rozenbilds, U., Goldney, R. D., & Gilchrist, P. N. (1986). Assessment by relatives of elderly patients with psychiatric illness. Psychological Reports, 58, 795−801. Saldana, J. (2003). Longitudinal qualitative research: Analyzing change through time. Blue Ridge Summit, PA: AltaMira Press. Schwartz, G. E. (1983). Development and validation of the Geriatric Evaluation by Relatives Rating Instrument (GERRI). Psychological Reports, 53, 478−488. Schwartz, G. (1988). Geriatric evaluation by relative's rating instrument (GERRI). Psychopharmacol Bulletin, 24, 713−716. Siegler, R. S. (2002). Microgenetic studies of self-explanation. In N. Granott, & J. Parziale (Eds.), Microdevelopment: Transition processes in development and learning (pp. 31−58). Cambridge: Cambridge university press. Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. Oxford: Oxford University Press. Slayton, J., & Llosa, L. (2005). The use of qualitative methods in large-scale evaluation: Improving the quality of the evaluation and the meaningfulness of the findings. Teachers College Record, 107, 2543−2565. Swain, M. (2006). Languaging, agency and collaboration in advanced language proficiency. In H. Byrnes (Ed.), Advanced language learning: The contribution of Halliday and Vygotsky (pp. 95−108). London: Continuum. Swain, M., & Lapkin, S. (2008). Evidence of cognitive change: Languaging with an older adult.Paper presented at the annual conference of the American Association of Applied Linguistics Washington, D.C.. Troyer, A. K., & Rich, J. B. (2002). Psychometric properties of a new metamemory questionnaire for older adults. Journal of Gerontology, 57B, 19−27. Troyer, A. K. (2001). Improving memory knowledge, satisfaction, and functioning via an education and intervention program for older adults. Aging, Neuropsychology and Cognition, 8, 256−268. Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Cambridge, MA: Harvard University Press. Wenger, G. C. (1999). Advantages gained by combining qualitative and quantitative data in a longitudinal study. Journal of Aging Studies, 13, 369−376.