HAL (Le Centre pour la Communication Scientifique Directe), Jun 13, 2008
The use of corpora in language teaching and learning (data-driven learning, or DDL) has generated... more The use of corpora in language teaching and learning (data-driven learning, or DDL) has generated considerable interest over the last two decades. However, a survey of this research shows a surprising lack of empirical studies regarding the key question of ...
HAL (Le Centre pour la Communication Scientifique Directe), 2012
The tremendous variety of language corpora, tools and ways to access them have opened up entire n... more The tremendous variety of language corpora, tools and ways to access them have opened up entire new spheres of application in language teaching and learning. The book deals with a variety of uses of language corpora in foreign or second language teaching and learning, and is divided into four parts. Sections 1 and 2 look at corpora as input, first exploring general issues of how they can inform language teaching, and second describing how this can be put into practice and evaluating concrete uses of corpora with learners. Sections 3 and 4 look at learner corpora as output: this includes comparison with native-speaker corpora to identify ‘errors’ or areas of difficulty, but also shows what learners can and do know at different levels of proficiency, and what this tells us about the learning process.
Corpora, in a broad sense, have had a role to play in language teaching and learning for many dec... more Corpora, in a broad sense, have had a role to play in language teaching and learning for many decades. Of note are Thorndike and Lorge’s Teacher’s Word Book of 30,000 Words (1944), West’s General Service List (1953), or Gougenheim (e.g., 1958) and colleagues’ work on the Francais Fondamental, but these mostly involved indirect applications, especially identifying frequent items (forms, meanings and uses) for inclusion in syllabuses and language programmes. Such work continues in lexicography, largely thanks to the pioneering Cobuild work led by the late John Sinclair (1987) (almost all major dictionaries, grammar books and manuals today are corpus-based to some extent, for major world languages at least), not to mention the proliferation of frequency lists (e.g., the series of Routledge Frequency Dictionaries1) and various academic research projects, from Coxhead’s Academic Word List (2000) to Martinez and Schmitt’s (2012) Phrasal Expressions List, as corpora have much to tell us no...
La question posee au depart lors de la table ronde etait : « faut-il amenager les documents authe... more La question posee au depart lors de la table ronde etait : « faut-il amenager les documents authentiques en vue de l'apprentissage ? » Les intervenants ont aborde differents themes en rapport avec cette question, traitant ainsi divers elements de discussion qui transparaissent dans les differents articles de ce numero des Melanges CRAPEL. Le texte ci-apres presente des extraits de la table ronde et ne pretend pas etre une retranscription de cet evenement dans son integralite. Ce texte a ete revu par les intervenants nommes ci-dessus avant de paraitre dans le present volume
Entre présence et distance : Enseigner et apprendre les langues à l’université à l’ère numérique; In I. Cros, N. Kübler, G. Miras & A. Burrows (eds), 2024
La linguistique de corpus a de nombreuses fins éducatives, notamment pour contribuer à de meilleu... more La linguistique de corpus a de nombreuses fins éducatives, notamment pour contribuer à de meilleures descriptions langagières qui, à leur tour, peuvent informer des outils pédagogiques. Pour des questions ou des besoins hautement spécialisés, on peut recourir aux corpus tout aussi spécifiques, parfois confectionnés sur mesure par les intéressés. En apprentissage des langues, l’exploitation de corpus est relativement bien connue sous l’appellation data-driven learning (apprentissage sur corpus, ou ASC) avec des bénéfices immédiats sans formation conséquente en linguistique de corpus. En réalité, on peut se servir des outils et techniques de la linguistique de corpus dans presque tout domaine qui requiert l’analyse de quantités de texte. L’approche adoptée ici utilise la langue non comme une fin en soi mais comme moyen d’explorer le contenu. Ce chapitre se donne pour objectif de décrire un cours universitaire où les étudiants choisissent un sujet en fonction de leurs propres intérêts, créent un corpus afin de répondre à leurs questions sur ce sujet, et rédigent un rapport de recherche pour rendre compte de leur travail. S’il ne s’agit donc pas d’ASC en tant que tel, l’exercice repose sur les mêmes principes d’authenticité, d’autonomie, de constructivisme, d’apprentissage par découverte, et une exposition à de grandes quantités de texte. Les projets sont décrits par rapport aux domaines d’intérêt des étudiants ; la suite examine de plus près leurs retours personnels afin d’obtenir un aperçu des processus d’appropriation de l’approche et des ressentis des étudiants qui débutent dans le domaine. Cette étude est motivée par une mise en situation, comparant une lecture linéaire et l’apport des outils de corpus à l’analyse de ces ressentis.
Bloomsbury handbook of language learning and technology; R. Hampel & U. Stickler (eds), Bloomsbury, 2024
Data-driven learning (DDL) can be broadly defined as the use of corpus tools and techniques for l... more Data-driven learning (DDL) can be broadly defined as the use of corpus tools and techniques for learning or using a foreign or second language (L2). This chapter begins with the origins of DDL, situating its spread and adoption in the practice and use of language data mediated by technology. It then looks at the current scope of DDL, the role of technology and DDL uses in instructed language learning settings. The final section considers some under-explored areas for future research, with younger learners, mobile assisted language learning (MALL) and the use of DDL with languages other than English. It closes with practical tips for language teachers who wish to explore DDL with their students.
Corpora for language learning: Bridging the research-practice divide; P. Crosthwaite (ed), 2024
Data-driven learning (DDL) involves using the tools and techniques of corpus linguistics for teac... more Data-driven learning (DDL) involves using the tools and techniques of corpus linguistics for teaching and learning second or foreign languages. Since its first appearance over 30 years ago, hundreds of papers have sought to evaluate different aspects of its use. In this chapter, Peter Crosthwaite talks to Alex Boulton about his syntheses of research in this area, and how researchers can make DDL more accessible.
Du recueil de données à l’analyse des corpus en didactique des langues; V. Privas-Bréauté (ed), 2024
This is a pre-publication version; please refer to the final paper where possible. Boulton, A. (2... more This is a pre-publication version; please refer to the final paper where possible. Boulton, A. (2024). Postface : Corpus et didactique en France. In V. Privas-Bréauté (Dir.), Du recueil de données à l'analyse des corpus en didactique des langues (pp. 173-178). Presses Universitaires de Rennes.
Beyond the concordance: Corpora in language education; P. Pérez-Paredes & G. Mark (eds) , 2021
Data-driven learning (DDL) typically involves language learners consulting corpus data, either di... more Data-driven learning (DDL) typically involves language learners consulting corpus data, either directly or via prepared materials, to answer questions about language. The approach has been mooted since the beginning of the modern era of corpus linguistics, and has come to be associated with work by Tim Johns who coined the term in print in 1990. Since then, hundreds of studies have attempted to evaluate some aspect of DDL, giving rise to several reviews and syntheses. This paper introduces DDL and discusses the syntheses to date, before analysing a rigorous collection of 351 studies published up to and including 2018. While previous syntheses have evaluated the field, the objective here is to provide an overview of how researchers see DDL across the board, to identify more clearly what DDL actually looks like today, how it has evolved from its early beginnings in the 1980s, and to suggest avenues for future research in underexplored areas.
This paper analyses the methodologies in 148 empirical data-driven learning studies for L2 Englis... more This paper analyses the methodologies in 148 empirical data-driven learning studies for L2 English in prestige journals to examine best practice. Manual coding and corpus analysis of key words and n-grams from the past five years (2018-22) explore the field as a whole and how methodologies have evolved, suggesting improvements and future avenues for research.
This study traces the evolution of Computer Assisted Language Learning (CALL) by investigating pu... more This study traces the evolution of Computer Assisted Language Learning (CALL) by investigating published research articles (RAs) in four major CALL journals: ReCALL, Computer Assisted Language Learning (CALL), Language Learning & Technology (LL&T), and CALICO Journal. All 2,397 RAs published over four decades (1983-2019) were included in the pool of data and the Google Scholar citation metric was adopted to assess the impact of the papers. By selecting the top 15% of widely-cited papers from each individual year, we minimized the time bias between years, enabling a balanced narration of the history of CALL through a representative dataset of 426 high-impact RAs. To identify the evolution of research trends, the contexts, methodologies, theoretical underpinnings and research foci of all 426 RAs were investigated using NVivo 12 and AntConc. The analysis of the data yielded encouraging results such as the upward trend in the number of publications and the international reach of CALL in the last two decades, the physical or virtual presence of language learners with diverse language profiles, and the growing tendency to triangulate methodology for increased complexity. However, long-standing issues such as the heavy reliance on traditional research contexts, poor reporting practices of basic demographic information, the large number of atheoretical papers and the concentration on a limited number of research foci continue to pose challenges in CALL research. Based on the findings, the paper suggests solutions for the controversies and addresses key issues for future research in CALL.
HAL (Le Centre pour la Communication Scientifique Directe), Jun 13, 2008
The use of corpora in language teaching and learning (data-driven learning, or DDL) has generated... more The use of corpora in language teaching and learning (data-driven learning, or DDL) has generated considerable interest over the last two decades. However, a survey of this research shows a surprising lack of empirical studies regarding the key question of ...
HAL (Le Centre pour la Communication Scientifique Directe), 2012
The tremendous variety of language corpora, tools and ways to access them have opened up entire n... more The tremendous variety of language corpora, tools and ways to access them have opened up entire new spheres of application in language teaching and learning. The book deals with a variety of uses of language corpora in foreign or second language teaching and learning, and is divided into four parts. Sections 1 and 2 look at corpora as input, first exploring general issues of how they can inform language teaching, and second describing how this can be put into practice and evaluating concrete uses of corpora with learners. Sections 3 and 4 look at learner corpora as output: this includes comparison with native-speaker corpora to identify ‘errors’ or areas of difficulty, but also shows what learners can and do know at different levels of proficiency, and what this tells us about the learning process.
Corpora, in a broad sense, have had a role to play in language teaching and learning for many dec... more Corpora, in a broad sense, have had a role to play in language teaching and learning for many decades. Of note are Thorndike and Lorge’s Teacher’s Word Book of 30,000 Words (1944), West’s General Service List (1953), or Gougenheim (e.g., 1958) and colleagues’ work on the Francais Fondamental, but these mostly involved indirect applications, especially identifying frequent items (forms, meanings and uses) for inclusion in syllabuses and language programmes. Such work continues in lexicography, largely thanks to the pioneering Cobuild work led by the late John Sinclair (1987) (almost all major dictionaries, grammar books and manuals today are corpus-based to some extent, for major world languages at least), not to mention the proliferation of frequency lists (e.g., the series of Routledge Frequency Dictionaries1) and various academic research projects, from Coxhead’s Academic Word List (2000) to Martinez and Schmitt’s (2012) Phrasal Expressions List, as corpora have much to tell us no...
La question posee au depart lors de la table ronde etait : « faut-il amenager les documents authe... more La question posee au depart lors de la table ronde etait : « faut-il amenager les documents authentiques en vue de l'apprentissage ? » Les intervenants ont aborde differents themes en rapport avec cette question, traitant ainsi divers elements de discussion qui transparaissent dans les differents articles de ce numero des Melanges CRAPEL. Le texte ci-apres presente des extraits de la table ronde et ne pretend pas etre une retranscription de cet evenement dans son integralite. Ce texte a ete revu par les intervenants nommes ci-dessus avant de paraitre dans le present volume
Entre présence et distance : Enseigner et apprendre les langues à l’université à l’ère numérique; In I. Cros, N. Kübler, G. Miras & A. Burrows (eds), 2024
La linguistique de corpus a de nombreuses fins éducatives, notamment pour contribuer à de meilleu... more La linguistique de corpus a de nombreuses fins éducatives, notamment pour contribuer à de meilleures descriptions langagières qui, à leur tour, peuvent informer des outils pédagogiques. Pour des questions ou des besoins hautement spécialisés, on peut recourir aux corpus tout aussi spécifiques, parfois confectionnés sur mesure par les intéressés. En apprentissage des langues, l’exploitation de corpus est relativement bien connue sous l’appellation data-driven learning (apprentissage sur corpus, ou ASC) avec des bénéfices immédiats sans formation conséquente en linguistique de corpus. En réalité, on peut se servir des outils et techniques de la linguistique de corpus dans presque tout domaine qui requiert l’analyse de quantités de texte. L’approche adoptée ici utilise la langue non comme une fin en soi mais comme moyen d’explorer le contenu. Ce chapitre se donne pour objectif de décrire un cours universitaire où les étudiants choisissent un sujet en fonction de leurs propres intérêts, créent un corpus afin de répondre à leurs questions sur ce sujet, et rédigent un rapport de recherche pour rendre compte de leur travail. S’il ne s’agit donc pas d’ASC en tant que tel, l’exercice repose sur les mêmes principes d’authenticité, d’autonomie, de constructivisme, d’apprentissage par découverte, et une exposition à de grandes quantités de texte. Les projets sont décrits par rapport aux domaines d’intérêt des étudiants ; la suite examine de plus près leurs retours personnels afin d’obtenir un aperçu des processus d’appropriation de l’approche et des ressentis des étudiants qui débutent dans le domaine. Cette étude est motivée par une mise en situation, comparant une lecture linéaire et l’apport des outils de corpus à l’analyse de ces ressentis.
Bloomsbury handbook of language learning and technology; R. Hampel & U. Stickler (eds), Bloomsbury, 2024
Data-driven learning (DDL) can be broadly defined as the use of corpus tools and techniques for l... more Data-driven learning (DDL) can be broadly defined as the use of corpus tools and techniques for learning or using a foreign or second language (L2). This chapter begins with the origins of DDL, situating its spread and adoption in the practice and use of language data mediated by technology. It then looks at the current scope of DDL, the role of technology and DDL uses in instructed language learning settings. The final section considers some under-explored areas for future research, with younger learners, mobile assisted language learning (MALL) and the use of DDL with languages other than English. It closes with practical tips for language teachers who wish to explore DDL with their students.
Corpora for language learning: Bridging the research-practice divide; P. Crosthwaite (ed), 2024
Data-driven learning (DDL) involves using the tools and techniques of corpus linguistics for teac... more Data-driven learning (DDL) involves using the tools and techniques of corpus linguistics for teaching and learning second or foreign languages. Since its first appearance over 30 years ago, hundreds of papers have sought to evaluate different aspects of its use. In this chapter, Peter Crosthwaite talks to Alex Boulton about his syntheses of research in this area, and how researchers can make DDL more accessible.
Du recueil de données à l’analyse des corpus en didactique des langues; V. Privas-Bréauté (ed), 2024
This is a pre-publication version; please refer to the final paper where possible. Boulton, A. (2... more This is a pre-publication version; please refer to the final paper where possible. Boulton, A. (2024). Postface : Corpus et didactique en France. In V. Privas-Bréauté (Dir.), Du recueil de données à l'analyse des corpus en didactique des langues (pp. 173-178). Presses Universitaires de Rennes.
Beyond the concordance: Corpora in language education; P. Pérez-Paredes & G. Mark (eds) , 2021
Data-driven learning (DDL) typically involves language learners consulting corpus data, either di... more Data-driven learning (DDL) typically involves language learners consulting corpus data, either directly or via prepared materials, to answer questions about language. The approach has been mooted since the beginning of the modern era of corpus linguistics, and has come to be associated with work by Tim Johns who coined the term in print in 1990. Since then, hundreds of studies have attempted to evaluate some aspect of DDL, giving rise to several reviews and syntheses. This paper introduces DDL and discusses the syntheses to date, before analysing a rigorous collection of 351 studies published up to and including 2018. While previous syntheses have evaluated the field, the objective here is to provide an overview of how researchers see DDL across the board, to identify more clearly what DDL actually looks like today, how it has evolved from its early beginnings in the 1980s, and to suggest avenues for future research in underexplored areas.
This paper analyses the methodologies in 148 empirical data-driven learning studies for L2 Englis... more This paper analyses the methodologies in 148 empirical data-driven learning studies for L2 English in prestige journals to examine best practice. Manual coding and corpus analysis of key words and n-grams from the past five years (2018-22) explore the field as a whole and how methodologies have evolved, suggesting improvements and future avenues for research.
This study traces the evolution of Computer Assisted Language Learning (CALL) by investigating pu... more This study traces the evolution of Computer Assisted Language Learning (CALL) by investigating published research articles (RAs) in four major CALL journals: ReCALL, Computer Assisted Language Learning (CALL), Language Learning & Technology (LL&T), and CALICO Journal. All 2,397 RAs published over four decades (1983-2019) were included in the pool of data and the Google Scholar citation metric was adopted to assess the impact of the papers. By selecting the top 15% of widely-cited papers from each individual year, we minimized the time bias between years, enabling a balanced narration of the history of CALL through a representative dataset of 426 high-impact RAs. To identify the evolution of research trends, the contexts, methodologies, theoretical underpinnings and research foci of all 426 RAs were investigated using NVivo 12 and AntConc. The analysis of the data yielded encouraging results such as the upward trend in the number of publications and the international reach of CALL in the last two decades, the physical or virtual presence of language learners with diverse language profiles, and the growing tendency to triangulate methodology for increased complexity. However, long-standing issues such as the heavy reliance on traditional research contexts, poor reporting practices of basic demographic information, the large number of atheoretical papers and the concentration on a limited number of research foci continue to pose challenges in CALL research. Based on the findings, the paper suggests solutions for the controversies and addresses key issues for future research in CALL.
Uploads
Papers by Alex Boulton