Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
  • I am Professor of English Language at Lancaster University. My research interests include corpus linguistics, languag... moreedit
‘By exploring the entire gamut of the representation of masculinity in both old and new media and across a wide range of disciplines, Baker and Balirano get readers really thinking about what it means to be a man in today’s liquid... more
‘By exploring the entire gamut of the representation of masculinity in both old and new media and across a wide range of disciplines, Baker and Balirano get readers really thinking about what it means to be a man in today’s liquid society. Guaranteed to raise awareness about the diverse ways of being and performing masculinity, the book provides a novel contribution to an exciting new field opening up new avenues for other researchers.’
—Delia Chiaro, Professor of English Linguistics and Translation, University of Bologna, Italy, and President of the International Society of Humor Studies

‘Exploring the interface of queer studies with the fields of linguistics, anthropology, semiotics, critical discourse analysis, literary and film studies, the articles in this collection draw a multifaceted picture of the discursive construction and representation of queer masculinities in a range of text genres and contexts. They engage in fascinating analyses of various aspects of queer masculinities, including issues such as consumer culture, representation in TV series, films, literature and art, intersectionality with trans and racial identities, homophobic discourse and subordination through hegemonic masculinity.’
—Heiko Motschenbacher, Western Norway University of Applied Sciences, Bergen
TITLES IN THE SERIES INCLUDE Peter Trudgill A Glossary of Sociolinguistics 0 7486 1623 3 Jean Aitchison A Glossary of Language and Mind 0 7486 1824 4 Laurie Bauer A Glossary of Morphology 0 7486 1853 8 Alan Davies A Glossary of Applied... more
TITLES IN THE SERIES INCLUDE Peter Trudgill A Glossary of Sociolinguistics 0 7486 1623 3 Jean Aitchison A Glossary of Language and Mind 0 7486 1824 4 Laurie Bauer A Glossary of Morphology 0 7486 1853 8 Alan Davies A Glossary of Applied Linguistics 0 7486 1854 ...
Is the British press prejudiced against Muslims? In what ways can prejudice be explicit or subtle? This book uses a detailed analysis of over 140 million words of newspaper articles on Muslims and Islam, combining corpus linguistics and... more
Is the British press prejudiced against Muslims? In what ways can prejudice be explicit or subtle? This book uses a detailed analysis of over 140 million words of newspaper articles on Muslims and Islam, combining corpus linguistics and discourse analysis methods to produce an objective picture of media attitudes. The authors analyse representations around frequently cited topics such as Muslim women who wear the veil and 'hate preachers'. The analysis is self-reflexive and multidisciplinary, incorporating research on journalistic practices, readership patterns and attitude surveys to answer questions which include: what do journalists mean when they use phrases like 'devout Muslim' and how did the 9/11 and 7/7 attacks affect press reporting? This is a stimulating and unique book for those working in fields of discourse analysis and corpus linguistics, while clear explanations of linguistic terminology make it valuable to those in the fields of politics, media studies, journalism and Islamic studies.
Research Interests:
Is the British press prejudiced against Muslims? In what ways can prejudice be explicit or subtle? This book uses a detailed analysis of over 140 million words of newspaper articles on Muslims and Islam, combining corpus linguistics and... more
Is the British press prejudiced against Muslims? In what ways can prejudice be explicit or subtle? This book uses a detailed analysis of over 140 million words of newspaper articles on Muslims and Islam, combining corpus linguistics and discourse analysis methods to produce an objective picture of media attitudes. The authors analyse representations around frequently cited topics such as Muslim women who wear the veil and 'hate preachers'. The analysis is self-reflexive and multidisciplinary, incorporating research on journalistic practices, readership patterns and attitude surveys to answer questions which include: what do journalists mean when they use phrases like 'devout Muslim' and how did the 9/11 and 7/7 attacks affect press reporting? This is a stimulating and unique book for those working in fields of discourse analysis and corpus linguistics, while clear explanations of linguistic terminology make it valuable to those in the fields of politics, media studies, journalism and Islamic studies.
This title offers a set of definitions of key terms in discourse analysis, a core area of all linguistics and language studies courses. Unlike many other areas of linguistics, Discourse analysis is a complex field to define, comprising a... more
This title offers a set of definitions of key terms in discourse analysis, a core area of all linguistics and language studies courses. Unlike many other areas of linguistics, Discourse analysis is a complex field to define, comprising a number of related but different theoretical and methodological frameworks. Discourse can mean many different things to different people. Students often find these multiple meanings to be confusing and this book attempts to spell out and reconcile the different approaches, to give a holistic picture of Discourse Analysis as a branch of several disciplines. As well as comprising a glossary of key terms, the book provides examples, a section on key thinkers and their ideas, and key texts for further reading.
Sociolinguistics and Corpus Linguistics is the first book to focus on the ways that corpus linguistics approaches can be used in order to aid sociolinguistic research. Both corpus linguistics and sociolinguistics have a great deal in... more
Sociolinguistics and Corpus Linguistics is the first book to focus on the ways that corpus linguistics approaches can be used in order to aid sociolinguistic research. Both corpus linguistics and sociolinguistics have a great deal in common in terms of their basic approaches to language enquiry, particularly in terms of providing representative samples from a population and analysing quantitative information in order to study variation or differences between populations. The book covers a range of different topics within sociolinguistics: analysing demographic variation, comparing language use across different cultures and examining language change over time, studying transcripts of spoken interactions and identifying attitudes or discourses. The book references many key and recent studies in the field as well as featuring original analyses of a number of corpora including the British National Corpus, the corpus of Spoken English Dialects and the Brown family of corpora. In addition, a new corpus of written British English collected around 2006 was collected for the purposes of writing the book. Techniques of analysis like concordancing, keywords and collocations are discussed, along with corpus annotation and statistical procedures such as chi-squared tests and clustering. The book takes a critical approach to using corpora in sociolinguistics, attempting to outline the limitations of the approach as well as its advantages.

Chapter 1 Introduction
Chapter 2 Sociolinguistic variation
Chapter 3 Diachronic variation
Chapter 4 Synchronic variation
Chapter 5 Corpora and interpersonal communication
Chapter 6 Uncovering discourses
Chapter 7 Conclusion
Sexed Texts is aimed at undergraduate students and beginning post-graduate students, presenting a coherent overview of a wide range of theoretical and analytical perspectives in the diverse and rapidly evolving field of language, gender... more
Sexed Texts is aimed at undergraduate students and beginning post-graduate students, presenting a coherent overview of a wide range of theoretical and analytical perspectives in the diverse and rapidly evolving field of language, gender and sexuality.

The book aims to show how people use language to construct themselves (and others) as male, female, gay, heterosexual etc. while prioritising some identities as normal or preferable, some as deviant or subordinate and others as simply non-existent. The book uses a range of real-life, everyday language texts which reference gender and sexuality, including newspaper and magazine articles, religious texts, children's fiction, nursery rhymes, romantic fiction, pornography, ordinary conversations, chat room data and advertisements as well as relying on interview, focus group and corpus data.

The book considers questions such as "is there such a thing as a gay voice?", "do women have to 'talk like men' to succeed at work?", "why are bisexuals one in a million in language use?", "how have advertisers co-opted feminism?", "when is it OK to be a bachelor?", "has 'political correctness' had an impact on the way we refer to women?" and "what exactly, is a dogger?"

Written in a clear way, Sexed Texts uses a combination of classic studies and new analyses in order to trace the development of the field, from early research which aimed to outline ways that men and women used language differently to each other, to studies which focussed on deconstructing the ways that language helps to create gendered and sexed discourses (or ways of understanding the world). The book critically considers feminist, queer and post-structuralist theories in order to show how identities are fluid, unstable and often linked to power hierarchies. However, it is argued that all of us hold multiple identities and experience moments of powerfulness and powerlessness, which must be constantly negotiated via language in ways that can be subtle or contradictory. The book therefore considers some of the most recent theoretical perspectives in the field and should be of value to any student or teacher of language, gender and sexuality.
Using Corpora in Discourse Analysis examines approaches to carrying out discourse analysis (DA) using techniques that are grounded in corpus linguistics. In the past much research on critical discourse analysis has focussed on analyses of... more
Using Corpora in Discourse Analysis examines approaches to carrying out discourse analysis (DA) using techniques that are grounded in corpus linguistics. In the past much research on critical discourse analysis has focussed on analyses of single texts or small collections of texts. However, researchers working in CDA are beginning to acknowledge the potential of using corpora either to supplement their findings or as a valid methodology in itself. A corpus-based approach helps to provide quantitative evidence of the existence of discourses by enabling researchers to identify repetitive linguistic patterns of language use and to uncover hidden meanings in lexical items e.g. by examining collocations. Corpus linguistics also allows researchers to uncover linguistic evidence for prevailing/majority and resistant/minority discourses as a large corpus is likely to show a range of ideological positions - something which an analysis of a single text may be less likely to reveal.

Using Corpora in Critical Discourse Analysis does not assume prior knowledge of corpora lingistics. The book examines and evaluate a variety of corpus-based methodologies including collocations, keyness, concordances and dispersal plots using a range of examples from different types of corpora. It also considers issues of building and annotating corpora as well as the validity of approaching CDA from a combination of qualitative and quantitative perspectives. The book is illustrated with a number of real-life examples of corpus-based CDA from a range of sources and covering a variety of subjects including

•Holiday brochures
•Parliament debates about banning foxhunting
•Newspaper reports about refugees
•Representations of the words bachelor and spinster in general corpora

Chapter 1 Introduction
Chapter 2 Corpus Building
Chapter 3 Frequency and Dispersion
Chapter 4 Concordances
Chapter 5 Colloates
Chapter 6 Keyness
Chapter 7 Beyond Collocation
Chapter 8 Conclusion
Although sexual and romantic same-sex relationships between humans have existed for millennia, the ways that such relationships and the people who engage in them have been celebrated, normalised, accepted, ignored, problematised or... more
Although sexual and romantic same-sex relationships between humans have existed for millennia, the ways that such relationships and the people who engage in them have been celebrated, normalised, accepted, ignored, problematised or persecuted has been subject to considerable variation over time and across different societies. Particularly over the last fifty years there has been an inordinate amount of controversy and negotiation concerning the ways that gay men have been talked and written about. Public Discouses of Gay Men explores the variety of ways that gay men are constructed in public settings in order to make sense of the current set of discourses or 'ways of seeing the world' that surround this group.

Taking a corpus-based analysis approach to examine millions of words of data from a range of contemporary sources, the book investigates how conflicting discourses have clashed together, resulting in a definition of homosexuality that is often ambivalent, confusing or contradictory.

The corpus-based approach allows for the identification of repeated patterns of language, showing the culmulative effect this has on discourse in everyday life. The following techniques are used to demonstrate these patterns:

•Collocational analyses - what sort of words tend to regularly appear next to or near words like "gay" and "homosexual" and how does this relate to different contexts?
•Discourse prosodies - how are gay people regularly constructed in language use? What are the most common patterns - which patterns are less frequent or resistant?
•Keywords and frequencies - what words, semantic concepts or grammatical categories tend to occur more frequently than expected by chance alone in public texts about gay men? What can this tell us about the ways that discourses of gay identity are currently constructed?
•Dispersion - how are terms like "gay" dispersed throughout particular texts and how do dispersion pattens relate to discourses of homosexuality?

From conceptualisations of homosexuality as 'unnatural behaviour' in the House of Lords to discourses of shame and outrageousness in tabloid newspapers, it is still the case that homophobia underpins contemporary understandings of homosexuality. However, homophobia is only part of the story - personal adverts and erotic stories show us how desire is constructed for gay men as intensely masculine and ostensibly heterosexual. Additionally, sitcoms like Will & Grace reveal a definition of homosexuality that is weighted in aspirational class-consciousness and camp humour. The full range of discourses is demonstrated in the final analysis chapter which examines safe sex documentation.

Chapter 1: What Can I do with a Naked Corpus? 1-37
Chapter 2: Unnatural Acts: the House of Lords debates on gay male law reform. 28-59

Chapter 3: Flamboyant, predatory, self-confessed homosexual: discourse prosodies in the British tabloid press.60-92

Chapter 4: "True Man" and "McFairyland": gay identities in an American sitcom. 93-130

Chapter 5 "No effeminates please": discourses of gay men's personal adverts. 131-153

Chapter 6: As big as a beercan: a comparative keyword analysis of lesbian and ga male erotic narratives. 154-190

Chapter 7: Making safer sex sexy: border crossing, informalisation and gay identity in sexual health documentation. 191-216

Chapter 8: Conclusion. 217-232
This book is about a little-known part of gay history and maritime history. Long before cities like Manchester and London had "gay villages", British gay men formed their own gay village at sea, taking advantage of the relaxed holiday... more
This book is about a little-known part of gay history and maritime history. Long before cities like Manchester and London had "gay villages", British gay men formed their own gay village at sea, taking advantage of the relaxed holiday atmosphere of luxurious cruise ships, where they worked as waiters and stewards, sometimes even outnumbering the straight men in the catering departments of ships that were household names and the pride of the British fleet.

In the largely homophobic atmosphere of the 1950s, most gay men had to be closeted, and ships were the only public places where they could not only be safely out but also camp. It was not unheard of for straight crewmembers to protect their queer colleagues. “He may be queer, but he’s our queer,” one sailor once said. Hello Sailor! uniquely shows what it was like to be queer at sea at a time when land meant straightness.
Polari has been the secret language of gay men and women through the twentieth century. But more than a language, Polari is an attitude. From the prisons and music halls of Edwardian England to Kenneth Williams, American GIs in London and... more
Polari has been the secret language of gay men and women through the twentieth century. But more than a language, Polari is an attitude. From the prisons and music halls of Edwardian England to Kenneth Williams, American GIs in London and the Sisters of Perpetual Indulgence, Polari has been used to laugh, bitch, gossip and cruise.  Like all slang, Polari users coined an ever-changing vocabulary. Derived from words used by criminals, circus artists, beggars and prostitutes, it also employed Italian, Yiddish, French, rhyming slang and backslang. Polari speakers camped up a storm, from West End chorus boys and office workers to East End sea-queens.

Since gay liberation, lesbian and gay slang has become less a language of concealment than a language of specialization, though the tradition of camp remains. A carefully researched and entertaining read, The Dictionary of Polari and Gay Slang presents a lexicon of Polari and a more general dictionary of lesbian and gay slang. If you don't yet know what vada the bona cartes on the ommee ajax, parkering ninty, a Mexican nightmare or a nellyectomy mean, then this is the book for you.
Polari is a secret form of language mainly used by homosexual men in London and other cities during the twentieth century. Derived in part from the slang lexicons of numerous stigmatised and itinerant groups, Polari was also a means of... more
Polari is a secret form of language mainly used by homosexual men in London and other cities during the twentieth century. Derived in part from the slang lexicons of numerous stigmatised and itinerant groups, Polari was also a means of socialising, acting out camp performances and reconstructing a shared gay identity and worldview among its speakers. This book examines the ways in which Polari was used in order to construct 'gay identities', linking its evolution to the changing status of gay men and lesbians in the UK over the past fifty years.

Chapter 1 What is Polari?
Chapter 2 Historical Origins
Chapter 3 Polari as a Language System
Chapter 4 Uses and Abuses
Chapter 5 Julian and Sandy
Chapter 6 Decline
Chapter 7 Revival
Chapter 8 Conclusion
Appendix: Polari dictionary
Research Interests:
Research Interests:
Research Interests:
Research Interests:
Research Interests:
The tool GraphColl (Brezina et al 2015) allows collocational networks to be identified within corpora, enabling corpus analysis to go beyond two-way collocation. With the creation of this tool, more complex forms of collocation emerge,... more
The tool GraphColl (Brezina et al 2015) allows collocational networks to be identified within corpora, enabling corpus analysis to go beyond two-way collocation. With the creation of this tool, more complex forms of collocation emerge, encompassing three or more words. This paper aims to illustrate the types of relationships that can appear when more than two words are considered, using graph theory to account for the different types of collocational 'shapes' that can be formed within GraphColl networks. Using the reference corpus, the BE06, examples of different types of graphs were elicited and then analysed in order to form an understanding of the sorts of relationships between words that occur in particular shapes. For example, it was found that for the graph C 4 , two of the non-collocating words were likely to be related grammatically or semantically, either being forms of the same lemma, coming from the same grammatical or semantic class or being synonyms or antonyms of one another. The analysis indicates the need for concepts from graph theory to be introduced into corpus analysis of collocation as well as showing the potential for a more sophisticated understanding of the company that words keep.
Research Interests:
Research Interests:
This study highlights how the auto-complete search algorithm offered by the search tool Google can produce suggested terms which could be viewed as racist, sexist or homophobic. Google was interrogated by entering different combinations... more
This study highlights how the auto-complete search algorithm offered by the search tool Google can produce suggested terms which could be viewed as racist, sexist or homophobic. Google was interrogated by entering different combinations of question words and identity terms such as ‘why are blacks…’ in order to elicit auto-completed questions. Two thousand, six hundred and ninety questions were elicited and then categorised according to the qualities they referenced. Certain identity groups were found to attract particular stereotypes or qualities. For example, Muslims and Jewish people were linked to questions about aspects of their appearance or behaviour, while white people were linked to questions about their sexual attitudes. Gay and black identities appeared to attract higher numbers of questions that were negatively stereotyping. The article concludes by questioning the extent to which such algorithms inadvertently help to perpetuate negative stereotypes.
This paper uses methods from corpus linguistics and critical discourse analysis in order to examine patterns of representation around the word Muslim in a 143 million word corpus of British newspaper articles published between 1998 and... more
This paper uses methods from corpus linguistics and critical discourse analysis in order to examine patterns of representation around the word Muslim in a 143 million word corpus of British newspaper articles published between 1998 and 2009. Employing the analysis tool Sketch Engine, an analysis of noun collocates of Muslim found that the following categories (in order of frequency) were referenced: ethnic/national identity, characterising/differentiating attributes, conflict, culture, religion, and group/organisations. The ‘conflict’ category was found to be particularly lexically rich, containing many word types. It was also implicitly indexed in the other categories. Following this, an analysis of the two most frequent collocate pairs: Muslim world and Muslim community showed that they were used to collectivise Muslims, both emphasising their sameness to each other and their difference to ‘The West’. Muslims were also represented as easily offended, alienated, and in conflict with non-Muslims. The analysis also considered legitimation strategies which enabled editors to print more controversial representations, and concluded with a discussion of researcher bias and an extended notion of audience via online social networks.
Research Interests:
A corpus of abstracts from the Lavender Languages and Linguistics Conference was subjected to a diachronic keywords analysis in order to identify concepts which had either stayed in constant focus or became more or less popular over time.... more
A corpus of abstracts from the Lavender Languages and Linguistics Conference was subjected to a diachronic keywords analysis in order to identify concepts which had either stayed in constant focus or became more or less popular over time. Patterns of change in the abstracts corpus were compared against the Corpus of Contemporary American English (COCA) in order to identify the extent that linguistic practices around language and sexuality were reflected in wider society. The analysis found that conference presenters had gradually begun to frame their analyses around queer theory and were using fewer sexual identity labels which were separating, collectivising and hierarchical in favour of more equalising and differentiating terminology. A number of differences between conference-goers' language use and the language of general American English were identified and the paper ends with a critical discussion of the method used and the potential consequences of some of the findings.
Research Interests:
This paper considers the proposal that corpus linguistics approaches can improve the objectivity of critical discourse analysis research, resulting in a more robust and valid set of findings. Taking a recent project which examined the... more
This paper considers the proposal that corpus linguistics approaches can improve the objectivity of critical discourse analysis research, resulting in a more robust and valid set of findings. Taking a recent project which examined the representation of Islam and Muslims in the British press, corpus-driven procedures identified that Muslims tended to be linked to the concept of extreme belief much more than moderate or strong belief. There were differences across newspapers, with 1 in 8 Muslims describing it as extreme in The People while this figure was 1 in 35 for The Guardian. Such patterns of quantification, however, still require researchers to carry out their own critical interpretations with regard to what counts as acceptable frequencies.
This paper explores the viability of automated semantic tagging as a tool of cultural analysis comparing American and British English using the Brown family of corpora. Pairs of corpora representing written language production from circa... more
This paper explores the viability of automated semantic tagging as a tool of cultural analysis comparing American and British English using the Brown family of corpora. Pairs of corpora representing written language production from circa 1961, 1991 and 2006 were contrasted by comparing key semantic tags. This method was then evaluated in relation to three earlier studies which attempted to uncover cultural differences via assigning keywords to ad hoc categories. After outlining the differences found, we conclude that computerised semantic tagging can offer a wider reaching and more scientific comparison of language patterns. However, we suggest that this method is most appropriate as a starting point for a more in-depth cultural analysis, rather than as a final or certain indication of cultural change.
The frequencies of words in four equal-sized reference corpora of written British English from 1931, 1961, 1991, and 2006 were compared to investigate patterns of vocabulary change and stability over time. The study addresses central... more
The frequencies of words in four equal-sized reference corpora of written British English from 1931, 1961, 1991, and 2006 were compared to investigate patterns of vocabulary change and stability over time. The study addresses central methodological questions surrounding diachronic change across multiple corpora, considering a number of methods to distinguish variance over time. Having identified an appropriate measure of variance, the study categorizes words as showing large increases, showing large decreases, or remaining stable. After grouping words into grammatical categories, several hypotheses about language change (and stability) are advanced. Concordance and collocational analyses explore these hypotheses and consider context of usage. The study reports on a number of trends relating to language (specifically British English) and cultural change, including a tendency for language to become less verbose and a move toward more informal and personal ways of writing.
En este artículo se discute el grado en que los analistas críticos del discurso pueden utilizar eficazmente los métodos normalmente empleados en la lingüística de corpus. Nuestra investigación se basa en el análisis de un corpus de 140... more
En este artículo se discute el grado en que los analistas críticos del discurso pueden utilizar eficazmente los métodos normalmente empleados en la lingüística de corpus. Nuestra investigación se basa en el análisis de un corpus de 140 millones de palabras quese compone de noticias de la prensa británica que tratan sobre refugiados, solicitantes de asilo, inmigrantes y migrantes (RSAIM). Explicamos cómo se pudo identificar categorías comunes de representación de RSAIM por medio del análisis de colocaciones y concordancias cómo con los métodos de análisis de colocaciones y concordancias se
pudieron identificar categorías comunes de representación de RSAIM, y cómo dirigir a los analistas hacia textos representativos, con el fin de llevar a cabo un análisis cualitativo. Este artículo propone un esquema de trabajo para adoptar los enfoques de la lingüística de corpus en el análisis crítico del discurso.
Research Interests:
This paper focuses upon two issues. Firstly, the question of identifying diachronic trends, and more importantly significant outliers, in corpora which permit an investigation of a feature at many sampling points over time. Secondly, we... more
This paper focuses upon two issues. Firstly, the question of identifying diachronic trends, and more importantly significant outliers, in corpora which permit an investigation of a feature at many sampling points over time. Secondly, we consider how best to combine more qualitatively oriented approaches to corpus data with the type of trends that can be observed in a corpus using quantitative techniques. The work uses a recently completed ESRC-funded project as a case study, the representation of Islam in the UK press, in order to demonstrate the potential of the approach taken to establishing significant peaks in diachronic frequency development, and the fruitful interface that may be created between qualitative and quantitative techniques.
In order to investigate frequency and context of usage of gender marked language, four equal sized and equivalently sampled corpora of British English in a range of written genres (press, fi ction, general prose, learned writing),... more
In order to investigate frequency and context of usage of gender marked language, four equal sized and equivalently sampled corpora of British English in a range of written genres (press, fi ction, general prose, learned writing), from 1931, 1961, 1991 and 2006 were compared. Terms that were investigated included male and female pronouns, man, woman, boy and girl, gender-related profession and role nouns such as chairman, spokesperson and policewoman, and terms of address such as Mr and Ms. Some reductions in frequencies of male terms were found over time, particularly in terms of decreases of male pronouns and Mr. However, equal frequencies did not necessarily equate with equal representation. A qualitative analysis of man and woman found that while there had been some reductions in gender stereotypes, others were being maintained (such as a lack of adjectives like successful or powerful being applied to words like woman). Additionally, the term girl was still more likely than the term boy to refer to adults, and it was oft en used in a disparaging or sexual way. The article concludes with a discussion of the sort of linguistic strategies that appear to have been successful in terms of equalising gender representation.
This paper describes the analysis of an 87 million word corpus of British newspaper articles which refer to the subject of Islam. In order to examine representations of Islam and Muslims, the corpus was subjected to a comparative... more
This paper describes the analysis of an 87 million word corpus of British newspaper articles which refer to the subject of Islam. In order to examine representations of Islam and Muslims, the corpus was subjected to a comparative analysis, by analysing the lexis that was used most significantly in the tabloid articles, when compared to the broadsheets, and vice versa. Concordances were then analysed in order to investigate the data in a more qualitative way. It was found that the tabloids tended to focus more on British interests, writing about Muslims in a highly emotional style, in connection with terrorist attacks and religious extremism, focussing on a small number of high-profile Muslim “villains“. On the other hand, the broadsheets had a more restrained reporting stance, writing about Muslims in a wider range of contexts, although their focus on world news resulted in them covering more stories about Muslims engaged in wars. The paper raises issues regarding the meaning of bias, and the process by which readers internalise lexical associations and the extent to which such associations impact on attitudes.
This paper describes the BE06 Corpus, a one million word reference corpus of general written British English that was designed to be comparable to the Brown family of corpora. After providing a description of the Brown sampling frame, and... more
This paper describes the BE06 Corpus, a one million word reference corpus of general written British English that was designed to be comparable to the Brown family of corpora. After providing a description of the Brown sampling frame, and giving the rationale for building a new corpus, the process of building the BE06 is elaborated upon, with reference to collecting previously published texts from internet sources, defining "British" authors and enabling accessibility of the corpus. Three studies of lexical frequency using BE06 and comparable corpora (LOB, FLOB and BLOB) are then carried out. These involve a comparison of the 20 most frequent lexical items, an examination of pronoun usage, and an investigation of keywords derived from comparing the 1991 FLOB corpus with the BE06. The paper ends with a critical evaluation of the worth of using the same sampling frame for linguistic studies of diachronic variation.
[Download full paper here: http://eng.sagepub.com/content/36/1/5.full.pdf] This paper examines the discursive construction of refugees and asylum seekers (and to a lesser extent immigrants and migrants) in a 140-million-word corpus of UK... more
[Download full paper here: http://eng.sagepub.com/content/36/1/5.full.pdf]

This paper examines the discursive construction of refugees and asylum seekers (and to a lesser extent immigrants and migrants) in a 140-million-word corpus of UK press articles published between 1996 and 2005. Taking a corpus-based approach, the data were analyzed not only as a whole, but also with regard to synchronic variation, by carrying out concordance analyses of keywords which occurred within tabloid and broad-sheet newspapers, and diachronic change, albeit mainly approached from an unusual angle, by investigating consistent collocates and frequencies of specific terms over time. The analyses point to a number of (mainly negative) categories of representation, the existence and development of nonsensical terms (e.g., illegal refugee), and media confusion and conflation of definitions of the four terms under examination. The paper concludes by critically discussing the extent to which a corpus-based methodological stance can inform critical discourse analysis.
Research Interests:
This article discusses the extent to which methods normally associated with corpus linguistics can be effectively used by critical discourse analysts. Our research is based on the analysis of a 140-million-word corpus of British news... more
This article discusses the extent to which methods normally associated with corpus linguistics can be effectively used by critical discourse analysts. Our research is based on the analysis of a 140-million-word corpus of British news articles about refugees, asylum seekers, immigrants and migrants (collectively RASIM). We discuss how processes such as collocation and concordance analysis were able to identify common categories of representation of RASIM as well as directing analysts to representative texts in order to carry out qualitative analysis. The article suggests a framework for adopting corpus approaches in critical discourse analysis.
A corpus-based analysis of discourses of refugees and asylum seekers was carried out on data taken from a range of British newspapers and texts from the Office of the United Nations High Commissioner for Refugees website, both published... more
A corpus-based analysis of discourses of refugees and asylum seekers was carried out on data taken from a range of British newspapers and texts from the Office of the United Nations High Commissioner for Refugees website, both published in 2003. Concordances of the terms ...
Abstract In this paper we consider how corpora may be of use in the teaching of grammar of the pre-tertiary level. Corpora are becoming well established in teaching in Universities. Corpora also have a role to play in secondary education,... more
Abstract In this paper we consider how corpora may be of use in the teaching of grammar of the pre-tertiary level. Corpora are becoming well established in teaching in Universities. Corpora also have a role to play in secondary education, in that they can help decide how ...
Two approaches to teaching grammar were compared with respect to accuracy of participant response over time. Of seventeen first year English Language undergraduates who participated in the seven week experiment, nine were taught grammar... more
Two approaches to teaching grammar were compared with respect to accuracy of participant response over time. Of seventeen first year English Language undergraduates who participated in the seven week experiment, nine were taught grammar via the traditional classroom‐based “human teacher” method, while the remainder used CyberTutor, a corpus based computer aided linguistic learning program. This program allowed subjects to annotate sentences whilst providing instant feedback and help facilities.The computer aided group out‐performed their human‐taught counterparts in terms of accuracy and number of words analysed. At the end of the experiment, mean accuracy was 89.34% for the computer aided group, whereas it was only 13.64% for the human‐taught group. The overall finding was that in terms of teaching the parts‐of‐speech at least, corpus‐based CALL programs may be more effective than traditional classroom interaction.
This paper uses corpus linguistics and qualitative, manual analysis to compare corpora of English and French Islamist extremist texts. Drawing on 679,743 words in English and 191,344 words in French, we use AntConc (Version 3.4.4)... more
This paper uses corpus linguistics and qualitative, manual analysis to compare corpora of English and French Islamist extremist texts. Drawing on 679,743 words in English and 191,344 words in French, we use AntConc (Version 3.4.4) (Anthony 2017) to examine the extent to which extremist messages in each language draw upon similar and distinct themes and linguistic strategies. Given that the data exist in different languages, a direct comparison of the datasets is not possible. Therefore, this paper discusses our innovative methodology: a cross-linguistic corpus-assisted comparison of thematic categories.
We compiled English and French keyword lists using the BE06 (Baker 2009) as an English reference corpus and 1,462,398 words of La Presse newspaper articles as a French reference corpus. The 500 top-ranked English and French keywords were examined using collocate, cluster and concordance analysis. Next, each keyword list was examined for themes that emerged from the use of words in context. Themes that emerged in one language were compared against the themes that emerged in the other language. Then, the English and French keyword lists were compared in order to establish which keywords were common to both corpora and which were unique to the English and French data. A final step involved mapping the equivalence (or non-equivalence) of keywords across languages (i.e. the cross-linguistic similarities and differences) onto the thematic categorisations previously established. This required revisiting the themes established in the previous step and also, in some cases, revisiting the qualitative analysis and revising some of the categories. The ultimate outcome of this procedure was a matrix that cross-listed the emergent thematic categories against the keywords that were either common to both languages or unique to one set of language data. As a final step, we assessed the similarities and differences across languages by calculating the sum of relative frequencies of all words in each category and the differences between these relative frequencies in English and French. Where relative frequencies were both high and similar in English and French, we consider these to be shared themes. Where relative frequencies were exceptionally higher in one language category than the other (using %diff calculation, Gabrielatos and Marchi 2012), we consider these themes more salient to that language.
Our findings revealed numerous similarities as well as differences. Both corpora focus on religion and rewards (i.e. for faith) and strongly rely on othering strategies. However, the English texts appear to be more concerned with world events pertaining to Islam and the French texts focus on issues specific to France. Also, while the English texts draw on code-switching to Arabic as a form of legitimation, the French texts use a formal register and draw heavily on quotation from scripture in order to discuss permissions, rights, obligations and laws. Finally, the English texts refer to and justify violence to a greater extent than the French texts.
We argue that a comparison of thematic categories across languages is a useful way to identify how similar meanings can be expressed differently in different languages. However, we argue that the subtle nuances inherent to meaning-making require the manual creation of categories rather than automatic categorisation or the use of pre-existing schemes. Although the lack of similar and comparable corpus resources in both languages raise some limitations, we call for more work to be undertaken on French language corpus creation and analysis and also for more experimental studies across languages. Finally, we contend that our approach provides a new and innovative way to undertake cross-linguistic corpus-assisted discourse studies.
References
Anthony, L. (2017). AntConc (Version 3.4.4) [Computer Software]. Tokyo, Japan: Waseda University. Available from http://www.laurenceanthony.net/
Baker, P. (2009) The BE06 Corpus of British English and recent language change. International Journal of Corpus Linguistics. 14:3 312-337.
Gabrielatos, C. and Marchi, A. (2012) Keyness: Appropriate metrics and practical issues. Paper given at CADS 2012 University of Bologna, 14 September 2012.
Research Interests:
This paper presents results of research into the representation of Romanian immigrants in online news articles, and readers’ comments on those articles from a corpus-assisted critical discourse studies perspective. The research focuses on... more
This paper presents results of research into the representation of Romanian immigrants in online news articles, and readers’ comments on those articles from a corpus-assisted critical discourse studies perspective. The research focuses on a specialised corpus constructed from in the articles and comments of pages containing the term ‘romanian’ published by the Daily Express online between 24th July 2006 and 23rd June 2016 (the date of the Brexit vote) and presents a method combining keyword and word cluster analyses to examine the relationships between online articles and their comments at the level of the text as well as discourse practice.
Research Interests:
This title acts as a one-volume resource, providing an introduction to every aspect of corpus linguistics as it is being used at the moment.Corpus linguistics uses large electronic databases of language to examine hypotheses about... more
This title acts as a one-volume resource, providing an introduction to every aspect of corpus linguistics as it is being used at the moment.Corpus linguistics uses large electronic databases of language to examine hypotheses about language use. These can be tested scientifically with computerised analytical tools, without the researcher's preconceptions influencing their conclusions. For this reason, corpus linguistics is a popular and expanding area of study. "Contemporary Corpus Linguistics" presents a comprehensive survey of the ways in which corpus linguistics is being used by researchers. Written by internationally renowned linguists, this volume of seventeen introductory chapters aims to provide a snapshot of the field.The contributors present accessible, yet detailed, analyses of recent methods and theory in corpus linguistics, ways of analysing corpora, and recent applications in translation, stylistics, discourse analysis and language teaching. The book represents the best of current practice in corpus linguistics, and as a one volume reference will be invaluable to students and researchers looking for an overview of the field.
A corpus-based analysis of discourses of refugees and asylum seekers was carried out on data taken from a range of British newspapers and texts from the Office of the United Nations High Commissioner for Refugees website, both published... more
A corpus-based analysis of discourses of refugees and asylum seekers was carried out on data taken from a range of British newspapers and texts from the Office of the United Nations High Commissioner for Refugees website, both published in 2003. Concordances of the terms refugee(s) and asylum seeker(s) were examined and grouped along patterns which revealed linguistic traces of discourses. Discourses which framed refugees as packages, invaders, pests or water were found in newspaper texts, although there were also cases of negative discourses found in the UNHCR texts, revealing how difficult it is to disregard dominant discourses. Lexical choice was found to be an essential aspect of maintaining discourses of asylum seekers — collocational analyses of terms like failed vs. rejected revealed the underlying attitudes of the writers towards the subject.
This paper examines the discursive construction of refugees and asylum seekers (and to a lesser extent immigrants and migrants) in a 140-million-word corpus of UK press articles published between 1996 and 2005. Taking a corpus-based... more
This paper examines the discursive construction of refugees and asylum seekers (and to a lesser extent immigrants and migrants) in a 140-million-word corpus of UK press articles published between 1996 and 2005. Taking a corpus-based approach, the data were analyzed not only as a whole, but also with regard to synchronic variation, by carrying out concordance analyses of keywords which occurred within tabloid and broad-sheet newspapers, and diachronic change, albeit mainly approached from an unusual angle, by investigating consistent collocates and frequencies of specific terms over time. The analyses point to a number of (mainly negative) categories of representation, the existence and development of nonsensical terms (e.g., illegal refugee), and media confusion and conflation of definitions of the four terms under examination. The paper concludes by critically discussing the extent to which a corpus-based methodological stance can inform critical discourse analysis.1
This article discusses the extent to which methods normally associated with corpus linguistics can be effectively used by critical discourse analysts. Our research is based on the analysis of a 140-million-word corpus of British news... more
This article discusses the extent to which methods normally associated with corpus linguistics can be effectively used by critical discourse analysts. Our research is based on the analysis of a 140-million-word corpus of British news articles about refugees, asylum seekers, immigrants and migrants (collectively RASIM). We discuss how processes such as collocation and concordance analysis were able to identify common categories of representation of RASIM as well as directing analysts to representative texts in order to carry out qualitative analysis. The article suggests a framework for adopting corpus approaches in critical discourse analysis.
The EMILLE Project (Enabling Minority Language Engineering) was established to construct a 67 million word corpus of South Asian languages. In addition, the project has had to address a number of issues related to establishing a language... more
The EMILLE Project (Enabling Minority Language Engineering) was established to construct a 67 million word corpus of South Asian languages. In addition, the project has had to address a number of issues related to establishing a language engineering (LE) environment for ...
State of the EMILLE corpora and outline the motives behind the various refinements that have been made to EMILLE's goals. 2.1 Monolingual written corpora The first major challenge facing any corpus builder is the identification of... more
State of the EMILLE corpora and outline the motives behind the various refinements that have been made to EMILLE's goals. 2.1 Monolingual written corpora The first major challenge facing any corpus builder is the identification of suitable sources of corpus data. Design criteria ...
This paper first discusses standards for developing Asian language corpora so as to facilitate international data exchange. Following this, we present two corpora of Asian languages developed at Lancaster University-the EMILLE Corpus,... more
This paper first discusses standards for developing Asian language corpora so as to facilitate international data exchange. Following this, we present two corpora of Asian languages developed at Lancaster University-the EMILLE Corpus, which contains 14 South Asian languages, and the Lancaster Corpus of Mandarin Chinese. Finally, we will demonstrate how to explore these corpora using Xara and other corpus tools.
❑ Both topoi can be supported and reinforced by the use of 'quantity'or 'group'collocations, embodying 'water'or 'war/crime'metaphors(eg metaphors (eg flood/river/tide/wave of refugees... more
❑ Both topoi can be supported and reinforced by the use of 'quantity'or 'group'collocations, embodying 'water'or 'war/crime'metaphors(eg metaphors (eg flood/river/tide/wave of refugees flood/river/tide/wave of refugees; army/hordes/gangs of refugees), which give rise to negative semantic/discourse prosodies related to their inordinate number, and, therefore, threat.
This paper reports work on an ongoing project on the representation of refugees and asylum seekers in the UK press. In recent years, the issue of refugees and asylum seekers entering the UK has has attracted intense media and political... more
This paper reports work on an ongoing project on the representation of refugees and asylum seekers in the UK press. In recent years, the issue of refugees and asylum seekers entering the UK has has attracted intense media and political discussion. As the representation of these groups in the press can influence the way in which readers perceive them, the discourses surrounding these, and related, groups have been the focus of linguistic studies (eg Greenslade, 2005; ter Wal, 2002). Although the project combines approaches within ...
Using corpus linguistics and qualitative, manual discourse analysis, this paper compares English and French extremist texts to determine how messages in different languages draw upon similar and distinct discursive themes and linguistic... more
Using corpus linguistics and qualitative, manual discourse analysis, this paper compares English and French extremist texts to determine how messages in different languages draw upon similar and distinct discursive themes and linguistic strategies. Findings show that both corpora focus on religion and rewards (i.e. for faith) and strongly rely on othering strategies. However, the English texts are concerned with world events whereas the French texts focus on issues specific to France. Also, while the English texts use Arabic code-switching as a form of legitimation, the French texts use a formal register and quotation from scripture in discussions of permissions, rights, obligations and laws. Finally, the English texts refer to and justify violence to a greater extent than the French texts. This paper contributes to the field of terrorism studies and the field of corpus linguistics by presenting a new approach to corpus-driven studies of discourse across more than one language.
Research Interests:
Research Interests:
This paper focuses upon two issues. Firstly, the question of identifying diachronic trends, and more importantly significant outliers, in corpora which permit an investigation of a feature at many sampling points over time. Secondly, we... more
This paper focuses upon two issues. Firstly, the question of identifying diachronic trends, and more importantly significant outliers, in corpora which permit an investigation of a feature at many sampling points over time. Secondly, we consider how best to combine more qualitatively oriented approaches to corpus data with the type of trends that can be observed in a corpus using quantitative techniques. The work uses a recently completed ESRC-funded project as a case study, the representation of Islam in the UK press, in order to demonstrate the potential of the approach taken to establishing significant peaks in diachronic frequency development, and the fruitful interface that may be created between qualitative and quantitative techniques.
Research Interests:
Research Interests: