Paul Baker
Lancaster University, Linguistics and English Language, Faculty Member
- I am Professor of English Language at Lancaster University. My research interests include corpus linguistics, languag... moreI am Professor of English Language at Lancaster University. My research interests include corpus linguistics, language and identities and (critical) discourse analysis. Books include: Using Corpora to Analyse Gender (2014), Discourse Analysis and Media Attitudes (2013), Corpus Linguistics and Sociolinguistics (2010), Sexed Texts: Language, Gender and Sexuality (2008), Using Corpora in Discourse Analysis (2006), Public Discourses of Gay Men (2005) and Polari: The Lost Language of Gay Men (2002). I am the commissioning editor for the journal Corpora.edit
Is the British press prejudiced against Muslims? In what ways can prejudice be explicit or subtle? This book uses a detailed analysis of over 140 million words of newspaper articles on Muslims and Islam, combining corpus linguistics and... more
Is the British press prejudiced against Muslims? In what ways can prejudice be explicit or subtle? This book uses a detailed analysis of over 140 million words of newspaper articles on Muslims and Islam, combining corpus linguistics and discourse analysis methods to produce an objective picture of media attitudes. The authors analyse representations around frequently cited topics such as Muslim women who wear the veil and 'hate preachers'. The analysis is self-reflexive and multidisciplinary, incorporating research on journalistic practices, readership patterns and attitude surveys to answer questions which include: what do journalists mean when they use phrases like 'devout Muslim' and how did the 9/11 and 7/7 attacks affect press reporting? This is a stimulating and unique book for those working in fields of discourse analysis and corpus linguistics, while clear explanations of linguistic terminology make it valuable to those in the fields of politics, media studies, journalism and Islamic studies.
This title offers a set of definitions of key terms in discourse analysis, a core area of all linguistics and language studies courses. Unlike many other areas of linguistics, Discourse analysis is a complex field to define, comprising a... more
This title offers a set of definitions of key terms in discourse analysis, a core area of all linguistics and language studies courses. Unlike many other areas of linguistics, Discourse analysis is a complex field to define, comprising a number of related but different theoretical and methodological frameworks. Discourse can mean many different things to different people. Students often find these multiple meanings to be confusing and this book attempts to spell out and reconcile the different approaches, to give a holistic picture of Discourse Analysis as a branch of several disciplines. As well as comprising a glossary of key terms, the book provides examples, a section on key thinkers and their ideas, and key texts for further reading.
Sociolinguistics and Corpus Linguistics is the first book to focus on the ways that corpus linguistics approaches can be used in order to aid sociolinguistic research. Both corpus linguistics and sociolinguistics have a great deal in... more
Sociolinguistics and Corpus Linguistics is the first book to focus on the ways that corpus linguistics approaches can be used in order to aid sociolinguistic research. Both corpus linguistics and sociolinguistics have a great deal in common in terms of their basic approaches to language enquiry, particularly in terms of providing representative samples from a population and analysing quantitative information in order to study variation or differences between populations. The book covers a range of different topics within sociolinguistics: analysing demographic variation, comparing language use across different cultures and examining language change over time, studying transcripts of spoken interactions and identifying attitudes or discourses. The book references many key and recent studies in the field as well as featuring original analyses of a number of corpora including the British National Corpus, the corpus of Spoken English Dialects and the Brown family of corpora. In addition, a new corpus of written British English collected around 2006 was collected for the purposes of writing the book. Techniques of analysis like concordancing, keywords and collocations are discussed, along with corpus annotation and statistical procedures such as chi-squared tests and clustering. The book takes a critical approach to using corpora in sociolinguistics, attempting to outline the limitations of the approach as well as its advantages.
Chapter 1 Introduction
Chapter 2 Sociolinguistic variation
Chapter 3 Diachronic variation
Chapter 4 Synchronic variation
Chapter 5 Corpora and interpersonal communication
Chapter 6 Uncovering discourses
Chapter 7 Conclusion
Chapter 1 Introduction
Chapter 2 Sociolinguistic variation
Chapter 3 Diachronic variation
Chapter 4 Synchronic variation
Chapter 5 Corpora and interpersonal communication
Chapter 6 Uncovering discourses
Chapter 7 Conclusion
Sexed Texts is aimed at undergraduate students and beginning post-graduate students, presenting a coherent overview of a wide range of theoretical and analytical perspectives in the diverse and rapidly evolving field of language, gender... more
Sexed Texts is aimed at undergraduate students and beginning post-graduate students, presenting a coherent overview of a wide range of theoretical and analytical perspectives in the diverse and rapidly evolving field of language, gender and sexuality.
The book aims to show how people use language to construct themselves (and others) as male, female, gay, heterosexual etc. while prioritising some identities as normal or preferable, some as deviant or subordinate and others as simply non-existent. The book uses a range of real-life, everyday language texts which reference gender and sexuality, including newspaper and magazine articles, religious texts, children's fiction, nursery rhymes, romantic fiction, pornography, ordinary conversations, chat room data and advertisements as well as relying on interview, focus group and corpus data.
The book considers questions such as "is there such a thing as a gay voice?", "do women have to 'talk like men' to succeed at work?", "why are bisexuals one in a million in language use?", "how have advertisers co-opted feminism?", "when is it OK to be a bachelor?", "has 'political correctness' had an impact on the way we refer to women?" and "what exactly, is a dogger?"
Written in a clear way, Sexed Texts uses a combination of classic studies and new analyses in order to trace the development of the field, from early research which aimed to outline ways that men and women used language differently to each other, to studies which focussed on deconstructing the ways that language helps to create gendered and sexed discourses (or ways of understanding the world). The book critically considers feminist, queer and post-structuralist theories in order to show how identities are fluid, unstable and often linked to power hierarchies. However, it is argued that all of us hold multiple identities and experience moments of powerfulness and powerlessness, which must be constantly negotiated via language in ways that can be subtle or contradictory. The book therefore considers some of the most recent theoretical perspectives in the field and should be of value to any student or teacher of language, gender and sexuality.
The book aims to show how people use language to construct themselves (and others) as male, female, gay, heterosexual etc. while prioritising some identities as normal or preferable, some as deviant or subordinate and others as simply non-existent. The book uses a range of real-life, everyday language texts which reference gender and sexuality, including newspaper and magazine articles, religious texts, children's fiction, nursery rhymes, romantic fiction, pornography, ordinary conversations, chat room data and advertisements as well as relying on interview, focus group and corpus data.
The book considers questions such as "is there such a thing as a gay voice?", "do women have to 'talk like men' to succeed at work?", "why are bisexuals one in a million in language use?", "how have advertisers co-opted feminism?", "when is it OK to be a bachelor?", "has 'political correctness' had an impact on the way we refer to women?" and "what exactly, is a dogger?"
Written in a clear way, Sexed Texts uses a combination of classic studies and new analyses in order to trace the development of the field, from early research which aimed to outline ways that men and women used language differently to each other, to studies which focussed on deconstructing the ways that language helps to create gendered and sexed discourses (or ways of understanding the world). The book critically considers feminist, queer and post-structuralist theories in order to show how identities are fluid, unstable and often linked to power hierarchies. However, it is argued that all of us hold multiple identities and experience moments of powerfulness and powerlessness, which must be constantly negotiated via language in ways that can be subtle or contradictory. The book therefore considers some of the most recent theoretical perspectives in the field and should be of value to any student or teacher of language, gender and sexuality.
Using Corpora in Discourse Analysis examines approaches to carrying out discourse analysis (DA) using techniques that are grounded in corpus linguistics. In the past much research on critical discourse analysis has focussed on analyses of... more
Using Corpora in Discourse Analysis examines approaches to carrying out discourse analysis (DA) using techniques that are grounded in corpus linguistics. In the past much research on critical discourse analysis has focussed on analyses of single texts or small collections of texts. However, researchers working in CDA are beginning to acknowledge the potential of using corpora either to supplement their findings or as a valid methodology in itself. A corpus-based approach helps to provide quantitative evidence of the existence of discourses by enabling researchers to identify repetitive linguistic patterns of language use and to uncover hidden meanings in lexical items e.g. by examining collocations. Corpus linguistics also allows researchers to uncover linguistic evidence for prevailing/majority and resistant/minority discourses as a large corpus is likely to show a range of ideological positions - something which an analysis of a single text may be less likely to reveal.
Using Corpora in Critical Discourse Analysis does not assume prior knowledge of corpora lingistics. The book examines and evaluate a variety of corpus-based methodologies including collocations, keyness, concordances and dispersal plots using a range of examples from different types of corpora. It also considers issues of building and annotating corpora as well as the validity of approaching CDA from a combination of qualitative and quantitative perspectives. The book is illustrated with a number of real-life examples of corpus-based CDA from a range of sources and covering a variety of subjects including
•Holiday brochures
•Parliament debates about banning foxhunting
•Newspaper reports about refugees
•Representations of the words bachelor and spinster in general corpora
Chapter 1 Introduction
Chapter 2 Corpus Building
Chapter 3 Frequency and Dispersion
Chapter 4 Concordances
Chapter 5 Colloates
Chapter 6 Keyness
Chapter 7 Beyond Collocation
Chapter 8 Conclusion
Using Corpora in Critical Discourse Analysis does not assume prior knowledge of corpora lingistics. The book examines and evaluate a variety of corpus-based methodologies including collocations, keyness, concordances and dispersal plots using a range of examples from different types of corpora. It also considers issues of building and annotating corpora as well as the validity of approaching CDA from a combination of qualitative and quantitative perspectives. The book is illustrated with a number of real-life examples of corpus-based CDA from a range of sources and covering a variety of subjects including
•Holiday brochures
•Parliament debates about banning foxhunting
•Newspaper reports about refugees
•Representations of the words bachelor and spinster in general corpora
Chapter 1 Introduction
Chapter 2 Corpus Building
Chapter 3 Frequency and Dispersion
Chapter 4 Concordances
Chapter 5 Colloates
Chapter 6 Keyness
Chapter 7 Beyond Collocation
Chapter 8 Conclusion
Although sexual and romantic same-sex relationships between humans have existed for millennia, the ways that such relationships and the people who engage in them have been celebrated, normalised, accepted, ignored, problematised or... more
Although sexual and romantic same-sex relationships between humans have existed for millennia, the ways that such relationships and the people who engage in them have been celebrated, normalised, accepted, ignored, problematised or persecuted has been subject to considerable variation over time and across different societies. Particularly over the last fifty years there has been an inordinate amount of controversy and negotiation concerning the ways that gay men have been talked and written about. Public Discouses of Gay Men explores the variety of ways that gay men are constructed in public settings in order to make sense of the current set of discourses or 'ways of seeing the world' that surround this group.
Taking a corpus-based analysis approach to examine millions of words of data from a range of contemporary sources, the book investigates how conflicting discourses have clashed together, resulting in a definition of homosexuality that is often ambivalent, confusing or contradictory.
The corpus-based approach allows for the identification of repeated patterns of language, showing the culmulative effect this has on discourse in everyday life. The following techniques are used to demonstrate these patterns:
•Collocational analyses - what sort of words tend to regularly appear next to or near words like "gay" and "homosexual" and how does this relate to different contexts?
•Discourse prosodies - how are gay people regularly constructed in language use? What are the most common patterns - which patterns are less frequent or resistant?
•Keywords and frequencies - what words, semantic concepts or grammatical categories tend to occur more frequently than expected by chance alone in public texts about gay men? What can this tell us about the ways that discourses of gay identity are currently constructed?
•Dispersion - how are terms like "gay" dispersed throughout particular texts and how do dispersion pattens relate to discourses of homosexuality?
From conceptualisations of homosexuality as 'unnatural behaviour' in the House of Lords to discourses of shame and outrageousness in tabloid newspapers, it is still the case that homophobia underpins contemporary understandings of homosexuality. However, homophobia is only part of the story - personal adverts and erotic stories show us how desire is constructed for gay men as intensely masculine and ostensibly heterosexual. Additionally, sitcoms like Will & Grace reveal a definition of homosexuality that is weighted in aspirational class-consciousness and camp humour. The full range of discourses is demonstrated in the final analysis chapter which examines safe sex documentation.
Chapter 1: What Can I do with a Naked Corpus? 1-37
Chapter 2: Unnatural Acts: the House of Lords debates on gay male law reform. 28-59
Chapter 3: Flamboyant, predatory, self-confessed homosexual: discourse prosodies in the British tabloid press.60-92
Chapter 4: "True Man" and "McFairyland": gay identities in an American sitcom. 93-130
Chapter 5 "No effeminates please": discourses of gay men's personal adverts. 131-153
Chapter 6: As big as a beercan: a comparative keyword analysis of lesbian and ga male erotic narratives. 154-190
Chapter 7: Making safer sex sexy: border crossing, informalisation and gay identity in sexual health documentation. 191-216
Chapter 8: Conclusion. 217-232
Taking a corpus-based analysis approach to examine millions of words of data from a range of contemporary sources, the book investigates how conflicting discourses have clashed together, resulting in a definition of homosexuality that is often ambivalent, confusing or contradictory.
The corpus-based approach allows for the identification of repeated patterns of language, showing the culmulative effect this has on discourse in everyday life. The following techniques are used to demonstrate these patterns:
•Collocational analyses - what sort of words tend to regularly appear next to or near words like "gay" and "homosexual" and how does this relate to different contexts?
•Discourse prosodies - how are gay people regularly constructed in language use? What are the most common patterns - which patterns are less frequent or resistant?
•Keywords and frequencies - what words, semantic concepts or grammatical categories tend to occur more frequently than expected by chance alone in public texts about gay men? What can this tell us about the ways that discourses of gay identity are currently constructed?
•Dispersion - how are terms like "gay" dispersed throughout particular texts and how do dispersion pattens relate to discourses of homosexuality?
From conceptualisations of homosexuality as 'unnatural behaviour' in the House of Lords to discourses of shame and outrageousness in tabloid newspapers, it is still the case that homophobia underpins contemporary understandings of homosexuality. However, homophobia is only part of the story - personal adverts and erotic stories show us how desire is constructed for gay men as intensely masculine and ostensibly heterosexual. Additionally, sitcoms like Will & Grace reveal a definition of homosexuality that is weighted in aspirational class-consciousness and camp humour. The full range of discourses is demonstrated in the final analysis chapter which examines safe sex documentation.
Chapter 1: What Can I do with a Naked Corpus? 1-37
Chapter 2: Unnatural Acts: the House of Lords debates on gay male law reform. 28-59
Chapter 3: Flamboyant, predatory, self-confessed homosexual: discourse prosodies in the British tabloid press.60-92
Chapter 4: "True Man" and "McFairyland": gay identities in an American sitcom. 93-130
Chapter 5 "No effeminates please": discourses of gay men's personal adverts. 131-153
Chapter 6: As big as a beercan: a comparative keyword analysis of lesbian and ga male erotic narratives. 154-190
Chapter 7: Making safer sex sexy: border crossing, informalisation and gay identity in sexual health documentation. 191-216
Chapter 8: Conclusion. 217-232
This book is about a little-known part of gay history and maritime history. Long before cities like Manchester and London had "gay villages", British gay men formed their own gay village at sea, taking advantage of the relaxed holiday... more
This book is about a little-known part of gay history and maritime history. Long before cities like Manchester and London had "gay villages", British gay men formed their own gay village at sea, taking advantage of the relaxed holiday atmosphere of luxurious cruise ships, where they worked as waiters and stewards, sometimes even outnumbering the straight men in the catering departments of ships that were household names and the pride of the British fleet.
In the largely homophobic atmosphere of the 1950s, most gay men had to be closeted, and ships were the only public places where they could not only be safely out but also camp. It was not unheard of for straight crewmembers to protect their queer colleagues. “He may be queer, but he’s our queer,” one sailor once said. Hello Sailor! uniquely shows what it was like to be queer at sea at a time when land meant straightness.
In the largely homophobic atmosphere of the 1950s, most gay men had to be closeted, and ships were the only public places where they could not only be safely out but also camp. It was not unheard of for straight crewmembers to protect their queer colleagues. “He may be queer, but he’s our queer,” one sailor once said. Hello Sailor! uniquely shows what it was like to be queer at sea at a time when land meant straightness.
Polari has been the secret language of gay men and women through the twentieth century. But more than a language, Polari is an attitude. From the prisons and music halls of Edwardian England to Kenneth Williams, American GIs in London and... more
Polari has been the secret language of gay men and women through the twentieth century. But more than a language, Polari is an attitude. From the prisons and music halls of Edwardian England to Kenneth Williams, American GIs in London and the Sisters of Perpetual Indulgence, Polari has been used to laugh, bitch, gossip and cruise. Like all slang, Polari users coined an ever-changing vocabulary. Derived from words used by criminals, circus artists, beggars and prostitutes, it also employed Italian, Yiddish, French, rhyming slang and backslang. Polari speakers camped up a storm, from West End chorus boys and office workers to East End sea-queens.
Since gay liberation, lesbian and gay slang has become less a language of concealment than a language of specialization, though the tradition of camp remains. A carefully researched and entertaining read, The Dictionary of Polari and Gay Slang presents a lexicon of Polari and a more general dictionary of lesbian and gay slang. If you don't yet know what vada the bona cartes on the ommee ajax, parkering ninty, a Mexican nightmare or a nellyectomy mean, then this is the book for you.
Since gay liberation, lesbian and gay slang has become less a language of concealment than a language of specialization, though the tradition of camp remains. A carefully researched and entertaining read, The Dictionary of Polari and Gay Slang presents a lexicon of Polari and a more general dictionary of lesbian and gay slang. If you don't yet know what vada the bona cartes on the ommee ajax, parkering ninty, a Mexican nightmare or a nellyectomy mean, then this is the book for you.
Polari is a secret form of language mainly used by homosexual men in London and other cities during the twentieth century. Derived in part from the slang lexicons of numerous stigmatised and itinerant groups, Polari was also a means of... more
Polari is a secret form of language mainly used by homosexual men in London and other cities during the twentieth century. Derived in part from the slang lexicons of numerous stigmatised and itinerant groups, Polari was also a means of socialising, acting out camp performances and reconstructing a shared gay identity and worldview among its speakers. This book examines the ways in which Polari was used in order to construct 'gay identities', linking its evolution to the changing status of gay men and lesbians in the UK over the past fifty years.
Chapter 1 What is Polari?
Chapter 2 Historical Origins
Chapter 3 Polari as a Language System
Chapter 4 Uses and Abuses
Chapter 5 Julian and Sandy
Chapter 6 Decline
Chapter 7 Revival
Chapter 8 Conclusion
Appendix: Polari dictionary
Chapter 1 What is Polari?
Chapter 2 Historical Origins
Chapter 3 Polari as a Language System
Chapter 4 Uses and Abuses
Chapter 5 Julian and Sandy
Chapter 6 Decline
Chapter 7 Revival
Chapter 8 Conclusion
Appendix: Polari dictionary
Research Interests:
Research Interests:
Research Interests:
The tool GraphColl (Brezina et al 2015) allows collocational networks to be identified within corpora, enabling corpus analysis to go beyond two-way collocation. With the creation of this tool, more complex forms of collocation emerge,... more
The tool GraphColl (Brezina et al 2015) allows collocational networks to be identified within corpora, enabling corpus analysis to go beyond two-way collocation. With the creation of this tool, more complex forms of collocation emerge, encompassing three or more words. This paper aims to illustrate the types of relationships that can appear when more than two words are considered, using graph theory to account for the different types of collocational 'shapes' that can be formed within GraphColl networks. Using the reference corpus, the BE06, examples of different types of graphs were elicited and then analysed in order to form an understanding of the sorts of relationships between words that occur in particular shapes. For example, it was found that for the graph C 4 , two of the non-collocating words were likely to be related grammatically or semantically, either being forms of the same lemma, coming from the same grammatical or semantic class or being synonyms or antonyms of one another. The analysis indicates the need for concepts from graph theory to be introduced into corpus analysis of collocation as well as showing the potential for a more sophisticated understanding of the company that words keep.
Research Interests:
Research Interests:
This study highlights how the auto-complete search algorithm offered by the search tool Google can produce suggested terms which could be viewed as racist, sexist or homophobic. Google was interrogated by entering different combinations... more
This study highlights how the auto-complete search algorithm offered by the search tool Google can produce suggested terms which could be viewed as racist, sexist or homophobic. Google was interrogated by entering different combinations of question words and identity terms such as ‘why are blacks…’ in order to elicit auto-completed questions. Two thousand, six hundred and ninety questions were elicited and then categorised according to the qualities they referenced. Certain identity groups were found to attract particular stereotypes or qualities. For example, Muslims and Jewish people were linked to questions about aspects of their appearance or behaviour, while white people were linked to questions about their sexual attitudes. Gay and black identities appeared to attract higher numbers of questions that were negatively stereotyping. The article concludes by questioning the extent to which such algorithms inadvertently help to perpetuate negative stereotypes.
A corpus of abstracts from the Lavender Languages and Linguistics Conference was subjected to a diachronic keywords analysis in order to identify concepts which had either stayed in constant focus or became more or less popular over time.... more
A corpus of abstracts from the Lavender Languages and Linguistics Conference was subjected to a diachronic keywords analysis in order to identify concepts which had either stayed in constant focus or became more or less popular over time. Patterns of change in the abstracts corpus were compared against the Corpus of Contemporary American English (COCA) in order to identify the extent that linguistic practices around language and sexuality were reflected in wider society. The analysis found that conference presenters had gradually begun to frame their analyses around queer theory and were using fewer sexual identity labels which were separating, collectivising and hierarchical in favour of more equalising and differentiating terminology. A number of differences between conference-goers' language use and the language of general American English were identified and the paper ends with a critical discussion of the method used and the potential consequences of some of the findings.
Research Interests:
This paper considers the proposal that corpus linguistics approaches can improve the objectivity of critical discourse analysis research, resulting in a more robust and valid set of findings. Taking a recent project which examined the... more
This paper considers the proposal that corpus linguistics approaches can improve the objectivity of critical discourse analysis research, resulting in a more robust and valid set of findings. Taking a recent project which examined the representation of Islam and Muslims in the British press, corpus-driven procedures identified that Muslims tended to be linked to the concept of extreme belief much more than moderate or strong belief. There were differences across newspapers, with 1 in 8 Muslims describing it as extreme in The People while this figure was 1 in 35 for The Guardian. Such patterns of quantification, however, still require researchers to carry out their own critical interpretations with regard to what counts as acceptable frequencies.
This paper explores the viability of automated semantic tagging as a tool of cultural analysis comparing American and British English using the Brown family of corpora. Pairs of corpora representing written language production from circa... more
This paper explores the viability of automated semantic tagging as a tool of cultural analysis comparing American and British English using the Brown family of corpora. Pairs of corpora representing written language production from circa 1961, 1991 and 2006 were contrasted by comparing key semantic tags. This method was then evaluated in relation to three earlier studies which attempted to uncover cultural differences via assigning keywords to ad hoc categories. After outlining the differences found, we conclude that computerised semantic tagging can offer a wider reaching and more scientific comparison of language patterns. However, we suggest that this method is most appropriate as a starting point for a more in-depth cultural analysis, rather than as a final or certain indication of cultural change.
The frequencies of words in four equal-sized reference corpora of written British English from 1931, 1961, 1991, and 2006 were compared to investigate patterns of vocabulary change and stability over time. The study addresses central... more
The frequencies of words in four equal-sized reference corpora of written British English from 1931, 1961, 1991, and 2006 were compared to investigate patterns of vocabulary change and stability over time. The study addresses central methodological questions surrounding diachronic change across multiple corpora, considering a number of methods to distinguish variance over time. Having identified an appropriate measure of variance, the study categorizes words as showing large increases, showing large decreases, or remaining stable. After grouping words into grammatical categories, several hypotheses about language change (and stability) are advanced. Concordance and collocational analyses explore these hypotheses and consider context of usage. The study reports on a number of trends relating to language (specifically British English) and cultural change, including a tendency for language to become less verbose and a move toward more informal and personal ways of writing.
In order to investigate frequency and context of usage of gender marked language, four equal sized and equivalently sampled corpora of British English in a range of written genres (press, fi ction, general prose, learned writing),... more
In order to investigate frequency and context of usage of gender marked language, four equal sized and equivalently sampled corpora of British English in a range of written genres (press, fi ction, general prose, learned writing), from 1931, 1961, 1991 and 2006 were compared. Terms that were investigated included male and female pronouns, man, woman, boy and girl, gender-related profession and role nouns such as chairman, spokesperson and policewoman, and terms of address such as Mr and Ms. Some reductions in frequencies of male terms were found over time, particularly in terms of decreases of male pronouns and Mr. However, equal frequencies did not necessarily equate with equal representation. A qualitative analysis of man and woman found that while there had been some reductions in gender stereotypes, others were being maintained (such as a lack of adjectives like successful or powerful being applied to words like woman). Additionally, the term girl was still more likely than the term boy to refer to adults, and it was oft en used in a disparaging or sexual way. The article concludes with a discussion of the sort of linguistic strategies that appear to have been successful in terms of equalising gender representation.
This paper describes the analysis of an 87 million word corpus of British newspaper articles which refer to the subject of Islam. In order to examine representations of Islam and Muslims, the corpus was subjected to a comparative... more
This paper describes the analysis of an 87 million word corpus of British newspaper articles which refer to the subject of Islam. In order to examine representations of Islam and Muslims, the corpus was subjected to a comparative analysis, by analysing the lexis that was used most significantly in the tabloid articles, when compared to the broadsheets, and vice versa. Concordances were then analysed in order to investigate the data in a more qualitative way. It was found that the tabloids tended to focus more on British interests, writing about Muslims in a highly emotional style, in connection with terrorist attacks and religious extremism, focussing on a small number of high-profile Muslim “villains“. On the other hand, the broadsheets had a more restrained reporting stance, writing about Muslims in a wider range of contexts, although their focus on world news resulted in them covering more stories about Muslims engaged in wars. The paper raises issues regarding the meaning of bias, and the process by which readers internalise lexical associations and the extent to which such associations impact on attitudes.
This paper describes the BE06 Corpus, a one million word reference corpus of general written British English that was designed to be comparable to the Brown family of corpora. After providing a description of the Brown sampling frame, and... more
This paper describes the BE06 Corpus, a one million word reference corpus of general written British English that was designed to be comparable to the Brown family of corpora. After providing a description of the Brown sampling frame, and giving the rationale for building a new corpus, the process of building the BE06 is elaborated upon, with reference to collecting previously published texts from internet sources, defining "British" authors and enabling accessibility of the corpus. Three studies of lexical frequency using BE06 and comparable corpora (LOB, FLOB and BLOB) are then carried out. These involve a comparison of the 20 most frequent lexical items, an examination of pronoun usage, and an investigation of keywords derived from comparing the 1991 FLOB corpus with the BE06. The paper ends with a critical evaluation of the worth of using the same sampling frame for linguistic studies of diachronic variation.
This title acts as a one-volume resource, providing an introduction to every aspect of corpus linguistics as it is being used at the moment.Corpus linguistics uses large electronic databases of language to examine hypotheses about... more
This title acts as a one-volume resource, providing an introduction to every aspect of corpus linguistics as it is being used at the moment.Corpus linguistics uses large electronic databases of language to examine hypotheses about language use. These can be tested scientifically with computerised analytical tools, without the researcher's preconceptions influencing their conclusions. For this reason, corpus linguistics is a popular and expanding area of study. "Contemporary Corpus Linguistics" presents a comprehensive survey of the ways in which corpus linguistics is being used by researchers. Written by internationally renowned linguists, this volume of seventeen introductory chapters aims to provide a snapshot of the field.The contributors present accessible, yet detailed, analyses of recent methods and theory in corpus linguistics, ways of analysing corpora, and recent applications in translation, stylistics, discourse analysis and language teaching. The book represents the best of current practice in corpus linguistics, and as a one volume reference will be invaluable to students and researchers looking for an overview of the field.
A corpus-based analysis of discourses of refugees and asylum seekers was carried out on data taken from a range of British newspapers and texts from the Office of the United Nations High Commissioner for Refugees website, both published... more
A corpus-based analysis of discourses of refugees and asylum seekers was carried out on data taken from a range of British newspapers and texts from the Office of the United Nations High Commissioner for Refugees website, both published in 2003. Concordances of the terms refugee(s) and asylum seeker(s) were examined and grouped along patterns which revealed linguistic traces of discourses. Discourses which framed refugees as packages, invaders, pests or water were found in newspaper texts, although there were also cases of negative discourses found in the UNHCR texts, revealing how difficult it is to disregard dominant discourses. Lexical choice was found to be an essential aspect of maintaining discourses of asylum seekers — collocational analyses of terms like failed vs. rejected revealed the underlying attitudes of the writers towards the subject.
Research Interests:
This paper examines the discursive construction of refugees and asylum seekers (and to a lesser extent immigrants and migrants) in a 140-million-word corpus of UK press articles published between 1996 and 2005. Taking a corpus-based... more
This paper examines the discursive construction of refugees and asylum seekers (and to a lesser extent immigrants and migrants) in a 140-million-word corpus of UK press articles published between 1996 and 2005. Taking a corpus-based approach, the data were analyzed not only as a whole, but also with regard to synchronic variation, by carrying out concordance analyses of keywords which occurred within tabloid and broad-sheet newspapers, and diachronic change, albeit mainly approached from an unusual angle, by investigating consistent collocates and frequencies of specific terms over time. The analyses point to a number of (mainly negative) categories of representation, the existence and development of nonsensical terms (e.g., illegal refugee), and media confusion and conflation of definitions of the four terms under examination. The paper concludes by critically discussing the extent to which a corpus-based methodological stance can inform critical discourse analysis.1
Research Interests:
This article discusses the extent to which methods normally associated with corpus linguistics can be effectively used by critical discourse analysts. Our research is based on the analysis of a 140-million-word corpus of British news... more
This article discusses the extent to which methods normally associated with corpus linguistics can be effectively used by critical discourse analysts. Our research is based on the analysis of a 140-million-word corpus of British news articles about refugees, asylum seekers, immigrants and migrants (collectively RASIM). We discuss how processes such as collocation and concordance analysis were able to identify common categories of representation of RASIM as well as directing analysts to representative texts in order to carry out qualitative analysis. The article suggests a framework for adopting corpus approaches in critical discourse analysis.
Research Interests: Discourse Analysis, Sociology, Applied Linguistics, Corpus Linguistics, Critical Discourse Analysis, and 9 moreLinguistics, Discourse and Society, Refugee, Corpus linguistic, Language Culture and Communication, Qualitative Analysis, Asylum seeker, Representation Politics, and Psychology and Cognitive Sciences
Research Interests:
The EMILLE Project (Enabling Minority Language Engineering) was established to construct a 67 million word corpus of South Asian languages. In addition, the project has had to address a number of issues related to establishing a language... more
The EMILLE Project (Enabling Minority Language Engineering) was established to construct a 67 million word corpus of South Asian languages. In addition, the project has had to address a number of issues related to establishing a language engineering (LE) environment for ...
State of the EMILLE corpora and outline the motives behind the various refinements that have been made to EMILLE's goals. 2.1 Monolingual written corpora The first major challenge facing any corpus builder is the identification of... more
State of the EMILLE corpora and outline the motives behind the various refinements that have been made to EMILLE's goals. 2.1 Monolingual written corpora The first major challenge facing any corpus builder is the identification of suitable sources of corpus data. Design criteria ...
Research Interests:
This paper first discusses standards for developing Asian language corpora so as to facilitate international data exchange. Following this, we present two corpora of Asian languages developed at Lancaster University-the EMILLE Corpus,... more
This paper first discusses standards for developing Asian language corpora so as to facilitate international data exchange. Following this, we present two corpora of Asian languages developed at Lancaster University-the EMILLE Corpus, which contains 14 South Asian languages, and the Lancaster Corpus of Mandarin Chinese. Finally, we will demonstrate how to explore these corpora using Xara and other corpus tools.
❑ Both topoi can be supported and reinforced by the use of 'quantity'or 'group'collocations, embodying 'water'or 'war/crime'metaphors(eg metaphors (eg flood/river/tide/wave of refugees... more
❑ Both topoi can be supported and reinforced by the use of 'quantity'or 'group'collocations, embodying 'water'or 'war/crime'metaphors(eg metaphors (eg flood/river/tide/wave of refugees flood/river/tide/wave of refugees; army/hordes/gangs of refugees), which give rise to negative semantic/discourse prosodies related to their inordinate number, and, therefore, threat.
This paper reports work on an ongoing project on the representation of refugees and asylum seekers in the UK press. In recent years, the issue of refugees and asylum seekers entering the UK has has attracted intense media and political... more
This paper reports work on an ongoing project on the representation of refugees and asylum seekers in the UK press. In recent years, the issue of refugees and asylum seekers entering the UK has has attracted intense media and political discussion. As the representation of these groups in the press can influence the way in which readers perceive them, the discourses surrounding these, and related, groups have been the focus of linguistic studies (eg Greenslade, 2005; ter Wal, 2002). Although the project combines approaches within ...