Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Reflects downloads up to 16 Oct 2024Bibliometrics
Skip Table Of Content Section
research-article
Not all arguments are processed equally: a distributional model of argument complexity
Abstract

This work addresses some questions about language processing: what does it mean that natural language sentences are semantically complex? What semantic features can determine different degrees of difficulty for human comprehenders? Our goal is to ...

research-article
Sense representations for Portuguese: experiments with sense embeddings and deep neural language models
Abstract

Sense representations have gone beyond word representations like Word2Vec, GloVe and FastText and achieved innovative performance on a wide range of natural language processing tasks. Although very useful in many applications, the traditional ...

research-article
Corpora compilation for prosody-informed speech processing
Abstract

Research on speech technologies necessitates spoken data, which is usually obtained through read recorded speech, and specifically adapted to the research needs. When the aim is to deal with the prosody involved in speech, the available data must ...

research-article
Low resource language specific pre-processing and features for sentiment analysis task
Abstract

Sentiment analysis is a classification task where polarity of textual data is identified, i.e. to analyze whether a sentence or document expresses a negative, positive or neutral sentiment. Manipuri is a less privileged, highly agglutinative and ...

research-article
Roman Urdu toxic comment classification
Abstract

With the increasing popularity of user-generated content on social media, the number of toxic texts is also on the rise. Such texts cause adverse effects on users and society at large, therefore, the identification of toxic comments is a growing ...

research-article
TuLeD (Tupían lexical database): introducing a database of a South American language family
Abstract

The last two decades witnessed a rapid growth of publicly accessible online language resources. This has allowed for valuable data on lesser known languages to become available. Such resources provide linguists with opportunities for advancing ...

research-article
Annotating affective dimensions in user-generated content: Comparing the reliability of best–worst scaling, pairwise comparison and rating scales for annotating valence, arousal and dominance
Abstract

In an era where user-generated content becomes ever more prevalent, reliable methods to judge emotional properties of these kinds of complex texts are needed, for example for developing corpora in machine learning contexts. In this study, we focus ...

brief-report
LanguageCrawl: a generic tool for building language models upon common Crawl
Abstract

The exponential growth of the internet community has resulted in the production of a vast amount of unstructured data, including web pages, blogs and social media. Such a volume consisting of hundreds of billions of words is unlikely to be ...

brief-report
A multimodal corpus of simulated consultations between a patient and multiple healthcare professionals
Abstract

Language resources for studying doctor–patient interaction are rare, primarily due to the ethical issues related to recording real medical consultations. Rarer still are resources that involve more than one healthcare professional in consultation ...

brief-report
LexO: an open-source system for managing OntoLex-Lemon resources
Abstract

The adoption of Semantic Web technologies and the Linked Data paradigm has been driven by the need to ensure the construction of resources that are at the same time interoperable, shareable and reusable by the scientific community. OntoLex-Lemon, ...

brief-report
A corpus of Schlieren photography of speech production: potential methodology to study aerodynamics of labial, nasal and vocalic processes
Abstract

This report presents a corpus of articulations recorded with Schlieren photography, a recording technique to visualize aeroflow dynamics for two purposes. First, as a means to investigate aerodynamic processes during speech production without any ...

Comments