Issue Downloads
More Than Syntaxes: Investigating Semantics to Zero-shot Cross-lingual Relation Extraction and Event Argument Role Labelling
Syntactic dependency structures are commonly utilized as language-agnostic features to solve the word order difference issues in zero-shot cross-lingual relation and event extraction tasks. However, while sentences in multiple forms can be employed to ...
A Research on University Students’ Behavioral Intention to Use New-generation Information Technology in Intelligent Foreign Language Learning
A better understanding of how advancement in science and technology affect students’ learning behavior in an academic setting can help all educators in higher education. With the advancement of science and technology, the new-generation information ...
Knowledge-based Data Processing for Multilingual Natural Language Analysis
Natural Language Processing (NLP) aids the empowerment of intelligent machines by enhancing human language understanding for linguistic-based human-computer communication. Recent developments in processing power, as well as the availability of large ...
Automatically Temporal Labeled Data Generation Using Positional Lexicon Expansion for Focus Time Estimation of News Articles
Many facts change over time, which is a fundamental aspect of our physical environment. In the case of pandemic articles, the user is not interested in the creation date of the document but in the facts and the cause of the last pandemic. Fake news can be ...
Multilingual Neural Machine Translation for Indic to Indic Languages
The method of translation from one language to another without human intervention is known as Machine Translation (MT). Multilingual neural machine translation (MNMT) is a technique for MT that builds a single model for multiple languages. It is preferred ...
A Novel Pretrained General-purpose Vision Language Model for the Vietnamese Language
Lying in the cross-section of computer vision and natural language processing, vision language models are capable of processing images and text at once. These models are helpful in various tasks: text generation from image and vice versa, image-text ...
Crossing Linguistic Barriers: Authorship Attribution in Sinhala Texts
Authorship attribution involves determining the original author of an anonymous text from a pool of potential authors. The author attribution task has applications in several domains, such as plagiarism detection, digital text forensics, and information ...
Fast Recurrent Neural Network with Bi-LSTM for Handwritten Tamil Text Segmentation in NLP
Tamil text segmentation is a long-standing test in language comprehension that entails separating a record into adjacent pieces based on its semantic design. Each segment is important in its own way. The segments are organised according to the purpose of ...
Multization: Multi-Modal Summarization Enhanced by Multi-Contextually Relevant and Irrelevant Attention Alignment
This article focuses on the task of Multi-Modal Summarization with Multi-Modal Output for China JD.COM e-commerce product description containing both source text and source images. In the context learning of multi-modal (text and image) input, there ...
Part-of-speech Tagging for Low-resource Languages: Activation Function for Deep Learning Network to Work with Minimal Training Data
Numerous natural language processing (NLP) applications exist today, especially for the most commonly spoken languages such as English, Chinese, and Spanish. Popular traditional methods such as Rule based methods, Naive Bayes classifiers, Hidden Markov ...
Performance of Binarization Algorithms on Tamizhi Inscription Images: An Analysis
Binarization of Tamizhi (Tamil-Brahmi) inscription images are highly challenging, as it is captured from very old stone inscriptions that exists around 3rd century BCE in India. The difficulty is due to the degradation of these inscriptions by ...
Knowledge-Enriched Prompt for Low-Resource Named Entity Recognition
Named Entity Recognition (NER) in low-resource settings aims to identify and categorize entities in a sentence with limited labeled data. Although prompt-based methods have succeeded in low-resource perspectives, challenges persist in effectively ...
Cleansing Jewel: A Neural Spelling Correction Model Built On Google OCR-ed Tibetan Manuscripts
Scholars in the humanities heavily rely on ancient manuscripts to study history, religion, and socio-political structures of the past. Significant efforts have been devoted to digitizing these precious manuscripts using OCR technology. However, most ...
Supervised Contrast Learning Text Classification Model Based on Data Quality Augmentation
Token-level data augmentation generates text samples by modifying the words of the sentences. However, data that are not easily classified can negatively affect the model. In particular, not considering the role of keywords when performing random ...