Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–19 of 19 results for author: Shamsfard, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.00867  [pdf

    cs.CL cs.AI

    Formality Style Transfer in Persian

    Authors: Parastoo Falakaflaki, Mehrnoush Shamsfard

    Abstract: This study explores the formality style transfer in Persian, particularly relevant in the face of the increasing prevalence of informal language on digital platforms, which poses challenges for existing Natural Language Processing (NLP) tools. The aim is to transform informal text into formal while retaining the original meaning, addressing both lexical and syntactic differences. We introduce a no… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 20 pages, 4 figures, 8 tables

  2. arXiv:2402.14838  [pdf, other

    cs.CL cs.AI cs.LG

    RFBES at SemEval-2024 Task 8: Investigating Syntactic and Semantic Features for Distinguishing AI-Generated and Human-Written Texts

    Authors: Mohammad Heydari Rad, Farhan Farsi, Shayan Bali, Romina Etezadi, Mehrnoush Shamsfard

    Abstract: Nowadays, the usage of Large Language Models (LLMs) has increased, and LLMs have been used to generate texts in different languages and for different tasks. Additionally, due to the participation of remarkable companies such as Google and OpenAI, LLMs are now more accessible, and people can easily use them. However, an important issue is how we can detect AI-generated texts from human-written ones… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: Mohammad Heydari Rad, Farhan Farsi, and Shayan Bali have made equal contributions to this work

  3. arXiv:2402.06617  [pdf, other

    cs.CL

    FaBERT: Pre-training BERT on Persian Blogs

    Authors: Mostafa Masumi, Seyed Soroush Majd, Mehrnoush Shamsfard, Hamid Beigy

    Abstract: We introduce FaBERT, a Persian BERT-base model pre-trained on the HmBlogs corpus, encompassing both informal and formal Persian texts. FaBERT is designed to excel in traditional Natural Language Understanding (NLU) tasks, addressing the intricacies of diverse sentence structures and linguistic styles prevalent in the Persian language. In our comprehensive evaluation of FaBERT on 12 datasets in var… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  4. arXiv:2308.05336  [pdf

    cs.CL

    Developing an Informal-Formal Persian Corpus

    Authors: Vahide Tajalli, Fateme Kalantari, Mehrnoush Shamsfard

    Abstract: Informal language is a style of spoken or written language frequently used in casual conversations, social media, weblogs, emails and text messages. In informal writing, the language faces some lexical and/or syntactic changes varying among different languages. Persian is one of the languages with many differences between its formal and informal styles of writing, thus developing informal language… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

    Comments: 16 pages, 1 Figure and 3 tables

  5. Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning

    Authors: Mozhgan Pourkeshavarz, Shahabedin Nabavi, Mohsen Ebrahimi Moghaddam, Mehrnoush Shamsfard

    Abstract: Recently, the attention-enriched encoder-decoder framework has aroused great interest in image captioning due to its overwhelming progress. Many visual attention models directly leverage meaningful regions to generate image descriptions. However, seeking a direct transition from visual space to text is not enough to generate fine-grained captions. This paper exploits a feature-compounding approach… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

    Journal ref: Multimedia Tools and Applications, Volume 83, pages 12209-12233, 2024

  6. arXiv:2203.15323  [pdf, other

    cs.CL

    Improving Persian Relation Extraction Models by Data Augmentation

    Authors: Moein Salimi Sartakhti, Romina Etezadi, Mehrnoush Shamsfard

    Abstract: Relation extraction that is the task of predicting semantic relation type between entities in a sentence or document is an important task in natural language processing. Although there are many researches and datasets for English, Persian suffers from sufficient researches and comprehensive datasets. The only available Persian dataset for this task is PERLEX, which is a Persian expert-translated v… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: 5 pages, 6 images

    ACM Class: I.2.7

    Journal ref: Proceedings of The Second International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2021) co-located with ICNLSP 2021, pages 32-37, Trento, Italy. Association for Computational Linguistics

  7. arXiv:2111.02362  [pdf

    cs.CL

    HmBlogs: A big general Persian corpus

    Authors: Hamzeh Motahari Khansari, Mehrnoush Shamsfard

    Abstract: This paper introduces the hmBlogs corpus for Persian, as a low resource language. This corpus has been prepared based on a collection of nearly 20 million blog posts over a period of about 15 years from a space of Persian blogs and includes more than 6.8 billion tokens. It can be claimed that this corpus is currently the largest Persian corpus that has been prepared independently for the Persian l… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

    Comments: 22 pages

  8. arXiv:2110.05133  [pdf, other

    cs.CL cs.AI

    Offensive Language Detection with BERT-based models, By Customizing Attention Probabilities

    Authors: Peyman Alavi, Pouria Nikvand, Mehrnoush Shamsfard

    Abstract: This paper describes a novel study on using `Attention Mask' input in transformers and using this approach for detecting offensive content in both English and Persian languages. The paper's principal focus is to suggest a methodology to enhance the performance of the BERT-based models on the `Offensive Language Detection' task. Therefore, we customize attention probabilities by changing the `Atten… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

  9. arXiv:2107.02040  [pdf

    cs.CL cs.AI

    A Knowledge-based Approach for Answering Complex Questions in Persian

    Authors: Romina Etezadi, Mehrnoush Shamsfard

    Abstract: Research on open-domain question answering (QA) has a long tradition. A challenge in this domain is answering complex questions (CQA) that require complex inference methods and large amounts of knowledge. In low resource languages, such as Persian, there are not many datasets for open-domain complex questions and also the language processing toolkits are not very accurate. In this paper, we propos… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

    Comments: 9 pages, 5 figures

    MSC Class: I.2.7; I.2.4

  10. arXiv:2107.01987  [pdf

    cs.CL

    Contradiction Detection in Persian Text

    Authors: Zeinab Rahimi, Mehrnoush ShamsFard

    Abstract: Detection of semantic contradictory sentences is one of the most challenging and fundamental issues for NLP applications such as recognition of textual entailments. Contradiction in this study includes different types of semantic confrontation, such as conflict and antonymy. Due to lack of sufficient data to apply precise machine learning and specifically deep learning methods to Persian and other… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

    Comments: 24 pages, 9 tables and 5 figures

  11. arXiv:2107.01540  [pdf, other

    cs.CL

    Persian-WSD-Corpus: A Sense Annotated Corpus for Persian All-words Word Sense Disambiguation

    Authors: Hossein Rouhizadeh, Mehrnoush Shamsfard, Vahideh Tajalli, Masoud Rouhziadeh

    Abstract: Word Sense Disambiguation (WSD) is a long-standing task in Natural Language Processing(NLP) that aims to automatically identify the most relevant meaning of the words in a given context. Developing standard WSD test collections can be mentioned as an important prerequisite for developing and evaluating different WSD systems in the language of interest. Although many WSD test collections have been… ▽ More

    Submitted 4 July, 2021; originally announced July 2021.

  12. arXiv:2106.15674  [pdf, other

    cs.CL cs.AI cs.LG

    SAT Based Analogy Evaluation Framework for Persian Word Embeddings

    Authors: Seyyed Ehsan Mahmoudi, Mehrnoush Shamsfard

    Abstract: In recent years there has been a special interest in word embeddings as a new approach to convert words to vectors. It has been a focal point to understand how much of the semantics of the the words has been transferred into embedding vectors. This is important as the embedding is going to be used as the basis for downstream NLP applications and it will be costly to evaluate the application end-to… ▽ More

    Submitted 29 June, 2021; originally announced June 2021.

  13. PeCoQ: A Dataset for Persian Complex Question Answering over Knowledge Graph

    Authors: Romina Etezadi, Mehrnoush Shamsfard

    Abstract: Question answering systems may find the answers to users' questions from either unstructured texts or structured data such as knowledge graphs. Answering questions using supervised learning approaches including deep learning models need large training datasets. In recent years, some datasets have been presented for the task of Question answering over knowledge graphs, which is the focus of this pa… ▽ More

    Submitted 27 June, 2021; originally announced June 2021.

    Comments: 5 pages, 4 figures

  14. arXiv:2106.14165  [pdf

    cs.CL

    Persian Causality Corpus (PerCause) and the Causality Detection Benchmark

    Authors: Zeinab Rahimi, Mehrnoush ShamsFard

    Abstract: Recognizing causal elements and causal relations in text is one of the challenging issues in natural language processing; specifically, in low resource languages such as Persian. In this research we prepare a causality human annotated corpus for the Persian language which consists of 4446 sentences and 5128 causal relations and three labels of cause, effect and causal mark -- if possibl -- are spe… ▽ More

    Submitted 27 June, 2021; originally announced June 2021.

    Comments: 20 pages, 6 figures and 10 tables

    ACM Class: I.2.7

  15. arXiv:2003.08875  [pdf, other

    cs.CL cs.LG

    Beheshti-NER: Persian Named Entity Recognition Using BERT

    Authors: Ehsan Taher, Seyed Abbas Hoseini, Mehrnoush Shamsfard

    Abstract: Named entity recognition is a natural language processing task to recognize and extract spans of text associated with named entities and classify them in semantic Categories. Google BERT is a deep bidirectional language model, pre-trained on large corpora that can be fine-tuned to solve many NLP tasks such as question answering, named entity recognition, part of speech tagging and etc. In this p… ▽ More

    Submitted 19 March, 2020; originally announced March 2020.

  16. arXiv:1909.03792  [pdf

    q-fin.ST cs.CL cs.LG

    Tehran Stock Exchange Prediction Using Sentiment Analysis of Online Textual Opinions

    Authors: Arezoo Hatefi Ghahfarrokhi, Mehrnoush Shamsfard

    Abstract: In this paper, we investigate the impact of the social media data in predicting the Tehran Stock Exchange (TSE) variables for the first time. We consider the closing price and daily return of three different stocks for this investigation. We collected our social media data from Sahamyab.com/stocktwits for about three months. To extract information from online comments, we propose a hybrid sentimen… ▽ More

    Submitted 27 September, 2019; v1 submitted 30 August, 2019; originally announced September 2019.

    Comments: Intelligent Systems in Accounting, Finance and Management (2019)

  17. arXiv:1904.00766  [pdf

    cs.CV

    A Weighted Multi-Criteria Decision Making Approach for Image Captioning

    Authors: Hassan Maleki Galandouz, Mohsen Ebrahimi Moghaddam, Mehrnoush Shamsfard

    Abstract: Image captioning aims at automatically generating descriptions of an image in natural language. This is a challenging problem in the field of artificial intelligence that has recently received significant attention in the computer vision and natural language processing. Among the existing approaches, visual retrieval based methods have been proven to be highly effective. These approaches search fo… ▽ More

    Submitted 17 March, 2019; originally announced April 2019.

    Comments: 12 pages

  18. arXiv:1805.02290  [pdf

    cs.AI

    The State of the Art in Developing Fuzzy Ontologies: A Survey

    Authors: Zahra Riahi Samani, Mehrnoush Shamsfard

    Abstract: Conceptual formalism supported by typical ontologies may not be sufficient to represent uncertainty information which is caused due to the lack of clear cut boundaries between concepts of a domain. Fuzzy ontologies are proposed to offer a way to deal with this uncertainty. This paper describes the state of the art in developing fuzzy ontologies. The survey is produced by studying about 35 works on… ▽ More

    Submitted 6 May, 2018; originally announced May 2018.

    MSC Class: 97R40; 68T37; 03B52; 68T30

  19. arXiv:1408.0325  [pdf, other

    cs.SI cs.IR cs.LG

    Matrix Factorization with Explicit Trust and Distrust Relationships

    Authors: Rana Forsati, Mehrdad Mahdavi, Mehrnoush Shamsfard, Mohamed Sarwat

    Abstract: With the advent of online social networks, recommender systems have became crucial for the success of many online applications/services due to their significance role in tailoring these applications to user-specific needs or preferences. Despite their increasing popularity, in general recommender systems suffer from the data sparsity and the cold-start problems. To alleviate these issues, in recen… ▽ More

    Submitted 1 August, 2014; originally announced August 2014.

    Comments: ACM Transactions on Information Systems