Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–9 of 9 results for author: José, M M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.04858  [pdf, other

    cs.CL

    Question Answering with Texts and Tables through Deep Reinforcement Learning

    Authors: Marcos M. José, Flávio N. Cação, Maria F. Ribeiro, Rafael M. Cheang, Paulo Pirozelli, Fabio G. Cozman

    Abstract: This paper proposes a novel architecture to generate multi-hop answers to open domain questions that require information from texts and tables, using the Open Table-and-Text Question Answering dataset for validation and training. One of the most common ways to generate answers in this setting is to retrieve information sequentially, where a selected piece of data helps searching for the next piece… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  2. arXiv:2402.19204  [pdf, other

    cs.CL

    PeLLE: Encoder-based language models for Brazilian Portuguese based on open data

    Authors: Guilherme Lamartine de Mello, Marcelo Finger, and Felipe Serras, Miguel de Mello Carpi, Marcos Menon Jose, Pedro Henrique Domingues, Paulo Cavalim

    Abstract: In this paper we present PeLLE, a family of large language models based on the RoBERTa architecture, for Brazilian Portuguese, trained on curated, open data from the Carolina corpus. Aiming at reproducible results, we describe details of the pretraining of the models. We also evaluate PeLLE models against a set of existing multilingual and PT-BR refined pretrained Transformer-based LLM encoders, c… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 15 pages

    ACM Class: I.2.7

  3. arXiv:2312.11720  [pdf, other

    cs.CL cs.AI

    Assessing Logical Reasoning Capabilities of Encoder-Only Transformer Models

    Authors: Paulo Pirozelli, Marcos M. José, Paulo de Tarso P. Filho, Anarosa A. F. Brandão, Fabio G. Cozman

    Abstract: Logical reasoning is central to complex human activities, such as thinking, debating, and planning; it is also a central component of many AI systems as well. In this paper, we investigate the extent to which encoder-only transformer language models (LMs) can reason according to logical rules. We ask whether those LMs can deduce theorems in propositional calculus and first-order logic; if their re… ▽ More

    Submitted 1 July, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  4. arXiv:2309.10945  [pdf, other

    cs.CL cs.AI

    Benchmarks for Pirá 2.0, a Reading Comprehension Dataset about the Ocean, the Brazilian Coast, and Climate Change

    Authors: Paulo Pirozelli, Marcos M. José, Igor Silveira, Flávio Nakasato, Sarajane M. Peres, Anarosa A. F. Brandão, Anna H. R. Costa, Fabio G. Cozman

    Abstract: Pirá is a reading comprehension dataset focused on the ocean, the Brazilian coast, and climate change, built from a collection of scientific abstracts and reports on these topics. This dataset represents a versatile language resource, particularly useful for testing the ability of current machine learning models to acquire expert scientific knowledge. Despite its potential, a detailed set of basel… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: Accepted at Data Intelligence. Online ISSN 2641-435X

  5. arXiv:2210.04107  [pdf, other

    cs.CL cs.AI

    Comparing Computational Architectures for Automated Journalism

    Authors: Yan V. Sym, João Gabriel M. Campos, Marcos M. José, Fabio G. Cozman

    Abstract: The majority of NLG systems have been designed following either a template-based or a pipeline-based architecture. Recent neural models for data-to-text generation have been proposed with an end-to-end deep learning flavor, which handles non-linguistic input in natural language without explicit intermediary representations. This study compares the most often employed methods for generating Brazili… ▽ More

    Submitted 8 October, 2022; originally announced October 2022.

    Comments: Accepted at the 19th National Meeting on Artificial and Computational Intelligence (ENIAC 2022)

  6. arXiv:2209.07928  [pdf, other

    cs.AI cs.CL eess.SY

    The BLue Amazon Brain (BLAB): A Modular Architecture of Services about the Brazilian Maritime Territory

    Authors: Paulo Pirozelli, Ais B. R. Castro, Ana Luiza C. de Oliveira, André S. Oliveira, Flávio N. Cação, Igor C. Silveira, João G. M. Campos, Laura C. Motheo, Leticia F. Figueiredo, Lucas F. A. O. Pellicer, Marcelo A. José, Marcos M. José, Pedro de M. Ligabue, Ricardo S. Grava, Rodrigo M. Tavares, Vinícius B. Matos, Yan V. Sym, Anna H. R. Costa, Anarosa A. F. Brandão, Denis D. Mauá, Fabio G. Cozman, Sarajane M. Peres

    Abstract: We describe the first steps in the development of an artificial agent focused on the Brazilian maritime territory, a large region within the South Atlantic also known as the Blue Amazon. The "BLue Amazon Brain" (BLAB) integrates a number of services aimed at disseminating information about this region and its importance, functioning as a tool for environmental awareness. The main service provided… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

    Journal ref: AI: Modeling Oceans and Climate Change (IJCAI-ECAI), 2022

  7. Integrating question answering and text-to-SQL in Portuguese

    Authors: Marcos Menon José, Marcelo Archanjo José, Denis Deratani Mauá, Fábio Gagliardi Cozman

    Abstract: Deep learning transformers have drastically improved systems that automatically answer questions in natural language. However, different questions demand different answering techniques; here we propose, build and validate an architecture that integrates different modules to answer two distinct kinds of queries. Our architecture takes a free-form natural language text and classifies it to send it e… ▽ More

    Submitted 21 September, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

    Comments: Published at International Conference on the Computational Processing of Portuguese (PROPOR 2022)

    Journal ref: Computational Processing of the Portuguese Language 2022

  8. Pirá: A Bilingual Portuguese-English Dataset for Question-Answering about the Ocean

    Authors: André F. A. Paschoal, Paulo Pirozelli, Valdinei Freire, Karina V. Delgado, Sarajane M. Peres, Marcos M. José, Flávio Nakasato, André S. Oliveira, Anarosa A. F. Brandão, Anna H. R. Costa, Fabio G. Cozman

    Abstract: Current research in natural language processing is highly dependent on carefully produced corpora. Most existing resources focus on English; some resources focus on languages such as Chinese and French; few resources deal with more than one language. This paper presents the Pirá dataset, a large set of questions and answers about the ocean and the Brazilian coast both in Portuguese and English. Pi… ▽ More

    Submitted 4 February, 2022; originally announced February 2022.

    Comments: https://github.com/C4AI/Pira

    Journal ref: CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021

  9. DEEPAGÉ: Answering Questions in Portuguese about the Brazilian Environment

    Authors: Flávio Nakasato Cação, Marcos Menon José, André Seidel Oliveira, Stefano Spindola, Anna Helena Reali Costa, Fábio Gagliardi Cozman

    Abstract: The challenge of climate change and biome conservation is one of the most pressing issues of our time - particularly in Brazil, where key environmental reserves are located. Given the availability of large textual databases on ecological themes, it is natural to resort to question answering (QA) systems to increase social awareness and understanding about these topics. In this work, we introduce m… ▽ More

    Submitted 19 October, 2021; originally announced October 2021.

    Comments: Accepted at BRACIS 2021