-
FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking
Authors:
Zhuoer Wang,
Leonardo F. R. Ribeiro,
Alexandros Papangelis,
Rohan Mukherjee,
Tzu-Yen Wang,
Xinyan Zhao,
Arijit Biswas,
James Caverlee,
Angeliki Metallinou
Abstract:
API call generation is the cornerstone of large language models' tool-using ability that provides access to the larger world. However, existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request. To address these limitations, we propose an output-side opt…
▽ More
API call generation is the cornerstone of large language models' tool-using ability that provides access to the larger world. However, existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request. To address these limitations, we propose an output-side optimization approach called FANTASE. Two of the unique contributions of FANTASE are its State-Tracked Constrained Decoding (SCD) and Reranking components. SCD dynamically incorporates appropriate API constraints in the form of Token Search Trie for efficient and guaranteed generation faithfulness with respect to the API documentation. The Reranking component efficiently brings in the supervised signal by leveraging a lightweight model as the discriminator to rerank the beam-searched candidate generations of the large language model. We demonstrate the superior performance of FANTASE in API call generation accuracy, inference efficiency, and context efficiency with DSTC8 and API Bank datasets.
△ Less
Submitted 18 July, 2024;
originally announced July 2024.
-
Measuring Retrieval Complexity in Question Answering Systems
Authors:
Matteo Gabburo,
Nicolaas Paul Jedema,
Siddhant Garg,
Leonardo F. R. Ribeiro,
Alessandro Moschitti
Abstract:
In this paper, we investigate which questions are challenging for retrieval-based Question Answering (QA). We (i) propose retrieval complexity (RC), a novel metric conditioned on the completeness of retrieved documents, which measures the difficulty of answering questions, and (ii) propose an unsupervised pipeline to measure RC given an arbitrary retrieval system. Our proposed pipeline measures RC…
▽ More
In this paper, we investigate which questions are challenging for retrieval-based Question Answering (QA). We (i) propose retrieval complexity (RC), a novel metric conditioned on the completeness of retrieved documents, which measures the difficulty of answering questions, and (ii) propose an unsupervised pipeline to measure RC given an arbitrary retrieval system. Our proposed pipeline measures RC more accurately than alternative estimators, including LLMs, on six challenging QA benchmarks. Further investigation reveals that RC scores strongly correlate with both QA performance and expert judgment across five of the six studied benchmarks, indicating that RC is an effective measure of question difficulty. Subsequent categorization of high-RC questions shows that they span a broad set of question shapes, including multi-hop, compositional, and temporal QA, indicating that RC scores can categorize a new subset of complex questions. Our system can also have a major impact on retrieval-based systems by helping to identify more challenging questions on existing datasets.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Artificial Intelligence Approaches for Predictive Maintenance in the Steel Industry: A Survey
Authors:
Jakub Jakubowski,
Natalia Wojak-Strzelecka,
Rita P. Ribeiro,
Sepideh Pashami,
Szymon Bobek,
Joao Gama,
Grzegorz J Nalepa
Abstract:
Predictive Maintenance (PdM) emerged as one of the pillars of Industry 4.0, and became crucial for enhancing operational efficiency, allowing to minimize downtime, extend lifespan of equipment, and prevent failures. A wide range of PdM tasks can be performed using Artificial Intelligence (AI) methods, which often use data generated from industrial sensors. The steel industry, which is an important…
▽ More
Predictive Maintenance (PdM) emerged as one of the pillars of Industry 4.0, and became crucial for enhancing operational efficiency, allowing to minimize downtime, extend lifespan of equipment, and prevent failures. A wide range of PdM tasks can be performed using Artificial Intelligence (AI) methods, which often use data generated from industrial sensors. The steel industry, which is an important branch of the global economy, is one of the potential beneficiaries of this trend, given its large environmental footprint, the globalized nature of the market, and the demanding working conditions. This survey synthesizes the current state of knowledge in the field of AI-based PdM within the steel industry and is addressed to researchers and practitioners. We identified 219 articles related to this topic and formulated five research questions, allowing us to gain a global perspective on current trends and the main research gaps. We examined equipment and facilities subjected to PdM, determined common PdM approaches, and identified trends in the AI methods used to develop these solutions. We explored the characteristics of the data used in the surveyed articles and assessed the practical implications of the research presented there. Most of the research focuses on the blast furnace or hot rolling, using data from industrial sensors. Current trends show increasing interest in the domain, especially in the use of deep learning. The main challenges include implementing the proposed methods in a production environment, incorporating them into maintenance plans, and enhancing the accessibility and reproducibility of the research.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Aequitas Flow: Streamlining Fair ML Experimentation
Authors:
Sérgio Jesus,
Pedro Saleiro,
Inês Oliveira e Silva,
Beatriz M. Jorge,
Rita P. Ribeiro,
João Gama,
Pedro Bizarro,
Rayid Ghani
Abstract:
Aequitas Flow is an open-source framework for end-to-end Fair Machine Learning (ML) experimentation in Python. This package fills the existing integration gaps in other Fair ML packages of complete and accessible experimentation. It provides a pipeline for fairness-aware model training, hyperparameter optimization, and evaluation, enabling rapid and simple experiments and result analysis. Aimed at…
▽ More
Aequitas Flow is an open-source framework for end-to-end Fair Machine Learning (ML) experimentation in Python. This package fills the existing integration gaps in other Fair ML packages of complete and accessible experimentation. It provides a pipeline for fairness-aware model training, hyperparameter optimization, and evaluation, enabling rapid and simple experiments and result analysis. Aimed at ML practitioners and researchers, the framework offers implementations of methods, datasets, metrics, and standard interfaces for these components to improve extensibility. By facilitating the development of fair ML practices, Aequitas Flow seeks to enhance the adoption of these concepts in AI technologies.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
A Multilevel Strategy to Improve People Tracking in a Real-World Scenario
Authors:
Cristiano B. de Oliveira,
Joao C. Neves,
Rafael O. Ribeiro,
David Menotti
Abstract:
The Palácio do Planalto, office of the President of Brazil, was invaded by protesters on January 8, 2023. Surveillance videos taken from inside the building were subsequently released by the Brazilian Supreme Court for public scrutiny. We used segments of such footage to create the UFPR-Planalto801 dataset for people tracking and re-identification in a real-world scenario. This dataset consists of…
▽ More
The Palácio do Planalto, office of the President of Brazil, was invaded by protesters on January 8, 2023. Surveillance videos taken from inside the building were subsequently released by the Brazilian Supreme Court for public scrutiny. We used segments of such footage to create the UFPR-Planalto801 dataset for people tracking and re-identification in a real-world scenario. This dataset consists of more than 500,000 images. This paper presents a tracking approach targeting this dataset. The method proposed in this paper relies on the use of known state-of-the-art trackers combined in a multilevel hierarchy to correct the ID association over the trajectories. We evaluated our method using IDF1, MOTA, MOTP and HOTA metrics. The results show improvements for every tracker used in the experiments, with IDF1 score increasing by a margin up to 9.5%.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
A Neuro-Symbolic Explainer for Rare Events: A Case Study on Predictive Maintenance
Authors:
João Gama,
Rita P. Ribeiro,
Saulo Mastelini,
Narjes Davarid,
Bruno Veloso
Abstract:
Predictive Maintenance applications are increasingly complex, with interactions between many components. Black box models are popular approaches based on deep learning techniques due to their predictive accuracy. This paper proposes a neural-symbolic architecture that uses an online rule-learning algorithm to explain when the black box model predicts failures. The proposed system solves two proble…
▽ More
Predictive Maintenance applications are increasingly complex, with interactions between many components. Black box models are popular approaches based on deep learning techniques due to their predictive accuracy. This paper proposes a neural-symbolic architecture that uses an online rule-learning algorithm to explain when the black box model predicts failures. The proposed system solves two problems in parallel: anomaly detection and explanation of the anomaly. For the first problem, we use an unsupervised state of the art autoencoder. For the second problem, we train a rule learning system that learns a mapping from the input features to the autoencoder reconstruction error. Both systems run online and in parallel. The autoencoder signals an alarm for the examples with a reconstruction error that exceeds a threshold. The causes of the signal alarm are hard for humans to understand because they result from a non linear combination of sensor data. The rule that triggers that example describes the relationship between the input features and the autoencoder reconstruction error. The rule explains the failure signal by indicating which sensors contribute to the alarm and allowing the identification of the component involved in the failure. The system can present global explanations for the black box model and local explanations for why the black box model predicts a failure. We evaluate the proposed system in a real-world case study of Metro do Porto and provide explanations that illustrate its benefits.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Super-Resolution Analysis for Landfill Waste Classification
Authors:
Matias Molina,
Rita P. Ribeiro,
Bruno Veloso,
João Gama
Abstract:
Illegal landfills are a critical issue due to their environmental, economic, and public health impacts. This study leverages aerial imagery for environmental crime monitoring. While advances in artificial intelligence and computer vision hold promise, the challenge lies in training models with high-resolution literature datasets and adapting them to open-access low-resolution images. Considering t…
▽ More
Illegal landfills are a critical issue due to their environmental, economic, and public health impacts. This study leverages aerial imagery for environmental crime monitoring. While advances in artificial intelligence and computer vision hold promise, the challenge lies in training models with high-resolution literature datasets and adapting them to open-access low-resolution images. Considering the substantial quality differences and limited annotation, this research explores the adaptability of models across these domains. Motivated by the necessity for a comprehensive evaluation of waste detection algorithms, it advocates cross-domain classification and super-resolution enhancement to analyze the impact of different image resolutions on waste classification as an evaluation to combat the proliferation of illegal landfills. We observed performance improvements by enhancing image quality but noted an influence on model sensitivity, necessitating careful threshold fine-tuning.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
On the Role of Summary Content Units in Text Summarization Evaluation
Authors:
Marcel Nawrath,
Agnieszka Nowak,
Tristan Ratz,
Danilo C. Walenta,
Juri Opitz,
Leonardo F. R. Ribeiro,
João Sedoc,
Daniel Deutsch,
Simon Mille,
Yixin Liu,
Lining Zhang,
Sebastian Gehrmann,
Saad Mahamood,
Miruna Clinciu,
Khyathi Chandu,
Yufang Hou
Abstract:
At the heart of the Pyramid evaluation method for text summarization lie human written summary content units (SCUs). These SCUs are concise sentences that decompose a summary into small facts. Such SCUs can be used to judge the quality of a candidate summary, possibly partially automated via natural language inference (NLI) systems. Interestingly, with the aim to fully automate the Pyramid evaluat…
▽ More
At the heart of the Pyramid evaluation method for text summarization lie human written summary content units (SCUs). These SCUs are concise sentences that decompose a summary into small facts. Such SCUs can be used to judge the quality of a candidate summary, possibly partially automated via natural language inference (NLI) systems. Interestingly, with the aim to fully automate the Pyramid evaluation, Zhang and Bansal (2021) show that SCUs can be approximated by automatically generated semantic role triplets (STUs). However, several questions currently lack answers, in particular: i) Are there other ways of approximating SCUs that can offer advantages? ii) Under which conditions are SCUs (or their approximations) offering the most value? In this work, we examine two novel strategies to approximate SCUs: generating SCU approximations from AMR meaning representations (SMUs) and from large language models (SGUs), respectively. We find that while STUs and SMUs are competitive, the best approximation quality is achieved by SGUs. We also show through a simple sentence-decomposition baseline (SSUs) that SCUs (and their approximations) offer the most value when ranking short summaries, but may not help as much when ranking systems or longer summaries.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Logic-based Explanations for Linear Support Vector Classifiers with Reject Option
Authors:
Francisco Mateus Rocha Filho,
Thiago Alves Rocha,
Reginaldo Pereira Fernandes Ribeiro,
Ajalmar Rêgo da Rocha Neto
Abstract:
Support Vector Classifier (SVC) is a well-known Machine Learning (ML) model for linear classification problems. It can be used in conjunction with a reject option strategy to reject instances that are hard to correctly classify and delegate them to a specialist. This further increases the confidence of the model. Given this, obtaining an explanation of the cause of rejection is important to not bl…
▽ More
Support Vector Classifier (SVC) is a well-known Machine Learning (ML) model for linear classification problems. It can be used in conjunction with a reject option strategy to reject instances that are hard to correctly classify and delegate them to a specialist. This further increases the confidence of the model. Given this, obtaining an explanation of the cause of rejection is important to not blindly trust the obtained results. While most of the related work has developed means to give such explanations for machine learning models, to the best of our knowledge none have done so for when reject option is present. We propose a logic-based approach with formal guarantees on the correctness and minimality of explanations for linear SVCs with reject option. We evaluate our approach by comparing it to Anchors, which is a heuristic algorithm for generating explanations. Obtained results show that our proposed method gives shorter explanations with reduced time cost.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
Redex -> Coq: towards a theory of decidability of Redex's reduction semantics
Authors:
Mallku Soldevila,
Rodrigo Ribeiro,
Beta Ziliani
Abstract:
We propose the first steps in the development of a tool to automate the translation of Redex models into a (hopefully) semantically equivalent model in Coq, and to provide tactics to help in the certification of fundamental properties of such models. The work is heavily based on a model of Redex's semantics developed by Klein et al. By means of a simple generalization of the matching problem in Re…
▽ More
We propose the first steps in the development of a tool to automate the translation of Redex models into a (hopefully) semantically equivalent model in Coq, and to provide tactics to help in the certification of fundamental properties of such models. The work is heavily based on a model of Redex's semantics developed by Klein et al. By means of a simple generalization of the matching problem in Redex, we obtain an algorithm suitable for its mechanization in Coq, for which we prove its soundness properties and its correspondence with the original solution proposed by Klein et al. In the process, we also adequate some parts of our mechanization to better prepare it for the future inclusion of Redex features absent in the present model, like its Kleene-star operator. Finally, we discuss future avenues of development that are enabled by this work.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Rarity of the infinite chains in the tree of numerical semigroups
Authors:
Maria Bras-Amorós,
Mariana Rosas Ribeiro
Abstract:
We prove that, for each fixed genus, the portion of semigroups of that
genus belonging to infinite chains in the semigroup tree approaches 0 as
the genus grows to infinite. This means that most numerical semigroups
have a finite number of descendants in the semigroup tree. This problem
has been open since 2009.
We prove that, for each fixed genus, the portion of semigroups of that
genus belonging to infinite chains in the semigroup tree approaches 0 as
the genus grows to infinite. This means that most numerical semigroups
have a finite number of descendants in the semigroup tree. This problem
has been open since 2009.
△ Less
Submitted 31 January, 2024;
originally announced January 2024.
-
Structural results for the Tree Builder Random Walk
Authors:
Janos Engländer,
Giulio Iacobelli,
Gábor Pete,
Rodrigo Ribeiro
Abstract:
We study the Tree Builder Random Walk: a randomly growing tree, built by a walker as she is walking around the tree. Namely, at each time $n$, she adds a leaf to her current vertex with probability $p_n=n^{-γ}$, $γ\in (2/3,1]$, then moves to a uniform random neighbor on the possibly modified tree. We show that the tree process at its growth times, after a random finite number of steps, can be coup…
▽ More
We study the Tree Builder Random Walk: a randomly growing tree, built by a walker as she is walking around the tree. Namely, at each time $n$, she adds a leaf to her current vertex with probability $p_n=n^{-γ}$, $γ\in (2/3,1]$, then moves to a uniform random neighbor on the possibly modified tree. We show that the tree process at its growth times, after a random finite number of steps, can be coupled to be identical to the Barabási-Albert preferential attachment tree model. Thus, our TBRW-model is a local dynamics giving rise to the BA-model. The coupling also implies that many properties known for the BA-model, such as diameter and degree distribution, can be directly transferred to our TBRW-model, extending previous results.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Generating Summaries with Controllable Readability Levels
Authors:
Leonardo F. R. Ribeiro,
Mohit Bansal,
Markus Dreyer
Abstract:
Readability refers to how easily a reader can understand a written text. Several factors affect the readability level, such as the complexity of the text, its subject matter, and the reader's background knowledge. Generating summaries based on different readability levels is critical for enabling knowledge consumption by diverse audiences. However, current text generation approaches lack refined c…
▽ More
Readability refers to how easily a reader can understand a written text. Several factors affect the readability level, such as the complexity of the text, its subject matter, and the reader's background knowledge. Generating summaries based on different readability levels is critical for enabling knowledge consumption by diverse audiences. However, current text generation approaches lack refined control, resulting in texts that are not customized to readers' proficiency levels. In this work, we bridge this gap and study techniques to generate summaries at specified readability levels. Unlike previous methods that focus on a specific readability level (e.g., lay summarization), we generate summaries with fine-grained control over their readability. We develop three text generation techniques for controlling readability: (1) instruction-based readability control, (2) reinforcement learning to minimize the gap between requested and observed readability and (3) a decoding approach that uses lookahead to estimate the readability of upcoming decoding steps. We show that our generation methods significantly improve readability control on news summarization (CNN/DM dataset), as measured by various readability metrics and human judgement, establishing strong baselines for controllable readability in summarization.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Socially reactive navigation models for mobile robots in dynamic environments
Authors:
Ricarte Ribeiro,
Plinio Moreno
Abstract:
The objective of this work is to expand upon previous works, considering socially acceptable behaviours within robot navigation and interaction, and allow a robot to closely approach static and dynamic individuals or groups. The space models developed in this dissertation are adaptive, that is, capable of changing over time to accommodate the changing circumstances often existent within a social e…
▽ More
The objective of this work is to expand upon previous works, considering socially acceptable behaviours within robot navigation and interaction, and allow a robot to closely approach static and dynamic individuals or groups. The space models developed in this dissertation are adaptive, that is, capable of changing over time to accommodate the changing circumstances often existent within a social environment. The space model's parameters' adaptation occurs with the end goal of enabling a close interaction between humans and robots and is thus capable of taking into account not only the arrangement of the groups, but also the basic characteristics of the robot itself. This work also further develops a preexisting approach pose estimation algorithm in order to better guarantee the safety and comfort of the humans involved in the interaction, by taking into account basic human sensibilities. The algorithms are integrated into ROS's navigation system through the use of the $costmap2d$ and the $move\_base$ packages. The space model adaptation is tested via comparative evaluation against previous algorithms through the use of datasets. The entire navigation system is then evaluated through both simulations (static and dynamic) and real life situations (static). These experiments demonstrate that the developed space model and approach pose estimation algorithms are capable of enabling a robot to closely approach individual humans and groups, while maintaining considerations for their comfort and sensibilities.
△ Less
Submitted 15 October, 2023;
originally announced October 2023.
-
Fuzzy Fingerprinting Transformer Language-Models for Emotion Recognition in Conversations
Authors:
Patrícia Pereira,
Rui Ribeiro,
Helena Moniz,
Luisa Coheur,
Joao Paulo Carvalho
Abstract:
Fuzzy Fingerprints have been successfully used as an interpretable text classification technique, but, like most other techniques, have been largely surpassed in performance by Large Pre-trained Language Models, such as BERT or RoBERTa. These models deliver state-of-the-art results in several Natural Language Processing tasks, namely Emotion Recognition in Conversations (ERC), but suffer from the…
▽ More
Fuzzy Fingerprints have been successfully used as an interpretable text classification technique, but, like most other techniques, have been largely surpassed in performance by Large Pre-trained Language Models, such as BERT or RoBERTa. These models deliver state-of-the-art results in several Natural Language Processing tasks, namely Emotion Recognition in Conversations (ERC), but suffer from the lack of interpretability and explainability. In this paper, we propose to combine the two approaches to perform ERC, as a means to obtain simpler and more interpretable Large Language Models-based classifiers. We propose to feed the utterances and their previous conversational turns to a pre-trained RoBERTa, obtaining contextual embedding utterance representations, that are then supplied to an adapted Fuzzy Fingerprint classification module. We validate our approach on the widely used DailyDialog ERC benchmark dataset, in which we obtain state-of-the-art level results using a much lighter model.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Enhancing Network Slicing Architectures with Machine Learning, Security, Sustainability and Experimental Networks Integration
Authors:
Joberto S. B. Martins,
Tereza C. Carvalho,
Rodrigo Moreira,
Cristiano Both,
Adnei Donatti,
João H. Corrêa,
José A. Suruagy,
Sand L. Corrêa,
Antonio J. G. Abelem,
Moisés R. N. Ribeiro,
Jose-Marcos Nogueira,
Luiz C. S. Magalhães,
Juliano Wickboldt,
Tiago Ferreto,
Ricardo Mello,
Rafael Pasquini,
Marcos Schwarz,
Leobino N. Sampaio,
Daniel F. Macedo,
José F. de Rezende,
Kleber V. Cardoso,
Flávio O. Silva
Abstract:
Network Slicing (NS) is an essential technique extensively used in 5G networks computing strategies, mobile edge computing, mobile cloud computing, and verticals like the Internet of Vehicles and industrial IoT, among others. NS is foreseen as one of the leading enablers for 6G futuristic and highly demanding applications since it allows the optimization and customization of scarce and disputed re…
▽ More
Network Slicing (NS) is an essential technique extensively used in 5G networks computing strategies, mobile edge computing, mobile cloud computing, and verticals like the Internet of Vehicles and industrial IoT, among others. NS is foreseen as one of the leading enablers for 6G futuristic and highly demanding applications since it allows the optimization and customization of scarce and disputed resources among dynamic, demanding clients with highly distinct application requirements. Various standardization organizations, like 3GPP's proposal for new generation networks and state-of-the-art 5G/6G research projects, are proposing new NS architectures. However, new NS architectures have to deal with an extensive range of requirements that inherently result in having NS architecture proposals typically fulfilling the needs of specific sets of domains with commonalities. The Slicing Future Internet Infrastructures (SFI2) architecture proposal explores the gap resulting from the diversity of NS architectures target domains by proposing a new NS reference architecture with a defined focus on integrating experimental networks and enhancing the NS architecture with Machine Learning (ML) native optimizations, energy-efficient slicing, and slicing-tailored security functionalities. The SFI2 architectural main contribution includes the utilization of the slice-as-a-service paradigm for end-to-end orchestration of resources across multi-domains and multi-technology experimental networks. In addition, the SFI2 reference architecture instantiations will enhance the multi-domain and multi-technology integrated experimental network deployment with native ML optimization, energy-efficient aware slicing, and slicing-tailored security functionalities for the practical domain.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
Reconstructing Spatiotemporal Data with C-VAEs
Authors:
Tiago F. R. Ribeiro,
Fernando Silva,
Rogério Luís de C. Costa
Abstract:
The continuous representation of spatiotemporal data commonly relies on using abstract data types, such as \textit{moving regions}, to represent entities whose shape and position continuously change over time. Creating this representation from discrete snapshots of real-world entities requires using interpolation methods to compute in-between data representations and estimate the position and shap…
▽ More
The continuous representation of spatiotemporal data commonly relies on using abstract data types, such as \textit{moving regions}, to represent entities whose shape and position continuously change over time. Creating this representation from discrete snapshots of real-world entities requires using interpolation methods to compute in-between data representations and estimate the position and shape of the object of interest at arbitrary temporal points. Existing region interpolation methods often fail to generate smooth and realistic representations of a region's evolution. However, recent advancements in deep learning techniques have revealed the potential of deep models trained on discrete observations to capture spatiotemporal dependencies through implicit feature learning.
In this work, we explore the capabilities of Conditional Variational Autoencoder (C-VAE) models to generate smooth and realistic representations of the spatiotemporal evolution of moving regions. We evaluate our proposed approach on a sparsely annotated dataset on the burnt area of a forest fire. We apply compression operations to sample from the dataset and use the C-VAE model and other commonly used interpolation algorithms to generate in-between region representations. To evaluate the performance of the methods, we compare their interpolation results with manually annotated data and regions generated by a U-Net model. We also assess the quality of generated data considering temporal consistency metrics.
The proposed C-VAE-based approach demonstrates competitive results in geometric similarity metrics. It also exhibits superior temporal consistency, suggesting that C-VAE models may be a viable alternative to modelling the spatiotemporal evolution of 2D moving regions.
△ Less
Submitted 28 August, 2023; v1 submitted 12 July, 2023;
originally announced July 2023.
-
Mobility Strategy of Multi-Limbed Climbing Robots for Asteroid Exploration
Authors:
Warley F. R. Ribeiro,
Kentaro Uno,
Masazumi Imai,
Koki Murase,
Barış Can Yalçın,
Matteo El Hariry,
Miguel A. Olivares-Mendez,
Kazuya Yoshida
Abstract:
Mobility on asteroids by multi-limbed climbing robots is expected to achieve our exploration goals in such challenging environments. We propose a mobility strategy to improve the locomotion safety of climbing robots in such harsh environments that picture extremely low gravity and highly uneven terrain. Our method plans the gait by decoupling the base and limbs' movements and adjusting the main bo…
▽ More
Mobility on asteroids by multi-limbed climbing robots is expected to achieve our exploration goals in such challenging environments. We propose a mobility strategy to improve the locomotion safety of climbing robots in such harsh environments that picture extremely low gravity and highly uneven terrain. Our method plans the gait by decoupling the base and limbs' movements and adjusting the main body pose to avoid ground collisions. The proposed approach includes a motion planning that reduces the reactions generated by the robot's movement by optimizing the swinging trajectory and distributing the momentum. Lower motion reactions decrease the pulling forces on the grippers, avoiding the slippage and flotation of the robot. Dynamic simulations and experiments demonstrate that the proposed method could improve the robot's mobility on the surface of asteroids.
△ Less
Submitted 22 June, 2023; v1 submitted 13 June, 2023;
originally announced June 2023.
-
Explainable Predictive Maintenance
Authors:
Sepideh Pashami,
Slawomir Nowaczyk,
Yuantao Fan,
Jakub Jakubowski,
Nuno Paiva,
Narjes Davari,
Szymon Bobek,
Samaneh Jamshidi,
Hamid Sarmadi,
Abdallah Alabdallah,
Rita P. Ribeiro,
Bruno Veloso,
Moamar Sayed-Mouchaweh,
Lala Rajaoarisoa,
Grzegorz J. Nalepa,
João Gama
Abstract:
Explainable Artificial Intelligence (XAI) fills the role of a critical interface fostering interactions between sophisticated intelligent systems and diverse individuals, including data scientists, domain experts, end-users, and more. It aids in deciphering the intricate internal mechanisms of ``black box'' Machine Learning (ML), rendering the reasons behind their decisions more understandable. Ho…
▽ More
Explainable Artificial Intelligence (XAI) fills the role of a critical interface fostering interactions between sophisticated intelligent systems and diverse individuals, including data scientists, domain experts, end-users, and more. It aids in deciphering the intricate internal mechanisms of ``black box'' Machine Learning (ML), rendering the reasons behind their decisions more understandable. However, current research in XAI primarily focuses on two aspects; ways to facilitate user trust, or to debug and refine the ML model. The majority of it falls short of recognising the diverse types of explanations needed in broader contexts, as different users and varied application areas necessitate solutions tailored to their specific needs.
One such domain is Predictive Maintenance (PdM), an exploding area of research under the Industry 4.0 \& 5.0 umbrella. This position paper highlights the gap between existing XAI methodologies and the specific requirements for explanations within industrial applications, particularly the Predictive Maintenance field. Despite explainability's crucial role, this subject remains a relatively under-explored area, making this paper a pioneering attempt to bring relevant challenges to the research community's attention. We provide an overview of predictive maintenance tasks and accentuate the need and varying purposes for corresponding explanations. We then list and describe XAI techniques commonly employed in the literature, discussing their suitability for PdM tasks. Finally, to make the ideas and claims more concrete, we demonstrate XAI applied in four specific industrial use cases: commercial vehicles, metro trains, steel plants, and wind farms, spotlighting areas requiring further research.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Learning to Reason over Scene Graphs: A Case Study of Finetuning GPT-2 into a Robot Language Model for Grounded Task Planning
Authors:
Georgia Chalvatzaki,
Ali Younes,
Daljeet Nandha,
An Le,
Leonardo F. R. Ribeiro,
Iryna Gurevych
Abstract:
Long-horizon task planning is essential for the development of intelligent assistive and service robots. In this work, we investigate the applicability of a smaller class of large language models (LLMs), specifically GPT-2, in robotic task planning by learning to decompose tasks into subgoal specifications for a planner to execute sequentially. Our method grounds the input of the LLM on the domain…
▽ More
Long-horizon task planning is essential for the development of intelligent assistive and service robots. In this work, we investigate the applicability of a smaller class of large language models (LLMs), specifically GPT-2, in robotic task planning by learning to decompose tasks into subgoal specifications for a planner to execute sequentially. Our method grounds the input of the LLM on the domain that is represented as a scene graph, enabling it to translate human requests into executable robot plans, thereby learning to reason over long-horizon tasks, as encountered in the ALFRED benchmark. We compare our approach with classical planning and baseline methods to examine the applicability and generalizability of LLM-based planners. Our findings suggest that the knowledge stored in an LLM can be effectively grounded to perform long-horizon task planning, demonstrating the promising potential for the future application of neuro-symbolic planning methods in robotics.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
Embedding Aggregation for Forensic Facial Comparison
Authors:
Rafael Oliveira Ribeiro,
João C. R. Neves,
Arnout C. C. Ruifrok,
Flavio de Barros Vidal
Abstract:
In forensic facial comparison, questioned-source images are usually captured in uncontrolled environments, with non-uniform lighting, and from non-cooperative subjects. The poor quality of such material usually compromises their value as evidence in legal matters. On the other hand, in forensic casework, multiple images of the person of interest are usually available. In this paper, we propose to…
▽ More
In forensic facial comparison, questioned-source images are usually captured in uncontrolled environments, with non-uniform lighting, and from non-cooperative subjects. The poor quality of such material usually compromises their value as evidence in legal matters. On the other hand, in forensic casework, multiple images of the person of interest are usually available. In this paper, we propose to aggregate deep neural network embeddings from various images of the same person to improve performance in facial verification. We observe significant performance improvements, especially for very low-quality images. Further improvements are obtained by aggregating embeddings of more images and by applying quality-weighted aggregation. We demonstrate the benefits of this approach in forensic evaluation settings with the development and validation of score-based likelihood ratio systems and report improvements in Cllr of up to 95% (from 0.249 to 0.012) for CCTV images and of up to 96% (from 0.083 to 0.003) for social media images.
△ Less
Submitted 29 April, 2023;
originally announced May 2023.
-
PGTask: Introducing the Task of Profile Generation from Dialogues
Authors:
Rui Ribeiro,
Joao P. Carvalho,
Luísa Coheur
Abstract:
Recent approaches have attempted to personalize dialogue systems by leveraging profile information into models. However, this knowledge is scarce and difficult to obtain, which makes the extraction/generation of profile information from dialogues a fundamental asset. To surpass this limitation, we introduce the Profile Generation Task (PGTask). We contribute with a new dataset for this problem, co…
▽ More
Recent approaches have attempted to personalize dialogue systems by leveraging profile information into models. However, this knowledge is scarce and difficult to obtain, which makes the extraction/generation of profile information from dialogues a fundamental asset. To surpass this limitation, we introduce the Profile Generation Task (PGTask). We contribute with a new dataset for this problem, comprising profile sentences aligned with related utterances, extracted from a corpus of dialogues. Furthermore, using state-of-the-art methods, we provide a benchmark for profile generation on this novel dataset. Our experiments disclose the challenges of profile generation, and we hope that this introduces a new research direction.
△ Less
Submitted 26 August, 2023; v1 submitted 13 April, 2023;
originally announced April 2023.
-
Forecasting Large Realized Covariance Matrices: The Benefits of Factor Models and Shrinkage
Authors:
Rafael Alves,
Diego S. de Brito,
Marcelo C. Medeiros,
Ruy M. Ribeiro
Abstract:
We propose a model to forecast large realized covariance matrices of returns, applying it to the constituents of the S\&P 500 daily. To address the curse of dimensionality, we decompose the return covariance matrix using standard firm-level factors (e.g., size, value, and profitability) and use sectoral restrictions in the residual covariance matrix. This restricted model is then estimated using v…
▽ More
We propose a model to forecast large realized covariance matrices of returns, applying it to the constituents of the S\&P 500 daily. To address the curse of dimensionality, we decompose the return covariance matrix using standard firm-level factors (e.g., size, value, and profitability) and use sectoral restrictions in the residual covariance matrix. This restricted model is then estimated using vector heterogeneous autoregressive (VHAR) models with the least absolute shrinkage and selection operator (LASSO). Our methodology improves forecasting precision relative to standard benchmarks and leads to better estimates of minimum variance portfolios.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
RAMP: Reaction-Aware Motion Planning of Multi-Legged Robots for Locomotion in Microgravity
Authors:
Warley F. R. Ribeiro,
Kentaro Uno,
Masazumi Imai,
Koki Murase,
Kazuya Yoshida
Abstract:
Robotic mobility in microgravity is necessary to expand human utilization and exploration of outer space. Bio-inspired multi-legged robots are a possible solution for safe and precise locomotion. However, a dynamic motion of a robot in microgravity can lead to failures due to gripper detachment caused by excessive motion reactions. We propose a novel Reaction-Aware Motion Planning (RAMP) to improv…
▽ More
Robotic mobility in microgravity is necessary to expand human utilization and exploration of outer space. Bio-inspired multi-legged robots are a possible solution for safe and precise locomotion. However, a dynamic motion of a robot in microgravity can lead to failures due to gripper detachment caused by excessive motion reactions. We propose a novel Reaction-Aware Motion Planning (RAMP) to improve locomotion safety in microgravity, decreasing the risk of losing contact with the terrain surface by reducing the robot's momentum change. RAMP minimizes the swing momentum with a Low-Reaction Swing Trajectory (LRST) while distributing this momentum to the whole body, ensuring zero velocity for the supporting grippers and minimizing motion reactions. We verify the proposed approach with dynamic simulations indicating the capability of RAMP to generate a safe motion without detachment of the supporting grippers, resulting in the robot reaching its specified location. We further validate RAMP in experiments with an air-floating system, demonstrating a significant reduction in reaction forces and improved mobility in microgravity.
△ Less
Submitted 19 January, 2023;
originally announced January 2023.
-
Turning the Tables: Biased, Imbalanced, Dynamic Tabular Datasets for ML Evaluation
Authors:
Sérgio Jesus,
José Pombal,
Duarte Alves,
André Cruz,
Pedro Saleiro,
Rita P. Ribeiro,
João Gama,
Pedro Bizarro
Abstract:
Evaluating new techniques on realistic datasets plays a crucial role in the development of ML research and its broader adoption by practitioners. In recent years, there has been a significant increase of publicly available unstructured data resources for computer vision and NLP tasks. However, tabular data -- which is prevalent in many high-stakes domains -- has been lagging behind. To bridge this…
▽ More
Evaluating new techniques on realistic datasets plays a crucial role in the development of ML research and its broader adoption by practitioners. In recent years, there has been a significant increase of publicly available unstructured data resources for computer vision and NLP tasks. However, tabular data -- which is prevalent in many high-stakes domains -- has been lagging behind. To bridge this gap, we present Bank Account Fraud (BAF), the first publicly available privacy-preserving, large-scale, realistic suite of tabular datasets. The suite was generated by applying state-of-the-art tabular data generation techniques on an anonymized,real-world bank account opening fraud detection dataset. This setting carries a set of challenges that are commonplace in real-world applications, including temporal dynamics and significant class imbalance. Additionally, to allow practitioners to stress test both performance and fairness of ML methods, each dataset variant of BAF contains specific types of data bias. With this resource, we aim to provide the research community with a more realistic, complete, and robust test bed to evaluate novel and existing methods.
△ Less
Submitted 28 November, 2022; v1 submitted 23 November, 2022;
originally announced November 2022.
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Authors:
BigScience Workshop,
:,
Teven Le Scao,
Angela Fan,
Christopher Akiki,
Ellie Pavlick,
Suzana Ilić,
Daniel Hesslow,
Roman Castagné,
Alexandra Sasha Luccioni,
François Yvon,
Matthias Gallé,
Jonathan Tow,
Alexander M. Rush,
Stella Biderman,
Albert Webson,
Pawan Sasanka Ammanamanchi,
Thomas Wang,
Benoît Sagot,
Niklas Muennighoff,
Albert Villanova del Moral,
Olatunji Ruwase,
Rachel Bawden,
Stas Bekman,
Angelina McMillan-Major
, et al. (369 additional authors not shown)
Abstract:
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access…
▽ More
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
△ Less
Submitted 27 June, 2023; v1 submitted 9 November, 2022;
originally announced November 2022.
-
Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking
Authors:
Tim Baumgärtner,
Leonardo F. R. Ribeiro,
Nils Reimers,
Iryna Gurevych
Abstract:
Pairing a lexical retriever with a neural re-ranking model has set state-of-the-art performance on large-scale information retrieval datasets. This pipeline covers scenarios like question answering or navigational queries, however, for information-seeking scenarios, users often provide information on whether a document is relevant to their query in form of clicks or explicit feedback. Therefore, i…
▽ More
Pairing a lexical retriever with a neural re-ranking model has set state-of-the-art performance on large-scale information retrieval datasets. This pipeline covers scenarios like question answering or navigational queries, however, for information-seeking scenarios, users often provide information on whether a document is relevant to their query in form of clicks or explicit feedback. Therefore, in this work, we explore how relevance feedback can be directly integrated into neural re-ranking models by adopting few-shot and parameter-efficient learning techniques. Specifically, we introduce a kNN approach that re-ranks documents based on their similarity with the query and the documents the user considers relevant. Further, we explore Cross-Encoder models that we pre-train using meta-learning and subsequently fine-tune for each query, training only on the feedback documents. To evaluate our different integration strategies, we transform four existing information retrieval datasets into the relevance feedback scenario. Extensive experiments demonstrate that integrating relevance feedback directly in neural re-ranking models improves their performance, and fusing lexical ranking with our best performing neural re-ranker outperforms all other methods by 5.2 nDCG@20.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
SUMBot: Summarizing Context in Open-Domain Dialogue Systems
Authors:
Rui Ribeiro,
Luísa Coheur
Abstract:
In this paper, we investigate the problem of including relevant information as context in open-domain dialogue systems. Most models struggle to identify and incorporate important knowledge from dialogues and simply use the entire turns as context, which increases the size of the input fed to the model with unnecessary information. Additionally, due to the input size limitation of a few hundred tok…
▽ More
In this paper, we investigate the problem of including relevant information as context in open-domain dialogue systems. Most models struggle to identify and incorporate important knowledge from dialogues and simply use the entire turns as context, which increases the size of the input fed to the model with unnecessary information. Additionally, due to the input size limitation of a few hundred tokens of large pre-trained models, regions of the history are not included and informative parts from the dialogue may be omitted. In order to surpass this problem, we introduce a simple method that substitutes part of the context with a summary instead of the whole history, which increases the ability of models to keep track of all the previous relevant information. We show that the inclusion of a summary may improve the answer generation task and discuss some examples to further understand the system's weaknesses.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
Face Super-Resolution Using Stochastic Differential Equations
Authors:
Marcelo dos Santos,
Rayson Laroca,
Rafael O. Ribeiro,
João Neves,
Hugo Proença,
David Menotti
Abstract:
Diffusion models have proven effective for various applications such as images, audio and graph generation. Other important applications are image super-resolution and the solution of inverse problems. More recently, some works have used stochastic differential equations (SDEs) to generalize diffusion models to continuous time. In this work, we introduce SDEs to generate super-resolution face imag…
▽ More
Diffusion models have proven effective for various applications such as images, audio and graph generation. Other important applications are image super-resolution and the solution of inverse problems. More recently, some works have used stochastic differential equations (SDEs) to generalize diffusion models to continuous time. In this work, we introduce SDEs to generate super-resolution face images. To the best of our knowledge, this is the first time SDEs have been used for such an application. The proposed method provides an improved peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and consistency than the existing super-resolution methods based on diffusion models. In particular, we also assess the potential application of this method for the face recognition task. A generic facial feature extractor is used to compare the super-resolution images with the ground truth and superior results were obtained compared with other methods. Our code is publicly available at https://github.com/marcelowds/sr-sde
△ Less
Submitted 24 September, 2022;
originally announced September 2022.
-
UKP-SQuARE v2: Explainability and Adversarial Attacks for Trustworthy QA
Authors:
Rachneet Sachdeva,
Haritz Puerto,
Tim Baumgärtner,
Sewin Tariverdian,
Hao Zhang,
Kexin Wang,
Hossain Shaikh Saadi,
Leonardo F. R. Ribeiro,
Iryna Gurevych
Abstract:
Question Answering (QA) systems are increasingly deployed in applications where they support real-world decisions. However, state-of-the-art models rely on deep neural networks, which are difficult to interpret by humans. Inherently interpretable models or post hoc explainability methods can help users to comprehend how a model arrives at its prediction and, if successful, increase their trust in…
▽ More
Question Answering (QA) systems are increasingly deployed in applications where they support real-world decisions. However, state-of-the-art models rely on deep neural networks, which are difficult to interpret by humans. Inherently interpretable models or post hoc explainability methods can help users to comprehend how a model arrives at its prediction and, if successful, increase their trust in the system. Furthermore, researchers can leverage these insights to develop new methods that are more accurate and less biased. In this paper, we introduce SQuARE v2, the new version of SQuARE, to provide an explainability infrastructure for comparing models based on methods such as saliency maps and graph-based explanations. While saliency maps are useful to inspect the importance of each input token for the model's prediction, graph-based explanations from external Knowledge Graphs enable the users to verify the reasoning behind the model prediction. In addition, we provide multiple adversarial attacks to compare the robustness of QA models. With these explainability methods and adversarial attacks, we aim to ease the research on trustworthy QA models. SQuARE is available on https://square.ukp-lab.de.
△ Less
Submitted 20 October, 2022; v1 submitted 19 August, 2022;
originally announced August 2022.
-
A Benchmark dataset for predictive maintenance
Authors:
Bruno Veloso,
João Gama,
Rita P. Ribeiro,
Pedro M. Pereira
Abstract:
The paper describes the MetroPT data set, an outcome of a eXplainable Predictive Maintenance (XPM) project with an urban metro public transportation service in Porto, Portugal. The data was collected in 2022 that aimed to evaluate machine learning methods for online anomaly detection and failure prediction. By capturing several analogic sensor signals (pressure, temperature, current consumption),…
▽ More
The paper describes the MetroPT data set, an outcome of a eXplainable Predictive Maintenance (XPM) project with an urban metro public transportation service in Porto, Portugal. The data was collected in 2022 that aimed to evaluate machine learning methods for online anomaly detection and failure prediction. By capturing several analogic sensor signals (pressure, temperature, current consumption), digital signals (control signals, discrete signals), and GPS information (latitude, longitude, and speed), we provide a dataset that can be easily used to evaluate online machine learning methods. This dataset contains some interesting characteristics and can be a good benchmark for predictive maintenance models.
△ Less
Submitted 18 July, 2022; v1 submitted 12 July, 2022;
originally announced July 2022.
-
Learning Rhetorical Structure Theory-based descriptions of observed behaviour
Authors:
Luis Botelho,
Luis Nunes,
Ricardo Ribeiro,
Rui J. Lopes
Abstract:
In a previous paper, we have proposed a set of concepts, axiom schemata and algorithms that can be used by agents to learn to describe their behaviour, goals, capabilities, and environment. The current paper proposes a new set of concepts, axiom schemata and algorithms that allow the agent to learn new descriptions of an observed behaviour (e.g., perplexing actions), of its actor (e.g., undesired…
▽ More
In a previous paper, we have proposed a set of concepts, axiom schemata and algorithms that can be used by agents to learn to describe their behaviour, goals, capabilities, and environment. The current paper proposes a new set of concepts, axiom schemata and algorithms that allow the agent to learn new descriptions of an observed behaviour (e.g., perplexing actions), of its actor (e.g., undesired propositions or actions), and of its environment (e.g., incompatible propositions). Each learned description (e.g., a certain action prevents another action from being performed in the future) is represented by a relationship between entities (either propositions or actions) and is learned by the agent, just by observation, using domain-independent axiom schemata and or learning algorithms. The relations used by agents to represent the descriptions they learn were inspired on the Theory of Rhetorical Structure (RST). The main contribution of the paper is the relation family Although, inspired on the RST relation Concession. The accurate definition of the relations of the family Although involves a set of deontic concepts whose definition and corresponding algorithms are presented. The relations of the family Although, once extracted from the agent's observations, express surprise at the observed behaviour and, in certain circumstances, present a justification for it.
The paper shows results of the presented proposals in a demonstration scenario, using implemented software.
△ Less
Submitted 24 June, 2022;
originally announced June 2022.
-
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
Authors:
Sebastian Gehrmann,
Abhik Bhattacharjee,
Abinaya Mahendiran,
Alex Wang,
Alexandros Papangelis,
Aman Madaan,
Angelina McMillan-Major,
Anna Shvets,
Ashish Upadhyay,
Bingsheng Yao,
Bryan Wilie,
Chandra Bhagavatula,
Chaobin You,
Craig Thomson,
Cristina Garbacea,
Dakuo Wang,
Daniel Deutsch,
Deyi Xiong,
Di Jin,
Dimitra Gkatzia,
Dragomir Radev,
Elizabeth Clark,
Esin Durmus,
Faisal Ladhak,
Filip Ginter
, et al. (52 additional authors not shown)
Abstract:
Evaluation in machine learning is usually informed by past choices, for example which datasets or metrics to use. This standardization enables the comparison on equal footing using leaderboards, but the evaluation choices become sub-optimal as better alternatives arise. This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, an…
▽ More
Evaluation in machine learning is usually informed by past choices, for example which datasets or metrics to use. This standardization enables the comparison on equal footing using leaderboards, but the evaluation choices become sub-optimal as better alternatives arise. This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims. To make following best model evaluation practices easier, we introduce GEMv2. The new version of the Generation, Evaluation, and Metrics Benchmark introduces a modular infrastructure for dataset, model, and metric developers to benefit from each others work. GEMv2 supports 40 documented datasets in 51 languages. Models for all datasets can be evaluated online and our interactive data card creation and rendering tools make it easier to add new datasets to the living benchmark.
△ Less
Submitted 24 June, 2022; v1 submitted 22 June, 2022;
originally announced June 2022.
-
Model Optimization in Imbalanced Regression
Authors:
Aníbal Silva,
Rita P. Ribeiro,
Nuno Moniz
Abstract:
Imbalanced domain learning aims to produce accurate models in predicting instances that, though underrepresented, are of utmost importance for the domain. Research in this field has been mainly focused on classification tasks. Comparatively, the number of studies carried out in the context of regression tasks is negligible. One of the main reasons for this is the lack of loss functions capable of…
▽ More
Imbalanced domain learning aims to produce accurate models in predicting instances that, though underrepresented, are of utmost importance for the domain. Research in this field has been mainly focused on classification tasks. Comparatively, the number of studies carried out in the context of regression tasks is negligible. One of the main reasons for this is the lack of loss functions capable of focusing on minimizing the errors of extreme (rare) values. Recently, an evaluation metric was introduced: Squared Error Relevance Area (SERA). This metric posits a bigger emphasis on the errors committed at extreme values while also accounting for the performance in the overall target variable domain, thus preventing severe bias. However, its effectiveness as an optimization metric is unknown. In this paper, our goal is to study the impacts of using SERA as an optimization criterion in imbalanced regression tasks. Using gradient boosting algorithms as proof of concept, we perform an experimental study with 36 data sets of different domains and sizes. Results show that models that used SERA as an objective function are practically better than the models produced by their respective standard boosting algorithms at the prediction of extreme values. This confirms that SERA can be embedded as a loss function into optimization-based learning algorithms for imbalanced regression scenarios.
△ Less
Submitted 15 August, 2022; v1 submitted 20 June, 2022;
originally announced June 2022.
-
FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations
Authors:
Leonardo F. R. Ribeiro,
Mengwen Liu,
Iryna Gurevych,
Markus Dreyer,
Mohit Bansal
Abstract:
Despite recent improvements in abstractive summarization, most current approaches generate summaries that are not factually consistent with the source document, severely restricting their trust and usage in real-world applications. Recent works have shown promising improvements in factuality error identification using text or dependency arc entailments; however, they do not consider the entire sem…
▽ More
Despite recent improvements in abstractive summarization, most current approaches generate summaries that are not factually consistent with the source document, severely restricting their trust and usage in real-world applications. Recent works have shown promising improvements in factuality error identification using text or dependency arc entailments; however, they do not consider the entire semantic graph simultaneously. To this end, we propose FactGraph, a method that decomposes the document and the summary into structured meaning representations (MR), which are more suitable for factuality evaluation. MRs describe core semantic concepts and their relations, aggregating the main content in both document and summary in a canonical form, and reducing data sparsity. FactGraph encodes such graphs using a graph encoder augmented with structure-aware adapters to capture interactions among the concepts based on the graph connectivity, along with text representations using an adapter-based text encoder. Experiments on different benchmarks for evaluating factuality show that FactGraph outperforms previous approaches by up to 15%. Furthermore, FactGraph improves performance on identifying content verifiability errors and better captures subsentence-level factual inconsistencies.
△ Less
Submitted 19 July, 2022; v1 submitted 13 April, 2022;
originally announced April 2022.
-
UKP-SQUARE: An Online Platform for Question Answering Research
Authors:
Tim Baumgärtner,
Kexin Wang,
Rachneet Sachdeva,
Max Eichler,
Gregor Geigle,
Clifton Poth,
Hannah Sterz,
Haritz Puerto,
Leonardo F. R. Ribeiro,
Jonas Pfeiffer,
Nils Reimers,
Gözde Gül Şahin,
Iryna Gurevych
Abstract:
Recent advances in NLP and information retrieval have given rise to a diverse set of question answering tasks that are of different formats (e.g., extractive, abstractive), require different model architectures (e.g., generative, discriminative), and setups (e.g., with or without retrieval). Despite having a large number of powerful, specialized QA pipelines (which we refer to as Skills) that cons…
▽ More
Recent advances in NLP and information retrieval have given rise to a diverse set of question answering tasks that are of different formats (e.g., extractive, abstractive), require different model architectures (e.g., generative, discriminative), and setups (e.g., with or without retrieval). Despite having a large number of powerful, specialized QA pipelines (which we refer to as Skills) that consider a single domain, model or setup, there exists no framework where users can easily explore and compare such pipelines and can extend them according to their needs. To address this issue, we present UKP-SQUARE, an extensible online QA platform for researchers which allows users to query and analyze a large collection of modern Skills via a user-friendly web interface and integrated behavioural tests. In addition, QA researchers can develop, manage, and share their custom Skills using our microservices that support a wide range of models (Transformers, Adapters, ONNX), datastores and retrieval techniques (e.g., sparse and dense). UKP-SQUARE is available on https://square.ukp-lab.de.
△ Less
Submitted 28 March, 2022; v1 submitted 25 March, 2022;
originally announced March 2022.
-
Towards Learning Through Open-Domain Dialog
Authors:
Eugénio Ribeiro,
Ricardo Ribeiro,
David Martins de Matos
Abstract:
The development of artificial agents able to learn through dialog without domain restrictions has the potential to allow machines to learn how to perform tasks in a similar manner to humans and change how we relate to them. However, research in this area is practically nonexistent. In this paper, we identify the modifications required for a dialog system to be able to learn from the dialog and pro…
▽ More
The development of artificial agents able to learn through dialog without domain restrictions has the potential to allow machines to learn how to perform tasks in a similar manner to humans and change how we relate to them. However, research in this area is practically nonexistent. In this paper, we identify the modifications required for a dialog system to be able to learn from the dialog and propose generic approaches that can be used to implement those modifications. More specifically, we discuss how knowledge can be extracted from the dialog, used to update the agent's semantic network, and grounded in action and observation. This way, we hope to raise awareness for this subject, so that it can become a focus of research in the future.
△ Less
Submitted 7 February, 2022;
originally announced February 2022.
-
Question rewriting? Assessing its importance for conversational question answering
Authors:
Gonçalo Raposo,
Rui Ribeiro,
Bruno Martins,
Luísa Coheur
Abstract:
In conversational question answering, systems must correctly interpret the interconnected interactions and generate knowledgeable answers, which may require the retrieval of relevant information from a background repository. Recent approaches to this problem leverage neural language models, although different alternatives can be considered in terms of modules for (a) representing user questions in…
▽ More
In conversational question answering, systems must correctly interpret the interconnected interactions and generate knowledgeable answers, which may require the retrieval of relevant information from a background repository. Recent approaches to this problem leverage neural language models, although different alternatives can be considered in terms of modules for (a) representing user questions in context, (b) retrieving the relevant background information, and (c) generating the answer. This work presents a conversational question answering system designed specifically for the Search-Oriented Conversational AI (SCAI) shared task, and reports on a detailed analysis of its question rewriting module. In particular, we considered different variations of the question rewriting module to evaluate the influence on the subsequent components, and performed a careful analysis of the results obtained with the best system configuration. Our system achieved the best performance in the shared task and our analysis emphasizes the importance of the conversation context representation for the overall system performance.
△ Less
Submitted 14 April, 2022; v1 submitted 22 January, 2022;
originally announced January 2022.
-
Smelting Gold and Silver for Improved Multilingual AMR-to-Text Generation
Authors:
Leonardo F. R. Ribeiro,
Jonas Pfeiffer,
Yue Zhang,
Iryna Gurevych
Abstract:
Recent work on multilingual AMR-to-text generation has exclusively focused on data augmentation strategies that utilize silver AMR. However, this assumes a high quality of generated AMRs, potentially limiting the transferability to the target task. In this paper, we investigate different techniques for automatically generating AMR annotations, where we aim to study which source of information yiel…
▽ More
Recent work on multilingual AMR-to-text generation has exclusively focused on data augmentation strategies that utilize silver AMR. However, this assumes a high quality of generated AMRs, potentially limiting the transferability to the target task. In this paper, we investigate different techniques for automatically generating AMR annotations, where we aim to study which source of information yields better multilingual results. Our models trained on gold AMR with silver (machine translated) sentences outperform approaches which leverage generated silver AMR. We find that combining both complementary sources of information further improves multilingual AMR-to-text generation. Our models surpass the previous state of the art for German, Italian, Spanish, and Chinese by a large margin.
△ Less
Submitted 8 September, 2021;
originally announced September 2021.
-
Structural Adapters in Pretrained Language Models for AMR-to-text Generation
Authors:
Leonardo F. R. Ribeiro,
Yue Zhang,
Iryna Gurevych
Abstract:
Pretrained language models (PLM) have recently advanced graph-to-text generation, where the input graph is linearized into a sequence and fed into the PLM to obtain its representation. However, efficiently encoding the graph structure in PLMs is challenging because such models were pretrained on natural language, and modeling structured data may lead to catastrophic forgetting of distributional kn…
▽ More
Pretrained language models (PLM) have recently advanced graph-to-text generation, where the input graph is linearized into a sequence and fed into the PLM to obtain its representation. However, efficiently encoding the graph structure in PLMs is challenging because such models were pretrained on natural language, and modeling structured data may lead to catastrophic forgetting of distributional knowledge. In this paper, we propose StructAdapt, an adapter method to encode graph structure into PLMs. Contrary to prior work, StructAdapt effectively models interactions among the nodes based on the graph connectivity, only training graph structure-aware adapter parameters. In this way, we incorporate task-specific knowledge while maintaining the topological structure of the graph. We empirically show the benefits of explicitly encoding graph structure into PLMs using StructAdapt, outperforming the state of the art on two AMR-to-text datasets, training only 5.1% of the PLM parameters.
△ Less
Submitted 8 September, 2021; v1 submitted 16 March, 2021;
originally announced March 2021.
-
Domain Adaptation in Dialogue Systems using Transfer and Meta-Learning
Authors:
Rui Ribeiro,
Alberto Abad,
José Lopes
Abstract:
Current generative-based dialogue systems are data-hungry and fail to adapt to new unseen domains when only a small amount of target data is available. Additionally, in real-world applications, most domains are underrepresented, so there is a need to create a system capable of generalizing to these domains using minimal data. In this paper, we propose a method that adapts to unseen domains by comb…
▽ More
Current generative-based dialogue systems are data-hungry and fail to adapt to new unseen domains when only a small amount of target data is available. Additionally, in real-world applications, most domains are underrepresented, so there is a need to create a system capable of generalizing to these domains using minimal data. In this paper, we propose a method that adapts to unseen domains by combining both transfer and meta-learning (DATML). DATML improves the previous state-of-the-art dialogue model, DiKTNet, by introducing a different learning technique: meta-learning. We use Reptile, a first-order optimization-based meta-learning algorithm as our improved training method. We evaluated our model on the MultiWOZ dataset and outperformed DiKTNet in both BLEU and Entity F1 scores when the same amount of data is available.
△ Less
Submitted 22 February, 2021;
originally announced February 2021.
-
Online Body Schema Adaptation through Cost-Sensitive Active Learning
Authors:
Gonçalo Cunha,
Pedro Vicente,
Alexandre Bernardino,
Ricardo Ribeiro,
Plínio Moreno
Abstract:
Humanoid robots have complex bodies and kinematic chains with several Degrees-of-Freedom (DoF) which are difficult to model. Learning the parameters of a kinematic model can be achieved by observing the position of the robot links during prospective motions and minimising the prediction errors. This work proposes a movement efficient approach for estimating online the body-schema of a humanoid rob…
▽ More
Humanoid robots have complex bodies and kinematic chains with several Degrees-of-Freedom (DoF) which are difficult to model. Learning the parameters of a kinematic model can be achieved by observing the position of the robot links during prospective motions and minimising the prediction errors. This work proposes a movement efficient approach for estimating online the body-schema of a humanoid robot arm in the form of Denavit-Hartenberg (DH) parameters. A cost-sensitive active learning approach based on the A-Optimality criterion is used to select optimal joint configurations. The chosen joint configurations simultaneously minimise the error in the estimation of the body schema and minimise the movement between samples. This reduces energy consumption, along with mechanical fatigue and wear, while not compromising the learning accuracy. The work was implemented in a simulation environment, using the 7DoF arm of the iCub robot simulator. The hand pose is measured with a single camera via markers placed in the palm and back of the robot's hand. A non-parametric occlusion model is proposed to avoid choosing joint configurations where the markers are not visible, thus preventing worthless attempts. The results show cost-sensitive active learning has similar accuracy to the standard active learning approach, while reducing in about half the executed movement.
△ Less
Submitted 10 February, 2022; v1 submitted 26 January, 2021;
originally announced January 2021.
-
Profiling Software Developers with Process Mining and N-Gram Language Models
Authors:
João Caldeira,
Fernando Brito e Abreu,
Jorge Cardoso,
Ricardo Ribeiro,
Claudia Werner
Abstract:
Context: Profiling developers is challenging since many factors, such as their skills, experience, development environment and behaviors, may influence a detailed analysis and the delivery of coherent interpretations.
Objective: We aim at profiling software developers by mining their software development process. To do so, we performed a controlled experiment where, in the realm of a Python prog…
▽ More
Context: Profiling developers is challenging since many factors, such as their skills, experience, development environment and behaviors, may influence a detailed analysis and the delivery of coherent interpretations.
Objective: We aim at profiling software developers by mining their software development process. To do so, we performed a controlled experiment where, in the realm of a Python programming contest, a group of developers had the same well-defined set of requirements specifications and a well-defined sprint schedule. Events were collected from the PyCharm IDE, and from the Mooshak automatic jury where subjects checked-in their code.
Method: We used n-gram language models and text mining to characterize developers' profiles, and process mining algorithms to discover their overall workflows and extract the correspondent metrics for further evaluation.
Results: Findings show that we can clearly characterize with a coherent rationale most developers, and distinguish the top performers from the ones with more challenging behaviors. This approach may lead ultimately to the creation of a catalog of software development process smells.
Conclusions: The profile of a developer provides a software project manager a clue for the selection of appropriate tasks he/she should be assigned. With the increasing usage of low and no-code platforms, where coding is automatically generated from an upper abstraction layer, mining developer's actions in the development platforms is a promising approach to early detect not only behaviors but also assess project complexity and model effort.
△ Less
Submitted 17 January, 2021;
originally announced January 2021.
-
Business-Driven Technical Debt Prioritization: An Industrial Case Study
Authors:
Rodrigo Rebouças de Almeida,
Rafael do Nascimento Ribeiro,
Christoph Treude,
Uirá Kulesza
Abstract:
Incorporating the business perspective into prioritizing technical debt is essential to contribute to decision making in industry. In this paper, we evolve and evaluate a business-driven approach for technical debt prioritization. The approach was evaluated during a five-month industrial case study with business and technical stakeholders' active participation. The results show that the approach c…
▽ More
Incorporating the business perspective into prioritizing technical debt is essential to contribute to decision making in industry. In this paper, we evolve and evaluate a business-driven approach for technical debt prioritization. The approach was evaluated during a five-month industrial case study with business and technical stakeholders' active participation. The results show that the approach contributed to aligning business criteria between the business and technical stakeholders. We also observed a downward trend in the amount of technical debt that affects high-value business assets. Moreover, we identified eight business factors that affect the decision making related to the prioritization of technical debt. The study results suggest that the proposed business-driven technical debt prioritization approach can help teams to focus their efforts on paying off the business' most relevant debt.
△ Less
Submitted 21 March, 2021; v1 submitted 19 October, 2020;
originally announced October 2020.
-
Estimating action plans for smart poultry houses
Authors:
Darlan Felipe Klotz,
Richardson Ribeiro,
Fabrício Enembreck,
Gustavo Denardin,
Marco Barbosa,
Dalcimar Casanova,
Marcelo Teixeira
Abstract:
In poultry farming, the systematic choice, update, and implementation of periodic (t) action plans define the feed conversion rate (FCR[t]), which is an acceptable measure for successful production. Appropriate action plans provide tailored resources for broilers, allowing them to grow within the so-called thermal comfort zone, without wast or lack of resources. Although the implementation of an a…
▽ More
In poultry farming, the systematic choice, update, and implementation of periodic (t) action plans define the feed conversion rate (FCR[t]), which is an acceptable measure for successful production. Appropriate action plans provide tailored resources for broilers, allowing them to grow within the so-called thermal comfort zone, without wast or lack of resources. Although the implementation of an action plan is automatic, its configuration depends on the knowledge of the specialist, tending to be inefficient and error-prone, besides to result in different FCR[t] for each poultry house. In this article, we claim that the specialist's perception can be reproduced, to some extent, by computational intelligence. By combining deep learning and genetic algorithm techniques, we show how action plans can adapt their performance over the time, based on previous well succeeded plans. We also implement a distributed network infrastructure that allows to replicate our method over distributed poultry houses, for their smart, interconnected, and adaptive control. A supervision system is provided as interface to users. Experiments conducted over real data show that our method improves 5% on the performance of the most productive specialist, staying very close to the optimal FCR[t].
△ Less
Submitted 17 August, 2020;
originally announced August 2020.
-
Investigating Pretrained Language Models for Graph-to-Text Generation
Authors:
Leonardo F. R. Ribeiro,
Martin Schmitt,
Hinrich Schütze,
Iryna Gurevych
Abstract:
Graph-to-text generation aims to generate fluent texts from graph-based data. In this paper, we investigate two recently proposed pretrained language models (PLMs) and analyze the impact of different task-adaptive pretraining strategies for PLMs in graph-to-text generation. We present a study across three graph domains: meaning representations, Wikipedia knowledge graphs (KGs) and scientific KGs.…
▽ More
Graph-to-text generation aims to generate fluent texts from graph-based data. In this paper, we investigate two recently proposed pretrained language models (PLMs) and analyze the impact of different task-adaptive pretraining strategies for PLMs in graph-to-text generation. We present a study across three graph domains: meaning representations, Wikipedia knowledge graphs (KGs) and scientific KGs. We show that the PLMs BART and T5 achieve new state-of-the-art results and that task-adaptive pretraining strategies improve their performance even further. In particular, we report new state-of-the-art BLEU scores of 49.72 on LDC2017T10, 59.70 on WebNLG, and 25.66 on AGENDA datasets - a relative improvement of 31.8%, 4.5%, and 42.4%, respectively. In an extensive analysis, we identify possible reasons for the PLMs' success on graph-to-text tasks. We find evidence that their knowledge about true facts helps them perform well even when the input graph representation is reduced to a simple bag of node and edge labels.
△ Less
Submitted 27 September, 2021; v1 submitted 16 July, 2020;
originally announced July 2020.
-
Concept and the implementation of a tool to convert industry 4.0 environments modeled as FSM to an OpenAI Gym wrapper
Authors:
Kallil M. C. Zielinski,
Marcelo Teixeira,
Richardson Ribeiro,
Dalcimar Casanova
Abstract:
Industry 4.0 systems have a high demand for optimization in their tasks, whether to minimize cost, maximize production, or even synchronize their actuators to finish or speed up the manufacture of a product. Those challenges make industrial environments a suitable scenario to apply all modern reinforcement learning (RL) concepts. The main difficulty, however, is the lack of that industrial environ…
▽ More
Industry 4.0 systems have a high demand for optimization in their tasks, whether to minimize cost, maximize production, or even synchronize their actuators to finish or speed up the manufacture of a product. Those challenges make industrial environments a suitable scenario to apply all modern reinforcement learning (RL) concepts. The main difficulty, however, is the lack of that industrial environments. In this way, this work presents the concept and the implementation of a tool that allows us to convert any dynamic system modeled as an FSM to the open-source Gym wrapper. After that, it is possible to employ any RL methods to optimize any desired task. In the first tests of the proposed tool, we show traditional Q-learning and Deep Q-learning methods running over two simple environments.
△ Less
Submitted 29 June, 2020;
originally announced June 2020.
-
Modeling Graph Structure via Relative Position for Text Generation from Knowledge Graphs
Authors:
Martin Schmitt,
Leonardo F. R. Ribeiro,
Philipp Dufter,
Iryna Gurevych,
Hinrich Schütze
Abstract:
We present Graformer, a novel Transformer-based encoder-decoder architecture for graph-to-text generation. With our novel graph self-attention, the encoding of a node relies on all nodes in the input graph - not only direct neighbors - facilitating the detection of global patterns. We represent the relation between two nodes as the length of the shortest path between them. Graformer learns to weig…
▽ More
We present Graformer, a novel Transformer-based encoder-decoder architecture for graph-to-text generation. With our novel graph self-attention, the encoding of a node relies on all nodes in the input graph - not only direct neighbors - facilitating the detection of global patterns. We represent the relation between two nodes as the length of the shortest path between them. Graformer learns to weight these node-node relations differently for different attention heads, thus virtually learning differently connected views of the input graph. We evaluate Graformer on two popular graph-to-text generation benchmarks, AGENDA and WebNLG, where it achieves strong performance while using many fewer parameters than other approaches.
△ Less
Submitted 27 April, 2021; v1 submitted 16 June, 2020;
originally announced June 2020.
-
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers
Authors:
Anne Lauscher,
Olga Majewska,
Leonardo F. R. Ribeiro,
Iryna Gurevych,
Nikolai Rozanov,
Goran Glavaš
Abstract:
Following the major success of neural language models (LMs) such as BERT or GPT-2 on a variety of language understanding tasks, recent work focused on injecting (structured) knowledge from external resources into these models. While on the one hand, joint pretraining (i.e., training from scratch, adding objectives based on external knowledge to the primary LM objective) may be prohibitively comput…
▽ More
Following the major success of neural language models (LMs) such as BERT or GPT-2 on a variety of language understanding tasks, recent work focused on injecting (structured) knowledge from external resources into these models. While on the one hand, joint pretraining (i.e., training from scratch, adding objectives based on external knowledge to the primary LM objective) may be prohibitively computationally expensive, post-hoc fine-tuning on external knowledge, on the other hand, may lead to the catastrophic forgetting of distributional knowledge. In this work, we investigate models for complementing the distributional knowledge of BERT with conceptual knowledge from ConceptNet and its corresponding Open Mind Common Sense (OMCS) corpus, respectively, using adapter training. While overall results on the GLUE benchmark paint an inconclusive picture, a deeper analysis reveals that our adapter-based models substantially outperform BERT (up to 15-20 performance points) on inference tasks that require the type of conceptual knowledge explicitly present in ConceptNet and OMCS. All code and experiments are open sourced under: https://github.com/wluper/retrograph .
△ Less
Submitted 11 October, 2020; v1 submitted 24 May, 2020;
originally announced May 2020.
-
Automatic Recognition of the General-Purpose Communicative Functions defined by the ISO 24617-2 Standard for Dialog Act Annotation
Authors:
Eugénio Ribeiro,
Ricardo Ribeiro,
David Martins de Matos
Abstract:
ISO 24617-2, the standard for dialog act annotation, defines a hierarchically organized set of general-purpose communicative functions. The automatic recognition of these functions, although practically unexplored, is relevant for a dialog system, since they provide cues regarding the intention behind the segments and how they should be interpreted. We explore the recognition of general-purpose co…
▽ More
ISO 24617-2, the standard for dialog act annotation, defines a hierarchically organized set of general-purpose communicative functions. The automatic recognition of these functions, although practically unexplored, is relevant for a dialog system, since they provide cues regarding the intention behind the segments and how they should be interpreted. We explore the recognition of general-purpose communicative functions in the DialogBank, which is a reference set of dialogs annotated according to this standard. To do so, we propose adaptations of existing approaches to flat dialog act recognition that allow them to deal with the hierarchical classification problem. More specifically, we propose the use of a hierarchical network with cascading outputs and maximum a posteriori path estimation to predict the communicative function at each level of the hierarchy, preserve the dependencies between the functions in the path, and decide at which level to stop. Furthermore, since the amount of dialogs in the DialogBank is reduced, we rely on transfer learning processes to reduce overfitting and improve performance. The results of our experiments show that the hierarchical approach outperforms a flat one and that each of its components plays an important role towards the recognition of general-purpose communicative functions.
△ Less
Submitted 16 January, 2021; v1 submitted 7 March, 2020;
originally announced March 2020.