Search | arXiv e-print repository

Statistical Validation of Column Matching in the Database Schema Evolution of the Brazilian Public School Census

Authors: Muriki G. Yamanaka, Diogo H. de Almeida, Paulo R. Lisboa de Almeida, Simone Dominico, Leticia M. Peres, Marcos S. Sunye, Eduardo C. de Almeida

Abstract: Publicly available datasets are subject to new versions, with each new version potentially reflecting changes to the data. These changes may involve adding or removing attributes, changing data types, and modifying values or their semantics. Integrating these datasets into a database poses a significant challenge: how to keep track of the evolving database schema while incorporating different vers… ▽ More Publicly available datasets are subject to new versions, with each new version potentially reflecting changes to the data. These changes may involve adding or removing attributes, changing data types, and modifying values or their semantics. Integrating these datasets into a database poses a significant challenge: how to keep track of the evolving database schema while incorporating different versions of the data sources? This paper presents a statistical methodology to validate the integration of 12 years of open-access datasets from Brazil's School Census, with a new version of the datasets released annually by the Brazilian Ministry of Education (MEC). We employ various statistical tests to find matching attributes between datasets from a specific year and their potential equivalents in datasets from later years. The results show that by using the Kolmogorov-Smirnov test we can successfully match columns from different dataset versions in about 90% of cases. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: Accepted for presentation at the Simposio Brasileiro de Bancos de Dados (SBBD) 2024

arXiv:2406.16781 [pdf, other]

A Carrying Capacity Calculator for Pedestrians Using OpenStreetMap Data: Application to Urban Tourism and Public Spaces

Authors: Duarte Sampaio de Almeida, Rodrigo Simões, Fernando Brito e Abreu, Adriano Lopes, Inês Boavida-Portugal

Abstract: Determining the carrying capacity of urban tourism destinations and public spaces is essential for sustainable management. This paper presents an online tool that calculates pedestrian carrying capacities for user-defined areas based on OpenStreetMap (OSM) data. The tool considers physical, real, and effective carrying capacities by incorporating parameters such as area per pedestrian, rotation fa… ▽ More Determining the carrying capacity of urban tourism destinations and public spaces is essential for sustainable management. This paper presents an online tool that calculates pedestrian carrying capacities for user-defined areas based on OpenStreetMap (OSM) data. The tool considers physical, real, and effective carrying capacities by incorporating parameters such as area per pedestrian, rotation factor, corrective factors, and management capacity. The carrying capacity calculator aids in balancing environmental, economic, social, and experiential factors to prevent overcrowding and preserve the quality of life for residents and visitors. This tool is particularly useful for tourism destination management, urban planning, and event management, ensuring positive visitor experiences and sustainable infrastructure development. We detail the implementation of the calculator, its underlying algorithm, and its application to the Santa Maria Maior parish in Lisbon, highlighting its effectiveness in managing urban tourism and public spaces. △ Less

Submitted 24 June, 2024; originally announced June 2024.

ACM Class: J.2; K.4

arXiv:2406.12623 [pdf, other]

Learned Image Compression for HE-stained Histopathological Images via Stain Deconvolution

Authors: Maximilian Fischer, Peter Neher, Tassilo Wald, Silvia Dias Almeida, Shuhan Xiao, Peter Schüffler, Rickmer Braren, Michael Götz, Alexander Muckenhuber, Jens Kleesiek, Marco Nolden, Klaus Maier-Hein

Abstract: Processing histopathological Whole Slide Images (WSI) leads to massive storage requirements for clinics worldwide. Even after lossy image compression during image acquisition, additional lossy compression is frequently possible without substantially affecting the performance of deep learning-based (DL) downstream tasks. In this paper, we show that the commonly used JPEG algorithm is not best suite… ▽ More Processing histopathological Whole Slide Images (WSI) leads to massive storage requirements for clinics worldwide. Even after lossy image compression during image acquisition, additional lossy compression is frequently possible without substantially affecting the performance of deep learning-based (DL) downstream tasks. In this paper, we show that the commonly used JPEG algorithm is not best suited for further compression and we propose Stain Quantized Latent Compression (SQLC ), a novel DL based histopathology data compression approach. SQLC compresses staining and RGB channels before passing it through a compression autoencoder (CAE ) in order to obtain quantized latent representations for maximizing the compression. We show that our approach yields superior performance in a classification downstream task, compared to traditional approaches like JPEG, while image quality metrics like the Multi-Scale Structural Similarity Index (MS-SSIM) is largely preserved. Our method is online available. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2405.12195 [pdf, other]

Developers' Perceptions on the Impact of ChatGPT in Software Development: A Survey

Authors: Thiago S. Vaillant, Felipe Deveza de Almeida, Paulo Anselmo M. S. Neto, Cuiyun Gao, Jan Bosch, Eduardo Santana de Almeida

Abstract: As Large Language Models (LLMs), including ChatGPT and analogous systems, continue to advance, their robust natural language processing capabilities and diverse applications have garnered considerable attention. Nonetheless, despite the increasing acknowledgment of the convergence of Artificial Intelligence (AI) and Software Engineering (SE), there is a lack of studies involving the impact of this… ▽ More As Large Language Models (LLMs), including ChatGPT and analogous systems, continue to advance, their robust natural language processing capabilities and diverse applications have garnered considerable attention. Nonetheless, despite the increasing acknowledgment of the convergence of Artificial Intelligence (AI) and Software Engineering (SE), there is a lack of studies involving the impact of this convergence on the practices and perceptions of software developers. Understanding how software developers perceive and engage with AI tools, such as ChatGPT, is essential for elucidating the impact and potential challenges of incorporating AI-driven tools in the software development process. In this paper, we conducted a survey with 207 software developers to understand the impact of ChatGPT on software quality, productivity, and job satisfaction. Furthermore, the study delves into developers' expectations regarding future adaptations of ChatGPT, concerns about potential job displacement, and perspectives on regulatory interventions. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 31 pages, 9 figures

ACM Class: D.2.0

arXiv:2404.17535 [pdf, other]

Using Neural Implicit Flow To Represent Latent Dynamics Of Canonical Systems

Authors: Imran Nasim, Joaõ Lucas de Sousa Almeida

Abstract: The recently introduced class of architectures known as Neural Operators has emerged as highly versatile tools applicable to a wide range of tasks in the field of Scientific Machine Learning (SciML), including data representation and forecasting. In this study, we investigate the capabilities of Neural Implicit Flow (NIF), a recently developed mesh-agnostic neural operator, for representing the la… ▽ More The recently introduced class of architectures known as Neural Operators has emerged as highly versatile tools applicable to a wide range of tasks in the field of Scientific Machine Learning (SciML), including data representation and forecasting. In this study, we investigate the capabilities of Neural Implicit Flow (NIF), a recently developed mesh-agnostic neural operator, for representing the latent dynamics of canonical systems such as the Kuramoto-Sivashinsky (KS), forced Korteweg-de Vries (fKdV), and Sine-Gordon (SG) equations, as well as for extracting dynamically relevant information from them. Finally we assess the applicability of NIF as a dimensionality reduction algorithm and conduct a comparative analysis with another widely recognized family of neural operators, known as Deep Operator Networks (DeepONets). △ Less

Submitted 26 April, 2024; originally announced April 2024.

Comments: Accepted into the International conference on Scientific Computation and Machine Learning 2024 (SCML 2024)

arXiv:2401.16307 [pdf, other]

doi 10.1145/3613904.3642662

Momentary Stressor Logging and Reflective Visualizations: Implications for Stress Management with Wearables

Authors: Sameer Neupane, Mithun Saha, Nasir Ali, Timothy Hnat, Shahin Alan Samiei, Anandatirtha Nandugudi, David M. Almeida, Santosh Kumar

Abstract: Commercial wearables from Fitbit, Garmin, and Whoop have recently introduced real-time notifications based on detecting changes in physiological responses indicating potential stress. In this paper, we investigate how these new capabilities can be leveraged to improve stress management. We developed a smartwatch app, a smartphone app, and a cloud service, and conducted a 100-day field study with 1… ▽ More Commercial wearables from Fitbit, Garmin, and Whoop have recently introduced real-time notifications based on detecting changes in physiological responses indicating potential stress. In this paper, we investigate how these new capabilities can be leveraged to improve stress management. We developed a smartwatch app, a smartphone app, and a cloud service, and conducted a 100-day field study with 122 participants who received prompts triggered by physiological responses several times a day. They were asked whether they were stressed, and if so, to log the most likely stressor. Each week, participants received new visualizations of their data to self-reflect on patterns and trends. Participants reported better awareness of their stressors, and self-initiating fourteen kinds of behavioral changes to reduce stress in their daily lives. Repeated self-reports over 14 weeks showed reductions in both stress intensity (in 26,521 momentary ratings) and stress frequency (in 1,057 weekly surveys). △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: In CHI '24 Proceedings of the CHI Conference on Human Factors in Computing Systems Honolulu, HI, USA

arXiv:2308.05759 [pdf, ps, other]

A machine-learning sleep-wake classification model using a reduced number of features derived from photoplethysmography and activity signals

Authors: Douglas A. Almeida, Felipe M. Dias, Marcelo A. F. Toledo, Diego A. C. Cardenas, Filipe A. C. Oliveira, Estela Ribeiro, Jose E. Krieger, Marco A. Gutierrez

Abstract: Sleep is a crucial aspect of our overall health and well-being. It plays a vital role in regulating our mental and physical health, impacting our mood, memory, and cognitive function to our physical resilience and immune system. The classification of sleep stages is a mandatory step to assess sleep quality, providing the metrics to estimate the quality of sleep and how well our body is functioning… ▽ More Sleep is a crucial aspect of our overall health and well-being. It plays a vital role in regulating our mental and physical health, impacting our mood, memory, and cognitive function to our physical resilience and immune system. The classification of sleep stages is a mandatory step to assess sleep quality, providing the metrics to estimate the quality of sleep and how well our body is functioning during this essential period of rest. Photoplethysmography (PPG) has been demonstrated to be an effective signal for sleep stage inference, meaning it can be used on its own or in a combination with others signals to determine sleep stage. This information is valuable in identifying potential sleep issues and developing strategies to improve sleep quality and overall health. In this work, we present a machine learning sleep-wake classification model based on the eXtreme Gradient Boosting (XGBoost) algorithm and features extracted from PPG signal and activity counts. The performance of our method was comparable to current state-of-the-art methods with a Sensitivity of 91.15 $\pm$ 1.16%, Specificity of 53.66 $\pm$ 1.12%, F1-score of 83.88 $\pm$ 0.56%, and Kappa of 48.0 $\pm$ 0.86%. Our method offers a significant improvement over other approaches as it uses a reduced number of features, making it suitable for implementation in wearable devices that have limited computational power. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Comments: 8 pages, 3 figures

arXiv:2308.01930 [pdf, other]

Machine Learning-Based Diabetes Detection Using Photoplethysmography Signal Features

Authors: Filipe A. C. Oliveira, Felipe M. Dias, Marcelo A. F. Toledo, Diego A. C. Cardenas, Douglas A. Almeida, Estela Ribeiro, Jose E. Krieger, Marco A. Gutierrez

Abstract: Diabetes is a prevalent chronic condition that compromises the health of millions of people worldwide. Minimally invasive methods are needed to prevent and control diabetes but most devices for measuring glucose levels are invasive and not amenable for continuous monitoring. Here, we present an alternative method to overcome these shortcomings based on non-invasive optical photoplethysmography (PP… ▽ More Diabetes is a prevalent chronic condition that compromises the health of millions of people worldwide. Minimally invasive methods are needed to prevent and control diabetes but most devices for measuring glucose levels are invasive and not amenable for continuous monitoring. Here, we present an alternative method to overcome these shortcomings based on non-invasive optical photoplethysmography (PPG) for detecting diabetes. We classify non-Diabetic and Diabetic patients using the PPG signal and metadata for training Logistic Regression (LR) and eXtreme Gradient Boosting (XGBoost) algorithms. We used PPG signals from a publicly available dataset. To prevent overfitting, we divided the data into five folds for cross-validation. By ensuring that patients in the training set are not in the testing set, the model's performance can be evaluated on unseen subjects' data, providing a more accurate assessment of its generalization. Our model achieved an F1-Score and AUC of $58.8\pm20.0\%$ and $79.2\pm15.0\%$ for LR and $51.7\pm16.5\%$ and $73.6\pm17.0\%$ for XGBoost, respectively. Feature analysis suggested that PPG morphological features contains diabetes-related information alongside metadata. Our findings are within the same range reported in the literature, indicating that machine learning methods are promising for developing remote, non-invasive, and continuous measurement devices for detecting and preventing diabetes. △ Less

Submitted 2 August, 2023; originally announced August 2023.

Comments: 11 pages, 6 figures

arXiv:2307.08766 [pdf, other]

Quality Assessment of Photoplethysmography Signals For Cardiovascular Biomarkers Monitoring Using Wearable Devices

Authors: Felipe M. Dias, Marcelo A. F. Toledo, Diego A. C. Cardenas, Douglas A. Almeida, Filipe A. C. Oliveira, Estela Ribeiro, Jose E. Krieger, Marco A. Gutierrez

Abstract: Photoplethysmography (PPG) is a non-invasive technology that measures changes in blood volume in the microvascular bed of tissue. It is commonly used in medical devices such as pulse oximeters and wrist worn heart rate monitors to monitor cardiovascular hemodynamics. PPG allows for the assessment of parameters (e.g., heart rate, pulse waveform, and peripheral perfusion) that can indicate condition… ▽ More Photoplethysmography (PPG) is a non-invasive technology that measures changes in blood volume in the microvascular bed of tissue. It is commonly used in medical devices such as pulse oximeters and wrist worn heart rate monitors to monitor cardiovascular hemodynamics. PPG allows for the assessment of parameters (e.g., heart rate, pulse waveform, and peripheral perfusion) that can indicate conditions such as vasoconstriction or vasodilation, and provides information about microvascular blood flow, making it a valuable tool for monitoring cardiovascular health. However, PPG is subject to a number of sources of variations that can impact its accuracy and reliability, especially when using a wearable device for continuous monitoring, such as motion artifacts, skin pigmentation, and vasomotion. In this study, we extracted 27 statistical features from the PPG signal for training machine-learning models based on gradient boosting (XGBoost and CatBoost) and Random Forest (RF) algorithms to assess quality of PPG signals that were labeled as good or poor quality. We used the PPG time series from a publicly available dataset and evaluated the algorithm s performance using Sensitivity (Se), Positive Predicted Value (PPV), and F1-score (F1) metrics. Our model achieved Se, PPV, and F1-score of 94.4, 95.6, and 95.0 for XGBoost, 94.7, 95.9, and 95.3 for CatBoost, and 93.7, 91.3 and 92.5 for RF, respectively. Our findings are comparable to state-of-the-art reported in the literature but using a much simpler model, indicating that ML models are promising for developing remote, non-invasive, and continuous measurement devices. △ Less

Submitted 17 July, 2023; originally announced July 2023.

Comments: 9 pages

arXiv:2307.07254 [pdf, other]

cOOpD: Reformulating COPD classification on chest CT scans as anomaly detection using contrastive representations

Authors: Silvia D. Almeida, Carsten T. Lüth, Tobias Norajitra, Tassilo Wald, Marco Nolden, Paul F. Jaeger, Claus P. Heussel, Jürgen Biederer, Oliver Weinheimer, Klaus Maier-Hein

Abstract: Classification of heterogeneous diseases is challenging due to their complexity, variability of symptoms and imaging findings. Chronic Obstructive Pulmonary Disease (COPD) is a prime example, being underdiagnosed despite being the third leading cause of death. Its sparse, diffuse and heterogeneous appearance on computed tomography challenges supervised binary classification. We reformulate COPD bi… ▽ More Classification of heterogeneous diseases is challenging due to their complexity, variability of symptoms and imaging findings. Chronic Obstructive Pulmonary Disease (COPD) is a prime example, being underdiagnosed despite being the third leading cause of death. Its sparse, diffuse and heterogeneous appearance on computed tomography challenges supervised binary classification. We reformulate COPD binary classification as an anomaly detection task, proposing cOOpD: heterogeneous pathological regions are detected as Out-of-Distribution (OOD) from normal homogeneous lung regions. To this end, we learn representations of unlabeled lung regions employing a self-supervised contrastive pretext model, potentially capturing specific characteristics of diseased and healthy unlabeled regions. A generative model then learns the distribution of healthy representations and identifies abnormalities (stemming from COPD) as deviations. Patient-level scores are obtained by aggregating region OOD scores. We show that cOOpD achieves the best performance on two public datasets, with an increase of 8.2% and 7.7% in terms of AUROC compared to the previous supervised state-of-the-art. Additionally, cOOpD yields well-interpretable spatial anomaly maps and patient-level scores which we show to be of additional value in identifying individuals in the early stage of progression. Experiments in artificially designed real-world prevalence settings further support that anomaly detection is a powerful way of tackling COPD classification. △ Less

Submitted 14 July, 2023; originally announced July 2023.

arXiv:2306.13116 [pdf, other]

A Machine Learning Pressure Emulator for Hydrogen Embrittlement

Authors: Minh Triet Chau, João Lucas de Sousa Almeida, Elie Alhajjar, Alberto Costa Nogueira Junior

Abstract: A recent alternative for hydrogen transportation as a mixture with natural gas is blending it into natural gas pipelines. However, hydrogen embrittlement of material is a major concern for scientists and gas installation designers to avoid process failures. In this paper, we propose a physics-informed machine learning model to predict the gas pressure on the pipes' inner wall. Despite its high-fid… ▽ More A recent alternative for hydrogen transportation as a mixture with natural gas is blending it into natural gas pipelines. However, hydrogen embrittlement of material is a major concern for scientists and gas installation designers to avoid process failures. In this paper, we propose a physics-informed machine learning model to predict the gas pressure on the pipes' inner wall. Despite its high-fidelity results, the current PDE-based simulators are time- and computationally-demanding. Using simulation data, we train an ML model to predict the pressure on the pipelines' inner walls, which is a first step for pipeline system surveillance. We found that the physics-based method outperformed the purely data-driven method and satisfy the physical constraints of the gas flow system. △ Less

Submitted 22 June, 2023; originally announced June 2023.

arXiv:2303.08774 [pdf, other]

GPT-4 Technical Report

Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4. △ Less

Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

Comments: 100 pages; updated authors list; fixed author names and added citation

arXiv:2211.15330 [pdf, other]

UAS in the Airspace: A Review on Integration, Simulation, Optimization, and Open Challenges

Authors: Euclides Carlos Pinto Neto, Derick Moreira Baum, Jorge Rady de Almeida Jr., Joao Batista Camargo Jr., Paulo Sergio Cugnasca

Abstract: Air transportation is essential for society, and it is increasing gradually due to its importance. To improve the airspace operation, new technologies are under development, such as Unmanned Aircraft Systems (UAS). In fact, in the past few years, there has been a growth in UAS numbers in segregated airspace. However, there is an interest in integrating these aircraft into the National Airspace Sys… ▽ More Air transportation is essential for society, and it is increasing gradually due to its importance. To improve the airspace operation, new technologies are under development, such as Unmanned Aircraft Systems (UAS). In fact, in the past few years, there has been a growth in UAS numbers in segregated airspace. However, there is an interest in integrating these aircraft into the National Airspace System (NAS). The UAS is vital to different industries due to its advantages brought to the airspace (e.g., efficiency). Conversely, the relationship between UAS and Air Traffic Control (ATC) needs to be well-defined due to the impacts on ATC capacity these aircraft may present. Throughout the years, this impact may be lower than it is nowadays because the current lack of familiarity in this relationship contributes to higher workload levels. Thereupon, the primary goal of this research is to present a comprehensive review of the advancements in the integration of UAS in the National Airspace System (NAS) from different perspectives. We consider the challenges regarding simulation, final approach, and optimization of problems related to the interoperability of such systems in the airspace. Finally, we identify several open challenges in the field based on the existing state-of-the-art proposals. △ Less

Submitted 24 November, 2022; originally announced November 2022.

arXiv:2206.01604 [pdf, other]

Non-Intrusive Reduced Models based on Operator Inference for Chaotic Systems

Authors: João Lucas de Sousa Almeida, Arthur Cancellieri Pires, Klaus Feine Vaz Cid, Alberto Costa Nogueira Junior

Abstract: This work explores the physics-driven machine learning technique Operator Inference (OpInf) for predicting the state of chaotic dynamical systems. OpInf provides a non-intrusive approach to infer approximations of polynomial operators in reduced space without having access to the full order operators appearing in discretized models. Datasets for the physics systems are generated using conventional… ▽ More This work explores the physics-driven machine learning technique Operator Inference (OpInf) for predicting the state of chaotic dynamical systems. OpInf provides a non-intrusive approach to infer approximations of polynomial operators in reduced space without having access to the full order operators appearing in discretized models. Datasets for the physics systems are generated using conventional numerical solvers and then projected to a low-dimensional space via Principal Component Analysis (PCA). In latent space, a least-squares problem is set to fit a quadratic polynomial operator, which is subsequently employed in a time-integration scheme in order to produce extrapolations in the same space. Once solved, the inverse PCA operation is applied to reconstruct the extrapolations in the original space. The quality of the OpInf predictions is assessed via the Normalized Root Mean Squared Error (NRMSE) metric from which the Valid Prediction Time (VPT) is computed. Numerical experiments considering the chaotic systems Lorenz 96 and the Kuramoto-Sivashinsky equation show promising forecasting capabilities of the OpInf reduced order models with VPT ranges that outperform state-of-the-art machine learning methods such as backpropagation and reservoir computing recurrent neural networks [1], as well as Markov neural operators [2]. △ Less

Submitted 21 September, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

Comments: 16 pages, 37 figures, accepted for publication in the IEEE-TAI-PIML

arXiv:2203.02155 [pdf, other]

Training language models to follow instructions with human feedback

Authors: Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, Ryan Lowe

Abstract: Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these models are not aligned with their users. In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning wi… ▽ More Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these models are not aligned with their users. In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback. Starting with a set of labeler-written prompts and prompts submitted through the OpenAI API, we collect a dataset of labeler demonstrations of the desired model behavior, which we use to fine-tune GPT-3 using supervised learning. We then collect a dataset of rankings of model outputs, which we use to further fine-tune this supervised model using reinforcement learning from human feedback. We call the resulting models InstructGPT. In human evaluations on our prompt distribution, outputs from the 1.3B parameter InstructGPT model are preferred to outputs from the 175B GPT-3, despite having 100x fewer parameters. Moreover, InstructGPT models show improvements in truthfulness and reductions in toxic output generation while having minimal performance regressions on public NLP datasets. Even though InstructGPT still makes simple mistakes, our results show that fine-tuning with human feedback is a promising direction for aligning language models with human intent. △ Less

Submitted 4 March, 2022; originally announced March 2022.

arXiv:2107.08492 [pdf, ps, other]

SENSORIMOTOR GRAPH: Action-Conditioned Graph Neural Network for Learning Robotic Soft Hand Dynamics

Authors: João Damião Almeida, Paul Schydlo, Atabak Dehban, José Santos-Victor

Abstract: Soft robotics is a thriving branch of robotics which takes inspiration from nature and uses affordable flexible materials to design adaptable non-rigid robots. However, their flexible behavior makes these robots hard to model, which is essential for a precise actuation and for optimal control. For system modelling, learning-based approaches have demonstrated good results, yet they fail to consider… ▽ More Soft robotics is a thriving branch of robotics which takes inspiration from nature and uses affordable flexible materials to design adaptable non-rigid robots. However, their flexible behavior makes these robots hard to model, which is essential for a precise actuation and for optimal control. For system modelling, learning-based approaches have demonstrated good results, yet they fail to consider the physical structure underlying the system as an inductive prior. In this work, we take inspiration from sensorimotor learning, and apply a Graph Neural Network to the problem of modelling a non-rigid kinematic chain (i.e. a robotic soft hand) taking advantage of two key properties: 1) the system is compositional, that is, it is composed of simple interacting parts connected by edges, 2) it is order invariant, i.e. only the structure of the system is relevant for predicting future trajectories. We denote our model as the 'Sensorimotor Graph' since it learns the system connectivity from observation and uses it for dynamics prediction. We validate our model in different scenarios and show that it outperforms the non-structured baselines in dynamics prediction while being more robust to configurational variations, tracking errors or node failures. △ Less

Submitted 18 July, 2021; originally announced July 2021.

Comments: Accepted at 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021)

arXiv:2106.00958 [pdf, other]

A Generalizable Approach to Learning Optimizers

Authors: Diogo Almeida, Clemens Winter, Jie Tang, Wojciech Zaremba

Abstract: A core issue with learning to optimize neural networks has been the lack of generalization to real world problems. To address this, we describe a system designed from a generalization-first perspective, learning to update optimizer hyperparameters instead of model parameters directly using novel features, actions, and a reward function. This system outperforms Adam at all neural network tasks incl… ▽ More A core issue with learning to optimize neural networks has been the lack of generalization to real world problems. To address this, we describe a system designed from a generalization-first perspective, learning to update optimizer hyperparameters instead of model parameters directly using novel features, actions, and a reward function. This system outperforms Adam at all neural network tasks including on modalities not seen during training. We achieve 2x speedups on ImageNet, and a 2.5x speedup on a language modeling task using over 5 orders of magnitude more compute than the training tasks. △ Less

Submitted 7 June, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

arXiv:2103.10242 [pdf, other]

A Methodology for Approaching the Integration of Complex Robotics Systems Illustrated through a Bi-manual Manipulation Case-Study

Authors: Pavlos Triantafyllou, Rafael Afonso Rodrigues, Sirapoab Chaikunsaeng, Diogo Almeida, Graham Deacon, Jelizaveta Konstantinova, Giuseppe Cotugno

Abstract: The multidisciplinarity of robotics creates a need for robust integration methodologies that can facilitate the adoption of state-of-the-art research components in an industrial application. Unfortunately, there are no clear, community accepted guidelines or standards that define the integration of such components in a single robotic system. In this paper, we propose a methodology that assesses th… ▽ More The multidisciplinarity of robotics creates a need for robust integration methodologies that can facilitate the adoption of state-of-the-art research components in an industrial application. Unfortunately, there are no clear, community accepted guidelines or standards that define the integration of such components in a single robotic system. In this paper, we propose a methodology that assesses the software components of a candidate system on the basis of the effort required to integrate them and the impact their integration will have on a target system. We demonstrate how this methodology can be applied using an industrial tool packing system as an example. The system integrates a wide range of both in-house and third-party research outputs and software components. We prove the effectiveness of our approach by evaluating system performance with an experimental benchmark that assesses the robustness, reliability and operational speed of the system for the given packing task. We also demonstrate how our methodology can be used to predict the amount of integration time required for a component. The proposed integration methodology can be applied to any robotic system to facilitate its transition from the research to an industrial environment. △ Less

Submitted 18 March, 2021; originally announced March 2021.

arXiv:2103.07953 [pdf, other]

doi 10.1109/ACCESS.2021.3137633

A new interpretable unsupervised anomaly detection method based on residual explanation

Authors: David F. N. Oliveira, Lucio F. Vismari, Alexandre M. Nascimento, Jorge R. de Almeida Jr, Paulo S. Cugnasca, Joao B. Camargo Jr, Leandro Almeida, Rafael Gripp, Marcelo Neves

Abstract: Despite the superior performance in modeling complex patterns to address challenging problems, the black-box nature of Deep Learning (DL) methods impose limitations to their application in real-world critical domains. The lack of a smooth manner for enabling human reasoning about the black-box decisions hinder any preventive action to unexpected events, in which may lead to catastrophic consequenc… ▽ More Despite the superior performance in modeling complex patterns to address challenging problems, the black-box nature of Deep Learning (DL) methods impose limitations to their application in real-world critical domains. The lack of a smooth manner for enabling human reasoning about the black-box decisions hinder any preventive action to unexpected events, in which may lead to catastrophic consequences. To tackle the unclearness from black-box models, interpretability became a fundamental requirement in DL-based systems, leveraging trust and knowledge by providing ways to understand the model's behavior. Although a current hot topic, further advances are still needed to overcome the existing limitations of the current interpretability methods in unsupervised DL-based models for Anomaly Detection (AD). Autoencoders (AE) are the core of unsupervised DL-based for AD applications, achieving best-in-class performance. However, due to their hybrid aspect to obtain the results (by requiring additional calculations out of network), only agnostic interpretable methods can be applied to AE-based AD. These agnostic methods are computationally expensive to process a large number of parameters. In this paper we present the RXP (Residual eXPlainer), a new interpretability method to deal with the limitations for AE-based AD in large-scale systems. It stands out for its implementation simplicity, low computational cost and deterministic behavior, in which explanations are obtained through the deviation analysis of reconstructed input features. In an experiment using data from a real heavy-haul railway line, the proposed method achieved superior performance compared to SHAP, demonstrating its potential to support decision making in large scale critical systems. △ Less

Submitted 14 March, 2021; originally announced March 2021.

Comments: 8 pages

ACM Class: I.2.6; I.2.1

arXiv:1905.01261 [pdf, other]

A Lyapunov-Based Approach to Exploit Asymmetries in Robotic Dual-Arm Task Resolution

Authors: Diogo Almeida, Yiannis Karayiannidis

Abstract: Dual-arm manipulation tasks can be prescribed to a robotic system in terms of desired absolute and relative motion of the robot's end-effectors. These can represent, e.g., jointly carrying a rigid object or performing an assembly task. When both types of motion are to be executed concurrently, the symmetric distribution of the relative motion between arms prevents task conflicts. Conversely, an as… ▽ More Dual-arm manipulation tasks can be prescribed to a robotic system in terms of desired absolute and relative motion of the robot's end-effectors. These can represent, e.g., jointly carrying a rigid object or performing an assembly task. When both types of motion are to be executed concurrently, the symmetric distribution of the relative motion between arms prevents task conflicts. Conversely, an asymmetric solution to the relative motion task will result in conflicts with the absolute task. In this work, we address the problem of designing a control law for the absolute motion task together with updating the distribution of the relative task among arms. Through a set of numerical results, we contrast our approach with the classical symmetric distribution of the relative motion task to illustrate the advantages of our method. △ Less

Submitted 20 August, 2019; v1 submitted 3 May, 2019; originally announced May 2019.

Comments: Accepted for publication at CDC 2019

arXiv:1905.01248 [pdf, other]

Asymmetric Dual-Arm Task Execution using an Extended Relative Jacobian

Authors: Diogo Almeida, Yiannis Karayiannidis

Abstract: Coordinated dual-arm manipulation tasks can be broadly characterized as possessing absolute and relative motion components. Relative motion tasks, in particular, are inherently redundant in the way they can be distributed between end-effectors. In this work, we analyse cooperative manipulation in terms of the asymmetric resolution of relative motion tasks. We discuss how existing approaches enable… ▽ More Coordinated dual-arm manipulation tasks can be broadly characterized as possessing absolute and relative motion components. Relative motion tasks, in particular, are inherently redundant in the way they can be distributed between end-effectors. In this work, we analyse cooperative manipulation in terms of the asymmetric resolution of relative motion tasks. We discuss how existing approaches enable the asymmetric execution of a relative motion task, and show how an asymmetric relative motion space can be defined. We leverage this result to propose an extended relative Jacobian to model the cooperative system, which allows a user to set a concrete degree of asymmetry in the task execution. This is achieved without the need for prescribing an absolute motion target. Instead, the absolute motion remains available as a functional redundancy to the system. We illustrate the properties of our proposed Jacobian through numerical simulations of a novel differential Inverse Kinematics algorithm. △ Less

Submitted 20 August, 2019; v1 submitted 3 May, 2019; originally announced May 2019.

Comments: Accepted for presentation at ISRR19. 16 Pages

arXiv:1904.02697 [pdf]

A Systematic Literature Review about the impact of Artificial Intelligence on Autonomous Vehicle Safety

Authors: A. M. Nascimento, L. F. Vismari, C. B. S. T. Molina, P. S. Cugnasca, J. B. Camargo Jr., J. R. de Almeida Jr., R. Inam, E. Fersman, M. V. Marquezini, A. Y. Hata

Abstract: Autonomous Vehicles (AV) are expected to bring considerable benefits to society, such as traffic optimization and accidents reduction. They rely heavily on advances in many Artificial Intelligence (AI) approaches and techniques. However, while some researchers in this field believe AI is the core element to enhance safety, others believe AI imposes new challenges to assure the safety of these new… ▽ More Autonomous Vehicles (AV) are expected to bring considerable benefits to society, such as traffic optimization and accidents reduction. They rely heavily on advances in many Artificial Intelligence (AI) approaches and techniques. However, while some researchers in this field believe AI is the core element to enhance safety, others believe AI imposes new challenges to assure the safety of these new AI-based systems and applications. In this non-convergent context, this paper presents a systematic literature review to paint a clear picture of the state of the art of the literature in AI on AV safety. Based on an initial sample of 4870 retrieved papers, 59 studies were selected as the result of the selection criteria detailed in the paper. The shortlisted studies were then mapped into six categories to answer the proposed research questions. An AV system model was proposed and applied to orient the discussions about the SLR findings. As a main result, we have reinforced our preliminary observation about the necessity of considering a serious safety agenda for the future studies on AI-based AV systems. △ Less

Submitted 4 April, 2019; originally announced April 2019.

Comments: 32 pages, 5 figures, 9 Tables

arXiv:1901.03660 [pdf]

doi 10.22161/ijaers.5.12.40

Weightless Neural Network with Transfer Learning to Detect Distress in Asphalt

Authors: Suayder Milhomem, Tiago da Silva Almeida, Warley Gramacho da Silva, Edeilson Milhomem da Silva, Rafael Lima de Carvalho

Abstract: The present paper shows a solution to the problem of automatic distress detection, more precisely the detection of holes in paved roads. To do so, the proposed solution uses a weightless neural network known as Wisard to decide whether an image of a road has any kind of cracks. In addition, the proposed architecture also shows how the use of transfer learning was able to improve the overall accura… ▽ More The present paper shows a solution to the problem of automatic distress detection, more precisely the detection of holes in paved roads. To do so, the proposed solution uses a weightless neural network known as Wisard to decide whether an image of a road has any kind of cracks. In addition, the proposed architecture also shows how the use of transfer learning was able to improve the overall accuracy of the decision system. As a verification step of the research, an experiment was carried out using images from the streets at the Federal University of Tocantins, Brazil. The architecture of the developed solution presents a result of 85.71% accuracy in the dataset, proving to be superior to approaches of the state-of-the-art. △ Less

Submitted 3 January, 2019; originally announced January 2019.

Comments: 6 pages, 5 figures, published on IJAERS

Journal ref: International Journal of Advanced Engineering Research and Science, Page No: 294-299, vol.5,no. 12, date:2018

arXiv:1612.04981 [pdf, other]

doi 10.4204/EPTCS.233.4

Reducing Nondeterministic Tree Automata by Adding Transitions

Authors: Ricardo Manuel de Oliveira Almeida

Abstract: We introduce saturation of nondeterministic tree automata, a technique that consists of adding new transitions to an automaton while preserving its language. We implemented our algorithm on minotaut - a module of the tree automata library libvata that reduces the size of automata by merging states and removing superfluous transitions - and we show how saturation can make subsequent merge and trans… ▽ More We introduce saturation of nondeterministic tree automata, a technique that consists of adding new transitions to an automaton while preserving its language. We implemented our algorithm on minotaut - a module of the tree automata library libvata that reduces the size of automata by merging states and removing superfluous transitions - and we show how saturation can make subsequent merge and transition-removal operations more effective. Thus we obtain a Ptime algorithm that reduces the size of tree automata even more than before. Additionally, we explore how minotaut alone can play an important role when performing hard operations like complementation, allowing to both obtain smaller complement automata and lower computation times. We then show how saturation can extend this contribution even further. We tested our algorithms on a large collection of automata from applications of libvata in shape analysis, and on different classes of randomly generated automata. △ Less

Submitted 15 December, 2016; originally announced December 2016.

Comments: In Proceedings MEMICS 2016, arXiv:1612.04037

Journal ref: EPTCS 233, 2016, pp. 33-51

arXiv:1611.00230 [pdf, other]

doi 10.1109/ICRA.2019.8794128

Towards Blended Reactive Planning and Acting using Behavior Trees

Authors: Michele Colledanchise, Diogo Almeida, Petter Ögren

Abstract: In this paper, we show how a planning algorithm can be used to automatically create and update a Behavior Tree (BT), controlling a robot in a dynamic environment. The planning part of the algorithm is based on the idea of back chaining. Starting from a goal condition we iteratively select actions to achieve that goal, and if those actions have unmet preconditions, they are extended with actions to… ▽ More In this paper, we show how a planning algorithm can be used to automatically create and update a Behavior Tree (BT), controlling a robot in a dynamic environment. The planning part of the algorithm is based on the idea of back chaining. Starting from a goal condition we iteratively select actions to achieve that goal, and if those actions have unmet preconditions, they are extended with actions to achieve them in the same way. The fact that BTs are inherently modular and reactive makes the proposed solution blend acting and planning in a way that enables the robot to efficiently react to external disturbances. If an external agent undoes an action the robot reexecutes it without re-planning, and if an external agent helps the robot, it skips the corresponding actions, again without replanning. We illustrate our approach in two different robotics scenarios. △ Less

Submitted 14 September, 2018; v1 submitted 1 November, 2016; originally announced November 2016.

Journal ref: 2019 International Conference on Robotics and Automation (ICRA)

arXiv:1605.07156 [pdf, other]

Genetic Architect: Discovering Genomic Structure with Learned Neural Architectures

Authors: Laura Deming, Sasha Targ, Nate Sauder, Diogo Almeida, Chun Jimmie Ye

Abstract: Each human genome is a 3 billion base pair set of encoding instructions. Decoding the genome using deep learning fundamentally differs from most tasks, as we do not know the full structure of the data and therefore cannot design architectures to suit it. As such, architectures that fit the structure of genomics should be learned not prescribed. Here, we develop a novel search algorithm, applicable… ▽ More Each human genome is a 3 billion base pair set of encoding instructions. Decoding the genome using deep learning fundamentally differs from most tasks, as we do not know the full structure of the data and therefore cannot design architectures to suit it. As such, architectures that fit the structure of genomics should be learned not prescribed. Here, we develop a novel search algorithm, applicable across domains, that discovers an optimal architecture which simultaneously learns general genomic patterns and identifies the most important sequence motifs in predicting functional genomic outcomes. The architectures we find using this algorithm succeed at using only RNA expression data to predict gene regulatory structure, learn human-interpretable visualizations of key sequence motifs, and surpass state-of-the-art results on benchmark genomics challenges. △ Less

Submitted 23 May, 2016; originally announced May 2016.

Comments: 10 pages, 4 figures

arXiv:1604.06558 [pdf, other]

Folding Assembly by Means of Dual-Arm Robotic Manipulation

Authors: Diogo Almeida, Yiannis Karayiannidis

Abstract: In this paper, we consider folding assembly as an assembly primitive suitable for dual-arm robotic assembly, that can be integrated in a higher level assembly strategy. The system composed by two pieces in contact is modelled as an articulated object, connected by a prismatic-revolute joint. Different grasping scenarios were considered in order to model the system, and a simple controller based on… ▽ More In this paper, we consider folding assembly as an assembly primitive suitable for dual-arm robotic assembly, that can be integrated in a higher level assembly strategy. The system composed by two pieces in contact is modelled as an articulated object, connected by a prismatic-revolute joint. Different grasping scenarios were considered in order to model the system, and a simple controller based on feedback linearisation is proposed, using force torque measurements to compute the contact point kinematics. The folding assembly controller has been experimentally tested with two sample parts, in order to showcase folding assembly as a viable assembly primitive. △ Less

Submitted 22 April, 2016; originally announced April 2016.

Comments: 7 pages, accepted for ICRA 2016

arXiv:1603.08029 [pdf, other]

Resnet in Resnet: Generalizing Residual Architectures

Authors: Sasha Targ, Diogo Almeida, Kevin Lyman

Abstract: Residual networks (ResNets) have recently achieved state-of-the-art on challenging computer vision tasks. We introduce Resnet in Resnet (RiR): a deep dual-stream architecture that generalizes ResNets and standard CNNs and is easily implemented with no computational overhead. RiR consistently improves performance over ResNets, outperforms architectures with similar amounts of augmentation on CIFAR-… ▽ More Residual networks (ResNets) have recently achieved state-of-the-art on challenging computer vision tasks. We introduce Resnet in Resnet (RiR): a deep dual-stream architecture that generalizes ResNets and standard CNNs and is easily implemented with no computational overhead. RiR consistently improves performance over ResNets, outperforms architectures with similar amounts of augmentation on CIFAR-10, and establishes a new state-of-the-art on CIFAR-100. △ Less

Submitted 25 March, 2016; originally announced March 2016.

arXiv:1603.05201 [pdf, other]

Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units

Authors: Wenling Shang, Kihyuk Sohn, Diogo Almeida, Honglak Lee

Abstract: Recently, convolutional neural networks (CNNs) have been used as a powerful tool to solve many problems of machine learning and computer vision. In this paper, we aim to provide insight on the property of convolutional neural networks, as well as a generic method to improve the performance of many CNN architectures. Specifically, we first examine existing CNN models and observe an intriguing prope… ▽ More Recently, convolutional neural networks (CNNs) have been used as a powerful tool to solve many problems of machine learning and computer vision. In this paper, we aim to provide insight on the property of convolutional neural networks, as well as a generic method to improve the performance of many CNN architectures. Specifically, we first examine existing CNN models and observe an intriguing property that the filters in the lower layers form pairs (i.e., filters with opposite phase). Inspired by our observation, we propose a novel, simple yet effective activation scheme called concatenated ReLU (CRelu) and theoretically analyze its reconstruction property in CNNs. We integrate CRelu into several state-of-the-art CNN architectures and demonstrate improvement in their recognition performance on CIFAR-10/100 and ImageNet datasets with fewer trainable parameters. Our results suggest that better understanding of the properties of CNNs can lead to significant performance improvement with a simple modification. △ Less

Submitted 19 July, 2016; v1 submitted 16 March, 2016; originally announced March 2016.

Comments: ICML 2016

arXiv:1511.06827 [pdf, other]

GradNets: Dynamic Interpolation Between Neural Architectures

Authors: Diogo Almeida, Nate Sauder

Abstract: In machine learning, there is a fundamental trade-off between ease of optimization and expressive power. Neural Networks, in particular, have enormous expressive power and yet are notoriously challenging to train. The nature of that optimization challenge changes over the course of learning. Traditionally in deep learning, one makes a static trade-off between the needs of early and late optimizati… ▽ More In machine learning, there is a fundamental trade-off between ease of optimization and expressive power. Neural Networks, in particular, have enormous expressive power and yet are notoriously challenging to train. The nature of that optimization challenge changes over the course of learning. Traditionally in deep learning, one makes a static trade-off between the needs of early and late optimization. In this paper, we investigate a novel framework, GradNets, for dynamically adapting architectures during training to get the benefits of both. For example, we can gradually transition from linear to non-linear networks, deterministic to stochastic computation, shallow to deep architectures, or even simple downsampling to fully differentiable attention mechanisms. Benefits include increased accuracy, easier convergence with more complex architectures, solutions to test-time execution of batch normalization, and the ability to train networks of up to 200 layers. △ Less

Submitted 20 November, 2015; originally announced November 2015.

Showing 1–30 of 30 results for author: Almeida, D