Search | arXiv e-print repository

Design and Feasibility of a Community Motorcycle Ambulance System in the Philippines

Authors: Aaron Rodriguez, Aidan Chen, Ryan Rodriguez

Abstract: This study investigates the potential for motorcycle ambulance (motorlance) deployment in Metro Manila and Iloilo City to improve emergency medical care in high-traffic, underserved regions of the Philippines. VSee, a humanitarian technology company, has organized numerous free clinics in the Philippines and identified a critical need for improved emergency services. Motorlances offer a fast, affo… ▽ More This study investigates the potential for motorcycle ambulance (motorlance) deployment in Metro Manila and Iloilo City to improve emergency medical care in high-traffic, underserved regions of the Philippines. VSee, a humanitarian technology company, has organized numerous free clinics in the Philippines and identified a critical need for improved emergency services. Motorlances offer a fast, affordable alternative to traditional ambulances, particularly in congested urban settings and remote rural locations. Pilot programs in Malawi, Thailand, and Iran have demonstrated significant improvements in response times and cost-efficiency with motorlance systems. This study presents a framework for motorlance operation and identifies three potential pilot locations: Mandaluyong, Smokey Mountain, and Iloilo City. Site visits, driver interviews, and user surveys indicate public trust in the motorlance concept and positive reception to potential motorlance deployment. Cost analysis verifies the financial feasibility of motorlance systems. Future work will focus on implementing a physical pilot in Mandaluyong, with the aim of expanding service to similar regions contingent on the Mandaluyong pilot's success. △ Less

Submitted 16 October, 2024; originally announced October 2024.

Comments: 7 pages, 8 figures

arXiv:2410.02038 [pdf, other]

Realizable Continuous-Space Shields for Safe Reinforcement Learning

Authors: Kyungmin Kim, Davide Corsi, Andoni Rodriguez, JB Lanier, Benjami Parellada, Pierre Baldi, Cesar Sanchez, Roy Fox

Abstract: While Deep Reinforcement Learning (DRL) has achieved remarkable success across various domains, it remains vulnerable to occasional catastrophic failures without additional safeguards. One effective solution to prevent these failures is to use a shield that validates and adjusts the agent's actions to ensure compliance with a provided set of safety specifications. For real-life robot domains, it i… ▽ More While Deep Reinforcement Learning (DRL) has achieved remarkable success across various domains, it remains vulnerable to occasional catastrophic failures without additional safeguards. One effective solution to prevent these failures is to use a shield that validates and adjusts the agent's actions to ensure compliance with a provided set of safety specifications. For real-life robot domains, it is desirable to be able to define such safety specifications over continuous state and action spaces to accurately account for system dynamics and calculate new safe actions that minimally alter the agent's output. In this paper, we propose the first shielding approach to automatically guarantee the realizability of safety requirements for continuous state and action spaces. Realizability is an essential property that confirms the shield will always be able to generate a safe action for any state in the environment. We formally prove that realizability can also be verified with a stateful shield, enabling the incorporation of non-Markovian safety requirements. Finally, we demonstrate the effectiveness of our approach in ensuring safety without compromising policy accuracy by applying it to a navigation problem and a multi-agent particle environment. △ Less

Submitted 2 October, 2024; originally announced October 2024.

Comments: Kim, Corsi, and Rodriguez contributed equally

arXiv:2409.19756 [pdf, other]

Advances in Privacy Preserving Federated Learning to Realize a Truly Learning Healthcare System

Authors: Ravi Madduri, Zilinghan Li, Tarak Nandi, Kibaek Kim, Minseok Ryu, Alex Rodriguez

Abstract: The concept of a learning healthcare system (LHS) envisions a self-improving network where multimodal data from patient care are continuously analyzed to enhance future healthcare outcomes. However, realizing this vision faces significant challenges in data sharing and privacy protection. Privacy-Preserving Federated Learning (PPFL) is a transformative and promising approach that has the potential… ▽ More The concept of a learning healthcare system (LHS) envisions a self-improving network where multimodal data from patient care are continuously analyzed to enhance future healthcare outcomes. However, realizing this vision faces significant challenges in data sharing and privacy protection. Privacy-Preserving Federated Learning (PPFL) is a transformative and promising approach that has the potential to address these challenges by enabling collaborative learning from decentralized data while safeguarding patient privacy. This paper proposes a vision for integrating PPFL into the healthcare ecosystem to achieve a truly LHS as defined by the Institute of Medicine (IOM) Roundtable. △ Less

Submitted 29 September, 2024; originally announced September 2024.

arXiv:2409.07626 [pdf, other]

Generalization Error Bound for Quantum Machine Learning in NISQ Era -- A Survey

Authors: Bikram Khanal, Pablo Rivas, Arun Sanjel, Korn Sooksatra, Ernesto Quevedo, Alejandro Rodriguez

Abstract: Despite the mounting anticipation for the quantum revolution, the success of Quantum Machine Learning (QML) in the Noisy Intermediate-Scale Quantum (NISQ) era hinges on a largely unexplored factor: the generalization error bound, a cornerstone of robust and reliable machine learning models. Current QML research, while exploring novel algorithms and applications extensively, is predominantly situat… ▽ More Despite the mounting anticipation for the quantum revolution, the success of Quantum Machine Learning (QML) in the Noisy Intermediate-Scale Quantum (NISQ) era hinges on a largely unexplored factor: the generalization error bound, a cornerstone of robust and reliable machine learning models. Current QML research, while exploring novel algorithms and applications extensively, is predominantly situated in the context of noise-free, ideal quantum computers. However, Quantum Circuit (QC) operations in NISQ-era devices are susceptible to various noise sources and errors. In this article, we conduct a Systematic Mapping Study (SMS) to explore the state-of-the-art generalization bound for supervised QML in NISQ-era and analyze the latest practices in the field. Our study systematically summarizes the existing computational platforms with quantum hardware, datasets, optimization techniques, and the common properties of the bounds found in the literature. We further present the performance accuracy of various approaches in classical benchmark datasets like the MNIST and IRIS datasets. The SMS also highlights the limitations and challenges in QML in the NISQ era and discusses future research directions to advance the field. Using a detailed Boolean operators query in five reliable indexers, we collected 544 papers and filtered them to a small set of 37 relevant articles. This filtration was done following the best practice of SMS with well-defined research questions and inclusion and exclusion criteria. △ Less

Submitted 11 September, 2024; originally announced September 2024.

arXiv:2408.17098 [pdf, other]

UTrack: Multi-Object Tracking with Uncertain Detections

Authors: Edgardo Solano-Carrillo, Felix Sattler, Antje Alex, Alexander Klein, Bruno Pereira Costa, Angel Bueno Rodriguez, Jannis Stoppe

Abstract: The tracking-by-detection paradigm is the mainstream in multi-object tracking, associating tracks to the predictions of an object detector. Although exhibiting uncertainty through a confidence score, these predictions do not capture the entire variability of the inference process. For safety and security critical applications like autonomous driving, surveillance, etc., knowing this predictive unc… ▽ More The tracking-by-detection paradigm is the mainstream in multi-object tracking, associating tracks to the predictions of an object detector. Although exhibiting uncertainty through a confidence score, these predictions do not capture the entire variability of the inference process. For safety and security critical applications like autonomous driving, surveillance, etc., knowing this predictive uncertainty is essential though. Therefore, we introduce, for the first time, a fast way to obtain the empirical predictive distribution during object detection and incorporate that knowledge in multi-object tracking. Our mechanism can easily be integrated into state-of-the-art trackers, enabling them to fully exploit the uncertainty in the detections. Additionally, novel association methods are introduced that leverage the proposed mechanism. We demonstrate the effectiveness of our contribution on a variety of benchmarks, such as MOT17, MOT20, DanceTrack, and KITTI. △ Less

Submitted 30 August, 2024; originally announced August 2024.

Comments: Accepted for the ECCV 2024 Workshop on Uncertainty Quantification for Computer Vision

arXiv:2408.12855 [pdf, other]

Efficient Training Approaches for Performance Anomaly Detection Models in Edge Computing Environments

Authors: Duneesha Fernando, Maria A. Rodriguez, Patricia Arroba, Leila Ismail, Rajkumar Buyya

Abstract: Microservice architectures are increasingly used to modularize IoT applications and deploy them in distributed and heterogeneous edge computing environments. Over time, these microservice-based IoT applications are susceptible to performance anomalies caused by resource hogging (e.g., CPU or memory), resource contention, etc., which can negatively impact their Quality of Service and violate their… ▽ More Microservice architectures are increasingly used to modularize IoT applications and deploy them in distributed and heterogeneous edge computing environments. Over time, these microservice-based IoT applications are susceptible to performance anomalies caused by resource hogging (e.g., CPU or memory), resource contention, etc., which can negatively impact their Quality of Service and violate their Service Level Agreements. Existing research on performance anomaly detection for edge computing environments focuses on model training approaches that either achieve high accuracy at the expense of a time-consuming and resource-intensive training process or prioritize training efficiency at the cost of lower accuracy. To address this gap, while considering the resource constraints and the large number of devices in modern edge platforms, we propose two clustering-based model training approaches : (1) intra-cluster parameter transfer learning-based model training (ICPTL) and (2) cluster-level model training (CM). These approaches aim to find a trade-off between the training efficiency of anomaly detection models and their accuracy. We compared the models trained under ICPTL and CM to models trained for specific devices (most accurate, least efficient) and a single general model trained for all devices (least accurate, most efficient). Our findings show that the model accuracy of ICPTL is comparable to that of the model per device approach while requiring only 40% of the training time. In addition, CM further improves training efficiency by requiring 23% less training time and reducing the number of trained models by approximately 66% compared to ICPTL, yet achieving a higher accuracy than a single general model. △ Less

Submitted 23 August, 2024; originally announced August 2024.

arXiv:2408.10442 [pdf, other]

Feasibility of assessing cognitive impairment via distributed camera network and privacy-preserving edge computing

Authors: Chaitra Hegde, Yashar Kiarashi, Allan I Levey, Amy D Rodriguez, Hyeokhyen Kwon, Gari D Clifford

Abstract: INTRODUCTION: Mild cognitive impairment (MCI) is characterized by a decline in cognitive functions beyond typical age and education-related expectations. Since, MCI has been linked to reduced social interactions and increased aimless movements, we aimed to automate the capture of these behaviors to enhance longitudinal monitoring. METHODS: Using a privacy-preserving distributed camera network, w… ▽ More INTRODUCTION: Mild cognitive impairment (MCI) is characterized by a decline in cognitive functions beyond typical age and education-related expectations. Since, MCI has been linked to reduced social interactions and increased aimless movements, we aimed to automate the capture of these behaviors to enhance longitudinal monitoring. METHODS: Using a privacy-preserving distributed camera network, we collected movement and social interaction data from groups of individuals with MCI undergoing therapy within a 1700$m^2$ space. We developed movement and social interaction features, which were then used to train a series of machine learning algorithms to distinguish between higher and lower cognitive functioning MCI groups. RESULTS: A Wilcoxon rank-sum test revealed statistically significant differences between high and low-functioning cohorts in features such as linear path length, walking speed, change in direction while walking, entropy of velocity and direction change, and number of group formations in the indoor space. Despite lacking individual identifiers to associate with specific levels of MCI, a machine learning approach using the most significant features provided a 71% accuracy. DISCUSSION: We provide evidence to show that a privacy-preserving low-cost camera network using edge computing framework has the potential to distinguish between different levels of cognitive impairment from the movements and social interactions captured during group activities. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.10405 [pdf, other]

ROOT: Requirements Organization and Optimization Tool

Authors: Katherine R. Dearstyne, Alberto D. Rodriguez, Jane Cleland-Huang

Abstract: Software engineering practices such as constructing requirements and establishing traceability help ensure systems are safe, reliable, and maintainable. However, they can be resource-intensive and are frequently underutilized. To alleviate the burden of these essential processes, we developed the Requirements Organization and Optimization Tool (ROOT). ROOT centralizes project information and offer… ▽ More Software engineering practices such as constructing requirements and establishing traceability help ensure systems are safe, reliable, and maintainable. However, they can be resource-intensive and are frequently underutilized. To alleviate the burden of these essential processes, we developed the Requirements Organization and Optimization Tool (ROOT). ROOT centralizes project information and offers project visualizations and AI-based tools designed to streamline engineering processes. With ROOT's assistance, engineers benefit from improved oversight and early error detection, leading to the successful development of software systems. Link to screen cast: https://youtu.be/3rtMYRnsu24 △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.06693 [pdf, other]

DC3DO: Diffusion Classifier for 3D Objects

Authors: Nursena Koprucu, Meher Shashwat Nigam, Shicheng Xu, Biruk Abere, Gabriele Dominici, Andrew Rodriguez, Sharvaree Vadgama, Berfin Inal, Alberto Tono

Abstract: Inspired by Geoffrey Hinton emphasis on generative modeling, To recognize shapes, first learn to generate them, we explore the use of 3D diffusion models for object classification. Leveraging the density estimates from these models, our approach, the Diffusion Classifier for 3D Objects (DC3DO), enables zero-shot classification of 3D shapes without additional training. On average, our method achiev… ▽ More Inspired by Geoffrey Hinton emphasis on generative modeling, To recognize shapes, first learn to generate them, we explore the use of 3D diffusion models for object classification. Leveraging the density estimates from these models, our approach, the Diffusion Classifier for 3D Objects (DC3DO), enables zero-shot classification of 3D shapes without additional training. On average, our method achieves a 12.5 percent improvement compared to its multiview counterparts, demonstrating superior multimodal reasoning over discriminative approaches. DC3DO employs a class-conditional diffusion model trained on ShapeNet, and we run inferences on point clouds of chairs and cars. This work highlights the potential of generative models in 3D object classification. △ Less

Submitted 13 August, 2024; originally announced August 2024.

arXiv:2408.05829 [pdf, other]

Supporting Software Maintenance with Dynamically Generated Document Hierarchies

Authors: Katherine R. Dearstyne, Alberto D. Rodriguez, Jane Cleland-Huang

Abstract: Software documentation supports a broad set of software maintenance tasks; however, creating and maintaining high-quality, multi-level software documentation can be incredibly time-consuming and therefore many code bases suffer from a lack of adequate documentation. We address this problem through presenting HGEN, a fully automated pipeline that leverages LLMs to transform source code through a se… ▽ More Software documentation supports a broad set of software maintenance tasks; however, creating and maintaining high-quality, multi-level software documentation can be incredibly time-consuming and therefore many code bases suffer from a lack of adequate documentation. We address this problem through presenting HGEN, a fully automated pipeline that leverages LLMs to transform source code through a series of six stages into a well-organized hierarchy of formatted documents. We evaluate HGEN both quantitatively and qualitatively. First, we use it to generate documentation for three diverse projects, and engage key developers in comparing the quality of the generated documentation against their own previously produced manually-crafted documentation. We then pilot HGEN in nine different industrial projects using diverse datasets provided by each project. We collect feedback from project stakeholders, and analyze it using an inductive approach to identify recurring themes. Results show that HGEN produces artifact hierarchies similar in quality to manually constructed documentation, with much higher coverage of the core concepts than the baseline approach. Stakeholder feedback highlights HGEN's commercial impact potential as a tool for accelerating code comprehension and maintenance tasks. Results and associated supplemental materials can be found at https://zenodo.org/records/11403244 △ Less

Submitted 11 August, 2024; originally announced August 2024.

arXiv:2407.21783 [pdf, other]

The Llama 3 Herd of Models

Authors: Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere, Bethany Biron, Binh Tang , et al. (510 additional authors not shown)

Abstract: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical… ▽ More Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development. △ Less

Submitted 15 August, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

arXiv:2407.09348 [pdf, other]

Predictable and Performant Reactive Synthesis Modulo Theories via Functional Synthesis

Authors: Andoni Rodríguez, Felipe Gorostiaga, César Sánchez

Abstract: Reactive synthesis is the process of generating correct controllers from temporal logic specifications. Classical LTL reactive synthesis handles (propositional) LTL as a specification language. Boolean abstractions allow reducing LTLt specifications (i.e., LTL with propositions replaced by literals from a theory calT), into equi-realizable LTL specifications. In this paper we extend these results… ▽ More Reactive synthesis is the process of generating correct controllers from temporal logic specifications. Classical LTL reactive synthesis handles (propositional) LTL as a specification language. Boolean abstractions allow reducing LTLt specifications (i.e., LTL with propositions replaced by literals from a theory calT), into equi-realizable LTL specifications. In this paper we extend these results into a full static synthesis procedure. The synthesized system receives from the environment valuations of variables from a rich theory calT and outputs valuations of system variables from calT. We use the abstraction method to synthesize a reactive Boolean controller from the LTL specification, and we combine it with functional synthesis to obtain a static controller for the original LTLt specification. We also show that our method allows responses in the sense that the controller can optimize its outputs in order to e.g., always provide the smallest safe values. This is the first full static synthesis method for LTLt, which is a deterministic program (hence predictable and efficient). △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.08855 [pdf, other]

BraTS-PEDs: Results of the Multi-Consortium International Pediatric Brain Tumor Segmentation Challenge 2023

Authors: Anahita Fathi Kazerooni, Nastaran Khalili, Xinyang Liu, Debanjan Haldar, Zhifan Jiang, Anna Zapaishchykova, Julija Pavaine, Lubdha M. Shah, Blaise V. Jones, Nakul Sheth, Sanjay P. Prabhu, Aaron S. McAllister, Wenxin Tu, Khanak K. Nandolia, Andres F. Rodriguez, Ibraheem Salman Shaikh, Mariana Sanchez Montano, Hollie Anne Lai, Maruf Adewole, Jake Albrecht, Udunna Anazodo, Hannah Anderson, Syed Muhammed Anwar, Alejandro Aristizabal, Sina Bagheri , et al. (55 additional authors not shown)

Abstract: Pediatric central nervous system tumors are the leading cause of cancer-related deaths in children. The five-year survival rate for high-grade glioma in children is less than 20%. The development of new treatments is dependent upon multi-institutional collaborative clinical trials requiring reproducible and accurate centralized response assessment. We present the results of the BraTS-PEDs 2023 cha… ▽ More Pediatric central nervous system tumors are the leading cause of cancer-related deaths in children. The five-year survival rate for high-grade glioma in children is less than 20%. The development of new treatments is dependent upon multi-institutional collaborative clinical trials requiring reproducible and accurate centralized response assessment. We present the results of the BraTS-PEDs 2023 challenge, the first Brain Tumor Segmentation (BraTS) challenge focused on pediatric brain tumors. This challenge utilized data acquired from multiple international consortia dedicated to pediatric neuro-oncology and clinical trials. BraTS-PEDs 2023 aimed to evaluate volumetric segmentation algorithms for pediatric brain gliomas from magnetic resonance imaging using standardized quantitative performance evaluation metrics employed across the BraTS 2023 challenges. The top-performing AI approaches for pediatric tumor analysis included ensembles of nnU-Net and Swin UNETR, Auto3DSeg, or nnU-Net with a self-supervised framework. The BraTSPEDs 2023 challenge fostered collaboration between clinicians (neuro-oncologists, neuroradiologists) and AI/imaging scientists, promoting faster data sharing and the development of automated volumetric analysis techniques. These advancements could significantly benefit clinical trials and improve the care of children with brain tumors. △ Less

Submitted 16 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08094 [pdf, other]

Density Estimation via Binless Multidimensional Integration

Authors: Matteo Carli, Alex Rodriguez, Alessandro Laio, Aldo Glielmo

Abstract: We introduce the Binless Multidimensional Thermodynamic Integration (BMTI) method for nonparametric, robust, and data-efficient density estimation. BMTI estimates the logarithm of the density by initially computing log-density differences between neighbouring data points. Subsequently, such differences are integrated, weighted by their associated uncertainties, using a maximum-likelihood formulati… ▽ More We introduce the Binless Multidimensional Thermodynamic Integration (BMTI) method for nonparametric, robust, and data-efficient density estimation. BMTI estimates the logarithm of the density by initially computing log-density differences between neighbouring data points. Subsequently, such differences are integrated, weighted by their associated uncertainties, using a maximum-likelihood formulation. This procedure can be seen as an extension to a multidimensional setting of the thermodynamic integration, a technique developed in statistical physics. The method leverages the manifold hypothesis, estimating quantities within the intrinsic data manifold without defining an explicit coordinate map. It does not rely on any binning or space partitioning, but rather on the construction of a neighbourhood graph based on an adaptive bandwidth selection procedure. BMTI mitigates the limitations commonly associated with traditional nonparametric density estimators, effectively reconstructing smooth profiles even in high-dimensional embedding spaces. The method is tested on a variety of complex synthetic high-dimensional datasets, where it is shown to outperform traditional estimators, and is benchmarked on realistic datasets from the chemical physics literature. △ Less

Submitted 14 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.02641 [pdf, other]

Learning Graph Structures and Uncertainty for Accurate and Calibrated Time-series Forecasting

Authors: Harshavardhan Kamarthi, Lingkai Kong, Alexander Rodriguez, Chao Zhang, B Aditya Prakash

Abstract: Multi-variate time series forecasting is an important problem with a wide range of applications. Recent works model the relations between time-series as graphs and have shown that propagating information over the relation graph can improve time series forecasting. However, in many cases, relational information is not available or is noisy and reliable. Moreover, most works ignore the underlying un… ▽ More Multi-variate time series forecasting is an important problem with a wide range of applications. Recent works model the relations between time-series as graphs and have shown that propagating information over the relation graph can improve time series forecasting. However, in many cases, relational information is not available or is noisy and reliable. Moreover, most works ignore the underlying uncertainty of time-series both for structure learning and deriving the forecasts resulting in the structure not capturing the uncertainty resulting in forecast distributions with poor uncertainty estimates. We tackle this challenge and introduce STOIC, that leverages stochastic correlations between time-series to learn underlying structure between time-series and to provide well-calibrated and accurate forecasts. Over a wide-range of benchmark datasets STOIC provides around 16% more accurate and 14% better-calibrated forecasts. STOIC also shows better adaptation to noise in data during inference and captures important and useful relational information in various benchmarks. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2406.15812 [pdf, other]

Intrinsic Dimension Correlation: uncovering nonlinear connections in multimodal representations

Authors: Lorenzo Basile, Santiago Acevedo, Luca Bortolussi, Fabio Anselmi, Alex Rodriguez

Abstract: To gain insight into the mechanisms behind machine learning methods, it is crucial to establish connections among the features describing data points. However, these correlations often exhibit a high-dimensional and strongly nonlinear nature, which makes them challenging to detect using standard methods. This paper exploits the entanglement between intrinsic dimensionality and correlation to propo… ▽ More To gain insight into the mechanisms behind machine learning methods, it is crucial to establish connections among the features describing data points. However, these correlations often exhibit a high-dimensional and strongly nonlinear nature, which makes them challenging to detect using standard methods. This paper exploits the entanglement between intrinsic dimensionality and correlation to propose a metric that quantifies the (potentially nonlinear) correlation between high-dimensional manifolds. We first validate our method on synthetic data in controlled environments, showcasing its advantages and drawbacks compared to existing techniques. Subsequently, we extend our analysis to large-scale applications in neural network representations. Specifically, we focus on latent representations of multimodal data, uncovering clear correlations between paired visual and textual embeddings, whereas existing methods struggle significantly in detecting similarity. Our results indicate the presence of highly nonlinear correlation patterns between latent manifolds. △ Less

Submitted 22 June, 2024; originally announced June 2024.

arXiv:2406.14349 [pdf, other]

Can you trust your explanations? A robustness test for feature attribution methods

Authors: Ilaria Vascotto, Alex Rodriguez, Alessandro Bonaita, Luca Bortolussi

Abstract: The increase of legislative concerns towards the usage of Artificial Intelligence (AI) has recently led to a series of regulations striving for a more transparent, trustworthy and accountable AI. Along with these proposals, the field of Explainable AI (XAI) has seen a rapid growth but the usage of its techniques has at times led to unexpected results. The robustness of the approaches is, in fact,… ▽ More The increase of legislative concerns towards the usage of Artificial Intelligence (AI) has recently led to a series of regulations striving for a more transparent, trustworthy and accountable AI. Along with these proposals, the field of Explainable AI (XAI) has seen a rapid growth but the usage of its techniques has at times led to unexpected results. The robustness of the approaches is, in fact, a key property often overlooked: it is necessary to evaluate the stability of an explanation (to random and adversarial perturbations) to ensure that the results are trustable. To this end, we propose a test to evaluate the robustness to non-adversarial perturbations and an ensemble approach to analyse more in depth the robustness of XAI methods applied to neural networks and tabular datasets. We will show how leveraging manifold hypothesis and ensemble approaches can be beneficial to an in-depth analysis of the robustness. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: 8 pages, 3 figures

arXiv:2406.09442 [pdf]

doi 10.1021/acsnano.4c06527

An insertable glucose sensor using a compact and cost-effective phosphorescence lifetime imager and machine learning

Authors: Artem Goncharov, Zoltan Gorocs, Ridhi Pradhan, Brian Ko, Ajmal Ajmal, Andres Rodriguez, David Baum, Marcell Veszpremi, Xilin Yang, Maxime Pindrys, Tianle Zheng, Oliver Wang, Jessica C. Ramella-Roman, Michael J. McShane, Aydogan Ozcan

Abstract: Optical continuous glucose monitoring (CGM) systems are emerging for personalized glucose management owing to their lower cost and prolonged durability compared to conventional electrochemical CGMs. Here, we report a computational CGM system, which integrates a biocompatible phosphorescence-based insertable biosensor and a custom-designed phosphorescence lifetime imager (PLI). This compact and cos… ▽ More Optical continuous glucose monitoring (CGM) systems are emerging for personalized glucose management owing to their lower cost and prolonged durability compared to conventional electrochemical CGMs. Here, we report a computational CGM system, which integrates a biocompatible phosphorescence-based insertable biosensor and a custom-designed phosphorescence lifetime imager (PLI). This compact and cost-effective PLI is designed to capture phosphorescence lifetime images of an insertable sensor through the skin, where the lifetime of the emitted phosphorescence signal is modulated by the local concentration of glucose. Because this phosphorescence signal has a very long lifetime compared to tissue autofluorescence or excitation leakage processes, it completely bypasses these noise sources by measuring the sensor emission over several tens of microseconds after the excitation light is turned off. The lifetime images acquired through the skin are processed by neural network-based models for misalignment-tolerant inference of glucose levels, accurately revealing normal, low (hypoglycemia) and high (hyperglycemia) concentration ranges. Using a 1-mm thick skin phantom mimicking the optical properties of human skin, we performed in vitro testing of the PLI using glucose-spiked samples, yielding 88.8% inference accuracy, also showing resilience to random and unknown misalignments within a lateral distance of ~4.7 mm with respect to the position of the insertable sensor underneath the skin phantom. Furthermore, the PLI accurately identified larger lateral misalignments beyond 5 mm, prompting user intervention for re-alignment. The misalignment-resilient glucose concentration inference capability of this compact and cost-effective phosphorescence lifetime imager makes it an appealing wearable diagnostics tool for real-time tracking of glucose and other biomarkers. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 24 Pages, 4 Figures

Journal ref: ACS Nano (2024)

arXiv:2406.06507 [pdf, other]

Verification-Guided Shielding for Deep Reinforcement Learning

Authors: Davide Corsi, Guy Amir, Andoni Rodriguez, Cesar Sanchez, Guy Katz, Roy Fox

Abstract: In recent years, Deep Reinforcement Learning (DRL) has emerged as an effective approach to solving real-world tasks. However, despite their successes, DRL-based policies suffer from poor reliability, which limits their deployment in safety-critical domains. Various methods have been put forth to address this issue by providing formal safety guarantees. Two main approaches include shielding and ver… ▽ More In recent years, Deep Reinforcement Learning (DRL) has emerged as an effective approach to solving real-world tasks. However, despite their successes, DRL-based policies suffer from poor reliability, which limits their deployment in safety-critical domains. Various methods have been put forth to address this issue by providing formal safety guarantees. Two main approaches include shielding and verification. While shielding ensures the safe behavior of the policy by employing an external online component (i.e., a ``shield'') that overrides potentially dangerous actions, this approach has a significant computational cost as the shield must be invoked at runtime to validate every decision. On the other hand, verification is an offline process that can identify policies that are unsafe, prior to their deployment, yet, without providing alternative actions when such a policy is deemed unsafe. In this work, we present verification-guided shielding -- a novel approach that bridges the DRL reliability gap by integrating these two methods. Our approach combines both formal and probabilistic verification tools to partition the input domain into safe and unsafe regions. In addition, we employ clustering and symbolic representation procedures that compress the unsafe regions into a compact representation. This, in turn, allows to temporarily activate the shield solely in (potentially) unsafe regions, in an efficient manner. Our novel approach allows to significantly reduce runtime overhead while still preserving formal safety guarantees. We extensively evaluate our approach on two benchmarks from the robotic navigation domain, as well as provide an in-depth analysis of its scalability and completeness. △ Less

Submitted 20 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

Comments: Accepted in the 1st Reinforcement Learning Conference (RLC), 2024. [Corsi and Amir contributed equally]

arXiv:2406.04184 [pdf, other]

Shield Synthesis for LTL Modulo Theories

Authors: Andoni Rodriguez, Guy Amir, Davide Corsi, Cesar Sanchez, Guy Katz

Abstract: In recent years, Machine Learning (ML) models have achieved remarkable success in various domains. However, these models also tend to demonstrate unsafe behaviors, precluding their deployment in safety-critical systems. To cope with this issue, ample research focuses on developing methods that guarantee the safe behaviour of a given ML model. A prominent example is shielding which incorporates an… ▽ More In recent years, Machine Learning (ML) models have achieved remarkable success in various domains. However, these models also tend to demonstrate unsafe behaviors, precluding their deployment in safety-critical systems. To cope with this issue, ample research focuses on developing methods that guarantee the safe behaviour of a given ML model. A prominent example is shielding which incorporates an external component (a "shield") that blocks unwanted behavior. Despite significant progress, shielding suffers from a main setback: it is currently geared towards properties encoded solely in propositional logics (e.g., LTL) and is unsuitable for richer logics. This, in turn, limits the widespread applicability of shielding in many real-world systems. In this work, we address this gap, and extend shielding to LTL modulo theories, by building upon recent advances in reactive synthesis modulo theories. This allowed us to develop a novel approach for generating shields conforming to complex safety specifications in these more expressive, logics. We evaluated our shields and demonstrate their ability to handle rich data with temporal dynamics. To the best of our knowledge, this is the first approach for synthesizing shields for such expressivity. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2404.12503 [pdf, other]

STRELA: STReaming ELAstic CGRA Accelerator for Embedded Systems

Authors: Daniel Vazquez, Jose Miranda, Alfonso Rodriguez, Andres Otero, Pascuale Davide Schiavone, David Atienza

Abstract: Reconfigurable computing offers a good balance between flexibility and energy efficiency. When combined with software-programmable devices such as CPUs, it is possible to obtain higher performance by spatially distributing the parallelizable sections of an application throughout the reconfigurable device while the CPU is in charge of control-intensive sections. This work introduces an elastic Coar… ▽ More Reconfigurable computing offers a good balance between flexibility and energy efficiency. When combined with software-programmable devices such as CPUs, it is possible to obtain higher performance by spatially distributing the parallelizable sections of an application throughout the reconfigurable device while the CPU is in charge of control-intensive sections. This work introduces an elastic Coarse-Grained Reconfigurable Architecture (CGRA) integrated into an energy-efficient RISC-V-based SoC designed for the embedded domain. The microarchitecture of CGRA supports conditionals and irregular loops, making it adaptable to domain-specific applications. Additionally, we propose specific mapping strategies that enable the efficient utilization of the CGRA for both simple applications, where the fabric is only reconfigured once (one-shot kernel), and more complex ones, where it is necessary to reconfigure the CGRA multiple times to complete them (multi-shot kernels). Large kernels also benefit from the independent memory nodes incorporated to streamline data accesses. Due to the integration of CGRA as an accelerator of the RISC-V processor enables a versatile and efficient framework, providing adaptability, processing capacity, and overall performance across various applications. The design has been implemented in TSMC 65 nm, achieving a maximum frequency of 250 MHz. It achieves a peak performance of 1.22 GOPs computing one-shot kernels and 1.17 GOPs computing multi-shot kernels. The best energy efficiency is 72.68 MOPs/mW for one-shot kernels and 115.96 MOPs/mW for multi-shot kernels. The design integrates power and clock-gating techniques to tailor the architecture to the embedded domain while maintaining performance. The best speed-ups are 17.63x and 18.61x for one-shot and multi-shot kernels. The best energy savings in the SoC are 9.05x and 11.10x for one-shot and multi-shot kernels. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 14 pages, 11 figures

arXiv:2403.13171 [pdf, other]

LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images

Authors: Jing Zhang, Irving Fang, Juexiao Zhang, Hao Wu, Akshat Kaushik, Alice Rodriguez, Hanwen Zhao, Zhuo Zheng, Radu Iovita, Chen Feng

Abstract: Lithic Use-Wear Analysis (LUWA) using microscopic images is an underexplored vision-for-science research area. It seeks to distinguish the worked material, which is critical for understanding archaeological artifacts, material interactions, tool functionalities, and dental records. However, this challenging task goes beyond the well-studied image classification problem for common objects. It is af… ▽ More Lithic Use-Wear Analysis (LUWA) using microscopic images is an underexplored vision-for-science research area. It seeks to distinguish the worked material, which is critical for understanding archaeological artifacts, material interactions, tool functionalities, and dental records. However, this challenging task goes beyond the well-studied image classification problem for common objects. It is affected by many confounders owing to the complex wear mechanism and microscopic imaging, which makes it difficult even for human experts to identify the worked material successfully. In this paper, we investigate the following three questions on this unique vision task for the first time:(i) How well can state-of-the-art pre-trained models (like DINOv2) generalize to the rarely seen domain? (ii) How can few-shot learning be exploited for scarce microscopic images? (iii) How do the ambiguous magnification and sensing modality influence the classification accuracy? To study these, we collaborated with archaeologists and built the first open-source and the largest LUWA dataset containing 23,130 microscopic images with different magnifications and sensing modalities. Extensive experiments show that existing pre-trained models notably outperform human experts but still leave a large gap for improvements. Most importantly, the LUWA dataset provides an underexplored opportunity for vision and learning communities and complements existing image classification problems on common objects. △ Less

Submitted 27 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

Comments: CVPR

arXiv:2403.03150 [pdf, other]

Deep-Learned Compression for Radio-Frequency Signal Classification

Authors: Armani Rodriguez, Yagna Kaasaragadda, Silvija Kokalj-Filipovic

Abstract: Next-generation cellular concepts rely on the processing of large quantities of radio-frequency (RF) samples. This includes Radio Access Networks (RAN) connecting the cellular front-end based on software defined radios (SDRs) and a framework for the AI processing of spectrum-related data. The RF data collected by the dense RAN radio units and spectrum sensors may need to be jointly processed for i… ▽ More Next-generation cellular concepts rely on the processing of large quantities of radio-frequency (RF) samples. This includes Radio Access Networks (RAN) connecting the cellular front-end based on software defined radios (SDRs) and a framework for the AI processing of spectrum-related data. The RF data collected by the dense RAN radio units and spectrum sensors may need to be jointly processed for intelligent decision making. Moving large amounts of data to AI agents may result in significant bandwidth and latency costs. We propose a deep learned compression (DLC) model, HQARF, based on learned vector quantization (VQ), to compress the complex-valued samples of RF signals comprised of 6 modulation classes. We are assessing the effects of HQARF on the performance of an AI model trained to infer the modulation class of the RF signal. Compression of narrow-band RF samples for the training and off-the-site inference will allow for an efficient use of the bandwidth and storage for non-real-time analytics, and for a decreased delay in real-time applications. While exploring the effectiveness of the HQARF signal reconstructions in modulation classification tasks, we highlight the DLC optimization space and some open problems related to the training of the VQ embedded in HQARF. △ Less

Submitted 5 March, 2024; originally announced March 2024.

arXiv:2403.02997 [pdf, other]

Cover Edge-Based Novel Triangle Counting

Authors: David A. Bader, Fuhuan Li, Zhihui Du, Palina Pauliuchenka, Oliver Alvarado Rodriguez, Anant Gupta, Sai Sri Vastav Minnal, Valmik Nahata, Anya Ganeshan, Ahmet Gundogdu, Jason Lew

Abstract: Listing and counting triangles in graphs is a key algorithmic kernel for network analyses, including community detection, clustering coefficients, k-trusses, and triangle centrality. In this paper, we propose the novel concept of a cover-edge set that can be used to find triangles more efficiently. Leveraging the breadth-first search (BFS) method, we can quickly generate a compact cover-edge set.… ▽ More Listing and counting triangles in graphs is a key algorithmic kernel for network analyses, including community detection, clustering coefficients, k-trusses, and triangle centrality. In this paper, we propose the novel concept of a cover-edge set that can be used to find triangles more efficiently. Leveraging the breadth-first search (BFS) method, we can quickly generate a compact cover-edge set. Novel sequential and parallel triangle counting algorithms that employ cover-edge sets are presented. The novel sequential algorithm performs competitively with the fastest previous approaches on both real and synthetic graphs, such as those from the Graph500 Benchmark and the MIT/Amazon/IEEE Graph Challenge. We implement 22 sequential algorithms for performance evaluation and comparison. At the same time, we employ OpenMP to parallelize 11 sequential algorithms, presenting an in-depth analysis of their parallel performance. Furthermore, we develop a distributed parallel algorithm that can asymptotically reduce communication on massive graphs. In our estimate from massive-scale Graph500 graphs, our distributed parallel algorithm can reduce the communication on a scale~36 graph by 1156x and on a scale~42 graph by 2368x. Comprehensive experiments are conducted on the recently launched Intel Xeon 8480+ processor and shed light on how graph attributes, such as topology, diameter, and degree distribution, can affect the performance of these algorithms. △ Less

Submitted 5 March, 2024; originally announced March 2024.

arXiv:2403.00049 [pdf, other]

TEXterity -- Tactile Extrinsic deXterity: Simultaneous Tactile Estimation and Control for Extrinsic Dexterity

Authors: Sangwoon Kim, Antonia Bronars, Parag Patre, Alberto Rodriguez

Abstract: We introduce a novel approach that combines tactile estimation and control for in-hand object manipulation. By integrating measurements from robot kinematics and an image-based tactile sensor, our framework estimates and tracks object pose while simultaneously generating motion plans in a receding horizon fashion to control the pose of a grasped object. This approach consists of a discrete pose es… ▽ More We introduce a novel approach that combines tactile estimation and control for in-hand object manipulation. By integrating measurements from robot kinematics and an image-based tactile sensor, our framework estimates and tracks object pose while simultaneously generating motion plans in a receding horizon fashion to control the pose of a grasped object. This approach consists of a discrete pose estimator that tracks the most likely sequence of object poses in a coarsely discretized grid, and a continuous pose estimator-controller to refine the pose estimate and accurately manipulate the pose of the grasped object. Our method is tested on diverse objects and configurations, achieving desired manipulation objectives and outperforming single-shot methods in estimation accuracy. The proposed approach holds potential for tasks requiring precise manipulation and limited intrinsic in-hand dexterity under visual occlusion, laying the foundation for closed-loop behavior in applications such as regrasping, insertion, and tool use. Please see https://sites.google.com/view/texterity for videos of real-world demonstrations. △ Less

Submitted 4 March, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

Comments: project website: https://sites.google.com/view/texterity. arXiv admin note: substantial text overlap with arXiv:2401.10230

arXiv:2402.10222 [pdf, other]

Autonomous Vehicle Patrolling Through Deep Reinforcement Learning: Learning to Communicate and Cooperate

Authors: Chenhao Tong, Maria A. Rodriguez, Richard O. Sinnott

Abstract: Autonomous vehicles are suited for continuous area patrolling problems. Finding an optimal patrolling strategy can be challenging due to unknown environmental factors, such as wind or landscape; or autonomous vehicles' constraints, such as limited battery life or hardware failures. Importantly, patrolling large areas often requires multiple agents to collectively coordinate their actions. However,… ▽ More Autonomous vehicles are suited for continuous area patrolling problems. Finding an optimal patrolling strategy can be challenging due to unknown environmental factors, such as wind or landscape; or autonomous vehicles' constraints, such as limited battery life or hardware failures. Importantly, patrolling large areas often requires multiple agents to collectively coordinate their actions. However, an optimal coordination strategy is often non-trivial to be manually defined due to the complex nature of patrolling environments. In this paper, we consider a patrolling problem with environmental factors, agent limitations, and three typical cooperation problems -- collision avoidance, congestion avoidance, and patrolling target negotiation. We propose a multi-agent reinforcement learning solution based on a reinforced inter-agent learning (RIAL) method. With this approach, agents are trained to develop their own communication protocol to cooperate during patrolling where faults can and do occur. The solution is validated through simulation experiments and is compared with several state-of-the-art patrolling solutions from different perspectives, including the overall patrol performance, the collision avoidance performance, the efficiency of battery recharging strategies, and the overall fault tolerance. △ Less

Submitted 28 January, 2024; originally announced February 2024.

arXiv:2402.09498 [pdf]

doi 10.1177/20552076221111289

Detection of the most influential variables for preventing postpartum urinary incontinence using machine learning techniques

Authors: José Alberto Benítez-Andrades, María Teresa García-Ordás, María Álvarez-González, Raquel Leirós-Rodríguez, Ana F López Rodríguez

Abstract: Background: Postpartum urinary incontinence (PUI) is a common issue among postnatal women. Previous studies identified potential related variables, but lacked analysis on certain intrinsic and extrinsic patient variables during pregnancy. Objective: The study aims to evaluate the most influential variables in PUI using machine learning, focusing on intrinsic, extrinsic, and combined variable gro… ▽ More Background: Postpartum urinary incontinence (PUI) is a common issue among postnatal women. Previous studies identified potential related variables, but lacked analysis on certain intrinsic and extrinsic patient variables during pregnancy. Objective: The study aims to evaluate the most influential variables in PUI using machine learning, focusing on intrinsic, extrinsic, and combined variable groups. Methods: Data from 93 pregnant women were analyzed using machine learning and oversampling techniques. Four key variables were predicted: occurrence, frequency, intensity of urinary incontinence, and stress urinary incontinence. Results: Models using extrinsic variables were most accurate, with 70% accuracy for urinary incontinence, 77% for frequency, 71% for intensity, and 93% for stress urinary incontinence. Conclusions: The study highlights extrinsic variables as significant predictors of PUI issues. This suggests that PUI prevention might be achievable through healthy habits during pregnancy, although further research is needed for confirmation. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Journal ref: Digital Health, Volume 8, 2022, 20552076221111289

arXiv:2402.04211 [pdf, other]

Variational Shapley Network: A Probabilistic Approach to Self-Explaining Shapley values with Uncertainty Quantification

Authors: Mert Ketenci, Iñigo Urteaga, Victor Alfonso Rodriguez, Noémie Elhadad, Adler Perotte

Abstract: Shapley values have emerged as a foundational tool in machine learning (ML) for elucidating model decision-making processes. Despite their widespread adoption and unique ability to satisfy essential explainability axioms, computational challenges persist in their estimation when ($i$) evaluating a model over all possible subset of input feature combinations, ($ii$) estimating model marginals, and… ▽ More Shapley values have emerged as a foundational tool in machine learning (ML) for elucidating model decision-making processes. Despite their widespread adoption and unique ability to satisfy essential explainability axioms, computational challenges persist in their estimation when ($i$) evaluating a model over all possible subset of input feature combinations, ($ii$) estimating model marginals, and ($iii$) addressing variability in explanations. We introduce a novel, self-explaining method that simplifies the computation of Shapley values significantly, requiring only a single forward pass. Recognizing the deterministic treatment of Shapley values as a limitation, we explore incorporating a probabilistic framework to capture the inherent uncertainty in explanations. Unlike alternatives, our technique does not rely directly on the observed data space to estimate marginals; instead, it uses adaptable baseline values derived from a latent, feature-specific embedding space, generated by a novel masked neural network architecture. Evaluations on simulated and real datasets underscore our technique's robust predictive and explanatory performance. △ Less

Submitted 6 February, 2024; originally announced February 2024.

arXiv:2401.12324 [pdf]

doi 10.1016/j.neucom.2021.10.085

Streamlining Advanced Taxi Assignment Strategies based on Legal Analysis

Authors: Holger Billhardt, José-Antonio Santos, Alberto Fernández, Mar Moreno, Sascha Ossowski, José A. Rodríguez

Abstract: In recent years many novel applications have appeared that promote the provision of services and activities in a collaborative manner. The key idea behind such systems is to take advantage of idle or underused capacities of existing resources, in order to provide improved services that assist people in their daily tasks, with additional functionality, enhanced efficiency, and/or reduced cost. Part… ▽ More In recent years many novel applications have appeared that promote the provision of services and activities in a collaborative manner. The key idea behind such systems is to take advantage of idle or underused capacities of existing resources, in order to provide improved services that assist people in their daily tasks, with additional functionality, enhanced efficiency, and/or reduced cost. Particularly in the domain of urban transportation, many researchers have put forward novel ideas, which are then implemented and evaluated through prototypes that usually draw upon AI methods and tools. However, such proposals also bring up multiple non-technical issues that need to be identified and addressed adequately if such systems are ever meant to be applied to the real world. While, in practice, legal and ethical aspects related to such AI-based systems are seldomly considered in the beginning of the research and development process, we argue that they not only restrict design decisions, but can also help guiding them. In this manuscript, we set out from a prototype of a taxi coordination service that mediates between individual (and autonomous) taxis and potential customers. After representing key aspects of its operation in a semi-structured manner, we analyse its viability from the viewpoint of current legal restrictions and constraints, so as to identify additional non-functional requirements as well as options to address them. Then, we go one step ahead, and actually modify the existing prototype to incorporate the previously identified recommendations. Performing experiments with this improved system helps us identify the most adequate option among several legally admissible alternatives. △ Less

Submitted 22 January, 2024; originally announced January 2024.

ACM Class: I.2.1

Journal ref: Neurocomputing, Volume 438 (2022)

arXiv:2401.10230 [pdf, other]

TEXterity: Tactile Extrinsic deXterity

Authors: Antonia Bronars, Sangwoon Kim, Parag Patre, Alberto Rodriguez

Abstract: We introduce a novel approach that combines tactile estimation and control for in-hand object manipulation. By integrating measurements from robot kinematics and an image-based tactile sensor, our framework estimates and tracks object pose while simultaneously generating motion plans to control the pose of a grasped object. This approach consists of a discrete pose estimator that uses the Viterbi… ▽ More We introduce a novel approach that combines tactile estimation and control for in-hand object manipulation. By integrating measurements from robot kinematics and an image-based tactile sensor, our framework estimates and tracks object pose while simultaneously generating motion plans to control the pose of a grasped object. This approach consists of a discrete pose estimator that uses the Viterbi decoding algorithm to find the most likely sequence of object poses in a coarsely discretized grid, and a continuous pose estimator-controller to refine the pose estimate and accurately manipulate the pose of the grasped object. Our method is tested on diverse objects and configurations, achieving desired manipulation objectives and outperforming single-shot methods in estimation accuracy. The proposed approach holds potential for tasks requiring precise manipulation in scenarios where visual perception is limited, laying the foundation for closed-loop behavior applications such as assembly and tool use. Please see supplementary videos for real-world demonstration at https://sites.google.com/view/texterity. △ Less

Submitted 22 January, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

Comments: 8 pages, 5 figures, submitted to ICRA 2024

arXiv:2401.09910 [pdf, other]

doi 10.1109/HPCC-DSS-SmartCity-DependSys60770.2023.00112

Deep Back-Filling: a Split Window Technique for Deep Online Cluster Job Scheduling

Authors: Lingfei Wang, Aaron Harwood, Maria A. Rodriguez

Abstract: Job scheduling is a critical component of workload management systems that can significantly influence system performance, e.g., in HPC clusters. The scheduling objectives are often mixed, such as maximizing resource utilization and minimizing job waiting time. An increasing number of researchers are moving from heuristic-based approaches to Deep Reinforcement Learning approaches in order to optim… ▽ More Job scheduling is a critical component of workload management systems that can significantly influence system performance, e.g., in HPC clusters. The scheduling objectives are often mixed, such as maximizing resource utilization and minimizing job waiting time. An increasing number of researchers are moving from heuristic-based approaches to Deep Reinforcement Learning approaches in order to optimize scheduling objectives. However, the job scheduler's state space is partially observable to a DRL-based agent because the job queue is practically unbounded. The agent's observation of the state space is constant in size since the input size of the neural networks is predefined. All existing solutions to this problem intuitively allow the agent to observe a fixed window size of jobs at the head of the job queue. In our research, we have seen that such an approach can lead to "window staleness" where the window becomes full of jobs that can not be scheduled until the cluster has completed sufficient work. In this paper, we propose a novel general technique that we call \emph{split window}, which allows the agent to observe both the head \emph{and tail} of the queue. With this technique, the agent can observe all arriving jobs at least once, which completely eliminates the window staleness problem. By leveraging the split window, the agent can significantly reduce the average job waiting time and average queue length, alternatively allowing the use of much smaller windows and, therefore, faster training times. We show a range of simulation results using HPC job scheduling trace data that supports the effectiveness of our technique. △ Less

Submitted 18 January, 2024; originally announced January 2024.

Comments: This paper has been accepted for presentation at HPCC, 2022

arXiv:2312.11556 [pdf, other]

StarVector: Generating Scalable Vector Graphics Code from Images

Authors: Juan A. Rodriguez, Shubham Agarwal, Issam H. Laradji, Pau Rodriguez, David Vazquez, Christopher Pal, Marco Pedersoli

Abstract: Scalable Vector Graphics (SVGs) have become integral in modern image rendering applications due to their infinite scalability in resolution, versatile usability, and editing capabilities. SVGs are particularly popular in the fields of web development and graphic design. Existing approaches for SVG modeling using deep learning often struggle with generating complex SVGs and are restricted to simple… ▽ More Scalable Vector Graphics (SVGs) have become integral in modern image rendering applications due to their infinite scalability in resolution, versatile usability, and editing capabilities. SVGs are particularly popular in the fields of web development and graphic design. Existing approaches for SVG modeling using deep learning often struggle with generating complex SVGs and are restricted to simpler ones that require extensive processing and simplification. This paper introduces StarVector, a multimodal SVG generation model that effectively integrates Code Generation Large Language Models (CodeLLMs) and vision models. Our approach utilizes a CLIP image encoder to extract visual representations from pixel-based images, which are then transformed into visual tokens via an adapter module. These visual tokens are pre-pended to the SVG token embeddings, and the sequence is modeled by the StarCoder model using next-token prediction, effectively learning to align the visual and code tokens. This enables StarVector to generate unrestricted SVGs that accurately represent pixel images. To evaluate StarVector's performance, we present SVG-Bench, a comprehensive benchmark for evaluating SVG methods across multiple datasets and relevant metrics. Within this benchmark, we introduce novel datasets including SVG-Stack, a large-scale dataset of real-world SVG examples, and use it to pre-train StarVector as a large foundation model for SVGs. Our results demonstrate significant enhancements in visual quality and complexity handling over current methods, marking a notable advancement in SVG generation technology. Code and models: https://github.com/joanrod/star-vector △ Less

Submitted 17 December, 2023; originally announced December 2023.

arXiv:2312.11283 [pdf, other]

The 2010 Census Confidentiality Protections Failed, Here's How and Why

Authors: John M. Abowd, Tamara Adams, Robert Ashmead, David Darais, Sourya Dey, Simson L. Garfinkel, Nathan Goldschlag, Daniel Kifer, Philip Leclerc, Ethan Lew, Scott Moore, Rolando A. Rodríguez, Ramy N. Tadros, Lars Vilhuber

Abstract: Using only 34 published tables, we reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records. Using the 38-bin age variable tabulated at the census block level, at most 20.1% of reconstructed records can differ from their confidential source on even a single value for these five variables. Using only published data, an attacker can veri… ▽ More Using only 34 published tables, we reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records. Using the 38-bin age variable tabulated at the census block level, at most 20.1% of reconstructed records can differ from their confidential source on even a single value for these five variables. Using only published data, an attacker can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. The tabular publications in Summary File 1 thus have prohibited disclosure risk similar to the unreleased confidential microdata. Reidentification studies confirm that an attacker can, within blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with nonmodal characteristics) with 95% accuracy, the same precision as the confidential data achieve and far greater than statistical baselines. The flaw in the 2010 Census framework was the assumption that aggregation prevented accurate microdata reconstruction, justifying weaker disclosure limitation methods than were applied to 2010 Census public microdata. The framework used for 2020 Census publications defends against attacks that are based on reconstruction, as we also demonstrate here. Finally, we show that alternatives to the 2020 Census Disclosure Avoidance System with similar accuracy (enhanced swapping) also fail to protect confidentiality, and those that partially defend against reconstruction attacks (incomplete suppression implementations) destroy the primary statutory use case: data for redistricting all legislatures in the country in compliance with the 1965 Voting Rights Act. △ Less

Submitted 18 December, 2023; originally announced December 2023.

arXiv:2312.07789 [pdf]

doi 10.3390/app11136169

Impact of Computer-Based Assessments on the Science's Ranks of Secondary Students

Authors: Eduardo A. Soto Rodríguez, Ana Fernández Vilas, Rebeca P. Díaz Redondo

Abstract: This study reports the impact of examining either with digital or paper-based tests in science subjects taught across the second-ary level. With our method, we compare the percentile ranking scores of two cohorts earned in computer- and paper-based teacher-made assessments to find signals of a testing mode effect. It was found that overall, at cohort and gender levels, pupils were rank-ordered equ… ▽ More This study reports the impact of examining either with digital or paper-based tests in science subjects taught across the second-ary level. With our method, we compare the percentile ranking scores of two cohorts earned in computer- and paper-based teacher-made assessments to find signals of a testing mode effect. It was found that overall, at cohort and gender levels, pupils were rank-ordered equivalently in both testing modes. Furthermore, females and top-achieving pupils were the two subgroups where the differences between modes were smaller. The practical implications of these findings are discussed from the lens of a case study and the doubt about whether regular schools could afford to deliver high-stakes computer-based tests. △ Less

Submitted 12 December, 2023; originally announced December 2023.

Journal ref: Appl. Sci. 2021

arXiv:2311.14762 [pdf, other]

The 2nd Workshop on Maritime Computer Vision (MaCVi) 2024

Authors: Benjamin Kiefer, Lojze Žust, Matej Kristan, Janez Perš, Matija Teršek, Arnold Wiliem, Martin Messmer, Cheng-Yen Yang, Hsiang-Wei Huang, Zhongyu Jiang, Heng-Cheng Kuo, Jie Mei, Jenq-Neng Hwang, Daniel Stadler, Lars Sommer, Kaer Huang, Aiguo Zheng, Weitu Chong, Kanokphan Lertniphonphan, Jun Xie, Feng Chen, Jian Li, Zhepeng Wang, Luca Zedda, Andrea Loddo , et al. (24 additional authors not shown)

Abstract: The 2nd Workshop on Maritime Computer Vision (MaCVi) 2024 addresses maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicles (USV). Three challenges categories are considered: (i) UAV-based Maritime Object Tracking with Re-identification, (ii) USV-based Maritime Obstacle Segmentation and Detection, (iii) USV-based Maritime Boat Tracking. The USV-based Maritime Obst… ▽ More The 2nd Workshop on Maritime Computer Vision (MaCVi) 2024 addresses maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicles (USV). Three challenges categories are considered: (i) UAV-based Maritime Object Tracking with Re-identification, (ii) USV-based Maritime Obstacle Segmentation and Detection, (iii) USV-based Maritime Boat Tracking. The USV-based Maritime Obstacle Segmentation and Detection features three sub-challenges, including a new embedded challenge addressing efficicent inference on real-world embedded devices. This report offers a comprehensive overview of the findings from the challenges. We provide both statistical and qualitative analyses, evaluating trends from over 195 submissions. All datasets, evaluation code, and the leaderboard are available to the public at https://macvi.org/workshop/macvi24. △ Less

Submitted 23 November, 2023; originally announced November 2023.

Comments: Part of 2nd Workshop on Maritime Computer Vision (MaCVi) 2024 IEEE Xplore submission as part of WACV 2024

arXiv:2311.07592 [pdf, other]

Hallucination-minimized Data-to-answer Framework for Financial Decision-makers

Authors: Sohini Roychowdhury, Andres Alvarez, Brian Moore, Marko Krema, Maria Paz Gelpi, Federico Martin Rodriguez, Angel Rodriguez, Jose Ramon Cabrejas, Pablo Martinez Serrano, Punit Agrawal, Arijit Mukherjee

Abstract: Large Language Models (LLMs) have been applied to build several automation and personalized question-answering prototypes so far. However, scaling such prototypes to robust products with minimized hallucinations or fake responses still remains an open challenge, especially in niche data-table heavy domains such as financial decision making. In this work, we present a novel Langchain-based framewor… ▽ More Large Language Models (LLMs) have been applied to build several automation and personalized question-answering prototypes so far. However, scaling such prototypes to robust products with minimized hallucinations or fake responses still remains an open challenge, especially in niche data-table heavy domains such as financial decision making. In this work, we present a novel Langchain-based framework that transforms data tables into hierarchical textual data chunks to enable a wide variety of actionable question answering. First, the user-queries are classified by intention followed by automated retrieval of the most relevant data chunks to generate customized LLM prompts per query. Next, the custom prompts and their responses undergo multi-metric scoring to assess for hallucinations and response confidence. The proposed system is optimized with user-query intention classification, advanced prompting, data scaling capabilities and it achieves over 90% confidence scores for a variety of user-queries responses ranging from {What, Where, Why, How, predict, trend, anomalies, exceptions} that are crucial for financial decision making applications. The proposed data to answers framework can be extended to other analytical domains such as sales and payroll to ensure optimal hallucination control guardrails. △ Less

Submitted 9 November, 2023; originally announced November 2023.

Comments: 11 pages, 5 figures, 4 tables

arXiv:2311.02811 [pdf, other]

Contour Algorithm for Connectivity

Authors: Zhihui Du, Oliver Alvarado Rodriguez, Fuhuan Li, Mohammad Dindoost, David A. Bader

Abstract: Finding connected components in a graph is a fundamental problem in graph analysis. In this work, we present a novel minimum-mapping based Contour algorithm to efficiently solve the connectivity problem. We prove that the Contour algorithm with two or higher order operators can identify all connected components of an undirected graph within $\mathcal{O}(\log d_{max})$ iterations, with each iterati… ▽ More Finding connected components in a graph is a fundamental problem in graph analysis. In this work, we present a novel minimum-mapping based Contour algorithm to efficiently solve the connectivity problem. We prove that the Contour algorithm with two or higher order operators can identify all connected components of an undirected graph within $\mathcal{O}(\log d_{max})$ iterations, with each iteration involving $\mathcal{O}(m)$ work, where $d_{max}$ represents the largest diameter among all components in the given graph, and $m$ is the total number of edges in the graph. Importantly, each iteration is highly parallelizable, making use of the efficient minimum-mapping operator applied to all edges. To further enhance its practical performance, we optimize the Contour algorithm through asynchronous updates, early convergence checking, eliminating atomic operations, and choosing more efficient mapping operators. Our implementation of the Contour algorithm has been integrated into the open-source framework Arachne. Arachne extends Arkouda for large-scale interactive graph analytics, providing a Python API powered by the high-productivity parallel language Chapel. Experimental results on both real-world and synthetic graphs demonstrate the superior performance of our proposed Contour algorithm compared to state-of-the-art large-scale parallel algorithm FastSV and the fastest shared memory algorithm ConnectIt. On average, Contour achieves a speedup of 7.3x and 1.4x compared to FastSV and ConnectIt, respectively. All code for the Contour algorithm and the Arachne framework is publicly available on GitHub ( https://github.com/Bears-R-Us/arkouda-njit ), ensuring transparency and reproducibility of our work. △ Less

Submitted 6 November, 2023; v1 submitted 5 November, 2023; originally announced November 2023.

Comments: 30th IEEE International Conference on High Performance Computing, Data, and Analytics, Goa, India, December 2023

arXiv:2311.00974 [pdf, other]

CloudSim Express: A Novel Framework for Rapid Low Code Simulation of Cloud Computing Environments

Authors: Tharindu B. Hewage, Shashikant Ilager, Maria A. Rodriguez, Rajkumar Buyya

Abstract: Cloud computing environment simulators enable cost-effective experimentation of novel infrastructure designs and management approaches by avoiding significant costs incurred from repetitive deployments in real Cloud platforms. However, widely used Cloud environment simulators compromise on usability due to complexities in design and configuration, along with the added overhead of programming langu… ▽ More Cloud computing environment simulators enable cost-effective experimentation of novel infrastructure designs and management approaches by avoiding significant costs incurred from repetitive deployments in real Cloud platforms. However, widely used Cloud environment simulators compromise on usability due to complexities in design and configuration, along with the added overhead of programming language expertise. Existing approaches attempting to reduce this overhead, such as script-based simulators and Graphical User Interface (GUI) based simulators, often compromise on the extensibility of the simulator. Simulator extensibility allows for customization at a fine-grained level, thus reducing it significantly affects flexibility in creating simulations. To address these challenges, we propose an architectural framework to enable human-readable script-based simulations in existing Cloud environment simulators while minimizing the impact on simulator extensibility. We implement the proposed framework for the widely used Cloud environment simulator, the CloudSim toolkit, and compare it against state-of-the-art baselines using a practical use case. The resulting framework, called CloudSim Express, achieves extensible simulations while surpassing baselines with over a 71.43% reduction in code complexity and an 89.42% reduction in lines of code. △ Less

Submitted 10 November, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

arXiv:2310.18425 [pdf, other]

Parallel-Jaw Gripper and Grasp Co-Optimization for Sets of Planar Objects

Authors: Rebecca H. Jiang, Neel Doshi, Ravi Gondhalekar, Alberto Rodriguez

Abstract: We propose a framework for optimizing a planar parallel-jaw gripper for use with multiple objects. While optimizing general-purpose grippers and contact locations for grasps are both well studied, co-optimizing grasps and the gripper geometry to execute them receives less attention. As such, our framework synthesizes grippers optimized to stably grasp sets of polygonal objects. Given a fixed numbe… ▽ More We propose a framework for optimizing a planar parallel-jaw gripper for use with multiple objects. While optimizing general-purpose grippers and contact locations for grasps are both well studied, co-optimizing grasps and the gripper geometry to execute them receives less attention. As such, our framework synthesizes grippers optimized to stably grasp sets of polygonal objects. Given a fixed number of contacts and their assignments to object faces and gripper jaws, our framework optimizes contact locations along these faces, gripper pose for each grasp, and gripper shape. Our key insights are to pose shape and contact constraints in frames fixed to the gripper jaws, and to leverage the linearity of constraints in our grasp stability and gripper shape models via an augmented Lagrangian formulation. Together, these enable a tractable nonlinear program implementation. We apply our method to several examples. The first illustrative problem shows the discovery of a geometrically simple solution where possible. In another, space is constrained, forcing multiple objects to be contacted by the same features as each other. Finally a toolset-grasping example shows that our framework applies to complex, real-world objects. We provide a physical experiment of the toolset grasps. △ Less

Submitted 27 October, 2023; originally announced October 2023.

Comments: 2023 IEEE IROS conference

arXiv:2310.17292 [pdf, other]

Boolean Abstractions for Realizability Modulo Theories (Extended version)

Authors: Andoni Rodriguez, Cesar Sanchez

Abstract: In this paper, we address the problem of the (reactive) realizability of specifications of theories richer than Booleans, including arithmetic theories. Our approach transforms theory specifications into purely Boolean specifications by (1) substituting theory literals by Boolean variables, and (2) computing an additional Boolean requirement that captures the dependencies between the new variables… ▽ More In this paper, we address the problem of the (reactive) realizability of specifications of theories richer than Booleans, including arithmetic theories. Our approach transforms theory specifications into purely Boolean specifications by (1) substituting theory literals by Boolean variables, and (2) computing an additional Boolean requirement that captures the dependencies between the new variables imposed by the literals. The resulting specification can be passed to existing Boolean off-the-shelf realizability tools, and is realizable if and only if the original specification is realizable. The first contribution is a brute-force version of our method, which requires a number of SMT queries that is doubly exponential in the number of input literals. Then, we present a faster method that exploits a nested encoding of the search for the extra requirement and uses SAT solving for faster traversing the search space and uses SMT queries internally. Another contribution is a prototype in Z3-Python. Finally, we report an empirical evaluation using specifications inspired in real industrial cases. To the best of our knowledge, this is the first method that succeeds in non-Boolean LTL realizability. △ Less

Submitted 26 October, 2023; originally announced October 2023.

arXiv:2310.11569

When Rigidity Hurts: Soft Consistency Regularization for Probabilistic Hierarchical Time Series Forecasting

Authors: Harshavardhan Kamarthi, Lingkai Kong, Alexander Rodríguez, Chao Zhang, B. Aditya Prakash

Abstract: Probabilistic hierarchical time-series forecasting is an important variant of time-series forecasting, where the goal is to model and forecast multivariate time-series that have underlying hierarchical relations. Most methods focus on point predictions and do not provide well-calibrated probabilistic forecasts distributions. Recent state-of-art probabilistic forecasting methods also impose hierarc… ▽ More Probabilistic hierarchical time-series forecasting is an important variant of time-series forecasting, where the goal is to model and forecast multivariate time-series that have underlying hierarchical relations. Most methods focus on point predictions and do not provide well-calibrated probabilistic forecasts distributions. Recent state-of-art probabilistic forecasting methods also impose hierarchical relations on point predictions and samples of distribution which does not account for coherency of forecast distributions. Previous works also silently assume that datasets are always consistent with given hierarchical relations and do not adapt to real-world datasets that show deviation from this assumption. We close both these gap and propose PROFHiT, which is a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy. PROFHiT uses a flexible probabilistic Bayesian approach and introduces a novel Distributional Coherency regularization to learn from hierarchical relations for entire forecast distribution that enables robust and calibrated forecasts as well as adapt to datasets of varying hierarchical consistency. On evaluating PROFHiT over wide range of datasets, we observed 41-88% better performance in accuracy and significantly better calibration. Due to modeling the coherency over full distribution, we observed that PROFHiT can robustly provide reliable forecasts even if up to 10% of input time-series data is missing where other methods' performance severely degrade by over 70%. △ Less

Submitted 19 October, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

Comments: I have submitted this paper as a revision of arXiv:2206.07940. Apologies for the confusion

arXiv:2310.10537 [pdf, other]

Microscaling Data Formats for Deep Learning

Authors: Bita Darvish Rouhani, Ritchie Zhao, Ankit More, Mathew Hall, Alireza Khodamoradi, Summer Deng, Dhruv Choudhary, Marius Cornea, Eric Dellinger, Kristof Denolf, Stosic Dusan, Venmugil Elango, Maximilian Golub, Alexander Heinecke, Phil James-Roxby, Dharmesh Jani, Gaurav Kolhe, Martin Langhammer, Ada Li, Levi Melnick, Maral Mesmakhosroshahi, Andres Rodriguez, Michael Schulte, Rasoul Shafipour, Lei Shao , et al. (8 additional authors not shown)

Abstract: Narrow bit-width data formats are key to reducing the computational and storage costs of modern deep learning applications. This paper evaluates Microscaling (MX) data formats that combine a per-block scaling factor with narrow floating-point and integer types for individual elements. MX formats balance the competing needs of hardware efficiency, model accuracy, and user friction. Empirical result… ▽ More Narrow bit-width data formats are key to reducing the computational and storage costs of modern deep learning applications. This paper evaluates Microscaling (MX) data formats that combine a per-block scaling factor with narrow floating-point and integer types for individual elements. MX formats balance the competing needs of hardware efficiency, model accuracy, and user friction. Empirical results on over two dozen benchmarks demonstrate practicality of MX data formats as a drop-in replacement for baseline FP32 for AI inference and training with low user friction. We also show the first instance of training generative language models at sub-8-bit weights, activations, and gradients with minimal accuracy loss and no modifications to the training recipe. △ Less

Submitted 19 October, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

arXiv:2310.09398 [pdf, other]

doi 10.1073/pnas.2220558120

An In-Depth Examination of Requirements for Disclosure Risk Assessment

Authors: Ron S. Jarmin, John M. Abowd, Robert Ashmead, Ryan Cumings-Menon, Nathan Goldschlag, Michael B. Hawes, Sallie Ann Keller, Daniel Kifer, Philip Leclerc, Jerome P. Reiter, Rolando A. Rodríguez, Ian Schmutte, Victoria A. Velkoff, Pavel Zhuravlev

Abstract: The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. Following long-established precedent in economics and statistics, we argue that any proposal for quantifying disclosure risk should be bas… ▽ More The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. Following long-established precedent in economics and statistics, we argue that any proposal for quantifying disclosure risk should be based on pre-specified, objective criteria. Such criteria should be used to compare methodologies to identify those with the most desirable properties. We illustrate this approach, using simple desiderata, to evaluate the absolute disclosure risk framework, the counterfactual framework underlying differential privacy, and prior-to-posterior comparisons. We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most while absolute disclosure risk satisfies the fewest. Furthermore, we explain that many of the criticisms levied against differential privacy would be levied against any technology that is not equivalent to direct, unrestricted access to confidential data. Thus, more research is needed, but in the near-term, the counterfactual approach appears best-suited for privacy-utility analysis. △ Less

Submitted 13 October, 2023; originally announced October 2023.

Comments: 47 pages, 1 table

Journal ref: PNAS, October 13, 2023, Vol. 120, No. 43

arXiv:2310.09003 [pdf, other]

μ-DDRL: A QoS-Aware Distributed Deep Reinforcement Learning Technique for Service Offloading in Fog computing Environments

Authors: Mohammad Goudarzi, Maria A. Rodriguez, Majid Sarvi, Rajkumar Buyya

Abstract: Fog and Edge computing extend cloud services to the proximity of end users, allowing many Internet of Things (IoT) use cases, particularly latency-critical applications. Smart devices, such as traffic and surveillance cameras, often do not have sufficient resources to process computation-intensive and latency-critical services. Hence, the constituent parts of services can be offloaded to nearby Ed… ▽ More Fog and Edge computing extend cloud services to the proximity of end users, allowing many Internet of Things (IoT) use cases, particularly latency-critical applications. Smart devices, such as traffic and surveillance cameras, often do not have sufficient resources to process computation-intensive and latency-critical services. Hence, the constituent parts of services can be offloaded to nearby Edge/Fog resources for processing and storage. However, making offloading decisions for complex services in highly stochastic and dynamic environments is an important, yet difficult task. Recently, Deep Reinforcement Learning (DRL) has been used in many complex service offloading problems; however, existing techniques are most suitable for centralized environments, and their convergence to the best-suitable solutions is slow. In addition, constituent parts of services often have predefined data dependencies and quality of service constraints, which further intensify the complexity of service offloading. To solve these issues, we propose a distributed DRL technique following the actor-critic architecture based on Asynchronous Proximal Policy Optimization (APPO) to achieve efficient and diverse distributed experience trajectory generation. Also, we employ PPO clipping and V-trace techniques for off-policy correction for faster convergence to the most suitable service offloading solutions. The results obtained demonstrate that our technique converges quickly, offers high scalability and adaptability, and outperforms its counterparts by improving the execution time of heterogeneous services. △ Less

Submitted 13 October, 2023; originally announced October 2023.

arXiv:2310.07904 [pdf, other]

From Realizability Modulo Theories to Synthesis Modulo Theories Part 1: Dynamic approach

Authors: Andoni Rodríguez, Cesar Sanchez

Abstract: Reactive synthesis is the process of using temporal logic specifications in LTL to generate correct controllers, but its use has been restricted to Boolean specifications. Recently, a Boolean abstraction technique allows to translate LTL T specifications that contain literals in theories into equi-realizable LTL specifications. However, no synthesis procedure exists yet. In synthesis modulo theori… ▽ More Reactive synthesis is the process of using temporal logic specifications in LTL to generate correct controllers, but its use has been restricted to Boolean specifications. Recently, a Boolean abstraction technique allows to translate LTL T specifications that contain literals in theories into equi-realizable LTL specifications. However, no synthesis procedure exists yet. In synthesis modulo theories, the system to synthesize receives valuations of environment variables in a first-order theory T and outputs valuations of system variables from T . In this paper, we address how to syntheize a full controller using a combination of the static Boolean controller obtained from the Booleanized LTL specification together with dynamic queries to a solver that produces models of a satisfiable existential formulae from T . This is the first method that realizes reactive synthesis modulo theories. △ Less

Submitted 11 October, 2023; originally announced October 2023.

arXiv:2310.06077 [pdf, other]

Performative Time-Series Forecasting

Authors: Zhiyuan Zhao, Alexander Rodriguez, B. Aditya Prakash

Abstract: Time-series forecasting is a critical challenge in various domains and has witnessed substantial progress in recent years. Many real-life scenarios, such as public health, economics, and social applications, involve feedback loops where predictions can influence the predicted outcome, subsequently altering the target variable's distribution. This phenomenon, known as performativity, introduces the… ▽ More Time-series forecasting is a critical challenge in various domains and has witnessed substantial progress in recent years. Many real-life scenarios, such as public health, economics, and social applications, involve feedback loops where predictions can influence the predicted outcome, subsequently altering the target variable's distribution. This phenomenon, known as performativity, introduces the potential for 'self-negating' or 'self-fulfilling' predictions. Despite extensive studies in classification problems across domains, performativity remains largely unexplored in the context of time-series forecasting from a machine-learning perspective. In this paper, we formalize performative time-series forecasting (PeTS), addressing the challenge of accurate predictions when performativity-induced distribution shifts are possible. We propose a novel approach, Feature Performative-Shifting (FPS), which leverages the concept of delayed response to anticipate distribution shifts and subsequently predicts targets accordingly. We provide theoretical insights suggesting that FPS can potentially lead to reduced generalization error. We conduct comprehensive experiments using multiple time-series models on COVID-19 and traffic forecasting tasks. The results demonstrate that FPS consistently outperforms conventional time-series forecasting methods, highlighting its efficacy in handling performativity-induced challenges. △ Less

Submitted 9 October, 2023; originally announced October 2023.

Comments: 12 pages (7 main text, 2 reference, 3 appendix), 3 figures, 4 tables

arXiv:2310.00798 [pdf, other]

Object manipulation through contact configuration regulation: multiple and intermittent contacts

Authors: Orion Taylor, Neel Doshi, Alberto Rodriguez

Abstract: In this work, we build on our method for manipulating unknown objects via contact configuration regulation: the estimation and control of the location, geometry, and mode of all contacts between the robot, object, and environment. We further develop our estimator and controller to enable manipulation through more complex contact interactions, including intermittent contact between the robot/object… ▽ More In this work, we build on our method for manipulating unknown objects via contact configuration regulation: the estimation and control of the location, geometry, and mode of all contacts between the robot, object, and environment. We further develop our estimator and controller to enable manipulation through more complex contact interactions, including intermittent contact between the robot/object, and multiple contacts between the object/environment. In addition, we support a larger set of contact geometries at each interface. This is accomplished through a factor graph based estimation framework that reasons about the complementary kinematic and wrench constraints of contact to predict the current contact configuration. We are aided by the incorporation of a limited amount of visual feedback; which when combined with the available F/T sensing and robot proprioception, allows us to differentiate contact modes that were previously indistinguishable. We implement this revamped framework on our manipulation platform, and demonstrate that it allows the robot to perform a wider set of manipulation tasks. This includes, using a wall as a support to re-orient an object, or regulating the contact geometry between the object and the ground. Finally, we conduct ablation studies to understand the contributions from visual and tactile feedback in our manipulation framework. Our code can be found at: https://github.com/mcubelab/pbal. △ Less

Submitted 1 October, 2023; originally announced October 2023.

arXiv:2309.09943 [pdf, other]

Property Graphs in Arachne

Authors: Oliver Alvarado Rodriguez, Fernando Vera Buschmann, Zhihui Du, David A. Bader

Abstract: Analyzing large-scale graphs poses challenges due to their increasing size and the demand for interactive and user-friendly analytics tools. These graphs arise from various domains, including cybersecurity, social sciences, health sciences, and network sciences, where networks can represent interactions between humans, neurons in the brain, or malicious flows in a network. Exploring these large gr… ▽ More Analyzing large-scale graphs poses challenges due to their increasing size and the demand for interactive and user-friendly analytics tools. These graphs arise from various domains, including cybersecurity, social sciences, health sciences, and network sciences, where networks can represent interactions between humans, neurons in the brain, or malicious flows in a network. Exploring these large graphs is crucial for revealing hidden structures and metrics that are not easily computable without parallel computing. Currently, Python users can leverage the open-source Arkouda framework to efficiently execute Pandas and NumPy-related tasks on thousands of cores. To address large-scale graph analysis, Arachne, an extension to Arkouda, enables easy transformation of Arkouda dataframes into graphs. This paper proposes and evaluates three distributable data structures for property graphs, implemented in Chapel, that are integrated into Arachne. Enriching Arachne with support for property graphs will empower data scientists to extend their analysis to new problem domains. Property graphs present additional complexities, requiring efficient storage for extra information on vertices and edges, such as labels, relationships, and properties. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: The 27th Annual IEEE High Performance Extreme Computing Conference (HPEC), Virtual, September 25-29, 2023

arXiv:2309.09637 [pdf, other]

Designing a Hybrid Neural System to Learn Real-world Crack Segmentation from Fractal-based Simulation

Authors: Achref Jaziri, Martin Mundt, Andres Fernandez Rodriguez, Visvanathan Ramesh

Abstract: Identification of cracks is essential to assess the structural integrity of concrete infrastructure. However, robust crack segmentation remains a challenging task for computer vision systems due to the diverse appearance of concrete surfaces, variable lighting and weather conditions, and the overlapping of different defects. In particular recent data-driven methods struggle with the limited availa… ▽ More Identification of cracks is essential to assess the structural integrity of concrete infrastructure. However, robust crack segmentation remains a challenging task for computer vision systems due to the diverse appearance of concrete surfaces, variable lighting and weather conditions, and the overlapping of different defects. In particular recent data-driven methods struggle with the limited availability of data, the fine-grained and time-consuming nature of crack annotation, and face subsequent difficulty in generalizing to out-of-distribution samples. In this work, we move past these challenges in a two-fold way. We introduce a high-fidelity crack graphics simulator based on fractals and a corresponding fully-annotated crack dataset. We then complement the latter with a system that learns generalizable representations from simulation, by leveraging both a pointwise mutual information estimate along with adaptive instance normalization as inductive biases. Finally, we empirically highlight how different design choices are symbiotic in bridging the simulation to real gap, and ultimately demonstrate that our introduced system can effectively handle real-world crack segmentation. △ Less

Submitted 18 September, 2023; originally announced September 2023.

arXiv:2309.07195 [pdf, other]

Diffusion models for audio semantic communication

Authors: Eleonora Grassucci, Christian Marinoni, Andrea Rodriguez, Danilo Comminiello

Abstract: Directly sending audio signals from a transmitter to a receiver across a noisy channel may absorb consistent bandwidth and be prone to errors when trying to recover the transmitted bits. On the contrary, the recent semantic communication approach proposes to send the semantics and then regenerate semantically consistent content at the receiver without exactly recovering the bitstream. In this pape… ▽ More Directly sending audio signals from a transmitter to a receiver across a noisy channel may absorb consistent bandwidth and be prone to errors when trying to recover the transmitted bits. On the contrary, the recent semantic communication approach proposes to send the semantics and then regenerate semantically consistent content at the receiver without exactly recovering the bitstream. In this paper, we propose a generative audio semantic communication framework that faces the communication problem as an inverse problem, therefore being robust to different corruptions. Our method transmits lower-dimensional representations of the audio signal and of the associated semantics to the receiver, which generates the corresponding signal with a particular focus on its meaning (i.e., the semantics) thanks to the conditional diffusion model at its core. During the generation process, the diffusion model restores the received information from multiple degradations at the same time including corruption noise and missing parts caused by the transmission over the noisy channel. We show that our framework outperforms competitors in a real-world scenario and with different channel conditions. Visit the project page to listen to samples and access the code: https://ispamm.github.io/diffusion-audio-semantic-communication/. △ Less

Submitted 13 September, 2023; originally announced September 2023.

Comments: Submitted to IEEE ICASSP 2024

Showing 1–50 of 259 results for author: Rodriguez, A