Search | arXiv e-print repository

Teaching Transformers Causal Reasoning through Axiomatic Training

Authors: Aniket Vashishtha, Abhinav Kumar, Abbavaram Gowtham Reddy, Vineeth N Balasubramanian, Amit Sharma

Abstract: For text-based AI systems to interact in the real world, causal reasoning is an essential skill. Since interventional data is costly to generate, we study to what extent an agent can learn causal reasoning from passive data. Specifically, we consider an axiomatic training setup where an agent learns from multiple demonstrations of a causal axiom (or rule), rather than incorporating the axiom as an… ▽ More For text-based AI systems to interact in the real world, causal reasoning is an essential skill. Since interventional data is costly to generate, we study to what extent an agent can learn causal reasoning from passive data. Specifically, we consider an axiomatic training setup where an agent learns from multiple demonstrations of a causal axiom (or rule), rather than incorporating the axiom as an inductive bias or inferring it from data values. A key question is whether the agent would learn to generalize from the axiom demonstrations to new scenarios. For example, if a transformer model is trained on demonstrations of the causal transitivity axiom over small graphs, would it generalize to applying the transitivity axiom over large graphs? Our results, based on a novel axiomatic training scheme, indicate that such generalization is possible. We consider the task of inferring whether a variable causes another variable, given a causal graph structure. We find that a 67 million parameter transformer model, when trained on linear causal chains (along with some noisy variations) can generalize well to new kinds of graphs, including longer causal chains, causal chains with reversed order, and graphs with branching; even when it is not explicitly trained for such settings. Our model performs at par (or even better) than many larger language models such as GPT-4, Gemini Pro, and Phi-3. Overall, our axiomatic training framework provides a new paradigm of learning causal reasoning from passive data that can be used to learn arbitrary axioms, as long as sufficient demonstrations can be generated. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2405.18547 [pdf, ps, other]

User Perception of CAPTCHAs: A Comparative Study between University and Internet Users

Authors: Arun Reddy, Yuan Cheng

Abstract: CAPTCHAs are commonly used to distinguish between human and bot users on the web. However, despite having various types of CAPTCHAs, there are still concerns about their security and usability. To address these concerns, we surveyed over 250 participants from a university campus and Amazon Mechanical Turk. Our goal was to gather user perceptions regarding the security and usability of current CAPT… ▽ More CAPTCHAs are commonly used to distinguish between human and bot users on the web. However, despite having various types of CAPTCHAs, there are still concerns about their security and usability. To address these concerns, we surveyed over 250 participants from a university campus and Amazon Mechanical Turk. Our goal was to gather user perceptions regarding the security and usability of current CAPTCHA implementations. After analyzing the data using statistical and thematic methods, we found that users struggle to navigate current CAPTCHA challenges due to increasing difficulty levels. As a result, they experience frustration, which negatively impacts their user experience. Additionally, participants expressed concerns about the reliability and security of these systems. Our findings can offer valuable insights for creating more secure and user-friendly CAPTCHA technologies. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.07921 [pdf, other]

Can Better Text Semantics in Prompt Tuning Improve VLM Generalization?

Authors: Hari Chandana Kuchibhotla, Sai Srinivas Kancheti, Abbavaram Gowtham Reddy, Vineeth N Balasubramanian

Abstract: Going beyond mere fine-tuning of vision-language models (VLMs), learnable prompt tuning has emerged as a promising, resource-efficient alternative. Despite their potential, effectively learning prompts faces the following challenges: (i) training in a low-shot scenario results in overfitting, limiting adaptability, and yielding weaker performance on newer classes or datasets; (ii) prompt-tuning's… ▽ More Going beyond mere fine-tuning of vision-language models (VLMs), learnable prompt tuning has emerged as a promising, resource-efficient alternative. Despite their potential, effectively learning prompts faces the following challenges: (i) training in a low-shot scenario results in overfitting, limiting adaptability, and yielding weaker performance on newer classes or datasets; (ii) prompt-tuning's efficacy heavily relies on the label space, with decreased performance in large class spaces, signaling potential gaps in bridging image and class concepts. In this work, we investigate whether better text semantics can help address these concerns. In particular, we introduce a prompt-tuning method that leverages class descriptions obtained from Large Language Models (LLMs). These class descriptions are used to bridge image and text modalities. Our approach constructs part-level description-guided image and text features, which are subsequently aligned to learn more generalizable prompts. Our comprehensive experiments conducted across 11 benchmark datasets show that our method outperforms established methods, demonstrating substantial improvements. △ Less

Submitted 20 June, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

arXiv:2404.15760 [pdf, other]

Debiasing Machine Unlearning with Counterfactual Examples

Authors: Ziheng Chen, Jia Wang, Jun Zhuang, Abbavaram Gowtham Reddy, Fabrizio Silvestri, Jin Huang, Kaushiki Nag, Kun Kuang, Xin Ning, Gabriele Tolomei

Abstract: The right to be forgotten (RTBF) seeks to safeguard individuals from the enduring effects of their historical actions by implementing machine-learning techniques. These techniques facilitate the deletion of previously acquired knowledge without requiring extensive model retraining. However, they often overlook a critical issue: unlearning processes bias. This bias emerges from two main sources: (1… ▽ More The right to be forgotten (RTBF) seeks to safeguard individuals from the enduring effects of their historical actions by implementing machine-learning techniques. These techniques facilitate the deletion of previously acquired knowledge without requiring extensive model retraining. However, they often overlook a critical issue: unlearning processes bias. This bias emerges from two main sources: (1) data-level bias, characterized by uneven data removal, and (2) algorithm-level bias, which leads to the contamination of the remaining dataset, thereby degrading model accuracy. In this work, we analyze the causal factors behind the unlearning process and mitigate biases at both data and algorithmic levels. Typically, we introduce an intervention-based approach, where knowledge to forget is erased with a debiased dataset. Besides, we guide the forgetting procedure by leveraging counterfactual examples, as they maintain semantic data consistency without hurting performance on the remaining dataset. Experimental results demonstrate that our method outperforms existing machine unlearning baselines on evaluation metrics. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2403.07540 [pdf, other]

WannaLaugh: A Configurable Ransomware Emulator -- Learning to Mimic Malicious Storage Traces

Authors: Dionysios Diamantopoulos, Roman Pletka, Slavisa Sarafijanovic, A. L. Narasimha Reddy, Haris Pozidis

Abstract: Ransomware, a fearsome and rapidly evolving cybersecurity threat, continues to inflict severe consequences on individuals and organizations worldwide. Traditional detection methods, reliant on static signatures and application behavioral patterns, are challenged by the dynamic nature of these threats. This paper introduces three primary contributions to address this challenge. First, we introduce… ▽ More Ransomware, a fearsome and rapidly evolving cybersecurity threat, continues to inflict severe consequences on individuals and organizations worldwide. Traditional detection methods, reliant on static signatures and application behavioral patterns, are challenged by the dynamic nature of these threats. This paper introduces three primary contributions to address this challenge. First, we introduce a ransomware emulator. This tool is designed to safely mimic ransomware attacks without causing actual harm or spreading malware, making it a unique solution for studying ransomware behavior. Second, we demonstrate how we use this emulator to create storage I/O traces. These traces are then utilized to train machine-learning models. Our results show that these models are effective in detecting ransomware, highlighting the practical application of our emulator in developing responsible cybersecurity tools. Third, we show how our emulator can be used to mimic the I/O behavior of existing ransomware thereby enabling safe trace collection. Both the emulator and its application represent significant steps forward in ransomware detection in the era of machine-learning-driven cybersecurity. △ Less

Submitted 12 June, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

arXiv:2312.02914 [pdf, other]

Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training

Authors: Arun Reddy, William Paul, Corban Rivera, Ketul Shah, Celso M. de Melo, Rama Chellappa

Abstract: In this work, we tackle the problem of unsupervised domain adaptation (UDA) for video action recognition. Our approach, which we call UNITE, uses an image teacher model to adapt a video student model to the target domain. UNITE first employs self-supervised pre-training to promote discriminative feature learning on target domain videos using a teacher-guided masked distillation objective. We then… ▽ More In this work, we tackle the problem of unsupervised domain adaptation (UDA) for video action recognition. Our approach, which we call UNITE, uses an image teacher model to adapt a video student model to the target domain. UNITE first employs self-supervised pre-training to promote discriminative feature learning on target domain videos using a teacher-guided masked distillation objective. We then perform self-training on masked target data, using the video student model and image teacher model together to generate improved pseudolabels for unlabeled target videos. Our self-training process successfully leverages the strengths of both models to achieve strong transfer performance across domains. We evaluate our approach on multiple video domain adaptation benchmarks and observe significant improvements upon previously reported results. △ Less

Submitted 20 April, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

Comments: Accepted at CVPR 2024. 13 pages, 4 figures. Approved for public release: distribution unlimited

arXiv:2310.15117 [pdf, other]

Causal Inference Using LLM-Guided Discovery

Authors: Aniket Vashishtha, Abbavaram Gowtham Reddy, Abhinav Kumar, Saketh Bachu, Vineeth N Balasubramanian, Amit Sharma

Abstract: At the core of causal inference lies the challenge of determining reliable causal graphs solely based on observational data. Since the well-known backdoor criterion depends on the graph, any errors in the graph can propagate downstream to effect inference. In this work, we initially show that complete graph information is not necessary for causal effect inference; the topological order over graph… ▽ More At the core of causal inference lies the challenge of determining reliable causal graphs solely based on observational data. Since the well-known backdoor criterion depends on the graph, any errors in the graph can propagate downstream to effect inference. In this work, we initially show that complete graph information is not necessary for causal effect inference; the topological order over graph variables (causal order) alone suffices. Further, given a node pair, causal order is easier to elicit from domain experts compared to graph edges since determining the existence of an edge can depend extensively on other variables. Interestingly, we find that the same principle holds for Large Language Models (LLMs) such as GPT-3.5-turbo and GPT-4, motivating an automated method to obtain causal order (and hence causal effect) with LLMs acting as virtual domain experts. To this end, we employ different prompting strategies and contextual cues to propose a robust technique of obtaining causal order from LLMs. Acknowledging LLMs' limitations, we also study possible techniques to integrate LLMs with established causal discovery algorithms, including constraint-based and score-based methods, to enhance their performance. Extensive experiments demonstrate that our approach significantly improves causal ordering accuracy as compared to discovery algorithms, highlighting the potential of LLMs to enhance causal inference across diverse fields. △ Less

Submitted 23 October, 2023; originally announced October 2023.

arXiv:2309.01030 [pdf, other]

Online Adaptive Mahalanobis Distance Estimation

Authors: Lianke Qin, Aravind Reddy, Zhao Song

Abstract: Mahalanobis metrics are widely used in machine learning in conjunction with methods like $k$-nearest neighbors, $k$-means clustering, and $k$-medians clustering. Despite their importance, there has not been any prior work on applying sketching techniques to speed up algorithms for Mahalanobis metrics. In this paper, we initiate the study of dimension reduction for Mahalanobis metrics. In particula… ▽ More Mahalanobis metrics are widely used in machine learning in conjunction with methods like $k$-nearest neighbors, $k$-means clustering, and $k$-medians clustering. Despite their importance, there has not been any prior work on applying sketching techniques to speed up algorithms for Mahalanobis metrics. In this paper, we initiate the study of dimension reduction for Mahalanobis metrics. In particular, we provide efficient data structures for solving the Approximate Distance Estimation (ADE) problem for Mahalanobis distances. We first provide a randomized Monte Carlo data structure. Then, we show how we can adapt it to provide our main data structure which can handle sequences of \textit{adaptive} queries and also online updates to both the Mahalanobis metric matrix and the data points, making it amenable to be used in conjunction with prior algorithms for online learning of Mahalanobis metrics. △ Less

Submitted 29 December, 2023; v1 submitted 2 September, 2023; originally announced September 2023.

Comments: IEEE BigData 2023

arXiv:2305.18183 [pdf, other]

On Counterfactual Data Augmentation Under Confounding

Authors: Abbavaram Gowtham Reddy, Saketh Bachu, Saloni Dash, Charchit Sharma, Amit Sharma, Vineeth N Balasubramanian

Abstract: Counterfactual data augmentation has recently emerged as a method to mitigate confounding biases in the training data. These biases, such as spurious correlations, arise due to various observed and unobserved confounding variables in the data generation process. In this paper, we formally analyze how confounding biases impact downstream classifiers and present a causal viewpoint to the solutions b… ▽ More Counterfactual data augmentation has recently emerged as a method to mitigate confounding biases in the training data. These biases, such as spurious correlations, arise due to various observed and unobserved confounding variables in the data generation process. In this paper, we formally analyze how confounding biases impact downstream classifiers and present a causal viewpoint to the solutions based on counterfactual data augmentation. We explore how removing confounding biases serves as a means to learn invariant features, ultimately aiding in generalization beyond the observed data distribution. Additionally, we present a straightforward yet powerful algorithm for generating counterfactual images, which effectively mitigates the influence of confounding effects on downstream classifiers. Through experiments on MNIST variants and the CelebA datasets, we demonstrate how our simple augmentation method helps existing state-of-the-art methods achieve good results. △ Less

Submitted 21 November, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

arXiv:2305.17300 [pdf, other]

Exploiting Large Neuroimaging Datasets to Create Connectome-Constrained Approaches for more Robust, Efficient, and Adaptable Artificial Intelligence

Authors: Erik C. Johnson, Brian S. Robinson, Gautam K. Vallabha, Justin Joyce, Jordan K. Matelsky, Raphael Norman-Tenazas, Isaac Western, Marisel Villafañe-Delgado, Martha Cervantes, Michael S. Robinette, Arun V. Reddy, Lindsey Kitchell, Patricia K. Rivlin, Elizabeth P. Reilly, Nathan Drenkow, Matthew J. Roos, I-Jeng Wang, Brock A. Wester, William R. Gray-Roncal, Joan A. Hoffmann

Abstract: Despite the progress in deep learning networks, efficient learning at the edge (enabling adaptable, low-complexity machine learning solutions) remains a critical need for defense and commercial applications. We envision a pipeline to utilize large neuroimaging datasets, including maps of the brain which capture neuron and synapse connectivity, to improve machine learning approaches. We have pursue… ▽ More Despite the progress in deep learning networks, efficient learning at the edge (enabling adaptable, low-complexity machine learning solutions) remains a critical need for defense and commercial applications. We envision a pipeline to utilize large neuroimaging datasets, including maps of the brain which capture neuron and synapse connectivity, to improve machine learning approaches. We have pursued different approaches within this pipeline structure. First, as a demonstration of data-driven discovery, the team has developed a technique for discovery of repeated subcircuits, or motifs. These were incorporated into a neural architecture search approach to evolve network architectures. Second, we have conducted analysis of the heading direction circuit in the fruit fly, which performs fusion of visual and angular velocity features, to explore augmenting existing computational models with new insight. Our team discovered a novel pattern of connectivity, implemented a new model, and demonstrated sensor fusion on a robotic platform. Third, the team analyzed circuitry for memory formation in the fruit fly connectome, enabling the design of a novel generative replay approach. Finally, the team has begun analysis of connectivity in mammalian cortex to explore potential improvements to transformer networks. These constraints increased network robustness on the most challenging examples in the CIFAR-10-C computer vision robustness benchmark task, while reducing learnable attention parameters by over an order of magnitude. Taken together, these results demonstrate multiple potential approaches to utilize insight from neural systems for developing robust and efficient machine learning techniques. △ Less

Submitted 26 May, 2023; originally announced May 2023.

Comments: 11 pages, 4 figures

arXiv:2304.06934 [pdf]

Classification of social media Toxic comments using Machine learning models

Authors: K. Poojitha, A. Sai Charish, M. Arun Kuamr Reddy, S. Ayyasamy

Abstract: The abstract outlines the problem of toxic comments on social media platforms, where individuals use disrespectful, abusive, and unreasonable language that can drive users away from discussions. This behavior is referred to as anti-social behavior, which occurs during online debates, comments, and fights. The comments containing explicit language can be classified into various categories, such as… ▽ More The abstract outlines the problem of toxic comments on social media platforms, where individuals use disrespectful, abusive, and unreasonable language that can drive users away from discussions. This behavior is referred to as anti-social behavior, which occurs during online debates, comments, and fights. The comments containing explicit language can be classified into various categories, such as toxic, severe toxic, obscene, threat, insult, and identity hate. This behavior leads to online harassment and cyberbullying, which forces individuals to stop expressing their opinions and ideas. To protect users from offensive language, companies have started flagging comments and blocking users. The abstract proposes to create a classifier using an Lstm-cnn model that can differentiate between toxic and non-toxic comments with high accuracy. The classifier can help organizations examine the toxicity of the comment section better. △ Less

Submitted 14 April, 2023; originally announced April 2023.

arXiv:2303.13850 [pdf, other]

Towards Learning and Explaining Indirect Causal Effects in Neural Networks

Authors: Abbavaram Gowtham Reddy, Saketh Bachu, Harsharaj Pathak, Benin L Godfrey, Vineeth N. Balasubramanian, Varshaneya V, Satya Narayanan Kar

Abstract: Recently, there has been a growing interest in learning and explaining causal effects within Neural Network (NN) models. By virtue of NN architectures, previous approaches consider only direct and total causal effects assuming independence among input variables. We view an NN as a structural causal model (SCM) and extend our focus to include indirect causal effects by introducing feedforward conne… ▽ More Recently, there has been a growing interest in learning and explaining causal effects within Neural Network (NN) models. By virtue of NN architectures, previous approaches consider only direct and total causal effects assuming independence among input variables. We view an NN as a structural causal model (SCM) and extend our focus to include indirect causal effects by introducing feedforward connections among input neurons. We propose an ante-hoc method that captures and maintains direct, indirect, and total causal effects during NN model training. We also propose an algorithm for quantifying learned causal effects in an NN model and efficient approximation strategies for quantifying causal effects in high-dimensional data. Extensive experiments conducted on synthetic and real-world datasets demonstrate that the causal effects learned by our ante-hoc method better approximate the ground truth effects compared to existing methods. △ Less

Submitted 8 January, 2024; v1 submitted 24 March, 2023; originally announced March 2023.

Comments: AAAI 2024

arXiv:2303.10280 [pdf, other]

Synthetic-to-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances

Authors: Arun V. Reddy, Ketul Shah, William Paul, Rohita Mocharla, Judy Hoffman, Kapil D. Katyal, Dinesh Manocha, Celso M. de Melo, Rama Chellappa

Abstract: Human action recognition is a challenging problem, particularly when there is high variability in factors such as subject appearance, backgrounds and viewpoint. While deep neural networks (DNNs) have been shown to perform well on action recognition tasks, they typically require large amounts of high-quality labeled data to achieve robust performance across a variety of conditions. Synthetic data h… ▽ More Human action recognition is a challenging problem, particularly when there is high variability in factors such as subject appearance, backgrounds and viewpoint. While deep neural networks (DNNs) have been shown to perform well on action recognition tasks, they typically require large amounts of high-quality labeled data to achieve robust performance across a variety of conditions. Synthetic data has shown promise as a way to avoid the substantial costs and potential ethical concerns associated with collecting and labeling enormous amounts of data in the real-world. However, synthetic data may differ from real data in important ways. This phenomenon, known as \textit{domain shift}, can limit the utility of synthetic data in robotics applications. To mitigate the effects of domain shift, substantial effort is being dedicated to the development of domain adaptation (DA) techniques. Yet, much remains to be understood about how best to develop these techniques. In this paper, we introduce a new dataset called Robot Control Gestures (RoCoG-v2). The dataset is composed of both real and synthetic videos from seven gesture classes, and is intended to support the study of synthetic-to-real domain shift for video-based action recognition. Our work expands upon existing datasets by focusing the action classes on gestures for human-robot teaming, as well as by enabling investigation of domain shift in both ground and aerial views. We present baseline results using state-of-the-art action recognition and domain adaptation algorithms and offer initial insight on tackling the synthetic-to-real and ground-to-air domain shifts. △ Less

Submitted 17 March, 2023; originally announced March 2023.

Comments: ICRA 2023. The first two authors contributed equally. Dataset available at: https://github.com/reddyav1/RoCoG-v2

arXiv:2303.08162 [pdf, other]

Artificial intelligence for artificial materials: moiré atom

Authors: Di Luo, Aidan P. Reddy, Trithep Devakul, Liang Fu

Abstract: Moiré engineering in atomically thin van der Waals heterostructures creates artificial quantum materials with designer properties. We solve the many-body problem of interacting electrons confined to a moiré superlattice potential minimum (the moiré atom) using a 2D fermionic neural network. We show that strong Coulomb interactions in combination with the anisotropic moiré potential lead to strikin… ▽ More Moiré engineering in atomically thin van der Waals heterostructures creates artificial quantum materials with designer properties. We solve the many-body problem of interacting electrons confined to a moiré superlattice potential minimum (the moiré atom) using a 2D fermionic neural network. We show that strong Coulomb interactions in combination with the anisotropic moiré potential lead to striking ``Wigner molecule" charge density distributions observable with scanning tunneling microscopy. △ Less

Submitted 26 March, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

Report number: MIT-CTP/5534

arXiv:2212.11408 [pdf, ps, other]

Adaptive and Dynamic Multi-Resolution Hashing for Pairwise Summations

Authors: Lianke Qin, Aravind Reddy, Zhao Song, Zhaozhuo Xu, Danyang Zhuo

Abstract: In this paper, we propose Adam-Hash: an adaptive and dynamic multi-resolution hashing data-structure for fast pairwise summation estimation. Given a data-set $X \subset \mathbb{R}^d$, a binary function $f:\mathbb{R}^d\times \mathbb{R}^d\to \mathbb{R}$, and a point $y \in \mathbb{R}^d$, the Pairwise Summation Estimate $\mathrm{PSE}_X(y) := \frac{1}{|X|} \sum_{x \in X} f(x,y)$. For any given data-se… ▽ More In this paper, we propose Adam-Hash: an adaptive and dynamic multi-resolution hashing data-structure for fast pairwise summation estimation. Given a data-set $X \subset \mathbb{R}^d$, a binary function $f:\mathbb{R}^d\times \mathbb{R}^d\to \mathbb{R}$, and a point $y \in \mathbb{R}^d$, the Pairwise Summation Estimate $\mathrm{PSE}_X(y) := \frac{1}{|X|} \sum_{x \in X} f(x,y)$. For any given data-set $X$, we need to design a data-structure such that given any query point $y \in \mathbb{R}^d$, the data-structure approximately estimates $\mathrm{PSE}_X(y)$ in time that is sub-linear in $|X|$. Prior works on this problem have focused exclusively on the case where the data-set is static, and the queries are independent. In this paper, we design a hashing-based PSE data-structure which works for the more practical \textit{dynamic} setting in which insertions, deletions, and replacements of points are allowed. Moreover, our proposed Adam-Hash is also robust to adaptive PSE queries, where an adversary can choose query $q_j \in \mathbb{R}^d$ depending on the output from previous queries $q_1, q_2, \dots, q_{j-1}$. △ Less

Submitted 21 December, 2022; originally announced December 2022.

Comments: BigData 2022

arXiv:2211.15494 [pdf, other]

Automated Routing of Droplets for DNA Storage on a Digital Microfluidics Platform

Authors: Ajay Manicka, Andrew Stephan, Sriram Chari, Gemma Mendonsa, Peyton Okubo, John Stolzberg-Schray, Anil Reddy, Marc Riedel

Abstract: Technologies for sequencing (reading) and synthesizing (writing) DNA have progressed on a Moore's law-like trajectory over the last three decades. This has motivated the idea of using DNA for data storage. Theoretically, DNA-based storage systems could out-compete all existing forms of archival storage. However, a large gap exists between what is theoretically possible in terms of read and write s… ▽ More Technologies for sequencing (reading) and synthesizing (writing) DNA have progressed on a Moore's law-like trajectory over the last three decades. This has motivated the idea of using DNA for data storage. Theoretically, DNA-based storage systems could out-compete all existing forms of archival storage. However, a large gap exists between what is theoretically possible in terms of read and write speeds and what has been practically demonstrated with DNA. This paper introduces a novel approach to DNA storage, with automated assembly on a digital microfluidic biochip. This technology offers unprecedented parallelism in DNA assembly using a dual library of "symbols" and "linkers". An algorithmic solution is discussed for the problem of managing droplet traffic on the device, with prioritized three-dimensional "A*" routing. An overview is given of the software that was developed for routing a large number of droplets in parallel on the device, minimizing congestion and maximizing throughput. △ Less

Submitted 5 July, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

Comments: 15 pages, 23 figures. Submitted to the journal "Lab on a Chip" for publication by the Royal Society of Chemistry

arXiv:2211.04370 [pdf, other]

NESTER: An Adaptive Neurosymbolic Method for Causal Effect Estimation

Authors: Abbavaram Gowtham Reddy, Vineeth N Balasubramanian

Abstract: Causal effect estimation from observational data is a central problem in causal inference. Methods based on potential outcomes framework solve this problem by exploiting inductive biases and heuristics from causal inference. Each of these methods addresses a specific aspect of causal effect estimation, such as controlling propensity score, enforcing randomization, etc., by designing neural network… ▽ More Causal effect estimation from observational data is a central problem in causal inference. Methods based on potential outcomes framework solve this problem by exploiting inductive biases and heuristics from causal inference. Each of these methods addresses a specific aspect of causal effect estimation, such as controlling propensity score, enforcing randomization, etc., by designing neural network (NN) architectures and regularizers. In this paper, we propose an adaptive method called Neurosymbolic Causal Effect Estimator (NESTER), a generalized method for causal effect estimation. NESTER integrates the ideas used in existing methods based on multi-head NNs for causal effect estimation into one framework. We design a Domain Specific Language (DSL) tailored for causal effect estimation based on causal inductive biases used in literature. We conduct a theoretical analysis to investigate NESTER's efficacy in estimating causal effects. Our comprehensive empirical results show that NESTER performs better than state-of-the-art methods on benchmark datasets. △ Less

Submitted 8 January, 2024; v1 submitted 8 November, 2022; originally announced November 2022.

Comments: AAAI 2024

arXiv:2210.12368 [pdf, other]

Counterfactual Generation Under Confounding

Authors: Abbavaram Gowtham Reddy, Saloni Dash, Amit Sharma, Vineeth N Balasubramanian

Abstract: A machine learning model, under the influence of observed or unobserved confounders in the training data, can learn spurious correlations and fail to generalize when deployed. For image classifiers, augmenting a training dataset using counterfactual examples has been empirically shown to break spurious correlations. However, the counterfactual generation task itself becomes more difficult as the l… ▽ More A machine learning model, under the influence of observed or unobserved confounders in the training data, can learn spurious correlations and fail to generalize when deployed. For image classifiers, augmenting a training dataset using counterfactual examples has been empirically shown to break spurious correlations. However, the counterfactual generation task itself becomes more difficult as the level of confounding increases. Existing methods for counterfactual generation under confounding consider a fixed set of interventions (e.g., texture, rotation) and are not flexible enough to capture diverse data-generating processes. Given a causal generative process, we formally characterize the adverse effects of confounding on any downstream tasks and show that the correlation between generative factors (attributes) can be used to quantitatively measure confounding between generative factors. To minimize such correlation, we propose a counterfactual generation method that learns to modify the value of any attribute in an image and generate new images given a set of observed attributes, even when the dataset is highly confounded. These counterfactual images are then used to regularize the downstream classifier such that the learned representations are the same across various generative factors conditioned on the class label. Our method is computationally efficient, simple to implement, and works well for any number of generative factors and confounding variables. Our experimental results on both synthetic (MNIST variants) and real-world (CelebA) datasets show the usefulness of our approach. △ Less

Submitted 10 December, 2022; v1 submitted 22 October, 2022; originally announced October 2022.

arXiv:2210.11722 [pdf, other]

Adaptive re-calibration of channel-wise features for Adversarial Audio Classification

Authors: Vardhan Dongre, Abhinav Thimma Reddy, Nikhitha Reddeddy

Abstract: DeepFake Audio, unlike DeepFake images and videos, has been relatively less explored from detection perspective, and the solutions which exist for the synthetic speech classification either use complex networks or dont generalize to different varieties of synthetic speech obtained using different generative and optimization-based methods. Through this work, we propose a channel-wise recalibration… ▽ More DeepFake Audio, unlike DeepFake images and videos, has been relatively less explored from detection perspective, and the solutions which exist for the synthetic speech classification either use complex networks or dont generalize to different varieties of synthetic speech obtained using different generative and optimization-based methods. Through this work, we propose a channel-wise recalibration of features using attention feature fusion for synthetic speech detection and compare its performance against different detection methods including End2End models and Resnet-based models on synthetic speech generated using Text to Speech and Vocoder systems like WaveNet, WaveRNN, Tactotron, and WaveGlow. We also experiment with Squeeze Excitation (SE) blocks in our Resnet models and found that the combination was able to get better performance. In addition to the analysis, we also demonstrate that the combination of Linear frequency cepstral coefficients (LFCC) and Mel Frequency cepstral coefficients (MFCC) using the attentional feature fusion technique creates better input features representations which can help even simpler models generalize well on synthetic speech classification tasks. Our models (Resnet based using feature fusion) trained on Fake or Real (FoR) dataset and were able to achieve 95% test accuracy with the FoR data, and an average of 90% accuracy with samples we generated using different generative models after adapting this framework. △ Less

Submitted 21 October, 2022; originally announced October 2022.

Comments: 7 pages, 8 figures, 4 tables

arXiv:2210.03961 [pdf, ps, other]

Dynamic Tensor Product Regression

Authors: Aravind Reddy, Zhao Song, Lichen Zhang

Abstract: In this work, we initiate the study of \emph{Dynamic Tensor Product Regression}. One has matrices $A_1\in \mathbb{R}^{n_1\times d_1},\ldots,A_q\in \mathbb{R}^{n_q\times d_q}$ and a label vector $b\in \mathbb{R}^{n_1\ldots n_q}$, and the goal is to solve the regression problem with the design matrix $A$ being the tensor product of the matrices $A_1, A_2, \dots, A_q$ i.e.… ▽ More In this work, we initiate the study of \emph{Dynamic Tensor Product Regression}. One has matrices $A_1\in \mathbb{R}^{n_1\times d_1},\ldots,A_q\in \mathbb{R}^{n_q\times d_q}$ and a label vector $b\in \mathbb{R}^{n_1\ldots n_q}$, and the goal is to solve the regression problem with the design matrix $A$ being the tensor product of the matrices $A_1, A_2, \dots, A_q$ i.e. $\min_{x\in \mathbb{R}^{d_1\ldots d_q}}~\|(A_1\otimes \ldots\otimes A_q)x-b\|_2$. At each time step, one matrix $A_i$ receives a sparse change, and the goal is to maintain a sketch of the tensor product $A_1\otimes\ldots \otimes A_q$ so that the regression solution can be updated quickly. Recomputing the solution from scratch for each round is very slow and so it is important to develop algorithms which can quickly update the solution with the new design matrix. Our main result is a dynamic tree data structure where any update to a single matrix can be propagated quickly throughout the tree. We show that our data structure can be used to solve dynamic versions of not only Tensor Product Regression, but also Tensor Product Spline regression (which is a generalization of ridge regression) and for maintaining Low Rank Approximations for the tensor product. △ Less

Submitted 9 January, 2023; v1 submitted 8 October, 2022; originally announced October 2022.

Comments: NeurIPS 2022

arXiv:2209.13482 [pdf, other]

doi 10.1029/2022JA031183

Predicting Swarm Equatorial Plasma Bubbles via Machine Learning and Shapley Values

Authors: S. A. Reddy, C. Forsyth, A. Aruliah, A. Smith, J. Bortnik, E. Aa, D. O. Kataria, G. Lewis

Abstract: In this study we present AI Prediction of Equatorial Plasma Bubbles (APE), a machine learning model that can accurately predict the Ionospheric Bubble Index (IBI) on the Swarm spacecraft. IBI is a correlation ($R^2$) between perturbations in plasma density and the magnetic field, whose source can be Equatorial Plasma Bubbles (EPBs). EPBs have been studied for a number of years, but their day-to-da… ▽ More In this study we present AI Prediction of Equatorial Plasma Bubbles (APE), a machine learning model that can accurately predict the Ionospheric Bubble Index (IBI) on the Swarm spacecraft. IBI is a correlation ($R^2$) between perturbations in plasma density and the magnetic field, whose source can be Equatorial Plasma Bubbles (EPBs). EPBs have been studied for a number of years, but their day-to-day variability has made predicting them a considerable challenge. We build an ensemble machine learning model to predict IBI. We use data from 2014-22 at a resolution of 1sec, and transform it from a time-series into a 6-dimensional space with a corresponding EPB $R^2$ (0-1) acting as the label. APE performs well across all metrics, exhibiting a skill, association and root mean squared error score of 0.96, 0.98 and 0.08 respectively. The model performs best post-sunset, in the American/Atlantic sector, around the equinoxes, and when solar activity is high. This is promising because EPBs are most likely to occur during these periods. Shapley values reveal that F10.7 is the most important feature in driving the predictions, whereas latitude is the least. The analysis also examines the relationship between the features, which reveals new insights into EPB climatology. Finally, the selection of the features means that APE could be expanded to forecasting EPBs following additional investigations into their onset. △ Less

Submitted 30 September, 2023; v1 submitted 27 September, 2022; originally announced September 2022.

Comments: 13 Pages, 9 Figures

Journal ref: Journal of Geophysical Research: Space Physics (2023): e2022JA031183

arXiv:2209.06358 [pdf, other]

Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset

Authors: Michael Chinen, Jan Skoglund, Chandan K A Reddy, Alessandro Ragano, Andrew Hines

Abstract: Non-reference speech quality models are important for a growing number of applications. The VoiceMOS 2022 challenge provided a dataset of synthetic voice conversion and text-to-speech samples with subjective labels. This study looks at the amount of variance that can be explained in subjective ratings of speech quality from metadata and the distribution imbalances of the dataset. Speech quality mo… ▽ More Non-reference speech quality models are important for a growing number of applications. The VoiceMOS 2022 challenge provided a dataset of synthetic voice conversion and text-to-speech samples with subjective labels. This study looks at the amount of variance that can be explained in subjective ratings of speech quality from metadata and the distribution imbalances of the dataset. Speech quality models were constructed using wav2vec 2.0 with additional metadata features that included rater groups and system identifiers and obtained competitive metrics including a Spearman rank correlation coefficient (SRCC) of 0.934 and MSE of 0.088 at the system-level, and 0.877 and 0.198 at the utterance-level. Using data and metadata that the test restricted or blinded further improved the metrics. A metadata analysis showed that the system-level metrics do not represent the model's system-level prediction as a result of the wide variation in the number of utterances used for each system on the validation and test datasets. We conclude that, in general, conditions should have enough utterances in the test set to bound the sample mean error, and be relatively balanced in utterance count between systems, otherwise the utterance-level metrics may be more reliable and interpretable. △ Less

Submitted 13 September, 2022; originally announced September 2022.

Comments: Preprint; accepted for Interspeech 2022

arXiv:2208.10737 [pdf, other]

Semi-Automatic Labeling and Semantic Segmentation of Gram-Stained Microscopic Images from DIBaS Dataset

Authors: Chethan Reddy G. P., Pullagurla Abhijith Reddy, Vidyashree R. Kanabur, Deepu Vijayasenan, Sumam S. David, Sreejith Govindan

Abstract: In this paper, a semi-automatic annotation of bacteria genera and species from DIBaS dataset is implemented using clustering and thresholding algorithms. A Deep learning model is trained to achieve the semantic segmentation and classification of the bacteria species. Classification accuracy of 95% is achieved. Deep learning models find tremendous applications in biomedical image processing. Automa… ▽ More In this paper, a semi-automatic annotation of bacteria genera and species from DIBaS dataset is implemented using clustering and thresholding algorithms. A Deep learning model is trained to achieve the semantic segmentation and classification of the bacteria species. Classification accuracy of 95% is achieved. Deep learning models find tremendous applications in biomedical image processing. Automatic segmentation of bacteria from gram-stained microscopic images is essential to diagnose respiratory and urinary tract infections, detect cancers, etc. Deep learning will aid the biologists to get reliable results in less time. Additionally, a lot of human intervention can be reduced. This work can be helpful to detect bacteria from urinary smear images, sputum smear images, etc to diagnose urinary tract infections, tuberculosis, pneumonia, etc. △ Less

Submitted 23 August, 2022; originally announced August 2022.

arXiv:2206.05750 [pdf, other]

Matching options to tasks using Option-Indexed Hierarchical Reinforcement Learning

Authors: Kushal Chauhan, Soumya Chatterjee, Akash Reddy, Balaraman Ravindran, Pradeep Shenoy

Abstract: The options framework in Hierarchical Reinforcement Learning breaks down overall goals into a combination of options or simpler tasks and associated policies, allowing for abstraction in the action space. Ideally, these options can be reused across different higher-level goals; indeed, such reuse is necessary to realize the vision of a continual learning agent that can effectively leverage its pri… ▽ More The options framework in Hierarchical Reinforcement Learning breaks down overall goals into a combination of options or simpler tasks and associated policies, allowing for abstraction in the action space. Ideally, these options can be reused across different higher-level goals; indeed, such reuse is necessary to realize the vision of a continual learning agent that can effectively leverage its prior experience. Previous approaches have only proposed limited forms of transfer of prelearned options to new task settings. We propose a novel option indexing approach to hierarchical learning (OI-HRL), where we learn an affinity function between options and the items present in the environment. This allows us to effectively reuse a large library of pretrained options, in zero-shot generalization at test time, by restricting goal-directed learning to only those options relevant to the task at hand. We develop a meta-training loop that learns the representations of options and environments over a series of HRL problems, by incorporating feedback about the relevance of retrieved options to the higher-level goal. We evaluate OI-HRL in two simulated settings - the CraftWorld and AI2THOR environments - and show that we achieve performance competitive with oracular baselines, and substantial gains over a baseline that has the entire option pool available for learning the hierarchical policy. △ Less

Submitted 12 June, 2022; originally announced June 2022.

Comments: 10 pages, 4 figures

arXiv:2205.09973 [pdf, other]

An In-Pipe Inspection Robot With Sensorless Underactuated Magnets and Omnidirectional Tracks: Design and Implementation

Authors: Gaurav Saha, K. M. Santosh, Anugu Reddy, Ravi Kant, Arjun Kumar

Abstract: This paper presents the plan of an in-pipe climbing robot that works utilizing an astute transmission part to investigate complex relationship of lines. Standard wheeled/proceeded in-pipe climbing robots are inclined to slip and take while investigating in pipe turns. The instrument helps in accomplishing the main inevitable result of getting out slip and drag in the robot tracks during advancemen… ▽ More This paper presents the plan of an in-pipe climbing robot that works utilizing an astute transmission part to investigate complex relationship of lines. Standard wheeled/proceeded in-pipe climbing robots are inclined to slip and take while investigating in pipe turns. The instrument helps in accomplishing the main inevitable result of getting out slip and drag in the robot tracks during advancement. The proposed transmission appreciates the practical furthest reaches of the standard two-yield transmission, which is developed the basic time for a transmission with three results. The instrument conclusively changes the track speeds of the robot considering the powers applied on each track inside the line relationship, by getting out the essential for any remarkable control. The amusement of the robot crossing in the line network in various orientation and in pipe-turns without slip shows the proposed game plan's adequacy. △ Less

Submitted 20 May, 2022; originally announced May 2022.

Comments: 6 pages, 2 figures. arXiv admin note: text overlap with arXiv:2201.07865

arXiv:2202.04073 [pdf]

The EMory BrEast imaging Dataset (EMBED): A Racially Diverse, Granular Dataset of 3.5M Screening and Diagnostic Mammograms

Authors: Jiwoong J. Jeong, Brianna L. Vey, Ananth Reddy, Thomas Kim, Thiago Santos, Ramon Correa, Raman Dutt, Marina Mosunjac, Gabriela Oprea-Ilies, Geoffrey Smith, Minjae Woo, Christopher R. McAdams, Mary S. Newell, Imon Banerjee, Judy Gichoya, Hari Trivedi

Abstract: Developing and validating artificial intelligence models in medical imaging requires datasets that are large, granular, and diverse. To date, the majority of publicly available breast imaging datasets lack in one or more of these areas. Models trained on these data may therefore underperform on patient populations or pathologies that have not previously been encountered. The EMory BrEast imaging D… ▽ More Developing and validating artificial intelligence models in medical imaging requires datasets that are large, granular, and diverse. To date, the majority of publicly available breast imaging datasets lack in one or more of these areas. Models trained on these data may therefore underperform on patient populations or pathologies that have not previously been encountered. The EMory BrEast imaging Dataset (EMBED) addresses these gaps by providing 3650,000 2D and DBT screening and diagnostic mammograms for 116,000 women divided equally between White and African American patients. The dataset also contains 40,000 annotated lesions linked to structured imaging descriptors and 61 ground truth pathologic outcomes grouped into six severity classes. Our goal is to share this dataset with research partners to aid in development and validation of breast AI models that will serve all patients fairly and help decrease bias in medical AI. △ Less

Submitted 8 February, 2022; originally announced February 2022.

arXiv:2112.07555 [pdf, other]

Classification of histopathology images using ConvNets to detect Lupus Nephritis

Authors: Akash Gupta, Anirudh Reddy, CV Jawahar, PK Vinod

Abstract: Systemic lupus erythematosus (SLE) is an autoimmune disease in which the immune system of the patient starts attacking healthy tissues of the body. Lupus Nephritis (LN) refers to the inflammation of kidney tissues resulting in renal failure due to these attacks. The International Society of Nephrology/Renal Pathology Society (ISN/RPS) has released a classification system based on various patterns… ▽ More Systemic lupus erythematosus (SLE) is an autoimmune disease in which the immune system of the patient starts attacking healthy tissues of the body. Lupus Nephritis (LN) refers to the inflammation of kidney tissues resulting in renal failure due to these attacks. The International Society of Nephrology/Renal Pathology Society (ISN/RPS) has released a classification system based on various patterns observed during renal injury in SLE. Traditional methods require meticulous pathological assessment of the renal biopsy and are time-consuming. Recently, computational techniques have helped to alleviate this issue by using virtual microscopy or Whole Slide Imaging (WSI). With the use of deep learning and modern computer vision techniques, we propose a pipeline that is able to automate the process of 1) detection of various glomeruli patterns present in these whole slide images and 2) classification of each image using the extracted glomeruli features. △ Less

Submitted 14 December, 2021; originally announced December 2021.

Comments: Accepted in the 2021 Medical Imaging meets NeurIPS Workshop

arXiv:2112.05746 [pdf, other]

On Causally Disentangled Representations

Authors: Abbavaram Gowtham Reddy, Benin Godfrey L, Vineeth N Balasubramanian

Abstract: Representation learners that disentangle factors of variation have already proven to be important in addressing various real world concerns such as fairness and interpretability. Initially consisting of unsupervised models with independence assumptions, more recently, weak supervision and correlated features have been explored, but without a causal view of the generative process. In contrast, we w… ▽ More Representation learners that disentangle factors of variation have already proven to be important in addressing various real world concerns such as fairness and interpretability. Initially consisting of unsupervised models with independence assumptions, more recently, weak supervision and correlated features have been explored, but without a causal view of the generative process. In contrast, we work under the regime of a causal generative process where generative factors are either independent or can be potentially confounded by a set of observed or unobserved confounders. We present an analysis of disentangled representations through the notion of disentangled causal process. We motivate the need for new metrics and datasets to study causal disentanglement and propose two evaluation metrics and a dataset. We show that our metrics capture the desiderata of disentangled causal process. Finally, we perform an empirical study on state of the art disentangled representation learners using our metrics and dataset to evaluate them from causal perspective. △ Less

Submitted 10 December, 2021; originally announced December 2021.

Comments: https://causal-disentanglement.github.io/IITH-CANDLE/ ; Accepted at the 36th AAAI Conference on Artificial Intelligence (AAAI 2022)

arXiv:2111.14674 [pdf, ps, other]

Online MAP Inference and Learning for Nonsymmetric Determinantal Point Processes

Authors: Aravind Reddy, Ryan A. Rossi, Zhao Song, Anup Rao, Tung Mai, Nedim Lipka, Gang Wu, Eunyee Koh, Nesreen Ahmed

Abstract: In this paper, we introduce the online and streaming MAP inference and learning problems for Non-symmetric Determinantal Point Processes (NDPPs) where data points arrive in an arbitrary order and the algorithms are constrained to use a single-pass over the data as well as sub-linear memory. The online setting has an additional requirement of maintaining a valid solution at any point in time. For s… ▽ More In this paper, we introduce the online and streaming MAP inference and learning problems for Non-symmetric Determinantal Point Processes (NDPPs) where data points arrive in an arbitrary order and the algorithms are constrained to use a single-pass over the data as well as sub-linear memory. The online setting has an additional requirement of maintaining a valid solution at any point in time. For solving these new problems, we propose algorithms with theoretical guarantees, evaluate them on several real-world datasets, and show that they give comparable performance to state-of-the-art offline algorithms that store the entire data in memory and take multiple passes over it. △ Less

Submitted 29 November, 2021; originally announced November 2021.

arXiv:2111.12490 [pdf, other]

Matching Learned Causal Effects of Neural Networks with Domain Priors

Authors: Sai Srinivas Kancheti, Abbavaram Gowtham Reddy, Vineeth N Balasubramanian, Amit Sharma

Abstract: A trained neural network can be interpreted as a structural causal model (SCM) that provides the effect of changing input variables on the model's output. However, if training data contains both causal and correlational relationships, a model that optimizes prediction accuracy may not necessarily learn the true causal relationships between input and output variables. On the other hand, expert user… ▽ More A trained neural network can be interpreted as a structural causal model (SCM) that provides the effect of changing input variables on the model's output. However, if training data contains both causal and correlational relationships, a model that optimizes prediction accuracy may not necessarily learn the true causal relationships between input and output variables. On the other hand, expert users often have prior knowledge of the causal relationship between certain input variables and output from domain knowledge. Therefore, we propose a regularization method that aligns the learned causal effects of a neural network with domain priors, including both direct and total causal effects. We show that this approach can generalize to different kinds of domain priors, including monotonicity of causal effect of an input variable on output or zero causal effect of a variable on output for purposes of fairness. Our experiments on twelve benchmark datasets show its utility in regularizing a neural network model to maintain desired causal effects, without compromising on accuracy. Importantly, we also show that a model thus trained is robust and gets improved accuracy on noisy inputs. △ Less

Submitted 29 June, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

Comments: Accepted at International Conference on Machine Learning (ICML'22)

arXiv:2110.04331 [pdf, ps, other]

MusicNet: Compact Convolutional Neural Network for Real-time Background Music Detection

Authors: Chandan K. A. Reddy, Vishak Gopa, Harishchandra Dubey, Sergiy Matusevych, Ross Cutler, Robert Aichner

Abstract: With the recent growth of remote work, online meetings often encounter challenging audio contexts such as background noise, music, and echo. Accurate real-time detection of music events can help to improve the user experience. In this paper, we present MusicNet, a compact neural model for detecting background music in the real-time communications pipeline. In video meetings, music frequently co-oc… ▽ More With the recent growth of remote work, online meetings often encounter challenging audio contexts such as background noise, music, and echo. Accurate real-time detection of music events can help to improve the user experience. In this paper, we present MusicNet, a compact neural model for detecting background music in the real-time communications pipeline. In video meetings, music frequently co-occurs with speech and background noises, making the accurate classification quite challenging. We propose a compact convolutional neural network core preceded by an in-model featurization layer. MusicNet takes 9 seconds of raw audio as input and does not require any model-specific featurization in the product stack. We train our model on the balanced subset of the Audio Set~\cite{gemmeke2017audio} data and validate it on 1000 crowd-sourced real test clips. Finally, we compare MusicNet performance with 20 state-of-the-art models. MusicNet has a true positive rate (TPR) of 81.3% at a 0.1% false positive rate (FPR), which is significantly better than state-of-the-art models included in our study. MusicNet is also 10x smaller and has 4x faster inference than the best performing models we benchmarked. △ Less

Submitted 15 April, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

arXiv:2110.01763 [pdf, other]

DNSMOS P.835: A Non-Intrusive Perceptual Objective Speech Quality Metric to Evaluate Noise Suppressors

Authors: Chandan K A Reddy, Vishak Gopal, Ross Cutler

Abstract: Human subjective evaluation is the gold standard to evaluate speech quality optimized for human perception. Perceptual objective metrics serve as a proxy for subjective scores. We have recently developed a non-intrusive speech quality metric called Deep Noise Suppression Mean Opinion Score (DNSMOS) using the scores from ITU-T Rec. P.808 subjective evaluation. The P.808 scores reflect the overall q… ▽ More Human subjective evaluation is the gold standard to evaluate speech quality optimized for human perception. Perceptual objective metrics serve as a proxy for subjective scores. We have recently developed a non-intrusive speech quality metric called Deep Noise Suppression Mean Opinion Score (DNSMOS) using the scores from ITU-T Rec. P.808 subjective evaluation. The P.808 scores reflect the overall quality of the audio clip. ITU-T Rec. P.835 subjective evaluation framework gives the standalone quality scores of speech and background noise in addition to the overall quality. In this work, we train an objective metric based on P.835 human ratings that outputs 3 scores: i) speech quality (SIG), ii) background noise quality (BAK), and iii) the overall quality (OVRL) of the audio. The developed metric is highly correlated with human ratings, with a Pearson's Correlation Coefficient (PCC)=0.94 for SIG and PCC=0.98 for BAK and OVRL. This is the first non-intrusive P.835 predictor we are aware of. DNSMOS P.835 is made publicly available as an Azure service. △ Less

Submitted 4 February, 2022; v1 submitted 4 October, 2021; originally announced October 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2010.15258

arXiv:2107.07184 [pdf, other]

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

Authors: Kevin Li, Abhishek Gupta, Ashwin Reddy, Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

Abstract: Exploration in reinforcement learning is a challenging problem: in the worst case, the agent must search for high-reward states that could be hidden anywhere in the state space. Can we define a more tractable class of RL problems, where the agent is provided with examples of successful outcomes? In this problem setting, the reward function can be obtained automatically by training a classifier to… ▽ More Exploration in reinforcement learning is a challenging problem: in the worst case, the agent must search for high-reward states that could be hidden anywhere in the state space. Can we define a more tractable class of RL problems, where the agent is provided with examples of successful outcomes? In this problem setting, the reward function can be obtained automatically by training a classifier to categorize states as successful or not. If trained properly, such a classifier can provide a well-shaped objective landscape that both promotes progress toward good states and provides a calibrated exploration bonus. In this work, we show that an uncertainty aware classifier can solve challenging reinforcement learning problems by both encouraging exploration and provided directed guidance towards positive outcomes. We propose a novel mechanism for obtaining these calibrated, uncertainty-aware classifiers based on an amortized technique for computing the normalized maximum likelihood (NML) distribution. To make this tractable, we propose a novel method for computing the NML distribution by using meta-learning. We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions, while also providing more effective guidance towards the goal. We demonstrate that our algorithm solves a number of challenging navigation and robotic manipulation tasks which prove difficult or impossible for prior methods. △ Less

Submitted 18 July, 2021; v1 submitted 15 July, 2021; originally announced July 2021.

Comments: Accepted to ICML 2021. First two authors contributed equally

arXiv:2107.06173 [pdf, other]

doi 10.1109/TSP.2020.2971936

Orthogonal and Non-Orthogonal Signal Representations Using New Transformation Matrices Having NPM Structure

Authors: Shaik Basheeruddin Shah, Vijay Kumar Chakka, Arikatla Satyanarayana Reddy

Abstract: In this paper, we introduce two types of real-valued sums known as Complex Conjugate Pair Sums (CCPSs) denoted as CCPS$^{(1)}$ and CCPS$^{(2)}$, and discuss a few of their properties. Using each type of CCPSs and their circular shifts, we construct two non-orthogonal Nested Periodic Matrices (NPMs). As NPMs are non-singular, this introduces two non-orthogonal transforms known as Complex Conjugate… ▽ More In this paper, we introduce two types of real-valued sums known as Complex Conjugate Pair Sums (CCPSs) denoted as CCPS$^{(1)}$ and CCPS$^{(2)}$, and discuss a few of their properties. Using each type of CCPSs and their circular shifts, we construct two non-orthogonal Nested Periodic Matrices (NPMs). As NPMs are non-singular, this introduces two non-orthogonal transforms known as Complex Conjugate Periodic Transforms (CCPTs) denoted as CCPT$^{(1)}$ and CCPT$^{(2)}$. We propose another NPM, which uses both types of CCPSs such that its columns are mutually orthogonal, this transform is known as Orthogonal CCPT (OCCPT). After a brief study of a few OCCPT properties like periodicity, circular shift, etc., we present two different interpretations of it. Further, we propose a Decimation-In-Time (DIT) based fast computation algorithm for OCCPT (termed as FOCCPT), whenever the length of the signal is equal to $2^v,\ v{\in} \mathbb{N}$. The proposed sums and transforms are inspired by Ramanujan sums and Ramanujan Period Transform (RPT). Finally, we show that the period (both divisor and non-divisor) and frequency information of a signal can be estimated using the proposed transforms with a significant reduction in the computational complexity over Discrete Fourier Transform (DFT). △ Less

Submitted 20 June, 2021; originally announced July 2021.

Comments: 13 pages, 5 figures,

Journal ref: IEEE Transactions on Signal Processing, 06 February 2020

arXiv:2107.02437 [pdf, ps, other]

Turbo Coded Single User Massive MIMO

Authors: K. Vasudevan, A. Phani Kumar Reddy, Gyanesh Kumar Pathak, Mahmoud Albreem

Abstract: This work deals with turbo coded single user massive multiple input multiple output (SU-MMIMO) systems, with and without precoding. SU-MMIMO has a much higher spectral efficiency compared to multi-user massive MIMO (MU-MMIMO) since independent signals are transmitted from each of the antenna elements (spatial multiplexing). MU-MMIMO that uses beamforming has a much lower spectral efficiency, since… ▽ More This work deals with turbo coded single user massive multiple input multiple output (SU-MMIMO) systems, with and without precoding. SU-MMIMO has a much higher spectral efficiency compared to multi-user massive MIMO (MU-MMIMO) since independent signals are transmitted from each of the antenna elements (spatial multiplexing). MU-MMIMO that uses beamforming has a much lower spectral efficiency, since the same signal (with a delay) is transmitted from each of the antenna elements. In this work, expressions for the upper bound on the average signal-to-noise ratio (SNR) per bit and spectral efficiency are derived for SU-MMIMO with and without precoding. We propose a performance index $f(N_t)$, which is a function of the number of transmit antennas $N_t$. Here $f(N_t)$ is the sum of the upper bound on the average SNR per bit and the spectral efficiency. We demonstrate that when the total number of antennas ($N_{\mathrm{tot}}$) in the transmitter and receiver is fixed, there exists a minimum value of $f(N_t)$, which has to be avoided. Computer simulations show that the bit-error-rate (BER) is nearly insensitive to a wide range of the number of transmit antennas and re-transmissions, when $N_{\mathrm{tot}}$ is large and kept constant. Thus, the spectral efficiency can be made as large as possible, for a given BER and $N_{\mathrm{tot}}$. △ Less

Submitted 6 July, 2021; originally announced July 2021.

Comments: 12 pages, 8 figures, journal. arXiv admin note: substantial text overlap with arXiv:2007.15959

arXiv:2106.15917 [pdf, other]

Explaining Caste-based Digital Divide in India

Authors: R Vaidehi, A Bheemeshwar Reddy, Sudatta Banerjee

Abstract: With the increasing importance of information and communication technologies in access to basic services like education and health, the question of the digital divide based on caste assumes importance in India where large socioeconomic disparities persist between different caste groups. Studies on caste-based digital inequality are still scanty in India. Using nationally representative survey data… ▽ More With the increasing importance of information and communication technologies in access to basic services like education and health, the question of the digital divide based on caste assumes importance in India where large socioeconomic disparities persist between different caste groups. Studies on caste-based digital inequality are still scanty in India. Using nationally representative survey data, this paper analyzes the first-level digital divide (ownership of computer and access to the internet) and the second-level digital divide (individual's skill to use computer and the internet) between the disadvantaged caste group and the others. Further, this paper identifies the caste group-based differences in socioeconomic factors that contribute to the digital divide between these groups using a non-linear decomposition method. The results show that there exists a large first-level and second-level digital divide between the disadvantaged caste groups and others in India. The non-linear decomposition results indicate that the caste-based digital divide in India is rooted in historical socioeconomic deprivation of disadvantaged caste groups. More than half of the caste-based digital gap is attributable to differences in educational attainment and income between the disadvantaged caste groups and others. The findings of this study highlight the urgent need for addressing educational and income inequality between the different caste groups in India in order to bridge the digital divide. △ Less

Submitted 30 June, 2021; originally announced June 2021.

arXiv:2106.05842 [pdf, other]

doi 10.1145/3461702.3462467

Causality in Neural Networks -- An Extended Abstract

Authors: Abbavaram Gowtham Reddy

Abstract: Causal reasoning is the main learning and explanation tool used by humans. AI systems should possess causal reasoning capabilities to be deployed in the real world with trust and reliability. Introducing the ideas of causality to machine learning helps in providing better learning and explainable models. Explainability, causal disentanglement are some important aspects of any machine learning mode… ▽ More Causal reasoning is the main learning and explanation tool used by humans. AI systems should possess causal reasoning capabilities to be deployed in the real world with trust and reliability. Introducing the ideas of causality to machine learning helps in providing better learning and explainable models. Explainability, causal disentanglement are some important aspects of any machine learning model. Causal explanations are required to believe in a model's decision and causal disentanglement learning is important for transfer learning applications. We exploit the ideas of causality to be used in deep learning models to achieve better and causally explainable models that are useful in fairness, disentangled representation, etc. △ Less

Submitted 3 June, 2021; originally announced June 2021.

arXiv:2105.07855 [pdf]

doi 10.14445/22315381/IJETT-V69I5P217

An Extensive Analytical Approach on Human Resources using Random Forest Algorithm

Authors: Swarajya lakshmi v papineni, A. Mallikarjuna Reddy, Sudeepti yarlagadda, Snigdha Yarlagadda, Haritha Akkinen

Abstract: The current job survey shows that most software employees are planning to change their job role due to high pay for recent jobs such as data scientists, business analysts and artificial intelligence fields. The survey also indicated that work life imbalances, low pay, uneven shifts and many other factors also make employees think about changing their work life. In this paper, for an efficient orga… ▽ More The current job survey shows that most software employees are planning to change their job role due to high pay for recent jobs such as data scientists, business analysts and artificial intelligence fields. The survey also indicated that work life imbalances, low pay, uneven shifts and many other factors also make employees think about changing their work life. In this paper, for an efficient organisation of the company in terms of human resources, the proposed system designed a model with the help of a random forest algorithm by considering different employee parameters. This helps the HR department retain the employee by identifying gaps and helping the organisation to run smoothly with a good employee retention ratio. This combination of HR and data science can help the productivity, collaboration and well-being of employees of the organisation. It also helps to develop strategies that have an impact on the performance of employees in terms of external and social factors. △ Less

Submitted 7 May, 2021; originally announced May 2021.

arXiv:2103.00034 [pdf, other]

Beyond Perturbation Stability: LP Recovery Guarantees for MAP Inference on Noisy Stable Instances

Authors: Hunter Lang, Aravind Reddy, David Sontag, Aravindan Vijayaraghavan

Abstract: Several works have shown that perturbation stable instances of the MAP inference problem in Potts models can be solved exactly using a natural linear programming (LP) relaxation. However, most of these works give few (or no) guarantees for the LP solutions on instances that do not satisfy the relatively strict perturbation stability definitions. In this work, we go beyond these stability results b… ▽ More Several works have shown that perturbation stable instances of the MAP inference problem in Potts models can be solved exactly using a natural linear programming (LP) relaxation. However, most of these works give few (or no) guarantees for the LP solutions on instances that do not satisfy the relatively strict perturbation stability definitions. In this work, we go beyond these stability results by showing that the LP approximately recovers the MAP solution of a stable instance even after the instance is corrupted by noise. This "noisy stable" model realistically fits with practical MAP inference problems: we design an algorithm for finding "close" stable instances, and show that several real-world instances from computer vision have nearby instances that are perturbation stable. These results suggest a new theoretical explanation for the excellent performance of this LP relaxation in practice. △ Less

Submitted 26 February, 2021; originally announced March 2021.

Comments: 25 pages, 2 figures, 2 tables. To appear in AISTATS 2021

arXiv:2102.02637 [pdf]

Big Data Analytics Applying the Fusion Approach of Multicriteria Decision Making with Deep Learning Algorithms

Authors: Swarajya Lakshmi V Papineni, Snigdha Yarlagadda, Harita Akkineni, A. Mallikarjuna Reddy

Abstract: Data is evolving with the rapid progress of population and communication for various types of devices such as networks, cloud computing, Internet of Things (IoT), actuators, and sensors. The increment of data and communication content goes with the equivalence of velocity, speed, size, and value to provide the useful and meaningful knowledge that helps to solve the future challenging tasks and lat… ▽ More Data is evolving with the rapid progress of population and communication for various types of devices such as networks, cloud computing, Internet of Things (IoT), actuators, and sensors. The increment of data and communication content goes with the equivalence of velocity, speed, size, and value to provide the useful and meaningful knowledge that helps to solve the future challenging tasks and latest issues. Besides, multicriteria based decision making is one of the key issues to solve for various issues related to the alternative effects in big data analysis. It tends to find a solution based on the latest machine learning techniques that include algorithms like decision making and deep learning mechanism based on multicriteria in providing insights to big data. On the other hand, the derivations are made for it to go with the approximations to increase the duality of runtime and improve the entire system's potentiality and efficacy. In essence, several fields, including business, agriculture, information technology, and computer science, use deep learning and multicriteria-based decision-making problems. This paper aims to provide various applications that involve the concepts of deep learning techniques and exploiting the multicriteria approaches for issues that are facing in big data analytics by proposing new studies with the fusion approaches of data-driven techniques. △ Less

Submitted 2 February, 2021; originally announced February 2021.

arXiv:2101.09725 [pdf]

doi 10.1109/ACCESS.2021.3063031

Audio-Visual Biometric Recognition and Presentation Attack Detection: A Comprehensive Survey

Authors: Hareesh Mandalapu, P N Aravinda Reddy, Raghavendra Ramachandra, K Sreenivasa Rao, Pabitra Mitra, S R Mahadeva Prasanna, Christoph Busch

Abstract: Biometric recognition is a trending technology that uses unique characteristics data to identify or verify/authenticate security applications. Amidst the classically used biometrics, voice and face attributes are the most propitious for prevalent applications in day-to-day life because they are easy to obtain through restrained and user-friendly procedures. The pervasiveness of low-cost audio and… ▽ More Biometric recognition is a trending technology that uses unique characteristics data to identify or verify/authenticate security applications. Amidst the classically used biometrics, voice and face attributes are the most propitious for prevalent applications in day-to-day life because they are easy to obtain through restrained and user-friendly procedures. The pervasiveness of low-cost audio and face capture sensors in smartphones, laptops, and tablets has made the advantage of voice and face biometrics more exceptional when compared to other biometrics. For many years, acoustic information alone has been a great success in automatic speaker verification applications. Meantime, the last decade or two has also witnessed a remarkable ascent in face recognition technologies. Nonetheless, in adverse unconstrained environments, neither of these techniques achieves optimal performance. Since audio-visual information carries correlated and complementary information, integrating them into one recognition system can increase the system's performance. The vulnerability of biometrics towards presentation attacks and audio-visual data usage for the detection of such attacks is also a hot topic of research. This paper made a comprehensive survey on existing state-of-the-art audio-visual recognition techniques, publicly available databases for benchmarking, and Presentation Attack Detection (PAD) algorithms. Further, a detailed discussion on challenges and open problems is presented in this field of biometrics. △ Less

Submitted 12 March, 2021; v1 submitted 24 January, 2021; originally announced January 2021.

Journal ref: in IEEE Access, vol. 9, pp. 37431-37455, 2021

arXiv:2101.09249 [pdf, other]

Towards efficient models for real-time deep noise suppression

Authors: Sebastian Braun, Hannes Gamper, Chandan K. A. Reddy, Ivan Tashev

Abstract: With recent research advancements, deep learning models are becoming attractive and powerful choices for speech enhancement in real-time applications. While state-of-the-art models can achieve outstanding results in terms of speech quality and background noise reduction, the main challenge is to obtain compact enough models, which are resource efficient during inference time. An important but ofte… ▽ More With recent research advancements, deep learning models are becoming attractive and powerful choices for speech enhancement in real-time applications. While state-of-the-art models can achieve outstanding results in terms of speech quality and background noise reduction, the main challenge is to obtain compact enough models, which are resource efficient during inference time. An important but often neglected aspect for data-driven methods is that results can be only convincing when tested on real-world data and evaluated with useful metrics. In this work, we investigate reasonably small recurrent and convolutional-recurrent network architectures for speech enhancement, trained on a large dataset considering also reverberation. We show interesting tradeoffs between computational complexity and the achievable speech quality, measured on real recordings using a highly accurate MOS estimator. It is shown that the achievable speech quality is a function of network complexity, and show which models have better tradeoffs. △ Less

Submitted 19 May, 2021; v1 submitted 22 January, 2021; originally announced January 2021.

arXiv:2101.06665 [pdf, other]

Brightening the Optical Flow through Posit Arithmetic

Authors: Vinay Saxena, Ankitha Reddy, Jonathan Neudorfer, John Gustafson, Sangeeth Nambiar, Rainer Leupers, Farhad Merchant

Abstract: As new technologies are invented, their commercial viability needs to be carefully examined along with their technical merits and demerits. The posit data format, proposed as a drop-in replacement for IEEE 754 float format, is one such invention that requires extensive theoretical and experimental study to identify products that can benefit from the advantages of posits for specific market segment… ▽ More As new technologies are invented, their commercial viability needs to be carefully examined along with their technical merits and demerits. The posit data format, proposed as a drop-in replacement for IEEE 754 float format, is one such invention that requires extensive theoretical and experimental study to identify products that can benefit from the advantages of posits for specific market segments. In this paper, we present an extensive empirical study of posit-based arithmetic vis-à-vis IEEE 754 compliant arithmetic for the optical flow estimation method called Lucas-Kanade (LuKa). First, we use SoftPosit and SoftFloat format emulators to perform an empirical error analysis of the LuKa method. Our study shows that the average error in LuKa with SoftPosit is an order of magnitude lower than LuKa with SoftFloat. We then present the integration of the hardware implementation of a posit adder and multiplier in a RISC-V open-source platform. We make several recommendations, along with the analysis of LuKa in the RISC-V context, for future generation platforms incorporating posit arithmetic units. △ Less

Submitted 17 January, 2021; originally announced January 2021.

Comments: To appear in ISQED 2021

arXiv:2101.01902 [pdf, other]

Interspeech 2021 Deep Noise Suppression Challenge

Authors: Chandan K A Reddy, Harishchandra Dubey, Kazuhito Koishida, Arun Nair, Vishak Gopal, Ross Cutler, Sebastian Braun, Hannes Gamper, Robert Aichner, Sriram Srinivasan

Abstract: The Deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTERSPEECH and ICASSP 2020. We open-sourced training and test datasets for the wideband scenario. We also open-sourced a subjective evaluation framework based on ITU-T standard P.808, wh… ▽ More The Deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTERSPEECH and ICASSP 2020. We open-sourced training and test datasets for the wideband scenario. We also open-sourced a subjective evaluation framework based on ITU-T standard P.808, which was also used to evaluate participants of the challenge. Many researchers from academia and industry made significant contributions to push the field forward, yet even the best noise suppressor was far from achieving superior speech quality in challenging scenarios. In this version of the challenge organized at INTERSPEECH 2021, we are expanding both our training and test datasets to accommodate full band scenarios. The two tracks in this challenge will focus on real-time denoising for (i) wide band, and(ii) full band scenarios. We are also making available a reliable non-intrusive objective speech quality metric called DNSMOS for the participants to use during their development phase. △ Less

Submitted 4 April, 2021; v1 submitted 6 January, 2021; originally announced January 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2009.06122

arXiv:2011.04567 [pdf, other]

doi 10.1109/FPL53798.2021.00039

FPGA-based Hyrbid Memory Emulation System

Authors: Fei Wen, Mian Qin, Paul V. Gratz, A. L. Narasimha Reddy

Abstract: Hybrid memory systems, comprised of emerging non-volatile memory (NVM) and DRAM, have been proposed to address the growing memory demand of applications. Emerging NVM technologies, such as phase-change memories (PCM), memristor, and 3D XPoint, have higher capacity density, minimal static power consumption and lower cost per GB. However, NVM has longer access latency and limited write endurance as… ▽ More Hybrid memory systems, comprised of emerging non-volatile memory (NVM) and DRAM, have been proposed to address the growing memory demand of applications. Emerging NVM technologies, such as phase-change memories (PCM), memristor, and 3D XPoint, have higher capacity density, minimal static power consumption and lower cost per GB. However, NVM has longer access latency and limited write endurance as opposed to DRAM. The different characteristics of two memory classes point towards the design of hybrid memory systems containing multiple classes of main memory. In the iterative and incremental development of new architectures, the timeliness of simulation completion is critical to project progression. Hence, a highly efficient simulation method is needed to evaluate the performance of different hybrid memory system designs. Design exploration for hybrid memory systems is challenging, because it requires emulation of the full system stack, including the OS, memory controller, and interconnect. Moreover, benchmark applications for memory performance test typically have much larger working sets, thus taking even longer simulation warm-up period. In this paper, we propose a FPGA-based hybrid memory system emulation platform. We target at the mobile computing system, which is sensitive to energy consumption and is likely to adopt NVM for its power efficiency. Here, because the focus of our platform is on the design of the hybrid memory system, we leverage the on-board hard IP ARM processors to both improve simulation performance while improving accuracy of the results. Thus, users can implement their data placement/migration policies with the FPGA logic elements and evaluate new designs quickly and effectively. Results show that our emulation platform provides a speedup of 9280x in simulation time compared to the software counterpart Gem5. △ Less

Submitted 9 November, 2020; originally announced November 2020.

arXiv:2010.15258 [pdf, other]

DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors

Authors: Chandan K A Reddy, Vishak Gopal, Ross Cutler

Abstract: Human subjective evaluation is the gold standard to evaluate speech quality optimized for human perception. Perceptual objective metrics serve as a proxy for subjective scores. The conventional and widely used metrics require a reference clean speech signal, which is unavailable in real recordings. The no-reference approaches correlate poorly with human ratings and are not widely adopted in the re… ▽ More Human subjective evaluation is the gold standard to evaluate speech quality optimized for human perception. Perceptual objective metrics serve as a proxy for subjective scores. The conventional and widely used metrics require a reference clean speech signal, which is unavailable in real recordings. The no-reference approaches correlate poorly with human ratings and are not widely adopted in the research community. One of the biggest use cases of these perceptual objective metrics is to evaluate noise suppression algorithms. This paper introduces a multi-stage self-teaching based perceptual objective metric that is designed to evaluate noise suppressors. The proposed method generalizes well in challenging test conditions with a high correlation to human ratings. △ Less

Submitted 10 February, 2021; v1 submitted 28 October, 2020; originally announced October 2020.

Comments: Submitted to ICASSP 2020

arXiv:2010.14487 [pdf, ps, other]

Improved Guarantees for k-means++ and k-means++ Parallel

Authors: Konstantin Makarychev, Aravind Reddy, Liren Shan

Abstract: In this paper, we study k-means++ and k-means++ parallel, the two most popular algorithms for the classic k-means clustering problem. We provide novel analyses and show improved approximation and bi-criteria approximation guarantees for k-means++ and k-means++ parallel. Our results give a better theoretical justification for why these algorithms perform extremely well in practice. We also propose… ▽ More In this paper, we study k-means++ and k-means++ parallel, the two most popular algorithms for the classic k-means clustering problem. We provide novel analyses and show improved approximation and bi-criteria approximation guarantees for k-means++ and k-means++ parallel. Our results give a better theoretical justification for why these algorithms perform extremely well in practice. We also propose a new variant of k-means++ parallel algorithm (Exponential Race k-means++) that has the same approximation guarantees as k-means++. △ Less

Submitted 27 October, 2020; originally announced October 2020.

Journal ref: NeurIPS 2020

arXiv:2010.12953 [pdf, other]

Road Accident Proneness Indicator Based On Time, Weather And Location Specificity Using Graph Neural Networks

Authors: Srikanth Chandar, Anish Reddy, Muvazima Mansoor, Suresh Jamadagni

Abstract: In this paper, we present a novel approach to identify the Spatio-temporal and environmental features that influence the safety of a road and predict its accident proneness based on these features. A total of 14 features were compiled based on Time, Weather, and Location (TWL) specificity along a road. To determine the influence each of the 14 features carries, a sensitivity study was performed us… ▽ More In this paper, we present a novel approach to identify the Spatio-temporal and environmental features that influence the safety of a road and predict its accident proneness based on these features. A total of 14 features were compiled based on Time, Weather, and Location (TWL) specificity along a road. To determine the influence each of the 14 features carries, a sensitivity study was performed using Principal Component Analysis. Using the locations of accident warnings, a Safety Index was developed to quantify how accident-prone a particular road is. We implement a novel approach to predict the Safety Index of a road-based on its TWL specificity by using a Graph Neural Network (GNN) architecture. The proposed architecture is uniquely suited for this application due to its ability to capture the complexities of the inherent nonlinear interlinking in a vast feature space. We employed a GNN to emulate the TWL feature vectors as individual nodes which were interlinked vis-à-vis edges of a graph. This model was verified to perform better than Logistic Regression, simple Feed-Forward Neural Networks, and even Long Short Term Memory (LSTM) Neural Networks. We validated our approach on a data set containing the alert locations along the routes of inter-state buses. The results achieved through this GNN architecture, using a TWL input feature space proved to be more feasible than the other predictive models, having reached a peak accuracy of 65%. △ Less

Submitted 24 October, 2020; originally announced October 2020.

Comments: 7 pages, 10 figures, Submitted to and accepted for presentation in the 19TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, Miami, Florida

arXiv:2009.12533 [pdf, other]

doi 10.1109/5GWF49715.2020.9221093

Latency Analysis for IMT-2020 Radio Interface Technology Evaluation

Authors: A. Phani Kumar Reddy, Navin Kumar, Sri Sai Apoorva Tirumalasetty, Srinivasan S, Vinosh Babu James J

Abstract: The International Telecommunication Union (ITU) is currently deliberating on the finalization of candidate radio interface technologies (RITs) for IMT-2020 (International Mobile Telecommunications) suitability. The candidate technologies are currently being evaluated and after a couple of ITU-Radiocommunication sector (ITU-R) working party (WP) meetings, they will become official. Although, produc… ▽ More The International Telecommunication Union (ITU) is currently deliberating on the finalization of candidate radio interface technologies (RITs) for IMT-2020 (International Mobile Telecommunications) suitability. The candidate technologies are currently being evaluated and after a couple of ITU-Radiocommunication sector (ITU-R) working party (WP) meetings, they will become official. Although, products based on the candidate technology from 3GPP (5G new radio (NR)) is already commercial in several operator networks, the ITU is yet to officially declare it as IMT-2020 qualified. Along with evaluation of the 3GPP 5G NR specifications, our group has evaluated many other proponent technologies. 3GPP entire specifications were examined and evaluated through simulation using Matlab and using own developed simulator which is based on the Go-language. The simulator can evaluate complete 5G NR performance using the IMT-2020 evaluation framework. In this work, we are presenting latency parameters which has shown some minor differences from the 3GPP report. Especially, for time division duplexing (TDD) mode of operation, the differences are observed. It might be possible that the differences are due to assumptions made outside the scope of the evaluation. However, we considered the worst case parameter. Although, the report is submitted to ITU but it is also important for the research community to understand why the differences and what were the assumptions in scenario for which differences are observed. △ Less

Submitted 26 September, 2020; originally announced September 2020.

Comments: Accepted in 2020 IEEE 3rd 5G World Forum (WF-5G) Conference

arXiv:2007.15959 [pdf, ps, other]

Turbo Coded Single User Massive MIMO with Precoding

Authors: K. Vasudevan, Gyanesh Kumar Pathak, A. Phani Kumar Reddy

Abstract: Precoding is a method of compensating the channel at the transmitter. This work presents a novel method of data detection in turbo coded single user massive multiple input multiple output (MIMO) systems using precoding. We show via computer simulations that, when precoding is used, re-transmitting the data does not result in significant reduction in bit-error-rate (BER), thus increasing the spectr… ▽ More Precoding is a method of compensating the channel at the transmitter. This work presents a novel method of data detection in turbo coded single user massive multiple input multiple output (MIMO) systems using precoding. We show via computer simulations that, when precoding is used, re-transmitting the data does not result in significant reduction in bit-error-rate (BER), thus increasing the spectral efficiency, compared to the case without precoding. Moreover, increasing the number of transmit and receive antennas results in improved BER. △ Less

Submitted 31 July, 2020; originally announced July 2020.

Comments: 6 pages, 4 figures, conference

Showing 1–50 of 76 results for author: Reddy, A