Search | arXiv e-print repository

Unextendible entanglement of quantum channels

Abstract: Quantum communication relies on the existence of high quality quantum channels to exchange information. In practice, however, all communication links are affected by noise from the environment. Here we investigate the ability of quantum channels to perform quantum communication tasks by restricting the participants to use only local operations and one-way classical communication (one-way LOCC) alo… ▽ More Quantum communication relies on the existence of high quality quantum channels to exchange information. In practice, however, all communication links are affected by noise from the environment. Here we investigate the ability of quantum channels to perform quantum communication tasks by restricting the participants to use only local operations and one-way classical communication (one-way LOCC) along with the available quantum channel. In particular, a channel can be used to distill a highly entangled state between two parties, which further enables quantum or private communication. In this work, we invoke the framework of superchannels to study the distillation of a resourceful quantum state, such as a maximally entangled state or a private state, using multiple instances of a point-to-point quantum channel. We use the idea of $k$-extendibility to obtain a semidefinite relaxation of the set of one-way LOCC superchannels and define a class of entanglement measures for quantum channels that decrease monotonically under such superchannels; therefore these measures, dubbed collectively the ``unextendible entanglement of a channel'', yield upper bounds on several communication-theoretic quantities of interest in the regimes of resource distillation and zero error. We then generalize the formalism of $k$-extendibility to bipartite superchannels, thus obtaining functions that are monotone under two-extendible superchannels. This allows us to analyze probabilistic distillation of ebits or secret key bits from a bipartite state when using a resourceful quantum channel. Moreover, we propose semidefinite programs to evaluate several of these quantities, providing a computationally feasible method of comparison between quantum channels for resource distillation. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: 32+13 pages, 8 figures

arXiv:2407.09531 [pdf]

UAV Networks Surveillance Implementing an Effective Load-Aware Multipath Routing Protocol (ELAMRP)

Authors: Raja Vavekanand, Kira Sam, Vijay Singh

Abstract: In this work uses innovative multi-channel load-sensing techniques to deploy unmanned aerial vehicles (UAVs) for surveillance. The research aims to improve the quality of data transmission methods and improve the efficiency and reliability of surveillance systems by exploiting the mobility and adaptability of UAVs does the proposed protocol intelligently distribute network traffic across multiple… ▽ More In this work uses innovative multi-channel load-sensing techniques to deploy unmanned aerial vehicles (UAVs) for surveillance. The research aims to improve the quality of data transmission methods and improve the efficiency and reliability of surveillance systems by exploiting the mobility and adaptability of UAVs does the proposed protocol intelligently distribute network traffic across multiple channels, considering the load of each channel, While addressing challenges such as load balancing, this study investigates the effectiveness of the protocol by simulations or practical tests on The expected results have improved UAV-based surveillance systems, more flexible and efficient networks for applications such as security, emergency response and the environment alignment of monitoring -Offering infrastructures, which contribute to efficient and reliable monitoring solutions. △ Less

Submitted 25 June, 2024; originally announced July 2024.

Comments: 06 pages, 07 figures

arXiv:2407.09481 [pdf]

ChatGPT and Vaccine Hesitancy: A Comparison of English, Spanish, and French Responses Using a Validated Scale

Authors: Saubhagya Joshi, Eunbin Ha, Yonaira Rivera, Vivek K. Singh

Abstract: ChatGPT is a popular information system (over 1 billion visits in August 2023) that can generate natural language responses to user queries. It is important to study the quality and equity of its responses on health-related topics, such as vaccination, as they may influence public health decision-making. We use the Vaccine Hesitancy Scale (VHS) proposed by Shapiro et al.1 to measure the hesitancy… ▽ More ChatGPT is a popular information system (over 1 billion visits in August 2023) that can generate natural language responses to user queries. It is important to study the quality and equity of its responses on health-related topics, such as vaccination, as they may influence public health decision-making. We use the Vaccine Hesitancy Scale (VHS) proposed by Shapiro et al.1 to measure the hesitancy of ChatGPT responses in English, Spanish, and French. We find that: (a) ChatGPT responses indicate less hesitancy than those reported for human respondents in past literature; (b) ChatGPT responses vary significantly across languages, with English responses being the most hesitant on average and Spanish being the least; (c) ChatGPT responses are largely consistent across different model parameters but show some variations across the scale factors (vaccine competency, risk). Results have implications for researchers interested in evaluating and improving the quality and equity of health-related web information. △ Less

Submitted 6 May, 2024; originally announced July 2024.

Comments: 11 pages. Appeared in the Proceedings of the AMIA Informatics Summit, 2024

arXiv:2407.04589 [pdf, other]

Remembering Everything Makes You Vulnerable: A Limelight on Machine Unlearning for Personalized Healthcare Sector

Authors: Ahan Chatterjee, Sai Anirudh Aryasomayajula, Rajat Chaudhari, Subhajit Paul, Vishwa Mohan Singh

Abstract: As the prevalence of data-driven technologies in healthcare continues to rise, concerns regarding data privacy and security become increasingly paramount. This thesis aims to address the vulnerability of personalized healthcare models, particularly in the context of ECG monitoring, to adversarial attacks that compromise patient privacy. We propose an approach termed "Machine Unlearning" to mitigat… ▽ More As the prevalence of data-driven technologies in healthcare continues to rise, concerns regarding data privacy and security become increasingly paramount. This thesis aims to address the vulnerability of personalized healthcare models, particularly in the context of ECG monitoring, to adversarial attacks that compromise patient privacy. We propose an approach termed "Machine Unlearning" to mitigate the impact of exposed data points on machine learning models, thereby enhancing model robustness against adversarial attacks while preserving individual privacy. Specifically, we investigate the efficacy of Machine Unlearning in the context of personalized ECG monitoring, utilizing a dataset of clinical ECG recordings. Our methodology involves training a deep neural classifier on ECG data and fine-tuning the model for individual patients. We demonstrate the susceptibility of fine-tuned models to adversarial attacks, such as the Fast Gradient Sign Method (FGSM), which can exploit additional data points in personalized models. To address this vulnerability, we propose a Machine Unlearning algorithm that selectively removes sensitive data points from fine-tuned models, effectively enhancing model resilience against adversarial manipulation. Experimental results demonstrate the effectiveness of our approach in mitigating the impact of adversarial attacks while maintaining the pre-trained model accuracy. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: 15 Pages, Exploring unlearning techniques on ECG Classifier

arXiv:2407.03852 [pdf, other]

Low-latency machine learning FPGA accelerator for multi-qubit state discrimination

Authors: Pradeep Kumar Gautam, Shantharam Kalipatnapu, Shankaranarayanan H, Ujjawal Singhal, Benjamin Lienhard, Vibhor Singh, Chetan Singh Thakur

Abstract: Measuring a qubit is a fundamental yet error prone operation in quantum computing. These errors can stem from various sources such as crosstalk, spontaneous state-transitions, and excitation caused by the readout pulse. In this work, we utilize an integrated approach to deploy neural networks (NN) on to field programmable gate arrays (FPGA). We demonstrate that it is practical to design and implem… ▽ More Measuring a qubit is a fundamental yet error prone operation in quantum computing. These errors can stem from various sources such as crosstalk, spontaneous state-transitions, and excitation caused by the readout pulse. In this work, we utilize an integrated approach to deploy neural networks (NN) on to field programmable gate arrays (FPGA). We demonstrate that it is practical to design and implement a fully connected neural network accelerator for frequency-multiplexed readout balancing computational complexity with low latency requirements without significant loss in accuracy. The neural network is implemented by quantization of weights, activation functions, and inputs. The hardware accelerator performs frequency-multiplexed readout of 5 superconducting qubits in less than 50 ns on RFSoC ZCU111 FPGA which is first of its kind in the literature. These modules can be implemented and integrated in existing Quantum control and readout platforms using a RFSoC ZCU111 ready for experimental deployment. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 10 pages, 6 figures

arXiv:2406.17377 [pdf, other]

A Three-Pronged Approach to Cross-Lingual Adaptation with Multilingual LLMs

Authors: Vaibhav Singh, Amrith Krishna, Karthika NJ, Ganesh Ramakrishnan

Abstract: Low-resource languages, by its very definition, tend to be under represented in the pre-training corpora of Large Language Models. In this work, we investigate three low-resource cross-lingual approaches that enable an LLM adapt to tasks in previously unseen languages. Llama-2 is an LLM where Indic languages, among many other language families, contribute to less than $0.005\%$ of the total $2$ tr… ▽ More Low-resource languages, by its very definition, tend to be under represented in the pre-training corpora of Large Language Models. In this work, we investigate three low-resource cross-lingual approaches that enable an LLM adapt to tasks in previously unseen languages. Llama-2 is an LLM where Indic languages, among many other language families, contribute to less than $0.005\%$ of the total $2$ trillion token pre-training corpora. In this work, we experiment with the English-dominated Llama-2 for cross-lingual transfer to three Indic languages, Bengali, Hindi, and Tamil as target languages. We study three approaches for cross-lingual transfer, under ICL and fine-tuning. One, we find that adding additional supervisory signals via a dominant language in the LLM, leads to improvements, both under in-context learning and fine-tuning. Two, adapting the target languages to word reordering may be beneficial under ICL, but its impact diminishes with fine tuning. Finally, continued pre-training in one low-resource language can improve model performance for other related low-resource languages. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.13653 [pdf, other]

Controlling Forgetting with Test-Time Data in Continual Learning

Authors: Vaibhav Singh, Rahaf Aljundi, Eugene Belilovsky

Abstract: Foundational vision-language models have shown impressive performance on various downstream tasks. Yet, there is still a pressing need to update these models later as new tasks or domains become available. Ongoing Continual Learning (CL) research provides techniques to overcome catastrophic forgetting of previous information when new knowledge is acquired. To date, CL techniques focus only on the… ▽ More Foundational vision-language models have shown impressive performance on various downstream tasks. Yet, there is still a pressing need to update these models later as new tasks or domains become available. Ongoing Continual Learning (CL) research provides techniques to overcome catastrophic forgetting of previous information when new knowledge is acquired. To date, CL techniques focus only on the supervised training sessions. This results in significant forgetting yielding inferior performance to even the prior model zero shot performance. In this work, we argue that test-time data hold great information that can be leveraged in a self supervised manner to refresh the model's memory of previous learned tasks and hence greatly reduce forgetting at no extra labelling cost. We study how unsupervised data can be employed online to improve models' performance on prior tasks upon encountering representative samples. We propose a simple yet effective student-teacher model with gradient based sparse parameters updates and show significant performance improvements and reduction in forgetting, which could alleviate the role of an offline episodic memory/experience replay buffer. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 9 pages, 2 figures

arXiv:2406.07887 [pdf, other]

An Empirical Study of Mamba-based Language Models

Authors: Roger Waleffe, Wonmin Byeon, Duncan Riach, Brandon Norick, Vijay Korthikanti, Tri Dao, Albert Gu, Ali Hatamizadeh, Sudhakar Singh, Deepak Narayanan, Garvit Kulshreshtha, Vartika Singh, Jared Casper, Jan Kautz, Mohammad Shoeybi, Bryan Catanzaro

Abstract: Selective state-space models (SSMs) like Mamba overcome some of the shortcomings of Transformers, such as quadratic computational complexity with sequence length and large inference-time memory requirements from the key-value cache. Moreover, recent studies have shown that SSMs can match or exceed the language modeling capabilities of Transformers, making them an attractive alternative. In a contr… ▽ More Selective state-space models (SSMs) like Mamba overcome some of the shortcomings of Transformers, such as quadratic computational complexity with sequence length and large inference-time memory requirements from the key-value cache. Moreover, recent studies have shown that SSMs can match or exceed the language modeling capabilities of Transformers, making them an attractive alternative. In a controlled setting (e.g., same data), however, studies so far have only presented small scale experiments comparing SSMs to Transformers. To understand the strengths and weaknesses of these architectures at larger scales, we present a direct comparison between 8B-parameter Mamba, Mamba-2, and Transformer models trained on the same datasets of up to 3.5T tokens. We also compare these models to a hybrid architecture consisting of 43% Mamba-2, 7% attention, and 50% MLP layers (Mamba-2-Hybrid). Using a diverse set of tasks, we answer the question of whether Mamba models can match Transformers at larger training budgets. Our results show that while pure SSMs match or exceed Transformers on many tasks, they lag behind Transformers on tasks which require strong copying or in-context learning abilities (e.g., 5-shot MMLU, Phonebook) or long-context reasoning. In contrast, we find that the 8B Mamba-2-Hybrid exceeds the 8B Transformer on all 12 standard tasks we evaluated (+2.65 points on average) and is predicted to be up to 8x faster when generating tokens at inference time. To validate long-context capabilities, we provide additional experiments evaluating variants of the Mamba-2-Hybrid and Transformer extended to support 16K, 32K, and 128K sequences. On an additional 23 long-context tasks, the hybrid model continues to closely match or exceed the Transformer on average. To enable further study, we release the checkpoints as well as the code used to train our models as part of NVIDIA's Megatron-LM project. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.07521 [pdf, other]

Faster Spectral Density Estimation and Sparsification in the Nuclear Norm

Authors: Yujia Jin, Ishani Karmarkar, Christopher Musco, Aaron Sidford, Apoorv Vikram Singh

Abstract: We consider the problem of estimating the spectral density of the normalized adjacency matrix of an $n$-node undirected graph. We provide a randomized algorithm that, with $O(nε^{-2})$ queries to a degree and neighbor oracle and in $O(nε^{-3})$ time, estimates the spectrum up to $ε$ accuracy in the Wasserstein-1 metric. This improves on previous state-of-the-art methods, including an $O(nε^{-7})$… ▽ More We consider the problem of estimating the spectral density of the normalized adjacency matrix of an $n$-node undirected graph. We provide a randomized algorithm that, with $O(nε^{-2})$ queries to a degree and neighbor oracle and in $O(nε^{-3})$ time, estimates the spectrum up to $ε$ accuracy in the Wasserstein-1 metric. This improves on previous state-of-the-art methods, including an $O(nε^{-7})$ time algorithm from [Braverman et al., STOC 2022] and, for sufficiently small $ε$, a $2^{O(ε^{-1})}$ time method from [Cohen-Steiner et al., KDD 2018]. To achieve this result, we introduce a new notion of graph sparsification, which we call nuclear sparsification. We provide an $O(nε^{-2})$-query and $O(nε^{-2})$-time algorithm for computing $O(nε^{-2})$-sparse nuclear sparsifiers. We show that this bound is optimal in both its sparsity and query complexity, and we separate our results from the related notion of additive spectral sparsification. Of independent interest, we show that our sparsification method also yields the first deterministic algorithm for spectral density estimation that scales linearly with $n$ (sublinear in the representation size of the graph). △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: Accepted for presentation at the Conference on Learning Theory (COLT) 2024

arXiv:2405.12087 [pdf, other]

Channel Balance Interpolation in the Lightning Network via Machine Learning

Authors: Vincent, Emanuele Rossi, Vikash Singh

Abstract: The Bitcoin Lightning Network is a Layer 2 payment protocol that addresses Bitcoin's scalability by facilitating quick and cost effective transactions through payment channels. This research explores the feasibility of using machine learning models to interpolate channel balances within the network, which can be used for optimizing the network's pathfinding algorithms. While there has been much ex… ▽ More The Bitcoin Lightning Network is a Layer 2 payment protocol that addresses Bitcoin's scalability by facilitating quick and cost effective transactions through payment channels. This research explores the feasibility of using machine learning models to interpolate channel balances within the network, which can be used for optimizing the network's pathfinding algorithms. While there has been much exploration in balance probing and multipath payment protocols, predicting channel balances using solely node and channel features remains an uncharted area. This paper evaluates the performance of several machine learning models against two heuristic baselines and investigates the predictive capabilities of various features. Our model performs favorably in experimental evaluation, outperforming by 10% against an equal split baseline where both edges are assigned half of the channel capacity. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.10206 [pdf, other]

A Participatory Budgeting based Truthful Budget-Limited Incentive Mechanism for Time-Constrained Tasks in Crowdsensing Systems

Authors: Chattu Bhargavi, Vikash Kumar Singh

Abstract: Crowdsensing, also known as participatory sensing, is a method of data collection that involves gathering information from a large number of common people (or individuals), often using mobile devices or other personal technologies. This paper considers the set-up with multiple task requesters and several task executors in a strategic setting. Each task requester has multiple heterogeneous tasks an… ▽ More Crowdsensing, also known as participatory sensing, is a method of data collection that involves gathering information from a large number of common people (or individuals), often using mobile devices or other personal technologies. This paper considers the set-up with multiple task requesters and several task executors in a strategic setting. Each task requester has multiple heterogeneous tasks and an estimated budget for the tasks. In our proposed model, the Government has a publicly known fund (or budget) and is limited. Due to limited funds, it may not be possible for the platform to offer the funds to all the available task requesters. For that purpose, in the first tier, the voting by the city dwellers over the task requesters is carried out to decide on the subset of task requesters receiving the Government fund. In the second tier, each task of the task requesters has start and finish times. Based on that, firstly, the tasks are distributed to distinct slots. In each slot, we have multiple task executors for executing the floated tasks. Each task executor reports a cost (private) for completing the floated task(s). Given the above-discussed set-up, the objectives of the second tier are: (1) to schedule each task of the task requesters in the available slots in a non-conflicting manner and (2) to select a set of executors for the available tasks in such a way that the total incentive given to the task executors should be at most the budget for the tasks. For the discussed scenario, a truthful incentive based mechanism is designed that also takes care of budget criteria. Theoretical analysis is done, and it shows that the proposed mechanism is computationally efficient, truthful, budget-feasible, and individually rational. The simulation is carried out, and the efficacy of the designed mechanism is compared with the state-of-the-art mechanisms. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: 17 pages, 25 figures

arXiv:2405.01409 [pdf, other]

Goal-conditioned reinforcement learning for ultrasound navigation guidance

Authors: Abdoul Aziz Amadou, Vivek Singh, Florin C. Ghesu, Young-Ho Kim, Laura Stanciulescu, Harshitha P. Sai, Puneet Sharma, Alistair Young, Ronak Rajani, Kawal Rhode

Abstract: Transesophageal echocardiography (TEE) plays a pivotal role in cardiology for diagnostic and interventional procedures. However, using it effectively requires extensive training due to the intricate nature of image acquisition and interpretation. To enhance the efficiency of novice sonographers and reduce variability in scan acquisitions, we propose a novel ultrasound (US) navigation assistance me… ▽ More Transesophageal echocardiography (TEE) plays a pivotal role in cardiology for diagnostic and interventional procedures. However, using it effectively requires extensive training due to the intricate nature of image acquisition and interpretation. To enhance the efficiency of novice sonographers and reduce variability in scan acquisitions, we propose a novel ultrasound (US) navigation assistance method based on contrastive learning as goal-conditioned reinforcement learning (GCRL). We augment the previous framework using a novel contrastive patient batching method (CPB) and a data-augmented contrastive loss, both of which we demonstrate are essential to ensure generalization to anatomical variations across patients. The proposed framework enables navigation to both standard diagnostic as well as intricate interventional views with a single model. Our method was developed with a large dataset of 789 patients and obtained an average error of 6.56 mm in position and 9.36 degrees in angle on a testing dataset of 140 patients, which is competitive or superior to models trained on individual views. Furthermore, we quantitatively validate our method's ability to navigate to interventional views such as the Left Atrial Appendage (LAA) view used in LAA closure. Our approach holds promise in providing valuable guidance during transesophageal ultrasound examinations, contributing to the advancement of skill acquisition for cardiac ultrasound practitioners. △ Less

Submitted 22 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

Comments: 11 pages, 3 figures

ACM Class: I.4.0; I.5.0

arXiv:2404.12306 [pdf]

Switchable Single/Dual Edge Registers for Pipeline Architecture

Authors: Suyash Vardhan Singh, Rakeshkumar Mahto

Abstract: The demand for low power processing is increasing due to mobile and portable devices. In a processor unit, an adder is an important building block since it is used in Floating Point Units (FPU) and Arithmetic Logic Units (ALU). Also, pipeline techniques are used extensively to improve the throughput of the processing unit. To implement a pipeline requires adding a register at each sub-stage that r… ▽ More The demand for low power processing is increasing due to mobile and portable devices. In a processor unit, an adder is an important building block since it is used in Floating Point Units (FPU) and Arithmetic Logic Units (ALU). Also, pipeline techniques are used extensively to improve the throughput of the processing unit. To implement a pipeline requires adding a register at each sub-stage that result in increasing the latency. Moreover, designing a low power pipeline adder with low latency has drawn a lot of attention. In a pipelined architecture that uses Dual Edge Triggered (DET) based registers can help in reducing the latency since they can capture input data at both clock edges. However, for high input activity, a DET flip-flop consumes more power than a Single-Edge Triggered (SET) flip-flop. Moreover, it is required to replace each Flip-Flop (FF) in the processor with Dual Edge Triggered (DET) Flip-Flop which will be a considerable area and power overhead. Therefore, it is desirable to have a switchable DET to SET depending on input activity or load condition to reduce the dynamic power consumption. In this paper, we are proposing a new shift register which imitates DET FF based shift register without the need of special DET FF. The proposed shift register improved the latency in a 4-bit pipelined adder by two-fold. Additionally, the power delay product was reduced by 44.16 %. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.07926 [pdf, ps, other]

Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data Annotation

Authors: Jinkyung Park, Pamela Wisniewski, Vivek Singh

Abstract: In this position paper, we discuss the potential for leveraging LLMs as interactive research tools to facilitate collaboration between human coders and AI to effectively annotate online risk data at scale. Collaborative human-AI labeling is a promising approach to annotating large-scale and complex data for various tasks. Yet, tools and methods to support effective human-AI collaboration for data… ▽ More In this position paper, we discuss the potential for leveraging LLMs as interactive research tools to facilitate collaboration between human coders and AI to effectively annotate online risk data at scale. Collaborative human-AI labeling is a promising approach to annotating large-scale and complex data for various tasks. Yet, tools and methods to support effective human-AI collaboration for data annotation are under-studied. This gap is pertinent because co-labeling tasks need to support a two-way interactive discussion that can add nuance and context, particularly in the context of online risk, which is highly subjective and contextualized. Therefore, we provide some of the early benefits and challenges of using LLMs-based tools for risk annotation and suggest future directions for the HCI research community to leverage LLMs as research tools to facilitate human-AI collaboration in contextualized online data annotation. Our research interests align very well with the purposes of the LLMs as Research Tools workshop to identify ongoing applications and challenges of using LLMs to work with data in HCI research. We anticipate learning valuable insights from organizers and participants into how LLMs can help reshape the HCI community's methods for working with data. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: This paper has been peer-reviewed and presented at the "CHI 2024 Workshop on LLMs as Research Tools: Applications and Evaluations in HCI Data Work, May 12, 2024, Honolulu, HI, USA."

arXiv:2404.03023 [pdf, ps, other]

Toward Safe Evolution of Artificial Intelligence (AI) based Conversational Agents to Support Adolescent Mental and Sexual Health Knowledge Discovery

Authors: Jinkyung Park, Vivek Singh, Pamela Wisniewski

Abstract: Following the recent release of various Artificial Intelligence (AI) based Conversation Agents (CAs), adolescents are increasingly using CAs for interactive knowledge discovery on sensitive topics, including mental and sexual health topics. Exploring such sensitive topics through online search has been an essential part of adolescent development, and CAs can support their knowledge discovery on su… ▽ More Following the recent release of various Artificial Intelligence (AI) based Conversation Agents (CAs), adolescents are increasingly using CAs for interactive knowledge discovery on sensitive topics, including mental and sexual health topics. Exploring such sensitive topics through online search has been an essential part of adolescent development, and CAs can support their knowledge discovery on such topics through human-like dialogues. Yet, unintended risks have been documented with adolescents' interactions with AI-based CAs, such as being exposed to inappropriate content, false information, and/or being given advice that is detrimental to their mental and physical well-being (e.g., to self-harm). In this position paper, we discuss the current landscape and opportunities for CAs to support adolescents' mental and sexual health knowledge discovery. We also discuss some of the challenges related to ensuring the safety of adolescents when interacting with CAs regarding sexual and mental health topics. We call for a discourse on how to set guardrails for the safe evolution of AI-based CAs for adolescents. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: This paper has been peer-reviewed and presented at the "CHI 2024 Workshop on Child-centred AI Design, May 11, 2024, Honolulu, HI, USA."

arXiv:2404.02181 [pdf, other]

Leveraging Machine Learning for Early Autism Detection via INDT-ASD Indian Database

Authors: Trapti Shrivastava, Harshal Chaudhari, Vrijendra Singh

Abstract: Machine learning (ML) has advanced quickly, particularly throughout the area of health care. The diagnosis of neurodevelopment problems using ML is a very important area of healthcare. Autism spectrum disorder (ASD) is one of the developmental disorders that is growing the fastest globally. The clinical screening tests used to identify autistic symptoms are expensive and time-consuming. But now th… ▽ More Machine learning (ML) has advanced quickly, particularly throughout the area of health care. The diagnosis of neurodevelopment problems using ML is a very important area of healthcare. Autism spectrum disorder (ASD) is one of the developmental disorders that is growing the fastest globally. The clinical screening tests used to identify autistic symptoms are expensive and time-consuming. But now that ML has been advanced, it's feasible to identify autism early on. Previously, many different techniques have been used in investigations. Still, none of them have produced the anticipated outcomes when it comes to the capacity to predict autistic features utilizing a clinically validated Indian ASD database. Therefore, this study aimed to develop a simple, quick, and inexpensive technique for identifying ASD by using ML. Various machine learning classifiers, including Adaboost (AB), Gradient Boost (GB), Decision Tree (DT), Logistic Regression (LR), Random Forest (RF), Gaussian Naive Bayes (GNB), Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), K-Nearest Neighbors (KNN), and Support Vector Machine (SVM), were used to develop the autism prediction model. The proposed method was tested with records from the AIIMS Modified INDT-ASD (AMI) database, which were collected through an application developed by AIIMS in Delhi, India. Feature engineering has been applied to make the proposed solution easier than already available solutions. Using the proposed model, we succeeded in predicting ASD using a minimized set of 20 questions rather than the 28 questions presented in AMI with promising accuracy. In a comparative evaluation, SVM emerged as the superior model among others, with 100 $\pm$ 0.05\% accuracy, higher recall by 5.34\%, and improved accuracy by 2.22\%-6.67\% over RF. We have also introduced a web-based solution supporting both Hindi and English. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2404.01392 [pdf, other]

No-go theorem for probabilistic one-way secret-key distillation

Authors: Vishal Singh, Mark M. Wilde

Abstract: The probabilistic one-way distillable secret key is equal to the largest expected rate at which perfect secret key bits can be probabilistically distilled from a bipartite state by means of local operations and one-way classical communication. Here we define the set of super two-extendible states and prove that an arbitrary state in this set cannot be used for probabilistic one-way secret-key dist… ▽ More The probabilistic one-way distillable secret key is equal to the largest expected rate at which perfect secret key bits can be probabilistically distilled from a bipartite state by means of local operations and one-way classical communication. Here we define the set of super two-extendible states and prove that an arbitrary state in this set cannot be used for probabilistic one-way secret-key distillation. This broad class of states includes both erased states and all full-rank states. Comparing the probabilistic one-way distillable secret key with the more commonly studied approximate one-way distillable secret key, our results demonstrate an extreme gap between them for many states of interest, with the approximate one-way distillable secret key being much larger. Our findings naturally extend to probabilistic one-way entanglement distillation, with similar conclusions. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: 4+8 pages

arXiv:2403.12483 [pdf]

doi 10.1007/s00521-023-09087-7

A Hybrid Transformer-Sequencer approach for Age and Gender classification from in-wild facial images

Authors: Aakash Singh, Vivek Kumar Singh

Abstract: The advancements in computer vision and image processing techniques have led to emergence of new application in the domain of visual surveillance, targeted advertisement, content-based searching, and human-computer interaction etc. Out of the various techniques in computer vision, face analysis, in particular, has gained much attention. Several previous studies have tried to explore different appl… ▽ More The advancements in computer vision and image processing techniques have led to emergence of new application in the domain of visual surveillance, targeted advertisement, content-based searching, and human-computer interaction etc. Out of the various techniques in computer vision, face analysis, in particular, has gained much attention. Several previous studies have tried to explore different applications of facial feature processing for a variety of tasks, including age and gender classification. However, despite several previous studies having explored the problem, the age and gender classification of in-wild human faces is still far from the achieving the desired levels of accuracy required for real-world applications. This paper, therefore, attempts to bridge this gap by proposing a hybrid model that combines self-attention and BiLSTM approaches for age and gender classification problems. The proposed models performance is compared with several state-of-the-art model proposed so far. An improvement of approximately 10percent and 6percent over the state-of-the-art implementations for age and gender classification, respectively, are noted for the proposed model. The proposed model is thus found to achieve superior performance and is found to provide a more generalized learning. The model can, therefore, be applied as a core classification component in various image processing and computer vision problems. △ Less

Submitted 20 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

Comments: 22 pages

Journal ref: Neural Computing and Applications. 2024 Jan;36(3):1149-65

arXiv:2403.11418 [pdf, other]

Variational Sampling of Temporal Trajectories

Authors: Jurijs Nazarovs, Zhichun Huang, Xingjian Zhen, Sourav Pal, Rudrasis Chakraborty, Vikas Singh

Abstract: A deterministic temporal process can be determined by its trajectory, an element in the product space of (a) initial condition $z_0 \in \mathcal{Z}$ and (b) transition function $f: (\mathcal{Z}, \mathcal{T}) \to \mathcal{Z}$ often influenced by the control of the underlying dynamical system. Existing methods often model the transition function as a differential equation or as a recurrent neural ne… ▽ More A deterministic temporal process can be determined by its trajectory, an element in the product space of (a) initial condition $z_0 \in \mathcal{Z}$ and (b) transition function $f: (\mathcal{Z}, \mathcal{T}) \to \mathcal{Z}$ often influenced by the control of the underlying dynamical system. Existing methods often model the transition function as a differential equation or as a recurrent neural network. Despite their effectiveness in predicting future measurements, few results have successfully established a method for sampling and statistical inference of trajectories using neural networks, partially due to constraints in the parameterization. In this work, we introduce a mechanism to learn the distribution of trajectories by parameterizing the transition function $f$ explicitly as an element in a function space. Our framework allows efficient synthesis of novel trajectories, while also directly providing a convenient tool for inference, i.e., uncertainty estimation, likelihood evaluations and out of distribution detection for abnormal trajectories. These capabilities can have implications for various downstream tasks, e.g., simulation and evaluation for reinforcement learning. △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2403.10259 [pdf]

Comprehensive Study Of Predictive Maintenance In Industries Using Classification Models And LSTM Model

Authors: Saket Maheshwari, Sambhav Tiwari, Shyam Rai, Satyam Vinayak Daman Pratap Singh

Abstract: In today's technology-driven era, the imperative for predictive maintenance and advanced diagnostics extends beyond aviation to encompass the identification of damages, failures, and operational defects in rotating and moving machines. Implementing such services not only curtails maintenance costs but also extends machine lifespan, ensuring heightened operational efficiency. Moreover, it serves as… ▽ More In today's technology-driven era, the imperative for predictive maintenance and advanced diagnostics extends beyond aviation to encompass the identification of damages, failures, and operational defects in rotating and moving machines. Implementing such services not only curtails maintenance costs but also extends machine lifespan, ensuring heightened operational efficiency. Moreover, it serves as a preventive measure against potential accidents or catastrophic events. The advent of Artificial Intelligence (AI) has revolutionized maintenance across industries, enabling more accurate and efficient prediction and analysis of machine failures, thereby conserving time and resources. Our proposed study aims to delve into various machine learning classification techniques, including Support Vector Machine (SVM), Random Forest, Logistic Regression, and Convolutional Neural Network LSTM-Based, for predicting and analyzing machine performance. SVM classifies data into different categories based on their positions in a multidimensional space, while Random Forest employs ensemble learning to create multiple decision trees for classification. Logistic Regression predicts the probability of binary outcomes using input data. The primary objective of the study is to assess these algorithms' performance in predicting and analyzing machine performance, considering factors such as accuracy, precision, recall, and F1 score. The findings will aid maintenance experts in selecting the most suitable machine learning algorithm for effective prediction and analysis of machine performance. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.07339 [pdf, other]

IM-Unpack: Training and Inference with Arbitrarily Low Precision Integers

Authors: Zhanpeng Zeng, Karthikeyan Sankaralingam, Vikas Singh

Abstract: GEneral Matrix Multiply (GEMM) is a central operation in deep learning and corresponds to the largest chunk of the compute footprint. Therefore, improving its efficiency is an active topic of ongoing research. A popular strategy is the use of low bit-width integers to approximate the original entries in a matrix. This allows efficiency gains, but often requires sophisticated techniques to control… ▽ More GEneral Matrix Multiply (GEMM) is a central operation in deep learning and corresponds to the largest chunk of the compute footprint. Therefore, improving its efficiency is an active topic of ongoing research. A popular strategy is the use of low bit-width integers to approximate the original entries in a matrix. This allows efficiency gains, but often requires sophisticated techniques to control the rounding error incurred. In this work, we first verify/check that when the low bit-width restriction is removed, for a variety of Transformer-based models, whether integers are sufficient for all GEMMs need -- for {\em both} training and inference stages, and can achieve parity with floating point counterparts. No sophisticated techniques are needed. We find that while a large majority of entries in matrices (encountered in such models) can be easily represented by {\em low} bit-width integers, the existence of a few heavy hitter entries make it difficult to achieve efficiency gains via the exclusive use of low bit-width GEMMs alone. To address this issue, we develop a simple algorithm, Integer Matrix Unpacking (IM-Unpack), to {\em unpack} a matrix with large integer entries into a larger matrix whose entries all lie within the representable range of arbitrarily low bit-width integers. This allows {\em equivalence} with the original GEMM, i.e., the exact result can be obtained using purely low bit-width integer GEMMs. This comes at the cost of additional operations -- we show that for many popular models, this overhead is quite small. △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2403.07221 [pdf, other]

LookupFFN: Making Transformers Compute-lite for CPU inference

Authors: Zhanpeng Zeng, Michael Davies, Pranav Pulijala, Karthikeyan Sankaralingam, Vikas Singh

Abstract: While GPU clusters are the de facto choice for training large deep neural network (DNN) models today, several reasons including ease of workflow, security and cost have led to efforts investigating whether CPUs may be viable for inference in routine use in many sectors of the industry. But the imbalance between the compute capabilities of GPUs and CPUs is huge. Motivated by these considerations, w… ▽ More While GPU clusters are the de facto choice for training large deep neural network (DNN) models today, several reasons including ease of workflow, security and cost have led to efforts investigating whether CPUs may be viable for inference in routine use in many sectors of the industry. But the imbalance between the compute capabilities of GPUs and CPUs is huge. Motivated by these considerations, we study a module which is a workhorse within modern DNN architectures, GEMM based Feed Forward Networks (FFNs), and assess the extent to which it can be made compute- (or FLOP-) lite. Specifically, we propose an alternative formulation (we call it LookupFFN) to GEMM based FFNs inspired by the recent studies of using Locality Sensitive Hashing (LSH) to approximate FFNs. Our formulation recasts most essential operations as a memory look-up, leveraging the trade-off between the two resources on any platform: compute and memory (since CPUs offer it in abundance). For RoBERTa language model pretraining, our formulation achieves similar performance compared to GEMM based FFNs, while dramatically reducing the required FLOP. Our development is complemented with a detailed hardware profiling of strategies that will maximize efficiency -- not just on contemporary hardware but on products that will be offered in the near/medium term future. Code is avaiable at \url{https://github.com/mlpen/LookupFFN}. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: ICML 2023

arXiv:2403.06082 [pdf, other]

FrameQuant: Flexible Low-Bit Quantization for Transformers

Authors: Harshavardhan Adepu, Zhanpeng Zeng, Li Zhang, Vikas Singh

Abstract: Transformers are the backbone of powerful foundation models for many Vision and Natural Language Processing tasks. But their compute and memory/storage footprint is large, and so, serving such models is expensive often requiring high-end hardware. To mitigate this difficulty, Post-Training Quantization seeks to modify a pre-trained model and quantize it to eight bits or lower, significantly boosti… ▽ More Transformers are the backbone of powerful foundation models for many Vision and Natural Language Processing tasks. But their compute and memory/storage footprint is large, and so, serving such models is expensive often requiring high-end hardware. To mitigate this difficulty, Post-Training Quantization seeks to modify a pre-trained model and quantize it to eight bits or lower, significantly boosting compute/memory/latency efficiency. Such models have been successfully quantized to four bits with some performance loss. In this work, we outline a simple scheme to quantize Transformer-based models to just two bits (plus some overhead) with only a small drop in accuracy. Key to our formulation is a concept borrowed from Harmonic analysis called Fusion Frames. Our main finding is that the quantization must take place not in the original weight space, but instead in the Fusion Frame representations. If quantization is interpreted as the addition of noise, our casting of the problem allows invoking an extensive body of known consistent recovery and noise robustness guarantees. Further, if desired, de-noising filters are known in closed form. We show empirically, via a variety of experiments, that (almost) two-bit quantization for Transformer models promises sizable efficiency gains. △ Less

Submitted 9 March, 2024; originally announced March 2024.

Comments: 25 pages, 15 figures

arXiv:2403.02598

Pooling Image Datasets With Multiple Covariate Shift and Imbalance

Authors: Sotirios Panagiotis Chytas, Vishnu Suresh Lokhande, Peiran Li, Vikas Singh

Abstract: Small sample sizes are common in many disciplines, which necessitates pooling roughly similar datasets across multiple institutions to study weak but relevant associations between images and disease outcomes. Such data often manifest shift/imbalance in covariates (i.e., secondary non-imaging data). Controlling for such nuisance variables is common within standard statistical analysis, but the idea… ▽ More Small sample sizes are common in many disciplines, which necessitates pooling roughly similar datasets across multiple institutions to study weak but relevant associations between images and disease outcomes. Such data often manifest shift/imbalance in covariates (i.e., secondary non-imaging data). Controlling for such nuisance variables is common within standard statistical analysis, but the ideas do not directly apply to overparameterized models. Consequently, recent work has shown how strategies from invariant representation learning provides a meaningful starting point, but the current repertoire of methods is limited to accounting for shifts/imbalances in just a couple of covariates at a time. In this paper, we show how viewing this problem from the perspective of Category theory provides a simple and effective solution that completely avoids elaborate multi-stage training pipelines that would otherwise be needed. We show the effectiveness of this approach via extensive experiments on real datasets. Further, we discuss how this style of formulation offers a unified perspective on at least 5+ distinct problem settings, from self-supervised learning to matching problems in 3D reconstruction. △ Less

Submitted 14 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

Comments: We need to do some fixes of references to make them more precise. This paper will be corrected and uploaded again by another group member

arXiv:2402.19250 [pdf, other]

doi 10.1016/j.neucom.2024.128222

Feature boosting with efficient attention for scene parsing

Authors: Vivek Singh, Shailza Sharma, Fabio Cuzzolin

Abstract: The complexity of scene parsing grows with the number of object and scene classes, which is higher in unrestricted open scenes. The biggest challenge is to model the spatial relation between scene elements while succeeding in identifying objects at smaller scales. This paper presents a novel feature-boosting network that gathers spatial context from multiple levels of feature extraction and comput… ▽ More The complexity of scene parsing grows with the number of object and scene classes, which is higher in unrestricted open scenes. The biggest challenge is to model the spatial relation between scene elements while succeeding in identifying objects at smaller scales. This paper presents a novel feature-boosting network that gathers spatial context from multiple levels of feature extraction and computes the attention weights for each level of representation to generate the final class labels. A novel `channel attention module' is designed to compute the attention weights, ensuring that features from the relevant extraction stages are boosted while the others are attenuated. The model also learns spatial context information at low resolution to preserve the abstract spatial relationships among scene elements and reduce computation cost. Spatial attention is subsequently concatenated into a final feature set before applying feature boosting. Low-resolution spatial attention features are trained using an auxiliary task that helps learning a coarse global scene structure. The proposed model outperforms all state-of-the-art models on both the ADE20K and the Cityscapes datasets. △ Less

Submitted 29 February, 2024; originally announced February 2024.

arXiv:2402.15214 [pdf, other]

ChildAugment: Data Augmentation Methods for Zero-Resource Children's Speaker Verification

Authors: Vishwanath Pratap Singh, Md Sahidullah, Tomi Kinnunen

Abstract: The accuracy of modern automatic speaker verification (ASV) systems, when trained exclusively on adult data, drops substantially when applied to children's speech. The scarcity of children's speech corpora hinders fine-tuning ASV systems for children's speech. Hence, there is a timely need to explore more effective ways of reusing adults' speech data. One promising approach is to align vocal-tract… ▽ More The accuracy of modern automatic speaker verification (ASV) systems, when trained exclusively on adult data, drops substantially when applied to children's speech. The scarcity of children's speech corpora hinders fine-tuning ASV systems for children's speech. Hence, there is a timely need to explore more effective ways of reusing adults' speech data. One promising approach is to align vocal-tract parameters between adults and children through children-specific data augmentation, referred here to as ChildAugment. Specifically, we modify the formant frequencies and formant bandwidths of adult speech to emulate children's speech. The modified spectra are used to train ECAPA-TDNN (emphasized channel attention, propagation, and aggregation in time-delay neural network) recognizer for children. We compare ChildAugment against various state-of-the-art data augmentation techniques for children's ASV. We also extensively compare different scoring methods, including cosine scoring, PLDA (probabilistic linear discriminant analysis), and NPLDA (neural PLDA). We also propose a low-complexity weighted cosine score for extremely low-resource children ASV. Our findings on the CSLU kids corpus indicate that ChildAugment holds promise as a simple, acoustics-motivated approach, for improving state-of-the-art deep learning based ASV for children. We achieve up to 12.45% (boys) and 11.96% (girls) relative improvement over the baseline. △ Less

Submitted 23 February, 2024; originally announced February 2024.

Comments: The following article has been accepted by The Journal of the Acoustical Society of America (JASA). After it is published, it will be found at https://pubs.aip.org/asa/jasa

arXiv:2402.06463 [pdf, other]

Cardiac ultrasound simulation for autonomous ultrasound navigation

Authors: Abdoul Aziz Amadou, Laura Peralta, Paul Dryburgh, Paul Klein, Kaloian Petkov, Richard James Housden, Vivek Singh, Rui Liao, Young-Ho Kim, Florin Christian Ghesu, Tommaso Mansi, Ronak Rajani, Alistair Young, Kawal Rhode

Abstract: Ultrasound is well-established as an imaging modality for diagnostic and interventional purposes. However, the image quality varies with operator skills as acquiring and interpreting ultrasound images requires extensive training due to the imaging artefacts, the range of acquisition parameters and the variability of patient anatomies. Automating the image acquisition task could improve acquisition… ▽ More Ultrasound is well-established as an imaging modality for diagnostic and interventional purposes. However, the image quality varies with operator skills as acquiring and interpreting ultrasound images requires extensive training due to the imaging artefacts, the range of acquisition parameters and the variability of patient anatomies. Automating the image acquisition task could improve acquisition reproducibility and quality but training such an algorithm requires large amounts of navigation data, not saved in routine examinations. Thus, we propose a method to generate large amounts of ultrasound images from other modalities and from arbitrary positions, such that this pipeline can later be used by learning algorithms for navigation. We present a novel simulation pipeline which uses segmentations from other modalities, an optimized volumetric data representation and GPU-accelerated Monte Carlo path tracing to generate view-dependent and patient-specific ultrasound images. We extensively validate the correctness of our pipeline with a phantom experiment, where structures' sizes, contrast and speckle noise properties are assessed. Furthermore, we demonstrate its usability to train neural networks for navigation in an echocardiography view classification experiment by generating synthetic images from more than 1000 patients. Networks pre-trained with our simulations achieve significantly superior performance in settings where large real datasets are not available, especially for under-represented classes. The proposed approach allows for fast and accurate patient-specific ultrasound image generation, and its usability for training networks for navigation-related tasks is demonstrated. △ Less

Submitted 9 February, 2024; originally announced February 2024.

Comments: 24 pages, 10 figures, 5 tables

ACM Class: I.6.0; I.5.4; J.3

arXiv:2401.15906 [pdf, other]

Mean Estimation with User-Level Privacy for Spatio-Temporal IoT Datasets

Authors: V. Arvind Rameshwar, Anshoo Tandon, Prajjwal Gupta, Aditya Vikram Singh, Novoneel Chakraborty, Abhay Sharma

Abstract: This paper considers the problem of the private release of sample means of speed values from traffic datasets. Our key contribution is the development of user-level differentially private algorithms that incorporate carefully chosen parameter values to ensure low estimation errors on real-world datasets, while ensuring privacy. We test our algorithms on ITMS (Intelligent Traffic Management System)… ▽ More This paper considers the problem of the private release of sample means of speed values from traffic datasets. Our key contribution is the development of user-level differentially private algorithms that incorporate carefully chosen parameter values to ensure low estimation errors on real-world datasets, while ensuring privacy. We test our algorithms on ITMS (Intelligent Traffic Management System) data from an Indian city, where the speeds of different buses are drawn in a potentially non-i.i.d. manner from an unknown distribution, and where the number of speed samples contributed by different buses is potentially different. We then apply our algorithms to large synthetic datasets, generated based on the ITMS data. Here, we provide theoretical justification for the observed performance trends, and also provide recommendations for the choices of algorithm subroutines that result in low estimation errors. Finally, we characterize the best performance of pseudo-user creation-based algorithms on worst-case datasets via a minimax approach; this then gives rise to a novel procedure for the creation of pseudo-users, which optimizes the worst-case total estimation error. The algorithms discussed in the paper are readily applicable to general spatio-temporal IoT datasets for releasing a differentially private mean of a desired value. △ Less

Submitted 25 April, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: 14 pages, 5 figures, submitted to the ACM for possible publication

arXiv:2401.05392 [pdf, other]

AT-2FF: Adaptive Type-2 Fuzzy Filter for De-noising Images Corrupted with Salt-and-Pepper

Authors: Vikas Singh

Abstract: Noise is inevitably common in digital images, leading to visual image deterioration. Therefore, a suitable filtering method is required to lessen the noise while preserving the image features (edges, corners, etc.). This paper presents the efficient type-2 fuzzy weighted mean filter with an adaptive threshold to remove the SAP noise. The present filter has two primary steps: The first stage catego… ▽ More Noise is inevitably common in digital images, leading to visual image deterioration. Therefore, a suitable filtering method is required to lessen the noise while preserving the image features (edges, corners, etc.). This paper presents the efficient type-2 fuzzy weighted mean filter with an adaptive threshold to remove the SAP noise. The present filter has two primary steps: The first stage categorizes images as lightly, medium, and heavily corrupted based on an adaptive threshold by comparing the M-ALD of processed pixels with the upper and lower MF of the type-2 fuzzy identifier. The second stage eliminates corrupted pixels by computing the appropriate weight using GMF with the mean and variance of the uncorrupted pixels in the filter window. Simulation results vividly show that the obtained denoised images preserve image features, i.e., edges, corners, and other sharp structures, compared with different filtering methods. △ Less

Submitted 19 December, 2023; originally announced January 2024.

arXiv:2401.05377 [pdf]

The impact of generative artificial intelligence on socioeconomic inequalities and policy making

Authors: Valerio Capraro, Austin Lentsch, Daron Acemoglu, Selin Akgun, Aisel Akhmedova, Ennio Bilancini, Jean-François Bonnefon, Pablo Brañas-Garza, Luigi Butera, Karen M. Douglas, Jim A. C. Everett, Gerd Gigerenzer, Christine Greenhow, Daniel A. Hashimoto, Julianne Holt-Lunstad, Jolanda Jetten, Simon Johnson, Chiara Longoni, Pete Lunn, Simone Natale, Iyad Rahwan, Neil Selwyn, Vivek Singh, Siddharth Suri, Jennifer Sutcliffe , et al. (6 additional authors not shown)

Abstract: Generative artificial intelligence has the potential to both exacerbate and ameliorate existing socioeconomic inequalities. In this article, we provide a state-of-the-art interdisciplinary overview of the potential impacts of generative AI on (mis)information and three information-intensive domains: work, education, and healthcare. Our goal is to highlight how generative AI could worsen existing i… ▽ More Generative artificial intelligence has the potential to both exacerbate and ameliorate existing socioeconomic inequalities. In this article, we provide a state-of-the-art interdisciplinary overview of the potential impacts of generative AI on (mis)information and three information-intensive domains: work, education, and healthcare. Our goal is to highlight how generative AI could worsen existing inequalities while illuminating how AI may help mitigate pervasive social problems. In the information domain, generative AI can democratize content creation and access, but may dramatically expand the production and proliferation of misinformation. In the workplace, it can boost productivity and create new jobs, but the benefits will likely be distributed unevenly. In education, it offers personalized learning, but may widen the digital divide. In healthcare, it might improve diagnostics and accessibility, but could deepen pre-existing inequalities. In each section we cover a specific topic, evaluate existing research, identify critical gaps, and recommend research directions, including explicit trade-offs that complicate the derivation of a priori hypotheses. We conclude with a section highlighting the role of policymaking to maximize generative AI's potential to reduce inequalities while mitigating its harmful effects. We discuss strengths and weaknesses of existing policy frameworks in the European Union, the United States, and the United Kingdom, observing that each fails to fully confront the socioeconomic challenges we have identified. We propose several concrete policies that could promote shared prosperity through the advancement of generative AI. This article emphasizes the need for interdisciplinary collaborations to understand and address the complex challenges of generative AI. △ Less

Submitted 6 May, 2024; v1 submitted 16 December, 2023; originally announced January 2024.

Comments: PNAS Nexus, in press

arXiv:2312.11593 [pdf, other]

Towards Establishing Dense Correspondence on Multiview Coronary Angiography: From Point-to-Point to Curve-to-Curve Query Matching

Authors: Yifan Wu, Rohit Jena, Mehmet Gulsun, Vivek Singh, Puneet Sharma, James C. Gee

Abstract: Coronary angiography is the gold standard imaging technique for studying and diagnosing coronary artery disease. However, the resulting 2D X-ray projections lose 3D information and exhibit visual ambiguities. In this work, we aim to establish dense correspondence in multi-view angiography, serving as a fundamental basis for various clinical applications and downstream tasks. To overcome the challe… ▽ More Coronary angiography is the gold standard imaging technique for studying and diagnosing coronary artery disease. However, the resulting 2D X-ray projections lose 3D information and exhibit visual ambiguities. In this work, we aim to establish dense correspondence in multi-view angiography, serving as a fundamental basis for various clinical applications and downstream tasks. To overcome the challenge of unavailable annotated data, we designed a data simulation pipeline using 3D Coronary Computed Tomography Angiography (CCTA). We formulated the problem of dense correspondence estimation as a query matching task over all points of interest in the given views. We established point-to-point query matching and advanced it to curve-to-curve correspondence, significantly reducing errors by minimizing ambiguity and improving topological awareness. The method was evaluated on a set of 1260 image pairs from different views across 8 clinically relevant angulation groups, demonstrating compelling results and indicating the feasibility of establishing dense correspondence in multi-view angiography. △ Less

Submitted 18 December, 2023; originally announced December 2023.

arXiv:2311.12336 [pdf]

Classification of Instagram fake users using supervised machine learning algorithms

Authors: Vertika Singh, Naman Tolasaria, Patel Meet Alpeshkumar, Shreyash Bartwal

Abstract: In the contemporary era, online social networks have become integral to social life, revolutionizing the way individuals manage their social connections. While enhancing accessibility and immediacy, these networks have concurrently given rise to challenges, notably the proliferation of fraudulent profiles and online impersonation. This paper proposes an application designed to detect and neutraliz… ▽ More In the contemporary era, online social networks have become integral to social life, revolutionizing the way individuals manage their social connections. While enhancing accessibility and immediacy, these networks have concurrently given rise to challenges, notably the proliferation of fraudulent profiles and online impersonation. This paper proposes an application designed to detect and neutralize such dishonest entities, with a focus on safeguarding companies from potential fraud. The user-centric design of the application ensures accessibility for investigative agencies, particularly the criminal branch, facilitating navigation of complex social media landscapes and integration with existing investigative procedures △ Less

Submitted 20 November, 2023; originally announced November 2023.

arXiv:2311.08689 [pdf, other]

Low Complexity High Speed Deep Neural Network Augmented Wireless Channel Estimation

Authors: Syed Asrar ul haq, Varun Singh, Bhanu Teja Tanaji, Sumit Darak

Abstract: The channel estimation (CE) in wireless receivers is one of the most critical and computationally complex signal processing operations. Recently, various works have shown that the deep learning (DL) based CE outperforms conventional minimum mean square error (MMSE) based CE, and it is hardware-friendly. However, DL-based CE has higher complexity and latency than popularly used least square (LS) ba… ▽ More The channel estimation (CE) in wireless receivers is one of the most critical and computationally complex signal processing operations. Recently, various works have shown that the deep learning (DL) based CE outperforms conventional minimum mean square error (MMSE) based CE, and it is hardware-friendly. However, DL-based CE has higher complexity and latency than popularly used least square (LS) based CE. In this work, we propose a novel low complexity high-speed Deep Neural Network-Augmented Least Square (LC-LSDNN) algorithm for IEEE 802.11p wireless physical layer and efficiently implement it on Zynq system on chip (ZSoC). The novelty of the LC-LSDNN is to use different DNNs for real and imaginary values of received complex symbols. This helps reduce the size of DL by 59% and optimize the critical path, allowing it to operate at 60% higher clock frequency. We also explore three different architectures for MMSE-based CE. We show that LC-LSDNN significantly outperforms MMSE and state-of-the-art DL-based CE for a wide range of signal-to-noise ratios (SNR) and different wireless channels. Also, it is computationally efficient, with around 50% lower resources than existing DL-based CE. △ Less

Submitted 14 November, 2023; originally announced November 2023.

arXiv:2310.03841 [pdf, other]

ALBERTA: ALgorithm-Based Error Resilience in Transformer Architectures

Authors: Haoxuan Liu, Vasu Singh, Michał Filipiuk, Siva Kumar Sastry Hari

Abstract: Vision Transformers are being increasingly deployed in safety-critical applications that demand high reliability. It is crucial to ensure the correctness of their execution in spite of potential errors such as transient hardware errors. We propose a novel algorithm-based resilience framework called ALBERTA that allows us to perform end-to-end resilience analysis and protection of transformer-based… ▽ More Vision Transformers are being increasingly deployed in safety-critical applications that demand high reliability. It is crucial to ensure the correctness of their execution in spite of potential errors such as transient hardware errors. We propose a novel algorithm-based resilience framework called ALBERTA that allows us to perform end-to-end resilience analysis and protection of transformer-based architectures. First, our work develops an efficient process of computing and ranking the resilience of transformers layers. We find that due to the large size of transformer models, applying traditional network redundancy to a subset of the most vulnerable layers provides high error coverage albeit with impractically high overhead. We address this shortcoming by providing a software-directed, checksum-based error detection technique aimed at protecting the most vulnerable general matrix multiply (GEMM) layers in the transformer models that use either floating-point or integer arithmetic. Results show that our approach achieves over 99% coverage for errors that result in a mismatch with less than 0.2% and 0.01% computation and memory overheads, respectively. Lastly, we present the applicability of our framework in various modern GPU architectures under different numerical precisions. We introduce an efficient self-correction mechanism for resolving erroneous detection with an average of less than 2% overhead per error. △ Less

Submitted 5 February, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

arXiv:2309.15750 [pdf, other]

Automated CT Lung Cancer Screening Workflow using 3D Camera

Authors: Brian Teixeira, Vivek Singh, Birgi Tamersoy, Andreas Prokein, Ankur Kapoor

Abstract: Despite recent developments in CT planning that enabled automation in patient positioning, time-consuming scout scans are still needed to compute dose profile and ensure the patient is properly positioned. In this paper, we present a novel method which eliminates the need for scout scans in CT lung cancer screening by estimating patient scan range, isocenter, and Water Equivalent Diameter (WED) fr… ▽ More Despite recent developments in CT planning that enabled automation in patient positioning, time-consuming scout scans are still needed to compute dose profile and ensure the patient is properly positioned. In this paper, we present a novel method which eliminates the need for scout scans in CT lung cancer screening by estimating patient scan range, isocenter, and Water Equivalent Diameter (WED) from 3D camera images. We achieve this task by training an implicit generative model on over 60,000 CT scans and introduce a novel approach for updating the prediction using real-time scan data. We demonstrate the effectiveness of our method on a testing set of 110 pairs of depth data and CT scan, resulting in an average error of 5mm in estimating the isocenter, 13mm in determining the scan range, 10mm and 16mm in estimating the AP and lateral WED respectively. The relative WED error of our method is 4%, which is well within the International Electrotechnical Commission (IEC) acceptance criteria of 10%. △ Less

Submitted 27 September, 2023; originally announced September 2023.

Comments: Accepted at MICCAI 2023

arXiv:2309.14328 [pdf, other]

doi 10.2312/envirvis.20231100

pyParaOcean: A System for Visual Analysis of Ocean Data

Authors: Toshit Jain, Varun Singh, Vijay Kumar Boda, Upkar Singh, Ingrid Hotz, P. N. Vinayachandran, Vijay Natarajan

Abstract: Visual analysis is well adopted within the field of oceanography for the analysis of model simulations, detection of different phenomena and events, and tracking of dynamic processes. With increasing data sizes and the availability of multivariate dynamic data, there is a growing need for scalable and extensible tools for visualization and interactive exploration. We describe pyParaOcean, a visual… ▽ More Visual analysis is well adopted within the field of oceanography for the analysis of model simulations, detection of different phenomena and events, and tracking of dynamic processes. With increasing data sizes and the availability of multivariate dynamic data, there is a growing need for scalable and extensible tools for visualization and interactive exploration. We describe pyParaOcean, a visualization system that supports several tasks routinely used in the visual analysis of ocean data. The system is available as a plugin to Paraview and is hence able to leverage its distributed computing capabilities and its rich set of generic analysis and visualization functionalities. pyParaOcean provides modules to support different visual analysis tasks specific to ocean data, such as eddy identification and salinity movement tracking. These modules are available as Paraview filters and this seamless integration results in a system that is easy to install and use. A case study on the Bay of Bengal illustrates the utility of the system for the study of ocean phenomena and processes. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: 8 pages, EnvirVis2023

ACM Class: F.7; I.3.6

Journal ref: envirvis2023

arXiv:2308.04589 [pdf, other]

Temporal DINO: A Self-supervised Video Strategy to Enhance Action Prediction

Authors: Izzeddin Teeti, Rongali Sai Bhargav, Vivek Singh, Andrew Bradley, Biplab Banerjee, Fabio Cuzzolin

Abstract: The emerging field of action prediction plays a vital role in various computer vision applications such as autonomous driving, activity analysis and human-computer interaction. Despite significant advancements, accurately predicting future actions remains a challenging problem due to high dimensionality, complex dynamics and uncertainties inherent in video data. Traditional supervised approaches r… ▽ More The emerging field of action prediction plays a vital role in various computer vision applications such as autonomous driving, activity analysis and human-computer interaction. Despite significant advancements, accurately predicting future actions remains a challenging problem due to high dimensionality, complex dynamics and uncertainties inherent in video data. Traditional supervised approaches require large amounts of labelled data, which is expensive and time-consuming to obtain. This paper introduces a novel self-supervised video strategy for enhancing action prediction inspired by DINO (self-distillation with no labels). The Temporal-DINO approach employs two models; a 'student' processing past frames; and a 'teacher' processing both past and future frames, enabling a broader temporal context. During training, the teacher guides the student to learn future context by only observing past frames. The strategy is evaluated on ROAD dataset for the action prediction downstream task using 3D-ResNet, Transformer, and LSTM architectures. The experimental results showcase significant improvements in prediction performance across these architectures, with our method achieving an average enhancement of 9.9% Precision Points (PP), highlighting its effectiveness in enhancing the backbones' capabilities of capturing long-term dependencies. Furthermore, our approach demonstrates efficiency regarding the pretraining dataset size and the number of epochs required. This method overcomes limitations present in other approaches, including considering various backbone architectures, addressing multiple prediction horizons, reducing reliance on hand-crafted augmentations, and streamlining the pretraining process into a single stage. These findings highlight the potential of our approach in diverse video-based tasks such as activity recognition, motion planning, and scene understanding. △ Less

Submitted 20 August, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

arXiv:2307.07868 [pdf]

Contrasting the efficiency of stock price prediction models using various types of LSTM models aided with sentiment analysis

Authors: Varun Sangwan, Vishesh Kumar Singh, Bibin Christopher V

Abstract: Our research aims to find the best model that uses companies projections and sector performances and how the given company fares accordingly to correctly predict equity share prices for both short and long term goals. Our research aims to find the best model that uses companies projections and sector performances and how the given company fares accordingly to correctly predict equity share prices for both short and long term goals. △ Less

Submitted 15 July, 2023; originally announced July 2023.

Comments: 8 Pages

arXiv:2307.00474 [pdf, other]

Moments, Random Walks, and Limits for Spectrum Approximation

Authors: Yujia Jin, Christopher Musco, Aaron Sidford, Apoorv Vikram Singh

Abstract: We study lower bounds for the problem of approximating a one dimensional distribution given (noisy) measurements of its moments. We show that there are distributions on $[-1,1]$ that cannot be approximated to accuracy $ε$ in Wasserstein-1 distance even if we know \emph{all} of their moments to multiplicative accuracy $(1\pm2^{-Ω(1/ε)})$; this result matches an upper bound of Kong and Valiant [Anna… ▽ More We study lower bounds for the problem of approximating a one dimensional distribution given (noisy) measurements of its moments. We show that there are distributions on $[-1,1]$ that cannot be approximated to accuracy $ε$ in Wasserstein-1 distance even if we know \emph{all} of their moments to multiplicative accuracy $(1\pm2^{-Ω(1/ε)})$; this result matches an upper bound of Kong and Valiant [Annals of Statistics, 2017]. To obtain our result, we provide a hard instance involving distributions induced by the eigenvalue spectra of carefully constructed graph adjacency matrices. Efficiently approximating such spectra in Wasserstein-1 distance is a well-studied algorithmic problem, and a recent result of Cohen-Steiner et al. [KDD 2018] gives a method based on accurately approximating spectral moments using $2^{O(1/ε)}$ random walks initiated at uniformly random nodes in the graph. As a strengthening of our main result, we show that improving the dependence on $1/ε$ in this result would require a new algorithmic approach. Specifically, no algorithm can compute an $ε$-accurate approximation to the spectrum of a normalized graph adjacency matrix with constant probability, even when given the transcript of $2^{Ω(1/ε)}$ random walks of length $2^{Ω(1/ε)}$ started at random nodes. △ Less

Submitted 2 July, 2023; originally announced July 2023.

arXiv:2306.16799 [pdf]

LeanAI: A method for AEC practitioners to effectively plan AI implementations

Authors: Ashwin Agrawal, Vishal Singh, Martin Fischer

Abstract: Recent developments in Artificial Intelligence (AI) provide unprecedented automation opportunities in the Architecture, Engineering, and Construction (AEC) industry. However, despite the enthusiasm regarding the use of AI, 85% of current big data projects fail. One of the main reasons for AI project failures in the AEC industry is the disconnect between those who plan or decide to use AI and those… ▽ More Recent developments in Artificial Intelligence (AI) provide unprecedented automation opportunities in the Architecture, Engineering, and Construction (AEC) industry. However, despite the enthusiasm regarding the use of AI, 85% of current big data projects fail. One of the main reasons for AI project failures in the AEC industry is the disconnect between those who plan or decide to use AI and those who implement it. AEC practitioners often lack a clear understanding of the capabilities and limitations of AI, leading to a failure to distinguish between what AI should solve, what it can solve, and what it will solve, treating these categories as if they are interchangeable. This lack of understanding results in the disconnect between AI planning and implementation because the planning is based on a vision of what AI should solve without considering if it can or will solve it. To address this challenge, this work introduces the LeanAI method. The method has been developed using data from several ongoing longitudinal studies analyzing AI implementations in the AEC industry, which involved 50+ hours of interview data. The LeanAI method delineates what AI should solve, what it can solve, and what it will solve, forcing practitioners to clearly articulate these components early in the planning process itself by involving the relevant stakeholders. By utilizing the method, practitioners can effectively plan AI implementations, thus increasing the likelihood of success and ultimately speeding up the adoption of AI. A case example illustrates the usefulness of the method. △ Less

Submitted 29 June, 2023; originally announced June 2023.

Comments: 40th International Symposium on Automation and Robotics in Construction (ISARC 2023)

arXiv:2306.07501 [pdf, other]

Speaker Verification Across Ages: Investigating Deep Speaker Embedding Sensitivity to Age Mismatch in Enrollment and Test Speech

Authors: Vishwanath Pratap Singh, Md Sahidullah, Tomi Kinnunen

Abstract: In this paper, we study the impact of the ageing on modern deep speaker embedding based automatic speaker verification (ASV) systems. We have selected two different datasets to examine ageing on the state-of-the-art ECAPA-TDNN system. The first dataset, used for addressing short-term ageing (up to 10 years time difference between enrollment and test) under uncontrolled conditions, is VoxCeleb. The… ▽ More In this paper, we study the impact of the ageing on modern deep speaker embedding based automatic speaker verification (ASV) systems. We have selected two different datasets to examine ageing on the state-of-the-art ECAPA-TDNN system. The first dataset, used for addressing short-term ageing (up to 10 years time difference between enrollment and test) under uncontrolled conditions, is VoxCeleb. The second dataset, used for addressing long-term ageing effect (up to 40 years difference) of Finnish speakers under a more controlled setup, is Longitudinal Corpus of Finnish Spoken in Helsinki (LCFSH). Our study provides new insights into the impact of speaker ageing on modern ASV systems. Specifically, we establish a quantitative measure between ageing and ASV scores. Further, our research indicates that ageing affects female English speakers to a greater degree than male English speakers, while in the case of Finnish, it has a greater impact on male speakers than female speakers. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Journal ref: Interspeech 2023

arXiv:2305.04241 [pdf, other]

Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens

Authors: Zhanpeng Zeng, Cole Hawkins, Mingyi Hong, Aston Zhang, Nikolaos Pappas, Vikas Singh, Shuai Zheng

Abstract: Transformers are central in modern natural language processing and computer vision applications. Despite recent works devoted to reducing the quadratic cost of such models (as a function of the sequence length), dealing with ultra long sequences (e.g., with more than 16K tokens) remains challenging. Applications such as answering questions based on a book or summarizing a scientific article are in… ▽ More Transformers are central in modern natural language processing and computer vision applications. Despite recent works devoted to reducing the quadratic cost of such models (as a function of the sequence length), dealing with ultra long sequences (e.g., with more than 16K tokens) remains challenging. Applications such as answering questions based on a book or summarizing a scientific article are inefficient or infeasible. Here, we propose to significantly improve the efficiency of Transformers for ultra long sequences, by compressing the sequence into a much smaller representation at each layer. Specifically, by exploiting the fact that in many tasks, only a small subset of special tokens (we call VIP-tokens) are most relevant to the final prediction, we propose a VIP-token centric compression (VCC) scheme which selectively compresses the sequence based on their impact on approximating the representation of the VIP-tokens. Compared with competitive baselines, our algorithm is not only efficient (achieving more than $3\times$ efficiency gain compared to baselines on 4K and 16K lengths), but also offers competitive/better performance on a large number of tasks. Further, we show that our algorithm scales to 128K tokens (or more) while consistently offering accuracy improvement. △ Less

Submitted 27 May, 2023; v1 submitted 7 May, 2023; originally announced May 2023.

Comments: 10 pages main text, 12 pages appendix, preprint

arXiv:2305.03546 [pdf, other]

Breast Cancer Immunohistochemical Image Generation: a Benchmark Dataset and Challenge Review

Authors: Chuang Zhu, Shengjie Liu, Zekuan Yu, Feng Xu, Arpit Aggarwal, Germán Corredor, Anant Madabhushi, Qixun Qu, Hongwei Fan, Fangda Li, Yueheng Li, Xianchao Guan, Yongbing Zhang, Vivek Kumar Singh, Farhan Akram, Md. Mostafa Kamal Sarker, Zhongyue Shi, Mulan Jin

Abstract: For invasive breast cancer, immunohistochemical (IHC) techniques are often used to detect the expression level of human epidermal growth factor receptor-2 (HER2) in breast tissue to formulate a precise treatment plan. From the perspective of saving manpower, material and time costs, directly generating IHC-stained images from Hematoxylin and Eosin (H&E) stained images is a valuable research direct… ▽ More For invasive breast cancer, immunohistochemical (IHC) techniques are often used to detect the expression level of human epidermal growth factor receptor-2 (HER2) in breast tissue to formulate a precise treatment plan. From the perspective of saving manpower, material and time costs, directly generating IHC-stained images from Hematoxylin and Eosin (H&E) stained images is a valuable research direction. Therefore, we held the breast cancer immunohistochemical image generation challenge, aiming to explore novel ideas of deep learning technology in pathological image generation and promote research in this field. The challenge provided registered H&E and IHC-stained image pairs, and participants were required to use these images to train a model that can directly generate IHC-stained images from corresponding H&E-stained images. We selected and reviewed the five highest-ranking methods based on their PSNR and SSIM metrics, while also providing overviews of the corresponding pipelines and implementations. In this paper, we further analyze the current limitations in the field of breast cancer immunohistochemical image generation and forecast the future development of this field. We hope that the released dataset and the challenge will inspire more scholars to jointly study higher-quality IHC-stained image generation. △ Less

Submitted 22 September, 2023; v1 submitted 5 May, 2023; originally announced May 2023.

Comments: 12 pages, 12 figures, 2tables

arXiv:2304.01241 [pdf]

doi 10.1007/978-3-031-28183-9_15

Detection of Homophobia & Transphobia in Dravidian Languages: Exploring Deep Learning Methods

Authors: Deepawali Sharma, Vedika Gupta, Vivek Kumar Singh

Abstract: The increase in abusive content on online social media platforms is impacting the social life of online users. Use of offensive and hate speech has been making so-cial media toxic. Homophobia and transphobia constitute offensive comments against LGBT+ community. It becomes imperative to detect and handle these comments, to timely flag or issue a warning to users indulging in such behaviour. Howeve… ▽ More The increase in abusive content on online social media platforms is impacting the social life of online users. Use of offensive and hate speech has been making so-cial media toxic. Homophobia and transphobia constitute offensive comments against LGBT+ community. It becomes imperative to detect and handle these comments, to timely flag or issue a warning to users indulging in such behaviour. However, automated detection of such content is a challenging task, more so in Dravidian languages which are identified as low resource languages. Motivated by this, the paper attempts to explore applicability of different deep learning mod-els for classification of the social media comments in Malayalam and Tamil lan-guages as homophobic, transphobic and non-anti-LGBT+content. The popularly used deep learning models- Convolutional Neural Network (CNN), Long Short Term Memory (LSTM) using GloVe embedding and transformer-based learning models (Multilingual BERT and IndicBERT) are applied to the classification problem. Results obtained show that IndicBERT outperforms the other imple-mented models, with obtained weighted average F1-score of 0.86 and 0.77 for Malayalam and Tamil, respectively. Therefore, the present work confirms higher performance of IndicBERT on the given task in selected Dravidian languages. △ Less

Submitted 3 April, 2023; originally announced April 2023.

Journal ref: Advanced Network Technologies and Intelligent Computing. ANTIC 2022. Communications in Computer and Information Science, vol 1798. Springer, Cham

arXiv:2303.16871 [pdf]

doi 10.1061/JHYEFF.HEENG-6064

Full-Range Approximation for the Theis Well Function Using Ramanujan's Series and Bounds for the Exponential Integral

Authors: Manotosh Kumbhakar, Vijay P. Singh

Abstract: The solution of the governing equation representing the drawdown in a horizontal confined aquifer, where groundwater flow is unsteady, is provided in terms of the exponential integral, which is famously known as the Well function. For the computation of this function in practical applications, it is important to develop not only accurate but also a simple approximation that requires evaluation of… ▽ More The solution of the governing equation representing the drawdown in a horizontal confined aquifer, where groundwater flow is unsteady, is provided in terms of the exponential integral, which is famously known as the Well function. For the computation of this function in practical applications, it is important to develop not only accurate but also a simple approximation that requires evaluation of the fewest possible terms. To that end, introducing Ramanujan's series expression, this work proposes a full-range approximation to the exponential integral using Ramanujan's series for the small argument (u \leq 1) and an approximation based on the bound of the integral for the other range (u \in (1,100]). The evaluation of the proposed approximation results in the most accurate formulae compared to the existing studies, which possess the maximum percentage error of 0.05\%. Further, the proposed formula is much simpler to apply as it contains just the product of exponential and logarithm functions. To further check the efficiency of the proposed approximation, we consider a practical example for evaluating the discrete pumping kernel, which shows the superiority of this approximation over the others. Finally, the authors hope that the proposed efficient approximation can be useful for groundwater and hydrogeological applications. △ Less

Submitted 3 March, 2023; originally announced March 2023.

Comments: 10 pages

MSC Class: 00A05

arXiv:2303.15852 [pdf]

doi 10.1007/978-3-031-28183-9_18

Exploring Deep Learning Methods for Classification of SAR Images: Towards NextGen Convolutions via Transformers

Authors: Aakash Singh, Vivek Kumar Singh

Abstract: Images generated by high-resolution SAR have vast areas of application as they can work better in adverse light and weather conditions. One such area of application is in the military systems. This study is an attempt to explore the suitability of current state-of-the-art models introduced in the domain of computer vision for SAR target classification (MSTAR). Since the application of any solution… ▽ More Images generated by high-resolution SAR have vast areas of application as they can work better in adverse light and weather conditions. One such area of application is in the military systems. This study is an attempt to explore the suitability of current state-of-the-art models introduced in the domain of computer vision for SAR target classification (MSTAR). Since the application of any solution produced for military systems would be strategic and real-time, accuracy is often not the only criterion to measure its performance. Other important parameters like prediction time and input resiliency are equally important. The paper deals with these issues in the context of SAR images. Experimental results show that deep learning models can be suitably applied in the domain of SAR image classification with the desired performance levels. △ Less

Submitted 28 March, 2023; originally announced March 2023.

Comments: 6 pages, 9 figures

Journal ref: In Advanced Network Technologies and Intelligent Computing Second International Conference, ANTIC 2022, Varanasi, India, December 22 24, 2022, Proceedings, Part II pp. 249 260. Cham Springer Nature Switzerland

arXiv:2303.05062 [pdf, other]

Influence before Hiring: A Two-tired Incentive Compatible Mechanism for IoT-based Crowdsourcing in Strategic Setting

Authors: Chattu Bhargavi, Vikash Kumar Singh

Abstract: In crowdsourcing, a group of common people is asked to execute the tasks and in return will receive some incentives. In this article, one of the crowdsourcing scenarios with multiple heterogeneous tasks and multiple IoT devices (as task executors) is studied as a two-tiered process. In the first tier of the proposed model, it is assumed that a substantial number of IoT devices are not aware of the… ▽ More In crowdsourcing, a group of common people is asked to execute the tasks and in return will receive some incentives. In this article, one of the crowdsourcing scenarios with multiple heterogeneous tasks and multiple IoT devices (as task executors) is studied as a two-tiered process. In the first tier of the proposed model, it is assumed that a substantial number of IoT devices are not aware of the hiring process and are made aware by utilizing their social connections. Each of the IoT devices reports a cost (private value) that it will charge in return for its services. The participating IoT devices are rational and strategic. The goal of the first tier is to select the subset of IoT devices as initial notifiers so as to maximize the number of IoT devices notified with the constraint that the total payment made to the notifiers is within the budget. For this purpose, an incentive compatible mechanism is proposed. In the second tier, a set of quality IoT devices is determined by utilizing the idea of single-peaked preferences. The next objective of the second tier is to hire quality IoT devices for the floated tasks. For this purpose, each quality IoT device reports private valuation along with its favorite bundle of tasks. In the second tier, it is assumed that the valuation of the IoT devices satisfies gross substitute criteria and is private. For the second tier, the truthful mechanisms are designed independently for determining the quality IoT devices and for hiring them and deciding their payment respectively. Theoretical analysis shows that the proposed mechanisms are computationally efficient, truthful, correct, budget feasible, and individually rational. The simulation is done to measure the efficacy of the proposed mechanisms with the benchmark mechanisms based on truthfulness, budget feasibility, and running time. △ Less

Submitted 3 April, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

Comments: 40 pages, 21 figures, Changed Figure 1, Simulation results are added, Conclusion and future works stretched a bit

arXiv:2303.04584 [pdf, other]

Estimating a scalar log-concave random variable, using a silence set based probabilistic sampling

Authors: Maben Rabi, Junfeng Wu, Vyoma Singh, Karl Henrik Johansson

Abstract: We study the probabilistic sampling of a random variable, in which the variable is sampled only if it falls outside a given set, which is called the silence set. This helps us to understand optimal event-based sampling for the special case of IID random processes, and also to understand the design of a sub-optimal scheme for other cases. We consider the design of this probabilistic sampling for a… ▽ More We study the probabilistic sampling of a random variable, in which the variable is sampled only if it falls outside a given set, which is called the silence set. This helps us to understand optimal event-based sampling for the special case of IID random processes, and also to understand the design of a sub-optimal scheme for other cases. We consider the design of this probabilistic sampling for a scalar, log-concave random variable, to minimize either the mean square estimation error, or the mean absolute estimation error. We show that the optimal silence interval: (i) is essentially unique, and (ii) is the limit of an iterative procedure of centering. Further we show through numerical experiments that super-level intervals seem to be remarkably near-optimal for mean square estimation. Finally we use the Gauss inequality for scalar unimodal densities, to show that probabilistic sampling gives a mean square distortion that is less than a third of the distortion incurred by periodic sampling, if the average sampling rate is between 0.3 and 0.9 samples per tick. △ Less

Submitted 16 March, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

Comments: Accepted for publication in the 2023 American Control Conference

arXiv:2303.00866 [pdf, other]

A prototype hybrid prediction market for estimating replicability of published work

Authors: Tatiana Chakravorti, Robert Fraleigh, Timothy Fritton, Michael McLaughlin, Vaibhav Singh, Christopher Griffin, Anthony Kwasnica, David Pennock, C. Lee Giles, Sarah Rajtmajer

Abstract: We present a prototype hybrid prediction market and demonstrate the avenue it represents for meaningful human-AI collaboration. We build on prior work proposing artificial prediction markets as a novel machine-learning algorithm. In an artificial prediction market, trained AI agents buy and sell outcomes of future events. Classification decisions can be framed as outcomes of future events, and acc… ▽ More We present a prototype hybrid prediction market and demonstrate the avenue it represents for meaningful human-AI collaboration. We build on prior work proposing artificial prediction markets as a novel machine-learning algorithm. In an artificial prediction market, trained AI agents buy and sell outcomes of future events. Classification decisions can be framed as outcomes of future events, and accordingly, the price of an asset corresponding to a given classification outcome can be taken as a proxy for the confidence of the system in that decision. By embedding human participants in these markets alongside bot traders, we can bring together insights from both. In this paper, we detail pilot studies with prototype hybrid markets for the prediction of replication study outcomes. We highlight challenges and opportunities, share insights from semi-structured interviews with hybrid market participants, and outline a vision for ongoing and future work. △ Less

Submitted 1 March, 2023; originally announced March 2023.

arXiv:2301.03040 [pdf]

Digital Twin: Where do humans fit in?

Authors: Ashwin Agrawal, Robert Thiel, Pooja Jain, Vishal Singh, Martin Fischer

Abstract: Digital Twin (DT) technology is far from being comprehensive and mature, resulting in their piecemeal implementation in practice where some functions are automated by DTs, and others are still performed by humans. This piecemeal implementation of DTs often leaves practitioners wondering what roles (or functions) to allocate to DTs in a work system, and how might it impact humans. A lack of knowled… ▽ More Digital Twin (DT) technology is far from being comprehensive and mature, resulting in their piecemeal implementation in practice where some functions are automated by DTs, and others are still performed by humans. This piecemeal implementation of DTs often leaves practitioners wondering what roles (or functions) to allocate to DTs in a work system, and how might it impact humans. A lack of knowledge about the roles that humans and DTs play in a work system can result in significant costs, misallocation of resources, unrealistic expectations from DTs, and strategic misalignments. To alleviate this challenge, this paper answers the research question: When humans work with DTs, what types of roles can a DT play, and to what extent can those roles be automated? Specifically, we propose a two-dimensional conceptual framework, Levels of Digital Twin (LoDT). The framework is an integration of the types of roles a DT can play, broadly categorized under (1) Observer, (2) Analyst, (3) Decision Maker, and (4) Action Executor, and the extent of automation for each of these roles, divided into five different levels ranging from completely manual to fully automated. A particular DT can play any number of roles at varying levels. The framework can help practitioners systematically plan DT deployments, clearly communicate goals and deliverables, and lay out a strategic vision. A case study illustrates the usefulness of the framework. △ Less

Submitted 8 January, 2023; originally announced January 2023.

Comments: Accepted for publication in Automation in Construction

Showing 1–50 of 246 results for author: Singh, V