Search | arXiv e-print repository

arXiv:2407.19985 [pdf, other]

Mixture of Nested Experts: Adaptive Processing of Visual Tokens

Authors: Gagan Jain, Nidhi Hegde, Aditya Kusupati, Arsha Nagrani, Shyamal Buch, Prateek Jain, Anurag Arnab, Sujoy Paul

Abstract: The visual medium (images and videos) naturally contains a large amount of information redundancy, thereby providing a great opportunity for leveraging efficiency in processing. While Vision Transformer (ViT) based models scale effectively to large data regimes, they fail to capitalize on this inherent redundancy, leading to higher computational costs. Mixture of Experts (MoE) networks demonstrate… ▽ More The visual medium (images and videos) naturally contains a large amount of information redundancy, thereby providing a great opportunity for leveraging efficiency in processing. While Vision Transformer (ViT) based models scale effectively to large data regimes, they fail to capitalize on this inherent redundancy, leading to higher computational costs. Mixture of Experts (MoE) networks demonstrate scalability while maintaining same inference-time costs, but they come with a larger parameter footprint. We present Mixture of Nested Experts (MoNE), which utilizes a nested structure for experts, wherein individual experts fall on an increasing compute-accuracy curve. Given a compute budget, MoNE learns to dynamically choose tokens in a priority order, and thus redundant tokens are processed through cheaper nested experts. Using this framework, we achieve equivalent performance as the baseline models, while reducing inference time compute by over two-fold. We validate our approach on standard image and video datasets - ImageNet-21K, Kinetics400, and Something-Something-v2. We further highlight MoNE$'$s adaptability by showcasing its ability to maintain strong performance across different inference-time compute budgets on videos, using only a single trained model. △ Less

Submitted 30 July, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

arXiv:2405.01010 [pdf, other]

Efficient and Adaptive Posterior Sampling Algorithms for Bandits

Authors: Bingshan Hu, Zhiming Huang, Tianyue H. Zhang, Mathias Lécuyer, Nidhi Hegde

Abstract: We study Thompson Sampling-based algorithms for stochastic bandits with bounded rewards. As the existing problem-dependent regret bound for Thompson Sampling with Gaussian priors [Agrawal and Goyal, 2017] is vacuous when $T \le 288 e^{64}$, we derive a more practical bound that tightens the coefficient of the leading term %from $288 e^{64}$ to $1270$. Additionally, motivated by large-scale real-wo… ▽ More We study Thompson Sampling-based algorithms for stochastic bandits with bounded rewards. As the existing problem-dependent regret bound for Thompson Sampling with Gaussian priors [Agrawal and Goyal, 2017] is vacuous when $T \le 288 e^{64}$, we derive a more practical bound that tightens the coefficient of the leading term %from $288 e^{64}$ to $1270$. Additionally, motivated by large-scale real-world applications that require scalability, adaptive computational resource allocation, and a balance in utility and computation, we propose two parameterized Thompson Sampling-based algorithms: Thompson Sampling with Model Aggregation (TS-MA-$α$) and Thompson Sampling with Timestamp Duelling (TS-TD-$α$), where $α\in [0,1]$ controls the trade-off between utility and computation. Both algorithms achieve $O \left(K\ln^{α+1}(T)/Δ\right)$ regret bound, where $K$ is the number of arms, $T$ is the finite learning horizon, and $Δ$ denotes the single round performance loss when pulling a sub-optimal arm. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2401.05422 [pdf, ps, other]

Machine Learning (ML)-assisted Beam Management in millimeter (mm)Wave Distributed Multiple Input Multiple Output (D-MIMO) systems

Authors: Karthik R M, Dhiraj Nagaraja Hegde, Muris Sarajlic, Abhishek Sarkar

Abstract: Beam management (BM) protocols are critical for establishing and maintaining connectivity between network radio nodes and User Equipments (UEs). In Distributed Multiple Input Multiple Output systems (D-MIMO), a number of access points (APs), coordinated by a central processing unit (CPU), serves a number of UEs. At mmWave frequencies, the problem of finding the best AP and beam to serve the UEs is… ▽ More Beam management (BM) protocols are critical for establishing and maintaining connectivity between network radio nodes and User Equipments (UEs). In Distributed Multiple Input Multiple Output systems (D-MIMO), a number of access points (APs), coordinated by a central processing unit (CPU), serves a number of UEs. At mmWave frequencies, the problem of finding the best AP and beam to serve the UEs is challenging due to a large number of beams that need to be sounded with Downlink (DL) reference signals. The objective of this paper is to investigate whether the best AP/beam can be reliably inferred from sounding only a small subset of beams and leveraging AI/ML for inference of best beam/AP. We use Random Forest (RF), MissForest (MF) and conditional Generative Adversarial Networks (c-GAN) for demonstrating the performance benefits of inference. △ Less

Submitted 30 December, 2023; originally announced January 2024.

arXiv:2309.14389 [pdf, other]

Analyzing the Efficacy of an LLM-Only Approach for Image-based Document Question Answering

Authors: Nidhi Hegde, Sujoy Paul, Gagan Madan, Gaurav Aggarwal

Abstract: Recent document question answering models consist of two key components: the vision encoder, which captures layout and visual elements in images, and a Large Language Model (LLM) that helps contextualize questions to the image and supplements them with external world knowledge to generate accurate answers. However, the relative contributions of the vision encoder and the language model in these ta… ▽ More Recent document question answering models consist of two key components: the vision encoder, which captures layout and visual elements in images, and a Large Language Model (LLM) that helps contextualize questions to the image and supplements them with external world knowledge to generate accurate answers. However, the relative contributions of the vision encoder and the language model in these tasks remain unclear. This is especially interesting given the effectiveness of instruction-tuned LLMs, which exhibit remarkable adaptability to new tasks. To this end, we explore the following aspects in this work: (1) The efficacy of an LLM-only approach on document question answering tasks (2) strategies for serializing textual information within document images and feeding it directly to an instruction-tuned LLM, thus bypassing the need for an explicit vision encoder (3) thorough quantitative analysis on the feasibility of such an approach. Our comprehensive analysis encompasses six diverse benchmark datasets, utilizing LLMs of varying scales. Our findings reveal that a strategy exclusively reliant on the LLM yields results that are on par with or closely approach state-of-the-art performance across a range of datasets. We posit that this evaluation framework will serve as a guiding resource for selecting appropriate datasets for future research endeavors that emphasize the fundamental importance of layout and image content information. △ Less

Submitted 25 September, 2023; originally announced September 2023.

arXiv:2306.17173 [pdf, other]

Photon: A Cross Platform P2P Data Transfer Application

Authors: Abhilash Shreedhar Hegde, Amruta Narayana Hegde, Adeep Krishna Keelar, Ananya Mathur

Abstract: Modern computing requires efficient and dependable data transport. Current solutions like Bluetooth, SMS (Short Message Service), and Email have their restrictions on efficiency, file size, compatibility, and cost. In order to facilitate direct communication and resource sharing amongst linked devices, this research study offers a cross-platform peer-to-peer (P2P) data transmission solution that t… ▽ More Modern computing requires efficient and dependable data transport. Current solutions like Bluetooth, SMS (Short Message Service), and Email have their restrictions on efficiency, file size, compatibility, and cost. In order to facilitate direct communication and resource sharing amongst linked devices, this research study offers a cross-platform peer-to-peer (P2P) data transmission solution that takes advantage of P2P networks' features. The system enables cost-effective and high-performance data transport by using the compute, storage, and network resources of the participating devices. Simple file sharing, adaptability, dependability, and high performance are some of the important benefits. The examination of the suggested solution is presented in this paper and includes discussion of the P2P architecture, data transfer mechanisms, performance assessment, implementation issues, security concerns, and the potential difficulties that needs to be addressed. The research intends to validate the efficacy and potential of the suggested cross-platform P2P data transfer solution, delivering better efficiency and dependability for users across various platforms, through practical investigations and comparisons with existing approaches. △ Less

Submitted 16 June, 2023; originally announced June 2023.

arXiv:2306.06823 [pdf, other]

Weakly supervised information extraction from inscrutable handwritten document images

Authors: Sujoy Paul, Gagan Madan, Akankshya Mishra, Narayan Hegde, Pradeep Kumar, Gaurav Aggarwal

Abstract: State-of-the-art information extraction methods are limited by OCR errors. They work well for printed text in form-like documents, but unstructured, handwritten documents still remain a challenge. Adapting existing models to domain-specific training data is quite expensive, because of two factors, 1) limited availability of the domain-specific documents (such as handwritten prescriptions, lab note… ▽ More State-of-the-art information extraction methods are limited by OCR errors. They work well for printed text in form-like documents, but unstructured, handwritten documents still remain a challenge. Adapting existing models to domain-specific training data is quite expensive, because of two factors, 1) limited availability of the domain-specific documents (such as handwritten prescriptions, lab notes, etc.), and 2) annotations become even more challenging as one needs domain-specific knowledge to decode inscrutable handwritten document images. In this work, we focus on the complex problem of extracting medicine names from handwritten prescriptions using only weakly labeled data. The data consists of images along with the list of medicine names in it, but not their location in the image. We solve the problem by first identifying the regions of interest, i.e., medicine lines from just weak labels and then injecting a domain-specific medicine language model learned using only synthetically generated data. Compared to off-the-shelf state-of-the-art methods, our approach performs >2.5x better in medicine names extraction from prescriptions. △ Less

Submitted 11 June, 2023; originally announced June 2023.

Comments: Accepted at ICDAR 2023

arXiv:2304.11983 [pdf, other]

Protecting Locks Against Unbalanced Unlock()

Authors: Vivek Shahare, Milind Chabbi, Nikhil Hegde

Abstract: The lock is a building-block synchronization primitive that enables mutually exclusive access to shared data in shared-memory parallel programs. Mutual exclusion is typically achieved by guarding the code that accesses the shared data with a pair of lock() and unlock() operations. Concurrency bugs arise when this ordering of operations is violated. In this paper, we study a particular pattern of m… ▽ More The lock is a building-block synchronization primitive that enables mutually exclusive access to shared data in shared-memory parallel programs. Mutual exclusion is typically achieved by guarding the code that accesses the shared data with a pair of lock() and unlock() operations. Concurrency bugs arise when this ordering of operations is violated. In this paper, we study a particular pattern of misuse where an unlock() is issued without first issuing a lock(), which can happen in code with complex control flow. This misuse is surprisingly common in several important open-source repositories we study. We systematically study what happens due to this misuse in several popular locking algorithms. We study how misuse can be detected and how the locking protocols can be fixed to avoid the unwanted consequences of misuse. Most locks require simple changes to detect and prevent this misuse. We evaluate the performance traits of modified implementations, which show mild performance penalties in most scalable locks. △ Less

Submitted 24 April, 2023; originally announced April 2023.

Comments: Paper Accepted to the 35th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 23)

arXiv:2302.02453 [pdf, ps, other]

FineDeb: A Debiasing Framework for Language Models

Authors: Akash Saravanan, Dhruv Mullick, Habibur Rahman, Nidhi Hegde

Abstract: As language models are increasingly included in human-facing machine learning tools, bias against demographic subgroups has gained attention. We propose FineDeb, a two-phase debiasing framework for language models that starts with contextual debiasing of embeddings learned by pretrained language models. The model is then fine-tuned on a language modeling objective. Our results show that FineDeb of… ▽ More As language models are increasingly included in human-facing machine learning tools, bias against demographic subgroups has gained attention. We propose FineDeb, a two-phase debiasing framework for language models that starts with contextual debiasing of embeddings learned by pretrained language models. The model is then fine-tuned on a language modeling objective. Our results show that FineDeb offers stronger debiasing in comparison to other methods which often result in models as biased as the original language model. Our framework is generalizable for demographics with multiple classes, and we demonstrate its effectiveness through extensive experiments and comparisons with state of the art techniques. We release our code and data on GitHub. △ Less

Submitted 5 February, 2023; originally announced February 2023.

Comments: Poster presentation at AAAI 2023: The Workshop on Artificial Intelligence for Social Good 2023 (https://amulyayadav.github.io/AI4SG2023/)

arXiv:2211.13508 [pdf, other]

1st Workshop on Maritime Computer Vision (MaCVi) 2023: Challenge Results

Authors: Benjamin Kiefer, Matej Kristan, Janez Perš, Lojze Žust, Fabio Poiesi, Fabio Augusto de Alcantara Andrade, Alexandre Bernardino, Matthew Dawkins, Jenni Raitoharju, Yitong Quan, Adem Atmaca, Timon Höfer, Qiming Zhang, Yufei Xu, Jing Zhang, Dacheng Tao, Lars Sommer, Raphael Spraul, Hangyue Zhao, Hongpu Zhang, Yanyun Zhao, Jan Lukas Augustin, Eui-ik Jeon, Impyeong Lee, Luca Zedda , et al. (48 additional authors not shown)

Abstract: The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detec… ▽ More The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi. △ Less

Submitted 28 November, 2022; v1 submitted 24 November, 2022; originally announced November 2022.

Comments: MaCVi 2023 was part of WACV 2023. This report (38 pages) discusses the competition as part of MaCVi

arXiv:2207.05777 [pdf, other]

Long Term Fairness for Minority Groups via Performative Distributionally Robust Optimization

Authors: Liam Peet-Pare, Nidhi Hegde, Alona Fyshe

Abstract: Fairness researchers in machine learning (ML) have coalesced around several fairness criteria which provide formal definitions of what it means for an ML model to be fair. However, these criteria have some serious limitations. We identify four key shortcomings of these formal fairness criteria, and aim to help to address them by extending performative prediction to include a distributionally robus… ▽ More Fairness researchers in machine learning (ML) have coalesced around several fairness criteria which provide formal definitions of what it means for an ML model to be fair. However, these criteria have some serious limitations. We identify four key shortcomings of these formal fairness criteria, and aim to help to address them by extending performative prediction to include a distributionally robust objective. △ Less

Submitted 12 July, 2022; originally announced July 2022.

Comments: From a submission to Responsible Decision Making in Dynamics Environments Workshop at ICML 2022

arXiv:2206.08653 [pdf, other]

All Mistakes Are Not Equal: Comprehensive Hierarchy Aware Multi-label Predictions (CHAMP)

Authors: Ashwin Vaswani, Gaurav Aggarwal, Praneeth Netrapalli, Narayan G Hegde

Abstract: This paper considers the problem of Hierarchical Multi-Label Classification (HMC), where (i) several labels can be present for each example, and (ii) labels are related via a domain-specific hierarchy tree. Guided by the intuition that all mistakes are not equal, we present Comprehensive Hierarchy Aware Multi-label Predictions (CHAMP), a framework that penalizes a misprediction depending on its se… ▽ More This paper considers the problem of Hierarchical Multi-Label Classification (HMC), where (i) several labels can be present for each example, and (ii) labels are related via a domain-specific hierarchy tree. Guided by the intuition that all mistakes are not equal, we present Comprehensive Hierarchy Aware Multi-label Predictions (CHAMP), a framework that penalizes a misprediction depending on its severity as per the hierarchy tree. While there have been works that apply such an idea to single-label classification, to the best of our knowledge, there are limited such works for multilabel classification focusing on the severity of mistakes. The key reason is that there is no clear way of quantifying the severity of a misprediction a priori in the multilabel setting. In this work, we propose a simple but effective metric to quantify the severity of a mistake in HMC, naturally leading to CHAMP. Extensive experiments on six public HMC datasets across modalities (image, audio, and text) demonstrate that incorporating hierarchical information leads to substantial gains as CHAMP improves both AUPRC (2.6% median percentage improvement) and hierarchical metrics (2.85% median percentage improvement), over stand-alone hierarchical or multilabel classification methods. Compared to standard multilabel baselines, CHAMP provides improved AUPRC in both robustness (8.87% mean percentage improvement ) and less data regimes. Further, our method provides a framework to enhance existing multilabel classification algorithms with better mistakes (18.1% mean percentage increment). △ Less

Submitted 17 June, 2022; originally announced June 2022.

arXiv:2203.14482 [pdf, other]

doi 10.1109/ISBI52829.2022.9761493

Leveraging Clinically Relevant Biometric Constraints To Supervise A Deep Learning Model For The Accurate Caliper Placement To Obtain Sonographic Measurements Of The Fetal Brain

Authors: Hari Shankar, Adithya Narayan, Shefali Jain, Divya Singh, Pooja Vyas, Nivedita Hegde, Purbayan Kar, Abhi Lad, Jens Thang, Jagruthi Atada, Duy Nguyen, PS Roopa, Akhila Vasudeva, Prathima Radhakrishnan, Sripad Krishna Devalla

Abstract: Multiple studies have demonstrated that obtaining standardized fetal brain biometry from mid-trimester ultrasonography (USG) examination is key for the reliable assessment of fetal neurodevelopment and the screening of central nervous system (CNS) anomalies. Obtaining these measurements is highly subjective, expertise-driven, and requires years of training experience, limiting quality prenatal car… ▽ More Multiple studies have demonstrated that obtaining standardized fetal brain biometry from mid-trimester ultrasonography (USG) examination is key for the reliable assessment of fetal neurodevelopment and the screening of central nervous system (CNS) anomalies. Obtaining these measurements is highly subjective, expertise-driven, and requires years of training experience, limiting quality prenatal care for all pregnant mothers. In this study, we propose a deep learning (DL) approach to compute 3 key fetal brain biometry from the 2D USG images of the transcerebellar plane (TC) through the accurate and automated caliper placement (2 per biometry) by modeling it as a landmark detection problem. We leveraged clinically relevant biometric constraints (relationship between caliper points) and domain-relevant data augmentation to improve the accuracy of a U-Net DL model (trained/tested on: 596 images, 473 subjects/143 images, 143 subjects). We performed multiple experiments demonstrating the effect of the DL backbone, data augmentation, generalizability and benchmarked against a recent state-of-the-art approach through extensive clinical validation (DL vs. 7 experienced clinicians). For all cases, the mean errors in the placement of the individual caliper points and the computed biometry were comparable to error rates among clinicians. The clinical translation of the proposed framework can assist novice users from low-resource settings in the reliable and standardized assessment of fetal brain sonograms. △ Less

Submitted 31 July, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

Comments: Accepted for presentation at 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI)

arXiv:2203.11992 [pdf, other]

Resonance in Weight Space: Covariate Shift Can Drive Divergence of SGD with Momentum

Authors: Kirby Banman, Liam Peet-Pare, Nidhi Hegde, Alona Fyshe, Martha White

Abstract: Most convergence guarantees for stochastic gradient descent with momentum (SGDm) rely on iid sampling. Yet, SGDm is often used outside this regime, in settings with temporally correlated input samples such as continual learning and reinforcement learning. Existing work has shown that SGDm with a decaying step-size can converge under Markovian temporal correlation. In this work, we show that SGDm u… ▽ More Most convergence guarantees for stochastic gradient descent with momentum (SGDm) rely on iid sampling. Yet, SGDm is often used outside this regime, in settings with temporally correlated input samples such as continual learning and reinforcement learning. Existing work has shown that SGDm with a decaying step-size can converge under Markovian temporal correlation. In this work, we show that SGDm under covariate shift with a fixed step-size can be unstable and diverge. In particular, we show SGDm under covariate shift is a parametric oscillator, and so can suffer from a phenomenon known as resonance. We approximate the learning system as a time varying system of ordinary differential equations, and leverage existing theory to characterize the system's divergence/convergence as resonant/nonresonant modes. The theoretical result is limited to the linear setting with periodic covariate shift, so we empirically supplement this result to show that resonance phenomena persist even under non-periodic covariate shift, nonlinear dynamics with neural networks, and optimizers other than SGDm. △ Less

Submitted 22 March, 2022; originally announced March 2022.

Comments: In International Conference on Learning Representations. 2021

arXiv:2202.13553 [pdf, other]

Towards A Device-Independent Deep Learning Approach for the Automated Segmentation of Sonographic Fetal Brain Structures: A Multi-Center and Multi-Device Validation

Authors: Abhi Lad, Adithya Narayan, Hari Shankar, Shefali Jain, Pooja Punjani Vyas, Divya Singh, Nivedita Hegde, Jagruthi Atada, Jens Thang, Saw Shier Nee, Arunkumar Govindarajan, Roopa PS, Muralidhar V Pai, Akhila Vasudeva, Prathima Radhakrishnan, Sripad Krishna Devalla

Abstract: Quality assessment of prenatal ultrasonography is essential for the screening of fetal central nervous system (CNS) anomalies. The interpretation of fetal brain structures is highly subjective, expertise-driven, and requires years of training experience, limiting quality prenatal care for all pregnant mothers. With recent advancement in Artificial Intelligence (AI), specifically deep learning (DL)… ▽ More Quality assessment of prenatal ultrasonography is essential for the screening of fetal central nervous system (CNS) anomalies. The interpretation of fetal brain structures is highly subjective, expertise-driven, and requires years of training experience, limiting quality prenatal care for all pregnant mothers. With recent advancement in Artificial Intelligence (AI), specifically deep learning (DL), assistance in precise anatomy identification through semantic segmentation essential for the reliable assessment of growth and neurodevelopment, and detection of structural abnormalities have been proposed. However, existing works only identify certain structures (e.g., cavum septum pellucidum, lateral ventricles, cerebellum) from either of the axial views (transventricular, transcerebellar), limiting the scope for a thorough anatomical assessment as per practice guidelines necessary for the screening of CNS anomalies. Further, existing works do not analyze the generalizability of these DL algorithms across images from multiple ultrasound devices and centers, thus, limiting their real-world clinical impact. In this study, we propose a DL based segmentation framework for the automated segmentation of 10 key fetal brain structures from 2 axial planes from fetal brain USG images (2D). We developed a custom U-Net variant that uses inceptionv4 block as a feature extractor and leverages custom domain-specific data augmentation. Quantitatively, the mean (10 structures; test sets 1/2/3/4) Dice-coefficients were: 0.827, 0.802, 0.731, 0.783. Irrespective of the USG device/center, the DL segmentations were qualitatively comparable to their manual segmentations. The proposed DL system offered a promising and generalizable performance (multi-centers, multi-device) and also presents evidence in support of device-induced variation in image quality (a challenge to generalizibility) by using UMAP analysis. △ Less

Submitted 28 February, 2022; originally announced February 2022.

Comments: SPIE Medical Imaging 2022: Computer Aided Diagnosis (12033-75), 11 pages, 7 figures

arXiv:2106.14815 [pdf, other]

Feature Importance Guided Attack: A Model Agnostic Adversarial Attack

Authors: Gilad Gressel, Niranjan Hegde, Archana Sreekumar, Rishikumar Radhakrishnan, Kalyani Harikumar, Anjali S., Krishnashree Achuthan

Abstract: Research in adversarial learning has primarily focused on homogeneous unstructured datasets, which often map into the problem space naturally. Inverting a feature space attack on heterogeneous datasets into the problem space is much more challenging, particularly the task of finding the perturbation to perform. This work presents a formal search strategy: the `Feature Importance Guided Attack' (FI… ▽ More Research in adversarial learning has primarily focused on homogeneous unstructured datasets, which often map into the problem space naturally. Inverting a feature space attack on heterogeneous datasets into the problem space is much more challenging, particularly the task of finding the perturbation to perform. This work presents a formal search strategy: the `Feature Importance Guided Attack' (FIGA), which finds perturbations in the feature space of heterogeneous tabular datasets to produce evasion attacks. We first demonstrate FIGA in the feature space and then in the problem space. FIGA assumes no prior knowledge of the defending model's learning algorithm and does not require any gradient information. FIGA assumes knowledge of the feature representation and the mean feature values of defending model's dataset. FIGA leverages feature importance rankings by perturbing the most important features of the input in the direction of the target class. While FIGA is conceptually similar to other work which uses feature selection processes (e.g., mimicry attacks), we formalize an attack algorithm with three tunable parameters and investigate the strength of FIGA on tabular datasets. We demonstrate the effectiveness of FIGA by evading phishing detection models trained on four different tabular phishing datasets and one financial dataset with an average success rate of 94%. We extend FIGA to the phishing problem space by limiting the possible perturbations to be valid and feasible in the phishing domain. We generate valid adversarial phishing sites that are visually identical to their unperturbed counterpart and use them to attack six tabular ML models achieving a 13.05% average success rate. △ Less

Submitted 13 January, 2023; v1 submitted 28 June, 2021; originally announced June 2021.

arXiv:2102.07929 [pdf, other]

Near-Optimal Algorithms for Differentially Private Online Learning in a Stochastic Environment

Authors: Bingshan Hu, Zhiming Huang, Nishant A. Mehta, Nidhi Hegde

Abstract: In this paper, we study differentially private online learning problems in a stochastic environment under both bandit and full information feedback. For differentially private stochastic bandits, we propose both UCB and Thompson Sampling-based algorithms that are anytime and achieve the optimal $O \left(\sum_{j: Δ_j>0} \frac{\ln(T)}{\min \left\{Δ_j, ε\right\}} \right)$ instance-dependent regret bo… ▽ More In this paper, we study differentially private online learning problems in a stochastic environment under both bandit and full information feedback. For differentially private stochastic bandits, we propose both UCB and Thompson Sampling-based algorithms that are anytime and achieve the optimal $O \left(\sum_{j: Δ_j>0} \frac{\ln(T)}{\min \left\{Δ_j, ε\right\}} \right)$ instance-dependent regret bound, where $T$ is the finite learning horizon, $Δ_j$ denotes the suboptimality gap between the optimal arm and a suboptimal arm $j$, and $ε$ is the required privacy parameter. For the differentially private full information setting with stochastic rewards, we show an $Ω\left(\frac{\ln(K)}{\min \left\{Δ_{\min}, ε\right\}} \right)$ instance-dependent regret lower bound and an $Ω\left(\sqrt{T\ln(K)} + \frac{\ln(K)}ε\right)$ minimax lower bound, where $K$ is the total number of actions and $Δ_{\min}$ denotes the minimum suboptimality gap among all the suboptimal actions. For the same differentially private full information setting, we also present an $ε$-differentially private algorithm whose instance-dependent regret and worst-case regret match our respective lower bounds up to an extra $\log(T)$ factor. △ Less

Submitted 30 May, 2024; v1 submitted 15 February, 2021; originally announced February 2021.

Comments: 40 pages. New in v3: (i) Removed Hybrid-UCB (although its analysis is correct to our knowledge); (ii) Added Lazy-DP-TS from UAI 2022 paper of Hu and Hegde (2022)

arXiv:2011.08965 [pdf]

doi 10.1038/s41746-021-00427-2

Interpretable Survival Prediction for Colorectal Cancer using Deep Learning

Authors: Ellery Wulczyn, David F. Steiner, Melissa Moran, Markus Plass, Robert Reihs, Fraser Tan, Isabelle Flament-Auvigne, Trissia Brown, Peter Regitnig, Po-Hsuan Cameron Chen, Narayan Hegde, Apaar Sadhwani, Robert MacDonald, Benny Ayalew, Greg S. Corrado, Lily H. Peng, Daniel Tse, Heimo Müller, Zhaoyang Xu, Yun Liu, Martin C. Stumpe, Kurt Zatloukal, Craig H. Mermel

Abstract: Deriving interpretable prognostic features from deep-learning-based prognostic histopathology models remains a challenge. In this study, we developed a deep learning system (DLS) for predicting disease specific survival for stage II and III colorectal cancer using 3,652 cases (27,300 slides). When evaluated on two validation datasets containing 1,239 cases (9,340 slides) and 738 cases (7,140 slide… ▽ More Deriving interpretable prognostic features from deep-learning-based prognostic histopathology models remains a challenge. In this study, we developed a deep learning system (DLS) for predicting disease specific survival for stage II and III colorectal cancer using 3,652 cases (27,300 slides). When evaluated on two validation datasets containing 1,239 cases (9,340 slides) and 738 cases (7,140 slides) respectively, the DLS achieved a 5-year disease-specific survival AUC of 0.70 (95%CI 0.66-0.73) and 0.69 (95%CI 0.64-0.72), and added significant predictive value to a set of 9 clinicopathologic features. To interpret the DLS, we explored the ability of different human-interpretable features to explain the variance in DLS scores. We observed that clinicopathologic features such as T-category, N-category, and grade explained a small fraction of the variance in DLS scores (R2=18% in both validation sets). Next, we generated human-interpretable histologic features by clustering embeddings from a deep-learning based image-similarity model and showed that they explain the majority of the variance (R2 of 73% to 80%). Furthermore, the clustering-derived feature most strongly associated with high DLS scores was also highly prognostic in isolation. With a distinct visual appearance (poorly differentiated tumor cell clusters adjacent to adipose tissue), this feature was identified by annotators with 87.0-95.5% accuracy. Our approach can be used to explain predictions from a prognostic deep learning model and uncover potentially-novel prognostic features that can be reliably identified by people for future validation studies. △ Less

Submitted 17 November, 2020; originally announced November 2020.

Journal ref: Nature Partner Journal Digital Medicine (2021)

arXiv:2002.02513 [pdf, other]

Multi Type Mean Field Reinforcement Learning

Authors: Sriram Ganapathi Subramanian, Pascal Poupart, Matthew E. Taylor, Nidhi Hegde

Abstract: Mean field theory provides an effective way of scaling multiagent reinforcement learning algorithms to environments with many agents that can be abstracted by a virtual mean agent. In this paper, we extend mean field multiagent algorithms to multiple types. The types enable the relaxation of a core assumption in mean field reinforcement learning, which is that all agents in the environment are pla… ▽ More Mean field theory provides an effective way of scaling multiagent reinforcement learning algorithms to environments with many agents that can be abstracted by a virtual mean agent. In this paper, we extend mean field multiagent algorithms to multiple types. The types enable the relaxation of a core assumption in mean field reinforcement learning, which is that all agents in the environment are playing almost similar strategies and have the same goal. We conduct experiments on three different testbeds for the field of many agent reinforcement learning, based on the standard MAgents framework. We consider two different kinds of mean field environments: a) Games where agents belong to predefined types that are known a priori and b) Games where the type of each agent is unknown and therefore must be learned based on observations. We introduce new algorithms for each type of game and demonstrate their superior performance over state of the art algorithms that assume that all agents belong to the same type and other baseline algorithms in the MAgent framework. △ Less

Submitted 21 June, 2022; v1 submitted 6 February, 2020; originally announced February 2020.

Comments: The paper appears in the proceedings of International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) 2020. Revised version has some typos corrected

arXiv:1902.02960 [pdf]

Human-Centered Tools for Coping with Imperfect Algorithms during Medical Decision-Making

Authors: Carrie J. Cai, Emily Reif, Narayan Hegde, Jason Hipp, Been Kim, Daniel Smilkov, Martin Wattenberg, Fernanda Viegas, Greg S. Corrado, Martin C. Stumpe, Michael Terry

Abstract: Machine learning (ML) is increasingly being used in image retrieval systems for medical decision making. One application of ML is to retrieve visually similar medical images from past patients (e.g. tissue from biopsies) to reference when making a medical decision with a new patient. However, no algorithm can perfectly capture an expert's ideal notion of similarity for every case: an image that is… ▽ More Machine learning (ML) is increasingly being used in image retrieval systems for medical decision making. One application of ML is to retrieve visually similar medical images from past patients (e.g. tissue from biopsies) to reference when making a medical decision with a new patient. However, no algorithm can perfectly capture an expert's ideal notion of similarity for every case: an image that is algorithmically determined to be similar may not be medically relevant to a doctor's specific diagnostic needs. In this paper, we identified the needs of pathologists when searching for similar images retrieved using a deep learning algorithm, and developed tools that empower users to cope with the search algorithm on-the-fly, communicating what types of similarity are most important at different moments in time. In two evaluations with pathologists, we found that these refinement tools increased the diagnostic utility of images found and increased user trust in the algorithm. The tools were preferred over a traditional interface, without a loss in diagnostic accuracy. We also observed that users adopted new strategies when using refinement tools, re-purposing them to test and understand the underlying algorithm and to disambiguate ML errors from their own errors. Taken together, these findings inform future human-ML collaborative systems for expert decision-making. △ Less

Submitted 8 February, 2019; originally announced February 2019.

arXiv:1901.11112 [pdf]

doi 10.1038/s41746-019-0131-z

Similar Image Search for Histopathology: SMILY

Authors: Narayan Hegde, Jason D. Hipp, Yun Liu, Michael E. Buck, Emily Reif, Daniel Smilkov, Michael Terry, Carrie J. Cai, Mahul B. Amin, Craig H. Mermel, Phil Q. Nelson, Lily H. Peng, Greg S. Corrado, Martin C. Stumpe

Abstract: The increasing availability of large institutional and public histopathology image datasets is enabling the searching of these datasets for diagnosis, research, and education. Though these datasets typically have associated metadata such as diagnosis or clinical notes, even carefully curated datasets rarely contain annotations of the location of regions of interest on each image. Because pathology… ▽ More The increasing availability of large institutional and public histopathology image datasets is enabling the searching of these datasets for diagnosis, research, and education. Though these datasets typically have associated metadata such as diagnosis or clinical notes, even carefully curated datasets rarely contain annotations of the location of regions of interest on each image. Because pathology images are extremely large (up to 100,000 pixels in each dimension), further laborious visual search of each image may be needed to find the feature of interest. In this paper, we introduce a deep learning based reverse image search tool for histopathology images: Similar Medical Images Like Yours (SMILY). We assessed SMILY's ability to retrieve search results in two ways: using pathologist-provided annotations, and via prospective studies where pathologists evaluated the quality of SMILY search results. As a negative control in the second evaluation, pathologists were blinded to whether search results were retrieved by SMILY or randomly. In both types of assessments, SMILY was able to retrieve search results with similar histologic features, organ site, and prostate cancer Gleason grade compared with the original query. SMILY may be a useful general-purpose tool in the pathologist's arsenal, to improve the efficiency of searching large archives of histopathology images, without the need to develop and implement specific tools for each application. △ Less

Submitted 5 February, 2019; v1 submitted 30 January, 2019; originally announced January 2019.

Comments: 23 Pages with 6 figures and 3 tables. The file also has 6 pages of supplemental material. Improved figure resolution, edited metadata

Journal ref: Nature Partner Journal Digital Medicine (2019)

arXiv:1901.10634 [pdf, other]

Privacy-preserving Q-Learning with Functional Noise in Continuous State Spaces

Authors: Baoxiang Wang, Nidhi Hegde

Abstract: We consider differentially private algorithms for reinforcement learning in continuous spaces, such that neighboring reward functions are indistinguishable. This protects the reward information from being exploited by methods such as inverse reinforcement learning. Existing studies that guarantee differential privacy are not extendable to infinite state spaces, as the noise level to ensure privacy… ▽ More We consider differentially private algorithms for reinforcement learning in continuous spaces, such that neighboring reward functions are indistinguishable. This protects the reward information from being exploited by methods such as inverse reinforcement learning. Existing studies that guarantee differential privacy are not extendable to infinite state spaces, as the noise level to ensure privacy will scale accordingly to infinity. Our aim is to protect the value function approximator, without regard to the number of states queried to the function. It is achieved by adding functional noise to the value function iteratively in the training. We show rigorous privacy guarantees by a series of analyses on the kernel of the noise space, the probabilistic bound of such noise samples, and the composition over the iterations. We gain insight into the utility analysis by proving the algorithm's approximate optimality when the state space is discrete. Experiments corroborate our theoretical findings and show improvement over existing approaches. △ Less

Submitted 11 November, 2019; v1 submitted 29 January, 2019; originally announced January 2019.

Comments: Advances in Neural Information Processing Systems (NeurIPS) 2019

arXiv:1801.02889 [pdf, ps, other]

Optimal Content Replication and Request Matching in Large Caching Systems

Authors: Arpan Mukhopadhyay, Nidhi Hegde, Marc Lelarge

Abstract: We consider models of content delivery networks in which the servers are constrained by two main resources: memory and bandwidth. In such systems, the throughput crucially depends on how contents are replicated across servers and how the requests of specific contents are matched to servers storing those contents. In this paper, we first formulate the problem of computing the optimal replication po… ▽ More We consider models of content delivery networks in which the servers are constrained by two main resources: memory and bandwidth. In such systems, the throughput crucially depends on how contents are replicated across servers and how the requests of specific contents are matched to servers storing those contents. In this paper, we first formulate the problem of computing the optimal replication policy which if combined with the optimal matching policy maximizes the throughput of the caching system in the stationary regime. It is shown that computing the optimal replication policy for a given system is an NP-hard problem. A greedy replication scheme is proposed and it is shown that the scheme provides a constant factor approximation guarantee. We then propose a simple randomized matching scheme which avoids the problem of interruption in service of the ongoing requests due to re-assignment or repacking of the existing requests in the optimal matching policy. The dynamics of the caching system is analyzed under the combination of proposed replication and matching schemes. We study a limiting regime, where the number of servers and the arrival rates of the contents are scaled proportionally, and show that the proposed policies achieve asymptotic optimality. Extensive simulation results are presented to evaluate the performance of different policies and study the behavior of the caching system under different service time distributions of the requests. △ Less

Submitted 9 January, 2018; originally announced January 2018.

Comments: INFOCOM 2018

arXiv:1212.0952 [pdf, ps, other]

doi 10.1016/j.tcs.2015.02.018

Self-Organizing Flows in Social Networks

Authors: Nidhi Hegde, Laurent Massoulié, Laurent Viennot

Abstract: Social networks offer users new means of accessing information, essentially relying on "social filtering", i.e. propagation and filtering of information by social contacts. The sheer amount of data flowing in these networks, combined with the limited budget of attention of each user, makes it difficult to ensure that social filtering brings relevant content to the interested users. Our motivation… ▽ More Social networks offer users new means of accessing information, essentially relying on "social filtering", i.e. propagation and filtering of information by social contacts. The sheer amount of data flowing in these networks, combined with the limited budget of attention of each user, makes it difficult to ensure that social filtering brings relevant content to the interested users. Our motivation in this paper is to measure to what extent self-organization of the social network results in efficient social filtering. To this end we introduce flow games, a simple abstraction that models network formation under selfish user dynamics, featuring user-specific interests and budget of attention. In the context of homogeneous user interests, we show that selfish dynamics converge to a stable network structure (namely a pure Nash equilibrium) with close-to-optimal information dissemination. We show in contrast, for the more realistic case of heterogeneous interests, that convergence, if it occurs, may lead to information dissemination that can be arbitrarily inefficient, as captured by an unbounded "price of anarchy". Nevertheless the situation differs when users' interests exhibit a particular structure, captured by a metric space with low doubling dimension. In that case, natural autonomous dynamics converge to a stable configuration. Moreover, users obtain all the information of interest to them in the corresponding dissemination, provided their budget of attention is logarithmic in the size of their interest set. △ Less

Submitted 28 February, 2015; v1 submitted 5 December, 2012; originally announced December 2012.

Journal ref: Theoretical Computer Science, Elsevier, 2015, pp.16

arXiv:1207.3269 [pdf, ps, other]

The Price of Privacy in Untrusted Recommendation Engines

Authors: Siddhartha Banerjee, Nidhi Hegde, Laurent Massoulié

Abstract: Recent increase in online privacy concerns prompts the following question: can a recommender system be accurate if users do not entrust it with their private data? To answer this, we study the problem of learning item-clusters under local differential privacy, a powerful, formal notion of data privacy. We develop bounds on the sample-complexity of learning item-clusters from privatized user inputs… ▽ More Recent increase in online privacy concerns prompts the following question: can a recommender system be accurate if users do not entrust it with their private data? To answer this, we study the problem of learning item-clusters under local differential privacy, a powerful, formal notion of data privacy. We develop bounds on the sample-complexity of learning item-clusters from privatized user inputs. Significantly, our results identify a sample-complexity separation between learning in an information-rich and an information-scarce regime, thereby highlighting the interaction between privacy and the amount of information (ratings) available to each user. In the information-rich regime, where each user rates at least a constant fraction of items, a spectral clustering approach is shown to achieve a sample-complexity lower bound derived from a simple information-theoretic argument based on Fano's inequality. However, the information-scarce regime, where each user rates only a vanishing fraction of items, is found to require a fundamentally different approach both for lower bounds and algorithms. To this end, we develop new techniques for bounding mutual information under a notion of channel-mismatch, and also propose a new algorithm, MaxSense, and show that it achieves optimal sample-complexity in this setting. The techniques we develop for bounding mutual information may be of broader interest. To illustrate this, we show their applicability to $(i)$ learning based on 1-bit sketches, and $(ii)$ adaptive learning, where queries can be adapted based on answers to past queries. △ Less

Submitted 27 October, 2014; v1 submitted 13 July, 2012; originally announced July 2012.

Comments: Preliminary version presented at the 50th Allerton Conference, 2012

arXiv:1203.1891 [pdf, ps, other]

Optimal control of end-user energy storage

Authors: Peter M. van de Ven, Nidhi Hegde, Laurent Massoulie, Theodoros Salonidis

Abstract: An increasing number of retail energy markets show price fluctuations, providing users with the opportunity to buy energy at lower than average prices. We propose to temporarily store this inexpensive energy in a battery, and use it to satisfy demand when energy prices are high, thus allowing users to exploit the price variations without having to shift their demand to the low-price periods. We st… ▽ More An increasing number of retail energy markets show price fluctuations, providing users with the opportunity to buy energy at lower than average prices. We propose to temporarily store this inexpensive energy in a battery, and use it to satisfy demand when energy prices are high, thus allowing users to exploit the price variations without having to shift their demand to the low-price periods. We study the battery control policy that yields the best performance, i.e., minimizes the total discounted costs. The optimal policy is shown to have a threshold structure, and we derive these thresholds in a few special cases. The cost savings obtained from energy storage are demonstrated through extensive numerical experiments, and we offer various directions for future research. △ Less

Submitted 5 December, 2012; v1 submitted 8 March, 2012; originally announced March 2012.

arXiv:0909.1713 [pdf, ps, other]

Size Does Matter (in P2P Live Streaming)

Authors: Nidhi Hegde, Fabien Mathieu, Diego Perino

Abstract: Optimal dissemination schemes have previously been studied for peer-to-peer live streaming applications. Live streaming being a delay-sensitive application, fine tuning of dissemination parameters is crucial. In this report, we investigate optimal sizing of chunks, the units of data exchange, and probe sets, the number peers a given node probes before transmitting chunks. Chunk size can have sig… ▽ More Optimal dissemination schemes have previously been studied for peer-to-peer live streaming applications. Live streaming being a delay-sensitive application, fine tuning of dissemination parameters is crucial. In this report, we investigate optimal sizing of chunks, the units of data exchange, and probe sets, the number peers a given node probes before transmitting chunks. Chunk size can have significant impact on diffusion rate (chunk miss ratio), diffusion delay, and overhead. The size of the probe set can also affect these metrics, primarily through the choices available for chunk dissemination. We perform extensive simulations on the so-called random-peer, latest-useful dissemination scheme. Our results show that size does matter, with the optimal size being not too small in both cases. △ Less

Submitted 9 September, 2009; originally announced September 2009.

Report number: RR-7032

Showing 1–26 of 26 results for author: Hegde, N