Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 141 results for author: Dheeraj

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.05843  [pdf, other

    cs.LG cs.IR stat.ML

    Online Matrix Completion: A Collaborative Approach with Hott Items

    Authors: Dheeraj Baby, Soumyabrata Pal

    Abstract: We investigate the low rank matrix completion problem in an online setting with ${M}$ users, ${N}$ items, ${T}$ rounds, and an unknown rank-$r$ reward matrix ${R}\in \mathbb{R}^{{M}\times {N}}$. This problem has been well-studied in the literature and has several applications in practice. In each round, we recommend ${S}$ carefully chosen distinct items to every user and observe noisy rewards. In… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

    Comments: Appeared at the Forty-first International Conference on Machine Learning, 2024

  2. arXiv:2408.05686  [pdf, other

    cs.LG cs.MA

    The Bandit Whisperer: Communication Learning for Restless Bandits

    Authors: Yunfan Zhao, Tonghan Wang, Dheeraj Nagaraj, Aparna Taneja, Milind Tambe

    Abstract: Applying Reinforcement Learning (RL) to Restless Multi-Arm Bandits (RMABs) offers a promising avenue for addressing allocation problems with resource constraints and temporal dynamics. However, classic RMAB models largely overlook the challenges of (systematic) data errors - a common occurrence in real-world scenarios due to factors like varying data collection protocols and intentional noise for… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

  3. arXiv:2407.06325  [pdf, other

    cs.LG cs.DC math.OC

    CONGO: Compressive Online Gradient Optimization with Application to Microservices Management

    Authors: Jeremy Carleton, Prathik Vijaykumar, Divyanshu Saxena, Dheeraj Narasimha, Srinivas Shakkottai, Aditya Akella

    Abstract: We address the challenge of online convex optimization where the objective function's gradient exhibits sparsity, indicating that only a small number of dimensions possess non-zero gradients. Our aim is to leverage this sparsity to obtain useful estimates of the objective function's gradient even when the only information available is a limited number of function samples. Our motivation stems from… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 28 pages, 7 figures

  4. arXiv:2407.05778  [pdf, other

    cs.CL cs.AI

    When is the consistent prediction likely to be a correct prediction?

    Authors: Alex Nguyen, Dheeraj Mekala, Chengyu Dong, Jingbo Shang

    Abstract: Self-consistency (Wang et al., 2023) suggests that the most consistent answer obtained through large language models (LLMs) is more likely to be correct. In this paper, we challenge this argument and propose a nuanced correction. Our observations indicate that consistent answers derived through more computation i.e. longer reasoning texts, rather than simply the most consistent answer across all o… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  5. arXiv:2407.03471  [pdf, other

    cs.CV

    Learning Action and Reasoning-Centric Image Editing from Videos and Simulations

    Authors: Benno Krojer, Dheeraj Vattikonda, Luis Lara, Varun Jampani, Eva Portelance, Christopher Pal, Siva Reddy

    Abstract: An image editing model should be able to perform diverse edits, ranging from object replacement, changing attributes or style, to performing actions or movement, which require many forms of reasoning. Current general instruction-guided editing models have significant shortcomings with action and reasoning-centric edits. Object, attribute or stylistic changes can be learned from visually static dat… ▽ More

    Submitted 9 August, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

    Comments: Submitted to NeurIPS (Dataset & Benchmarks)

  6. arXiv:2407.00121  [pdf, other

    cs.LG cs.AI cs.CL

    Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks

    Authors: Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Sadhana Kumaravel, Matthew Stallone, Rameswar Panda, Yara Rizk, GP Bhargav, Maxwell Crouse, Chulaka Gunasekara, Shajith Ikbal, Sachin Joshi, Hima Karanam, Vineet Kumar, Asim Munawar, Sumit Neelam, Dinesh Raghu, Udit Sharma, Adriana Meza Soria, Dheeraj Sreedhar, Praveen Venkateswaran, Merve Unuvar, David Cox, Salim Roukos, Luis Lastras , et al. (1 additional authors not shown)

    Abstract: Large language models (LLMs) have recently shown tremendous promise in serving as the backbone to agentic systems, as demonstrated by their performance in multi-faceted, challenging benchmarks like SWE-Bench and Agent-Bench. However, to realize the true potential of LLMs as autonomous agents, they must learn to identify, call, and interact with external tools and application program interfaces (AP… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  7. arXiv:2406.17591  [pdf, other

    cs.CV

    DocParseNet: Advanced Semantic Segmentation and OCR Embeddings for Efficient Scanned Document Annotation

    Authors: Ahmad Mohammadshirazi, Ali Nosrati Firoozsalari, Mengxi Zhou, Dheeraj Kulshrestha, Rajiv Ramnath

    Abstract: Automating the annotation of scanned documents is challenging, requiring a balance between computational efficiency and accuracy. DocParseNet addresses this by combining deep learning and multi-modal learning to process both text and visual data. This model goes beyond traditional OCR and semantic segmentation, capturing the interplay between text and images to preserve contextual nuances in compl… ▽ More

    Submitted 21 July, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  8. POPCat: Propagation of particles for complex annotation tasks

    Authors: Adam Srebrnjak Yang, Dheeraj Khanna, John S. Zelek

    Abstract: Novel dataset creation for all multi-object tracking, crowd-counting, and industrial-based videos is arduous and time-consuming when faced with a unique class that densely populates a video sequence. We propose a time efficient method called POPCat that exploits the multi-target and temporal features of video data to produce a semi-supervised pipeline for segmentation or box-based video annotation… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figures, Accepted in "Conference on Robots and Vision 2024"

  9. arXiv:2406.08848  [pdf, other

    cs.CL cs.AI

    An Approach to Build Zero-Shot Slot-Filling System for Industry-Grade Conversational Assistants

    Authors: G P Shrivatsa Bhargav, Sumit Neelam, Udit Sharma, Shajith Ikbal, Dheeraj Sreedhar, Hima Karanam, Sachindra Joshi, Pankaj Dhoolia, Dinesh Garg, Kyle Croutwater, Haode Qi, Eric Wayne, J William Murdock

    Abstract: We present an approach to build Large Language Model (LLM) based slot-filling system to perform Dialogue State Tracking in conversational assistants serving across a wide variety of industry-grade applications. Key requirements of this system include: 1) usage of smaller-sized models to meet low latency requirements and to enable convenient and cost-effective cloud and customer premise deployments… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  10. arXiv:2405.17068  [pdf, other

    cs.LG math.NA stat.ML

    The Poisson Midpoint Method for Langevin Dynamics: Provably Efficient Discretization for Diffusion Models

    Authors: Saravanan Kandasamy, Dheeraj Nagaraj

    Abstract: Langevin Dynamics is a Stochastic Differential Equation (SDE) central to sampling and generative modeling and is implemented via time discretization. Langevin Monte Carlo (LMC), based on the Euler-Maruyama discretization, is the simplest and most studied algorithm. LMC can suffer from slow convergence - requiring a large number of steps of small step-size to obtain good quality samples. This becom… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: "One often meets his destiny on the road he takes to avoid it" - Master Oogway. My destiny seems to be to write triangle inequalities for the rest of my life

  11. arXiv:2405.17035  [pdf, other

    cs.LG

    Glauber Generative Model: Discrete Diffusion Models via Binary Classification

    Authors: Harshit Varma, Dheeraj Nagaraj, Karthikeyan Shanmugam

    Abstract: We introduce the Glauber Generative Model (GGM), a new class of discrete diffusion models, to obtain new samples from a distribution given samples from a discrete space. GGM deploys a discrete Markov chain called the heat bath dynamics (or the Glauber dynamics) to denoise a sequence of noisy tokens to a sample from a joint distribution of discrete tokens. Our novel conceptual framework provides an… ▽ More

    Submitted 27 August, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  12. arXiv:2405.07698  [pdf, other

    cs.CV

    oTTC: Object Time-to-Contact for Motion Estimation in Autonomous Driving

    Authors: Abdul Hannan Khan, Syed Tahseen Raza Rizvi, Dheeraj Varma Chittari Macharavtu, Andreas Dengel

    Abstract: Autonomous driving systems require a quick and robust perception of the nearby environment to carry out their routines effectively. With the aim to avoid collisions and drive safely, autonomous driving systems rely heavily on object detection. However, 2D object detections alone are insufficient; more information, such as relative velocity and distance, is required for safer planning. Monocular 3D… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 9 pages, 4 figures

  13. arXiv:2404.09127  [pdf, other

    cs.CL

    Confidence Calibration and Rationalization for LLMs via Multi-Agent Deliberation

    Authors: Ruixin Yang, Dheeraj Rajagopal, Shirley Anugrah Hayati, Bin Hu, Dongyeop Kang

    Abstract: Uncertainty estimation is a significant issue for current large language models (LLMs) that are generally poorly calibrated and over-confident, especially with reinforcement learning from human feedback (RLHF). Unlike humans, whose decisions and confidences not only stem from intrinsic beliefs but can also be adjusted through daily observations, existing calibration methods for LLMs focus on estim… ▽ More

    Submitted 10 May, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

    Comments: Accepted at ICLR 2024 Workshop on Reliable and Responsible Foundation Models

  14. arXiv:2404.00439  [pdf, other

    cs.CL

    DOCMASTER: A Unified Platform for Annotation, Training, & Inference in Document Question-Answering

    Authors: Alex Nguyen, Zilong Wang, Jingbo Shang, Dheeraj Mekala

    Abstract: The application of natural language processing models to PDF documents is pivotal for various business applications yet the challenge of training models for this purpose persists in businesses due to specific hurdles. These include the complexity of working with PDF formats that necessitate parsing text and layout information for curating training data and the lack of privacy-preserving annotation… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  15. arXiv:2402.14807  [pdf, other

    cs.MA cs.AI cs.LG

    A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health

    Authors: Nikhil Behari, Edwin Zhang, Yunfan Zhao, Aparna Taneja, Dheeraj Nagaraj, Milind Tambe

    Abstract: Restless multi-armed bandits (RMAB) have demonstrated success in optimizing resource allocation for large beneficiary populations in public health settings. Unfortunately, RMAB models lack flexibility to adapt to evolving public health policy priorities. Concurrently, Large Language Models (LLMs) have emerged as adept automated planners across domains of robotic control and navigation. In this pap… ▽ More

    Submitted 26 May, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

  16. arXiv:2402.14158  [pdf, other

    cs.CL

    TOOLVERIFIER: Generalization to New Tools via Self-Verification

    Authors: Dheeraj Mekala, Jason Weston, Jack Lanchantin, Roberta Raileanu, Maria Lomeli, Jingbo Shang, Jane Dwivedi-Yu

    Abstract: Teaching language models to use tools is an important milestone towards building general assistants, but remains an open problem. While there has been significant progress on learning to use specific tools via fine-tuning, language models still struggle with learning how to robustly use new tools from only a few demonstrations. In this work we introduce a self-verification method which distinguish… ▽ More

    Submitted 13 March, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  17. arXiv:2402.11728  [pdf, other

    cs.CL cs.LG q-fin.CP

    Numerical Claim Detection in Finance: A New Financial Dataset, Weak-Supervision Model, and Market Analysis

    Authors: Agam Shah, Arnav Hiray, Pratvi Shah, Arkaprabha Banerjee, Anushka Singh, Dheeraj Eidnani, Bhaskar Chaudhury, Sudheer Chava

    Abstract: In this paper, we investigate the influence of claims in analyst reports and earnings calls on financial market returns, considering them as significant quarterly events for publicly traded companies. To facilitate a comprehensive analysis, we construct a new financial dataset for the claim detection task in the financial domain. We benchmark various language models on this dataset and propose a n… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

  18. arXiv:2402.11711  [pdf, other

    cs.CL

    MORL-Prompt: An Empirical Analysis of Multi-Objective Reinforcement Learning for Discrete Prompt Optimization

    Authors: Yasaman Jafari, Dheeraj Mekala, Rose Yu, Taylor Berg-Kirkpatrick

    Abstract: RL-based techniques can be used to search for prompts that when fed into a target language model maximize a set of user-specified reward functions. However, in many target applications, the natural reward functions are in tension with one another -- for example, content preservation vs. style matching in style transfer tasks. Current techniques focus on maximizing the average of reward functions,… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

  19. arXiv:2402.10430  [pdf, other

    cs.CL

    Smaller Language Models are capable of selecting Instruction-Tuning Training Data for Larger Language Models

    Authors: Dheeraj Mekala, Alex Nguyen, Jingbo Shang

    Abstract: Instruction-tuning language models has become a crucial step in aligning them for general use. Typically, this process involves extensive training on large datasets, incurring high training costs. In this paper, we introduce a novel training data selection based on the learning percentage of the samples. We assert that current language models possess the capability to autonomously select high-qual… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  20. arXiv:2402.04400  [pdf, other

    cs.LG cs.AI cs.CY

    CEHR-GPT: Generating Electronic Health Records with Chronological Patient Timelines

    Authors: Chao Pang, Xinzhuo Jiang, Nishanth Parameshwar Pavinkurve, Krishna S. Kalluri, Elise L. Minto, Jason Patterson, Linying Zhang, George Hripcsak, Gamze Gürsoy, Noémie Elhadad, Karthik Natarajan

    Abstract: Synthetic Electronic Health Records (EHR) have emerged as a pivotal tool in advancing healthcare applications and machine learning models, particularly for researchers without direct access to healthcare data. Although existing methods, like rule-based approaches and generative adversarial networks (GANs), generate synthetic data that resembles real-world EHR data, these methods often use a tabula… ▽ More

    Submitted 5 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  21. arXiv:2402.03545  [pdf, other

    cs.LG

    Online Feature Updates Improve Online (Generalized) Label Shift Adaptation

    Authors: Ruihan Wu, Siddhartha Datta, Yi Su, Dheeraj Baby, Yu-Xiang Wang, Kilian Q. Weinberger

    Abstract: This paper addresses the prevalent issue of label shift in an online setting with missing labels, where data distributions change over time and obtaining timely labels is challenging. While existing methods primarily focus on adjusting or updating the final layer of a pre-trained classifier, we explore the untapped potential of enhancing feature representations using unlabeled data at test-time. O… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  22. arXiv:2401.03340  [pdf, other

    cs.CV

    Classifying cow stall numbers using YOLO

    Authors: Dheeraj Vajjarapu

    Abstract: This paper introduces the CowStallNumbers dataset, a collection of images extracted from videos focusing on cow teats, designed to advance the field of cow stall number detection. The dataset comprises 1042 training images and 261 test images, featuring stall numbers ranging from 0 to 60. To enhance the dataset, we performed fine-tuning on a YOLO model and applied data augmentation techniques, inc… ▽ More

    Submitted 23 November, 2023; originally announced January 2024.

  23. arXiv:2311.09799  [pdf, other

    cs.CL

    How Far Can We Extract Diverse Perspectives from Large Language Models?

    Authors: Shirley Anugrah Hayati, Minhwa Lee, Dheeraj Rajagopal, Dongyeop Kang

    Abstract: Collecting diverse human opinions is costly and challenging. This leads to a recent trend in collaborative efforts between humans and Large Language Models (LLMs) for generating diverse data, offering potential scalable and efficient solutions. However, the extent of LLMs' capability to generate diverse perspectives on subjective topics remains an unexplored question. In this study, we investigate… ▽ More

    Submitted 18 February, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

  24. ezBIDS: Guided standardization of neuroimaging data interoperable with major data archives and platforms

    Authors: Daniel Levitas, Soichi Hayashi, Sophia Vinci-Booher, Anibal Heinsfeld, Dheeraj Bhatia, Nicholas Lee, Anthony Galassi, Guiomar Niso, Franco Pestilli

    Abstract: Data standardization has become one of the leading methods neuroimaging researchers rely on for data sharing and reproducibility. Data standardization promotes a common framework through which researchers can utilize others' data. Yet, as of today, formatting datasets that adhere to community best practices requires technical expertise involving coding and considerable knowledge of file formats an… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  25. arXiv:2311.03319  [pdf, other

    cs.CL cs.AI

    DAIL: Data Augmentation for In-Context Learning via Self-Paraphrase

    Authors: Dawei Li, Yaxuan Li, Dheeraj Mekala, Shuyao Li, Yulin wang, Xueqi Wang, William Hogan, Jingbo Shang

    Abstract: In-Context Learning (ICL) combined with pre-trained large language models has achieved promising results on various NLP tasks. However, ICL requires high-quality annotated demonstrations which might not be available in real-world scenarios. To overcome this limitation, we propose \textbf{D}ata \textbf{A}ugmentation for \textbf{I}n-Context \textbf{L}earning (\textbf{DAIL}). DAIL leverages the intui… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: Course project for DSC 253 (Advanced Data-Driven Text Mining) at UCSD

  26. arXiv:2310.16132  [pdf, other

    cs.SE

    Diversity in Software Engineering Conferences and Journals

    Authors: Aditya Shankar Narayanan, Dheeraj Vagavolu, Nancy A Day, Meiyappan Nagappan

    Abstract: Diversity with respect to ethnicity and gender has been studied in open-source and industrial settings for software development. Publication avenues such as academic conferences and journals contribute to the growing technology industry. However, there have been very few diversity-related studies conducted in the context of academia. In this paper, we study the ethnic, gender, and geographical div… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 13 pages, 10 figures, 4 tables

  27. arXiv:2310.14526  [pdf, other

    cs.LG cs.AI

    Towards a Pretrained Model for Restless Bandits via Multi-arm Generalization

    Authors: Yunfan Zhao, Nikhil Behari, Edward Hughes, Edwin Zhang, Dheeraj Nagaraj, Karl Tuyls, Aparna Taneja, Milind Tambe

    Abstract: Restless multi-arm bandits (RMABs), a class of resource allocation problems with broad application in areas such as healthcare, online advertising, and anti-poaching, have recently been studied from a multi-agent reinforcement learning perspective. Prior RMAB research suffers from several limitations, e.g., it fails to adequately address continuous states, and requires retraining from scratch when… ▽ More

    Submitted 29 January, 2024; v1 submitted 22 October, 2023; originally announced October 2023.

  28. arXiv:2310.12963  [pdf, other

    cs.CL cs.AI

    AutoMix: Automatically Mixing Language Models

    Authors: Pranjal Aggarwal, Aman Madaan, Ankit Anand, Srividya Pranavi Potharaju, Swaroop Mishra, Pei Zhou, Aditya Gupta, Dheeraj Rajagopal, Karthik Kappaganthu, Yiming Yang, Shyam Upadhyay, Manaal Faruqui, Mausam

    Abstract: Large language models (LLMs) are now available from cloud API providers in various sizes and configurations. While this diversity offers a broad spectrum of choices, effectively leveraging the options to optimize computational cost and performance remains challenging. In this work, we present Automix, an approach that strategically routes queries to larger LMs, based on the approximate correctness… ▽ More

    Submitted 28 June, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: The first two authors contributed equally. Work started and partly done during Aman's internship at Google. This version adds results on additional models and datasets

  29. PyDCM: Custom Data Center Models with Reinforcement Learning for Sustainability

    Authors: Avisek Naug, Antonio Guillen, Ricardo Luna Gutiérrez, Vineet Gundecha, Dejan Markovikj, Lekhapriya Dheeraj Kashyap, Lorenz Krause, Sahand Ghorbanpour, Sajad Mousavi, Ashwin Ramesh Babu, Soumyendu Sarkar

    Abstract: The increasing global emphasis on sustainability and reducing carbon emissions is pushing governments and corporations to rethink their approach to data center design and operation. Given their high energy consumption and exponentially large computational workloads, data centers are prime candidates for optimizing power consumption, especially in areas such as cooling and IT energy usage. A signif… ▽ More

    Submitted 26 March, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

    Comments: The 10th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation (BuildSys '23), November 15-16, 2023, Istanbul, Turkey

    Journal ref: 2023 BuildSys '23: Proceedings of the 10th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation

  30. arXiv:2310.01515  [pdf, other

    quant-ph cs.LG

    Tensor Ring Optimized Quantum-Enhanced Tensor Neural Networks

    Authors: Debanjan Konar, Dheeraj Peddireddy, Vaneet Aggarwal, Bijaya K. Panigrahi

    Abstract: Quantum machine learning researchers often rely on incorporating Tensor Networks (TN) into Deep Neural Networks (DNN) and variational optimization. However, the standard optimization techniques used for training the contracted trainable weights of each model layer suffer from the correlations and entanglement structure between the model parameters on classical implementations. To address this issu… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  31. arXiv:2309.09206  [pdf, other

    cs.RO cs.CV cs.LG

    Differentiable SLAM Helps Deep Learning-based LiDAR Perception Tasks

    Authors: Prashant Kumar, Dheeraj Vattikonda, Vedang Bhupesh Shenvi Nadkarni, Erqun Dong, Sabyasachi Sahoo

    Abstract: We investigate a new paradigm that uses differentiable SLAM architectures in a self-supervised manner to train end-to-end deep learning models in various LiDAR based applications. To the best of our knowledge there does not exist any work that leverages SLAM as a training signal for deep learning based models. We explore new ways to improve the efficiency, robustness, and adaptability of LiDAR sys… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

    Comments: 15 pages,6 Tables, 3 figures. Accepted at BMVC 2023

  32. arXiv:2308.16041  [pdf, other

    cs.CV

    From Pixels to Portraits: A Comprehensive Survey of Talking Head Generation Techniques and Applications

    Authors: Shreyank N Gowda, Dheeraj Pandey, Shashank Narayana Gowda

    Abstract: Recent advancements in deep learning and computer vision have led to a surge of interest in generating realistic talking heads. This paper presents a comprehensive survey of state-of-the-art methods for talking head generation. We systematically categorises them into four main approaches: image-driven, audio-driven, video-driven and others (including neural radiance fields (NeRF), and 3D-based met… ▽ More

    Submitted 30 August, 2023; originally announced August 2023.

  33. arXiv:2307.03884  [pdf, other

    quant-ph cs.LG

    Noisy Tensor Ring approximation for computing gradients of Variational Quantum Eigensolver for Combinatorial Optimization

    Authors: Dheeraj Peddireddy, Utkarsh Priyam, Vaneet Aggarwal

    Abstract: Variational Quantum algorithms, especially Quantum Approximate Optimization and Variational Quantum Eigensolver (VQE) have established their potential to provide computational advantage in the realm of combinatorial optimization. However, these algorithms suffer from classically intractable gradients limiting the scalability. This work addresses the scalability challenge for VQE by proposing a cla… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

    Comments: 12 pages, 13 figures, preprint

  34. arXiv:2306.14288  [pdf, other

    stat.ML cs.LG math.ST

    Near Optimal Heteroscedastic Regression with Symbiotic Learning

    Authors: Dheeraj Baby, Aniket Das, Dheeraj Nagaraj, Praneeth Netrapalli

    Abstract: We consider the problem of heteroscedastic linear regression, where, given $n$ samples $(\mathbf{x}_i, y_i)$ from $y_i = \langle \mathbf{w}^{*}, \mathbf{x}_i \rangle + ε_i \cdot \langle \mathbf{f}^{*}, \mathbf{x}_i \rangle$ with $\mathbf{x}_i \sim N(0,\mathbf{I})$, $ε_i \sim N(0,1)$, we aim to estimate $\mathbf{w}^{*}$. Beyond classical applications of such models in statistics, econometrics, time… ▽ More

    Submitted 1 July, 2023; v1 submitted 25 June, 2023; originally announced June 2023.

    Comments: To appear in Conference on Learning Theory 2023 (COLT 2023)

  35. arXiv:2306.09222  [pdf, other

    cs.LG cs.AI

    Stochastic Re-weighted Gradient Descent via Distributionally Robust Optimization

    Authors: Ramnath Kumar, Kushal Majmundar, Dheeraj Nagaraj, Arun Sai Suggala

    Abstract: We present Re-weighted Gradient Descent (RGD), a novel optimization technique that improves the performance of deep neural networks through dynamic sample importance weighting. Our method is grounded in the principles of distributionally robust optimization (DRO) with Kullback-Leibler divergence. RGD is simple to implement, computationally efficient, and compatible with widely used optimizers such… ▽ More

    Submitted 26 February, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

  36. arXiv:2306.07305  [pdf, other

    cs.LG cs.AI q-fin.CP

    Making forecasting self-learning and adaptive -- Pilot forecasting rack

    Authors: Shaun D'Souza, Dheeraj Shah, Amareshwar Allati, Parikshit Soni

    Abstract: Retail sales and price projections are typically based on time series forecasting. For some product categories, the accuracy of demand forecasts achieved is low, negatively impacting inventory, transport, and replenishment planning. This paper presents our findings based on a proactive pilot exercise to explore ways to help retailers to improve forecast accuracy for such product categories. We e… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

  37. arXiv:2306.02183  [pdf

    cs.DC q-bio.NC q-bio.QM

    brainlife.io: A decentralized and open source cloud platform to support neuroscience research

    Authors: Soichi Hayashi, Bradley A. Caron, Anibal Sólon Heinsfeld, Sophia Vinci-Booher, Brent McPherson, Daniel N. Bullock, Giulia Bertò, Guiomar Niso, Sandra Hanekamp, Daniel Levitas, Kimberly Ray, Anne MacKenzie, Lindsey Kitchell, Josiah K. Leong, Filipi Nascimento-Silva, Serge Koudoro, Hanna Willis, Jasleen K. Jolly, Derek Pisner, Taylor R. Zuidema, Jan W. Kurzawski, Kyriaki Mikellidou, Aurore Bussalb, Christopher Rorden, Conner Victory , et al. (39 additional authors not shown)

    Abstract: Neuroscience research has expanded dramatically over the past 30 years by advancing standardization and tool development to support rigor and transparency. Consequently, the complexity of the data pipeline has also increased, hindering access to FAIR (Findable, Accessible, Interoperabile, and Reusable) data analysis to portions of the worldwide research community. brainlife.io was developed to red… ▽ More

    Submitted 11 August, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

  38. arXiv:2305.19570  [pdf, other

    stat.ML cs.LG

    Online Label Shift: Optimal Dynamic Regret meets Practical Algorithms

    Authors: Dheeraj Baby, Saurabh Garg, Tzu-Ching Yen, Sivaraman Balakrishnan, Zachary Chase Lipton, Yu-Xiang Wang

    Abstract: This paper focuses on supervised and unsupervised online label shift, where the class marginals $Q(y)$ varies but the class-conditionals $Q(x|y)$ remain invariant. In the unsupervised setting, our goal is to adapt a learner, trained on some offline labeled data, to changing label distributions given unlabeled online data. In the supervised setting, we must both learn a classifier and adapt to the… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: First three authors contributed equally

  39. arXiv:2305.17558  [pdf, other

    stat.ML cs.LG math.ST

    Provably Fast Finite Particle Variants of SVGD via Virtual Particle Stochastic Approximation

    Authors: Aniket Das, Dheeraj Nagaraj

    Abstract: Stein Variational Gradient Descent (SVGD) is a popular variational inference algorithm which simulates an interacting particle system to approximately sample from a target distribution, with impressive empirical performance across various domains. Theoretically, its population (i.e, infinite-particle) limit dynamics is well studied but the behavior of SVGD in the finite-particle regime is much les… ▽ More

    Submitted 5 October, 2023; v1 submitted 27 May, 2023; originally announced May 2023.

    Comments: To appear as a Spotlight Paper in The 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  40. arXiv:2305.14696  [pdf, other

    cs.CL

    SELFOOD: Self-Supervised Out-Of-Distribution Detection via Learning to Rank

    Authors: Dheeraj Mekala, Adithya Samavedhi, Chengyu Dong, Jingbo Shang

    Abstract: Deep neural classifiers trained with cross-entropy loss (CE loss) often suffer from poor calibration, necessitating the task of out-of-distribution (OOD) detection. Traditional supervised OOD detection methods require expensive manual annotation of in-distribution and OOD samples. To address the annotation bottleneck, we introduce SELFOOD, a self-supervised OOD detection method that requires only… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  41. arXiv:2305.12749  [pdf, other

    cs.CL

    A Benchmark on Extremely Weakly Supervised Text Classification: Reconcile Seed Matching and Prompting Approaches

    Authors: Zihan Wang, Tianle Wang, Dheeraj Mekala, Jingbo Shang

    Abstract: Etremely Weakly Supervised Text Classification (XWS-TC) refers to text classification based on minimal high-level human guidance, such as a few label-indicative seed words or classification instructions. There are two mainstream approaches for XWS-TC, however, never being rigorously compared: (1) training classifiers based on pseudo-labels generated by (softly) matching seed words (SEED) and (2) p… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: ACL 2023 Findings

  42. Hardware-Impaired Rician-Faded Cell-Free Massive MIMO Systems With Channel Aging

    Authors: Venkatesh Tentu, Dheeraj N Amudala, Anish Chattopadhyay, Rohit Budhiraja

    Abstract: We study the impact of channel aging on the uplink of a cell-free (CF) massive multiple-input multiple-output (mMIMO) system by considering i) spatially-correlated Rician-faded channels; ii) hardware impairments at the access points and user equipments (UEs); and iii) two-layer large-scale fading decoding (LSFD). We first derive a closed-form spectral efficiency (SE) expression for this system, an… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

    Comments: This work has been submitted to the IEEE Transactions on Communications for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible, 32 pages, 14 figures

  43. LSFD for Rician-Faded Cell-Free mMIMO Systems With Channel Aging and Hardware Impairments

    Authors: Anish Chattopadhyay, Venkatesh Tentu, Dheeraj Naidu Amudala, Rohit Budhiraja

    Abstract: We study the impact of channel aging on the uplink of a cell-free massive multiple-input multiple-output system with hardware impairments. We consider a dynamic analog-to-digital converter architecture at the access points (APs), and low-resolution digital-to-analog converters at the user equipments (UEs). We derive a closed-form spectral efficiency expression by considering i) practical spatially… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: This paper is accepted for presentation in 2023 IEEE International Conference on Communications (ICC): Wireless Communications Symposium (IEEE ICC'23 - WC Symposium), 6 pages and 4 figures

    Journal ref: ICC 2023 - IEEE International Conference on Communications, 28 May 2023 - 01 June 2023

  44. arXiv:2212.10815  [pdf, other

    cs.CL

    ZEROTOP: Zero-Shot Task-Oriented Semantic Parsing using Large Language Models

    Authors: Dheeraj Mekala, Jason Wolfe, Subhro Roy

    Abstract: We explore the use of large language models (LLMs) for zero-shot semantic parsing. Semantic parsing involves mapping natural language utterances to task-specific meaning representations. Language models are generally trained on the publicly available text and code and cannot be expected to directly generalize to domain-specific parsing tasks in a zero-shot setting. In this work, we propose ZEROTOP… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

  45. arXiv:2211.16276  [pdf, ps, other

    cs.IT eess.SP

    Hardware-Aware Pilot Decontamination Precoding for Multi-cell mMIMO Systems With Rician Fading

    Authors: Harshit Kesarwani, Dheeraj Naidu Amudala, Venkatesh Tentu, Rohit Budhiraja

    Abstract: We consider a hardware-impaired multi-cell Rician faded massive multi-input multi-output (mMIMO) system with two-layer pilot decontamination precoding, also known as large-scale fading precoding (LSFP). Each BS is equipped with a flexible dynamic analog-to-digital converter (ADC)/digital-to-analog converter (DAC) architecture and the user equipments (UEs) have low-resolution ADCs. Further, both BS… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: This paper is accepted for presentation in 2022 IEEE Global Communications Conference: Wireless Communications (Globecom 2022 WC), 7 pages and 4 figures

  46. arXiv:2211.10527  [pdf, other

    cs.NI eess.SP

    PMNet: Robust Pathloss Map Prediction via Supervised Learning

    Authors: Ju-Hyung Lee, Omer Gokalp Serbetci, Dheeraj Panneer Selvam, Andreas F. Molisch

    Abstract: Pathloss prediction is an essential component of wireless network planning. While ray tracing based methods have been successfully used for many years, they require significant computational effort that may become prohibitive with the increased network densification and/or use of higher frequencies in 5G/B5G (beyond 5G) systems. In this paper, we propose and evaluate a data-driven and model-free p… ▽ More

    Submitted 16 May, 2023; v1 submitted 18 November, 2022; originally announced November 2022.

  47. arXiv:2211.00112  [pdf, other

    cs.MA cs.AI cs.LG math.OC

    Indexability is Not Enough for Whittle: Improved, Near-Optimal Algorithms for Restless Bandits

    Authors: Abheek Ghosh, Dheeraj Nagaraj, Manish Jain, Milind Tambe

    Abstract: We study the problem of planning restless multi-armed bandits (RMABs) with multiple actions. This is a popular model for multi-agent systems with applications like multi-channel communication, monitoring and machine maintenance tasks, and healthcare. Whittle index policies, which are based on Lagrangian relaxations, are widely used in these settings due to their simplicity and near-optimality unde… ▽ More

    Submitted 28 February, 2023; v1 submitted 31 October, 2022; originally announced November 2022.

    Comments: 21 pages; AAMAS'23 version with appendix

  48. arXiv:2211.00083  [pdf, other

    cs.CL cs.AI cs.LG

    WHEN FLUE MEETS FLANG: Benchmarks and Large Pre-trained Language Model for Financial Domain

    Authors: Raj Sanjay Shah, Kunal Chawla, Dheeraj Eidnani, Agam Shah, Wendi Du, Sudheer Chava, Natraj Raman, Charese Smiley, Jiaao Chen, Diyi Yang

    Abstract: Pre-trained language models have shown impressive performance on a variety of tasks and domains. Previous research on financial language models usually employs a generic training scheme to train standard model architectures, without completely leveraging the richness of the financial data. We propose a novel domain specific Financial LANGuage model (FLANG) which uses financial keywords and phrases… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

  49. arXiv:2210.14380  [pdf, other

    cs.CL

    Progressive Sentiment Analysis for Code-Switched Text Data

    Authors: Sudhanshu Ranjan, Dheeraj Mekala, Jingbo Shang

    Abstract: Multilingual transformer language models have recently attracted much attention from researchers and are used in cross-lingual transfer learning for many NLP tasks such as text classification and named entity recognition. However, similar methods for transfer learning from monolingual text to code-switched text have not been extensively explored mainly due to the following challenges: (1) Code-swi… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

    Comments: To appear in Findings of EMNLP 2022

  50. arXiv:2210.07469  [pdf, other

    cs.CL

    StyLEx: Explaining Style Using Human Lexical Annotations

    Authors: Shirley Anugrah Hayati, Kyumin Park, Dheeraj Rajagopal, Lyle Ungar, Dongyeop Kang

    Abstract: Large pre-trained language models have achieved impressive results on various style classification tasks, but they often learn spurious domain-specific words to make predictions (Hayati et al., 2021). While human explanation highlights stylistic tokens as important features for this task, we observe that model explanations often do not align with them. To tackle this issue, we introduce StyLEx, a… ▽ More

    Submitted 14 April, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: EACL 2023