Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 171 results for author: Chowdhury, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.18676  [pdf, other

    cs.LG

    Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift

    Authors: Seongho Son, William Bankes, Sayak Ray Chowdhury, Brooks Paige, Ilija Bogunovic

    Abstract: Reinforcement learning from human feedback (RLHF) aligns Large Language Models (LLMs) with human preferences. However, these preferences can often change over time due to external factors (e.g. environment change and societal influence). Consequently, what was wrong then might be right now. Current preference optimization algorithms do not account for temporal preference drift in their modeling, w… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: 30 pages, 9 figures

  2. arXiv:2407.18108  [pdf, other

    cs.LG cs.CY cs.SI physics.soc-ph

    Graph Neural Ordinary Differential Equations for Coarse-Grained Socioeconomic Dynamics

    Authors: James Koch, Pranab Roy Chowdhury, Heng Wan, Parin Bhaduri, Jim Yoon, Vivek Srikrishnan, W. Brent Daniel

    Abstract: We present a data-driven machine-learning approach for modeling space-time socioeconomic dynamics. Through coarse-graining fine-scale observations, our modeling framework simplifies these complex systems to a set of tractable mechanistic relationships -- in the form of ordinary differential equations -- while preserving critical system behaviors. This approach allows for expedited 'what if' studie… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  3. arXiv:2407.14885  [pdf, other

    cs.CL cs.CV

    Falcon2-11B Technical Report

    Authors: Quentin Malartic, Nilabhra Roy Chowdhury, Ruxandra Cojocaru, Mugariya Farooq, Giulia Campesan, Yasser Abdelaziz Dahou Djilali, Sanath Narayan, Ankit Singh, Maksim Velikanov, Basma El Amel Boussaha, Mohammed Al-Yafeai, Hamza Alobeidli, Leen Al Qadi, Mohamed El Amine Seddik, Kirill Fedyanin, Reda Alami, Hakim Hacid

    Abstract: We introduce Falcon2-11B, a foundation model trained on over five trillion tokens, and its multimodal counterpart, Falcon2-11B-vlm, which is a vision-to-text model. We report our findings during the training of the Falcon2-11B which follows a multi-stage approach where the early stages are distinguished by their context length and a final stage where we use a curated, high-quality dataset. Additio… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

  4. arXiv:2407.11383  [pdf, other

    cs.CV cs.AI

    TM-PATHVQA:90000+ Textless Multilingual Questions for Medical Visual Question Answering

    Authors: Tonmoy Rajkhowa, Amartya Roy Chowdhury, Sankalp Nagaonkar, Achyut Mani Tripathi

    Abstract: In healthcare and medical diagnostics, Visual Question Answering (VQA) mayemergeasapivotal tool in scenarios where analysis of intricate medical images becomes critical for accurate diagnoses. Current text-based VQA systems limit their utility in scenarios where hands-free interaction and accessibility are crucial while performing tasks. A speech-based VQA system may provide a better means of inte… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: conference (Accepted at Interspeech 2024)

  5. arXiv:2406.17740  [pdf, other

    cs.LG cs.AI cs.CV

    Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning

    Authors: Arijit Sehanobish, Avinava Dubey, Krzysztof Choromanski, Somnath Basu Roy Chowdhury, Deepali Jain, Vikas Sindhwani, Snigdha Chaturvedi

    Abstract: Recent efforts to scale Transformer models have demonstrated rapid progress across a wide range of tasks (Wei et al., 2022). However, fine-tuning these models for downstream tasks is expensive due to their large parameter counts. Parameter-efficient fine-tuning (PEFT) approaches have emerged as a viable alternative by allowing us to fine-tune models by updating only a small number of parameters. I… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Work in progress

  6. arXiv:2406.16257  [pdf, other

    cs.LG

    Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning

    Authors: Somnath Basu Roy Chowdhury, Krzysztof Choromanski, Arijit Sehanobish, Avinava Dubey, Snigdha Chaturvedi

    Abstract: Machine unlearning is the process of efficiently removing the influence of a training data instance from a trained machine learning model without retraining it from scratch. A popular subclass of unlearning approaches is exact machine unlearning, which focuses on techniques that explicitly guarantee the removal of the influence of a data instance from a model. Exact unlearning approaches use a mac… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: Work in Progress

  7. arXiv:2406.15881  [pdf, other

    cs.LG cs.AI

    Fast Tree-Field Integrators: From Low Displacement Rank to Topological Transformers

    Authors: Krzysztof Choromanski, Arijit Sehanobish, Somnath Basu Roy Chowdhury, Han Lin, Avinava Dubey, Tamas Sarlos, Snigdha Chaturvedi

    Abstract: We present a new class of fast polylog-linear algorithms based on the theory of structured matrices (in particular low displacement rank) for integrating tensor fields defined on weighted trees. Several applications of the resulting fast tree-field integrators (FTFIs) are presented, including (a) approximation of graph metrics with tree metrics, (b) graph classification, (c) modeling on meshes, an… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Preprint. Comments welcome

  8. arXiv:2406.11107  [pdf, other

    cs.CL

    Exploring Safety-Utility Trade-Offs in Personalized Language Models

    Authors: Anvesh Rao Vijjini, Somnath Basu Roy Chowdhury, Snigdha Chaturvedi

    Abstract: As large language models (LLMs) become increasingly integrated into daily applications, it is essential to ensure they operate fairly across diverse user demographics. In this work, we show that LLMs suffer from personalization bias, where their performance is impacted when they are personalized to a user's identity. We quantify personalization bias by evaluating the performance of LLMs along two… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: Work in Progress

  9. arXiv:2405.16646  [pdf, other

    cs.LG

    A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts

    Authors: Mohammed Nowaz Rabbani Chowdhury, Meng Wang, Kaoutar El Maghraoui, Naigang Wang, Pin-Yu Chen, Christopher Carothers

    Abstract: The sparsely gated mixture of experts (MoE) architecture sends different inputs to different subnetworks, i.e., experts, through trainable routers. MoE reduces the training computation significantly for large models, but its deployment can be still memory or computation expensive for some downstream tasks. Model pruning is a popular approach to reduce inference computation, but its application in… ▽ More

    Submitted 30 May, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

    Journal ref: The 41st International Conference on Machine Learning, ICML 2024

  10. arXiv:2405.02665  [pdf, ps, other

    cs.CR

    Metric Differential Privacy at the User-Level

    Authors: Jacob Imola, Amrita Roy Chowdhury, Kamalika Chaudhuri

    Abstract: Metric differential privacy (DP) provides heterogeneous privacy guarantees based on a distance between the pair of inputs. It is a widely popular notion of privacy since it captures the natural privacy semantics for many applications (such as, for location data) and results in better utility than standard DP. However, prior work in metric DP has primarily focused on the \textit{item-level} setting… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  11. arXiv:2405.00441  [pdf, other

    cs.CR

    Modeling Linear and Non-linear Layers: An MILP Approach Towards Finding Differential and Impossible Differential Propagations

    Authors: Debranjan Pal, Vishal Pankaj Chandratreya, Abhijit Das, Dipanwita Roy Chowdhury

    Abstract: Symmetric key cryptography stands as a fundamental cornerstone in ensuring security within contemporary electronic communication frameworks. The cryptanalysis of classical symmetric key ciphers involves traditional methods and techniques aimed at breaking or analyzing these cryptographic systems. In the evaluation of new ciphers, the resistance against linear and differential cryptanalysis is comm… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 42 pages, 2 figures, 21 tables, 7 algorithms

  12. arXiv:2404.09147  [pdf

    cs.HC

    Evaluating the efficacy of haptic feedback, 360° treadmill-integrated Virtual Reality framework and longitudinal training on decision-making performance in a complex search-and-shoot simulation

    Authors: Akash K Rao, Arnav Bhavsar, Shubhajit Roy Chowdhury, Sushil Chandra, Ramsingh Negi, Prakash Duraisamy, Varun Dutt

    Abstract: Virtual Reality (VR) has made significant strides, offering users a multitude of ways to interact with virtual environments. Each sensory modality in VR provides distinct inputs and interactions, enhancing the user's immersion and presence. However, the potential of additional sensory modalities, such as haptic feedback and 360° locomotion, to improve decision-making performance has not been thoro… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: 13 pages, 6 figures, 1 Table

  13. arXiv:2404.05159  [pdf

    cs.CL cs.CR cs.LG

    Semantic Stealth: Adversarial Text Attacks on NLP Using Several Methods

    Authors: Roopkatha Dey, Aivy Debnath, Sayak Kumar Dutta, Kaustav Ghosh, Arijit Mitra, Arghya Roy Chowdhury, Jaydip Sen

    Abstract: In various real-world applications such as machine translation, sentiment analysis, and question answering, a pivotal role is played by NLP models, facilitating efficient communication and decision-making processes in domains ranging from healthcare to finance. However, a significant challenge is posed to the robustness of these natural language processing models by text adversarial attacks. These… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: This report pertains to the Capstone Project done by Group 2 of the Fall batch of 2023 students at Praxis Tech School, Kolkata, India. The reports consists of 28 pages and it includes 10 tables. This is the preprint which will be submitted to IEEE CONIT 2024 for review

  14. arXiv:2404.03606  [pdf, other

    cs.SD cs.AI cs.IR eess.AS

    Analyzing Musical Characteristics of National Anthems in Relation to Global Indices

    Authors: S M Rakib Hasan, Aakar Dhakal, Ms. Ayesha Siddiqua, Mohammad Mominur Rahman, Md Maidul Islam, Mohammed Arfat Raihan Chowdhury, S M Masfequier Rahman Swapno, SM Nuruzzaman Nobel

    Abstract: Music plays a huge part in shaping peoples' psychology and behavioral patterns. This paper investigates the connection between national anthems and different global indices with computational music analysis and statistical correlation analysis. We analyze national anthem musical data to determine whether certain musical characteristics are associated with peace, happiness, suicide rate, crime rate… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  15. arXiv:2404.03570  [pdf, other

    cs.RO

    Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity

    Authors: Jake Varley, Sumeet Singh, Deepali Jain, Krzysztof Choromanski, Andy Zeng, Somnath Basu Roy Chowdhury, Avinava Dubey, Vikas Sindhwani

    Abstract: We present an embodied AI system which receives open-ended natural language instructions from a human, and controls two arms to collaboratively accomplish potentially long-horizon tasks over a large workspace. Our system is modular: it deploys state of the art Large Language Models for task planning,Vision-Language models for semantic perception, and Point Cloud transformers for grasping. With sem… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  16. arXiv:2403.07918  [pdf, other

    cs.CY cs.AI cs.LG

    On the Societal Impact of Open Foundation Models

    Authors: Sayash Kapoor, Rishi Bommasani, Kevin Klyman, Shayne Longpre, Ashwin Ramaswami, Peter Cihon, Aspen Hopkins, Kevin Bankston, Stella Biderman, Miranda Bogen, Rumman Chowdhury, Alex Engler, Peter Henderson, Yacine Jernite, Seth Lazar, Stefano Maffulli, Alondra Nelson, Joelle Pineau, Aviya Skowron, Dawn Song, Victor Storchan, Daniel Zhang, Daniel E. Ho, Percy Liang, Arvind Narayanan

    Abstract: Foundation models are powerful technologies: how they are released publicly directly shapes their societal impact. In this position paper, we focus on open foundation models, defined here as those with broadly available model weights (e.g. Llama 2, Stable Diffusion XL). We identify five distinctive properties (e.g. greater customizability, poor monitoring) of open foundation models that lead to bo… ▽ More

    Submitted 27 February, 2024; originally announced March 2024.

  17. arXiv:2403.00409  [pdf, other

    cs.LG cs.CL

    Provably Robust DPO: Aligning Language Models with Noisy Feedback

    Authors: Sayak Ray Chowdhury, Anush Kini, Nagarajan Natarajan

    Abstract: Learning from preference-based feedback has recently gained traction as a promising approach to align language models with human interests. While these aligned generative models have demonstrated impressive capabilities across various tasks, their dependence on high-quality human preference data poses a bottleneck in practical applications. Specifically, noisy (incorrect and ambiguous) preference… ▽ More

    Submitted 11 April, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

  18. arXiv:2402.18128  [pdf, other

    cs.CV cs.LG

    Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization

    Authors: Han Guo, Ramtin Hosseini, Ruiyi Zhang, Sai Ashish Somayajula, Ranak Roy Chowdhury, Rajesh K. Gupta, Pengtao Xie

    Abstract: Masked Autoencoder (MAE) is a notable method for self-supervised pretraining in visual representation learning. It operates by randomly masking image patches and reconstructing these masked patches using the unmasked ones. A key limitation of MAE lies in its disregard for the varying informativeness of different patches, as it uniformly selects patches to mask. To overcome this, some approaches pr… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  19. arXiv:2402.16173  [pdf

    cs.NI cs.AI

    Communication Traffic Characteristics Reveal an IoT Devices Identity

    Authors: Rajarshi Roy Chowdhury, Debashish Roy, Pg Emeroylariffion Abas

    Abstract: Internet of Things (IoT) is one of the technological advancements of the twenty-first century which can improve living standards. However, it also imposes new types of security challenges, including device authentication, traffic types classification, and malicious traffic identification, in the network domain. Traditionally, internet protocol (IP) and media access control (MAC) addresses are util… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

    Comments: 16 pages

    ACM Class: F.2.2; I.2.7

  20. arXiv:2402.12572  [pdf, other

    cs.LG cs.AI cs.CR

    FairProof : Confidential and Certifiable Fairness for Neural Networks

    Authors: Chhavi Yadav, Amrita Roy Chowdhury, Dan Boneh, Kamalika Chaudhuri

    Abstract: Machine learning models are increasingly used in societal applications, yet legal and privacy concerns demand that they very often be kept confidential. Consequently, there is a growing distrust about the fairness properties of these models in the minds of consumers, who are often at the receiving end of model predictions. To this end, we propose \name -- a system that uses Zero-Knowledge Proofs (… ▽ More

    Submitted 15 July, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  21. arXiv:2402.10500  [pdf, other

    cs.LG cs.AI cs.CL

    Active Preference Optimization for Sample Efficient RLHF

    Authors: Nirjhar Das, Souradip Chakraborty, Aldo Pacchiano, Sayak Ray Chowdhury

    Abstract: Reinforcement Learning from Human Feedback (RLHF) is pivotal in aligning Large Language Models (LLMs) with human preferences. Although aligned generative models have shown remarkable abilities in various tasks, their reliance on high-quality human preference data creates a costly bottleneck in the practical application of RLHF. One primary reason is that current methods rely on uniformly picking p… ▽ More

    Submitted 5 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

    Comments: New experimental results added. Some reorganization

  22. arXiv:2402.01801  [pdf, other

    cs.LG cs.AI cs.CL

    Large Language Models for Time Series: A Survey

    Authors: Xiyuan Zhang, Ranak Roy Chowdhury, Rajesh K. Gupta, Jingbo Shang

    Abstract: Large Language Models (LLMs) have seen significant use in domains such as natural language processing and computer vision. Going beyond text, image and graphics, LLMs present a significant potential for analysis of time series data, benefiting domains such as climate, IoT, healthcare, traffic, audio and finance. This survey paper provides an in-depth exploration and a detailed taxonomy of the vari… ▽ More

    Submitted 6 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: GitHub repository: https://github.com/xiyuanzh/awesome-llm-time-series

  23. arXiv:2402.00976  [pdf, ps, other

    cs.LG cs.AI cs.NE

    Investigating Recurrent Transformers with Dynamic Halt

    Authors: Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: In this paper, we study the inductive biases of two major approaches to augmenting Transformers with a recurrent mechanism - (1) the approach of incorporating a depth-wise recurrence similar to Universal Transformers; and (2) the approach of incorporating a chunk-wise temporal recurrence like Temporal Latent Bottleneck. Furthermore, we propose and investigate novel ways to extend and combine the a… ▽ More

    Submitted 31 March, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  24. arXiv:2402.00090  [pdf

    q-bio.NC cs.HC

    Classification of attention performance post-longitudinal tDCS via functional connectivity and machine learning methods

    Authors: Akash K Rao, Vishnu K Menon, Arnav Bhavsar, Shubhajit Roy Chowdhury, Ramsingh Negi, Varun Dutt

    Abstract: Attention is the brain's mechanism for selectively processing specific stimuli while filtering out irrelevant information. Characterizing changes in attention following long-term interventions (such as transcranial direct current stimulation (tDCS)) has seldom been emphasized in the literature. To classify attention performance post-tDCS, this study uses functional connectivity and machine learnin… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

    Comments: 6 pages, to be presented in the IEEE 9th International Conference for Convergence in Technology (I2CT),Pune, April 2024. arXiv admin note: substantial text overlap with arXiv:2401.17700

  25. arXiv:2401.17711  [pdf

    cs.HC cs.AI

    Prediction of multitasking performance post-longitudinal tDCS via EEG-based functional connectivity and machine learning methods

    Authors: Akash K Rao, Shashank Uttrani, Vishnu K Menon, Darshil Shah, Arnav Bhavsar, Shubhajit Roy Chowdhury, Varun Dutt

    Abstract: Predicting and understanding the changes in cognitive performance, especially after a longitudinal intervention, is a fundamental goal in neuroscience. Longitudinal brain stimulation-based interventions like transcranial direct current stimulation (tDCS) induce short-term changes in the resting membrane potential and influence cognitive processes. However, very little research has been conducted o… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: 16 pages, presented at the 30th International Conference on Neural Information Processing (ICONIP2023), Changsha, China, November 2023

  26. arXiv:2401.08047  [pdf, other

    cs.CL cs.LG

    Incremental Extractive Opinion Summarization Using Cover Trees

    Authors: Somnath Basu Roy Chowdhury, Nicholas Monath, Avinava Dubey, Manzil Zaheer, Andrew McCallum, Amr Ahmed, Snigdha Chaturvedi

    Abstract: Extractive opinion summarization involves automatically producing a summary of text about an entity (e.g., a product's reviews) by extracting representative sentences that capture prevalent opinions in the review set. Typically, in online marketplaces user reviews accumulate over time, and opinion summaries need to be updated periodically to provide customers with up-to-date information. In this w… ▽ More

    Submitted 12 April, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

    Comments: Accepted at TMLR

  27. arXiv:2312.01007  [pdf, other

    cs.IR cs.AI

    A Hypergraph-Based Approach to Recommend Online Resources in a Library

    Authors: Debashish Roy, Rajarshi Roy Chowdhury

    Abstract: When users in a digital library read or browse online resources, it generates an immense amount of data. If the underlying system can recommend items, such as books and journals, to the users, it will help them to find the related items. This research analyzes a digital library's usage data to recommend items to its users, and it uses different clustering algorithms to design the recommender syste… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: 12 Pages, 2 figures, and 1 table

  28. arXiv:2312.00194  [pdf

    cs.LG cs.CL

    Robust Concept Erasure via Kernelized Rate-Distortion Maximization

    Authors: Somnath Basu Roy Chowdhury, Nicholas Monath, Avinava Dubey, Amr Ahmed, Snigdha Chaturvedi

    Abstract: Distributed representations provide a vector space that captures meaningful relationships between data instances. The distributed nature of these representations, however, entangles together multiple attributes or concepts of data instances (e.g., the topic or sentiment of a text, characteristics of the author (age, gender, etc), etc). Recent work has proposed the task of concept erasure, in which… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

    Comments: NeurIPS 2023

  29. arXiv:2311.14711  [pdf, other

    cs.CY cs.AI

    Towards Publicly Accountable Frontier LLMs: Building an External Scrutiny Ecosystem under the ASPIRE Framework

    Authors: Markus Anderljung, Everett Thornton Smith, Joe O'Brien, Lisa Soder, Benjamin Bucknall, Emma Bluemke, Jonas Schuett, Robert Trager, Lacey Strahm, Rumman Chowdhury

    Abstract: With the increasing integration of frontier large language models (LLMs) into society and the economy, decisions related to their training, deployment, and use have far-reaching implications. These decisions should not be left solely in the hands of frontier LLM developers. LLM users, civil society and policymakers need trustworthy sources of information to steer such decisions for the better. Inv… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: Accepted to Workshop on Socially Responsible Language Modelling Research (SoLaR) at the 2023 Conference on Neural Information Processing Systems (NeurIPS 2023)

    ACM Class: I.2.0

  30. arXiv:2311.10025  [pdf, other

    cs.LG cs.AI cs.DC cs.NE

    A Novel Neural Network-Based Federated Learning System for Imbalanced and Non-IID Data

    Authors: Mahfuzur Rahman Chowdhury, Muhammad Ibrahim

    Abstract: With the growth of machine learning techniques, privacy of data of users has become a major concern. Most of the machine learning algorithms rely heavily on large amount of data which may be collected from various sources. Collecting these data yet maintaining privacy policies has become one of the most challenging tasks for the researchers. To combat this issue, researchers have introduced federa… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: 48 pages

  31. arXiv:2311.06968  [pdf, other

    cs.LG cs.AI eess.SP stat.ML

    Physics-Informed Data Denoising for Real-Life Sensing Systems

    Authors: Xiyuan Zhang, Xiaohan Fu, Diyan Teng, Chengyu Dong, Keerthivasan Vijayakumar, Jiayun Zhang, Ranak Roy Chowdhury, Junsheng Han, Dezhi Hong, Rashmi Kulkarni, Jingbo Shang, Rajesh Gupta

    Abstract: Sensors measuring real-life physical processes are ubiquitous in today's interconnected world. These sensors inherently bear noise that often adversely affects performance and reliability of the systems they support. Classic filtering-based approaches introduce strong assumptions on the time or frequency characteristics of sensory measurements, while learning-based denoising approaches typically r… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    Comments: SenSys 2023

  32. arXiv:2311.04449  [pdf, other

    cs.LG cs.CL

    Recursion in Recursion: Two-Level Nested Recursion for Length Generalization with Scalability

    Authors: Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: Binary Balanced Tree RvNNs (BBT-RvNNs) enforce sequence composition according to a preset balanced binary tree structure. Thus, their non-linear recursion depth is just $\log_2 n$ ($n$ being the sequence length). Such logarithmic scaling makes BBT-RvNNs efficient and scalable on long sequence tasks such as Long Range Arena (LRA). However, such computational efficiency comes at a cost because BBT-R… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Accepted at NeurIPS 2023

  33. arXiv:2310.20158  [pdf, other

    cs.CL

    GAR-meets-RAG Paradigm for Zero-Shot Information Retrieval

    Authors: Daman Arora, Anush Kini, Sayak Ray Chowdhury, Nagarajan Natarajan, Gaurav Sinha, Amit Sharma

    Abstract: Given a query and a document corpus, the information retrieval (IR) task is to output a ranked list of relevant documents. Combining large language models (LLMs) with embedding-based retrieval models, recent work shows promising results on the zero-shot retrieval problem, i.e., no access to labeled data from the target domain. Two such popular paradigms are generation-augmented retrieval or GAR (g… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: preprint

  34. arXiv:2310.19733  [pdf, other

    cs.LG cs.CR

    Differentially Private Reward Estimation with Preference Feedback

    Authors: Sayak Ray Chowdhury, Xingyu Zhou, Nagarajan Natarajan

    Abstract: Learning from preference-based feedback has recently gained considerable traction as a promising approach to align generative models with human interests. Instead of relying on numerical rewards, the generative models are trained using reinforcement learning with human feedback (RLHF). These approaches first solicit feedback from human labelers typically in the form of pairwise comparisons between… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  35. arXiv:2310.11401  [pdf, other

    cs.LG

    Enhancing Group Fairness in Online Settings Using Oblique Decision Forests

    Authors: Somnath Basu Roy Chowdhury, Nicholas Monath, Ahmad Beirami, Rahul Kidambi, Avinava Dubey, Amr Ahmed, Snigdha Chaturvedi

    Abstract: Fairness, especially group fairness, is an important consideration in the context of machine learning systems. The most commonly adopted group fairness-enhancing techniques are in-processing methods that rely on a mixture of a fairness objective (e.g., demographic parity) and a task-specific objective (e.g., cross-entropy) during the training process. However, when data arrives in an online fashio… ▽ More

    Submitted 27 April, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: ICLR 2024 (Spotlight)

  36. Identifying and Mitigating the Security Risks of Generative AI

    Authors: Clark Barrett, Brad Boyd, Elie Burzstein, Nicholas Carlini, Brad Chen, Jihye Choi, Amrita Roy Chowdhury, Mihai Christodorescu, Anupam Datta, Soheil Feizi, Kathleen Fisher, Tatsunori Hashimoto, Dan Hendrycks, Somesh Jha, Daniel Kang, Florian Kerschbaum, Eric Mitchell, John Mitchell, Zulfikar Ramzan, Khawaja Shams, Dawn Song, Ankur Taly, Diyi Yang

    Abstract: Every major technical invention resurfaces the dual-use dilemma -- the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well… ▽ More

    Submitted 28 December, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

    Journal ref: Foundations and Trends in Privacy and Security 6 (2023) 1-52

  37. arXiv:2308.05179  [pdf

    cs.CV

    JutePestDetect: An Intelligent Approach for Jute Pest Identification Using Fine-Tuned Transfer Learning

    Authors: Md. Simul Hasan Talukder, Mohammad Raziuddin Chowdhury, Md Sakib Ullah Sourav, Abdullah Al Rakin, Shabbir Ahmed Shuvo, Rejwan Bin Sulaiman, Musarrat Saberin Nipun, Muntarin Islam, Mst Rumpa Islam, Md Aminul Islam, Zubaer Haque

    Abstract: In certain Asian countries, Jute is one of the primary sources of income and Gross Domestic Product (GDP) for the agricultural sector. Like many other crops, Jute is prone to pest infestations, and its identification is typically made visually in countries like Bangladesh, India, Myanmar, and China. In addition, this method is time-consuming, challenging, and somewhat imprecise, which poses a subs… ▽ More

    Submitted 28 May, 2023; originally announced August 2023.

    Comments: 29 Pages, 7 Tables, 7 Figures, 5 Appendix

  38. arXiv:2307.13859  [pdf, other

    cs.CR

    Random (Un)rounding : Vulnerabilities in Discrete Attribute Disclosure in the 2021 Canadian Census

    Authors: Christopher West, Ivy Vecna, Raiyan Chowdhury

    Abstract: The 2021 Canadian census is notable for using a unique form of privacy, random rounding, which independently and probabilistically rounds discrete numerical attribute values. In this work, we explore how hierarchical summative correlation between discrete variables allows for both probabilistic and exact solutions to attribute values in the 2021 Canadian Census disclosure. We demonstrate that, in… ▽ More

    Submitted 27 July, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

    Comments: Small formatting revision

  39. arXiv:2307.10779  [pdf, other

    cs.LG

    Efficient Beam Tree Recursion

    Authors: Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: Beam Tree Recursive Neural Network (BT-RvNN) was recently proposed as a simple extension of Gumbel Tree RvNN and it was shown to achieve state-of-the-art length generalization performance in ListOps while maintaining comparable performance on other tasks. However, although not the worst in its kind, BT-RvNN can be still exorbitantly expensive in memory usage. In this paper, we identify the main bo… ▽ More

    Submitted 7 November, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: Accepted in NeurIPS 2023

  40. arXiv:2307.04699  [pdf, other

    cs.CY

    International Institutions for Advanced AI

    Authors: Lewis Ho, Joslyn Barnhart, Robert Trager, Yoshua Bengio, Miles Brundage, Allison Carnegie, Rumman Chowdhury, Allan Dafoe, Gillian Hadfield, Margaret Levi, Duncan Snidal

    Abstract: International institutions may have an important role to play in ensuring advanced AI systems benefit humanity. International collaborations can unlock AI's ability to further sustainable development, and coordination of regulatory efforts can reduce obstacles to innovation and the spread of benefits. Conversely, the potential dangerous capabilities of powerful and general-purpose AI systems creat… ▽ More

    Submitted 11 July, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

    Comments: 19 pages, 2 figures, fixed rendering issues

    ACM Class: K.4.1

  41. arXiv:2306.06206  [pdf

    cs.CV cs.LG

    PotatoPestNet: A CTInceptionV3-RS-Based Neural Network for Accurate Identification of Potato Pests

    Authors: Md. Simul Hasan Talukder, Rejwan Bin Sulaiman, Mohammad Raziuddin Chowdhury, Musarrat Saberin Nipun, Taminul Islam

    Abstract: Potatoes are the third-largest food crop globally, but their production frequently encounters difficulties because of aggressive pest infestations. The aim of this study is to investigate the various types and characteristics of these pests and propose an efficient PotatoPestNet AI-based automatic potato pest identification system. To accomplish this, we curated a reliable dataset consisting of ei… ▽ More

    Submitted 15 July, 2023; v1 submitted 27 May, 2023; originally announced June 2023.

  42. arXiv:2306.04352  [pdf, other

    cs.CR

    Security Analysis of WG-7 Lightweight Stream Cipher against Cube Attack

    Authors: Bijoy Das, Abhijit Das, Dipanwita Roy Chowdhury

    Abstract: Welch--Gong (WG) is a hardware-oriented LFSR-based stream cipher. WG-7 is a version of the eStream submission Welch--Gong, used for RFID encryption and authentication purposes. It offers 80-bit cryptographic security. In modern days, almost all ciphers achieve the security by exploiting the nonlinear feedback structure. In this paper, we investigate the security of the nonlinear feedback-based ini… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  43. arXiv:2306.04073  [pdf, other

    cs.LG

    Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks

    Authors: Mohammed Nowaz Rabbani Chowdhury, Shuai Zhang, Meng Wang, Sijia Liu, Pin-Yu Chen

    Abstract: In deep learning, mixture-of-experts (MoE) activates one or few experts (sub-networks) on a per-sample or per-token basis, resulting in significant computation reduction. The recently proposed \underline{p}atch-level routing in \underline{MoE} (pMoE) divides each input into $n$ patches (or tokens) and sends $l$ patches ($l\ll n$) to each expert through prioritized routing. pMoE has demonstrated gr… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Journal ref: The 40th International Conference on Machine Learning (ICML), 2023

  44. arXiv:2306.02642  [pdf, other

    cs.CR

    Efficient Algorithms for Modeling SBoxes Using MILP

    Authors: Debranjan Pal, Vishal Pankaj Chandratreya, Dipanwita Roy Chowdhury

    Abstract: Mixed Integer Linear Programming (MILP) is a well-known approach for the cryptanalysis of a symmetric cipher. A number of MILP-based security analyses have been reported for non-linear (SBoxes) and linear layers. Researchers proposed word- and bit-wise SBox modeling techniques using a set of inequalities which helps in searching differential trails for a cipher. In this paper, we propose two new t… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  45. arXiv:2306.01121  [pdf, other

    cs.LG cs.AI

    Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards

    Authors: Yulian Wu, Xingyu Zhou, Sayak Ray Chowdhury, Di Wang

    Abstract: In this paper, we study the problem of (finite horizon tabular) Markov decision processes (MDPs) with heavy-tailed rewards under the constraint of differential privacy (DP). Compared with the previous studies for private reinforcement learning that typically assume rewards are sampled from some bounded or sub-Gaussian distributions to ensure DP, we consider the setting where reward distributions h… ▽ More

    Submitted 5 June, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: ICML 2023

  46. arXiv:2305.20019  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Monotonic Location Attention for Length Generalization

    Authors: Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: We explore different ways to utilize position-based cross-attention in seq2seq networks to enable length generalization in algorithmic tasks. We show that a simple approach of interpolating the original and reversed encoded representations combined with relative attention allows near-perfect length generalization for both forward and reverse lookup tasks or copy tasks that had been generally hard… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: Accepted in ICML 2023

  47. arXiv:2305.19999  [pdf, other

    cs.LG cs.AI cs.CL

    Beam Tree Recursive Cells

    Authors: Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: We propose Beam Tree Recursive Cell (BT-Cell) - a backpropagation-friendly framework to extend Recursive Neural Networks (RvNNs) with beam search for latent structure induction. We further extend this framework by proposing a relaxation of the hard top-k operators in beam search for better propagation of gradient signals. We evaluate our proposed models in different out-of-distribution splits in b… ▽ More

    Submitted 20 June, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: Accepted in ICML 2023

  48. arXiv:2305.17968  [pdf, other

    cs.CL

    Data Augmentation for Low-Resource Keyphrase Generation

    Authors: Krishna Garg, Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: Keyphrase generation is the task of summarizing the contents of any given article into a few salient phrases (or keyphrases). Existing works for the task mostly rely on large-scale annotated datasets, which are not easy to acquire. Very few works address the problem of keyphrase generation in low-resource settings, but they still rely on a lot of additional unlabeled data for pretraining and on au… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: 9 pages, 8 tables, To appear at the Findings of the Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, Canada

  49. arXiv:2305.14553  [pdf

    cs.CR cs.AI cs.CY

    Adversarial Machine Learning and Cybersecurity: Risks, Challenges, and Legal Implications

    Authors: Micah Musser, Andrew Lohn, James X. Dempsey, Jonathan Spring, Ram Shankar Siva Kumar, Brenda Leong, Christina Liaghati, Cindy Martinez, Crystal D. Grant, Daniel Rohrer, Heather Frase, Jonathan Elliott, John Bansemer, Mikel Rodriguez, Mitt Regan, Rumman Chowdhury, Stefan Hermanek

    Abstract: In July 2022, the Center for Security and Emerging Technology (CSET) at Georgetown University and the Program on Geopolitics, Technology, and Governance at the Stanford Cyber Policy Center convened a workshop of experts to examine the relationship between vulnerabilities in artificial intelligence systems and more traditional types of software vulnerabilities. Topics discussed included the extent… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  50. arXiv:2305.10389  [pdf, other

    cs.DS cs.CG

    Cache-Oblivious Parallel Convex Hull in the Binary Forking Model

    Authors: Reilly Browne, Rezaul Chowdhury, Shih-Yu Tsai, Yimin Zhu

    Abstract: We present two cache-oblivious sorting-based convex hull algorithms in the Binary Forking Model. The first is an algorithm for a presorted set of points which achieves $O(n)$ work, $O(\log n)$ span, and $O(n/B)$ serial cache complexity, where $B$ is the cache line size. These are all optimal worst-case bounds for cache-oblivious algorithms in the Binary Forking Model. The second adapts Cole and Ra… ▽ More

    Submitted 17 July, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: 15 pages 3 figures