Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 404 results for author: Mukherjee, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.12551  [pdf, other

    cs.FL

    Greybox Learning of Languages Recognizable by Event-Recording Automata

    Authors: Anirban Majumdar, Sayan Mukherjee, Jean-François Raskin

    Abstract: In this paper, we revisit the active learning of timed languages recognizable by event-recording automata. Our framework employs a method known as greybox learning, which enables the learning of event-recording automata with a minimal number of control states. This approach avoids learning the region automaton associated with the language, contrasting with existing methods. We have implemented our… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: Shorter version of this article has been accepted at ATVA 2024

  2. arXiv:2408.08577  [pdf, other

    cond-mat.soft cs.CE physics.bio-ph physics.chem-ph

    Mechanistic Modeling of Lipid Nanoparticle Formation for the Delivery of Nucleic Acid Therapeutics

    Authors: Pavan K. Inguva, Saikat Mukherjee, Pierre J. Walker, Mona A. Kanso, Jie Wang, Yanchen Wu, Vico Tenberg, Srimanta Santra, Shalini Singh, Shin Hyuk Kim, Bernhardt L. Trout, Martin Z. Bazant, Allan S. Myerson, Richard D. Braatz

    Abstract: Nucleic acids such as mRNA have emerged as a promising therapeutic modality with the capability of addressing a wide range of diseases. Lipid nanoparticles (LNPs) as a delivery platform for nucleic acids were used in the COVID-19 vaccines and have received much attention. While modern manufacturing processes which involve rapidly mixing an organic stream containing the lipids with an aqueous strea… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: 67 pages, 10 figures

  3. arXiv:2408.07860  [pdf

    eess.IV cs.CV

    A Novel Generative Artificial Intelligence Method for Interference Study on Multiplex Brightfield Immunohistochemistry Images

    Authors: Satarupa Mukherjee, Jim Martin, Yao Nie

    Abstract: Multiplex brightfield imaging offers the advantage of simultaneously analyzing multiple biomarkers on a single slide, as opposed to single biomarker labeling on multiple consecutive slides. To accurately analyze multiple biomarkers localized at the same cellular compartment, two representative biomarker sets were selected as assay models - cMET-PDL1-EGFR and CD8-LAG3-PDL1, where all three biomarke… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  4. arXiv:2408.06996  [pdf, other

    cs.LG math.ST

    Blessing of Dimensionality for Approximating Sobolev Classes on Manifolds

    Authors: Hong Ye Tan, Subhadip Mukherjee, Junqi Tang, Carola-Bibiane Schönlieb

    Abstract: The manifold hypothesis says that natural high-dimensional data is actually supported on or around a low-dimensional manifold. Recent success of statistical and learning-based methods empirically supports this hypothesis, due to outperforming classical statistical intuition in very high dimensions. A natural step for analysis is thus to assume the manifold hypothesis and derive bounds that are ind… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    MSC Class: 41A25; 41A46; 53Z50;

  5. arXiv:2408.03599  [pdf, other

    cs.LG cs.AI cs.NE math.NA

    Activations Through Extensions: A Framework To Boost Performance Of Neural Networks

    Authors: Chandramouli Kamanchi, Sumanta Mukherjee, Kameshwaran Sampath, Pankaj Dayama, Arindam Jati, Vijay Ekambaram, Dzung Phan

    Abstract: Activation functions are non-linearities in neural networks that allow them to learn complex mapping between inputs and outputs. Typical choices for activation functions are ReLU, Tanh, Sigmoid etc., where the choice generally depends on the application domain. In this work, we propose a framework/strategy that unifies several works on activation functions and theoretically explains the performanc… ▽ More

    Submitted 15 August, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

  6. arXiv:2408.01868  [pdf, other

    stat.ML cs.LG math.PR math.ST

    Meta-Posterior Consistency for the Bayesian Inference of Metastable System

    Authors: Zachary P Adams, Sayan Mukherjee

    Abstract: The vast majority of the literature on learning dynamical systems or stochastic processes from time series has focused on stable or ergodic systems, for both Bayesian and frequentist inference procedures. However, most real-world systems are only metastable, that is, the dynamics appear to be stable on some time scale, but are in fact unstable over longer time scales. Consistency of inference for… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

    Comments: 32 pages, 3 figures

    MSC Class: 62F15; 60J70

  7. arXiv:2407.18200  [pdf, ps, other

    cs.DC cs.LG eess.SP

    Sparse Incremental Aggregation in Multi-Hop Federated Learning

    Authors: Sourav Mukherjee, Nasrin Razmi, Armin Dekorsy, Petar Popovski, Bho Matthiesen

    Abstract: This paper investigates federated learning (FL) in a multi-hop communication setup, such as in constellations with inter-satellite links. In this setup, part of the FL clients are responsible for forwarding other client's results to the parameter server. Instead of using conventional routing, the communication efficiency can be improved significantly by using in-network model aggregation at each i… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: This paper is accepted for the 25th IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC) conference

  8. arXiv:2407.16737  [pdf, other

    cs.CL

    A Survey of Text Style Transfer: Applications and Ethical Implications

    Authors: Sourabrata Mukherjee, Mateusz Lango, Zdenek Kasner, Ondrej Dušek

    Abstract: Text style transfer (TST) is an important task in controllable text generation, which aims to control selected attributes of language use, such as politeness, formality, or sentiment, without altering the style-independent content of the text. The field has received considerable research attention in recent years and has already been covered in several reviews, but the focus has mostly been on the… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  9. arXiv:2407.14822  [pdf, other

    cs.CL

    Text Style Transfer: An Introductory Overview

    Authors: Sourabrata Mukherjee, Ondrej Dušek

    Abstract: Text Style Transfer (TST) is a pivotal task in natural language generation to manipulate text style attributes while preserving style-independent content. The attributes targeted in TST can vary widely, including politeness, authorship, mitigation of offensive language, modification of feelings, and adjustment of text formality. TST has become a widely researched topic with substantial advancement… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    Comments: Accepted at 4EU+ International Workshop on Recent Advancements in Artificial Intelligence

  10. arXiv:2407.06015  [pdf, other

    stat.ML cs.LG stat.AP

    Simulation-based Benchmarking for Causal Structure Learning in Gene Perturbation Experiments

    Authors: Luka Kovačević, Izzy Newsham, Sach Mukherjee, John Whittaker

    Abstract: Causal structure learning (CSL) refers to the task of learning causal relationships from data. Advances in CSL now allow learning of causal graphs in diverse application domains, which has the potential to facilitate data-driven causal decision-making. Real-world CSL performance depends on a number of $\textit{context-specific}$ factors, including context-specific data distributions and non-linear… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 16 pages, 8 figures, 4 tables

  11. arXiv:2406.16962  [pdf, other

    cs.LG cs.AI cs.CL

    MetaGreen: Meta-Learning Inspired Transformer Selection for Green Semantic Communication

    Authors: Shubhabrata Mukherjee, Cory Beard, Sejun Song

    Abstract: Semantic Communication can transform the way we transmit information, prioritizing meaningful and effective content over individual symbols or bits. This evolution promises significant benefits, including reduced latency, lower bandwidth usage, and higher throughput compared to traditional communication. However, the development of Semantic Communication faces a crucial challenge: the need for uni… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2310.07592

  12. arXiv:2406.15247  [pdf, other

    math.ST cs.IT math.PR

    On Naive Mean-Field Approximation for high-dimensional canonical GLMs

    Authors: Sumit Mukherjee, Jiaze Qiu, Subhabrata Sen

    Abstract: We study the validity of the Naive Mean Field (NMF) approximation for canonical GLMs with product priors. This setting is challenging due to the non-conjugacy of the likelihood and the prior. Using the theory of non-linear large deviations (Austin 2019, Chatterjee, Dembo 2016, Eldan 2018), we derive sufficient conditions for the tightness of the NMF approximation to the log-normalizing constant of… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 33 pages, 2 figures

    MSC Class: Primary: 62F15; Secondary: 94A17; 65K10

  13. arXiv:2406.15074  [pdf, other

    cs.HC

    Balancing The Perception of Cheating Detection, Privacy and Fairness: A Mixed-Methods Study of Visual Data Obfuscation in Remote Proctoring

    Authors: Suvadeep Mukherjee, Verena Distler, Gabriele Lenzini, Pedro Cardoso-Leite

    Abstract: Remote proctoring technology, a cheating-preventive measure, often raises privacy and fairness concerns that may affect test-takers' experiences and the validity of test results. Our study explores how selectively obfuscating information in video recordings can protect test-takers' privacy while ensuring effective and fair cheating detection. Interviews with experts (N=9) identified four key video… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  14. arXiv:2406.11661  [pdf, other

    cs.CL

    Cultural Conditioning or Placebo? On the Effectiveness of Socio-Demographic Prompting

    Authors: Sagnik Mukherjee, Muhammad Farid Adilazuarda, Sunayana Sitaram, Kalika Bali, Alham Fikri Aji, Monojit Choudhury

    Abstract: Socio-demographic prompting is a commonly employed approach to study cultural biases in LLMs as well as for aligning models to certain cultures. In this paper, we systematically probe four LLMs (Llama 3, Mistral v0.2, GPT-3.5 Turbo and GPT-4) with prompts that are conditioned on culturally sensitive and non-sensitive cues, on datasets that are supposed to be culturally sensitive (EtiCor and CALI)… ▽ More

    Submitted 20 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  15. arXiv:2406.10030  [pdf, other

    cs.LG stat.ML

    Off-Policy Evaluation from Logged Human Feedback

    Authors: Aniruddha Bhargava, Lalit Jain, Branislav Kveton, Ge Liu, Subhojyoti Mukherjee

    Abstract: Learning from human feedback has been central to recent advances in artificial intelligence and machine learning. Since the collection of human feedback is costly, a natural question to ask is if the new feedback always needs to collected. Or could we evaluate a new model with the human feedback on responses of another model? This motivates us to study off-policy evaluation from logged human feedb… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  16. arXiv:2406.07100  [pdf, other

    cs.LG cs.AI math.AT

    D-GRIL: End-to-End Topological Learning with 2-parameter Persistence

    Authors: Soham Mukherjee, Shreyas N. Samaga, Cheng Xin, Steve Oudot, Tamal K. Dey

    Abstract: End-to-end topological learning using 1-parameter persistence is well-known. We show that the framework can be enhanced using 2-parameter persistence by adopting a recently introduced 2-parameter persistence based vectorization technique called GRIL. We establish a theoretical foundation of differentiating GRIL producing D-GRIL. We show that D-GRIL can be used to learn a bifiltration function on s… ▽ More

    Submitted 27 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  17. arXiv:2406.05885  [pdf

    cs.CL

    Are Large Language Models Actually Good at Text Style Transfer?

    Authors: Sourabrata Mukherjee, Atul Kr. Ojha, Ondřej Dušek

    Abstract: We analyze the performance of large language models (LLMs) on Text Style Transfer (TST), specifically focusing on sentiment transfer and text detoxification across three languages: English, Hindi, and Bengali. Text Style Transfer involves modifying the linguistic style of a text while preserving its core content. We evaluate the capabilities of pre-trained LLMs using zero-shot and few-shot prompti… ▽ More

    Submitted 27 August, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

  18. arXiv:2406.05064  [pdf, other

    cs.LG

    Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning

    Authors: Subhojyoti Mukherjee, Josiah P. Hanna, Qiaomin Xie, Robert Nowak

    Abstract: In this paper, we study multi-task structured bandit problem where the goal is to learn a near-optimal algorithm that minimizes cumulative regret. The tasks share a common structure and the algorithm exploits the shared structure to minimize the cumulative regret for an unseen but related test task. We use a transformer as a decision-making algorithm to learn this shared structure so as to general… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  19. arXiv:2406.04487  [pdf, other

    cs.LG stat.ML

    A multi-core periphery perspective: Ranking via relative centrality

    Authors: Chandra Sekhar Mukherjee, Jiapeng Zhang

    Abstract: Community and core-periphery are two widely studied graph structures, with their coexistence observed in real-world graphs (Rombach, Porter, Fowler \& Mucha [SIAM J. App. Math. 2014, SIAM Review 2017]). However, the nature of this coexistence is not well understood and has been pointed out as an open problem (Yanchenko \& Sengupta [Statistics Surveys, 2023]). Especially, the impact of inferring th… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  20. arXiv:2406.03437  [pdf, other

    cs.LG

    Transfer Learning for Latent Variable Network Models

    Authors: Akhil Jalan, Arya Mazumdar, Soumendu Sundar Mukherjee, Purnamrita Sarkar

    Abstract: We study transfer learning for estimation in latent variable network models. In our setting, the conditional edge probability matrices given the latent variables are represented by $P$ for the source and $Q$ for the target. We wish to estimate $Q$ given two kinds of data: (1) edge data from a subgraph induced by an $o(1)$ fraction of the nodes of $Q$, and (2) edge data from all of $P$. If the sour… ▽ More

    Submitted 6 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

  21. arXiv:2406.02732  [pdf, other

    cs.LG cs.DS

    GEFL: Extended Filtration Learning for Graph Classification

    Authors: Simon Zhang, Soham Mukherjee, Tamal K. Dey

    Abstract: Extended persistence is a technique from topological data analysis to obtain global multiscale topological information from a graph. This includes information about connected components and cycles that are captured by the so-called persistence barcodes. We introduce extended persistence into a supervised learning framework for graph classification. Global topological information, in the form of a… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 26 pages, 13 figures, Learning on Graphs Conference (LoG 2022)

  22. arXiv:2406.02165  [pdf, other

    cs.LG

    SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in Tabular MDP

    Authors: Subhojyoti Mukherjee, Josiah P. Hanna, Robert Nowak

    Abstract: In this paper, we study safe data collection for the purpose of policy evaluation in tabular Markov decision processes (MDPs). In policy evaluation, we are given a \textit{target} policy and asked to estimate the expected cumulative reward it will obtain. Policy evaluation requires data and we are interested in the question of what \textit{behavior} policy should collect the data for the most accu… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  23. arXiv:2405.20805  [pdf

    cs.CL

    Multilingual Text Style Transfer: Datasets & Models for Indian Languages

    Authors: Sourabrata Mukherjee, Atul Kr. Ojha, Akanksha Bansal, Deepak Alok, John P. McCrae, Ondřej Dušek

    Abstract: Text style transfer (TST) involves altering the linguistic style of a text while preserving its core content. This paper focuses on sentiment transfer, a popular TST subtask, across a spectrum of Indian languages: Hindi, Magahi, Malayalam, Marathi, Punjabi, Odia, Telugu, and Urdu, expanding upon previous work on English-Bangla sentiment transfer (Mukherjee et al., 2023). We introduce dedicated dat… ▽ More

    Submitted 27 August, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

  24. arXiv:2405.20483  [pdf, other

    cs.CR

    Hiding Your Awful Online Choices Made More Efficient and Secure: A New Privacy-Aware Recommender System

    Authors: Shibam Mukherjee, Roman Walch, Fredrik Meisingseth, Elisabeth Lex, Christian Rechberger

    Abstract: Recommender systems are an integral part of online platforms that recommend new content to users with similar interests. However, they demand a considerable amount of user activity data where, if the data is not adequately protected, constitute a critical threat to the user privacy. Privacy-aware recommender systems enable protection of such sensitive user data while still maintaining a similar re… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  25. arXiv:2405.04386  [pdf

    cs.AI cs.LG

    Pragmatist Intelligence: Where the Principle of Usefulness Can Take ANNs

    Authors: Antonio Bikić, Sayan Mukherjee

    Abstract: Artificial neural networks (ANNs) perform extraordinarily on numerous tasks including classification or prediction, e.g., speech processing and image classification. These new functions are based on a computational model that is enabled to select freely all necessary internal model parameters as long as it eventually delivers the functionality it is supposed to exhibit. Here, we review the connect… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 16 pages

  26. arXiv:2404.14618  [pdf, other

    cs.LG cs.AI cs.CL

    Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing

    Authors: Dujian Ding, Ankur Mallick, Chi Wang, Robert Sim, Subhabrata Mukherjee, Victor Ruhle, Laks V. S. Lakshmanan, Ahmed Hassan Awadallah

    Abstract: Large language models (LLMs) excel in most NLP tasks but also require expensive cloud servers for deployment due to their size, while smaller models that can be deployed on lower cost (e.g., edge) devices, tend to lag behind in terms of response quality. Therefore in this work we propose a hybrid inference approach which combines their respective strengths to save cost and maintain quality. Our ap… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Accepted to ICLR 2024 (main conference)

  27. arXiv:2404.13895  [pdf, other

    cs.LG

    Optimal Design for Human Feedback

    Authors: Subhojyoti Mukherjee, Anusha Lalitha, Kousha Kalantari, Aniket Deshmukh, Ge Liu, Yifei Ma, Branislav Kveton

    Abstract: Learning of preference models from human feedback has been central to recent advances in artificial intelligence. Motivated by the cost of obtaining high-quality human annotations, we study the problem of data collection for learning preference models. The key idea in our work is to generalize the optimal design, a method for computing information gathering policies, to ranked lists. To show the g… ▽ More

    Submitted 30 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  28. arXiv:2404.08846  [pdf, other

    cs.LG cs.CL

    Experimental Design for Active Transductive Inference in Large Language Models

    Authors: Subhojyoti Mukherjee, Anusha Lalitha, Aniket Deshmukh, Ge Liu, Yifei Ma, Branislav Kveton

    Abstract: One emergent ability of large language models (LLMs) is that query-specific examples can be included in the prompt at inference time. In this work, we use active learning for adaptive prompt design and call it Active In-context Prompt Design (AIPD). We design the LLM prompt by adaptively choosing few-shot examples from a training set to optimize performance on a test set. The training examples are… ▽ More

    Submitted 30 May, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

  29. arXiv:2404.08831  [pdf, other

    eess.IV cs.CV cs.LG

    Structured Model Pruning for Efficient Inference in Computational Pathology

    Authors: Mohammed Adnan, Qinle Ba, Nazim Shaikh, Shivam Kalra, Satarupa Mukherjee, Auranuch Lorsakul

    Abstract: Recent years have seen significant efforts to adopt Artificial Intelligence (AI) in healthcare for various use cases, from computer-aided diagnosis to ICU triage. However, the size of AI models has been rapidly growing due to scaling laws and the success of foundational models, which poses an increasing challenge to leverage advanced models in practical applications. It is thus imperative to devel… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  30. arXiv:2404.05445  [pdf, other

    stat.ME cs.LG stat.CO

    Unsupervised Training of Convex Regularizers using Maximum Likelihood Estimation

    Authors: Hong Ye Tan, Ziruo Cai, Marcelo Pereyra, Subhadip Mukherjee, Junqi Tang, Carola-Bibiane Schönlieb

    Abstract: Imaging is a standard example of an inverse problem, where the task of reconstructing a ground truth from a noisy measurement is ill-posed. Recent state-of-the-art approaches for imaging use deep learning, spearheaded by unrolled and end-to-end models and trained on various image datasets. However, many such methods require the availability of ground truth data, which may be unavailable or expensi… ▽ More

    Submitted 29 July, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    MSC Class: 62C12; 62F15; 65C40; 65J22

  31. arXiv:2404.05205  [pdf, other

    cs.CV

    A secure and private ensemble matcher using multi-vault obfuscated templates

    Authors: Babak Poorebrahim Gilkalaye, Shubhabrata Mukherjee, Reza Derakhshani

    Abstract: Generative AI has revolutionized modern machine learning by providing unprecedented realism, diversity, and efficiency in data generation. This technology holds immense potential for biometrics, including for securing sensitive and personally identifiable information. Given the irrevocability of biometric samples and mounting privacy concerns, biometric template security and secure matching are am… ▽ More

    Submitted 12 August, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted in IJCB 2024 Special Session, Generative AI for Futuristic Biometrics

  32. arXiv:2404.04521  [pdf, other

    cs.SE cs.PL

    Automated Computer Program Evaluation and Projects -- Our Experiences

    Authors: Bama Srinivasan, Mala Nehru, Ranjani Parthasarathi, Saswati Mukherjee, Jeena A Thankachan

    Abstract: This paper provides a few approaches to automating computer programming and project submission tasks, that we have been following for the last six years and have found to be successful. The approaches include using CodeRunner with Learning Management System (LMS) integration for programming practice and evaluation, and Git (GitHub) for project submissions and automatic code evaluation. In this pap… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: 14 pages, 15 figures

    ACM Class: D.3

    Journal ref: https://www.sxcejournal.com/spe-apr-2023/17.pdf

  33. arXiv:2403.19792  [pdf, other

    cs.LG cs.AI cs.CR cs.DC

    MAPL: Model Agnostic Peer-to-peer Learning

    Authors: Sayak Mukherjee, Andrea Simonetto, Hadi Jamali-Rad

    Abstract: Effective collaboration among heterogeneous clients in a decentralized setting is a rather unexplored avenue in the literature. To structurally address this, we introduce Model Agnostic Peer-to-peer Learning (coined as MAPL) a novel approach to simultaneously learn heterogeneous personalized models as well as a collaboration graph through peer-to-peer communication among neighboring clients. MAPL… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: Our code is available and can be accessed here: https://github.com/SayakMukherjee/MAPL

  34. arXiv:2403.15412  [pdf, other

    cs.CY cs.AI cs.CL

    Towards Measuring and Modeling "Culture" in LLMs: A Survey

    Authors: Muhammad Farid Adilazuarda, Sagnik Mukherjee, Pradhyumna Lavania, Siddhant Singh, Alham Fikri Aji, Jacki O'Neill, Ashutosh Modi, Monojit Choudhury

    Abstract: We present a survey of more than 90 recent papers that aim to study cultural representation and inclusion in large language models (LLMs). We observe that none of the studies explicitly define "culture, which is a complex, multifaceted concept; instead, they probe the models on some specially designed datasets which represent certain aspects of "culture". We call these aspects the proxies of cultu… ▽ More

    Submitted 4 September, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  35. arXiv:2403.13313  [pdf, other

    cs.AI cs.CL

    Polaris: A Safety-focused LLM Constellation Architecture for Healthcare

    Authors: Subhabrata Mukherjee, Paul Gamble, Markel Sanz Ausin, Neel Kant, Kriti Aggarwal, Neha Manjunath, Debajyoti Datta, Zhengliang Liu, Jiayuan Ding, Sophia Busacca, Cezanne Bianco, Swapnil Sharma, Rae Lasko, Michelle Voisard, Sanchay Harneja, Darya Filippova, Gerry Meixiong, Kevin Cha, Amir Youssefi, Meyhaa Buvanesh, Howard Weingram, Sebastian Bierman-Lytle, Harpreet Singh Mangat, Kim Parikh, Saad Godil , et al. (1 additional authors not shown)

    Abstract: We develop Polaris, the first safety-focused LLM constellation for real-time patient-AI healthcare conversations. Unlike prior LLM works in healthcare focusing on tasks like question answering, our work specifically focuses on long multi-turn voice conversations. Our one-trillion parameter constellation system is composed of several multibillion parameter LLMs as co-operative agents: a stateful pr… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  36. arXiv:2403.12161  [pdf

    cs.CE cs.CY q-fin.GN

    Effect of Leaders Voice on Financial Market: An Empirical Deep Learning Expedition on NASDAQ, NSE, and Beyond

    Authors: Arijit Das, Tanmoy Nandi, Prasanta Saha, Suman Das, Saronyo Mukherjee, Sudip Kumar Naskar, Diganta Saha

    Abstract: Financial market like the price of stock, share, gold, oil, mutual funds are affected by the news and posts on social media. In this work deep learning based models are proposed to predict the trend of financial market based on NLP analysis of the twitter handles of leaders of different fields. There are many models available to predict financial market based on only the historical data of the fin… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 20 pages original research

  37. arXiv:2403.02765  [pdf, other

    cs.LG q-bio.BM

    G4-Attention: Deep Learning Model with Attention for predicting DNA G-Quadruplexes

    Authors: Shrimon Mukherjee, Pulakesh Pramanik, Partha Basuchowdhuri, Santanu Bhattacharya

    Abstract: G-Quadruplexes are the four-stranded non-canonical nucleic acid secondary structures, formed by the stacking arrangement of the guanine tetramers. They are involved in a wide range of biological roles because of their exceptionally unique and distinct structural characteristics. After the completion of the human genome sequencing project, a lot of bioinformatic algorithms were introduced to predic… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  38. arXiv:2402.17595  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Implicit Regularization via Spectral Neural Networks and Non-linear Matrix Sensing

    Authors: Hong T. M. Chu, Subhro Ghosh, Chi Thanh Lam, Soumendu Sundar Mukherjee

    Abstract: The phenomenon of implicit regularization has attracted interest in recent years as a fundamental aspect of the remarkable generalizing ability of neural networks. In a nutshell, it entails that gradient descent dynamics in many neural nets, even without any explicit regularizer in the loss function, converges to the solution of a regularized learning problem. However, known results attempting to… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  39. arXiv:2402.12352  [pdf, other

    cs.CL cs.IR

    Graph-Based Retriever Captures the Long Tail of Biomedical Knowledge

    Authors: Julien Delile, Srayanta Mukherjee, Anton Van Pamel, Leonid Zhukov

    Abstract: Large language models (LLMs) are transforming the way information is retrieved with vast amounts of knowledge being summarized and presented via natural language conversations. Yet, LLMs are prone to highlight the most frequently seen pieces of information from the training set and to neglect the rare ones. In the field of biomedical research, latest discoveries are key to academic and industrial… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 11 pages, 4 figures

  40. arXiv:2402.07894  [pdf, other

    cs.CV

    MODIPHY: Multimodal Obscured Detection for IoT using PHantom Convolution-Enabled Faster YOLO

    Authors: Shubhabrata Mukherjee, Cory Beard, Zhu Li

    Abstract: Low-light conditions and occluded scenarios impede object detection in real-world Internet of Things (IoT) applications like autonomous vehicles and security systems. While advanced machine learning models strive for accuracy, their computational demands clash with the limitations of resource-constrained devices, hampering real-time performance. In our current research, we tackle this challenge, b… ▽ More

    Submitted 23 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: This paper has been accepted for publication at the IEEE International Conference on Image Processing (ICIP) 2024

  41. arXiv:2402.07770  [pdf, other

    cs.IR cs.CL stat.AP

    Quantitative knowledge retrieval from large language models

    Authors: David Selby, Kai Spriestersbach, Yuichiro Iwashita, Dennis Bappert, Archana Warrier, Sumantrak Mukherjee, Muhammad Nabeel Asim, Koichi Kise, Sebastian Vollmer

    Abstract: Large language models (LLMs) have been extensively studied for their abilities to generate convincing natural language sequences, however their utility for quantitative information retrieval is less well understood. In this paper we explore the feasibility of LLMs as a mechanism for quantitative knowledge retrieval to aid data analysis tasks such as elicitation of prior distributions for Bayesian… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: 13 pages plus supplementary materials

  42. arXiv:2402.07767  [pdf

    cs.CL

    Text Detoxification as Style Transfer in English and Hindi

    Authors: Sourabrata Mukherjee, Akanksha Bansal, Atul Kr. Ojha, John P. McCrae, Ondřej Dušek

    Abstract: This paper focuses on text detoxification, i.e., automatically converting toxic text into non-toxic text. This task contributes to safer and more respectful online communication and can be considered a Text Style Transfer (TST) task, where the text style changes while its content is preserved. We present three approaches: knowledge transfer from a similar task, multi-task learning approach, combin… ▽ More

    Submitted 9 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: Accepted and presented at the 20th International Conference on Natural Language Processing (ICON-2023) during December 14-17, 2023

  43. arXiv:2402.06053  [pdf, other

    cs.HC cs.AI cs.CY

    Randomness Is All You Need: Semantic Traversal of Problem-Solution Spaces with Large Language Models

    Authors: Thomas Sandholm, Sayandev Mukherjee, Bernardo A. Huberman

    Abstract: We present a novel approach to exploring innovation problem and solution domains using LLM fine-tuning with a custom idea database. By semantically traversing the bi-directional problem and solution tree at different temperature levels we achieve high diversity in solution edit distance while still remaining close to the original problem statement semantically. In addition to finding a variety of… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  44. arXiv:2402.02441  [pdf, other

    cs.LG cs.AI cs.MS stat.CO

    TopoX: A Suite of Python Packages for Machine Learning on Topological Domains

    Authors: Mustafa Hajij, Mathilde Papillon, Florian Frantzen, Jens Agerberg, Ibrahem AlJabea, Ruben Ballester, Claudio Battiloro, Guillermo Bernárdez, Tolga Birdal, Aiden Brent, Peter Chin, Sergio Escalera, Simone Fiorellino, Odin Hoff Gardaa, Gurusankar Gopalakrishnan, Devendra Govil, Josef Hoppe, Maneel Reddy Karri, Jude Khouja, Manuel Lecha, Neal Livesay, Jan Meißner, Soham Mukherjee, Alexander Nikitin, Theodore Papamarkou , et al. (18 additional authors not shown)

    Abstract: We introduce TopoX, a Python software suite that provides reliable and user-friendly building blocks for computing and machine learning on topological domains that extend graphs: hypergraphs, simplicial, cellular, path and combinatorial complexes. TopoX consists of three packages: TopoNetX facilitates constructing and computing on these domains, including working with nodes, edges and higher-order… ▽ More

    Submitted 17 February, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  45. arXiv:2402.01052  [pdf, other

    math.OC cs.CV cs.LG stat.ML

    Weakly Convex Regularisers for Inverse Problems: Convergence of Critical Points and Primal-Dual Optimisation

    Authors: Zakhar Shumaylov, Jeremy Budd, Subhadip Mukherjee, Carola-Bibiane Schönlieb

    Abstract: Variational regularisation is the primary method for solving inverse problems, and recently there has been considerable work leveraging deeply learned regularisation for enhanced performance. However, few results exist addressing the convergence of such regularisation, particularly within the context of critical points as opposed to global minimisers. In this paper, we present a generalised formul… ▽ More

    Submitted 15 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: 26 pages, 4 figures; https://openreview.net/forum?id=E8FpcUyPuS

  46. arXiv:2401.08948  [pdf, other

    cs.RO

    PINSAT: Parallelized Interleaving of Graph Search and Trajectory Optimization for Kinodynamic Motion Planning

    Authors: Ramkumar Natarajan, Shohin Mukherjee, Howie Choset, Maxim Likhachev

    Abstract: Trajectory optimization is a widely used technique in robot motion planning for letting the dynamics and constraints on the system shape and synthesize complex behaviors. Several previous works have shown its benefits in high-dimensional continuous state spaces and under differential constraints. However, long time horizons and planning around obstacles in non-convex spaces pose challenges in guar… ▽ More

    Submitted 16 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Under review

  47. arXiv:2401.08588  [pdf

    cs.CV

    Improved Pothole Detection Using YOLOv7 and ESRGAN

    Authors: Nirmal Kumar Rout, Gyanateet Dutta, Varun Sinha, Arghadeep Dey, Subhrangshu Mukherjee, Gopal Gupta

    Abstract: Potholes are common road hazards that is causing damage to vehicles and posing a safety risk to drivers. The introduction of Convolutional Neural Networks (CNNs) is widely used in the industry for object detection based on Deep Learning methods and has achieved significant progress in hardware improvement and software implementations. In this paper, a unique better algorithm is proposed to warrant… ▽ More

    Submitted 10 November, 2023; originally announced January 2024.

  48. arXiv:2401.08513  [pdf, other

    cs.LG cs.CR

    X Hacking: The Threat of Misguided AutoML

    Authors: Rahul Sharma, Sergey Redyuk, Sumantrak Mukherjee, Andrea Sipka, Sebastian Vollmer, David Selby

    Abstract: Explainable AI (XAI) and interpretable machine learning methods help to build trust in model predictions and derived insights, yet also present a perverse incentive for analysts to manipulate XAI metrics to support pre-specified conclusions. This paper introduces the concept of X-hacking, a form of p-hacking applied to XAI metrics such as Shap values. We show how an automated machine learning pipe… ▽ More

    Submitted 12 February, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: 13 pages, 8 figures, plus supplementary materials

  49. arXiv:2401.06713  [pdf, other

    cs.DC

    Picasso: Memory-Efficient Graph Coloring Using Palettes With Applications in Quantum Computing

    Authors: S M Ferdous, Reece Neff, Bo Peng, Salman Shuvo, Marco Minutoli, Sayak Mukherjee, Karol Kowalski, Michela Becchi, Mahantesh Halappanavar

    Abstract: A coloring of a graph is an assignment of colors to vertices such that no two neighboring vertices have the same color. The need for memory-efficient coloring algorithms is motivated by their application in computing clique partitions of graphs arising in quantum computations where the objective is to map a large set of Pauli strings into a compact set of unitaries. We present Picasso, a randomize… ▽ More

    Submitted 12 February, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

    Comments: Accepted by IPDPS 2024

  50. arXiv:2401.03955  [pdf, other

    cs.LG cs.AI

    Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series

    Authors: Vijay Ekambaram, Arindam Jati, Pankaj Dayama, Sumanta Mukherjee, Nam H. Nguyen, Wesley M. Gifford, Chandra Reddy, Jayant Kalagnanam

    Abstract: Large pre-trained models excel in zero/few-shot learning for language and vision tasks but face challenges in multivariate time series (TS) forecasting due to diverse data characteristics. Consequently, recent research efforts have focused on developing pre-trained TS forecasting models. These models, whether built from scratch or adapted from large language models (LLMs), excel in zero/few-shot f… ▽ More

    Submitted 5 June, 2024; v1 submitted 8 January, 2024; originally announced January 2024.