Search | arXiv e-print repository

doi 10.1093/mnras/stae131

The impact of the FREDDA dedispersion algorithm on $H_0$ estimations with FRBs

Authors: Jordan Hoffmann, Clancy W. James, Hao Qiu, Marcin Glowacki, Keith W. Bannister, Vivek Gupta, Jason X. Prochaska, Apurba Bera, Adam T. Deller, Kelly Gourdji, Lachlan Marnoch, Stuart D. Ryder, Danica R. Scott, Ryan M. Shannon, Nicolas Tejos

Abstract: Fast radio bursts (FRBs) are transient radio signals of extragalactic origins that are subjected to propagation effects such as dispersion and scattering. It follows then that these signals hold information regarding the medium they have traversed and are hence useful as cosmological probes of the Universe. Recently, FRBs were used to make an independent measure of the Hubble Constant $H_0$, promi… ▽ More Fast radio bursts (FRBs) are transient radio signals of extragalactic origins that are subjected to propagation effects such as dispersion and scattering. It follows then that these signals hold information regarding the medium they have traversed and are hence useful as cosmological probes of the Universe. Recently, FRBs were used to make an independent measure of the Hubble Constant $H_0$, promising to resolve the Hubble tension given a sufficient number of detected FRBs. Such cosmological studies are dependent on FRB population statistics, cosmological parameters and detection biases, and thus it is important to accurately characterise each of these. In this work, we empirically characterise the sensitivity of the Fast Real-time Engine for Dedispersing Amplitudes (FREDDA) which is the current detection system for the Australian Square Kilometer Array Pathfinder (ASKAP). We coherently redisperse high-time resolution data of 13 ASKAP-detected FRBs and inject them into FREDDA to determine the recovered signal-to-noise ratios as a function of dispersion measure (DM). We find that for 11 of the 13 FRBs, these results are consistent with injecting idealised pulses. Approximating this sensitivity function with theoretical predictions results in a systematic error of 0.3$\,$km$\,$s$^{-1}\,$Mpc$^{-1}$ on $H_0$ when it is the only free parameter. Allowing additional parameters to vary could increase this systematic by up to $\sim1\,$km$\,$s$^{-1}\,$Mpc$^{-1}$. We estimate that this systematic will not be relevant until $\sim$400 localised FRBs have been detected, but will likely be significant in resolving the Hubble tension. △ Less

Submitted 12 August, 2024; originally announced August 2024.

Comments: 8 pages, 6 figures, Published in MNRAS

arXiv:2408.00582 [pdf, other]

First Measurement of the Total Inelastic Cross-Section of Positively-Charged Kaons on Argon at Energies Between 5.0 and 7.5 GeV

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, C. Andreopoulos, M. Andreotti , et al. (1341 additional authors not shown)

Abstract: ProtoDUNE Single-Phase (ProtoDUNE-SP) is a 770-ton liquid argon time projection chamber that operated in a hadron test beam at the CERN Neutrino Platform in 2018. We present a measurement of the total inelastic cross section of charged kaons on argon as a function of kaon energy using 6 and 7 GeV/$c$ beam momentum settings. The flux-weighted average of the extracted inelastic cross section at each… ▽ More ProtoDUNE Single-Phase (ProtoDUNE-SP) is a 770-ton liquid argon time projection chamber that operated in a hadron test beam at the CERN Neutrino Platform in 2018. We present a measurement of the total inelastic cross section of charged kaons on argon as a function of kaon energy using 6 and 7 GeV/$c$ beam momentum settings. The flux-weighted average of the extracted inelastic cross section at each beam momentum setting was measured to be 380$\pm$26 mbarns for the 6 GeV/$c$ setting and 379$\pm$35 mbarns for the 7 GeV/$c$ setting. △ Less

Submitted 1 August, 2024; originally announced August 2024.

Report number: CERN-EP-2024-211, FERMILAB-PUB-24-0216-V

arXiv:2407.21783 [pdf, other]

The Llama 3 Herd of Models

Authors: Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere, Bethany Biron, Binh Tang , et al. (510 additional authors not shown)

Abstract: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical… ▽ More Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development. △ Less

Submitted 15 August, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

arXiv:2407.16030 [pdf, other]

Enhancing Temporal Understanding in LLMs for Semi-structured Tables

Authors: Irwin Deng, Kushagra Dixit, Vivek Gupta, Dan Roth

Abstract: Temporal reasoning over tabular data presents substantial challenges for large language models (LLMs), as evidenced by recent research. In this study, we conduct a comprehensive analysis of temporal datasets to pinpoint the specific limitations of LLMs. Our investigation leads to enhancements in TempTabQA, a dataset specifically designed for tabular temporal question answering. We provide critical… ▽ More Temporal reasoning over tabular data presents substantial challenges for large language models (LLMs), as evidenced by recent research. In this study, we conduct a comprehensive analysis of temporal datasets to pinpoint the specific limitations of LLMs. Our investigation leads to enhancements in TempTabQA, a dataset specifically designed for tabular temporal question answering. We provide critical insights for improving LLM performance in temporal reasoning tasks with tabular data. Furthermore, we introduce a novel approach, C.L.E.A.R to strengthen LLM capabilities in this domain. Our findings demonstrate that our method significantly improves evidence-based reasoning across various models. Additionally, our experimental results reveal that indirect supervision with auxiliary data substantially boosts model performance in these tasks. This work contributes to a deeper understanding of LLMs' temporal reasoning abilities over tabular data and promotes advancements in their application across diverse fields. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: Total Pages 18, Total Tables 6, Total figures 7

arXiv:2407.15452 [pdf, other]

doi 10.1145/3627673.3680021

GraphScale: A Framework to Enable Machine Learning over Billion-node Graphs

Authors: Vipul Gupta, Xin Chen, Ruoyun Huang, Fanlong Meng, Jianjun Chen, Yujun Yan

Abstract: Graph Neural Networks (GNNs) have emerged as powerful tools for supervised machine learning over graph-structured data, while sampling-based node representation learning is widely utilized in unsupervised learning. However, scalability remains a major challenge in both supervised and unsupervised learning for large graphs (e.g., those with over 1 billion nodes). The scalability bottleneck largely… ▽ More Graph Neural Networks (GNNs) have emerged as powerful tools for supervised machine learning over graph-structured data, while sampling-based node representation learning is widely utilized in unsupervised learning. However, scalability remains a major challenge in both supervised and unsupervised learning for large graphs (e.g., those with over 1 billion nodes). The scalability bottleneck largely stems from the mini-batch sampling phase in GNNs and the random walk sampling phase in unsupervised methods. These processes often require storing features or embeddings in memory. In the context of distributed training, they require frequent, inefficient random access to data stored across different workers. Such repeated inter-worker communication for each mini-batch leads to high communication overhead and computational inefficiency. We propose GraphScale, a unified framework for both supervised and unsupervised learning to store and process large graph data distributedly. The key insight in our design is the separation of workers who store data and those who perform the training. This separation allows us to decouple computing and storage in graph training, thus effectively building a pipeline where data fetching and data computation can overlap asynchronously. Our experiments show that GraphScale outperforms state-of-the-art methods for distributed training of both GNNs and node embeddings. We evaluate GraphScale both on public and proprietary graph datasets and observe a reduction of at least 40% in end-to-end training times compared to popular distributed frameworks, without any loss in performance. While most existing methods don't support billion-node graphs for training node embeddings, GraphScale is currently deployed in production at TikTok enabling efficient learning over such large graphs. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: Published in the Proceedings of the 33rd ACM International Conference on Information and Knowledge Management (CIKM 2024), 8 Pages, 12 Figures

Journal ref: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management (CIKM 2024), October 21-25, 2024, Boise, ID, USA

arXiv:2407.14933 [pdf, other]

Consent in Crisis: The Rapid Decline of the AI Data Commons

Authors: Shayne Longpre, Robert Mahari, Ariel Lee, Campbell Lund, Hamidah Oderinwale, William Brannon, Nayan Saxena, Naana Obeng-Marnu, Tobin South, Cole Hunter, Kevin Klyman, Christopher Klamm, Hailey Schoelkopf, Nikhil Singh, Manuel Cherep, Ahmad Anis, An Dinh, Caroline Chitongo, Da Yin, Damien Sileo, Deividas Mataciunas, Diganta Misra, Emad Alghamdi, Enrico Shippole, Jianguo Zhang , et al. (24 additional authors not shown)

Abstract: General-purpose artificial intelligence (AI) systems are built on massive swathes of public web data, assembled into corpora such as C4, RefinedWeb, and Dolma. To our knowledge, we conduct the first, large-scale, longitudinal audit of the consent protocols for the web domains underlying AI training corpora. Our audit of 14,000 web domains provides an expansive view of crawlable web data and how co… ▽ More General-purpose artificial intelligence (AI) systems are built on massive swathes of public web data, assembled into corpora such as C4, RefinedWeb, and Dolma. To our knowledge, we conduct the first, large-scale, longitudinal audit of the consent protocols for the web domains underlying AI training corpora. Our audit of 14,000 web domains provides an expansive view of crawlable web data and how codified data use preferences are changing over time. We observe a proliferation of AI-specific clauses to limit use, acute differences in restrictions on AI developers, as well as general inconsistencies between websites' expressed intentions in their Terms of Service and their robots.txt. We diagnose these as symptoms of ineffective web protocols, not designed to cope with the widespread re-purposing of the internet for AI. Our longitudinal analyses show that in a single year (2023-2024) there has been a rapid crescendo of data restrictions from web sources, rendering ~5%+ of all tokens in C4, or 28%+ of the most actively maintained, critical sources in C4, fully restricted from use. For Terms of Service crawling restrictions, a full 45% of C4 is now restricted. If respected or enforced, these restrictions are rapidly biasing the diversity, freshness, and scaling laws for general-purpose AI systems. We hope to illustrate the emerging crises in data consent, for both developers and creators. The foreclosure of much of the open web will impact not only commercial AI, but also non-commercial AI and academic research. △ Less

Submitted 24 July, 2024; v1 submitted 20 July, 2024; originally announced July 2024.

Comments: 41 pages (13 main), 5 figures, 9 tables

arXiv:2407.11229 [pdf, other]

Unraveling the Truth: Do LLMs really Understand Charts? A Deep Dive into Consistency and Robustness

Authors: Srija Mukhopadhyay, Adnan Qidwai, Aparna Garimella, Pritika Ramu, Vivek Gupta, Dan Roth

Abstract: Chart question answering (CQA) is a crucial area of Visual Language Understanding. However, the robustness and consistency of current Visual Language Models (VLMs) in this field remain under-explored. This paper evaluates state-of-the-art VLMs on comprehensive datasets, developed specifically for this study, encompassing diverse question categories and chart formats. We investigate two key aspects… ▽ More Chart question answering (CQA) is a crucial area of Visual Language Understanding. However, the robustness and consistency of current Visual Language Models (VLMs) in this field remain under-explored. This paper evaluates state-of-the-art VLMs on comprehensive datasets, developed specifically for this study, encompassing diverse question categories and chart formats. We investigate two key aspects: 1) the models' ability to handle varying levels of chart and question complexity, and 2) their robustness across different visual representations of the same underlying data. Our analysis reveals significant performance variations based on question and chart types, highlighting both strengths and weaknesses of current models. Additionally, we identify areas for improvement and propose future research directions to build more robust and reliable CQA systems. This study sheds light on the limitations of current models and paves the way for future advancements in the field. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: 22 pages, 7 Tables, 3 Figures, 25 examples

arXiv:2407.11014 [pdf, other]

Geode: A Zero-shot Geospatial Question-Answering Agent with Explicit Reasoning and Precise Spatio-Temporal Retrieval

Authors: Devashish Vikas Gupta, Azeez Syed Ali Ishaqui, Divya Kiran Kadiyala

Abstract: Large language models (LLMs) have shown promising results in learning and contextualizing information from different forms of data. Recent advancements in foundational models, particularly those employing self-attention mechanisms, have significantly enhanced our ability to comprehend the semantics of diverse data types. One such area that could highly benefit from multi-modality is in understandi… ▽ More Large language models (LLMs) have shown promising results in learning and contextualizing information from different forms of data. Recent advancements in foundational models, particularly those employing self-attention mechanisms, have significantly enhanced our ability to comprehend the semantics of diverse data types. One such area that could highly benefit from multi-modality is in understanding geospatial data, which inherently has multiple modalities. However, current Natural Language Processing (NLP) mechanisms struggle to effectively address geospatial queries. Existing pre-trained LLMs are inadequately equipped to meet the unique demands of geospatial data, lacking the ability to retrieve precise spatio-temporal data in real-time, thus leading to significantly reduced accuracy in answering complex geospatial queries. To address these limitations, we introduce Geode--a pioneering system designed to tackle zero-shot geospatial question-answering tasks with high precision using spatio-temporal data retrieval. Our approach represents a significant improvement in addressing the limitations of current LLM models, demonstrating remarkable improvement in geospatial question-answering abilities compared to existing state-of-the-art pre-trained models. △ Less

Submitted 26 June, 2024; originally announced July 2024.

arXiv:2407.10380 [pdf, other]

NTSEBENCH: Cognitive Reasoning Benchmark for Vision Language Models

Authors: Pranshu Pandya, Agney S Talwarr, Vatsal Gupta, Tushar Kataria, Vivek Gupta, Dan Roth

Abstract: Cognitive textual and visual reasoning tasks, such as puzzles, series, and analogies, demand the ability to quickly reason, decipher, and evaluate patterns both textually and spatially. While LLMs and VLMs, through extensive training on large amounts of human-curated data, have attained a high level of pseudo-human intelligence in some common sense reasoning tasks, they still struggle with more co… ▽ More Cognitive textual and visual reasoning tasks, such as puzzles, series, and analogies, demand the ability to quickly reason, decipher, and evaluate patterns both textually and spatially. While LLMs and VLMs, through extensive training on large amounts of human-curated data, have attained a high level of pseudo-human intelligence in some common sense reasoning tasks, they still struggle with more complex reasoning tasks that require cognitive understanding. In this work, we introduce a new dataset, NTSEBench, designed to evaluate the cognitive multi-modal reasoning and problem-solving skills of large models. The dataset comprises 2,728 multiple-choice questions comprising of a total of 4,642 images across 26 categories sampled from the NTSE examination conducted nationwide in India, featuring both visual and textual general aptitude questions that do not rely on rote learning. We establish baselines on the dataset using state-of-the-art LLMs and VLMs. To facilitate a comparison between open source and propriety models, we propose four distinct modeling strategies to handle different modalities (text and images) in the dataset instances. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: 15 pages, 2 figures, 5 tables

arXiv:2407.10339 [pdf, other]

Supernova Pointing Capabilities of DUNE

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, D. A. Andrade , et al. (1340 additional authors not shown)

Abstract: The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electr… ▽ More The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electron-neutrino charged-current absorption on $^{40}$Ar and elastic scattering of neutrinos on electrons. Procedures to reconstruct individual interactions, including a newly developed technique called ``brems flipping'', as well as the burst direction from an ensemble of interactions are described. Performance of the burst direction reconstruction is evaluated for supernovae happening at a distance of 10 kpc for a specific supernova burst flux model. The pointing resolution is found to be 3.4 degrees at 68% coverage for a perfect interaction-channel classification and a fiducial mass of 40 kton, and 6.6 degrees for a 10 kton fiducial mass respectively. Assuming a 4% rate of charged-current interactions being misidentified as elastic scattering, DUNE's burst pointing resolution is found to be 4.3 degrees (8.7 degrees) at 68% coverage. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: 25 pages, 16 figures

Report number: FERMILAB-PUB-24-0319-LBNF

arXiv:2407.10088 [pdf, other]

Predictability of weakly turbulent systems from spatially sparse observations using data assimilation and machine learning

Authors: Vikrant Gupta, Yuanqing Chen, Minping Wan

Abstract: We apply two data assimilation (DA) methods, a smoother and a filter, and a model-free machine learning (ML) shallow network to forecast two weakly turbulent systems. We analyse the effect of the spatial sparsity of observations on accuracy of the predictions obtained from these data-driven methods. Based on the results, we divide the spatial sparsity levels in three zones. First is the good-predi… ▽ More We apply two data assimilation (DA) methods, a smoother and a filter, and a model-free machine learning (ML) shallow network to forecast two weakly turbulent systems. We analyse the effect of the spatial sparsity of observations on accuracy of the predictions obtained from these data-driven methods. Based on the results, we divide the spatial sparsity levels in three zones. First is the good-predictions zone in which both DA and ML methods work. We find that in the good-predictions zone the observations remain dense enough to accurately capture the fractal manifold of the system's dynamics, which is measured using the correlation dimension. The accuracy of the DA methods in this zone remains almost as good as for full-resolution observations. Second is the reasonable-predictions zone in which the DA methods still work but at reduced prediction accuracy. Third is the bad-predictions zone in which even the DA methods fail. We find that the sparsity level up to which the DA methods work is almost the same up to which chaos synchronisation of these systems can be achieved. The main implications of these results are that they (i) firmly establish the spatial resolution up to which the data-driven methods can be utilised, (ii) provide measures to determine if adding more sensors will improve the predictions, and (iii) quantify the advantage (in terms of the required measurement resolution) of using the governing equations within data-driven methods. We also discuss the applicability of these results to fully developed turbulence. △ Less

Submitted 14 July, 2024; originally announced July 2024.

arXiv:2407.08221 [pdf, other]

GAURA: Generalizable Approach for Unified Restoration and Rendering of Arbitrary Views

Authors: Vinayak Gupta, Rongali Simhachala Venkata Girish, Mukund Varma T, Ayush Tewari, Kaushik Mitra

Abstract: Neural rendering methods can achieve near-photorealistic image synthesis of scenes from posed input images. However, when the images are imperfect, e.g., captured in very low-light conditions, state-of-the-art methods fail to reconstruct high-quality 3D scenes. Recent approaches have tried to address this limitation by modeling various degradation processes in the image formation model; however, t… ▽ More Neural rendering methods can achieve near-photorealistic image synthesis of scenes from posed input images. However, when the images are imperfect, e.g., captured in very low-light conditions, state-of-the-art methods fail to reconstruct high-quality 3D scenes. Recent approaches have tried to address this limitation by modeling various degradation processes in the image formation model; however, this limits them to specific image degradations. In this paper, we propose a generalizable neural rendering method that can perform high-fidelity novel view synthesis under several degradations. Our method, GAURA, is learning-based and does not require any test-time scene-specific optimization. It is trained on a synthetic dataset that includes several degradation types. GAURA outperforms state-of-the-art methods on several benchmarks for low-light enhancement, dehazing, deraining, and on-par for motion deblurring. Further, our model can be efficiently fine-tuned to any new incoming degradation using minimal data. We thus demonstrate adaptation results on two unseen degradations, desnowing and removing defocus blur. Code and video results are available at vinayak-vg.github.io/GAURA. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: European Conference on Computer Vision(ECCV) 2024

arXiv:2407.05952 [pdf, other]

H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables

Authors: Nikhil Abhyankar, Vivek Gupta, Dan Roth, Chandan K. Reddy

Abstract: Tabular reasoning involves interpreting unstructured queries against structured tables, requiring a synthesis of textual understanding and symbolic reasoning. Existing methods rely on either of the approaches and are constrained by their respective limitations. Textual reasoning excels in semantic interpretation unlike symbolic reasoning (SQL logic), but falls short in mathematical reasoning where… ▽ More Tabular reasoning involves interpreting unstructured queries against structured tables, requiring a synthesis of textual understanding and symbolic reasoning. Existing methods rely on either of the approaches and are constrained by their respective limitations. Textual reasoning excels in semantic interpretation unlike symbolic reasoning (SQL logic), but falls short in mathematical reasoning where SQL excels. In this paper, we introduce a novel algorithm H-STAR, comprising table extraction and adaptive reasoning, integrating both symbolic and semantic (text-based) approaches. To enhance evidence extraction, H-STAR employs a multi-view approach, incorporating step-by-step row and column retrieval. It also adapts reasoning strategies based on question types, utilizing symbolic reasoning for quantitative and logical tasks, and semantic reasoning for direct lookup and complex lexical queries. Our extensive experiments demonstrate that H-STAR significantly outperforms state-of-the-art methods across three tabular question-answering (QA) and fact-verification datasets, underscoring its effectiveness and efficiency. △ Less

Submitted 29 June, 2024; originally announced July 2024.

Comments: 13 pages, 14 tables, 9 figures

arXiv:2406.19470 [pdf, other]

Changing Answer Order Can Decrease MMLU Accuracy

Authors: Vipul Gupta, David Pantoja, Candace Ross, Adina Williams, Megan Ung

Abstract: As large language models (LLMs) have grown in prevalence, particular benchmarks have become essential for the evaluation of these models and for understanding model capabilities. Most commonly, we use test accuracy averaged across multiple subtasks in order to rank models on leaderboards, to determine which model is best for our purposes. In this paper, we investigate the robustness of the accurac… ▽ More As large language models (LLMs) have grown in prevalence, particular benchmarks have become essential for the evaluation of these models and for understanding model capabilities. Most commonly, we use test accuracy averaged across multiple subtasks in order to rank models on leaderboards, to determine which model is best for our purposes. In this paper, we investigate the robustness of the accuracy measurement on a widely used multiple choice question answering dataset, MMLU. When shuffling the answer label contents, we find that all explored models decrease in accuracy on MMLU, but not every model is equally sensitive. These findings suggest a possible adjustment to the standard practice of leaderboard testing, where we additionally consider the percentage of examples each model answers correctly by random chance. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: Short paper, 9 pages

arXiv:2406.19237 [pdf, other]

FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts

Authors: Shubhankar Singh, Purvi Chaurasia, Yerram Varun, Pranshu Pandya, Vatsal Gupta, Vivek Gupta, Dan Roth

Abstract: Existing benchmarks for visual question answering lack in visual grounding and complexity, particularly in evaluating spatial reasoning skills. We introduce FlowVQA, a novel benchmark aimed at assessing the capabilities of visual question-answering multimodal language models in reasoning with flowcharts as visual contexts. FlowVQA comprises 2,272 carefully generated and human-verified flowchart im… ▽ More Existing benchmarks for visual question answering lack in visual grounding and complexity, particularly in evaluating spatial reasoning skills. We introduce FlowVQA, a novel benchmark aimed at assessing the capabilities of visual question-answering multimodal language models in reasoning with flowcharts as visual contexts. FlowVQA comprises 2,272 carefully generated and human-verified flowchart images from three distinct content sources, along with 22,413 diverse question-answer pairs, to test a spectrum of reasoning tasks, including information localization, decision-making, and logical progression. We conduct a thorough baseline evaluation on a suite of both open-source and proprietary multimodal language models using various strategies, followed by an analysis of directional bias. The results underscore the benchmark's potential as a vital tool for advancing the field of multimodal modeling, providing a focused and challenging environment for enhancing model performance in visual and logical reasoning tasks. △ Less

Submitted 28 June, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

Comments: Accepted in ACL 2024 (Findings), 21 pages, 7 figures, 9 Tables

arXiv:2406.18128 [pdf, other]

Stokes' paradox in rarefied gases: A perspective through the method of fundamental solutions

Authors: Himanshi, Anirudh Singh Rana, Vinay Kumar Gupta

Abstract: In the realm of fluid dynamics, a curious and counterintuitive phenomenon is Stokes' paradox. While Stokes equations -- used for modeling slow and steady flows -- lead to a meaningful solution to the problem of slow and steady flow past a sphere, they fail to yield a non-trivial solution to the problem of slow and steady flow past an infinitely long cylinder (a two-dimensional problem essentially)… ▽ More In the realm of fluid dynamics, a curious and counterintuitive phenomenon is Stokes' paradox. While Stokes equations -- used for modeling slow and steady flows -- lead to a meaningful solution to the problem of slow and steady flow past a sphere, they fail to yield a non-trivial solution to the problem of slow and steady flow past an infinitely long cylinder (a two-dimensional problem essentially); this is referred to as Stokes' paradox. We revisit this paradox in the context of rarefied gas flows by means of the method of fundamental solutions (MFS). To this end, we adopt an extended hydrodynamic model, referred to as the CCR model, consisting of the balance equations for the mass, momentum and energy and closed with the coupled constitutive relations. We determine an analytic solution of the CCR model for the problem and compare it with the MFS-based numerical solution. Apart from addressing flow past a circular cylinder, we aim to showcase the capability of the MFS to predict the flow past other objects in two dimensions for which the analytic solutions do not exist. For that, we investigate the problem of rarefied gas flow past an infinitely long semicircular cylinder. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 28 Pages, 16 figures

MSC Class: 76P05; 35E05

arXiv:2406.16964 [pdf, other]

Are Language Models Actually Useful for Time Series Forecasting?

Authors: Mingtian Tan, Mike A. Merrill, Vinayak Gupta, Tim Althoff, Thomas Hartvigsen

Abstract: Large language models (LLMs) are being applied to time series tasks, particularly time series forecasting. However, are language models actually useful for time series? After a series of ablation studies on three recent and popular LLM-based time series forecasting methods, we find that removing the LLM component or replacing it with a basic attention layer does not degrade the forecasting results… ▽ More Large language models (LLMs) are being applied to time series tasks, particularly time series forecasting. However, are language models actually useful for time series? After a series of ablation studies on three recent and popular LLM-based time series forecasting methods, we find that removing the LLM component or replacing it with a basic attention layer does not degrade the forecasting results -- in most cases the results even improved. We also find that despite their significant computational cost, pretrained LLMs do no better than models trained from scratch, do not represent the sequential dependencies in time series, and do not assist in few-shot settings. Additionally, we explore time series encoders and reveal that patching and attention structures perform similarly to state-of-the-art LLM-based forecasters. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: 25 pages, 8 figures and 20 tables

arXiv:2406.16253 [pdf, other]

LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

Authors: Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo , et al. (15 additional authors not shown)

Abstract: This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as th… ▽ More This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as they have to spend more time reading, writing, and reviewing papers. This raises the question: how can LLMs potentially assist researchers in alleviating their heavy workload? This study focuses on the topic of LLMs assist NLP Researchers, particularly examining the effectiveness of LLM in assisting paper (meta-)reviewing and its recognizability. To address this, we constructed the ReviewCritique dataset, which includes two types of information: (i) NLP papers (initial submissions rather than camera-ready) with both human-written and LLM-generated reviews, and (ii) each review comes with "deficiency" labels and corresponding explanations for individual segments, annotated by experts. Using ReviewCritique, this study explores two threads of research questions: (i) "LLMs as Reviewers", how do reviews generated by LLMs compare with those written by humans in terms of quality and distinguishability? (ii) "LLMs as Metareviewers", how effectively can LLMs identify potential issues, such as Deficient or unprofessional review segments, within individual paper reviews? To our knowledge, this is the first work to provide such a comprehensive analysis. △ Less

Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

arXiv:2406.10889 [pdf, other]

VELOCITI: Can Video-Language Models Bind Semantic Concepts through Time?

Authors: Darshana Saravanan, Darshan Singh, Varun Gupta, Zeeshan Khan, Vineet Gandhi, Makarand Tapaswi

Abstract: Compositionality is a fundamental aspect of vision-language understanding and is especially required for videos since they contain multiple entities (e.g. persons, actions, and scenes) interacting dynamically over time. Existing benchmarks focus primarily on perception capabilities. However, they do not study binding, the ability of a model to associate entities through appropriate relationships.… ▽ More Compositionality is a fundamental aspect of vision-language understanding and is especially required for videos since they contain multiple entities (e.g. persons, actions, and scenes) interacting dynamically over time. Existing benchmarks focus primarily on perception capabilities. However, they do not study binding, the ability of a model to associate entities through appropriate relationships. To this end, we propose VELOCITI, a new benchmark building on complex movie clips and dense semantic role label annotations to test perception and binding in video language models (contrastive and Video-LLMs). Our perception-based tests require discriminating video-caption pairs that share similar entities, and the binding tests require models to associate the correct entity to a given situation while ignoring the different yet plausible entities that also appear in the same video. While current state-of-the-art models perform moderately well on perception tests, accuracy is near random when both entities are present in the same video, indicating that they fail at binding tests. Even the powerful Gemini 1.5 Flash has a substantial gap (16-28%) with respect to human accuracy in such binding tests. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: 26 pages, 17 figures, 3 tables

arXiv:2406.10085 [pdf, other]

Enhancing Question Answering on Charts Through Effective Pre-training Tasks

Authors: Ashim Gupta, Vivek Gupta, Shuo Zhang, Yujie He, Ning Zhang, Shalin Shah

Abstract: To completely understand a document, the use of textual information is not enough. Understanding visual cues, such as layouts and charts, is also required. While the current state-of-the-art approaches for document understanding (both OCR-based and OCR-free) work well, a thorough analysis of their capabilities and limitations has not yet been performed. Therefore, in this work, we addresses the li… ▽ More To completely understand a document, the use of textual information is not enough. Understanding visual cues, such as layouts and charts, is also required. While the current state-of-the-art approaches for document understanding (both OCR-based and OCR-free) work well, a thorough analysis of their capabilities and limitations has not yet been performed. Therefore, in this work, we addresses the limitation of current VisualQA models when applied to charts and plots. To investigate shortcomings of the state-of-the-art models, we conduct a comprehensive behavioral analysis, using ChartQA as a case study. Our findings indicate that existing models particularly underperform in answering questions related to the chart's structural and visual context, as well as numerical information. To address these issues, we propose three simple pre-training tasks that enforce the existing model in terms of both structural-visual knowledge, as well as its understanding of numerical questions. We evaluate our pre-trained model (called MatCha-v2) on three chart datasets - both extractive and abstractive question datasets - and observe that it achieves an average improvement of 1.7% over the baseline model. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.00968 [pdf, other]

Evaluating MEDIRL: A Replication and Ablation Study of Maximum Entropy Deep Inverse Reinforcement Learning for Human Social Navigation

Authors: Vinay Gupta, Nihal Gunukula

Abstract: In this study, we enhance the Maximum Entropy Deep Inverse Reinforcement Learning (MEDIRL) framework, targeting its application in human robot interaction (HRI) for modeling pedestrian behavior in crowded environments. Our work is grounded in the pioneering research by Fahad, Chen, and Guo, and aims to elevate MEDIRL's efficacy in real world HRI settings. We replicated the original MEDIRL model an… ▽ More In this study, we enhance the Maximum Entropy Deep Inverse Reinforcement Learning (MEDIRL) framework, targeting its application in human robot interaction (HRI) for modeling pedestrian behavior in crowded environments. Our work is grounded in the pioneering research by Fahad, Chen, and Guo, and aims to elevate MEDIRL's efficacy in real world HRI settings. We replicated the original MEDIRL model and conducted detailed ablation studies, focusing on key model components like learning rates, state dimensions, and network layers. Our findings reveal the effectiveness of a two dimensional state representation over three dimensional approach, significantly improving model accuracy for pedestrian behavior prediction in HRI scenarios. These results not only demonstrate MEDIRL's enhanced performance but also offer valuable insights for future HRI system development, emphasizing the importance of model customization to specific environmental contexts. Our research contributes to advancing the field of socially intelligent navigation systems, promoting more intuitive and safer human robot interactions. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: 14 pages, 13 figures

arXiv:2405.19772 [pdf, other]

New Exponential operators connected with a^2+x^2: a generalization of Post-Widder and Ismail May

Authors: Vijay Gupta, Anjali

Abstract: The present study offers a general exponential operator connected with a^2+x^2; for positive real "a". We estimate the asymptotic formula for simultaneous and ordinary approximation of the constructed operator. In the last section, we graphically interpret the created operator's convergence to two periodic functions "x sin(x)" and "-x/2*cos(pi*x)". We also consider the limiting case a tends to 0;… ▽ More The present study offers a general exponential operator connected with a^2+x^2; for positive real "a". We estimate the asymptotic formula for simultaneous and ordinary approximation of the constructed operator. In the last section, we graphically interpret the created operator's convergence to two periodic functions "x sin(x)" and "-x/2*cos(pi*x)". We also consider the limiting case a tends to 0; which provides Post-Widder operator. In addition, we analyze each particular case of the defined operator and determine the optimal value of "a", that would yield the greatest approximation; this facilitates us to contrast the well-known operators existing in the literature, especially the Post-Widder operator and the operator due to Ismail-May. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 12 pages, 14 figures,

MSC Class: 41A25 and 41A35

arXiv:2405.16752 [pdf, other]

Model Ensembling for Constrained Optimization

Authors: Ira Globus-Harris, Varun Gupta, Michael Kearns, Aaron Roth

Abstract: There is a long history in machine learning of model ensembling, beginning with boosting and bagging and continuing to the present day. Much of this history has focused on combining models for classification and regression, but recently there is interest in more complex settings such as ensembling policies in reinforcement learning. Strong connections have also emerged between ensembling and multi… ▽ More There is a long history in machine learning of model ensembling, beginning with boosting and bagging and continuing to the present day. Much of this history has focused on combining models for classification and regression, but recently there is interest in more complex settings such as ensembling policies in reinforcement learning. Strong connections have also emerged between ensembling and multicalibration techniques. In this work, we further investigate these themes by considering a setting in which we wish to ensemble models for multidimensional output predictions that are in turn used for downstream optimization. More precisely, we imagine we are given a number of models mapping a state space to multidimensional real-valued predictions. These predictions form the coefficients of a linear objective that we would like to optimize under specified constraints. The fundamental question we address is how to improve and combine such models in a way that outperforms the best of them in the downstream optimization problem. We apply multicalibration techniques that lead to two provably efficient and convergent algorithms. The first of these (the white box approach) requires being given models that map states to output predictions, while the second (the \emph{black box} approach) requires only policies (mappings from states to solutions to the optimization problem). For both, we provide convergence and utility guarantees. We conclude by investigating the performance and behavior of the two algorithms in a controlled experimental setting. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.15046 [pdf, other]

On the minimum spectral radius of connected graphs of given order and size

Authors: Sebastian M. Cioabă, Vishal Gupta, Celso Marques

Abstract: In this paper, we study a question of Hong from 1993 related to the minimum spectral radii of the adjacency matrices of connected graphs of given order and size. Hong asked if it is true that among all connected graphs of given number of vertices $n$ and number of edges $e$, the graphs having minimum spectral radius (the minimizer graphs) must be almost regular, meaning that the difference between… ▽ More In this paper, we study a question of Hong from 1993 related to the minimum spectral radii of the adjacency matrices of connected graphs of given order and size. Hong asked if it is true that among all connected graphs of given number of vertices $n$ and number of edges $e$, the graphs having minimum spectral radius (the minimizer graphs) must be almost regular, meaning that the difference between their maximum degree and their minimum degree is at most one. In this paper, we answer Hong's question positively for various values of $n$ and $e$ and in several cases, we determined the graphs with minimum spectral radius. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 19 pages, 6 figures

MSC Class: 05C50; 15A18

arXiv:2405.14900 [pdf, other]

doi 10.1016/j.media.2024.103206.

Fair Evaluation of Federated Learning Algorithms for Automated Breast Density Classification: The Results of the 2022 ACR-NCI-NVIDIA Federated Learning Challenge

Authors: Kendall Schmidt, Benjamin Bearce, Ken Chang, Laura Coombs, Keyvan Farahani, Marawan Elbatele, Kaouther Mouhebe, Robert Marti, Ruipeng Zhang, Yao Zhang, Yanfeng Wang, Yaojun Hu, Haochao Ying, Yuyang Xu, Conrad Testagrose, Mutlu Demirer, Vikash Gupta, Ünal Akünal, Markus Bujotzek, Klaus H. Maier-Hein, Yi Qin, Xiaomeng Li, Jayashree Kalpathy-Cramer, Holger R. Roth

Abstract: The correct interpretation of breast density is important in the assessment of breast cancer risk. AI has been shown capable of accurately predicting breast density, however, due to the differences in imaging characteristics across mammography systems, models built using data from one system do not generalize well to other systems. Though federated learning (FL) has emerged as a way to improve the… ▽ More The correct interpretation of breast density is important in the assessment of breast cancer risk. AI has been shown capable of accurately predicting breast density, however, due to the differences in imaging characteristics across mammography systems, models built using data from one system do not generalize well to other systems. Though federated learning (FL) has emerged as a way to improve the generalizability of AI without the need to share data, the best way to preserve features from all training data during FL is an active area of research. To explore FL methodology, the breast density classification FL challenge was hosted in partnership with the American College of Radiology, Harvard Medical School's Mass General Brigham, University of Colorado, NVIDIA, and the National Institutes of Health National Cancer Institute. Challenge participants were able to submit docker containers capable of implementing FL on three simulated medical facilities, each containing a unique large mammography dataset. The breast density FL challenge ran from June 15 to September 5, 2022, attracting seven finalists from around the world. The winning FL submission reached a linear kappa score of 0.653 on the challenge test data and 0.413 on an external testing dataset, scoring comparably to a model trained on the same data in a central location. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 16 pages, 9 figures

Journal ref: Medical Image Analysis Volume 95, July 2024, 103206

arXiv:2405.12403 [pdf, other]

Searching for gravitational wave optical counterparts with the Zwicky Transient Facility: summary of O4a

Authors: Tomás Ahumada, Shreya Anand, Michael W. Coughlin, Vaidehi Gupta, Mansi M. Kasliwal, Viraj R. Karambelkar, Robert D. Stein, Gaurav Waratkar, Vishwajeet Swain, Theophile Jegou du Laz, Akash Anumarlapudi, Igor Andreoni, Mattia Bulla, Gokul P. Srinivasaragavan, Andrew Toivonen, Avery Wold, Eric C. Bellm, S. Bradley Cenko, David L. Kaplan, Jesper Sollerman, Varun Bhalerao, Daniel Perley, Anirudh Salgundi, Aswin Suresh, K-Ryan Hinds , et al. (27 additional authors not shown)

Abstract: During the first half of the fourth observing run (O4a) of the International Gravitational Wave Network (IGWN), the Zwicky Transient Facility (ZTF) conducted a systematic search for kilonova (KN) counterparts to binary neutron star (BNS) and neutron star-black hole (NSBH) merger candidates. Here, we present a comprehensive study of the five high-significance (FAR < 1 per year) BNS and NSBH candida… ▽ More During the first half of the fourth observing run (O4a) of the International Gravitational Wave Network (IGWN), the Zwicky Transient Facility (ZTF) conducted a systematic search for kilonova (KN) counterparts to binary neutron star (BNS) and neutron star-black hole (NSBH) merger candidates. Here, we present a comprehensive study of the five high-significance (FAR < 1 per year) BNS and NSBH candidates in O4a. Our follow-up campaigns relied on both target-of-opportunity observations (ToO) and re-weighting of the nominal survey schedule to maximize coverage. We describe the toolkit we have been developing, Fritz, an instance of SkyPortal, instrumental in coordinating and managing our telescope scheduling, candidate vetting, and follow-up observations through a user-friendly interface. ZTF covered a total of 2841 deg$^2$ within the skymaps of the high-significance GW events, reaching a median depth of g~20.2 mag. We circulated 15 candidates, but found no viable KN counterpart to any of the GW events. Based on the ZTF non-detections of the high-significance events in O4a, we used a Bayesian approach, nimbus, to quantify the posterior probability of KN model parameters that are consistent with our non-detections. Our analysis favors KNe with initial absolute magnitude fainter than -16 mag. The joint posterior probability of a GW170817-like KN associated with all our O4a follow-ups was 64%. Additionally, we use a survey simulation software, simsurvey, to determine that our combined filtered efficiency to detect a GW170817-like KN is 36%, when considering the 5 confirmed astrophysical events in O3 (1 BNS and 4 NSBH), along with our O4a follow-ups. Following Kasliwal et al. (2020), we derived joint constraints on the underlying KN luminosity function based on our O3 and O4a follow-ups, determining that no more than 76% of KNe fading at 1 mag/day can peak at a magnitude brighter than -17.5 mag. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: submitted

arXiv:2405.07439 [pdf, other]

A Fast Radio Burst monitor with a Compact All-Sky Phased Array (CASPA)

Authors: R. Luo, R. D. Ekers, G. Hobbs, A. Dunning, C. W. James, M. E. Lower, V. Gupta, A. Zic, M. Sokolowski, C. Phillips, A. T. Deller, L. Staveley-Smith

Abstract: Fast Radio Bursts (FRBs) are short-duration radio transients that occur at random times in host galaxies distributed all over the sky. Large field of view instruments can play a critical role in the blind search for rare FRBs. We present a concept for an all-sky FRB monitor using a compact all-sky phased array (CASPA), which can efficiently achieve an extremely large field of view of $\sim10^4$ sq… ▽ More Fast Radio Bursts (FRBs) are short-duration radio transients that occur at random times in host galaxies distributed all over the sky. Large field of view instruments can play a critical role in the blind search for rare FRBs. We present a concept for an all-sky FRB monitor using a compact all-sky phased array (CASPA), which can efficiently achieve an extremely large field of view of $\sim10^4$ square degrees. Such a system would allow us to conduct a continuous, blind FRB search covering the entire southern sky. Using the measured FRB luminosity function, we investigate the detection rate for this all-sky phased array and compare the result to a number of other proposed large field-of-view instruments. We predict a rate of a few FRB detections per week and determine the dispersion measure and redshift distributions of these detectable FRBs. This instrument is optimal for detecting FRBs in the nearby Universe and for extending the high-end of the FRB luminosity function through finding ultraluminous events. Additionally, this instrument can be used to shadow the new gravitational-wave observing runs, detect high energy events triggered from Galactic magnetars and search for other bright, but currently unknown transient signals. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: Submitted to PASA, comments welcome

arXiv:2405.00908 [pdf]

Transformer-Based Self-Supervised Learning for Histopathological Classification of Ischemic Stroke Clot Origin

Authors: K. Yeh, M. S. Jabal, V. Gupta, D. F. Kallmes, W. Brinjikji, B. S. Erdal

Abstract: Background and Purpose: Identifying the thromboembolism source in ischemic stroke is crucial for treatment and secondary prevention yet is often undetermined. This study describes a self-supervised deep learning approach in digital pathology of emboli for classifying ischemic stroke clot origin from histopathological images. Methods: The dataset included whole slide images (WSI) from the STRIP AI… ▽ More Background and Purpose: Identifying the thromboembolism source in ischemic stroke is crucial for treatment and secondary prevention yet is often undetermined. This study describes a self-supervised deep learning approach in digital pathology of emboli for classifying ischemic stroke clot origin from histopathological images. Methods: The dataset included whole slide images (WSI) from the STRIP AI Kaggle challenge, consisting of retrieved clots from ischemic stroke patients following mechanical thrombectomy. Transformer-based deep learning models were developed using transfer learning and self-supervised pretraining for classifying WSI. Customizations included an attention pooling layer, weighted loss function, and threshold optimization. Various model architectures were tested and compared, and model performances were primarily evaluated using weighted logarithmic loss. Results: The model achieved a logloss score of 0.662 in cross-validation and 0.659 on the test set. Different model backbones were compared, with the swin_large_patch4_window12_384 showed higher performance. Thresholding techniques for clot origin classification were employed to balance false positives and negatives. Conclusion: The study demonstrates the extent of efficacy of transformer-based deep learning models in identifying ischemic stroke clot origins from histopathological images and emphasizes the need for refined modeling techniques specifically adapted to thrombi WSI. Further research is needed to improve model performance, interpretability, validate its effectiveness. Future enhancement could include integrating larger patient cohorts, advanced preprocessing strategies, and exploring ensemble multimodal methods for enhanced diagnostic accuracy. △ Less

Submitted 1 May, 2024; originally announced May 2024.

arXiv:2404.15546 [pdf, ps, other]

Modular Forms in Combinatorial Optimization

Authors: Varsha Gupta

Abstract: Combinatorial optimization problems, such as the Asymmetric Traveling Salesman Problem (ATSP), find applications across various domains including logistics, genome sequencing, and robotics. Despite their extensive applications, there have not been significant advancements in deriving optimal solutions for these problems. The lack of theoretical understanding owing to the complex structure of these… ▽ More Combinatorial optimization problems, such as the Asymmetric Traveling Salesman Problem (ATSP), find applications across various domains including logistics, genome sequencing, and robotics. Despite their extensive applications, there have not been significant advancements in deriving optimal solutions for these problems. The lack of theoretical understanding owing to the complex structure of these problems has hindered the development of sophisticated algorithms. This paper proposes an unconventional approach by translating the ATSP into the complex domain, revealing an intrinsic modular nature of the problem. Furthermore, we have exploited modularity conditions to gain deeper insights into both unconstrained and constrained optimal solutions. The theoretical framework laid out in this paper can lead to important results at the intersection of combinatorial optimization and number theory. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.15540 [pdf, ps, other]

A Unified Framework for Total Variation Regularized Optimization in Fluid Dynamics and Related Physical Systems

Authors: Varsha Gupta

Abstract: An optimization framework is presented for minimizing the energy functional developed around a generalized equation governing physical systems such as fluid dynamics, particle transport, phase transition, and other related systems. The convexity of the energy functional is investigated to derive the necessary conditions for a smooth and global optimum solution. Furthermore, the Total Variation (TV… ▽ More An optimization framework is presented for minimizing the energy functional developed around a generalized equation governing physical systems such as fluid dynamics, particle transport, phase transition, and other related systems. The convexity of the energy functional is investigated to derive the necessary conditions for a smooth and global optimum solution. Furthermore, the Total Variation (TV) regularization term is introduced to gain insights into the solution space and convergence analysis of convection-dominated problems. We demonstrate the practical application of our method by applying it to some selected examples such as the Boltzmann, Navier-Stokes, and Maxwell equations. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.11757 [pdf, other]

Language Models Still Struggle to Zero-shot Reason about Time Series

Authors: Mike A. Merrill, Mingtian Tan, Vinayak Gupta, Tom Hartvigsen, Tim Althoff

Abstract: Time series are critical for decision-making in fields like finance and healthcare. Their importance has driven a recent influx of works passing time series into language models, leading to non-trivial forecasting on some datasets. But it remains unknown whether non-trivial forecasting implies that language models can reason about time series. To address this gap, we generate a first-of-its-kind e… ▽ More Time series are critical for decision-making in fields like finance and healthcare. Their importance has driven a recent influx of works passing time series into language models, leading to non-trivial forecasting on some datasets. But it remains unknown whether non-trivial forecasting implies that language models can reason about time series. To address this gap, we generate a first-of-its-kind evaluation framework for time series reasoning, including formal tasks and a corresponding dataset of multi-scale time series paired with text captions across ten domains. Using these data, we probe whether language models achieve three forms of reasoning: (1) Etiological Reasoning - given an input time series, can the language model identify the scenario that most likely created it? (2) Question Answering - can a language model answer factual questions about time series? (3) Context-Aided Forecasting - does highly relevant textual context improve a language model's time series forecasts? We find that otherwise highly-capable language models demonstrate surprisingly limited time series reasoning: they score marginally above random on etiological and question answering tasks (up to 30 percentage points worse than humans) and show modest success in using context to improve forecasting. These weakness showcase that time series reasoning is an impactful, yet deeply underdeveloped direction for language model research. We also make our datasets and code public at to support further research in this direction at https://github.com/behavioral-data/TSandLanguage △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2404.11691 [pdf, ps, other]

doi 10.1109/INCET51464.2021.9456342

Improvement in Semantic Address Matching using Natural Language Processing

Authors: Vansh Gupta, Mohit Gupta, Jai Garg, Nitesh Garg

Abstract: Address matching is an important task for many businesses especially delivery and take out companies which help them to take out a certain address from their data warehouse. Existing solution uses similarity of strings, and edit distance algorithms to find out the similar addresses from the address database, but these algorithms could not work effectively with redundant, unstructured, or incomplet… ▽ More Address matching is an important task for many businesses especially delivery and take out companies which help them to take out a certain address from their data warehouse. Existing solution uses similarity of strings, and edit distance algorithms to find out the similar addresses from the address database, but these algorithms could not work effectively with redundant, unstructured, or incomplete address data. This paper discuss semantic Address matching technique, by which we can find out a particular address from a list of possible addresses. We have also reviewed existing practices and their shortcoming. Semantic address matching is an essentially NLP task in the field of deep learning. Through this technique We have the ability to triumph the drawbacks of existing methods like redundant or abbreviated data problems. The solution uses the OCR on invoices to extract the address and create the data pool of addresses. Then this data is fed to the algorithm BM-25 for scoring the best matching entries. Then to observe the best result, this will pass through BERT for giving the best possible result from the similar queries. Our investigation exhibits that our methodology enormously improves both accuracy and review of cutting-edge technology existing techniques. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 5 pages, 7 tables, 2021 2nd International Conference for Emerging Technology (INCET)

Journal ref: 2021 2nd International Conference for Emerging Technology (INCET), Belagavi, India, 2021, pp. 1-5

arXiv:2404.11661 [pdf, other]

doi 10.1109/GlobConET53749.2022.9872449

Designing an Intelligent Parcel Management System using IoT & Machine Learning

Authors: Mohit Gupta, Nitesh Garg, Jai Garg, Vansh Gupta, Devraj Gautam

Abstract: Parcels delivery is a critical activity in railways. More importantly, each parcel must be thoroughly checked and sorted according to its destination address. We require an efficient and robust IoT system capable of doing all of these tasks with great precision and minimal human interaction. This paper discusses, We created a fully-fledged solution using IoT and machine learning to assist trains i… ▽ More Parcels delivery is a critical activity in railways. More importantly, each parcel must be thoroughly checked and sorted according to its destination address. We require an efficient and robust IoT system capable of doing all of these tasks with great precision and minimal human interaction. This paper discusses, We created a fully-fledged solution using IoT and machine learning to assist trains in performing this operation efficiently. In this study, we covered the product, which consists mostly of two phases. Scanning is the first step, followed by sorting. During the scanning process, the parcel will be passed through three scanners that will look for explosives, drugs, and any dangerous materials in the parcel and will trash it if any of the tests fail. When the scanning step is over, the parcel moves on to the sorting phase, where we use QR codes to retrieve the details of the parcels and sort them properly. The simulation of the system is done using the blender software. Our research shows that our procedure significantly improves accuracy as well as the assessment of cutting-edge technology and existing techniques. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 6 pages, 6 figures, 2022 IEEE IAS Global Conference on Emerging Technologies (GlobConET)

Journal ref: 2022 IEEE IAS Global Conference on Emerging Technologies (GlobConET), Arad, Romania, 2022, pp. 751-756

arXiv:2404.07461 [pdf, other]

"Confidently Nonsensical?'': A Critical Survey on the Perspectives and Challenges of 'Hallucinations' in NLP

Authors: Pranav Narayanan Venkit, Tatiana Chakravorti, Vipul Gupta, Heidi Biggs, Mukund Srinath, Koustava Goswami, Sarah Rajtmajer, Shomir Wilson

Abstract: We investigate how hallucination in large language models (LLM) is characterized in peer-reviewed literature using a critical examination of 103 publications across NLP research. Through a comprehensive review of sociological and technological literature, we identify a lack of agreement with the term `hallucination.' Additionally, we conduct a survey with 171 practitioners from the field of NLP an… ▽ More We investigate how hallucination in large language models (LLM) is characterized in peer-reviewed literature using a critical examination of 103 publications across NLP research. Through a comprehensive review of sociological and technological literature, we identify a lack of agreement with the term `hallucination.' Additionally, we conduct a survey with 171 practitioners from the field of NLP and AI to capture varying perspectives on hallucination. Our analysis underscores the necessity for explicit definitions and frameworks outlining hallucination within NLP, highlighting potential challenges, and our survey inputs provide a thematic understanding of the influence and ramifications of hallucination in society. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.06959 [pdf, ps, other]

Regular inclusions of simple unital $C^*$-algebras

Authors: Keshab Chandra Bakshi, Ved Prakash Gupta

Abstract: We prove that an inclusion $\mathcal{B} \subset \mathcal{A}$ of simple unital $C^*$-algebras with a finite-index conditional expectation is regular if and only if there exists a finite group $G$ that admits a cocycle action $(α,σ)$ on the intermediate $C^*$-subalgebra $\mathcal{C}$ generated by $\mathcal{B}$ and its centralizer $\mathcal{C}_\mathcal{A}(\mathcal{B})$ such that $\mathcal{B}$ is oute… ▽ More We prove that an inclusion $\mathcal{B} \subset \mathcal{A}$ of simple unital $C^*$-algebras with a finite-index conditional expectation is regular if and only if there exists a finite group $G$ that admits a cocycle action $(α,σ)$ on the intermediate $C^*$-subalgebra $\mathcal{C}$ generated by $\mathcal{B}$ and its centralizer $\mathcal{C}_\mathcal{A}(\mathcal{B})$ such that $\mathcal{B}$ is outerly $α$-invariant and $(\mathcal{B} \subset \mathcal{A}) \cong ( \mathcal{B} \subset \mathcal{C}\rtimes^r_{α, σ} G)$. Prior to this characterization, we prove the existence of two-sided and unitary quasi-bases for the minimal conditional expectation of any such inclusion, and also show that such an inclusion has integer Watatani index and depth at most $2$. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 16 pages

arXiv:2404.06751 [pdf, other]

Leveraging open-source models for legal language modeling and analysis: a case study on the Indian constitution

Authors: Vikhyath Gupta, Srinivasa Rao P

Abstract: In recent years, the use of open-source models has gained immense popularity in various fields, including legal language modelling and analysis. These models have proven to be highly effective in tasks such as summarizing legal documents, extracting key information, and even predicting case outcomes. This has revolutionized the legal industry, enabling lawyers, researchers, and policymakers to qui… ▽ More In recent years, the use of open-source models has gained immense popularity in various fields, including legal language modelling and analysis. These models have proven to be highly effective in tasks such as summarizing legal documents, extracting key information, and even predicting case outcomes. This has revolutionized the legal industry, enabling lawyers, researchers, and policymakers to quickly access and analyse vast amounts of legal text, saving time and resources. This paper presents a novel approach to legal language modeling (LLM) and analysis using open-source models from Hugging Face. We leverage Hugging Face embeddings via LangChain and Sentence Transformers to develop an LLM tailored for legal texts. We then demonstrate the application of this model by extracting insights from the official Constitution of India. Our methodology involves preprocessing the data, splitting it into chunks, using ChromaDB and LangChainVectorStores, and employing the Google/Flan-T5-XXL model for analysis. The trained model is tested on the Indian Constitution, which is available in PDF format. Our findings suggest that our approach holds promise for efficient legal language processing and analysis. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 10 Pages , 3 figures

arXiv:2403.05967 [pdf, ps, other]

On various notions of distance between subalgebras of operator algebras

Authors: Ved Prakash Gupta, Sumit Kumar

Abstract: Given any irreducible inclusion $\mB \subset \mA$ of unital $C^*$-algebras with a finite-index conditional expectation $E: \mA \to \mB$, we show that the set of $E$-compatible intermediate $C^*$-subalgebras is finite, thereby generalizing a finiteness result of Ino and Watatani (from \cite{IW}). A finiteness result for a certain collection of intermediate $C^*$-subalgebras of a non-irreducible inc… ▽ More Given any irreducible inclusion $\mB \subset \mA$ of unital $C^*$-algebras with a finite-index conditional expectation $E: \mA \to \mB$, we show that the set of $E$-compatible intermediate $C^*$-subalgebras is finite, thereby generalizing a finiteness result of Ino and Watatani (from \cite{IW}). A finiteness result for a certain collection of intermediate $C^*$-subalgebras of a non-irreducible inclusion of simple unital $C^*$-algebras is also obtained, which provides a $C^*$-version of a finiteness result of Khoshkam and Mashood (from \cite{KM}). Apart from these finiteness results, comparisons between various notions of distance between subalgebras of operator algebras by Kadison-Kastler, Christensen and Mashood-Taylor are made. Further, these comparisons are used satisfactorily to provide some concrete calculations of distance between operator algebras associated to two distinct subgroups of a given discrete group. △ Less

Submitted 9 March, 2024; originally announced March 2024.

arXiv:2403.04007 [pdf, other]

Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical Systems

Authors: Wesley A. Suttle, Vipul K. Sharma, Krishna C. Kosaraju, S. Sivaranjani, Ji Liu, Vijay Gupta, Brian M. Sadler

Abstract: We develop provably safe and convergent reinforcement learning (RL) algorithms for control of nonlinear dynamical systems, bridging the gap between the hard safety guarantees of control theory and the convergence guarantees of RL theory. Recent advances at the intersection of control and RL follow a two-stage, safety filter approach to enforcing hard safety constraints: model-free RL is used to le… ▽ More We develop provably safe and convergent reinforcement learning (RL) algorithms for control of nonlinear dynamical systems, bridging the gap between the hard safety guarantees of control theory and the convergence guarantees of RL theory. Recent advances at the intersection of control and RL follow a two-stage, safety filter approach to enforcing hard safety constraints: model-free RL is used to learn a potentially unsafe controller, whose actions are projected onto safe sets prescribed, for example, by a control barrier function. Though safe, such approaches lose any convergence guarantees enjoyed by the underlying RL methods. In this paper, we develop a single-stage, sampling-based approach to hard constraint satisfaction that learns RL controllers enjoying classical convergence guarantees while satisfying hard safety constraints throughout training and deployment. We validate the efficacy of our approach in simulation, including safe control of a quadcopter in a challenging obstacle avoidance problem, and demonstrate that it outperforms existing benchmarks. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: 20 pages, 7 figures

arXiv:2403.03212 [pdf, other]

Performance of a modular ton-scale pixel-readout liquid argon time projection chamber

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, D. A. Andrade , et al. (1340 additional authors not shown)

Abstract: The Module-0 Demonstrator is a single-phase 600 kg liquid argon time projection chamber operated as a prototype for the DUNE liquid argon near detector. Based on the ArgonCube design concept, Module-0 features a novel 80k-channel pixelated charge readout and advanced high-coverage photon detection system. In this paper, we present an analysis of an eight-day data set consisting of 25 million cosmi… ▽ More The Module-0 Demonstrator is a single-phase 600 kg liquid argon time projection chamber operated as a prototype for the DUNE liquid argon near detector. Based on the ArgonCube design concept, Module-0 features a novel 80k-channel pixelated charge readout and advanced high-coverage photon detection system. In this paper, we present an analysis of an eight-day data set consisting of 25 million cosmic ray events collected in the spring of 2021. We use this sample to demonstrate the imaging performance of the charge and light readout systems as well as the signal correlations between the two. We also report argon purity and detector uniformity measurements, and provide comparisons to detector simulations. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 47 pages, 41 figures

Report number: FERMILAB-PUB-24-0073-LBNF

arXiv:2403.01151 [pdf, ps, other]

A Ricci flow on graphs from effective resistance

Authors: Aleyah Dawkins, Vishal Gupta, Mark Kempton, William Linz, Jeremy Quail, Harry Richman, Zachary Stier

Abstract: In this paper, we introduce a new notion of curvature on the edges of a graph that is defined in terms of effective resistances. We call this the Ricci--Foster curvature. We study the Ricci flow resulting from this curvature. We prove the existence of solutions to Ricci flow on short time intervals, and prove that Ricci flow preserves graphs with nonnegative (resp. positive) curvature. In this paper, we introduce a new notion of curvature on the edges of a graph that is defined in terms of effective resistances. We call this the Ricci--Foster curvature. We study the Ricci flow resulting from this curvature. We prove the existence of solutions to Ricci flow on short time intervals, and prove that Ricci flow preserves graphs with nonnegative (resp. positive) curvature. △ Less

Submitted 2 March, 2024; originally announced March 2024.

Comments: 14 pages, 8 figures, comments welcome!

MSC Class: 05C10; 53E20; 05C22; 53A70; 53C21; 94C15

arXiv:2403.01037 [pdf, other]

Node resistance curvature in Cartesian graph products

Authors: Aleyah Dawkins, Vishal Gupta, Mark Kempton, William Linz, Jeremy Quail, Harry Richman, Zachary Stier

Abstract: Devriendt and Lambiotte recently introduced the \emph{node resistance curvature}, a notion of graph curvature based on the effective resistance matrix. In this paper, we begin the study of the behavior of the node resistance curvature under the operation of the Cartesian graph product. We study the natural question of global positivity of node resistance curvature of the Cartesian product of posit… ▽ More Devriendt and Lambiotte recently introduced the \emph{node resistance curvature}, a notion of graph curvature based on the effective resistance matrix. In this paper, we begin the study of the behavior of the node resistance curvature under the operation of the Cartesian graph product. We study the natural question of global positivity of node resistance curvature of the Cartesian product of positively-curved graphs, and prove that, whenever $m,n\ge3$, the node resistance curvature of the interior vertices of a $m\times n$ grid is always nonpositive, while it is always nonnegative on the boundary of such grids. For completeness, we also prove a number of results on node resistance curvature in $2\times n$ grids and exhibit a counterexample to a generalization. We also give generic bounds and suggest several further questions for future study. △ Less

Submitted 1 March, 2024; originally announced March 2024.

MSC Class: 05C99; 05C81

arXiv:2402.17108 [pdf, ps, other]

Repeated Contracting with Multiple Non-Myopic Agents: Policy Regret and Limited Liability

Authors: Natalie Collina, Varun Gupta, Aaron Roth

Abstract: We study a repeated contracting setting in which a Principal adaptively chooses amongst $k$ Agents at each of $T$ rounds. The Agents are non-myopic, and so a mechanism for the Principal induces a $T$-round extensive form game amongst the Agents. We give several results aimed at understanding an under-explored aspect of contract theory -- the game induced when choosing an Agent to contract with. Fi… ▽ More We study a repeated contracting setting in which a Principal adaptively chooses amongst $k$ Agents at each of $T$ rounds. The Agents are non-myopic, and so a mechanism for the Principal induces a $T$-round extensive form game amongst the Agents. We give several results aimed at understanding an under-explored aspect of contract theory -- the game induced when choosing an Agent to contract with. First, we show that this game admits a pure-strategy \emph{non-responsive} equilibrium amongst the Agents -- informally an equilibrium in which the Agent's actions depend on the history of realized states of nature, but not on the history of each other's actions, and so avoids the complexities of collusion and threats. Next, we show that if the Principal selects Agents using a \emph{monotone} bandit algorithm, then for any concave contract, in any such equilibrium, the Principal obtains no regret to contracting with the best Agent in hindsight -- not just given their realized actions, but also to the counterfactual world in which they had offered a guaranteed $T$-round contract to the best Agent in hindsight, which would have induced a different sequence of actions. Finally, we show that if the Principal selects Agents using a monotone bandit algorithm which guarantees no swap-regret, then the Principal can additionally offer only limited liability contracts (in which the Agent never needs to pay the Principal) while getting no-regret to the counterfactual world in which she offered a linear contract to the best Agent in hindsight -- despite the fact that linear contracts are not limited liability. We instantiate this theorem by demonstrating the existence of a monotone no swap-regret bandit algorithm, which to our knowledge has not previously appeared in the literature. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.11755 [pdf, other]

SPML: A DSL for Defending Language Models Against Prompt Attacks

Authors: Reshabh K Sharma, Vinayak Gupta, Dan Grossman

Abstract: Large language models (LLMs) have profoundly transformed natural language applications, with a growing reliance on instruction-based definitions for designing chatbots. However, post-deployment the chatbot definitions are fixed and are vulnerable to attacks by malicious users, emphasizing the need to prevent unethical applications and financial losses. Existing studies explore user prompts' impact… ▽ More Large language models (LLMs) have profoundly transformed natural language applications, with a growing reliance on instruction-based definitions for designing chatbots. However, post-deployment the chatbot definitions are fixed and are vulnerable to attacks by malicious users, emphasizing the need to prevent unethical applications and financial losses. Existing studies explore user prompts' impact on LLM-based chatbots, yet practical methods to contain attacks on application-specific chatbots remain unexplored. This paper presents System Prompt Meta Language (SPML), a domain-specific language for refining prompts and monitoring the inputs to the LLM-based chatbots. SPML actively checks attack prompts, ensuring user inputs align with chatbot definitions to prevent malicious execution on the LLM backbone, optimizing costs. It also streamlines chatbot definition crafting with programming language capabilities, overcoming natural language design challenges. Additionally, we introduce a groundbreaking benchmark with 1.8k system prompts and 20k user inputs, offering the inaugural language and benchmark for chatbot definition evaluation. Experiments across datasets demonstrate SPML's proficiency in understanding attacker prompts, surpassing models like GPT-4, GPT-3.5, and LLAMA. Our data and codes are publicly available at: https://prompt-compiler.github.io/SPML/. △ Less

Submitted 18 February, 2024; originally announced February 2024.

arXiv:2402.11194 [pdf, other]

Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering

Authors: Pragya Srivastava, Manuj Malik, Vivek Gupta, Tanuja Ganu, Dan Roth

Abstract: Large Language Models (LLMs), excel in natural language understanding, but their capability for complex mathematical reasoning with an amalgamation of structured tables and unstructured text is uncertain. This study explores LLMs' mathematical reasoning on four financial tabular question-answering datasets: TATQA, FinQA, ConvFinQA, and Multihiertt. Through extensive experiments with various models… ▽ More Large Language Models (LLMs), excel in natural language understanding, but their capability for complex mathematical reasoning with an amalgamation of structured tables and unstructured text is uncertain. This study explores LLMs' mathematical reasoning on four financial tabular question-answering datasets: TATQA, FinQA, ConvFinQA, and Multihiertt. Through extensive experiments with various models and prompting techniques, we assess how LLMs adapt to complex tables and mathematical tasks. We focus on sensitivity to table complexity and performance variations with an increasing number of arithmetic reasoning steps. The results provide insights into LLMs' capabilities and limitations in handling complex mathematical scenarios for semi-structured tables. Ultimately, we introduce a novel prompting technique tailored to semi-structured documents, matching or outperforming other baselines in performance while providing a nuanced understanding of LLMs abilities for such a task. △ Less

Submitted 29 February, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

Comments: 25 pages, 17 figures

arXiv:2402.09658 [pdf]

Towards Precision Cardiovascular Analysis in Zebrafish: The ZACAF Paradigm

Authors: Amir Mohammad Naderi, Jennifer G. Casey, Mao-Hsiang Huang, Rachelle Victorio, David Y. Chiang, Calum MacRae, Hung Cao, Vandana A. Gupta

Abstract: Quantifying cardiovascular parameters like ejection fraction in zebrafish as a host of biological investigations has been extensively studied. Since current manual monitoring techniques are time-consuming and fallible, several image processing frameworks have been proposed to automate the process. Most of these works rely on supervised deep-learning architectures. However, supervised methods tend… ▽ More Quantifying cardiovascular parameters like ejection fraction in zebrafish as a host of biological investigations has been extensively studied. Since current manual monitoring techniques are time-consuming and fallible, several image processing frameworks have been proposed to automate the process. Most of these works rely on supervised deep-learning architectures. However, supervised methods tend to be overfitted on their training dataset. This means that applying the same framework to new data with different imaging setups and mutant types can severely decrease performance. We have developed a Zebrafish Automatic Cardiovascular Assessment Framework (ZACAF) to quantify the cardiac function in zebrafish. In this work, we further applied data augmentation, Transfer Learning (TL), and Test Time Augmentation (TTA) to ZACAF to improve the performance for the quantification of cardiovascular function quantification in zebrafish. This strategy can be integrated with the available frameworks to aid other researchers. We demonstrate that using TL, even with a constrained dataset, the model can be refined to accommodate a novel microscope setup, encompassing diverse mutant types and accommodating various video recording protocols. Additionally, as users engage in successive rounds of TL, the model is anticipated to undergo substantial enhancements in both generalizability and accuracy. Finally, we applied this approach to assess the cardiovascular function in nrap mutant zebrafish, a model of cardiomyopathy. △ Less

Submitted 14 February, 2024; originally announced February 2024.

arXiv:2402.08747 [pdf, other]

Rationality of Learning Algorithms in Repeated Normal-Form Games

Authors: Shivam Bajaj, Pranoy Das, Yevgeniy Vorobeychik, Vijay Gupta

Abstract: Many learning algorithms are known to converge to an equilibrium for specific classes of games if the same learning algorithm is adopted by all agents. However, when the agents are self-interested, a natural question is whether agents have a strong incentive to adopt an alternative learning algorithm that yields them greater individual utility. We capture such incentives as an algorithm's rational… ▽ More Many learning algorithms are known to converge to an equilibrium for specific classes of games if the same learning algorithm is adopted by all agents. However, when the agents are self-interested, a natural question is whether agents have a strong incentive to adopt an alternative learning algorithm that yields them greater individual utility. We capture such incentives as an algorithm's rationality ratio, which is the ratio of the highest payoff an agent can obtain by deviating from a learning algorithm to its payoff from following it. We define a learning algorithm to be $c$-rational if its rationality ratio is at most $c$ irrespective of the game. We first establish that popular learning algorithms such as fictitious play and regret matching are not $c$-rational for any constant $c\geq 1$. We then propose and analyze two algorithms that are provably $1$-rational under mild assumptions, and have the same properties as (a generalized version of) fictitious play and regret matching, respectively, if all agents follow them. Finally, we show that if an assumption of perfect monitoring is not satisfied, there are games for which $c$-rational algorithms do not exist, and illustrate our results with numerical case studies. △ Less

Submitted 13 February, 2024; originally announced February 2024.

arXiv:2402.04632 [pdf, other]

GSN: Generalisable Segmentation in Neural Radiance Field

Authors: Vinayak Gupta, Rahul Goel, Sirikonda Dhawal, P. J. Narayanan

Abstract: Traditional Radiance Field (RF) representations capture details of a specific scene and must be trained afresh on each scene. Semantic feature fields have been added to RFs to facilitate several segmentation tasks. Generalised RF representations learn the principles of view interpolation. A generalised RF can render new views of an unknown and untrained scene, given a few views. We present a way t… ▽ More Traditional Radiance Field (RF) representations capture details of a specific scene and must be trained afresh on each scene. Semantic feature fields have been added to RFs to facilitate several segmentation tasks. Generalised RF representations learn the principles of view interpolation. A generalised RF can render new views of an unknown and untrained scene, given a few views. We present a way to distil feature fields into the generalised GNT representation. Our GSN representation generates new views of unseen scenes on the fly along with consistent, per-pixel semantic features. This enables multi-view segmentation of arbitrary new scenes. We show different semantic features being distilled into generalised RFs. Our multi-view segmentation results are on par with methods that use traditional RFs. GSN closes the gap between standard and generalisable RF methods significantly. Project Page: https://vinayak-vg.github.io/GSN/ △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: Accepted at the Main Technical Track of AAAI 2024

arXiv:2402.04146 [pdf, other]

Interpretable Multi-Source Data Fusion Through Latent Variable Gaussian Process

Authors: Sandipp Krishnan Ravi, Yigitcan Comlek, Wei Chen, Arjun Pathak, Vipul Gupta, Rajnikant Umretiya, Andrew Hoffman, Ghanshyam Pilania, Piyush Pandita, Sayan Ghosh, Nathaniel Mckeever, Liping Wang

Abstract: With the advent of artificial intelligence (AI) and machine learning (ML), various domains of science and engineering communites has leveraged data-driven surrogates to model complex systems from numerous sources of information (data). The proliferation has led to significant reduction in cost and time involved in development of superior systems designed to perform specific functionalities. A high… ▽ More With the advent of artificial intelligence (AI) and machine learning (ML), various domains of science and engineering communites has leveraged data-driven surrogates to model complex systems from numerous sources of information (data). The proliferation has led to significant reduction in cost and time involved in development of superior systems designed to perform specific functionalities. A high proposition of such surrogates are built extensively fusing multiple sources of data, may it be published papers, patents, open repositories, or other resources. However, not much attention has been paid to the differences in quality and comprehensiveness of the known and unknown underlying physical parameters of the information sources that could have downstream implications during system optimization. Towards resolving this issue, a multi-source data fusion framework based on Latent Variable Gaussian Process (LVGP) is proposed. The individual data sources are tagged as a characteristic categorical variable that are mapped into a physically interpretable latent space, allowing the development of source-aware data fusion modeling. Additionally, a dissimilarity metric based on the latent variables of LVGP is introduced to study and understand the differences in the sources of data. The proposed approach is demonstrated on and analyzed through two mathematical (representative parabola problem, 2D Ackley function) and two materials science (design of FeCrAl and SmCoFe alloys) case studies. From the case studies, it is observed that compared to using single-source and source unaware ML models, the proposed multi-source data fusion framework can provide better predictions for sparse-data problems, interpretability regarding the sources, and enhanced modeling capabilities by taking advantage of the correlations and relationships among different sources. △ Less

Submitted 15 July, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

Comments: 27 Pages,10 Figures, 3 Supplementary Figures, 2 Supplementary Tables

arXiv:2402.03256 [pdf, ps, other]

Decision-Focused Learning with Directional Gradients

Authors: Michael Huang, Vishal Gupta

Abstract: We propose a novel family of decision-aware surrogate losses, called Perturbation Gradient (PG) losses, for the predict-then-optimize framework. The key idea is to connect the expected downstream decision loss with the directional derivative of a particular plug-in objective, and then approximate this derivative using zeroth order gradient techniques. Unlike the original decision loss which is typ… ▽ More We propose a novel family of decision-aware surrogate losses, called Perturbation Gradient (PG) losses, for the predict-then-optimize framework. The key idea is to connect the expected downstream decision loss with the directional derivative of a particular plug-in objective, and then approximate this derivative using zeroth order gradient techniques. Unlike the original decision loss which is typically piecewise constant and discontinuous, our new PG losses can be optimized using off-the-shelf gradient-based methods. Most importantly, unlike existing surrogate losses, the approximation error of our PG losses vanishes as the number of samples grows. Hence, optimizing our surrogate loss yields a best-in-class policy asymptotically, even in misspecified settings. This is the first such result in misspecified settings, and we provide numerical evidence confirming our PG losses substantively outperform existing proposals when the underlying model is misspecified. △ Less

Submitted 23 July, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

arXiv:2402.02589 [pdf, other]

Prospective Prediction of Body Mass Index Trajectories using Multi-task Gaussian Processes

Authors: Arthur Leroy, Varsha Gupta, Mya Thway Tint, Delicia Ooi Shu Qin, Keith M. Godfrey, Fabian Yap, Leck Ngee, Yung Seng Lee, Johan G. Eriksson, Navin Michael, Mauricio A. Alvarez, Dennis Wang

Abstract: Clinicians often investigate the body mass index (BMI) trajectories of children to assess their growth with respect to their peers, as well as to anticipate future growth and disease risk. While retrospective modelling of BMI trajectories has been an active area of research, prospective prediction of continuous BMI trajectories from historical growth data has not been well investigated. Using weig… ▽ More Clinicians often investigate the body mass index (BMI) trajectories of children to assess their growth with respect to their peers, as well as to anticipate future growth and disease risk. While retrospective modelling of BMI trajectories has been an active area of research, prospective prediction of continuous BMI trajectories from historical growth data has not been well investigated. Using weight and height measurements from birth to age 10 years from a longitudinal mother-offspring cohort, we leveraged a multi-task Gaussian processes model, called MagmaClust, to derive probabilistic predictions for BMI trajectories over various forecasting periods. Experiments were conducted to evaluate the accuracy, sensitivity to missing values, and number of clusters. The results were compared with cubic B-spline regression and a parametric Jenss-Bayley mixed effects model. A downstream tool computing individual overweight probabilities was also proposed and evaluated. In all experiments, MagmaClust outperformed conventional models in prediction accuracy while correctly calibrating uncertainty regardless of the missing data amount (up to 90\% missing) or the forecasting period (from 2 to 8 years in the future). Moreover, the overweight probabilities computed from MagmaClust's uncertainty quantification exhibited high specificity ($0.94$ to $0.96$) and accuracy ($0.86$ to $0.94$) in predicting the 10-year overweight status even from age 2 years. MagmaClust provides a probabilistic non-parametric framework to prospectively predict BMI trajectories, which is robust to missing values and outperforms conventional BMI trajectory modelling approaches. It also clusters individuals to identify typical BMI patterns (early peak, adiposity rebounds) during childhood. Overall, we demonstrated its potential to anticipate BMI evolution throughout childhood, allowing clinicians to implement prevention strategies. △ Less

Submitted 4 February, 2024; originally announced February 2024.

Comments: 17 pages, 9 figures, 5 tables

Showing 1–50 of 471 results for author: Gupta, V