-
Step-by-Step Reasoning to Solve Grid Puzzles: Where do LLMs Falter?
Authors:
Nemika Tyagi,
Mihir Parmar,
Mohith Kulkarni,
Aswin RRV,
Nisarg Patel,
Mutsumi Nakamura,
Arindam Mitra,
Chitta Baral
Abstract:
Solving grid puzzles involves a significant amount of logical reasoning. Hence, it is a good domain to evaluate the reasoning capability of a model which can then guide us to improve the reasoning ability of models. However, most existing works evaluate only the final predicted answer of a puzzle, without delving into an in-depth analysis of the LLMs' reasoning chains (such as where they falter) o…
▽ More
Solving grid puzzles involves a significant amount of logical reasoning. Hence, it is a good domain to evaluate the reasoning capability of a model which can then guide us to improve the reasoning ability of models. However, most existing works evaluate only the final predicted answer of a puzzle, without delving into an in-depth analysis of the LLMs' reasoning chains (such as where they falter) or providing any finer metrics to evaluate them. Since LLMs may rely on simple heuristics or artifacts to predict the final answer, it is crucial to evaluate the generated reasoning chain beyond overall correctness measures, for accurately evaluating the reasoning abilities of LLMs. To this end, we first develop GridPuzzle, an evaluation dataset comprising 274 grid-based puzzles with different complexities. Second, we propose a new error taxonomy derived from manual analysis of reasoning chains from LLMs including GPT-4, Claude-3, Gemini, Mistral, and Llama-2. Then, we develop an LLM-based framework for large-scale subjective evaluation (i.e., identifying errors) and an objective metric, PuzzleEval, to evaluate the correctness of reasoning chains. Evaluating reasoning chains from LLMs leads to several interesting findings. We further show that existing prompting methods used for enhancing models' reasoning abilities do not improve performance on GridPuzzle. This highlights the importance of understanding fine-grained errors and presents a challenge for future research to enhance LLMs' puzzle-solving abilities by developing methods that address these errors. Data and source code are available at https://github.com/Mihir3009/GridPuzzle.
△ Less
Submitted 20 July, 2024;
originally announced July 2024.
-
SALSA: Swift Adaptive Lightweight Self-Attention for Enhanced LiDAR Place Recognition
Authors:
Raktim Gautam Goswami,
Naman Patel,
Prashanth Krishnamurthy,
Farshad Khorrami
Abstract:
Large-scale LiDAR mappings and localization leverage place recognition techniques to mitigate odometry drifts, ensuring accurate mapping. These techniques utilize scene representations from LiDAR point clouds to identify previously visited sites within a database. Local descriptors, assigned to each point within a point cloud, are aggregated to form a scene representation for the point cloud. Thes…
▽ More
Large-scale LiDAR mappings and localization leverage place recognition techniques to mitigate odometry drifts, ensuring accurate mapping. These techniques utilize scene representations from LiDAR point clouds to identify previously visited sites within a database. Local descriptors, assigned to each point within a point cloud, are aggregated to form a scene representation for the point cloud. These descriptors are also used to re-rank the retrieved point clouds based on geometric fitness scores. We propose SALSA, a novel, lightweight, and efficient framework for LiDAR place recognition. It consists of a Sphereformer backbone that uses radial window attention to enable information aggregation for sparse distant points, an adaptive self-attention layer to pool local descriptors into tokens, and a multi-layer-perceptron Mixer layer for aggregating the tokens to generate a scene descriptor. The proposed framework outperforms existing methods on various LiDAR place recognition datasets in terms of both retrieval and metric localization while operating in real-time.
△ Less
Submitted 30 July, 2024; v1 submitted 11 July, 2024;
originally announced July 2024.
-
Lifestyle-Informed Personalized Blood Biomarker Prediction via Novel Representation Learning
Authors:
A. Ali Heydari,
Naghmeh Rezaei,
Javier L. Prieto,
Shwetak N. Patel,
Ahmed A. Metwally
Abstract:
Blood biomarkers are an essential tool for healthcare providers to diagnose, monitor, and treat a wide range of medical conditions. Current reference values and recommended ranges often rely on population-level statistics, which may not adequately account for the influence of inter-individual variability driven by factors such as lifestyle and genetics. In this work, we introduce a novel framework…
▽ More
Blood biomarkers are an essential tool for healthcare providers to diagnose, monitor, and treat a wide range of medical conditions. Current reference values and recommended ranges often rely on population-level statistics, which may not adequately account for the influence of inter-individual variability driven by factors such as lifestyle and genetics. In this work, we introduce a novel framework for predicting future blood biomarker values and define personalized references through learned representations from lifestyle data (physical activity and sleep) and blood biomarkers. Our proposed method learns a similarity-based embedding space that captures the complex relationship between biomarkers and lifestyle factors. Using the UK Biobank (257K participants), our results show that our deep-learned embeddings outperform traditional and current state-of-the-art representation learning techniques in predicting clinical diagnosis. Using a subset of UK Biobank of 6440 participants who have follow-up visits, we validate that the inclusion of these embeddings and lifestyle factors directly in blood biomarker models improves the prediction of future lab values from a single lab visit. This personalized modeling approach provides a foundation for developing more accurate risk stratification tools and tailoring preventative care strategies. In clinical settings, this translates to the potential for earlier disease detection, more timely interventions, and ultimately, a shift towards personalized healthcare.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Multi-LogiEval: Towards Evaluating Multi-Step Logical Reasoning Ability of Large Language Models
Authors:
Nisarg Patel,
Mohith Kulkarni,
Mihir Parmar,
Aashna Budhiraja,
Mutsumi Nakamura,
Neeraj Varshney,
Chitta Baral
Abstract:
As Large Language Models (LLMs) continue to exhibit remarkable performance in natural language understanding tasks, there is a crucial need to measure their ability for human-like multi-step logical reasoning. Existing logical reasoning evaluation benchmarks often focus primarily on simplistic single-step or multi-step reasoning with a limited set of inference rules. Furthermore, the lack of datas…
▽ More
As Large Language Models (LLMs) continue to exhibit remarkable performance in natural language understanding tasks, there is a crucial need to measure their ability for human-like multi-step logical reasoning. Existing logical reasoning evaluation benchmarks often focus primarily on simplistic single-step or multi-step reasoning with a limited set of inference rules. Furthermore, the lack of datasets for evaluating non-monotonic reasoning represents a crucial gap since it aligns more closely with human-like reasoning. To address these limitations, we propose Multi-LogiEval, a comprehensive evaluation dataset encompassing multi-step logical reasoning with various inference rules and depths. Multi-LogiEval covers three logic types--propositional, first-order, and non-monotonic--consisting of more than 30 inference rules and more than 60 of their combinations with various depths. Leveraging this dataset, we conduct evaluations on a range of LLMs including GPT-4, ChatGPT, Gemini-Pro, Yi, Orca, and Mistral, employing a zero-shot chain-of-thought. Experimental results show that there is a significant drop in the performance of LLMs as the reasoning steps/depth increases (average accuracy of ~68% at depth-1 to ~43% at depth-5). We further conduct a thorough investigation of reasoning chains generated by LLMs which reveals several important findings. We believe that Multi-LogiEval facilitates future research for evaluating and enhancing the logical reasoning ability of LLMs. Data is available at https://github.com/Mihir3009/Multi-LogiEval.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Empowering Tuberculosis Screening with Explainable Self-Supervised Deep Neural Networks
Authors:
Neel Patel,
Alexander Wong,
Ashkan Ebadi
Abstract:
Tuberculosis persists as a global health crisis, especially in resource-limited populations and remote regions, with more than 10 million individuals newly infected annually. It stands as a stark symbol of inequity in public health. Tuberculosis impacts roughly a quarter of the global populace, with the majority of cases concentrated in eight countries, accounting for two-thirds of all tuberculosi…
▽ More
Tuberculosis persists as a global health crisis, especially in resource-limited populations and remote regions, with more than 10 million individuals newly infected annually. It stands as a stark symbol of inequity in public health. Tuberculosis impacts roughly a quarter of the global populace, with the majority of cases concentrated in eight countries, accounting for two-thirds of all tuberculosis infections. Although a severe ailment, tuberculosis is both curable and manageable. However, early detection and screening of at-risk populations are imperative. Chest x-ray stands as the predominant imaging technique utilized in tuberculosis screening efforts. However, x-ray screening necessitates skilled radiologists, a resource often scarce, particularly in remote regions with limited resources. Consequently, there is a pressing need for artificial intelligence (AI)-powered systems to support clinicians and healthcare providers in swift screening. However, training a reliable AI model necessitates large-scale high-quality data, which can be difficult and costly to acquire. Inspired by these challenges, in this work, we introduce an explainable self-supervised self-train learning network tailored for tuberculosis case screening. The network achieves an outstanding overall accuracy of 98.14% and demonstrates high recall and precision rates of 95.72% and 99.44%, respectively, in identifying tuberculosis cases, effectively capturing clinically significant features.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Demonstration of neutron identification in neutrino interactions in the MicroBooNE liquid argon time projection chamber
Authors:
MicroBooNE collaboration,
P. Abratenko,
O. Alterkait,
D. Andrade Aldana,
L. Arellano,
J. Asaadi,
A. Ashkenazi,
S. Balasubramanian,
B. Baller,
A. Barnard,
G. Barr,
D. Barrow,
J. Barrow,
V. Basque,
J. Bateman,
O. Benevides Rodrigues,
S. Berkman,
A. Bhanderi,
A. Bhat,
M. Bhattacharya,
M. Bishai,
A. Blake,
B. Bogart,
T. Bolton,
J. Y. Book
, et al. (165 additional authors not shown)
Abstract:
A significant challenge in measurements of neutrino oscillations is reconstructing the incoming neutrino energies. While modern fully-active tracking calorimeters such as liquid argon time projection chambers in principle allow the measurement of all final state particles above some detection threshold, undetected neutrons remain a considerable source of missing energy with little to no data const…
▽ More
A significant challenge in measurements of neutrino oscillations is reconstructing the incoming neutrino energies. While modern fully-active tracking calorimeters such as liquid argon time projection chambers in principle allow the measurement of all final state particles above some detection threshold, undetected neutrons remain a considerable source of missing energy with little to no data constraining their production rates and kinematics. We present the first demonstration of tagging neutrino-induced neutrons in liquid argon time projection chambers using secondary protons emitted from neutron-argon interactions in the MicroBooNE detector. We describe the method developed to identify neutrino-induced neutrons and demonstrate its performance using neutrons produced in muon-neutrino charged current interactions. The method is validated using a small subset of MicroBooNE's total dataset. The selection yields a sample with $60\%$ of selected tracks corresponding to neutron-induced secondary protons.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Improving neutrino energy estimation of charged-current interaction events with recurrent neural networks in MicroBooNE
Authors:
MicroBooNE collaboration,
P. Abratenko,
O. Alterkait,
D. Andrade Aldana,
L. Arellano,
J. Asaadi,
A. Ashkenazi,
S. Balasubramanian,
B. Baller,
A. Barnard,
G. Barr,
D. Barrow,
J. Barrow,
V. Basque,
J. Bateman,
O. Benevides Rodrigues,
S. Berkman,
A. Bhanderi,
A. Bhat,
M. Bhattacharya,
M. Bishai,
A. Blake,
B. Bogart,
T. Bolton,
J. Y. Book
, et al. (164 additional authors not shown)
Abstract:
We present a deep learning-based method for estimating the neutrino energy of charged-current neutrino-argon interactions. We employ a recurrent neural network (RNN) architecture for neutrino energy estimation in the MicroBooNE experiment, utilizing liquid argon time projection chamber (LArTPC) detector technology. Traditional energy estimation approaches in LArTPCs, which largely rely on reconstr…
▽ More
We present a deep learning-based method for estimating the neutrino energy of charged-current neutrino-argon interactions. We employ a recurrent neural network (RNN) architecture for neutrino energy estimation in the MicroBooNE experiment, utilizing liquid argon time projection chamber (LArTPC) detector technology. Traditional energy estimation approaches in LArTPCs, which largely rely on reconstructing and summing visible energies, often experience sizable biases and resolution smearing because of the complex nature of neutrino interactions and the detector response. The estimation of neutrino energy can be improved after considering the kinematics information of reconstructed final-state particles. Utilizing kinematic information of reconstructed particles, the deep learning-based approach shows improved resolution and reduced bias for the muon neutrino Monte Carlo simulation sample compared to the traditional approach. In order to address the common concern about the effectiveness of this method on experimental data, the RNN-based energy estimator is further examined and validated with dedicated data-simulation consistency tests using MicroBooNE data. We also assess its potential impact on a neutrino oscillation study after accounting for all statistical and systematic uncertainties and show that it enhances physics sensitivity. This method has good potential to improve the performance of other physics analyses.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
EnterpriseEM: Fine-tuned Embeddings for Enterprise Semantic Search
Authors:
Kamalkumar Rathinasamy,
Jayarama Nettar,
Amit Kumar,
Vishal Manchanda,
Arun Vijayakumar,
Ayush Kataria,
Venkateshprasanna Manjunath,
Chidambaram GS,
Jaskirat Singh Sodhi,
Shoeb Shaikh,
Wasim Akhtar Khan,
Prashant Singh,
Tanishq Dattatray Ige,
Vipin Tiwari,
Rajab Ali Mondal,
Harshini K,
S Reka,
Chetana Amancharla,
Faiz ur Rahman,
Harikrishnan P A,
Indraneel Saha,
Bhavya Tiwary,
Navin Shankar Patel,
Pradeep T S,
Balaji A J
, et al. (2 additional authors not shown)
Abstract:
Enterprises grapple with the significant challenge of managing proprietary unstructured data, hindering efficient information retrieval. This has led to the emergence of AI-driven information retrieval solutions, designed to adeptly extract relevant insights to address employee inquiries. These solutions often leverage pre-trained embedding models and generative models as foundational components.…
▽ More
Enterprises grapple with the significant challenge of managing proprietary unstructured data, hindering efficient information retrieval. This has led to the emergence of AI-driven information retrieval solutions, designed to adeptly extract relevant insights to address employee inquiries. These solutions often leverage pre-trained embedding models and generative models as foundational components. While pre-trained embeddings may exhibit proximity or disparity based on their original training objectives, they might not fully align with the unique characteristics of enterprise-specific data, leading to suboptimal alignment with the retrieval goals of enterprise environments. In this paper, we propose a methodology to fine-tune pre-trained embedding models specifically for enterprise environments. By adapting the embeddings to better suit the retrieval tasks prevalent in enterprises, we aim to enhance the performance of information retrieval solutions. We discuss the process of fine-tuning, its effect on retrieval accuracy, and the potential benefits for enterprise information management. Our findings demonstrate the efficacy of fine-tuned embedding models in improving the precision and relevance of search results in enterprise settings.
△ Less
Submitted 18 May, 2024;
originally announced June 2024.
-
Improved Emotional Alignment of AI and Humans: Human Ratings of Emotions Expressed by Stable Diffusion v1, DALL-E 2, and DALL-E 3
Authors:
James Derek Lomas,
Willem van der Maden,
Sohhom Bandyopadhyay,
Giovanni Lion,
Nirmal Patel,
Gyanesh Jain,
Yanna Litowsky,
Haian Xue,
Pieter Desmet
Abstract:
Generative AI systems are increasingly capable of expressing emotions via text and imagery. Effective emotional expression will likely play a major role in the efficacy of AI systems -- particularly those designed to support human mental health and wellbeing. This motivates our present research to better understand the alignment of AI expressed emotions with the human perception of emotions. When…
▽ More
Generative AI systems are increasingly capable of expressing emotions via text and imagery. Effective emotional expression will likely play a major role in the efficacy of AI systems -- particularly those designed to support human mental health and wellbeing. This motivates our present research to better understand the alignment of AI expressed emotions with the human perception of emotions. When AI tries to express a particular emotion, how might we assess whether they are successful? To answer this question, we designed a survey to measure the alignment between emotions expressed by generative AI and human perceptions. Three generative image models (DALL-E 2, DALL-E 3 and Stable Diffusion v1) were used to generate 240 examples of images, each of which was based on a prompt designed to express five positive and five negative emotions across both humans and robots. 24 participants recruited from the Prolific website rated the alignment of AI-generated emotional expressions with a text prompt used to generate the emotion (i.e., "A robot expressing the emotion amusement"). The results of our evaluation suggest that generative AI models are indeed capable of producing emotional expressions that are well-aligned with a range of human emotions; however, we show that the alignment significantly depends upon the AI model used and the emotion itself. We analyze variations in the performance of these systems to identify gaps for future improvement. We conclude with a discussion of the implications for future AI systems designed to support mental health and wellbeing.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Elucidating nanostructural organisation and photonic properties of butterfly wing scales using hyperspectral microscopy
Authors:
Anna-Lee Jessop,
Primoz Pirih,
Limin Wang,
Nipam Patel,
Peta Clode,
Gerd Schroeder-Turk,
Bodo Wilts
Abstract:
Biophotonic nanostructures in butterfly wing scales remain fascinating examples of biological functional materials, with intriguing open questions in regards to formation and evolutionary function. One particularly interesting butterfly species, Erora opisena (Lycaenidae: Theclinae), develops wing scales that contain three-dimensional photonic crystals that closely resemble a single gyroid geometr…
▽ More
Biophotonic nanostructures in butterfly wing scales remain fascinating examples of biological functional materials, with intriguing open questions in regards to formation and evolutionary function. One particularly interesting butterfly species, Erora opisena (Lycaenidae: Theclinae), develops wing scales that contain three-dimensional photonic crystals that closely resemble a single gyroid geometry. Unlike most other gyroid forming butterflies, E. opisena develops discrete gyroid crystallites with a pronounced size gradient hinting at a developmental sequence frozen in time. Here, we use a hyperspectral (wavelength-resolved) microscopy technique to investigate the ultrastructural organisation of these gyroid crystallites in dry, adult wing scales. We show that reflectance corresponds to crystallite size, where larger crystallites reflect green wavelengths more intensely; this relationship could be used to infer size from the optical signal. We further successfully resolve the red-shifted reflectance signal from wing scales immersed in refractive index oils with varying refractive index, including values similar to water or cytosol. Such photonic crystals with lower refractive index contrast may be similar to the hypothesized nanostructural forms in the developing butterfly scales. The ability to resolve these fainter signals hints at the potential of this facile light microscopy method for in vivo analysis of nanostructure formation in developing butterflies.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
CLIPScope: Enhancing Zero-Shot OOD Detection with Bayesian Scoring
Authors:
Hao Fu,
Naman Patel,
Prashanth Krishnamurthy,
Farshad Khorrami
Abstract:
Detection of out-of-distribution (OOD) samples is crucial for safe real-world deployment of machine learning models. Recent advances in vision language foundation models have made them capable of detecting OOD samples without requiring in-distribution (ID) images. However, these zero-shot methods often underperform as they do not adequately consider ID class likelihoods in their detection confiden…
▽ More
Detection of out-of-distribution (OOD) samples is crucial for safe real-world deployment of machine learning models. Recent advances in vision language foundation models have made them capable of detecting OOD samples without requiring in-distribution (ID) images. However, these zero-shot methods often underperform as they do not adequately consider ID class likelihoods in their detection confidence scoring. Hence, we introduce CLIPScope, a zero-shot OOD detection approach that normalizes the confidence score of a sample by class likelihoods, akin to a Bayesian posterior update. Furthermore, CLIPScope incorporates a novel strategy to mine OOD classes from a large lexical database. It selects class labels that are farthest and nearest to ID classes in terms of CLIP embedding distance to maximize coverage of OOD samples. We conduct extensive ablation studies and empirical evaluations, demonstrating state of the art performance of CLIPScope across various OOD detection benchmarks.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Verifying Lock-free Search Structure Templates
Authors:
Nisarg Patel,
Dennis Shasha,
Thomas Wies
Abstract:
We present and verify template algorithms for lock-free concurrent search structures that cover a broad range of existing implementations based on lists and skiplists. Our linearizability proofs are fully mechanized in the concurrent separation logic Iris. The proofs are modular and cover the broader design space of the underlying algorithms by parameterizing the verification over aspects such as…
▽ More
We present and verify template algorithms for lock-free concurrent search structures that cover a broad range of existing implementations based on lists and skiplists. Our linearizability proofs are fully mechanized in the concurrent separation logic Iris. The proofs are modular and cover the broader design space of the underlying algorithms by parameterizing the verification over aspects such as the low-level representation of nodes and the style of data structure maintenance. As a further technical contribution, we present a mechanization of a recently proposed method for reasoning about future-dependent linearization points using hindsight arguments. The mechanization builds on Iris' support for prophecy reasoning and user-defined ghost resources. We demonstrate that the method can help to reduce the proof effort compared to direct prophecy-based proofs.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
The KALEIDOSCOPE survey : A new strong and weak gravitational lensing view of the massive galaxy cluster MACS J1423.8+2404
Authors:
Nency R. Patel,
Mathilde Jauzac,
Anna Niemiec,
David Lagattuta,
Guillaume Mahler,
Benjamin Beauchesne,
Alastair Edge,
Harald Ebeling,
Marceau Limousin
Abstract:
We present a combined strong and weak gravitational-lensing analysis of the massive galaxy cluster MACS J1423.8+2404 ($z=0.545$, MACS J1423 hereafter), one of the most dynamically relaxed and massive cool-core clusters discovered in the MAssive Cluster Survey at $z>0.5$. We combine high-resolution imaging from the Hubble Space Telescope (HST) in the F606W, F814W, and F160W pass-bands with spectros…
▽ More
We present a combined strong and weak gravitational-lensing analysis of the massive galaxy cluster MACS J1423.8+2404 ($z=0.545$, MACS J1423 hereafter), one of the most dynamically relaxed and massive cool-core clusters discovered in the MAssive Cluster Survey at $z>0.5$. We combine high-resolution imaging from the Hubble Space Telescope (HST) in the F606W, F814W, and F160W pass-bands with spectroscopic observations taken as part of the KALEIDOSCOPE survey with the Multi-Unit Spectroscopic Explorer mounted on the Very Large Telescope. Our strong lensing analysis of the mass distribution in the cluster core is constrained by four multiple-image systems (17 individual images) within redshift range $1.779<z<2.840$. Our weak-lensing analysis of the cluster outskirts, confined to the HST field of view, is based on a background galaxy catalogue with a density of 57 gal.arcmin$^{-2}$. We measure a projected mass of M($\textrm{R}<200$ kpc) = (1.6 $\pm$ 0.05) $\times$ 10$^{14}$ M$_{\rm\odot}$ from our strong-lensing model, and a projected mass of M($\textrm{R}<640$ kpc) = (6.6 $\pm$ 0.6) $\times$ 10$^{14}$ M$_{\rm\odot}$ when combining with our the weak-lensing constraints. Our analysis of the cluster mass distribution yields no evidence of substructures, confirming the dynamically relaxed state of MACS J1423. Our work sets the stage for future analysis of MACS J1423 in the upcoming Canadian Near Infrared Imager and Stiltless Spectrograph Unbiased Cluster Survey for the James Webb Space Telescope.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models
Authors:
Mihir Parmar,
Nisarg Patel,
Neeraj Varshney,
Mutsumi Nakamura,
Man Luo,
Santosh Mashetty,
Arindam Mitra,
Chitta Baral
Abstract:
Recently developed large language models (LLMs) have been shown to perform remarkably well on a wide range of language understanding tasks. But, can they really "reason" over the natural language? This question has been receiving significant research attention and many reasoning skills such as commonsense, numerical, and qualitative have been studied. However, the crucial skill pertaining to 'logi…
▽ More
Recently developed large language models (LLMs) have been shown to perform remarkably well on a wide range of language understanding tasks. But, can they really "reason" over the natural language? This question has been receiving significant research attention and many reasoning skills such as commonsense, numerical, and qualitative have been studied. However, the crucial skill pertaining to 'logical reasoning' has remained underexplored. Existing work investigating this reasoning ability of LLMs has focused only on a couple of inference rules (such as modus ponens and modus tollens) of propositional and first-order logic. Addressing the above limitation, we comprehensively evaluate the logical reasoning ability of LLMs on 25 different reasoning patterns spanning over propositional, first-order, and non-monotonic logics. To enable systematic evaluation, we introduce LogicBench, a natural language question-answering dataset focusing on the use of a single inference rule. We conduct detailed analysis with a range of LLMs such as GPT-4, ChatGPT, Gemini, Llama-2, and Mistral using chain-of-thought prompting. Experimental results show that existing LLMs do not fare well on LogicBench; especially, they struggle with instances involving complex reasoning and negations. Furthermore, they sometimes overlook contextual information necessary for reasoning to arrive at the correct conclusion. We believe that our work and findings facilitate future research for evaluating and enhancing the logical reasoning ability of LLMs. Data and code are available at https://github.com/Mihir3009/LogicBench.
△ Less
Submitted 6 June, 2024; v1 submitted 23 April, 2024;
originally announced April 2024.
-
First double-differential cross section measurement of neutral-current $π^0$ production in neutrino-argon scattering in the MicroBooNE detector
Authors:
MicroBooNE collaboration,
P. Abratenko,
O. Alterkait,
D. Andrade Aldana,
L. Arellano,
J. Asaadi,
A. Ashkenazi,
S. Balasubramanian,
B. Baller,
A. Barnard,
G. Barr,
D. Barrow,
J. Barrow,
V. Basque,
J. Bateman,
O. Benevides Rodrigues,
S. Berkman,
A. Bhanderi,
A. Bhat,
M. Bhattacharya,
M. Bishai,
A. Blake,
B. Bogart,
T. Bolton,
J. Y. Book
, et al. (166 additional authors not shown)
Abstract:
We report the first double-differential cross section measurement of neutral-current neutral pion (NC$π^0$) production in neutrino-argon scattering, as well as single-differential measurements of the same channel in terms of final states with and without protons. The kinematic variables of interest for these measurements are the $π^0$ momentum and the $π^0$ scattering angle with respect to the neu…
▽ More
We report the first double-differential cross section measurement of neutral-current neutral pion (NC$π^0$) production in neutrino-argon scattering, as well as single-differential measurements of the same channel in terms of final states with and without protons. The kinematic variables of interest for these measurements are the $π^0$ momentum and the $π^0$ scattering angle with respect to the neutrino beam. A total of 4971 candidate NC$π^0$ events fully-contained within the MicroBooNE detector are selected using data collected at a mean neutrino energy of $\sim 0.8$ GeV from $6.4\times10^{20}$ protons on target from the Booster Neutrino Beam at the Fermi National Accelerator Laboratory. After extensive data-driven model validation to ensure unbiased unfolding, the Wiener-SVD method is used to extract nominal flux-averaged cross sections. The results are compared to predictions from commonly used neutrino event generators, which tend to overpredict the measured NC$π^0$ cross section, especially in the 0.2-0.5 GeV/c $π^0$ momentum range, at forward scattering angles, and when at least one proton is present in the final state. These measurements show sensitivity to a variety of features that complicate the description of NC$π^0$ production including the form factors describing the elementary neutrino interaction and the final state interactions of the outgoing particles in the residual argon nucleus. This data will help improve the modeling of NC$π^0$ production, which represents a major background in measurements of charge-parity violation in the neutrino sector and in searches for new physics beyond the Standard Model.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Measurement of the differential cross section for neutral pion production in charged-current muon neutrino interactions on argon with the MicroBooNE detector
Authors:
MicroBooNE collaboration,
P. Abratenko,
O. Alterkait,
D. Andrade Aldana,
L. Arellano,
J. Asaadi,
A. Ashkenazi,
S. Balasubramanian,
B. Baller,
G. Barr,
D. Barrow,
J. Barrow,
V. Basque,
O. Benevides Rodrigues,
S. Berkman,
A. Bhanderi,
A. Bhat,
M. Bhattacharya,
M. Bishai,
A. Blake,
B. Bogart,
T. Bolton,
J. Y. Book,
M. B. Brunetti,
L. Camilleri
, et al. (163 additional authors not shown)
Abstract:
We present a measurement of neutral pion production in charged-current interactions using data recorded with the MicroBooNE detector exposed to Fermilab's booster neutrino beam. The signal comprises one muon, one neutral pion, any number of nucleons, and no charged pions. Studying neutral pion production in the MicroBooNE detector provides an opportunity to better understand neutrino-argon interac…
▽ More
We present a measurement of neutral pion production in charged-current interactions using data recorded with the MicroBooNE detector exposed to Fermilab's booster neutrino beam. The signal comprises one muon, one neutral pion, any number of nucleons, and no charged pions. Studying neutral pion production in the MicroBooNE detector provides an opportunity to better understand neutrino-argon interactions, and is crucial for future accelerator-based neutrino oscillation experiments. Using a dataset corresponding to $6.86 \times 10^{20}$ protons on target, we present single-differential cross sections in muon and neutral pion momenta, scattering angles with respect to the beam for the outgoing muon and neutral pion, as well as the opening angle between the muon and neutral pion. Data extracted cross sections are compared to generator predictions. We report good agreement between the data and the models for scattering angles, except for an over-prediction by generators at muon forward angles. Similarly, the agreement between data and the models as a function of momentum is good, except for an underprediction by generators in the medium momentum ranges, $200-400$ MeV for muons and $100-200$ MeV for pions.
△ Less
Submitted 6 May, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
Measurement of double-differential cross sections for mesonless charged-current muon neutrino interactions on argon with final-state protons using the MicroBooNE detector
Authors:
MicroBooNE collaboration,
P. Abratenko,
O. Alterkait,
D. Andrade Aldana,
L. Arellano,
J. Asaadi,
A. Ashkenazi,
S. Balasubramanian,
B. Baller,
G. Barr,
D. Barrow,
J. Barrow,
V. Basque,
O. Benevides Rodrigues,
S. Berkman,
A. Bhanderi,
A. Bhat,
M. Bhattacharya,
M. Bishai,
A. Blake,
B. Bogart,
T. Bolton,
J. Y. Book,
M. B. Brunetti,
L. Camilleri
, et al. (163 additional authors not shown)
Abstract:
Charged-current neutrino interactions with final states containing zero mesons and at least one proton are of high interest for current and future accelerator-based neutrino oscillation experiments. Using the Booster Neutrino Beam and the MicroBooNE detector at Fermi National Accelerator Laboratory, we have obtained the first double-differential cross section measurements of this channel for muon…
▽ More
Charged-current neutrino interactions with final states containing zero mesons and at least one proton are of high interest for current and future accelerator-based neutrino oscillation experiments. Using the Booster Neutrino Beam and the MicroBooNE detector at Fermi National Accelerator Laboratory, we have obtained the first double-differential cross section measurements of this channel for muon neutrino scattering on an argon target with a proton momentum threshold of 0.25 GeV/c. We also report a flux-averaged total cross section of $σ= (11.8 \pm 1.2) \times 10^{-38}$ cm$^2$ / Ar and several single-differential measurements which extend and improve upon previous results. Statistical and systematic uncertainties are quantified with a full treatment of correlations across 359 kinematic bins, including correlations between distributions describing different observables. The resulting data set provides the most detailed information obtained to date for testing models of mesonless neutrino-argon scattering.
△ Less
Submitted 16 April, 2024; v1 submitted 28 March, 2024;
originally announced March 2024.
-
Enhancing Financial Data Visualization for Investment Decision-Making
Authors:
Nisarg Patel,
Harmit Shah,
Kishan Mewada
Abstract:
Navigating the intricate landscape of financial markets requires adept forecasting of stock price movements. This paper delves into the potential of Long Short-Term Memory (LSTM) networks for predicting stock dynamics, with a focus on discerning nuanced rise and fall patterns. Leveraging a dataset from the New York Stock Exchange (NYSE), the study incorporates multiple features to enhance LSTM's c…
▽ More
Navigating the intricate landscape of financial markets requires adept forecasting of stock price movements. This paper delves into the potential of Long Short-Term Memory (LSTM) networks for predicting stock dynamics, with a focus on discerning nuanced rise and fall patterns. Leveraging a dataset from the New York Stock Exchange (NYSE), the study incorporates multiple features to enhance LSTM's capacity in capturing complex patterns. Visualization of key attributes, such as opening, closing, low, and high prices, aids in unraveling subtle distinctions crucial for comprehensive market understanding. The meticulously crafted LSTM input structure, inspired by established guidelines, incorporates both price and volume attributes over a 25-day time step, enabling the model to capture temporal intricacies. A comprehensive methodology, including hyperparameter tuning with Grid Search, Early Stopping, and Callback mechanisms, leads to a remarkable 53% improvement in predictive accuracy. The study concludes with insights into model robustness, contributions to financial forecasting literature, and a roadmap for real-time stock market prediction. The amalgamation of LSTM networks, strategic hyperparameter tuning, and informed feature selection presents a potent framework for advancing the accuracy of stock price predictions, contributing substantively to financial time series forecasting discourse.
△ Less
Submitted 9 December, 2023;
originally announced March 2024.
-
First simultaneous measurement of differential muon-neutrino charged-current cross sections on argon for final states with and without protons using MicroBooNE data
Authors:
MicroBooNE collaboration,
P. Abratenko,
O. Alterkait,
D. Andrade Aldana,
L. Arellano,
J. Asaadi,
A. Ashkenazi,
S. Balasubramanian,
B. Baller,
G. Barr,
D. Barrow,
J. Barrow,
V. Basque,
O. Benevides Rodrigues,
S. Berkman,
A. Bhanderi,
A. Bhat,
M. Bhattacharya,
M. Bishai,
A. Blake,
B. Bogart,
T. Bolton,
J. Y. Book,
M. B. Brunetti,
L. Camilleri
, et al. (163 additional authors not shown)
Abstract:
We report the first double-differential neutrino-argon cross section measurement made simultaneously for final states with and without protons for the inclusive muon neutrino charged-current interaction channel. The proton kinematics of this channel are further explored with a differential cross section measurement as a function of the leading proton's kinetic energy that extends across the detect…
▽ More
We report the first double-differential neutrino-argon cross section measurement made simultaneously for final states with and without protons for the inclusive muon neutrino charged-current interaction channel. The proton kinematics of this channel are further explored with a differential cross section measurement as a function of the leading proton's kinetic energy that extends across the detection threshold. These measurements utilize data collected using the MicroBooNE detector from 6.4$\times10^{20}$ protons on target from the Fermilab Booster Neutrino Beam with a mean neutrino energy of $\sim$0.8 GeV. Extensive data-driven model validation utilizing the conditional constraint formalism is employed. This motivates enlarging the uncertainties with an empirical reweighting approach to minimize the possibility of extracting biased cross section results. The extracted nominal flux-averaged cross sections are compared to widely used event generator predictions revealing severe mismodeling of final states without protons for muon neutrino charged-current interactions, possibly from insufficient treatment of final state interactions. These measurements provide a wealth of new information useful for improving event generators which will enhance the sensitivity of precision measurements in neutrino experiments.
△ Less
Submitted 27 July, 2024; v1 submitted 29 February, 2024;
originally announced February 2024.
-
Inclusive cross section measurements in final states with and without protons for charged-current $ν_μ$-Ar scattering in MicroBooNE
Authors:
MicroBooNE collaboration,
P. Abratenko,
O. Alterkait,
D. Andrade Aldana,
L. Arellano,
J. Asaadi,
A. Ashkenazi,
S. Balasubramanian,
B. Baller,
G. Barr,
D. Barrow,
J. Barrow,
V. Basque,
O. Benevides Rodrigues,
S. Berkman,
A. Bhanderi,
A. Bhat,
M. Bhattacharya,
M. Bishai,
A. Blake,
B. Bogart,
T. Bolton,
J. Y. Book,
M. B. Brunetti,
L. Camilleri
, et al. (164 additional authors not shown)
Abstract:
A detailed understanding of inclusive muon neutrino charged-current interactions on argon is crucial to the study of neutrino oscillations in current and future experiments using liquid argon time projection chambers. To that end, we report a comprehensive set of differential cross section measurements for this channel that simultaneously probe the leptonic and hadronic systems by dividing the cha…
▽ More
A detailed understanding of inclusive muon neutrino charged-current interactions on argon is crucial to the study of neutrino oscillations in current and future experiments using liquid argon time projection chambers. To that end, we report a comprehensive set of differential cross section measurements for this channel that simultaneously probe the leptonic and hadronic systems by dividing the channel into final states with and without protons. Measurements of the proton kinematics and proton multiplicity of the final state are also presented. For these measurements, we utilize data collected with the MicroBooNE detector from 6.4$\times10^{20}$ protons on target from the Fermilab Booster Neutrino Beam at a mean neutrino energy of approximately 0.8 GeV. We present in detail the cross section extraction procedure, including the unfolding, and model validation that uses data to model comparisons and the conditional constraint formalism to detect mismodeling that may introduce biases to extracted cross sections that are larger than their uncertainties. The validation exposes insufficiencies in the overall model, motivating the inclusion of an additional data-driven reweighting systematic to ensure the accuracy of the unfolding. The extracted results are compared to a number of event generators and their performance is discussed with a focus on the regions of phase-space that indicate the greatest need for modeling improvements.
△ Less
Submitted 27 July, 2024; v1 submitted 29 February, 2024;
originally announced February 2024.
-
Question answering systems for health professionals at the point of care -- a systematic review
Authors:
Gregory Kell,
Angus Roberts,
Serge Umansky,
Linglong Qian,
Davide Ferrari,
Frank Soboczenski,
Byron Wallace,
Nikhil Patel,
Iain J Marshall
Abstract:
Objective: Question answering (QA) systems have the potential to improve the quality of clinical care by providing health professionals with the latest and most relevant evidence. However, QA systems have not been widely adopted. This systematic review aims to characterize current medical QA systems, assess their suitability for healthcare, and identify areas of improvement.
Materials and method…
▽ More
Objective: Question answering (QA) systems have the potential to improve the quality of clinical care by providing health professionals with the latest and most relevant evidence. However, QA systems have not been widely adopted. This systematic review aims to characterize current medical QA systems, assess their suitability for healthcare, and identify areas of improvement.
Materials and methods: We searched PubMed, IEEE Xplore, ACM Digital Library, ACL Anthology and forward and backward citations on 7th February 2023. We included peer-reviewed journal and conference papers describing the design and evaluation of biomedical QA systems. Two reviewers screened titles, abstracts, and full-text articles. We conducted a narrative synthesis and risk of bias assessment for each study. We assessed the utility of biomedical QA systems.
Results: We included 79 studies and identified themes, including question realism, answer reliability, answer utility, clinical specialism, systems, usability, and evaluation methods. Clinicians' questions used to train and evaluate QA systems were restricted to certain sources, types and complexity levels. No system communicated confidence levels in the answers or sources. Many studies suffered from high risks of bias and applicability concerns. Only 8 studies completely satisfied any criterion for clinical utility, and only 7 reported user evaluations. Most systems were built with limited input from clinicians.
Discussion: While machine learning methods have led to increased accuracy, most studies imperfectly reflected real-world healthcare information needs. Key research priorities include developing more realistic healthcare QA datasets and considering the reliability of answer sources, rather than merely focusing on accuracy.
△ Less
Submitted 24 January, 2024;
originally announced February 2024.
-
VISION-MAE: A Foundation Model for Medical Image Segmentation and Classification
Authors:
Zelong Liu,
Andrew Tieu,
Nikhil Patel,
Alexander Zhou,
George Soultanidis,
Zahi A. Fayad,
Timothy Deyer,
Xueyan Mei
Abstract:
Artificial Intelligence (AI) has the potential to revolutionize diagnosis and segmentation in medical imaging. However, development and clinical implementation face multiple challenges including limited data availability, lack of generalizability, and the necessity to incorporate multi-modal data effectively. A foundation model, which is a large-scale pre-trained AI model, offers a versatile base…
▽ More
Artificial Intelligence (AI) has the potential to revolutionize diagnosis and segmentation in medical imaging. However, development and clinical implementation face multiple challenges including limited data availability, lack of generalizability, and the necessity to incorporate multi-modal data effectively. A foundation model, which is a large-scale pre-trained AI model, offers a versatile base that can be adapted to a variety of specific tasks and contexts. Here, we present a novel foundation model, VISION-MAE, specifically designed for medical imaging. Specifically, VISION-MAE is trained on a dataset of 2.5 million unlabeled images from various modalities (CT, MR, PET, X-rays, and ultrasound), using self-supervised learning techniques. It is then adapted to classification and segmentation tasks using explicit labels. VISION-MAE has high label efficiency, outperforming several benchmark models in both in-domain and out-of-domain applications, and achieves high performance even with reduced availability of labeled data. This model represents a significant advancement in medical imaging AI, offering a generalizable and robust solution for improving segmentation and classification tasks while reducing the data annotation workload.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
MRAnnotator: A Multi-Anatomy Deep Learning Model for MRI Segmentation
Authors:
Alexander Zhou,
Zelong Liu,
Andrew Tieu,
Nikhil Patel,
Sean Sun,
Anthony Yang,
Peter Choi,
Valentin Fauveau,
George Soultanidis,
Mingqian Huang,
Amish Doshi,
Zahi A. Fayad,
Timothy Deyer,
Xueyan Mei
Abstract:
Purpose To develop a deep learning model for multi-anatomy and many-class segmentation of diverse anatomic structures on MRI imaging.
Materials and Methods In this retrospective study, two datasets were curated and annotated for model development and evaluation. An internal dataset of 1022 MRI sequences from various clinical sites within a health system and an external dataset of 264 MRI sequenc…
▽ More
Purpose To develop a deep learning model for multi-anatomy and many-class segmentation of diverse anatomic structures on MRI imaging.
Materials and Methods In this retrospective study, two datasets were curated and annotated for model development and evaluation. An internal dataset of 1022 MRI sequences from various clinical sites within a health system and an external dataset of 264 MRI sequences from an independent imaging center were collected. In both datasets, 49 anatomic structures were annotated as the ground truth. The internal dataset was divided into training, validation, and test sets and used to train and evaluate an nnU-Net model. The external dataset was used to evaluate nnU-Net model generalizability and performance in all classes on independent imaging data. Dice scores were calculated to evaluate model segmentation performance.
Results The model achieved an average Dice score of 0.801 on the internal test set, and an average score of 0.814 on the complete external dataset across 49 classes.
Conclusion The developed model achieves robust and generalizable segmentation of 49 anatomic structures on MRI imaging. A future direction is focused on the incorporation of additional anatomic regions and structures into the datasets and model.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Ordered magnetic fields around the 3C 84 central black hole
Authors:
G. F. Paraschos,
J. -Y. Kim,
M. Wielgus,
J. Röder,
T. P. Krichbaum,
E. Ros,
I. Agudo,
I. Myserlis,
M. Moscibrodzka,
E. Traianou,
J. A. Zensus,
L. Blackburn,
C. -K. Chan,
S. Issaoun,
M. Janssen,
M. D. Johnson,
V. L. Fish,
K. Akiyama,
A. Alberdi,
W. Alef,
J. C. Algaba,
R. Anantua,
K. Asada,
R. Azulay,
U. Bach
, et al. (258 additional authors not shown)
Abstract:
3C84 is a nearby radio source with a complex total intensity structure, showing linear polarisation and spectral patterns. A detailed investigation of the central engine region necessitates the use of VLBI above the hitherto available maximum frequency of 86GHz. Using ultrahigh resolution VLBI observations at the highest available frequency of 228GHz, we aim to directly detect compact structures a…
▽ More
3C84 is a nearby radio source with a complex total intensity structure, showing linear polarisation and spectral patterns. A detailed investigation of the central engine region necessitates the use of VLBI above the hitherto available maximum frequency of 86GHz. Using ultrahigh resolution VLBI observations at the highest available frequency of 228GHz, we aim to directly detect compact structures and understand the physical conditions in the compact region of 3C84. We used EHT 228GHz observations and, given the limited (u,v)-coverage, applied geometric model fitting to the data. We also employed quasi-simultaneously observed, multi-frequency VLBI data for the source in order to carry out a comprehensive analysis of the core structure. We report the detection of a highly ordered, strong magnetic field around the central, SMBH of 3C84. The brightness temperature analysis suggests that the system is in equipartition. We determined a turnover frequency of $ν_m=(113\pm4)$GHz, a corresponding synchrotron self-absorbed magnetic field of $B_{SSA}=(2.9\pm1.6)$G, and an equipartition magnetic field of $B_{eq}=(5.2\pm0.6)$G. Three components are resolved with the highest fractional polarisation detected for this object ($m_\textrm{net}=(17.0\pm3.9)$%). The positions of the components are compatible with those seen in low-frequency VLBI observations since 2017-2018. We report a steeply negative slope of the spectrum at 228GHz. We used these findings to test models of jet formation, propagation, and Faraday rotation in 3C84. The findings of our investigation into different flow geometries and black hole spins support an advection-dominated accretion flow in a magnetically arrested state around a rapidly rotating supermassive black hole as a model of the jet-launching system in the core of 3C84. However, systematic uncertainties due to the limited (u,v)-coverage, however, cannot be ignored.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
UMBRELLA: A One-stop Shop Bridging the Gap from Lab to Real-World IoT Experimentation
Authors:
Ioannis Mavromatis,
Yichao Jin,
Aleksandar Stanoev,
Anthony Portelli,
Ingram Weeks,
Ben Holden,
Eliot Glasspole,
Tim Farnham,
Aftab Khan,
Usman Raza,
Adnan Aijaz,
Thomas Bierton,
Ichiro Seto,
Nita Patel,
Mahesh Sooriyabandara
Abstract:
UMBRELLA is an open, large-scale IoT ecosystem deployed across South Gloucestershire, UK. It is intended to accelerate innovation across multiple technology domains. UMBRELLA is built to bridge the gap between existing specialised testbeds and address holistically real-world technological challenges in a System-of-Systems (SoS) fashion. UMBRELLA provides open access to real-world devices and infra…
▽ More
UMBRELLA is an open, large-scale IoT ecosystem deployed across South Gloucestershire, UK. It is intended to accelerate innovation across multiple technology domains. UMBRELLA is built to bridge the gap between existing specialised testbeds and address holistically real-world technological challenges in a System-of-Systems (SoS) fashion. UMBRELLA provides open access to real-world devices and infrastructure, enabling researchers and the industry to evaluate solutions for Smart Cities, Robotics, Wireless Communications, Edge Intelligence, and more. Key features include over 200 multi-sensor nodes installed on public infrastructure, a robotics arena with 20 mobile robots, a 5G network-in-a-box solution, and a unified backend platform for management, control and secure user access. The heterogeneity of hardware components, including diverse sensors, communication interfaces, and GPU-enabled edge devices, coupled with tools like digital twins, allows for comprehensive experimentation and benchmarking of innovative solutions not viable in lab environments. This paper provides a comprehensive overview of UMBRELLA's multi-domain architecture and capabilities, making it an ideal playground for Internet of Things (IoT) and Industrial IoT (IIoT) innovation. It discusses the challenges in designing, developing and operating UMBRELLA as an open, sustainable testbed and shares lessons learned to guide similar future initiatives. With its unique openness, heterogeneity, realism and tools, UMBRELLA aims to continue accelerating cutting-edge technology research, development and translation into real-world progress.
△ Less
Submitted 2 February, 2024; v1 submitted 26 January, 2024;
originally announced January 2024.
-
Mathematical Tri-State Model for Bee Shimmering Propagation Dynamics
Authors:
Navin Patel,
Henri Huijberts,
Kaspar Althoefer,
Ketao Zhang
Abstract:
Bees undergo a self-organised process known as shimmering, where they form emergent patterns when they interact with each other on the nest surface as a defence mechanism in response to predator attacks. Many experimental studies have empirically investigated how the transfer of information to neighbouring bees propagates in various shimmering processes by measuring shimmering wave strength. Howev…
▽ More
Bees undergo a self-organised process known as shimmering, where they form emergent patterns when they interact with each other on the nest surface as a defence mechanism in response to predator attacks. Many experimental studies have empirically investigated how the transfer of information to neighbouring bees propagates in various shimmering processes by measuring shimmering wave strength. However, there is no analytical modelling of the collective defence mechanism in nature. Here we introduce the first analytical tri-state Inactive-Active-Relapse (IAR) model to formulate the intrinsic process of bee shimmering. The major shimmering behaviour is shown to emerge under theoretical conditions which is demonstrated numerically and visually by simulating 1,000,000 bee agents, while the number of agents is scalable. Furthermore, we elaborate on these mathematical results to construct a wave strength function to demonstrate the accuracy of shimmering dynamics. The constructed wave strength function can be adapted to peak between 50-150ms which supports the experimental studies. Our results provide a foundation for further theoretical understanding of bee shimmering wave dynamics and could serve as inspiration for modelling other self-organised phenomena across scientific applications.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Calculating Quasi-Normal Modes of Schwarzschild Black Holes with Physics Informed Neural Networks
Authors:
Nirmal Patel,
Aycin Aykutalp,
Pablo Laguna
Abstract:
Machine learning, particularly neural networks, has rapidly permeated most activities and work where data has a story to tell. Recently, deep learning has started to be used for solving differential equations with input from physics, also known as Physics Informed Neural Networks (PINNs). We present a study showing the efficacy of PINNs for solving the Zerilli and the Regge-Wheeler equations in th…
▽ More
Machine learning, particularly neural networks, has rapidly permeated most activities and work where data has a story to tell. Recently, deep learning has started to be used for solving differential equations with input from physics, also known as Physics Informed Neural Networks (PINNs). We present a study showing the efficacy of PINNs for solving the Zerilli and the Regge-Wheeler equations in the time domain to calculate the quasi-normal oscillation modes of a Schwarzschild black hole. We compare the extracted modes with those obtained with finite difference methods. Although the PINN results are competitive, with a few percent differences in the quasi-normal modes estimates relative to those computed with finite difference methods, the real power of PINNs will emerge when applied to large dimensionality problems.
△ Less
Submitted 2 January, 2024;
originally announced January 2024.
-
A Cascaded Neural Network System For Rating Student Performance In Surgical Knot Tying Simulation
Authors:
Yunzhe Xue,
Olanrewaju Eletta,
Justin W. Ady,
Nell M. Patel,
Advaith Bongu,
Usman Roshan
Abstract:
As part of their training all medical students and residents have to pass basic surgical tasks such as knot tying, needle-passing, and suturing. Their assessment is typically performed in the operating room by surgical faculty where mistakes and failure by the student increases the operation time and cost. This evaluation is quantitative and has a low margin of error. Simulation has emerged as a c…
▽ More
As part of their training all medical students and residents have to pass basic surgical tasks such as knot tying, needle-passing, and suturing. Their assessment is typically performed in the operating room by surgical faculty where mistakes and failure by the student increases the operation time and cost. This evaluation is quantitative and has a low margin of error. Simulation has emerged as a cost effective option but it lacks assessment or requires additional expensive hardware for evaluation. Apps that provide training videos on surgical knot trying are available to students but none have evaluation. We propose a cascaded neural network architecture that evaluates a student's performance just from a video of themselves simulating a surgical knot tying task. Our model converts video frame images into feature vectors with a pre-trained deep convolutional network and then models the sequence of frames with a temporal network. We obtained videos of medical students and residents from the Robert Wood Johnson Hospital performing knot tying on a standardized simulation kit. We manually annotated each video and proceeded to do a five-fold cross-validation study on them. Our model achieves a median precision, recall, and F1-score of 0.71, 0.66, and 0.65 respectively in determining the level of knot related tasks of tying and pushing the knot. Our mean precision score averaged across different probability thresholds is 0.8. Both our F1-score and mean precision score are 8% and 30% higher than that of a recently published study for the same problem. We expect the accuracy of our model to further increase as we add more training videos to the model thus making it a practical solution that students can use to evaluate themselves.
△ Less
Submitted 9 December, 2023;
originally announced December 2023.
-
On Nonlinear Stability of Muskat Bubbles
Authors:
Francisco Gancedo,
Eduardo García-Juárez,
Neel Patel,
Robert M. Strain
Abstract:
In this paper we consider gravity-capillarity Muskat bubbles in 2D. We obtain a new approach to improve our result in [25]. Due to a new bubble-adapted formulation, the improvement is two fold. We significantly condense the proof and we now obtain the global well-posedness result for Muskat bubbles in critical regularity.
In this paper we consider gravity-capillarity Muskat bubbles in 2D. We obtain a new approach to improve our result in [25]. Due to a new bubble-adapted formulation, the improvement is two fold. We significantly condense the proof and we now obtain the global well-posedness result for Muskat bubbles in critical regularity.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
First search for dark-trident processes using the MicroBooNE detector
Authors:
MicroBooNE collaboration,
P. Abratenko,
O. Alterkait,
D. Andrade Aldana,
L. Arellano,
J. Asaadi,
A. Ashkenazi,
S. Balasubramanian,
B. Baller,
G. Barr,
D. Barrow,
J. Barrow,
V. Basque,
O. Benevides Rodrigues,
S. Berkman,
A. Bhanderi,
A. Bhat,
M. Bhattacharya,
M. Bishai,
A. Blake,
B. Bogart,
T. Bolton,
J. Y. Book,
M. B. Brunetti,
L. Camilleri
, et al. (163 additional authors not shown)
Abstract:
We present a first search for dark-trident scattering in a neutrino beam using a data set corresponding to $7.2 \times 10^{20}$ protons on target taken with the MicroBooNE detector at Fermilab. Proton interactions in the neutrino target at the Main Injector produce $π^0$ and $η$ mesons, which could decay into dark-matter (DM) particles mediated via a dark photon $A^\prime$. A convolutional neural…
▽ More
We present a first search for dark-trident scattering in a neutrino beam using a data set corresponding to $7.2 \times 10^{20}$ protons on target taken with the MicroBooNE detector at Fermilab. Proton interactions in the neutrino target at the Main Injector produce $π^0$ and $η$ mesons, which could decay into dark-matter (DM) particles mediated via a dark photon $A^\prime$. A convolutional neural network is trained to identify interactions of the DM particles in the liquid-argon time projection chamber (LArTPC) exploiting its image-like reconstruction capability. In the absence of a DM signal, we provide limits at the $90\%$ confidence level on the squared kinematic mixing parameter $\varepsilon^2$ as a function of the dark-photon mass in the range $10\le M_{A^\prime}\le 400$ MeV. The limits cover previously unconstrained parameter space for the production of fermion or scalar DM particles $χ$ for two benchmark models with mass ratios $M_χ/M_{A^\prime}=0.6$ and $2$ and for dark fine-structure constants $0.1\leα_D\le 1$.
△ Less
Submitted 16 May, 2024; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Combinatorial Stationary Prophet Inequalities
Authors:
Neel Patel,
David Wajc
Abstract:
Numerous recent papers have studied the tension between thickening and clearing a market in (uncertain, online) long-time horizon Markovian settings. In particular, (Aouad and Sarita{ç} EC'20, Collina et al. WINE'20, Kessel et al. EC'22) studied what the latter referred to as the Stationary Prophet Inequality Problem, due to its similarity to the classic finite-time horizon prophet inequality prob…
▽ More
Numerous recent papers have studied the tension between thickening and clearing a market in (uncertain, online) long-time horizon Markovian settings. In particular, (Aouad and Sarita{ç} EC'20, Collina et al. WINE'20, Kessel et al. EC'22) studied what the latter referred to as the Stationary Prophet Inequality Problem, due to its similarity to the classic finite-time horizon prophet inequality problem. These works all consider unit-demand buyers. Mirroring the long line of work on the classic prophet inequality problem subject to combinatorial constraints, we initiate the study of the stationary prophet inequality problem subject to combinatorially-constrained buyers.
Our results can be summarized succinctly as unearthing an algorithmic connection between contention resolution schemes (CRS) and stationary prophet inequalities. While the classic prophet inequality problem has a tight connection to online CRS (Feldman et al. SODA'16, Lee and Singla ESA'18), we show that for the stationary prophet inequality problem, offline CRS play a similarly central role. We show that, up to small constant factors, the best (ex-ante) competitive ratio achievable for the combinatorial prophet inequality equals the best possible balancedness achievable by offline CRS for the same combinatorial constraints.
△ Less
Submitted 14 December, 2023; v1 submitted 13 December, 2023;
originally announced December 2023.
-
Metadata for the Flux Density Calibration of the April 2018 Event Horizon Telescope Data
Authors:
J. Y. Koay,
C. Romero-Cañizales,
L. D. Matthews,
M. Janssen,
L. Blackburn,
R. P. J. Tilanus,
J. Park,
K. Asada,
S. Matsushita,
A. -K. Baczko,
N. La Bella,
C. -K. Chan,
G. B. Crew,
V. Fish,
N. Patel,
V. Ramakrishnan,
H. Rottmann,
J. Wagner,
K. Wiik,
P. Friberg,
C. Goddi,
S. Issaoun,
G. Keating,
J. Kim,
T. P. Krichbaum
, et al. (7 additional authors not shown)
Abstract:
The Event Horizon Telescope (EHT) observations carried out in 2018 April at 1.3 mm wavelengths included 9 stations in the array, comprising 7 single-dish telescopes and 2 phased arrays. The metadata package for the 2018 EHT observing campaign contains calibration tables required for the a-priori amplitude calibration of the 2018 April visibility data. This memo is the official documentation accomp…
▽ More
The Event Horizon Telescope (EHT) observations carried out in 2018 April at 1.3 mm wavelengths included 9 stations in the array, comprising 7 single-dish telescopes and 2 phased arrays. The metadata package for the 2018 EHT observing campaign contains calibration tables required for the a-priori amplitude calibration of the 2018 April visibility data. This memo is the official documentation accompanying the release of the 2018 EHT metadata package, providing an overview of the contents of the package. We describe how telescope sensitivities, gain curves and other relevant parameters for each station in the EHT array were collected, processed, and validated to produce the calibration tables.
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
Absolute Flux Density Calibration of the Greenland Telescope Data for Event Horizon Telescope Observations
Authors:
J. Y. Koay,
K. Asada,
S. Matsushita,
C. -Y. Kuo,
C. -W. L. Huang,
C. Romero-Cañizales,
S. Koyama,
J. Park,
W. -P. Lo,
G. Bower,
M. -T. Chen,
S. -H. Chang,
C. -C. Chen,
R. Chilson,
C. C. Han,
P. T. P. Ho,
Y. -D. Huang,
M. Inoue,
B. Jeter,
H. Jiang,
P. M. Koch,
D. Kubo,
C. -T. Li,
C. -T. Liu,
K. -Y. Liu
, et al. (13 additional authors not shown)
Abstract:
Starting from the observing campaign in April 2018, the Greenland Telescope (GLT) has been added as a new station of the Event Horizon Telescope (EHT) array. Visibilities on baselines to the GLT, particularly in the North-South direction, potentially provide valuable new constraints for the modeling and imaging of sources such as M87*. The GLT's location at high Northern latitudes adds unique chal…
▽ More
Starting from the observing campaign in April 2018, the Greenland Telescope (GLT) has been added as a new station of the Event Horizon Telescope (EHT) array. Visibilities on baselines to the GLT, particularly in the North-South direction, potentially provide valuable new constraints for the modeling and imaging of sources such as M87*. The GLT's location at high Northern latitudes adds unique challenges to its calibration strategies. Additionally, the performance of the GLT was not optimal during the 2018 observations due to it being only partially commissioned at the time. This document describes the steps taken to estimate the various parameters (and their uncertainties) required for the absolute flux calibration of the GLT data as part of the EHT. In particular, we consider the non-optimized status of the GLT in 2018, as well as its improved performance during the 2021 EHT campaign.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Bridging the Gap: Addressing Discrepancies in Diffusion Model Training for Classifier-Free Guidance
Authors:
Niket Patel,
Luis Salamanca,
Luis Barba
Abstract:
Diffusion models have emerged as a pivotal advancement in generative models, setting new standards to the quality of the generated instances. In the current paper we aim to underscore a discrepancy between conventional training methods and the desired conditional sampling behavior of these models. While the prevalent classifier-free guidance technique works well, it's not without flaws. At higher…
▽ More
Diffusion models have emerged as a pivotal advancement in generative models, setting new standards to the quality of the generated instances. In the current paper we aim to underscore a discrepancy between conventional training methods and the desired conditional sampling behavior of these models. While the prevalent classifier-free guidance technique works well, it's not without flaws. At higher values for the guidance scale parameter $w$, we often get out of distribution samples and mode collapse, whereas at lower values for $w$ we may not get the desired specificity. To address these challenges, we introduce an updated loss function that better aligns training objectives with sampling behaviors. Experimental validation with FID scores on CIFAR-10 elucidates our method's ability to produce higher quality samples with fewer sampling timesteps, and be more robust to the choice of guidance scale $w$. We also experiment with fine-tuning Stable Diffusion on the proposed loss, to provide early evidence that large diffusion models may also benefit from this refined loss function.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classification
Authors:
Yongxin Huang,
Kexin Wang,
Sourav Dutta,
Raj Nath Patel,
Goran Glavaš,
Iryna Gurevych
Abstract:
Recent work has found that few-shot sentence classification based on pre-trained Sentence Encoders (SEs) is efficient, robust, and effective. In this work, we investigate strategies for domain-specialization in the context of few-shot sentence classification with SEs. We first establish that unsupervised Domain-Adaptive Pre-Training (DAPT) of a base Pre-trained Language Model (PLM) (i.e., not an S…
▽ More
Recent work has found that few-shot sentence classification based on pre-trained Sentence Encoders (SEs) is efficient, robust, and effective. In this work, we investigate strategies for domain-specialization in the context of few-shot sentence classification with SEs. We first establish that unsupervised Domain-Adaptive Pre-Training (DAPT) of a base Pre-trained Language Model (PLM) (i.e., not an SE) substantially improves the accuracy of few-shot sentence classification by up to 8.4 points. However, applying DAPT on SEs, on the one hand, disrupts the effects of their (general-domain) Sentence Embedding Pre-Training (SEPT). On the other hand, applying general-domain SEPT on top of a domain-adapted base PLM (i.e., after DAPT) is effective but inefficient, since the computationally expensive SEPT needs to be executed on top of a DAPT-ed PLM of each domain. As a solution, we propose AdaSent, which decouples SEPT from DAPT by training a SEPT adapter on the base PLM. The adapter can be inserted into DAPT-ed PLMs from any domain. We demonstrate AdaSent's effectiveness in extensive experiments on 17 different few-shot sentence classification datasets. AdaSent matches or surpasses the performance of full SEPT on DAPT-ed PLM, while substantially reducing the training costs. The code for AdaSent is available.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
Towards Quantum Dynamics Simulation of Physical Systems: A Survey
Authors:
Rikteem Bhowmick,
Navaneeth Krishnan Mohan,
Devesh Kumar,
Rohit Chaurasiya,
Nixon Patel
Abstract:
After the emergence of quantum mechanics and realising its need for an accurate understanding of physical systems, numerical methods were being used to undergo quantum mechanical treatment. With increasing system correlations and size, numerical methods fell rather inefficient, and there was a need to simulate quantum mechanical phenomena on actual quantum computing hardware. Now, with noisy quant…
▽ More
After the emergence of quantum mechanics and realising its need for an accurate understanding of physical systems, numerical methods were being used to undergo quantum mechanical treatment. With increasing system correlations and size, numerical methods fell rather inefficient, and there was a need to simulate quantum mechanical phenomena on actual quantum computing hardware. Now, with noisy quantum computing machines that have been built and made available to use, realising quantum simulations are edging towards a practical reality. In this paper, we talk about the progress that has been made in the field of quantum simulations by actual quantum computing hardware and talk about some very fascinating fields where it has expanded its branches, too. Not only that, but we also review different software tool-sets available to date, which are to lay the foundation for realising quantum simulations in a much more comprehensive manner.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
A New Approach Towards Autoformalization
Authors:
Nilay Patel,
Rahul Saha,
Jeffrey Flanigan
Abstract:
Verifying mathematical proofs is difficult, but can be automated with the assistance of a computer. Autoformalization is the task of automatically translating natural language mathematics into a formal language that can be verified by a program. This is a challenging task, and especially for higher-level mathematics found in research papers. Research paper mathematics requires large amounts of bac…
▽ More
Verifying mathematical proofs is difficult, but can be automated with the assistance of a computer. Autoformalization is the task of automatically translating natural language mathematics into a formal language that can be verified by a program. This is a challenging task, and especially for higher-level mathematics found in research papers. Research paper mathematics requires large amounts of background and context. In this paper, we propose an avenue towards tackling autoformalization for research-level mathematics, by breaking the task into easier and more approachable subtasks: unlinked formalization (formalization with unlinked definitions and theorems), entity linking (linking to the proper theorems and definitions), and finally adjusting types so it passes the type checker. In addition, we present arXiv2Formal, a benchmark dataset for unlinked formalization consisting of 50 theorems formalized for the Lean theorem prover sampled from papers on arXiv.org. We welcome any contributions from the community to future versions of this dataset.
△ Less
Submitted 9 July, 2024; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Search for heavy neutral leptons in electron-positron and neutral-pion final states with the MicroBooNE detector
Authors:
MicroBooNE collaboration,
P. Abratenko,
O. Alterkait,
D. Andrade Aldana,
L. Arellano,
J. Asaadi,
A. Ashkenazi,
S. Balasubramanian,
B. Baller,
G. Barr,
D. Barrow,
J. Barrow,
V. Basque,
O. Benevides Rodrigues,
S. Berkman,
A. Bhanderi,
A. Bhat,
M. Bhattacharya,
M. Bishai,
A. Blake,
B. Bogart,
T. Bolton,
J. Y. Book,
M. B. Brunetti,
L. Camilleri
, et al. (163 additional authors not shown)
Abstract:
We present the first search for heavy neutral leptons (HNL) decaying into $νe^+e^-$ or $νπ^0$ final states in a liquid-argon time projection chamber using data collected with the MicroBooNE detector. The data were recorded synchronously with the NuMI neutrino beam from Fermilab's Main Injector corresponding to a total exposure of $7.01 \times 10^{20}$ protons on target. We set upper limits at the…
▽ More
We present the first search for heavy neutral leptons (HNL) decaying into $νe^+e^-$ or $νπ^0$ final states in a liquid-argon time projection chamber using data collected with the MicroBooNE detector. The data were recorded synchronously with the NuMI neutrino beam from Fermilab's Main Injector corresponding to a total exposure of $7.01 \times 10^{20}$ protons on target. We set upper limits at the $90\%$ confidence level on the mixing parameter $\lvert U_{μ4}\rvert^2$ in the mass ranges $10\le m_{\rm HNL}\le 150$ MeV for the $νe^+e^-$ channel and $150\le m_{\rm HNL}\le 245$ MeV for the $νπ^0$ channel, assuming $\lvert U_{e 4}\rvert^2 = \lvert U_{τ4}\rvert^2 = 0$. These limits represent the most stringent constraints in the mass range $35<m_{\rm HNL}<175$ MeV and the first constraints from a direct search for $νπ^0$ decays.
△ Less
Submitted 12 January, 2024; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Measurement of nuclear effects in neutrino-argon interactions using generalized kinematic imbalance variables with the MicroBooNE detector
Authors:
MicroBooNE collaboration,
P. Abratenko,
O. Alterkait,
D. Andrade Aldana,
L. Arellano,
J. Asaadi,
A. Ashkenazi,
S. Balasubramanian,
B. Baller,
G. Barr,
D. Barrow,
J. Barrow,
V. Basque,
O. Benevides Rodrigues,
S. Berkman,
A. Bhanderi,
A. Bhat,
M. Bhattacharya,
M. Bishai,
A. Blake,
B. Bogart,
T. Bolton,
J. Y. Book,
M. B. Brunetti,
L. Camilleri
, et al. (163 additional authors not shown)
Abstract:
We present a set of new generalized kinematic imbalance variables that can be measured in neutrino scattering. These variables extend previous measurements of kinematic imbalance on the transverse plane, and are more sensitive to modeling of nuclear effects. We demonstrate the enhanced power of these variables using simulation, and then use the MicroBooNE detector to measure them for the first tim…
▽ More
We present a set of new generalized kinematic imbalance variables that can be measured in neutrino scattering. These variables extend previous measurements of kinematic imbalance on the transverse plane, and are more sensitive to modeling of nuclear effects. We demonstrate the enhanced power of these variables using simulation, and then use the MicroBooNE detector to measure them for the first time. We report flux-integrated single- and double-differential measurements of charged-current muon neutrino scattering on argon using a topolgy with one muon and one proton in the final state as a function of these novel kinematic imbalance variables. These measurements allow us to demonstrate that the treatment of charged current quasielastic interactions in GENIE version 2 is inadequate to describe data. Further, they reveal tensions with more modern generator predictions particularly in regions of phase space where final state interactions are important.
△ Less
Submitted 16 May, 2024; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Exploring Embeddings for Measuring Text Relatedness: Unveiling Sentiments and Relationships in Online Comments
Authors:
Anthony Olakangil,
Cindy Wang,
Justin Nguyen,
Qunbo Zhou,
Kaavya Jethwa,
Jason Li,
Aryan Narendra,
Nishk Patel,
Arjun Rajaram
Abstract:
After the COVID-19 pandemic caused internet usage to grow by 70%, there has been an increased number of people all across the world using social media. Applications like Twitter, Meta Threads, YouTube, and Reddit have become increasingly pervasive, leaving almost no digital space where public opinion is not expressed. This paper investigates sentiment and semantic relationships among comments acro…
▽ More
After the COVID-19 pandemic caused internet usage to grow by 70%, there has been an increased number of people all across the world using social media. Applications like Twitter, Meta Threads, YouTube, and Reddit have become increasingly pervasive, leaving almost no digital space where public opinion is not expressed. This paper investigates sentiment and semantic relationships among comments across various social media platforms, as well as discusses the importance of shared opinions across these different media platforms, using word embeddings to analyze components in sentences and documents. It allows researchers, politicians, and business representatives to trace a path of shared sentiment among users across the world. This research paper presents multiple approaches that measure the relatedness of text extracted from user comments on these popular online platforms. By leveraging embeddings, which capture semantic relationships between words and help analyze sentiments across the web, we can uncover connections regarding public opinion as a whole. The study utilizes pre-existing datasets from YouTube, Reddit, Twitter, and more. We made use of popular natural language processing models like Bidirectional Encoder Representations from Transformers (BERT) to analyze sentiments and explore relationships between comment embeddings. Additionally, we aim to utilize clustering and Kl-divergence to find semantic relationships within these comment embeddings across various social media platforms. Our analysis will enable a deeper understanding of the interconnectedness of online comments and will investigate the notion of the internet functioning as a large interconnected brain.
△ Less
Submitted 30 October, 2023; v1 submitted 15 September, 2023;
originally announced October 2023.
-
Limitations of Stochastic Selection with Pairwise Independent Priors
Authors:
Shaddin Dughmi,
Yusuf Hakan Kalayci,
Neel Patel
Abstract:
Motivated by the growing interest in correlation-robust stochastic optimization, we investigate stochastic selection problems beyond independence. Specifically, we consider the instructive case of pairwise-independent priors and matroid constraints. We obtain essentially-optimal bounds for contention resolution and prophet inequalities. The impetus for our work comes from the recent work of Caragi…
▽ More
Motivated by the growing interest in correlation-robust stochastic optimization, we investigate stochastic selection problems beyond independence. Specifically, we consider the instructive case of pairwise-independent priors and matroid constraints. We obtain essentially-optimal bounds for contention resolution and prophet inequalities. The impetus for our work comes from the recent work of Caragiannis et al., who derived a constant-approximation for the single-choice prophet inequality with pairwise-independent priors.
For general matroids, our results are tight and largely negative. For both contention resolution and prophet inequalities, our impossibility results hold for the full linear matroid over a finite field. We explicitly construct pairwise-independent distributions which rule out an omega(1/Rank)-balanced offline CRS and an omega(1/log Rank)-competitive prophet inequality against the (usual) oblivious adversary. For both results, we employ a generic approach for constructing pairwise-independent random vectors -- one which unifies and generalizes existing pairwise-independence constructions from the literature on universal hash functions and pseudorandomness. Specifically, our approach is based on our observation that random linear maps turn linear independence into stochastic independence.
We then examine the class of matroids which satisfy the so-called partition property -- these include most common matroids encountered in optimization. We obtain positive results for both online contention resolution and prophet inequalities with pairwise-independent priors on such matroids, approximately matching the corresponding guarantees for fully independent priors. These algorithmic results hold against the almighty adversary for both problems.
△ Less
Submitted 17 March, 2024; v1 submitted 8 October, 2023;
originally announced October 2023.
-
Room Temperature Dynamics of an Optically Addressable Single Spin in Hexagonal Boron Nitride
Authors:
Raj N. Patel,
Rebecca E. K. Fishman,
Tzu-Yung Huang,
Jordan A. Gusdorff,
David A. Fehr,
David A. Hopper,
S. Alex Breitweiser,
Benjamin Porat,
Michael E. Flatté,
Lee C. Bassett
Abstract:
Hexagonal boron nitride (h-BN) hosts pure single-photon emitters that have shown evidence of optically detected electronic spin dynamics. However, the electrical and chemical structure of these optically addressable spins is unknown, and the nature of their spin-optical interactions remains mysterious. Here, we use time-domain optical and microwave experiments to characterize a single emitter in h…
▽ More
Hexagonal boron nitride (h-BN) hosts pure single-photon emitters that have shown evidence of optically detected electronic spin dynamics. However, the electrical and chemical structure of these optically addressable spins is unknown, and the nature of their spin-optical interactions remains mysterious. Here, we use time-domain optical and microwave experiments to characterize a single emitter in h-BN exhibiting room temperature optically detected magnetic resonance. Using dynamical simulations, we constrain and quantify transition rates in the model, and we design optical control protocols that optimize the signal-to-noise ratio for spin readout. This constitutes a necessary step towards quantum control of spin states in h-BN.
△ Less
Submitted 8 November, 2023; v1 submitted 11 September, 2023;
originally announced September 2023.
-
Can NLP Models 'Identify', 'Distinguish', and 'Justify' Questions that Don't have a Definitive Answer?
Authors:
Ayushi Agarwal,
Nisarg Patel,
Neeraj Varshney,
Mihir Parmar,
Pavan Mallina,
Aryan Bhavin Shah,
Srihari Raju Sangaraju,
Tirth Patel,
Nihar Thakkar,
Chitta Baral
Abstract:
Though state-of-the-art (SOTA) NLP systems have achieved remarkable performance on a variety of language understanding tasks, they primarily focus on questions that have a correct and a definitive answer. However, in real-world applications, users often ask questions that don't have a definitive answer. Incorrectly answering such questions certainly hampers a system's reliability and trustworthine…
▽ More
Though state-of-the-art (SOTA) NLP systems have achieved remarkable performance on a variety of language understanding tasks, they primarily focus on questions that have a correct and a definitive answer. However, in real-world applications, users often ask questions that don't have a definitive answer. Incorrectly answering such questions certainly hampers a system's reliability and trustworthiness. Can SOTA models accurately identify such questions and provide a reasonable response?
To investigate the above question, we introduce QnotA, a dataset consisting of five different categories of questions that don't have definitive answers. Furthermore, for each QnotA instance, we also provide a corresponding QA instance i.e. an alternate question that ''can be'' answered. With this data, we formulate three evaluation tasks that test a system's ability to 'identify', 'distinguish', and 'justify' QnotA questions. Through comprehensive experiments, we show that even SOTA models including GPT-3 and Flan T5 do not fare well on these tasks and lack considerably behind the human performance baseline. We conduct a thorough analysis which further leads to several interesting findings. Overall, we believe our work and findings will encourage and facilitate further research in this important area and help develop more robust models.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Financial Fraud Detection using Quantum Graph Neural Networks
Authors:
Nouhaila Innan,
Abhishek Sawaika,
Ashim Dhor,
Siddhant Dutta,
Sairupa Thota,
Husayn Gokal,
Nandan Patel,
Muhammad Al-Zafar Khan,
Ioannis Theodonis,
Mohamed Bennai
Abstract:
Financial fraud detection is essential for preventing significant financial losses and maintaining the reputation of financial institutions. However, conventional methods of detecting financial fraud have limited effectiveness, necessitating the need for new approaches to improve detection rates. In this paper, we propose a novel approach for detecting financial fraud using Quantum Graph Neural Ne…
▽ More
Financial fraud detection is essential for preventing significant financial losses and maintaining the reputation of financial institutions. However, conventional methods of detecting financial fraud have limited effectiveness, necessitating the need for new approaches to improve detection rates. In this paper, we propose a novel approach for detecting financial fraud using Quantum Graph Neural Networks (QGNNs). QGNNs are a type of neural network that can process graph-structured data and leverage the power of Quantum Computing (QC) to perform computations more efficiently than classical neural networks. Our approach uses Variational Quantum Circuits (VQC) to enhance the performance of the QGNN. In order to evaluate the efficiency of our proposed method, we compared the performance of QGNNs to Classical Graph Neural Networks using a real-world financial fraud detection dataset. The results of our experiments showed that QGNNs achieved an AUC of $0.85$, which outperformed classical GNNs. Our research highlights the potential of QGNNs and suggests that QGNNs are a promising new approach for improving financial fraud detection.
△ Less
Submitted 3 September, 2023;
originally announced September 2023.
-
A search for pulsars around Sgr A* in the first Event Horizon Telescope dataset
Authors:
Pablo Torne,
Kuo Liu,
Ralph P. Eatough,
Jompoj Wongphechauxsorn,
James M. Cordes,
Gregory Desvignes,
Mariafelicia De Laurentis,
Michael Kramer,
Scott M. Ransom,
Shami Chatterjee,
Robert Wharton,
Ramesh Karuppusamy,
Lindy Blackburn,
Michael Janssen,
Chi-kwan Chan,
Geoffrey B. Crew,
Lynn D. Matthews,
Ciriaco Goddi,
Helge Rottmann,
Jan Wagner,
Salvador Sanchez,
Ignacio Ruiz,
Federico Abbate,
Geoffrey C. Bower,
Juan J. Salamanca
, et al. (261 additional authors not shown)
Abstract:
The Event Horizon Telescope (EHT) observed in 2017 the supermassive black hole at the center of the Milky Way, Sagittarius A* (Sgr A*), at a frequency of 228.1 GHz ($λ$=1.3 mm). The fundamental physics tests that even a single pulsar orbiting Sgr A* would enable motivate searching for pulsars in EHT datasets. The high observing frequency means that pulsars - which typically exhibit steep emission…
▽ More
The Event Horizon Telescope (EHT) observed in 2017 the supermassive black hole at the center of the Milky Way, Sagittarius A* (Sgr A*), at a frequency of 228.1 GHz ($λ$=1.3 mm). The fundamental physics tests that even a single pulsar orbiting Sgr A* would enable motivate searching for pulsars in EHT datasets. The high observing frequency means that pulsars - which typically exhibit steep emission spectra - are expected to be very faint. However, it also negates pulse scattering, an effect that could hinder pulsar detections in the Galactic Center. Additionally, magnetars or a secondary inverse Compton emission could be stronger at millimeter wavelengths than at lower frequencies. We present a search for pulsars close to Sgr A* using the data from the three most-sensitive stations in the EHT 2017 campaign: the Atacama Large Millimeter/submillimeter Array, the Large Millimeter Telescope and the IRAM 30 m Telescope. We apply three detection methods based on Fourier-domain analysis, the Fast-Folding-Algorithm and single pulse search targeting both pulsars and burst-like transient emission; using the simultaneity of the observations to confirm potential candidates. No new pulsars or significant bursts were found. Being the first pulsar search ever carried out at such high radio frequencies, we detail our analysis methods and give a detailed estimation of the sensitivity of the search. We conclude that the EHT 2017 observations are only sensitive to a small fraction ($\lesssim$2.2%) of the pulsars that may exist close to Sgr A*, motivating further searches for fainter pulsars in the region.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
On Supermodular Contracts and Dense Subgraphs
Authors:
Ramiro Deo-Campo Vuong,
Shaddin Dughmi,
Neel Patel,
Aditya Prasad
Abstract:
We study the combinatorial contract design problem, introduced and studied by Dutting et. al. (2021, 2022), in both the single and multi-agent settings. Prior work has examined the problem when the principal's utility function is submodular in the actions chosen by the agent(s).
We complement this emerging literature with an examination of the problem when the principal's utility is supermodular…
▽ More
We study the combinatorial contract design problem, introduced and studied by Dutting et. al. (2021, 2022), in both the single and multi-agent settings. Prior work has examined the problem when the principal's utility function is submodular in the actions chosen by the agent(s).
We complement this emerging literature with an examination of the problem when the principal's utility is supermodular.
In the single-agent setting, we obtain a strongly polynomial time algorithm for the optimal contract.
This stands in contrast to the NP-hardness of the problem with submodular principal utility due to Dutting et. al. (2021).
This result has two technical components, the first of which applies beyond supermodular or submodular utilities.
This result strengthens and simplifies analogous enumeration algorithms from Dutting et. al. (2021), and applies to any nondecreasing valuation function for the principal.
Second, we show that supermodular valuations lead to a polynomial number of breakpoints, analogous to a similar result by Dutting et. al. (2021) for gross substitutes valuations.
In the multi-agent setting, we obtain a mixed bag of positive and negative results.
First, we show that it is NP-hard to obtain any finite multiplicative approximation, or an additive FPTAS.
This stands in contrast to the submodular case, where efficient computation of approximately optimal contracts was shown by Dutting et. al. (2022).
Second, we derive an additive PTAS for the problem in the instructive special case of graph-based supermodular valuations, and equal costs.
En-route to this result, we discover an intimate connection between the multi-agent contract problem and the notorious k-densest subgraph problem.
We build on and combine techniques from the literature on dense subgraph problems to obtain our additive PTAS.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
First application of a liquid argon time projection chamber for the search for intranuclear neutron-antineutron transitions and annihilation in $^{40}$Ar using the MicroBooNE detector
Authors:
MicroBooNE collaboration,
P. Abratenko,
O. Alterkait,
D. Andrade Aldana,
L. Arellano,
J. Asaadi,
A. Ashkenazi,
S. Balasubramanian,
B. Baller,
G. Barr,
D. Barrow,
J. Barrow,
V. Basque,
O. Benevides Rodrigues,
S. Berkman,
A. Bhanderi,
A. Bhat,
M. Bhattacharya,
M. Bishai,
A. Blake,
B. Bogart,
T. Bolton,
J. Y. Book,
L. Camilleri,
Y. Cao
, et al. (164 additional authors not shown)
Abstract:
We present a novel methodology to search for intranuclear neutron-antineutron transition ($n\rightarrow\bar{n}$) followed by $\bar{n}$-nucleon annihilation within an $^{40}$Ar nucleus, using the MicroBooNE liquid argon time projection chamber (LArTPC) detector. A discovery of $n\rightarrow\bar{n}$ transition or a new best limit on the lifetime of this process would either constitute physics beyond…
▽ More
We present a novel methodology to search for intranuclear neutron-antineutron transition ($n\rightarrow\bar{n}$) followed by $\bar{n}$-nucleon annihilation within an $^{40}$Ar nucleus, using the MicroBooNE liquid argon time projection chamber (LArTPC) detector. A discovery of $n\rightarrow\bar{n}$ transition or a new best limit on the lifetime of this process would either constitute physics beyond the Standard Model or greatly constrain theories of baryogenesis, respectively. The approach presented in this paper makes use of deep learning methods to select $n\rightarrow\bar{n}$ events based on their unique features and differentiate them from cosmogenic backgrounds. The achieved signal and background efficiencies are (70.22$\pm$6.04)\% and (0.0020$\pm$0.0003)\%, respectively. A demonstration of a search is performed with a data set corresponding to an exposure of $3.32 \times10^{26}\,$neutron-years, and where the background rate is constrained through direct measurement, assuming the presence of a negligible signal. With this approach, no excess of events over the background prediction is observed, setting a demonstrative lower bound on the $n\rightarrow\bar{n}$ lifetime in $^{40}$Ar of $τ_{\textrm{m}} \gtrsim 1.1\times10^{26}\,$years, and on the free $n\rightarrow\bar{n}$ transition time of $τ_{\textrm{\nnbar}} \gtrsim 2.6\times10^{5}\,$s, each at the $90\%$ confidence level. This analysis represents a first-ever proof-of-principle demonstration of the ability to search for this rare process in LArTPCs with high efficiency and low background.
△ Less
Submitted 27 June, 2024; v1 submitted 7 August, 2023;
originally announced August 2023.
-
Dimensionality Reduction for Improving Out-of-Distribution Detection in Medical Image Segmentation
Authors:
McKell Woodland,
Nihil Patel,
Mais Al Taie,
Joshua P. Yung,
Tucker J. Netherton,
Ankit B. Patel,
Kristy K. Brock
Abstract:
Clinically deployed segmentation models are known to fail on data outside of their training distribution. As these models perform well on most cases, it is imperative to detect out-of-distribution (OOD) images at inference to protect against automation bias. This work applies the Mahalanobis distance post hoc to the bottleneck features of a Swin UNETR model that segments the liver on T1-weighted m…
▽ More
Clinically deployed segmentation models are known to fail on data outside of their training distribution. As these models perform well on most cases, it is imperative to detect out-of-distribution (OOD) images at inference to protect against automation bias. This work applies the Mahalanobis distance post hoc to the bottleneck features of a Swin UNETR model that segments the liver on T1-weighted magnetic resonance imaging. By reducing the dimensions of the bottleneck features with principal component analysis, OOD images were detected with high performance and minimal computational load.
△ Less
Submitted 19 October, 2023; v1 submitted 7 August, 2023;
originally announced August 2023.
-
The Greenland Telescope: Construction, Commissioning, and Operations in Pituffik
Authors:
Ming-Tang Chen,
Keiichi Asada,
Satoki Matsushita,
Philippe Raffin,
Makoto Inoue,
Paul T. P. Ho,
Chih-Chiang Han,
Derek Kubo,
Timothy Norton,
Nimesh A. Patel,
George Nystrom,
Chih-Wei L. Huang,
Pierre Martin-Cocher,
Jun Yi Koay,
Cristina Romero-Cañizales,
Ching-Tang Liu,
Teddy Huang,
Kuan-Yu Liu,
Tashun Wei,
Shu-Hao Chang,
Ryan Chilson,
Peter Oshiro,
Homin Jiang,
Chao-Te Li,
Geoffrey Bower
, et al. (29 additional authors not shown)
Abstract:
In 2018, the Greenland Telescope (GLT) started scientific observation in Greenland. Since then, we have completed several significant improvements and added new capabilities to the telescope system. This paper presents a full review of the GLT system, a summary of our observation activities since 2018, the lessons learned from the operations in the Arctic regions, and the prospect of the telescope…
▽ More
In 2018, the Greenland Telescope (GLT) started scientific observation in Greenland. Since then, we have completed several significant improvements and added new capabilities to the telescope system. This paper presents a full review of the GLT system, a summary of our observation activities since 2018, the lessons learned from the operations in the Arctic regions, and the prospect of the telescope.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
Demonstrating a long-coherence dual-rail erasure qubit using tunable transmons
Authors:
Harry Levine,
Arbel Haim,
Jimmy S. C. Hung,
Nasser Alidoust,
Mahmoud Kalaee,
Laura DeLorenzo,
E. Alex Wollack,
Patricio Arrangoiz-Arriola,
Amirhossein Khalajhedayati,
Rohan Sanil,
Hesam Moradinejad,
Yotam Vaknin,
Aleksander Kubica,
David Hover,
Shahriar Aghaeimeibodi,
Joshua Ari Alcid,
Christopher Baek,
James Barnett,
Kaustubh Bawdekar,
Przemyslaw Bienias,
Hugh Carson,
Cliff Chen,
Li Chen,
Harut Chinkezian,
Eric M. Chisholm
, et al. (88 additional authors not shown)
Abstract:
Quantum error correction with erasure qubits promises significant advantages over standard error correction due to favorable thresholds for erasure errors. To realize this advantage in practice requires a qubit for which nearly all errors are such erasure errors, and the ability to check for erasure errors without dephasing the qubit. We demonstrate that a "dual-rail qubit" consisting of a pair of…
▽ More
Quantum error correction with erasure qubits promises significant advantages over standard error correction due to favorable thresholds for erasure errors. To realize this advantage in practice requires a qubit for which nearly all errors are such erasure errors, and the ability to check for erasure errors without dephasing the qubit. We demonstrate that a "dual-rail qubit" consisting of a pair of resonantly coupled transmons can form a highly coherent erasure qubit, where transmon $T_1$ errors are converted into erasure errors and residual dephasing is strongly suppressed, leading to millisecond-scale coherence within the qubit subspace. We show that single-qubit gates are limited primarily by erasure errors, with erasure probability $p_\text{erasure} = 2.19(2)\times 10^{-3}$ per gate while the residual errors are $\sim 40$ times lower. We further demonstrate mid-circuit detection of erasure errors while introducing $< 0.1\%$ dephasing error per check. Finally, we show that the suppression of transmon noise allows this dual-rail qubit to preserve high coherence over a broad tunable operating range, offering an improved capacity to avoid frequency collisions. This work establishes transmon-based dual-rail qubits as an attractive building block for hardware-efficient quantum error correction.
△ Less
Submitted 20 March, 2024; v1 submitted 17 July, 2023;
originally announced July 2023.