-
Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR
Authors:
Abhishek Gupta,
Amruta Parulekar,
Sameep Chattopadhyay,
Preethi Jyothi
Abstract:
Automatic speech recognition (ASR) for low-resource languages remains a challenge due to the scarcity of labeled training data. Parameter-efficient fine-tuning and text-only adaptation are two popular methods that have been used to address such low-resource settings. In this work, we investigate how these techniques can be effectively combined using a multilingual multimodal model like SeamlessM4T…
▽ More
Automatic speech recognition (ASR) for low-resource languages remains a challenge due to the scarcity of labeled training data. Parameter-efficient fine-tuning and text-only adaptation are two popular methods that have been used to address such low-resource settings. In this work, we investigate how these techniques can be effectively combined using a multilingual multimodal model like SeamlessM4T. Multimodal models are able to leverage unlabeled text via text-only adaptation with further parameter-efficient ASR fine-tuning, thus boosting ASR performance. We also show cross-lingual transfer from a high-resource language, achieving up to a relative 17% WER reduction over a baseline in a zero-shot setting without any labeled speech.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
REFINE on Scarce Data: Retrieval Enhancement through Fine-Tuning via Model Fusion of Embedding Models
Authors:
Ambuje Gupta,
Mrinal Rawat,
Andreas Stolcke,
Roberto Pieraccini
Abstract:
Retrieval augmented generation (RAG) pipelines are commonly used in tasks such as question-answering (QA), relying on retrieving relevant documents from a vector store computed using a pretrained embedding model. However, if the retrieved context is inaccurate, the answers generated using the large language model (LLM) may contain errors or hallucinations. Although pretrained embedding models have…
▽ More
Retrieval augmented generation (RAG) pipelines are commonly used in tasks such as question-answering (QA), relying on retrieving relevant documents from a vector store computed using a pretrained embedding model. However, if the retrieved context is inaccurate, the answers generated using the large language model (LLM) may contain errors or hallucinations. Although pretrained embedding models have advanced, adapting them to new domains remains challenging. Fine-tuning is a potential solution, but industry settings often lack the necessary fine-tuning data. To address these challenges, we propose REFINE, a novel technique that generates synthetic data from available documents and then uses a model fusion approach to fine-tune embeddings for improved retrieval performance in new domains, while preserving out-of-domain capability. We conducted experiments on the two public datasets: SQUAD and RAG-12000 and a proprietary TOURISM dataset. Results demonstrate that even the standard fine-tuning with the proposed data augmentation technique outperforms the vanilla pretrained model. Furthermore, when combined with model fusion, the proposed approach achieves superior performance, with a 5.76% improvement in recall on the TOURISM dataset, and 6.58 % and 0.32% enhancement on SQUAD and RAG-12000 respectively.
△ Less
Submitted 16 October, 2024;
originally announced October 2024.
-
DAXA: Traversing the X-ray desert by Democratising Archival X-ray Astronomy
Authors:
David J. Turner,
Jessica E. Pilling,
Megan Donahue,
Paul A. Giles,
Kathy Romer,
Agrim Gupta,
Toby Wallage,
Ray Wang
Abstract:
We introduce a new, open-source, Python module for the acquisition and processing of archival data from many X-ray telescopes - Democratising Archival X-ray Astronomy (hereafter referred to as DAXA). Our software is built to increase access to, and use of, large archives of X-ray astronomy data; providing a unified, easy-to-use, Python interface to the disparate archives and processing tools. We p…
▽ More
We introduce a new, open-source, Python module for the acquisition and processing of archival data from many X-ray telescopes - Democratising Archival X-ray Astronomy (hereafter referred to as DAXA). Our software is built to increase access to, and use of, large archives of X-ray astronomy data; providing a unified, easy-to-use, Python interface to the disparate archives and processing tools. We provide this interface for the majority of X-ray telescopes launched within the last 30 years. This module enables much greater access to X-ray data for non-specialists, while preserving low-level control of processing for X-ray experts. It is useful for identifying relevant observations of a single object of interest but it excels at creating multi-mission datasets for serendipitous or targeted studies of large samples of X-ray emitting objects. The management and organization of datasets is also made easier; DAXA archives can be version controlled and updated if new data become available. Once relevant observations are identified, the raw data can be downloaded (and optionally processed) through DAXA, or pre-processed event lists, images, and exposure maps can be downloaded if they are available. X-ray observations are perfectly suited to serendipitous discoveries and archival analyses, and with a decade-long `X-ray desert' potentially on the horizon archival data will take on even greater importance; enhanced access to those archives will be vital to the continuation of X-ray astronomy.
△ Less
Submitted 15 October, 2024;
originally announced October 2024.
-
Unbiased estimation of second-order parameter sensitivities for stochastic reaction networks
Authors:
Quentin Badolle,
Ankit Gupta,
Mustafa Khammash
Abstract:
Stochastic models for chemical reaction networks are increasingly popular in systems and synthetic biology. These models formulate the reaction dynamics as Continuous-Time Markov Chains (CTMCs) whose propensities are parameterized by a vector $θ$ and parameter sensitivities are introduced as derivatives of their expected outputs with respect to components of the parameter vector. Sensitivities cha…
▽ More
Stochastic models for chemical reaction networks are increasingly popular in systems and synthetic biology. These models formulate the reaction dynamics as Continuous-Time Markov Chains (CTMCs) whose propensities are parameterized by a vector $θ$ and parameter sensitivities are introduced as derivatives of their expected outputs with respect to components of the parameter vector. Sensitivities characterise key properties of the output like robustness and are also at the heart of numerically efficient optimisation routines like Newton-type algorithms used in parameter inference and the design of of control mechanisms. Currently the only unbiased estimator for second-order sensitivities is based on the Girsanov transform and it often suffers from high estimator variance. We develop a novel estimator for second-order sensitivities by first rigorously deriving an integral representation of these sensitivities. We call the resulting method the Double Bernoulli Path Algorithm and illustrate its efficiency through numerical examples.
△ Less
Submitted 15 October, 2024;
originally announced October 2024.
-
Code-Mixer Ya Nahi: Novel Approaches to Measuring Multilingual LLMs' Code-Mixing Capabilities
Authors:
Ayushman Gupta,
Akhil Bhogal,
Kripabandhu Ghosh
Abstract:
Multilingual Large Language Models (LLMs) have demonstrated exceptional performance in Machine Translation (MT) tasks. However, their MT abilities in the context of code-switching (the practice of mixing two or more languages in an utterance) remain under-explored. In this paper, we introduce Rule-Based Prompting, a novel prompting technique to generate code-mixed sentences. We measure and compare…
▽ More
Multilingual Large Language Models (LLMs) have demonstrated exceptional performance in Machine Translation (MT) tasks. However, their MT abilities in the context of code-switching (the practice of mixing two or more languages in an utterance) remain under-explored. In this paper, we introduce Rule-Based Prompting, a novel prompting technique to generate code-mixed sentences. We measure and compare the code-mixed MT abilities of 3 popular multilingual LLMs: GPT-3.5-turbo, GPT-4, and Gemini Pro across five language pairs: English-{Hindi, Bengali, Gujarati, French, Spanish} using $k$-shot prompting ($k\in\{0, 1, 10, 20\}$) and Rule-Based Prompting. Our findings suggest that though $k$-shot prompting often leads to the best results, Rule-Based prompting shows promise in generating unique code-mixed sentences that vary in their style of code-mixing. We also use $k$-shot prompting to gauge the code-mixed to English translation abilities of multilingual LLMs. For this purpose, we create a gold-standard code-mixed dataset spanning five language pairs: English-{Hindi, Bengali, Gujarati, French, Spanish}. As a real-world application of our work, we create a code-mixed chatbot.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
Multilingual Controlled Generation And Gold-Standard-Agnostic Evaluation of Code-Mixed Sentences
Authors:
Ayushman Gupta,
Akhil Bhogal,
Kripabandhu Ghosh
Abstract:
Code-mixing, the practice of alternating between two or more languages in an utterance, is a common phenomenon in multilingual communities. Due to the colloquial nature of code-mixing, there is no singular correct way to translate an English sentence into a code-mixed sentence. For this reason, standard n-gram-based MT evaluation metrics such as the BLEU score are not appropriate for code-mixed ev…
▽ More
Code-mixing, the practice of alternating between two or more languages in an utterance, is a common phenomenon in multilingual communities. Due to the colloquial nature of code-mixing, there is no singular correct way to translate an English sentence into a code-mixed sentence. For this reason, standard n-gram-based MT evaluation metrics such as the BLEU score are not appropriate for code-mixed evaluation. To demonstrate this, we propose a novel method for code-mixed text generation: Controlled Generation, which parameterizes the code-mixing degree (CMD) and enables the generation of multiple semantically equivalent code-mixed sentences from a given English sentence. We introduce a robust new evaluation metric: GAME: A Gold-Standard Agnostic Measure for Evaluation of Code-Mixed Sentences. GAME is both language-agnostic and gold-standard-agnostic, i.e. unlike other metrics, GAME does not require gold-standard code-mixed sentences for evaluation, thus eliminating the need for human annotators in the code-mixed evaluation process. When used to evaluate semantically equivalent code-mixed sentences, we find that GAME scores have a lower standard deviation than BLEU scores. Further, we create and release a dataset containing gold-standard code-mixed sentences across 4 language pairs: English-{Hindi, Bengali, French, Spanish} to encourage more computational research on code-mixing.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
European Option Pricing in Regime Switching Framework via Physics-Informed Residual Learning
Authors:
Naman Krishna Pande,
Puneet Pasricha,
Arun Kumar,
Arvind Kumar Gupta
Abstract:
In this article, we employ physics-informed residual learning (PIRL) and propose a pricing method for European options under a regime-switching framework, where closed-form solutions are not available. We demonstrate that the proposed approach serves an efficient alternative to competing pricing techniques for regime-switching models in the literature. Specifically, we demonstrate that PIRLs elimi…
▽ More
In this article, we employ physics-informed residual learning (PIRL) and propose a pricing method for European options under a regime-switching framework, where closed-form solutions are not available. We demonstrate that the proposed approach serves an efficient alternative to competing pricing techniques for regime-switching models in the literature. Specifically, we demonstrate that PIRLs eliminate the need for retraining and become nearly instantaneous once trained, thus, offering an efficient and flexible tool for pricing options across a broad range of specifications and parameters.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
A search using GEO600 for gravitational waves coincident with fast radio bursts from SGR 1935+2154
Authors:
The LIGO Scientific Collaboration,
the Virgo Collaboration,
the KAGRA Collaboration,
A. G. Abac,
R. Abbott,
I. Abouelfettouh,
F. Acernese,
K. Ackley,
S. Adhicary,
N. Adhikari,
R. X. Adhikari,
V. K. Adkins,
D. Agarwal,
M. Agathos,
M. Aghaei Abchouyeh,
O. D. Aguiar,
I. Aguilar,
L. Aiello,
A. Ain,
P. Ajith,
T. Akutsu,
S. Albanesi,
R. A. Alfaidi,
A. Al-Jodah,
C. Alléné
, et al. (1758 additional authors not shown)
Abstract:
The magnetar SGR 1935+2154 is the only known Galactic source of fast radio bursts (FRBs). FRBs from SGR 1935+2154 were first detected by CHIME/FRB and STARE2 in 2020 April, after the conclusion of the LIGO, Virgo, and KAGRA Collaborations' O3 observing run. Here we analyze four periods of gravitational wave (GW) data from the GEO600 detector coincident with four periods of FRB activity detected by…
▽ More
The magnetar SGR 1935+2154 is the only known Galactic source of fast radio bursts (FRBs). FRBs from SGR 1935+2154 were first detected by CHIME/FRB and STARE2 in 2020 April, after the conclusion of the LIGO, Virgo, and KAGRA Collaborations' O3 observing run. Here we analyze four periods of gravitational wave (GW) data from the GEO600 detector coincident with four periods of FRB activity detected by CHIME/FRB, as well as X-ray glitches and X-ray bursts detected by NICER and NuSTAR close to the time of one of the FRBs. We do not detect any significant GW emission from any of the events. Instead, using a short-duration GW search (for bursts $\leq$ 1 s) we derive 50\% (90\%) upper limits of $10^{48}$ ($10^{49}$) erg for GWs at 300 Hz and $10^{49}$ ($10^{50}$) erg at 2 kHz, and constrain the GW-to-radio energy ratio to $\leq 10^{14} - 10^{16}$. We also derive upper limits from a long-duration search for bursts with durations between 1 and 10 s. These represent the strictest upper limits on concurrent GW emission from FRBs.
△ Less
Submitted 11 October, 2024;
originally announced October 2024.
-
BD+44 493: Chemo-Dynamical Analysis and Constraints on Companion Planetary Masses from WIYN/NEID Spectroscopy
Authors:
Vinicius M. Placco,
Arvind F. Gupta,
Felipe Almeida-Fernandes,
Sarah E. Logsdon,
Jayadev Rajagopal,
Erika M. Holmbeck,
Ian U. Roederer,
John Della Costa,
Pipa Fernandez,
Eli Golub,
Jesus Higuera,
Yatrik Patel,
Susan Ridgway,
Heidi Schweiker
Abstract:
In this work, we present high-resolution (R~100,000), high signal-to-noise (S/N~800) spectroscopic observations for the well-known, bright, extremely metal-poor, carbon-enhanced star BD+44 493. We determined chemical abundances and upper limits for 17 elements from WIYN/NEID data, complemented with 11 abundances re-determined from Subaru and Hubble data, using the new, more accurate, stellar atmos…
▽ More
In this work, we present high-resolution (R~100,000), high signal-to-noise (S/N~800) spectroscopic observations for the well-known, bright, extremely metal-poor, carbon-enhanced star BD+44 493. We determined chemical abundances and upper limits for 17 elements from WIYN/NEID data, complemented with 11 abundances re-determined from Subaru and Hubble data, using the new, more accurate, stellar atmospheric parameters calculated in this work. Our analysis suggests that BD+44 493 is a low-mass (0.83Msun) old (12.1-13.2Gyr) second-generation star likely formed from a gas cloud enriched by a single metal-free 20.5Msun Population III star in the early Universe. With a disk-like orbit, BD+44 493 does not appear to be associated with any major merger event in the early history of the Milky Way. From the precision radial-velocity NEID measurements (median absolute deviation - MAD=16m/s), we were able to constrain companion planetary masses around BD+44 493 and rule out the presence of planets as small as msin(i)=2MJ out to periods of 100 days. This study opens a new avenue of exploration for the intersection between stellar archaeology and exoplanet science using NEID.
△ Less
Submitted 11 October, 2024;
originally announced October 2024.
-
Sylber: Syllabic Embedding Representation of Speech from Raw Audio
Authors:
Cheol Jun Cho,
Nicholas Lee,
Akshat Gupta,
Dhruv Agarwal,
Ethan Chen,
Alan W Black,
Gopala K. Anumanchipalli
Abstract:
Syllables are compositional units of spoken language that play a crucial role in human speech perception and production. However, current neural speech representations lack structure, resulting in dense token sequences that are costly to process. To bridge this gap, we propose a new model, Sylber, that produces speech representations with clean and robust syllabic structure. Specifically, we propo…
▽ More
Syllables are compositional units of spoken language that play a crucial role in human speech perception and production. However, current neural speech representations lack structure, resulting in dense token sequences that are costly to process. To bridge this gap, we propose a new model, Sylber, that produces speech representations with clean and robust syllabic structure. Specifically, we propose a self-supervised model that regresses features on syllabic segments distilled from a teacher model which is an exponential moving average of the model in training. This results in a highly structured representation of speech features, offering three key benefits: 1) a fast, linear-time syllable segmentation algorithm, 2) efficient syllabic tokenization with an average of 4.27 tokens per second, and 3) syllabic units better suited for lexical and syntactic understanding. We also train token-to-speech generative models with our syllabic units and show that fully intelligible speech can be reconstructed from these tokens. Lastly, we observe that categorical perception, a linguistic phenomenon of speech perception, emerges naturally in our model, making the embedding space more categorical and sparse than previous self-supervised learning approaches. Together, we present a novel self-supervised approach for representing speech as syllables, with significant potential for efficient speech tokenization and spoken language modeling.
△ Less
Submitted 9 October, 2024;
originally announced October 2024.
-
Emission-Line Ratios and Ionization Conditions of CEERS Star-Forming Galaxies with JWST/NIRSpec
Authors:
Ansh R. Gupta,
Allison Kirkpatrick,
Vital Fernandez,
Pablo Arrabal Haro,
Bren E. Backhaus,
Nikko J. Cleri,
Norman A. Grogin,
Anton M. Koekemoer
Abstract:
Galaxy emission-line fluxes can be analyzed to determine star formation rates (SFR) and ISM ionization. Here, we investigate rest-frame optical emission lines of 71 star-forming galaxies at redshift 0.7 < z < 7 from the Cosmic Evolution Early Release Science (CEERS) survey using JWST/NIRSpec. We use H$α$ line fluxes to measure SFRs. We combine these with HST CANDELS stellar mass estimates to deter…
▽ More
Galaxy emission-line fluxes can be analyzed to determine star formation rates (SFR) and ISM ionization. Here, we investigate rest-frame optical emission lines of 71 star-forming galaxies at redshift 0.7 < z < 7 from the Cosmic Evolution Early Release Science (CEERS) survey using JWST/NIRSpec. We use H$α$ line fluxes to measure SFRs. We combine these with HST CANDELS stellar mass estimates to determine the redshift evolution of specific SFR (sSFR) and compare our sample with the star-forming galaxy main sequence. We create [O III]$λ$5008/H$β$ versus [Ne III]$λ$3870/[O II]$λ$3728 line ratio diagrams and correlate these ratios with sSFR and the distance of each galaxy from the main sequence (excess sSFR). We find a modest correlation between the line ratios and sSFR, which is consistent with previous work analyzing similar samples. However, we find a weak correlation between the line ratios and excess sSFR. Taken together, our results suggest that sSFR is the parameter that governs ionization conditions rather than SFR or a galaxy's distance from the main sequence. These measurements reveal a rich diversity of ISM conditions and physical galaxy properties throughout cosmic time.
△ Less
Submitted 3 October, 2024;
originally announced October 2024.
-
Asteroseismology of the mild Am $δ$ Sct star HD 118660 : TESS photometry and modelling
Authors:
Mrinmoy Sarkar,
Santosh Joshi,
Marc-Antoine Dupret,
Otto Trust,
Peter De Cat,
Eugene Semenko,
Patricia Lampens,
Aruna Goswami,
David Mkrtichian,
Drisya Karinkuzhi,
Ilya Yakunin,
Archana Gupta
Abstract:
We present the results of an asteroseismic study of HD 118660 (TIC 171729860), being a chemically peculiar (mild Am) star exhibiting $δ$ Scuti ($δ$ Sct) pulsations. It is based on the analysis of two sectors of time-series photometry from the space mission TESS and seismic modelling. It yielded the detection of 15 and 16 frequencies for TESS sectors 23 and 50, respectively. The identified pulsatio…
▽ More
We present the results of an asteroseismic study of HD 118660 (TIC 171729860), being a chemically peculiar (mild Am) star exhibiting $δ$ Scuti ($δ$ Sct) pulsations. It is based on the analysis of two sectors of time-series photometry from the space mission TESS and seismic modelling. It yielded the detection of 15 and 16 frequencies for TESS sectors 23 and 50, respectively. The identified pulsation modes include four radial ($\ell=0$) and five dipolar ($\ell=1$) ones. The radial modes are overtones with order $n$ ranging from $3$ and $6$. Such high values of $n$ are theoretically not expected for stars with the effective temperature of HD 118660 ($\rm T_{\rm eff}\approx 7550 \rm K$ ) located near the red edge of the $δ$ Sct instability strip. To estimate the asteroseismic parameters, we have generated a grid of stellar models assuming a solar metallicity ($Z=0.014$) and different values for the convective overshooting parameter ($0.1\leq α_{\rm ov}\leq 0.3$). We conclude that the analysis of the radial modes is insufficient to constrain $α_{\rm ov}$ and $Z$ for $δ$ Sct stars. The value for the equatorial velocity of HD 118660 derived from the seismic radius and the rotational frequency is consistent with values found in the literature.
△ Less
Submitted 3 October, 2024;
originally announced October 2024.
-
FLAG: Financial Long Document Classification via AMR-based GNN
Authors:
Bolun "Namir" Xia,
Mohammed J. Zaki,
Aparna Gupta
Abstract:
The advent of large language models (LLMs) has initiated much research into their various financial applications. However, in applying LLMs on long documents, semantic relations are not explicitly incorporated, and a full or arbitrarily sparse attention operation is employed. In recent years, progress has been made in Abstract Meaning Representation (AMR), which is a graph-based representation of…
▽ More
The advent of large language models (LLMs) has initiated much research into their various financial applications. However, in applying LLMs on long documents, semantic relations are not explicitly incorporated, and a full or arbitrarily sparse attention operation is employed. In recent years, progress has been made in Abstract Meaning Representation (AMR), which is a graph-based representation of text to preserve its semantic relations. Since AMR can represent semantic relationships at a deeper level, it can be beneficially utilized by graph neural networks (GNNs) for constructing effective document-level graph representations built upon LLM embeddings to predict target metrics in the financial domain. We propose FLAG: Financial Long document classification via AMR-based GNN, an AMR graph based framework to generate document-level embeddings for long financial document classification. We construct document-level graphs from sentence-level AMR graphs, endow them with specialized LLM word embeddings in the financial domain, apply a deep learning mechanism that utilizes a GNN, and examine the efficacy of our AMR-based approach in predicting labeled target data from long financial documents. Extensive experiments are conducted on a dataset of quarterly earnings calls transcripts of companies in various sectors of the economy, as well as on a corpus of more recent earnings calls of companies in the S&P 1500 Composite Index. We find that our AMR-based approach outperforms fine-tuning LLMs directly on text in predicting stock price movement trends at different time horizons in both datasets. Our work also outperforms previous work utilizing document graphs and GNNs for text classification.
△ Less
Submitted 14 October, 2024; v1 submitted 2 October, 2024;
originally announced October 2024.
-
Self-Tuning Spectral Clustering for Speaker Diarization
Authors:
Nikhil Raghav,
Avisek Gupta,
Md Sahidullah,
Swagatam Das
Abstract:
Spectral clustering has proven effective in grouping speech representations for speaker diarization tasks, although post-processing the affinity matrix remains difficult due to the need for careful tuning before constructing the Laplacian. In this study, we present a novel pruning algorithm to create a sparse affinity matrix called \emph{spectral clustering on p-neighborhood retained affinity matr…
▽ More
Spectral clustering has proven effective in grouping speech representations for speaker diarization tasks, although post-processing the affinity matrix remains difficult due to the need for careful tuning before constructing the Laplacian. In this study, we present a novel pruning algorithm to create a sparse affinity matrix called \emph{spectral clustering on p-neighborhood retained affinity matrix} (SC-pNA). Our method improves on node-specific fixed neighbor selection by allowing a variable number of neighbors, eliminating the need for external tuning data as the pruning parameters are derived directly from the affinity matrix. SC-pNA does so by identifying two clusters in every row of the initial affinity matrix, and retains only the top $p\%$ similarity scores from the cluster containing larger similarities. Spectral clustering is performed subsequently, with the number of clusters determined as the maximum eigengap. Experimental results on the challenging DIHARD-III dataset highlight the superiority of SC-pNA, which is also computationally more efficient than existing auto-tuning approaches.
△ Less
Submitted 16 September, 2024;
originally announced October 2024.
-
Robi Butler: Remote Multimodal Interactions with Household Robot Assistant
Authors:
Anxing Xiao,
Nuwan Janaka,
Tianrun Hu,
Anshul Gupta,
Kaixin Li,
Cunjun Yu,
David Hsu
Abstract:
In this paper, we introduce Robi Butler, a novel household robotic system that enables multimodal interactions with remote users. Building on the advanced communication interfaces, Robi Butler allows users to monitor the robot's status, send text or voice instructions, and select target objects by hand pointing. At the core of our system is a high-level behavior module, powered by Large Language M…
▽ More
In this paper, we introduce Robi Butler, a novel household robotic system that enables multimodal interactions with remote users. Building on the advanced communication interfaces, Robi Butler allows users to monitor the robot's status, send text or voice instructions, and select target objects by hand pointing. At the core of our system is a high-level behavior module, powered by Large Language Models (LLMs), that interprets multimodal instructions to generate action plans. These plans are composed of a set of open vocabulary primitives supported by Vision Language Models (VLMs) that handle both text and pointing queries. The integration of the above components allows Robi Butler to ground remote multimodal instructions in the real-world home environment in a zero-shot manner. We demonstrate the effectiveness and efficiency of this system using a variety of daily household tasks that involve remote users giving multimodal instructions. Additionally, we conducted a user study to analyze how multimodal interactions affect efficiency and user experience during remote human-robot interaction and discuss the potential improvements.
△ Less
Submitted 30 September, 2024;
originally announced September 2024.
-
Spectroscopic Visualization of Hard Quasi-1D Superconductivity Induced in Nanowires Deposited on a Quasi-2D Indium film
Authors:
Ambikesh Gupta,
Pranab Kumar Nag,
Shai Kiriati,
Samuel D. Escribano,
Man Suk Song,
Hadas Shtrikman,
Yuval Oreg,
Nurit Avraham,
Haim Beidenkopf
Abstract:
Following significant progress in the visualization and characterization of hybrid superconducting-semiconducting systems, greatly propelled by reports of Majorana zero modes in nanowire devices, considerable attention has been devoted to investigating the electronic structure at the buried superconducting-semiconducting interface and the nature of the induced superconducting correlations. The pro…
▽ More
Following significant progress in the visualization and characterization of hybrid superconducting-semiconducting systems, greatly propelled by reports of Majorana zero modes in nanowire devices, considerable attention has been devoted to investigating the electronic structure at the buried superconducting-semiconducting interface and the nature of the induced superconducting correlations. The properties of that interface and the structure of the electronic wave functions that occupy it determine the functionality and the topological nature of the induced superconducting state. Here, we introduce a novel hybrid platform for proximity-inducing superconductivity in InAs$_{0.6}$Sb$_{0.4}$ nanowires, leveraging a unique architecture and material combination. By dispersing these nanowires over a superconducting Indium film we exploit Indium's high critical temperature of 3.7~K and the anticipated high spin-orbit and Zeeman couplings of InAs$_{0.6}$Sb$_{0.4}$. This design preserves the pristine top facet of the nanowires, making it highly compatible with scanning tunneling spectroscopy. Using this architecture we demonstrate that the mechanical contact supports Cooper-pair transparency as high as 90\%, comparable with epitaxial interfaces. The anisotropic angular response to an applied magnetic field shows the quasi-two-dimensional nature of the parent superconductivity in the Indium film and the quasi-one-dimensional nature of the induced superconductivity in the nanowires. Our platform offers robust and advantageous foundations for studying the emergence of topological superconductivity and the interplay of superconductivity and magnetism using atomic-scale spectroscopic tools.
△ Less
Submitted 29 September, 2024;
originally announced September 2024.
-
Training the Next Generation of Seismologists: Delivering Research-Grade Software Education for Cloud and HPC Computing through Diverse Training Modalities
Authors:
M. Denolle,
C. Tape,
E. Bozdağ,
Y. Wang,
F. Waldhauser,
A. A. Gabriel,
J. Braunmiller,
B. Chow,
L. Ding,
K. F. Feng,
A. Ghosh,
N. Groebner,
A. Gupta,
Z. Krauss,
A. McPherson,
M. Nagaso,
Z. Niu,
Y. Ni,
R. \" Orsvuran,
G. Pavlis,
F. Rodriguez-Cardozo,
T. Sawi,
N. Schliwa,
D. Schneller,
Q. Shi
, et al. (6 additional authors not shown)
Abstract:
With the rise of data volume and computing power, seismological research requires more advanced skills in data processing, numerical methods, and parallel computing. We present the experience of conducting training workshops over various forms of delivery to support the adoption of large-scale High-Performance Computing and Cloud computing to advance seismological research. The seismological foci…
▽ More
With the rise of data volume and computing power, seismological research requires more advanced skills in data processing, numerical methods, and parallel computing. We present the experience of conducting training workshops over various forms of delivery to support the adoption of large-scale High-Performance Computing and Cloud computing to advance seismological research. The seismological foci were on earthquake source parameter estimation in catalogs, forward and adjoint wavefield simulations in 2 and 3 dimensions at local, regional, and global scales, earthquake dynamics, ambient noise seismology, and machine learning. This contribution describes the series of workshops, the learning outcomes of the participants, and lessons learned by the instructors. Our curriculum was grounded on open and reproducible science, large-scale scientific computing and data mining, and computing infrastructure (access and usage) for HPC and the cloud. We also describe the types of teaching materials that have proven beneficial to the instruction and the sustainability of the program. We propose guidelines to deliver future workshops on these topics.
△ Less
Submitted 27 September, 2024;
originally announced September 2024.
-
The hypothetical track-length fitting algorithm for energy measurement in liquid argon TPCs
Authors:
DUNE Collaboration,
A. Abed Abud,
B. Abi,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
F. Akbar,
N. S. Alex,
K. Allison,
S. Alonso Monsalve,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
H. Amar,
P. Amedo,
J. Anderson,
C. Andreopoulos
, et al. (1348 additional authors not shown)
Abstract:
This paper introduces the hypothetical track-length fitting algorithm, a novel method for measuring the kinetic energies of ionizing particles in liquid argon time projection chambers (LArTPCs). The algorithm finds the most probable offset in track length for a track-like object by comparing the measured ionization density as a function of position with a theoretical prediction of the energy loss…
▽ More
This paper introduces the hypothetical track-length fitting algorithm, a novel method for measuring the kinetic energies of ionizing particles in liquid argon time projection chambers (LArTPCs). The algorithm finds the most probable offset in track length for a track-like object by comparing the measured ionization density as a function of position with a theoretical prediction of the energy loss as a function of the energy, including models of electron recombination and detector response. The algorithm can be used to measure the energies of particles that interact before they stop, such as charged pions that are absorbed by argon nuclei. The algorithm's energy measurement resolutions and fractional biases are presented as functions of particle kinetic energy and number of track hits using samples of stopping secondary charged pions in data collected by the ProtoDUNE-SP detector, and also in a detailed simulation. Additional studies describe impact of the dE/dx model on energy measurement performance. The method described in this paper to characterize the energy measurement performance can be repeated in any LArTPC experiment using stopping secondary charged pions.
△ Less
Submitted 1 October, 2024; v1 submitted 26 September, 2024;
originally announced September 2024.
-
Spatially correlated stellar accretion in the Lupus star forming region: Evidence for ongoing infall from the interstellar medium
Authors:
Andrew J. Winter,
Myriam Benisty,
Carlo F. Manara,
Aashish Gupta
Abstract:
Growing evidence suggests that protoplanetary discs may be influenced by late stage infall from the interstellar medium (ISM). It remains unclear the degree to which infall shapes disc populations at ages $\gtrsim 1$~Myr. We explore possible spatial correlations between stellar accretion rates in the Lupus star forming region, which would support the hypothesis that infall can regulate stellar acc…
▽ More
Growing evidence suggests that protoplanetary discs may be influenced by late stage infall from the interstellar medium (ISM). It remains unclear the degree to which infall shapes disc populations at ages $\gtrsim 1$~Myr. We explore possible spatial correlations between stellar accretion rates in the Lupus star forming region, which would support the hypothesis that infall can regulate stellar accretion. We consider both the `clustered' stars towards the center of Lupus 3, and the `distributed' stars that are more sparsely distributed across the Lupus complex. We take the observed accretion rates in the literature and explore spatial correlations. In particular, we test whether the clustered stars exhibit a radial gradient in normalised accretion rates, and whether the distributed stars have spatially correlated accretion rates. We find statistically significant correlations for both the clustered and distributed samples. The clustered sample exhibits higher accretion rates in the central region, consistent with the expected Bondi-Hoyle-Lyttleton accretion rate. Stars that are spatially closer among the distributed population also exhibit more similar accretion rates. These results cannot be explained by the stellar mass distribution for either sample. Age gradients are disfavoured, though not discounted, because normalised disc dust masses are not spatially correlated across the region. Spatially correlated stellar accretion rates within the Lupus star forming region argue in favour of an environmental influence on stellar accretion, possibly combined with internal processes in the inner disc. Refined age measurements and searches for evidence of infalling material are potential ways to further test this finding.
△ Less
Submitted 7 October, 2024; v1 submitted 25 September, 2024;
originally announced September 2024.
-
FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text Compression
Authors:
Fazal Mittu,
Yihuan Bu,
Akshat Gupta,
Ashok Devireddy,
Alp Eren Ozdarendeli,
Anant Singh,
Gopala Anumanchipalli
Abstract:
While the language modeling objective has been shown to be deeply connected with compression, it is surprising that modern LLMs are not employed in practical text compression systems. In this paper, we provide an in-depth analysis of neural network and transformer-based compression techniques to answer this question. We compare traditional text compression systems with neural network and LLM-based…
▽ More
While the language modeling objective has been shown to be deeply connected with compression, it is surprising that modern LLMs are not employed in practical text compression systems. In this paper, we provide an in-depth analysis of neural network and transformer-based compression techniques to answer this question. We compare traditional text compression systems with neural network and LLM-based text compression methods. Although LLM-based systems significantly outperform conventional compression methods, they are highly impractical. Specifically, LLMZip, a recent text compression system using Llama3-8B requires 9.5 days to compress just 10 MB of text, although with huge improvements in compression ratios. To overcome this, we present FineZip - a novel LLM-based text compression system that combines ideas of online memorization and dynamic context to reduce the compression time immensely. FineZip can compress the above corpus in approximately 4 hours compared to 9.5 days, a 54 times improvement over LLMZip and comparable performance. FineZip outperforms traditional algorithmic compression methods with a large margin, improving compression ratios by approximately 50\%. With this work, we take the first step towards making lossless text compression with LLMs a reality. While FineZip presents a significant step in that direction, LLMs are still not a viable solution for large-scale text compression. We hope our work paves the way for future research and innovation to solve this problem.
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Searching for GEMS: TOI-6383Ab, a giant planet transiting an M3-dwarf star in a binary system
Authors:
Lia Marta Bernabò,
Shubham Kanodia,
Caleb I. Canas,
William D. Cochran,
Szilárd Csizmadia,
Suvrath Mahadevan,
Gudhmundur Stefánsson,
Arvind F. Gupta,
Andrew Monson,
Henry A. Kobulnicky,
Alexander K. Larsen,
Ethan G. Cotter,
Alexina Birkholz,
Tera N. Swaby,
Gregory Zeimann,
Chad F. Bender,
Scott A. Diddams,
Jessica E. Libby-Roberts,
Andrea S. J. Lin,
Joe P. Ninan,
Heike Rauer,
Varghese Reji,
Paul Robertson,
Arpita Roy,
Christian Schwab
Abstract:
We report on the discovery of a transiting giant planet around the 3500 K M3-dwarf star TOI-6383A located 172 pc from Earth. It was detected by the Transiting Exoplanet Survey Satellite (TESS) and confirmed by a combination of ground-based follow-up photometry and precise radial velocity measurements. This planet has an orbital period of $\sim$1.791 days, mass of 1.040$\pm$0.094 $M_J$ and a radius…
▽ More
We report on the discovery of a transiting giant planet around the 3500 K M3-dwarf star TOI-6383A located 172 pc from Earth. It was detected by the Transiting Exoplanet Survey Satellite (TESS) and confirmed by a combination of ground-based follow-up photometry and precise radial velocity measurements. This planet has an orbital period of $\sim$1.791 days, mass of 1.040$\pm$0.094 $M_J$ and a radius of 1d.008$^{+0.036}_{-0.033} ~R_J$, resulting in a mean bulk density of 1.26$^{+0.18}_{-0.17}$ g cm$^{-3}$. TOI-6383A has an M-dwarf companion star, TOI-6383B, which has a stellar effective temperature $T_{eff}$ $\sim$ 3100 K and a projected orbital separation of 3100 AU. TOI-6383A is a low-mass dwarf star hosting a giant planet and is an intriguing object for planetary evolution studies due to its high planet-to-star mass ratio. This discovery is part of the \textit{Searching for Giant Exoplanets around M-dwarf Stars (GEMS)} Survey, intending to provide robust and accurate estimates of the occurrence of GEMS and the statistics on their physical and orbital parameters. This paper presents an interesting addition to the small number of confirmed GEMS, particularly notable since its formation necessitates massive, ust-rich protoplanetary discs and high accretion efficiency ($>$ 10\%).
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Design and Fabrication of Robust Hybrid Photonic Crystal Cavities
Authors:
Alex Abulnaga,
Sean Karg,
Sounak Mukherjee,
Adbhut Gupta,
Kirk W. Baldwin,
Loren N. Pfeiffer,
Nathalie P. de Leon
Abstract:
Heterogeneously integrated hybrid photonic crystal cavities enable strong light-matter interactions with solid-state, optically addressable quantum memories. A key challenge to realizing high quality factor (Q) hybrid photonic crystals is the reduced index contrast on the substrate compared to suspended devices in air. This challenge is particularly acute for color centers in diamond because of di…
▽ More
Heterogeneously integrated hybrid photonic crystal cavities enable strong light-matter interactions with solid-state, optically addressable quantum memories. A key challenge to realizing high quality factor (Q) hybrid photonic crystals is the reduced index contrast on the substrate compared to suspended devices in air. This challenge is particularly acute for color centers in diamond because of diamond's high refractive index, which leads to increased scattering loss into the substrate. Here we develop a design methodology for hybrid photonic crystals utilizing a detailed understanding of substrate-mediated loss, which incorporates sensitivity to fabrication errors as a critical parameter. Using this methodology we design robust, high-Q, GaAs-on-diamond photonic crystal cavities, and by optimizing our fabrication procedure we experimentally realize cavities with Q approaching 30,000 at a resonance wavelength of 955 nm.
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Spectrophotometric reverberation mapping of Intermediate-mass black hole NGC 4395
Authors:
Shivangi Pandey,
Suvendu Rakshit,
Krishan Chand,
C. S. Stalin,
Hojin Cho,
Jong-Hak Woo,
Priyanka Jalan,
Amit Kumar Mandal,
Amitesh Omar,
Jincen Jose,
Archana Gupta
Abstract:
Understanding the origins of massive black hole seeds and their co-evolution with their host galaxy requires studying intermediate-mass black holes (IMBHs) and estimating their mass. However, measuring the mass of these IMBHs is challenging due to the high spatial resolution requirement. A spectrophotometric reverberation monitoring is performed for a low-luminosity Seyfert 1 galaxy NGC 4395 to me…
▽ More
Understanding the origins of massive black hole seeds and their co-evolution with their host galaxy requires studying intermediate-mass black holes (IMBHs) and estimating their mass. However, measuring the mass of these IMBHs is challenging due to the high spatial resolution requirement. A spectrophotometric reverberation monitoring is performed for a low-luminosity Seyfert 1 galaxy NGC 4395 to measure the size of the broad line region (BLR) and black hole mass. The data were collected using the 1.3-m Devasthal fast optical telescope (DFOT) and 3.6-m Devasthal optical telescope (DOT) at ARIES, Nainital, over two consecutive days in March 2022. The analysis revealed strong emission lines in the spectra and light curves of merged 5100Å spectroscopic continuum flux ($f_{\mathrm{5100}}$) with photometric continuum V-band and H$α$, with fractional variabilities of 6.38\% and 6.31\% respectively. In comparison to several previous studies with lag estimation $<$ 90 minutes, our calculated H$α$ lag supersedes by $125.0^{+6.2}_{-6.1}$ minutes using ICCF and {\small JAVELIN} methods. The velocity dispersion ($σ_{\mathrm{line}}$) of the broad line clouds is measured to be $544.7^{+22.4}_{-25.1}$ km s$^{-1}$, yielding a black hole mass of $\sim$ $2.2^{+0.2}_{-0.2}\times 10^{4}M_{\mathrm{\odot}}$ and an Eddington ratio of 0.06.
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Transport properties in a two-dimensional Su-Schrieffer-Heeger model in Quantum Hall Regime
Authors:
Aruna Gupta,
Shaina Gandhi,
Niladri Sarkar,
Jayendra N. Bandyopadhyay
Abstract:
We investigate the transport properties of a two-dimensional Su-Schrieffer-Heeger (2D SSH) model in the quantum Hall regime using non-equilibrium Green's function formalism (NEGF). The device Hamiltonian, where the 2D SSH model serves as the channel, is constructed using a nearest-neighbor tight-binding model. The effect of an external perpendicular magnetic field is incorporated into the contacts…
▽ More
We investigate the transport properties of a two-dimensional Su-Schrieffer-Heeger (2D SSH) model in the quantum Hall regime using non-equilibrium Green's function formalism (NEGF). The device Hamiltonian, where the 2D SSH model serves as the channel, is constructed using a nearest-neighbor tight-binding model. The effect of an external perpendicular magnetic field is incorporated into the contacts via Peierls substitution. We observe a transition from a gapped phase to a flat band regime at zero energy by varying the magnetic field. This transition is characterized by the emergence of highly localized states in the bulk or edges, which we observe by calculating local density-of-states (LDOS). We analyze transport in the system along two directions ($x$ and $y$) via transmission measurements, indicating a magnetic field-induced transition from insulating to metallic phase. The study of the energy spectrum of the system shows the formation of Landau levels. Moreover, the quantum number of the non-degenerate and degenerate Landau levels (transmission modes) can be any integer or only an odd integer, depending on diagonal, inter-cell, and intra-cell hopping strengths. From the analysis of the transport properties along $y$-direction, we find that edge modes play a crucial role in facilitating ballistic transport.
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation
Authors:
Homanga Bharadhwaj,
Debidatta Dwibedi,
Abhinav Gupta,
Shubham Tulsiani,
Carl Doersch,
Ted Xiao,
Dhruv Shah,
Fei Xia,
Dorsa Sadigh,
Sean Kirmani
Abstract:
How can robot manipulation policies generalize to novel tasks involving unseen object types and new motions? In this paper, we provide a solution in terms of predicting motion information from web data through human video generation and conditioning a robot policy on the generated video. Instead of attempting to scale robot data collection which is expensive, we show how we can leverage video gene…
▽ More
How can robot manipulation policies generalize to novel tasks involving unseen object types and new motions? In this paper, we provide a solution in terms of predicting motion information from web data through human video generation and conditioning a robot policy on the generated video. Instead of attempting to scale robot data collection which is expensive, we show how we can leverage video generation models trained on easily available web data, for enabling generalization. Our approach Gen2Act casts language-conditioned manipulation as zero-shot human video generation followed by execution with a single policy conditioned on the generated video. To train the policy, we use an order of magnitude less robot interaction data compared to what the video prediction model was trained on. Gen2Act doesn't require fine-tuning the video model at all and we directly use a pre-trained model for generating human videos. Our results on diverse real-world scenarios show how Gen2Act enables manipulating unseen object types and performing novel motions for tasks not present in the robot data. Videos are at https://homangab.github.io/gen2act/
△ Less
Submitted 24 September, 2024;
originally announced September 2024.
-
A Sinkhorn Regularized Adversarial Network for Image Guided DEM Super-resolution using Frequency Selective Hybrid Graph Transformer
Authors:
Subhajit Paul,
Ashutosh Gupta
Abstract:
Digital Elevation Model (DEM) is an essential aspect in the remote sensing (RS) domain to analyze various applications related to surface elevations. Here, we address the generation of high-resolution (HR) DEMs using HR multi-spectral (MX) satellite imagery as a guide by introducing a novel hybrid transformer model consisting of Densely connected Multi-Residual Block (DMRB) and multi-headed Freque…
▽ More
Digital Elevation Model (DEM) is an essential aspect in the remote sensing (RS) domain to analyze various applications related to surface elevations. Here, we address the generation of high-resolution (HR) DEMs using HR multi-spectral (MX) satellite imagery as a guide by introducing a novel hybrid transformer model consisting of Densely connected Multi-Residual Block (DMRB) and multi-headed Frequency Selective Graph Attention (M-FSGA). To promptly regulate this process, we utilize the notion of discriminator spatial maps as the conditional attention to the MX guide. Further, we present a novel adversarial objective related to optimizing Sinkhorn distance with classical GAN. In this regard, we provide both theoretical and empirical substantiation of better performance in terms of vanishing gradient issues and numerical convergence. Based on our experiments on 4 different DEM datasets, we demonstrate both qualitative and quantitative comparisons with available baseline methods and show that the performance of our proposed model is superior to others with sharper details and minimal errors.
△ Less
Submitted 21 September, 2024;
originally announced September 2024.
-
Prithvi WxC: Foundation Model for Weather and Climate
Authors:
Johannes Schmude,
Sujit Roy,
Will Trojak,
Johannes Jakubik,
Daniel Salles Civitarese,
Shraddha Singh,
Julian Kuehnert,
Kumar Ankur,
Aman Gupta,
Christopher E Phillips,
Romeo Kienzler,
Daniela Szwarcman,
Vishal Gaur,
Rajat Shinde,
Rohit Lal,
Arlindo Da Silva,
Jorge Luis Guevara Diaz,
Anne Jones,
Simon Pfreundschuh,
Amy Lin,
Aditi Sheshadri,
Udaysankar Nair,
Valentine Anantharaj,
Hendrik Hamann,
Campbell Watson
, et al. (4 additional authors not shown)
Abstract:
Triggered by the realization that AI emulators can rival the performance of traditional numerical weather prediction models running on HPC systems, there is now an increasing number of large AI models that address use cases such as forecasting, downscaling, or nowcasting. While the parallel developments in the AI literature focus on foundation models -- models that can be effectively tuned to addr…
▽ More
Triggered by the realization that AI emulators can rival the performance of traditional numerical weather prediction models running on HPC systems, there is now an increasing number of large AI models that address use cases such as forecasting, downscaling, or nowcasting. While the parallel developments in the AI literature focus on foundation models -- models that can be effectively tuned to address multiple, different use cases -- the developments on the weather and climate side largely focus on single-use cases with particular emphasis on mid-range forecasting. We close this gap by introducing Prithvi WxC, a 2.3 billion parameter foundation model developed using 160 variables from the Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2). Prithvi WxC employs an encoder-decoder-based architecture, incorporating concepts from various recent transformer models to effectively capture both regional and global dependencies in the input data. The model has been designed to accommodate large token counts to model weather phenomena in different topologies at fine resolutions. Furthermore, it is trained with a mixed objective that combines the paradigms of masked reconstruction with forecasting. We test the model on a set of challenging downstream tasks namely: Autoregressive rollout forecasting, Downscaling, Gravity wave flux parameterization, and Extreme events estimation. The pretrained model with 2.3 billion parameters, along with the associated fine-tuning workflows, has been publicly released as an open-source contribution via Hugging Face.
△ Less
Submitted 20 September, 2024;
originally announced September 2024.
-
Error-Minimizing Measurements in Postselected One-Shot Symmetric Quantum State Discrimination and Acceptance as a Performance Metric
Authors:
Saurabh Kumar Gupta,
Abhishek K. Gupta
Abstract:
In hypothesis testing with quantum states, given a black box containing one of the two possible states, measurement is performed to detect in favor of one of the hypotheses. In postselected hypothesis testing, a third outcome is added, corresponding to not selecting any of the hypotheses. In postselected scenario, minimum error one-shot symmetric hypothesis testing is characterized in literature c…
▽ More
In hypothesis testing with quantum states, given a black box containing one of the two possible states, measurement is performed to detect in favor of one of the hypotheses. In postselected hypothesis testing, a third outcome is added, corresponding to not selecting any of the hypotheses. In postselected scenario, minimum error one-shot symmetric hypothesis testing is characterized in literature conditioned on the fact that one of the selected outcomes occur. We proceed further in this direction to give the set of all possible measurements that lead to the minimum error. We have given an arbitrary error-minimizing measurement in a parametric form. Note that not selecting any of the hypotheses decimates the quality of testing. We further give an example to show that these measurements vary in quality. There is a need to discuss the quality of postselected hypothesis testing. We then characterize the quality of postselected hypothesis testing by defining a new metric acceptance and give expression of acceptance for an arbitrary error-minimizing measurement in terms of some parameters of the measurement. On the set of measurements that achieve minimum error, we have maximized the acceptance, and given an example which achieves that, thus giving an example of the best possible measurement in terms of acceptance.
△ Less
Submitted 20 September, 2024;
originally announced September 2024.
-
Multiscale Encoder and Omni-Dimensional Dynamic Convolution Enrichment in nnU-Net for Brain Tumor Segmentation
Authors:
Sahaj K. Mistry,
Sourav Saini,
Aashray Gupta,
Aayush Gupta,
Sunny Rai,
Vinit Jakhetiya,
Ujjwal Baid,
Sharath Chandra Guntuku
Abstract:
Brain tumor segmentation plays a crucial role in computer-aided diagnosis. This study introduces a novel segmentation algorithm utilizing a modified nnU-Net architecture. Within the nnU-Net architecture's encoder section, we enhance conventional convolution layers by incorporating omni-dimensional dynamic convolution layers, resulting in improved feature representation. Simultaneously, we propose…
▽ More
Brain tumor segmentation plays a crucial role in computer-aided diagnosis. This study introduces a novel segmentation algorithm utilizing a modified nnU-Net architecture. Within the nnU-Net architecture's encoder section, we enhance conventional convolution layers by incorporating omni-dimensional dynamic convolution layers, resulting in improved feature representation. Simultaneously, we propose a multi-scale attention strategy that harnesses contemporary insights from various scales. Our model's efficacy is demonstrated on diverse datasets from the BraTS-2023 challenge. Integrating omni-dimensional dynamic convolution (ODConv) layers and multi-scale features yields substantial improvement in the nnU-Net architecture's performance across multiple tumor segmentation datasets. Remarkably, our proposed model attains good accuracy during validation for the BraTS Africa dataset. The ODconv source code along with full training code is available on GitHub.
△ Less
Submitted 20 September, 2024;
originally announced September 2024.
-
Re-Introducing LayerNorm: Geometric Meaning, Irreversibility and a Comparative Study with RMSNorm
Authors:
Akshat Gupta,
Atahan Ozdemir,
Gopala Anumanchipalli
Abstract:
Layer normalization is a pivotal step in the transformer architecture. This paper delves into the less explored geometric implications of this process, examining how LayerNorm influences the norm and orientation of hidden vectors in the representation space. We show that the definition of LayerNorm is innately linked to the uniform vector, defined as…
▽ More
Layer normalization is a pivotal step in the transformer architecture. This paper delves into the less explored geometric implications of this process, examining how LayerNorm influences the norm and orientation of hidden vectors in the representation space. We show that the definition of LayerNorm is innately linked to the uniform vector, defined as $\boldsymbol{1} = [1, 1, 1, 1, \cdots, 1]^T \in \mathbb{R}^d$. We then show that the standardization step in LayerNorm can be understood in three simple steps: (i) remove the component of a vector along the uniform vector, (ii) normalize the remaining vector, and (iii) scale the resultant vector by $\sqrt{d}$, where $d$ is the dimensionality of the representation space. We also introduce the property of "irreversibility" for LayerNorm, where we show that the information lost during the normalization process cannot be recovered. In other words, unlike batch normalization, LayerNorm cannot learn an identity transform. While we present possible arguments for removing the component along the uniform vector, the choice of removing this component seems arbitrary and not well motivated by the original authors. To evaluate the usefulness of this step, we compare the hidden representations of LayerNorm-based LLMs with models trained using RMSNorm and show that all LLMs naturally align representations orthogonal to the uniform vector, presenting the first mechanistic evidence that removing the component along the uniform vector in LayerNorm is a redundant step. Our findings support the use of RMSNorm over LayerNorm as it is not only more computationally efficient with comparable downstream performance, but also learns a similar distribution of hidden representations that operate orthogonal to the uniform vector.
△ Less
Submitted 19 September, 2024;
originally announced September 2024.
-
Physics-Informed Neural Networks can accurately model cardiac electrophysiology in 3D geometries and fibrillatory conditions
Authors:
Ching-En Chiu,
Aditi Roy,
Sarah Cechnicka,
Ashvin Gupta,
Arieh Levy Pinto,
Christoforos Galazis,
Kim Christensen,
Danilo Mandic,
Marta Varela
Abstract:
Physics-Informed Neural Networks (PINNs) are fast becoming an important tool to solve differential equations rapidly and accurately, and to identify the systems parameters that best agree with a given set of measurements. PINNs have been used for cardiac electrophysiology (EP), but only in simple 1D and 2D geometries and for sinus rhythm or single rotor dynamics. Here, we demonstrate how PINNs can…
▽ More
Physics-Informed Neural Networks (PINNs) are fast becoming an important tool to solve differential equations rapidly and accurately, and to identify the systems parameters that best agree with a given set of measurements. PINNs have been used for cardiac electrophysiology (EP), but only in simple 1D and 2D geometries and for sinus rhythm or single rotor dynamics. Here, we demonstrate how PINNs can be used to accurately reconstruct the propagation of cardiac action potential in more complex geometries and dynamical regimes. These include 3D spherical geometries and spiral break-up conditions that model cardiac fibrillation, with a mean RMSE $< 5.1\times 10^{-2}$ overall.
We also demonstrate that PINNs can be used to reliably parameterise cardiac EP models with some biological detail. We estimate the diffusion coefficient and parameters related to ion channel conductances in the Fenton-Karma model in a 2D setup, achieving a mean relative error of $-0.09\pm 0.33$. Our results are an important step towards the deployment of PINNs to realistic cardiac geometries and arrhythmic conditions.
△ Less
Submitted 18 September, 2024;
originally announced September 2024.
-
The NEID Earth Twin Survey. I. Confirmation of a 31-day planet orbiting HD 86728
Authors:
Arvind F. Gupta,
Jacob K. Luhn,
Jason T. Wright,
Suvrath Mahadevan,
Paul Robertson,
Daniel M. Krolikowski,
Eric B. Ford,
Caleb I. Cañas,
Samuel Halverson,
Andrea S. J. Lin,
Shubham Kanodia,
Evan Fitzmaurice,
Christian Gilbertson,
Chad F. Bender,
Cullen H. Blake,
Jiayin Dong,
Mark R. Giovinazzi,
Sarah E. Logsdon,
Andrew Monson,
Joe P. Ninan,
Jayadev Rajagopal,
Arpita Roy,
Christian Schwab,
Guðmundur Stefánsson
Abstract:
With close to three years of observations in hand, the NEID Earth Twin Survey (NETS) is starting to unearth new astrophysical signals for a curated sample of bright, radial velocity (RV)-quiet stars. We present the discovery of the first NETS exoplanet, HD 86728 b, a $m_p\sin i = 9.16^{+0.55}_{-0.56}\ \rm{M}_\oplus$ planet on a circular, $P=31.1503^{+0.0062}_{-0.0066}$ d orbit, thereby confirming…
▽ More
With close to three years of observations in hand, the NEID Earth Twin Survey (NETS) is starting to unearth new astrophysical signals for a curated sample of bright, radial velocity (RV)-quiet stars. We present the discovery of the first NETS exoplanet, HD 86728 b, a $m_p\sin i = 9.16^{+0.55}_{-0.56}\ \rm{M}_\oplus$ planet on a circular, $P=31.1503^{+0.0062}_{-0.0066}$ d orbit, thereby confirming a candidate signal identified by Hirsch et al. (2021). We confirm the planetary origin of the detected signal, which has a semi-amplitude of just $K=1.91^{+0.11}_{-0.12}$ m s$^{-1}$, via careful analysis of the NEID RVs and spectral activity indicators, and we constrain the mass and orbit via fits to NEID and archival RV measurements. The host star is intrinsically quiet at the $\sim1$ m s$^{-1}$ level, with the majority of this variability likely stemming from short-timescale granulation. HD 86728 b is among the small fraction of exoplanets with similar masses and periods that have no known planetary siblings.
△ Less
Submitted 18 September, 2024;
originally announced September 2024.
-
How to Build the Virtual Cell with Artificial Intelligence: Priorities and Opportunities
Authors:
Charlotte Bunne,
Yusuf Roohani,
Yanay Rosen,
Ankit Gupta,
Xikun Zhang,
Marcel Roed,
Theo Alexandrov,
Mohammed AlQuraishi,
Patricia Brennan,
Daniel B. Burkhardt,
Andrea Califano,
Jonah Cool,
Abby F. Dernburg,
Kirsty Ewing,
Emily B. Fox,
Matthias Haury,
Amy E. Herr,
Eric Horvitz,
Patrick D. Hsu,
Viren Jain,
Gregory R. Johnson,
Thomas Kalil,
David R. Kelley,
Shana O. Kelley,
Anna Kreshuk
, et al. (17 additional authors not shown)
Abstract:
The cell is arguably the most fundamental unit of life and is central to understanding biology. Accurate modeling of cells is important for this understanding as well as for determining the root causes of disease. Recent advances in artificial intelligence (AI), combined with the ability to generate large-scale experimental data, present novel opportunities to model cells. Here we propose a vision…
▽ More
The cell is arguably the most fundamental unit of life and is central to understanding biology. Accurate modeling of cells is important for this understanding as well as for determining the root causes of disease. Recent advances in artificial intelligence (AI), combined with the ability to generate large-scale experimental data, present novel opportunities to model cells. Here we propose a vision of leveraging advances in AI to construct virtual cells, high-fidelity simulations of cells and cellular systems under different conditions that are directly learned from biological data across measurements and scales. We discuss desired capabilities of such AI Virtual Cells, including generating universal representations of biological entities across scales, and facilitating interpretable in silico experiments to predict and understand their behavior using virtual instruments. We further address the challenges, opportunities and requirements to realize this vision including data needs, evaluation strategies, and community standards and engagement to ensure biological accuracy and broad utility. We envision a future where AI Virtual Cells help identify new drug targets, predict cellular responses to perturbations, as well as scale hypothesis exploration. With open science collaborations across the biomedical ecosystem that includes academia, philanthropy, and the biopharma and AI industries, a comprehensive predictive understanding of cell mechanisms and interactions has come into reach.
△ Less
Submitted 14 October, 2024; v1 submitted 17 September, 2024;
originally announced September 2024.
-
AD-Lite Net: A Lightweight and Concatenated CNN Model for Alzheimer's Detection from MRI Images
Authors:
Santanu Roy,
Archit Gupta,
Shubhi Tiwari,
Palak Sahu
Abstract:
Alzheimer's Disease (AD) is a non-curable progressive neurodegenerative disorder that affects the human brain, leading to a decline in memory, cognitive abilities, and eventually, the ability to carry out daily tasks. Manual diagnosis of Alzheimer's disease from MRI images is fraught with less sensitivity and it is a very tedious process for neurologists. Therefore, there is a need for an automati…
▽ More
Alzheimer's Disease (AD) is a non-curable progressive neurodegenerative disorder that affects the human brain, leading to a decline in memory, cognitive abilities, and eventually, the ability to carry out daily tasks. Manual diagnosis of Alzheimer's disease from MRI images is fraught with less sensitivity and it is a very tedious process for neurologists. Therefore, there is a need for an automatic Computer Assisted Diagnosis (CAD) system, which can detect AD at early stages with higher accuracy. In this research, we have proposed a novel AD-Lite Net model (trained from scratch), that could alleviate the aforementioned problem. The novelties we bring here in this research are, (I) We have proposed a very lightweight CNN model by incorporating Depth Wise Separable Convolutional (DWSC) layers and Global Average Pooling (GAP) layers. (II) We have leveraged a ``parallel concatenation block'' (pcb), in the proposed AD-Lite Net model. This pcb consists of a Transformation layer (Tx-layer), followed by two convolutional layers, which are thereby concatenated with the original base model. This Tx-layer converts the features into very distinct kind of features, which are imperative for the Alzheimer's disease. As a consequence, the proposed AD-Lite Net model with ``parallel concatenation'' converges faster and automatically mitigates the class imbalance problem from the MRI datasets in a very generalized way. For the validity of our proposed model, we have implemented it on three different MRI datasets. Furthermore, we have combined the ADNI and AD datasets and subsequently performed a 10-fold cross-validation experiment to verify the model's generalization ability. Extensive experimental results showed that our proposed model has outperformed all the existing CNN models, and one recent trend Vision Transformer (ViT) model by a significant margin.
△ Less
Submitted 12 September, 2024;
originally announced September 2024.
-
Development of an embedded-atom method potential of Ni-Mo alloys for electrocatalysis / surface compositional studies
Authors:
Ambesh Gupta,
Chinmay Dahale,
Soumyadipta Maiti,
Sriram Goverapet Srinivasan,
Beena Rai
Abstract:
Ni-Mo superalloys have emerged as materials of choice for a diverse array of applications owing to their superior mechanical properties, exceptional corrosion and oxidation resistance, electrocatalytic behavior, and surface stability. Understanding and optimizing the surface composition of Ni-Mo alloys is critical for enhancing their performance in practical applications. Traditional experimental…
▽ More
Ni-Mo superalloys have emerged as materials of choice for a diverse array of applications owing to their superior mechanical properties, exceptional corrosion and oxidation resistance, electrocatalytic behavior, and surface stability. Understanding and optimizing the surface composition of Ni-Mo alloys is critical for enhancing their performance in practical applications. Traditional experimental surface analysis techniques, while informative, are often prohibitive in terms of cost and time. Likewise, theoretical approaches such as first-principle calculations demand substantial computational resources and it is difficult to simulate large structures. This study introduces an alternative approach utilizing hybrid Monte-Carlo / Molecular Dynamics (MC/MD) simulations to investigate the surface composition of Ni-Mo alloys. We report the development of an optimized Embedded-Atom Method (EAM) potential specifically for Ni-Mo alloys, carefully parameterized using empirical lattice constants and formation energies of elemental and face-centered cubic (FCC) Ni-Mo solid solution alloys. The reliability of the EAM potential is corroborated via the evaluation of equations of state, with a particular focus on reproducing structural properties. Utilizing this validated potential, MC/MD simulations were performed to understand the depth-wise variations in the compositions of Ni-Mo alloy nanoparticles and extended surfaces. These simulations reveal a preferential segregation of nickel on surface, and molybdenum in sub-surface layer. Due to this preferential segregation, it is imperative to consider surface segregation while tailoring the surface properties for targeted applications.
△ Less
Submitted 11 September, 2024;
originally announced September 2024.
-
LEIA: Latent View-invariant Embeddings for Implicit 3D Articulation
Authors:
Archana Swaminathan,
Anubhav Gupta,
Kamal Gupta,
Shishira R. Maiya,
Vatsal Agarwal,
Abhinav Shrivastava
Abstract:
Neural Radiance Fields (NeRFs) have revolutionized the reconstruction of static scenes and objects in 3D, offering unprecedented quality. However, extending NeRFs to model dynamic objects or object articulations remains a challenging problem. Previous works have tackled this issue by focusing on part-level reconstruction and motion estimation for objects, but they often rely on heuristics regardin…
▽ More
Neural Radiance Fields (NeRFs) have revolutionized the reconstruction of static scenes and objects in 3D, offering unprecedented quality. However, extending NeRFs to model dynamic objects or object articulations remains a challenging problem. Previous works have tackled this issue by focusing on part-level reconstruction and motion estimation for objects, but they often rely on heuristics regarding the number of moving parts or object categories, which can limit their practical use. In this work, we introduce LEIA, a novel approach for representing dynamic 3D objects. Our method involves observing the object at distinct time steps or "states" and conditioning a hypernetwork on the current state, using this to parameterize our NeRF. This approach allows us to learn a view-invariant latent representation for each state. We further demonstrate that by interpolating between these states, we can generate novel articulation configurations in 3D space that were previously unseen. Our experimental results highlight the effectiveness of our method in articulating objects in a manner that is independent of the viewing angle and joint configuration. Notably, our approach outperforms previous methods that rely on motion information for articulation registration.
△ Less
Submitted 10 September, 2024;
originally announced September 2024.
-
Hevelius Report: Visualizing Web-Based Mobility Test Data For Clinical Decision and Learning Support
Authors:
Hongjin Lin,
Tessa Han,
Krzysztof Z. Gajos,
Anoopum S. Gupta
Abstract:
Hevelius, a web-based computer mouse test, measures arm movement and has been shown to accurately evaluate severity for patients with Parkinson's disease and ataxias. A Hevelius session produces 32 numeric features, which may be hard to interpret, especially in time-constrained clinical settings. This work aims to support clinicians (and other stakeholders) in interpreting and connecting Hevelius…
▽ More
Hevelius, a web-based computer mouse test, measures arm movement and has been shown to accurately evaluate severity for patients with Parkinson's disease and ataxias. A Hevelius session produces 32 numeric features, which may be hard to interpret, especially in time-constrained clinical settings. This work aims to support clinicians (and other stakeholders) in interpreting and connecting Hevelius features to clinical concepts. Through an iterative design process, we developed a visualization tool (Hevelius Report) that (1) abstracts six clinically relevant concepts from 32 features, (2) visualizes patient test results, and compares them to results from healthy controls and other patients, and (3) is an interactive app to meet the specific needs in different usage scenarios. Then, we conducted a preliminary user study through an online interview with three clinicians who were not involved in the project. They expressed interest in using Hevelius Report, especially for identifying subtle changes in their patients' mobility that are hard to capture with existing clinical tests. Future work will integrate the visualization tool into the current clinical workflow of a neurology team and conduct systematic evaluations of the tool's usefulness, usability, and effectiveness. Hevelius Report represents a promising solution for analyzing fine-motor test results and monitoring patients' conditions and progressions.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
SpikingRx: From Neural to Spiking Receiver
Authors:
Ankit Gupta,
Onur Dizdar,
Yun Chen,
Stephen Wang
Abstract:
In this work, we propose an energy efficient neuromorphic receiver to replace multiple signal-processing blocks at the receiver by a Spiking Neural Network (SNN) based module, called SpikingRx. We propose a deep convolutional SNN with spike-element-wise ResNet layers which takes a whole OFDM grid compliant with 5G specifications and provides soft outputs for decoded bits that can be used as log-li…
▽ More
In this work, we propose an energy efficient neuromorphic receiver to replace multiple signal-processing blocks at the receiver by a Spiking Neural Network (SNN) based module, called SpikingRx. We propose a deep convolutional SNN with spike-element-wise ResNet layers which takes a whole OFDM grid compliant with 5G specifications and provides soft outputs for decoded bits that can be used as log-likelihood ratios. We propose to employ the surrogate gradient descent method for training the SpikingRx and focus on its generalizability and robustness to quantization. Moreover, the interpretability of the proposed SpikingRx is studied by a comprehensive ablation study. Our extensive numerical simulations show that SpikingRx is capable of achieving significant block error rate performance gain compared to conventional 5G receivers and similar performance compared to its traditional NN-based counterparts with approximately 9x less energy consumption.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
An efficient hp-Variational PINNs framework for incompressible Navier-Stokes equations
Authors:
Thivin Anandh,
Divij Ghose,
Ankit Tyagi,
Abhineet Gupta,
Suranjan Sarkar,
Sashikumaar Ganesan
Abstract:
Physics-informed neural networks (PINNs) are able to solve partial differential equations (PDEs) by incorporating the residuals of the PDEs into their loss functions. Variational Physics-Informed Neural Networks (VPINNs) and hp-VPINNs use the variational form of the PDE residuals in their loss function. Although hp-VPINNs have shown promise over traditional PINNs, they suffer from higher training…
▽ More
Physics-informed neural networks (PINNs) are able to solve partial differential equations (PDEs) by incorporating the residuals of the PDEs into their loss functions. Variational Physics-Informed Neural Networks (VPINNs) and hp-VPINNs use the variational form of the PDE residuals in their loss function. Although hp-VPINNs have shown promise over traditional PINNs, they suffer from higher training times and lack a framework capable of handling complex geometries, which limits their application to more complex PDEs. As such, hp-VPINNs have not been applied in solving the Navier-Stokes equations, amongst other problems in CFD, thus far. FastVPINNs was introduced to address these challenges by incorporating tensor-based loss computations, significantly improving the training efficiency. Moreover, by using the bilinear transformation, the FastVPINNs framework was able to solve PDEs on complex geometries. In the present work, we extend the FastVPINNs framework to vector-valued problems, with a particular focus on solving the incompressible Navier-Stokes equations for two-dimensional forward and inverse problems, including problems such as the lid-driven cavity flow, the Kovasznay flow, and flow past a backward-facing step for Reynolds numbers up to 200. Our results demonstrate a 2x improvement in training time while maintaining the same order of accuracy compared to PINNs algorithms documented in the literature. We further showcase the framework's efficiency in solving inverse problems for the incompressible Navier-Stokes equations by accurately identifying the Reynolds number of the underlying flow. Additionally, the framework's ability to handle complex geometries highlights its potential for broader applications in computational fluid dynamics. This implementation opens new avenues for research on hp-VPINNs, potentially extending their applicability to more complex problems.
△ Less
Submitted 6 September, 2024;
originally announced September 2024.
-
Optical Coherence Tomography Angiography-OCTA dataset for the study of Diabetic Retinopathy
Authors:
Pooja Bidwai,
Shilpa Gite,
Biswajeet Pradhan,
Aditi Gupta,
Kishore pahuja
Abstract:
This study presents a dataset consisting of 268 retinal images from 179 individuals, including 133 left-eye and 135 right-eye images, collected from Natasha Eye Care and Research Institute in Pune, Maharashtra, India. The images were captured using a nonmydriatic Optical Coherence Tomography Angiography (OCTA) device, specifically the Optovue Avanti Edition machine as per the protocol mentioned in…
▽ More
This study presents a dataset consisting of 268 retinal images from 179 individuals, including 133 left-eye and 135 right-eye images, collected from Natasha Eye Care and Research Institute in Pune, Maharashtra, India. The images were captured using a nonmydriatic Optical Coherence Tomography Angiography (OCTA) device, specifically the Optovue Avanti Edition machine as per the protocol mentioned in this paper. Two ophthalmologists then annotated the images. This dataset can be used by researchers and doctors to develop automated diagnostic tools for early detection of diabetic retinopathy (DR).
△ Less
Submitted 6 September, 2024;
originally announced September 2024.
-
1 Modular Parallel Manipulator for Long-Term Soft Robotic Data Collection
Authors:
Kiyn Chin,
Carmel Majidi,
Abhinav Gupta
Abstract:
Performing long-term experimentation or large-scale data collection for machine learning in the field of soft robotics is challenging, due to the hardware robustness and experimental flexibility required. In this work, we propose a modular parallel robotic manipulation platform suitable for such large-scale data collection and compatible with various soft-robotic fabrication methods. Considering t…
▽ More
Performing long-term experimentation or large-scale data collection for machine learning in the field of soft robotics is challenging, due to the hardware robustness and experimental flexibility required. In this work, we propose a modular parallel robotic manipulation platform suitable for such large-scale data collection and compatible with various soft-robotic fabrication methods. Considering the computational and theoretical difficulty of replicating the high-fidelity, faster-than-real-time simulations that enable large-scale data collection in rigid robotic systems, a robust soft-robotic hardware platform becomes a high priority development task for the field.
The platform's modules consist of a pair of off-the-shelf electrical motors which actuate a customizable finger consisting of a compliant parallel structure. The parallel mechanism of the finger can be as simple as a single 3D-printed urethane or molded silicone bulk structure, due to the motors being able to fully actuate a passive structure. This design flexibility allows experimentation with soft mechanism varied geometries, bulk properties and surface properties. Additionally, while the parallel mechanism does not require separate electronics or additional parts, these can be included, and it can be constructed using multi-functional soft materials to study compatible soft sensors and actuators in the learning process. In this work, we validate the platform's ability to be used for policy gradient reinforcement learning directly on hardware in a benchmark 2D manipulation task. We additionally demonstrate compatibility with multiple fingers and characterize the design constraints for compatible extensions.
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
Inhomogeneous hysteresis in local STM tunnel conductance with gate-voltage in single-layer MoS$_2$ on SiO$_2$
Authors:
Santu Prasad Jana,
Suraina Gupta,
Anjan Kumar Gupta
Abstract:
Randomly distributed traps at the MoS$_2$/SiO$_2$ interface result in non-ideal transport behavior, including hysteresis in MoS$_2$/SiO$_2$ field effect transistors (FETs). Thus traps are mostly detrimental to the FET performance but they also offer some application potential. Our STM/S measurements on atomically resolved few-layer and single-layer MoS$_2$ on SiO$_2$ show n-doped behavior with the…
▽ More
Randomly distributed traps at the MoS$_2$/SiO$_2$ interface result in non-ideal transport behavior, including hysteresis in MoS$_2$/SiO$_2$ field effect transistors (FETs). Thus traps are mostly detrimental to the FET performance but they also offer some application potential. Our STM/S measurements on atomically resolved few-layer and single-layer MoS$_2$ on SiO$_2$ show n-doped behavior with the expected band gap close to 2.0 and 1.4 eV, respectively. The local tunnel conductance with gate-voltage $V_{\rm g}$ sweep exhibits a turn-on/off at a threshold $V_{\rm g}$ at which the tip's Fermi-energy nearly coincides with the local conduction band minimum. This threshold value is found to depend on $V_{\rm g}$ sweep direction amounting to local hysteresis. The hysteresis is, expectedly, found to depend on both the extent and rate of $V_{\rm g}$-sweep. Further, the spatial variation in the local $V_{\rm g}$ threshold and the details of tunnel conductance Vs $V_{\rm g}$ behavior indicate inhomogenieties in both the traps' density and their energy distribution. The latter even leads to the pinning of the local Fermi energy in some regions. Further, some rare locations exhibit a p-doping with both p and n-type $V_{\rm g}$-thresholds in local conductance and an unusual hysteresis.
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
F2former: When Fractional Fourier Meets Deep Wiener Deconvolution and Selective Frequency Transformer for Image Deblurring
Authors:
Subhajit Paul,
Sahil Kumawat,
Ashutosh Gupta,
Deepak Mishra
Abstract:
Recent progress in image deblurring techniques focuses mainly on operating in both frequency and spatial domains using the Fourier transform (FT) properties. However, their performance is limited due to the dependency of FT on stationary signals and its lack of capability to extract spatial-frequency properties. In this paper, we propose a novel approach based on the Fractional Fourier Transform (…
▽ More
Recent progress in image deblurring techniques focuses mainly on operating in both frequency and spatial domains using the Fourier transform (FT) properties. However, their performance is limited due to the dependency of FT on stationary signals and its lack of capability to extract spatial-frequency properties. In this paper, we propose a novel approach based on the Fractional Fourier Transform (FRFT), a unified spatial-frequency representation leveraging both spatial and frequency components simultaneously, making it ideal for processing non-stationary signals like images. Specifically, we introduce a Fractional Fourier Transformer (F2former), where we combine the classical fractional Fourier based Wiener deconvolution (F2WD) as well as a multi-branch encoder-decoder transformer based on a new fractional frequency aware transformer block (F2TB). We design F2TB consisting of a fractional frequency aware self-attention (F2SA) to estimate element-wise product attention based on important frequency components and a novel feed-forward network based on frequency division multiplexing (FM-FFN) to refine high and low frequency features separately for efficient latent clear image restoration. Experimental results for the cases of both motion deblurring as well as defocus deblurring show that the performance of our proposed method is superior to other state-of-the-art (SOTA) approaches.
△ Less
Submitted 3 September, 2024;
originally announced September 2024.
-
Searching for GEMS: TOI-5688 A b, a low-density giant orbiting a high-metallicity early M-dwarf
Authors:
Varghese Reji,
Shubham Kanodia,
Joe Ninan,
Caleb I. Cañas,
Jessica Libby-Roberts,
Andrea S. J. Lin,
Arvind F Gupta,
Tera N. Sewaby,
Alexander Larsen,
Henry A. Kobulnicky,
Philip I. Choi,
Nez Evans,
Sage Santomenna,
Isabelle Winnick,
Larry Yu,
Jaime A. Alvarado-Montes,
Chad Bender,
Lia Marta Bernabò,
Cullen H. Blake,
William D. Cochran,
Scott A. Diddams,
Samuel Halverson,
Te Han,
Fred Hearty,
Sarah E. Logsdon
, et al. (9 additional authors not shown)
Abstract:
We present the discovery of a low-density planet transiting TOI-5688 A b, a high-metallicity M2V star. This planet was discovered as part of the search for transiting giant planets ($R \gtrsim8$ M$_\oplus$) through the Searching for GEMS (Giant Exoplanets around M-dwarf Stars) survey. The planet TOI-5688 A b was discovered with the Transiting Exoplanet Survey Satellite (TESS), and characterized wi…
▽ More
We present the discovery of a low-density planet transiting TOI-5688 A b, a high-metallicity M2V star. This planet was discovered as part of the search for transiting giant planets ($R \gtrsim8$ M$_\oplus$) through the Searching for GEMS (Giant Exoplanets around M-dwarf Stars) survey. The planet TOI-5688 A b was discovered with the Transiting Exoplanet Survey Satellite (TESS), and characterized with ground-based transits from Red Buttes Observatory (RBO), the Table Mountain Observatory of Pomona College, and radial velocity (RV) measurements with the Habitable-Zone Planet Finder (HPF) on the 10 m Hobby Eberly Telescope (HET) and NEID on the WIYN 3.5 m telescope. From the joint fit of transit and RV data, the mass of the planet is $124\pm24$ M$_\oplus$ and the radius is $10.4\pm0.7$ R$_\oplus$. This planet has a density of $0.61^{+0.20}_{-0.15}$ g/cm${}^3$, and is on a $\sim2.95$ day orbit around its host star. The spectroscopic and photometric analysis of the host star TOI-5688 A shows that it is a high metallicity ([Fe/H] $ = 0.47\pm0.16$ dex) M2V star, favoring the core-accretion formation pathway as the likely formation scenario for this planet. In this paper, we analyze potential mechanisms of planet formation in the context of the formation of TOI-5688 A b. Additionally, observations with Gaia suggest the presence of a wide-separation binary companion, TOI-5688 B, which has a projected separation of $\sim5"$ (1110 AU) and is an M4V. This makes TOI-5688 A b part of a growing number of GEMS in wide-separation binary systems.
△ Less
Submitted 4 September, 2024; v1 submitted 2 September, 2024;
originally announced September 2024.
-
Semantically Controllable Augmentations for Generalizable Robot Learning
Authors:
Zoey Chen,
Zhao Mandi,
Homanga Bharadhwaj,
Mohit Sharma,
Shuran Song,
Abhishek Gupta,
Vikash Kumar
Abstract:
Generalization to unseen real-world scenarios for robot manipulation requires exposure to diverse datasets during training. However, collecting large real-world datasets is intractable due to high operational costs. For robot learning to generalize despite these challenges, it is essential to leverage sources of data or priors beyond the robot's direct experience. In this work, we posit that image…
▽ More
Generalization to unseen real-world scenarios for robot manipulation requires exposure to diverse datasets during training. However, collecting large real-world datasets is intractable due to high operational costs. For robot learning to generalize despite these challenges, it is essential to leverage sources of data or priors beyond the robot's direct experience. In this work, we posit that image-text generative models, which are pre-trained on large corpora of web-scraped data, can serve as such a data source. These generative models encompass a broad range of real-world scenarios beyond a robot's direct experience and can synthesize novel synthetic experiences that expose robotic agents to additional world priors aiding real-world generalization at no extra cost.
In particular, our approach leverages pre-trained generative models as an effective tool for data augmentation. We propose a generative augmentation framework for semantically controllable augmentations and rapidly multiplying robot datasets while inducing rich variations that enable real-world generalization. Based on diverse augmentations of robot data, we show how scalable robot manipulation policies can be trained and deployed both in simulation and in unseen real-world environments such as kitchens and table-tops. By demonstrating the effectiveness of image-text generative models in diverse real-world robotic applications, our generative augmentation framework provides a scalable and efficient path for boosting generalization in robot learning at no extra human cost.
△ Less
Submitted 2 September, 2024;
originally announced September 2024.
-
gaspery: Optimized Scheduling of Radial Velocity Follow-Up Observations for Active Host Stars
Authors:
Christopher Lam,
Megan Bedell,
Lily L. Zhao,
Arvind F. Gupta,
Sarah A. Ballard
Abstract:
Radial velocity (RV) follow-up is a critical complement of transiting exoplanet surveys like the Transiting Exoplanet Survey Satellite (TESS ), both for validating discoveries of exoplanets and measuring their masses. Stellar activity introduces challenges to interpreting these measurements because the noise from the host star, which is often correlated in time, can result in high RV uncertainty.…
▽ More
Radial velocity (RV) follow-up is a critical complement of transiting exoplanet surveys like the Transiting Exoplanet Survey Satellite (TESS ), both for validating discoveries of exoplanets and measuring their masses. Stellar activity introduces challenges to interpreting these measurements because the noise from the host star, which is often correlated in time, can result in high RV uncertainty. A robust understanding of stellar activity and how its timescales interact with the observing cadence can optimize limited RV resources. For this reason, in the era of over-subscribed, high-precision RV measurements, folding stellar activity timescales into the scheduling of observation campaigns is ideal. We present gaspery, an open-source code implementation to enable the optimization of RV observing strategies. Gaspery employs a generalized formulation of the Fisher Information for RV time series that also incorporates information about stellar correlated noise. We show that the information contained in an observing strategy can be significantly affected by beat frequencies between the orbital period of the planet, the stellar rotation period, and the observation epochs. We investigate how the follow-up observing strategy will affect the resulting radial velocity uncertainty, as a function of stellar properties such as the spot decay timescale and rotation period. We then describe two example use cases for gaspery: 1) calculating the minimum number of observations to reach an uncertainty tolerance in a correlated noise regime and 2) finding an optimal strategy given a fixed observing budget. Finally, we outline a prescription for selecting an observing strategy that is generalizable to different targets.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
sEMG-Driven Physics-Informed Gated Recurrent Networks for Modeling Upper Limb Multi-Joint Movement Dynamics
Authors:
Rajnish Kumar,
Anand Gupta,
Suriya Prakash Muthukrishnan,
Lalan Kumar,
Sitikantha Roy
Abstract:
Exoskeletons and rehabilitation systems offer great potential for enhancing human strength and recovery through advanced human-machine interfaces (HMIs) that adapt to movement dynamics. However, the real-time application of physics-informed neural networks (PINNs) is limited by their reliance on fixed input lengths and surrogate models. This study introduces a novel physics-informed Gated Recurren…
▽ More
Exoskeletons and rehabilitation systems offer great potential for enhancing human strength and recovery through advanced human-machine interfaces (HMIs) that adapt to movement dynamics. However, the real-time application of physics-informed neural networks (PINNs) is limited by their reliance on fixed input lengths and surrogate models. This study introduces a novel physics-informed Gated Recurrent Network (PiGRN) designed to predict multi-joint torques using surface electromyography (sEMG) data. The PiGRN model employs a Gated Recurrent Unit (GRU) to convert time-series sEMG inputs into multi-joint kinematics and external loads, which are then integrated into an equation of motion to ensure consistency with physical laws. Experimental validation with sEMG data from five participants performing elbow flexion-extension tasks showed that the PiGRN model accurately predicted joint torques for 10 unfamiliar movements, with RMSE values between 4.02\% and 11.40\% and correlation coefficients ranging from 0.87 to 0.98. These findings highlight the PiGRN's potential for real-time exoskeleton and rehabilitation applications. Future research will explore more diverse datasets, improve musculoskeletal models, and investigate unsupervised learning methods.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
AAVENUE: Detecting LLM Biases on NLU Tasks in AAVE via a Novel Benchmark
Authors:
Abhay Gupta,
Philip Meng,
Ece Yurtseven,
Sean O'Brien,
Kevin Zhu
Abstract:
Detecting biases in natural language understanding (NLU) for African American Vernacular English (AAVE) is crucial to developing inclusive natural language processing (NLP) systems. To address dialect-induced performance discrepancies, we introduce AAVENUE ({AAVE} {N}atural Language {U}nderstanding {E}valuation), a benchmark for evaluating large language model (LLM) performance on NLU tasks in AAV…
▽ More
Detecting biases in natural language understanding (NLU) for African American Vernacular English (AAVE) is crucial to developing inclusive natural language processing (NLP) systems. To address dialect-induced performance discrepancies, we introduce AAVENUE ({AAVE} {N}atural Language {U}nderstanding {E}valuation), a benchmark for evaluating large language model (LLM) performance on NLU tasks in AAVE and Standard American English (SAE). AAVENUE builds upon and extends existing benchmarks like VALUE, replacing deterministic syntactic and morphological transformations with a more flexible methodology leveraging LLM-based translation with few-shot prompting, improving performance across our evaluation metrics when translating key tasks from the GLUE and SuperGLUE benchmarks. We compare AAVENUE and VALUE translations using five popular LLMs and a comprehensive set of metrics including fluency, BARTScore, quality, coherence, and understandability. Additionally, we recruit fluent AAVE speakers to validate our translations for authenticity. Our evaluations reveal that LLMs consistently perform better on SAE tasks than AAVE-translated versions, underscoring inherent biases and highlighting the need for more inclusive NLP models. We have open-sourced our source code on GitHub and created a website to showcase our work at https://aavenue.live.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
Searching for GEMS: Characterizing Six Giant Planets around Cool Dwarfs
Authors:
Shubham Kanodia,
Arvind F. Gupta,
Caleb I. Canas,
Lia Marta Bernabo,
Varghese Reji,
Te Han,
Madison Brady,
Andreas Seifahrt,
William D. Cochran,
Nidia Morrell,
Ritvik Basant,
Jacob Bean,
Chad F. Bender,
Zoe L. de Beurs,
Allyson Bieryla,
Alexina Birkholz,
Nina Brown,
Franklin Chapman,
David R. Ciardi,
Catherine A. Clark,
Ethan G. Cotter,
Scott A. Diddams,
Samuel Halverson,
Suzanne Hawley,
Leslie Hebb
, et al. (20 additional authors not shown)
Abstract:
Transiting giant exoplanets around M-dwarf stars (GEMS) are rare, owing to the low-mass host stars. However, the all-sky coverage of TESS has enabled the detection of an increasingly large number of them to enable statistical surveys like the \textit{Searching for GEMS} survey. As part of this endeavour, we describe the observations of six transiting giant planets, which includes precise mass meas…
▽ More
Transiting giant exoplanets around M-dwarf stars (GEMS) are rare, owing to the low-mass host stars. However, the all-sky coverage of TESS has enabled the detection of an increasingly large number of them to enable statistical surveys like the \textit{Searching for GEMS} survey. As part of this endeavour, we describe the observations of six transiting giant planets, which includes precise mass measurements for two GEMS (K2-419Ab, TOI-6034b) and statistical validation for four systems, which includes validation and mass upper limits for three of them (TOI-5218b, TOI-5616b, TOI-5634Ab), while the fourth one -- TOI-5414b is classified as a `likely planet'. Our observations include radial velocities from the Habitable-zone Planet Finder on the Hobby-Eberly Telescope, and MAROON-X on Gemini-North, along with photometry and high-contrast imaging from multiple ground-based facilities. In addition to TESS photometry, K2-419Ab was also observed and statistically validated as part of the K2 mission in Campaigns 5 and 18, which provides precise orbital and planetary constraints despite the faint host star and long orbital period of $\sim 20.4$ days. With an equilibrium temperature of only 380 K, K2-419Ab is one of the coolest known well-characterized transiting planets. TOI-6034 has a late F-type companion about 40\arcsec~away, making it the first GEMS host star to have an earlier main-sequence binary companion. These confirmations add to the existing small sample of confirmed transiting GEMS.
△ Less
Submitted 27 August, 2024; v1 submitted 26 August, 2024;
originally announced August 2024.
-
Multi-Beam Object-Localization for Millimeter-Wave ISAC-Aided Connected Autonomous Vehicles
Authors:
Jitendra Singh,
Awadhesh Gupta,
Aditya K. Jagannatham,
Lajos Hanzo
Abstract:
Millimeter wave (mmWave) multiple-input multiple-output (MIMO) systems capable of integrated sensing and communication (ISAC) constitute a key technology for connected autonomous vehicles (CAVs). In this context, we propose a multi-beam object-localization (MBOL) model for enhancing the sensing beampattern (SBP) gain of adjacent objects in CAV scenarios. Given the ultra-narrow beams of mmWave MIMO…
▽ More
Millimeter wave (mmWave) multiple-input multiple-output (MIMO) systems capable of integrated sensing and communication (ISAC) constitute a key technology for connected autonomous vehicles (CAVs). In this context, we propose a multi-beam object-localization (MBOL) model for enhancing the sensing beampattern (SBP) gain of adjacent objects in CAV scenarios. Given the ultra-narrow beams of mmWave MIMO systems, a single pencil beam is unsuitable for closely located objects, which tend to require multiple beams. Hence, we formulate the SBP gain maximization problem, considering also the constraints on the signal-to-interference and noise ratio (SINR) of the communication users (CUs), on the transmit power, and the constant modulus of the phase-shifters in the mmWave hybrid transceiver. To solve this non-convex problem, we propose a penalty-based triple alternating optimization algorithm to design the hybrid beamformer. Finally, simulation results are provided for demonstrating the efficacy of the proposed model.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.