Search | arXiv e-print repository

arXiv:2407.12097 [pdf, other]

Radio afterglows from tidal disruption events: An unbiased sample from ASKAP RACS

Authors: Akash Anumarlapudi, Dougal Dobie, David L. Kaplan, Tara Murphy, Assaf Horesh, Emil Lenc, Laura N. Driessen, Stefan W. Duchesne, Ms. Hannah Dykaar, Bryan M. Gaensler, Timothy J. Galvin, J. A. Grundy, George Heald, Aidan Hotan, Minh Huynh, James Leung, David McConnell, Vanessa A. Moss, Joshua Pritchard, Wasim Raja, Kovi Rose, Gregory R. Sivakoff, Yuanming Wang, Ziteng Wang, Mark Wieringa , et al. (1 additional authors not shown)

Abstract: Late-time ($\sim$ year) radio follow-up of optically-discovered tidal disruption events (TDEs) is increasingly resulting in detections at radio wavelengths, and there is growing evidence for this late-time radio activity to be common to the broad class of sub-relativistic TDEs. Detailed studies of some of these TDEs at radio wavelengths are also challenging the existing models for radio emission.… ▽ More Late-time ($\sim$ year) radio follow-up of optically-discovered tidal disruption events (TDEs) is increasingly resulting in detections at radio wavelengths, and there is growing evidence for this late-time radio activity to be common to the broad class of sub-relativistic TDEs. Detailed studies of some of these TDEs at radio wavelengths are also challenging the existing models for radio emission. Using all-sky multi-epoch data from the Australian Square Kilometre Array Pathfinder (ASKAP), taken as a part of the Rapid ASKAP Continuum Survey (RACS), we searched for radio counterparts to a sample of optically-discovered TDEs. We detected late-time emission at RACS frequencies (742-1032\,MHz) in five TDEs, reporting the independent discovery of radio emission from TDE AT2019ahk and extending the time baseline out to almost 3000\,days for some events. Overall, we find that at least $22^{+15}_{-11}$\% of the population of optically-discovered TDEs has detectable radio emission in the RACS survey, while also noting that the true fraction can be higher given the limited cadence (2 epochs separated by $\sim 3\,$ years) of the survey. Finally, we project that the ongoing higher-cadence ($\sim 2$\,months) ASKAP Variable and Slow Transients (VAST) survey can detect $\sim 20$ TDEs in its operational span (4\,yrs), given the current rate from optical surveys. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: Accepted for publication in ApJ, comments welcome

arXiv:2407.12003 [pdf, other]

Evaluation and Continual Improvement for an Enterprise AI Assistant

Authors: Akash V. Maharaj, Kun Qian, Uttaran Bhattacharya, Sally Fang, Horia Galatanu, Manas Garg, Rachel Hanessian, Nishant Kapoor, Ken Russell, Shivakumar Vaithyanathan, Yunyao Li

Abstract: The development of conversational AI assistants is an iterative process with multiple components. As such, the evaluation and continual improvement of these assistants is a complex and multifaceted problem. This paper introduces the challenges in evaluating and improving a generative AI assistant for enterprises, which is under active development, and how we address these challenges. We also share… ▽ More The development of conversational AI assistants is an iterative process with multiple components. As such, the evaluation and continual improvement of these assistants is a complex and multifaceted problem. This paper introduces the challenges in evaluating and improving a generative AI assistant for enterprises, which is under active development, and how we address these challenges. We also share preliminary results and discuss lessons learned. △ Less

Submitted 15 June, 2024; originally announced July 2024.

Comments: Accepted to DaSH Workshop at NAACL 2024

arXiv:2407.11694 [pdf, ps, other]

Nonvanishing of Second Coefficients of Hecke Polynomials on the Newspace

Authors: William Cason, Akash Jim, Charlie Medlock, Erick Ross, Trevor Vilardi, Hui Xue

Abstract: For $m \geq 1$, let $N \geq 1$ be coprime to $m$, $k \geq 2$, and $χ$ be a Dirichlet character modulo $N$ with $χ(-1)=(-1)^k$. Then let $T_m^{\text{new}}(N,k,χ)$ denote the restriction of the $m$-th Hecke operator to the space $S_k^{\text{new}}(Γ_0(N), χ)$. We demonstrate that for fixed $m$ and trivial character $χ$, the second coefficient of the characteristic polynomial of… ▽ More For $m \geq 1$, let $N \geq 1$ be coprime to $m$, $k \geq 2$, and $χ$ be a Dirichlet character modulo $N$ with $χ(-1)=(-1)^k$. Then let $T_m^{\text{new}}(N,k,χ)$ denote the restriction of the $m$-th Hecke operator to the space $S_k^{\text{new}}(Γ_0(N), χ)$. We demonstrate that for fixed $m$ and trivial character $χ$, the second coefficient of the characteristic polynomial of $T_m^{\text{new}}(N,k)$ vanishes for only finitely many pairs $(N,k)$, and we further determine the sign. To demonstrate our method, for $m=2,4$, we also compute all pairs $(N,k)$ for which the second coefficient vanishes. In the general character case, we also show that excluding an infinite family where $S_k^{\text{new}}(Γ_0(N), χ)$ is trivial, the second coefficient of the characteristic polynomial of $T_m^{\text{new}}(N,k,χ)$ vanishes for only finitely many triples $(N,k,χ)$. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: 30 pages

MSC Class: 11F25; 11F72; 11F11

arXiv:2407.10731 [pdf, other]

Unitary tetrahedron quantum gates

Authors: Vivek Kumar Singh, Akash Sinha, Pramod Padmanabhan, Vladimir Korepin

Abstract: Quantum simulations of many-body systems using 2-qubit Yang-Baxter gates offer a benchmark for quantum hardware. This can be extended to the higher dimensional case with $n$-qubit generalisations of Yang-Baxter gates called $n$-simplex operators. Such multi-qubit gates potentially lead to shallower and more efficient quantum circuits as well. Finding them amounts to identifying unitary solutions o… ▽ More Quantum simulations of many-body systems using 2-qubit Yang-Baxter gates offer a benchmark for quantum hardware. This can be extended to the higher dimensional case with $n$-qubit generalisations of Yang-Baxter gates called $n$-simplex operators. Such multi-qubit gates potentially lead to shallower and more efficient quantum circuits as well. Finding them amounts to identifying unitary solutions of the $n$-simplex equations, the building blocks of higher dimensional integrable systems. These are a set of highly non-linear and over determined system of equations making it notoriously hard to solve even when the local Hilbert spaces are spanned by qubits. We systematically overcome this for higher simplex operators constructed using two methods: from Clifford algebras and by lifting Yang-Baxter operators. The $n=3$ or the tetrahedron case is analyzed in detail. For the qubit case our methods produce 13 inequivalent families of unitary tetrahedron operators. 12 of these families are obtained by appending the 5 unitary families of 4 by 4 constant Yang-Baxter operators of Dye-Hietarinta, with a single qubit operator. As applications, universal sets of single, two and three qubit gates are realized using such unitary tetrahedron operators. The ideas presented in this work can be naturally extended to the higher simplex cases. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: 34 pages of main text + 4 pages of appendices + 8 pages of references

arXiv:2407.09941 [pdf, other]

Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers

Authors: Sukjun Hwang, Aakash Lahoti, Tri Dao, Albert Gu

Abstract: A wide array of sequence models are built on a framework modeled after Transformers, comprising alternating sequence mixer and channel mixer layers. This paper studies a unifying matrix mixer view of sequence mixers that can be conceptualized as a linear map on the input sequence. This framework encompasses a broad range of well-known sequence models, including the self-attention of Transformers a… ▽ More A wide array of sequence models are built on a framework modeled after Transformers, comprising alternating sequence mixer and channel mixer layers. This paper studies a unifying matrix mixer view of sequence mixers that can be conceptualized as a linear map on the input sequence. This framework encompasses a broad range of well-known sequence models, including the self-attention of Transformers as well as recent strong alternatives such as structured state space models (SSMs), and allows understanding downstream characteristics such as efficiency and expressivity through properties of their structured matrix class. We identify a key axis of matrix parameterizations termed sequence alignment, which increases the flexibility and performance of matrix mixers, providing insights into the strong performance of Transformers and recent SSMs such as Mamba. Furthermore, the matrix mixer framework offers a systematic approach to developing sequence mixers with desired properties, allowing us to develop several new sub-quadratic sequence models. In particular, we propose a natural bidirectional extension of the Mamba model (Hydra), parameterized as a quasiseparable matrix mixer, which demonstrates superior performance over other sequence models including Transformers on non-causal tasks. As a drop-in replacement for attention layers, Hydra outperforms BERT by 0.8 points on the GLUE benchmark and ViT by 2% Top-1 accuracy on ImageNet. △ Less

Submitted 13 July, 2024; originally announced July 2024.

arXiv:2407.09298 [pdf, other]

Transformer Layers as Painters

Authors: Qi Sun, Marc Pickett, Aakash Kumar Nain, Llion Jones

Abstract: Despite their nearly universal adoption for large language models, the internal workings of transformers are not well understood. We aim to better understand the impact of removing or reorganizing information throughout the layers of a pretrained transformer. Such an understanding could both yield better usage of existing models as well as to make architectural improvements to produce new variants… ▽ More Despite their nearly universal adoption for large language models, the internal workings of transformers are not well understood. We aim to better understand the impact of removing or reorganizing information throughout the layers of a pretrained transformer. Such an understanding could both yield better usage of existing models as well as to make architectural improvements to produce new variants. We present a series of empirical studies on frozen models that show that the lower and final layers of pretrained transformers differ from middle layers, but that middle layers have a surprising amount of uniformity. We further show that some classes of problems have robustness to skipping layers, running the layers in an order different from how they were trained, or running the layers in parallel. Our observations suggest that even frozen pretrained models may gracefully trade accuracy for latency by skipping layers or running layers in parallel. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 15 pages total, including references and appendices

arXiv:2407.06482 [pdf, other]

The Anomalous Acceleration of PSR J2043+1711: Long-Period Orbital Companion or Stellar Flyby?

Authors: Thomas Donlon II, Sukanya Chakrabarti, Michael T. Lam, Daniel Huber, Daniel Hey, Enrico Ramirez-Ruiz, Benjamin Shappee, David L. Kaplan, Gabriella Agazie, Akash Anumarlapudi, Anne M. Archibald, Zaven Arzoumanian, Paul T. Baker, Paul R. Brook, H. Thankful Cromartie, Kathryn Crowter, Megan E. DeCesar, Paul B. Demorest, Timothy Dolch, Elizabeth C. Ferrara, William Fiore, Emmanuel Fonseca, Gabriel E. Freedman, Nate Garver-Daniels, Peter A. Gentile , et al. (31 additional authors not shown)

Abstract: Based on the rate of change of its orbital period, PSR J2043+1711 has a substantial peculiar acceleration of 3.5 $\pm$ 0.8 mm/s/yr, which deviates from the acceleration predicted by equilibrium Milky Way models at a $4σ$ level. The magnitude of the peculiar acceleration is too large to be explained by disequilibrium effects of the Milky Way interacting with orbiting dwarf galaxies ($\sim$1 mm/s/yr… ▽ More Based on the rate of change of its orbital period, PSR J2043+1711 has a substantial peculiar acceleration of 3.5 $\pm$ 0.8 mm/s/yr, which deviates from the acceleration predicted by equilibrium Milky Way models at a $4σ$ level. The magnitude of the peculiar acceleration is too large to be explained by disequilibrium effects of the Milky Way interacting with orbiting dwarf galaxies ($\sim$1 mm/s/yr), and too small to be caused by period variations due to the pulsar being a redback. We identify and examine two plausible causes for the anomalous acceleration: a stellar flyby, and a long-period orbital companion. We identify a main-sequence star in \textit{Gaia} DR3 and Pan-STARRS DR2 with the correct mass, distance, and on-sky position to potentially explain the observed peculiar acceleration. However, the star and the pulsar system have substantially different proper motions, indicating that they are not gravitationally bound. However, it is possible that this is an unrelated star that just happens to be located near J2043+1711 along our line of sight (chance probability of 1.6\%). Therefore, we also constrain possible orbital parameters for a circumbinary companion in a hierarchical triple system with J2043+1711; the changes in the spindown rate of the pulsar are consistent with an outer object that has an orbital period of 80 kyr, a companion mass of 0.3 $M_\odot$ (indicative of a white dwarf or low-mass star), and a semi-major axis of 2000 AU. Continued timing and/or future faint optical observations of J2043+1711 may eventually allow us to differentiate between these scenarios. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.04685 [pdf, other]

The miscibility of hydrogen and water in planetary atmospheres and interiors

Authors: Akash Gupta, Lars Stixrude, Hilke E. Schlichting

Abstract: Many planets in the solar system and across the galaxy have hydrogen-rich atmospheres overlying more heavy element-rich interiors with which they interact for billions of years. Atmosphere-interior interactions are thus crucial to understanding the formation and evolution of these bodies. However, this understanding is still lacking in part because the relevant pressure-temperature conditions are… ▽ More Many planets in the solar system and across the galaxy have hydrogen-rich atmospheres overlying more heavy element-rich interiors with which they interact for billions of years. Atmosphere-interior interactions are thus crucial to understanding the formation and evolution of these bodies. However, this understanding is still lacking in part because the relevant pressure-temperature conditions are extreme. We conduct molecular dynamics simulations based on Density Functional Theory to investigate how hydrogen and water interact over a wide range of pressure and temperature, encompassing the interiors of Neptune-sized and smaller planets. We determine the critical curve at which a single homogeneous phase exsolves into two separate, hydrogen-rich and water-rich phases, finding good agreement with existing experimental data. We find that the temperature along the critical curve increases with increasing pressure and shows the influence of a change in fluid structure from molecular to atomic near 30 GPa and 3000 K, which may impact magnetic field generation. The internal temperatures of many exoplanets, including TOI-270 d and K2-18 b may lie entirely above the critical curve: the envelope is expected to consist of a single homogeneous hydrogen-water fluid, that is much less susceptible to atmospheric loss as compared with a pure hydrogen envelope. As planets cool, they cross the critical curve, leading to rainout of water-rich fluid and an increase in internal luminosity. Compositions of the resulting outer, hydrogen-rich, and inner, water-rich envelopes depend on age and instellation and are governed by thermodynamics. Rainout of water may be occurring in Uranus and Neptune at present. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: 17 pages, 4 figures

arXiv:2407.04386 [pdf, other]

doi 10.1109/TRO.2024.3422052

A Tree-based Next-best-trajectory Method for 3D UAV Exploration

Authors: Björn Lindqvist, Akash Patel, Kalle Löfgren, George Nikolakopoulos

Abstract: This work presents a fully integrated tree-based combined exploration-planning algorithm: Exploration-RRT (ERRT). The algorithm is focused on providing real-time solutions for local exploration in a fully unknown and unstructured environment while directly incorporating exploratory behavior, robot-safe path planning, and robot actuation into the central problem. ERRT provides a complete sampling a… ▽ More This work presents a fully integrated tree-based combined exploration-planning algorithm: Exploration-RRT (ERRT). The algorithm is focused on providing real-time solutions for local exploration in a fully unknown and unstructured environment while directly incorporating exploratory behavior, robot-safe path planning, and robot actuation into the central problem. ERRT provides a complete sampling and tree-based solution for evaluating "where to go next" by considering a trade-off between maximizing information gain, and minimizing the distances travelled and the robot actuation along the path. The complete scheme is evaluated in extensive simulations, comparisons, as well as real-world field experiments in constrained and narrow subterranean and GPS-denied environments. The framework is fully ROS-integrated, straight-forward to use, and we open-source it at https://github.com/LTU-RAI/ExplorationRRT. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: 19 pages, 29 figures Transactions on Robotics

ACM Class: I.2.9

arXiv:2407.03666 [pdf, ps, other]

Greedy on Preorder is Linear for Preorder Initial Tree

Authors: Akash Pareek

Abstract: The (preorder) traversal conjecture states that starting with an initial tree, the cost to search a sequence $S=(s_1,s_2,\dots,s_n) \in [n]^n$ in a binary search tree (BST) algorithm is $O(n)$, where $S$ is obtained by a preorder traversal of some BST. The sequence $S$ is called a preorder sequence. For Splay trees (candidate for dynamic optimality conjecture), the preorder traversal holds only… ▽ More The (preorder) traversal conjecture states that starting with an initial tree, the cost to search a sequence $S=(s_1,s_2,\dots,s_n) \in [n]^n$ in a binary search tree (BST) algorithm is $O(n)$, where $S$ is obtained by a preorder traversal of some BST. The sequence $S$ is called a preorder sequence. For Splay trees (candidate for dynamic optimality conjecture), the preorder traversal holds only when the initial tree is empty (Levy and Tarjan, WADS 2019). The preorder traversal conjecture for GREEDY (candidate for dynamic optimality conjecture) was known to be $n2^{α(n)^{O(1)}}$ (Chalermsook et al., FOCS 2015), which was recently improved to $O(n2^{α(n)})$ (Chalermsook et al., SODA 2023), here $α(n)$ is the inverse Ackermann function of $n$. For a special case when the initial tree is flat, GREEDY is known to satisfy the traversal conjecture, i.e., $O(n)$ (Chalermsook et al., FOCS 2015). In this paper, we show that for every preorder sequence $S$, there exists an initial tree called the preorder initial tree for which GREEDY satisfies the preorder traversal conjecture. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 10 pages, 5 figures

arXiv:2407.03424 [pdf, other]

Supernova Shocks Cannot Explain the Inflated State of Hypervelocity Runaways from White Dwarf Binaries

Authors: Aakash Bhat, Evan B. Bauer, Rüdiger Pakmor, Ken J. Shen, Ilaria Caiazzo, Abinaya Swaruba Rajamuthukumar, Kareem El-Badry, Wolfgang E. Kerzendorf

Abstract: Recent observations have found a growing number of hypervelocity stars with speeds of $\approx 1500-2500\,$km\,s$^{-1}$ which could have only been produced through thermonuclear supernovae in white dwarf binaries. Most of the observed hypervelocity runaways in this class display a surprising inflated structure: their current radii are roughly an order of magnitude greater than they would have been… ▽ More Recent observations have found a growing number of hypervelocity stars with speeds of $\approx 1500-2500\,$km\,s$^{-1}$ which could have only been produced through thermonuclear supernovae in white dwarf binaries. Most of the observed hypervelocity runaways in this class display a surprising inflated structure: their current radii are roughly an order of magnitude greater than they would have been as white dwarfs filling their Roche lobe. While many simulations exist studying the dynamical phase leading to supernova detonation in these systems, no detailed calculations of the long-term structure of the runaways have yet been performed. We use an existing \textsc{Arepo} hydrodynamical simulation of a supernova in a white dwarf binary as a starting point for the evolution of these stars with the 1 dimensional stellar evolution code MESA. We show that the supernova shock is not enough to inflate the white dwarf over timescales longer than a few thousand years, significantly shorter than the $10^{5-6}$ year lifetimes inferred for observed hypervelocity runaways. Despite experiencing a shock from a supernova less than $\approx 0.02\,R_\odot$ away, our models do not experience significant interior heating, and all contract back to radii around $0.01\,R_\odot$ within about $10^4$\,years. Explaining the observed inflated states requires either an additional source of significant heating or some other physics that is not yet accounted for in the subsequent evolution. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: Submitted to A\&A. 15 pages, 17 figures

arXiv:2407.02238 [pdf, other]

MIREncoder: Multi-modal IR-based Pretrained Embeddings for Performance Optimizations

Authors: Akash Dutta, Ali Jannesari

Abstract: One of the primary areas of interest in High Performance Computing is the improvement of performance of parallel workloads. Nowadays, compilable source code-based optimization tasks that employ deep learning often exploit LLVM Intermediate Representations (IRs) for extracting features from source code. Most such works target specific tasks, or are designed with a pre-defined set of heuristics. So… ▽ More One of the primary areas of interest in High Performance Computing is the improvement of performance of parallel workloads. Nowadays, compilable source code-based optimization tasks that employ deep learning often exploit LLVM Intermediate Representations (IRs) for extracting features from source code. Most such works target specific tasks, or are designed with a pre-defined set of heuristics. So far, pre-trained models are rare in this domain, but the possibilities have been widely discussed. Especially approaches mimicking large-language models (LLMs) have been proposed. But these have prohibitively large training costs. In this paper, we propose MIREncoder, a M}ulti-modal IR-based Auto-Encoder that can be pre-trained to generate a learned embedding space to be used for downstream tasks by machine learning-based approaches. A multi-modal approach enables us to better extract features from compilable programs. It allows us to better model code syntax, semantics and structure. For code-based performance optimizations, these features are very important while making optimization decisions. A pre-trained model/embedding implicitly enables the usage of transfer learning, and helps move away from task-specific trained models. Additionally, a pre-trained model used for downstream performance optimization should itself have reduced overhead, and be easily usable. These considerations have led us to propose a modeling approach that i) understands code semantics and structure, ii) enables use of transfer learning, and iii) is small and simple enough to be easily re-purposed or reused even with low resource availability. Our evaluations will show that our proposed approach can outperform the state of the art while reducing overhead. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 12 pages, 6 figures, 9 tables, PACT '24 conference

arXiv:2407.01283 [pdf, other]

Energy-Aware Decentralized Learning with Intermittent Model Training

Authors: Akash Dhasade, Paolo Dini, Elia Guerra, Anne-Marie Kermarrec, Marco Miozzo, Rafael Pires, Rishi Sharma, Martijn de Vos

Abstract: Decentralized learning (DL) offers a powerful framework where nodes collaboratively train models without sharing raw data and without the coordination of a central server. In the iterative rounds of DL, models are trained locally, shared with neighbors in the topology, and aggregated with other models received from neighbors. Sharing and merging models contribute to convergence towards a consensus… ▽ More Decentralized learning (DL) offers a powerful framework where nodes collaboratively train models without sharing raw data and without the coordination of a central server. In the iterative rounds of DL, models are trained locally, shared with neighbors in the topology, and aggregated with other models received from neighbors. Sharing and merging models contribute to convergence towards a consensus model that generalizes better across the collective data captured at training time. In addition, the energy consumption while sharing and merging model parameters is negligible compared to the energy spent during the training phase. Leveraging this fact, we present SkipTrain, a novel DL algorithm, which minimizes energy consumption in decentralized learning by strategically skipping some training rounds and substituting them with synchronization rounds. These training-silent periods, besides saving energy, also allow models to better mix and finally produce models with superior accuracy than typical DL algorithms that train at every round. Our empirical evaluations with 256 nodes demonstrate that SkipTrain reduces energy consumption by 50% and increases model accuracy by up to 12% compared to D-PSGD, the conventional DL algorithm. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2407.00129 [pdf]

Multimodal Learning and Cognitive Processes in Radiology: MedGaze for Chest X-ray Scanpath Prediction

Authors: Akash Awasthi, Ngan Le, Zhigang Deng, Rishi Agrawal, Carol C. Wu, Hien Van Nguyen

Abstract: Predicting human gaze behavior within computer vision is integral for developing interactive systems that can anticipate user attention, address fundamental questions in cognitive science, and hold implications for fields like human-computer interaction (HCI) and augmented/virtual reality (AR/VR) systems. Despite methodologies introduced for modeling human eye gaze behavior, applying these models… ▽ More Predicting human gaze behavior within computer vision is integral for developing interactive systems that can anticipate user attention, address fundamental questions in cognitive science, and hold implications for fields like human-computer interaction (HCI) and augmented/virtual reality (AR/VR) systems. Despite methodologies introduced for modeling human eye gaze behavior, applying these models to medical imaging for scanpath prediction remains unexplored. Our proposed system aims to predict eye gaze sequences from radiology reports and CXR images, potentially streamlining data collection and enhancing AI systems using larger datasets. However, predicting human scanpaths on medical images presents unique challenges due to the diverse nature of abnormal regions. Our model predicts fixation coordinates and durations critical for medical scanpath prediction, outperforming existing models in the computer vision community. Utilizing a two-stage training process and large publicly available datasets, our approach generates static heatmaps and eye gaze videos aligned with radiology reports, facilitating comprehensive analysis. We validate our approach by comparing its performance with state-of-the-art methods and assessing its generalizability among different radiologists, introducing novel strategies to model radiologists' search patterns during CXR image diagnosis. Based on the radiologist's evaluation, MedGaze can generate human-like gaze sequences with a high focus on relevant regions over the CXR images. It sometimes also outperforms humans in terms of redundancy and randomness in the scanpaths. △ Less

Submitted 28 June, 2024; originally announced July 2024.

Comments: Submitted to the Journal

arXiv:2407.00101 [pdf, other]

Hybrid Approach to Parallel Stochastic Gradient Descent

Authors: Aakash Sudhirbhai Vora, Dhrumil Chetankumar Joshi, Aksh Kantibhai Patel

Abstract: Stochastic Gradient Descent is used for large datasets to train models to reduce the training time. On top of that data parallelism is widely used as a method to efficiently train neural networks using multiple worker nodes in parallel. Synchronous and asynchronous approach to data parallelism is used by most systems to train the model in parallel. However, both of them have their drawbacks. We pr… ▽ More Stochastic Gradient Descent is used for large datasets to train models to reduce the training time. On top of that data parallelism is widely used as a method to efficiently train neural networks using multiple worker nodes in parallel. Synchronous and asynchronous approach to data parallelism is used by most systems to train the model in parallel. However, both of them have their drawbacks. We propose a third approach to data parallelism which is a hybrid between synchronous and asynchronous approaches, using both approaches to train the neural network. When the threshold function is selected appropriately to gradually shift all parameter aggregation from asynchronous to synchronous, we show that in a given time period our hybrid approach outperforms both asynchronous and synchronous approaches. △ Less

Submitted 27 June, 2024; originally announced July 2024.

arXiv:2406.19686 [pdf]

Enhancing Radiological Diagnosis: A Collaborative Approach Integrating AI and Human Expertise for Visual Miss Correction

Authors: Akash Awasthi, Ngan Le, Zhigang Deng, Carol C. Wu, Hien Van Nguyen

Abstract: Human-AI collaboration to identify and correct perceptual errors in chest radiographs has not been previously explored. This study aimed to develop a collaborative AI system, CoRaX, which integrates eye gaze data and radiology reports to enhance diagnostic accuracy in chest radiology by pinpointing perceptual errors and refining the decision-making process. Using public datasets REFLACX and EGD-CX… ▽ More Human-AI collaboration to identify and correct perceptual errors in chest radiographs has not been previously explored. This study aimed to develop a collaborative AI system, CoRaX, which integrates eye gaze data and radiology reports to enhance diagnostic accuracy in chest radiology by pinpointing perceptual errors and refining the decision-making process. Using public datasets REFLACX and EGD-CXR, the study retrospectively developed CoRaX, employing a large multimodal model to analyze image embeddings, eye gaze data, and radiology reports. The system's effectiveness was evaluated based on its referral-making process, the quality of referrals, and performance in collaborative diagnostic settings. CoRaX was tested on a simulated error dataset of 271 samples with 28% (93 of 332) missed abnormalities. The system corrected 21% (71 of 332) of these errors, leaving 7% (22 of 312) unresolved. The Referral-Usefulness score, indicating the accuracy of predicted regions for all true referrals, was 0.63 (95% CI 0.59, 0.68). The Total-Usefulness score, reflecting the diagnostic accuracy of CoRaX's interactions with radiologists, showed that 84% (237 of 280) of these interactions had a score above 0.40. In conclusion, CoRaX efficiently collaborates with radiologists to address perceptual errors across various abnormalities, with potential applications in the education and training of novice radiologists. △ Less

Submitted 28 June, 2024; originally announced June 2024.

Comments: Under Review in Journal

arXiv:2406.18258 [pdf, other]

Sunburst quantum Ising battery

Authors: Akash Mitra, Shashi C. L. Srivastava

Abstract: We study the energy transfer process in the recently proposed sunburst quantum Ising model, which consists of two interacting integrable systems: a transverse Ising chain with a very small transverse field and a finite number of external isolated qubits. We show that in this model of the quantum battery, coupling between the battery and charger can be used to optimize the ergotropy, which is the m… ▽ More We study the energy transfer process in the recently proposed sunburst quantum Ising model, which consists of two interacting integrable systems: a transverse Ising chain with a very small transverse field and a finite number of external isolated qubits. We show that in this model of the quantum battery, coupling between the battery and charger can be used to optimize the ergotropy, which is the maximum amount of energy that can be extracted from the battery. At the same time, maximum charging power increases with the coupling strength, allowing for the simultaneous optimization of both ergotropy and charging power in the strong coupling limit. Furthermore, we show that both ergotropy and charging power are independent of the initial state of the charger. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.17630 [pdf, other]

KANQAS: Kolmogorov Arnold Network for Quantum Architecture Search

Authors: Akash Kundu, Aritra Sarkar, Abhishek Sadhu

Abstract: Quantum architecture search~(QAS) is a promising direction for optimization and automated design of quantum circuits towards quantum advantage. Recent techniques in QAS focus on machine learning-based approaches from reinforcement learning, like deep Q-network. While multi-layer perceptron-based deep Q-networks have been applied for QAS, their interpretability remains challenging due to the high n… ▽ More Quantum architecture search~(QAS) is a promising direction for optimization and automated design of quantum circuits towards quantum advantage. Recent techniques in QAS focus on machine learning-based approaches from reinforcement learning, like deep Q-network. While multi-layer perceptron-based deep Q-networks have been applied for QAS, their interpretability remains challenging due to the high number of parameters. In this work, we evaluate the practicality of KANs in quantum architecture search problems, analyzing their efficiency in terms of the probability of success, frequency of optimal solutions and their dependencies on various degrees of freedom of the network. In a noiseless scenario, the probability of success and the number of optimal quantum circuit configurations to generate the multi-qubit maximally entangled states are significantly higher than MLPs. Moreover in noisy scenarios, KAN can achieve a better fidelity in approximating maximally entangled state than MLPs, where the performance of the MLP significantly depends on the choice of activation function. Further investigation reveals that KAN requires a very small number of learnable parameters compared to MLPs, however, the average time of executing each episode for KAN is much higher. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: 10 pages and 4 figures

arXiv:2406.17610 [pdf, other]

YAQQ: Yet Another Quantum Quantizer -- Design Space Exploration of Quantum Gate Sets using Novelty Search

Authors: Aritra Sarkar, Akash Kundu, Matthew Steinberg, Sibasish Mishra, Sebastiaan Fauquenot, Tamal Acharya, Jarosław A. Miszczak, Sebastian Feld

Abstract: In the standard circuit model of quantum computation, the number and quality of the quantum gates composing the circuit influence the runtime and fidelity of the computation. The fidelity of the decomposition of quantum algorithms, represented as unitary matrices, to bounded depth quantum circuits depends strongly on the set of gates available for the decomposition routine. To investigate this dep… ▽ More In the standard circuit model of quantum computation, the number and quality of the quantum gates composing the circuit influence the runtime and fidelity of the computation. The fidelity of the decomposition of quantum algorithms, represented as unitary matrices, to bounded depth quantum circuits depends strongly on the set of gates available for the decomposition routine. To investigate this dependence, we explore the design space of discrete quantum gate sets and present a software tool for comparative analysis of quantum processing units and control protocols based on their native gates. The evaluation is conditioned on a set of unitary transformations representing target use cases on the quantum processors. The cost function considers three key factors: (i) the statistical distribution of the decomposed circuits' depth, (ii) the statistical distribution of process fidelities for the approximate decomposition, and (iii) the relative novelty of a gate set compared to other gate sets in terms of the aforementioned properties. The developed software, YAQQ (Yet Another Quantum Quantizer), enables the discovery of an optimized set of quantum gates through this tunable joint cost function. To identify these gate sets, we use the novelty search algorithm, circuit decomposition techniques, and stochastic optimization to implement YAQQ within the Qiskit quantum simulator environment. YAQQ exploits reachability tradeoffs conceptually derived from quantum algorithmic information theory. Our results demonstrate the pragmatic application of identifying gate sets that are advantageous to popularly used quantum gate sets in representing quantum algorithms. Consequently, we demonstrate pragmatic use cases of YAQQ in comparing transversal logical gate sets in quantum error correction codes, designing optimal quantum instruction sets, and compiling to specific quantum processors. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.15371 [pdf, ps, other]

Affirmative safety: An approach to risk management for high-risk AI

Authors: Akash R. Wasil, Joshua Clymer, David Krueger, Emily Dardaman, Simeon Campos, Evan R. Murphy

Abstract: Prominent AI experts have suggested that companies developing high-risk AI systems should be required to show that such systems are safe before they can be developed or deployed. The goal of this paper is to expand on this idea and explore its implications for risk management. We argue that entities developing or deploying high-risk AI systems should be required to present evidence of affirmative… ▽ More Prominent AI experts have suggested that companies developing high-risk AI systems should be required to show that such systems are safe before they can be developed or deployed. The goal of this paper is to expand on this idea and explore its implications for risk management. We argue that entities developing or deploying high-risk AI systems should be required to present evidence of affirmative safety: a proactive case that their activities keep risks below acceptable thresholds. We begin the paper by highlighting global security risks from AI that have been acknowledged by AI experts and world governments. Next, we briefly describe principles of risk management from other high-risk fields (e.g., nuclear safety). Then, we propose a risk management approach for advanced AI in which model developers must provide evidence that their activities keep certain risks below regulator-set thresholds. As a first step toward understanding what affirmative safety cases should include, we illustrate how certain kinds of technical evidence and operational evidence can support an affirmative safety case. In the technical section, we discuss behavioral evidence (evidence about model outputs), cognitive evidence (evidence about model internals), and developmental evidence (evidence about the training process). In the operational section, we offer examples of organizational practices that could contribute to affirmative safety cases: information security practices, safety culture, and emergency response capacity. Finally, we briefly compare our approach to the NIST AI Risk Management Framework. Overall, we hope our work contributes to ongoing discussions about national and global security risks posed by AI and regulatory approaches to address these risks. △ Less

Submitted 14 April, 2024; originally announced June 2024.

arXiv:2406.13881 [pdf, other]

Static Generation of Efficient OpenMP Offload Data Mappings

Authors: Luke Marzen, Akash Dutta, Ali Jannesari

Abstract: Increasing heterogeneity in HPC architectures and compiler advancements have led to OpenMP being frequently used to enable computations on heterogeneous devices. However, the efficient movement of data on heterogeneous computing platforms is crucial for achieving high utilization. The implicit OpenMP data-mapping rules often result in redundant data transfer, which can be a bottleneck for program… ▽ More Increasing heterogeneity in HPC architectures and compiler advancements have led to OpenMP being frequently used to enable computations on heterogeneous devices. However, the efficient movement of data on heterogeneous computing platforms is crucial for achieving high utilization. The implicit OpenMP data-mapping rules often result in redundant data transfer, which can be a bottleneck for program performance. Programmers must explicitly map data between the host and connected accelerator devices to achieve efficient data movement. For this, OpenMP offers the target data and target update constructs. Ensuring efficient data transfer requires programmers to reason about complex data flow. This can be a laborious and error-prone process since the programmer must keep a mental model of data validity and lifetime spanning multiple data environments. Any automated analysis should maximize data reuse, minimize data transfer, and must consider control flow and context from function call sites, making the analysis interprocedural and context sensitive. In this paper, we present a static analysis tool, OMPDart (OpenMP DAta Reduction Tool), for OpenMP programs that models data dependencies between host and device regions and applies source code transformations to achieve efficient data transfer. The analysis is based on a hybrid data structure that joins an Abstract Syntax Tree (AST) with a Control Flow Graph (CFG). Our evaluations on nine HPC benchmarks demonstrate that OMPDart is capable of generating effective data mapping constructs that substantially reduce data transfer between host and device. OMPDart helps reduce data transfers by 85% and improves runtime performance by 1.6x over an expert-defined implementation of LULESH 2.0. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: Accepted to the 2024 International Conference for High Performance Computing, Networking, Storage, and Analysis (SC24)

arXiv:2406.12352 [pdf, other]

A two-minute burst of highly polarised radio emission originating from low Galactic latitude

Authors: Dougal Dobie, Andrew Zic, Lucy S. Oswald, Joshua Pritchard, Marcus E. Lower, Ziteng Wang, Hao Qiu, Natasha Hurley-Walker, Yuanming Wang, Emil Lenc, David L. Kaplan, Akash Anumarlapudi, Katie Auchettl, Matthew Bailes, Andrew D. Cameron, Jeffrey Cooke, Adam Deller, Laura N. Driessen, James Freeburn, Tara Murphy, Ryan M. Shannon, Adam J. Stewart

Abstract: Several sources of repeating coherent bursts of radio emission with periods of many minutes have now been reported in the literature. These ``ultra-long period'' (ULP) sources have no clear multi-wavelength counterparts and challenge canonical pulsar emission models, leading to debate regarding their nature. In this work we report the discovery of a bright, highly-polarised burst of radio emission… ▽ More Several sources of repeating coherent bursts of radio emission with periods of many minutes have now been reported in the literature. These ``ultra-long period'' (ULP) sources have no clear multi-wavelength counterparts and challenge canonical pulsar emission models, leading to debate regarding their nature. In this work we report the discovery of a bright, highly-polarised burst of radio emission at low Galactic latitude as part of a wide-field survey for transient and variable radio sources. ASKAP\,J175534.9$-$252749.1 does not appear to repeat, with only a single intense two-minute $\sim 200$\,mJy burst detected from 60~hours of observations. The burst morphology and polarisation properties are comparable to those of classical pulsars but the duration is more than one hundred times longer, analogous to ULPs. No comparable bursts are detected in the rest of our widefield survey to date. Combined with the existing ULP population, this suggests that these sources have a strong Galactic latitude dependence and hints at an unexplored population of transient and variable radio sources in the thin disk of the Milky Way. The resemblance of this burst with both ULPs and pulsars calls for a unified coherent emission model for objects with spin periods from milliseconds to tens of minutes. However, whether or not these are all neutron stars or have the same underlying power source remains open for debate. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.09208 [pdf, other]

Python-based DSL for generating Verilog model of Synchronous Digital Circuits

Authors: Mandar Datar, Dhruva S. Hegde, Vendra Durga Prasad, Manish Prajapati, Neralla Manikanta, Devansh Gupta, Janampalli Pavanija, Pratyush Pare, Akash, Shivam Gupta, Sachin B. Patkar

Abstract: We have designed a Python-based Domain Specific Language (DSL) for modeling synchronous digital circuits. In this DSL, hardware is modeled as a collection of transactions -- running in series, parallel, and loops. When the model is executed by a Python interpreter, synthesizable and behavioural Verilog is generated as output, which can be integrated with other RTL designs or directly used for FPGA… ▽ More We have designed a Python-based Domain Specific Language (DSL) for modeling synchronous digital circuits. In this DSL, hardware is modeled as a collection of transactions -- running in series, parallel, and loops. When the model is executed by a Python interpreter, synthesizable and behavioural Verilog is generated as output, which can be integrated with other RTL designs or directly used for FPGA and ASIC flows. In this paper, we describe - 1) the language (DSL), which allows users to express computation in series/parallel/loop constructs, with explicit cycle boundaries, 2) the internals of a simple Python implementation to produce synthesizable Verilog, and 3) several design examples and case studies for applications in post-quantum cryptography, stereo-vision, digital signal processing and optimization techniques. In the end, we list ideas to extend this framework. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 9 pages, 13 figures

arXiv:2406.08521 [pdf, other]

Embedding-based Multimodal Learning on Pan-Squamous Cell Carcinomas for Improved Survival Outcomes

Authors: Asim Waqas, Aakash Tripathi, Paul Stewart, Mia Naeini, Ghulam Rasool

Abstract: Cancer clinics capture disease data at various scales, from genetic to organ level. Current bioinformatic methods struggle to handle the heterogeneous nature of this data, especially with missing modalities. We propose PARADIGM, a Graph Neural Network (GNN) framework that learns from multimodal, heterogeneous datasets to improve clinical outcome prediction. PARADIGM generates embeddings from multi… ▽ More Cancer clinics capture disease data at various scales, from genetic to organ level. Current bioinformatic methods struggle to handle the heterogeneous nature of this data, especially with missing modalities. We propose PARADIGM, a Graph Neural Network (GNN) framework that learns from multimodal, heterogeneous datasets to improve clinical outcome prediction. PARADIGM generates embeddings from multi-resolution data using foundation models, aggregates them into patient-level representations, fuses them into a unified graph, and enhances performance for tasks like survival analysis. We train GNNs on pan-Squamous Cell Carcinomas and validate our approach on Moffitt Cancer Center lung SCC data. Multimodal GNN outperforms other models in patient survival prediction. Converging individual data modalities across varying scales provides a more insightful disease view. Our solution aims to understand the patient's circumstances comprehensively, offering insights on heterogeneous data integration and the benefits of converging maximum data views. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.08371 [pdf, other]

An Untargeted Search for Radio-Emitting Tidal Disruption Events in the VAST Pilot Survey

Authors: Hannah Dykaar, Maria R. Drout, B. M. Gaensler, David L. Kaplan, Tara Murphy, Assaf Horesh, Akash Anumarlapudi, Dougal Dobie, Laura N. Driessen, Emil Lenc, Adam Stewart

Abstract: We present a systematic search for tidal disruption events (TDEs) using radio data from the Variables and Slow Transients (VAST) Pilot Survey conducted using the Australian Square Kilometre Array Pathfinder (ASKAP). Historically, TDEs have been identified using observations at X-ray, optical, and ultraviolet wavelengths. After discovery, a few dozen TDEs have been shown to have radio counterparts… ▽ More We present a systematic search for tidal disruption events (TDEs) using radio data from the Variables and Slow Transients (VAST) Pilot Survey conducted using the Australian Square Kilometre Array Pathfinder (ASKAP). Historically, TDEs have been identified using observations at X-ray, optical, and ultraviolet wavelengths. After discovery, a few dozen TDEs have been shown to have radio counterparts through follow-up observations. With systematic time-domain radio surveys becoming available, we can now identify new TDEs in the radio regime. A population of radio-discovered TDEs has the potential to provide several key insights including an independent constraint on their volumetric rate. We conducted a search to select variable radio sources with a single prominent radio flare and a position consistent within 2$σ$ of the nucleus of a known galaxy. While TDEs were the primary target of our search, sources identified in this search may also be consistent with active galactic nuclei exhibiting unusual flux density changes at the timescales probed, uncharacteristically bright supernovae, or a population of gamma-ray bursts. We identify a sample of 12 radio-bright candidate TDEs. The timescales and luminosities range from ~6 to 230 days and ~10$^{38}$ to 10$^{41}$ erg s$^{-1}$, consistent with models of radio emission from TDEs that launch relativistic jets. After calculating the detection efficiency of our search using a Monte Carlo simulation of TDEs, and assuming all 12 sources are jetted TDEs, we derive a volumetric rate for jetted TDEs of 0.80$^{+0.31}_{-0.23}$ Gpc$^{-3}$ yr$^{-1}$, consistent with previous empirically estimated rates. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 37 pages, 37 figures, accepted to ApJ

arXiv:2406.07334 [pdf, other]

Fractonic solids

Authors: Akash Jain

Abstract: Fractons are exotic quasiparticles whose mobility in space is restricted by symmetries. In potential real-world realisations, fractons are likely lodged to a physical material rather than absolute space. Motivated by this, we propose and explore a new symmetry principle that restricts the motion of fractons relative to a physical solid. Unlike models with restricted mobility in absolute space, the… ▽ More Fractons are exotic quasiparticles whose mobility in space is restricted by symmetries. In potential real-world realisations, fractons are likely lodged to a physical material rather than absolute space. Motivated by this, we propose and explore a new symmetry principle that restricts the motion of fractons relative to a physical solid. Unlike models with restricted mobility in absolute space, these fractonic solids admit gauge-invariant momentum density, are compatible with boost symmetry, and can consistently be coupled to gravity. We also propose a holographic model for fractonic solids. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 5 pages + bibliography and supplementary material; a supplementary mathematica notebook is included containing the details of dispersion relations

arXiv:2406.06739 [pdf, other]

Scaling the Vocabulary of Non-autoregressive Models for Efficient Generative Retrieval

Authors: Ravisri Valluri, Akash Kumar Mohankumar, Kushal Dave, Amit Singh, Jian Jiao, Manik Varma, Gaurav Sinha

Abstract: Generative Retrieval introduces a new approach to Information Retrieval by reframing it as a constrained generation task, leveraging recent advancements in Autoregressive (AR) language models. However, AR-based Generative Retrieval methods suffer from high inference latency and cost compared to traditional dense retrieval techniques, limiting their practical applicability. This paper investigates… ▽ More Generative Retrieval introduces a new approach to Information Retrieval by reframing it as a constrained generation task, leveraging recent advancements in Autoregressive (AR) language models. However, AR-based Generative Retrieval methods suffer from high inference latency and cost compared to traditional dense retrieval techniques, limiting their practical applicability. This paper investigates fully Non-autoregressive (NAR) language models as a more efficient alternative for generative retrieval. While standard NAR models alleviate latency and cost concerns, they exhibit a significant drop in retrieval performance (compared to AR models) due to their inability to capture dependencies between target tokens. To address this, we question the conventional choice of limiting the target token space to solely words or sub-words. We propose PIXAR, a novel approach that expands the target vocabulary of NAR models to include multi-word entities and common phrases (up to 5 million tokens), thereby reducing token dependencies. PIXAR employs inference optimization strategies to maintain low inference latency despite the significantly larger vocabulary. Our results demonstrate that PIXAR achieves a relative improvement of 31.0% in MRR@10 on MS MARCO and 23.2% in Hits@5 on Natural Questions compared to standard NAR models with similar latency and cost. Furthermore, online A/B experiments on a large commercial search engine show that PIXAR increases ad clicks by 5.08% and revenue by 4.02%. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 14 pages, 6 tables, 2 figures

arXiv:2406.05828 [pdf, other]

Multi-Stain Multi-Level Convolutional Network for Multi-Tissue Breast Cancer Image Segmentation

Authors: Akash Modi, Sumit Kumar Jha, Purnendu Mishra, Rajiv Kumar, Kiran Aatre, Gursewak Singh, Shubham Mathur

Abstract: Digital pathology and microscopy image analysis are widely employed in the segmentation of digitally scanned IHC slides, primarily to identify cancer and pinpoint regions of interest (ROI) indicative of tumor presence. However, current ROI segmentation models are either stain-specific or suffer from the issues of stain and scanner variance due to different staining protocols or modalities across m… ▽ More Digital pathology and microscopy image analysis are widely employed in the segmentation of digitally scanned IHC slides, primarily to identify cancer and pinpoint regions of interest (ROI) indicative of tumor presence. However, current ROI segmentation models are either stain-specific or suffer from the issues of stain and scanner variance due to different staining protocols or modalities across multiple labs. Also, tissues like Ductal Carcinoma in Situ (DCIS), acini, etc. are often classified as Tumors due to their structural similarities and color compositions. In this paper, we proposed a novel convolutional neural network (CNN) based Multi-class Tissue Segmentation model for histopathology whole-slide Breast slides which classify tumors and segments other tissue regions such as Ducts, acini, DCIS, Squamous epithelium, Blood Vessels, Necrosis, etc. as a separate class. Our unique pixel-aligned non-linear merge across spatial resolutions empowers models with both local and global fields of view for accurate detection of various classes. Our proposed model is also able to separate bad regions such as folds, artifacts, blurry regions, bubbles, etc. from tissue regions using multi-level context from different resolutions of WSI. Multi-phase iterative training with context-aware augmentation and increasing noise was used to efficiently train a multi-stain generic model with partial and noisy annotations from 513 slides. Our training pipeline used 12 million patches generated using context-aware augmentations which made our model stain and scanner invariant across data sources. To extrapolate stain and scanner invariance, our model was evaluated on 23000 patches which were for a completely new stain (Hematoxylin and Eosin) from a completely new scanner (Motic) from a different lab. The mean IOU was 0.72 which is on par with model performance on other data sources and scanners. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2406.04138 [pdf, other]

The 3D-PC: a benchmark for visual perspective taking in humans and machines

Authors: Drew Linsley, Peisen Zhou, Alekh Karkada Ashok, Akash Nagaraj, Gaurav Gaonkar, Francis E Lewis, Zygmunt Pizlo, Thomas Serre

Abstract: Visual perspective taking (VPT) is the ability to perceive and reason about the perspectives of others. It is an essential feature of human intelligence, which develops over the first decade of life and requires an ability to process the 3D structure of visual scenes. A growing number of reports have indicated that deep neural networks (DNNs) become capable of analyzing 3D scenes after training on… ▽ More Visual perspective taking (VPT) is the ability to perceive and reason about the perspectives of others. It is an essential feature of human intelligence, which develops over the first decade of life and requires an ability to process the 3D structure of visual scenes. A growing number of reports have indicated that deep neural networks (DNNs) become capable of analyzing 3D scenes after training on large image datasets. We investigated if this emergent ability for 3D analysis in DNNs is sufficient for VPT with the 3D perception challenge (3D-PC): a novel benchmark for 3D perception in humans and DNNs. The 3D-PC is comprised of three 3D-analysis tasks posed within natural scene images: 1. a simple test of object depth order, 2. a basic VPT task (VPT-basic), and 3. another version of VPT (VPT-Strategy) designed to limit the effectiveness of "shortcut" visual strategies. We tested human participants (N=33) and linearly probed or text-prompted over 300 DNNs on the challenge and found that nearly all of the DNNs approached or exceeded human accuracy in analyzing object depth order. Surprisingly, DNN accuracy on this task correlated with their object recognition performance. In contrast, there was an extraordinary gap between DNNs and humans on VPT-basic. Humans were nearly perfect, whereas most DNNs were near chance. Fine-tuning DNNs on VPT-basic brought them close to human performance, but they, unlike humans, dropped back to chance when tested on VPT-perturb. Our challenge demonstrates that the training routines and architectures of today's DNNs are well-suited for learning basic 3D properties of scenes and objects but are ill-suited for reasoning about these properties like humans do. We release our 3D-PC datasets and code to help bridge this gap in 3D perception between humans and machines. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2406.03776 [pdf, other]

XL-HeadTags: Leveraging Multimodal Retrieval Augmentation for the Multilingual Generation of News Headlines and Tags

Authors: Faisal Tareque Shohan, Mir Tafseer Nayeem, Samsul Islam, Abu Ubaida Akash, Shafiq Joty

Abstract: Millions of news articles published online daily can overwhelm readers. Headlines and entity (topic) tags are essential for guiding readers to decide if the content is worth their time. While headline generation has been extensively studied, tag generation remains largely unexplored, yet it offers readers better access to topics of interest. The need for conciseness in capturing readers' attention… ▽ More Millions of news articles published online daily can overwhelm readers. Headlines and entity (topic) tags are essential for guiding readers to decide if the content is worth their time. While headline generation has been extensively studied, tag generation remains largely unexplored, yet it offers readers better access to topics of interest. The need for conciseness in capturing readers' attention necessitates improved content selection strategies for identifying salient and relevant segments within lengthy articles, thereby guiding language models effectively. To address this, we propose to leverage auxiliary information such as images and captions embedded in the articles to retrieve relevant sentences and utilize instruction tuning with variations to generate both headlines and tags for news articles in a multilingual context. To make use of the auxiliary information, we have compiled a dataset named XL-HeadTags, which includes 20 languages across 6 diverse language families. Through extensive evaluation, we demonstrate the effectiveness of our plug-and-play multimodal-multilingual retrievers for both tasks. Additionally, we have developed a suite of tools for processing and evaluating multilingual texts, significantly contributing to the research community by enabling more accurate and efficient analysis across languages. △ Less

Submitted 7 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

Comments: ACL 2024 camera ready. The first two authors contributed equally

arXiv:2406.03245 [pdf, other]

doi 10.1145/3661455.3669867

Reconfiguring Participatory Design to Resist AI Realism

Authors: Aakash Gautam

Abstract: The growing trend of artificial intelligence (AI) as a solution to social and technical problems reinforces AI Realism -- the belief that AI is an inevitable and natural order. In response, this paper argues that participatory design (PD), with its focus on democratic values and processes, can play a role in questioning and resisting AI Realism. I examine three concerning aspects of AI Realism: th… ▽ More The growing trend of artificial intelligence (AI) as a solution to social and technical problems reinforces AI Realism -- the belief that AI is an inevitable and natural order. In response, this paper argues that participatory design (PD), with its focus on democratic values and processes, can play a role in questioning and resisting AI Realism. I examine three concerning aspects of AI Realism: the facade of democratization that lacks true empowerment, demands for human adaptability in contrast to AI systems' inflexibility, and the obfuscation of essential human labor enabling the AI system. I propose resisting AI Realism by reconfiguring PD to continue engaging with value-centered visions, increasing its exploration of non-AI alternatives, and making the essential human labor underpinning AI systems visible. I position PD as a means to generate friction against AI Realism and open space for alternative futures centered on human needs and values. △ Less

Submitted 8 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

Comments: 6 pages, 1 table

Journal ref: Participatory Design Conference 2024

arXiv:2406.01247 [pdf]

Modal Analysis of Cellular Dynamics in the Morphospace in Epithelial-Mesenchymal Transition

Authors: Akash Chandra Das, Debanga Raj Neog, Biplab Bose

Abstract: During epithelial-mesenchymal transition (EMT), epithelial cells change their morphology, disperse, and gain mesenchymal-like characteristics. Usually, cells are categorized into discrete cell types or states based on gene expression and other cellular features. Subsequently, EMT is investigated as a dynamical process where cells jump from one discrete state to another. In the current work, we mov… ▽ More During epithelial-mesenchymal transition (EMT), epithelial cells change their morphology, disperse, and gain mesenchymal-like characteristics. Usually, cells are categorized into discrete cell types or states based on gene expression and other cellular features. Subsequently, EMT is investigated as a dynamical process where cells jump from one discrete state to another. In the current work, we moved away from this idea of discrete state transition and investigated EMT dynamics in a continuous phenotypic space. We used morphology to define the phenotype of a cell. We used the data from quantitative image analysis of MDA-MB-468 cells undergoing EGF-induced EMT. We defined the morphological state space or 'morphospace' using the morphological features extracted through image analysis. During EMT, as the morphology changed, the distribution of cells in the morphospace also changed. However, this morphospace had a very high dimension. We reduced it to a 2-dimensional "reduced morphospace" and investigated the temporal change in the spatial distribution of cells in this reduced space. We used proper orthogonal decomposition to find dominant dynamical features of this spatio-temporal data. The modal analysis detected key features of EMT in this experimental system - reversible transition, distinct paths of phenotypic transition during induction and reversal of EMT, and enhanced diversity of cells during reversal of EMT. We also provide some intuitive physical meaning of the spatial modes and connect them to the key molecular event during EMT. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 33 pages, 8 figures

arXiv:2406.01000 [pdf, other]

doi 10.1016/j.asr.2024.06.040

Seasonal variation in nighttime NO radiative cooling as observed by TIMED/SABER in lower thermosphere during solar maximum and solar minimum

Authors: Alok Kumar Ranjan, MV Sunil Krishna, Akash Kumar, Dayakrishna Nailwal, Sumanta Sarkhel

Abstract: Both composition and temperature play a crucial role in determining the NO radiative cooling in lower thermosphere as observed by TIMED/SABER. In this work, we present a detailed investigation of seasonal variation in thermospheric NO radiative cooling. We have carried forward the investigation of \cite{li2018} regarding the variations in local nighttime peak NO radiative cooling and its altitude… ▽ More Both composition and temperature play a crucial role in determining the NO radiative cooling in lower thermosphere as observed by TIMED/SABER. In this work, we present a detailed investigation of seasonal variation in thermospheric NO radiative cooling. We have carried forward the investigation of \cite{li2018} regarding the variations in local nighttime peak NO radiative cooling and its altitude during solar maximum and solar minimum conditions. By analyzing latitudinal changes over quiet times for each month in year 2018, it is evident that both the investigative parameters exhibit summer-winter variability. The qualitative contribution of different species (i.e., NO, and O), and temperatures in determining the vertical profile of NO radiative cooling for different latitudes is investigated by utilizing the NRLMSISE-00 estimated parameters, and SNOE observed NO density. The temperature, NO density, meridional wind, and associated compositional variations due to asymmetrical solar heating in both the hemispheres during solar minimum conditions seem to be the dominating factor in controlling the NO radiative cooling during different seasons. The altitudes at which maximum cooling by NO occurs exhibits an inverse correlation with the amount of radiative cooling. The region of enhanced NO densities (polar and summer hemispheric low-mid latitude regions) have larger NO radiative cooling with lower peak altitudes in comparison to other regions (equatorial to winter hemispheric low-mid latitude regions), where NO radiative cooling is low with higher peak altitude values. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 19 pages, 10 figures

arXiv:2405.21050 [pdf, other]

Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models

Authors: Xinxi Zhang, Song Wen, Ligong Han, Felix Juefei-Xu, Akash Srivastava, Junzhou Huang, Hao Wang, Molei Tao, Dimitris N. Metaxas

Abstract: Adapting large-scale pre-trained generative models in a parameter-efficient manner is gaining traction. Traditional methods like low rank adaptation achieve parameter efficiency by imposing constraints but may not be optimal for tasks requiring high representation capacity. We propose a novel spectrum-aware adaptation framework for generative models. Our method adjusts both singular values and the… ▽ More Adapting large-scale pre-trained generative models in a parameter-efficient manner is gaining traction. Traditional methods like low rank adaptation achieve parameter efficiency by imposing constraints but may not be optimal for tasks requiring high representation capacity. We propose a novel spectrum-aware adaptation framework for generative models. Our method adjusts both singular values and their basis vectors of pretrained weights. Using the Kronecker product and efficient Stiefel optimizers, we achieve parameter-efficient adaptation of orthogonal matrices. We introduce Spectral Orthogonal Decomposition Adaptation (SODA), which balances computational efficiency and representation capacity. Extensive evaluations on text-to-image diffusion models demonstrate SODA's effectiveness, offering a spectrum-aware alternative to existing fine-tuning methods. △ Less

Submitted 31 May, 2024; originally announced May 2024.

arXiv:2405.20755 [pdf]

Improving code-mixed hate detection by native sample mixing: A case study for Hindi-English code-mixed scenario

Authors: Debajyoti Mazumder, Aakash Kumar, Jasabanta Patro

Abstract: Hate detection has long been a challenging task for the NLP community. The task becomes complex in a code-mixed environment because the models must understand the context and the hate expressed through language alteration. Compared to the monolingual setup, we see very less work on code-mixed hate as large-scale annotated hate corpora are unavailable to make the study. To overcome this bottleneck,… ▽ More Hate detection has long been a challenging task for the NLP community. The task becomes complex in a code-mixed environment because the models must understand the context and the hate expressed through language alteration. Compared to the monolingual setup, we see very less work on code-mixed hate as large-scale annotated hate corpora are unavailable to make the study. To overcome this bottleneck, we propose using native language hate samples. We hypothesise that in the era of multilingual language models (MLMs), hate in code-mixed settings can be detected by majorly relying on the native language samples. Even though the NLP literature reports the effectiveness of MLMs on hate detection in many cross-lingual settings, their extensive evaluation in a code-mixed scenario is yet to be done. This paper attempts to fill this gap through rigorous empirical experiments. We considered the Hindi-English code-mixed setup as a case study as we have the linguistic expertise for the same. Some of the interesting observations we got are: (i) adding native hate samples in the code-mixed training set, even in small quantity, improved the performance of MLMs for code-mixed hate detection, (ii) MLMs trained with native samples alone observed to be detecting code-mixed hate to a large extent, (iii) The visualisation of attention scores revealed that, when native samples were included in training, MLMs could better focus on the hate emitting words in the code-mixed context, and (iv) finally, when hate is subjective or sarcastic, naively mixing native samples doesn't help much to detect code-mixed hate. We will release the data and code repository to reproduce the reported results. △ Less

Submitted 31 May, 2024; originally announced May 2024.

Comments: Generated from XeLaTeX

arXiv:2405.20739 [pdf]

Elucidating the Role of Stacking Faults in TlGaSe$_{2}$ on its Thermoelectric Properties

Authors: Tigran Simonian, Ahin Roy, Akash Bajaj, Rui Dong, Zheng Lei, Zdeněk Sofer, Stefano Sanvito, Valeria Nicolosi

Abstract: Thermoelectric materials are of great interest for heat energy harvesting applications. One such promising material is TlGaSe$_{2}$, a p-type semiconducting ternary chalcogenide. Recent reports show it can be processed as a thin film, opening the door for large-scale commercialization. However, TlGaSe$_{2}$ is prone to stacking faults along the [001] stacking direction and their role in its thermo… ▽ More Thermoelectric materials are of great interest for heat energy harvesting applications. One such promising material is TlGaSe$_{2}$, a p-type semiconducting ternary chalcogenide. Recent reports show it can be processed as a thin film, opening the door for large-scale commercialization. However, TlGaSe$_{2}$ is prone to stacking faults along the [001] stacking direction and their role in its thermoelectric properties has not been understood to date. Herein, TlGaSe$_{2}$ is investigated via (scanning) transmission electron microscopy and first-principles calculations. Stacking faults are found to be present throughout the material, as density functional theory calculations reveal a lack of preferential stacking order. Electron transport calculations show an enhancement of thermoelectric power factors when stacking faults are present. This implies the presence of stacking faults is key to the material's excellent thermoelectric properties along the [001] stacking direction, which can be further enhanced by doping the material to hole carrier concentrations to approx. 10$^{19}$ cm$^{-3}$. △ Less

Submitted 31 May, 2024; originally announced May 2024.

arXiv:2405.20592 [pdf, other]

LInK: Learning Joint Representations of Design and Performance Spaces through Contrastive Learning for Mechanism Synthesis

Authors: Amin Heyrani Nobari, Akash Srivastava, Dan Gutfreund, Kai Xu, Faez Ahmed

Abstract: In this paper, we introduce LInK, a novel framework that integrates contrastive learning of performance and design space with optimization techniques for solving complex inverse problems in engineering design with discrete and continuous variables. We focus on the path synthesis problem for planar linkage mechanisms. By leveraging a multi-modal and transformation-invariant contrastive learning fra… ▽ More In this paper, we introduce LInK, a novel framework that integrates contrastive learning of performance and design space with optimization techniques for solving complex inverse problems in engineering design with discrete and continuous variables. We focus on the path synthesis problem for planar linkage mechanisms. By leveraging a multi-modal and transformation-invariant contrastive learning framework, LInK learns a joint representation that captures complex physics and design representations of mechanisms, enabling rapid retrieval from a vast dataset of over 10 million mechanisms. This approach improves precision through the warm start of a hierarchical unconstrained nonlinear optimization algorithm, combining the robustness of traditional optimization with the speed and adaptability of modern deep learning methods. Our results on an existing benchmark demonstrate that LInK outperforms existing methods with 28 times less error compared to a state-of-the-art approach while taking 20 times less time on an existing benchmark. Moreover, we introduce a significantly more challenging benchmark, named LINK-ABC, which involves synthesizing linkages that trace the trajectories of English capital alphabets - an inverse design benchmark task that existing methods struggle with due to large non-linearities and tiny feasible space. Our results demonstrate that LInK not only advances the field of mechanism design but also broadens the applicability of contrastive learning and optimization to other areas of engineering. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.19927 [pdf, ps, other]

Adsorption of Mo and O at S-vacancy on ReS2 surface of ReS2/MoTe2 vdW heterointerface

Authors: Puneet Kumar Shaw, Jehan Taraporewalla, Sohaib Raza, Akash Kumar, Rimisha Duttagupta, Hafizur Rahaman, Dipankar Saha

Abstract: Applications like high density information storage, neuromorphic computing, nanophotonics, etc. require ultra-thin electronic devices which can be controlled with applied electric field. Of late, atomically thin two-dimensional (2D) materials and van der Waals (vdW) heterointerface of those have emerged as suitable candidates for such ultra-low power nanoelectric devices. In this work, employing d… ▽ More Applications like high density information storage, neuromorphic computing, nanophotonics, etc. require ultra-thin electronic devices which can be controlled with applied electric field. Of late, atomically thin two-dimensional (2D) materials and van der Waals (vdW) heterointerface of those have emerged as suitable candidates for such ultra-low power nanoelectric devices. In this work, employing density functional theory (DFT), the monolayer ReS2 / monolayer MoTe2 vdW heterostructure with Sulphur vacancy is studied to examine various ground state electronic properties. Changes in effective band gap owing to defect-induced states and modulation of the energy gap value with Molybdenum (Mo) and Oxygen (O) adsorption at the defect site are examined. Since two-dimensional (2D) material based nanoscaled devices exhibit promising switching between non-conducting and conducting states, determining the role of defect-induced states and the adsorption of atoms/molecules on surfaces is crucial. Here, a detailed theoretical study to determine surface properties and relative energetic stability of the vdW heterostructures is carried out. The charge re-distribution between the constituent layers is also analyzed by obtaining Electron Difference Density (EDD) for different heterointerfaces. Nonetheless, the efficacy of switching between non-conducting and conducting states is assessed based on adsorption energy of adatoms binding at the defect site. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 21 pages | 10 figures

arXiv:2405.18670 [pdf, other]

Adapting Differentially Private Synthetic Data to Relational Databases

Authors: Kaveh Alimohammadi, Hao Wang, Ojas Gulati, Akash Srivastava, Navid Azizan

Abstract: Existing differentially private (DP) synthetic data generation mechanisms typically assume a single-source table. In practice, data is often distributed across multiple tables with relationships across tables. In this paper, we introduce the first-of-its-kind algorithm that can be combined with any existing DP mechanisms to generate synthetic relational databases. Our algorithm iteratively refines… ▽ More Existing differentially private (DP) synthetic data generation mechanisms typically assume a single-source table. In practice, data is often distributed across multiple tables with relationships across tables. In this paper, we introduce the first-of-its-kind algorithm that can be combined with any existing DP mechanisms to generate synthetic relational databases. Our algorithm iteratively refines the relationship between individual synthetic tables to minimize their approximation errors in terms of low-order marginal distributions while maintaining referential integrity. Finally, we provide both DP and theoretical utility guarantees for our algorithm. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.18573 [pdf, other]

Programmer Visual Attention During Context-Aware Code Summarization

Authors: Aakash Bansal, Robert Wallace, Zachary Karas, Ningzhi Tang, Yu Huang, Toby Jia-Jun Li, Collin McMillan

Abstract: Abridged: Programmer attention represents the visual focus of programmers on parts of the source code in pursuit of programming tasks. We conducted an in-depth human study with XY Java programmers, where each programmer generated summaries for 40 methods from five large Java projects over five one-hour sessions. We used eye-tracking equipment to map the visual attention of programmers while they w… ▽ More Abridged: Programmer attention represents the visual focus of programmers on parts of the source code in pursuit of programming tasks. We conducted an in-depth human study with XY Java programmers, where each programmer generated summaries for 40 methods from five large Java projects over five one-hour sessions. We used eye-tracking equipment to map the visual attention of programmers while they wrote the summaries. We also rate the quality of each summary. We found eye-gaze patterns and metrics that define common behaviors between programmer attention during context-aware code summarization. Specifically, we found that programmers need to read significantly (p<0.01) fewer words and make significantly fewer revisits to words (p\textless0.03) as they summarize more methods during a session, while maintaining the quality of summaries. We also found that the amount of source code a participant looks at correlates with a higher quality summary, but this trend follows a bell-shaped curve, such that after a threshold reading more source code leads to a significant decrease (p<0.01) in the quality of summaries. We also gathered insight into the type of methods in the project that provide the most contextual information for code summarization based on programmer attention. Specifically, we observed that programmers spent a majority of their time looking at methods inside the same class as the target method to be summarized. Surprisingly, we found that programmers spent significantly less time looking at methods in the call graph of the target method. We discuss how our empirical observations may aid future studies towards modeling programmer attention and improving context-aware automatic source code summarization. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 10 pages, 4 figures, 4 tables. this is a pre-print submitted to IEEE Transactions on Software Engineering for review

arXiv:2405.16477 [pdf, other]

Toffoli gates solve the tetrahedron equations

Authors: Akash Sinha, Pramod Padmanabhan, Vladimir Korepin

Abstract: The circuit model of quantum computation can be interpreted as a scattering process. In particular, factorised scattering operators result in integrable quantum circuits that provide universal quantum computation and are potentially less noisy. These are realized through Yang-Baxter or 2-simplex operators. A natural question is to extend this construction to higher qubit gates, like the Toffoli ga… ▽ More The circuit model of quantum computation can be interpreted as a scattering process. In particular, factorised scattering operators result in integrable quantum circuits that provide universal quantum computation and are potentially less noisy. These are realized through Yang-Baxter or 2-simplex operators. A natural question is to extend this construction to higher qubit gates, like the Toffoli gates, which also lead to universal quantum computation but with shallower circuits. We show that unitary families of such operators are constructed by the 3-dimensional generalizations of the Yang-Baxter operators known as tetrahedron or 3-simplex operators. The latter satisfy a spectral parameter-dependent tetrahedron equation. This construction goes through for $n$-Toffoli gates realized using $n$-simplex operators. △ Less

Submitted 26 May, 2024; originally announced May 2024.

Comments: 8 pages + References

arXiv:2405.16081 [pdf, other]

A Study on Developer Behaviors for Validating and Repairing LLM-Generated Code Using Eye Tracking and IDE Actions

Authors: Ningzhi Tang, Meng Chen, Zheng Ning, Aakash Bansal, Yu Huang, Collin McMillan, Toby Jia-Jun Li

Abstract: The increasing use of large language model (LLM)-powered code generation tools, such as GitHub Copilot, is transforming software engineering practices. This paper investigates how developers validate and repair code generated by Copilot and examines the impact of code provenance awareness during these processes. We conducted a lab study with 28 participants, who were tasked with validating and rep… ▽ More The increasing use of large language model (LLM)-powered code generation tools, such as GitHub Copilot, is transforming software engineering practices. This paper investigates how developers validate and repair code generated by Copilot and examines the impact of code provenance awareness during these processes. We conducted a lab study with 28 participants, who were tasked with validating and repairing Copilot-generated code in three software projects. Participants were randomly divided into two groups: one informed about the provenance of LLM-generated code and the other not. We collected data on IDE interactions, eye-tracking, cognitive workload assessments, and conducted semi-structured interviews. Our results indicate that, without explicit information, developers often fail to identify the LLM origin of the code. Developers generally employ similar validation and repair strategies for LLM-generated code, but exhibit behaviors such as frequent switching between code and comments, different attentional focus, and a tendency to delete and rewrite code. Being aware of the code's provenance led to improved performance, increased search efforts, more frequent Copilot usage, and higher cognitive workload. These findings enhance our understanding of how developers interact with LLM-generated code and carry implications for designing tools that facilitate effective human-LLM collaboration in software development. △ Less

Submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.15669 [pdf, other]

doi 10.1145/3643834.3660730

Enhancing Reentry Support Programs Through Digital Literacy Integration

Authors: Aakash Gautam, Khushboo Gandhi, Jessica Eileen Sendejo

Abstract: Challenges faced by formerly incarcerated individuals in the United States raise questions about our society's ability to truly provide second chances. This paper presents the outcomes of our ongoing collaboration with a non-profit organization dedicated to reentry support. We highlight the multifaceted challenges individuals face during their reentry journey, including support programs that prior… ▽ More Challenges faced by formerly incarcerated individuals in the United States raise questions about our society's ability to truly provide second chances. This paper presents the outcomes of our ongoing collaboration with a non-profit organization dedicated to reentry support. We highlight the multifaceted challenges individuals face during their reentry journey, including support programs that prioritize supervision over service, unresponsive support systems, limited access to resources, financial struggles exacerbated by restricted employment opportunities, and technological barriers. In the face of such complex social challenges, our work aims to facilitate our partner organization's ongoing efforts to promote digital literacy through a web application that is integrated into their existing processes. We share initial feedback from the stakeholders, draw out four implications: supporting continuity of care, promoting reflection through slow technology, building in flexibility, and reconfiguring toward existing infrastructure, and conclude with a reflection on our role as partners on the side. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 15 pages, 1 table, 3 figures

Journal ref: Designing Interactive Systems Conference 2024

arXiv:2405.15644 [pdf, other]

Harnessing Increased Client Participation with Cohort-Parallel Federated Learning

Authors: Akash Dhasade, Anne-Marie Kermarrec, Tuan-Anh Nguyen, Rafael Pires, Martijn de Vos

Abstract: Federated Learning (FL) is a machine learning approach where nodes collaboratively train a global model. As more nodes participate in a round of FL, the effectiveness of individual model updates by nodes also diminishes. In this study, we increase the effectiveness of client updates by dividing the network into smaller partitions, or cohorts. We introduce Cohort-Parallel Federated Learning (CPFL):… ▽ More Federated Learning (FL) is a machine learning approach where nodes collaboratively train a global model. As more nodes participate in a round of FL, the effectiveness of individual model updates by nodes also diminishes. In this study, we increase the effectiveness of client updates by dividing the network into smaller partitions, or cohorts. We introduce Cohort-Parallel Federated Learning (CPFL): a novel learning approach where each cohort independently trains a global model using FL, until convergence, and the produced models by each cohort are then unified using one-shot Knowledge Distillation (KD) and a cross-domain, unlabeled dataset. The insight behind CPFL is that smaller, isolated networks converge quicker than in a one-network setting where all nodes participate. Through exhaustive experiments involving realistic traces and non-IID data distributions on the CIFAR-10 and FEMNIST image classification tasks, we investigate the balance between the number of cohorts, model accuracy, training time, and compute and communication resources. Compared to traditional FL, CPFL with four cohorts, non-IID data distribution, and CIFAR-10 yields a 1.9$\times$ reduction in train time and a 1.3$\times$ reduction in resource usage, with a minimal drop in test accuracy. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.15445 [pdf, other]

Cracking of submerged beds

Authors: Satyanu Bhadra, Anit Sane, Akash Ghosh, Shankar Ghosh, Kirti Chandra Sahu

Abstract: We investigate the phenomena of crater formation and gas release caused by projectile impact on underwater beds, which occurs in many natural, geophysical, and industrial applications. The bed in our experiment is constructed of hydrophobic particles, which trap a substantial amount of air in its pores. In contrast to dry beds, the air-water interface in a submerged bed generates a granular skin t… ▽ More We investigate the phenomena of crater formation and gas release caused by projectile impact on underwater beds, which occurs in many natural, geophysical, and industrial applications. The bed in our experiment is constructed of hydrophobic particles, which trap a substantial amount of air in its pores. In contrast to dry beds, the air-water interface in a submerged bed generates a granular skin that provides rigidity to the medium by producing skin over the bulk. The projectile's energy is used to reorganise the grains, which causes the skin to crack, allowing the trapped air to escape. The morphology of the craters as a function of impact energy in submerged beds exhibits different scaling laws than what is known for dry beds. This phenomenon is attributed to the contact line motion on the hydrophobic fractal-like surface of submerged grains. The volume of the gas released is a function of multiple factors, chiefly the velocity of the projectile, depth of the bed and depth of the water column. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 15 pages, 10 figures

arXiv:2405.14941 [pdf, other]

The NANOGrav 15 yr Data Set: Chromatic Gaussian Process Noise Models for Six Pulsars

Authors: Bjorn Larsen, Chiara M. F. Mingarelli, Jeffrey S. Hazboun, Aurelien Chalumeau, Deborah C. Good, Joseph Simon, Gabriella Agazie, Akash Anumarlapudi, Anne M. Archibald, Zaven Arzoumanian, Paul T. Baker, Paul R. Brook, H. Thankful Cromartie, Kathryn Crowter, Megan E. DeCesar, Paul B. Demorest, Timothy Dolch, Elizabeth C. Ferrara, William Fiore, Emmanuel Fonseca, Gabriel E. Freedman, Nate Garver-Daniels, Peter A. Gentile, Joseph Glaser, Ross J. Jennings , et al. (39 additional authors not shown)

Abstract: Pulsar timing arrays (PTAs) are designed to detect low-frequency gravitational waves (GWs). GWs induce achromatic signals in PTA data, meaning that the timing delays do not depend on radio-frequency. However, pulse arrival times are also affected by radio-frequency dependent "chromatic" noise from sources such as dispersion measure (DM) and scattering delay variations. Furthermore, the characteriz… ▽ More Pulsar timing arrays (PTAs) are designed to detect low-frequency gravitational waves (GWs). GWs induce achromatic signals in PTA data, meaning that the timing delays do not depend on radio-frequency. However, pulse arrival times are also affected by radio-frequency dependent "chromatic" noise from sources such as dispersion measure (DM) and scattering delay variations. Furthermore, the characterization of GW signals may be influenced by the choice of chromatic noise model for each pulsar. To better understand this effect, we assess if and how different chromatic noise models affect achromatic noise properties in each pulsar. The models we compare include existing DM models used by NANOGrav and noise models used for the European PTA Data Release 2 (EPTA DR2). We perform this comparison using a subsample of six pulsars from the NANOGrav 15 yr data set, selecting the same six pulsars as from the EPTA DR2 six-pulsar dataset. We find that the choice of chromatic noise model noticeably affects the achromatic noise properties of several pulsars. This is most dramatic for PSR J1713+0747, where the amplitude of its achromatic red noise lowers from $\log_{10}A_{\text{RN}} = -14.1^{+0.1}_{-0.1}$ to $-14.7^{+0.3}_{-0.5}$, and the spectral index broadens from $γ_{\text{RN}} = 2.6^{+0.5}_{-0.4}$ to $γ_{\text{RN}} = 3.5^{+1.2}_{-0.9}$. We also compare each pulsar's noise properties with those inferred from the EPTA DR2, using the same models. From the discrepancies, we identify potential areas where the noise models could be improved. These results highlight the potential for custom chromatic noise models to improve PTA sensitivity to GWs. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.12403 [pdf, other]

Searching for gravitational wave optical counterparts with the Zwicky Transient Facility: summary of O4a

Authors: Tomás Ahumada, Shreya Anand, Michael W. Coughlin, Vaidehi Gupta, Mansi M. Kasliwal, Viraj R. Karambelkar, Robert D. Stein, Gaurav Waratkar, Vishwajeet Swain, Theophile Jegou du Laz, Akash Anumarlapudi, Igor Andreoni, Mattia Bulla, Gokul P. Srinivasaragavan, Andrew Toivonen, Avery Wold, Eric C. Bellm, S. Bradley Cenko, David L. Kaplan, Jesper Sollerman, Varun Bhalerao, Daniel Perley, Anirudh Salgundi, Aswin Suresh, K-Ryan Hinds , et al. (27 additional authors not shown)

Abstract: During the first half of the fourth observing run (O4a) of the International Gravitational Wave Network (IGWN), the Zwicky Transient Facility (ZTF) conducted a systematic search for kilonova (KN) counterparts to binary neutron star (BNS) and neutron star-black hole (NSBH) merger candidates. Here, we present a comprehensive study of the five high-significance (FAR < 1 per year) BNS and NSBH candida… ▽ More During the first half of the fourth observing run (O4a) of the International Gravitational Wave Network (IGWN), the Zwicky Transient Facility (ZTF) conducted a systematic search for kilonova (KN) counterparts to binary neutron star (BNS) and neutron star-black hole (NSBH) merger candidates. Here, we present a comprehensive study of the five high-significance (FAR < 1 per year) BNS and NSBH candidates in O4a. Our follow-up campaigns relied on both target-of-opportunity observations (ToO) and re-weighting of the nominal survey schedule to maximize coverage. We describe the toolkit we have been developing, Fritz, an instance of SkyPortal, instrumental in coordinating and managing our telescope scheduling, candidate vetting, and follow-up observations through a user-friendly interface. ZTF covered a total of 2841 deg$^2$ within the skymaps of the high-significance GW events, reaching a median depth of g~20.2 mag. We circulated 15 candidates, but found no viable KN counterpart to any of the GW events. Based on the ZTF non-detections of the high-significance events in O4a, we used a Bayesian approach, nimbus, to quantify the posterior probability of KN model parameters that are consistent with our non-detections. Our analysis favors KNe with initial absolute magnitude fainter than -16 mag. The joint posterior probability of a GW170817-like KN associated with all our O4a follow-ups was 64%. Additionally, we use a survey simulation software, simsurvey, to determine that our combined filtered efficiency to detect a GW170817-like KN is 36%, when considering the 5 confirmed astrophysical events in O3 (1 BNS and 4 NSBH), along with our O4a follow-ups. Following Kasliwal et al. (2020), we derived joint constraints on the underlying KN luminosity function based on our O3 and O4a follow-ups, determining that no more than 76% of KNe fading at 1 mag/day can peak at a magnitude brighter than -17.5 mag. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: submitted

arXiv:2405.11023 [pdf, ps, other]

Hydrodynamics of thermal active matter

Authors: Jay Armas, Akash Jain, Ruben Lier

Abstract: Active matter concerns many-body systems comprised of living or self-driven agents that collectively exhibit macroscopic phenomena distinct from conventional passive matter. Using Schwinger-Keldysh effective field theory, we develop a novel hydrodynamic framework for thermal active matter that accounts for local temperature variations and the ensuing stochastic effects. This framework provides a d… ▽ More Active matter concerns many-body systems comprised of living or self-driven agents that collectively exhibit macroscopic phenomena distinct from conventional passive matter. Using Schwinger-Keldysh effective field theory, we develop a novel hydrodynamic framework for thermal active matter that accounts for local temperature variations and the ensuing stochastic effects. This framework provides a deeper understanding of energy balance, second law of thermodynamics, and thermostated steady states in active matter, while also addressing the systematic violations of fluctuation-dissipation theorem and detailed balance. We use our framework of active hydrodynamics to develop effective field theory actions for active superfluids and active nematics that offer a first-principle derivation of various active transport coefficients and feature activity-induced phase transitions. △ Less

Submitted 17 May, 2024; originally announced May 2024.

arXiv:2405.10080 [pdf, other]

The Tracking Tapered Gridded Estimator for the 21-cm power spectrum from MWA drift scan observations I: Validation and preliminary results

Authors: Suman Chatterjee, Khandakar Md Asif Elahi, Somnath Bharadwaj, Shouvik Sarkar, Samir Choudhuri, Shiv Sethi, Akash Kumar Patwa

Abstract: Drift scan observations provide the broad sky coverage and instrumental stability needed to measure the Epoch of Reionization (EoR) 21-cm signal. In such observations, the telescope's pointing center (PC) moves continuously on the sky. The Tracking Tapered Gridded Estimator (TTGE) combines observations from different PC to estimate $P(k_{\perp}, k_{\parallel})$ the 21-cm power spectrum, centered o… ▽ More Drift scan observations provide the broad sky coverage and instrumental stability needed to measure the Epoch of Reionization (EoR) 21-cm signal. In such observations, the telescope's pointing center (PC) moves continuously on the sky. The Tracking Tapered Gridded Estimator (TTGE) combines observations from different PC to estimate $P(k_{\perp}, k_{\parallel})$ the 21-cm power spectrum, centered on a tracking center (TC) which remains fixed on the sky. The tapering further restricts the sky response to a small angular region around TC, thereby mitigating wide-field foregrounds. Here we consider $154.2 \, {\rm MHz}$ ($z = 8.2$) Murchison Widefield Array (MWA) drift scan observations. The periodic pattern of flagged channels, present in MWA data, is known to introduce artefacts which pose a challenge for estimating $P(k_{\perp}, k_{\parallel})$. We demonstrate that the TTGE is able to recover $P(k_{\perp}, k_{\parallel})$ without any artefacts, and estimate $P(k)$ within $5 \%$ accuracy over a large $k$-range. We also present preliminary results for a single PC, combining 9 nights of observation $(17 \, {\rm min}$ total). We find that $P(k_{\perp}, k_{\parallel})$ exhibits streaks at a fixed interval of $k_{\parallel}=0.29 \, {\rm Mpc}^{-1}$, which matches $Δν_{\rm per}=1.28 \, {\rm MHz}$ that is the period of the flagged channels. The streaks are not as pronounced at larger $k_{\parallel}$, and in some cases they do not appear to extend across the entire $k_{\perp}$ range. The rectangular region $0.05 \leq k_{\perp} \leq 0.16 \, {\rm Mpc^{-1}}$ and $0.9 \leq k_{\parallel} \leq 4.6 \, {\rm Mpc^{-1}}$ is found to be relatively free of foreground contamination and artefacts, and we have used this to place the $2σ$ upper limit $Δ^2(k) < (1.85 \times 10^4)^2\, {\rm mK^2}$ on the EoR 21-cm mean squared brightness temperature fluctuations at $k=1 \,{\rm Mpc}^{-1}$. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: 15 pages, 11 figures, accepted for publication in PASA

arXiv:2405.09589 [pdf, other]

Unveiling Hallucination in Text, Image, Video, and Audio Foundation Models: A Comprehensive Survey

Authors: Pranab Sahoo, Prabhash Meharia, Akash Ghosh, Sriparna Saha, Vinija Jain, Aman Chadha

Abstract: The rapid advancement of foundation models (FMs) across language, image, audio, and video domains has shown remarkable capabilities in diverse tasks. However, the proliferation of FMs brings forth a critical challenge: the potential to generate hallucinated outputs, particularly in high-stakes applications. The tendency of foundation models to produce hallucinated content arguably represents the b… ▽ More The rapid advancement of foundation models (FMs) across language, image, audio, and video domains has shown remarkable capabilities in diverse tasks. However, the proliferation of FMs brings forth a critical challenge: the potential to generate hallucinated outputs, particularly in high-stakes applications. The tendency of foundation models to produce hallucinated content arguably represents the biggest hindrance to their widespread adoption in real-world scenarios, especially in domains where reliability and accuracy are paramount. This survey paper presents a comprehensive overview of recent developments that aim to identify and mitigate the problem of hallucination in FMs, spanning text, image, video, and audio modalities. By synthesizing recent advancements in detecting and mitigating hallucination across various modalities, the paper aims to provide valuable insights for researchers, developers, and practitioners. Essentially, it establishes a clear framework encompassing definition, taxonomy, and detection strategies for addressing hallucination in multimodal foundation models, laying the foundation for future research in this pivotal area. △ Less

Submitted 20 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

Showing 1–50 of 1,052 results for author: Aakash