Search | arXiv e-print repository

XDT-CXR: Investigating Cross-Disease Transferability in Zero-Shot Binary Classification of Chest X-Rays

Authors: Umaima Rahman, Abhishek Basu, Muhammad Uzair Khattak, Aniq Ur Rahman

Abstract: This study explores the concept of cross-disease transferability (XDT) in medical imaging, focusing on the potential of binary classifiers trained on one disease to perform zero-shot classification on another disease affecting the same organ. Utilizing chest X-rays (CXR) as the primary modality, we investigate whether a model trained on one pulmonary disease can make predictions about another nove… ▽ More This study explores the concept of cross-disease transferability (XDT) in medical imaging, focusing on the potential of binary classifiers trained on one disease to perform zero-shot classification on another disease affecting the same organ. Utilizing chest X-rays (CXR) as the primary modality, we investigate whether a model trained on one pulmonary disease can make predictions about another novel pulmonary disease, a scenario with significant implications for medical settings with limited data on emerging diseases. The XDT framework leverages the embedding space of a vision encoder, which, through kernel transformation, aids in distinguishing between diseased and non-diseased classes in the latent space. This capability is especially beneficial in resource-limited environments or in regions with low prevalence of certain diseases, where conventional diagnostic practices may fail. However, the XDT framework is currently limited to binary classification, determining only the presence or absence of a disease rather than differentiating among multiple diseases. This limitation underscores the supplementary role of XDT to traditional diagnostic tests in clinical settings. Furthermore, results show that XDT-CXR as a framework is able to make better predictions compared to other zero-shot learning (ZSL) baselines. △ Less

Submitted 21 August, 2024; originally announced August 2024.

Comments: Accepted in Machine Learning for Healthcare Conference MLHC 2024

arXiv:2408.04581 [pdf, other]

Probing the morphology of polarized emission induced by fluctuation dynamo using Minkowski functionals

Authors: Riju Dutta, Sharanya Sur, Aritra Basu

Abstract: The morphology and the characteristic scale of polarized structures provide crucial insights into the mechanisms that drives turbulence and maintains magnetic fields in magneto-ionic plasma. We aim to establish the efficacy of Minkowski functionals as quantitative statistical probes of filamentary morphology of polarized synchrotron emission resulting from fluctuation dynamo action. Using syntheti… ▽ More The morphology and the characteristic scale of polarized structures provide crucial insights into the mechanisms that drives turbulence and maintains magnetic fields in magneto-ionic plasma. We aim to establish the efficacy of Minkowski functionals as quantitative statistical probes of filamentary morphology of polarized synchrotron emission resulting from fluctuation dynamo action. Using synthetic observations generated from magnetohydrodynamic simulations of fluctuation dynamos with varying driving scales ($\ell_{\rm f}$) of turbulence in isothermal, incompressible, and subsonic media, we study the relation between different morphological measures, and their connection to fractional polarization ($p_{\rm f}$). We find that Faraday depolarization at low frequencies give rise to small-scale polarized structures that have higher filamentarity as compared to the intrinsic structures that are comparable to $\ell_{\rm f}$. Above $\sim3\,{\rm GHz}$, the number of connected polarized structures per unit area ($N_{\rm CC, peak}$) is related to the mean $p_{\rm f}$ ($\langle p_{\rm f} \rangle$) of the emitting region as $\langle p_{\rm f} \rangle \propto N_{\rm CC, peak}^{-1/4}$, provided the scale of the detectable emitting region is larger than $\ell_{\rm f}$. This implies that $N_{\rm CC,peak}$ represents the number of turbulent cells projected on the plane of the sky and can be directly used to infer $\ell_{\rm f}$ via the relation $\ell_{\rm f} \propto N_{\rm CC,peak}^{-1/2}$. An estimate on $\ell_{\rm f}$ thus directly allows for pinning down the turbulence driving mechanism in astrophysical systems. While the simulated conditions are mostly prevalent in the intracluster medium of galaxy clusters, the qualitative morphological features are also applicable in the context of interstellar medium in galaxies. △ Less

Submitted 8 August, 2024; originally announced August 2024.

Comments: Version resubmitted to ApJ after addressing comments of the referee

arXiv:2408.03407 [pdf, other]

Deep Clustering via Distribution Learning

Authors: Guanfang Dong, Zijie Tan, Chenqiu Zhao, Anup Basu

Abstract: Distribution learning finds probability density functions from a set of data samples, whereas clustering aims to group similar data points to form clusters. Although there are deep clustering methods that employ distribution learning methods, past work still lacks theoretical analysis regarding the relationship between clustering and distribution learning. Thus, in this work, we provide a theoreti… ▽ More Distribution learning finds probability density functions from a set of data samples, whereas clustering aims to group similar data points to form clusters. Although there are deep clustering methods that employ distribution learning methods, past work still lacks theoretical analysis regarding the relationship between clustering and distribution learning. Thus, in this work, we provide a theoretical analysis to guide the optimization of clustering via distribution learning. To achieve better results, we embed deep clustering guided by a theoretical analysis. Furthermore, the distribution learning method cannot always be directly applied to data. To overcome this issue, we introduce a clustering-oriented distribution learning method called Monte-Carlo Marginalization for Clustering. We integrate Monte-Carlo Marginalization for Clustering into Deep Clustering, resulting in Deep Clustering via Distribution Learning (DCDL). Eventually, the proposed DCDL achieves promising results compared to state-of-the-art methods on popular datasets. Considering a clustering task, the new distribution learning method outperforms previous methods as well. △ Less

Submitted 6 August, 2024; originally announced August 2024.

arXiv:2408.01627 [pdf, other]

JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Model

Authors: Farzaneh Jafari, Stefano Berretti, Anup Basu

Abstract: In recent years, talking head generation has become a focal point for researchers. Considerable effort is being made to refine lip-sync motion, capture expressive facial expressions, generate natural head poses, and achieve high video quality. However, no single model has yet achieved equivalence across all these metrics. This paper aims to animate a 3D face using Jamba, a hybrid Transformers-Mamb… ▽ More In recent years, talking head generation has become a focal point for researchers. Considerable effort is being made to refine lip-sync motion, capture expressive facial expressions, generate natural head poses, and achieve high video quality. However, no single model has yet achieved equivalence across all these metrics. This paper aims to animate a 3D face using Jamba, a hybrid Transformers-Mamba model. Mamba, a pioneering Structured State Space Model (SSM) architecture, was designed to address the constraints of the conventional Transformer architecture. Nevertheless, it has several drawbacks. Jamba merges the advantages of both Transformer and Mamba approaches, providing a holistic solution. Based on the foundational Jamba block, we present JambaTalk to enhance motion variety and speed through multimodal integration. Extensive experiments reveal that our method achieves performance comparable or superior to state-of-the-art models. △ Less

Submitted 2 August, 2024; originally announced August 2024.

Comments: 12 pages with 3 figures

arXiv:2408.01364 [pdf, ps, other]

Availability versus carrying capacity: Phases of asymmetric exclusion processes competing for finite pools of resources

Authors: Astik Haldar, Parna Roy, Erwin Frey, Abhik Basu

Abstract: We address how the interplay between the finite availability and carrying capacity of particles at different parts of a spatially extended system can control the steady state currents and density profiles in the one-dimensional current-carrying lanes connecting the different parts of the system. To study this, we set up a minimal model consisting of two particle reservoirs of the same finite carry… ▽ More We address how the interplay between the finite availability and carrying capacity of particles at different parts of a spatially extended system can control the steady state currents and density profiles in the one-dimensional current-carrying lanes connecting the different parts of the system. To study this, we set up a minimal model consisting of two particle reservoirs of the same finite carrying capacity connected by two equally sized anti-parallel asymmetric exclusion processes (TASEP). We focus on the steady-state currents and particle density profiles in the two TASEP lanes. The ensuing phases and the phase diagrams, which can be remarkably complex, are parametrized by the model parameters defining particle exchange between the TASEP lanes and the reservoirs and the filling fraction of the particles that determine the total resources available. These parameters may be tuned to make the densities of the two TASEP lanes globally uniform or piece-wise continuous in the form of a combination of a single localized domain wall and a spatially constant density or a pair of delocalized domain walls. Our model reveals that the two reservoirs can be preferentially populated or depopulated in the steady states. △ Less

Submitted 2 August, 2024; originally announced August 2024.

Comments: 18 pages, 4 figures

arXiv:2407.16748 [pdf, other]

The dispersion measure and rotation measure from fast radio burst host galaxies based on the IllustrisTNG50 simulation

Authors: Timea Orsolya Kovacs, Sui Ann Mao, Aritra Basu, Yik Ki Ma, Laura G. Spitler, Charles R. H. Walker

Abstract: Fast radio bursts (FRB) will become important cosmological tools, as the number of observed FRBs is increasing rapidly with more surveys being carried out. A large sample of FRBs with dispersion measures (DM) and rotation measures (RM) can be used to study the intergalactic magnetic field. However, the observed DM and RM of FRBs have multiple contributors which must be quantified to obtain the int… ▽ More Fast radio bursts (FRB) will become important cosmological tools, as the number of observed FRBs is increasing rapidly with more surveys being carried out. A large sample of FRBs with dispersion measures (DM) and rotation measures (RM) can be used to study the intergalactic magnetic field. However, the observed DM and RM of FRBs have multiple contributors which must be quantified to obtain the intergalactic medium's (IGM) DM and RM. In this paper, we estimate one such contribution to DM and RM: that of FRB host galaxies. We show how it changes with redshift, galaxy type, and the stellar mass of the galaxies, inclination, and FRB's projected offset. Using the IllustrisTNG50 simulations, we selected 16500 galaxies at redshifts of 0<=z<=2, with stellar masses in the range 9<=log(M*/Msun)<=12. In each galaxy, we calculate the DM and RM contributions of 1000 sightlines, and construct DM and RM probability density functions. We find that the rest frame DM distributions of all galaxies at a given redshift can be fitted by a lognormal function, and the rest frame RM distribution is symmetric around 0 rad m$^{-2}$, and can be fitted by the combination of a Lorentzian and two Gaussian functions. The parameters of these functions change for different subsets of galaxies with different redshift, stellar mass, inclination, and FRB offset. These changes are due to an increasing $n_e$ with redshift, SFR, and stellar mass, and we find a more ordered B field at lower z compared to higher z, suggested by more galaxies with B field reversals and B fields dominated by random B field at higher z. We estimate the FRB host DM and RM contributions, which can be used in the future to isolate the IGM's contribution from the observed DM and RM of FRBs. We predict that to constrain an $σ_{\rm RM,IGM}$ of 2 rad m$^{-2}$ to 95% confidence level we need to observe 95000 FRBs at z=0.5, but only 9500 FRBs at z=2. △ Less

Submitted 23 July, 2024; originally announced July 2024.

Comments: 24 pages, 15 figures Accepted for publication in A&A

arXiv:2407.09957 [pdf, other]

Leptonic CP-violation in the sneutrino sector of the BLSSM with Inverse Seesaw

Authors: Arindam Basu, Amit Chakraborty, Yi Liu, Stefano Moretti, Harri Waltari

Abstract: We study CP violation (CPV) in the sneutrino sector within the B-L extension of the Minimal Supersymmetric Standard Model (BLSSM), wherein an inverse seesaw mechanism has been implemented. CPV arises from the new superpotential couplings in the (s)neutrino sector, which can be complex and the mixing of CP-eigenstates induced by those couplings. CPV leads to asymmetries in so called T-odd observabl… ▽ More We study CP violation (CPV) in the sneutrino sector within the B-L extension of the Minimal Supersymmetric Standard Model (BLSSM), wherein an inverse seesaw mechanism has been implemented. CPV arises from the new superpotential couplings in the (s)neutrino sector, which can be complex and the mixing of CP-eigenstates induced by those couplings. CPV leads to asymmetries in so called T-odd observables, but we argue that such asymmetries also lead to a wider distribution of those observables. We look at a final state where a sneutrino decays to a lepton, two jets and missing transverse momentum at the Future Circular Collider operating in hadron-hadron mode at $100$ TeV and with a luminosity of 3~ab$^{-1}$. In order to exclude the CP conserving scenario we need to improve traditional analysis by introducing boosted decision trees using both standard kinematic variables and T-odd observables and we need $Z^{\prime}$ boson not too much above current bounds as a portal to produce sneutrinos efficiently. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: 27 pages, 21 figures, 10 tables

arXiv:2407.09180 [pdf, other]

iMIV: in-Memory Integrity Verification for NVM

Authors: Rajat Jain, Aravinda Prasad, Sreenivas Subramoney, Arkaprava Basu

Abstract: Non-volatile Memory (NVM) could bridge the gap between memory and storage. However, NVMs are susceptible to data remanence attacks. Thus, multiple security metadata must persist along with the data to protect the confidentiality and integrity of NVM-resident data. Persisting Bonsai Merkel Tree (BMT) nodes, critical for data integrity, can add significant overheads due to need to write large amount… ▽ More Non-volatile Memory (NVM) could bridge the gap between memory and storage. However, NVMs are susceptible to data remanence attacks. Thus, multiple security metadata must persist along with the data to protect the confidentiality and integrity of NVM-resident data. Persisting Bonsai Merkel Tree (BMT) nodes, critical for data integrity, can add significant overheads due to need to write large amounts of metadata off-chip to the bandwidth-constrained NVMs. We propose iMIV for low-overhead, fine-grained integrity verification through in-memory computing. We argue that memory-intensive integrity verification operations (BMT updates and verification) should be employed close to the NVM to limit off-chip data movement. We design iMIV based on typical NVDIMM designs that have an onboard logic chip with a trusted encryption engine, separate from the untrusted storage media. iMIV reduces the performance overheads from 205% to 55% when integrity verification operations are offloaded to NVM compared to when all the security operations are employed at the memory controller. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.03081 [pdf]

Optically Pumped Terahertz Amplitude Modulation in Type-II Ge QD/Si heterostructures grown via Molecular Beam Epitaxy

Authors: Suprovat Ghosh, Abir Mukherjee, Sudarshan Singh, Samit K Ray, Ananjan Basu, Santanu Manna, Samaresh Das

Abstract: This article exploits group-IV germanium (Ge) quantum dots (QDs) on Silicon-on-Insulator (SOI) grown by molecular beam epitaxy (MBE) in order to explore its optical behaviour in the Terahertz (THz) regime. In this work, Ge QDs, pumped by an above bandgap near infrared wavelength, exhibit THz amplitude modulation in the frequency range of 0.1-1.0 THz. The epitaxial Ge QDs outperform reference SOI s… ▽ More This article exploits group-IV germanium (Ge) quantum dots (QDs) on Silicon-on-Insulator (SOI) grown by molecular beam epitaxy (MBE) in order to explore its optical behaviour in the Terahertz (THz) regime. In this work, Ge QDs, pumped by an above bandgap near infrared wavelength, exhibit THz amplitude modulation in the frequency range of 0.1-1.0 THz. The epitaxial Ge QDs outperform reference SOI substrate in THz amplitude modulation owing to higher carrier generation in weakly confined dots compared to its bulk counterpart. This is further corroborated using theoretical model based on the non-equilibrium Green's function (NEGF) method. This model enables the calculation of photo carriers generated (PCG) and their confinement in the Ge QD region. Our model also reroutes the calculation from PCG to corresponding plasma frequency and hence to refractive index and THz photo-conductivity. Moreover, the photo-generated confined holes accumulation at the Ge QDs-Si interface is elevated after optical illumination, leading to a decreased THz photo-conductivity. This augmentation in THz photo-conductivity contributes to a significant enhancement of THz modulation depth ~77% at Ge QDs-Si interfaces compared to bare SOI at 0.1 THz. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 15 pages, 5 figures

arXiv:2406.14863 [pdf, other]

Older and Wiser: The Marriage of Device Aging and Intellectual Property Protection of Deep Neural Networks

Authors: Ning Lin, Shaocong Wang, Yue Zhang, Yangu He, Kwunhang Wong, Arindam Basu, Dashan Shang, Xiaoming Chen, Zhongrui Wang

Abstract: Deep neural networks (DNNs), such as the widely-used GPT-3 with billions of parameters, are often kept secret due to high training costs and privacy concerns surrounding the data used to train them. Previous approaches to securing DNNs typically require expensive circuit redesign, resulting in additional overheads such as increased area, energy consumption, and latency. To address these issues, we… ▽ More Deep neural networks (DNNs), such as the widely-used GPT-3 with billions of parameters, are often kept secret due to high training costs and privacy concerns surrounding the data used to train them. Previous approaches to securing DNNs typically require expensive circuit redesign, resulting in additional overheads such as increased area, energy consumption, and latency. To address these issues, we propose a novel hardware-software co-design approach for DNN intellectual property (IP) protection that capitalizes on the inherent aging characteristics of circuits and a novel differential orientation fine-tuning (DOFT) to ensure effective protection. Hardware-wise, we employ random aging to produce authorized chips. This process circumvents the need for chip redesign, thereby eliminating any additional hardware overhead during the inference procedure of DNNs. Moreover, the authorized chips demonstrate a considerable disparity in DNN inference performance when compared to unauthorized chips. Software-wise, we propose a novel DOFT, which allows pre-trained DNNs to maintain their original accuracy on authorized chips with minimal fine-tuning, while the model's performance on unauthorized chips is reduced to random guessing. Extensive experiments on various models, including MLP, VGG, ResNet, Mixer, and SwinTransformer, with lightweight binary and practical multi-bit weights demonstrate that the proposed method achieves effective IP protection, with only 10\% accuracy on unauthorized chips, while preserving nearly the original accuracy on authorized ones. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: Design Automation Conference 2024

arXiv:2406.09219 [pdf, other]

doi 10.1093/mnras/stae1499

Measuring glitch recoveries and braking indices with Bayesian model selection

Authors: Yang Liu, Michael J. Keith, Danai Antonopoulou, Patrick Weltevrede, Benjamin Shaw, Benjamin W. Stappers, Andrew G. Lyne, Mitchell B. Mickaliger, Avishek Basu

Abstract: For a selection of 35 pulsars with large spin-up glitches ($Δν/ν\geq10^{-6}$), which are monitored by the Jodrell Bank Observatory, we analyse 157 glitches and their recoveries. All parameters are measured consistently and we choose the best model to describe the post-glitch recovery based on Bayesian evidence. We present updated glitch epochs, sizes, changes of spin down rate, exponentially recov… ▽ More For a selection of 35 pulsars with large spin-up glitches ($Δν/ν\geq10^{-6}$), which are monitored by the Jodrell Bank Observatory, we analyse 157 glitches and their recoveries. All parameters are measured consistently and we choose the best model to describe the post-glitch recovery based on Bayesian evidence. We present updated glitch epochs, sizes, changes of spin down rate, exponentially recovering components (amplitude and corresponding timescale) when present, as well as pulsars' second frequency derivatives and their glitch associated changes if detected. We discuss the different observed styles of post-glitch recovery as well as some particularly interesting sources. Several correlations are revealed between glitch parameters and pulsar spin parameters, including a very strong correlation between a pulsar's interglitch $|\ddotν|$ and $\dotν$, as well as between the glitch-induced spin-down rate change $Δ\dotν_{\rm p}$ that does not relax exponentially and $\dotν$. We find that the ratio $\left|Δ\dotν_{\mathrm{p}}/\ddotν\right|$ can be used as an estimate of glitch recurrence times, especially for those pulsars for which there are indications of a characteristic glitch size and interglitch waiting time. We calculate the interglitch braking index $n$ and find that pulsars with large glitches typically have $n$ greater than $3$, suggesting that internal torques dominate the rotational evolution between glitches. The external torque, e.g. from electromagnetic dipole radiation, could dominate the observed $\ddotν$ for the youngest pulsars ($\lesssim10^{4}\;\mathrm{yr}$), which may be expected to display $n\sim3$. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 24 pages, 14 figures, 8 tables. Accepted for publication in MNRAS

Journal ref: Monthly Notices of the Royal Astronomical Society, Volume 532, Issue 1, pp.859-882 July 2024

arXiv:2406.06689 [pdf, other]

doi 10.1051/0004-6361/202348588

A spatially resolved radio spectral study of the galaxy M 51

Authors: L. Gajović, B. Adebahr, A. Basu, V. Heesen, M. Brüggen, F. de Gasperin, M. A. Lara-Lopez, J. B. R. Oonk, H. W. Edler, D. J. Bomans, R. Paladino, L. E. Garduño, O. López-Cruz, M. Stein, J. Fritz, J. Piotrowska, A. Sinha

Abstract: Radio continuum emission from galaxies at gigahertz frequencies can be used as an extinction-free tracer of star formation. However, at frequencies of a few hundred megahertz, there is evidence for low-frequency spectral flattening. We wish to better understand the origin of this low-frequency flattening and, to this end, perform a spatially resolved study of the nearby spiral galaxy M 51. We expl… ▽ More Radio continuum emission from galaxies at gigahertz frequencies can be used as an extinction-free tracer of star formation. However, at frequencies of a few hundred megahertz, there is evidence for low-frequency spectral flattening. We wish to better understand the origin of this low-frequency flattening and, to this end, perform a spatially resolved study of the nearby spiral galaxy M 51. We explore the different effects that can cause flattening of the spectrum towards lower frequencies, such as free-free absorption and cosmic-ray ionisation losses. We used radio continuum intensity maps between 54 and 8350 MHz at eight different frequencies, with observations at 240 MHz from the Giant Metrewave Radio Telescope presented for the first time. We corrected for contribution from thermal free-free emission using an H$α$ map that has been extinction-corrected with 24 $μ$m data. We fitted free-free absorption models to the radio spectra to determine the emission measure (EM) as well as polynomial functions to measure the non-thermal spectral curvature. The non-thermal low-frequency radio continuum spectrum between 54 and 144 MHz is very flat and even partially inverted, particularly in the spiral arms; contrary, the spectrum at higher frequencies shows the typical non-thermal radio continuum spectrum. However, we do not find any correlation between the EMs calculated from radio and from H$α$ observations; instead, the non-thermal spectral curvature weakly correlates with the HI gas mass surface density. This suggests that cosmic-ray ionisation losses play an important role in the low-frequency spectral flattening. The observed spectral flattening towards low frequencies in M 51 is caused by a combination of ionisation losses and free-free absorption. The reasons for this flattening need to be understood in order to use sub-GHz frequencies as a star-formation tracer. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Journal ref: A&A 689, A68 (2024)

arXiv:2406.00576 [pdf, other]

A Universal Transfer Theorem for Convex Optimization Algorithms Using Inexact First-order Oracles

Authors: Phillip Kerger, Marco Molinaro, Hongyi Jiang, Amitabh Basu

Abstract: Given any algorithm for convex optimization that uses exact first-order information (i.e., function values and subgradients), we show how to use such an algorithm to solve the problem with access to inexact first-order information. This is done in a ``black-box'' manner without knowledge of the internal workings of the algorithm. This complements previous work that considers the performance of spe… ▽ More Given any algorithm for convex optimization that uses exact first-order information (i.e., function values and subgradients), we show how to use such an algorithm to solve the problem with access to inexact first-order information. This is done in a ``black-box'' manner without knowledge of the internal workings of the algorithm. This complements previous work that considers the performance of specific algorithms like (accelerated) gradient descent with inexact information. In particular, our results apply to a wider range of algorithms beyond variants of gradient descent, e.g., projection-free methods, cutting-plane methods, or any other first-order methods formulated in the future. Further, they also apply to algorithms that handle structured nonconvexities like mixed-integer decision variables. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: 15 pages, 1 figure

arXiv:2406.00314 [pdf, other]

CASE: Efficient Curricular Data Pre-training for Building Assistive Psychology Expert Models

Authors: Sarthak Harne, Monjoy Narayan Choudhury, Madhav Rao, TK Srikanth, Seema Mehrotra, Apoorva Vashisht, Aarushi Basu, Manjit Sodhi

Abstract: The limited availability of psychologists necessitates efficient identification of individuals requiring urgent mental healthcare. This study explores the use of Natural Language Processing (NLP) pipelines to analyze text data from online mental health forums used for consultations. By analyzing forum posts, these pipelines can flag users who may require immediate professional attention. A crucial… ▽ More The limited availability of psychologists necessitates efficient identification of individuals requiring urgent mental healthcare. This study explores the use of Natural Language Processing (NLP) pipelines to analyze text data from online mental health forums used for consultations. By analyzing forum posts, these pipelines can flag users who may require immediate professional attention. A crucial challenge in this domain is data privacy and scarcity. To address this, we propose utilizing readily available curricular texts used in institutes specializing in mental health for pre-training the NLP pipelines. This helps us mimic the training process of a psychologist. Our work presents CASE-BERT that flags potential mental health disorders based on forum text. CASE-BERT demonstrates superior performance compared to existing methods, achieving an f1 score of 0.91 for Depression and 0.88 for Anxiety, two of the most commonly reported mental health disorders. Our code is publicly available. △ Less

Submitted 16 June, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

arXiv:2405.20070 [pdf]

doi 10.1038/s41598-024-58935-6

Pick-up and assembling of chemically sensitive van der Waals heterostructures using dry cryogenic exfoliation

Authors: Vilas Patil, Sanat Ghosh, Amit Basu, Kuldeep, Achintya Dutta, Khushabu Agrawal, Neha Bhatia, Amit Shah, Digambar A. Jangade, Ruta Kulkarni, A. Thamizhavel, Mandar M. Deshmukh

Abstract: Assembling atomic layers of van der Waals materials (vdW) combines the physics of two materials, offering opportunities for novel functional devices. Realization of this has been possible because of advancements in nanofabrication processes which often involve chemical processing of the materials under study; this can be detrimental to device performance. To address this issue, we have developed a… ▽ More Assembling atomic layers of van der Waals materials (vdW) combines the physics of two materials, offering opportunities for novel functional devices. Realization of this has been possible because of advancements in nanofabrication processes which often involve chemical processing of the materials under study; this can be detrimental to device performance. To address this issue, we have developed a modified micro-manipulator setup for cryogenic exfoliation, pick up, and transfer of vdW materials to assemble heterostructures. We use the glass transition of a polymer PDMS to cleave a flake into two, followed by its pick-up and drop to form pristine twisted junctions. To demonstrate the potential of the technique, we fabricated twisted heterostructure of Bi$_2$Sr$_2$CaCu$_2$O$_{8+x}$ (BSCCO), a van der Waals high-temperature cuprate superconductor. We also employed this method to re-exfoliate NbSe$_2$ and make twisted heterostructure. Transport measurements of the fabricated devices indicate the high quality of the artificial twisted interface. In addition, we extend this cryogenic exfoliation method for other vdW materials, offering an effective way of assembling heterostructures and twisted junctions with pristine interfaces. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Journal ref: Scientific Reports 14, Article number: 11097 (2024)

arXiv:2405.13992 [pdf, other]

Learning Cut Generating Functions for Integer Programming

Authors: Hongyu Cheng, Amitabh Basu

Abstract: The branch-and-cut algorithm is the method of choice to solve large scale integer programming problems in practice. A key ingredient of branch-and-cut is the use of cutting planes which are derived constraints that reduce the search space for an optimal solution. Selecting effective cutting planes to produce small branch-and-cut trees is a critical challenge in the branch-and-cut algorithm. Recent… ▽ More The branch-and-cut algorithm is the method of choice to solve large scale integer programming problems in practice. A key ingredient of branch-and-cut is the use of cutting planes which are derived constraints that reduce the search space for an optimal solution. Selecting effective cutting planes to produce small branch-and-cut trees is a critical challenge in the branch-and-cut algorithm. Recent advances have employed a data-driven approach to select optimal cutting planes from a parameterized family, aimed at reducing the branch-and-bound tree size (in expectation) for a given distribution of integer programming instances. We extend this idea to the selection of the best cut generating function (CGF), which is a tool in the integer programming literature for generating a wide variety of cutting planes that generalize the well-known Gomory Mixed-Integer (GMI) cutting planes. We provide rigorous sample complexity bounds for the selection of an effective CGF from certain parameterized families that provably performs well for any specified distribution on the problem instances. Our empirical results show that the selected CGF can outperform the GMI cuts for certain distributions. Additionally, we explore the sample complexity of using neural networks for instance-dependent CGF selection. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.08428 [pdf, other]

A Low-Power Spike Detector Using In-Memory Computing for Event-based Neural Frontend

Authors: Ye Ke, Arindam Basu

Abstract: With the sensor scaling of next-generation Brain-Machine Interface (BMI) systems, the massive A/D conversion and analog multiplexing at the neural frontend poses a challenge in terms of power and data rates for wireless and implantable BMIs. While previous works have reported the neuromorphic compression of neural signal, further compression requires integration of spike detectors on chip. In this… ▽ More With the sensor scaling of next-generation Brain-Machine Interface (BMI) systems, the massive A/D conversion and analog multiplexing at the neural frontend poses a challenge in terms of power and data rates for wireless and implantable BMIs. While previous works have reported the neuromorphic compression of neural signal, further compression requires integration of spike detectors on chip. In this work, we propose an efficient HRAM-based spike detector using In-memory computing for compressive event-based neural frontend. Our proposed method involves detecting spikes from event pulses without reconstructing the signal and uses a 10T hybrid in-memory computing bitcell for the accumulation and thresholding operations. We show that our method ensures a spike detection accuracy of 92-99% for neural signal inputs while consuming only 13.8 nW per channel in 65 nm CMOS. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: Originally submitted at IEEE ISCAS 2024

arXiv:2405.08292 [pdf, other]

Hybrid Event-Frame Neural Spike Detector for Neuromorphic Implantable BMI

Authors: Vivek Mohan, Wee Peng Tay, Arindam Basu

Abstract: This work introduces two novel neural spike detection schemes intended for use in next-generation neuromorphic brain-machine interfaces (iBMIs). The first, an Event-based Spike Detector (Ev-SPD) which examines the temporal neighborhood of a neural event for spike detection, is designed for in-vivo processing and offers high sensitivity and decent accuracy (94-97%). The second, Neural Network-based… ▽ More This work introduces two novel neural spike detection schemes intended for use in next-generation neuromorphic brain-machine interfaces (iBMIs). The first, an Event-based Spike Detector (Ev-SPD) which examines the temporal neighborhood of a neural event for spike detection, is designed for in-vivo processing and offers high sensitivity and decent accuracy (94-97%). The second, Neural Network-based Spike Detector (NN-SPD) which operates on hybrid temporal event frames, provides an off-implant solution using shallow neural networks with impressive detection accuracy (96-99%) and minimal false detections. These methods are evaluated using a synthetic dataset with varying noise levels and validated through comparison with ground truth data. The results highlight their potential in next-gen neuromorphic iBMI systems and emphasize the need to explore this direction further to understand their resource-efficient and high-performance capabilities for practical iBMI settings. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: This paper has been accepted for 2024 IEEE International Symposium on Circuits and Systems (ISCAS), Singapore

arXiv:2405.07763 [pdf, ps, other]

The number of cliques in hypergraphs with forbidden subgraphs

Authors: Ayush Basu, Vojtech Rodl, Yi Zhao

Abstract: We study the maximum number of $r$-vertex cliques in $(r-1)$-uniform hypergraphs not containing complete $r$-partite hypergraphs $K_r^{(r-1)}(a_1, \dots, a_r)$. By using the hypergraph removal lemma, we show that this maximum is $o( n^{r - 1/(a_1 \cdots a_{r-1})} )$. This immediately implies the corresponding results of Mubayi and Mukherjee and of Balogh, Jiang, and Luo for graphs. We also provide… ▽ More We study the maximum number of $r$-vertex cliques in $(r-1)$-uniform hypergraphs not containing complete $r$-partite hypergraphs $K_r^{(r-1)}(a_1, \dots, a_r)$. By using the hypergraph removal lemma, we show that this maximum is $o( n^{r - 1/(a_1 \cdots a_{r-1})} )$. This immediately implies the corresponding results of Mubayi and Mukherjee and of Balogh, Jiang, and Luo for graphs. We also provide a lower bound by using hypergraph Turán numbers. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2404.15549 [pdf, other]

PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models

Authors: Shashi Kant Gupta, Aditya Basu, Mauro Nievas, Jerrin Thomas, Nathan Wolfrath, Adhitya Ramamurthi, Bradley Taylor, Anai N. Kothari, Regina Schwind, Therica M. Miller, Sorena Nadaf-Rahrov, Yanshan Wang, Hrituraj Singh

Abstract: Clinical trial matching is the task of identifying trials for which patients may be potentially eligible. Typically, this task is labor-intensive and requires detailed verification of patient electronic health records (EHRs) against the stringent inclusion and exclusion criteria of clinical trials. This process is manual, time-intensive, and challenging to scale up, resulting in many patients miss… ▽ More Clinical trial matching is the task of identifying trials for which patients may be potentially eligible. Typically, this task is labor-intensive and requires detailed verification of patient electronic health records (EHRs) against the stringent inclusion and exclusion criteria of clinical trials. This process is manual, time-intensive, and challenging to scale up, resulting in many patients missing out on potential therapeutic options. Recent advancements in Large Language Models (LLMs) have made automating patient-trial matching possible, as shown in multiple concurrent research studies. However, the current approaches are confined to constrained, often synthetic datasets that do not adequately mirror the complexities encountered in real-world medical data. In this study, we present the first, end-to-end large-scale empirical evaluation of clinical trial matching using real-world EHRs. Our study showcases the capability of LLMs to accurately match patients with appropriate clinical trials. We perform experiments with proprietary LLMs, including GPT-4 and GPT-3.5, as well as our custom fine-tuned model called OncoLLM and show that OncoLLM, despite its significantly smaller size, not only outperforms GPT-3.5 but also matches the performance of qualified medical doctors. All experiments were carried out on real-world EHRs that include clinical notes and available clinical trials from a single cancer center in the United States. △ Less

Submitted 26 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

Comments: 30 Pages, 8 Figures, Supplementary Work Attached

arXiv:2404.06680 [pdf, other]

Onco-Retriever: Generative Classifier for Retrieval of EHR Records in Oncology

Authors: Shashi Kant Gupta, Aditya Basu, Bradley Taylor, Anai Kothari, Hrituraj Singh

Abstract: Retrieving information from EHR systems is essential for answering specific questions about patient journeys and improving the delivery of clinical care. Despite this fact, most EHR systems still rely on keyword-based searches. With the advent of generative large language models (LLMs), retrieving information can lead to better search and summarization capabilities. Such retrievers can also feed R… ▽ More Retrieving information from EHR systems is essential for answering specific questions about patient journeys and improving the delivery of clinical care. Despite this fact, most EHR systems still rely on keyword-based searches. With the advent of generative large language models (LLMs), retrieving information can lead to better search and summarization capabilities. Such retrievers can also feed Retrieval-augmented generation (RAG) pipelines to answer any query. However, the task of retrieving information from EHR real-world clinical data contained within EHR systems in order to solve several downstream use cases is challenging due to the difficulty in creating query-document support pairs. We provide a blueprint for creating such datasets in an affordable manner using large language models. Our method results in a retriever that is 30-50 F-1 points better than propriety counterparts such as Ada and Mistral for oncology data elements. We further compare our model, called Onco-Retriever, against fine-tuned PubMedBERT model as well. We conduct an extensive manual evaluation on real-world EHR data along with latency analysis of the different models and provide a path forward for healthcare organizations to build domain-specific retrievers. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 18 pages

arXiv:2404.06409 [pdf, other]

Helium Reionization from Empirical Quasar Luminosity Functions before and after JWST

Authors: Arghyadeep Basu, Enrico Garaldi, Benedetta Ciardi

Abstract: Recently, models of the quasar luminosity function (QLF) rooted on large observational compilations have been produced that, unlike their predecessors, feature a smooth evolution with time. This bypasses the need to assume an ionizing emissivity evolution when simulating helium reionization with observations-based QLF, thus yielding more robust constraints. We combine one such QLF with a cosmologi… ▽ More Recently, models of the quasar luminosity function (QLF) rooted on large observational compilations have been produced that, unlike their predecessors, feature a smooth evolution with time. This bypasses the need to assume an ionizing emissivity evolution when simulating helium reionization with observations-based QLF, thus yielding more robust constraints. We combine one such QLF with a cosmological hydrodynamical simulation and 3D multi-frequency radiative transfer. The simulated reionization history is consistently delayed in comparison to most other models in the literature. The predicted intergalactic medium temperature is larger than the observed one at $z \lesssim 3$. Through forward modeling of the He II Lyman-$α$ forest, we show that our model produces an extended helium reionization and successfully matches the bulk of the observed effective optical depth distribution, although it over-ionizes the Universe at $z\lesssim2.8$ as the effect of small-scale Lyman Limit Systems not being resolved. We thoroughly characterize transmission regions and dark gaps in He II Lyman-$α$ forest sightlines. We quantify their sensitivity to the helium reionization, opening a new avenue for further observational studies of this epoch. Finally, we explore the implications for helium reionization of the large number of active galactic nuclei revealed at $z\gtrsim5$ by JWST. We find that such modifications do not affect any observable at $z\leq4$, except in our most extreme model, indicating that the observed abundance of high-$z$ AGNs does not bear consequences for helium reionization. △ Less

Submitted 12 June, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

Comments: 18 pages, 21 figures, Version accepted for publication in MNRAS Main Journal

arXiv:2404.06294 [pdf, other]

Fortifying Fully Convolutional Generative Adversarial Networks for Image Super-Resolution Using Divergence Measures

Authors: Arkaprabha Basu, Kushal Bose, Sankha Subhra Mullick, Anish Chakrabarty, Swagatam Das

Abstract: Super-Resolution (SR) is a time-hallowed image processing problem that aims to improve the quality of a Low-Resolution (LR) sample up to the standard of its High-Resolution (HR) counterpart. We aim to address this by introducing Super-Resolution Generator (SuRGe), a fully-convolutional Generative Adversarial Network (GAN)-based architecture for SR. We show that distinct convolutional features obta… ▽ More Super-Resolution (SR) is a time-hallowed image processing problem that aims to improve the quality of a Low-Resolution (LR) sample up to the standard of its High-Resolution (HR) counterpart. We aim to address this by introducing Super-Resolution Generator (SuRGe), a fully-convolutional Generative Adversarial Network (GAN)-based architecture for SR. We show that distinct convolutional features obtained at increasing depths of a GAN generator can be optimally combined by a set of learnable convex weights to improve the quality of generated SR samples. In the process, we employ the Jensen-Shannon and the Gromov-Wasserstein losses respectively between the SR-HR and LR-SR pairs of distributions to further aid the generator of SuRGe to better exploit the available information in an attempt to improve SR. Moreover, we train the discriminator of SuRGe with the Wasserstein loss with gradient penalty, to primarily prevent mode collapse. The proposed SuRGe, as an end-to-end GAN workflow tailor-made for super-resolution, offers improved performance while maintaining low inference time. The efficacy of SuRGe is substantiated by its superior performance compared to 18 state-of-the-art contenders on 10 benchmark datasets. △ Less

Submitted 9 April, 2024; originally announced April 2024.

arXiv:2404.03404 [pdf, other]

Robust inference for linear regression models with possibly skewed error distribution

Authors: Amarnath Nandy, Ayanendranath Basu, Abhik Ghosh

Abstract: Traditional methods for linear regression generally assume that the underlying error distribution, equivalently the distribution of the responses, is normal. Yet, sometimes real life response data may exhibit a skewed pattern, and assuming normality would not give reliable results in such cases. This is often observed in cases of some biomedical, behavioral, socio-economic and other variables. In… ▽ More Traditional methods for linear regression generally assume that the underlying error distribution, equivalently the distribution of the responses, is normal. Yet, sometimes real life response data may exhibit a skewed pattern, and assuming normality would not give reliable results in such cases. This is often observed in cases of some biomedical, behavioral, socio-economic and other variables. In this paper, we propose to use the class of skew normal (SN) distributions, which also includes the ordinary normal distribution as its special case, as the model for the errors in a linear regression setup and perform subsequent statistical inference using the popular and robust minimum density power divergence approach to get stable insights in the presence of possible data contamination (e.g., outliers). We provide the asymptotic distribution of the proposed estimator of the regression parameters and also propose robust Wald-type tests of significance for these parameters. We provide an influence function analysis of these estimators and test statistics, and also provide level and power influence functions. Numerical verification including simulation studies and real data analysis is provided to substantiate the theory developed. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: Pre-print; under review

arXiv:2404.02051 [pdf, other]

The Thousand-Pulsar-Array programme on MeerKAT XIII: Timing, flux density, rotation measure and dispersion measure timeseries of 597 pulsars

Authors: M. J. Keith, S. Johnston, A. Karastergiou, P. Weltevrede, M. E. Lower, A. Basu, B. Posselt, L. S. Oswald, A. Parthasarathy, A. D. Cameron, M. Serylak, S. Buchner

Abstract: We report here on the timing of 597 pulsars over the last four years with the MeerKAT telescope. We provide Times-of-Arrival, pulsar ephemeris files and per-epoch measurements of the flux density, dispersion measure (DM) and rotation measure (RM) for each pulsar. In addition we use a Gaussian process to model the timing residuals to measure the spin frequency derivative at each epoch. We also repo… ▽ More We report here on the timing of 597 pulsars over the last four years with the MeerKAT telescope. We provide Times-of-Arrival, pulsar ephemeris files and per-epoch measurements of the flux density, dispersion measure (DM) and rotation measure (RM) for each pulsar. In addition we use a Gaussian process to model the timing residuals to measure the spin frequency derivative at each epoch. We also report the detection of 11 glitches in 9 individual pulsars. We find significant DM and RM variations in 87 and 76 pulsars respectively. We find that the DM variations scale approximately linearly with DM, which is broadly in agreement with models of the ionised interstellar medium. The observed RM variations seem largely independent of DM, which may suggest that the RM variations are dominated by variations in the interstellar magnetic field on the line of sight, rather than varying electron density. We also find that normal pulsars have around 5 times greater amplitude of DM variability compared to millisecond pulsars, and surmise that this is due to the known difference in their velocity distributions. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: This is a pre-copyedited, author-produced PDF of an article accepted for publication in MNRAS following peer review. 12 pages 7 figures

arXiv:2403.06256 [pdf, other]

Flat or crumpled: states of active symmetric membranes

Authors: Sudip Mukherjee, Abhik Basu

Abstract: We set up and study the hydrodynamic theory for active fluid and tethered membranes. We focus on those membranes which are inversion-symmetric. We show that such membranes are either described by appropriate linear hydrodynamic equations, which are exact in the asymptotic long wavelength limit, giving stable flat phases with positional quasi long range orders, or be linearly unstable, implying cru… ▽ More We set up and study the hydrodynamic theory for active fluid and tethered membranes. We focus on those membranes which are inversion-symmetric. We show that such membranes are either described by appropriate linear hydrodynamic equations, which are exact in the asymptotic long wavelength limit, giving stable flat phases with positional quasi long range orders, or be linearly unstable, implying crumpling of the membranes. We argue that in a such an active membrane thermal noises dominate over any active noises, and use it to calculate the correlation functions of the undulation and in-plane displacements of the membrane in the stable case, and the associated correlation functions of the embedding bulk flow velocities. △ Less

Submitted 10 March, 2024; originally announced March 2024.

Comments: Preliminary version, 8 pages, 2 figures

arXiv:2403.05945 [pdf, other]

Distributed fixed resources exchanging particles: Phases of an asymmetric exclusion process connected to two reservoirs

Authors: Sourav Pal, Parna Roy, Abhik Basu

Abstract: We propose and study a conceptual one-dimensional model to explore how the combined interplay between fixed resources and particle exchanges between different parts of an extended system can affect the stationary densities in a current carrying channel connecting different parts of the system. To this end, we consider a model composed of a totally asymmetric simple exclusion process (TASEP) connec… ▽ More We propose and study a conceptual one-dimensional model to explore how the combined interplay between fixed resources and particle exchanges between different parts of an extended system can affect the stationary densities in a current carrying channel connecting different parts of the system. To this end, we consider a model composed of a totally asymmetric simple exclusion process (TASEP) connecting two particle reservoirs without any internal dynamics but which can directly exchange particles between each other, ensuring nonvanishing currents in the steady states. The total particle number in the system that defines the "resources" available, although is kept constant by the model dynamics, can take any value independent of the model parameters that define the dynamics of the model. We show how the resulting phase diagrams of the model are controlled by the parameters, which define the various dynamical update rules together with the total available resources. These control parameters can be tuned to make the density on the TASEP lane globally uniform or piecewise continuous with localized domain walls, and can also control populations of the two reservoirs. In general, the phase diagrams are quite different from a TASEP with open boundaries. In the limit of large amount of resources, the phase diagrams in the plane of the control parameters become topologically identical to that for an open TASEP together with delocalization of domain walls. △ Less

Submitted 24 May, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

Comments: 21 pages, 14 figures

arXiv:2403.01563 [pdf, ps, other]

The $k$-representation number of the random graph

Authors: Ayush Basu, Vojtěch Rödl, Marcelo Sales

Abstract: The $k$-representation number of a graph $G$ is the minimum cardinality of the system of vertex subsets with the property that every edge of $G$ is covered at least $k$ times while every non-edge is covered at most $(k-1)$ times. In particular, for $k=1$ this notion is equivalent to the clique number of a graph $G$. Extending results of Frieze and Reed, and Eaton and Grable, we study the $k$-repre… ▽ More The $k$-representation number of a graph $G$ is the minimum cardinality of the system of vertex subsets with the property that every edge of $G$ is covered at least $k$ times while every non-edge is covered at most $(k-1)$ times. In particular, for $k=1$ this notion is equivalent to the clique number of a graph $G$. Extending results of Frieze and Reed, and Eaton and Grable, we study the $k$-representation number of $G(n,1/2)$. As a tool, we will prove a sharp concentration result counting the number of induced subgraphs of $G(n,1/2)$ with density $(\frac{1}{2}+α)$. In Lemma 3.7, we will show that the number of such subgraphs is close to its expected value with probability $1-\exp(-n^C)$. △ Less

Submitted 3 March, 2024; originally announced March 2024.

arXiv:2402.18206 [pdf, other]

Balancing Act: Distribution-Guided Debiasing in Diffusion Models

Authors: Rishubh Parihar, Abhijnya Bhat, Abhipsa Basu, Saswat Mallick, Jogendra Nath Kundu, R. Venkatesh Babu

Abstract: Diffusion Models (DMs) have emerged as powerful generative models with unprecedented image generation capability. These models are widely used for data augmentation and creative applications. However, DMs reflect the biases present in the training datasets. This is especially concerning in the context of faces, where the DM prefers one demographic subgroup vs others (eg. female vs male). In this w… ▽ More Diffusion Models (DMs) have emerged as powerful generative models with unprecedented image generation capability. These models are widely used for data augmentation and creative applications. However, DMs reflect the biases present in the training datasets. This is especially concerning in the context of faces, where the DM prefers one demographic subgroup vs others (eg. female vs male). In this work, we present a method for debiasing DMs without relying on additional data or model retraining. Specifically, we propose Distribution Guidance, which enforces the generated images to follow the prescribed attribute distribution. To realize this, we build on the key insight that the latent features of denoising UNet hold rich demographic semantics, and the same can be leveraged to guide debiased generation. We train Attribute Distribution Predictor (ADP) - a small mlp that maps the latent features to the distribution of attributes. ADP is trained with pseudo labels generated from existing attribute classifiers. The proposed Distribution Guidance with ADP enables us to do fair generation. Our method reduces bias across single/multiple attributes and outperforms the baseline by a significant margin for unconditional and text-conditional diffusion models. Further, we present a downstream task of training a fair attribute classifier by rebalancing the training set with our generated data. △ Less

Submitted 29 May, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

Comments: CVPR 2024. Project Page : https://ab-34.github.io/balancing_act/

arXiv:2402.16984 [pdf, ps, other]

Note on set representation of bounded degree hypergaphs

Authors: Ayush Basu, Griffin Johnston, Vojtěch Rödl, Marcelo Sales

Abstract: In their classical paper, Erdős, Goodman and Posa studied the representation of a graph by vertex set $[n]$ with a family of subsets $S_1,\dots, S_n$ with the property that $\{i,j\}$ is an edge iff $S_i\cap S_j\neq \emptyset$. In this note, we consider a similar representation of bounded degree $r$-uniform hypergraphs and establish some bounds for a corresponding problem. In their classical paper, Erdős, Goodman and Posa studied the representation of a graph by vertex set $[n]$ with a family of subsets $S_1,\dots, S_n$ with the property that $\{i,j\}$ is an edge iff $S_i\cap S_j\neq \emptyset$. In this note, we consider a similar representation of bounded degree $r$-uniform hypergraphs and establish some bounds for a corresponding problem. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.09065 [pdf, other]

The Thousand-Pulsar-Array programme on MeerKAT -- XII. Discovery of long-term pulse profile evolution in 7 young pulsars

Authors: A. Basu, P. Weltevrede, M. J. Keith, S. Johnston, A. Karastergiou, L. S. Oswald, B. Posselt, X. Song, A. D. Cameron

Abstract: A number of pulsars are known to have profile evolution on timescales of months, often correlated with spin-down rate changes. Here, we present the first result from 3 years of monitoring observations from MeerKAT as part of the Thousand Pulsar Array programme. This programme obtains high-fidelity pulse profiles for $\sim$ 500 pulsars, which enabled the detection of subtle changes in seven sources… ▽ More A number of pulsars are known to have profile evolution on timescales of months, often correlated with spin-down rate changes. Here, we present the first result from 3 years of monitoring observations from MeerKAT as part of the Thousand Pulsar Array programme. This programme obtains high-fidelity pulse profiles for $\sim$ 500 pulsars, which enabled the detection of subtle changes in seven sources not previously known to exhibit long-term profile evolution. A 2D Gaussian convolution is used to highlight correlated emission variability in both the pulse phase and observing epoch direction. Simulations show that for one additional source the observed profile variability is likely to originate from stochastic single-pulse shape variability (jitter). We find that it is common for long-term profile variability to be associated with changes in polarization fractions, but not with polarisation position angle (PA) changes. PA changes are expected if emission height changes or precession is responsible for the profile variability. PSR J1741$-$3927 is the only pulsar in our sample that shows correlated PA variability, and this is associated with orthogonal polarization mode activity. For the six %the rest, without correlated PA variability, other pulsars limits on possible emission height changes and impact angle changes are derived. These limits are consistent with the small changes in the total intensity profile shape. None of the sources show detectable spin-down variability correlated with the emission changes, which are thought to be driven by magnetospheric current fluctuations. Therefore the absence of correlated spin-down rate variability allows upper limits to be placed on changes in the magnetospheric charge density. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: 19 pages, 12 figures, 3 Tables. Accepted for publication in MNRAS

arXiv:2402.02328 [pdf, other]

Sample Complexity of Algorithm Selection Using Neural Networks and Its Applications to Branch-and-Cut

Authors: Hongyu Cheng, Sammy Khalife, Barbara Fiedorowicz, Amitabh Basu

Abstract: Data-driven algorithm design is a paradigm that uses statistical and machine learning techniques to select from a class of algorithms for a computational problem an algorithm that has the best expected performance with respect to some (unknown) distribution on the instances of the problem. We build upon recent work in this line of research by considering the setup where, instead of selecting a sin… ▽ More Data-driven algorithm design is a paradigm that uses statistical and machine learning techniques to select from a class of algorithms for a computational problem an algorithm that has the best expected performance with respect to some (unknown) distribution on the instances of the problem. We build upon recent work in this line of research by considering the setup where, instead of selecting a single algorithm that has the best performance, we allow the possibility of selecting an algorithm based on the instance to be solved, using neural networks. In particular, given a representative sample of instances, we learn a neural network that maps an instance of the problem to the most appropriate algorithm for that instance. We formalize this idea and derive rigorous sample complexity bounds for this learning problem, in the spirit of recent work in data-driven algorithm design. We then apply this approach to the problem of making good decisions in the branch-and-cut framework for mixed-integer optimization (e.g., which cut to add?). In other words, the neural network will take as input a mixed-integer optimization instance and output a decision that will result in a small branch-and-cut tree for that instance. Our computational results provide evidence that our particular way of using neural networks for cut selection can make a significant impact in reducing branch-and-cut tree sizes, compared to previous data-driven approaches. △ Less

Submitted 4 June, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

arXiv:2401.09797 [pdf, other]

Memory Efficient Corner Detection for Event-driven Dynamic Vision Sensors

Authors: Pao-Sheng Vincent Sun, Arren Glover, Chiara Bartolozzi, Arindam Basu

Abstract: Event cameras offer low-latency and data compression for visual applications, through event-driven operation, that can be exploited for edge processing in tiny autonomous agents. Robust, accurate and low latency extraction of highly informative features such as corners is key for most visual processing. While several corner detection algorithms have been proposed, state-of-the-art performance is a… ▽ More Event cameras offer low-latency and data compression for visual applications, through event-driven operation, that can be exploited for edge processing in tiny autonomous agents. Robust, accurate and low latency extraction of highly informative features such as corners is key for most visual processing. While several corner detection algorithms have been proposed, state-of-the-art performance is achieved by luvHarris. However, this algorithm requires a high number of memory accesses per event, making it less-than ideal for low-latency, low-energy implementation in tiny edge processors. In this paper, we propose a new event-driven corner detection implementation tailored for edge computing devices, which requires much lower memory access than luvHarris while also improving accuracy. Our method trades computation for memory access, which is more expensive for large memories. For a DAVIS346 camera, our method requires ~3.8X less memory, ~36.6X less memory accesses with only ~2.3X more computes. △ Less

Submitted 18 January, 2024; originally announced January 2024.

arXiv:2312.15889 [pdf, other]

ANN vs SNN: A case study for Neural Decoding in Implantable Brain-Machine Interfaces

Authors: Biyan Zhou, Pao-Sheng Vincent Sun, Arindam Basu

Abstract: While it is important to make implantable brain-machine interfaces (iBMI) wireless to increase patient comfort and safety, the trend of increased channel count in recent neural probes poses a challenge due to the concomitant increase in the data rate. Extracting information from raw data at the source by using edge computing is a promising solution to this problem, with integrated intention decode… ▽ More While it is important to make implantable brain-machine interfaces (iBMI) wireless to increase patient comfort and safety, the trend of increased channel count in recent neural probes poses a challenge due to the concomitant increase in the data rate. Extracting information from raw data at the source by using edge computing is a promising solution to this problem, with integrated intention decoders providing the best compression ratio. In this work, we compare different neural networks (NN) for motor decoding in terms of accuracy and implementation cost. We further show that combining traditional signal processing techniques with machine learning ones deliver surprisingly good performance even with simple NNs. Adding a block Bidirectional Bessel filter provided maximum gains of $\approx 0.05$, $0.04$ and $0.03$ in $R^2$ for ANN\_3d, SNN\_3D and ANN models, while the gains were lower ($\approx 0.02$ or less) for LSTM and SNN\_streaming models. Increasing training data helped improve the $R^2$ of all models by $0.03-0.04$ indicating they have more capacity for future improvement. In general, LSTM and SNN\_streaming models occupy the high and low ends of the pareto curves (for accuracy vs. memory/operations) respectively while SNN\_3D and ANN\_3D occupy intermediate positions. Our work presents state of the art results for this dataset and paves the way for decoder-integrated-implants of the future. △ Less

Submitted 26 December, 2023; originally announced December 2023.

arXiv:2312.09958 [pdf, other]

Distilling Large Language Models for Matching Patients to Clinical Trials

Authors: Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh

Abstract: The recent success of large language models (LLMs) has paved the way for their adoption in the high-stakes domain of healthcare. Specifically, the application of LLMs in patient-trial matching, which involves assessing patient eligibility against clinical trial's nuanced inclusion and exclusion criteria, has shown promise. Recent research has shown that GPT-3.5, a widely recognized LLM developed b… ▽ More The recent success of large language models (LLMs) has paved the way for their adoption in the high-stakes domain of healthcare. Specifically, the application of LLMs in patient-trial matching, which involves assessing patient eligibility against clinical trial's nuanced inclusion and exclusion criteria, has shown promise. Recent research has shown that GPT-3.5, a widely recognized LLM developed by OpenAI, can outperform existing methods with minimal 'variable engineering' by simply comparing clinical trial information against patient summaries. However, there are significant challenges associated with using closed-source proprietary LLMs like GPT-3.5 in practical healthcare applications, such as cost, privacy and reproducibility concerns. To address these issues, this study presents the first systematic examination of the efficacy of both proprietary (GPT-3.5, and GPT-4) and open-source LLMs (LLAMA 7B,13B, and 70B) for the task of patient-trial matching. Employing a multifaceted evaluation framework, we conducted extensive automated and human-centric assessments coupled with a detailed error analysis for each model. To enhance the adaptability of open-source LLMs, we have created a specialized synthetic dataset utilizing GPT-4, enabling effective fine-tuning under constrained data conditions. Our findings reveal that open-source LLMs, when fine-tuned on this limited and synthetic dataset, demonstrate performance parity with their proprietary counterparts. This presents a massive opportunity for their deployment in real-world healthcare applications. To foster further research and applications in this field, we release both the annotated evaluation dataset along with the fine-tuned LLM -- Trial-LLAMA -- for public use. △ Less

Submitted 15 December, 2023; originally announced December 2023.

arXiv:2312.09503 [pdf, other]

Towards Neuromorphic Compression based Neural Sensing for Next-Generation Wireless Implantable Brain Machine Interface

Authors: Vivek Mohan, Wee Peng Tay, Arindam Basu

Abstract: This work introduces a neuromorphic compression based neural sensing architecture with address-event representation inspired readout protocol for massively parallel, next-gen wireless iBMI. The architectural trade-offs and implications of the proposed method are quantitatively analyzed in terms of compression ratio and spike information preservation. For the latter, we use metrics such as root-mea… ▽ More This work introduces a neuromorphic compression based neural sensing architecture with address-event representation inspired readout protocol for massively parallel, next-gen wireless iBMI. The architectural trade-offs and implications of the proposed method are quantitatively analyzed in terms of compression ratio and spike information preservation. For the latter, we use metrics such as root-mean-square error and correlation coefficient between the original and recovered signal to assess the effect of neuromorphic compression on spike shape. Furthermore, we use accuracy, sensitivity, and false detection rate to understand the effect of compression on downstream iBMI tasks, specifically, spike detection. We demonstrate that a data compression ratio of $50-100$ can be achieved, $5-18\times$ more than prior work, by selective transmission of event pulses corresponding to neural spikes. A correlation coefficient of $\approx0.9$ and spike detection accuracy of over $90\%$ for the worst-case analysis involving $10K$-channel simulated recording and typical analysis using $100$ or $384$-channel real neural recordings. We also analyze the collision handling capability and scalability of the proposed pipeline. △ Less

Submitted 14 December, 2023; originally announced December 2023.

Comments: 14 pages, 8 figures, IEEE Transaction submission manuscript. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2312.06591 [pdf, other]

Concurrent Density Estimation with Wasserstein Autoencoders: Some Statistical Insights

Authors: Anish Chakrabarty, Arkaprabha Basu, Swagatam Das

Abstract: Variational Autoencoders (VAEs) have been a pioneering force in the realm of deep generative models. Amongst its legions of progenies, Wasserstein Autoencoders (WAEs) stand out in particular due to the dual offering of heightened generative quality and a strong theoretical backbone. WAEs consist of an encoding and a decoding network forming a bottleneck with the prime objective of generating new s… ▽ More Variational Autoencoders (VAEs) have been a pioneering force in the realm of deep generative models. Amongst its legions of progenies, Wasserstein Autoencoders (WAEs) stand out in particular due to the dual offering of heightened generative quality and a strong theoretical backbone. WAEs consist of an encoding and a decoding network forming a bottleneck with the prime objective of generating new samples resembling the ones it was catered to. In the process, they aim to achieve a target latent representation of the encoded data. Our work is an attempt to offer a theoretical understanding of the machinery behind WAEs. From a statistical viewpoint, we pose the problem as concurrent density estimation tasks based on neural network-induced transformations. This allows us to establish deterministic upper bounds on the realized errors WAEs commit. We also analyze the propagation of these stochastic errors in the presence of adversaries. As a result, both the large sample properties of the reconstructed distribution and the resilience of WAE models are explored. △ Less

Submitted 11 December, 2023; originally announced December 2023.

arXiv:2311.10056 [pdf, other]

A Collection of German Science Interests in the Next Generation Very Large Array

Authors: M. Kadler, D. A. Riechers, J. Agarwal, A. -K. Baczko, H. Beuther, F. Bigiel, T. Birnstiel, B. Boccardi, D. J. Bomans, L. Boogaard, T. T. Braun, S. Britzen, M. Brüggen, A. Brunthaler, P. Caselli, D. Elsässer, S. von Fellenberg, M. Flock, C. M. Fromm, L. Fuhrmann, P. Hartogh, M. Hoeft, R. P. Keenan, Y. Kovalev, K. Kreckel , et al. (66 additional authors not shown)

Abstract: The Next Generation Very Large Array (ngVLA) is a planned radio interferometer providing unprecedented sensitivity at wavelengths between 21 cm and 3 mm. Its 263 antenna element array will be spatially distributed across North America to enable both superb low surface brightness recovery and sub-milliarcsecond angular resolution imaging. The project was developed by the international astronomy com… ▽ More The Next Generation Very Large Array (ngVLA) is a planned radio interferometer providing unprecedented sensitivity at wavelengths between 21 cm and 3 mm. Its 263 antenna element array will be spatially distributed across North America to enable both superb low surface brightness recovery and sub-milliarcsecond angular resolution imaging. The project was developed by the international astronomy community under the lead of the National Radio Astronomy Observatory (NRAO), and is anticipated to be built between 2027 and 2037. Two workshops have been held in 2022 and 2023 with the goal to discuss and consolidate the scientific interests in the ngVLA within the German astronomical community. This community paper constitutes a collection of 48 science ideas which the German community aims to pursue with the ngVLA in the 2030s. This is not a complete list and the ideas are not developed at the level of a "Science Book", such that the present document is mainly meant provide a basis for further discussion within the community. As such, additional contributions are welcome, and will be considered for inclusion in future revisions. △ Less

Submitted 18 June, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: Version 2.0 (status June 18, 2024): 169 pages, comments and future contributions welcome [v2.0: 7 new science cases added, some minor revisions to other chapters]

arXiv:2310.20101 [pdf, other]

Medical Image Denosing via Explainable AI Feature Preserving Loss

Authors: Guanfang Dong, Anup Basu

Abstract: Denoising algorithms play a crucial role in medical image processing and analysis. However, classical denoising algorithms often ignore explanatory and critical medical features preservation, which may lead to misdiagnosis and legal liabilities. In this work, we propose a new denoising method for medical images that not only efficiently removes various types of noise, but also preserves key medica… ▽ More Denoising algorithms play a crucial role in medical image processing and analysis. However, classical denoising algorithms often ignore explanatory and critical medical features preservation, which may lead to misdiagnosis and legal liabilities. In this work, we propose a new denoising method for medical images that not only efficiently removes various types of noise, but also preserves key medical features throughout the process. To achieve this goal, we utilize a gradient-based eXplainable Artificial Intelligence (XAI) approach to design a feature preserving loss function. Our feature preserving loss function is motivated by the characteristic that gradient-based XAI is sensitive to noise. Through backpropagation, medical image features before and after denoising can be kept consistent. We conducted extensive experiments on three available medical image datasets, including synthesized 13 different types of noise and artifacts. The experimental results demonstrate the superiority of our method in terms of denoising performance, model explainability, and generalization. △ Less

Submitted 7 November, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

arXiv:2310.09349 [pdf, other]

Viability of Boosted Light Dark Matter in a Two-Component Scenario

Authors: Arindam Basu, Amit Chakraborty, Nilanjana Kumar, Soumya Sadhukhan

Abstract: We study the two-component boosted dark matter (DM) scenario in a neutrinophilic two-Higgs doublet model ($ν2$HDM), which comprises of one extra Higgs doublet with a MeV scale CP-even scalar H. This model is extended with a light ($\sim 10$~MeV) singlet scalar DM $φ_3$, which is stabilized under the existing $Z_2$ symmetry and can only effectively annihilate through scalar $H$. As the presence of… ▽ More We study the two-component boosted dark matter (DM) scenario in a neutrinophilic two-Higgs doublet model ($ν2$HDM), which comprises of one extra Higgs doublet with a MeV scale CP-even scalar H. This model is extended with a light ($\sim 10$~MeV) singlet scalar DM $φ_3$, which is stabilized under the existing $Z_2$ symmetry and can only effectively annihilate through scalar $H$. As the presence of a light H modify the oblique parameters to put tight constraints on the model, introduction of vectorlike leptons (VLL) can potentially salvage the issue. These additional vector-like doublet $N$ and one vector-like singlet $χ$ are also stabilized through the $Z_2$ symmetry. The lightest vectorlike mass eigenstate $χ_1$ ($ \sim 100$ GeV) can be the potential second DM component of the model. Individual scalar and fermionic DM candidates have Higgs/Z mediated annihilation, restricting the fermion DM in a narrow mass region while a somewhat broader mass region is allowed for the scalar DM. In a coupled scenario, light DM $φ_3$ gets its boost from the $χ_1$ annihilation while the fermionic DM opens up new annihilation channel $χ_1 χ_1 \to φ_3 φ_3$: decreases the relic density. This paves way for more fermionic DM mass with under-abundant relic, a region of [35-60] GeV compared to a smaller [40-50] GeV window for the single component fermion DM. On the other hand, the $φ_3$ resonant annihilation gets diluted due to boosting effects in kinematics, which increases the DM relic leading to a smaller allowed region. To achieve an under-abundant relic, the total DM relic will be dominated by the $χ_1$ contribution. While there is a region with $φ_3$ contribution dominating the total DM, the combined relic becomes over-abundant. Therefore, a sub-dominant ($\sim 5 \%$) boosted scalar is the most favorable light DM candidate to be probed for detection. △ Less

Submitted 13 October, 2023; originally announced October 2023.

Comments: 29 pages, 10 figures, 5 tables

arXiv:2310.08280 [pdf]

Optimizing Odia Braille Literacy: The Influence of Speed on Error Reduction and Enhanced Comprehension

Authors: Monnie Parida, Manjira Sinha, Anupam Basu, Pabitra Mitra

Abstract: This study aims to conduct an extensive detailed analysis of the Odia Braille reading comprehension among students with visual disability. Specifically, the study explores their reading speed and hand or finger movements. The study also aims to investigate any comprehension difficulties and reading errors they may encounter. Six students from the 9th and 10th grades, aged between 14 and 16, partic… ▽ More This study aims to conduct an extensive detailed analysis of the Odia Braille reading comprehension among students with visual disability. Specifically, the study explores their reading speed and hand or finger movements. The study also aims to investigate any comprehension difficulties and reading errors they may encounter. Six students from the 9th and 10th grades, aged between 14 and 16, participated in the study. We observed participants hand movements to understand how reading errors were connected to hand movement and identify the students reading difficulties. We also evaluated the participants Odia Braille reading skills, including their reading speed (in words per minute), errors, and comprehension. The average speed of Odia Braille reader is 17.64wpm. According to the study, there was a noticeable correlation between reading speed and reading errors. As reading speed decreased, the number of reading errors tended to increase. Moreover, the study established a link between reduced Braille reading errors and improved reading comprehension. In contrast, the study found that better comprehension was associated with increased reading speed. The researchers concluded with some interesting findings about preferred Braille reading patterns. These findings have important theoretical, developmental, and methodological implications for instruction. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: 4 Pages, Paper accepted in Diversity and Inclusion track at CODS-COMAD 2024

arXiv:2309.13531 [pdf, other]

Robust Principal Component Analysis using Density Power Divergence

Authors: Subhrajyoty Roy, Ayanendranath Basu, Abhik Ghosh

Abstract: Principal component analysis (PCA) is a widely employed statistical tool used primarily for dimensionality reduction. However, it is known to be adversely affected by the presence of outlying observations in the sample, which is quite common. Robust PCA methods using M-estimators have theoretical benefits, but their robustness drop substantially for high dimensional data. On the other end of the s… ▽ More Principal component analysis (PCA) is a widely employed statistical tool used primarily for dimensionality reduction. However, it is known to be adversely affected by the presence of outlying observations in the sample, which is quite common. Robust PCA methods using M-estimators have theoretical benefits, but their robustness drop substantially for high dimensional data. On the other end of the spectrum, robust PCA algorithms solving principal component pursuit or similar optimization problems have high breakdown, but lack theoretical richness and demand high computational power compared to the M-estimators. We introduce a novel robust PCA estimator based on the minimum density power divergence estimator. This combines the theoretical strength of the M-estimators and the minimum divergence estimators with a high breakdown guarantee regardless of data dimension. We present a computationally efficient algorithm for this estimate. Our theoretical findings are supported by extensive simulations and comparisons with existing robust PCA methods. We also showcase the proposed algorithm's applicability on two benchmark datasets and a credit card transactions dataset for fraud detection. △ Less

Submitted 23 September, 2023; originally announced September 2023.

arXiv:2309.00752 [pdf, other]

Affine-Transformation-Invariant Image Classification by Differentiable Arithmetic Distribution Module

Authors: Zijie Tan, Guanfang Dong, Chenqiu Zhao, Anup Basu

Abstract: Although Convolutional Neural Networks (CNNs) have achieved promising results in image classification, they still are vulnerable to affine transformations including rotation, translation, flip and shuffle. The drawback motivates us to design a module which can alleviate the impact from different affine transformations. Thus, in this work, we introduce a more robust substitute by incorporating dist… ▽ More Although Convolutional Neural Networks (CNNs) have achieved promising results in image classification, they still are vulnerable to affine transformations including rotation, translation, flip and shuffle. The drawback motivates us to design a module which can alleviate the impact from different affine transformations. Thus, in this work, we introduce a more robust substitute by incorporating distribution learning techniques, focusing particularly on learning the spatial distribution information of pixels in images. To rectify the issue of non-differentiability of prior distribution learning methods that rely on traditional histograms, we adopt the Kernel Density Estimation (KDE) to formulate differentiable histograms. On this foundation, we present a novel Differentiable Arithmetic Distribution Module (DADM), which is designed to extract the intrinsic probability distributions from images. The proposed approach is able to enhance the model's robustness to affine transformations without sacrificing its feature extraction capabilities, thus bridging the gap between traditional CNNs and distribution-based learning. We validate the effectiveness of the proposed approach through ablation study and comparative experiments with LeNet. △ Less

Submitted 12 December, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

arXiv:2308.15667 [pdf, other]

Bridging Distribution Learning and Image Clustering in High-dimensional Space

Authors: Guanfang Dong, Chenqiu Zhao, Anup Basu

Abstract: Distribution learning focuses on learning the probability density function from a set of data samples. In contrast, clustering aims to group similar objects together in an unsupervised manner. Usually, these two tasks are considered unrelated. However, the relationship between the two may be indirectly correlated, with Gaussian Mixture Models (GMM) acting as a bridge. In this paper, we focus on ex… ▽ More Distribution learning focuses on learning the probability density function from a set of data samples. In contrast, clustering aims to group similar objects together in an unsupervised manner. Usually, these two tasks are considered unrelated. However, the relationship between the two may be indirectly correlated, with Gaussian Mixture Models (GMM) acting as a bridge. In this paper, we focus on exploring the correlation between distribution learning and clustering, with the motivation to fill the gap between these two fields, utilizing an autoencoder (AE) to encode images into a high-dimensional latent space. Then, Monte-Carlo Marginalization (MCMarg) and Kullback-Leibler (KL) divergence loss are used to fit the Gaussian components of the GMM and learn the data distribution. Finally, image clustering is achieved through each Gaussian component of GMM. Yet, the "curse of dimensionality" poses severe challenges for most clustering algorithms. Compared with the classic Expectation-Maximization (EM) Algorithm, experimental results show that MCMarg and KL divergence can greatly alleviate the difficulty. Based on the experimental results, we believe distribution learning can exploit the potential of GMM in image clustering within high-dimensional space. △ Less

Submitted 29 August, 2023; originally announced August 2023.

arXiv:2308.13612

Is Deep Learning Network Necessary for Image Generation?

Authors: Chenqiu Zhao, Guanfang Dong, Anup Basu

Abstract: Recently, images are considered samples from a high-dimensional distribution, and deep learning has become almost synonymous with image generation. However, is a deep learning network truly necessary for image generation? In this paper, we investigate the possibility of image generation without using a deep learning network, motivated by validating the assumption that images follow a high-dimensio… ▽ More Recently, images are considered samples from a high-dimensional distribution, and deep learning has become almost synonymous with image generation. However, is a deep learning network truly necessary for image generation? In this paper, we investigate the possibility of image generation without using a deep learning network, motivated by validating the assumption that images follow a high-dimensional distribution. Since images are assumed to be samples from such a distribution, we utilize the Gaussian Mixture Model (GMM) to describe it. In particular, we employ a recent distribution learning technique named as Monte-Carlo Marginalization to capture the parameters of the GMM based on image samples. Moreover, we also use the Singular Value Decomposition (SVD) for dimensionality reduction to decrease computational complexity. During our evaluation experiment, we first attempt to model the distribution of image samples directly to verify the assumption that images truly follow a distribution. We then use the SVD for dimensionality reduction. The principal components, rather than raw image data, are used for distribution learning. Compared to methods relying on deep learning networks, our approach is more explainable, and its performance is promising. Experiments show that our images have a lower FID value compared to those generated by variational auto-encoders, demonstrating the feasibility of image generation without deep learning networks. △ Less

Submitted 2 November, 2023; v1 submitted 25 August, 2023; originally announced August 2023.

Comments: This paper has been reject. I am planning to combine this paper with my another paper to make one strong paper

arXiv:2308.11391 [pdf, other]

doi 10.1051/0004-6361/202346777

Probing magnetic fields in the circumgalactic medium using polarization data from MIGHTEE

Authors: K. Böckmann, M. Brüggen, V. Heesen, A. Basu, S. P. O'Sullivan, I. Heywood, M. Jarvis, A. Scaife, J. Stil, R. Taylor, N. J. Adams, R. A. A. Bowler, M. N. Tudorache

Abstract: The detection and study of magnetic fields surrounding galaxies is important to understand galaxy evolution since magnetic fields are tracers for dynamical processes in the circumgalactic medium (CGM) and can have a significant impact on the evolution of the CGM. The Faraday rotation measure (RM) of the polarized light of background radio sources passing through the magnetized CGM of intervening g… ▽ More The detection and study of magnetic fields surrounding galaxies is important to understand galaxy evolution since magnetic fields are tracers for dynamical processes in the circumgalactic medium (CGM) and can have a significant impact on the evolution of the CGM. The Faraday rotation measure (RM) of the polarized light of background radio sources passing through the magnetized CGM of intervening galaxies can be used as a tracer for the strength and extent of magnetic fields around galaxies. We use rotation measures observed by the MIGHTEE-POL (MeerKAT International GHz Tiered Extragalactic Exploration POLarisation) survey by MeerKAT in the XMM-LSS and COSMOS fields to investigate the RM around foreground star-forming galaxies. We use spectroscopic catalogs of star-forming and blue cloud galaxies to measure the RM of MIGHTEE-POL sources as a function of the impact parameter from the intervening galaxy. We then repeat this procedure using a deeper galaxy catalog with photometric redshifts. For the spectroscopic star-forming sample we find a redshift-corrected |RM| excess of 5.6 +/- 2.3 rad m-2 which corresponds to a 2.5 sigma significance around galaxies with a median redshift of z = 0.46 for impact parameters below 130 kpc only selecting the intervenor with the smallest impact parameter. Making use of a photometric galaxy catalog and taking into account all intervenors with Mg < -13.6 mag, the signal disappears. We find no indication for a correlation between redshift and RM, nor do we find a connection between the total number of intervenors to the total |RM| . We have presented tentative evidence that the CGM of star-forming galaxies is permeated by coherent magnetic fields within the virial radius. We conclude that mostly bright, star-forming galaxies with impact parameters less than 130 kpc significantly contribute to the RM of the background radio source. △ Less

Submitted 22 August, 2023; originally announced August 2023.

Comments: 11 pages, 8 figures, accepted for publication in A&A

Journal ref: A&A 678, A56 (2023)

arXiv:2308.11153 [pdf, other]

Information Complexity of Mixed-integer Convex Optimization

Authors: Amitabh Basu, Hongyi Jiang, Phillip Kerger, Marco Molinaro

Abstract: We investigate the information complexity of mixed-integer convex optimization under different types of oracles. We establish new lower bounds for the standard first-order oracle, improving upon the previous best known lower bound. This leaves only a lower order linear term (in the dimension) as the gap between the lower and upper bounds. This is derived as a corollary of a more fundamental ``tran… ▽ More We investigate the information complexity of mixed-integer convex optimization under different types of oracles. We establish new lower bounds for the standard first-order oracle, improving upon the previous best known lower bound. This leaves only a lower order linear term (in the dimension) as the gap between the lower and upper bounds. This is derived as a corollary of a more fundamental ``transfer" result that shows how lower bounds on information complexity of continuous convex optimization under different oracles can be transferred to the mixed-integer setting in a black-box manner. Further, we (to the best of our knowledge) initiate the study of, and obtain the first set of results on, information complexity under oracles that only reveal \emph{partial} first-order information, e.g., where one can only make a binary query over the function value or subgradient at a given point. We give algorithms for (mixed-integer) convex optimization that work under these less informative oracles. We also give lower bounds showing that, for some of these oracles, every algorithm requires more iterations to achieve a target error compared to when complete first-order information is available. That is, these oracles are provably less informative than full first-order oracles for the purpose of optimization. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: 35 pages, 4 figures

arXiv:2308.08384 [pdf, other]

Availability, storage capacity, and diffusion: Stationary states of an asymmetric exclusion process connected to two reservoirs

Authors: Sourav Pal, Parna Roy, Abhik Basu

Abstract: We explore how the interplay of finite availability, carrying capacity of particles at different parts of a spatially extended system and particle diffusion between them control the steady state currents and density profiles in a one-dimensional current-carrying channel connecting the different parts of the system. To study this, we construct a minimal model consisting of two particle reservoirs o… ▽ More We explore how the interplay of finite availability, carrying capacity of particles at different parts of a spatially extended system and particle diffusion between them control the steady state currents and density profiles in a one-dimensional current-carrying channel connecting the different parts of the system. To study this, we construct a minimal model consisting of two particle reservoirs of finite carrying capacities connected by a totally asymmetric simple exclusion process (TASEP). In addition to particle transport via TASEP between the reservoirs, the latter can also directly exchange particles, modeling particle diffusion between them that can maintain a steady current in the system. We investigate the steady state density profiles and the associated particle currents in the TASEP lane. The resulting phases and the phase diagrams are quite different from an open TASEP, and are characterised by the model parameters defining particle exchanges between the TASEP and the reservoirs, direct particle exchanges between the reservoirs, and the filling fraction of the particles that determines the total resources available. These parameters can be tuned to make the density on the TASEP lane globally uniform or piecewise continuous, and can make the two reservoirs preferentially populated or depopulated. △ Less

Submitted 17 February, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

Comments: 29 pages, 22 figures

arXiv:2308.06352 [pdf, other]

Learning Distributions via Monte-Carlo Marginalization

Authors: Chenqiu Zhao, Guanfang Dong, Anup Basu

Abstract: We propose a novel method to learn intractable distributions from their samples. The main idea is to use a parametric distribution model, such as a Gaussian Mixture Model (GMM), to approximate intractable distributions by minimizing the KL-divergence. Based on this idea, there are two challenges that need to be addressed. First, the computational complexity of KL-divergence is unacceptable when th… ▽ More We propose a novel method to learn intractable distributions from their samples. The main idea is to use a parametric distribution model, such as a Gaussian Mixture Model (GMM), to approximate intractable distributions by minimizing the KL-divergence. Based on this idea, there are two challenges that need to be addressed. First, the computational complexity of KL-divergence is unacceptable when the dimensions of distributions increases. The Monte-Carlo Marginalization (MCMarg) is proposed to address this issue. The second challenge is the differentiability of the optimization process, since the target distribution is intractable. We handle this problem by using Kernel Density Estimation (KDE). The proposed approach is a powerful tool to learn complex distributions and the entire process is differentiable. Thus, it can be a better substitute of the variational inference in variational auto-encoders (VAE). One strong evidence of the benefit of our method is that the distributions learned by the proposed approach can generate better images even based on a pre-trained VAE's decoder. Based on this point, we devise a distribution learning auto-encoder which is better than VAE under the same network architecture. Experiments on standard dataset and synthetic data demonstrate the efficiency of the proposed approach. △ Less

Submitted 11 August, 2023; originally announced August 2023.

arXiv:2307.10591 [pdf, ps, other]

Analysis of the rSVDdpd Algorithm: A Robust Singular Value Decomposition Method using Density Power Divergence

Authors: Subhrajyoty Roy, Abhik Ghosh, Ayanendranath Basu

Abstract: The traditional method of computing singular value decomposition (SVD) of a data matrix is based on a least squares principle, thus, is very sensitive to the presence of outliers. Hence the resulting inferences across different applications using the classical SVD are extremely degraded in the presence of data contamination (e.g., video surveillance background modelling tasks, etc.). A robust sing… ▽ More The traditional method of computing singular value decomposition (SVD) of a data matrix is based on a least squares principle, thus, is very sensitive to the presence of outliers. Hence the resulting inferences across different applications using the classical SVD are extremely degraded in the presence of data contamination (e.g., video surveillance background modelling tasks, etc.). A robust singular value decomposition method using the minimum density power divergence estimator (rSVDdpd) has been found to provide a satisfactory solution to this problem and works well in applications. For example, it provides a neat solution to the background modelling problem of video surveillance data in the presence of camera tampering. In this paper, we investigate the theoretical properties of the rSVDdpd estimator such as convergence, equivariance and consistency under reasonable assumptions. Since the dimension of the parameters, i.e., the number of singular values and the dimension of singular vectors can grow linearly with the size of the data, the usual M-estimation theory has to be suitably modified with concentration bounds to establish the asymptotic properties. We believe that we have been able to accomplish this satisfactorily in the present work. We also demonstrate the efficiency of rSVDdpd through extensive simulations. △ Less

Submitted 20 July, 2023; originally announced July 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2109.10680

Showing 1–50 of 460 results for author: Basu, A