Search | arXiv e-print repository

First Direct Search for Light Dark Matter Using the NEON Experiment at a Nuclear Reactor

Authors: J. J. Choi, C. Ha, E. J. Jeon, J. Y. Kim, K. W. Kim, S. H. Kim, S. K. Kim, Y. D. Kim, Y. J. Ko, B. C. Koh, S. H. Lee, I. S. Lee, H. Lee, H. S. Lee, J. S. Lee, Y. M. Oh, B. J. Park

Abstract: We report new results from the Neutrino Elastic Scattering Observation with NaI (NEON) experiment in the search for light dark matter (LDM) using 2,636 kg$\cdot$days of NaI(Tl) exposure. The experiment employs an array of NaI(Tl) crystals with a total mass of 16.7 kg, located 23.7 meters away from a 2.8 GW thermal power nuclear reactor. We investigated LDM produced by the… ▽ More We report new results from the Neutrino Elastic Scattering Observation with NaI (NEON) experiment in the search for light dark matter (LDM) using 2,636 kg$\cdot$days of NaI(Tl) exposure. The experiment employs an array of NaI(Tl) crystals with a total mass of 16.7 kg, located 23.7 meters away from a 2.8 GW thermal power nuclear reactor. We investigated LDM produced by the $\textit{invisible decay}$ of dark photons generated by high-flux photons during reactor operation. The energy spectra collected during reactor-on and reactor-off periods were compared within the LDM signal region of $1-10$ keV. No signal consistent with LDM interaction with electrons was observed, allowing us to set 90% confidence level exclusion limits for the dark matter-electron scattering cross-section ($σ_e$) across dark matter masses ranging from 1 keV/c$^2$ to 1 MeV/c$^2$. Our results set a 90% confidence level upper limit of $σ_e = 3.17\times10^{-35}~\mathrm{cm^2}$ for a dark matter mass of 100 keV/c$^2$, marking the best laboratory result in this mass range. Additionally, our search extends the coverage of LDM below 100 keV/c$^2$ first time. △ Less

Submitted 23 July, 2024; originally announced July 2024.

arXiv:2407.15573 [pdf, other]

Machine Learning-Enhanced Design of Lead-Free Halide Perovskite Materials Using Density Functional Theory

Authors: Upendra Kumar, Hyeon Woo Kim, Gyanendra Kumar Maurya, Bincy Babu Raj, Sobhit Singh, Ajay Kumar Kushwaha, Sung Beom Cho, Hyunseok Ko

Abstract: The investigation of emerging non-toxic perovskite materials has been undertaken to advance the fabrication of environmentally sustainable lead-free perovskite solar cells. This study introduces a machine learning methodology aimed at predicting innovative halide perovskite materials that hold promise for use in photovoltaic applications. The seven newly predicted materials are as follows: CsMnCl… ▽ More The investigation of emerging non-toxic perovskite materials has been undertaken to advance the fabrication of environmentally sustainable lead-free perovskite solar cells. This study introduces a machine learning methodology aimed at predicting innovative halide perovskite materials that hold promise for use in photovoltaic applications. The seven newly predicted materials are as follows: CsMnCl$_4$, Rb$_3$Mn$_2$Cl$_9$, Rb$_4$MnCl$_6$, Rb$_3$MnCl$_5$, RbMn$_2$Cl$_7$, RbMn$_4$Cl$_9$, and CsIn$_2$Cl$_7$. The predicted compounds are first screened using a machine learning approach, and their validity is subsequently verified through density functional theory calculations. CsMnCl$_4$ is notable among them, displaying a bandgap of 1.37 eV, falling within the Shockley-Queisser limit, making it suitable for photovoltaic applications. Through the integration of machine learning and density functional theory, this study presents a methodology that is more effective and thorough for the discovery and design of materials. △ Less

Submitted 22 July, 2024; originally announced July 2024.

arXiv:2407.14338 [pdf, other]

Star Formation in Extreme Environments: A 200 pc High Velocity Gas Stream in the Galactic Centre

Authors: V. S. Veena, W. -J. Kim, Alvaro Sanchez-Monge, P. Schilke, K. M. Menten, G. A. Fuller, M. C. Sormani, F. Wyrowski, W. E. Banda-Barragan, D. Riquelme, P. Tarrio, P. de Vicente

Abstract: The expanding molecular ring (EMR) manifests itself as a parallelogram in the position-velocity diagram of spectral line emission from the Central Molecular Zone (CMZ) surrounding the Galacic centre (GC). Using multiwavelength data, we investigate the gas kinematics, star formation activity, and the presence of shocked gas in a 200 pc long high velocity gas stream (V~ +150 km/s) with a double heli… ▽ More The expanding molecular ring (EMR) manifests itself as a parallelogram in the position-velocity diagram of spectral line emission from the Central Molecular Zone (CMZ) surrounding the Galacic centre (GC). Using multiwavelength data, we investigate the gas kinematics, star formation activity, and the presence of shocked gas in a 200 pc long high velocity gas stream (V~ +150 km/s) with a double helix morphology named the helix stream, that is located 15-55 pc above the CMZ and is kinematically associated with the EMR/parallelogram. We carried out molecular line observations using the IRAM 30m, Yebes 40m, and APEX 12m telescopes. The detection of four rotational transitions of the SiO molecule indicate the presence of shocks. We derived the SiO column densities and abundances in different regions of the helix stream. The presence of protostellar clumps and a candidate HII region signify the ongoing star formation activity within the helix stream. The cloud is massive (2.5x10^6 M_sun) and highly turbulent. We find evidence of cloud-cloud collisions towards the eastern edge (l~1.3°), suggesting a dynamic interaction with the CMZ. An expanding shell is detected within the cloud with radius of 6.7 pc and an expansion velocity of 35 km/s. The shell might be powered by several supernovae or a single hypernova. The SiO abundance within the helix stream implies extensive shock processes occurring on large scales. The helical or cork-screw velocity structure of the helix stream indicates twisting and turning motions within the cloud. We propose that the helix stream is the continuation of the near side bar lane, that is overshooting after brushing the CMZ. Our findings carry profound implications for understanding star formation in extreme conditions and elucidate the intricate properties of gas and dust associated with nuclear inflows in barred spiral galaxies. △ Less

Submitted 19 July, 2024; originally announced July 2024.

Comments: 20 pages, 20 figures, 4 tables, accepted for publication in A&A

arXiv:2407.12998 [pdf, other]

Surgical Robot Transformer (SRT): Imitation Learning for Surgical Tasks

Authors: Ji Woong Kim, Tony Z. Zhao, Samuel Schmidgall, Anton Deguet, Marin Kobilarov, Chelsea Finn, Axel Krieger

Abstract: We explore whether surgical manipulation tasks can be learned on the da Vinci robot via imitation learning. However, the da Vinci system presents unique challenges which hinder straight-forward implementation of imitation learning. Notably, its forward kinematics is inconsistent due to imprecise joint measurements, and naively training a policy using such approximate kinematics data often leads to… ▽ More We explore whether surgical manipulation tasks can be learned on the da Vinci robot via imitation learning. However, the da Vinci system presents unique challenges which hinder straight-forward implementation of imitation learning. Notably, its forward kinematics is inconsistent due to imprecise joint measurements, and naively training a policy using such approximate kinematics data often leads to task failure. To overcome this limitation, we introduce a relative action formulation which enables successful policy training and deployment using its approximate kinematics data. A promising outcome of this approach is that the large repository of clinical data, which contains approximate kinematics, may be directly utilized for robot learning without further corrections. We demonstrate our findings through successful execution of three fundamental surgical tasks, including tissue manipulation, needle handling, and knot-tying. △ Less

Submitted 17 July, 2024; originally announced July 2024.

Comments: 8 pages

arXiv:2407.12867 [pdf, other]

Swift-BAT GUANO follow-up of gravitational-wave triggers in the third LIGO-Virgo-KAGRA observing run

Authors: Gayathri Raman, Samuele Ronchini, James Delaunay, Aaron Tohuvavohu, Jamie A. Kennea, Tyler Parsotan, Elena Ambrosi, Maria Grazia Bernardini, Sergio Campana, Giancarlo Cusumano, Antonino D'Ai, Paolo D'Avanzo, Valerio D'Elia, Massimiliano De Pasquale, Simone Dichiara, Phil Evans, Dieter Hartmann, Paul Kuin, Andrea Melandri, Paul O'Brien, Julian P. Osborne, Kim Page, David M. Palmer, Boris Sbarufatti, Gianpiero Tagliaferri , et al. (1797 additional authors not shown)

Abstract: We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wav… ▽ More We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wave Transient Catalogs (GWTC-3). Targeted searches were carried out on the entire GW sample using the maximum--likelihood NITRATES pipeline on the BAT data made available via the GUANO infrastructure. We do not detect any significant electromagnetic emission that is temporally and spatially coincident with any of the GW candidates. We report flux upper limits in the 15-350 keV band as a function of sky position for all the catalog candidates. For GW candidates where the Swift-BAT false alarm rate is less than 10$^{-3}$ Hz, we compute the GW--BAT joint false alarm rate. Finally, the derived Swift-BAT upper limits are used to infer constraints on the putative electromagnetic emission associated with binary black hole mergers. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: 50 pages, 10 figures, 4 tables

arXiv:2407.12227 [pdf, other]

Development of MMC-based lithium molybdate cryogenic calorimeters for AMoRE-II

Authors: A. Agrawal, V. V. Alenkov, P. Aryal, H. Bae, J. Beyer, B. Bhandari, R. S. Boiko, K. Boonin, O. Buzanov, C. R. Byeon, N. Chanthima, M. K. Cheoun, J. S. Choe, S. Choi, S. Choudhury, J. S. Chung, F. A. Danevich, M. Djamal, D. Drung, C. Enss, A. Fleischmann, A. M. Gangapshev, L. Gastaldo, Y. M. Gavrilyuk, A. M. Gezhaev , et al. (84 additional authors not shown)

Abstract: The AMoRE collaboration searches for neutrinoless double beta decay of $^{100}$Mo using molybdate scintillating crystals via low temperature thermal calorimetric detection. The early phases of the experiment, AMoRE-pilot and AMoRE-I, have demonstrated competitive discovery potential. Presently, the AMoRE-II experiment, featuring a large detector array with about 90 kg of $^{100}$Mo isotope, is und… ▽ More The AMoRE collaboration searches for neutrinoless double beta decay of $^{100}$Mo using molybdate scintillating crystals via low temperature thermal calorimetric detection. The early phases of the experiment, AMoRE-pilot and AMoRE-I, have demonstrated competitive discovery potential. Presently, the AMoRE-II experiment, featuring a large detector array with about 90 kg of $^{100}$Mo isotope, is under construction.This paper discusses the baseline design and characterization of the lithium molybdate cryogenic calorimeters to be used in the AMoRE-II detector modules. The results from prototype setups that incorporate new housing structures and two different crystal masses (316 g and 517 - 521 g), operated at 10 mK temperature, show energy resolutions (FWHM) of 7.55 - 8.82 keV at the 2.615 MeV $^{208}$Tl $γ$ line, and effective light detection of 0.79 - 0.96 keV/MeV. The simultaneous heat and light detection enables clear separation of alpha particles with a discrimination power of 12.37 - 19.50 at the energy region around $^6$Li(n, $α$)$^3$H with Q-value = 4.785 MeV. Promising detector performances were demonstrated at temperatures as high as 30 mK, which relaxes the temperature constraints for operating the large AMoRE-II array. △ Less

Submitted 16 July, 2024; originally announced July 2024.

arXiv:2407.10733 [pdf, other]

Joint-Embedding Predictive Architecture for Self-Supervised Learning of Mask Classification Architecture

Authors: Dong-Hee Kim, Sungduk Cho, Hyeonwoo Cho, Chanmin Park, Jinyoung Kim, Won Hwa Kim

Abstract: In this work, we introduce Mask-JEPA, a self-supervised learning framework tailored for mask classification architectures (MCA), to overcome the traditional constraints associated with training segmentation models. Mask-JEPA combines a Joint Embedding Predictive Architecture with MCA to adeptly capture intricate semantics and precise object boundaries. Our approach addresses two critical challenge… ▽ More In this work, we introduce Mask-JEPA, a self-supervised learning framework tailored for mask classification architectures (MCA), to overcome the traditional constraints associated with training segmentation models. Mask-JEPA combines a Joint Embedding Predictive Architecture with MCA to adeptly capture intricate semantics and precise object boundaries. Our approach addresses two critical challenges in self-supervised learning: 1) extracting comprehensive representations for universal image segmentation from a pixel decoder, and 2) effectively training the transformer decoder. The use of the transformer decoder as a predictor within the JEPA framework allows proficient training in universal image segmentation tasks. Through rigorous evaluations on datasets such as ADE20K, Cityscapes and COCO, Mask-JEPA demonstrates not only competitive results but also exceptional adaptability and robustness across various training scenarios. The architecture-agnostic nature of Mask-JEPA further underscores its versatility, allowing seamless adaptation to various mask classification family. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: 27 pages, 5 figures

arXiv:2407.09303 [pdf, other]

ProDepth: Boosting Self-Supervised Multi-Frame Monocular Depth with Probabilistic Fusion

Authors: Sungmin Woo, Wonjoon Lee, Woo Jin Kim, Dogyoon Lee, Sangyoun Lee

Abstract: Self-supervised multi-frame monocular depth estimation relies on the geometric consistency between successive frames under the assumption of a static scene. However, the presence of moving objects in dynamic scenes introduces inevitable inconsistencies, causing misaligned multi-frame feature matching and misleading self-supervision during training. In this paper, we propose a novel framework calle… ▽ More Self-supervised multi-frame monocular depth estimation relies on the geometric consistency between successive frames under the assumption of a static scene. However, the presence of moving objects in dynamic scenes introduces inevitable inconsistencies, causing misaligned multi-frame feature matching and misleading self-supervision during training. In this paper, we propose a novel framework called ProDepth, which effectively addresses the mismatch problem caused by dynamic objects using a probabilistic approach. We initially deduce the uncertainty associated with static scene assumption by adopting an auxiliary decoder. This decoder analyzes inconsistencies embedded in the cost volume, inferring the probability of areas being dynamic. We then directly rectify the erroneous cost volume for dynamic areas through a Probabilistic Cost Volume Modulation (PCVM) module. Specifically, we derive probability distributions of depth candidates from both single-frame and multi-frame cues, modulating the cost volume by adaptively fusing those distributions based on the inferred uncertainty. Additionally, we present a self-supervision loss reweighting strategy that not only masks out incorrect supervision with high uncertainty but also mitigates the risks in remaining possible dynamic areas in accordance with the probability. Our proposed method excels over state-of-the-art approaches in all metrics on both Cityscapes and KITTI datasets, and demonstrates superior generalization ability on the Waymo Open dataset. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV 2024. Project Page: https://sungmin-woo.github.io/prodepth/

arXiv:2407.09153 [pdf]

doi 10.1038/s41467-024-49841-6

Topological Fermi-arc surface state covered by floating electrons on a two-dimensional electride

Authors: Chan-young Lim, Min-Seok Kim, Dong Cheol Lim, Sunghun Kim, Yeonghoon Lee, Jaehoon Cha, Gyubin Lee, Sang Yong Song, Dinesh Thapa, Jonathan D. Denlinger, Seong-Gon Kim, Sung Wng Kim, Jungpil Seo, Yeongkwan Kim

Abstract: Two-dimensional electrides can acquire topologically non-trivial phases due to intriguing interplay between the cationic atomic layers and anionic electron layers. However, experimental evidence of topological surface states has yet to be verified. Here, via angle-resolved photoemission spectroscopy (ARPES) and scanning tunnelling microscopy (STM), we probe the magnetic Weyl states of the ferromag… ▽ More Two-dimensional electrides can acquire topologically non-trivial phases due to intriguing interplay between the cationic atomic layers and anionic electron layers. However, experimental evidence of topological surface states has yet to be verified. Here, via angle-resolved photoemission spectroscopy (ARPES) and scanning tunnelling microscopy (STM), we probe the magnetic Weyl states of the ferromagnetic electride $[Gd_{2}$C]^{2+}\cdot2e^{-}$. In particular, the presence of Weyl cones and Fermi-arc states is demonstrated through photon energy-dependent ARPES measurements, agreeing with theoretical band structure calculations. Notably, the STM measurements reveal that the Fermi-arc states exist underneath a floating quantum electron liquid on the top Gd layer, forming double-stacked surface states in a heterostructure. Our work thus not only unveils the non-trivial topology of the $[Gd_{2}$C]^{2+}\cdot2e^{-}$ electride but also realizes a surface heterostructure that can host phenomena distinct from the bulk. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 22 pages, 6 figures

Journal ref: Nat. Commun. 15 (2024) 5615

arXiv:2407.07317 [pdf, other]

Flow-acoustic resonance in deep and inclined cavities

Authors: You Wei Ho, Jae Wook Kim

Abstract: This paper presents numerical investigations of flow-acoustic resonances in deep and inclined cavities using wall-resolved large eddy simulations. The study focuses on cavity configurations with an aspect ratio of $D/L = 2.632$, subjected to two Mach numbers of $0.2$ and $0.3$ at three different inclination angles ($α=30^{\circ}$, $60^{\circ}$, and $90^{\circ}$). Fully turbulent boundary layers ge… ▽ More This paper presents numerical investigations of flow-acoustic resonances in deep and inclined cavities using wall-resolved large eddy simulations. The study focuses on cavity configurations with an aspect ratio of $D/L = 2.632$, subjected to two Mach numbers of $0.2$ and $0.3$ at three different inclination angles ($α=30^{\circ}$, $60^{\circ}$, and $90^{\circ}$). Fully turbulent boundary layers generated from independent precursor simulations are employed upstream of the cavities. Initial results highlight distinct aeroacoustic responses between inclined and orthogonal cavities, particularly at $M_{\infty}=0.3$, where inclined cavities exhibit stronger resonances at a lower peak frequency ($St\approx 0.27$) compared to the orthogonal cavity. Further analysis reveals that this lower Strouhal number corresponds to a reduced vortex convection speed linked to large shear-layer oscillations. Additionally, the acoustic input-output analysis indicates that the inclined cavities amplify acoustic responses more effectively and exhibit weaker source-sink cancellations compared to the orthogonal cavity. These mechanisms are identified as the primary contributors to the enhanced aeroacoustic responses in the inclined cavities. Finally, this paper proposes that the ratio between acoustic particle displacement and momentum thickness may be used as a criterion to predict the onset of the distinctive resonance at $St\approx 0.27$. It is suggested that the amplified resonances may be linked to a nonlinear mode shift of the first hydrodynamic mode through enhanced shear-layer oscillation taking place when the proposed criterion is met. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.05618 [pdf, other]

Improved limit on neutrinoless double beta decay of \mohundred~from AMoRE-I

Authors: A. Agrawal, V. V. Alenkov, P. Aryal, J. Beyer, B. Bhandari, R. S. Boiko, K. Boonin, O. Buzanov, C. R. Byeon, N. Chanthima, M. K. Cheoun, J. S. Choe, Seonho Choi, S. Choudhury, J. S. Chung, F. A. Danevich, M. Djamal, D. Drung, C. Enss, A. Fleischmann, A. M. Gangapshev, L. Gastaldo, Y. M. Gavrilyuk, A. M. Gezhaev, O. Gileva , et al. (83 additional authors not shown)

Abstract: AMoRE searches for the signature of neutrinoless double beta decay of $^{100}$Mo with a 100 kg sample of enriched $^{100}$Mo. Scintillating molybdate crystals coupled with a metallic magnetic calorimeter operate at milli-Kelvin temperatures to measure the energy of electrons emitted in the decay. As a demonstration of the full-scale AMoRE, we conducted AMoRE-I, a pre-experiment with 18 molybdate c… ▽ More AMoRE searches for the signature of neutrinoless double beta decay of $^{100}$Mo with a 100 kg sample of enriched $^{100}$Mo. Scintillating molybdate crystals coupled with a metallic magnetic calorimeter operate at milli-Kelvin temperatures to measure the energy of electrons emitted in the decay. As a demonstration of the full-scale AMoRE, we conducted AMoRE-I, a pre-experiment with 18 molybdate crystals, at the Yangyang Underground Laboratory for over two years. The exposure was 8.02 kg$\cdot$year (or 3.89 kg$_{\mathrm{^{100}Mo}}\cdot$year) and the total background rate near the Q-value was 0.025 $\pm$ 0.002 counts/keV/kg/year. We observed no indication of $0νββ$ decay and report a new lower limit of the half-life of $^{100}$Mo $0νββ$ decay as $ T^{0ν}_{1/2}>3.0\times10^{24}~\mathrm{years}$ at 90\% confidence level. The effective Majorana mass limit range is $m_{ββ}<$(210--610) meV using nuclear matrix elements estimated in the framework of different models, including the recent shell model calculations. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 7 pages, 4 figures

arXiv:2407.02622 [pdf, other]

doi 10.1109/ICAIIC60209.2024.10463391

RISC-V R-Extension: Advancing Efficiency with Rented-Pipeline for Edge DNN Processing

Authors: Won Hyeok Kim, Hyeong Jin Kim, Tae Hee Han

Abstract: The proliferation of edge devices necessitates efficient computational architectures for lightweight tasks, particularly deep neural network (DNN) inference. Traditional NPUs, though effective for such operations, face challenges in power, cost, and area when integrated into lightweight edge devices. The RISC-V architecture, known for its modularity and open-source nature, offers a viable alternat… ▽ More The proliferation of edge devices necessitates efficient computational architectures for lightweight tasks, particularly deep neural network (DNN) inference. Traditional NPUs, though effective for such operations, face challenges in power, cost, and area when integrated into lightweight edge devices. The RISC-V architecture, known for its modularity and open-source nature, offers a viable alternative. This paper introduces the RISC-V R-extension, a novel approach to enhancing DNN process efficiency on edge devices. The extension features rented-pipeline stages and architectural pipeline registers (APR), which optimize critical operation execution, thereby reducing latency and memory access frequency. Furthermore, this extension includes new custom instructions to support these architectural improvements. Through comprehensive analysis, this study demonstrates the boost of R-extension in edge device processing, setting the stage for more responsive and intelligent edge applications. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 6 pages, 6 figures, ICAIIC 2024

arXiv:2406.19287 [pdf, other]

Isotropy of cosmic rays beyond $10^{20}$ eV favors their heavy mass composition

Authors: Telescope Array Collaboration, R. U. Abbasi, Y. Abe, T. Abu-Zayyad, M. Allen, Y. Arai, R. Arimura, E. Barcikowski, J. W. Belz, D. R. Bergman, S. A. Blake, I. Buckland, B. G. Cheon, M. Chikawa, T. Fujii, K. Fujisue, K. Fujita, R. Fujiwara, M. Fukushima, G. Furlich, N. Globus, R. Gonzalez, W. Hanlon, N. Hayashida, H. He , et al. (118 additional authors not shown)

Abstract: We report an estimation of the injected mass composition of ultra-high energy cosmic rays (UHECRs) at energies higher than 10 EeV. The composition is inferred from an energy-dependent sky distribution of UHECR events observed by the Telescope Array surface detector by comparing it to the Large Scale Structure of the local Universe. In the case of negligible extra-galactic magnetic fields the resul… ▽ More We report an estimation of the injected mass composition of ultra-high energy cosmic rays (UHECRs) at energies higher than 10 EeV. The composition is inferred from an energy-dependent sky distribution of UHECR events observed by the Telescope Array surface detector by comparing it to the Large Scale Structure of the local Universe. In the case of negligible extra-galactic magnetic fields the results are consistent with a relatively heavy injected composition at E ~ 10 EeV that becomes lighter up to E ~ 100 EeV, while the composition at E > 100 EeV is very heavy. The latter is true even in the presence of highest experimentally allowed extra-galactic magnetic fields, while the composition at lower energies can be light if a strong EGMF is present. The effect of the uncertainty in the galactic magnetic field on these results is subdominant. △ Less

Submitted 3 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

Comments: 8 pages, 3 figures, accepted for publication in PRL

arXiv:2406.19286 [pdf, other]

Mass composition of ultra-high energy cosmic rays from distribution of their arrival directions with the Telescope Array

Authors: Telescope Array Collaboration, R. U. Abbasi, Y. Abe, T. Abu-Zayyad, M. Allen, Y. Arai, R. Arimura, E. Barcikowski, J. W. Belz, D. R. Bergman, S. A. Blake, I. Buckland, B. G. Cheon, M. Chikawa, T. Fujii, K. Fujisue, K. Fujita, R. Fujiwara, M. Fukushima, G. Furlich, N. Globus, R. Gonzalez, W. Hanlon, N. Hayashida, H. He , et al. (118 additional authors not shown)

Abstract: We use a new method to estimate the injected mass composition of ultrahigh cosmic rays (UHECRs) at energies higher than 10 EeV. The method is based on comparison of the energy-dependent distribution of cosmic ray arrival directions as measured by the Telescope Array experiment (TA) with that calculated in a given putative model of UHECR under the assumption that sources trace the large-scale struc… ▽ More We use a new method to estimate the injected mass composition of ultrahigh cosmic rays (UHECRs) at energies higher than 10 EeV. The method is based on comparison of the energy-dependent distribution of cosmic ray arrival directions as measured by the Telescope Array experiment (TA) with that calculated in a given putative model of UHECR under the assumption that sources trace the large-scale structure (LSS) of the Universe. As we report in the companion letter, the TA data show large deflections with respect to the LSS which can be explained, assuming small extra-galactic magnetic fields (EGMF), by an intermediate composition changing to a heavy one (iron) in the highest energy bin. Here we show that these results are robust to uncertainties in UHECR injection spectra, the energy scale of the experiment and galactic magnetic fields (GMF). The assumption of weak EGMF, however, strongly affects this interpretation at all but the highest energies E > 100 EeV, where the remarkable isotropy of the data implies a heavy injected composition even in the case of strong EGMF. This result also holds if UHECR sources are as rare as $2 \times 10^{-5}$ Mpc$^{-3}$, that is the conservative lower limit for the source number density. △ Less

Submitted 3 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

Comments: 18 pages, 11 figures, accepted for publication in PRD

arXiv:2406.19148 [pdf, other]

BackMix: Mitigating Shortcut Learning in Echocardiography with Minimal Supervision

Authors: Kit Mills Bransby, Arian Beqiri, Woo-Jin Cho Kim, Jorge Oliveira, Agisilaos Chartsias, Alberto Gomez

Abstract: Neural networks can learn spurious correlations that lead to the correct prediction in a validation set, but generalise poorly because the predictions are right for the wrong reason. This undesired learning of naive shortcuts (Clever Hans effect) can happen for example in echocardiogram view classification when background cues (e.g. metadata) are biased towards a class and the model learns to focu… ▽ More Neural networks can learn spurious correlations that lead to the correct prediction in a validation set, but generalise poorly because the predictions are right for the wrong reason. This undesired learning of naive shortcuts (Clever Hans effect) can happen for example in echocardiogram view classification when background cues (e.g. metadata) are biased towards a class and the model learns to focus on those background features instead of on the image content. We propose a simple, yet effective random background augmentation method called BackMix, which samples random backgrounds from other examples in the training set. By enforcing the background to be uncorrelated with the outcome, the model learns to focus on the data within the ultrasound sector and becomes invariant to the regions outside this. We extend our method in a semi-supervised setting, finding that the positive effects of BackMix are maintained with as few as 5% of segmentation labels. A loss weighting mechanism, wBackMix, is also proposed to increase the contribution of the augmented examples. We validate our method on both in-distribution and out-of-distribution datasets, demonstrating significant improvements in classification accuracy, region focus and generalisability. Our source code is available at: https://github.com/kitbransby/BackMix △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: Accepted at MICCAI 2024 (Pre-print)

arXiv:2406.18823 [pdf, other]

Emergence of metachronal waves in a chain of symmetrically beating filaments

Authors: Narina Jung, Won Kyu Kim, Changbong Hyeon

Abstract: Recent experiments have shown that metachronal waves (MCWs) can emerge from a chain of symmetrically beating nematodes aligned at the edge of sessile droplets. Our study, employing a coupled elastohydrodynamic model of active filaments, elucidates that a misalignment caused by a tilt against the bounding wall disrupts the synchronization and generates a constant time lag between adjacent filaments… ▽ More Recent experiments have shown that metachronal waves (MCWs) can emerge from a chain of symmetrically beating nematodes aligned at the edge of sessile droplets. Our study, employing a coupled elastohydrodynamic model of active filaments, elucidates that a misalignment caused by a tilt against the bounding wall disrupts the synchronization and generates a constant time lag between adjacent filaments, leading to MCWs. The MCWs, enhancing the fluid circulation, achieve their maximum thermodynamic efficiency over the same range of tilt angles observed in the nematode experiments. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 12 page, 8 figures

arXiv:2406.17869 [pdf, other]

Burst Image Super-Resolution with Base Frame Selection

Authors: Sanghyun Kim, Min Jung Lee, Woohyeok Kim, Deunsol Jung, Jaesung Rim, Sunghyun Cho, Minsu Cho

Abstract: Burst image super-resolution has been a topic of active research in recent years due to its ability to obtain a high-resolution image by using complementary information between multiple frames in the burst. In this work, we explore using burst shots with non-uniform exposures to confront real-world practical scenarios by introducing a new benchmark dataset, dubbed Non-uniformly Exposed Burst Image… ▽ More Burst image super-resolution has been a topic of active research in recent years due to its ability to obtain a high-resolution image by using complementary information between multiple frames in the burst. In this work, we explore using burst shots with non-uniform exposures to confront real-world practical scenarios by introducing a new benchmark dataset, dubbed Non-uniformly Exposed Burst Image (NEBI), that includes the burst frames at varying exposure times to obtain a broader range of irradiance and motion characteristics within a scene. As burst shots with non-uniform exposures exhibit varying levels of degradation, fusing information of the burst shots into the first frame as a base frame may not result in optimal image quality. To address this limitation, we propose a Frame Selection Network (FSN) for non-uniform scenarios. This network seamlessly integrates into existing super-resolution methods in a plug-and-play manner with low computational costs. The comparative analysis reveals the effectiveness of the nonuniform setting for the practical scenario and our FSN on synthetic-/real- NEBI datasets. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: CVPR2024W NTIRE accepted

arXiv:2406.15539 [pdf, other]

First Measurement of Deeply Virtual Compton Scattering on the Neutron with Detection of the Active Neutron

Authors: CLAS Collaboration, A. Hobart, S. Niccolai, M. Čuić, K. Kumerički, P. Achenbach, J. S. Alvarado, W. R. Armstrong, H. Atac, H. Avakian, L. Baashen, N. A. Baltzell, L. Barion, M. Bashkanov, M. Battaglieri, B. Benkel, F. Benmokhtar, A. Bianconi, A. S. Biselli, S. Boiarinov, M. Bondi, W. A. Booth, F. Bossù, K. -Th. Brinkmann, W. J. Briscoe , et al. (124 additional authors not shown)

Abstract: Measuring Deeply Virtual Compton Scattering on the neutron is one of the necessary steps to understand the structure of the nucleon in terms of Generalized Parton Distributions (GPDs). Neutron targets play a complementary role to transversely polarized proton targets in the determination of the GPD $E$. This poorly known and poorly constrained GPD is essential to obtain the contribution of the qua… ▽ More Measuring Deeply Virtual Compton Scattering on the neutron is one of the necessary steps to understand the structure of the nucleon in terms of Generalized Parton Distributions (GPDs). Neutron targets play a complementary role to transversely polarized proton targets in the determination of the GPD $E$. This poorly known and poorly constrained GPD is essential to obtain the contribution of the quarks' angular momentum to the spin of the nucleon. DVCS on the neutron was measured for the first time selecting the exclusive final state by detecting the neutron, using the Jefferson Lab longitudinally polarized electron beam, with energies up to 10.6 GeV, and the CLAS12 detector. The extracted beam-spin asymmetries, combined with DVCS observables measured on the proton, allow a clean quark-flavor separation of the imaginary parts of the GPDs $H$ and $E$. △ Less

Submitted 25 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

Comments: 7 pages, 6 figures

Report number: JLAB-PHY-24-4089

arXiv:2406.12246 [pdf, other]

TroL: Traversal of Layers for Large Language and Vision Models

Authors: Byung-Kwan Lee, Sangyun Chung, Chae Won Kim, Beomchan Park, Yong Man Ro

Abstract: Large language and vision models (LLVMs) have been driven by the generalization power of large language models (LLMs) and the advent of visual instruction tuning. Along with scaling them up directly, these models enable LLVMs to showcase powerful vision language (VL) performances by covering diverse tasks via natural language instructions. However, existing open-source LLVMs that perform comparabl… ▽ More Large language and vision models (LLVMs) have been driven by the generalization power of large language models (LLMs) and the advent of visual instruction tuning. Along with scaling them up directly, these models enable LLVMs to showcase powerful vision language (VL) performances by covering diverse tasks via natural language instructions. However, existing open-source LLVMs that perform comparably to closed-source LLVMs such as GPT-4V are often considered too large (e.g., 26B, 34B, and 110B parameters), having a larger number of layers. These large models demand costly, high-end resources for both training and inference. To address this issue, we present a new efficient LLVM family with 1.8B, 3.8B, and 7B LLM model sizes, Traversal of Layers (TroL), which enables the reuse of layers in a token-wise manner. This layer traversing technique simulates the effect of looking back and retracing the answering stream while increasing the number of forward propagation layers without physically adding more layers. We demonstrate that TroL employs a simple layer traversing approach yet efficiently outperforms the open-source LLVMs with larger model sizes and rivals the performances of the closed-source LLVMs with substantial sizes. △ Less

Submitted 19 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: Code is available in https://github.com/ByungKwanLee/TroL

arXiv:2406.12095 [pdf, other]

DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features

Authors: Letian Wang, Seung Wook Kim, Jiawei Yang, Cunjun Yu, Boris Ivanovic, Steven L. Waslander, Yue Wang, Sanja Fidler, Marco Pavone, Peter Karkus

Abstract: We propose DistillNeRF, a self-supervised learning framework addressing the challenge of understanding 3D environments from limited 2D observations in autonomous driving. Our method is a generalizable feedforward model that predicts a rich neural scene representation from sparse, single-frame multi-view camera inputs, and is trained self-supervised with differentiable rendering to reconstruct RGB,… ▽ More We propose DistillNeRF, a self-supervised learning framework addressing the challenge of understanding 3D environments from limited 2D observations in autonomous driving. Our method is a generalizable feedforward model that predicts a rich neural scene representation from sparse, single-frame multi-view camera inputs, and is trained self-supervised with differentiable rendering to reconstruct RGB, depth, or feature images. Our first insight is to exploit per-scene optimized Neural Radiance Fields (NeRFs) by generating dense depth and virtual camera targets for training, thereby helping our model to learn 3D geometry from sparse non-overlapping image inputs. Second, to learn a semantically rich 3D representation, we propose distilling features from pre-trained 2D foundation models, such as CLIP or DINOv2, thereby enabling various downstream tasks without the need for costly 3D human annotations. To leverage these two insights, we introduce a novel model architecture with a two-stage lift-splat-shoot encoder and a parameterized sparse hierarchical voxel representation. Experimental results on the NuScenes dataset demonstrate that DistillNeRF significantly outperforms existing comparable self-supervised methods for scene reconstruction, novel view synthesis, and depth estimation; and it allows for competitive zero-shot 3D semantic occupancy prediction, as well as open-world scene understanding through distilled foundation model features. Demos and code will be available at https://distillnerf.github.io/. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.11427 [pdf, other]

DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer

Authors: Keon Lee, Dong Won Kim, Jaehyeon Kim, Jaewoong Cho

Abstract: Large-scale diffusion models have shown outstanding generative abilities across multiple modalities including images, videos, and audio. However, text-to-speech (TTS) systems typically involve domain-specific modeling factors (e.g., phonemes and phoneme-level durations) to ensure precise temporal alignments between text and speech, which hinders the efficiency and scalability of diffusion models f… ▽ More Large-scale diffusion models have shown outstanding generative abilities across multiple modalities including images, videos, and audio. However, text-to-speech (TTS) systems typically involve domain-specific modeling factors (e.g., phonemes and phoneme-level durations) to ensure precise temporal alignments between text and speech, which hinders the efficiency and scalability of diffusion models for TTS. In this work, we present an efficient and scalable Diffusion Transformer (DiT) that utilizes off-the-shelf pre-trained text and speech encoders. Our approach addresses the challenge of text-speech alignment via cross-attention mechanisms with the prediction of the total length of speech representations. To achieve this, we enhance the DiT architecture to suit TTS and improve the alignment by incorporating semantic guidance into the latent space of speech. We scale the training dataset and the model size to 82K hours and 790M parameters, respectively. Our extensive experiments demonstrate that the large-scale diffusion model for TTS without domain-specific modeling not only simplifies the training pipeline but also yields superior or comparable zero-shot performance to state-of-the-art TTS models in terms of naturalness, intelligibility, and speaker similarity. Our speech samples are available at https://ditto-tts.github.io. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.11313 [pdf, other]

Semi-Supervised Domain Adaptation Using Target-Oriented Domain Augmentation for 3D Object Detection

Authors: Yecheol Kim, Junho Lee, Changsoo Park, Hyoung won Kim, Inho Lim, Christopher Chang, Jun Won Choi

Abstract: 3D object detection is crucial for applications like autonomous driving and robotics. However, in real-world environments, variations in sensor data distribution due to sensor upgrades, weather changes, and geographic differences can adversely affect detection performance. Semi-Supervised Domain Adaptation (SSDA) aims to mitigate these challenges by transferring knowledge from a source domain, abu… ▽ More 3D object detection is crucial for applications like autonomous driving and robotics. However, in real-world environments, variations in sensor data distribution due to sensor upgrades, weather changes, and geographic differences can adversely affect detection performance. Semi-Supervised Domain Adaptation (SSDA) aims to mitigate these challenges by transferring knowledge from a source domain, abundant in labeled data, to a target domain where labels are scarce. This paper presents a new SSDA method referred to as Target-Oriented Domain Augmentation (TODA) specifically tailored for LiDAR-based 3D object detection. TODA efficiently utilizes all available data, including labeled data in the source domain, and both labeled data and unlabeled data in the target domain to enhance domain adaptation performance. TODA consists of two stages: TargetMix and AdvMix. TargetMix employs mixing augmentation accounting for LiDAR sensor characteristics to facilitate feature alignment between the source-domain and target-domain. AdvMix applies point-wise adversarial augmentation with mixing augmentation, which perturbs the unlabeled data to align the features within both labeled and unlabeled data in the target domain. Our experiments conducted on the challenging domain adaptation tasks demonstrate that TODA outperforms existing domain adaptation techniques designed for 3D object detection by significant margins. The code is available at: https://github.com/rasd3/TODA. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: Accepted to IEEE Transactions on Intelligent Vehicles (T-IV). The code is available at: https://github.com/rasd3/TODA

arXiv:2406.10324 [pdf, other]

L4GM: Large 4D Gaussian Reconstruction Model

Authors: Jiawei Ren, Kevin Xie, Ashkan Mirzaei, Hanxue Liang, Xiaohui Zeng, Karsten Kreis, Ziwei Liu, Antonio Torralba, Sanja Fidler, Seung Wook Kim, Huan Ling

Abstract: We present L4GM, the first 4D Large Reconstruction Model that produces animated objects from a single-view video input -- in a single feed-forward pass that takes only a second. Key to our success is a novel dataset of multiview videos containing curated, rendered animated objects from Objaverse. This dataset depicts 44K diverse objects with 110K animations rendered in 48 viewpoints, resulting in… ▽ More We present L4GM, the first 4D Large Reconstruction Model that produces animated objects from a single-view video input -- in a single feed-forward pass that takes only a second. Key to our success is a novel dataset of multiview videos containing curated, rendered animated objects from Objaverse. This dataset depicts 44K diverse objects with 110K animations rendered in 48 viewpoints, resulting in 12M videos with a total of 300M frames. We keep our L4GM simple for scalability and build directly on top of LGM, a pretrained 3D Large Reconstruction Model that outputs 3D Gaussian ellipsoids from multiview image input. L4GM outputs a per-frame 3D Gaussian Splatting representation from video frames sampled at a low fps and then upsamples the representation to a higher fps to achieve temporal smoothness. We add temporal self-attention layers to the base LGM to help it learn consistency across time, and utilize a per-timestep multiview rendering loss to train the model. The representation is upsampled to a higher framerate by training an interpolation model which produces intermediate 3D Gaussian representations. We showcase that L4GM that is only trained on synthetic data generalizes extremely well on in-the-wild videos, producing high quality animated 3D assets. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: Project page: https://research.nvidia.com/labs/toronto-ai/l4gm

arXiv:2406.09698 [pdf, other]

Projected background and sensitivity of AMoRE-II

Authors: A. Agrawal, V. V. Alenkov, P. Aryal, J. Beyer, B. Bhandari, R. S. Boiko, K. Boonin, O. Buzanov, C. R. Byeon, N. Chanthima, M. K. Cheoun, J. S. Choe, Seonho Choi, S. Choudhury, J. S. Chung, F. A. Danevich, M. Djamal, D. Drung, C. Enss, A. Fleischmann, A. M. Gangapshev, L. Gastaldo, Y. M. Gavrilyuk, A. M. Gezhaev, O. Gileva , et al. (81 additional authors not shown)

Abstract: AMoRE-II aims to search for neutrinoless double beta decay with an array of 423 Li$_2$$^{100}$MoO$_4$ crystals operating in the cryogenic system as the main phase of the Advanced Molybdenum-based Rare process Experiment (AMoRE). AMoRE has been planned to operate in three phases: AMoRE-pilot, AMoRE-I, and AMoRE-II. AMoRE-II is currently being installed at the Yemi Underground Laboratory, located ap… ▽ More AMoRE-II aims to search for neutrinoless double beta decay with an array of 423 Li$_2$$^{100}$MoO$_4$ crystals operating in the cryogenic system as the main phase of the Advanced Molybdenum-based Rare process Experiment (AMoRE). AMoRE has been planned to operate in three phases: AMoRE-pilot, AMoRE-I, and AMoRE-II. AMoRE-II is currently being installed at the Yemi Underground Laboratory, located approximately 1000 meters deep in Jeongseon, Korea. The goal of AMoRE-II is to reach up to $T^{0νββ}_{1/2}$ $\sim$ 6 $\times$ 10$^{26}$ years, corresponding to an effective Majorana mass of 15 - 29 meV, covering all the inverted mass hierarchy regions. To achieve this, the background level of the experimental configurations and possible background sources of gamma and beta events should be well understood. We have intensively performed Monte Carlo simulations using the GEANT4 toolkit in all the experimental configurations with potential sources. We report the estimated background level that meets the 10$^{-4}$counts/(keV$\cdot$kg$\cdot$yr) requirement for AMoRE-II in the region of interest (ROI) and show the projected half-life sensitivity based on the simulation study. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.09188 [pdf, ps, other]

Reducing Task Discrepancy of Text Encoders for Zero-Shot Composed Image Retrieval

Authors: Jaeseok Byun, Seokhyeon Jeong, Wonjae Kim, Sanghyuk Chun, Taesup Moon

Abstract: Composed Image Retrieval (CIR) aims to retrieve a target image based on a reference image and conditioning text, enabling controllable searches. Due to the expensive dataset construction cost for CIR triplets, a zero-shot (ZS) CIR setting has been actively studied to eliminate the need for human-collected triplet datasets. The mainstream of ZS-CIR employs an efficient projection module that projec… ▽ More Composed Image Retrieval (CIR) aims to retrieve a target image based on a reference image and conditioning text, enabling controllable searches. Due to the expensive dataset construction cost for CIR triplets, a zero-shot (ZS) CIR setting has been actively studied to eliminate the need for human-collected triplet datasets. The mainstream of ZS-CIR employs an efficient projection module that projects a CLIP image embedding to the CLIP text token embedding space, while fixing the CLIP encoders. Using the projected image embedding, these methods generate image-text composed features by using the pre-trained text encoder. However, their CLIP image and text encoders suffer from the task discrepancy between the pre-training task (text $\leftrightarrow$ image) and the target CIR task (image + text $\leftrightarrow$ image). Conceptually, we need expensive triplet samples to reduce the discrepancy, but we use cheap text triplets instead and update the text encoder. To that end, we introduce the Reducing Task Discrepancy of text encoders for Composed Image Retrieval (RTD), a plug-and-play training scheme for the text encoder that enhances its capability using a novel target-anchored text contrastive learning. We also propose two additional techniques to improve the proposed learning scheme: a hard negatives-based refined batch sampling strategy and a sophisticated concatenation scheme. Integrating RTD into the state-of-the-art projection-based ZS-CIR methods significantly improves performance across various datasets and backbones, demonstrating its efficiency and generalizability. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 17 pages

arXiv:2406.09145 [pdf, other]

The Cygnus Allscale Survey of Chemistry and Dynamical Environments: CASCADE III. The large scale distribution of DCO+, DNC and DCN in the DR21 filament

Authors: I. Barlach Christensen, F. Wyrowski, V. S. Veena, H. Beuther, D. Semenov, K. M. Menten, A. M. Jacob, W. -J. Kim, N. Cunningham, C. Gieser, A. Hacar, S. Li, N. Schneider, I. Skretas, J. M. Winters

Abstract: Deuterated molecules and their molecular D/H-ratios (RD(D)) are important diagnostic tools to study the physical conditions of star-forming regions. The degree of deuteration, RD(D), can be significantly enhanced over the elemental D/H-ratio depending on physical parameters. Within the Cygnus Allscale Survey of Chemistry and Dynamical Environments (CASCADE), we aim to explore the large-scale distr… ▽ More Deuterated molecules and their molecular D/H-ratios (RD(D)) are important diagnostic tools to study the physical conditions of star-forming regions. The degree of deuteration, RD(D), can be significantly enhanced over the elemental D/H-ratio depending on physical parameters. Within the Cygnus Allscale Survey of Chemistry and Dynamical Environments (CASCADE), we aim to explore the large-scale distribution of deuterated molecules in the nearby Cygnus-X region. We focus on the analysis of large-scale structures of deuterated molecules in the filamentary region hosting the prominent Hii region DR21 and DR21(OH). Here we discuss the HCO+, HNC and HCN molecules and their deuterated isotopologues DCO+, DNC and DCN. The spatial distributions of integrated line emissions from DCO+, DNC, and DCN reveal morphological differences. DCO+ displays the most extended emission, characterized by several prominent peaks. Likewise, DNC exhibits multiple peaks, although its emission appears less extended compared to DCO+. In contrast to the extended emission of DCO+ and DNC, DCN appears the least extended, with distinct peaks. Focusing only on the regions where all three molecules are observed, the mean deuteration ratios for each species are 0.01 for both DNC and DCN, and = 0.005 for DCO+. Anti-correlations are found with deuterated molecules and dust temperature or N(H2). The strongest anti-correlation is found with RD(DCO+) and N(H2). The anti-correlation of RD(DCO+) and N(H2) is suggested to be a result of a combination of an increased photodissociation degree and shocks. A strong positive correlation between the ratio of integrated intensities of DCN and DNC with their 13C-isotopologues, are found in high column density regions. The positive relationship between the ratios implies that the D-isotopologue of the isomers could potentially serve as a tracer for the kinetic gas temperature. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 24 pages, 21 figures, accepted to A&A

arXiv:2406.08823 [pdf, other]

Effects of Halo Spin on Bar Formation in Disk Galaxies

Authors: Dajeong Jang, Woong-Tae Kim

Abstract: The spin of dark halos has been shown to significantly affect bar formation and evolution in disk galaxies. To understand the physical role of the halo spin on bar formation, we run $N$-body simulations of isolated, Milky Way-sized galaxies by varying the halo spin parameter in the range $-0.16 \leq λ\leq 0.16$ and the bulge mass. We find that our adopted halo \emph{alone} is subject to swing ampl… ▽ More The spin of dark halos has been shown to significantly affect bar formation and evolution in disk galaxies. To understand the physical role of the halo spin on bar formation, we run $N$-body simulations of isolated, Milky Way-sized galaxies by varying the halo spin parameter in the range $-0.16 \leq λ\leq 0.16$ and the bulge mass. We find that our adopted halo \emph{alone} is subject to swing amplification of an $m=2$ non-axisymmetric mode rotating in the same sense as the halo, which assists or inhibits the bar formation in a disk depending on its sense of rotation. The $m=2$ mode in the disk, growing via swing amplification, interacts constructively (destructively) with the $m=2$ mode in the prograde (retrograde) halo, promoting (delaying) bar formation. A bar grows by losing its angular momentum primarily to a halo. Since the halo particles inside (outside) the corotation resonance with the bar can emit (absorb) angular momentum to (from) the bar, the bar pattern speed decays slower for larger $λ>0$, while it decreases relatively fast almost independent of $λ\leq0$. Models with a strong bar develop a boxy peanut-shaped bulge. In models without a bulge, this occurs rapidly via buckling instability, while the bars with a bulge thicken gradually without undergoing buckling instability. Among the models considered in the present work, the bar in the $λ= 0.06$ model with a bulge of 10\% of the disk mass best describes the Milky Way in terms of the bar length and pattern speed. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: accepted for publication in ApJ

arXiv:2406.08612 [pdf, other]

Observation of Declination Dependence in the Cosmic Ray Energy Spectrum

Authors: The Telescope Array Collaboration, R. U. Abbasi, T. Abu-Zayyad, M. Allen, J. W. Belz, D. R. Bergman, I. Buckland, W. Campbell, B. G. Cheon, K. Endo, A. Fedynitch, T. Fujii, K. Fujisue, K. Fujita, M. Fukushima, G. Furlich, Z. Gerber, N. Globus, W. Hanlon, N. Hayashida, H. He, K. Hibino, R. Higuchi, D. Ikeda, T. Ishii , et al. (101 additional authors not shown)

Abstract: We report on an observation of the difference between northern and southern skies of the ultrahigh energy cosmic ray energy spectrum with a significance of ${\sim}8σ$. We use measurements from the two largest experiments$\unicode{x2014}$the Telescope Array observing the northern hemisphere and the Pierre Auger Observatory viewing the southern hemisphere. Since the comparison of two measurements fr… ▽ More We report on an observation of the difference between northern and southern skies of the ultrahigh energy cosmic ray energy spectrum with a significance of ${\sim}8σ$. We use measurements from the two largest experiments$\unicode{x2014}$the Telescope Array observing the northern hemisphere and the Pierre Auger Observatory viewing the southern hemisphere. Since the comparison of two measurements from different observatories introduces the issue of possible systematic differences between detectors and analyses, we validate the methodology of the comparison by examining the region of the sky where the apertures of the two observatories overlap. Although the spectra differ in this region, we find that there is only a $1.8σ$ difference between the spectrum measurements when anisotropic regions are removed and a fiducial cut in the aperture is applied. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 8 pages, 6 figures

arXiv:2406.08301 [pdf, other]

Jet modification via $π^0$-hadron correlations in Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV

Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, S. Afanasiev, C. Aidala, N. N. Ajitanand, Y. Akiba, H. Al-Bataineh, J. Alexander, M. Alfred, K. Aoki, N. Apadula, L. Aphecetche, J. Asai, H. Asano, E. T. Atomssa, R. Averbeck, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, G. Baksay, L. Baksay, A. Baldisseri , et al. (510 additional authors not shown)

Abstract: High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is obs… ▽ More High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is observed in the yield of high-momentum jet fragments opposite the trigger particle, which indicates jet suppression stemming from in-medium partonic energy loss, while enhancement is observed for low-momentum particles. The ratio and differences between the yield in Au$+$Au collisions and $p$$+$$p$ collisions, $I_{AA}$ and $Δ_{AA}$, as a function of the trigger-hadron azimuthal separation, $Δφ$, are measured for the first time at the Relativistic Heavy Ion Collider. These results better quantify how the yield of low-$p_T$ associated hadrons is enhanced at wide angle, which is crucial for studying energy loss as well as medium-response effects. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 534 authors from 83 institutions, 12 pages, 7 figures. v1 is version submitted to Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

arXiv:2406.07867 [pdf, other]

Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation

Authors: Se Jin Park, Chae Won Kim, Hyeongseop Rha, Minsu Kim, Joanna Hong, Jeong Hun Yeo, Yong Man Ro

Abstract: In this paper, we introduce a novel Face-to-Face spoken dialogue model. It processes audio-visual speech from user input and generates audio-visual speech as the response, marking the initial step towards creating an avatar chatbot system without relying on intermediate text. To this end, we newly introduce MultiDialog, the first large-scale multimodal (i.e., audio and visual) spoken dialogue corp… ▽ More In this paper, we introduce a novel Face-to-Face spoken dialogue model. It processes audio-visual speech from user input and generates audio-visual speech as the response, marking the initial step towards creating an avatar chatbot system without relying on intermediate text. To this end, we newly introduce MultiDialog, the first large-scale multimodal (i.e., audio and visual) spoken dialogue corpus containing 340 hours of approximately 9,000 dialogues, recorded based on the open domain dialogue dataset, TopicalChat. The MultiDialog contains parallel audio-visual recordings of conversation partners acting according to the given script with emotion annotations, which we expect to open up research opportunities in multimodal synthesis. Our Face-to-Face spoken dialogue model incorporates a textually pretrained large language model and adapts it into the audio-visual spoken dialogue domain by incorporating speech-text joint pretraining. Through extensive experiments, we validate the effectiveness of our model in facilitating a face-to-face conversation. Demo and data are available at https://multidialog.github.io and https://huggingface.co/datasets/IVLLab/MultiDialog, respectively. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: Accepted to ACL 2024

arXiv:2406.06650 [pdf, other]

Predicting the risk of early-stage breast cancer recurrence using H\&E-stained tissue images

Authors: Geongyu Lee, Joonho Lee, Tae-Yeong Kwak, Sun Woo Kim, Youngmee Kwon, Chungyeul Kim, Hyeyoon Chang

Abstract: Accurate prediction of the likelihood of recurrence is important in the selection of postoperative treatment for patients with early-stage breast cancer. In this study, we investigated whether deep learning algorithms can predict patients' risk of recurrence by analyzing the pathology images of their cancer histology. A total of 125 hematoxylin and eosin stained breast cancer whole slide images la… ▽ More Accurate prediction of the likelihood of recurrence is important in the selection of postoperative treatment for patients with early-stage breast cancer. In this study, we investigated whether deep learning algorithms can predict patients' risk of recurrence by analyzing the pathology images of their cancer histology. A total of 125 hematoxylin and eosin stained breast cancer whole slide images labeled with the risk prediction via genomics assays were used, and we obtained sensitivity of 0.857, 0.746, and 0.529 for predicting low, intermediate, and high risk, and specificity of 0.816, 0.803, and 0.972. When compared to the expert pathologist's regional histology grade information, a Pearson's correlation coefficient of 0.61 was obtained. When we checked the model learned through these studies through the class activation map, we found that it actually considered tubule formation and mitotic rate when predicting different risk groups. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 12 pages, 7 figures

arXiv:2406.06149 [pdf, other]

Decoupled Marked Temporal Point Process using Neural Ordinary Differential Equations

Authors: Yujee Song, Donghyun Lee, Rui Meng, Won Hwa Kim

Abstract: A Marked Temporal Point Process (MTPP) is a stochastic process whose realization is a set of event-time data. MTPP is often used to understand complex dynamics of asynchronous temporal events such as money transaction, social media, healthcare, etc. Recent studies have utilized deep neural networks to capture complex temporal dependencies of events and generate embedding that aptly represent the o… ▽ More A Marked Temporal Point Process (MTPP) is a stochastic process whose realization is a set of event-time data. MTPP is often used to understand complex dynamics of asynchronous temporal events such as money transaction, social media, healthcare, etc. Recent studies have utilized deep neural networks to capture complex temporal dependencies of events and generate embedding that aptly represent the observed events. While most previous studies focus on the inter-event dependencies and their representations, how individual events influence the overall dynamics over time has been under-explored. In this regime, we propose a Decoupled MTPP framework that disentangles characterization of a stochastic process into a set of evolving influences from different events. Our approach employs Neural Ordinary Differential Equations (Neural ODEs) to learn flexible continuous dynamics of these influences while simultaneously addressing multiple inference problems, such as density estimation and survival rate computation. We emphasize the significance of disentangling the influences by comparing our framework with state-of-the-art methods on real-life datasets, and provide analysis on the model behavior for potential applications. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 18 pages, 8 figures, The Twelfth International Conference on Learning Representations (ICLR 2024)

arXiv:2406.01897 [pdf, ps, other]

The quasilocal energy and thermodynamic first law in accelerating AdS black holes

Authors: Wontae Kim, Mungon Nam, Sang-Heon Yi

Abstract: We scrutinize the conserved energy of an accelerating AdS black hole by employing the off-shell quasilocal formalism, which amalgamates the ADT formalism with the covariant phase space approach. In the presence of conical singularities in the accelerating black hole, the energy expression is articulated through the surface term derived from our formalism. The essence of our analysis of the quasilo… ▽ More We scrutinize the conserved energy of an accelerating AdS black hole by employing the off-shell quasilocal formalism, which amalgamates the ADT formalism with the covariant phase space approach. In the presence of conical singularities in the accelerating black hole, the energy expression is articulated through the surface term derived from our formalism. The essence of our analysis of the quasilocal energy resides in the surface contributions coming from the conical singularities as well as the conventional radial boundary. Consequently, the resultant conserved quasilocal energy naturally conforms the thermodynamic first law for the black hole without necessitating any augmentation of thermodynamic variables. Additionally, we obtain the Smarr relation for the black hole using the differential operator method and the scaling argument of the relevant thermodynamic quantities. △ Less

Submitted 18 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

Comments: 15 pages, no figure, minor edits, references added

arXiv:2406.00267 [pdf, ps, other]

doi 10.1063/5.0202862

General Framework for Quantifying Dissipation Pathways in Open Quantum Systems. II. Numerical Validation and the Role of Non-Markovianity

Authors: Chang Woo Kim, Ignacio Franco

Abstract: In the previous paper [C. W. Kim and I. Franco, J. Chem. Phys. 160, 214111 (2024)], we developed a theory called MQME-D, which allows us to decompose the overall energy dissipation process in open quantum system dynamics into contributions by individual components of the bath when the subsystem dynamics is governed by a Markovian quantum master equation (MQME). Here, we contrast the predictions of… ▽ More In the previous paper [C. W. Kim and I. Franco, J. Chem. Phys. 160, 214111 (2024)], we developed a theory called MQME-D, which allows us to decompose the overall energy dissipation process in open quantum system dynamics into contributions by individual components of the bath when the subsystem dynamics is governed by a Markovian quantum master equation (MQME). Here, we contrast the predictions of MQME-D against the numerically exact results obtained by combining hierarchical equations of motion (HEOM) with a recently reported protocol for monitoring the statistics of the bath. Overall, MQME-D accurately captures the contributions of specific bath components to the overall dissipation while greatly reducing the computational cost as compared to exact computations using HEOM. The computations show that MQME-D exhibits errors originating from its inherent Markov approximation. We demonstrate that its accuracy can be significantly increased by incorporating non-Markovianity by exploiting time scale separations (TSS) in different components of the bath. Our work demonstrates that MQME-D combined with TSS can be reliably used to understanding how energy is dissipated in realistic open quantum system dynamics. △ Less

Submitted 8 June, 2024; v1 submitted 31 May, 2024; originally announced June 2024.

Journal ref: J. Chem. Phys. 160, 214112 (2024)

arXiv:2406.00266 [pdf, ps, other]

doi 10.1063/5.0202860

General Framework for Quantifying Dissipation Pathways in Open Quantum Systems. I. Theoretical Formulation

Authors: Chang Woo Kim, Ignacio Franco

Abstract: We present a general and practical theoretical framework to investigate how energy is dissipated in open quantum system dynamics. This is done by quantifying the contributions of individual bath components to the overall dissipation of the system. The framework is based on the Nakajima-Zwanzig projection operator technique which allows us to express the rate of energy dissipation into a specific b… ▽ More We present a general and practical theoretical framework to investigate how energy is dissipated in open quantum system dynamics. This is done by quantifying the contributions of individual bath components to the overall dissipation of the system. The framework is based on the Nakajima-Zwanzig projection operator technique which allows us to express the rate of energy dissipation into a specific bath degree of freedom by using traces of operator products. The approach captures system-bath interactions to all orders, but is based on second-order perturbation theory on the off-diagonal subsystem's couplings and a Markovian description of the bath. The usefulness of our theory is demonstrated by applying it to various models of open quantum systems involving harmonic oscillator or spin baths, and connecting the outcomes to existing results such as our previously reported formula derived for locally coupled harmonic bath [J. Chem. Phys. 154, 084109 (2021)]. We also prove that the dissipation calculated by our theory rigorously satisfies thermodynamic principles such as energy conservation and detailed balance. Overall, the strategy can be used to develop the theory and simulation of dissipation pathways to interpret and engineer the dynamics of open quantum systems. △ Less

Submitted 8 June, 2024; v1 submitted 31 May, 2024; originally announced June 2024.

Journal ref: J. Chem. Phys. 160, 214111 (2024)

arXiv:2405.19961 [pdf, other]

Collective Variable Free Transition Path Sampling with Generative Flow Network

Authors: Kiyoung Seong, Seonghyun Park, Seonghwan Kim, Woo Youn Kim, Sungsoo Ahn

Abstract: Understanding transition paths between meta-stable states in molecular systems is fundamental for material design and drug discovery. However, sampling these paths via unbiased molecular dynamics simulations is computationally prohibitive due to the high energy barriers between the meta-stable states. Recent machine learning approaches are often restricted to simple systems or rely on collective v… ▽ More Understanding transition paths between meta-stable states in molecular systems is fundamental for material design and drug discovery. However, sampling these paths via unbiased molecular dynamics simulations is computationally prohibitive due to the high energy barriers between the meta-stable states. Recent machine learning approaches are often restricted to simple systems or rely on collective variables (CVs) extracted from expensive domain knowledge. In this work, we propose to leverage generative flow networks (GFlowNets) to sample transition paths without relying on CVs. We reformulate the problem as amortized energy-based sampling over transition paths and train a neural bias potential by minimizing the squared log-ratio between the target distribution and the generator, derived from the flow matching objective of GFlowNets. Our evaluation on three proteins (Alanine Dipeptide, Polyproline Helix, and Chignolin) demonstrates that our approach, called TPS-GFN, generates more realistic and diverse transition paths than the previous CV-free machine learning approach. △ Less

Submitted 18 July, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

Comments: 8 pages, 5 figures, 2 tables

arXiv:2405.16861 [pdf, other]

NCIDiff: Non-covalent Interaction-generative Diffusion Model for Improving Reliability of 3D Molecule Generation Inside Protein Pocket

Authors: Joongwon Lee, Wonho Zhung, Woo Youn Kim

Abstract: Advancements in deep generative modeling have changed the paradigm of drug discovery. Among such approaches, target-aware methods that exploit 3D structures of protein pockets were spotlighted for generating ligand molecules with their plausible binding modes. While docking scores superficially assess the quality of generated ligands, closer inspection of the binding structures reveals the inconsi… ▽ More Advancements in deep generative modeling have changed the paradigm of drug discovery. Among such approaches, target-aware methods that exploit 3D structures of protein pockets were spotlighted for generating ligand molecules with their plausible binding modes. While docking scores superficially assess the quality of generated ligands, closer inspection of the binding structures reveals the inconsistency in local interactions between a pocket and generated ligands. Here, we address the issue by explicitly generating non-covalent interactions (NCIs), which are universal patterns throughout protein-ligand complexes. Our proposed model, NCIDiff, simultaneously denoises NCI types of protein-ligand edges along with a 3D graph of a ligand molecule during the sampling. With the NCI-generating strategy, our model generates ligands with more reliable NCIs, especially outperforming the baseline diffusion-based models. We further adopted inpainting techniques on NCIs to further improve the quality of the generated molecules. Finally, we showcase the applicability of NCIDiff on drug design tasks for real-world settings with specialized objectives by guiding the generation process with desired NCI patterns. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.16357 [pdf, other]

Exploring the Enigma of Neural Dynamics Through A Scattering-Transform Mixer Landscape for Riemannian Manifold

Authors: Tingting Dan, Ziquan Wei, Won Hwa Kim, Guorong Wu

Abstract: The human brain is a complex inter-wired system that emerges spontaneous functional fluctuations. In spite of tremendous success in the experimental neuroscience field, a system-level understanding of how brain anatomy supports various neural activities remains elusive. Capitalizing on the unprecedented amount of neuroimaging data, we present a physics-informed deep model to uncover the coupling m… ▽ More The human brain is a complex inter-wired system that emerges spontaneous functional fluctuations. In spite of tremendous success in the experimental neuroscience field, a system-level understanding of how brain anatomy supports various neural activities remains elusive. Capitalizing on the unprecedented amount of neuroimaging data, we present a physics-informed deep model to uncover the coupling mechanism between brain structure and function through the lens of data geometry that is rooted in the widespread wiring topology of connections between distant brain regions. Since deciphering the puzzle of self-organized patterns in functional fluctuations is the gateway to understanding the emergence of cognition and behavior, we devise a geometric deep model to uncover manifold mapping functions that characterize the intrinsic feature representations of evolving functional fluctuations on the Riemannian manifold. In lieu of learning unconstrained mapping functions, we introduce a set of graph-harmonic scattering transforms to impose the brain-wide geometry on top of manifold mapping functions, which allows us to cast the manifold-based deep learning into a reminiscent of MLP-Mixer architecture (in computer vision) for Riemannian manifold. As a proof-of-concept approach, we explore a neural-manifold perspective to understand the relationship between (static) brain structure and (dynamic) function, challenging the prevailing notion in cognitive neuroscience by proposing that neural activities are essentially excited by brain-wide oscillation waves living on the geometry of human connectomes, instead of being confined to focal areas. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: 15 pages, 6 figures

MSC Class: 51H30 ACM Class: I.3.5

arXiv:2405.15574 [pdf, other]

Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Authors: Byung-Kwan Lee, Chae Won Kim, Beomchan Park, Yong Man Ro

Abstract: The rapid development of large language and vision models (LLVMs) has been driven by advances in visual instruction tuning. Recently, open-source LLVMs have curated high-quality visual instruction tuning datasets and utilized additional vision encoders or multiple computer vision models in order to narrow the performance gap with powerful closed-source LLVMs. These advancements are attributed to m… ▽ More The rapid development of large language and vision models (LLVMs) has been driven by advances in visual instruction tuning. Recently, open-source LLVMs have curated high-quality visual instruction tuning datasets and utilized additional vision encoders or multiple computer vision models in order to narrow the performance gap with powerful closed-source LLVMs. These advancements are attributed to multifaceted information required for diverse capabilities, including fundamental image understanding, real-world knowledge about common-sense and non-object concepts (e.g., charts, diagrams, symbols, signs, and math problems), and step-by-step procedures for solving complex questions. Drawing from the multifaceted information, we present a new efficient LLVM, Mamba-based traversal of rationales (Meteor), which leverages multifaceted rationale to enhance understanding and answering capabilities. To embed lengthy rationales containing abundant information, we employ the Mamba architecture, capable of processing sequential data with linear time complexity. We introduce a new concept of traversal of rationale that facilitates efficient embedding of rationale. Subsequently, the backbone multimodal language model (MLM) is trained to generate answers with the aid of rationale. Through these steps, Meteor achieves significant improvements in vision language performances across multiple evaluation benchmarks requiring diverse capabilities, without scaling up the model size or employing additional vision encoders and computer vision models. △ Less

Submitted 27 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

Comments: Code is available in https://github.com/ByungKwanLee/Meteor

arXiv:2405.14126 [pdf, other]

The Disappearance of Timestep Embedding in Modern Time-Dependent Neural Networks

Authors: Bum Jun Kim, Yoshinobu Kawahara, Sang Woo Kim

Abstract: Dynamical systems are often time-varying, whose modeling requires a function that evolves with respect to time. Recent studies such as the neural ordinary differential equation proposed a time-dependent neural network, which provides a neural network varying with respect to time. However, we claim that the architectural choice to build a time-dependent neural network significantly affects its time… ▽ More Dynamical systems are often time-varying, whose modeling requires a function that evolves with respect to time. Recent studies such as the neural ordinary differential equation proposed a time-dependent neural network, which provides a neural network varying with respect to time. However, we claim that the architectural choice to build a time-dependent neural network significantly affects its time-awareness but still lacks sufficient validation in its current states. In this study, we conduct an in-depth analysis of the architecture of modern time-dependent neural networks. Here, we report a vulnerability of vanishing timestep embedding, which disables the time-awareness of a time-dependent neural network. Furthermore, we find that this vulnerability can also be observed in diffusion models because they employ a similar architecture that incorporates timestep embedding to discriminate between different timesteps during a diffusion process. Our analysis provides a detailed description of this phenomenon as well as several solutions to address the root cause. Through experiments on neural ordinary differential equations and diffusion models, we observed that ensuring alive time-awareness via proposed solutions boosted their performance, which implies that their current implementations lack sufficient time-dependency. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 14 pages, 7 figures

arXiv:2405.14115 [pdf, other]

Configuring Data Augmentations to Reduce Variance Shift in Positional Embedding of Vision Transformers

Authors: Bum Jun Kim, Sang Woo Kim

Abstract: Vision transformers (ViTs) have demonstrated remarkable performance in a variety of vision tasks. Despite their promising capabilities, training a ViT requires a large amount of diverse data. Several studies empirically found that using rich data augmentations, such as Mixup, Cutmix, and random erasing, is critical to the successful training of ViTs. Now, the use of rich data augmentations has bec… ▽ More Vision transformers (ViTs) have demonstrated remarkable performance in a variety of vision tasks. Despite their promising capabilities, training a ViT requires a large amount of diverse data. Several studies empirically found that using rich data augmentations, such as Mixup, Cutmix, and random erasing, is critical to the successful training of ViTs. Now, the use of rich data augmentations has become a standard practice in the current state. However, we report a vulnerability to this practice: Certain data augmentations such as Mixup cause a variance shift in the positional embedding of ViT, which has been a hidden factor that degrades the performance of ViT during the test phase. We claim that achieving a stable effect from positional embedding requires a specific condition on the image, which is often broken for the current data augmentation methods. We provide a detailed analysis of this problem as well as the correct configuration for these data augmentations to remove the side effects of variance shift. Experiments showed that adopting our guidelines improves the performance of ViTs compared with the current configuration of data augmentations. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 16 pages, 4 figures

arXiv:2405.07650 [pdf, other]

Arrow of Time in Estimation and Control: Duality Theory Beyond the Linear Gaussian Model

Authors: Jin Won Kim, Prashant G. Mehta

Abstract: Duality between estimation and control is a foundational concept in Control Theory. Most students learn about the elementary duality -- between observability and controllability -- in their first graduate course in linear systems theory. Therefore, it comes as a surprise that for a more general class of nonlinear stochastic systems (hidden Markov models or HMMs), duality is incomplete. Our objec… ▽ More Duality between estimation and control is a foundational concept in Control Theory. Most students learn about the elementary duality -- between observability and controllability -- in their first graduate course in linear systems theory. Therefore, it comes as a surprise that for a more general class of nonlinear stochastic systems (hidden Markov models or HMMs), duality is incomplete. Our objective in writing this article is two-fold: (i) To describe the difficulty in extending duality to HMMs; and (ii) To discuss its recent resolution by the authors. A key message is that the main difficulty in extending duality comes from time reversal in going from estimation to control. The reason for time reversal is explained with the aid of the familiar linear deterministic and linear Gaussian models. The explanation is used to motivate the difference between the linear and the nonlinear models. Once the difference is understood, duality for HMMs is described based on our recent work. The article also includes a comparison and discussion of the different types of duality considered in literature. △ Less

Submitted 27 June, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.02066 [pdf, other]

WateRF: Robust Watermarks in Radiance Fields for Protection of Copyrights

Authors: Youngdong Jang, Dong In Lee, MinHyuk Jang, Jong Wook Kim, Feng Yang, Sangpil Kim

Abstract: The advances in the Neural Radiance Fields (NeRF) research offer extensive applications in diverse domains, but protecting their copyrights has not yet been researched in depth. Recently, NeRF watermarking has been considered one of the pivotal solutions for safely deploying NeRF-based 3D representations. However, existing methods are designed to apply only to implicit or explicit NeRF representat… ▽ More The advances in the Neural Radiance Fields (NeRF) research offer extensive applications in diverse domains, but protecting their copyrights has not yet been researched in depth. Recently, NeRF watermarking has been considered one of the pivotal solutions for safely deploying NeRF-based 3D representations. However, existing methods are designed to apply only to implicit or explicit NeRF representations. In this work, we introduce an innovative watermarking method that can be employed in both representations of NeRF. This is achieved by fine-tuning NeRF to embed binary messages in the rendering process. In detail, we propose utilizing the discrete wavelet transform in the NeRF space for watermarking. Furthermore, we adopt a deferred back-propagation technique and introduce a combination with the patch-wise loss to improve rendering quality and bit accuracy with minimum trade-offs. We evaluate our method in three different aspects: capacity, invisibility, and robustness of the embedded watermarks in the 2D-rendered images. Our method achieves state-of-the-art performance with faster training speed over the compared state-of-the-art methods. △ Less

Submitted 11 July, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

arXiv:2405.01127 [pdf, other]

Backward Map for Filter Stability Analysis

Authors: Jin Won Kim, Anant A. Joshi, Prashant G. Mehta

Abstract: In this paper, a backward map is introduced for the purposes of analysis of the nonlinear (stochastic) filter stability. The backward map is important because the filter-stability in the sense of $\chisq$-divergence follows from showing a certain variance decay property for the backward map. To show this property requires additional assumptions on the model properties of the hidden Markov model (H… ▽ More In this paper, a backward map is introduced for the purposes of analysis of the nonlinear (stochastic) filter stability. The backward map is important because the filter-stability in the sense of $\chisq$-divergence follows from showing a certain variance decay property for the backward map. To show this property requires additional assumptions on the model properties of the hidden Markov model (HMM). The analysis in this paper is based on introducing a Poincaré Inequality (PI) for HMMs with white noise observations. In finite state-space settings, PI is related to both the ergodicity of the Markov process as well as the observability of the HMM. It is shown that the Poincaré constant is positive if and only if the HMM is detectable. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: arXiv admin note: text overlap with arXiv:2305.12850

arXiv:2405.00748 [pdf, other]

ChatGPT in Data Visualization Education: A Student Perspective

Authors: Nam Wook Kim, Hyung-Kwon Ko, Grace Myers, Benjamin Bach

Abstract: Unlike traditional educational chatbots that rely on pre-programmed responses, large-language model-driven chatbots, such as ChatGPT, demonstrate remarkable versatility and have the potential to serve as a dynamic resource for addressing student needs from understanding advanced concepts to solving complex problems. This work explores the impact of such technology on student learning in an interdi… ▽ More Unlike traditional educational chatbots that rely on pre-programmed responses, large-language model-driven chatbots, such as ChatGPT, demonstrate remarkable versatility and have the potential to serve as a dynamic resource for addressing student needs from understanding advanced concepts to solving complex problems. This work explores the impact of such technology on student learning in an interdisciplinary, project-oriented data visualization course. Throughout the semester, students engaged with ChatGPT across four distinct projects, including data visualizations and implementing them using a variety of tools including Tableau, D3, and Vega-lite. We collected conversation logs and reflection surveys from the students after each assignment. In addition, we conducted interviews with selected students to gain deeper insights into their overall experiences with ChatGPT. Our analysis examined the advantages and barriers of using ChatGPT, students' querying behavior, the types of assistance sought, and its impact on assignment outcomes and engagement. Based on the findings, we discuss design considerations for an educational solution that goes beyond the basic interface of ChatGPT, specifically tailored for data visualization education. △ Less

Submitted 30 April, 2024; originally announced May 2024.

Comments: 12 pages; 3 figures

arXiv:2405.00107 [pdf, other]

Impacts of bar-driven shear and shocks on star formation

Authors: Taehyun Kim, Dimitri A. Gadotti, Miguel Querejeta, Isabel Pérez, Almudena Zurita, Justus Neumann, Glenn van de Ven, Jairo Méndez-Abreu, Adriana de Lorenzo-Cáceres, Patricia Sánchez-Blázquez, Francesca Fragkoudi, Lucimara P. Martins, Luiz A. Silva-Lima, Woong-Tae Kim, Myeong-gu Park

Abstract: Bars drive gas inflow. As the gas flows inwards, shocks and shear occur along the bar dust lanes. Such shocks and shear can affect the star formation and change the gas properties. For four barred galaxies, we present Hα velocity gradient maps that highlight bar-driven shocks and shear using data from the PHANGS-MUSE and PHANGS-ALMA surveys which allow us to study bar kinematics in unprecedented d… ▽ More Bars drive gas inflow. As the gas flows inwards, shocks and shear occur along the bar dust lanes. Such shocks and shear can affect the star formation and change the gas properties. For four barred galaxies, we present Hα velocity gradient maps that highlight bar-driven shocks and shear using data from the PHANGS-MUSE and PHANGS-ALMA surveys which allow us to study bar kinematics in unprecedented detail. Velocity gradients are enhanced along the bar dust lanes, where shocks and shear are shown to occur in numerical simulations. Velocity gradient maps also efficiently pick up expanding shells around HII regions. We put pseudo slits on the regions where velocity gradients are enhanced and find that Hα and CO velocities jump up to ~170 km/s, even after removing the effects of circular motions due to the galaxy rotation. Enhanced velocity gradients either coincide with the peak of CO intensity along the bar dust lanes or are slightly offset from CO intensity peaks, depending on the objects. Using the BPT diagnostic, we identify the source of ionization on each spaxel and find that star formation is inhibited in the high velocity gradient regions of the bar, and the majority of those regions are classified as LINER or composite. This implies that star formation is inhibited where bar-driven shear and shocks are strong. Our results are consistent with the results from the numerical simulations that show star formation is inhibited in the bar where shear force is strong. △ Less

Submitted 30 April, 2024; originally announced May 2024.

Comments: 26 pages, Accepted for publication in ApJ

arXiv:2405.00021 [pdf, other]

SIMPLOT: Enhancing Chart Question Answering by Distilling Essentials

Authors: Wonjoong Kim, Sangwu Park, Yeonjun In, Seokwon Han, Chanyoung Park

Abstract: Recently, interpreting complex charts with logical reasoning has emerged as challenges due to the development of vision-language models. A prior state-of-the-art (SOTA) model has presented an end-to-end method that leverages the vision-language model to convert charts into table format utilizing Large Language Model (LLM) for reasoning. However, unlike natural images, charts contain a mix of essen… ▽ More Recently, interpreting complex charts with logical reasoning has emerged as challenges due to the development of vision-language models. A prior state-of-the-art (SOTA) model has presented an end-to-end method that leverages the vision-language model to convert charts into table format utilizing Large Language Model (LLM) for reasoning. However, unlike natural images, charts contain a mix of essential and irrelevant information required for chart reasoning, and we discover that this characteristic can lower the performance of chart-to-table extraction. In this paper, we introduce SIMPLOT, a method designed to extract only the elements necessary for chart reasoning. The proposed method involves two steps: 1) training to mimic a simple plot that contains only the essential information from a complex chart for table extraction, followed by 2) performing reasoning based on the table. Our model enables accurate chart reasoning without the need for additional annotations or datasets, and its effectiveness is demonstrated through various experiments. Furthermore, we propose a novel prompt mimicking how human interpret charts for more accurate reasoning. Our source code is available at https://github.com/sangwu99/Simplot. △ Less

Submitted 17 June, 2024; v1 submitted 22 February, 2024; originally announced May 2024.

arXiv:2404.19111 [pdf, ps, other]

Hölder regularity for degenerate parabolic double-phase equations

Authors: Wontae Kim, Kristian Moring, Lauri Särkiö

Abstract: We prove that bounded weak solutions to degenerate parabolic double-phase equations of $p$-Laplace type are locally Hölder continuous. The proof is based on phase analysis and methods for the $p$-Laplace equation. In particular, the phase analysis determines whether the double-phase equation is locally similar to the $p$-Laplace or the $q$-Laplace equation. We prove that bounded weak solutions to degenerate parabolic double-phase equations of $p$-Laplace type are locally Hölder continuous. The proof is based on phase analysis and methods for the $p$-Laplace equation. In particular, the phase analysis determines whether the double-phase equation is locally similar to the $p$-Laplace or the $q$-Laplace equation. △ Less

Submitted 29 April, 2024; originally announced April 2024.

MSC Class: 35D30; 35K65; 35K92

arXiv:2404.17507 [pdf, other]

HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts

Authors: Wonjae Kim, Sanghyuk Chun, Taekyung Kim, Dongyoon Han, Sangdoo Yun

Abstract: In an era where the volume of data drives the effectiveness of self-supervised learning, the specificity and clarity of data semantics play a crucial role in model training. Addressing this, we introduce HYPerbolic Entailment filtering (HYPE), a novel methodology designed to meticulously extract modality-wise meaningful and well-aligned data from extensive, noisy image-text pair datasets. Our appr… ▽ More In an era where the volume of data drives the effectiveness of self-supervised learning, the specificity and clarity of data semantics play a crucial role in model training. Addressing this, we introduce HYPerbolic Entailment filtering (HYPE), a novel methodology designed to meticulously extract modality-wise meaningful and well-aligned data from extensive, noisy image-text pair datasets. Our approach leverages hyperbolic embeddings and the concept of entailment cones to evaluate and filter out samples with meaningless or underspecified semantics, focusing on enhancing the specificity of each data sample. HYPE not only demonstrates a significant improvement in filtering efficiency but also sets a new state-of-the-art in the DataComp benchmark when combined with existing filtering techniques. This breakthrough showcases the potential of HYPE to refine the data selection process, thereby contributing to the development of more accurate and efficient self-supervised learning models. Additionally, the image specificity $ε_{i}$ can be independently applied to induce an image-only dataset from an image-text or image-only data pool for training image-only self-supervised models and showed superior performance when compared to the dataset induced by CLIP score. △ Less

Submitted 16 July, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

Comments: ECCV 2024; 33pages, 4.5MB

arXiv:2404.16344 [pdf]

Imaging Tunable Luttinger Liquid Systems in van der Waals Heterostructures

Authors: Hongyuan Li, Ziyu Xiang, Tianle Wang, Mit H. Naik, Woochang Kim, Jiahui Nie, Shiyu Li, Zhehao Ge, Zehao He, Yunbo Ou, Rounak Banerjee, Takashi Taniguchi, Kenji Watanabe, Sefaattin Tongay, Alex Zettl, Steven G. Louie, Michael P. Zaletel, Michael F. Crommie, Feng Wang

Abstract: One-dimensional (1D) interacting electrons are often described as a Luttinger liquid1-4 having properties that are intrinsically different from Fermi liquids in higher dimensions5,6. 1D electrons in materials systems exhibit exotic quantum phenomena that can be tuned by both intra- and inter-1D-chain electronic interactions, but their experimental characterization can be challenging. Here we demon… ▽ More One-dimensional (1D) interacting electrons are often described as a Luttinger liquid1-4 having properties that are intrinsically different from Fermi liquids in higher dimensions5,6. 1D electrons in materials systems exhibit exotic quantum phenomena that can be tuned by both intra- and inter-1D-chain electronic interactions, but their experimental characterization can be challenging. Here we demonstrate that layer-stacking domain walls (DWs) in van der Waals heterostructures form a broadly tunable Luttinger liquid system including both isolated and coupled arrays. We have imaged the evolution of DW Luttinger liquids under different interaction regimes tuned by electron density using a novel scanning tunneling microscopy (STM) technique. Single DWs at low carrier density are highly susceptible to Wigner crystallization consistent with a spin-incoherent Luttinger liquid, while at intermediate densities dimerized Wigner crystals form due to an enhanced magneto-elastic coupling. Periodic arrays of DWs exhibit an interplay between intra- and inter-chain interactions that gives rise to new quantum phases. At low electron densities inter-chain interactions are dominant and induce a 2D electron crystal composed of phased-locked 1D Wigner crystal in a staggered configuration. Increased electron density causes intra-chain fluctuation potentials to dominate, leading to an electronic smectic liquid crystal phase where electrons are ordered with algebraical correlation decay along the chain direction but disordered between chains. Our work shows that layer-stacking DWs in 2D heterostructures offers new opportunities to explore Luttinger liquid physics. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Showing 1–50 of 1,583 results for author: Kim, W