Search | arXiv e-print repository

Density-dependent quark mean-field model for nuclear matter and neutron stars

Authors: Kaixuan Huang, Jinniu Hu, Ying Zhang, Hong Shen

Abstract: We develop a density-dependent quark mean-field (DDQMF) model to study the properties of nuclear matter and neutron stars, where the coupling strength between $σ$ meson and nucleon is generated by the degree of freedom of quarks, while other meson coupling constants are regarded as density-dependent ones. Two values for the nucleon effective mass, $M^*_{N0}/M_N=0.556,~0.70$ at the saturation densi… ▽ More We develop a density-dependent quark mean-field (DDQMF) model to study the properties of nuclear matter and neutron stars, where the coupling strength between $σ$ meson and nucleon is generated by the degree of freedom of quarks, while other meson coupling constants are regarded as density-dependent ones. Two values for the nucleon effective mass, $M^*_{N0}/M_N=0.556,~0.70$ at the saturation density are chosen based on the consideration of the core-collapse supernova simulation and finite nuclei when the meson-nucleon coupling constants are fixed. We find that the equation of state (EOS) of nuclear matter, the symmetry energy, the mass-radius relations, and the tidal deformabilities of neutron stars with larger nucleon effective mass are more sensitive to the skewness coefficient $J_0$. The EOSs with $M^*_{N0}/M_N=0.70$ are softer when the skewness coefficient $J_0=-800$ MeV. However, the maximum masses of the neutron star can be around $2.32M_\odot$ with $J_0=400$ MeV regardless of the value of the nucleon effective mass. By manipulating the coupling strength of the isovector meson to generate different slopes of symmetry energy, we construct the neutron star EOSs that can satisfy the different variables from the simultaneous mass-radius measurements of PSR J0030+0451, PSR J0740+6620 by the NICER collaboration, the mass-radius relations of HESS J1731-347, and the radius constraints from the gravitational-wave signal GW170817 in the framework of DDQMF model. At the same time, most of these constructed EOSs can also satisfy the constraints of the tidal deformability from GW170817 event. △ Less

Submitted 3 March, 2024; originally announced March 2024.

arXiv:2403.01507 [pdf, other]

ISSF: The Intelligent Security Service Framework for Cloud-Native Operation

Authors: Yikuan Yan, Keman Huang, Michael Siegel

Abstract: The growing system complexity from microservice architectures and the bilateral enhancement of artificial intelligence (AI) for both attackers and defenders presents increasing security challenges for cloud-native operations. In particular, cloud-native operators require a holistic view of the dynamic security posture for the cloud-native environment from a defense aspect. Additionally, both attac… ▽ More The growing system complexity from microservice architectures and the bilateral enhancement of artificial intelligence (AI) for both attackers and defenders presents increasing security challenges for cloud-native operations. In particular, cloud-native operators require a holistic view of the dynamic security posture for the cloud-native environment from a defense aspect. Additionally, both attackers and defenders can adopt advanced AI technologies. This makes the dynamic interaction and benchmark among different intelligent offense and defense strategies more crucial. Hence, following the multi-agent deep reinforcement learning (RL) paradigm, this research develops an agent-based intelligent security service framework (ISSF) for cloud-native operation. It includes a dynamic access graph model to represent the cloud-native environment and an action model to represent offense and defense actions. Then we develop an approach to enable the training, publishing, and evaluating of intelligent security services using diverse deep RL algorithms and training strategies, facilitating their systematic development and benchmark. The experiments demonstrate that our framework can sufficiently model the security posture of a cloud-native system for defenders, effectively develop and quantitatively benchmark different services for both attackers and defenders and guide further service optimization. △ Less

Submitted 3 March, 2024; originally announced March 2024.

Comments: 17pages

arXiv:2403.01364 [pdf, other]

Improving Cross-lingual Representation for Semantic Retrieval with Code-switching

Authors: Mieradilijiang Maimaiti, Yuanhang Zheng, Ji Zhang, Fei Huang, Yue Zhang, Wenpei Luo, Kaiyu Huang

Abstract: Semantic Retrieval (SR) has become an indispensable part of the FAQ system in the task-oriented question-answering (QA) dialogue scenario. The demands for a cross-lingual smart-customer-service system for an e-commerce platform or some particular business conditions have been increasing recently. Most previous studies exploit cross-lingual pre-trained models (PTMs) for multi-lingual knowledge retr… ▽ More Semantic Retrieval (SR) has become an indispensable part of the FAQ system in the task-oriented question-answering (QA) dialogue scenario. The demands for a cross-lingual smart-customer-service system for an e-commerce platform or some particular business conditions have been increasing recently. Most previous studies exploit cross-lingual pre-trained models (PTMs) for multi-lingual knowledge retrieval directly, while some others also leverage the continual pre-training before fine-tuning PTMs on the downstream tasks. However, no matter which schema is used, the previous work ignores to inform PTMs of some features of the downstream task, i.e. train their PTMs without providing any signals related to SR. To this end, in this work, we propose an Alternative Cross-lingual PTM for SR via code-switching. We are the first to utilize the code-switching approach for cross-lingual SR. Besides, we introduce the novel code-switched continual pre-training instead of directly using the PTMs on the SR tasks. The experimental results show that our proposed approach consistently outperforms the previous SOTA methods on SR and semantic textual similarity (STS) tasks with three business corpora and four open datasets in 20+ languages. △ Less

Submitted 2 March, 2024; originally announced March 2024.

arXiv:2403.00944 [pdf, other]

Optimizing Dynamic Balance in a Rat Robot via the Lateral Flexion of a Soft Actuated Spine

Authors: Yuhong Huang, Zhenshan Bing, Zitao Zhang, Genghang Zhuang, Kai Huang, Alois Knoll

Abstract: Balancing oneself using the spine is a physiological alignment of the body posture in the most efficient manner by the muscular forces for mammals. For this reason, we can see many disabled quadruped animals can still stand or walk even with three limbs. This paper investigates the optimization of dynamic balance during trot gait based on the spatial relationship between the center of mass (CoM) a… ▽ More Balancing oneself using the spine is a physiological alignment of the body posture in the most efficient manner by the muscular forces for mammals. For this reason, we can see many disabled quadruped animals can still stand or walk even with three limbs. This paper investigates the optimization of dynamic balance during trot gait based on the spatial relationship between the center of mass (CoM) and support area influenced by spinal flexion. During trotting, the robot balance is significantly influenced by the distance of the CoM to the support area formed by diagonal footholds. In this context, lateral spinal flexion, which is able to modify the position of footholds, holds promise for optimizing balance during trotting. This paper explores this phenomenon using a rat robot equipped with a soft actuated spine. Based on the lateral flexion of the spine, we establish a kinematic model to quantify the impact of spinal flexion on robot balance during trot gait. Subsequently, we develop an optimized controller for spinal flexion, designed to enhance balance without altering the leg locomotion. The effectiveness of our proposed controller is evaluated through extensive simulations and physical experiments conducted on a rat robot. Compared to both a non-spine based trot gait controller and a trot gait controller with lateral spinal flexion, our proposed optimized controller effectively improves the dynamic balance of the robot and retains the desired locomotion during trotting. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2402.18853 [pdf, other]

Rethinking Multi-domain Generalization with A General Learning Objective

Authors: Zhaorui Tan, Xi Yang, Kaizhu Huang

Abstract: Multi-domain generalization (mDG) is universally aimed to minimize the discrepancy between training and testing distributions to enhance marginal-to-label distribution mapping. However, existing mDG literature lacks a general learning objective paradigm and often imposes constraints on static target marginal distributions. In this paper, we propose to leverage a $Y$-mapping to relax the constraint… ▽ More Multi-domain generalization (mDG) is universally aimed to minimize the discrepancy between training and testing distributions to enhance marginal-to-label distribution mapping. However, existing mDG literature lacks a general learning objective paradigm and often imposes constraints on static target marginal distributions. In this paper, we propose to leverage a $Y$-mapping to relax the constraint. We rethink the learning objective for mDG and design a new \textbf{general learning objective} to interpret and analyze most existing mDG wisdom. This general objective is bifurcated into two synergistic amis: learning domain-independent conditional features and maximizing a posterior. Explorations also extend to two effective regularization terms that incorporate prior information and suppress invalid causality, alleviating the issues that come with relaxed constraints. We theoretically contribute an upper bound for the domain alignment of domain-independent conditional features, disclosing that many previous mDG endeavors actually \textbf{optimize partially the objective} and thus lead to limited performance. As such, our study distills a general learning objective into four practical components, providing a general, robust, and flexible mechanism to handle complex domain shifts. Extensive empirical results indicate that the proposed objective with $Y$-mapping leads to substantially better mDG performance in various downstream tasks, including regression, segmentation, and classification. △ Less

Submitted 29 February, 2024; originally announced February 2024.

Comments: Accepted by CVPR24

arXiv:2402.16070 [pdf, other]

High-order topological pumping on a superconducting quantum processor

Authors: Cheng-Lin Deng, Yu Liu, Yu-Ran Zhang, Xue-Gang Li, Tao Liu, Chi-Tong Chen, Tong Liu, Cong-Wei Lu, Yong-Yi Wang, Tian-Ming Li, Cai-Ping Fang, Si-Yun Zhou, Jia-Cheng Song, Yue-Shan Xu, Yang He, Zheng-He Liu, Kai-Xuan Huang, Zhong-Cheng Xiang, Jie-Ci Wang, Dong-Ning Zheng, Guang-Ming Xue, Kai Xu, H. F. Yu, Heng Fan

Abstract: High-order topological phases of matter refer to the systems of $n$-dimensional bulk with the topology of $m$-th order, exhibiting $(n-m)$-dimensional boundary modes and can be characterized by topological pumping. Here, we experimentally demonstrate two types of second-order topological pumps, forming four 0-dimensional corner localized states on a 4$\times$4 square lattice array of 16 supercondu… ▽ More High-order topological phases of matter refer to the systems of $n$-dimensional bulk with the topology of $m$-th order, exhibiting $(n-m)$-dimensional boundary modes and can be characterized by topological pumping. Here, we experimentally demonstrate two types of second-order topological pumps, forming four 0-dimensional corner localized states on a 4$\times$4 square lattice array of 16 superconducting qubits. The initial ground state of the system for half-filling, as a product of four identical entangled 4-qubit states, is prepared using an adiabatic scheme. During the pumping procedure, we adiabatically modulate the superlattice Bose-Hubbard Hamiltonian by precisely controlling both the hopping strengths and on-site potentials. At the half pumping period, the system evolves to a corner-localized state in a quadrupole configuration. The robustness of the second-order topological pump is also investigated by introducing different on-site disorder. Our work studies the topological properties of high-order topological phases from the dynamical transport picture using superconducting qubits, which would inspire further research on high-order topological phases. △ Less

Submitted 25 February, 2024; originally announced February 2024.

arXiv:2402.11207 [pdf, ps, other]

Search for the production of deuterons and antideuterons in e^+e^- annihilation at center-of-mass energies between 4.13 and 4.70 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (593 additional authors not shown)

Abstract: Using a data sample of $e^+e^-$ collision data corresponding to an integrated luminosity of 19 fb$^{-1}$ collected with the BESIII detector at the BEPCII collider, we search for the production of deuterons and antideuterons via $e^+e^-\to ppπ^-\bar{d}+c.c.$ for the first time at center-of-mass energies between 4.13 and 4.70 GeV. No significant signal is observed and the upper limit of the… ▽ More Using a data sample of $e^+e^-$ collision data corresponding to an integrated luminosity of 19 fb$^{-1}$ collected with the BESIII detector at the BEPCII collider, we search for the production of deuterons and antideuterons via $e^+e^-\to ppπ^-\bar{d}+c.c.$ for the first time at center-of-mass energies between 4.13 and 4.70 GeV. No significant signal is observed and the upper limit of the $e^+e^-\to ppπ^-\bar{d}+c.c.$ cross section is determined to be from 9.0 to 145 fb depending on the center-of-mass energy at the $90\%$ confidence level. △ Less

Submitted 17 February, 2024; originally announced February 2024.

arXiv:2402.10721 [pdf, other]

doi 10.1051/0004-6361/202348787

Molecular isotopologue measurements toward super star clusters and the relation to their ages in NGC253 with ALCHEMI

Authors: J. Butterworth, S. Viti, P. P. Van der Werf, J. G. Mangum, S. Martín, N. Harada, K. L. Emig, S. Muller, K. Sakamoto, Y. Yoshimura, K. Tanaka, R. Herrero-Illana, L. Colzi, V. M. Rivilla, K. Y. Huang, M. Bouvier, E. Behrens, C. Henkel, Y. T. Yan, D. S. Meier, D. Zhou

Abstract: Determining the evolution of the CNO isotopes in the interstellar medium (ISM) of starburst galaxies can yield important constraints on the ages of superstar clusters (SSCs), or on other aspects and contributing factors of their evolution. Due to the time-dependent nature of the abundances of isotopes within the ISM as they are supplied from processes such as nucleosynthesis or chemical fractionat… ▽ More Determining the evolution of the CNO isotopes in the interstellar medium (ISM) of starburst galaxies can yield important constraints on the ages of superstar clusters (SSCs), or on other aspects and contributing factors of their evolution. Due to the time-dependent nature of the abundances of isotopes within the ISM as they are supplied from processes such as nucleosynthesis or chemical fractionation, this provides the possible opportunity to probe the ability of isotopes ratios to trace the ages of high star forming regions, such as SSCs. The goal of this study is to investigate whether the isotopic variations in SSC regions within NGC253 are correlated with their different ages as derived from stellar population modelling. We have measured abundance ratios of CO, HCN and HCO$^+$ isotopologues in six regions containing SSCs within NGC253 using high spatial resolution (1.6",$\sim 28$pc) data from the ALCHEMI (ALma Comprehensive High-resolution Extragalactic Molecular Inventory) ALMA Large program. We have then analysed these ratios using RADEX radiative transfer modelling, with the parameter space sampled using the nested sampling Monte Carlo algorithm MLFriends. These abundance ratios were then compared to ages predicted in each region via the fitting of observed star formation tracers (such as Br$γ$) to starburst stellar population evolution models. We do not find any significant trend with age for the CO and HCN isotopologue ratios on the timescales for the ages of the SSC* regions observed. The driving factors of these ratios within SSCs could be the Initial Mass Function as well as possibly fractionation effects. To further probe these effects in SSCs over time a larger sample of SSCs must be observed spanning a larger age range. △ Less

Submitted 16 February, 2024; originally announced February 2024.

Comments: 44 pages, 43 Figures, Accepted for Publication to A&A

Journal ref: A&A 686, A31 (2024)

arXiv:2402.10158 [pdf, other]

InfoNet: Neural Estimation of Mutual Information without Test-Time Optimization

Authors: Zhengyang Hu, Song Kang, Qunsong Zeng, Kaibin Huang, Yanchao Yang

Abstract: Estimating mutual correlations between random variables or data streams is essential for intelligent behavior and decision-making. As a fundamental quantity for measuring statistical relationships, mutual information has been extensively studied and utilized for its generality and equitability. However, existing methods often lack the efficiency needed for real-time applications, such as test-time… ▽ More Estimating mutual correlations between random variables or data streams is essential for intelligent behavior and decision-making. As a fundamental quantity for measuring statistical relationships, mutual information has been extensively studied and utilized for its generality and equitability. However, existing methods often lack the efficiency needed for real-time applications, such as test-time optimization of a neural network, or the differentiability required for end-to-end learning, like histograms. We introduce a neural network called InfoNet, which directly outputs mutual information estimations of data streams by leveraging the attention mechanism and the computational efficiency of deep learning infrastructures. By maximizing a dual formulation of mutual information through large-scale simulated training, our approach circumvents time-consuming test-time optimization and offers generalization ability. We evaluate the effectiveness and generalization of our proposed mutual information estimation scheme on various families of distributions and applications. Our results demonstrate that InfoNet and its training process provide a graceful efficiency-accuracy trade-off and order-preserving properties. We will make the code and models available as a comprehensive toolbox to facilitate studies in different fields requiring real-time mutual information estimation. △ Less

Submitted 15 February, 2024; originally announced February 2024.

arXiv:2402.09714 [pdf, other]

An Accelerated Distributed Stochastic Gradient Method with Momentum

Authors: Kun Huang, Shi Pu, Angelia Nedić

Abstract: In this paper, we introduce an accelerated distributed stochastic gradient method with momentum for solving the distributed optimization problem, where a group of $n$ agents collaboratively minimize the average of the local objective functions over a connected network. The method, termed ``Distributed Stochastic Momentum Tracking (DSMT)'', is a single-loop algorithm that utilizes the momentum trac… ▽ More In this paper, we introduce an accelerated distributed stochastic gradient method with momentum for solving the distributed optimization problem, where a group of $n$ agents collaboratively minimize the average of the local objective functions over a connected network. The method, termed ``Distributed Stochastic Momentum Tracking (DSMT)'', is a single-loop algorithm that utilizes the momentum tracking technique as well as the Loopless Chebyshev Acceleration (LCA) method. We show that DSMT can asymptotically achieve comparable convergence rates as centralized stochastic gradient descent (SGD) method under a general variance condition regarding the stochastic gradients. Moreover, the number of iterations (transient times) required for DSMT to achieve such rates behaves as $\mathcal{O}(n^{5/3}/(1-λ))$ for minimizing general smooth objective functions, and $\mathcal{O}(\sqrt{n/(1-λ)})$ under the Polyak-Łojasiewicz (PL) condition. Here, the term $1-λ$ denotes the spectral gap of the mixing matrix related to the underlying network topology. Notably, the obtained results do not rely on multiple inter-node communications or stochastic gradient accumulation per iteration, and the transient times are the shortest under the setting to the best of our knowledge. △ Less

Submitted 18 February, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

Comments: 32 pages, 5 figures

arXiv:2402.07401 [pdf, other]

Can LLMs Produce Faithful Explanations For Fact-checking? Towards Faithful Explainable Fact-Checking via Multi-Agent Debate

Authors: Kyungha Kim, Sangyun Lee, Kung-Hsiang Huang, Hou Pong Chan, Manling Li, Heng Ji

Abstract: Fact-checking research has extensively explored verification but less so the generation of natural-language explanations, crucial for user trust. While Large Language Models (LLMs) excel in text generation, their capability for producing faithful explanations in fact-checking remains underexamined. Our study investigates LLMs' ability to generate such explanations, finding that zero-shot prompts o… ▽ More Fact-checking research has extensively explored verification but less so the generation of natural-language explanations, crucial for user trust. While Large Language Models (LLMs) excel in text generation, their capability for producing faithful explanations in fact-checking remains underexamined. Our study investigates LLMs' ability to generate such explanations, finding that zero-shot prompts often result in unfaithfulness. To address these challenges, we propose the Multi-Agent Debate Refinement (MADR) framework, leveraging multiple LLMs as agents with diverse roles in an iterative refining process aimed at enhancing faithfulness in generated explanations. MADR ensures that the final explanation undergoes rigorous validation, significantly reducing the likelihood of unfaithful elements and aligning closely with the provided evidence. Experimental results demonstrate that MADR significantly improves the faithfulness of LLM-generated explanations to the evidence, advancing the credibility and trustworthiness of these explanations. △ Less

Submitted 11 February, 2024; originally announced February 2024.

arXiv:2402.05162 [pdf, other]

Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

Authors: Boyi Wei, Kaixuan Huang, Yangsibo Huang, Tinghao Xie, Xiangyu Qi, Mengzhou Xia, Prateek Mittal, Mengdi Wang, Peter Henderson

Abstract: Large language models (LLMs) show inherent brittleness in their safety mechanisms, as evidenced by their susceptibility to jailbreaking and even non-malicious fine-tuning. This study explores this brittleness of safety alignment by leveraging pruning and low-rank modifications. We develop methods to identify critical regions that are vital for safety guardrails, and that are disentangled from util… ▽ More Large language models (LLMs) show inherent brittleness in their safety mechanisms, as evidenced by their susceptibility to jailbreaking and even non-malicious fine-tuning. This study explores this brittleness of safety alignment by leveraging pruning and low-rank modifications. We develop methods to identify critical regions that are vital for safety guardrails, and that are disentangled from utility-relevant regions at both the neuron and rank levels. Surprisingly, the isolated regions we find are sparse, comprising about $3\%$ at the parameter level and $2.5\%$ at the rank level. Removing these regions compromises safety without significantly impacting utility, corroborating the inherent brittleness of the model's safety mechanisms. Moreover, we show that LLMs remain vulnerable to low-cost fine-tuning attacks even when modifications to the safety-critical regions are restricted. These findings underscore the urgent need for more robust safety strategies in LLMs. △ Less

Submitted 1 July, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

Comments: 22 pages, 9 figures. Project page is available at https://boyiwei.com/alignment-attribution/

arXiv:2402.03829 [pdf, ps, other]

Precise Measurement of Born Cross Sections for $e^+e^-\to D\bar{D}$ and Observation of One Structure between $\sqrt{s} = 3.80-4.95$ GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (604 additional authors not shown)

Abstract: Using data samples collected with the BESIII detector at the BEPCII collider at center-of-mass energies ranging from 3.80 to 4.95 GeV, corresponding to an integrated luminosity of 20 fb$^{-1}$, a measurement of Born cross sections for the $e^+e^-\to D^{0}\bar{D}^{0}$ and $D^{+}D^{-}$ processes is presented with unprecedented precision. By performing a simultaneous fit to the dressed cross sections… ▽ More Using data samples collected with the BESIII detector at the BEPCII collider at center-of-mass energies ranging from 3.80 to 4.95 GeV, corresponding to an integrated luminosity of 20 fb$^{-1}$, a measurement of Born cross sections for the $e^+e^-\to D^{0}\bar{D}^{0}$ and $D^{+}D^{-}$ processes is presented with unprecedented precision. By performing a simultaneous fit to the dressed cross sections for both processes, one possible new structure around 3.9 GeV/$c^2$ is observed for the first time, in addition to seven known resonances $ψ(3770)$, $ψ(4040)$, $ψ(4160)$, $Y(4230)$, $Y(4360)$, $ψ(4415)$, and $Y(4660)$. These results offer crucial experimental insights into the nature of hadron production in the open charm region. △ Less

Submitted 6 February, 2024; originally announced February 2024.

Comments: 9 pages, 4 figures, 1 tables, 1 Supplemental_Material

arXiv:2402.03470 [pdf, ps, other]

A Study Guide for "A Restriction Estimate using Polynomial Partitioning"

Authors: John Green, Terry Harris, Kaiyi Huang, Arian Nadjimzadah

Abstract: This manuscript is intended as an accompaniment to Guth's "A restriction estimate using polynomial partitioning". We begin by summarizing the core ideas of the proof, elaborating the history and development of the techniques therein. From there, we provide supplementary details on some of the standard methods and more technical arguments which may be unfamiliar or less accessible to readers not ye… ▽ More This manuscript is intended as an accompaniment to Guth's "A restriction estimate using polynomial partitioning". We begin by summarizing the core ideas of the proof, elaborating the history and development of the techniques therein. From there, we provide supplementary details on some of the standard methods and more technical arguments which may be unfamiliar or less accessible to readers not yet acquainted with the paper. We also provide a summary of some more recent developments since the publication of Guth's work. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: Study guide written at UPenn Study Guide Writing Workshop 2023 https://sites.google.com/view/studyguideworkshop2023/home

MSC Class: 42B20; 42-02

arXiv:2402.02935 [pdf, other]

doi 10.1016/j.adt.2024.101661

Nuclear mass table in deformed relativistic Hartree-Bogoliubov theory in continuum, II: Even-$Z$ nuclei

Authors: DRHBc Mass Table Collaboration, Peng Guo, Xiaojie Cao, Kangmin Chen, Zhihui Chen, Myung-Ki Cheoun, Yong-Beom Choi, Pak Chung Lam, Wenmin Deng, Jianmin Dong, Pengxiang Du, Xiaokai Du, Kangda Duan, Xiaohua Fan, Wei Gao, Lisheng Geng, Eunja Ha, Xiao-Tao He, Jinniu Hu, Jingke Huang, Kun Huang, Yanan Huang, Zidan Huang, Kim Da Hyung, Hoi Yat Chan , et al. (58 additional authors not shown)

Abstract: The mass table in the deformed relativistic Hartree-Bogoliubov theory in continuum (DRHBc) with the PC-PK1 density functional has been established for even-$Z$ nuclei with $8\le Z\le120$, extended from the previous work for even-even nuclei [Zhang $\it{et.~al.}$ (DRHBc Mass Table Collaboration), At. Data Nucl. Data Tables 144, 101488 (2022)]. The calculated binding energies, two-nucleon and one-ne… ▽ More The mass table in the deformed relativistic Hartree-Bogoliubov theory in continuum (DRHBc) with the PC-PK1 density functional has been established for even-$Z$ nuclei with $8\le Z\le120$, extended from the previous work for even-even nuclei [Zhang $\it{et.~al.}$ (DRHBc Mass Table Collaboration), At. Data Nucl. Data Tables 144, 101488 (2022)]. The calculated binding energies, two-nucleon and one-neutron separation energies, root-mean-square (rms) radii of neutron, proton, matter, and charge distributions, quadrupole deformations, and neutron and proton Fermi surfaces are tabulated and compared with available experimental data. A total of 4829 even-$Z$ nuclei are predicted to be bound, with an rms deviation of 1.477 MeV from the 1244 mass data. Good agreement with the available experimental odd-even mass differences, $α$ decay energies, and charge radii is also achieved. The description accuracy for nuclear masses and nucleon separation energies as well as the prediction for drip lines is compared with the results obtained from other relativistic and nonrelativistic density functional. The comparison shows that the DRHBc theory with PC-PK1 provides an excellent microscopic description for the masses of even-$Z$ nuclei. The systematics of the nucleon separation energies, odd-even mass differences, pairing energies, two-nucleon gaps, $α$ decay energies, rms radii, quadrupole deformations, potential energy curves, neutron density distributions, and neutron mean-field potentials are discussed. △ Less

Submitted 10 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

Comments: 394 pages, 17 figures, 2 tables, published in Atomic Data and Nuclear Data Tables, data file in the TXT form is available for download under "Ancillary files"

Journal ref: Peng Guo, et. al. (DRHBc Mass Table Collaboration), Atomic Data and Nuclear Data Tables 158 (2024) 101661

arXiv:2402.01993 [pdf, other]

Measurement of the Electromagnetic Transition Form-factors in the decays $η'\rightarrowπ^+π^-l^+l^-$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (618 additional authors not shown)

Abstract: With a sample of $(10087\pm44)\times10^{6}$ $J/ψ$ events accumulated with the BESIII detector, we analyze the decays $η'\rightarrowπ^+π^-l^+l^-(l=e,$ $μ)$ via the process $J/ψ\rightarrowγη'$. The branching fractions are measured to be $\mathcal{B}(η'\rightarrowπ^+π^-e^+e^-)=(2.45\pm0.02(\rm{stat.})\pm0.08(\rm{syst.})) \times10^{-3}$ and… ▽ More With a sample of $(10087\pm44)\times10^{6}$ $J/ψ$ events accumulated with the BESIII detector, we analyze the decays $η'\rightarrowπ^+π^-l^+l^-(l=e,$ $μ)$ via the process $J/ψ\rightarrowγη'$. The branching fractions are measured to be $\mathcal{B}(η'\rightarrowπ^+π^-e^+e^-)=(2.45\pm0.02(\rm{stat.})\pm0.08(\rm{syst.})) \times10^{-3}$ and $\mathcal{B}(η'\rightarrowπ^+π^-μ^+μ^-)=(2.16\pm0.12(\rm{stat.})\pm0.06(\rm{syst.}))\times10^{-5}$, and the ratio is $\frac{\mathcal{B}(η'\rightarrowπ^{+}π^{-}e^{+}e^{-})}{\mathcal{B}(η'\rightarrowπ^{+}π^{-}μ^{+}μ^{-})} = 113.4\pm0.9(\rm{stat.})\pm3.7(\rm{syst.})$. In addition, by combining the $η'\rightarrowπ^+π^-e^+e^-$ and $η'\rightarrowπ^+π^-μ^+μ^-$ decays, the slope parameter of the electromagnetic transition form factor is measured to be $b_{η'}=1.30\pm0.19\ (\mathrm{GeV}/c^{2})^{-2}$, which is consistent with previous measurements from BESIII and theoretical predictions from the VMD model. The asymmetry in the angle between the $π^+π^-$ and $l^+l^-$ decay planes, which has the potential to reveal the $CP$-violation originating from an unconventional electric dipole transition, is also investigated. The asymmetry parameters are determined to be $\mathcal{A}_{CP}(η'\rightarrowπ^+π^-e^+e^-)=(-0.21\pm0.73(\rm{stat.})\pm0.01(\rm{syst.}))\%$ and $\mathcal{A}_{CP}(η'\rightarrowπ^+π^-μ^+μ^-)=(0.62\pm4.71(\rm{stat.})\pm0.08(\rm{syst.}))\%$, implying that no evidence of $CP$-violation is observed at the present statistics. Finally, an axion-like particle is searched for via the decay $η'\rightarrowπ^+π^-a, a\rightarrow e^+e^-$, and upper limits of the branching fractions are presented for the mass assumptions of the axion-like particle in the range of $0-500\ \mathrm{MeV}/c^{2}$. △ Less

Submitted 2 February, 2024; originally announced February 2024.

arXiv:2401.17873 [pdf, other]

Measurements of Normalized Differential Cross Sections of Inclusive $η$ Production in $e^{+}e^{-}$ Annihilation at Energy from 2.0000 to 3.6710 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, D. Anderle, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (641 additional authors not shown)

Abstract: Using data samples collected with the BESIII detector operating at the BEPCII storage ring, the cross section of the inclusive process $e^{+}e^{-} \to η+ X$, normalized by the total cross section of $e^{+}e^{-} \to \text{hadrons}$, is measured at eight center-of-mass energy points from 2.0000 GeV to 3.6710 GeV. These are the first measurements with momentum dependence in this energy region. Our me… ▽ More Using data samples collected with the BESIII detector operating at the BEPCII storage ring, the cross section of the inclusive process $e^{+}e^{-} \to η+ X$, normalized by the total cross section of $e^{+}e^{-} \to \text{hadrons}$, is measured at eight center-of-mass energy points from 2.0000 GeV to 3.6710 GeV. These are the first measurements with momentum dependence in this energy region. Our measurement shows a significant discrepancy from calculations with the existing fragmentation functions. To address this discrepancy, a new QCD analysis is performed at the next-to-next-to-leading order with hadron mass corrections and higher twist effects, which can explain both the established high-energy data and our measurements reasonably well. △ Less

Submitted 31 January, 2024; originally announced January 2024.

Comments: 9 pages, 2 figures

arXiv:2401.16659 [pdf, other]

History-Aware Conversational Dense Retrieval

Authors: Fengran Mo, Chen Qu, Kelong Mao, Tianyu Zhu, Zhan Su, Kaiyu Huang, Jian-Yun Nie

Abstract: Conversational search facilitates complex information retrieval by enabling multi-turn interactions between users and the system. Supporting such interactions requires a comprehensive understanding of the conversational inputs to formulate a good search query based on historical information. In particular, the search query should include the relevant information from the previous conversation turn… ▽ More Conversational search facilitates complex information retrieval by enabling multi-turn interactions between users and the system. Supporting such interactions requires a comprehensive understanding of the conversational inputs to formulate a good search query based on historical information. In particular, the search query should include the relevant information from the previous conversation turns. However, current approaches for conversational dense retrieval primarily rely on fine-tuning a pre-trained ad-hoc retriever using the whole conversational search session, which can be lengthy and noisy. Moreover, existing approaches are limited by the amount of manual supervision signals in the existing datasets. To address the aforementioned issues, we propose a History-Aware Conversational Dense Retrieval (HAConvDR) system, which incorporates two ideas: context-denoised query reformulation and automatic mining of supervision signals based on the actual impact of historical turns. Experiments on two public conversational search datasets demonstrate the improved history modeling capability of HAConvDR, in particular for long conversations with topic shifts. △ Less

Submitted 28 May, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: Accepted to Findings of ACL 2024

arXiv:2401.15071 [pdf, other]

From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

Authors: Chaochao Lu, Chen Qian, Guodong Zheng, Hongxing Fan, Hongzhi Gao, Jie Zhang, Jing Shao, Jingyi Deng, Jinlan Fu, Kexin Huang, Kunchang Li, Lijun Li, Limin Wang, Lu Sheng, Meiqi Chen, Ming Zhang, Qibing Ren, Sirui Chen, Tao Gui, Wanli Ouyang, Yali Wang, Yan Teng, Yaru Wang, Yi Wang, Yinan He , et al. (11 additional authors not shown)

Abstract: Multi-modal Large Language Models (MLLMs) have shown impressive abilities in generating reasonable responses with respect to multi-modal contents. However, there is still a wide gap between the performance of recent MLLM-based applications and the expectation of the broad public, even though the most powerful OpenAI's GPT-4 and Google's Gemini have been deployed. This paper strives to enhance unde… ▽ More Multi-modal Large Language Models (MLLMs) have shown impressive abilities in generating reasonable responses with respect to multi-modal contents. However, there is still a wide gap between the performance of recent MLLM-based applications and the expectation of the broad public, even though the most powerful OpenAI's GPT-4 and Google's Gemini have been deployed. This paper strives to enhance understanding of the gap through the lens of a qualitative study on the generalizability, trustworthiness, and causal reasoning capabilities of recent proprietary and open-source MLLMs across four modalities: ie, text, code, image, and video, ultimately aiming to improve the transparency of MLLMs. We believe these properties are several representative factors that define the reliability of MLLMs, in supporting various downstream applications. To be specific, we evaluate the closed-source GPT-4 and Gemini and 6 open-source LLMs and MLLMs. Overall we evaluate 230 manually designed cases, where the qualitative results are then summarized into 12 scores (ie, 4 modalities times 3 properties). In total, we uncover 14 empirical findings that are useful to understand the capabilities and limitations of both proprietary and open-source MLLMs, towards more reliable downstream multi-modal applications. △ Less

Submitted 29 January, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

arXiv:2401.14720 [pdf, ps, other]

Observation of structures in the processes $e^+e^-\rightarrowωχ_{c1}$ and $ωχ_{c2}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (608 additional authors not shown)

Abstract: We present measurements of the Born cross sections for the processes $e^+e^-\rightarrowωχ_{c1}$ and $ωχ_{c2}$ at center-of-mass energies $\sqrt{s}$ from 4.308 to 4.951 GeV. The measurements are performed with data samples corresponding to an integrated luminosity of 11.0 $\rm{fb}^{-1}$ collected with the BESIII detector operating at the BEPCII storage ring. Assuming the $e^+e^-\rightarrowωχ_{c2}$… ▽ More We present measurements of the Born cross sections for the processes $e^+e^-\rightarrowωχ_{c1}$ and $ωχ_{c2}$ at center-of-mass energies $\sqrt{s}$ from 4.308 to 4.951 GeV. The measurements are performed with data samples corresponding to an integrated luminosity of 11.0 $\rm{fb}^{-1}$ collected with the BESIII detector operating at the BEPCII storage ring. Assuming the $e^+e^-\rightarrowωχ_{c2}$ signals come from a single resonance, the mass and width are determined to be $M=(4413.6\pm9.0\pm0.8)$ MeV/$c^2$ and $Γ=(110.5\pm15.0\pm2.9)$ MeV, respectively, which is consistent with the parameters of the well-established resonance $ψ(4415)$. In addition, we also use one single resonance to describe the $e^+e^-\rightarrowωχ_{c1}$ lineshape, and determine the mass and width to be $M=(4544.2\pm18.7\pm1.7)$ MeV/$c^2$ and $Γ=(116.1\pm33.5\pm1.7)$ MeV, respectively. The structure of this lineshape, observed for the first time, requires further understanding. △ Less

Submitted 24 March, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

Comments: 11 pages, 8 figures, with Supplemental Material

arXiv:2401.14711 [pdf, other]

Study of $e^{+}e^{-}\rightarrowπ^{+}π^{-}π^{0}$ at $\sqrt{s}$ from 2.00 to 3.08 GeV at BESIII

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (608 additional authors not shown)

Abstract: With the data samples taken at center-of-mass energies from 2.00 to 3.08 GeV with the BESIII detector at the BEPCII collider, a partial wave analysis on the $e^{+}e^{-}\rightarrowπ^{+}π^{-}π^{0}$ process is performed. The Born cross sections for $e^{+}e^{-}\rightarrowπ^{+}π^{-}π^{0}$ and its intermediate processes $e^{+}e^{-}\rightarrowρπ$ and $ρ(1450)π$ are measured as functions of $\sqrt{s}$. Th… ▽ More With the data samples taken at center-of-mass energies from 2.00 to 3.08 GeV with the BESIII detector at the BEPCII collider, a partial wave analysis on the $e^{+}e^{-}\rightarrowπ^{+}π^{-}π^{0}$ process is performed. The Born cross sections for $e^{+}e^{-}\rightarrowπ^{+}π^{-}π^{0}$ and its intermediate processes $e^{+}e^{-}\rightarrowρπ$ and $ρ(1450)π$ are measured as functions of $\sqrt{s}$. The results for $e^{+}e^{-}\rightarrowπ^{+}π^{-}π^{0}$ are consistent with previous results measured with the initial state radiation method within one standard deviation, and improve the uncertainty by a factor of ten. By fitting the line shapes of the Born cross sections for the $e^{+}e^{-}\rightarrowρπ$ and $ρ(1450)π$, a structure with mass $M = 2119\pm11\pm15\ {\rm MeV}/c^2$ and width $Γ=69\pm30\pm5 {\rm MeV}$ is observed with a significance of $5.9σ$, where the first uncertainties are statistical and the second ones are systematic. This structure can be intepreteted as an excited $ω$ state. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2401.13598 [pdf, other]

Consistency Guided Knowledge Retrieval and Denoising in LLMs for Zero-shot Document-level Relation Triplet Extraction

Authors: Qi Sun, Kun Huang, Xiaocui Yang, Rong Tong, Kun Zhang, Soujanya Poria

Abstract: Document-level Relation Triplet Extraction (DocRTE) is a fundamental task in information systems that aims to simultaneously extract entities with semantic relations from a document. Existing methods heavily rely on a substantial amount of fully labeled data. However, collecting and annotating data for newly emerging relations is time-consuming and labor-intensive. Recent advanced Large Language M… ▽ More Document-level Relation Triplet Extraction (DocRTE) is a fundamental task in information systems that aims to simultaneously extract entities with semantic relations from a document. Existing methods heavily rely on a substantial amount of fully labeled data. However, collecting and annotating data for newly emerging relations is time-consuming and labor-intensive. Recent advanced Large Language Models (LLMs), such as ChatGPT and LLaMA, exhibit impressive long-text generation capabilities, inspiring us to explore an alternative approach for obtaining auto-labeled documents with new relations. In this paper, we propose a Zero-shot Document-level Relation Triplet Extraction (ZeroDocRTE) framework, which generates labeled data by retrieval and denoising knowledge from LLMs, called GenRDK. Specifically, we propose a chain-of-retrieval prompt to guide ChatGPT to generate labeled long-text data step by step. To improve the quality of synthetic data, we propose a denoising strategy based on the consistency of cross-document knowledge. Leveraging our denoised synthetic data, we proceed to fine-tune the LLaMA2-13B-Chat for extracting document-level relation triplets. We perform experiments for both zero-shot document-level relation and triplet extraction on two public datasets. The experimental results illustrate that our GenRDK framework outperforms strong baselines. △ Less

Submitted 24 January, 2024; originally announced January 2024.

Comments: Accepted by WWW 2024

arXiv:2401.13225 [pdf, ps, other]

A New Look at the Scalar Meson $f_0(500)$ via $D^+\to π^+π^-\ell^+ν_\ell$ Decays

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai, X. Cai , et al. (615 additional authors not shown)

Abstract: Using $2.93~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector at the center-of-mass energy of 3.773 GeV, we investigate the semileptonic decays $D^+\to π^+π^- \ell^+ν_\ell$ ($\ell=e$ and $μ$). The $D^+\to f_0(500)μ^+ν_μ$ decay is observed for the first time. By analyzing simultaneously the differential decay rates of $D^+\to f_0(500) μ^+ν_μ$ and… ▽ More Using $2.93~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector at the center-of-mass energy of 3.773 GeV, we investigate the semileptonic decays $D^+\to π^+π^- \ell^+ν_\ell$ ($\ell=e$ and $μ$). The $D^+\to f_0(500)μ^+ν_μ$ decay is observed for the first time. By analyzing simultaneously the differential decay rates of $D^+\to f_0(500) μ^+ν_μ$ and $D^+\to f_0(500) e^+ν_e$ in different $\ell^+ν_\ell$ four-momentum transfer intervals, the product of the relevant hadronic form factor $f^{f_0}_{+}(0)$ and the magnitude of the $c\to d$ Cabibbo-Kobayashi-Maskawa matrix element $|V_{cd}|$ is determined to be $f_{+}^{f_0} (0)|V_{cd}|=0.0787\pm0.0060_{\rm stat}\pm0.0033_{\rm syst}$ for the first time. With the input of $|V_{cd}|$ from the global fit in the standard model, we determine $f_{+}^{f_0} (0)=0.350\pm0.027_{\rm stat}\pm0.015_{\rm syst}$. The absolute branching fractions of $D^+\to f_0(500)_{(π^+π^-)}μ^+ν_μ$ and $D^+\to ρ^0_{(π^+π^-)} μ^+ν_μ$ are determined as $(0.72\pm0.13_{\rm stat}\pm0.10_{\rm syst})\times10^{-3}$ and $(1.64\pm0.13_{\rm stat}\pm0.11_{\rm syst})\times 10^{-3}$. Combining these results with those of previous BESIII measurements on their semielectronic counterparts from the same data sample, we test lepton flavor universality by measuring the branching fraction ratios ${\mathcal B}_{D^+\to ρ^0 μ^+ν_μ}/{\mathcal B}_{D^+\to ρ^0 e^+ν_e}=0.88\pm0.10$ and ${\mathcal B}_{D^+\to f_0(500) μ^+ν_μ}/{\mathcal B}_{D^+\to f_0(500) e^+ν_e}=1.14\pm0.28$, which are compatible with the standard model expectation. △ Less

Submitted 4 February, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

Comments: Supplemental Materials added in this version

Report number: BAM-00660

arXiv:2401.12447 [pdf, other]

NIV-SSD: Neighbor IoU-Voting Single-Stage Object Detector From Point Cloud

Authors: Shuai Liu, Di Wang, Quan Wang, Kai Huang

Abstract: Previous single-stage detectors typically suffer the misalignment between localization accuracy and classification confidence. To solve the misalignment problem, we introduce a novel rectification method named neighbor IoU-voting (NIV) strategy. Typically, classification and regression are treated as separate branches, making it challenging to establish a connection between them. Consequently, the… ▽ More Previous single-stage detectors typically suffer the misalignment between localization accuracy and classification confidence. To solve the misalignment problem, we introduce a novel rectification method named neighbor IoU-voting (NIV) strategy. Typically, classification and regression are treated as separate branches, making it challenging to establish a connection between them. Consequently, the classification confidence cannot accurately reflect the regression quality. NIV strategy can serve as a bridge between classification and regression branches by calculating two types of statistical data from the regression output to correct the classification confidence. Furthermore, to alleviate the imbalance of detection accuracy for complete objects with dense points (easy objects) and incomplete objects with sparse points (difficult objects), we propose a new data augmentation scheme named object resampling. It undersamples easy objects and oversamples difficult objects by randomly transforming part of easy objects into difficult objects. Finally, combining the NIV strategy and object resampling augmentation, we design an efficient single-stage detector termed NIV-SSD. Extensive experiments on several datasets indicate the effectiveness of the NIV strategy and the competitive performance of the NIV-SSD detector. The code will be available at https://github.com/Say2L/NIV-SSD. △ Less

Submitted 22 January, 2024; originally announced January 2024.

arXiv:2401.09468 [pdf, other]

doi 10.1007/JHEP05(2024)022

Measurement of Born cross section of $e^{+}e^{-}\rightarrowΣ^{+}\barΣ^{-}$ at center-of-mass energies between 3.510 and 4.951 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (632 additional authors not shown)

Abstract: Using 24.1 fb$^{-1}$ of $e^{+}e^{-}$ collision data collected with the BESIII detector at the BEPCII collider, the Born cross sections and effective form factors of the $e^{+}e^{-}\rightarrowΣ^{+}\barΣ^{-}$ reaction are measured. The measurements are performed at center-of-mass energies ranging from 3.510 to 4.951 GeV. No significant evidence for the decay of the charmonium(-like) states,… ▽ More Using 24.1 fb$^{-1}$ of $e^{+}e^{-}$ collision data collected with the BESIII detector at the BEPCII collider, the Born cross sections and effective form factors of the $e^{+}e^{-}\rightarrowΣ^{+}\barΣ^{-}$ reaction are measured. The measurements are performed at center-of-mass energies ranging from 3.510 to 4.951 GeV. No significant evidence for the decay of the charmonium(-like) states, $ψ(3770)$, $ψ(4040)$, $ψ(4160)$, $Y(4230)$, $Y(4360)$, $ψ(4415)$, and $Y(4660)$, into a $Σ^{+}\barΣ^{-}$ final state is observed. Consequently, upper limits for the products of the branching fractions and the electronic partial widths at the 90% confidence level are reported for these decays. △ Less

Submitted 6 May, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

Comments: 22 pages, 3 figures, 3 tables, consistent with the publication in JHEP05(2024)022

Journal ref: JHEP05(2024)022

arXiv:2401.09225 [pdf, other]

First measurements of the absolute branching fraction of $Λ_{c}(2625)^{+}\to Λ^{+}_{c}π^+π^-$ and upper limit on $Λ_{c}(2595)^{+}\to Λ^{+}_{c}π^+π^-$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (603 additional authors not shown)

Abstract: The absolute branching fraction of the decay $Λ_{c}(2625)^{+}\to Λ^{+}_{c}π^+π^-$ is measured for the first time to be $(50.7 \pm 5.0_{\rm{stat.}} \pm 4.9_{\rm{syst.}} )\%$ with 368.48 pb$^{-1}$ of $e^+e^-$ collision data collected by the BESIII detector at the center-of-mass energies of $\sqrt{s} = 4.918$ and $4.950$ GeV. This result is lower than the naive prediction of 67\%, obtained from isosp… ▽ More The absolute branching fraction of the decay $Λ_{c}(2625)^{+}\to Λ^{+}_{c}π^+π^-$ is measured for the first time to be $(50.7 \pm 5.0_{\rm{stat.}} \pm 4.9_{\rm{syst.}} )\%$ with 368.48 pb$^{-1}$ of $e^+e^-$ collision data collected by the BESIII detector at the center-of-mass energies of $\sqrt{s} = 4.918$ and $4.950$ GeV. This result is lower than the naive prediction of 67\%, obtained from isospin symmetry, by more than $2σ$, thereby indicating that the novel mechanism referred to as the \textit{threshold effect}, proposed for the strong decays of $Λ_{c}(2595)^{+}$, also applies to $Λ_{c}(2625)^{+}$. This measurement is necessary to obtain the coupling constants for the transitions between $s$-wave and $p$-wave charmed baryons in heavy hadron chiral perturbation theory. In addition, we search for the decay $Λ_{c}(2595)^{+}\to Λ^{+}_{c}π^+π^-$. No significant signal is observed, and the upper limit on its branching fraction is determined to be 80.8\% at the 90\% confidence level. △ Less

Submitted 17 January, 2024; originally announced January 2024.

Comments: 8 pages, 6 figures

arXiv:2401.09136 [pdf, other]

doi 10.1103/PhysRevD.109.072001

Improved measurements of the Dalitz decays $η/η'\rightarrowγe^{+}e^{-}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (618 additional authors not shown)

Abstract: Based on a data sample of 10 billion $J/ψ$ events collected with the BESIII detector, improved measurements of the Dalitz decays $η/η'\rightarrowγe^+e^-$ are performed, where the $η$ and $η'$ are produced through the radiative decays $J/ψ\rightarrowγη/η'$. The branching fractions of $η\rightarrowγe^+e^-$ and $η'\rightarrowγe^+e^-$ are measured to be $(7.07 \pm 0.05 \pm 0.23)\times10^{-3}$ and… ▽ More Based on a data sample of 10 billion $J/ψ$ events collected with the BESIII detector, improved measurements of the Dalitz decays $η/η'\rightarrowγe^+e^-$ are performed, where the $η$ and $η'$ are produced through the radiative decays $J/ψ\rightarrowγη/η'$. The branching fractions of $η\rightarrowγe^+e^-$ and $η'\rightarrowγe^+e^-$ are measured to be $(7.07 \pm 0.05 \pm 0.23)\times10^{-3}$ and $(4.83\pm0.07\pm0.14)\times10^{-4}$, respectively. Within the single pole model, the parameter of electromagnetic transition form factor for $η\rightarrowγe^+e^-$ is determined to be $Λ_η=(0.749 \pm 0.027 \pm 0.007)~ {\rm GeV}/c^{2}$. Within the multi-pole model, we extract the electromagnetic transition form factors for $η'\rightarrowγe^+e^-$ to be $Λ_{η'} = (0.802 \pm 0.007\pm 0.008)~ {\rm GeV}/c^{2}$ and $γ_{η'} = (0.113\pm0.010\pm0.002)~ {\rm GeV}/c^{2}$. The results are consistent with both theoretical predictions and previous measurements. The characteristic sizes of the interaction regions for the $η$ and $η'$ are calculated to be $(0.645 \pm 0.023 \pm 0.007 )~ {\rm fm}$ and $(0.596 \pm 0.005 \pm 0.006)~ {\rm fm}$, respectively. In addition, we search for the dark photon in $η/η^\prime\rightarrowγe^{+}e^{-}$, and the upper limits of the branching fractions as a function of the dark photon are given at 90\% confidence level. △ Less

Submitted 5 April, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

Journal ref: Phys.Rev.D 109 (2024) 7, 072001

arXiv:2401.09071 [pdf, other]

Rethinking Spectral Graph Neural Networks with Spatially Adaptive Filtering

Authors: Jingwei Guo, Kaizhu Huang, Xinping Yi, Zixian Su, Rui Zhang

Abstract: Whilst spectral Graph Neural Networks (GNNs) are theoretically well-founded in the spectral domain, their practical reliance on polynomial approximation implies a profound linkage to the spatial domain. As previous studies rarely examine spectral GNNs from the spatial perspective, their spatial-domain interpretability remains elusive, e.g., what information is essentially encoded by spectral GNNs… ▽ More Whilst spectral Graph Neural Networks (GNNs) are theoretically well-founded in the spectral domain, their practical reliance on polynomial approximation implies a profound linkage to the spatial domain. As previous studies rarely examine spectral GNNs from the spatial perspective, their spatial-domain interpretability remains elusive, e.g., what information is essentially encoded by spectral GNNs in the spatial domain? In this paper, to answer this question, we investigate the theoretical connection between spectral filtering and spatial aggregation, unveiling an intrinsic interaction that spectral filtering implicitly leads the original graph to an adapted new graph, explicitly computed for spatial aggregation. Both theoretical and empirical investigations reveal that the adapted new graph not only exhibits non-locality but also accommodates signed edge weights to reflect label consistency among nodes. These findings thus highlight the interpretable role of spectral GNNs in the spatial domain and inspire us to rethink graph spectral filters beyond the fixed-order polynomials, which neglect global information. Built upon the theoretical findings, we revisit the state-of-the-art spectral GNNs and propose a novel Spatially Adaptive Filtering (SAF) framework, which leverages the adapted new graph by spectral filtering for an auxiliary non-local aggregation. Notably, our SAF comprehensively models both node similarity and dissimilarity from a global perspective, therefore alleviating persistent deficiencies of GNNs related to long-range dependencies and graph heterophily. Extensive experiments over 13 node classification benchmarks demonstrate the superiority of our proposed framework to the state-of-the-art methods. △ Less

Submitted 22 May, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.09012 [pdf, other]

First study of antihyperon-nucleon scattering $\barΛp\rightarrow\barΛp$ and measurement of $Λp\rightarrowΛp$ cross section

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: Using $(10.087\pm0.044)\times10^{9}$ $J/ψ$ events collected with the BESIII detector at the BEPCII storage ring, the processes $Λp\rightarrowΛp$ and $\barΛp\rightarrow\barΛp$ are studied, where the $Λ/\barΛ$ baryons are produced in the process $J/ψ\rightarrowΛ\barΛ$ and the protons are the hydrogen nuclei in the cooling oil of the beam pipe. Clear signals are observed for the two reactions. The cr… ▽ More Using $(10.087\pm0.044)\times10^{9}$ $J/ψ$ events collected with the BESIII detector at the BEPCII storage ring, the processes $Λp\rightarrowΛp$ and $\barΛp\rightarrow\barΛp$ are studied, where the $Λ/\barΛ$ baryons are produced in the process $J/ψ\rightarrowΛ\barΛ$ and the protons are the hydrogen nuclei in the cooling oil of the beam pipe. Clear signals are observed for the two reactions. The cross sections in $-0.9\leq\rm{cos}θ_{Λ/\barΛ}\leq0.9$ are measured to be $σ(Λp\rightarrowΛp)=(12.2\pm1.6_{\rm{stat}}\pm1.1_{\rm{sys}})$ mb and $σ(\barΛ p\rightarrow\barΛ p)=(17.5\pm2.1_{\rm{stat}}\pm1.6_{\rm{sys}})$ mb at the $Λ/\barΛ$ momentum of $1.074$ GeV/$c$ within a range of $\pm0.017$ GeV/$c$, where the $θ_{Λ/\barΛ}$ are the scattering angles of the $Λ/\barΛ$ in the $Λp/\barΛp$ rest frames. Furthermore, the differential cross sections of the two reactions are also measured, where there is a slight tendency of forward scattering for $Λp\rightarrowΛp$, and a strong forward peak for $\barΛp\rightarrow\barΛp$. We present an approach to extract the total elastic cross sections by extrapolation. The study of $\barΛp\rightarrow\barΛp$ represents the first study of antihyperon-nucleon scattering, and these new measurements will serve as important inputs for the theoretical understanding of the (anti)hyperon-nucleon interaction. △ Less

Submitted 18 May, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

Comments: 9 pages, 5 figures

arXiv:2401.08478 [pdf, other]

Solving Continual Offline Reinforcement Learning with Decision Transformer

Authors: Kaixin Huang, Li Shen, Chen Zhao, Chun Yuan, Dacheng Tao

Abstract: Continuous offline reinforcement learning (CORL) combines continuous and offline reinforcement learning, enabling agents to learn multiple tasks from static datasets without forgetting prior tasks. However, CORL faces challenges in balancing stability and plasticity. Existing methods, employing Actor-Critic structures and experience replay (ER), suffer from distribution shifts, low efficiency, and… ▽ More Continuous offline reinforcement learning (CORL) combines continuous and offline reinforcement learning, enabling agents to learn multiple tasks from static datasets without forgetting prior tasks. However, CORL faces challenges in balancing stability and plasticity. Existing methods, employing Actor-Critic structures and experience replay (ER), suffer from distribution shifts, low efficiency, and weak knowledge-sharing. We aim to investigate whether Decision Transformer (DT), another offline RL paradigm, can serve as a more suitable offline continuous learner to address these issues. We first compare AC-based offline algorithms with DT in the CORL framework. DT offers advantages in learning efficiency, distribution shift mitigation, and zero-shot generalization but exacerbates the forgetting problem during supervised parameter updates. We introduce multi-head DT (MH-DT) and low-rank adaptation DT (LoRA-DT) to mitigate DT's forgetting problem. MH-DT stores task-specific knowledge using multiple heads, facilitating knowledge sharing with common components. It employs distillation and selective rehearsal to enhance current task learning when a replay buffer is available. In buffer-unavailable scenarios, LoRA-DT merges less influential weights and fine-tunes DT's decisive MLP layer to adapt to the current task. Extensive experiments on MoJuCo and Meta-World benchmarks demonstrate that our methods outperform SOTA CORL baselines and showcase enhanced learning capabilities and superior memory efficiency. △ Less

Submitted 7 April, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

Comments: 11 pages, 6 figures

arXiv:2401.08252 [pdf, other]

Observation of $ψ(3686) \to Ω^- K^+ \barΞ^0 $+c.c

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (630 additional authors not shown)

Abstract: Using $(27.12 \pm 0.14) \times 10^{8}$ $ψ(3686)$ events collected with the BESIII detector at BEPCII, the decay of $ψ(3686) \to Ω^- K^+ \barΞ^0 +c.c.$ is observed for the first time. The branching fraction of this decay is measured to be $\mathcal{B}_{ψ(3686) \to Ω^- K^+ \barΞ^0 +c.c.}=(2.78 \pm 0.40 \pm 0.18 ) \times 10^{-6}$, where the first uncertainty is statistical and the second is systemati… ▽ More Using $(27.12 \pm 0.14) \times 10^{8}$ $ψ(3686)$ events collected with the BESIII detector at BEPCII, the decay of $ψ(3686) \to Ω^- K^+ \barΞ^0 +c.c.$ is observed for the first time. The branching fraction of this decay is measured to be $\mathcal{B}_{ψ(3686) \to Ω^- K^+ \barΞ^0 +c.c.}=(2.78 \pm 0.40 \pm 0.18 ) \times 10^{-6}$, where the first uncertainty is statistical and the second is systematic. Possible baryon excited states are searched for in this decay, but no evident intermediate state is observed with the current sample size. △ Less

Submitted 15 April, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

arXiv:2401.07553 [pdf, other]

Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models

Authors: Xingzhou Lou, Junge Zhang, Ziyan Wang, Kaiqi Huang, Yali Du

Abstract: Safe reinforcement learning (RL) agents accomplish given tasks while adhering to specific constraints. Employing constraints expressed via easily-understandable human language offers considerable potential for real-world applications due to its accessibility and non-reliance on domain expertise. Previous safe RL methods with natural language constraints typically adopt a recurrent neural network,… ▽ More Safe reinforcement learning (RL) agents accomplish given tasks while adhering to specific constraints. Employing constraints expressed via easily-understandable human language offers considerable potential for real-world applications due to its accessibility and non-reliance on domain expertise. Previous safe RL methods with natural language constraints typically adopt a recurrent neural network, which leads to limited capabilities when dealing with various forms of human language input. Furthermore, these methods often require a ground-truth cost function, necessitating domain expertise for the conversion of language constraints into a well-defined cost function that determines constraint violation. To address these issues, we proposes to use pre-trained language models (LM) to facilitate RL agents' comprehension of natural language constraints and allow them to infer costs for safe policy learning. Through the use of pre-trained LMs and the elimination of the need for a ground-truth cost, our method enhances safe policy learning under a diverse set of human-derived free-form natural language constraints. Experiments on grid-world navigation and robot control show that the proposed method can achieve strong performance while adhering to given constraints. The usage of pre-trained LMs allows our method to comprehend complicated constraints and learn safe policies without the need for ground-truth cost at any stage of training or evaluation. Extensive ablation studies are conducted to demonstrate the efficacy of each part of our method. △ Less

Submitted 15 May, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

arXiv:2401.07240 [pdf, other]

DCDet: Dynamic Cross-based 3D Object Detector

Authors: Shuai Liu, Boyang Li, Zhiyu Fang, Kai Huang

Abstract: Recently, significant progress has been made in the research of 3D object detection. However, most prior studies have focused on the utilization of center-based or anchor-based label assignment schemes. Alternative label assignment strategies remain unexplored in 3D object detection. We find that the center-based label assignment often fails to generate sufficient positive samples for training, wh… ▽ More Recently, significant progress has been made in the research of 3D object detection. However, most prior studies have focused on the utilization of center-based or anchor-based label assignment schemes. Alternative label assignment strategies remain unexplored in 3D object detection. We find that the center-based label assignment often fails to generate sufficient positive samples for training, while the anchor-based label assignment tends to encounter an imbalanced issue when handling objects of varying scales. To solve these issues, we introduce a dynamic cross label assignment (DCLA) scheme, which dynamically assigns positive samples for each object from a cross-shaped region, thus providing sufficient and balanced positive samples for training. Furthermore, to address the challenge of accurately regressing objects with varying scales, we put forth a rotation-weighted Intersection over Union (RWIoU) metric to replace the widely used L1 metric in regression loss. Extensive experiments demonstrate the generality and effectiveness of our DCLA and RWIoU-based regression loss. The Code will be available at https://github.com/Say2L/DCDet.git. △ Less

Submitted 22 May, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

arXiv:2401.07122 [pdf, other]

Decentralized Federated Learning with Asynchronous Parameter Sharing for Large-scale IoT Networks

Authors: Haihui Xie, Minghua Xia, Peiran Wu, Shuai Wang, Kaibin Huang

Abstract: Federated learning (FL) enables wireless terminals to collaboratively learn a shared parameter model while keeping all the training data on devices per se. Parameter sharing consists of synchronous and asynchronous ways: the former transmits parameters as blocks or frames and waits until all transmissions finish, whereas the latter provides messages about the status of pending and failed parameter… ▽ More Federated learning (FL) enables wireless terminals to collaboratively learn a shared parameter model while keeping all the training data on devices per se. Parameter sharing consists of synchronous and asynchronous ways: the former transmits parameters as blocks or frames and waits until all transmissions finish, whereas the latter provides messages about the status of pending and failed parameter transmission requests. Whatever synchronous or asynchronous parameter sharing is applied, the learning model shall adapt to distinct network architectures as an improper learning model will deteriorate learning performance and, even worse, lead to model divergence for the asynchronous transmission in resource-limited large-scale Internet-of-Things (IoT) networks. This paper proposes a decentralized learning model and develops an asynchronous parameter-sharing algorithm for resource-limited distributed IoT networks. This decentralized learning model approaches a convex function as the number of nodes increases, and its learning process converges to a global stationary point with a higher probability than the centralized FL model. Moreover, by jointly accounting for the convergence bound of federated learning and the transmission delay of wireless communications, we develop a node scheduling and bandwidth allocation algorithm to minimize the transmission delay. Extensive simulation results corroborate the effectiveness of the distributed algorithm in terms of fast learning model convergence and low transmission delay. △ Less

Submitted 13 January, 2024; originally announced January 2024.

Comments: 17 pages, 8 figures, to appear in IEEE Internet of Things Journal

arXiv:2401.06813 [pdf, other]

doi 10.1103/PhysRevD.109.053005

First observation of the decay $Λ^+_c\to nK^{0}_{S}π^+π^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (630 additional authors not shown)

Abstract: Based on 4.5 fb$^{-1}$ of $e^{+}e^{-}$ collision data accumulated at center-of-mass energies between $4599.53$ MeV and $4698.82$ MeV with the BESIII detector, the decay $Λ_{c}^{+}\to nK_{S}^{0}π^+π^0$ is observed for the first time with a significance of $9.2σ$. The branching fraction is measured to be $(0.85\pm0.13\pm0.03)\%$, where the first uncertainty is statistical and the second systematic,… ▽ More Based on 4.5 fb$^{-1}$ of $e^{+}e^{-}$ collision data accumulated at center-of-mass energies between $4599.53$ MeV and $4698.82$ MeV with the BESIII detector, the decay $Λ_{c}^{+}\to nK_{S}^{0}π^+π^0$ is observed for the first time with a significance of $9.2σ$. The branching fraction is measured to be $(0.85\pm0.13\pm0.03)\%$, where the first uncertainty is statistical and the second systematic, which differs from the theoretical prediction based on isospin by 4.4$σ$. This indicates that there may be resonant contributions or some unknown dynamics in this decay. △ Less

Submitted 28 March, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

Journal ref: Phys.Rev.D,109,053005 (2024)

arXiv:2401.06394 [pdf, other]

Adaptive Data Augmentation for Aspect Sentiment Quad Prediction

Authors: Wenyuan Zhang, Xinghua Zhang, Shiyao Cui, Kun Huang, Xuebin Wang, Tingwen Liu

Abstract: Aspect sentiment quad prediction (ASQP) aims to predict the quad sentiment elements for a given sentence, which is a critical task in the field of aspect-based sentiment analysis. However, the data imbalance issue has not received sufficient attention in ASQP task. In this paper, we divide the issue into two-folds, quad-pattern imbalance and aspect-category imbalance, and propose an Adaptive Data… ▽ More Aspect sentiment quad prediction (ASQP) aims to predict the quad sentiment elements for a given sentence, which is a critical task in the field of aspect-based sentiment analysis. However, the data imbalance issue has not received sufficient attention in ASQP task. In this paper, we divide the issue into two-folds, quad-pattern imbalance and aspect-category imbalance, and propose an Adaptive Data Augmentation (ADA) framework to tackle the imbalance issue. Specifically, a data augmentation process with a condition function adaptively enhances the tail quad patterns and aspect categories, alleviating the data imbalance in ASQP. Following previous studies, we also further explore the generative framework for extracting complete quads by introducing the category prior knowledge and syntax-guided decoding target. Experimental results demonstrate that data augmentation for imbalance in ASQP task can improve the performance, and the proposed ADA method is superior to naive data oversampling. △ Less

Submitted 12 January, 2024; originally announced January 2024.

Comments: Accepted by ICASSP 2024, 5 pages

arXiv:2401.03634 [pdf]

Propagation properties of partially coherent electromagnetic hyperbolic-sine-Gaussian vortex beam through anisotropic atmospheric turbulence

Authors: Jin Cao, Rufeng Tang, Kai Huang, Yuqiang Li, Yonggen Xu

Abstract: Utilizing the extended Huygens-Fresnel principle and the Rytov approximation, the analytical formula for the propagation of a partially coherent electromagnetic hyperbolic-sine-Gaussian vortex beam (PCEShVB) in anisotropic atmospheric turbulence has been theoretically derived. Detailed studies have been conducted on the evolution characteristics of average intensity, the degree of coherence (DOC)… ▽ More Utilizing the extended Huygens-Fresnel principle and the Rytov approximation, the analytical formula for the propagation of a partially coherent electromagnetic hyperbolic-sine-Gaussian vortex beam (PCEShVB) in anisotropic atmospheric turbulence has been theoretically derived. Detailed studies have been conducted on the evolution characteristics of average intensity, the degree of coherence (DOC) and the degree of polarization (DOP) of the beam in turbulence. The results show that during propagation, the intensity distribution of the beam will exhibit a spiral structure, and the overall distribution of light spots will rotate in a direction related to the sign of the topological charge. The DOC distribution of PCEShVB will display a pattern reminiscent of beam interference fringes with an increase in propagation distance, with the number of 'interference fringes' greatly impacted by the hyperbolic sine parameter. Furthermore, PCEShVB with large initial coherent length and hyperbolic sine parameter will increase the degree of separation of the spots and yield a large DOP. Finally, for the validation of the theoretical findings, the random phase screen method was employed to simulate the propagation of PCEShVB through anisotropic atmospheric turbulence. The studies revealed a consistent alignment between the simulation results and the theoretical predictions. △ Less

Submitted 7 January, 2024; originally announced January 2024.

arXiv:2401.03412 [pdf, other]

N$^{3}$-Mapping: Normal Guided Neural Non-Projective Signed Distance Fields for Large-scale 3D Mapping

Authors: Shuangfu Song, Junqiao Zhao, Kai Huang, Jiaye Lin, Chen Ye, Tiantian Feng

Abstract: Accurate and dense mapping in large-scale environments is essential for various robot applications. Recently, implicit neural signed distance fields (SDFs) have shown promising advances in this task. However, most existing approaches employ projective distances from range data as SDF supervision, introducing approximation errors and thus degrading the mapping quality. To address this problem, we i… ▽ More Accurate and dense mapping in large-scale environments is essential for various robot applications. Recently, implicit neural signed distance fields (SDFs) have shown promising advances in this task. However, most existing approaches employ projective distances from range data as SDF supervision, introducing approximation errors and thus degrading the mapping quality. To address this problem, we introduce N$^{3}$-Mapping, an implicit neural mapping system featuring normal-guided neural non-projective signed distance fields. Specifically, we directly sample points along the surface normal, instead of the ray, to obtain more accurate non-projective distance values from range data. Then these distance values are used as supervision to train the implicit map. For large-scale mapping, we apply a voxel-oriented sliding window mechanism to alleviate the forgetting issue with a bounded memory footprint. Besides, considering the uneven distribution of measured point clouds, a hierarchical sampling strategy is designed to improve training efficiency. Experiments demonstrate that our method effectively mitigates SDF approximation errors and achieves state-of-the-art mapping quality compared to existing approaches. △ Less

Submitted 29 April, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

Comments: 8 pages, 10 figures. Accepted by RAL2024

arXiv:2401.02578 [pdf, other]

The ALCHEMI atlas: principal component analysis reveals starburst evolution in NGC 253

Authors: Nanase Harada, David S. Meier, Sergio Martín, Sebastien Muller, Kazushi Sakamoto, Toshiki Saito, Mark D. Gorski, Christian Henkel, Kunihiko Tanaka, Jeffrey G. Mangum, Susanne Aalto, Rebeca Aladro, Mathilde Bouvier, Laura Colzi, Kimberly L. Emig, Rubén Herrero-Illana, Ko-Yun Huang, Kotaro Kohno, Sabine König, Kouichiro Nakanishi, Yuri Nishimura, Shuro Takano, Víctor M. Rivilla, Serena Viti, Yoshimasa Watanabe , et al. (2 additional authors not shown)

Abstract: Molecular lines are powerful diagnostics of the physical and chemical properties of the interstellar medium (ISM). These ISM properties, which affect future star formation, are expected to differ in starburst galaxies from those of more quiescent galaxies. We investigate the ISM properties in the central molecular zone of the nearby starburst galaxy NGC 253 using the ultra-wide millimeter spectral… ▽ More Molecular lines are powerful diagnostics of the physical and chemical properties of the interstellar medium (ISM). These ISM properties, which affect future star formation, are expected to differ in starburst galaxies from those of more quiescent galaxies. We investigate the ISM properties in the central molecular zone of the nearby starburst galaxy NGC 253 using the ultra-wide millimeter spectral scan survey from the ALMA Large Program ALCHEMI. We present an atlas of velocity-integrated images at a 1".6 resolution of 148 unblended transitions from 44 species, including the first extragalactic detection of HCNH$^+$ and the first interferometric images of C$_3$H$^+$, NO, HCS$^+$. We conduct a principal component analysis (PCA) on these images to extract correlated chemical species and to identify key groups of diagnostic transitions. To the best of our knowledge, our dataset is currently the largest astronomical set of molecular lines to which PCA has been applied. The PCA can categorize transitions coming from different physical components in NGC 253 such as i) young starburst tracers characterized by high-excitation transitions of HC$_3$N and complex organic molecules (COMs) versus tracers of on-going star formation (radio recombination lines) and high-excitation transitions of CCH and CN tracing PDRs, ii) tracers of cloud-collision-induced shocks (low-excitation transitions of CH$_3$OH, HNCO, HOCO$^+$, and OCS) versus shocks from star-formation-induced outflows (high-excitation transitions of SiO), as well as iii) outflows showing emission from HOC$^+$, CCH, H$_3$O$^+$, CO isotopologues, HCN, HCO$^+$, CS, and CN. Our findings show these intensities vary with galactic dynamics, star formation activities, and stellar feedback. △ Less

Submitted 4 January, 2024; originally announced January 2024.

Comments: 65 pages, 39 figures. Accepted for publication in ApJS

arXiv:2401.02510 [pdf, ps, other]

Inequalities of Brascamp-Lieb type on the Heisenberg group

Authors: Kaiyi Huang, Betsy Stovall

Abstract: We prove both necessary and sufficient conditions for $L^p$-bound\-ed\-ness of certain multilinear generalized Radon transforms that arise as Heisenberg group analogues of the Brascamp--Lieb inequalities on Euclidean space. The necessary and sufficient conditions coincide in some important special cases, but differ in others. We prove both necessary and sufficient conditions for $L^p$-bound\-ed\-ness of certain multilinear generalized Radon transforms that arise as Heisenberg group analogues of the Brascamp--Lieb inequalities on Euclidean space. The necessary and sufficient conditions coincide in some important special cases, but differ in others. △ Less

Submitted 5 April, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

Comments: We corrected the formulae of conditions (C1) and (C2), modified the statement of Proposition 2.2, and rewrote the proof of Lemma 4.4 for clarity; 13 pages

MSC Class: 42B20

arXiv:2401.01964 [pdf]

doi 10.1039/D4SC00322E

Bulk synthesis of Zn$_3$WN$_4$ via solid-state metathesis

Authors: Christopher L. Rom, Shaun O'Donnell, Kayla Huang, Ryan A. Klein, Morgan J. Kramer, Rebecca W. Smaha, Andriy Zakutayev

Abstract: Ternary nitrides are of growing technological importance, with applications as semiconductors, catalysts, and magnetic materials; however, new synthetic tools are needed to advance materials discovery efforts. Here, we show that Zn$_3$WN$_4$ can be synthesized via metathesis reactions between Li$_6$WN$_4$ and Zn$X_2$ ($X$ = Br, Cl, F). In situ synchrotron powder X-ray diffraction and differential… ▽ More Ternary nitrides are of growing technological importance, with applications as semiconductors, catalysts, and magnetic materials; however, new synthetic tools are needed to advance materials discovery efforts. Here, we show that Zn$_3$WN$_4$ can be synthesized via metathesis reactions between Li$_6$WN$_4$ and Zn$X_2$ ($X$ = Br, Cl, F). In situ synchrotron powder X-ray diffraction and differential scanning calorimetry show that the reaction onset is correlated with the Zn$X_2$ melting point and that product purity is inversely correlated with the reaction's exothermicity. High resolution synchrotron powder X-ray diffraction measurements show that this bulk synthesis produces a structure with substantial cation ordering, as opposed to the disordered structure initially discovered via thin film sputtering. Diffuse reflectance spectroscopy reveals that Zn$_3$WN$_4$ powders exhibit two optical absorption onsets at 2.5 eV and 4.0 eV, indicating wide-bandgap semiconducting behavior and suggesting a small amount of structural disorder. We hypothesize that this synthesis strategy is generalizable because many potential Li-$M$-N precursors (where $M$ is a metal) are available for synthesizing new ternary nitride materials. This work introduces a promising synthesis strategy that will accelerate the discovery of novel functional ternary nitrides and other currently inaccessible materials. △ Less

Submitted 5 January, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

Journal ref: Chem. Sci. 2024

arXiv:2401.01530 [pdf, other]

Disorder-induced topological pumping on a superconducting quantum processor

Authors: Yu Liu, Yu-Ran Zhang, Yun-Hao Shi, Tao Liu, Congwei Lu, Yong-Yi Wang, Hao Li, Tian-Ming Li, Cheng-Lin Deng, Si-Yun Zhou, Tong Liu, Jia-Chi Zhang, Gui-Han Liang, Zheng-Yang Mei, Wei-Guo Ma, Hao-Tian Liu, Zheng-He Liu, Chi-Tong Chen, Kaixuan Huang, Xiaohui Song, SP Zhao, Ye Tian, Zhongcheng Xiang, Dongning Zheng, Franco Nori , et al. (2 additional authors not shown)

Abstract: Thouless pumping, a dynamical version of the integer quantum Hall effect, represents the quantized charge pumped during an adiabatic cyclic evolution. Here we report experimental observations of nontrivial topological pumping that is induced by disorder even during a topologically trivial pumping trajectory. With a 41-qubit superconducting quantum processor, we develop a Floquet engineering techni… ▽ More Thouless pumping, a dynamical version of the integer quantum Hall effect, represents the quantized charge pumped during an adiabatic cyclic evolution. Here we report experimental observations of nontrivial topological pumping that is induced by disorder even during a topologically trivial pumping trajectory. With a 41-qubit superconducting quantum processor, we develop a Floquet engineering technique to realize cycles of adiabatic pumping by simultaneously varying the on-site potentials and the hopping couplings. We demonstrate Thouless pumping in the presence of disorder and show its breakdown as the strength of disorder increases. Moreover, we observe two types of topological pumping that are induced by on-site potential disorder and hopping disorder, respectively. Especially, an intrinsic topological pump that is induced by quasi-periodic hopping disorder has never been experimentally realized before. Our highly controllable system provides a valuable quantum simulating platform for studying various aspects of topological physics in the presence of disorder. △ Less

Submitted 2 January, 2024; originally announced January 2024.

arXiv:2401.00918 [pdf, ps, other]

Partial Wave Analysis of $J/ψ\rightarrow γγφ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (603 additional authors not shown)

Abstract: Using a sample of $(10087\pm44)\times10^{6}$ $J/ψ$ events collected with the BESIII detector at the BEPCII collider, a partial wave analysis on the decay $γγφ$ is performed to investigate the intermediate resonances in $J/ψ\rightarrowγX, X\rightarrowγφ$. The resonances $f_{1}(1285)$, $η(1405)$, $f_{1}(1420)$, $f_{1}(1510)$, $f_{2}(1525)$, $X(1835)$, $f_{2}(1950)$, $f_{2}(2010)$, $f_{0}(2200)$ and… ▽ More Using a sample of $(10087\pm44)\times10^{6}$ $J/ψ$ events collected with the BESIII detector at the BEPCII collider, a partial wave analysis on the decay $γγφ$ is performed to investigate the intermediate resonances in $J/ψ\rightarrowγX, X\rightarrowγφ$. The resonances $f_{1}(1285)$, $η(1405)$, $f_{1}(1420)$, $f_{1}(1510)$, $f_{2}(1525)$, $X(1835)$, $f_{2}(1950)$, $f_{2}(2010)$, $f_{0}(2200)$ and $η_{c}$ are observed with statistical significance greater than 5$σ$. The product branching fractions $\mathcal{B}(J/ψ\rightarrowγX, X\rightarrow γφ)$ are reported. The resonance parameters of $η(1405)$ and $X(1835)$ are also measured. △ Less

Submitted 1 January, 2024; originally announced January 2024.

arXiv:2401.00878 [pdf, ps, other]

Observation of $\mathcal R(3810)$ in $e^+e^-\rightarrow {\rm hadrons}$ and Improved Measurements of the Resonance Parameters of $\mathcal R(3760)$ and $\mathcal R(3780)$

Authors: M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (596 additional authors not shown)

Abstract: We report the measurement of the cross sections for $e^+e^-\rightarrow {\rm hadrons}$ at center-of-mass (c.m.) energies from 3.645 to 3.871 GeV. We observe a new resonance $\mathcal R(3810)$ in the cross sections for the first time, and observe the $\mathcal R(3760)$ resonance with high significance in the cross sections. The $\mathcal R(3810)$ has a mass of $(3804.5 \pm 0.9 \pm 0.9)$ ~MeV/$c^2$,… ▽ More We report the measurement of the cross sections for $e^+e^-\rightarrow {\rm hadrons}$ at center-of-mass (c.m.) energies from 3.645 to 3.871 GeV. We observe a new resonance $\mathcal R(3810)$ in the cross sections for the first time, and observe the $\mathcal R(3760)$ resonance with high significance in the cross sections. The $\mathcal R(3810)$ has a mass of $(3804.5 \pm 0.9 \pm 0.9)$ ~MeV/$c^2$, a total width of $(5.4 \pm 3.5 \pm 3.2)$~MeV, and an electronic partial width of $(19.4 \pm 7.4 \pm 12.1)$~eV. Its significance is $7.7σ$. The $\mathcal R(3810)$ could be interpreted as a hadro-charmonium resonance predicted by Quantum Chromodynamics (QCD). In addition, we measure the mass $(3751.9\pm 3.8\pm 2.8)$ ~MeV/$c^2$, the total width $(32.8 \pm 5.8 \pm 8.7)$~MeV, and the electronic partial width $(184\pm 75\pm 86)$~eV with improved precision for the $\mathcal R(3760)$. Furthermore, for the $\mathcal R(3780)$ we measure the mass $(3778.7\pm 0.5\pm 0.3)$ ~MeV/$c^2$ and total width $(20.3 \pm 0.8 \pm 1.7)$~MeV with improved precision, and the electronic partial width $(265\pm 69\pm 83)$~eV. The $\mathcal R(3780)$ can be interpreted as the $1^3D_1$ state of charmonium. Its mass and total width differ significantly from the corresponding fitted values given by the Particle Data Group in 2022 by 7.1 and 3.2 times the uncertainties for $ψ(3770)$, respectively. $ψ(3770)$ has been interpreted as the $1^3D_1$ state for 45 years. △ Less

Submitted 30 December, 2023; originally announced January 2024.

arXiv:2401.00273 [pdf, ps, other]

Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision

Authors: Chih-Kai Yang, Kuan-Po Huang, Ke-Han Lu, Chun-Yi Kuan, Chi-Yuan Hsiao, Hung-yi Lee

Abstract: This work evaluated several cutting-edge large-scale foundation models based on self-supervision or weak supervision, including SeamlessM4T, SeamlessM4T v2, and Whisper-large-v3, on three code-switched corpora. We found that self-supervised models can achieve performances close to the supervised model, indicating the effectiveness of multilingual self-supervised pre-training. We also observed that… ▽ More This work evaluated several cutting-edge large-scale foundation models based on self-supervision or weak supervision, including SeamlessM4T, SeamlessM4T v2, and Whisper-large-v3, on three code-switched corpora. We found that self-supervised models can achieve performances close to the supervised model, indicating the effectiveness of multilingual self-supervised pre-training. We also observed that these models still have room for improvement as they kept making similar mistakes and had unsatisfactory performances on modeling intra-sentential code-switching. In addition, the validity of several variants of Whisper was explored, and we concluded that they remained effective in a code-switching scenario, and similar techniques for self-supervised models are worth studying to boost the performance of code-switched tasks. △ Less

Submitted 30 December, 2023; originally announced January 2024.

Comments: Submitted to ICASSP 2024 Self-supervision in Audio, Speech and Beyond workshop

arXiv:2312.17063 [pdf, other]

doi 10.1016/j.physletb.2024.138614

Search for a massless particle beyond the Standard Model in the $Σ^+\rightarrow p+{\rm invisible}$ decay

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: A massless particle beyond the Standard Model is searched for in the two-body decay $Σ^+\rightarrow p+{\rm invisible}$ using $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected at a center-of-mass energy of $\sqrt{s}=3.097$ GeV with the BESIII detector at the BEPCII collider. No significant signal is observed, and the upper limit on the branching fraction $B(Σ^+\rightarrow p+{\rm invisible})$… ▽ More A massless particle beyond the Standard Model is searched for in the two-body decay $Σ^+\rightarrow p+{\rm invisible}$ using $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected at a center-of-mass energy of $\sqrt{s}=3.097$ GeV with the BESIII detector at the BEPCII collider. No significant signal is observed, and the upper limit on the branching fraction $B(Σ^+\rightarrow p+{\rm invisible})$ is determined to be $3.2\times10^{-5}$ at the 90% confidence level. This is the first search for a flavor-changing neutral current process with missing energy in hyperon decays which plays an important role in constraining new physics models. △ Less

Submitted 5 April, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

Comments: 11 pages, 5 figures

Journal ref: Phys. Lett. B 852 (2024) 138614

arXiv:2312.16405 [pdf, ps, other]

Observation of $χ_{cJ}\to 3(K^+K^-)$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (632 additional authors not shown)

Abstract: By analyzing $(27.12\pm0.14)\times10^8$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decay processes $χ_{cJ} \to 3(K^+K^-)$ ($J=0,1,2$) are observed for the first time with statistical significances of 8.2$σ$, 8.1$σ$, and 12.4$σ$, respectively. The product branching fractions of $ψ(3686)\toγχ_{cJ}$, $χ_{cJ}\to 3(K^+K^-)$ are presented and the branching… ▽ More By analyzing $(27.12\pm0.14)\times10^8$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decay processes $χ_{cJ} \to 3(K^+K^-)$ ($J=0,1,2$) are observed for the first time with statistical significances of 8.2$σ$, 8.1$σ$, and 12.4$σ$, respectively. The product branching fractions of $ψ(3686)\toγχ_{cJ}$, $χ_{cJ}\to 3(K^+K^-)$ are presented and the branching fractions of $χ_{cJ}\to 3(K^+K^-)$ decays are determined to be $\mathcal{B}_{χ_{c0}\to 3(K^+K^-)}$=$(10.7\pm1.8\pm1.1)$$\times10^{-6}$, $\mathcal{B}_{χ_{c1}\to 3(K^+K^-)}$=$(4.2\pm0.9\pm0.5)$$\times10^{-6}$, and $\mathcal{B}_{χ_{c2}\to 3(K^+K^-)}$=$(7.2\pm1.1\pm0.8)$$\times10^{-6}$, where the first uncertainties are statistical and the second are systematic. △ Less

Submitted 26 December, 2023; originally announced December 2023.

Comments: 8 pages, 2 figures

arXiv:2312.15667 [pdf, other]

TAPE: Leveraging Agent Topology for Cooperative Multi-Agent Policy Gradient

Authors: Xingzhou Lou, Junge Zhang, Timothy J. Norman, Kaiqi Huang, Yali Du

Abstract: Multi-Agent Policy Gradient (MAPG) has made significant progress in recent years. However, centralized critics in state-of-the-art MAPG methods still face the centralized-decentralized mismatch (CDM) issue, which means sub-optimal actions by some agents will affect other agent's policy learning. While using individual critics for policy updates can avoid this issue, they severely limit cooperation… ▽ More Multi-Agent Policy Gradient (MAPG) has made significant progress in recent years. However, centralized critics in state-of-the-art MAPG methods still face the centralized-decentralized mismatch (CDM) issue, which means sub-optimal actions by some agents will affect other agent's policy learning. While using individual critics for policy updates can avoid this issue, they severely limit cooperation among agents. To address this issue, we propose an agent topology framework, which decides whether other agents should be considered in policy gradient and achieves compromise between facilitating cooperation and alleviating the CDM issue. The agent topology allows agents to use coalition utility as learning objective instead of global utility by centralized critics or local utility by individual critics. To constitute the agent topology, various models are studied. We propose Topology-based multi-Agent Policy gradiEnt (TAPE) for both stochastic and deterministic MAPG methods. We prove the policy improvement theorem for stochastic TAPE and give a theoretical explanation for the improved cooperation among agents. Experiment results on several benchmarks show the agent topology is able to facilitate agent cooperation and alleviate CDM issue respectively to improve performance of TAPE. Finally, multiple ablation studies and a heuristic graph search algorithm are devised to show the efficacy of the agent topology. △ Less

Submitted 15 January, 2024; v1 submitted 25 December, 2023; originally announced December 2023.

arXiv:2312.14229 [pdf, other]

doi 10.1145/3495243.3560551

Real-time Neural Network Inference on Extremely Weak Devices: Agile Offloading with Explainable AI

Authors: Kai Huang, Wei Gao

Abstract: With the wide adoption of AI applications, there is a pressing need of enabling real-time neural network (NN) inference on small embedded devices, but deploying NNs and achieving high performance of NN inference on these small devices is challenging due to their extremely weak capabilities. Although NN partitioning and offloading can contribute to such deployment, they are incapable of minimizing… ▽ More With the wide adoption of AI applications, there is a pressing need of enabling real-time neural network (NN) inference on small embedded devices, but deploying NNs and achieving high performance of NN inference on these small devices is challenging due to their extremely weak capabilities. Although NN partitioning and offloading can contribute to such deployment, they are incapable of minimizing the local costs at embedded devices. Instead, we suggest to address this challenge via agile NN offloading, which migrates the required computations in NN offloading from online inference to offline learning. In this paper, we present AgileNN, a new NN offloading technique that achieves real-time NN inference on weak embedded devices by leveraging eXplainable AI techniques, so as to explicitly enforce feature sparsity during the training phase and minimize the online computation and communication costs. Experiment results show that AgileNN's inference latency is >6x lower than the existing schemes, ensuring that sensory data on embedded devices can be timely consumed. It also reduces the local device's resource consumption by >8x, without impairing the inference accuracy. △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: published at ACM MobiCom 2022. 14 pages

arXiv:2312.14227 [pdf, other]

doi 10.1145/3581791.3596852

ElasticTrainer: Speeding Up On-Device Training with Runtime Elastic Tensor Selection

Authors: Kai Huang, Boyuan Yang, Wei Gao

Abstract: On-device training is essential for neural networks (NNs) to continuously adapt to new online data, but can be time-consuming due to the device's limited computing power. To speed up on-device training, existing schemes select trainable NN portion offline or conduct unrecoverable selection at runtime, but the evolution of trainable NN portion is constrained and cannot adapt to the current need for… ▽ More On-device training is essential for neural networks (NNs) to continuously adapt to new online data, but can be time-consuming due to the device's limited computing power. To speed up on-device training, existing schemes select trainable NN portion offline or conduct unrecoverable selection at runtime, but the evolution of trainable NN portion is constrained and cannot adapt to the current need for training. Instead, runtime adaptation of on-device training should be fully elastic, i.e., every NN substructure can be freely removed from or added to the trainable NN portion at any time in training. In this paper, we present ElasticTrainer, a new technique that enforces such elasticity to achieve the required training speedup with the minimum NN accuracy loss. Experiment results show that ElasticTrainer achieves up to 3.5x more training speedup in wall-clock time and reduces energy consumption by 2x-3x more compared to the existing schemes, without noticeable accuracy loss. △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: published at ACM MobiSys 2023. 14 pages

Showing 151–200 of 1,656 results for author: Huang, K