Search | arXiv e-print repository

Efficient Meta Neural Heuristic for Multi-Objective Combinatorial Optimization

Authors: Jinbiao Chen, Jiahai Wang, Zizhen Zhang, Zhiguang Cao, Te Ye, Siyuan Chen

Abstract: Recently, neural heuristics based on deep reinforcement learning have exhibited promise in solving multi-objective combinatorial optimization problems (MOCOPs). However, they are still struggling to achieve high learning efficiency and solution quality. To tackle this issue, we propose an efficient meta neural heuristic (EMNH), in which a meta-model is first trained and then fine-tuned with a few… ▽ More Recently, neural heuristics based on deep reinforcement learning have exhibited promise in solving multi-objective combinatorial optimization problems (MOCOPs). However, they are still struggling to achieve high learning efficiency and solution quality. To tackle this issue, we propose an efficient meta neural heuristic (EMNH), in which a meta-model is first trained and then fine-tuned with a few steps to solve corresponding single-objective subproblems. Specifically, for the training process, a (partial) architecture-shared multi-task model is leveraged to achieve parallel learning for the meta-model, so as to speed up the training; meanwhile, a scaled symmetric sampling method with respect to the weight vectors is designed to stabilize the training. For the fine-tuning process, an efficient hierarchical method is proposed to systematically tackle all the subproblems. Experimental results on the multi-objective traveling salesman problem (MOTSP), multi-objective capacitated vehicle routing problem (MOCVRP), and multi-objective knapsack problem (MOKP) show that, EMNH is able to outperform the state-of-the-art neural heuristics in terms of solution quality and learning efficiency, and yield competitive solutions to the strong traditional heuristics while consuming much shorter time. △ Less

Submitted 22 October, 2023; originally announced October 2023.

Comments: Accepted at NeurIPS 2023

arXiv:2310.15195 [pdf, other]

Neural Multi-Objective Combinatorial Optimization with Diversity Enhancement

Authors: Jinbiao Chen, Zizhen Zhang, Zhiguang Cao, Yaoxin Wu, Yining Ma, Te Ye, Jiahai Wang

Abstract: Most of existing neural methods for multi-objective combinatorial optimization (MOCO) problems solely rely on decomposition, which often leads to repetitive solutions for the respective subproblems, thus a limited Pareto set. Beyond decomposition, we propose a novel neural heuristic with diversity enhancement (NHDE) to produce more Pareto solutions from two perspectives. On the one hand, to hinder… ▽ More Most of existing neural methods for multi-objective combinatorial optimization (MOCO) problems solely rely on decomposition, which often leads to repetitive solutions for the respective subproblems, thus a limited Pareto set. Beyond decomposition, we propose a novel neural heuristic with diversity enhancement (NHDE) to produce more Pareto solutions from two perspectives. On the one hand, to hinder duplicated solutions for different subproblems, we propose an indicator-enhanced deep reinforcement learning method to guide the model, and design a heterogeneous graph attention mechanism to capture the relations between the instance graph and the Pareto front graph. On the other hand, to excavate more solutions in the neighborhood of each subproblem, we present a multiple Pareto optima strategy to sample and preserve desirable solutions. Experimental results on classic MOCO problems show that our NHDE is able to generate a Pareto front with higher diversity, thereby achieving superior overall performance. Moreover, our NHDE is generic and can be applied to different neural methods for MOCO. △ Less

Submitted 22 October, 2023; originally announced October 2023.

Comments: Accepted at NeurIPS 2023

arXiv:2310.14057 [pdf, other]

Ab Initio Studies on Interactions in K$_3$C$_{60}$ under High Pressure

Authors: Jianyu Li, Zhangkai Cao, Jiahao Su, Ruipeng Wang, Haipeng Li, Yusuke Nomura, Xiaosen Yang, Ho-Kin Tang

Abstract: Fullerene solids doped with alkali metals (A$_3$C$_{60}$, A = K, Rb, Cs) exhibit a superconducting transition temperature ($T_c$) as high as 40 K, and their unconventional superconducting properties have been a subject of debate. With application of high pressure on K$_3$C$_{60}$ and Rb$_3$C$_{60}$, the experiments demonstrate the decrease of $T_c$. In this paper, we focus on K$_3$C$_{60}$ and der… ▽ More Fullerene solids doped with alkali metals (A$_3$C$_{60}$, A = K, Rb, Cs) exhibit a superconducting transition temperature ($T_c$) as high as 40 K, and their unconventional superconducting properties have been a subject of debate. With application of high pressure on K$_3$C$_{60}$ and Rb$_3$C$_{60}$, the experiments demonstrate the decrease of $T_c$. In this paper, we focus on K$_3$C$_{60}$ and derive the structure of K$_3$C$_{60}$ under different pressures based on first-principles calculations, exploring the trends of Coulomb interactions at various pressures. By utilizing the Maximally Localized Wannier function approach, Constrained Density Functional Perturbation Theory (cDFPT), and Constrained Random Phase Approximation (cRPA), we construct a microscopic low-energy model near the Fermi level. Our results strongly indicate that, in the K$_3$C$_{60}$ system, as pressure increases, the effect of phonons is the key to intraorbital electron pairing. There is a dominance of the phonon-driven superconducting mechanism at high pressure. △ Less

Submitted 21 October, 2023; originally announced October 2023.

arXiv:2310.12533 [pdf, other]

doi 10.1088/1367-2630/acffec

Quantum Private Function Evaluation

Authors: Zhu Cao

Abstract: Private function evaluation is a task that aims to obtain the output of a function while keeping the function secret. So far its quantum analogue has not yet been articulated. In this study, we initiate the study of quantum private function evaluation, the quantum analogue of classical private function evaluation. We give a formal definition of quantum private function evaluation and present two s… ▽ More Private function evaluation is a task that aims to obtain the output of a function while keeping the function secret. So far its quantum analogue has not yet been articulated. In this study, we initiate the study of quantum private function evaluation, the quantum analogue of classical private function evaluation. We give a formal definition of quantum private function evaluation and present two schemes together with their security proofs. We then give an experimental demonstration of the scheme. Finally we apply quantum private function evaluation to quantum copy protection to illustrate its usage. △ Less

Submitted 19 October, 2023; originally announced October 2023.

Comments: 14 pages, 6 figures

Journal ref: New Journal of Physics 25, 103027 (2023)

arXiv:2310.11805 [pdf, other]

doi 10.1109/ROBIO58561.2023.10354990

GMC-Pos: Graph-Based Multi-Robot Coverage Positioning Method

Authors: Khattiya Pongsirijinda, Zhiqiang Cao, Muhammad Shalihan, Benny Kai Kiat Ng, Billy Pik Lik Lau, Chau Yuen, U-Xuan Tan

Abstract: Nowadays, several real-world tasks require adequate environment coverage for maintaining communication between multiple robots, for example, target search tasks, environmental monitoring, and post-disaster rescues. In this study, we look into a situation where there are a human operator and multiple robots, and we assume that each human or robot covers a certain range of areas. We want them to max… ▽ More Nowadays, several real-world tasks require adequate environment coverage for maintaining communication between multiple robots, for example, target search tasks, environmental monitoring, and post-disaster rescues. In this study, we look into a situation where there are a human operator and multiple robots, and we assume that each human or robot covers a certain range of areas. We want them to maximize their area of coverage collectively. Therefore, in this paper, we propose the Graph-Based Multi-Robot Coverage Positioning Method (GMC-Pos) to find strategic positions for robots that maximize the area coverage. Our novel approach consists of two main modules: graph generation and node selection. Firstly, graph generation represents the environment using a weighted connected graph. Then, we present a novel generalized graph-based distance and utilize it together with the graph degrees to be the conditions for node selection in a recursive manner. Our method is deployed in three environments with different settings. The results show that it outperforms the benchmark method by 15.13% to 24.88% regarding the area coverage percentage. △ Less

Submitted 18 October, 2023; originally announced October 2023.

Comments: This paper has been accepted by the 2023 IEEE International Conference on Robotics and Biomimetics (IEEE ROBIO 2023)

arXiv:2310.10289 [pdf, other]

Moving Object Localization based on the Fusion of Ultra-WideBand and LiDAR with a Mobile Robot

Authors: Muhammad Shalihan, Zhiqiang Cao, Khattiya Pongsirijinda, Lin Guo, Billy Pik Lik Lau, Ran Liu, Chau Yuen, U-Xuan Tan

Abstract: Localization of objects is vital for robot-object interaction. Light Detection and Ranging (LiDAR) application in robotics is an emerging and widely used object localization technique due to its accurate distance measurement, long-range, wide field of view, and robustness in different conditions. However, LiDAR is unable to identify the objects when they are obstructed by obstacles, resulting in i… ▽ More Localization of objects is vital for robot-object interaction. Light Detection and Ranging (LiDAR) application in robotics is an emerging and widely used object localization technique due to its accurate distance measurement, long-range, wide field of view, and robustness in different conditions. However, LiDAR is unable to identify the objects when they are obstructed by obstacles, resulting in inaccuracy and noise in localization. To address this issue, we present an approach incorporating LiDAR and Ultra-Wideband (UWB) ranging for object localization. The UWB is popular in sensor fusion localization algorithms due to its low weight and low power consumption. In addition, the UWB is able to return ranging measurements even when the object is not within line-of-sight. Our approach provides an efficient solution to combine an anonymous optical sensor (LiDAR) with an identity-based radio sensor (UWB) to improve the localization accuracy of the object. Our approach consists of three modules. The first module is an object-identification algorithm that compares successive scans from the LiDAR to detect a moving object in the environment and returns the position with the closest range to UWB ranging. The second module estimates the moving object's moving direction using the previous and current estimated position from our object-identification module. It removes the suspicious estimations through an outlier rejection criterion. Lastly, we fuse the LiDAR, UWB ranging, and odometry measurements in pose graph optimization (PGO) to recover the entire trajectory of the robot and object. Extensive experiments were performed to evaluate the performance of the proposed approach. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: This paper has been accepted by The 2023 IEEE International Conference on Robotics and Biomimetics (IEEE ROBIO 2023)

arXiv:2310.08845 [pdf, other]

doi 10.1126/sciadv.adj2778

Very high energy gamma-ray emission beyond 10 TeV from GRB 221009A

Authors: Zhen Cao, F. Aharonian, Q. An, A. Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the t… ▽ More The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the trigger. The intrinsic energy spectrum of gamma-rays can be described by a power-law after correcting for extragalactic background light (EBL) absorption. Such a hard spectrum challenges the synchrotron self-Compton (SSC) scenario of relativistic electrons for the afterglow emission above several TeV. Observations of gamma-rays up to 13 TeV from a source with a measured redshift of z=0.151 hints more transparency in intergalactic space than previously expected. Alternatively, one may invoke new physics such as Lorentz Invariance Violation (LIV) or an axion origin of very high energy (VHE) signals. △ Less

Submitted 22 November, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

Comments: 49pages, 11figures

Journal ref: Science Advances, 9, eadj2778 (2023) 15 November 2023

arXiv:2310.08252 [pdf, other]

MetaBox: A Benchmark Platform for Meta-Black-Box Optimization with Reinforcement Learning

Authors: Zeyuan Ma, Hongshu Guo, Jiacheng Chen, Zhenrui Li, Guojun Peng, Yue-Jiao Gong, Yining Ma, Zhiguang Cao

Abstract: Recently, Meta-Black-Box Optimization with Reinforcement Learning (MetaBBO-RL) has showcased the power of leveraging RL at the meta-level to mitigate manual fine-tuning of low-level black-box optimizers. However, this field is hindered by the lack of a unified benchmark. To fill this gap, we introduce MetaBox, the first benchmark platform expressly tailored for developing and evaluating MetaBBO-RL… ▽ More Recently, Meta-Black-Box Optimization with Reinforcement Learning (MetaBBO-RL) has showcased the power of leveraging RL at the meta-level to mitigate manual fine-tuning of low-level black-box optimizers. However, this field is hindered by the lack of a unified benchmark. To fill this gap, we introduce MetaBox, the first benchmark platform expressly tailored for developing and evaluating MetaBBO-RL methods. MetaBox offers a flexible algorithmic template that allows users to effortlessly implement their unique designs within the platform. Moreover, it provides a broad spectrum of over 300 problem instances, collected from synthetic to realistic scenarios, and an extensive library of 19 baseline methods, including both traditional black-box optimizers and recent MetaBBO-RL methods. Besides, MetaBox introduces three standardized performance metrics, enabling a more thorough assessment of the methods. In a bid to illustrate the utility of MetaBox for facilitating rigorous evaluation and in-depth analysis, we carry out a wide-ranging benchmarking study on existing MetaBBO-RL methods. Our MetaBox is open-source and accessible at: https://github.com/GMC-DRL/MetaBox. △ Less

Submitted 27 October, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

Comments: Accepted at NuerIPS 2023

arXiv:2310.08160 [pdf, other]

doi 10.1103/PhysRevD.109.104041

Eccentricity enables the earliest warning and localization of gravitational waves with ground-based detectors

Authors: Tao Yang, Rong-Gen Cai, Zhoujian Cao, Hyung Mok Lee

Abstract: The early and precise localization of gravitational waves (GWs) is pivotal in detecting their electromagnetic (EM) counterparts, especially for binary neutron stars (BNS) and neutron star-black hole binaries (NSBH). In this letter, we pioneer the exploration of utilizing the higher harmonic modes induced by the eccentricity of compact binaries to localize GWs with ground-based detectors even befor… ▽ More The early and precise localization of gravitational waves (GWs) is pivotal in detecting their electromagnetic (EM) counterparts, especially for binary neutron stars (BNS) and neutron star-black hole binaries (NSBH). In this letter, we pioneer the exploration of utilizing the higher harmonic modes induced by the eccentricity of compact binaries to localize GWs with ground-based detectors even before the quadrupole baseline $\ell=2$ mode enters the detector band. Our theoretical analysis marks a first in proposing a strategy for gaining the earliest possible warning and maximizing preparation time for observing pre- and/or post-merger EM counterparts. We simulate three typical binaries from GWTC-3 with eccentricities ranging from 0.05 to 0.4. Our results reveal that the third-generation (3G) detectors (low frequency cut-off $f_0=5$ Hz) can accumulate sufficient signal-to-noise ratios through higher modes before the onset of the baseline $\ell=2$ mode entry into the band. Notably, relying solely on the higher modes, the 3G detector network ET+2CE achieves an average localization on the order of $1-10^2~\rm deg^2$ around 1-1.8 hours before the merger of a GW170817-like BNS, and $10-10^3~\rm deg^2$ approximately 18-30 minutes prior to the merger of a GW200115-like NSBH. A $100~\rm deg^2$ localization is attainable even 2-4 hours prior to a BNS merger. Moreover, in the near face-on orientations which are generally more favorable for EM counterpart detection, the localization can be further improved. △ Less

Submitted 13 May, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

Comments: 8 pages, 2 figures, published in PRD

Journal ref: Phys.Rev.D 109 (2024) 104041

arXiv:2310.05917 [pdf, other]

doi 10.1145/3610548.3618136

Drivable Avatar Clothing: Faithful Full-Body Telepresence with Dynamic Clothing Driven by Sparse RGB-D Input

Authors: Donglai Xiang, Fabian Prada, Zhe Cao, Kaiwen Guo, Chenglei Wu, Jessica Hodgins, Timur Bagautdinov

Abstract: Clothing is an important part of human appearance but challenging to model in photorealistic avatars. In this work we present avatars with dynamically moving loose clothing that can be faithfully driven by sparse RGB-D inputs as well as body and face motion. We propose a Neural Iterative Closest Point (N-ICP) algorithm that can efficiently track the coarse garment shape given sparse depth input. G… ▽ More Clothing is an important part of human appearance but challenging to model in photorealistic avatars. In this work we present avatars with dynamically moving loose clothing that can be faithfully driven by sparse RGB-D inputs as well as body and face motion. We propose a Neural Iterative Closest Point (N-ICP) algorithm that can efficiently track the coarse garment shape given sparse depth input. Given the coarse tracking results, the input RGB-D images are then remapped to texel-aligned features, which are fed into the drivable avatar models to faithfully reconstruct appearance details. We evaluate our method against recent image-driven synthesis baselines, and conduct a comprehensive analysis of the N-ICP algorithm. We demonstrate that our method can generalize to a novel testing environment, while preserving the ability to produce high-fidelity and faithful clothing dynamics and appearance. △ Less

Submitted 11 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

Comments: SIGGRAPH Asia 2023 Conference Paper. Project website: https://xiangdonglai.github.io/www-sa23-drivable-clothing/

arXiv:2310.04552 [pdf, ps, other]

Effective-One-Body Numerical-Relativity waveform model for Eccentric spin-precessing binary black hole coalescence

Authors: Xiaolin Liu, Zhoujian Cao, Zong-Hong Zhu

Abstract: Waveform models are important to gravitational wave data analysis. People recently pay much attention to the waveform model construction for eccentric binary black hole coalescence. Several Effective-One-Body Numerical-Relativity waveform models of eccentric binary black hole coalescence have been constructed. But none of them can treat orbit eccentricity and spin-precessing simultaneously. The cu… ▽ More Waveform models are important to gravitational wave data analysis. People recently pay much attention to the waveform model construction for eccentric binary black hole coalescence. Several Effective-One-Body Numerical-Relativity waveform models of eccentric binary black hole coalescence have been constructed. But none of them can treat orbit eccentricity and spin-precessing simultaneously. The current paper focuses on this problem. The authors previously have constructed waveform model for spin-aligned eccentric binary black hole coalescence $\texttt{SEOBNRE}$. Here we extend such waveform model to describe eccentric spin-precessing binary black hole coalescence. We calculate the 2PN orbital radiation-reaction forces and the instantaneous part of the decomposed waveform for a general spinning precessing binary black hole system in effective-one-body (EOB) coordinates. We implement these results based on our previous $\texttt{SEOBNRE}$ waveform model. We have also compared our model waveforms to both SXS and RIT numerical relativity waveforms. We find good consistency between our model and numerical relativity. Based on our new waveform model, we analyze the impact of the non-perpendicular spin contributions on waveform accuracy. We find that the non-perpendicular spin contributions primarily affect the phase of the gravitational waveforms. For the current gravitational wave detectors, this contribution is not significant. The future detectors may be affected by such non perpendicular spin contributions. More importantly our $\texttt{SEOBNRE}$ waveform model, as the first theoretical waveform model to describe eccentric spin-precessing binary black hole coalescence, can help people to analyze orbit eccentricity and spin precession simultaneously for gravitational wave detection data. △ Less

Submitted 6 October, 2023; originally announced October 2023.

Comments: 19 pages,12 figures

arXiv:2310.02944 [pdf, other]

Adaptive Landmark Color for AUV Docking in Visually Dynamic Environments

Authors: Corey Knutson, Zhipeng Cao, Junaed Sattar

Abstract: Autonomous Underwater Vehicles (AUVs) conduct missions underwater without the need for human intervention. A docking station (DS) can extend mission times of an AUV by providing a location for the AUV to recharge its batteries and receive updated mission information. Various methods for locating and tracking a DS exist, but most rely on expensive acoustic sensors, or are vision-based, which is sig… ▽ More Autonomous Underwater Vehicles (AUVs) conduct missions underwater without the need for human intervention. A docking station (DS) can extend mission times of an AUV by providing a location for the AUV to recharge its batteries and receive updated mission information. Various methods for locating and tracking a DS exist, but most rely on expensive acoustic sensors, or are vision-based, which is significantly affected by water quality. In this \doctype, we present a vision-based method that utilizes adaptive color LED markers and dynamic color filtering to maximize landmark visibility in varying water conditions. Both AUV and DS utilize cameras to determine the water background color in order to calculate the desired marker color. No communication between AUV and DS is needed to determine marker color. Experiments conducted in a pool and lake show our method performs 10 times better than static color thresholding methods as background color varies. DS detection is possible at a range of 5 meters in clear water with minimal false positives. △ Less

Submitted 19 May, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: Submitted to ICRA 2024 for review

arXiv:2310.01744 [pdf, other]

doi 10.1146/annurev-nucl-112822-025357

Ultra-High-Energy Gamma-Ray Astronomy

Authors: Zhen Cao, Songzhan Chen, Ruoyu Liu, Ruizhi Yang

Abstract: Ultra-High Energy (UHE, $>$0.1\,PeV) $γ$-ray Astronomy is rapidly evolving into an expanding branch of the $γ$-ray astronomy with the surprising discovery of 12 PeVatrons and the detection of a handful of photons above 1 PeV. Nearly all known celestial object types that have emissions in the TeV band are found also emitting UHE photons. UHE $γ$-rays have a well-defined horizon inside our galaxy du… ▽ More Ultra-High Energy (UHE, $>$0.1\,PeV) $γ$-ray Astronomy is rapidly evolving into an expanding branch of the $γ$-ray astronomy with the surprising discovery of 12 PeVatrons and the detection of a handful of photons above 1 PeV. Nearly all known celestial object types that have emissions in the TeV band are found also emitting UHE photons. UHE $γ$-rays have a well-defined horizon inside our galaxy due to the absorption of infrared and cosmic microwave backgrounds in the universe. With the last 30 years, traditional cosmic ray (CR) detection techniques allow the detection of UHE $γ$-rays, and opened up the last observation window. For leptonic sources, UHE radiation is in the deep Klein-Nishina regime which is largely suppressed. Therefore UHE $γ$-ray detection will help to locate and identify hadronic radiation sources, tracing the historic pursuit for the origin of CRs around the knee of the spectrum. The Crab Nebula is again the focus of attention with measured photon emissions above 1\,PeV. In the absence of hadronic processes, this may indicate the existence of an extreme accelerator of e$^+$/e$^-$. Utilization of the CR extensive air shower detection techniques broadens the field of view of the source observations, enabling the measurement of UHE radiation surrounding the sources. These observations can probe the particle propagation inside and outside the accelerators and the subsequent injection/escape into the interstellar medium. △ Less

Submitted 2 October, 2023; originally announced October 2023.

Comments: 23 pages, 6 figures, to appear in Annual Review of Nuclear and Particle Science

Journal ref: Annual Review of Nuclear and Particle Science, Volume 73 (2023), page 341-363

arXiv:2310.01047 [pdf, other]

Charge equilibration of Laser-accelerated Carbon Ions in Foam Target

Authors: Bubo Ma, Jieru Ren, Lirong Liu, Wenqing Wei, Benzheng Chen, Shizheng Zhang, Hao Xu, Zhongmin Hu, Fangfang Li, Xing Wang, Shuai Yin, Jianhua Feng, Xianming Zhou, Yifang Gao, Yuan Li, Xiaohua Shi, Jianxing Li, Xueguang Ren, Zhongfeng Xu, Zhigang Deng, Wei Qi, Shaoyi Wang, Quanping Fan, Bo Cui, Weiwu Wang , et al. (17 additional authors not shown)

Abstract: The charge equilibration of laser-accelerated carbon ion beams in 2 mg/cm3 foam target was investigated experimentally. The ions were generated through target normal sheath acceleration mechanism in laser-foil interaction scheme. This allows to get the equilibrium charge state in wide energy range near Bragg peak within a single shot. By using foam, the charge equilibration measurement in density… ▽ More The charge equilibration of laser-accelerated carbon ion beams in 2 mg/cm3 foam target was investigated experimentally. The ions were generated through target normal sheath acceleration mechanism in laser-foil interaction scheme. This allows to get the equilibrium charge state in wide energy range near Bragg peak within a single shot. By using foam, the charge equilibration measurement in density regime between gas and solid state was firstly reached out experimentally. It was found that the theoretical predictions with tabulated cross section data for gas target greatly underestimated the charge states. The experimental data are in close agreement with both semi-empirical formula as well as rate equation predictions based on ion-solid interactions. The important role of target density effects that increase the ionization probability and decrease the electron capture probability through frequent multi-collisions in foam are demonstrated. The double electron processes are shown to have little influence on the average charge states. The findings are essential for high energy density physics research where the foams are widely used, and have impacts on a broad range of applications in medical, biological and material fields. The method also provides a new approach to investigate the interaction mechanism of swift heavy ions in matter by taking advantage of the laser-accelerated short-pulse wide-energy range ions. △ Less

Submitted 2 October, 2023; originally announced October 2023.

arXiv:2310.00259 [pdf, other]

AutoHall: Automated Hallucination Dataset Generation for Large Language Models

Authors: Zouying Cao, Yifei Yang, Hai Zhao

Abstract: While Large language models (LLMs) have garnered widespread applications across various domains due to their powerful language understanding and generation capabilities, the detection of non-factual or hallucinatory content generated by LLMs remains scarce. Currently, one significant challenge in hallucination detection is the laborious task of time-consuming and expensive manual annotation of the… ▽ More While Large language models (LLMs) have garnered widespread applications across various domains due to their powerful language understanding and generation capabilities, the detection of non-factual or hallucinatory content generated by LLMs remains scarce. Currently, one significant challenge in hallucination detection is the laborious task of time-consuming and expensive manual annotation of the hallucinatory generation. To address this issue, this paper first introduces a method for automatically constructing model-specific hallucination datasets based on existing fact-checking datasets called AutoHall. Furthermore, we propose a zero-resource and black-box hallucination detection method based on self-contradiction. We conduct experiments towards prevalent open-/closed-source LLMs, achieving superior hallucination detection performance compared to extant baselines. Moreover, our experiments reveal variations in hallucination proportions and types among different models. △ Less

Submitted 30 September, 2023; originally announced October 2023.

arXiv:2310.00109 [pdf, other]

FedAIoT: A Federated Learning Benchmark for Artificial Intelligence of Things

Authors: Samiul Alam, Tuo Zhang, Tiantian Feng, Hui Shen, Zhichao Cao, Dong Zhao, JeongGil Ko, Kiran Somasundaram, Shrikanth S. Narayanan, Salman Avestimehr, Mi Zhang

Abstract: There is a significant relevance of federated learning (FL) in the realm of Artificial Intelligence of Things (AIoT). However, most existing FL works do not use datasets collected from authentic IoT devices and thus do not capture unique modalities and inherent challenges of IoT data. To fill this critical gap, in this work, we introduce FedAIoT, an FL benchmark for AIoT. FedAIoT includes eight da… ▽ More There is a significant relevance of federated learning (FL) in the realm of Artificial Intelligence of Things (AIoT). However, most existing FL works do not use datasets collected from authentic IoT devices and thus do not capture unique modalities and inherent challenges of IoT data. To fill this critical gap, in this work, we introduce FedAIoT, an FL benchmark for AIoT. FedAIoT includes eight datasets collected from a wide range of IoT devices. These datasets cover unique IoT modalities and target representative applications of AIoT. FedAIoT also includes a unified end-to-end FL framework for AIoT that simplifies benchmarking the performance of the datasets. Our benchmark results shed light on the opportunities and challenges of FL for AIoT. We hope FedAIoT could serve as an invaluable resource to foster advancements in the important field of FL for AIoT. The repository of FedAIoT is maintained at https://github.com/AIoT-MLSys-Lab/FedAIoT. △ Less

Submitted 19 June, 2024; v1 submitted 29 September, 2023; originally announced October 2023.

arXiv:2309.17218 [pdf, other]

When Epipolar Constraint Meets Non-local Operators in Multi-View Stereo

Authors: Tianqi Liu, Xinyi Ye, Weiyue Zhao, Zhiyu Pan, Min Shi, Zhiguo Cao

Abstract: Learning-based multi-view stereo (MVS) method heavily relies on feature matching, which requires distinctive and descriptive representations. An effective solution is to apply non-local feature aggregation, e.g., Transformer. Albeit useful, these techniques introduce heavy computation overheads for MVS. Each pixel densely attends to the whole image. In contrast, we propose to constrain non-local f… ▽ More Learning-based multi-view stereo (MVS) method heavily relies on feature matching, which requires distinctive and descriptive representations. An effective solution is to apply non-local feature aggregation, e.g., Transformer. Albeit useful, these techniques introduce heavy computation overheads for MVS. Each pixel densely attends to the whole image. In contrast, we propose to constrain non-local feature augmentation within a pair of lines: each point only attends the corresponding pair of epipolar lines. Our idea takes inspiration from the classic epipolar geometry, which shows that one point with different depth hypotheses will be projected to the epipolar line on the other view. This constraint reduces the 2D search space into the epipolar line in stereo matching. Similarly, this suggests that the matching of MVS is to distinguish a series of points lying on the same line. Inspired by this point-to-line search, we devise a line-to-point non-local augmentation strategy. We first devise an optimized searching algorithm to split the 2D feature maps into epipolar line pairs. Then, an Epipolar Transformer (ET) performs non-local feature augmentation among epipolar line pairs. We incorporate the ET into a learning-based MVS baseline, named ET-MVSNet. ET-MVSNet achieves state-of-the-art reconstruction performance on both the DTU and Tanks-and-Temples benchmark with high efficiency. Code is available at https://github.com/TQTQliu/ET-MVSNet. △ Less

Submitted 29 September, 2023; originally announced September 2023.

Comments: ICCV2023

arXiv:2309.14045 [pdf, other]

Impacts of Gravitational-Wave Background from Supermassive Black Hole Binaries on the Detection of Compact Binaries by LISA

Authors: Fan Huang, Yan-Chen Bi, Zhoujian Cao, Qing-Guo Huang

Abstract: In the frequency band of Laser Interferometer Space Antenna (LISA), extensive research has been conducted on the impact of foreground confusion noise generated by galactic binaries within the Milky Way galaxy. Additionally, the recent evidence for a stochastic signal, announced by the NANOGrav, EPTA, PPTA, CPTA and InPTA, indicates that the stochastic gravitational-wave background generated by sup… ▽ More In the frequency band of Laser Interferometer Space Antenna (LISA), extensive research has been conducted on the impact of foreground confusion noise generated by galactic binaries within the Milky Way galaxy. Additionally, the recent evidence for a stochastic signal, announced by the NANOGrav, EPTA, PPTA, CPTA and InPTA, indicates that the stochastic gravitational-wave background generated by supermassive black hole binaries (SMBHBs) can contribute a strong background noise within in LISA band. Given the presence of such strong noise, it is expected to have a considerable impacts on LISA's scientific missions. In this work, we investigate the impacts of the SGWB generated by SMBHBs on the detection of massive black hole binaries (MBHBs), verified galactic binaries (VGBs) and extreme mass ratio inspirals (EMRIs) in the context of LISA, and find it crucial to resolve and eliminate the exceed noise from the SGWB to ensure the success of LISA's missions. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: 6 pages, 3 figures

arXiv:2309.14032 [pdf, other]

DeepACO: Neural-enhanced Ant Systems for Combinatorial Optimization

Authors: Haoran Ye, Jiarui Wang, Zhiguang Cao, Helan Liang, Yong Li

Abstract: Ant Colony Optimization (ACO) is a meta-heuristic algorithm that has been successfully applied to various Combinatorial Optimization Problems (COPs). Traditionally, customizing ACO for a specific problem requires the expert design of knowledge-driven heuristics. In this paper, we propose DeepACO, a generic framework that leverages deep reinforcement learning to automate heuristic designs. DeepACO… ▽ More Ant Colony Optimization (ACO) is a meta-heuristic algorithm that has been successfully applied to various Combinatorial Optimization Problems (COPs). Traditionally, customizing ACO for a specific problem requires the expert design of knowledge-driven heuristics. In this paper, we propose DeepACO, a generic framework that leverages deep reinforcement learning to automate heuristic designs. DeepACO serves to strengthen the heuristic measures of existing ACO algorithms and dispense with laborious manual design in future ACO applications. As a neural-enhanced meta-heuristic, DeepACO consistently outperforms its ACO counterparts on eight COPs using a single neural architecture and a single set of hyperparameters. As a Neural Combinatorial Optimization method, DeepACO performs better than or on par with problem-specific methods on canonical routing problems. Our code is publicly available at https://github.com/henry-yeh/DeepACO. △ Less

Submitted 4 November, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

Comments: Accepted at NeurIPS 2023

arXiv:2309.13902 [pdf, other]

doi 10.1109/TVT.2023.3293189

NoncovANM: Gridless DOA Estimation for LPDF System

Authors: Yangying Zhao, Peng Chen, Zhenxin Cao, Xianbin Wang

Abstract: Direction of arrival (DOA) estimation is an important research in the area of array signal processing, and has been studied for decades. High resolution DOA estimation requires large array aperture, which leads to the increase of hardware cost. Besides, high accuracy DOA estimation methods usually have high computational complexity. In this paper, the problem of decreasing the hardware cost and al… ▽ More Direction of arrival (DOA) estimation is an important research in the area of array signal processing, and has been studied for decades. High resolution DOA estimation requires large array aperture, which leads to the increase of hardware cost. Besides, high accuracy DOA estimation methods usually have high computational complexity. In this paper, the problem of decreasing the hardware cost and algorithm complexity is addressed. First, considering the ability of flexible controlling the electromagnetic waves and low-cost, an intelligent reconfigurable surface (IRS)-aided low-cost passive direction finding (LPDF) system is developed, where only one fully functional receiving channel is adopted. Then, the sparsity of targets direction in the spatial domain is exploited by formulating an atomic norm minimization (ANM) problem to estimate the DOA. Traditionally, solving ANM problem is complex and cannot be realized efficiently. Hence, a novel nonconvex-based ANM (NC-ANM) method is proposed by gradient threshold iteration, where a perturbation is introduced to avoid falling into saddle points. The theoretical analysis for the convergence of the NC-ANM method is also given. Moreover, the corresponding Cramér-Rao lower bound (CRLB) in the LPDF system is derived, and taken as the referred bound of the DOA estimation. Simulation results show that the proposed method outperforms the compared methods in the DOA estimation with lower computational complexity in the LPDF system. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: 11 pages, 8 figures

Journal ref: IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023

arXiv:2309.12842 [pdf, other]

SRFNet: Monocular Depth Estimation with Fine-grained Structure via Spatial Reliability-oriented Fusion of Frames and Events

Authors: Tianbo Pan, Zidong Cao, Lin Wang

Abstract: Monocular depth estimation is a crucial task to measure distance relative to a camera, which is important for applications, such as robot navigation and self-driving. Traditional frame-based methods suffer from performance drops due to the limited dynamic range and motion blur. Therefore, recent works leverage novel event cameras to complement or guide the frame modality via frame-event feature fu… ▽ More Monocular depth estimation is a crucial task to measure distance relative to a camera, which is important for applications, such as robot navigation and self-driving. Traditional frame-based methods suffer from performance drops due to the limited dynamic range and motion blur. Therefore, recent works leverage novel event cameras to complement or guide the frame modality via frame-event feature fusion. However, event streams exhibit spatial sparsity, leaving some areas unperceived, especially in regions with marginal light changes. Therefore, direct fusion methods, e.g., RAMNet, often ignore the contribution of the most confident regions of each modality. This leads to structural ambiguity in the modality fusion process, thus degrading the depth estimation performance. In this paper, we propose a novel Spatial Reliability-oriented Fusion Network (SRFNet), that can estimate depth with fine-grained structure at both daytime and nighttime. Our method consists of two key technical components. Firstly, we propose an attention-based interactive fusion (AIF) module that applies spatial priors of events and frames as the initial masks and learns the consensus regions to guide the inter-modal feature fusion. The fused feature are then fed back to enhance the frame and event feature learning. Meanwhile, it utilizes an output head to generate a fused mask, which is iteratively updated for learning consensual spatial priors. Secondly, we propose the Reliability-oriented Depth Refinement (RDR) module to estimate dense depth with the fine-grained structure based on the fused features and masks. We evaluate the effectiveness of our method on the synthetic and real-world datasets, which shows that, even without pretraining, our method outperforms the prior methods, e.g., RAMNet, especially in night scenes. Our project homepage: https://vlislab22.github.io/SRFNet. △ Less

Submitted 22 September, 2023; originally announced September 2023.

arXiv:2309.11687 [pdf, other]

Large-scale Pretraining Improves Sample Efficiency of Active Learning based Molecule Virtual Screening

Authors: Zhonglin Cao, Simone Sciabola, Ye Wang

Abstract: Virtual screening of large compound libraries to identify potential hit candidates is one of the earliest steps in drug discovery. As the size of commercially available compound collections grows exponentially to the scale of billions, brute-force virtual screening using traditional tools such as docking becomes infeasible in terms of time and computational resources. Active learning and Bayesian… ▽ More Virtual screening of large compound libraries to identify potential hit candidates is one of the earliest steps in drug discovery. As the size of commercially available compound collections grows exponentially to the scale of billions, brute-force virtual screening using traditional tools such as docking becomes infeasible in terms of time and computational resources. Active learning and Bayesian optimization has recently been proven as effective methods of narrowing down the search space. An essential component in those methods is a surrogate machine learning model that is trained with a small subset of the library to predict the desired properties of compounds. Accurate model can achieve high sample efficiency by finding the most promising compounds with only a fraction of the whole library being virtually screened. In this study, we examined the performance of pretrained transformer-based language model and graph neural network in Bayesian optimization active learning framework. The best pretrained models identifies 58.97% of the top-50000 by docking score after screening only 0.6% of an ultra-large library containing 99.5 million compounds, improving 8% over previous state-of-the-art baseline. Through extensive benchmarks, we show that the superior performance of pretrained models persists in both structure-based and ligand-based drug discovery. Such model can serve as a boost to the accuracy and sample efficiency of active learning based molecule virtual screening. △ Less

Submitted 20 September, 2023; originally announced September 2023.

arXiv:2309.11149 [pdf, other]

Electrostatic environment and Majorana bound states in full-shell topological insulator nanowires

Authors: Li Chen, Xiao-Hong Pan, Zhan Cao, Dong E. Liu, Xin Liu

Abstract: The combination of a superconductor (SC) and a topological insulator (TI) nanowire was proposed as a potential candidate for realizing Majorana zero modes (MZMs). In this study, we adopt the Schrödinger-Poisson formalism to incorporate the electrostatic environment inside the nanowire and systematically explore its topological properties. Our calculations reveal that the proximity to the SC induce… ▽ More The combination of a superconductor (SC) and a topological insulator (TI) nanowire was proposed as a potential candidate for realizing Majorana zero modes (MZMs). In this study, we adopt the Schrödinger-Poisson formalism to incorporate the electrostatic environment inside the nanowire and systematically explore its topological properties. Our calculations reveal that the proximity to the SC induces a band bending effect, leading to a non-uniform potential across the TI nanowire. As a consequence, there is an upward shift of the Fermi level within the conduction band. This gives rise to the coexistence of surface and bulk states, localized in an accumulation layer adjacent to the TI-SC interface. When magnetic flux is applied, these occupied states have different flux-penetration areas, suppressing the superconducting gap. However, this impact can be mitigated by increasing the radius of the nanowire. Finally, We demonstrate that MZMs can be achieved across a wide range of parameters centered around one applied flux quantum, $φ_0 = h/2e$. Within this regime, MZMs can be realized even in the presence of conduction bands, which are not affected by the band bending effect. These findings provide valuable insights into the practical realization of MZMs in TI nanowire-based devices, especially in the presence of a complicated electrostatic environment. △ Less

Submitted 22 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

arXiv:2309.07920 [pdf, other]

Large-Vocabulary 3D Diffusion Model with Transformer

Authors: Ziang Cao, Fangzhou Hong, Tong Wu, Liang Pan, Ziwei Liu

Abstract: Creating diverse and high-quality 3D assets with an automatic generative model is highly desirable. Despite extensive efforts on 3D generation, most existing works focus on the generation of a single category or a few categories. In this paper, we introduce a diffusion-based feed-forward framework for synthesizing massive categories of real-world 3D objects with a single generative model. Notably,… ▽ More Creating diverse and high-quality 3D assets with an automatic generative model is highly desirable. Despite extensive efforts on 3D generation, most existing works focus on the generation of a single category or a few categories. In this paper, we introduce a diffusion-based feed-forward framework for synthesizing massive categories of real-world 3D objects with a single generative model. Notably, there are three major challenges for this large-vocabulary 3D generation: a) the need for expressive yet efficient 3D representation; b) large diversity in geometry and texture across categories; c) complexity in the appearances of real-world objects. To this end, we propose a novel triplane-based 3D-aware Diffusion model with TransFormer, DiffTF, for handling challenges via three aspects. 1) Considering efficiency and robustness, we adopt a revised triplane representation and improve the fitting speed and accuracy. 2) To handle the drastic variations in geometry and texture, we regard the features of all 3D objects as a combination of generalized 3D knowledge and specialized 3D features. To extract generalized 3D knowledge from diverse categories, we propose a novel 3D-aware transformer with shared cross-plane attention. It learns the cross-plane relations across different planes and aggregates the generalized 3D knowledge with specialized 3D features. 3) In addition, we devise the 3D-aware encoder/decoder to enhance the generalized 3D knowledge in the encoded triplanes for handling categories with complex appearances. Extensive experiments on ShapeNet and OmniObject3D (over 200 diverse real-world categories) convincingly demonstrate that a single DiffTF model achieves state-of-the-art large-vocabulary 3D object generation performance with large diversity, rich semantics, and high quality. △ Less

Submitted 15 September, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

Comments: Project page at https://ziangcao0312.github.io/difftf_pages/

arXiv:2309.07026 [pdf, other]

APICom: Automatic API Completion via Prompt Learning and Adversarial Training-based Data Augmentation

Authors: Yafeng Gu, Yiheng Shen, Xiang Chen, Shaoyu Yang, Yiling Huang, Zhixiang Cao

Abstract: Based on developer needs and usage scenarios, API (Application Programming Interface) recommendation is the process of assisting developers in finding the required API among numerous candidate APIs. Previous studies mainly modeled API recommendation as the recommendation task, which can recommend multiple candidate APIs for the given query, and developers may not yet be able to find what they need… ▽ More Based on developer needs and usage scenarios, API (Application Programming Interface) recommendation is the process of assisting developers in finding the required API among numerous candidate APIs. Previous studies mainly modeled API recommendation as the recommendation task, which can recommend multiple candidate APIs for the given query, and developers may not yet be able to find what they need. Motivated by the neural machine translation research domain, we can model this problem as the generation task, which aims to directly generate the required API for the developer query. After our preliminary investigation, we find the performance of this intuitive approach is not promising. The reason is that there exists an error when generating the prefixes of the API. However, developers may know certain API prefix information during actual development in most cases. Therefore, we model this problem as the automatic completion task and propose a novel approach APICom based on prompt learning, which can generate API related to the query according to the prompts (i.e., API prefix information). Moreover, the effectiveness of APICom highly depends on the quality of the training dataset. In this study, we further design a novel gradient-based adversarial training method {\atpart} for data augmentation, which can improve the normalized stability when generating adversarial examples. To evaluate the effectiveness of APICom, we consider a corpus of 33k developer queries and corresponding APIs. Compared with the state-of-the-art baselines, our experimental results show that APICom can outperform all baselines by at least 40.02\%, 13.20\%, and 16.31\% in terms of the performance measures EM@1, MRR, and MAP. Finally, our ablation studies confirm the effectiveness of our component setting (such as our designed adversarial training method, our used pre-trained model, and prompt learning) in APICom. △ Less

Submitted 13 September, 2023; originally announced September 2023.

Comments: accepted in Internetware 2023

arXiv:2309.06037 [pdf, other]

doi 10.1103/PhysRevD.107.123029

Fast resolving Galactic binaries in LISA data and its ability to study the Milky Way

Authors: Pin Gao, Xi-Long Fan, Zhou-Jian Cao, Xue-Hao Zhang

Abstract: Resolving individual gravitational waves from tens of millions of double white dwarf (DWD) binaries in the Milky Way is a challenge for future space-based gravitational wave detection programs. By using previous data to define the priors for the next search, we propose an accelerated approach of searching the DWD binaries and demonstrate its efficiency based on the GBSIEVER detection pipeline. Com… ▽ More Resolving individual gravitational waves from tens of millions of double white dwarf (DWD) binaries in the Milky Way is a challenge for future space-based gravitational wave detection programs. By using previous data to define the priors for the next search, we propose an accelerated approach of searching the DWD binaries and demonstrate its efficiency based on the GBSIEVER detection pipeline. Compared to the traditional GBSIEVER method, our method can obtain $\sim 50\%$ of sources with 2.5\% of the searching time for LDC1-4 data. In addition, we find that both methods have a similar ability to detect the Milky Way structure by their confirmed sources. The relative error of distance and chirp mass is about 20\% for DWD binaries whose gravitational wave frequency is higher than $4\times10^{-3}$ Hz, even if they are close to the Galactic center. Finally, we propose a signal-to-noise ratio (SNR) threshold for LISA to confirm the detection of DWD binaries. The threshold should be 16 when the gravitational wave frequency is lower than $4\times10^{-3}$ Hz and 9 when the frequency range is from $4\times10^{-3}$ Hz to $1.5\times10^{-2}$ Hz. △ Less

Submitted 12 September, 2023; originally announced September 2023.

Comments: 16 pages, 19 figures

Journal ref: Phys. Rev. D 107, 123029, 2023

arXiv:2309.02030 [pdf, other]

doi 10.1103/PhysRevD.109.043009

Matched filtering for gravitational wave detection without template bank driven by deep learning template prediction model bank

Authors: CunLiang Ma, Sen Wang, Wei Wang, Zhoujiang Cao

Abstract: The existing matched filtering method for gravitational wave (GW) search relies on a template bank. The computational efficiency of this method scales with the size of the templates within the bank. Higher-order modes and eccentricity will play an important role when third-generation detectors operate in the future. In this case, traditional GW search methods will hit computational limits. To spee… ▽ More The existing matched filtering method for gravitational wave (GW) search relies on a template bank. The computational efficiency of this method scales with the size of the templates within the bank. Higher-order modes and eccentricity will play an important role when third-generation detectors operate in the future. In this case, traditional GW search methods will hit computational limits. To speed up the computational efficiency of GW search, we propose the utilization of a deep learning (DL) model bank as a substitute for the template bank. This model bank predicts the latent templates embedded in the strain data. Combining an envelope extraction network and an astrophysical origin discrimination network, we realize a novel GW search framework. The framework can predict the GW signal's matched filtering signal-to-noise ratio (SNR). Unlike the end-to-end DL-based GW search method, our statistical SNR holds greater physical interpretability than the $p_{score}$ metric. Moreover, the intermediate results generated by our approach, including the predicted template, offer valuable assistance in subsequent GW data processing tasks such as parameter estimation and source localization. Compared to the traditional matched filtering method, the proposed method can realize real-time analysis. The minor improvements in the future, the proposed method may expand to other scopes of GW search, such as GW emitted by the supernova explosion. △ Less

Submitted 5 September, 2023; originally announced September 2023.

Comments: 20 pages,12 figures

Journal ref: Physical Review D, 2024, 109(4): 043009

arXiv:2309.01579 [pdf, other]

Direct observation of topological surface states in the layered kagome lattice with broken time-reversal symmetry

Authors: Zhicheng Jiang, Tongrui Li, Jian Yuan, Zhengtai Liu, Zhipeng Cao, Soohyun Cho, Mingfang Shu, Yichen Yang, Jianyang Ding, Zhikai Li, Jiayu Liu, Zhonghao Liu, Jishan Liu, Jie Ma, Zhe Sun, Yanfeng Guo, Dawei Shen

Abstract: Magnetic topological quantum materials display a diverse range of fascinating physical properties which arise from their intrinsic magnetism and the breaking of time-reversal symmetry. However, so far, few examples of intrinsic magnetic topological materials have been confirmed experimentally, which significantly hinder our comprehensive understanding of the abundant physical properties in this sy… ▽ More Magnetic topological quantum materials display a diverse range of fascinating physical properties which arise from their intrinsic magnetism and the breaking of time-reversal symmetry. However, so far, few examples of intrinsic magnetic topological materials have been confirmed experimentally, which significantly hinder our comprehensive understanding of the abundant physical properties in this system. The kagome lattices, which host diversity of electronic structure signatures such as Dirac nodes, flat bands, and saddle points, provide an alternative and promising platform for in-depth investigations into correlations and band topology. In this article, drawing inspiration from the stacking configuration of MnBi$_2$Te$_4$, we conceive and then synthesize a high-quality single crystal EuTi$_3$Bi$_4$, which is a unique natural heterostructure consisting of both topological kagome layers and magnetic interlayers. We investigate the electronic structure of EuTi$_3$Bi$_4$ and uncover distinct features of anisotropic multiple Van Hove singularitie (VHS) that might prevent Fermi surface nesting, leading to the absence of a charge density wave (CDW). In addition, we identify the topological nontrivial surface states that serve as connections between different saddle bands in the vicinity of the Fermi level. Combined with calculations, we establish that, the effective time-reversal symmetry S=$θ$$τ_{1/2}$ play a crucial role in the antiferromagnetic ground state of EuTi$_3$Bi$_4$, which ensures the stability of the topological surface states and gives rise to their intriguing topological nature. Therefore, EuTi$_3$Bi$_4$ offers the rare opportunity to investigate correlated topological states in magnetic kagome materials. △ Less

Submitted 4 September, 2023; originally announced September 2023.

Comments: 9 pages, 4 figures

arXiv:2308.16783 [pdf, other]

Neutron Star vs Quark Star in the Multimessenger Era

Authors: Zheng Cao, Lie-Wen Chen

Abstract: Neutron stars (NSs) which could contain exotic degrees of freedom in the core and the self-bound quark stars (QSs) made purely of absolutely stable deconfined quark matter are still two main candidates for the compact objects observed in pulsars and gravitational wave (GW) events in binary star mergers. We perform a Bayesian model-agnostic inference of the properties of NSs and QSs by combining mu… ▽ More Neutron stars (NSs) which could contain exotic degrees of freedom in the core and the self-bound quark stars (QSs) made purely of absolutely stable deconfined quark matter are still two main candidates for the compact objects observed in pulsars and gravitational wave (GW) events in binary star mergers. We perform a Bayesian model-agnostic inference of the properties of NSs and QSs by combining multi-messenger data of GW170817, GW190425, PSR J0030+0451, PSR J0740+6620, PSR J1614-2230, PSR J0348+0432 as well as ab initio calculations from perturbative quantum chromodynamics and chiral effective field theory. We find the NS scenario is strongly favored against the QS scenario with a Bayes factor of NS over QS $\mathcal{B}^\text{NS}_\text{QS} = 11.5$. In addition, the peak of the squared sound velocity $c_s^2 \sim 0.5c^2$ around $3.5$ times nuclear saturation density $n_0$ observed in the NS case disappears in the QS case which suggests that the $c_s^2$ first increases and then saturates at $c_s^2 \sim 0.5c^2$ above $\sim 4n_0$. The sound velocity and trace anomaly are found to approach the conformal limit in the core of heavy NSs with mass $M \gtrsim 2M_{\odot}$, but not in the core of QSs. △ Less

Submitted 31 August, 2023; originally announced August 2023.

Comments: 6 pages, 2 figures

arXiv:2308.16422 [pdf, other]

doi 10.1103/PhysRevD.109.084054

Dilated convolutional neural network for detecting extreme-mass-ratio inspirals

Authors: Tianyu Zhao, Yue Zhou, Ruijun Shi, Zhoujian Cao, Zhixiang Ren

Abstract: The detection of Extreme Mass Ratio Inspirals (EMRIs) is intricate due to their complex waveforms, extended duration, and low signal-to-noise ratio (SNR), making them more challenging to be identified compared to compact binary coalescences. While matched filtering-based techniques are known for their computational demands, existing deep learning-based methods primarily handle time-domain data and… ▽ More The detection of Extreme Mass Ratio Inspirals (EMRIs) is intricate due to their complex waveforms, extended duration, and low signal-to-noise ratio (SNR), making them more challenging to be identified compared to compact binary coalescences. While matched filtering-based techniques are known for their computational demands, existing deep learning-based methods primarily handle time-domain data and are often constrained by data duration and SNR. In addition, most existing work ignores time-delay interferometry (TDI) and applies the long-wavelength approximation in detector response calculations, thus limiting their ability to handle laser frequency noise. In this study, we introduce DECODE, an end-to-end model focusing on EMRI signal detection by sequence modeling in the frequency domain. Centered around a dilated causal convolutional neural network, trained on synthetic data considering TDI-1.5 detector response, DECODE can efficiently process a year's worth of multichannel TDI data with an SNR of around 50. We evaluate our model on 1-year data with accumulated SNR ranging from 50 to 120 and achieve a true positive rate of 96.3% at a false positive rate of 1%, keeping an inference time of less than 0.01 seconds. With the visualization of three showcased EMRI signals for interpretability and generalization, DECODE exhibits strong potential for future space-based gravitational wave data analyses. △ Less

Submitted 14 May, 2024; v1 submitted 30 August, 2023; originally announced August 2023.

Comments: 11 pages, 5 figures, and 2 tables

Journal ref: Phys. Rev. D 109, 084054 (2024)

arXiv:2308.15711 [pdf, other]

Optimizing Factual Accuracy in Text Generation through Dynamic Knowledge Selection

Authors: Hongjin Qian, Zhicheng Dou, Jiejun Tan, Haonan Chen, Haoqi Gu, Ruofei Lai, Xinyu Zhang, Zhao Cao, Ji-Rong Wen

Abstract: Language models (LMs) have revolutionized the way we interact with information, but they often generate nonfactual text, raising concerns about their reliability. Previous methods use external knowledge as references for text generation to enhance factuality but often struggle with the knowledge mix-up(e.g., entity mismatch) of irrelevant references. Besides,as the length of the output text grows,… ▽ More Language models (LMs) have revolutionized the way we interact with information, but they often generate nonfactual text, raising concerns about their reliability. Previous methods use external knowledge as references for text generation to enhance factuality but often struggle with the knowledge mix-up(e.g., entity mismatch) of irrelevant references. Besides,as the length of the output text grows, the randomness of sampling can escalate, detrimentally impacting the factual accuracy of the generated text. In this paper, we present DKGen, which divide the text generation process into an iterative process. In each iteration, DKGen takes the input query, the previously generated text and a subset of the reference passages as input to generate short text. During the process, the subset is dynamically selected from the full passage set based on their relevance to the previously generated text and the query, largely eliminating the irrelevant references from input. To further enhance DKGen's ability to correctly use these external knowledge, DKGen distills the relevance order of reference passages to the cross-attention distribution of decoder. We train and evaluate DKGen on a large-scale benchmark dataset. Experiment results show that DKGen outperforms all baseline models. △ Less

Submitted 29 August, 2023; originally announced August 2023.

Comments: 15 pages

arXiv:2308.15371 [pdf, other]

doi 10.1103/PhysRevD.109.044031

Influence of dark matter equation of state on the axial gravitational ringing of supermassive black holes

Authors: Yuqian Zhao, Bing Sun, Zhoujian Cao, Kai Lin, Wei-Liang Qian

Abstract: In this work, we explore the effects of surrounding dark matter featuring different equations of state on the axial gravitational quasinormal modes of supermassive black holes situated at the center of galaxies. Our attention primarily rests on dark matter exhibiting a spike structure, originating from relativistic Bondi accretion through an adiabatic process, which diminishes at a certain distanc… ▽ More In this work, we explore the effects of surrounding dark matter featuring different equations of state on the axial gravitational quasinormal modes of supermassive black holes situated at the center of galaxies. Our attention primarily rests on dark matter exhibiting a spike structure, originating from relativistic Bondi accretion through an adiabatic process, which diminishes at a certain distance from the black hole. We analyze how varying the equation of state of the dark matter influences the properties of the spacetime in the black hole's vicinity. Our findings reveal that different states of dark matter spikes correspondingly affect the black hole's quasinormal modes. In particular, we identify deviations in both the ringing frequency and damping time, reaching magnitudes of up to $10^{-3}$ for certain parameter values. These variations can potentially be detected by upcoming space-borne detectors. Our findings thus indicate the feasibility of discerning and limiting the essential properties of dark matter surrounding supermassive black holes using future gravitational wave detections, particularly in the case of extreme mass ratio inspiral systems. △ Less

Submitted 22 February, 2024; v1 submitted 29 August, 2023; originally announced August 2023.

Comments: 23 pages, 6 figures, 1 table

Journal ref: Phys. Rev. D 109, 044031(2024)

arXiv:2308.15085 [pdf, other]

Learning to Upsample by Learning to Sample

Authors: Wenze Liu, Hao Lu, Hongtao Fu, Zhiguo Cao

Abstract: We present DySample, an ultra-lightweight and effective dynamic upsampler. While impressive performance gains have been witnessed from recent kernel-based dynamic upsamplers such as CARAFE, FADE, and SAPA, they introduce much workload, mostly due to the time-consuming dynamic convolution and the additional sub-network used to generate dynamic kernels. Further, the need for high-res feature guidanc… ▽ More We present DySample, an ultra-lightweight and effective dynamic upsampler. While impressive performance gains have been witnessed from recent kernel-based dynamic upsamplers such as CARAFE, FADE, and SAPA, they introduce much workload, mostly due to the time-consuming dynamic convolution and the additional sub-network used to generate dynamic kernels. Further, the need for high-res feature guidance of FADE and SAPA somehow limits their application scenarios. To address these concerns, we bypass dynamic convolution and formulate upsampling from the perspective of point sampling, which is more resource-efficient and can be easily implemented with the standard built-in function in PyTorch. We first showcase a naive design, and then demonstrate how to strengthen its upsampling behavior step by step towards our new upsampler, DySample. Compared with former kernel-based dynamic upsamplers, DySample requires no customized CUDA package and has much fewer parameters, FLOPs, GPU memory, and latency. Besides the light-weight characteristics, DySample outperforms other upsamplers across five dense prediction tasks, including semantic segmentation, object detection, instance segmentation, panoptic segmentation, and monocular depth estimation. Code is available at https://github.com/tiny-smart/dysample. △ Less

Submitted 29 August, 2023; originally announced August 2023.

Comments: Accepted by ICCV 2023

arXiv:2308.15003 [pdf, other]

Generative Model for Models: Rapid DNN Customization for Diverse Tasks and Resource Constraints

Authors: Wenxing Xu, Yuanchun Li, Jiacheng Liu, Yi Sun, Zhengyang Cao, Yixuan Li, Hao Wen, Yunxin Liu

Abstract: Unlike cloud-based deep learning models that are often large and uniform, edge-deployed models usually demand customization for domain-specific tasks and resource-limited environments. Such customization processes can be costly and time-consuming due to the diversity of edge scenarios and the training load for each scenario. Although various approaches have been proposed for rapid resource-oriente… ▽ More Unlike cloud-based deep learning models that are often large and uniform, edge-deployed models usually demand customization for domain-specific tasks and resource-limited environments. Such customization processes can be costly and time-consuming due to the diversity of edge scenarios and the training load for each scenario. Although various approaches have been proposed for rapid resource-oriented customization and task-oriented customization respectively, achieving both of them at the same time is challenging. Drawing inspiration from the generative AI and the modular composability of neural networks, we introduce NN-Factory, an one-for-all framework to generate customized lightweight models for diverse edge scenarios. The key idea is to use a generative model to directly produce the customized models, instead of training them. The main components of NN-Factory include a modular supernet with pretrained modules that can be conditionally activated to accomplish different tasks and a generative module assembler that manipulate the modules according to task and sparsity requirements. Given an edge scenario, NN-Factory can efficiently customize a compact model specialized in the edge task while satisfying the edge resource constraints by searching for the optimal strategy to assemble the modules. Based on experiments on image classification and object detection tasks with different edge devices, NN-Factory is able to generate high-quality task- and resource-specific models within few seconds, faster than conventional model customization approaches by orders of magnitude. △ Less

Submitted 28 August, 2023; originally announced August 2023.

arXiv:2308.13814 [pdf, other]

Point-Query Quadtree for Crowd Counting, Localization, and More

Authors: Chengxin Liu, Hao Lu, Zhiguo Cao, Tongliang Liu

Abstract: We show that crowd counting can be viewed as a decomposable point querying process. This formulation enables arbitrary points as input and jointly reasons whether the points are crowd and where they locate. The querying processing, however, raises an underlying problem on the number of necessary querying points. Too few imply underestimation; too many increase computational overhead. To address th… ▽ More We show that crowd counting can be viewed as a decomposable point querying process. This formulation enables arbitrary points as input and jointly reasons whether the points are crowd and where they locate. The querying processing, however, raises an underlying problem on the number of necessary querying points. Too few imply underestimation; too many increase computational overhead. To address this dilemma, we introduce a decomposable structure, i.e., the point-query quadtree, and propose a new counting model, termed Point quEry Transformer (PET). PET implements decomposable point querying via data-dependent quadtree splitting, where each querying point could split into four new points when necessary, thus enabling dynamic processing of sparse and dense regions. Such a querying process yields an intuitive, universal modeling of crowd as both the input and output are interpretable and steerable. We demonstrate the applications of PET on a number of crowd-related tasks, including fully-supervised crowd counting and localization, partial annotation learning, and point annotation refinement, and also report state-of-the-art performance. For the first time, we show that a single counting model can address multiple crowd-related tasks across different learning paradigms. Code is available at https://github.com/cxliu0/PET. △ Less

Submitted 26 August, 2023; originally announced August 2023.

Comments: Accepted by ICCV 2023

arXiv:2308.13690 [pdf, other]

Including higher harmonics in gravitational-wave parameter estimation and cosmological implications for LISA

Authors: Yi Gong, Zhoujian Cao, Junjie Zhao, Lijing Shao

Abstract: Massive black holes (MBHs) are crucial in shaping their host galaxies. How the MBH co-evolves with its host galaxy is a pressing problem in astrophysics and cosmology. The valuable information carried by the binary MBH is encoded in the gravitational waves (GWs), which will be detectable by the space-borne GW detector LISA. In the GW data analysis, usually, only the dominant $(2,2)$ mode of the GW… ▽ More Massive black holes (MBHs) are crucial in shaping their host galaxies. How the MBH co-evolves with its host galaxy is a pressing problem in astrophysics and cosmology. The valuable information carried by the binary MBH is encoded in the gravitational waves (GWs), which will be detectable by the space-borne GW detector LISA. In the GW data analysis, usually, only the dominant $(2,2)$ mode of the GW signal is considered in the parameter estimation for LISA. However, including the higher harmonics in parameter estimation can break the degeneracy between the parameters, especially for the inclination angle and luminosity distance. This may enable the identification of GW signals without electromagnetic counterparts, known as ''dark sirens''. Thus, incorporating higher harmonics will be beneficial to resolve the Hubble tension and constrain the cosmological model. In this paper, we investigate the role of higher harmonics in the parameter estimation for GWs emitted by binary MBHs. We demonstrate that including $(3,3)$ mode can lead to a $10^3$-times improvement in angular resolution and a $10^4$-times improvement in luminosity distance. Meanwhile, our results indicate that considering higher harmonics increases the probability of identifying over 70% host galaxies from $10^{-2}\,\rm{Gpc}^3$ cosmological volume threshold (corresponding $10^5$ host galaxies), while the probability less than 8% for only the $(2,2)$ mode. Thus, our results underscore the importance of including higher modes in the GW signal from binary MBHs, for LISA at least $(3,3)$ mode. △ Less

Submitted 25 August, 2023; originally announced August 2023.

Comments: 14 pages, 11 figures, 4 tables; accepted by Physical Review D

Journal ref: Phys. Rev. D 108 (2023) 064046

arXiv:2308.13666 [pdf, other]

A Joint Fermi-GBM and Swift-BAT Analysis of Gravitational-Wave Candidates from the Third Gravitational-wave Observing Run

Authors: C. Fletcher, J. Wood, R. Hamburg, P. Veres, C. M. Hui, E. Bissaldi, M. S. Briggs, E. Burns, W. H. Cleveland, M. M. Giles, A. Goldstein, B. A. Hristov, D. Kocevski, S. Lesage, B. Mailyan, C. Malacaria, S. Poolakkil, A. von Kienlin, C. A. Wilson-Hodge, The Fermi Gamma-ray Burst Monitor Team, M. Crnogorčević, J. DeLaunay, A. Tohuvavohu, R. Caputo, S. B. Cenko , et al. (1674 additional authors not shown)

Abstract: We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses,… ▽ More We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses, the Targeted Search and the Untargeted Search, we investigate whether there are any coincident GRBs associated with the GWs. We also search the Swift-BAT rate data around the GW times to determine whether a GRB counterpart is present. No counterparts are found. Using both the Fermi-GBM Targeted Search and the Swift-BAT search, we calculate flux upper limits and present joint upper limits on the gamma-ray luminosity of each GW. Given these limits, we constrain theoretical models for the emission of gamma-rays from binary black hole mergers. △ Less

Submitted 25 August, 2023; originally announced August 2023.

arXiv:2308.12605 [pdf, other]

APLA: Additional Perturbation for Latent Noise with Adversarial Training Enables Consistency

Authors: Yupu Yao, Shangqi Deng, Zihan Cao, Harry Zhang, Liang-Jian Deng

Abstract: Diffusion models have exhibited promising progress in video generation. However, they often struggle to retain consistent details within local regions across frames. One underlying cause is that traditional diffusion models approximate Gaussian noise distribution by utilizing predictive noise, without fully accounting for the impact of inherent information within the input itself. Additionally, th… ▽ More Diffusion models have exhibited promising progress in video generation. However, they often struggle to retain consistent details within local regions across frames. One underlying cause is that traditional diffusion models approximate Gaussian noise distribution by utilizing predictive noise, without fully accounting for the impact of inherent information within the input itself. Additionally, these models emphasize the distinction between predictions and references, neglecting information intrinsic to the videos. To address this limitation, inspired by the self-attention mechanism, we propose a novel text-to-video (T2V) generation network structure based on diffusion models, dubbed Additional Perturbation for Latent noise with Adversarial training (APLA). Our approach only necessitates a single video as input and builds upon pre-trained stable diffusion networks. Notably, we introduce an additional compact network, known as the Video Generation Transformer (VGT). This auxiliary component is designed to extract perturbations from the inherent information contained within the input, thereby refining inconsistent pixels during temporal predictions. We leverage a hybrid architecture of transformers and convolutions to compensate for temporal intricacies, enhancing consistency between different frames within the video. Experiments demonstrate a noticeable improvement in the consistency of the generated videos both qualitatively and quantitatively. △ Less

Submitted 1 May, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

arXiv:2308.10330 [pdf, other]

Towards Real-World Visual Tracking with Temporal Contexts

Authors: Ziang Cao, Ziyuan Huang, Liang Pan, Shiwei Zhang, Ziwei Liu, Changhong Fu

Abstract: Visual tracking has made significant improvements in the past few decades. Most existing state-of-the-art trackers 1) merely aim for performance in ideal conditions while overlooking the real-world conditions; 2) adopt the tracking-by-detection paradigm, neglecting rich temporal contexts; 3) only integrate the temporal information into the template, where temporal contexts among consecutive frames… ▽ More Visual tracking has made significant improvements in the past few decades. Most existing state-of-the-art trackers 1) merely aim for performance in ideal conditions while overlooking the real-world conditions; 2) adopt the tracking-by-detection paradigm, neglecting rich temporal contexts; 3) only integrate the temporal information into the template, where temporal contexts among consecutive frames are far from being fully utilized. To handle those problems, we propose a two-level framework (TCTrack) that can exploit temporal contexts efficiently. Based on it, we propose a stronger version for real-world visual tracking, i.e., TCTrack++. It boils down to two levels: features and similarity maps. Specifically, for feature extraction, we propose an attention-based temporally adaptive convolution to enhance the spatial features using temporal information, which is achieved by dynamically calibrating the convolution weights. For similarity map refinement, we introduce an adaptive temporal transformer to encode the temporal knowledge efficiently and decode it for the accurate refinement of the similarity map. To further improve the performance, we additionally introduce a curriculum learning strategy. Also, we adopt online evaluation to measure performance in real-world conditions. Exhaustive experiments on 8 wellknown benchmarks demonstrate the superiority of TCTrack++. Real-world tests directly verify that TCTrack++ can be readily used in real-world applications. △ Less

Submitted 20 August, 2023; originally announced August 2023.

Comments: Accepted by IEEE TPAMI, Code: https://github.com/vision4robotics/TCTrack

arXiv:2308.10257 [pdf, other]

doi 10.1145/3581783.3612033

Make-It-4D: Synthesizing a Consistent Long-Term Dynamic Scene Video from a Single Image

Authors: Liao Shen, Xingyi Li, Huiqiang Sun, Juewen Peng, Ke Xian, Zhiguo Cao, Guosheng Lin

Abstract: We study the problem of synthesizing a long-term dynamic video from only a single image. This is challenging since it requires consistent visual content movements given large camera motions. Existing methods either hallucinate inconsistent perpetual views or struggle with long camera trajectories. To address these issues, it is essential to estimate the underlying 4D (including 3D geometry and sce… ▽ More We study the problem of synthesizing a long-term dynamic video from only a single image. This is challenging since it requires consistent visual content movements given large camera motions. Existing methods either hallucinate inconsistent perpetual views or struggle with long camera trajectories. To address these issues, it is essential to estimate the underlying 4D (including 3D geometry and scene motion) and fill in the occluded regions. To this end, we present Make-It-4D, a novel method that can generate a consistent long-term dynamic video from a single image. On the one hand, we utilize layered depth images (LDIs) to represent a scene, and they are then unprojected to form a feature point cloud. To animate the visual content, the feature point cloud is displaced based on the scene flow derived from motion estimation and the corresponding camera pose. Such 4D representation enables our method to maintain the global consistency of the generated dynamic video. On the other hand, we fill in the occluded regions by using a pretrained diffusion model to inpaint and outpaint the input image. This enables our method to work under large camera motions. Benefiting from our design, our method can be training-free which saves a significant amount of training time. Experimental results demonstrate the effectiveness of our approach, which showcases compelling rendering results. △ Less

Submitted 20 August, 2023; originally announced August 2023.

Comments: accepted by ACM MM'23

arXiv:2308.08843 [pdf, other]

Dr.Bokeh: DiffeRentiable Occlusion-aware Bokeh Rendering

Authors: Yichen Sheng, Zixun Yu, Lu Ling, Zhiwen Cao, Cecilia Zhang, Xin Lu, Ke Xian, Haiting Lin, Bedrich Benes

Abstract: Bokeh is widely used in photography to draw attention to the subject while effectively isolating distractions in the background. Computational methods simulate bokeh effects without relying on a physical camera lens. However, in the realm of digital bokeh synthesis, the two main challenges for bokeh synthesis are color bleeding and partial occlusion at object boundaries. Our primary goal is to ove… ▽ More Bokeh is widely used in photography to draw attention to the subject while effectively isolating distractions in the background. Computational methods simulate bokeh effects without relying on a physical camera lens. However, in the realm of digital bokeh synthesis, the two main challenges for bokeh synthesis are color bleeding and partial occlusion at object boundaries. Our primary goal is to overcome these two major challenges using physics principles that define bokeh formation. To achieve this, we propose a novel and accurate filtering-based bokeh rendering equation and a physically-based occlusion-aware bokeh renderer, dubbed Dr.Bokeh, which addresses the aforementioned challenges during the rendering stage without the need of post-processing or data-driven approaches. Our rendering algorithm first preprocesses the input RGBD to obtain a layered scene representation. Dr.Bokeh then takes the layered representation and user-defined lens parameters to render photo-realistic lens blur. By softening non-differentiable operations, we make Dr.Bokeh differentiable such that it can be plugged into a machine-learning framework. We perform quantitative and qualitative evaluations on synthetic and real-world images to validate the effectiveness of the rendering quality and the differentiability of our method. We show Dr.Bokeh not only outperforms state-of-the-art bokeh rendering algorithms in terms of photo-realism but also improves the depth quality from depth-from-defocus. △ Less

Submitted 17 August, 2023; originally announced August 2023.

arXiv:2308.08176 [pdf, other]

RSpell: Retrieval-augmented Framework for Domain Adaptive Chinese Spelling Check

Authors: Siqi Song, Qi Lv, Lei Geng, Ziqiang Cao, Guohong Fu

Abstract: Chinese Spelling Check (CSC) refers to the detection and correction of spelling errors in Chinese texts. In practical application scenarios, it is important to make CSC models have the ability to correct errors across different domains. In this paper, we propose a retrieval-augmented spelling check framework called RSpell, which searches corresponding domain terms and incorporates them into CSC mo… ▽ More Chinese Spelling Check (CSC) refers to the detection and correction of spelling errors in Chinese texts. In practical application scenarios, it is important to make CSC models have the ability to correct errors across different domains. In this paper, we propose a retrieval-augmented spelling check framework called RSpell, which searches corresponding domain terms and incorporates them into CSC models. Specifically, we employ pinyin fuzzy matching to search for terms, which are combined with the input and fed into the CSC model. Then, we introduce an adaptive process control mechanism to dynamically adjust the impact of external knowledge on the model. Additionally, we develop an iterative strategy for the RSpell framework to enhance reasoning capabilities. We conducted experiments on CSC datasets in three domains: law, medicine, and official document writing. The results demonstrate that RSpell achieves state-of-the-art performance in both zero-shot and fine-tuning scenarios, demonstrating the effectiveness of the retrieval-augmented CSC framework. Our code is available at https://github.com/47777777/Rspell. △ Less

Submitted 16 August, 2023; originally announced August 2023.

Journal ref: NLPCC 2023

arXiv:2308.08114 [pdf, other]

OmniZoomer: Learning to Move and Zoom in on Sphere at High-Resolution

Authors: Zidong Cao, Hao Ai, Yan-Pei Cao, Ying Shan, Xiaohu Qie, Lin Wang

Abstract: Omnidirectional images (ODIs) have become increasingly popular, as their large field-of-view (FoV) can offer viewers the chance to freely choose the view directions in immersive environments such as virtual reality. The Möbius transformation is typically employed to further provide the opportunity for movement and zoom on ODIs, but applying it to the image level often results in blurry effect and… ▽ More Omnidirectional images (ODIs) have become increasingly popular, as their large field-of-view (FoV) can offer viewers the chance to freely choose the view directions in immersive environments such as virtual reality. The Möbius transformation is typically employed to further provide the opportunity for movement and zoom on ODIs, but applying it to the image level often results in blurry effect and aliasing problem. In this paper, we propose a novel deep learning-based approach, called \textbf{OmniZoomer}, to incorporate the Möbius transformation into the network for movement and zoom on ODIs. By learning various transformed feature maps under different conditions, the network is enhanced to handle the increasing edge curvatures, which alleviates the blurry effect. Moreover, to address the aliasing problem, we propose two key components. Firstly, to compensate for the lack of pixels for describing curves, we enhance the feature maps in the high-resolution (HR) space and calculate the transformed index map with a spatial index generation module. Secondly, considering that ODIs are inherently represented in the spherical space, we propose a spherical resampling module that combines the index map and HR feature maps to transform the feature maps for better spherical correlation. The transformed feature maps are decoded to output a zoomed ODI. Experiments show that our method can produce HR and high-quality ODIs with the flexibility to move and zoom in to the object of interest. Project page is available at http://vlislab22.github.io/OmniZoomer/. △ Less

Submitted 18 August, 2023; v1 submitted 15 August, 2023; originally announced August 2023.

Comments: Accepted by ICCV 2023

arXiv:2308.06926 [pdf, other]

OpenGCD: Assisting Open World Recognition with Generalized Category Discovery

Authors: Fulin Gao, Weimin Zhong, Zhixing Cao, Xin Peng, Zhi Li

Abstract: A desirable open world recognition (OWR) system requires performing three tasks: (1) Open set recognition (OSR), i.e., classifying the known (classes seen during training) and rejecting the unknown (unseen$/$novel classes) online; (2) Grouping and labeling these unknown as novel known classes; (3) Incremental learning (IL), i.e., continual learning these novel classes and retaining the memory of o… ▽ More A desirable open world recognition (OWR) system requires performing three tasks: (1) Open set recognition (OSR), i.e., classifying the known (classes seen during training) and rejecting the unknown (unseen$/$novel classes) online; (2) Grouping and labeling these unknown as novel known classes; (3) Incremental learning (IL), i.e., continual learning these novel classes and retaining the memory of old classes. Ideally, all of these steps should be automated. However, existing methods mostly assume that the second task is completely done manually. To bridge this gap, we propose OpenGCD that combines three key ideas to solve the above problems sequentially: (a) We score the origin of instances (unknown or specifically known) based on the uncertainty of the classifier's prediction; (b) For the first time, we introduce generalized category discovery (GCD) techniques in OWR to assist humans in grouping unlabeled data; (c) For the smooth execution of IL and GCD, we retain an equal number of informative exemplars for each class with diversity as the goal. Moreover, we present a new performance evaluation metric for GCD called harmonic clustering accuracy. Experiments on two standard classification benchmarks and a challenging dataset demonstrate that OpenGCD not only offers excellent compatibility but also substantially outperforms other baselines. Code: https://github.com/Fulin-Gao/OpenGCD. △ Less

Submitted 14 August, 2023; originally announced August 2023.

arXiv:2308.03822 [pdf, other]

Search for Eccentric Black Hole Coalescences during the Third Observing Run of LIGO and Virgo

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi , et al. (1750 additional authors not shown)

Abstract: Despite the growing number of confident binary black hole coalescences observed through gravitational waves so far, the astrophysical origin of these binaries remains uncertain. Orbital eccentricity is one of the clearest tracers of binary formation channels. Identifying binary eccentricity, however, remains challenging due to the limited availability of gravitational waveforms that include effect… ▽ More Despite the growing number of confident binary black hole coalescences observed through gravitational waves so far, the astrophysical origin of these binaries remains uncertain. Orbital eccentricity is one of the clearest tracers of binary formation channels. Identifying binary eccentricity, however, remains challenging due to the limited availability of gravitational waveforms that include effects of eccentricity. Here, we present observational results for a waveform-independent search sensitive to eccentric black hole coalescences, covering the third observing run (O3) of the LIGO and Virgo detectors. We identified no new high-significance candidates beyond those that were already identified with searches focusing on quasi-circular binaries. We determine the sensitivity of our search to high-mass (total mass $M>70$ $M_\odot$) binaries covering eccentricities up to 0.3 at 15 Hz orbital frequency, and use this to compare model predictions to search results. Assuming all detections are indeed quasi-circular, for our fiducial population model, we place an upper limit for the merger rate density of high-mass binaries with eccentricities $0 < e \leq 0.3$ at $0.33$ Gpc$^{-3}$ yr$^{-1}$ at 90\% confidence level. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Comments: 24 pages, 5 figures

Report number: LIGO-P2300080

arXiv:2308.02994 [pdf]

Efficient production of nuclear isomer $^{93m}$Mo with laser-accelerated proton beam and an astrophysical implication on $^{92m}$Mo production

Authors: Wenru Fan, Wei Qi, Jingli Zhang, Zongwei Cao, Haoyang Lan, Xinxiang Li, Yi Xu, Yuqiu Gu, Zhigang Deng, Zhimeng Zhang, Changxiang Tan, Wen Luo, Yun Yuan, Weimin Zhou

Abstract: Nuclear isomers play a key role in the creation of the elements in the universe and have a number of fascinating potential applications related to the controlled release of nuclear energy on demand. Particularly, $^{93m}$Mo isomer is a good candidate for studying the depletion of nuclear isomer via nuclear excitation by electron capture. For such purposes, efficient approach for $^{93m}$Mo product… ▽ More Nuclear isomers play a key role in the creation of the elements in the universe and have a number of fascinating potential applications related to the controlled release of nuclear energy on demand. Particularly, $^{93m}$Mo isomer is a good candidate for studying the depletion of nuclear isomer via nuclear excitation by electron capture. For such purposes, efficient approach for $^{93m}$Mo production needs to be explored. In the present work, we demonstrate experimentally an efficient production of $^{93m}$Mo through $^{93}$Nb(p, n) reaction induced by intense laser pulse. When a ps-duration, 100-J laser pulse is employed, the $^{93m}$Mo isomer at 2425 keV (21/2$^+$, $T_{1/2}$ = 6.85 h) are generated with a high yield of $1.8\times10^6$ particles/shot. The resulting peak efficiency is expected to be $10^{17}$ particles/s, which is at least five orders of magnitudes higher than using classical proton accelerator. The effects of production and destruction of $^{93m}$Mo on the controversial astrophysical p-isotope $^{92}$Mo are studied. It is found that the $^{93}$Nb(p, n)-$^{93m}$Mo reaction is an important production path for ^{93m}Mo seed nucleus, and the influence of ^{93m}Mo-^{92}Mo reaction flow on ^{92}Mo production cannot be ignored. In addition, we propose to directly measure the astrophysical rate of (p, n) reaction using laser-induced proton beam since the latter one fits the Maxwell-Boltzmann distribution well. We conclude that laser-induced proton beam opens a new path to produce nuclear isomers with high peak efficiency towards the understanding of p-nuclei nucleosythesis. △ Less

Submitted 5 August, 2023; originally announced August 2023.

Comments: 21 pages, 9 figure

arXiv:2308.02283 [pdf, other]

doi 10.1145/3581783.3611807

Diffusion-Augmented Depth Prediction with Sparse Annotations

Authors: Jiaqi Li, Yiran Wang, Zihao Huang, Jinghong Zheng, Ke Xian, Zhiguo Cao, Jianming Zhang

Abstract: Depth estimation aims to predict dense depth maps. In autonomous driving scenes, sparsity of annotations makes the task challenging. Supervised models produce concave objects due to insufficient structural information. They overfit to valid pixels and fail to restore spatial structures. Self-supervised methods are proposed for the problem. Their robustness is limited by pose estimation, leading to… ▽ More Depth estimation aims to predict dense depth maps. In autonomous driving scenes, sparsity of annotations makes the task challenging. Supervised models produce concave objects due to insufficient structural information. They overfit to valid pixels and fail to restore spatial structures. Self-supervised methods are proposed for the problem. Their robustness is limited by pose estimation, leading to erroneous results in natural scenes. In this paper, we propose a supervised framework termed Diffusion-Augmented Depth Prediction (DADP). We leverage the structural characteristics of diffusion model to enforce depth structures of depth models in a plug-and-play manner. An object-guided integrality loss is also proposed to further enhance regional structure integrality by fetching objective information. We evaluate DADP on three driving benchmarks and achieve significant improvements in depth structures and robustness. Our work provides a new perspective on depth estimation with sparse annotations in autonomous driving scenes. △ Less

Submitted 4 August, 2023; originally announced August 2023.

Comments: Accepted by ACM MM'2023

arXiv:2308.01907 [pdf, other]

The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World

Authors: Weiyun Wang, Min Shi, Qingyun Li, Wenhai Wang, Zhenhang Huang, Linjie Xing, Zhe Chen, Hao Li, Xizhou Zhu, Zhiguo Cao, Yushi Chen, Tong Lu, Jifeng Dai, Yu Qiao

Abstract: We present the All-Seeing (AS) project: a large-scale data and model for recognizing and understanding everything in the open world. Using a scalable data engine that incorporates human feedback and efficient models in the loop, we create a new dataset (AS-1B) with over 1 billion regions annotated with semantic tags, question-answering pairs, and detailed captions. It covers a wide range of 3.5 mi… ▽ More We present the All-Seeing (AS) project: a large-scale data and model for recognizing and understanding everything in the open world. Using a scalable data engine that incorporates human feedback and efficient models in the loop, we create a new dataset (AS-1B) with over 1 billion regions annotated with semantic tags, question-answering pairs, and detailed captions. It covers a wide range of 3.5 million common and rare concepts in the real world, and has 132.2 billion tokens that describe the concepts and their attributes. Leveraging this new dataset, we develop the All-Seeing model (ASM), a unified framework for panoptic visual recognition and understanding. The model is trained with open-ended language prompts and locations, which allows it to generalize to various vision and language tasks with remarkable zero-shot performance, including region-text retrieval, region recognition, captioning, and question-answering. We hope that this project can serve as a foundation for vision-language artificial general intelligence research. Models and the dataset shall be released at https://github.com/OpenGVLab/All-Seeing, and demo can be seen at https://huggingface.co/spaces/OpenGVLab/all-seeing. △ Less

Submitted 3 August, 2023; originally announced August 2023.

Comments: Technical Report

arXiv:2308.00233 [pdf, ps, other]

doi 10.1016/j.physletb.2023.138350

Probe the gravitational constant variation via the propagation of gravitational waves

Authors: Bing Sun, Jiachen An, Zhoujian Cao

Abstract: The gravitational constant variation means the breakdown of the strong equivalence principle. As the cornerstone of general relativity, the validity of general relativity can be examined by studying the gravitational constant variation. Such variations have the potential to affect both the generation and propagation of gravitational waves. In this paper, our focus lies on the effect of gravitation… ▽ More The gravitational constant variation means the breakdown of the strong equivalence principle. As the cornerstone of general relativity, the validity of general relativity can be examined by studying the gravitational constant variation. Such variations have the potential to affect both the generation and propagation of gravitational waves. In this paper, our focus lies on the effect of gravitational constant variation specifically on the propagation of gravitational waves. We employ two analytical methods, namely based on the Fierz-Pauli action and the perturbation of Einstein-Hilbert action around Minkowski spacetime, both leading to the the same gravitational wave equation. By solving this equation, we find the effects of gravitational constant variation on gravitational wave propagation. The result is consistent with previous investigations based on Maxwell-like equations for gravitational waves. Notably, we find that small variations in the gravitational constant result in an amplitude correction at the leading order and a phase correction at the sub-leading order for gravitational waves. These results provide valuable insights for probing gravitational constant variation and can be directly applied to gravitational wave data analysis. △ Less

Submitted 31 July, 2023; originally announced August 2023.

Comments: 9 pages, 1 figure

Journal ref: Physics Letters B, Volume 848, January 2024, 138350

arXiv:2307.15382 [pdf, ps, other]

doi 10.1016/j.physletb.2023.138108

The effect of the gravitational constant variation on the propagation of gravitational waves

Authors: Jiachen An, Yadong Xue, Zhoujian Cao, Xiaokai He, Bing Sun

Abstract: Since the first detection of gravitational waves, they have been used to investigate various fundamental problems, including the variation of physical constants. Regarding the gravitational constant, previous works focused on the effect of the gravitational constant variation on the gravitational wave generation. In this paper, we investigate the effect of the gravitational constant variation on t… ▽ More Since the first detection of gravitational waves, they have been used to investigate various fundamental problems, including the variation of physical constants. Regarding the gravitational constant, previous works focused on the effect of the gravitational constant variation on the gravitational wave generation. In this paper, we investigate the effect of the gravitational constant variation on the gravitational wave propagation. The Maxwell-like equation that describes the propagation of gravitational waves is extended in this paper to account for situations where the gravitational constant varies. Based on this equation, we find that the amplitude of gravitational waves will be corrected. Consequently the estimated distance to the gravitational wave source without considering such a correction may be biased. Applying our correction result to the well known binary neutron star coalescence event GW170817, we get a constraint on the variation of the gravitational constant. Relating our result to the Yukawa deviation of gravity, we for the first time get the constraint of the Yukawa parameters in 10Mpc scale. This scale corresponds to a graviton mass $m_g\sim10^{-31}$eV. △ Less

Submitted 28 July, 2023; originally announced July 2023.

Showing 151–200 of 1,131 results for author: Cao, Z