Search | arXiv e-print repository

Dance of the ADS: Orchestrating Failures through Historically-Informed Scenario Fuzzing

Authors: Tong Wang, Taotao Gu, Huan Deng, Hu Li, Xiaohui Kuang, Gang Zhao

Abstract: As autonomous driving systems (ADS) advance towards higher levels of autonomy, orchestrating their safety verification becomes increasingly intricate. This paper unveils ScenarioFuzz, a pioneering scenario-based fuzz testing methodology. Designed like a choreographer who understands the past performances, it uncovers vulnerabilities in ADS without the crutch of predefined scenarios. Leveraging map… ▽ More As autonomous driving systems (ADS) advance towards higher levels of autonomy, orchestrating their safety verification becomes increasingly intricate. This paper unveils ScenarioFuzz, a pioneering scenario-based fuzz testing methodology. Designed like a choreographer who understands the past performances, it uncovers vulnerabilities in ADS without the crutch of predefined scenarios. Leveraging map road networks, such as OPENDRIVE, we extract essential data to form a foundational scenario seed corpus. This corpus, enriched with pertinent information, provides the necessary boundaries for fuzz testing in the absence of starting scenarios. Our approach integrates specialized mutators and mutation techniques, combined with a graph neural network model, to predict and filter out high-risk scenario seeds, optimizing the fuzzing process using historical test data. Compared to other methods, our approach reduces the time cost by an average of 60.3%, while the number of error scenarios discovered per unit of time increases by 103%. Furthermore, we propose a self-supervised collision trajectory clustering method, which aids in identifying and summarizing 54 high-risk scenario categories prone to inducing ADS faults. Our experiments have successfully uncovered 58 bugs across six tested systems, emphasizing the critical safety concerns of ADS. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: This paper was accepted by 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2024)

MSC Class: 68Txx (Primary) ACM Class: D.2.4; I.2.9; I.6.7

arXiv:2407.01601 [pdf, other]

Unveiling and Controlling Anomalous Attention Distribution in Transformers

Authors: Ruiqing Yan, Xingbo Du, Haoyu Deng, Linghan Zheng, Qiuzhuang Sun, Jifang Hu, Yuhang Shao, Penghao Jiang, Jinrong Jiang, Lian Zhao

Abstract: With the advent of large models based on the Transformer architecture, researchers have observed an anomalous phenomenon in the Attention mechanism--there is a very high attention on the first element, which is prevalent across Transformer-based models. It is crucial to understand it for the development of techniques focusing on attention distribution, such as Key-Value (KV) Cache compression and… ▽ More With the advent of large models based on the Transformer architecture, researchers have observed an anomalous phenomenon in the Attention mechanism--there is a very high attention on the first element, which is prevalent across Transformer-based models. It is crucial to understand it for the development of techniques focusing on attention distribution, such as Key-Value (KV) Cache compression and infinite extrapolation; however, the latent cause leaves to be unknown. In this paper, we analyze such a phenomenon from the perspective of waiver phenomenon, which involves reducing the internal values of certain elements in the sequence, allowing them to absorb excess attention without affecting their contribution to information. In specific models, due to differences in positional encoding and attention patterns, we have found that the selection of waiver elements by the model can be categorized into two methods: positional-encoding-based and feature-distribution-within-elements-based. △ Less

Submitted 3 July, 2024; v1 submitted 26 June, 2024; originally announced July 2024.

arXiv:2407.01050 [pdf, other]

Evolutionary Morphology Towards Overconstrained Locomotion via Large-Scale, Multi-Terrain Deep Reinforcement Learning

Authors: Yenan Chen, Chuye Zhang, Pengxi Gu, Jianuo Qiu, Jiayi Yin, Nuofan Qiu, Guojing Huang, Bangchao Huang, Zishang Zhang, Hui Deng, Wei Zhang, Fang Wan, Chaoyang Song

Abstract: While the animals' Fin-to-Limb evolution has been well-researched in biology, such morphological transformation remains under-adopted in the modern design of advanced robotic limbs. This paper investigates a novel class of overconstrained locomotion from a design and learning perspective inspired by evolutionary morphology, aiming to integrate the concept of `intelligent design under constraints'… ▽ More While the animals' Fin-to-Limb evolution has been well-researched in biology, such morphological transformation remains under-adopted in the modern design of advanced robotic limbs. This paper investigates a novel class of overconstrained locomotion from a design and learning perspective inspired by evolutionary morphology, aiming to integrate the concept of `intelligent design under constraints' - hereafter referred to as constraint-driven design intelligence - in developing modern robotic limbs with superior energy efficiency. We propose a 3D-printable design of robotic limbs parametrically reconfigurable as a classical planar 4-bar linkage, an overconstrained Bennett linkage, and a spherical 4-bar linkage. These limbs adopt a co-axial actuation, identical to the modern legged robot platforms, with the added capability of upgrading into a wheel-legged system. Then, we implemented a large-scale, multi-terrain deep reinforcement learning framework to train these reconfigurable limbs for a comparative analysis of overconstrained locomotion in energy efficiency. Results show that the overconstrained limbs exhibit more efficient locomotion than planar limbs during forward and sideways walking over different terrains, including floors, slopes, and stairs, with or without random noises, by saving at least 22% mechanical energy in completing the traverse task, with the spherical limbs being the least efficient. It also achieves the highest average speed of 0.85 meters per second on flat terrain, which is 20% faster than the planar limbs. This study paves the path for an exciting direction for future research in overconstrained robotics leveraging evolutionary morphology and reconfigurable mechanism intelligence when combined with state-of-the-art methods in deep reinforcement learning. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 13 pages, 5 figures, Accepted and Presented at ReMAR2024

arXiv:2406.18548 [pdf]

Exploration of Multi-Scale Image Fusion Systems in Intelligent Medical Image Analysis

Authors: Yuxiang Hu, Haowei Yang, Ting Xu, Shuyao He, Jiajie Yuan, Haozhang Deng

Abstract: The diagnosis of brain cancer relies heavily on medical imaging techniques, with MRI being the most commonly used. It is necessary to perform automatic segmentation of brain tumors on MRI images. This project intends to build an MRI algorithm based on U-Net. The residual network and the module used to enhance the context information are combined, and the void space convolution pooling pyramid is a… ▽ More The diagnosis of brain cancer relies heavily on medical imaging techniques, with MRI being the most commonly used. It is necessary to perform automatic segmentation of brain tumors on MRI images. This project intends to build an MRI algorithm based on U-Net. The residual network and the module used to enhance the context information are combined, and the void space convolution pooling pyramid is added to the network for processing. The brain glioma MRI image dataset provided by cancer imaging archives was experimentally verified. A multi-scale segmentation method based on a weighted least squares filter was used to complete the 3D reconstruction of brain tumors. Thus, the accuracy of three-dimensional reconstruction is further improved. Experiments show that the local texture features obtained by the proposed algorithm are similar to those obtained by laser scanning. The algorithm is improved by using the U-Net method and an accuracy of 0.9851 is obtained. This approach significantly enhances the precision of image segmentation and boosts the efficiency of image classification. △ Less

Submitted 23 May, 2024; originally announced June 2024.

arXiv:2406.17538 [pdf, other]

SKD-TSTSAN: Three-Stream Temporal-Shift Attention Network Based on Self-Knowledge Distillation for Micro-Expression Recognition

Authors: Guanghao Zhu, Lin Liu, Yuhao Hu, Haixin Sun, Fang Liu, Xiaohui Du, Ruqian Hao, Juanxiu Liu, Yong Liu, Hao Deng, Jing Zhang

Abstract: Micro-expressions (MEs) are subtle facial movements that occur spontaneously when people try to conceal the real emotions. Micro-expression recognition (MER) is crucial in many fields, including criminal analysis and psychotherapy. However, MER is challenging since MEs have low intensity and ME datasets are small in size. To this end, a three-stream temporal-shift attention network based on self-k… ▽ More Micro-expressions (MEs) are subtle facial movements that occur spontaneously when people try to conceal the real emotions. Micro-expression recognition (MER) is crucial in many fields, including criminal analysis and psychotherapy. However, MER is challenging since MEs have low intensity and ME datasets are small in size. To this end, a three-stream temporal-shift attention network based on self-knowledge distillation (SKD-TSTSAN) is proposed in this paper. Firstly, to address the low intensity of ME muscle movements, we utilize learning-based motion magnification modules to enhance the intensity of ME muscle movements. Secondly, we employ efficient channel attention (ECA) modules in the local-spatial stream to make the network focus on facial regions that are highly relevant to MEs. In addition, temporal shift modules (TSMs) are used in the dynamic-temporal stream, which enables temporal modeling with no additional parameters by mixing ME motion information from two different temporal domains. Furthermore, we introduce self-knowledge distillation (SKD) into the MER task by introducing auxiliary classifiers and using the deepest section of the network for supervision, encouraging all blocks to fully explore the features of the training set. Finally, extensive experiments are conducted on four ME datasets: CASME II, SAMM, MMEW, and CAS(ME)3. The experimental results demonstrate that our SKD-TSTSAN outperforms other existing methods and achieves new state-of-the-art performance. Our code will be available at https://github.com/GuanghaoZhu663/SKD-TSTSAN. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.12769 [pdf, other]

Latent Intuitive Physics: Learning to Transfer Hidden Physics from A 3D Video

Authors: Xiangming Zhu, Huayu Deng, Haochen Yuan, Yunbo Wang, Xiaokang Yang

Abstract: We introduce latent intuitive physics, a transfer learning framework for physics simulation that can infer hidden properties of fluids from a single 3D video and simulate the observed fluid in novel scenes. Our key insight is to use latent features drawn from a learnable prior distribution conditioned on the underlying particle states to capture the invisible and complex physical properties. To ac… ▽ More We introduce latent intuitive physics, a transfer learning framework for physics simulation that can infer hidden properties of fluids from a single 3D video and simulate the observed fluid in novel scenes. Our key insight is to use latent features drawn from a learnable prior distribution conditioned on the underlying particle states to capture the invisible and complex physical properties. To achieve this, we train a parametrized prior learner given visual observations to approximate the visual posterior of inverse graphics, and both the particle states and the visual posterior are obtained from a learned neural renderer. The converged prior learner is embedded in our probabilistic physics engine, allowing us to perform novel simulations on unseen geometries, boundaries, and dynamics without knowledge of the true physical parameters. We validate our model in three ways: (i) novel scene simulation with the learned visual-world physics, (ii) future prediction of the observed fluid dynamics, and (iii) supervised particle simulation. Our model demonstrates strong performance in all three tasks. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: Published as a conference paper at ICLR 2024

Journal ref: ICLR 2024

arXiv:2406.08864 [pdf]

Research on Early Warning Model of Cardiovascular Disease Based on Computer Deep Learning

Authors: Yuxiang Hu, Jinxin Hu, Ting Xu, Bo Zhang, Jiajie Yuan, Haozhang Deng

Abstract: This project intends to study a cardiovascular disease risk early warning model based on one-dimensional convolutional neural networks. First, the missing values of 13 physiological and symptom indicators such as patient age, blood glucose, cholesterol, and chest pain were filled and Z-score was standardized. The convolutional neural network is converted into a 2D matrix, the convolution function… ▽ More This project intends to study a cardiovascular disease risk early warning model based on one-dimensional convolutional neural networks. First, the missing values of 13 physiological and symptom indicators such as patient age, blood glucose, cholesterol, and chest pain were filled and Z-score was standardized. The convolutional neural network is converted into a 2D matrix, the convolution function of 1,3, and 5 is used for the first-order convolution operation, and the Max Pooling algorithm is adopted for dimension reduction. Set the learning rate and output rate. It is optimized by the Adam algorithm. The result of classification is output by a soft classifier. This study was conducted based on Statlog in the UCI database and heart disease database respectively. The empirical data indicate that the forecasting precision of this technique has been enhanced by 11.2%, relative to conventional approaches, while there is a significant improvement in the logarithmic curve fitting. The efficacy and applicability of the novel approach are corroborated through the examination employing a one-dimensional convolutional neural network. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 6 pages

arXiv:2406.04593 [pdf, other]

SynAsk: Unleashing the Power of Large Language Models in Organic Synthesis

Authors: Chonghuan Zhang, Qianghua Lin, Biwei Zhu, Haopeng Yang, Xiao Lian, Hao Deng, Jiajun Zheng, Kuangbiao Liao

Abstract: The field of natural language processing (NLP) has witnessed a transformative shift with the emergence of large language models (LLMs), revolutionizing various language tasks and applications, and the integration of LLM into specialized domains enhances their capabilities for domain-specific applications. Notably, NLP has made significant strides in organic chemistry, particularly in predicting sy… ▽ More The field of natural language processing (NLP) has witnessed a transformative shift with the emergence of large language models (LLMs), revolutionizing various language tasks and applications, and the integration of LLM into specialized domains enhances their capabilities for domain-specific applications. Notably, NLP has made significant strides in organic chemistry, particularly in predicting synthetic tasks, paving the way for the development of LLMs tailored to the organic chemistry field. In this work, we introduce SynAsk, a comprehensive organic chemistry domain-specific LLM platform developed by AIChemEco Inc. By finetuning an LLM with domain-specific data and integrating it with a chain of thought approach, SynAsk seamlessly accesses our knowledge base and advanced chemistry tools in a question-and-answer format. This includes functionalities such as a basic chemistry knowledge base, molecular information retrieval, reaction performance prediction, retrosynthesis prediction, chemical literature acquisition, and more. This novel methodology synergizes fine-tuning techniques with external resource integration, resulting in an organic chemistry-specific model poised to facilitate research and discovery in the field. Accessible via http://synask.aichemeco.com, SynAsk represents a significant advancement in leveraging NLP for synthetic applications. △ Less

Submitted 13 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

arXiv:2405.20071 [pdf]

A Staged Approach using Machine Learning and Uncertainty Quantification to Predict the Risk of Hip Fracture

Authors: Anjum Shaik, Kristoffer Larsen, Nancy E. Lane, Chen Zhao, Kuan-Jui Su, Joyce H. Keyak, Qing Tian, Qiuying Sha, Hui Shen, Hong-Wen Deng, Weihua Zhou

Abstract: Despite advancements in medical care, hip fractures impose a significant burden on individuals and healthcare systems. This paper focuses on the prediction of hip fracture risk in older and middle-aged adults, where falls and compromised bone quality are predominant factors. We propose a novel staged model that combines advanced imaging and clinical data to improve predictive performance. By using… ▽ More Despite advancements in medical care, hip fractures impose a significant burden on individuals and healthcare systems. This paper focuses on the prediction of hip fracture risk in older and middle-aged adults, where falls and compromised bone quality are predominant factors. We propose a novel staged model that combines advanced imaging and clinical data to improve predictive performance. By using CNNs to extract features from hip DXA images, along with clinical variables, shape measurements, and texture features, our method provides a comprehensive framework for assessing fracture risk. A staged machine learning-based model was developed using two ensemble models: Ensemble 1 (clinical variables only) and Ensemble 2 (clinical variables and DXA imaging features). This staged approach used uncertainty quantification from Ensemble 1 to decide if DXA features are necessary for further prediction. Ensemble 2 exhibited the highest performance, achieving an AUC of 0.9541, an accuracy of 0.9195, a sensitivity of 0.8078, and a specificity of 0.9427. The staged model also performed well, with an AUC of 0.8486, an accuracy of 0.8611, a sensitivity of 0.5578, and a specificity of 0.9249, outperforming Ensemble 1, which had an AUC of 0.5549, an accuracy of 0.7239, a sensitivity of 0.1956, and a specificity of 0.8343. Furthermore, the staged model suggested that 54.49% of patients did not require DXA scanning. It effectively balanced accuracy and specificity, offering a robust solution when DXA data acquisition is not always feasible. Statistical tests confirmed significant differences between the models, highlighting the advantages of the advanced modeling strategies. Our staged approach could identify individuals at risk with a high accuracy but reduce the unnecessary DXA scanning. It has great promise to guide interventions to prevent hip fractures with reduced cost and radiation. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 29 pages, 5 figures, 6 tables

arXiv:2405.14588 [pdf, other]

A Study of the Spectral properties of Gamma-Ray Bursts with the Precursors and Main bursts

Authors: Hui-Ying Deng, Zhao-Yang Peng, Jia-Ming Chen, Yue Yin, Ting Li

Abstract: There is no consensus yet on whether the precursor and the main burst of gamma-ray bursts (GRBs) have the same origin, and their jet composition is still unclear. In order to further investigate this issue, we systematically search 21 Fermi GRBs with both precursor and main burst for spectral analysis. We first perform Bayesian time-resolved spectral analysis and find that almost all the precursor… ▽ More There is no consensus yet on whether the precursor and the main burst of gamma-ray bursts (GRBs) have the same origin, and their jet composition is still unclear. In order to further investigate this issue, we systematically search 21 Fermi GRBs with both precursor and main burst for spectral analysis. We first perform Bayesian time-resolved spectral analysis and find that almost all the precursors and the main bursts (94.4$\%$) exhibit thermal components, and the vast majority of them have low-energy spectral index ($α$) (72.2$\%$) that exceed the limit of synchrotron radiation. We then analyse the evolution and correlation of the spectral parameters and find that approximately half of the $α$ (50$\%$) of the precursors and the main bursts evolve in a similar pattern, while peak energy ($E_{p}$) (55.6$\%$) behave similarly, and their evolution is mainly characterized by flux tracking; for the $α-F$ (the flux) relation, more than half of the precursors and the main bursts (61.1$\%$) exhibit roughly similar patterns; the $E_{p}-F$ relation in both the precursor and main burst (100$\%$) exhibits a positive correlation of at least moderate strength. Next, we constrain the outflow properties of the precursors and the main bursts and find that most of them exhibit typical properties of photosphere radiation. Finally, we compare the time-integrated spectra of the precursors and the main bursts and find that nearly all of them are located in similar regions of the Amati relation and follow the Yonetoku relation. Therefore, we conclude that main bursts are continuations of precursors and they may share a common physical origin. △ Less

Submitted 23 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

Comments: 36 pages,13 figures. Accepted for publication in ApJ

arXiv:2405.10691 [pdf, other]

LoCI-DiffCom: Longitudinal Consistency-Informed Diffusion Model for 3D Infant Brain Image Completion

Authors: Zihao Zhu, Tianli Tao, Yitian Tao, Haowen Deng, Xinyi Cai, Gaofeng Wu, Kaidong Wang, Haifeng Tang, Lixuan Zhu, Zhuoyang Gu, Jiawei Huang, Dinggang Shen, Han Zhang

Abstract: The infant brain undergoes rapid development in the first few years after birth.Compared to cross-sectional studies, longitudinal studies can depict the trajectories of infants brain development with higher accuracy, statistical power and flexibility.However, the collection of infant longitudinal magnetic resonance (MR) data suffers a notorious dropout problem, resulting in incomplete datasets wit… ▽ More The infant brain undergoes rapid development in the first few years after birth.Compared to cross-sectional studies, longitudinal studies can depict the trajectories of infants brain development with higher accuracy, statistical power and flexibility.However, the collection of infant longitudinal magnetic resonance (MR) data suffers a notorious dropout problem, resulting in incomplete datasets with missing time points. This limitation significantly impedes subsequent neuroscience and clinical modeling. Yet, existing deep generative models are facing difficulties in missing brain image completion, due to sparse data and the nonlinear, dramatic contrast/geometric variations in the developing brain. We propose LoCI-DiffCom, a novel Longitudinal Consistency-Informed Diffusion model for infant brain image Completion,which integrates the images from preceding and subsequent time points to guide a diffusion model for generating high-fidelity missing data. Our designed LoCI module can work on highly sparse sequences, relying solely on data from two temporal points. Despite wide separation and diversity between age time points, our approach can extract individualized developmental features while ensuring context-aware consistency. Our experiments on a large infant brain MR dataset demonstrate its effectiveness with consistent performance on missing infant brain MR completion even in big gap scenarios, aiding in better delineation of early developmental trajectories. △ Less

Submitted 17 May, 2024; originally announced May 2024.

arXiv:2405.07977 [pdf, other]

A Demographic-Conditioned Variational Autoencoder for fMRI Distribution Sampling and Removal of Confounds

Authors: Anton Orlichenko, Gang Qu, Ziyu Zhou, Anqi Liu, Hong-Wen Deng, Zhengming Ding, Julia M. Stephen, Tony W. Wilson, Vince D. Calhoun, Yu-Ping Wang

Abstract: Objective: fMRI and derived measures such as functional connectivity (FC) have been used to predict brain age, general fluid intelligence, psychiatric disease status, and preclinical neurodegenerative disease. However, it is not always clear that all demographic confounds, such as age, sex, and race, have been removed from fMRI data. Additionally, many fMRI datasets are restricted to authorized re… ▽ More Objective: fMRI and derived measures such as functional connectivity (FC) have been used to predict brain age, general fluid intelligence, psychiatric disease status, and preclinical neurodegenerative disease. However, it is not always clear that all demographic confounds, such as age, sex, and race, have been removed from fMRI data. Additionally, many fMRI datasets are restricted to authorized researchers, making dissemination of these valuable data sources challenging. Methods: We create a variational autoencoder (VAE)-based model, DemoVAE, to decorrelate fMRI features from demographics and generate high-quality synthetic fMRI data based on user-supplied demographics. We train and validate our model using two large, widely used datasets, the Philadelphia Neurodevelopmental Cohort (PNC) and Bipolar and Schizophrenia Network for Intermediate Phenotypes (BSNIP). Results: We find that DemoVAE recapitulates group differences in fMRI data while capturing the full breadth of individual variations. Significantly, we also find that most clinical and computerized battery fields that are correlated with fMRI data are not correlated with DemoVAE latents. An exception are several fields related to schizophrenia medication and symptom severity. Conclusion: Our model generates fMRI data that captures the full distribution of FC better than traditional VAE or GAN models. We also find that most prediction using fMRI data is dependent on correlation with, and prediction of, demographics. Significance: Our DemoVAE model allows for generation of high quality synthetic data conditioned on subject demographics as well as the removal of the confounding effects of demographics. We identify that FC-based prediction tasks are highly influenced by demographic confounds. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 12 pages

arXiv:2405.07919 [pdf, other]

Exploring the Low-Pass Filtering Behavior in Image Super-Resolution

Authors: Haoyu Deng, Zijing Xu, Yule Duan, Xiao Wu, Wenjie Shu, Liang-Jian Deng

Abstract: Deep neural networks for image super-resolution (ISR) have shown significant advantages over traditional approaches like the interpolation. However, they are often criticized as 'black boxes' compared to traditional approaches with solid mathematical foundations. In this paper, we attempt to interpret the behavior of deep neural networks in ISR using theories from the field of signal processing. F… ▽ More Deep neural networks for image super-resolution (ISR) have shown significant advantages over traditional approaches like the interpolation. However, they are often criticized as 'black boxes' compared to traditional approaches with solid mathematical foundations. In this paper, we attempt to interpret the behavior of deep neural networks in ISR using theories from the field of signal processing. First, we report an intriguing phenomenon, referred to as `the sinc phenomenon.' It occurs when an impulse input is fed to a neural network. Then, building on this observation, we propose a method named Hybrid Response Analysis (HyRA) to analyze the behavior of neural networks in ISR tasks. Specifically, HyRA decomposes a neural network into a parallel connection of a linear system and a non-linear system and demonstrates that the linear system functions as a low-pass filter while the non-linear system injects high-frequency information. Finally, to quantify the injected high-frequency information, we introduce a metric for image-to-image tasks called Frequency Spectrum Distribution Similarity (FSDS). FSDS reflects the distribution similarity of different frequency components and can capture nuances that traditional metrics may overlook. Code, videos and raw experimental results for this paper can be found in: https://github.com/RisingEntropy/LPFInISR. △ Less

Submitted 23 May, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

Comments: Accepted by ICML 2024

arXiv:2405.05481 [pdf, other]

Achieving millisecond coherence fluxonium through overlap Josephson junctions

Authors: Fei Wang, Kannan Lu, Huijuan Zhan, Lu Ma, Feng Wu, Hantao Sun, Hao Deng, Yang Bai, Feng Bao, Xu Chang, Ran Gao, Xun Gao, Guicheng Gong, Lijuan Hu, Ruizi Hu, Honghong Ji, Xizheng Ma, Liyong Mao, Zhijun Song, Chengchun Tang, Hongcheng Wang, Tenghui Wang, Ziang Wang, Tian Xia, Hongxin Xu , et al. (10 additional authors not shown)

Abstract: Fluxonium qubits are recognized for their high coherence times and high operation fidelities, attributed to their unique design incorporating over 100 Josephson junctions per superconducting loop. However, this complexity poses significant fabrication challenges, particularly in achieving high yield and junction uniformity with traditional methods. Here, we introduce an overlap process for Josephs… ▽ More Fluxonium qubits are recognized for their high coherence times and high operation fidelities, attributed to their unique design incorporating over 100 Josephson junctions per superconducting loop. However, this complexity poses significant fabrication challenges, particularly in achieving high yield and junction uniformity with traditional methods. Here, we introduce an overlap process for Josephson junction fabrication that achieves nearly 100% yield and maintains uniformity across a 2-inch wafer with less than 5% variation for the phase slip junction and less than 2% for the junction array. Our compact junction array design facilitates fluxonium qubits with energy relaxation times exceeding 1 millisecond at the flux frustration point, demonstrating consistency with state-of-the-art dielectric loss tangents and flux noise across multiple devices. This work suggests the scalability of high coherence fluxonium processors using CMOS-compatible processes, marking a significant step towards practical quantum computing. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2405.04782 [pdf, other]

Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection

Authors: Zhaoxiang Zhang, Hanqiu Deng, Jinan Bao, Xingyu Li

Abstract: Image Anomaly Detection has been a challenging task in Computer Vision field. The advent of Vision-Language models, particularly the rise of CLIP-based frameworks, has opened new avenues for zero-shot anomaly detection. Recent studies have explored the use of CLIP by aligning images with normal and prompt descriptions. However, the exclusive dependence on textual guidance often falls short, highli… ▽ More Image Anomaly Detection has been a challenging task in Computer Vision field. The advent of Vision-Language models, particularly the rise of CLIP-based frameworks, has opened new avenues for zero-shot anomaly detection. Recent studies have explored the use of CLIP by aligning images with normal and prompt descriptions. However, the exclusive dependence on textual guidance often falls short, highlighting the critical importance of additional visual references. In this work, we introduce a Dual-Image Enhanced CLIP approach, leveraging a joint vision-language scoring system. Our methods process pairs of images, utilizing each as a visual reference for the other, thereby enriching the inference process with visual context. This dual-image strategy markedly enhanced both anomaly classification and localization performances. Furthermore, we have strengthened our model with a test-time adaptation module that incorporates synthesized anomalies to refine localization capabilities. Our approach significantly exploits the potential of vision-language joint anomaly detection and demonstrates comparable performance with current SOTA methods across various datasets. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.04318 [pdf, other]

doi 10.1103/PhysRevResearch.6.023117

Long-range magnetic order in CePdAl$_3$ enabled by orthorhombic deformation

Authors: M. Stekiel, P. Čermák, C. Franz, M. Meven, D. Legut, W. Simeth, U. B. Hansen, B. Fåk, S. Weber, R. Schönmann, V. Kumar, K. Nemkovski, H. Deng, A. Bauer, C. Pfleiderer, A. Schneidewind

Abstract: We investigate the effect of structural deformation on the magnetic properties of orthorhombic CePdAl$_3$ in relation to its tetragonal polymorph. Utilizing x-ray and neutron diffraction we establish that the crystal structure has the $Cmcm$ space group symmetry and exhibits pseudo-tetragonal twinning. According to density-functional calculations the tetragonal-orthorhombic deformation mechanism h… ▽ More We investigate the effect of structural deformation on the magnetic properties of orthorhombic CePdAl$_3$ in relation to its tetragonal polymorph. Utilizing x-ray and neutron diffraction we establish that the crystal structure has the $Cmcm$ space group symmetry and exhibits pseudo-tetragonal twinning. According to density-functional calculations the tetragonal-orthorhombic deformation mechanism has its grounds in relatively small free enthalpy difference between the polymorphs, allowing either phase to be quenched and fully accounts for the twinned microstructure of the orthorhombic phase. Neutron diffraction measurements show that orthorhombic CePdAl$_3$ establishes long-range magnetic order below $T_\mathrm{N}$=5.29 (5) K characterized by a collinear, antiferromagnetic arrangement of magnetic moments. Magnetic anisotropies of orthorhombic CePdAl$_3$ arise from strong spin-orbit coupling as evidenced by the crystal-field splitting of the $4f$ multiplet, fully characterised with neutron spectroscopy. We discuss the potential mechanism of frustration posed by antiferromagnetic interactions between nearest neighbours in the tetragonal phase, which hinders the formation of long-range magnetic order in tetragonal CePdAl$_3$. We propose that orthorhombic deformation releases the frustration and allows for long-range magnetic order. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: Finalized paper from the splitting of arxiv.org/abs/2106.08194v1

Journal ref: Phys. Rev. Research 6, 023117, (2024)

arXiv:2405.04309 [pdf, other]

Non-rigid Structure-from-Motion: Temporally-smooth Procrustean Alignment and Spatially-variant Deformation Modeling

Authors: Jiawei Shi, Hui Deng, Yuchao Dai

Abstract: Even though Non-rigid Structure-from-Motion (NRSfM) has been extensively studied and great progress has been made, there are still key challenges that hinder their broad real-world applications: 1) the inherent motion/rotation ambiguity requires either explicit camera motion recovery with extra constraint or complex Procrustean Alignment; 2) existing low-rank modeling of the global shape can over-… ▽ More Even though Non-rigid Structure-from-Motion (NRSfM) has been extensively studied and great progress has been made, there are still key challenges that hinder their broad real-world applications: 1) the inherent motion/rotation ambiguity requires either explicit camera motion recovery with extra constraint or complex Procrustean Alignment; 2) existing low-rank modeling of the global shape can over-penalize drastic deformations in the 3D shape sequence. This paper proposes to resolve the above issues from a spatial-temporal modeling perspective. First, we propose a novel Temporally-smooth Procrustean Alignment module that estimates 3D deforming shapes and adjusts the camera motion by aligning the 3D shape sequence consecutively. Our new alignment module remedies the requirement of complex reference 3D shape during alignment, which is more conductive to non-isotropic deformation modeling. Second, we propose a spatial-weighted approach to enforce the low-rank constraint adaptively at different locations to accommodate drastic spatially-variant deformation reconstruction better. Our modeling outperform existing low-rank based methods, and extensive experiments across different datasets validate the effectiveness of our method. △ Less

Submitted 23 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

Comments: Accepted by CVPR 2024; V2 adds new experiments

arXiv:2405.04294 [pdf, other]

Enhancing the Efficiency and Accuracy of Underlying Asset Reviews in Structured Finance: The Application of Multi-agent Framework

Authors: Xiangpeng Wan, Haicheng Deng, Kai Zou, Shiqi Xu

Abstract: Structured finance, which involves restructuring diverse assets into securities like MBS, ABS, and CDOs, enhances capital market efficiency but presents significant due diligence challenges. This study explores the integration of artificial intelligence (AI) with traditional asset review processes to improve efficiency and accuracy in structured finance. Using both open-sourced and close-sourced l… ▽ More Structured finance, which involves restructuring diverse assets into securities like MBS, ABS, and CDOs, enhances capital market efficiency but presents significant due diligence challenges. This study explores the integration of artificial intelligence (AI) with traditional asset review processes to improve efficiency and accuracy in structured finance. Using both open-sourced and close-sourced large language models (LLMs), we demonstrate that AI can automate the verification of information between loan applications and bank statements effectively. While close-sourced models such as GPT-4 show superior performance, open-sourced models like LLAMA3 offer a cost-effective alternative. Dual-agent systems further increase accuracy, though this comes with higher operational costs. This research highlights AI's potential to minimize manual errors and streamline due diligence, suggesting a broader application of AI in financial document analysis and risk management. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.00245 [pdf, other]

Flexible multi-bunch-length operation for continuous-wave x-ray free-electron lasers

Authors: Zihan Zhu, Jiawei Yan, Hanxiang Yang, Duan Gu, Bart Faatz, Haixiao Deng, Qiang Gu

Abstract: The X-ray free-electron lasers (XFELs) are cutting-edge instruments pivotal in a broad range of fields, providing high-power X-ray pulses with durations spanning from femtoseconds to attoseconds. One of the critical challenges in XFEL facilities is the simultaneous accommodation of diverse requirements for XFEL operation modes and photon properties across different undulator lines. This paper prop… ▽ More The X-ray free-electron lasers (XFELs) are cutting-edge instruments pivotal in a broad range of fields, providing high-power X-ray pulses with durations spanning from femtoseconds to attoseconds. One of the critical challenges in XFEL facilities is the simultaneous accommodation of diverse requirements for XFEL operation modes and photon properties across different undulator lines. This paper proposes a dipole-kicker combination in the bunch compressors to vary the electron bunch length for the continuous-wave XFEL facilities driven by a superconducting linac. This method enables optimization of the electron bunch length on a per-bunch basis, tailored to each specific needs of each undulator. Through start-to-end simulations based on the parameters of the Shanghai high-repetition-rate XFEL and extreme light facility, we demonstrate the feasibility of this technique. The results show its effectiveness in enabling simultaneous operations of self-amplified spontaneous emission and externally seeded FEL across different undulator lines, ensuring optimal electron bunch compression for each undulator line. △ Less

Submitted 30 April, 2024; originally announced May 2024.

arXiv:2405.00236 [pdf, other]

STT: Stateful Tracking with Transformers for Autonomous Driving

Authors: Longlong Jing, Ruichi Yu, Xu Chen, Zhengli Zhao, Shiwei Sheng, Colin Graber, Qi Chen, Qinru Li, Shangxuan Wu, Han Deng, Sangjin Lee, Chris Sweeney, Qiurui He, Wei-Chih Hung, Tong He, Xingyi Zhou, Farshid Moussavi, Zijian Guo, Yin Zhou, Mingxing Tan, Weilong Yang, Congcong Li

Abstract: Tracking objects in three-dimensional space is critical for autonomous driving. To ensure safety while driving, the tracker must be able to reliably track objects across frames and accurately estimate their states such as velocity and acceleration in the present. Existing works frequently focus on the association task while either neglecting the model performance on state estimation or deploying c… ▽ More Tracking objects in three-dimensional space is critical for autonomous driving. To ensure safety while driving, the tracker must be able to reliably track objects across frames and accurately estimate their states such as velocity and acceleration in the present. Existing works frequently focus on the association task while either neglecting the model performance on state estimation or deploying complex heuristics to predict the states. In this paper, we propose STT, a Stateful Tracking model built with Transformers, that can consistently track objects in the scenes while also predicting their states accurately. STT consumes rich appearance, geometry, and motion signals through long term history of detections and is jointly optimized for both data association and state estimation tasks. Since the standard tracking metrics like MOTA and MOTP do not capture the combined performance of the two tasks in the wider spectrum of object states, we extend them with new metrics called S-MOTA and MOTPS that address this limitation. STT achieves competitive real-time performance on the Waymo Open Dataset. △ Less

Submitted 30 April, 2024; originally announced May 2024.

Comments: ICRA 2024

arXiv:2404.16522 [pdf, other]

A Deep Learning-Driven Pipeline for Differentiating Hypertrophic Cardiomyopathy from Cardiac Amyloidosis Using 2D Multi-View Echocardiography

Authors: Bo Peng, Xiaofeng Li, Xinyu Li, Zhenghan Wang, Hui Deng, Xiaoxian Luo, Lixue Yin, Hongmei Zhang

Abstract: Hypertrophic cardiomyopathy (HCM) and cardiac amyloidosis (CA) are both heart conditions that can progress to heart failure if untreated. They exhibit similar echocardiographic characteristics, often leading to diagnostic challenges. This paper introduces a novel multi-view deep learning approach that utilizes 2D echocardiography for differentiating between HCM and CA. The method begins by classif… ▽ More Hypertrophic cardiomyopathy (HCM) and cardiac amyloidosis (CA) are both heart conditions that can progress to heart failure if untreated. They exhibit similar echocardiographic characteristics, often leading to diagnostic challenges. This paper introduces a novel multi-view deep learning approach that utilizes 2D echocardiography for differentiating between HCM and CA. The method begins by classifying 2D echocardiography data into five distinct echocardiographic views: apical 4-chamber, parasternal long axis of left ventricle, parasternal short axis at levels of the mitral valve, papillary muscle, and apex. It then extracts features of each view separately and combines five features for disease classification. A total of 212 patients diagnosed with HCM, and 30 patients diagnosed with CA, along with 200 individuals with normal cardiac function(Normal), were enrolled in this study from 2018 to 2022. This approach achieved a precision, recall of 0.905, and micro-F1 score of 0.904, demonstrating its effectiveness in accurately identifying HCM and CA using a multi-view analysis. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.13860 [pdf, other]

Distributional Black-Box Model Inversion Attack with Multi-Agent Reinforcement Learning

Authors: Huan Bao, Kaimin Wei, Yongdong Wu, Jin Qian, Robert H. Deng

Abstract: A Model Inversion (MI) attack based on Generative Adversarial Networks (GAN) aims to recover the private training data from complex deep learning models by searching codes in the latent space. However, they merely search a deterministic latent space such that the found latent code is usually suboptimal. In addition, the existing distributional MI schemes assume that an attacker can access the stru… ▽ More A Model Inversion (MI) attack based on Generative Adversarial Networks (GAN) aims to recover the private training data from complex deep learning models by searching codes in the latent space. However, they merely search a deterministic latent space such that the found latent code is usually suboptimal. In addition, the existing distributional MI schemes assume that an attacker can access the structures and parameters of the target model, which is not always viable in practice. To overcome the above shortcomings, this paper proposes a novel Distributional Black-Box Model Inversion (DBB-MI) attack by constructing the probabilistic latent space for searching the target privacy data. Specifically, DBB-MI does not need the target model parameters or specialized GAN training. Instead, it finds the latent probability distribution by combining the output of the target model with multi-agent reinforcement learning techniques. Then, it randomly chooses latent codes from the latent probability distribution for recovering the private data. As the latent probability distribution closely aligns with the target privacy data in latent space, the recovered data will leak the privacy of training samples of the target model significantly. Abundant experiments conducted on diverse datasets and networks show that the present DBB-MI has better performance than state-of-the-art in attack accuracy, K-nearest neighbor feature distance, and Peak Signal-to-Noise Ratio. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.11206 [pdf, other]

Prompt-tuning for Clickbait Detection via Text Summarization

Authors: Haoxiang Deng, Yi Zhu, Ye Wang, Jipeng Qiang, Yunhao Yuan, Yun Li, Runmei Zhang

Abstract: Clickbaits are surprising social posts or deceptive news headlines that attempt to lure users for more clicks, which have posted at unprecedented rates for more profit or commercial revenue. The spread of clickbait has significant negative impacts on the users, which brings users misleading or even click-jacking attacks. Different from fake news, the crucial problem in clickbait detection is deter… ▽ More Clickbaits are surprising social posts or deceptive news headlines that attempt to lure users for more clicks, which have posted at unprecedented rates for more profit or commercial revenue. The spread of clickbait has significant negative impacts on the users, which brings users misleading or even click-jacking attacks. Different from fake news, the crucial problem in clickbait detection is determining whether the headline matches the corresponding content. Most existing methods compute the semantic similarity between the headlines and contents for detecting clickbait. However, due to significant differences in length and semantic features between headlines and contents, directly calculating semantic similarity is often difficult to summarize the relationship between them. To address this problem, we propose a prompt-tuning method for clickbait detection via text summarization in this paper, text summarization is introduced to summarize the contents, and clickbait detection is performed based on the similarity between the generated summary and the contents. Specifically, we first introduce a two-stage text summarization model to produce high-quality news summaries based on pre-trained language models, and then both the headlines and new generated summaries are incorporated as the inputs for prompt-tuning. Additionally, a variety of strategies are conducted to incorporate external knowledge for improving the performance of clickbait detection. The extensive experiments on well-known clickbait detection datasets demonstrate that our method achieved state-of-the-art performance. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2404.08750 [pdf, other]

FastLogAD: Log Anomaly Detection with Mask-Guided Pseudo Anomaly Generation and Discrimination

Authors: Yifei Lin, Hanqiu Deng, Xingyu Li

Abstract: Nowadays large computers extensively output logs to record the runtime status and it has become crucial to identify any suspicious or malicious activities from the information provided by the realtime logs. Thus, fast log anomaly detection is a necessary task to be implemented for automating the infeasible manual detection. Most of the existing unsupervised methods are trained only on normal log d… ▽ More Nowadays large computers extensively output logs to record the runtime status and it has become crucial to identify any suspicious or malicious activities from the information provided by the realtime logs. Thus, fast log anomaly detection is a necessary task to be implemented for automating the infeasible manual detection. Most of the existing unsupervised methods are trained only on normal log data, but they usually require either additional abnormal data for hyperparameter selection or auxiliary datasets for discriminative model optimization. In this paper, aiming for a highly effective discriminative model that enables rapid anomaly detection,we propose FastLogAD, a generator-discriminator framework trained to exhibit the capability of generating pseudo-abnormal logs through the Mask-Guided Anomaly Generation (MGAG) model and efficiently identifying the anomalous logs via the Discriminative Abnormality Separation (DAS) model. Particularly, pseudo-abnormal logs are generated by replacing randomly masked tokens in a normal sequence with unlikely candidates. During the discriminative stage, FastLogAD learns a distinct separation between normal and pseudoabnormal samples based on their embedding norms, allowing the selection of a threshold without exposure to any test data and achieving competitive performance. Extensive experiments on several common benchmarks show that our proposed FastLogAD outperforms existing anomaly detection approaches. Furthermore, compared to previous methods, FastLogAD achieves at least x10 speed increase in anomaly detection over prior work. Our implementation is available at https://github.com/YifeiLin0226/FastLogAD. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: 10 pages

arXiv:2404.07932 [pdf, other]

FusionMamba: Efficient Image Fusion with State Space Model

Authors: Siran Peng, Xiangyu Zhu, Haoyu Deng, Zhen Lei, Liang-Jian Deng

Abstract: Image fusion aims to generate a high-resolution multi/hyper-spectral image by combining a high-resolution image with limited spectral information and a low-resolution image with abundant spectral data. Current deep learning (DL)-based methods for image fusion primarily rely on CNNs or Transformers to extract features and merge different types of data. While CNNs are efficient, their receptive fiel… ▽ More Image fusion aims to generate a high-resolution multi/hyper-spectral image by combining a high-resolution image with limited spectral information and a low-resolution image with abundant spectral data. Current deep learning (DL)-based methods for image fusion primarily rely on CNNs or Transformers to extract features and merge different types of data. While CNNs are efficient, their receptive fields are limited, restricting their capacity to capture global context. Conversely, Transformers excel at learning global information but are hindered by their quadratic complexity. Fortunately, recent advancements in the State Space Model (SSM), particularly Mamba, offer a promising solution to this issue by enabling global awareness with linear complexity. However, there have been few attempts to explore the potential of the SSM in information fusion, which is a crucial ability in domains like image fusion. Therefore, we propose FusionMamba, an innovative method for efficient image fusion. Our contributions mainly focus on two aspects. Firstly, recognizing that images from different sources possess distinct properties, we incorporate Mamba blocks into two U-shaped networks, presenting a novel architecture that extracts spatial and spectral features in an efficient, independent, and hierarchical manner. Secondly, to effectively combine spatial and spectral information, we extend the Mamba block to accommodate dual inputs. This expansion leads to the creation of a new module called the FusionMamba block, which outperforms existing fusion techniques such as concatenation and cross-attention. We conduct a series of experiments on five datasets related to three image fusion tasks. The quantitative and qualitative evaluation results demonstrate that our method achieves SOTA performance, underscoring the superiority of FusionMamba. The code is available at https://github.com/PSRben/FusionMamba. △ Less

Submitted 10 May, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.07833 [pdf]

Streamlined Photoacoustic Image Processing with Foundation Models: A Training-Free Solution

Authors: Handi Deng, Yucheng Zhou, Jiaxuan Xiang, Liujie Gu, Yan Luo, Hai Feng, Mingyuan Liu, Cheng Ma

Abstract: Foundation models have rapidly evolved and have achieved significant accomplishments in computer vision tasks. Specifically, the prompt mechanism conveniently allows users to integrate image prior information into the model, making it possible to apply models without any training. Therefore, we propose a method based on foundation models and zero training to solve the tasks of photoacoustic (PA) i… ▽ More Foundation models have rapidly evolved and have achieved significant accomplishments in computer vision tasks. Specifically, the prompt mechanism conveniently allows users to integrate image prior information into the model, making it possible to apply models without any training. Therefore, we propose a method based on foundation models and zero training to solve the tasks of photoacoustic (PA) image segmentation. We employed the segment anything model (SAM) by setting simple prompts and integrating the model's outputs with prior knowledge of the imaged objects to accomplish various tasks, including: (1) removing the skin signal in three-dimensional PA image rendering; (2) dual speed-of-sound reconstruction, and (3) segmentation of finger blood vessels. Through these demonstrations, we have concluded that deep learning can be directly applied in PA imaging without the requirement for network design and training. This potentially allows for a hands-on, convenient approach to achieving efficient and accurate segmentation of PA images. This letter serves as a comprehensive tutorial, facilitating the mastery of the technique through the provision of code and sample datasets. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.07543 [pdf, other]

Content-Adaptive Non-Local Convolution for Remote Sensing Pansharpening

Authors: Yule Duan, Xiao Wu, Haoyu Deng, Liang-Jian Deng

Abstract: Currently, machine learning-based methods for remote sensing pansharpening have progressed rapidly. However, existing pansharpening methods often do not fully exploit differentiating regional information in non-local spaces, thereby limiting the effectiveness of the methods and resulting in redundant learning parameters. In this paper, we introduce a so-called content-adaptive non-local convolutio… ▽ More Currently, machine learning-based methods for remote sensing pansharpening have progressed rapidly. However, existing pansharpening methods often do not fully exploit differentiating regional information in non-local spaces, thereby limiting the effectiveness of the methods and resulting in redundant learning parameters. In this paper, we introduce a so-called content-adaptive non-local convolution (CANConv), a novel method tailored for remote sensing image pansharpening. Specifically, CANConv employs adaptive convolution, ensuring spatial adaptability, and incorporates non-local self-similarity through the similarity relationship partition (SRP) and the partition-wise adaptive convolution (PWAC) sub-modules. Furthermore, we also propose a corresponding network architecture, called CANNet, which mainly utilizes the multi-scale self-similarity. Extensive experiments demonstrate the superior performance of CANConv, compared with recent promising fusion methods. Besides, we substantiate the method's effectiveness through visualization, ablation experiments, and comparison with existing methods on multiple test sets. The source code is publicly available at https://github.com/duanyll/CANConv. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: Accepted by CVPR 2024

arXiv:2404.07020 [pdf, other]

The NANOGrav 15 yr Data Set: Looking for Signs of Discreteness in the Gravitational-wave Background

Authors: Gabriella Agazie, Paul T. Baker, Bence Bécsy, Laura Blecha, Adam Brazier, Paul R. Brook, Lucas Brown, Sarah Burke-Spolaor, J. Andrew Casey-Clyde, Maria Charisi, Shami Chatterjee, Tyler Cohen, James M. Cordes, Neil J. Cornish, Fronefield Crawford, H. Thankful Cromartie, Megan E. DeCesar, Paul B. Demorest, Heling Deng, Timothy Dolch, Elizabeth C. Ferrara, William Fiore, Emmanuel Fonseca, Gabriel E. Freedman, Nate Garver-Daniels , et al. (58 additional authors not shown)

Abstract: The cosmic merger history of supermassive black hole binaries (SMBHBs) is expected to produce a low-frequency gravitational wave background (GWB). Here we investigate how signs of the discrete nature of this GWB can manifest in pulsar timing arrays through excursions from, and breaks in, the expected $f_{\mathrm{GW}}^{-2/3}$ power-law of the GWB strain spectrum. To do this, we create a semi-analyt… ▽ More The cosmic merger history of supermassive black hole binaries (SMBHBs) is expected to produce a low-frequency gravitational wave background (GWB). Here we investigate how signs of the discrete nature of this GWB can manifest in pulsar timing arrays through excursions from, and breaks in, the expected $f_{\mathrm{GW}}^{-2/3}$ power-law of the GWB strain spectrum. To do this, we create a semi-analytic SMBHB population model, fit to NANOGrav's 15 yr GWB amplitude, and with 1,000 realizations we study the populations' characteristic strain and residual spectra. Comparing our models to the NANOGrav 15 yr spectrum, we find two interesting excursions from the power-law. The first, at $2 \; \mathrm{nHz}$, is below our GWB realizations with $p$-value significance $p = 0.05$ to $0.06$ ($\approx 1.8 σ- 1.9 σ$). The second, at $16 \; \mathrm{nHz}$, is above our GWB realizations with $p = 0.04$ to $0.15$ ($\approx 1.4 σ- 2.1 σ$). We explore the properties of a loud SMBHB which could cause such an excursion. Our simulations also show that the expected number of SMBHBs decreases by three orders of magnitude, from $\sim 10^6$ to $\sim 10^3$, between $2\; \mathrm{nHz}$ and $20 \; \mathrm{nHz}$. This causes a break in the strain spectrum as the stochasticity of the background breaks down at $26^{+28}_{-19} \; \mathrm{nHz}$, consistent with predictions pre-dating GWB measurements. The diminished GWB signal from SMBHBs at frequencies above the $26$~nHz break opens a window for PTAs to detect continuous GWs from individual SMBHBs or GWs from the early universe. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 10 pages, 8 figures, 1 appendix, submitted to ApJ

arXiv:2404.04897 [pdf, other]

Electronic origin of solute effects on the mobility of screw dislocation in bcc molybdenum

Authors: Kangzhi Zhou, Jiajun Feng, Ziran Liu, Huiqiu Deng, Lixia Jia, Xinfu He

Abstract: In body-centered cubic (bcc) metals such as molybdenum, screw dislocations often exhibit non-Schmid behavior, moving in directions unpredicted by the Schmid law. The mobility of these dislocations is notably influenced by the presence of solute atoms within the alloy matrix. In this study, employing first-principles calculations, we delve into the electronic origins of these influences.Initially,… ▽ More In body-centered cubic (bcc) metals such as molybdenum, screw dislocations often exhibit non-Schmid behavior, moving in directions unpredicted by the Schmid law. The mobility of these dislocations is notably influenced by the presence of solute atoms within the alloy matrix. In this study, employing first-principles calculations, we delve into the electronic origins of these influences.Initially, we construct both single atomic column and triple atomic column models to simulate the formation of screw dislocations with solute atoms. Our investigation reveals that tantalum (Ta) and tungsten (W) increase the formation energy of solute-dislocation complexes, in contrast to osmium (Os), iridium (Ir), and platinum (Pt). Subsequently, employing a comprehensive screw dislocation dipole model under shear deformation, we explore the combined effects of solute atoms and deformation on dislocation core movement. Our findings demonstrate that Ta and W, positioned as first nearest neighbors, reduce the stress required to move dislocation cores away from corresponding dislocation dipoles. Conversely, Os, Ir, and Pt exhibit an attractive effect on dislocation cores, lowering the energy barrier for screw dislocation formation and enticing dislocation cores towards these solute atoms. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.00996 [pdf]

Charge density wave without long-range structural modulation in canted antiferromagnetic kagome FeGe

Authors: Chenfei Shi, Hanbin Deng, Surya Rohith Kotla, Yi Liu, Sitaram Ramakrishnan, Claudio Eisele, Harshit Agarwal, Leila Noohinejad, Ji-Yong Liu, Tianyu Yang, Guowei Liu, Bishal Baran Maity, Qi Wang, Zhaodi Lin, Baojuan Kang, Wanting Yang, Yongchang Li, Zhihua Yang, Yuke Li, Yanpeng Qi, Arumugam Thamizhavel, Wei Ren, Guang-Han Cao, Jia-Xin Yin, Sander van Smaalen , et al. (2 additional authors not shown)

Abstract: Strongly correlated electron systems with a kagome lattice can host abundant exotic quantum states such as superconductivity and spin/charge density waves (CDW) due to the complicated interactions between different degrees of freedoms in the framework of a unique two-dimensional geometrically frustrated lattice structure. Recently, successive orders of A-type antiferromagnetism (AFM),… ▽ More Strongly correlated electron systems with a kagome lattice can host abundant exotic quantum states such as superconductivity and spin/charge density waves (CDW) due to the complicated interactions between different degrees of freedoms in the framework of a unique two-dimensional geometrically frustrated lattice structure. Recently, successive orders of A-type antiferromagnetism (AFM), $2\times2\times2$ CDW and canted double-cone AFM have been manifested upon cooling in magnetic kagome FeGe. However, the mechanism of the CDW order and its interaction with magnetism are presently enigmatic at best. Here we investigate the evolution of CDW order with temperature across the spin canting transition in FeGe by single-crystal x-ray diffraction. Refinements of its modulated structure are presented using the superspace approach. Interestingly, the superlattice reflections originating from CDW-induced long-range structural modulation become extremely weak after the system enters the canted AFM while a $2\times2$ CDW in the $ab$ plane persists as a long-range order demonstrated by strong electronic modulation in the d$I$/d$V$ map of scanning tunneling spectroscopy. We discovered a novel CDW order without long-range structural modulation in FeGe probably because of the competition between CDW and canted AFM in determining the underlying crystal structure. In addition, occupational modulations of Ge1 atoms located in the kagome plane and displacive modulations of all the atoms were extracted from the refinements, confirming the existence of Ge atom dimerization along the $c$ axis as the major distortion and indicating a dynamic transformation between different CDW domains. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: 22 pages, 6 figures. Comments on the manuscript are welcome

arXiv:2404.00229 [pdf, other]

Rotating-modulated Higher-Order Topological States in a Split-ring Photonic Insulator

Authors: Hui Chang Li, Xiang Zhou, Hai Lin Chi, Wen Wen Wang, Yun Shen, Xiao Hua Deng

Abstract: The emerging field of topology has brought device effects to a new level. Higher-order topological insulators (HOTIs) go beyond traditional descriptions of bulk-edge correspondence, broadening the understanding of topologically insulating phases. In this paper, a second-order split-ring photonic crystal (SSPC) with zero-dimensional (0D) corner states and one-dimensional (1D) edge states is propose… ▽ More The emerging field of topology has brought device effects to a new level. Higher-order topological insulators (HOTIs) go beyond traditional descriptions of bulk-edge correspondence, broadening the understanding of topologically insulating phases. In this paper, a second-order split-ring photonic crystal (SSPC) with zero-dimensional (0D) corner states and one-dimensional (1D) edge states is proposed. Based on the coupling strength determined by the opening direction between the split-rings, the electronic transition strength of the electronic system is imitated, and the topological trivial and non-trivial transformation of the topological two-dimensional (2D) SSH model are realized by using the rotating split-ring lattice. Theory and simulation find that SSPC has non-trivial topological edge states that can be quantified by bulk polarization. As the opening direction of the split-rings gradually changes within one period, there will be transitions between four different topological polarizations of the lowest energy bands, which can be conveniently used to achieve transitions between different topological phases. Our research can be extended to higher dimensions and broaden research paths for higher-order photonic topological insulators and semimetals. △ Less

Submitted 1 April, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

Comments: 7 pages, 4 figures

arXiv:2403.16570 [pdf, other]

Spectropolarimetry of Fraunhofer lines in local upper solar atmosphere

Authors: Z. Q. Qu, L. Chang, G. T. Dun, X. M. Cheng, C. Fang, Z. Xu, D. Yuan, L. H. Deng, X. Y. Zhang

Abstract: Spectropolarimetric results of Fraunhofer lines between 516.3nm and 532.6nm are presented in local upper solar chromosphere, transition zone and inner corona below a height of about 0.04 solar radius above the solar limb. The data were acquired on Nov.3, 2013 during a total solar eclipse in Gabon by the prototype Fiber Arrayed Solar Optical Telescope(FASOT). It is found that the polarization ampli… ▽ More Spectropolarimetric results of Fraunhofer lines between 516.3nm and 532.6nm are presented in local upper solar chromosphere, transition zone and inner corona below a height of about 0.04 solar radius above the solar limb. The data were acquired on Nov.3, 2013 during a total solar eclipse in Gabon by the prototype Fiber Arrayed Solar Optical Telescope(FASOT). It is found that the polarization amplitudes of the Fraunhofer lines in these layers depend strongly on specific spectral lines. Fraunhofer line at MgI$b_{1}$518.4nm can have a polarization amplitude up to 0.36$\%$ with respective to the continuum polarization level, while the polarizations of some lines like FeI/CrI524.7nm and FeI525.0nm are often under the detection limit 6.0$\times 10^{-4}$. The polarizations of the Fraunhofer lines, like the emission lines and the continuum, increase with height as a whole trend. The fractional linear polarization amplitudes of inner F-corona can be close to those of inner E-corona, and in general larger than those of inner K-corona. Rotation of the polarization direction of Fraunhofer line is often accompanied with variations in their polarization amplitudes and profile shapes. It is also judged from these polarimetric properties, along with evidences, that neutral atoms exist in these atmospheric layers. Thus the inner F-corona described here is induced by the neutral atoms, and the entropy of the inner corona evaluated becomes larger than those in the underneath layers due to more microstates found. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: also Submitted to ApJ

arXiv:2403.15432 [pdf, other]

BRIEDGE: EEG-Adaptive Edge AI for Multi-Brain to Multi-Robot Interaction

Authors: Jinhui Ouyang, Mingzhu Wu, Xinglin Li, Hanhui Deng, Di Wu

Abstract: Recent advances in EEG-based BCI technologies have revealed the potential of brain-to-robot collaboration through the integration of sensing, computing, communication, and control. In this paper, we present BRIEDGE as an end-to-end system for multi-brain to multi-robot interaction through an EEG-adaptive neural network and an encoding-decoding communication framework, as illustrated in Fig.1. As d… ▽ More Recent advances in EEG-based BCI technologies have revealed the potential of brain-to-robot collaboration through the integration of sensing, computing, communication, and control. In this paper, we present BRIEDGE as an end-to-end system for multi-brain to multi-robot interaction through an EEG-adaptive neural network and an encoding-decoding communication framework, as illustrated in Fig.1. As depicted, the edge mobile server or edge portable server will collect EEG data from the users and utilize the EEG-adaptive neural network to identify the users' intentions. The encoding-decoding communication framework then encodes the EEG-based semantic information and decodes it into commands in the process of data transmission. To better extract the joint features of heterogeneous EEG data as well as enhance classification accuracy, BRIEDGE introduces an informer-based ProbSparse self-attention mechanism. Meanwhile, parallel and secure transmissions for multi-user multi-task scenarios under physical channels are addressed by dynamic autoencoder and autodecoder communications. From mobile computing and edge AI perspectives, model compression schemes composed of pruning, weight sharing, and quantization are also used to deploy lightweight EEG-adaptive models running on both transmitter and receiver sides. Based on the effectiveness of these components, a code map representing various commands enables multiple users to control multiple intelligent agents concurrently. Our experiments in comparison with state-of-the-art works show that BRIEDGE achieves the best classification accuracy of heterogeneous EEG data, and more stable performance under noisy environments. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2403.13897 [pdf, other]

Large Exciton Binding Energy in the Bulk van der Waals Magnet CrSBr

Authors: Shane Smolenski, Ming Wen, Qiuyang Li, Eoghan Downey, Adam Alfrey, Wenhao Liu, Aswin L. N. Kondusamy, Aaron Bostwick, Chris Jozwiak, Eli Rotenberg, Liuyan Zhao, Hui Deng, Bing Lv, Dominika Zgid, Emanuel Gull, Na Hyun Jo

Abstract: Excitons, bound electron-hole pairs, influence the optical properties in strongly interacting solid state systems. Excitons and their associated many-body physics are typically most stable and pronounced in monolayer materials. Bulk systems with large exciton binding energies, on the other hand, are rare and the mechanisms driving their stability are still relatively unexplored. Here, we report an… ▽ More Excitons, bound electron-hole pairs, influence the optical properties in strongly interacting solid state systems. Excitons and their associated many-body physics are typically most stable and pronounced in monolayer materials. Bulk systems with large exciton binding energies, on the other hand, are rare and the mechanisms driving their stability are still relatively unexplored. Here, we report an exceptionally large exciton binding energy in single crystals of the bulk van der Waals antiferromagnet CrSBr. Utilizing state-of-the-art angle-resolved photoemission spectroscopy and self-consistent ab-initio GW calculations, we present direct spectroscopic evidence that robust electronic and structural anisotropy can significantly amplify the exciton binding energy within bulk crystals. Furthermore, the application of a vertical electric field enables broad tunability of the optical and electronic properties. Our results indicate that CrSBr is a promising material for the study of the role of anisotropy in strongly interacting bulk systems and for the development of exciton-based optoelectronics. △ Less

Submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.12552 [pdf, other]

M2DA: Multi-Modal Fusion Transformer Incorporating Driver Attention for Autonomous Driving

Authors: Dongyang Xu, Haokun Li, Qingfan Wang, Ziying Song, Lei Chen, Hanming Deng

Abstract: End-to-end autonomous driving has witnessed remarkable progress. However, the extensive deployment of autonomous vehicles has yet to be realized, primarily due to 1) inefficient multi-modal environment perception: how to integrate data from multi-modal sensors more efficiently; 2) non-human-like scene understanding: how to effectively locate and predict critical risky agents in traffic scenarios l… ▽ More End-to-end autonomous driving has witnessed remarkable progress. However, the extensive deployment of autonomous vehicles has yet to be realized, primarily due to 1) inefficient multi-modal environment perception: how to integrate data from multi-modal sensors more efficiently; 2) non-human-like scene understanding: how to effectively locate and predict critical risky agents in traffic scenarios like an experienced driver. To overcome these challenges, in this paper, we propose a Multi-Modal fusion transformer incorporating Driver Attention (M2DA) for autonomous driving. To better fuse multi-modal data and achieve higher alignment between different modalities, a novel Lidar-Vision-Attention-based Fusion (LVAFusion) module is proposed. By incorporating driver attention, we empower the human-like scene understanding ability to autonomous vehicles to identify crucial areas within complex scenarios precisely and ensure safety. We conduct experiments on the CARLA simulator and achieve state-of-the-art performance with less data in closed-loop benchmarks. Source codes are available at https://anonymous.4open.science/r/M2DA-4772. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2403.06784 [pdf, ps, other]

Uniqueness of the critical points of solutions to two kinds of semilinear elliptic equations in higher dimensional domains

Authors: Haiyun Deng, Jingwen Ji, Feida Jiang, Jiabin Yin

Abstract: In this paper, we provide an affirmative answer to the conjecture A for bounded simple rotationally symmetric domains $Ω\subset \mathbb{R}^n(n\geq 3)$ along $x_n$ axis. Precisely, we use a new simple argument to study the symmetry of positive solutions for two kinds of semilinear elliptic equations. To do this, when $f(\cdot,s)$ is strictly convex with respect to $s$, we show that the nonnegativit… ▽ More In this paper, we provide an affirmative answer to the conjecture A for bounded simple rotationally symmetric domains $Ω\subset \mathbb{R}^n(n\geq 3)$ along $x_n$ axis. Precisely, we use a new simple argument to study the symmetry of positive solutions for two kinds of semilinear elliptic equations. To do this, when $f(\cdot,s)$ is strictly convex with respect to $s$, we show that the nonnegativity of the first eigenvalue of the corresponding linearized operator in somehow symmetric domains is a sufficient condition for the symmetry of $u$. Moreover, we prove the uniqueness of critical points of a positive solution to semilinear elliptic equation $-\triangle u=f(\cdot,u)$ with zero Dirichlet boundary condition for simple rotationally symmetric domains in $\mathbb{R}^n$ by continuity method and a variety of maximum principles. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: 18 pages

MSC Class: 35B38; 35J05; 35J25

arXiv:2403.05649 [pdf]

Reconfigurable inverse designed phase-change photonics

Authors: Changming Wu, Ziyu Jiao, Haoqin Deng, Yi-Siou Huang, Heshan Yu, Ichiro Takeuchi, Carlos A. Ríos Ocampo, Mo Li

Abstract: Integrated photonic network-on-chip (NoC) fabrics can provide multiple terabits per second (Tbps) bandwidth for high-performance computing servers supporting the latest artificial intelligence. To maximize the utility of the photonic hardware, these optical networks can multiplex data streams in multi-dimensional channels encoded in optical wavelengths, modes, and polarization states. A generic ph… ▽ More Integrated photonic network-on-chip (NoC) fabrics can provide multiple terabits per second (Tbps) bandwidth for high-performance computing servers supporting the latest artificial intelligence. To maximize the utility of the photonic hardware, these optical networks can multiplex data streams in multi-dimensional channels encoded in optical wavelengths, modes, and polarization states. A generic photonic platform that can be reconfigured to implement functionalities in those dimensions is highly desirable to streamline the design and manufacturing of integrated photonic networks. Here, we demonstrate a multi-functional photonic device using phase-change material Sb2Se3 that can be reconfigured from a wavelength-division demultiplexer to a mode-division demultiplexer. The reconfiguration is achieved by direct laser writing of phase patterns optimized with the inverse design technique. The demonstrated programmable phase-change photonic devices hold immense promise for a wide range of photonic applications requiring adaptable functionalities, including on-chip and in-package optical interconnects, optical signal processing, photonic tensor cores, and future optical computing for artificial intelligence. △ Less

Submitted 8 March, 2024; originally announced March 2024.

Comments: 12 pages, 4 figures

arXiv:2403.02361 [pdf]

Renal function changes in chronic hepatitis B patients

Authors: Jinhua Zhao, Lili Wu, Xiaoan Yang, Zhilaing Gao, Hong Deng

Abstract: The best way to treat chronic hepatitis B is with pegylated interferon alone or with oral antiviral drugs. There is limited research comparing the renal safety of entecavir and tenofovir when used with pegylated interferon. This study will compare changes in renal function in chronic hepatitis B patients treated with pegylated interferon and either entecavir or tenofovir. The study included a coho… ▽ More The best way to treat chronic hepatitis B is with pegylated interferon alone or with oral antiviral drugs. There is limited research comparing the renal safety of entecavir and tenofovir when used with pegylated interferon. This study will compare changes in renal function in chronic hepatitis B patients treated with pegylated interferon and either entecavir or tenofovir. The study included a cohort of 836 patients with chronic hepatitis B (CHB) who received treatment with pegylated interferon (IFN) either alone or in combination with entecavir (ETV) and tenofovir (TDF) between the years 2018 and 2021. Of these patients, 713 were included in a matched analysis comparing outcomes between those who were cured and those who were uncured, while 123 patients received IFN alone as a control group for comparison with the ETV and TDF treatment groups. The primary outcome measured was the change in renal function, specifically estimated glomerular filtration rate (eGFR), cystatin C (CysC), and inorganic phosphorus (IPHOS). Patients were categorized into stage 1 or stage 2 based on a baseline eGFR of less than 90 ml/min/m^2 Results: 125 CHB patients were matched 1:1 in both the combined treatment and cured groups. Baseline eGFR, CysC, and IPHOS levels were similar between the groups. Renal function in stage 1 and stage 2 groups showed a decreasing trend at 48 weeks after an initial increase.Correlation analysis showed significant relationships between changes in ALT and eGFR values at 12 weeks in both non-cured and cured groups. Conclusions: Over the 48-week duration of combined treatment in patients with chronic hepatitis B (CHB), it was found that both Tenofovir Disoproxil Fumarate (TDF) and Entecavir (ETV) did not lead to an increase in renal injury. △ Less

Submitted 4 March, 2024; originally announced March 2024.

Comments: Over the 48-week duration of combined treatment in patients with chronic hepatitis B (CHB), it was found that both Tenofovir Disoproxil Fumarate (TDF) and Entecavir (ETV) did not lead to an increase in renal injury

ACM Class: G.1

arXiv:2403.01774 [pdf, other]

WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations

Authors: Haolin Deng, Chang Wang, Xin Li, Dezhang Yuan, Junlang Zhan, Tianhua Zhou, Jin Ma, Jun Gao, Ruifeng Xu

Abstract: Enhancing the attribution in large language models (LLMs) is a crucial task. One feasible approach is to enable LLMs to cite external sources that support their generations. However, existing datasets and evaluation methods in this domain still exhibit notable limitations. In this work, we formulate the task of attributed query-focused summarization (AQFS) and present WebCiteS, a Chinese dataset f… ▽ More Enhancing the attribution in large language models (LLMs) is a crucial task. One feasible approach is to enable LLMs to cite external sources that support their generations. However, existing datasets and evaluation methods in this domain still exhibit notable limitations. In this work, we formulate the task of attributed query-focused summarization (AQFS) and present WebCiteS, a Chinese dataset featuring 7k human-annotated summaries with citations. WebCiteS derives from real-world user queries and web search results, offering a valuable resource for model training and evaluation. Prior works in attribution evaluation do not differentiate between groundedness errors and citation errors. They also fall short in automatically verifying sentences that draw partial support from multiple sources. We tackle these issues by developing detailed metrics and enabling the automatic evaluator to decompose the sentences into sub-claims for fine-grained verification. Our comprehensive evaluation of both open-source and proprietary models on WebCiteS highlights the challenge LLMs face in correctly citing sources, underscoring the necessity for further improvement. The dataset and code will be open-sourced to facilitate further research in this crucial field. △ Less

Submitted 28 May, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

Comments: 20 pages, 7 figures, accepted to ACL 2024 main conference

arXiv:2403.01458 [pdf, ps, other]

Semi-vortex solitons and their excited states in spin-orbit-coupled binary bosonic condensates

Authors: Haiming Deng, Jinqing Li, Zhaopin Chen, Yaohui Liu, Dong Liu, Chunzhi Jiang, Chao Kong, Boris A. Malomed

Abstract: It is known that two-dimensional two-component fundamental solitons of the semi-vortex (SV) type, with vorticities $(s_{+},s_{-})=(0,1)$ in their components, are stable ground states (GSs) in the spin-orbit-coupled (SOC) binary Bose-Einstein condensate with the contact self-attraction acting in both components, in spite of the possibility of the critical collapse in the system. However, excited st… ▽ More It is known that two-dimensional two-component fundamental solitons of the semi-vortex (SV) type, with vorticities $(s_{+},s_{-})=(0,1)$ in their components, are stable ground states (GSs) in the spin-orbit-coupled (SOC) binary Bose-Einstein condensate with the contact self-attraction acting in both components, in spite of the possibility of the critical collapse in the system. However, excited states(ESs) of the SV solitons, with the vorticity set $(s_{+},s_{-})=( S_{+},S_{+}+1)$ and $S_{+}=1,2,3,...$, are unstable in the same system. We construct ESs of SV solitons in the SOC system with opposite signs of the self-interaction in the two components. The main finding is stability of the ES-SV solitons, with the extra vorticity (at least) up to $S_{+}=6$. The threshold value of the norm for the onset of the critical collapse, $N_{\mathrm{thr}}$, in these excited states is higher than the commonly known critical value, $N_{c}\approx 5.85$,associated with the single-component Townes solitons, $N_{\mathrm{thr}}$ increasing with the growth of $S_{+}$. A velocity interval for stable motion of the GS-SV solitons is found too. The results suggest a solution for the challenging problem of the creation of stable vortex solitons with high topological charges. △ Less

Submitted 12 May, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

Comments: To be published in Physical Review E

arXiv:2403.01120 [pdf]

Symmetry-breaking-dependent electronic structures and strain regulation in ReSeS monolayer

Authors: Texture Lin, J. W. Ma, H. C. Deng, L. Z. Liu

Abstract: Electronic devices for information storages and processes can be further optimized by introducing the degree of freedom of anisotropy, which is strongly dependent of their structural symmetry. Herein, a ReSeS monolayer with asymmetrical double-faces are proposed to disclose the anisotropic electronic structure. Meanwhile infrared fingerprint based on the lattice vibration is also adopted to demons… ▽ More Electronic devices for information storages and processes can be further optimized by introducing the degree of freedom of anisotropy, which is strongly dependent of their structural symmetry. Herein, a ReSeS monolayer with asymmetrical double-faces are proposed to disclose the anisotropic electronic structure. Meanwhile infrared fingerprint based on the lattice vibration is also adopted to demonstrate the symmetry-breaking-dependent structural transformation. First-principles calculations demonstrate that the geometry deformation will induce the reconstruction of electronic structure. Ulteriorly, both the dynamic properties of carrier and spectroscopic response can be regulated by external strain and displays anisotropic behaviors. Our idea provides threads for designing new regulable optoelectronic devices. △ Less

Submitted 2 March, 2024; originally announced March 2024.

arXiv:2403.00862 [pdf, other]

NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Journalism

Authors: Miao Li, Ming-Bin Chen, Bo Tang, Shengbin Hou, Pengyu Wang, Haiying Deng, Zhiyu Li, Feiyu Xiong, Keming Mao, Peng Cheng, Yi Luo

Abstract: We present NewsBench, a novel evaluation framework to systematically assess the capabilities of Large Language Models (LLMs) for editorial capabilities in Chinese journalism. Our constructed benchmark dataset is focused on four facets of writing proficiency and six facets of safety adherence, and it comprises manually and carefully designed 1,267 test samples in the types of multiple choice questi… ▽ More We present NewsBench, a novel evaluation framework to systematically assess the capabilities of Large Language Models (LLMs) for editorial capabilities in Chinese journalism. Our constructed benchmark dataset is focused on four facets of writing proficiency and six facets of safety adherence, and it comprises manually and carefully designed 1,267 test samples in the types of multiple choice questions and short answer questions for five editorial tasks in 24 news domains. To measure performances, we propose different GPT-4 based automatic evaluation protocols to assess LLM generations for short answer questions in terms of writing proficiency and safety adherence, and both are validated by the high correlations with human evaluations. Based on the systematic evaluation framework, we conduct a comprehensive analysis of ten popular LLMs which can handle Chinese. The experimental results highlight GPT-4 and ERNIE Bot as top performers, yet reveal a relative deficiency in journalistic safety adherence in creative writing tasks. Our findings also underscore the need for enhanced ethical guidance in machine-generated journalistic content, marking a step forward in aligning LLMs with journalistic standards and safety considerations. △ Less

Submitted 4 June, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

Comments: Long paper, ACL 2024 Main

arXiv:2402.17091 [pdf, other]

Structural Teacher-Student Normality Learning for Multi-Class Anomaly Detection and Localization

Authors: Hanqiu Deng, Xingyu Li

Abstract: Visual anomaly detection is a challenging open-set task aimed at identifying unknown anomalous patterns while modeling normal data. The knowledge distillation paradigm has shown remarkable performance in one-class anomaly detection by leveraging teacher-student network feature comparisons. However, extending this paradigm to multi-class anomaly detection introduces novel scalability challenges. In… ▽ More Visual anomaly detection is a challenging open-set task aimed at identifying unknown anomalous patterns while modeling normal data. The knowledge distillation paradigm has shown remarkable performance in one-class anomaly detection by leveraging teacher-student network feature comparisons. However, extending this paradigm to multi-class anomaly detection introduces novel scalability challenges. In this study, we address the significant performance degradation observed in previous teacher-student models when applied to multi-class anomaly detection, which we identify as resulting from cross-class interference. To tackle this issue, we introduce a novel approach known as Structural Teacher-Student Normality Learning (SNL): (1) We propose spatial-channel distillation and intra-&inter-affinity distillation techniques to measure structural distance between the teacher and student networks. (2) We introduce a central residual aggregation module (CRAM) to encapsulate the normal representation space of the student network. We evaluate our proposed approach on two anomaly detection datasets, MVTecAD and VisA. Our method surpasses the state-of-the-art distillation-based algorithms by a significant margin of 3.9% and 1.5% on MVTecAD and 1.2% and 2.5% on VisA in the multi-class anomaly detection and localization tasks, respectively. Furthermore, our algorithm outperforms the current state-of-the-art unified models on both MVTecAD and VisA. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.15823 [pdf, other]

Parameter-efficient Prompt Learning for 3D Point Cloud Understanding

Authors: Hongyu Sun, Yongcai Wang, Wang Chen, Haoran Deng, Deying Li

Abstract: This paper presents a parameter-efficient prompt tuning method, named PPT, to adapt a large multi-modal model for 3D point cloud understanding. Existing strategies are quite expensive in computation and storage, and depend on time-consuming prompt engineering. We address the problems from three aspects. Firstly, a PromptLearner module is devised to replace hand-crafted prompts with learnable conte… ▽ More This paper presents a parameter-efficient prompt tuning method, named PPT, to adapt a large multi-modal model for 3D point cloud understanding. Existing strategies are quite expensive in computation and storage, and depend on time-consuming prompt engineering. We address the problems from three aspects. Firstly, a PromptLearner module is devised to replace hand-crafted prompts with learnable contexts to automate the prompt tuning process. Then, we lock the pre-trained backbone instead of adopting the full fine-tuning paradigm to substantially improve the parameter efficiency. Finally, a lightweight PointAdapter module is arranged near target tasks to enhance prompt tuning for 3D point cloud understanding. Comprehensive experiments are conducted to demonstrate the superior parameter and data efficiency of the proposed method.Meanwhile, we obtain new records on 4 public datasets and multiple 3D tasks, i.e., point cloud recognition, few-shot learning, and part segmentation. The implementation is available at https://github.com/auniquesun/PPT. △ Less

Submitted 24 February, 2024; originally announced February 2024.

Comments: 9 pages, 5 figures, 6 tables; accepted by ICRA 2024

arXiv:2402.15805 [pdf, other]

Distinguishable-particle Glassy Crystal: the simplest molecular model of glass

Authors: Leo S. I. Lam, Gautham Gopinath, Zichen Zhao, Shuling Wang, Chun-Shing Lee, Hai-Yao Deng, Feng Wang, Yilong Han, Cho-Tung Yip, Chi-Hang Lam

Abstract: The nature of glassy dynamics and the glass transition are long-standing problems under active debate. In the presence of a structural disorder widely believed to be an essential characteristic of structural glass, identifying and understanding key dynamical behaviors are very challenging. In this work, we demonstrate that an energetic disorder, which usually results from a structural disorder, is… ▽ More The nature of glassy dynamics and the glass transition are long-standing problems under active debate. In the presence of a structural disorder widely believed to be an essential characteristic of structural glass, identifying and understanding key dynamical behaviors are very challenging. In this work, we demonstrate that an energetic disorder, which usually results from a structural disorder, is instead a more essential feature of glass. Specifically, we develop a distinguishable-particle glassy crystal (DPGC) in which particles are ordered in a face-centered cubic lattice and follow particle-dependent random interactions, leading to an energetic disorder in the particle configuration space. Molecular dynamics simulations in the presence of vacancy-induced particle diffusion show typical glassy behaviors. A unique feature of this molecular model is the knowledge of the complete set of inherent structures with easily calculable free energies, implying a well-understood potential energy landscape. Due to its simplicity, the study of the DPGC provides a promising direction to unlock the mysteries of glass. △ Less

Submitted 24 February, 2024; originally announced February 2024.

arXiv:2402.14518 [pdf, other]

Bulk Boundary Paradox in the Surface Reconstructed Magnetic Weyl Semimetal NdAlSi

Authors: Cong Li, Jianfeng Zhang, Hongxiong Liu, Wanyu Chen, Guowei Liu, Hanbin Deng, Craig Polley, Balasubramanian Thiagarajan, Timur Kim, Jiaxin Yin, Youguo Shi, Tao Xiang, Oscar Tjernberg

Abstract: The bulk boundary correspondence in the context of Weyl semimetals is a fundamental topological principle that establishes a connection between the bulk properties of the material and the emergence of specific surface states. In Weyl semimetals, the bulk boundary correspondence is manifested by the presence of surface Fermi arcs connecting pairs of Weyl nodes with opposite chirality. Here we demon… ▽ More The bulk boundary correspondence in the context of Weyl semimetals is a fundamental topological principle that establishes a connection between the bulk properties of the material and the emergence of specific surface states. In Weyl semimetals, the bulk boundary correspondence is manifested by the presence of surface Fermi arcs connecting pairs of Weyl nodes with opposite chirality. Here we demonstrate that this bulk boundary correspondence is challenged in the case of the surface selectively reconstructed noncentrosymmetric magnetic Weyl semimetal NdAlSi. By comparing angle-resolved photoemission spectroscopy measurements with surface projected density functional theory calculations and scanning tunneling microscope measurements, the existence of surface selective spontaneous reconstruction is demonstrated. The surface reconstruction in NdAlSi not only leads to the reconstruction of the surface Fermi arcs, but also generates new surface Fermi arcs that do not connect corresponding Weyl nodes. This observation challenges the conventional view of the bulk boundary correspondence in Weyl semimetals. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: 18 pages, 4 figures

arXiv:2402.11210 [pdf, other]

doi 10.1039/D4SM00221K

Surface mobility gradient and emergent facilitation in glassy films

Authors: Qiang Zhai, Xin-Yuan Gao, Chun-Shing Lee, Chin-Yuan Ong, Ke Yan, Hai-Yao Deng, Sen Yang, Chi-Hang Lam

Abstract: Confining glassy polymer into films can substantially modify their local and film-averaged properties. We present a lattice model of film geometry with void-mediated facilitation behaviors but free from any elasticity effect. We analyze the spatially varying viscosity to delineate the transport property of glassy films. The film mobility measurements reported by [Yang et. al., Science, 2010, 328,… ▽ More Confining glassy polymer into films can substantially modify their local and film-averaged properties. We present a lattice model of film geometry with void-mediated facilitation behaviors but free from any elasticity effect. We analyze the spatially varying viscosity to delineate the transport property of glassy films. The film mobility measurements reported by [Yang et. al., Science, 2010, 328, 1676] are successfully reproduced. The flow exhibits a crossover from simple viscous flow to a surface-dominated regime as temperature decreases. The propagation of a highly mobile front induced by the free surface is visualized in real space. Our approach provides a microscopic treatment of the observed glassy phenomena. △ Less

Submitted 17 February, 2024; originally announced February 2024.

arXiv:2402.11130 [pdf, other]

Depth-dependent study of time-reversal symmetry-breaking in the kagome superconductor $A$V$_{3}$Sb$_{5}$

Authors: J. N. Graham, C. Mielke III, D. Das, T. Morresi, V. Sazgari, A. Suter, T. Prokscha, H. Deng, R. Khasanov, S. D. Wilson, A. C. Salinas, M. M. Martins, Y. Zhong, K. Okazaki, Z. Wang, M. Z. Hasan, M. Fischer, T. Neupert, J. -X. Yin, S. Sanna, H. Luetkens, Z. Salman, P. Bonfa, Z. Guguchia

Abstract: The breaking of time-reversal symmetry (TRS) in the normal state of kagome superconductors $A$V$_{3}$Sb$_{5}$ stands out as a significant feature. Yet the extent to which this effect can be tuned remains uncertain, a crucial aspect to grasp in light of the varying details of TRS breaking observed through different techniques. Here, we employ the unique low-energy muon spin rotation technique combi… ▽ More The breaking of time-reversal symmetry (TRS) in the normal state of kagome superconductors $A$V$_{3}$Sb$_{5}$ stands out as a significant feature. Yet the extent to which this effect can be tuned remains uncertain, a crucial aspect to grasp in light of the varying details of TRS breaking observed through different techniques. Here, we employ the unique low-energy muon spin rotation technique combined with local field numerical analysis to study the TRS breaking response as a function of depth from the surface in single crystals of RbV$_{3}$Sb$_{5}$ with charge order and Cs(V$_{0.86}$Ta$_{0.14}$)$_{3}$Sb$_{5}$ without charge order. In the bulk (i.e., > 33 nm from the surface) of RbV$_{3}$Sb$_{5}$, we have detected a notable increase in the internal magnetic field width experienced by the muon ensemble. This increase occurs only within the charge ordered state. Intriguingly, the muon spin relaxation rate is significantly enhanced near the surface (i.e., < 33 nm from the surface) of RbV$_{3}$Sb$_{5}$, and this effect commences at temperatures significantly higher than the onset of charge order. Conversely, in Cs(V$_{0.86}$Ta$_{0.14}$)$_{3}$Sb$_{5}$, we do not observe a similar enhancement in the internal field width, neither in the bulk nor near the surface. These observations indicate a strong connection between charge order and TRS breaking on one hand, and on the other hand, suggest that TRS breaking can occur prior to long-range charge order. This research offers compelling evidence for depth-dependent magnetism in $A$V$_{3}$Sb$_{5}$ superconductors in the presence of charge order. Such findings are likely to elucidate the intricate microscopic mechanisms that underpin the TRS breaking phenomena in these materials. △ Less

Submitted 16 February, 2024; originally announced February 2024.

Comments: 16 pages, 13 figures

arXiv:2402.09565 [pdf, other]

doi 10.1145/3589334.3645452

Graph-Skeleton: ~1% Nodes are Sufficient to Represent Billion-Scale Graph

Authors: Linfeng Cao, Haoran Deng, Yang Yang, Chunping Wang, Lei Chen

Abstract: Due to the ubiquity of graph data on the web, web graph mining has become a hot research spot. Nonetheless, the prevalence of large-scale web graphs in real applications poses significant challenges to storage, computational capacity and graph model design. Despite numerous studies to enhance the scalability of graph models, a noticeable gap remains between academic research and practical web grap… ▽ More Due to the ubiquity of graph data on the web, web graph mining has become a hot research spot. Nonetheless, the prevalence of large-scale web graphs in real applications poses significant challenges to storage, computational capacity and graph model design. Despite numerous studies to enhance the scalability of graph models, a noticeable gap remains between academic research and practical web graph mining applications. One major cause is that in most industrial scenarios, only a small part of nodes in a web graph are actually required to be analyzed, where we term these nodes as target nodes, while others as background nodes. In this paper, we argue that properly fetching and condensing the background nodes from massive web graph data might be a more economical shortcut to tackle the obstacles fundamentally. To this end, we make the first attempt to study the problem of massive background nodes compression for target nodes classification. Through extensive experiments, we reveal two critical roles played by the background nodes in target node classification: enhancing structural connectivity between target nodes, and feature correlation with target nodes. Followingthis, we propose a novel Graph-Skeleton1 model, which properly fetches the background nodes, and further condenses the semantic and topological information of background nodes within similar target-background local structures. Extensive experiments on various web graph datasets demonstrate the effectiveness and efficiency of the proposed method. In particular, for MAG240M dataset with 0.24 billion nodes, our generated skeleton graph achieves highly comparable performance while only containing 1.8% nodes of the original graph. △ Less

Submitted 6 March, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

Comments: 21 pages, 11 figures, In Proceedings of the ACM Web Conference 2024 (WWW'24)

arXiv:2402.06267 [pdf, other]

Efficient initialization of fluxonium qubits based on auxiliary energy levels

Authors: Tenghui Wang, Feng Wu, Fei Wang, Xizheng Ma, Gengyan Zhang, Jianjun Chen, Hao Deng, Ran Gao, Ruizi Hu, Lu Ma, Zhijun Song, Tian Xia, Make Ying, Huijuan Zhan, Hui-Hai Zhao, Chunqing Deng

Abstract: Fast and high-fidelity qubit initialization is crucial for low-frequency qubits such as fluxonium, and in applications of many quantum algorithms and quantum error correction codes. In a circuit quantum electrodynamics system, the initialization is typically achieved by transferring the state between the qubit and a short-lived cavity through microwave driving, also known as the sideband cooling p… ▽ More Fast and high-fidelity qubit initialization is crucial for low-frequency qubits such as fluxonium, and in applications of many quantum algorithms and quantum error correction codes. In a circuit quantum electrodynamics system, the initialization is typically achieved by transferring the state between the qubit and a short-lived cavity through microwave driving, also known as the sideband cooling process in atomic system. Constrained by the selection rules from the parity symmetry of the wavefunctions, the sideband transitions are only enabled by multi-photon processes which requires multi-tone or strong driving. Leveraging the flux-tunability of fluxonium, we circumvent this limitation by breaking flux symmetry to enable an interaction between a non-computational qubit transition and the cavity excitation. With single-tone sideband driving, we realize qubit initialization with a fidelity exceeding 99% within a duration of 300 ns, robust against the variation of control parameters. Furthermore, we show that our initialization scheme has a built-in benefit in simultaneously removing the second-excited state population of the qubit, and can be easily incorporated into a large-scale fluxonium processor. △ Less

Submitted 9 February, 2024; originally announced February 2024.

Showing 1–50 of 703 results for author: Deng, H