Search | arXiv e-print repository

arXiv:2410.03121 [pdf]

Ordering of Interstitial Iron Atoms and Local Structural Distortion Induced by Iron Polycomplex in Fe1+yTe1-xSex as Seen via Transmission Electron Microscopy

Authors: Xiao-Ping Ma, Lu Zhang, Wen-Tao Wang, Jing-Zhe Nie, Huan-Fang Tian, Shi-Long Wu, Shuai-Shuai Sun, Tian-Long Xia, Jun Li, Jian-Qi Li, Huai-Xin Yang

Abstract: Employing aberration-corrected scanning transmission electron microscopy (STEM), we meticulously investigated the intrinsic chemical heterogeneity of Fe1+yTe, Fe1+yTe0.8Se0.2, and Fe1+yTe0.5Se0.5. Comprehensive analysis reveals the presence of interstitial iron atoms (Feint) across all samples, pre-dominantly occupying the 2c site of the P4/nmm space group. Moreover, a superstructure phase charact… ▽ More Employing aberration-corrected scanning transmission electron microscopy (STEM), we meticulously investigated the intrinsic chemical heterogeneity of Fe1+yTe, Fe1+yTe0.8Se0.2, and Fe1+yTe0.5Se0.5. Comprehensive analysis reveals the presence of interstitial iron atoms (Feint) across all samples, pre-dominantly occupying the 2c site of the P4/nmm space group. Moreover, a superstructure phase characterized by a wave vector q = 2/5a + 1/2c, originating from the ordering of Feint, is distinctly observable in the parent compound Fe1+yTe. In this scenario, the Feint atoms interact with adjacent Fe atoms, forming iron polycomplex and leading to an evident distortion of the FeTe4 tetrahedral. Experimental results further demonstrate effective suppression of Feint concentration and ordering through appropriate Se substitution; notably, Fe1+yTe0.5Se0.5 manifests the lowest concentration of Feint atoms. Our findings additionally indicate that Se substitution is random, and nanoscale phase separation induced by Te/Se chemical heterogeneity is commonly observed within Fe1+yTe1-xSex crystals. △ Less

Submitted 3 October, 2024; originally announced October 2024.

Comments: 14 pages, 5 figures

arXiv:2410.01176 [pdf, other]

Generative Diffusion-based Contract Design for Efficient AI Twins Migration in Vehicular Embodied AI Networks

Authors: Yue Zhong, Jiawen Kang, Jinbo Wen, Dongdong Ye, Jiangtian Nie, Dusit Niyato, Xiaozheng Gao, Shengli Xie

Abstract: Embodied AI is a rapidly advancing field that bridges the gap between cyberspace and physical space, enabling a wide range of applications. This evolution has led to the development of the Vehicular Embodied AI NETwork (VEANET), where advanced AI capabilities are integrated into vehicular systems to enhance autonomous operations and decision-making. Embodied agents, such as Autonomous Vehicles (AV… ▽ More Embodied AI is a rapidly advancing field that bridges the gap between cyberspace and physical space, enabling a wide range of applications. This evolution has led to the development of the Vehicular Embodied AI NETwork (VEANET), where advanced AI capabilities are integrated into vehicular systems to enhance autonomous operations and decision-making. Embodied agents, such as Autonomous Vehicles (AVs), are autonomous entities that can perceive their environment and take actions to achieve specific goals, actively interacting with the physical world. Embodied twins are digital models of these embodied agents, with various embodied AI twins for intelligent applications in cyberspace. In VEANET, embodied AI twins act as in-vehicle AI assistants to perform diverse tasks supporting autonomous driving using generative AI models. Due to limited computational resources of AVs, these AVs often offload computationally intensive tasks, such as constructing and updating embodied AI twins, to nearby RSUs. However, since the rapid mobility of AVs and the limited provision coverage of a single RSU, embodied AI twins require dynamic migrations from current RSU to other RSUs in real-time, resulting in the challenge of selecting suitable RSUs for efficient embodied AI twins migrations. Given information asymmetry, AVs cannot know the detailed information of RSUs. To this end, in this paper, we construct a multi-dimensional contract theoretical model between AVs and alternative RSUs. Considering that AVs may exhibit irrational behavior, we utilize prospect theory instead of expected utility theory to model the actual utilities of AVs. Finally, we employ a generative diffusion model-based algorithm to identify the optimal contract designs. Compared with traditional deep reinforcement learning algorithms, numerical results demonstrate the effectiveness of the proposed scheme. △ Less

Submitted 1 October, 2024; originally announced October 2024.

arXiv:2410.00057 [pdf, other]

STTM: A New Approach Based Spatial-Temporal Transformer And Memory Network For Real-time Pressure Signal In On-demand Food Delivery

Authors: Jiang Wang, Haibin Wei, Xiaowei Xu, Jiacheng Shi, Jian Nie, Longzhi Du, Taixu Jiang

Abstract: On-demand Food Delivery (OFD) services have become very common around the world. For example, on the Ele.me platform, users place more than 15 million food orders every day. Predicting the Real-time Pressure Signal (RPS) is crucial for OFD services, as it is primarily used to measure the current status of pressure on the logistics system. When RPS rises, the pressure increases, and the platform ne… ▽ More On-demand Food Delivery (OFD) services have become very common around the world. For example, on the Ele.me platform, users place more than 15 million food orders every day. Predicting the Real-time Pressure Signal (RPS) is crucial for OFD services, as it is primarily used to measure the current status of pressure on the logistics system. When RPS rises, the pressure increases, and the platform needs to quickly take measures to prevent the logistics system from being overloaded. Usually, the average delivery time for all orders within a business district is used to represent RPS. Existing research on OFD services primarily focuses on predicting the delivery time of orders, while relatively less attention has been given to the study of the RPS. Previous research directly applies general models such as DeepFM, RNN, and GNN for prediction, but fails to adequately utilize the unique temporal and spatial characteristics of OFD services, and faces issues with insufficient sensitivity during sudden severe weather conditions or peak periods. To address these problems, this paper proposes a new method based on Spatio-Temporal Transformer and Memory Network (STTM). Specifically, we use a novel Spatio-Temporal Transformer structure to learn logistics features across temporal and spatial dimensions and encode the historical information of a business district and its neighbors, thereby learning both temporal and spatial information. Additionally, a Memory Network is employed to increase sensitivity to abnormal events. Experimental results on the real-world dataset show that STTM significantly outperforms previous methods in both offline experiments and the online A/B test, demonstrating the effectiveness of this method. △ Less

Submitted 29 September, 2024; originally announced October 2024.

arXiv:2409.04050 [pdf, other]

EigenSR: Eigenimage-Bridged Pre-Trained RGB Learners for Single Hyperspectral Image Super-Resolution

Authors: Xi Su, Xiangfei Shen, Mingyang Wan, Jing Nie, Lihui Chen, Haijun Liu, Xichuan Zhou

Abstract: Single hyperspectral image super-resolution (single-HSI-SR) aims to improve the resolution of a single input low-resolution HSI. Due to the bottleneck of data scarcity, the development of single-HSI-SR lags far behind that of RGB natural images. In recent years, research on RGB SR has shown that models pre-trained on large-scale benchmark datasets can greatly improve performance on unseen data, wh… ▽ More Single hyperspectral image super-resolution (single-HSI-SR) aims to improve the resolution of a single input low-resolution HSI. Due to the bottleneck of data scarcity, the development of single-HSI-SR lags far behind that of RGB natural images. In recent years, research on RGB SR has shown that models pre-trained on large-scale benchmark datasets can greatly improve performance on unseen data, which may stand as a remedy for HSI. But how can we transfer the pre-trained RGB model to HSI, to overcome the data-scarcity bottleneck? Because of the significant difference in the channels between the pre-trained RGB model and the HSI, the model cannot focus on the correlation along the spectral dimension, thus limiting its ability to utilize on HSI. Inspired by the HSI spatial-spectral decoupling, we propose a new framework that first fine-tunes the pre-trained model with the spatial components (known as eigenimages), and then infers on unseen HSI using an iterative spectral regularization (ISR) to maintain the spectral correlation. The advantages of our method lie in: 1) we effectively inject the spatial texture processing capabilities of the pre-trained RGB model into HSI while keeping spectral fidelity, 2) learning in the spectral-decorrelated domain can improve the generalizability to spectral-agnostic data, and 3) our inference in the eigenimage domain naturally exploits the spectral low-rank property of HSI, thereby reducing the complexity. This work bridges the gap between pre-trained RGB models and HSI via eigenimages, addressing the issue of limited HSI training data, hence the name EigenSR. Extensive experiments show that EigenSR outperforms the state-of-the-art (SOTA) methods in both spatial and spectral metrics. Our code will be released. △ Less

Submitted 6 September, 2024; originally announced September 2024.

Comments: Submitted to AAAI 2025

arXiv:2409.01442 [pdf, other]

Off-diagonal Ramsey numbers for slowly growing hypergraphs

Authors: Sam Mattheus, Dhruv Mubayi, Jiaxi Nie, Jacques Verstraëte

Abstract: For a $k$-uniform hypergraph $F$ and a positive integer $n$, the Ramsey number $r(F,n)$ denotes the minimum $N$ such that every $N$-vertex $F$-free $k$-uniform hypergraph contains an independent set of $n$ vertices. A hypergraph is $\textit{slowly growing}$ if there is an ordering $e_1,e_2,\dots,e_t$ of its edges such that $|e_i \setminus \bigcup_{j = 1}^{i - 1}e_j| \leq 1$ for each… ▽ More For a $k$-uniform hypergraph $F$ and a positive integer $n$, the Ramsey number $r(F,n)$ denotes the minimum $N$ such that every $N$-vertex $F$-free $k$-uniform hypergraph contains an independent set of $n$ vertices. A hypergraph is $\textit{slowly growing}$ if there is an ordering $e_1,e_2,\dots,e_t$ of its edges such that $|e_i \setminus \bigcup_{j = 1}^{i - 1}e_j| \leq 1$ for each $i \in \{2, \ldots, t\}$. We prove that if $k \geq 3$ is fixed and $F$ is any non $k$-partite slowly growing $k$-uniform hypergraph, then for $n\ge2$, \[ r(F,n) = Ω\Bigl(\frac{n^k}{(\log n)^{2k - 2}}\Bigr).\] In particular, we deduce that the off-diagonal Ramsey number $r(F_5,n)$ is of order $n^{3}/\mbox{polylog}(n)$, where $F_5$ is the triple system $\{123, 124, 345\}$. This is the only 3-uniform Berge triangle for which the polynomial power of its off-diagonal Ramsey number was not previously known. Our constructions use pseudorandom graphs, martingales, and hypergraph containers. △ Less

Submitted 2 September, 2024; originally announced September 2024.

Comments: 11 pages, 2 figures

MSC Class: 05D10

arXiv:2408.16031 [pdf, other]

EMP: Enhance Memory in Data Pruning

Authors: Jinying Xiao, Ping Li, Jie Nie, Zhe Tang

Abstract: Recently, large language and vision models have shown strong performance, but due to high pre-training and fine-tuning costs, research has shifted towards faster training via dataset pruning. Previous methods used sample loss as an evaluation criterion, aiming to select the most "difficult" samples for training. However, when the pruning rate increases, the number of times each sample is trained b… ▽ More Recently, large language and vision models have shown strong performance, but due to high pre-training and fine-tuning costs, research has shifted towards faster training via dataset pruning. Previous methods used sample loss as an evaluation criterion, aiming to select the most "difficult" samples for training. However, when the pruning rate increases, the number of times each sample is trained becomes more evenly distributed, which causes many critical or general samples to not be effectively fitted. We refer to this as Low-Frequency Learning (LFL). In other words, LFL prevents the model from remembering most samples. In our work, we decompose the scoring function of LFL, provide a theoretical explanation for the inefficiency of LFL, and propose adding a memory term to the scoring function to enhance the model's memory capability, along with an approximation of this memory term. Similarly, we explore memory in Self-Supervised Learning (SSL), marking the first discussion on SSL memory. Using contrastive learning, we derive the memory term both theoretically and experimentally. Finally, we propose Enhance Memory Pruning (EMP), which addresses the issue of insufficient memory under high pruning rates by enhancing the model's memory of data, thereby improving its performance. We evaluated the performance of EMP in tasks such as image classification, natural language understanding, and model pre-training. The results show that EMP can improve model performance under extreme pruning rates. For example, in the CIFAR100-ResNet50 pre-training task, with 70\% pruning, EMP outperforms current methods by 2.2\%. △ Less

Submitted 28 August, 2024; originally announced August 2024.

arXiv:2408.14818 [pdf]

Irrelevance of 1H composition to the superconductivity in the infinite-layer nickelates: judging from the MeV energy scale

Authors: Jia-Cai Nie, Xing-Yu Chen, Yi Bian, Xue-Yan Wang, Ting-Na Shao, Jing-Xin Gao, Wei Mao, Bing-Hui Ge, Arnold Muller, Jikun Chen

Abstract: The discovery of the superconductivity in the infinite-layer nickelates, as topotactically reduced from their respective perovskite percussors via co-annealing with CaH2, extends the understanding in superconductivity. Nevertheless, whether the incorporated 1H composition is critical to the infinite-layer superconductivity recently arouses considerable debates, while the central challenge lies in… ▽ More The discovery of the superconductivity in the infinite-layer nickelates, as topotactically reduced from their respective perovskite percussors via co-annealing with CaH2, extends the understanding in superconductivity. Nevertheless, whether the incorporated 1H composition is critical to the infinite-layer superconductivity recently arouses considerable debates, while the central challenge lies in the quantification of 1H that is easily interfered by the conventional electron or orbital associated processes. Herein, we demonstrate the irrelevance between the superconductivity in the infinite-layer nickelates and their incorporated 1H composition, assisted by nuclear reaction analysis (NRA) and heavy ion energy recoil detection analysis (HIERDA) based on the nuclear interactions at MeV energy scale. These approaches completely overwhelm the conventional interferes, such as ionization, activation and chemical bonds, and achieves the 1H quantification within superconducting La0.8Sr0.2NiO2 (or Nd0.8Sr0.2NiO2). A large diversity of 1H composition far beyond the previously expected critical dome was observed, while their TC were not changed significantly. Furthermore, the superconductivity was demonstrated to be achievable for La0.8Sr0.2NiO2 reduced by Al without any hydrogen associated process, while the superconducting properties for the CaH2 reduced La0.8Sr0.2NiO2 is rather stable after long term exposure in air, despite the high volatility of 1H within oxides. All these results indicate that the 1H incorporation composition is not critical to the superconductivity of the infinite-layer nickelates. △ Less

Submitted 27 August, 2024; originally announced August 2024.

arXiv:2408.13842 [pdf, other]

DESI Peculiar Velocity Survey -- Fundamental Plane

Authors: Khaled Said, Cullan Howlett, Tamara Davis, John Lucey, Christoph Saulder, Kelly Douglass, Alex G. Kim, Anthony Kremin, Caitlin Ross, Greg Aldering, Jessica Nicole Aguilar, Steven Ahlen, Segev BenZvi, Davide Bianchi, David Brooks, Todd Claybaugh, Kyle Dawson, Axel de la Macorra, Biprateep Dey, Peter Doel, Kevin Fanning, Simone Ferraro, Andreu Font-Ribera, Jaime E. Forero-Romero, Enrique Gaztañaga , et al. (30 additional authors not shown)

Abstract: The Dark Energy Spectroscopic Instrument (DESI) Peculiar Velocity Survey aims to measure the peculiar velocities of early and late type galaxies within the DESI footprint using both the Fundamental Plane and Tully-Fisher relations. Direct measurements of peculiar velocities can significantly improve constraints on the growth rate of structure, reducing uncertainty by a factor of approximately 2.5… ▽ More The Dark Energy Spectroscopic Instrument (DESI) Peculiar Velocity Survey aims to measure the peculiar velocities of early and late type galaxies within the DESI footprint using both the Fundamental Plane and Tully-Fisher relations. Direct measurements of peculiar velocities can significantly improve constraints on the growth rate of structure, reducing uncertainty by a factor of approximately 2.5 at redshift 0.1 compared to the DESI Bright Galaxy Survey's redshift space distortion measurements alone. We assess the quality of stellar velocity dispersion measurements from DESI spectroscopic data. These measurements, along with photometric data from the Legacy Survey, establish the Fundamental Plane relation and determine distances and peculiar velocities of early-type galaxies. During Survey Validation, we obtain spectra for 6698 unique early-type galaxies, up to a photometric redshift of 0.15. 64\% of observed galaxies (4267) have relative velocity dispersion errors below 10\%. This percentage increases to 75\% if we restrict our sample to galaxies with spectroscopic redshifts below 0.1. We use the measured central velocity dispersion, along with photometry from the DESI Legacy Imaging Surveys, to fit the Fundamental Plane parameters using a 3D Gaussian maximum likelihood algorithm that accounts for measurement uncertainties and selection cuts. In addition, we conduct zero-point calibration using the absolute distance measurements to the Coma cluster, leading to a value of the Hubble constant, $H_0 = 76.05 \pm 0.35$(statistical) $\pm 0.49$(systematic FP) $\pm 4.86$(statistical due to calibration) $\mathrm{km \ s^{-1} Mpc^{-1}}$. This $H_0$ value is within $2σ$ of Planck Cosmic Microwave Background results and within $1σ$, of other low redshift distance indicator-based measurements. △ Less

Submitted 25 August, 2024; originally announced August 2024.

Comments: 18 pages, 9 figures, 2 tables. Submitted for publication in MNRAS

arXiv:2408.11878 [pdf, other]

Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Authors: Qianqian Xie, Dong Li, Mengxi Xiao, Zihao Jiang, Ruoyu Xiang, Xiao Zhang, Zhengyu Chen, Yueru He, Weiguang Han, Yuzhe Yang, Shunian Chen, Yifei Zhang, Lihang Shen, Daniel Kim, Zhiwei Liu, Zheheng Luo, Yangyang Yu, Yupeng Cao, Zhiyang Deng, Zhiyuan Yao, Haohang Li, Duanyu Feng, Yongfu Dai, VijayaSai Somasundaram, Peng Lu , et al. (14 additional authors not shown)

Abstract: Large language models (LLMs) have advanced financial applications, yet they often lack sufficient financial knowledge and struggle with tasks involving multi-modal inputs like tables and time series data. To address these limitations, we introduce \textit{Open-FinLLMs}, a series of Financial LLMs. We begin with FinLLaMA, pre-trained on a 52 billion token financial corpus, incorporating text, table… ▽ More Large language models (LLMs) have advanced financial applications, yet they often lack sufficient financial knowledge and struggle with tasks involving multi-modal inputs like tables and time series data. To address these limitations, we introduce \textit{Open-FinLLMs}, a series of Financial LLMs. We begin with FinLLaMA, pre-trained on a 52 billion token financial corpus, incorporating text, tables, and time-series data to embed comprehensive financial knowledge. FinLLaMA is then instruction fine-tuned with 573K financial instructions, resulting in FinLLaMA-instruct, which enhances task performance. Finally, we present FinLLaVA, a multimodal LLM trained with 1.43M image-text instructions to handle complex financial data types. Extensive evaluations demonstrate FinLLaMA's superior performance over LLaMA3-8B, LLaMA3.1-8B, and BloombergGPT in both zero-shot and few-shot settings across 19 and 4 datasets, respectively. FinLLaMA-instruct outperforms GPT-4 and other Financial LLMs on 15 datasets. FinLLaVA excels in understanding tables and charts across 4 multimodal tasks. Additionally, FinLLaMA achieves impressive Sharpe Ratios in trading simulations, highlighting its robust financial application capabilities. We will continually maintain and improve our models and benchmarks to support ongoing innovation in academia and industry. △ Less

Submitted 20 August, 2024; originally announced August 2024.

Comments: 33 pages, 13 figures

arXiv:2408.11272 [pdf, other]

High-Dimensional Overdispersed Generalized Factor Model with Application to Single-Cell Sequencing Data Analysis

Authors: Jinyu Nie, Zhilong Qin, Wei Liu

Abstract: The current high-dimensional linear factor models fail to account for the different types of variables, while high-dimensional nonlinear factor models often overlook the overdispersion present in mixed-type data. However, overdispersion is prevalent in practical applications, particularly in fields like biomedical and genomics studies. To address this practical demand, we propose an overdispersed… ▽ More The current high-dimensional linear factor models fail to account for the different types of variables, while high-dimensional nonlinear factor models often overlook the overdispersion present in mixed-type data. However, overdispersion is prevalent in practical applications, particularly in fields like biomedical and genomics studies. To address this practical demand, we propose an overdispersed generalized factor model (OverGFM) for performing high-dimensional nonlinear factor analysis on overdispersed mixed-type data. Our approach incorporates an additional error term to capture the overdispersion that cannot be accounted for by factors alone. However, this introduces significant computational challenges due to the involvement of two high-dimensional latent random matrices in the nonlinear model. To overcome these challenges, we propose a novel variational EM algorithm that integrates Laplace and Taylor approximations. This algorithm provides iterative explicit solutions for the complex variational parameters and is proven to possess excellent convergence properties. We also develop a criterion based on the singular value ratio to determine the optimal number of factors. Numerical results demonstrate the effectiveness of this criterion. Through comprehensive simulation studies, we show that OverGFM outperforms state-of-the-art methods in terms of estimation accuracy and computational efficiency. Furthermore, we demonstrate the practical merit of our method through its application to two datasets from genomics. To facilitate its usage, we have integrated the implementation of OverGFM into the R package GFM. △ Less

Submitted 20 August, 2024; originally announced August 2024.

arXiv:2408.08598 [pdf, ps, other]

On odd covers of cliques and disjoint unions

Authors: Calum Buchanan, Alexander Clifton, Eric Culver, Péter Frankl, Jiaxi Nie, Kenta Ozeki, Puck Rombach, Mei Yin

Abstract: Babai and Frankl posed the ``odd cover problem" of finding the minimum cardinality of a collection of complete bipartite graphs such that every edge of the complete graph of order $n$ is covered an odd number of times. In a previous paper with O'Neill, some of the authors proved that this value is always $\lceil n / 2 \rceil$ or $\lceil n / 2 \rceil + 1$ and that it is the former whenever $n$ is a… ▽ More Babai and Frankl posed the ``odd cover problem" of finding the minimum cardinality of a collection of complete bipartite graphs such that every edge of the complete graph of order $n$ is covered an odd number of times. In a previous paper with O'Neill, some of the authors proved that this value is always $\lceil n / 2 \rceil$ or $\lceil n / 2 \rceil + 1$ and that it is the former whenever $n$ is a multiple of $8$. In this paper, we determine this value to be $\lceil n / 2 \rceil$ whenever $n$ is odd or equivalent to $18$ modulo $24$. We also further the study of odd covers of graphs which are not complete, wherein edges are covered an odd number of times and nonedges an even number of times by the complete bipartite graphs in the collection. Among various results on disjoint unions, we find the minimum cardinality of an odd cover of a union of odd cliques and of a union of cycles. △ Less

Submitted 16 August, 2024; originally announced August 2024.

Comments: 19 pages, 6 figures

MSC Class: 05C70; 05C50

arXiv:2408.03996 [pdf, other]

The atomic gas sequence and mass-metallicity relation from dwarfs to massive galaxies

Authors: D. Scholte, A. Saintonge, J. Moustakas, B. Catinella, H. Zou, B. Dey, J. Aguilar, S. Ahlen, A. Anand, R. Blum, D. Brooks, C. Circosta, T. Claybaugh, A. de la Macorra, P. Doel, A. Font-Ribera, P. U. Förster, J. E. Forero-Romero, E. Gaztañaga, S. Gontcho A Gontcho, S. Juneau, R. Kehoe, T. Kisner, S. E. Koposov, A. Kremin , et al. (21 additional authors not shown)

Abstract: Galaxy scaling relations provide insights into the processes that drive galaxy evolution. The extension of these scaling relations into the dwarf galaxy regime is of particular interest. This is because dwarf galaxies represent a crucial stage in galaxy evolution, and understanding them could also shed light on their role in reionising the early Universe. There is currently no consensus on the pro… ▽ More Galaxy scaling relations provide insights into the processes that drive galaxy evolution. The extension of these scaling relations into the dwarf galaxy regime is of particular interest. This is because dwarf galaxies represent a crucial stage in galaxy evolution, and understanding them could also shed light on their role in reionising the early Universe. There is currently no consensus on the processes that dominate the evolution of dwarfs. In this work we constrain the atomic gas sequence (stellar mass vs. atomic gas fraction) and mass-metallicity relation (stellar mass vs. gas phase metallicity) from dwarf ($10^{6.5}$ $\textrm{M}_{\odot}$) to massive ($10^{11.5}$ $\textrm{M}_{\odot}$) galaxies in the local Universe. The combined optical and 21-cm spectroscopic observations of the DESI and ALFALFA surveys allow us to simultaneously constrain both scaling relations. We find a slope change of the atomic gas sequence at a stellar mass of $\sim 10^{9} ~\textrm{M}_{\odot}$. We also find that the shape and scatter of the atomic gas sequence and mass-metallicity relation are strongly linked for both dwarfs and more massive galaxies. Consequently, the low mass slope change of the atomic gas sequence is imprinted onto the mass-metallicity relation of dwarf galaxies. The mass scale of the measured slope change is consistent with a predicted escape velocity threshold below which low mass galaxies experience significant supernova-driven gas loss, as well as with a reduction in cold gas accretion onto more massive galaxies. △ Less

Submitted 7 August, 2024; originally announced August 2024.

Comments: 16 pages, 10 figures, submitted to MNRAS

arXiv:2408.03406 [pdf, ps, other]

Random Turán Problems for Hypergraph Expansions

Authors: Jiaxi Nie, Sam Spiro

Abstract: The random Turán number $\mathrm{ex}(G_{n,p}^r,F)$ is the maximum number of edges in an $F$-free subgraph of the random $r$-uniform hypergraph $G_{n,p}^r$. We prove general results which (informally) shows that if $F$ is an $r_0$-graph, then upper bounds for $\mathrm{ex}(G_{n,p}^{r_0},F)$ can be lifted into upper bounds for $\mathrm{ex}(G_{n,p}^{r},F^{(r)})$ where $F^{(r)}$ is the $r$-uniform expa… ▽ More The random Turán number $\mathrm{ex}(G_{n,p}^r,F)$ is the maximum number of edges in an $F$-free subgraph of the random $r$-uniform hypergraph $G_{n,p}^r$. We prove general results which (informally) shows that if $F$ is an $r_0$-graph, then upper bounds for $\mathrm{ex}(G_{n,p}^{r_0},F)$ can be lifted into upper bounds for $\mathrm{ex}(G_{n,p}^{r},F^{(r)})$ where $F^{(r)}$ is the $r$-uniform expansion of $F$, i.e.\ the hypergraph obtained from $F$ by inserting $r-r_0$ distinct vertices into each edge of $F$. These results unify and generalize most known upper bounds for random Turán numbers of degenerate hypergraphs of uniformity at least 3, and also provide new tight bounds for the random Turán numbers of expansions of theta graphs and complete bipartite graphs. △ Less

Submitted 6 August, 2024; originally announced August 2024.

Comments: 23 pages+5 page appendix, comments welcome!

MSC Class: 05D40; 05C35; 05C65; 05C80

arXiv:2408.02263 [pdf, other]

VoxelTrack: Exploring Voxel Representation for 3D Point Cloud Object Tracking

Authors: Yuxuan Lu, Jiahao Nie, Zhiwei He, Hongjie Gu, Xudong Lv

Abstract: Current LiDAR point cloud-based 3D single object tracking (SOT) methods typically rely on point-based representation network. Despite demonstrated success, such networks suffer from some fundamental problems: 1) It contains pooling operation to cope with inherently disordered point clouds, hindering the capture of 3D spatial information that is useful for tracking, a regression task. 2) The adopte… ▽ More Current LiDAR point cloud-based 3D single object tracking (SOT) methods typically rely on point-based representation network. Despite demonstrated success, such networks suffer from some fundamental problems: 1) It contains pooling operation to cope with inherently disordered point clouds, hindering the capture of 3D spatial information that is useful for tracking, a regression task. 2) The adopted set abstraction operation hardly handles density-inconsistent point clouds, also preventing 3D spatial information from being modeled. To solve these problems, we introduce a novel tracking framework, termed VoxelTrack. By voxelizing inherently disordered point clouds into 3D voxels and extracting their features via sparse convolution blocks, VoxelTrack effectively models precise and robust 3D spatial information, thereby guiding accurate position prediction for tracked objects. Moreover, VoxelTrack incorporates a dual-stream encoder with cross-iterative feature fusion module to further explore fine-grained 3D spatial information for tracking. Benefiting from accurate 3D spatial information being modeled, our VoxelTrack simplifies tracking pipeline with a single regression loss. Extensive experiments are conducted on three widely-adopted datasets including KITTI, NuScenes and Waymo Open Dataset. The experimental results confirm that VoxelTrack achieves state-of-the-art performance (88.3%, 71.4% and 63.6% mean precision on the three datasets, respectively), and outperforms the existing trackers with a real-time speed of 36 Fps on a single TITAN RTX GPU. The source code and model will be released. △ Less

Submitted 5 August, 2024; originally announced August 2024.

arXiv:2407.20189 [pdf, other]

Aligning Query Representation with Rewritten Query and Relevance Judgments in Conversational Search

Authors: Fengran Mo, Chen Qu, Kelong Mao, Yihong Wu, Zhan Su, Kaiyu Huang, Jian-Yun Nie

Abstract: Conversational search supports multi-turn user-system interactions to solve complex information needs. Different from the traditional single-turn ad-hoc search, conversational search encounters a more challenging problem of context-dependent query understanding with the lengthy and long-tail conversational history context. While conversational query rewriting methods leverage explicit rewritten qu… ▽ More Conversational search supports multi-turn user-system interactions to solve complex information needs. Different from the traditional single-turn ad-hoc search, conversational search encounters a more challenging problem of context-dependent query understanding with the lengthy and long-tail conversational history context. While conversational query rewriting methods leverage explicit rewritten queries to train a rewriting model to transform the context-dependent query into a stand-stone search query, this is usually done without considering the quality of search results. Conversational dense retrieval methods use fine-tuning to improve a pre-trained ad-hoc query encoder, but they are limited by the conversational search data available for training. In this paper, we leverage both rewritten queries and relevance judgments in the conversational search data to train a better query representation model. The key idea is to align the query representation with those of rewritten queries and relevant documents. The proposed model -- Query Representation Alignment Conversational Dense Retriever, QRACDR, is tested on eight datasets, including various settings in conversational search and ad-hoc search. The results demonstrate the strong performance of QRACDR compared with state-of-the-art methods, and confirm the effectiveness of representation alignment. △ Less

Submitted 29 July, 2024; originally announced July 2024.

Comments: Accepted by CIKM 2024

arXiv:2407.18424 [pdf, other]

Model-driven Heart Rate Estimation and Heart Murmur Detection based on Phonocardiogram

Authors: Jingping Nie, Ran Liu, Behrooz Mahasseni, Erdrin Azemi, Vikramjit Mitra

Abstract: Acoustic signals are crucial for health monitoring, particularly heart sounds which provide essential data like heart rate and detect cardiac anomalies such as murmurs. This study utilizes a publicly available phonocardiogram (PCG) dataset to estimate heart rate using model-driven methods and extends the best-performing model to a multi-task learning (MTL) framework for simultaneous heart rate est… ▽ More Acoustic signals are crucial for health monitoring, particularly heart sounds which provide essential data like heart rate and detect cardiac anomalies such as murmurs. This study utilizes a publicly available phonocardiogram (PCG) dataset to estimate heart rate using model-driven methods and extends the best-performing model to a multi-task learning (MTL) framework for simultaneous heart rate estimation and murmur detection. Heart rate estimates are derived using a sliding window technique on heart sound snippets, analyzed with a combination of acoustic features (Mel spectrogram, cepstral coefficients, power spectral density, root mean square energy). Our findings indicate that a 2D convolutional neural network (\textbf{\texttt{2dCNN}}) is most effective for heart rate estimation, achieving a mean absolute error (MAE) of 1.312 bpm. We systematically investigate the impact of different feature combinations and find that utilizing all four features yields the best results. The MTL model (\textbf{\texttt{2dCNN-MTL}}) achieves accuracy over 95% in murmur detection, surpassing existing models, while maintaining an MAE of 1.636 bpm in heart rate estimation, satisfying the requirements stated by Association for the Advancement of Medical Instrumentation (AAMI). △ Less

Submitted 25 July, 2024; originally announced July 2024.

Comments: 6 pages, 10 figures

arXiv:2407.16569 [pdf]

Regulated magnetic anisotropy and charge density wave in uniformly fabricated Janus CrTeSe monolayer

Authors: Jin-Hua Nie, Cong Wang, Mao-Peng Miao, Kang-Di Niu, Tao Xie, Ting-Fei Guo, Wen-Hao Zhang, Chao-Fei Liu, Rui-Jing Sun, Jian-Wang Zhou, Jun-Hao Lin, Wei Ji, Ying-Shuang Fu

Abstract: Two-dimensional materials with Janus structure host novel physical properties due to their inversional symmetry breaking. However, it remains elusive to synthesize Janus monolayer crystals with tailored long-range magnetic orders. Here, we have developed a general method to fabricate uniform Janus CrTeSe monolayers by selective selenization of preformed CrTe2 monolayers with molecular beam epitaxy… ▽ More Two-dimensional materials with Janus structure host novel physical properties due to their inversional symmetry breaking. However, it remains elusive to synthesize Janus monolayer crystals with tailored long-range magnetic orders. Here, we have developed a general method to fabricate uniform Janus CrTeSe monolayers by selective selenization of preformed CrTe2 monolayers with molecular beam epitaxy. The uniform Janus structure of CrTeSe with high crystal quality is confirmed by high-resolution scanning transmission electron microscopy. Spin-polarized scanning tunneling microscopy/spectroscopy measurements unveil that the Janus CrTeSe undergoes a charge density wave (CDW) transition and a robust antiferromagnetic order. The magnetic anisotropy of CrTeSe is drastically altered compared to monolayer CrTe2 by the breaking symmetries induced from the Janus structure and the CDW transition, as is substantiated with first principles calculations. Our research achieves the construction of large-area Janus structures, and artificially tailors the electronic and magnetic properties of Janus systems at the two-dimensional limit. △ Less

Submitted 23 July, 2024; originally announced July 2024.

Comments: 19 pages, 4 figures

arXiv:2407.16192 [pdf, other]

How to Leverage Personal Textual Knowledge for Personalized Conversational Information Retrieval

Authors: Fengran Mo, Longxiang Zhao, Kaiyu Huang, Yue Dong, Degen Huang, Jian-Yun Nie

Abstract: Personalized conversational information retrieval (CIR) combines conversational and personalizable elements to satisfy various users' complex information needs through multi-turn interaction based on their backgrounds. The key promise is that the personal textual knowledge base (PTKB) can improve the CIR effectiveness because the retrieval results can be more related to the user's background. Howe… ▽ More Personalized conversational information retrieval (CIR) combines conversational and personalizable elements to satisfy various users' complex information needs through multi-turn interaction based on their backgrounds. The key promise is that the personal textual knowledge base (PTKB) can improve the CIR effectiveness because the retrieval results can be more related to the user's background. However, PTKB is noisy: not every piece of knowledge in PTKB is relevant to the specific query at hand. In this paper, we explore and test several ways to select knowledge from PTKB and use it for query reformulation by using a large language model (LLM). The experimental results show the PTKB might not always improve the search results when used alone, but LLM can help generate a more appropriate personalized query when high-quality guidance is provided. △ Less

Submitted 23 July, 2024; originally announced July 2024.

Comments: Accepted to CIKM 2024

arXiv:2407.15346 [pdf, other]

Knowledge Acquisition Disentanglement for Knowledge-based Visual Question Answering with Large Language Models

Authors: Wenbin An, Feng Tian, Jiahao Nie, Wenkai Shi, Haonan Lin, Yan Chen, QianYing Wang, Yaqiang Wu, Guang Dai, Ping Chen

Abstract: Knowledge-based Visual Question Answering (KVQA) requires both image and world knowledge to answer questions. Current methods first retrieve knowledge from the image and external knowledge base with the original complex question, then generate answers with Large Language Models (LLMs). However, since the original question contains complex elements that require knowledge from different sources, acq… ▽ More Knowledge-based Visual Question Answering (KVQA) requires both image and world knowledge to answer questions. Current methods first retrieve knowledge from the image and external knowledge base with the original complex question, then generate answers with Large Language Models (LLMs). However, since the original question contains complex elements that require knowledge from different sources, acquiring different kinds of knowledge in a coupled manner may confuse models and hinder them from retrieving precise knowledge. Furthermore, the ``forward-only'' answering process fails to explicitly capture the knowledge needs of LLMs, which can further hurt answering quality. To cope with the above limitations, we propose DKA: Disentangled Knowledge Acquisition from LLM feedback, a training-free framework that disentangles knowledge acquisition to avoid confusion and uses LLM's feedback to specify the required knowledge. Specifically, DKA requires LLMs to specify what knowledge they need to answer the question and decompose the original complex question into two simple sub-questions: Image-based sub-question and Knowledge-based sub-question. Then we use the two sub-questions to retrieve knowledge from the image and knowledge base, respectively. In this way, two knowledge acquisition models can focus on the content that corresponds to them and avoid disturbance of irrelevant elements in the original complex question, which can help to provide more precise knowledge and better align the knowledge needs of LLMs to yield correct answers. Experiments on benchmark datasets show that DKA significantly outperforms SOTA models. To facilitate future research, our data and code are available at \url{https://github.com/Lackel/DKA}. △ Less

Submitted 21 July, 2024; originally announced July 2024.

Comments: Pre-print

arXiv:2407.10979 [pdf, ps, other]

Diffusion Model-based Incentive Mechanism with Prospect Theory for Edge AIGC Services in 6G IoT

Authors: Jinbo Wen, Jiangtian Nie, Yue Zhong, Changyan Yi, Xiaohuan Li, Jiangming Jin, Yang Zhang, Dusit Niyato

Abstract: The fusion of the Internet of Things (IoT) with Sixth-Generation (6G) technology has significant potential to revolutionize the IoT landscape. With the ultra-reliable and low-latency communication capabilities of 6G, 6G-IoT networks can transmit high-quality and diverse data to enhance edge learning. Artificial Intelligence-Generated Content (AIGC) harnesses advanced AI algorithms to automatically… ▽ More The fusion of the Internet of Things (IoT) with Sixth-Generation (6G) technology has significant potential to revolutionize the IoT landscape. With the ultra-reliable and low-latency communication capabilities of 6G, 6G-IoT networks can transmit high-quality and diverse data to enhance edge learning. Artificial Intelligence-Generated Content (AIGC) harnesses advanced AI algorithms to automatically generate various types of content. The emergence of edge AIGC integrates with edge networks, facilitating real-time provision of customized AIGC services by deploying AIGC models on edge devices. However, the current practice of edge devices as AIGC Service Providers (ASPs) lacks incentives, hindering the sustainable provision of high-quality edge AIGC services amidst information asymmetry. In this paper, we develop a user-centric incentive mechanism framework for edge AIGC services in 6G-IoT networks. Specifically, we first propose a contract theory model for incentivizing ASPs to provide AIGC services to clients. Recognizing the irrationality of clients towards personalized AIGC services, we utilize Prospect Theory (PT) to capture their subjective utility better. Furthermore, we adopt the diffusion-based soft actor-critic algorithm to generate the optimal contract design under PT, outperforming traditional deep reinforcement learning algorithms. Our numerical results demonstrate the effectiveness of the proposed scheme. △ Less

Submitted 25 July, 2024; v1 submitted 10 June, 2024; originally announced July 2024.

arXiv:2407.06280 [pdf, other]

DESI Early Data Release Milky Way Survey Value-Added Catalogue

Authors: Sergey E. Koposov, C. Allende-Prieto, A. P. Cooper, T. S. Li, L. Beraldo e Silva, B. Kim, A. Carrillo, A. Dey, C. J. Manser, F. Nikakhtar, A. H. Riley, C. Rockosi, M. Valluri, J. Aguilar, S. Ahlen, S. Bailey, R. Blum, D. Brooks, T. Claybaugh, S. Cole, A. de la Macorra, B. Dey, J. E. Forero-Romero, E. Gaztañaga, J. Guy , et al. (18 additional authors not shown)

Abstract: We present the stellar value-added catalogue based on the Dark Energy Spectroscopic Instrument (DESI) Early Data Release. The catalogue contains radial velocity and stellar parameter measurements for $\simeq$ 400,000 unique stars observed during commissioning and survey validation by DESI. These observations were made under conditions similar to the Milky Way Survey (MWS) currently carried out by… ▽ More We present the stellar value-added catalogue based on the Dark Energy Spectroscopic Instrument (DESI) Early Data Release. The catalogue contains radial velocity and stellar parameter measurements for $\simeq$ 400,000 unique stars observed during commissioning and survey validation by DESI. These observations were made under conditions similar to the Milky Way Survey (MWS) currently carried out by DESI but also include multiple specially targeted fields, such as those containing well-studied dwarf galaxies and stellar streams. The majority of observed stars have $16<r<20$ with a median signal-to-noise ratio in the spectra of $\sim$ 20. In the paper, we describe the structure of the catalogue, give an overview of different target classes observed, as well as provide recipes for selecting clean stellar samples. We validate the catalogue using external high-resolution measurements and show that radial velocities, surface gravities, and iron abundances determined by DESI are accurate to 1 km/s, $0.3$ dex and $\sim$ 0.15 dex respectively. We also demonstrate possible uses of the catalogue for chemo-dynamical studies of the Milky Way stellar halo and Draco dwarf spheroidal. The value-added catalogue described in this paper is the very first DESI MWS catalogue. The next DESI data release, expected in less than a year, will add the data from the first year of DESI survey operations and will contain approximately 4 million stars, along with significant processing improvements. △ Less

Submitted 26 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

Comments: Accepted to MNRAS; Value added catalogue is available at https://data.desi.lbl.gov/public/edr/vac/edr/mws/fuji/

arXiv:2407.05238 [pdf, other]

P2P: Part-to-Part Motion Cues Guide a Strong Tracking Framework for LiDAR Point Clouds

Authors: Jiahao Nie, Fei Xie, Sifan Zhou, Xueyi Zhou, Dong-Kyu Chae, Zhiwei He

Abstract: 3D single object tracking (SOT) methods based on appearance matching has long suffered from insufficient appearance information incurred by incomplete, textureless and semantically deficient LiDAR point clouds. While motion paradigm exploits motion cues instead of appearance matching for tracking, it incurs complex multi-stage processing and segmentation module. In this paper, we first provide in-… ▽ More 3D single object tracking (SOT) methods based on appearance matching has long suffered from insufficient appearance information incurred by incomplete, textureless and semantically deficient LiDAR point clouds. While motion paradigm exploits motion cues instead of appearance matching for tracking, it incurs complex multi-stage processing and segmentation module. In this paper, we first provide in-depth explorations on motion paradigm, which proves that (\textbf{i}) it is feasible to directly infer target relative motion from point clouds across consecutive frames; (\textbf{ii}) fine-grained information comparison between consecutive point clouds facilitates target motion modeling. We thereby propose to perform part-to-part motion modeling for consecutive point clouds and introduce a novel tracking framework, termed \textbf{P2P}. The novel framework fuses each corresponding part information between consecutive point clouds, effectively exploring detailed information changes and thus modeling accurate target-related motion cues. Following this framework, we present P2P-point and P2P-voxel models, incorporating implicit and explicit part-to-part motion modeling by point- and voxel-based representation, respectively. Without bells and whistles, P2P-voxel sets a new state-of-the-art performance ($\sim$\textbf{89\%}, \textbf{72\%} and \textbf{63\%} precision on KITTI, NuScenes and Waymo Open Dataset, respectively). Moreover, under the same point-based representation, P2P-point outperforms the previous motion tracker M$^2$Track by \textbf{3.3\%} and \textbf{6.7\%} on the KITTI and NuScenes, while running at a considerably high speed of \textbf{107 Fps} on a single RTX3090 GPU. The source code and pre-trained models are available at \url{https://github.com/haooozi/P2P}. △ Less

Submitted 8 July, 2024; v1 submitted 6 July, 2024; originally announced July 2024.

Comments: The source code and pre-trained models are available at https://github.com/haooozi/P2P

arXiv:2407.05083 [pdf, other]

Exploring agent interaction patterns in the comment sections of fake and real news

Authors: Kailun Zhu, Songtao Peng, Jiaqi Nie, Zhongyuan Ruan, Shanqing Yu, Qi Xuan

Abstract: User comments on social media have been recognized as a crucial factor in distinguishing between fake and real news, with many studies focusing on the textual content of user reactions. However, the interactions among agents in the comment sections for fake and real news have not been fully explored. In this study, we analyze a dataset comprising both fake and real news from Reddit to investigate… ▽ More User comments on social media have been recognized as a crucial factor in distinguishing between fake and real news, with many studies focusing on the textual content of user reactions. However, the interactions among agents in the comment sections for fake and real news have not been fully explored. In this study, we analyze a dataset comprising both fake and real news from Reddit to investigate agent interaction patterns, considering both the network structure and the sentiment of the nodes. Our findings reveal that (i) comments on fake news are more likely to form groups, (ii) compared to fake news, where users generate more negative sentiment, real news tend to elicit more neutral and positive sentiments. Additionally, nodes with similar sentiments cluster together more tightly than anticipated. From a dynamic perspective, we found that the sentiment distribution among nodes stabilizes early and remains stable over time. These findings have both theoretical and practical implications, particularly for the early detection of real and fake news within social networks. △ Less

Submitted 11 October, 2024; v1 submitted 6 July, 2024; originally announced July 2024.

arXiv:2407.03040 [pdf, other]

Raw Text is All you Need: Knowledge-intensive Multi-turn Instruction Tuning for Large Language Model

Authors: Xia Hou, Qifeng Li, Jian Yang, Tongliang Li, Linzheng Chai, Xianjie Wu, Hangyuan Ji, Zhoujun Li, Jixuan Nie, Jingbo Dun, Wenfeng Song

Abstract: Instruction tuning as an effective technique aligns the outputs of large language models (LLMs) with human preference. But how to generate the seasonal multi-turn dialogues from raw documents for instruction tuning still requires further exploration. In this paper, we present a novel framework named R2S that leverages the CoD-Chain of Dialogue logic to guide large language models (LLMs) in generat… ▽ More Instruction tuning as an effective technique aligns the outputs of large language models (LLMs) with human preference. But how to generate the seasonal multi-turn dialogues from raw documents for instruction tuning still requires further exploration. In this paper, we present a novel framework named R2S that leverages the CoD-Chain of Dialogue logic to guide large language models (LLMs) in generating knowledge-intensive multi-turn dialogues for instruction tuning. By integrating raw documents from both open-source datasets and domain-specific web-crawled documents into a benchmark K-BENCH, we cover diverse areas such as Wikipedia (English), Science (Chinese), and Artifacts (Chinese). Our approach first decides the logic flow of the current dialogue and then prompts LLMs to produce key phrases for sourcing relevant response content. This methodology enables the creation of the G I NSTRUCT instruction dataset, retaining raw document knowledge within dialoguestyle interactions. Utilizing this dataset, we fine-tune GLLM, a model designed to transform raw documents into structured multi-turn dialogues, thereby injecting comprehensive domain knowledge into the SFT model for enhanced instruction tuning. This work signifies a stride towards refining the adaptability and effectiveness of LLMs in processing and generating more accurate, contextually nuanced responses across various fields. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 11 pages, 3 figures

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2407.02719 [pdf, other]

Boosting Biomedical Concept Extraction by Rule-Based Data Augmentation

Authors: Qiwei Shao, Fengran Mo, Jian-Yun Nie

Abstract: Document-level biomedical concept extraction is the task of identifying biomedical concepts mentioned in a given document. Recent advancements have adapted pre-trained language models for this task. However, the scarcity of domain-specific data and the deviation of concepts from their canonical names often hinder these models' effectiveness. To tackle this issue, we employ MetaMapLite, an existing… ▽ More Document-level biomedical concept extraction is the task of identifying biomedical concepts mentioned in a given document. Recent advancements have adapted pre-trained language models for this task. However, the scarcity of domain-specific data and the deviation of concepts from their canonical names often hinder these models' effectiveness. To tackle this issue, we employ MetaMapLite, an existing rule-based concept mapping system, to generate additional pseudo-annotated data from PubMed and PMC. The annotated data are used to augment the limited training data. Through extensive experiments, this study demonstrates the utility of a manually crafted concept mapping tool for training a better concept extraction model. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2406.18868 [pdf, other]

Advancing Cross-domain Discriminability in Continual Learning of Vison-Language Models

Authors: Yicheng Xu, Yuxin Chen, Jiahao Nie, Yusong Wang, Huiping Zhuang, Manabu Okumura

Abstract: Continual learning (CL) with Vision-Language Models (VLMs) has overcome the constraints of traditional CL, which only focuses on previously encountered classes. During the CL of VLMs, we need not only to prevent the catastrophic forgetting on incrementally learned knowledge but also to preserve the zero-shot ability of VLMs. However, existing methods require additional reference datasets to mainta… ▽ More Continual learning (CL) with Vision-Language Models (VLMs) has overcome the constraints of traditional CL, which only focuses on previously encountered classes. During the CL of VLMs, we need not only to prevent the catastrophic forgetting on incrementally learned knowledge but also to preserve the zero-shot ability of VLMs. However, existing methods require additional reference datasets to maintain such zero-shot ability and rely on domain-identity hints to classify images across different domains. In this study, we propose Regression-based Analytic Incremental Learning (RAIL), which utilizes a recursive ridge regression-based adapter to learn from a sequence of domains in a non-forgetting manner and decouple the cross-domain correlations by projecting features to a higher-dimensional space. Cooperating with a training-free fusion module, RAIL absolutely preserves the VLM's zero-shot ability on unseen domains without any reference data. Additionally, we introduce Cross-domain Task-Agnostic Incremental Learning (X-TAIL) setting. In this setting, a CL learner is required to incrementally learn from multiple domains and classify test images from both seen and unseen domains without any domain-identity hint. We theoretically prove RAIL's absolute memorization on incrementally learned domains. Experiment results affirm RAIL's state-of-the-art performance in both X-TAIL and existing Multi-domain Task-Incremental Learning settings. The code will be released upon acceptance. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.13996 [pdf, ps, other]

doi 10.1145/3637528.3671840

Unifying Graph Convolution and Contrastive Learning in Collaborative Filtering

Authors: Yihong Wu, Le Zhang, Fengran Mo, Tianyu Zhu, Weizhi Ma, Jian-Yun Nie

Abstract: Graph-based models and contrastive learning have emerged as prominent methods in Collaborative Filtering (CF). While many existing models in CF incorporate these methods in their design, there seems to be a limited depth of analysis regarding the foundational principles behind them. This paper bridges graph convolution, a pivotal element of graph-based models, with contrastive learning through a t… ▽ More Graph-based models and contrastive learning have emerged as prominent methods in Collaborative Filtering (CF). While many existing models in CF incorporate these methods in their design, there seems to be a limited depth of analysis regarding the foundational principles behind them. This paper bridges graph convolution, a pivotal element of graph-based models, with contrastive learning through a theoretical framework. By examining the learning dynamics and equilibrium of the contrastive loss, we offer a fresh lens to understand contrastive learning via graph theory, emphasizing its capability to capture high-order connectivity. Building on this analysis, we further show that the graph convolutional layers often used in graph-based models are not essential for high-order connectivity modeling and might contribute to the risk of oversmoothing. Stemming from our findings, we introduce Simple Contrastive Collaborative Filtering (SCCF), a simple and effective algorithm based on a naive embedding model and a modified contrastive loss. The efficacy of the algorithm is demonstrated through extensive experiments across four public datasets. The experiment code is available at \url{https://github.com/wu1hong/SCCF}. \end{abstract} △ Less

Submitted 21 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

Comments: KDD 2024

arXiv:2406.12718 [pdf, other]

AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention

Authors: Wenbin An, Feng Tian, Sicong Leng, Jiahao Nie, Haonan Lin, QianYing Wang, Guang Dai, Ping Chen, Shijian Lu

Abstract: Despite their great success across various multimodal tasks, Large Vision-Language Models (LVLMs) are facing a prevalent problem with object hallucinations, where the generated textual responses are inconsistent with ground-truth objects in the given image. This paper investigates various LVLMs and pinpoints attention deficiency toward discriminative local image features as one root cause of objec… ▽ More Despite their great success across various multimodal tasks, Large Vision-Language Models (LVLMs) are facing a prevalent problem with object hallucinations, where the generated textual responses are inconsistent with ground-truth objects in the given image. This paper investigates various LVLMs and pinpoints attention deficiency toward discriminative local image features as one root cause of object hallucinations. Specifically, LVLMs predominantly attend to prompt-independent global image features, while failing to capture prompt-relevant local features, consequently undermining the visual grounding capacity of LVLMs and leading to hallucinations. To this end, we propose Assembly of Global and Local Attention (AGLA), a training-free and plug-and-play approach that mitigates object hallucinations by exploring an ensemble of global features for response generation and local features for visual discrimination simultaneously. Our approach exhibits an image-prompt matching scheme that captures prompt-relevant local features from images, leading to an augmented view of the input image where prompt-relevant content is reserved while irrelevant distractions are masked. With the augmented view, a calibrated decoding distribution can be derived by integrating generative global features from the original image and discriminative local features from the augmented image. Extensive experiments show that AGLA consistently mitigates object hallucinations and enhances general perception capability for LVLMs across various discriminative and generative benchmarks. Our code will be released at https://github.com/Lackel/AGLA. △ Less

Submitted 21 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.09121 [pdf, other]

MMRel: A Relation Understanding Dataset and Benchmark in the MLLM Era

Authors: Jiahao Nie, Gongjie Zhang, Wenbin An, Yap-Peng Tan, Alex C. Kot, Shijian Lu

Abstract: Despite the recent advancements in Multi-modal Large Language Models (MLLMs), understanding inter-object relations, i.e., interactions or associations between distinct objects, remains a major challenge for such models. This issue significantly hinders their advanced reasoning capabilities and is primarily due to the lack of large-scale, high-quality, and diverse multi-modal data essential for tra… ▽ More Despite the recent advancements in Multi-modal Large Language Models (MLLMs), understanding inter-object relations, i.e., interactions or associations between distinct objects, remains a major challenge for such models. This issue significantly hinders their advanced reasoning capabilities and is primarily due to the lack of large-scale, high-quality, and diverse multi-modal data essential for training and evaluating MLLMs. In this paper, we provide a taxonomy of inter-object relations and introduce Multi-Modal Relation Understanding (MMRel), a comprehensive dataset designed to bridge this gap by providing large-scale, high-quality and diverse data for studying inter-object relations with MLLMs. MMRel features three distinctive attributes: (i) It includes over 15K question-answer pairs, which are sourced from three distinct domains, ensuring large scale and high diversity; (ii) It contains a subset featuring highly unusual relations, on which MLLMs often fail due to hallucinations, thus are very challenging; (iii) It provides manually verified high-quality labels for inter-object relations. Thanks to these features, MMRel is ideal for evaluating MLLMs on relation understanding, as well as being used to fine-tune MLLMs to enhance relation understanding and even benefit overall performance in various vision-language tasks. Extensive experiments on various popular MLLMs validate the effectiveness of MMRel. Both MMRel dataset and the complete labeling scripts have been made publicly available. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.08921 [pdf, other]

doi 10.1093/mnras/stae1415

AuriDESI: Mock Catalogues for the DESI Milky Way Survey

Authors: Namitha Kizhuprakkat, Andrew P. Cooper, Alexander H. Riley, Sergey E. Koposov, Jessica Nicole Aguilar, Steven Ahlen, Carlos Allende Prieto, David Brooks, Todd Claybaugh, Kyle Dawson, Axel de la Macorra, Peter Doel, Jaime E. Forero-Romero, Carlos Frenk, Enrique Gaztañaga, Oleg Y. Gnedin, Robert J. J. Grand, Satya Gontcho A Gontcho, Klaus Honscheid, Robert Kehoe, Martin Landriau, Marc Manera, Aaron Meisner, Ramon Miquel, Jundan Nie , et al. (9 additional authors not shown)

Abstract: The Dark Energy Spectroscopic Instrument Milky Way Survey (DESI MWS) will explore the assembly history of the Milky Way by characterising remnants of ancient dwarf galaxy accretion events and improving constraints on the distribution of dark matter in the outer halo. We present mock catalogues that reproduce the selection criteria of MWS and the format of the final MWS data set. These catalogues c… ▽ More The Dark Energy Spectroscopic Instrument Milky Way Survey (DESI MWS) will explore the assembly history of the Milky Way by characterising remnants of ancient dwarf galaxy accretion events and improving constraints on the distribution of dark matter in the outer halo. We present mock catalogues that reproduce the selection criteria of MWS and the format of the final MWS data set. These catalogues can be used to test methods for quantifying the properties of stellar halo substructure and reconstructing the Milky Way's accretion history with the MWS data, including the effects of halo-to-halo variance. The mock catalogues are based on a phase-space kernel expansion technique applied to star particles in the Auriga suite of six high-resolution $Λ$CDM magneto-hydrodynamic zoom-in simulations. They include photometric properties (and associated errors) used in DESI target selection and the outputs of the MWS spectral analysis pipeline (radial velocity, metallicity, surface gravity, and temperature). They also include information from the underlying simulation, such as the total gravitational potential and information on the progenitors of accreted halo stars. We discuss how the subset of halo stars observable by MWS in these simulations corresponds to their true content and properties. These mock Milky Ways have rich accretion histories, resulting in a large number of substructures that span the whole stellar halo out to large distances and have substantial overlap in the space of orbital energy and angular momentum. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: Accepted for publication in MNRAS, 31 pages, 27 figues, 7 tables. The mock catalogues are available at https://data.desi.lbl.gov/public/papers/mws/auridesi/v1

arXiv:2406.06882 [pdf, ps, other]

A Characterization for Tightness of the Sparse Moment-SOS Hierarchy

Authors: Jiawang Nie, Zheng Qu, Xindong Tang, Linghao Zhang

Abstract: This paper studies the sparse Moment-SOS hierarchy of relaxations for solving sparse polynomial optimization problems. We show that this sparse hierarchy is tight if and only if the objective can be written as a sum of sparse nonnegative polynomials, each of which belongs to the sum of the ideal and quadratic module generated by the corresponding sparse constraints. Based on this characterization,… ▽ More This paper studies the sparse Moment-SOS hierarchy of relaxations for solving sparse polynomial optimization problems. We show that this sparse hierarchy is tight if and only if the objective can be written as a sum of sparse nonnegative polynomials, each of which belongs to the sum of the ideal and quadratic module generated by the corresponding sparse constraints. Based on this characterization, we give several sufficient conditions for the sparse Moment-SOS hierarchy to be tight. In particular, we show that this sparse hierarchy is tight under some assumptions such as convexity, optimality conditions or finiteness of constraining sets. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 27 pages

arXiv:2406.05013 [pdf, other]

CHIQ: Contextual History Enhancement for Improving Query Rewriting in Conversational Search

Authors: Fengran Mo, Abbas Ghaddar, Kelong Mao, Mehdi Rezagholizadeh, Boxing Chen, Qun Liu, Jian-Yun Nie

Abstract: In this paper, we study how open-source large language models (LLMs) can be effectively deployed for improving query rewriting in conversational search, especially for ambiguous queries. We introduce CHIQ, a two-step method that leverages the capabilities of LLMs to resolve ambiguities in the conversation history before query rewriting. This approach contrasts with prior studies that predominantly… ▽ More In this paper, we study how open-source large language models (LLMs) can be effectively deployed for improving query rewriting in conversational search, especially for ambiguous queries. We introduce CHIQ, a two-step method that leverages the capabilities of LLMs to resolve ambiguities in the conversation history before query rewriting. This approach contrasts with prior studies that predominantly use closed-source LLMs to directly generate search queries from conversation history. We demonstrate on five well-established benchmarks that CHIQ leads to state-of-the-art results across most settings, showing highly competitive performances with systems leveraging closed-source LLMs. Our study provides a first step towards leveraging open-source LLMs in conversational search, as a competitive alternative to the prevailing reliance on commercial LLMs. Data, models, and source code will be publicly available upon acceptance at https://github.com/fengranMark/CHIQ. △ Less

Submitted 26 September, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

Comments: Accepted by EMNLP 2024

arXiv:2406.03249 [pdf, other]

Near-field Beam training for Extremely Large-scale MIMO Based on Deep Learning

Authors: Jiali Nie, Yuanhao Cui, Zhaohui Yang, Weijie Yuan, Xiaojun Jing

Abstract: Extremely Large-scale Array (ELAA) is considered a frontier technology for future communication systems, pivotal in improving wireless systems' rate and spectral efficiency. As ELAA employs a multitude of antennas operating at higher frequencies, users are typically situated in the near-field region where the spherical wavefront propagates. The near-field beam training in ELAA requires both angle… ▽ More Extremely Large-scale Array (ELAA) is considered a frontier technology for future communication systems, pivotal in improving wireless systems' rate and spectral efficiency. As ELAA employs a multitude of antennas operating at higher frequencies, users are typically situated in the near-field region where the spherical wavefront propagates. The near-field beam training in ELAA requires both angle and distance information, which inevitably leads to a significant increase in the beam training overhead. To address this problem, we propose a near-field beam training method based on deep learning. We use a convolutional neural network (CNN) to efficiently learn channel characteristics from historical data by strategically selecting padding and kernel sizes. The negative value of the user average achievable rate is utilized as the loss function to optimize the beamformer. This method maximizes multi-user networks' achievable rate without predefined beam codebooks. Upon deployment, the model requires solely the pre-estimated channel state information (CSI) to derive the optimal beamforming vector. The simulation results demonstrate that the proposed scheme achieves a more stable beamforming gain and significantly improves performance compared to the traditional beam training method. Furthermore, owing to the inherent traits of deep learning methodologies, this approach substantially diminishes the near-field beam training overhead. △ Less

Submitted 23 August, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

arXiv:2406.00972 [pdf, other]

doi 10.1088/1674-4527/ad26b6

All-sky Guide Star Catalog for CSST

Authors: Hui-Mei Feng, Zi-Huang Cao, Man I Lam, Ran Li, Hao Tian, Da-Yi Yin, Yuan-Yu Yang, Xin Zhang, Dong-Wei Fan, Yi-Qiao Dong, Xin-Feng Li, Wei Wang, Long Li, Hugh R. A. Jones, Yi-Han Tao, Jia-Lu Nie, Pei-Pei Wang, Mao-Yuan Liu, He-jun Yang, Chao Liu

Abstract: The China Space Station Telescope (CSST) is a two-meter space telescope with multiple back-end instruments. The Fine Guidance Sensor (FGS) is an essential subsystem of the CSST Precision Image Stability System to ensure the required absolute pointing accuracy and line-of-sight stabilization. In this study, we construct the Main Guide Star Catalog for FGS. To accomplish this, we utilize the informa… ▽ More The China Space Station Telescope (CSST) is a two-meter space telescope with multiple back-end instruments. The Fine Guidance Sensor (FGS) is an essential subsystem of the CSST Precision Image Stability System to ensure the required absolute pointing accuracy and line-of-sight stabilization. In this study, we construct the Main Guide Star Catalog for FGS. To accomplish this, we utilize the information about the FGS and object information from the Gaia Data Release 3. We provide an FGS instrument magnitude and exclude variables, binaries, and high proper motion stars from the catalog to ensure uniform FGS guidance capabilities. Subsequently, we generate a HEALPix index, which provides a hierarchical tessellation of the celestial sphere, and employ the Voronoi algorithm to achieve a homogeneous distribution of stars across the catalog. This distribution ensures adequate coverage and sampling of the sky. The performance of the CSST guide star catalog was assessed by simulating the field of view of the FGS according to the CSST mock survey strategy catalog. The analysis of the results indicates that this catalog provides adequate coverage and accuracy. The catalog's performance meets the FGS requirements, ensuring the functioning of the FGS and its guidance capabilities. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: published on RAA

arXiv:2405.18589 [pdf, other]

Candidate strongly-lensed Type Ia supernovae in the Zwicky Transient Facility archive

Authors: A. Townsend, J. Nordin, A. Sagués Carracedo, M. Kowalski, N. Arendse, S. Dhawan, A. Goobar, J. Johansson, E. Mörtsell, S. Schulze, I. Andreoni, E. Fernández, A. G. Kim, P. E. Nugent, F. Prada, M. Rigault, N. Sarin, D. Sharma, E. C. Bellm, M. W. Coughlin, R. Dekany, S. L. Groom, L. Lacroix, R. R. Laher, R. Riddle , et al. (39 additional authors not shown)

Abstract: Gravitationally lensed Type Ia supernovae (glSNe Ia) are unique astronomical tools for studying cosmological parameters, distributions of dark matter, the astrophysics of the supernovae and the intervening lensing galaxies themselves. Only a few highly magnified glSNe Ia have been discovered by ground-based telescopes, such as the Zwicky Transient Facility (ZTF), but simulations predict the existe… ▽ More Gravitationally lensed Type Ia supernovae (glSNe Ia) are unique astronomical tools for studying cosmological parameters, distributions of dark matter, the astrophysics of the supernovae and the intervening lensing galaxies themselves. Only a few highly magnified glSNe Ia have been discovered by ground-based telescopes, such as the Zwicky Transient Facility (ZTF), but simulations predict the existence of a fainter, undetected population. We present a systematic search in the ZTF archive of alerts from 1 June 2019 to 1 September 2022. Using the AMPEL platform, we developed a pipeline that distinguishes candidate glSNe Ia from other variable sources. Initial cuts were applied to the ZTF alert photometry before forced photometry was obtained for the remaining candidates. Additional cuts were applied to refine the candidates based on their light curve colours, lens galaxy colours, and the resulting parameters from fits to the SALT2 SN Ia template. Candidates were also cross-matched with the DESI spectroscopic catalogue. Seven transients passed all the cuts and had an associated galaxy DESI redshift, which we present as glSN Ia candidates. While superluminous supernovae (SLSNe) cannot be fully rejected, two events, ZTF19abpjicm and ZTF22aahmovu, are significantly different from typical SLSNe and their light curves can be modelled as two-image glSN Ia systems. From this two-image modelling, we estimate time delays of 22 $\pm$ 3 and 34 $\pm$ 1 days for the two events, respectively, which suggests that we have uncovered a population with longer time delays. The pipeline is efficient and sensitive enough to parse full alert streams. It is currently being applied to the live ZTF alert stream to identify and follow-up future candidates while active. This pipeline could be the foundation for glSNe Ia searches in future surveys, like the Vera C. Rubin Observatory's Legacy Survey of Space and Time. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 21 pages, 15 figures

arXiv:2405.16671 [pdf, other]

Mixture of Experts Using Tensor Products

Authors: Zhan Su, Fengran Mo, Prayag Tiwari, Benyou Wang, Jian-Yun Nie, Jakob Grue Simonsen

Abstract: In multi-task learning, the conventional approach involves training a model on multiple tasks simultaneously. However, the training signals from different tasks can interfere with one another, potentially leading to \textit{negative transfer}. To mitigate this, we investigate if modular language models can facilitate positive transfer and systematic generalization. Specifically, we propose a novel… ▽ More In multi-task learning, the conventional approach involves training a model on multiple tasks simultaneously. However, the training signals from different tasks can interfere with one another, potentially leading to \textit{negative transfer}. To mitigate this, we investigate if modular language models can facilitate positive transfer and systematic generalization. Specifically, we propose a novel modular language model (\texttt{TensorPoly}), that balances parameter efficiency with nuanced routing methods. For \textit{modules}, we reparameterize Low-Rank Adaptation (\texttt{LoRA}) by employing an entangled tensor through the use of tensor product operations and name the resulting approach \texttt{TLoRA}. For \textit{routing function}, we tailor two innovative routing functions according to the granularity: \texttt{TensorPoly-I} which directs to each rank within the entangled tensor while \texttt{TensorPoly-II} offers a finer-grained routing approach targeting each order of the entangled tensor. The experimental results from the multi-task T0-benchmark demonstrate that: 1) all modular LMs surpass the corresponding dense approaches, highlighting the potential of modular language models to mitigate negative inference in multi-task learning and deliver superior outcomes. 2) \texttt{TensorPoly-I} achieves higher parameter efficiency in adaptation and outperforms other modular LMs, which shows the potential of our approach in multi-task transfer learning. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.16657 [pdf, other]

ELG Spectroscopic Systematics Analysis of the DESI Data Release 1

Authors: Jiaxi Yu, Ashley J. Ross, Antoine Rocher, Otávio Alves, Arnaud de Mattia, Daniel Forero-Sánchez, Jean-Paul Kneib, Alex Krolewski, TingWen Lan, Michael Rashkovetskyi, Jessica Nicole Aguilar, Steven Ahlen, Stephen Bailey, David Brooks, Edmond Chaussidon, Todd Claybaugh, Axel de la Macorra, Arjun Dey, Biprateep Dey, Peter Doel, Kevin Fanning, Jaime E. Forero-Romero, Enrique Gaztañaga, Satya Gontcho A Gontcho, Klaus Honscheid , et al. (36 additional authors not shown)

Abstract: Dark Energy Spectroscopic Instrument (DESI) uses more than 2.4 million Emission Line Galaxies (ELGs) for 3D large-scale structure (LSS) analyses in its Data Release 1 (DR1). Such large statistics enable thorough research on systematic uncertainties. In this study, we focus on spectroscopic systematics of ELGs. The redshift success rate ($f_{\rm goodz}$) is the relative fraction of secure redshifts… ▽ More Dark Energy Spectroscopic Instrument (DESI) uses more than 2.4 million Emission Line Galaxies (ELGs) for 3D large-scale structure (LSS) analyses in its Data Release 1 (DR1). Such large statistics enable thorough research on systematic uncertainties. In this study, we focus on spectroscopic systematics of ELGs. The redshift success rate ($f_{\rm goodz}$) is the relative fraction of secure redshifts among all measurements. It depends on observing conditions, thus introduces non-cosmological variations to the LSS. We, therefore, develop the redshift failure weight ($w_{\rm zfail}$) and a per-fibre correction ($η_{\rm zfail}$) to mitigate these dependences. They have minor influences on the galaxy clustering. For ELGs with a secure redshift, there are two subtypes of systematics: 1) catastrophics (large) that only occur in a few samples; 2) redshift uncertainty (small) that exists for all samples. The catastrophics represent 0.26\% of the total DR1 ELGs, composed of the confusion between O\,\textsc{ii} and sky residuals, double objects, total catastrophics and others. We simulate the realistic 0.26\% catastrophics of DR1 ELGs, the hypothetical 1\% catastrophics, and the truncation of the contaminated $1.31<z<1.33$ in the \textsc{AbacusSummit} ELG mocks. Their $P_\ell$ show non-negligible bias from the uncontaminated mocks. But their influences on the redshift space distortions (RSD) parameters are smaller than $0.2σ$. The redshift uncertainty of \Yone ELGs is 8.5 km/s with a Lorentzian profile. The code for implementing the catastrophics and redshift uncertainty on mocks can be found in https://github.com/Jiaxi-Yu/modelling_spectro_sys. △ Less

Submitted 26 September, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.16593 [pdf, other]

The Construction of Large-scale Structure Catalogs for the Dark Energy Spectroscopic Instrument

Authors: A. J. Ross, J. Aguilar, S. Ahlen, S. Alam, A. Anand, S. Bailey, D. Bianchi, S. Brieden, D. Brooks, E. Burtin, A. Carnero Rosell, E. Chaussidon, T. Claybaugh, S. Cole, K. Dawson, A. de la Macorra, A. de Mattia, Arjun Dey, Biprateep Dey, P. Doel, K. Fanning, S. Ferraro, J. Ereza, A. Font-Ribera, J. E. Forero-Romero , et al. (61 additional authors not shown)

Abstract: We present the technical details on how large-scale structure (LSS) catalogs are constructed from redshifts measured from spectra observed by the Dark Energy Spectroscopic Instrument (DESI). The LSS catalogs provide the information needed to determine the relative number density of DESI tracers as a function of redshift and celestial coordinates and, e.g., determine clustering statistics. We produ… ▽ More We present the technical details on how large-scale structure (LSS) catalogs are constructed from redshifts measured from spectra observed by the Dark Energy Spectroscopic Instrument (DESI). The LSS catalogs provide the information needed to determine the relative number density of DESI tracers as a function of redshift and celestial coordinates and, e.g., determine clustering statistics. We produce catalogs that are weighted subsamples of the observed data, each matched to a weighted `random' catalog that forms an unclustered sampling of the probability density that DESI could have observed those data at each location. Precise knowledge of the DESI observing history and associated hardware performance allows for a determination of the DESI footprint and the number of times DESI has covered it at sub-arcsecond level precision. This enables the completeness of any DESI sample to be modeled at this same resolution. The pipeline developed to create LSS catalogs has been designed to easily allow robustness tests and enable future improvements. We describe how it allows ongoing work improving the match between galaxy and random catalogs, such as including further information when assigning redshifts to randoms, accounting for fluctuations in target density, accounting for variation in the redshift success rate, and accommodating blinding schemes. △ Less

Submitted 18 July, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

Comments: Accepted (by JCAP) version of supporting publication of DESI 2024II: Sample definitions, characteristics, and two-point clustering statistics

arXiv:2405.16299 [pdf, other]

Forward modeling fluctuations in the DESI LRGs target sample using image simulations

Authors: Hui Kong, Ashley J. Ross, Klaus Honscheid, Dustin Lang, Anna Porredon, Arnaud de Mattia, Mehdi Rezaie, Rongpu Zhou, Edward Schlafly, John Moustakas, Alberto Rosado-Marin, Jessica Nicole Aguilar, Steven Ahlen, David Brooks, Edmond Chaussidon, Todd Claybaugh, Shaun Cole, Axel de la Macorra, Arjun Dey, Biprateep Dey, Peter Doel, Kevin Fanning, Jaime E. Forero-Romero, Enrique Gaztanaga, Satya Gontcho A Gontcho , et al. (28 additional authors not shown)

Abstract: We use the forward modeling pipeline, Obiwan, to study the imaging systematics of the Luminous Red Galaxies (LRGs) targeted by the Dark Energy Spectroscopic Instrument (DESI). We update the Obiwan pipeline, which had previously been developed to simulate the optical images used to target DESI data, to further simulate WISE images in the infrared. This addition makes it possible to simulate the DES… ▽ More We use the forward modeling pipeline, Obiwan, to study the imaging systematics of the Luminous Red Galaxies (LRGs) targeted by the Dark Energy Spectroscopic Instrument (DESI). We update the Obiwan pipeline, which had previously been developed to simulate the optical images used to target DESI data, to further simulate WISE images in the infrared. This addition makes it possible to simulate the DESI LRGs sample, which utilizes WISE data in the target selection. Deep DESI imaging data combined with a method to account for biases in their shapes is used to define a truth sample of potential LRG targets. We simulate a total of 15 million galaxies to obtain a simulated LRG sample (Obiwan LRGs) that predicts the variations in target density due to imaging properties. We find that the simulations predict the trends with depth observed in the data, including how they depend on the intrinsic brightness of the galaxies. We observe that faint LRGs are the main contributing power of the imaging systematics trend induced by depth. We also find significant trends in the data against Galactic extinction that are not predicted by Obiwan. These trends depend strongly on the particular map of Galactic extinction chosen to test against, implying Large-Scale Structure systematic contamination (e.g. Cosmic-Infrared Background) in the Galactic extinction maps is a likely root cause. We additionally observe that the DESI LRGs sample exhibits a complex dependency on a combination of seeing, depth, and intrinsic galaxy brightness, which is not replicated by Obiwan, suggesting discrepancies between the current simulation settings and the actual observations. The detailed findings we present should be used to guide any observational systematics mitigation treatment for the clustering of the DESI LRG sample. △ Less

Submitted 4 October, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

Comments: 46 pages, 26 figures

arXiv:2405.15829 [pdf, other]

Spatio-temporal Value Semantics-based Abstraction for Dense Deep Reinforcement Learning

Authors: Jihui Nie, Dehui Du, Jiangnan Zhao

Abstract: Intelligent Cyber-Physical Systems (ICPS) represent a specialized form of Cyber-Physical System (CPS) that incorporates intelligent components, notably Convolutional Neural Networks (CNNs) and Deep Reinforcement Learning (DRL), to undertake multifaceted tasks encompassing perception, decision-making, and control. The utilization of DRL for decision-making facilitates dynamic interaction with the e… ▽ More Intelligent Cyber-Physical Systems (ICPS) represent a specialized form of Cyber-Physical System (CPS) that incorporates intelligent components, notably Convolutional Neural Networks (CNNs) and Deep Reinforcement Learning (DRL), to undertake multifaceted tasks encompassing perception, decision-making, and control. The utilization of DRL for decision-making facilitates dynamic interaction with the environment, generating control actions aimed at maximizing cumulative rewards. Nevertheless, the inherent uncertainty of the operational environment and the intricate nature of ICPS necessitate exploration within complex and dynamic state spaces during the learning phase. DRL confronts challenges in terms of efficiency, generalization capabilities, and data scarcity during decision-making process. In response to these challenges, we propose an innovative abstract modeling approach grounded in spatial-temporal value semantics, capturing the evolution in the distribution of semantic value across time and space. A semantics-based abstraction is introduced to construct an abstract Markov Decision Process (MDP) for the DRL learning process. Furthermore, optimization techniques for abstraction are delineated, aiming to refine the abstract model and mitigate semantic gaps between abstract and concrete states. The efficacy of the abstract modeling is assessed through the evaluation and analysis of the abstract MDP model using PRISM. A series of experiments are conducted, involving diverse scenarios such as lane-keeping, adaptive cruise control, and intersection crossroad assistance, to demonstrate the effectiveness of our abstracting approach. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 24 pages, 7 figures, conference

MSC Class: 68N30 ACM Class: D.2.4

arXiv:2405.14988 [pdf, other]

CMB lensing and Lyα forest cross bispectrum from DESI's first-year quasar sample

Authors: N. G. Karaçaylı, P. Martini, D. H. Weinberg, S. Ferraro, R. de Belsunce, J. Aguilar, S. Ahlen, E. Armengaud, D. Brooks, T. Claybaugh, A. de la Macorra, B. Dey, P. Doel, K. Fanning, J. E. Forero-Romero, S. Gontcho A Gontcho, A. X. Gonzalez-Morales, G. Gutierrez, J. Guy, K. Honscheid, D. Kirkby, T. Kisner, A. Kremin, A. Lambert, M. Landriau , et al. (28 additional authors not shown)

Abstract: The squeezed cross-bispectrum \bispeconed\ between the gravitational lensing in the Cosmic Microwave Background and the 1D \lya\ forest power spectrum can constrain bias parameters and break degeneracies between $σ_8$ and other cosmological parameters. We detect \bispeconed\ with $4.8σ$ significance at an effective redshift $z_\mathrm{eff}=2.4$ using Planck PR3 lensing map and over 280,000 quasar… ▽ More The squeezed cross-bispectrum \bispeconed\ between the gravitational lensing in the Cosmic Microwave Background and the 1D \lya\ forest power spectrum can constrain bias parameters and break degeneracies between $σ_8$ and other cosmological parameters. We detect \bispeconed\ with $4.8σ$ significance at an effective redshift $z_\mathrm{eff}=2.4$ using Planck PR3 lensing map and over 280,000 quasar spectra from the Dark Energy Spectroscopic Instrument's first-year data. We test our measurement against metal contamination and foregrounds such as Galactic extinction and clusters of galaxies by deprojecting the thermal Sunyaev-Zeldovich effect. We compare our results to a tree-level perturbation theory calculation and find reasonable agreement between the model and measurement. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 13 pages excluding references, 8 figures

arXiv:2405.13325 [pdf, other]

DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction with Slot Querying

Authors: Guanghui Wang, Dexi Liu, Jian-Yun Nie, Qizhi Wan, Rong Hu, Xiping Liu, Wanlong Liu, Jiaming Liu

Abstract: Recent advancements in event argument extraction (EAE) involve incorporating useful auxiliary information into models during training and inference, such as retrieved instances and event templates. These methods face two challenges: (1) the retrieval results may be irrelevant and (2) templates are developed independently for each event without considering their possible relationship. In this work,… ▽ More Recent advancements in event argument extraction (EAE) involve incorporating useful auxiliary information into models during training and inference, such as retrieved instances and event templates. These methods face two challenges: (1) the retrieval results may be irrelevant and (2) templates are developed independently for each event without considering their possible relationship. In this work, we propose DEGAP to address these challenges through a simple yet effective components: dual prefixes, i.e. learnable prompt vectors, where the instance-oriented prefix and template-oriented prefix are trained to learn information from different event instances and templates. Additionally, we propose an event-guided adaptive gating mechanism, which can adaptively leverage possible connections between different events and thus capture relevant information from the prefix. Finally, these event-guided prefixes provide relevant information as cues to EAE model without retrieval. Extensive experiments demonstrate that our method achieves new state-of-the-art performance on four datasets (ACE05, RAMS, WIKIEVENTS, and MLEE). Further analysis shows the impact of different components. △ Less

Submitted 15 June, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.10936 [pdf, other]

A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers

Authors: Kaiyu Huang, Fengran Mo, Hongliang Li, You Li, Yuanchi Zhang, Weijian Yi, Yulong Mao, Jinchen Liu, Yuzhuang Xu, Jinan Xu, Jian-Yun Nie, Yang Liu

Abstract: The rapid development of Large Language Models (LLMs) demonstrates remarkable multilingual capabilities in natural language processing, attracting global attention in both academia and industry. To mitigate potential discrimination and enhance the overall usability and accessibility for diverse language user groups, it is important for the development of language-fair technology. Despite the break… ▽ More The rapid development of Large Language Models (LLMs) demonstrates remarkable multilingual capabilities in natural language processing, attracting global attention in both academia and industry. To mitigate potential discrimination and enhance the overall usability and accessibility for diverse language user groups, it is important for the development of language-fair technology. Despite the breakthroughs of LLMs, the investigation into the multilingual scenario remains insufficient, where a comprehensive survey to summarize recent approaches, developments, limitations, and potential solutions is desirable. To this end, we provide a survey with multiple perspectives on the utilization of LLMs in the multilingual scenario. We first rethink the transitions between previous and current research on pre-trained language models. Then we introduce several perspectives on the multilingualism of LLMs, including training and inference methods, model security, multi-domain with language culture, and usage of datasets. We also discuss the major challenges that arise in these aspects, along with possible solutions. Besides, we highlight future research directions that aim at further enhancing LLMs with multilingualism. The survey aims to help the research community address multilingual problems and provide a comprehensive understanding of the core concepts, key techniques, and latest developments in multilingual natural language processing based on LLMs. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 54 pages, Work in Progress

arXiv:2405.09487 [pdf, other]

Color Space Learning for Cross-Color Person Re-Identification

Authors: Jiahao Nie, Shan Lin, Alex C. Kot

Abstract: The primary color profile of the same identity is assumed to remain consistent in typical Person Re-identification (Person ReID) tasks. However, this assumption may be invalid in real-world situations and images hold variant color profiles, because of cross-modality cameras or identity with different clothing. To address this issue, we propose Color Space Learning (CSL) for those Cross-Color Perso… ▽ More The primary color profile of the same identity is assumed to remain consistent in typical Person Re-identification (Person ReID) tasks. However, this assumption may be invalid in real-world situations and images hold variant color profiles, because of cross-modality cameras or identity with different clothing. To address this issue, we propose Color Space Learning (CSL) for those Cross-Color Person ReID problems. Specifically, CSL guides the model to be less color-sensitive with two modules: Image-level Color-Augmentation and Pixel-level Color-Transformation. The first module increases the color diversity of the inputs and guides the model to focus more on the non-color information. The second module projects every pixel of input images onto a new color space. In addition, we introduce a new Person ReID benchmark across RGB and Infrared modalities, NTU-Corridor, which is the first with privacy agreements from all participants. To evaluate the effectiveness and robustness of our proposed CSL, we evaluate it on several Cross-Color Person ReID benchmarks. Our method surpasses the state-of-the-art methods consistently. The code and benchmark are available at: https://github.com/niejiahao1998/CSL △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: Accepted by ICME 2024 (Oral)

arXiv:2405.08633 [pdf, other]

On the superconducting gap structure of the miassite Rh17S15: Nodal or nodeless?

Authors: J. Y. Nie, C. C. Zhao, C. Q. Xu, B. Li, C. P. Tu, X. Zhang, D. Z. Dai, H. R. Wang, S. Xu, Wenhe Jiao, B. M. Wang, Zhu'an Xu, Xiaofeng Xu, S. Y. Li

Abstract: Recent penetration depth measurement claimed the observation of unconventional superconductivity in the miassite Rh$_{17}$S$_{15}$ single crystals, evidenced by the linear-in-temperature penetration depth at low temperatures, thereby arguing for the presence of the lines of node in its superconducting gap structure. Here we measure the thermal conductivity of Rh$_{17}$S$_{15}$ single crystals down… ▽ More Recent penetration depth measurement claimed the observation of unconventional superconductivity in the miassite Rh$_{17}$S$_{15}$ single crystals, evidenced by the linear-in-temperature penetration depth at low temperatures, thereby arguing for the presence of the lines of node in its superconducting gap structure. Here we measure the thermal conductivity of Rh$_{17}$S$_{15}$ single crystals down to 110 mK and up to a field of 8 T ($\simeq 0.4H{\rm_{c2}}$). In marked contrast to the penetration depth measurement, we observe a negligible residual linear term $κ_0/T$ in zero field, in line with the nodeless gap structure. The field dependence of $κ_0(H)/T$ shows a profile that is more consistent with either a highly anisotropic gap structure or multiple nodeless gaps with significantly different magnitudes. Moreover, first-principles calculations give two electronic bands with complex shape of Fermi surfaces. These results suggest multigap nodeless superconductivity in this multiband Rh$_{17}$S$_{15}$ superconductor. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 7 pages, 6 figures

arXiv:2405.08314 [pdf, other]

Probing the impact of radio-mode feedback on the properties of the cool circumgalactic medium

Authors: Yu-Ling Chang, Ting-Wen Lan, J. Xavier Prochaska, Lucas Napolitano, Abhijeet Anand, J. Aguilar, S. Ahlen, D. Brooks, T. Claybaugh, A. de la Macorra, Arjun Dey, P. Doel, S. Gontcho A Gontcho, J. Guy, S. Juneau, T. Kisner, A. Lambert, M. Landriau, L. Le Guillou, M. Manera, P. Martini, A. Meisner, R. Miquel, J. Moustakas, A. D. Myers , et al. (11 additional authors not shown)

Abstract: We explore the influence of radio-mode feedback on the properties of the cool circumgalactic medium (CGM). To this end, we assemble a statistical sample of approximately 30,000 radio galaxies with background quasars by combining optical spectroscopic measurements of luminous red galaxies (LRGs) and quasars from the year 1 dataset of Dark Energy Spectroscopic Instrument (DESI) and radio sources fro… ▽ More We explore the influence of radio-mode feedback on the properties of the cool circumgalactic medium (CGM). To this end, we assemble a statistical sample of approximately 30,000 radio galaxies with background quasars by combining optical spectroscopic measurements of luminous red galaxies (LRGs) and quasars from the year 1 dataset of Dark Energy Spectroscopic Instrument (DESI) and radio sources from the LOw-Frequency ARray Two-metre Sky Survey (LoTSS) DR2 catalog and the Very Large Array Sky Survey (VLASS) quick look catalog. Galaxies with similar optical properties but with no radio counterparts in LoTSS and VLASS are selected as the control group. We measure the cool CGM properties of radio galaxies and their control samples traced by MgII absorption lines, including covering fraction, rest equivalent width, and gas kinematics. Our results show no significant difference in the properties of gas around radio galaxies and their control sample, indicating that the operating radio-mode feedback of massive galaxies does not produce detectable effects on the properties of the cool CGM. Finally, we show that the CGM of radio galaxies contain a non-negligible amount of cool gas with approximately 10^10 solar masses. This abundance can place a stringent constraint on the radio-mode feedback models. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 20 pages, 12 figures

arXiv:2405.06743 [pdf, other]

New measurements of the Lyman-$α$ forest continuum and effective optical depth with LyCAN and DESI Y1 data

Authors: Wynne Turner, Paul Martini, Naim Göksel Karaçaylı, J. Aguilar, S. Ahlen, D. Brooks, T. Claybaugh, A. de la Macorra, A. Dey, P. Doel, K. Fanning, J. E. Forero-Romero, S. Gontcho A Gontcho, A. X. Gonzalez-Morales, G. Gutierrez, J. Guy, H. K. Herrera-Alcantar, K. Honscheid, S. Juneau, T. Kisner, A. Kremin, A. Lambert, M. Landriau, L. Le Guillou, A. Meisner , et al. (20 additional authors not shown)

Abstract: We present the Lyman-$α$ Continuum Analysis Network (LyCAN), a Convolutional Neural Network that predicts the unabsorbed quasar continuum within the rest-frame wavelength range of $1040-1600$ Angstroms based on the red side of the Lyman-$α$ emission line ($1216-1600$ Angstroms). We developed synthetic spectra based on a Gaussian Mixture Model representation of Nonnegative Matrix Factorization (NMF… ▽ More We present the Lyman-$α$ Continuum Analysis Network (LyCAN), a Convolutional Neural Network that predicts the unabsorbed quasar continuum within the rest-frame wavelength range of $1040-1600$ Angstroms based on the red side of the Lyman-$α$ emission line ($1216-1600$ Angstroms). We developed synthetic spectra based on a Gaussian Mixture Model representation of Nonnegative Matrix Factorization (NMF) coefficients. These coefficients were derived from high-resolution, low-redshift ($z<0.2$) Hubble Space Telescope/Cosmic Origins Spectrograph quasar spectra. We supplemented this COS-based synthetic sample with an equal number of DESI Year 5 mock spectra. LyCAN performs extremely well on testing sets, achieving a median error in the forest region of 1.5% on the DESI mock sample, 2.0% on the COS-based synthetic sample, and 4.1% on the original COS spectra. LyCAN outperforms Principal Component Analysis (PCA)- and NMF-based prediction methods using the same training set by 40% or more. We predict the intrinsic continua of 83,635 DESI Year 1 spectra in the redshift range of $2.1 \leq z \leq 4.2$ and perform an absolute measurement of the evolution of the effective optical depth. This is the largest sample employed to measure the optical depth evolution to date. We fit a power-law of the form $τ(z) = τ_0 (1+z)^γ$ to our measurements and find $τ_0 = (2.46 \pm 0.14)\times10^{-3}$ and $γ= 3.62 \pm 0.04$. Our results show particular agreement with high-resolution, ground-based observations around $z = 2$, indicating that LyCAN is able to predict the quasar continuum in the forest region with only spectral information outside the forest. △ Less

Submitted 6 September, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

Comments: 23 pages, 15 figures, 3 tables; accepted to ApJ

arXiv:2405.03926 [pdf, ps, other]

Generalized Nash equilibrium problems with quasi-linear constraints

Authors: Jiyoung Choi, Jiawang Nie, Xindong Tang, Suhan Zhong

Abstract: We study generalized Nash equilibrium problems (GNEPs) such that objectives are polynomial functions, and each player's constraints are linear in their own strategy. For such GNEPs, the KKT sets can be represented as unions of simpler sets by Carathéodory's theorem. We give a convenient representation for KKT sets using partial Lagrange multiplier expressions. This produces a set of branch polynom… ▽ More We study generalized Nash equilibrium problems (GNEPs) such that objectives are polynomial functions, and each player's constraints are linear in their own strategy. For such GNEPs, the KKT sets can be represented as unions of simpler sets by Carathéodory's theorem. We give a convenient representation for KKT sets using partial Lagrange multiplier expressions. This produces a set of branch polynomial optimization problems, which can be efficiently solved by Moment-SOS relaxations. By doing this, we can compute all generalized Nash equilibria or detect their nonexistence. Numerical experiments are also provided to demonstrate the computational efficiency. △ Less

Submitted 29 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

MSC Class: 90C23; 90C33; 91A10; 65K05

arXiv:2405.03857 [pdf, other]

The MOST Hosts Survey: spectroscopic observation of the host galaxies of ~40,000 transients using DESI

Authors: Maayane T. Soumagnac, Peter Nugent, Robert A. Knop, Anna Y. Q. Ho, William Hohensee, Autumn Awbrey, Alexis Andersen, Greg Aldering, Matan Ventura, Jessica N. Aguilar, Steven Ahlen, Segev Y. Benzvi, David Brooks, Dillon Brout, Todd Claybaugh, Tamara M. Davis, Kyle Dawson, Axel de la Macorra, Arjun Dey, Biprateep Dey, Peter Doel, Kelly A. Douglass, Jaime E. Forero-Romero, Enrique Gaztanaga, Satya Gontcho A Gontcho , et al. (32 additional authors not shown)

Abstract: We present the MOST Hosts survey (Multi-Object Spectroscopy of Transient Hosts). The survey is planned to run throughout the five years of operation of the Dark Energy Spectroscopic Instrument (DESI) and will generate a spectroscopic catalog of the hosts of most transients observed to date, in particular all the supernovae observed by most public, untargeted, wide-field, optical surveys (PTF/iPTF,… ▽ More We present the MOST Hosts survey (Multi-Object Spectroscopy of Transient Hosts). The survey is planned to run throughout the five years of operation of the Dark Energy Spectroscopic Instrument (DESI) and will generate a spectroscopic catalog of the hosts of most transients observed to date, in particular all the supernovae observed by most public, untargeted, wide-field, optical surveys (PTF/iPTF, SDSS II, ZTF, DECAT, DESIRT). Scientific questions for which the MOST Hosts survey will be useful include Type Ia supernova cosmology, fundamental plane and peculiar velocity measurements, and the understanding of the correlations between transients and their host galaxy properties. Here, we present the first release of the MOST Hosts survey: 21,931 hosts of 20,235 transients. These numbers represent 36% of the final MOST Hosts sample, consisting of 60,212 potential host galaxies of 38,603 transients (a transient can be assigned multiple potential hosts). Of these galaxies, 40% do not appear in the DESI primary target list and therefore require a specific program like MOST Hosts. Of all the transients in the MOST Hosts list, only 26.7% have existing classifications, and so the survey will provide redshifts (and luminosities) for nearly 30,000 transients. A preliminary Hubble diagram and a transient luminosity-duration diagram are shown as examples of future potential uses of the MOST Hosts survey. The survey will also provide a training sample of spectroscopically observed transients for photometry-only classifiers, as we enter an era when most newly observed transients will lack spectroscopic classification. The MOST Hosts DESI survey data will be released through the Wiserep platform on a rolling cadence and updated to match the DESI releases. Dates of future releases and updates are available through the https://mosthosts.desi.lbl.gov website. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: Submitted to ApJS

arXiv:2405.03228 [pdf, other]

TED: Accelerate Model Training by Internal Generalization

Authors: Jinying Xiao, Ping Li, Jie Nie

Abstract: Large language models have demonstrated strong performance in recent years, but the high cost of training drives the need for efficient methods to compress dataset sizes. We propose TED pruning, a method that addresses the challenge of overfitting under high pruning ratios by quantifying the model's ability to improve performance on pruned data while fitting retained data, known as Internal Genera… ▽ More Large language models have demonstrated strong performance in recent years, but the high cost of training drives the need for efficient methods to compress dataset sizes. We propose TED pruning, a method that addresses the challenge of overfitting under high pruning ratios by quantifying the model's ability to improve performance on pruned data while fitting retained data, known as Internal Generalization (IG). TED uses an optimization objective based on Internal Generalization Distance (IGD), measuring changes in IG before and after pruning to align with true generalization performance and achieve implicit regularization. The IGD optimization objective was verified to allow the model to achieve the smallest upper bound on generalization error. The impact of small mask fluctuations on IG is studied through masks and Taylor approximation, and fast estimation of IGD is enabled. In analyzing continuous training dynamics, the prior effect of IGD is validated, and a progressive pruning strategy is proposed. Experiments on image classification, natural language understanding, and large language model fine-tuning show TED achieves lossless performance with 60-70\% of the data. Upon acceptance, our code will be made publicly available. △ Less

Submitted 19 August, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

Showing 1–50 of 527 results for author: Nie, J