Search | arXiv e-print repository

FastForensics: Efficient Two-Stream Design for Real-Time Image Manipulation Detection

Authors: Yangxiang Zhang, Yuezun Li, Ao Luo, Jiaran Zhou, Junyu Dong

Abstract: With the rise in popularity of portable devices, the spread of falsified media on social platforms has become rampant. This necessitates the timely identification of authentic content. However, most advanced detection methods are computationally heavy, hindering their real-time application. In this paper, we describe an efficient two-stream architecture for real-time image manipulation detection.… ▽ More With the rise in popularity of portable devices, the spread of falsified media on social platforms has become rampant. This necessitates the timely identification of authentic content. However, most advanced detection methods are computationally heavy, hindering their real-time application. In this paper, we describe an efficient two-stream architecture for real-time image manipulation detection. Our method consists of two-stream branches targeting the cognitive and inspective perspectives. In the cognitive branch, we propose efficient wavelet-guided Transformer blocks to capture the global manipulation traces related to frequency. This block contains an interactive wavelet-guided self-attention module that integrates wavelet transformation with efficient attention design, interacting with the knowledge from the inspective branch. The inspective branch consists of simple convolutions that capture fine-grained traces and interact bidirectionally with Transformer blocks to provide mutual support. Our method is lightweight ($\sim$ 8M) but achieves competitive performance compared to many other counterparts, demonstrating its efficacy in image manipulation detection and its potential for portable integration. △ Less

Submitted 29 August, 2024; originally announced August 2024.

Comments: BMVC 2024

arXiv:2408.16299 [pdf, other]

The evolution of molecular clouds: global radial collapse

Authors: An-Xu Luo, Hong-Li Liu, Jin-Zeng Li

Abstract: The star formation efficiency (SFE) measures the proportion of molecular gas converted into stars, while the star formation rate (SFR) indicates the rate at which gas is transformed into stars. Here we propose such a model in the framework of a global radial collapse of molecular clouds, where the collapse velocity depends on the density profile and the initial mass-to-radius ratio of molecular cl… ▽ More The star formation efficiency (SFE) measures the proportion of molecular gas converted into stars, while the star formation rate (SFR) indicates the rate at which gas is transformed into stars. Here we propose such a model in the framework of a global radial collapse of molecular clouds, where the collapse velocity depends on the density profile and the initial mass-to-radius ratio of molecular clouds, with the collapse velocity accelerating during the collapse process. This simplified analytical model allows us to estimate a lifetime of giant molecular clouds of approximately $0.44-7.36 \times 10^7\, \rm{yr}$, and a star formation timescale of approximately $0.5-5.88 \times 10^6\, \rm{yr}$. Additionally, we can predict an SFE of approximately $1.59\, \%$, and an SFR of roughly $1.85\, \rm{M_{\odot} \, yr^{-1}}$ for the Milky Way in agreement with observations. △ Less

Submitted 29 August, 2024; originally announced August 2024.

Comments: 12 pages, 3 figures, 2 tables. Submitted to the ApJL

arXiv:2408.12791 [pdf, other]

Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture

Authors: Chenqi Kong, Anwei Luo, Peijun Bao, Haoliang Li, Renjie Wan, Zengwei Zheng, Anderson Rocha, Alex C. Kot

Abstract: Open-set face forgery detection poses significant security threats and presents substantial challenges for existing detection models. These detectors primarily have two limitations: they cannot generalize across unknown forgery domains and inefficiently adapt to new data. To address these issues, we introduce an approach that is both general and parameter-efficient for face forgery detection. It b… ▽ More Open-set face forgery detection poses significant security threats and presents substantial challenges for existing detection models. These detectors primarily have two limitations: they cannot generalize across unknown forgery domains and inefficiently adapt to new data. To address these issues, we introduce an approach that is both general and parameter-efficient for face forgery detection. It builds on the assumption that different forgery source domains exhibit distinct style statistics. Previous methods typically require fully fine-tuning pre-trained networks, consuming substantial time and computational resources. In turn, we design a forgery-style mixture formulation that augments the diversity of forgery source domains, enhancing the model's generalizability across unseen domains. Drawing on recent advancements in vision transformers (ViT) for face forgery detection, we develop a parameter-efficient ViT-based detection model that includes lightweight forgery feature extraction modules and enables the model to extract global and local forgery clues simultaneously. We only optimize the inserted lightweight modules during training, maintaining the original ViT structure with its pre-trained ImageNet weights. This training strategy effectively preserves the informative pre-trained knowledge while flexibly adapting the model to the task of Deepfake detection. Extensive experimental results demonstrate that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters, representing an important step toward open-set Deepfake detection in the wild. △ Less

Submitted 22 August, 2024; originally announced August 2024.

arXiv:2407.21240 [pdf, other]

FCN4Flare: Fully Convolution Neural Networks for Flare Detection

Authors: Ming-Hui Jia, A-Li Luo, Bo Qiu

Abstract: Stellar flares originate from the sudden reconnection of magnetic flux lines within the star's atmosphere, particularly in the chromosphere. Automated flare detection enables exploiting vast photometric datasets from missions like Kepler. Prior methods rely on outlier detection, facing challenges of complexity, detection accuracy, and scalability. This paper presents FCN4Flare, a deep learning app… ▽ More Stellar flares originate from the sudden reconnection of magnetic flux lines within the star's atmosphere, particularly in the chromosphere. Automated flare detection enables exploiting vast photometric datasets from missions like Kepler. Prior methods rely on outlier detection, facing challenges of complexity, detection accuracy, and scalability. This paper presents FCN4Flare, a deep learning approach using fully convolutional networks for precise point-to-point flare prediction regardless of light curve length. Key innovations include the NaN Mask to handle missing data, dilated convolutions to preserve local information, and the MaskDice loss to mitigate severe class imbalance. Experiments demonstrate significantly improved detection performance over previous models, with a 0.64 Dice coefficient on Kepler data. Applying FCN4Flare to Kepler and LAMOST, we compile a catalog of 30,285 high-confidence flares across 1426 stars. Flare energies are estimated and stellar/exoplanet properties analyzed, identifying pronounced activity for an M-dwarf hosting a habitable zone planet. This work overcomes limitations of prior flare detection methods via deep learning, enabling new scientific discoveries through analysis of photometric time-series data. △ Less

Submitted 30 July, 2024; originally announced July 2024.

Comments: 13 pages, 6 figures

arXiv:2407.18772 [pdf, other]

Learning production functions for supply chains with graph neural networks

Authors: Serina Chang, Zhiyin Lin, Benjamin Yan, Swapnil Bembde, Qi Xiu, Chi Heem Wong, Yu Qin, Frank Kloster, Alex Luo, Raj Palleti, Jure Leskovec

Abstract: The global economy relies on the flow of goods over supply chain networks, with nodes as firms and edges as transactions between firms. While we may observe these external transactions, they are governed by unseen production functions, which determine how firms internally transform the input products they receive into output products that they sell. In this setting, it can be extremely valuable to… ▽ More The global economy relies on the flow of goods over supply chain networks, with nodes as firms and edges as transactions between firms. While we may observe these external transactions, they are governed by unseen production functions, which determine how firms internally transform the input products they receive into output products that they sell. In this setting, it can be extremely valuable to infer these production functions, to better understand and improve supply chains, and to forecast future transactions more accurately. However, existing graph neural networks (GNNs) cannot capture these hidden relationships between nodes' inputs and outputs. Here, we introduce a new class of models for this setting, by combining temporal GNNs with a novel inventory module, which learns production functions via attention weights and a special loss function. We evaluate our models extensively on real supply chains data, along with data generated from our new open-source simulator, SupplySim. Our models successfully infer production functions, with a 6-50% improvement over baselines, and forecast future transactions on real and synthetic data, outperforming baselines by 11-62%. △ Less

Submitted 26 July, 2024; originally announced July 2024.

arXiv:2407.18754 [pdf, other]

Deep learning interpretable analysis for carbon star identification in Gaia DR3

Authors: Shuo Ye, Wen-Yuan Cui, Yin-Bi Li, A-Li Luo, R. A. Hugh Jones

Abstract: Context. A large fraction of Asymptotic Giant Branch (AGB) stars develop carbon-rich atmospheres during their evolution. Based on their color and luminosity, these carbon stars can be easily distinguished from many other kinds of stars. However, a large number of G, K, and M giants are also distributed in the same region as carbon stars on the HR diagram. Their spectra have differences,especially… ▽ More Context. A large fraction of Asymptotic Giant Branch (AGB) stars develop carbon-rich atmospheres during their evolution. Based on their color and luminosity, these carbon stars can be easily distinguished from many other kinds of stars. However, a large number of G, K, and M giants are also distributed in the same region as carbon stars on the HR diagram. Their spectra have differences,especially in the prominent CN molecular bands. Aims. We aim to distinguish carbon stars from other kinds of stars using Gaia's XP spectra, while providing attribution explanations of key features necessary for identification, and even discovering additional new spectral key features. Methods. We proposed a classification model named `GaiaNet', an improved one-dimensional convolutional neural network specifically designed for handling Gaia's XP spectra. We utilized the SHAP interpretability model to calculate the SHAP value for each feature point in a spectrum, enabling us to explain the output of the `GaiaNet' model and provide further meaningful analysis Results. Compared to four traditional machine-learning methods, the `GaiaNet' model exhibits an average classification accuracy improvement of approximately 0.3% on the validation set, with the highest accuracy even reaching 100%. Utilizing the SHAP algorithm, we present a clear spectroscopic heatmap highlighting molecular band absorption features primarily distributed mainly around CN773.3 and CN895.0, and summarize five crucial feature regions for carbon star identification. Upon applying the trained classification model to the CSTAR sample with Gaia `xp_sampled_mean' spectra, we obtained 451 new candidate carbon stars as a by-product. △ Less

Submitted 26 July, 2024; originally announced July 2024.

Comments: 23 pages, 22 figures

arXiv:2407.13133 [pdf, other]

FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection

Authors: Jianwei Zhao, Xin Li, Fan Yang, Qiang Zhai, Ao Luo, Zicheng Jiao, Hong Cheng

Abstract: Detecting objects seamlessly blended into their surroundings represents a complex task for both human cognitive capabilities and advanced artificial intelligence algorithms. Currently, the majority of methodologies for detecting camouflaged objects mainly focus on utilizing discriminative models with various unique designs. However, it has been observed that generative models, such as Stable Diffu… ▽ More Detecting objects seamlessly blended into their surroundings represents a complex task for both human cognitive capabilities and advanced artificial intelligence algorithms. Currently, the majority of methodologies for detecting camouflaged objects mainly focus on utilizing discriminative models with various unique designs. However, it has been observed that generative models, such as Stable Diffusion, possess stronger capabilities for understanding various objects in complex environments; Yet their potential for the cognition and detection of camouflaged objects has not been extensively explored. In this study, we present a novel denoising diffusion model, namely FocusDiffuser, to investigate how generative models can enhance the detection and interpretation of camouflaged objects. We believe that the secret to spotting camouflaged objects lies in catching the subtle nuances in details. Consequently, our FocusDiffuser innovatively integrates specialized enhancements, notably the Boundary-Driven LookUp (BDLU) module and Cyclic Positioning (CP) module, to elevate standard diffusion models, significantly boosting the detail-oriented analytical capabilities. Our experiments demonstrate that FocusDiffuser, from a generative perspective, effectively addresses the challenge of camouflaged object detection, surpassing leading models on benchmarks like CAMO, COD10K and NC4K. △ Less

Submitted 17 July, 2024; originally announced July 2024.

Comments: 18 pages,7figures

arXiv:2407.11333 [pdf, other]

Disentangled Acoustic Fields For Multimodal Physical Scene Understanding

Authors: Jie Yin, Andrew Luo, Yilun Du, Anoop Cherian, Tim K. Marks, Jonathan Le Roux, Chuang Gan

Abstract: We study the problem of multimodal physical scene understanding, where an embodied agent needs to find fallen objects by inferring object properties, direction, and distance of an impact sound source. Previous works adopt feed-forward neural networks to directly regress the variables from sound, leading to poor generalization and domain adaptation issues. In this paper, we illustrate that learning… ▽ More We study the problem of multimodal physical scene understanding, where an embodied agent needs to find fallen objects by inferring object properties, direction, and distance of an impact sound source. Previous works adopt feed-forward neural networks to directly regress the variables from sound, leading to poor generalization and domain adaptation issues. In this paper, we illustrate that learning a disentangled model of acoustic formation, referred to as disentangled acoustic field (DAF), to capture the sound generation and propagation process, enables the embodied agent to construct a spatial uncertainty map over where the objects may have fallen. We demonstrate that our analysis-by-synthesis framework can jointly infer sound properties by explicitly decomposing and factorizing the latent space of the disentangled model. We further show that the spatial uncertainty map can significantly improve the success rate for the localization of fallen objects by proposing multiple plausible exploration locations. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.08939 [pdf, other]

LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion Models

Authors: Hai Jiang, Ao Luo, Xiaohong Liu, Songchen Han, Shuaicheng Liu

Abstract: In this paper, we propose a diffusion-based unsupervised framework that incorporates physically explainable Retinex theory with diffusion models for low-light image enhancement, named LightenDiffusion. Specifically, we present a content-transfer decomposition network that performs Retinex decomposition within the latent space instead of image space as in previous approaches, enabling the encoded f… ▽ More In this paper, we propose a diffusion-based unsupervised framework that incorporates physically explainable Retinex theory with diffusion models for low-light image enhancement, named LightenDiffusion. Specifically, we present a content-transfer decomposition network that performs Retinex decomposition within the latent space instead of image space as in previous approaches, enabling the encoded features of unpaired low-light and normal-light images to be decomposed into content-rich reflectance maps and content-free illumination maps. Subsequently, the reflectance map of the low-light image and the illumination map of the normal-light image are taken as input to the diffusion model for unsupervised restoration with the guidance of the low-light feature, where a self-constrained consistency loss is further proposed to eliminate the interference of normal-light content on the restored results to improve overall visual quality. Extensive experiments on publicly available real-world benchmarks show that the proposed LightenDiffusion outperforms state-of-the-art unsupervised competitors and is comparable to supervised methods while being more generalizable to various scenes. Our code is available at https://github.com/JianghaiSCU/LightenDiffusion. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV 2024

arXiv:2406.13735 [pdf, other]

StableSemantics: A Synthetic Language-Vision Dataset of Semantic Representations in Naturalistic Images

Authors: Rushikesh Zawar, Shaurya Dewan, Andrew F. Luo, Margaret M. Henderson, Michael J. Tarr, Leila Wehbe

Abstract: Understanding the semantics of visual scenes is a fundamental challenge in Computer Vision. A key aspect of this challenge is that objects sharing similar semantic meanings or functions can exhibit striking visual differences, making accurate identification and categorization difficult. Recent advancements in text-to-image frameworks have led to models that implicitly capture natural scene statist… ▽ More Understanding the semantics of visual scenes is a fundamental challenge in Computer Vision. A key aspect of this challenge is that objects sharing similar semantic meanings or functions can exhibit striking visual differences, making accurate identification and categorization difficult. Recent advancements in text-to-image frameworks have led to models that implicitly capture natural scene statistics. These frameworks account for the visual variability of objects, as well as complex object co-occurrences and sources of noise such as diverse lighting conditions. By leveraging large-scale datasets and cross-attention conditioning, these models generate detailed and contextually rich scene representations. This capability opens new avenues for improving object recognition and scene understanding in varied and challenging environments. Our work presents StableSemantics, a dataset comprising 224 thousand human-curated prompts, processed natural language captions, over 2 million synthetic images, and 10 million attention maps corresponding to individual noun chunks. We explicitly leverage human-generated prompts that correspond to visually interesting stable diffusion generations, provide 10 generations per phrase, and extract cross-attention maps for each image. We explore the semantic distribution of generated images, examine the distribution of objects within images, and benchmark captioning and open vocabulary segmentation methods on our data. To the best of our knowledge, we are the first to release a diffusion dataset with semantic attributions. We expect our proposed dataset to catalyze advances in visual semantic understanding and provide a foundation for developing more sophisticated and effective visual models. Website: https://stablesemantics.github.io/StableSemantics △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: Dataset website: https://stablesemantics.github.io/StableSemantics

arXiv:2406.10744 [pdf, other]

Technique Report of CVPR 2024 PBDL Challenges

Authors: Ying Fu, Yu Li, Shaodi You, Boxin Shi, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Shengping Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou, Cong Li, Senyan Xu , et al. (75 additional authors not shown)

Abstract: The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a… ▽ More The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, and medium properties from images. In recent years, deep learning has shown promising improvements for various vision tasks, and when combined with physics-based vision, these approaches can enhance the robustness and accuracy of vision systems. This technical report summarizes the outcomes of the Physics-Based Vision Meets Deep Learning (PBDL) 2024 challenge, held in CVPR 2024 workshop. The challenge consisted of eight tracks, focusing on Low-Light Enhancement and Detection as well as High Dynamic Range (HDR) Imaging. This report details the objectives, methodologies, and results of each track, highlighting the top-performing solutions and their innovative approaches. △ Less

Submitted 12 July, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

Comments: CVPR 2024 PBDL Challenges: https://pbdl-ws.github.io/pbdl2024/challenge/index.html

arXiv:2406.07761 [pdf]

Deep Learning of Structural Morphology Imaged by Scanning X-ray Diffraction Microscopy

Authors: Aileen Luo, Tao Zhou, Martin V. Holt, Andrej Singer, Mathew J. Cherukara

Abstract: Scanning X-ray nanodiffraction microscopy is a powerful technique for spatially resolving nanoscale structural morphologies by diffraction contrast. One of the critical challenges in experimental nanodiffraction data analysis is posed by the convergence angle of nanoscale focusing optics which creates simultaneous dependency of the far-field scattering data on three independent components of the l… ▽ More Scanning X-ray nanodiffraction microscopy is a powerful technique for spatially resolving nanoscale structural morphologies by diffraction contrast. One of the critical challenges in experimental nanodiffraction data analysis is posed by the convergence angle of nanoscale focusing optics which creates simultaneous dependency of the far-field scattering data on three independent components of the local strain tensor - corresponding to dilation and two potential rigid body rotations of the unit cell. All three components are in principle resolvable through a spatially mapped sample tilt series however traditional data analysis is computationally expensive and prone to artifacts. In this study, we implement NanobeamNN, a convolutional neural network specifically tailored to the analysis of scanning probe X-ray microscopy data. NanobeamNN learns lattice strain and rotation angles from simulated diffraction of a focused X-ray nanobeam by an epitaxial thin film and can directly make reasonable predictions on experimental data without the need for additional fine-tuning. We demonstrate that this approach represents a significant advancement in computational speed over conventional methods, as well as a potential improvement in accuracy over the current standard. △ Less

Submitted 24 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.05191 [pdf, other]

DiffusionPID: Interpreting Diffusion via Partial Information Decomposition

Authors: Shaurya Dewan, Rushikesh Zawar, Prakanshul Saxena, Yingshan Chang, Andrew Luo, Yonatan Bisk

Abstract: Text-to-image diffusion models have made significant progress in generating naturalistic images from textual inputs, and demonstrate the capacity to learn and represent complex visual-semantic relationships. While these diffusion models have achieved remarkable success, the underlying mechanisms driving their performance are not yet fully accounted for, with many unanswered questions surrounding w… ▽ More Text-to-image diffusion models have made significant progress in generating naturalistic images from textual inputs, and demonstrate the capacity to learn and represent complex visual-semantic relationships. While these diffusion models have achieved remarkable success, the underlying mechanisms driving their performance are not yet fully accounted for, with many unanswered questions surrounding what they learn, how they represent visual-semantic relationships, and why they sometimes fail to generalize. Our work presents Diffusion Partial Information Decomposition (DiffusionPID), a novel technique that applies information-theoretic principles to decompose the input text prompt into its elementary components, enabling a detailed examination of how individual tokens and their interactions shape the generated image. We introduce a formal approach to analyze the uniqueness, redundancy, and synergy terms by applying PID to the denoising model at both the image and pixel level. This approach enables us to characterize how individual tokens and their interactions affect the model output. We first present a fine-grained analysis of characteristics utilized by the model to uniquely localize specific concepts, we then apply our approach in bias analysis and show it can recover gender and ethnicity biases. Finally, we use our method to visually characterize word ambiguity and similarity from the model's perspective and illustrate the efficacy of our method for prompt intervention. Our results show that PID is a potent tool for evaluating and diagnosing text-to-image diffusion models. △ Less

Submitted 12 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

arXiv:2406.02659 [pdf, other]

Neural Representations of Dynamic Visual Stimuli

Authors: Jacob Yeung, Andrew F. Luo, Gabriel Sarch, Margaret M. Henderson, Deva Ramanan, Michael J. Tarr

Abstract: Humans experience the world through constantly changing visual stimuli, where scenes can shift and move, change in appearance, and vary in distance. The dynamic nature of visual perception is a fundamental aspect of our daily lives, yet the large majority of research on object and scene processing, particularly using fMRI, has focused on static stimuli. While studies of static image perception are… ▽ More Humans experience the world through constantly changing visual stimuli, where scenes can shift and move, change in appearance, and vary in distance. The dynamic nature of visual perception is a fundamental aspect of our daily lives, yet the large majority of research on object and scene processing, particularly using fMRI, has focused on static stimuli. While studies of static image perception are attractive due to their computational simplicity, they impose a strong non-naturalistic constraint on our investigation of human vision. In contrast, dynamic visual stimuli offer a more ecologically-valid approach but present new challenges due to the interplay between spatial and temporal information, making it difficult to disentangle the representations of stable image features and motion. To overcome this limitation -- given dynamic inputs, we explicitly decouple the modeling of static image representations and motion representations in the human brain. Three results demonstrate the feasibility of this approach. First, we show that visual motion information as optical flow can be predicted (or decoded) from brain activity as measured by fMRI. Second, we show that this predicted motion can be used to realistically animate static images using a motion-conditioned video diffusion model (where the motion is driven by fMRI brain activity). Third, we show prediction in the reverse direction: existing video encoders can be fine-tuned to predict fMRI brain activity from video imagery, and can do so more effectively than image encoders. This foundational work offers a novel, extensible framework for interpreting how the human brain processes dynamic visual information. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.19425 [pdf, other]

Adaptive In-conversation Team Building for Language Model Agents

Authors: Linxin Song, Jiale Liu, Jieyu Zhang, Shaokun Zhang, Ao Luo, Shijian Wang, Qingyun Wu, Chi Wang

Abstract: Leveraging multiple large language model (LLM) agents has shown to be a promising approach for tackling complex tasks, while the effective design of multiple agents for a particular application remains an art. It is thus intriguing to answer a critical question: Given a task, how can we build a team of LLM agents to solve it effectively? Our new adaptive team-building paradigm offers a flexible so… ▽ More Leveraging multiple large language model (LLM) agents has shown to be a promising approach for tackling complex tasks, while the effective design of multiple agents for a particular application remains an art. It is thus intriguing to answer a critical question: Given a task, how can we build a team of LLM agents to solve it effectively? Our new adaptive team-building paradigm offers a flexible solution, realized through a novel agent design named Captain Agent. It dynamically forms and manages teams for each step of a task-solving process, utilizing nested group conversations and reflection to ensure diverse expertise and prevent stereotypical outputs. It allows for a flexible yet structured approach to problem-solving and can help reduce redundancy and enhance output diversity. A comprehensive evaluation across six real-world scenarios demonstrates that Captain Agent significantly outperforms existing multi-agent methods with 21.94% improvement in average accuracy, providing outstanding performance without requiring task-specific prompt engineering. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.15981 [pdf, other]

A numerical method for designing topological superconductors induced by s-wave superconductivity

Authors: Jingnan Hu, Aiyun Luo, Zhijun Wang, Quansheng Wu, Gang Xu

Abstract: Topological superconductors, as one of the most important research directions at present, have attracted much attention because of their potential to realize topological quantum computation. However, a universal computational tool based on first-principle calculations for topological superconductivity was not yet fully developed, and eventually significant challenges in predicting topological supe… ▽ More Topological superconductors, as one of the most important research directions at present, have attracted much attention because of their potential to realize topological quantum computation. However, a universal computational tool based on first-principle calculations for topological superconductivity was not yet fully developed, and eventually significant challenges in predicting topological superconducting materials. It is difficult to calculate the topological superconducting properties of the system in a self-consistent manner. In this paper, we develop a numerical method to characterize the superconducting band spectrum and superconducting topological invariants of two-dimensional (2D) slab system from first-principles calculations and implemented in an open-source software WannierTools. We hope that it would accelerate the discovery of the topological superconductor candidates. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.10890 [pdf, other]

A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model

Authors: Mingxiang Fu, Yu Song, Jiameng Lv, Liang Cao, Peng Jia, Nan Li, Xiangru Li, Jifeng Liu, A-Li Luo, Bo Qiu, Shiyin Shen, Liangping Tu, Lili Wang, Shoulin Wei, Haifeng Yang, Zhenping Yi, Zhiqiang Zou

Abstract: The exponential growth of astronomical datasets provides an unprecedented opportunity for humans to gain insight into the Universe. However, effectively analyzing this vast amount of data poses a significant challenge. Astronomers are turning to deep learning techniques to address this, but the methods are limited by their specific training sets, leading to considerable duplicate workloads too. He… ▽ More The exponential growth of astronomical datasets provides an unprecedented opportunity for humans to gain insight into the Universe. However, effectively analyzing this vast amount of data poses a significant challenge. Astronomers are turning to deep learning techniques to address this, but the methods are limited by their specific training sets, leading to considerable duplicate workloads too. Hence, as an example to present how to overcome the issue, we built a framework for general analysis of galaxy images, based on a large vision model (LVM) plus downstream tasks (DST), including galaxy morphological classification, image restoration, object detection, parameter extraction, and more. Considering the low signal-to-noise ratio of galaxy images and the imbalanced distribution of galaxy categories, we have incorporated a Human-in-the-loop (HITL) module into our large vision model, which leverages human knowledge to enhance the reliability and interpretability of processing galaxy images interactively. The proposed framework exhibits notable few-shot learning capabilities and versatile adaptability to all the abovementioned tasks on galaxy images in the DESI legacy imaging surveys. Expressly, for object detection, trained by 1000 data points, our DST upon the LVM achieves an accuracy of 96.7%, while ResNet50 plus Mask R-CNN gives an accuracy of 93.1%; for morphology classification, to obtain AUC ~0.9, LVM plus DST and HITL only requests 1/50 training sets compared to ResNet18. Expectedly, multimodal data can be integrated similarly, which opens up possibilities for conducting joint analyses with datasets spanning diverse domains in the era of multi-message astronomy. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 26 pages, 10 figures, to be published on Chinese Physics C

arXiv:2405.10047 [pdf, other]

doi 10.1051/0004-6361/202348988

Stellar Chromospheric Activity Database of Solar-like Stars Based on the LAMOST Low-Resolution Spectroscopic Survey: II. the bolometric and photospheric calibration

Authors: Weitao Zhang, Jun Zhang, Han He, Ali Luo, Haotong Zhang

Abstract: The dependence of stellar magnetic activity on stellar parameters would be inspired by the chromospheric activity studies based on the large-scale spectroscopic surveys. The Ca II H and K lines are employed to construct indicators for assessing and studying the chromospheric activity of solar-like stars. We investigate the widely used bolometric and photospheric calibrated chromospheric activity i… ▽ More The dependence of stellar magnetic activity on stellar parameters would be inspired by the chromospheric activity studies based on the large-scale spectroscopic surveys. The Ca II H and K lines are employed to construct indicators for assessing and studying the chromospheric activity of solar-like stars. We investigate the widely used bolometric and photospheric calibrated chromospheric activity index $R'_{\rm HK}$, derived from the method in the classic literature ($R'_{\rm HK,classic}$) and the method based on the PHOENIX model ($R'_{\rm HK,PHOENIX}$). Since the detailed stellar atmospheric parameters, effective temperature ($T_{\rm eff}$), surface gravity ($\log\,g$), and metallicity ([Fe/H]), are available for LAMOST, we estimate the chromospheric activity index $R'_{\rm HK,PHOENIX}$, along with the corresponding bolometric calibrated index $R_{\rm HK,PHOENIX}$, taking these parameters into account. We provide the database of the derived chromospheric activity parameters for 1,122,495 LAMOST LRS spectra of solar-like stars. Our calculations show that $\log\,R'_{\rm HK,PHOENIX}$ is approximately linearly correlated with $\log\,R'_{\rm HK,classic}$. The results based on our extensive archive support the view that the dynamo mechanism of solar-like stars is generally consistent with the Sun; and the value of solar chromospheric activity index is located at the midpoint of the solar-like star sample. We further investigate the proportions of solar-like stars with different chromospheric activity levels (very active, active, inactive and very inactive). The investigation indicates that the occurrence rate of high levels of chromospheric activity is lower among the stars with effective temperatures between $5600$ and $5900 \,{\rm K}$. △ Less

Submitted 22 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

Comments: 18 pages, 20 figures, accepted for publication in A&A

Journal ref: A&A 688, A23 (2024)

arXiv:2405.05064 [pdf, other]

Implication of the velocity dispersion scalings on high-mass star formation in molecular clouds

Authors: An-Xu Luo, Hong-Li Liu, Sheng-Li Qin, Dong-Ting Yang, Sirong Pan

Abstract: This paper is aimed at exploring implications of velocity dispersion scalings on high-mass star formation in molecular clouds, including the scalings of Larson's linewidth--size ($σ$--$R$) and ratio--mass surface density ($\cal{L}$--$Σ$; here $\cal{L}$$=σ/R^{0.5}$). We have systematically analyzed the $σ$ parameter of well-selected 221 massive clumps, complemented with published samples of other h… ▽ More This paper is aimed at exploring implications of velocity dispersion scalings on high-mass star formation in molecular clouds, including the scalings of Larson's linewidth--size ($σ$--$R$) and ratio--mass surface density ($\cal{L}$--$Σ$; here $\cal{L}$$=σ/R^{0.5}$). We have systematically analyzed the $σ$ parameter of well-selected 221 massive clumps, complemented with published samples of other hierarchical density structures of molecular clouds over spatial scales of 0.01--10 pc. Those massive clumps are classified into four phases: quiescent, protostellar, HII region, and PDR clumps in an evolutionary sequence. The velocity dispersion of clumps increases overall with the evolutionary sequence, reflecting enhanced stellar feedback in more evolved phases. The relations of $σ$--$R$ and $\cal{L}$--$Σ$ are weak with the clump sample alone, but become evident when combined with others spanning a much wider spatial scales. For $σ$--$R$, its tight relation indicates a kinematic connection between hierarchical density structures, supporting theoretical models of multiscale high-mass star formation. From the $\cal{L}$--$Σ$ relation, cloud structures can be found to transition from over-virial state ($α_\mathrm{vir} > 2$) to sub-virial state ($α_\mathrm{vir} < 2$) as they become smaller and denser, indicating a possible shift in the governing force from turbulence to gravity. This implies that the multiscale physical process of high-mass star formation hinges on the self-gravity of sub-virial molecular clouds. However, the influence of turbulence may not be dismissed until large-scale clouds attain a sub-virial state. This is pending confirmation from future multiscale kinematic observations of molecular clouds with uniform observing settings. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: 18 pages,9 figures, accepted by AJ

arXiv:2405.05063 [pdf, other]

What role of gravity, turbulence and magnetic fields in high-mass star formation clouds?

Authors: An-Xu Luo, Hong-Li Liu, Guang-Xing Li, Sirong Pan, Dong-Ting Yang

Abstract: To explore the potential role of gravity, turbulence and magnetic fields in high-mass star formation in molecular clouds, this study revisits the velocity dispersion--size ($σ$--$L$) and density--size ($ρ$--$L$) scalings and the associated turbulent energy spectrum using an extensive data sample. The sample includes various hierarchical density structures in high-mass star formation clouds, across… ▽ More To explore the potential role of gravity, turbulence and magnetic fields in high-mass star formation in molecular clouds, this study revisits the velocity dispersion--size ($σ$--$L$) and density--size ($ρ$--$L$) scalings and the associated turbulent energy spectrum using an extensive data sample. The sample includes various hierarchical density structures in high-mass star formation clouds, across scales of 0.01 to 100 pc. We observe $σ\propto L^{0.26}$ and $ρ\propto L^{-1.54}$ scalings, converging toward a virial equilibrium state. A nearly flat virial parameter--mass ($α_{\rm vir}-M$) distribution is seen across all density scales, with $α_{\rm vir}$ values centered around unity, suggesting a global equilibrium maintained by the interplay between gravity and turbulence across multiple scales. Our turbulent energy spectrum ($E(k)$) analysis, based on the $σ$--$L$ and $ρ$--$L$ scalings, yields a characteristic $E(k) \propto k^{-1.52}$. These findings indicate the potential significance of gravity, turbulence, and possibly magnetic fields all in regulating dynamics of molecular clouds and high-mass star formation therein. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: 15 pages,5 figures, Accepted by RAA

arXiv:2404.08452 [pdf, other]

MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection

Authors: Chenqi Kong, Anwei Luo, Peijun Bao, Yi Yu, Haoliang Li, Zengwei Zheng, Shiqi Wang, Alex C. Kot

Abstract: Deepfakes have recently raised significant trust issues and security concerns among the public. Compared to CNN face forgery detectors, ViT-based methods take advantage of the expressivity of transformers, achieving superior detection performance. However, these approaches still exhibit the following limitations: (1) Fully fine-tuning ViT-based models from ImageNet weights demands substantial comp… ▽ More Deepfakes have recently raised significant trust issues and security concerns among the public. Compared to CNN face forgery detectors, ViT-based methods take advantage of the expressivity of transformers, achieving superior detection performance. However, these approaches still exhibit the following limitations: (1) Fully fine-tuning ViT-based models from ImageNet weights demands substantial computational and storage resources; (2) ViT-based methods struggle to capture local forgery clues, leading to model bias; (3) These methods limit their scope on only one or few face forgery features, resulting in limited generalizability. To tackle these challenges, this work introduces Mixture-of-Experts modules for Face Forgery Detection (MoE-FFD), a generalized yet parameter-efficient ViT-based approach. MoE-FFD only updates lightweight Low-Rank Adaptation (LoRA) and Adapter layers while keeping the ViT backbone frozen, thereby achieving parameter-efficient training. Moreover, MoE-FFD leverages the expressivity of transformers and local priors of CNNs to simultaneously extract global and local forgery clues. Additionally, novel MoE modules are designed to scale the model's capacity and smartly select optimal forgery experts, further enhancing forgery detection performance. Our proposed learning scheme can be seamlessly adapted to various transformer backbones in a plug-and-play manner. Extensive experimental results demonstrate that the proposed method achieves state-of-the-art face forgery detection performance with significantly reduced parameter overhead. The code is released at: https://github.com/LoveSiameseCat/MoE-FFD. △ Less

Submitted 7 June, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

arXiv:2403.19164 [pdf, other]

RecDiffusion: Rectangling for Image Stitching with Diffusion Models

Authors: Tianhao Zhou, Haipeng Li, Ziyi Wang, Ao Luo, Chen-Lin Zhang, Jiajun Li, Bing Zeng, Shuaicheng Liu

Abstract: Image stitching from different captures often results in non-rectangular boundaries, which is often considered unappealing. To solve non-rectangular boundaries, current solutions involve cropping, which discards image content, inpainting, which can introduce unrelated content, or warping, which can distort non-linear features and introduce artifacts. To overcome these issues, we introduce a novel… ▽ More Image stitching from different captures often results in non-rectangular boundaries, which is often considered unappealing. To solve non-rectangular boundaries, current solutions involve cropping, which discards image content, inpainting, which can introduce unrelated content, or warping, which can distort non-linear features and introduce artifacts. To overcome these issues, we introduce a novel diffusion-based learning framework, \textbf{RecDiffusion}, for image stitching rectangling. This framework combines Motion Diffusion Models (MDM) to generate motion fields, effectively transitioning from the stitched image's irregular borders to a geometrically corrected intermediary. Followed by Content Diffusion Models (CDM) for image detail refinement. Notably, our sampling process utilizes a weighted map to identify regions needing correction during each iteration of CDM. Our RecDiffusion ensures geometric accuracy and overall visual appeal, surpassing all previous methods in both quantitative and qualitative measures when evaluated on public benchmarks. Code is released at https://github.com/lhaippp/RecDiffusion. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.10719 [pdf]

X-ray Nano-imaging of a Heterogeneous Structural Phase Transition in V2O3

Authors: Ziming Shao, Aileen Luo, Eti Barazani, Tao Zhou, Zhonghou Cai, Martin V. Holt, Yoav Kalcheim, Andrej Singer

Abstract: Controlling the Mott transition through strain engineering is crucial for advancing the development and application of memristive and neuromorphic computing devices. Yet, Mott insulators are heterogeneous due to intrinsic phase boundaries and extrinsic defects, posing significant challenges to fully understanding the impact of local microscopic distortions on the local Mott transition. Addressing… ▽ More Controlling the Mott transition through strain engineering is crucial for advancing the development and application of memristive and neuromorphic computing devices. Yet, Mott insulators are heterogeneous due to intrinsic phase boundaries and extrinsic defects, posing significant challenges to fully understanding the impact of local microscopic distortions on the local Mott transition. Addressing these challenges demands structural characterizations at the relevant length scale. Here, using a synchrotron-based scanning X-ray nanoprobe, we studied the real-space structural heterogeneity during the structural phase transition in a V2O3 thin film. Through temperature-dependent metal-insulator phase coexistence mapping, we report a variation in the local transition temperature of up to 7 K across the film and the presence of the transition hysteresis at the nanoscale. Furthermore, a detailed quantitative analysis demonstrates that the spatial heterogeneity of the transition is closely tied to the tilting of crystallographic planes in the pure insulating phase. Our work highlights the impact of local heterogeneity on the Mott transition and lays the groundwork for future innovations in harnessing strain heterogeneity within Mott systems for the next-generation computational technologies. △ Less

Submitted 30 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.09205 [pdf, other]

doi 10.1103/PhysRevLett.133.043401

Microscopic Study on Superexchange Dynamics of Composite Spin-1 Bosons

Authors: An Luo, Yong-Guang Zheng, Wei-Yong Zhang, Ming-Gen He, Ying-Chao Shen, Zi-Hang Zhu, Zhen-Sheng Yuan, Jian-Wei Pan

Abstract: We report on an experimental simulation of the spin-1 Heisenberg model with composite bosons in a one-dimensional chain based on the two-component Bose-Hubbard model. Exploiting our site-and spin-resolved quantum gas microscope, we observed faster superexchange dynamics of the spin-1 system compared to its spin-1/2 counterpart, which is attributed to the enhancement effect of multi-bosons. We furt… ▽ More We report on an experimental simulation of the spin-1 Heisenberg model with composite bosons in a one-dimensional chain based on the two-component Bose-Hubbard model. Exploiting our site-and spin-resolved quantum gas microscope, we observed faster superexchange dynamics of the spin-1 system compared to its spin-1/2 counterpart, which is attributed to the enhancement effect of multi-bosons. We further probed the non-equilibrium spin dynamics driven by the superexchange and single-ion anisotropy terms, unveiling the linear expansion of the spin-spin correlations, which is limited by the Lieb-Robinson bound. Based on the superexchange process, we prepared and verified the entangled qutrits pairs with these composite spin-1 bosons, potentially being applied in qutrit-based quantum information processing. △ Less

Submitted 18 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

Journal ref: Phys. Rev. Lett. 133, 043401 (2024)

arXiv:2403.03479 [pdf, other]

Observation of counterflow superfluidity in a two-component Mott insulator

Authors: Yong-Guang Zheng, An Luo, Ying-Chao Shen, Ming-Gen He, Zi-Hang Zhu, Ying Liu, Wei-Yong Zhang, Hui Sun, Youjin Deng, Zhen-Sheng Yuan, Jian-Wei Pan

Abstract: The counterflow superfluidity (CSF) was predicted two decades ago. Counterintuitively, while both components in the CSF have fluidity, their correlated counterflow currents cancel out leading the overall system to an incompressible Mott insulator. However, realizing and identifying the CSF remain challenging due to the request on extreme experimental capabilities in a single setup. Here, we observ… ▽ More The counterflow superfluidity (CSF) was predicted two decades ago. Counterintuitively, while both components in the CSF have fluidity, their correlated counterflow currents cancel out leading the overall system to an incompressible Mott insulator. However, realizing and identifying the CSF remain challenging due to the request on extreme experimental capabilities in a single setup. Here, we observe the CSF in a binary Bose mixture in optical lattices. We prepare a low-entropy spin-Mott state by conveying and merging two spin-1/2 bosonic atoms at every site and drive it adiabatically to the CSF at $\sim$ 1 nK. Antipair correlations of the CSF are probed though a site- and spin-resolved quantum gas microscope in both real and momentum spaces. These techniques and observations provide accessibility to the symmetry-protected topological quantum matters. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: 13 pages, 10 figures

arXiv:2401.09972 [pdf, other]

Better Explain Transformers by Illuminating Important Information

Authors: Linxin Song, Yan Cui, Ao Luo, Freddy Lecue, Irene Li

Abstract: Transformer-based models excel in various natural language processing (NLP) tasks, attracting countless efforts to explain their inner workings. Prior methods explain Transformers by focusing on the raw gradient and attention as token attribution scores, where non-relevant information is often considered during explanation computation, resulting in confusing results. In this work, we propose highl… ▽ More Transformer-based models excel in various natural language processing (NLP) tasks, attracting countless efforts to explain their inner workings. Prior methods explain Transformers by focusing on the raw gradient and attention as token attribution scores, where non-relevant information is often considered during explanation computation, resulting in confusing results. In this work, we propose highlighting the important information and eliminating irrelevant information by a refined information flow on top of the layer-wise relevance propagation (LRP) method. Specifically, we consider identifying syntactic and positional heads as important attention heads and focus on the relevance obtained from these important heads. Experimental results demonstrate that irrelevant information does distort output attribution scores and then should be masked during explanation computation. Compared to eight baselines on both classification and question-answering datasets, our method consistently outperforms with over 3\% to 33\% improvement on explanation metrics, providing superior explanation performance. Our anonymous code repository is available at: https://github.com/LinxinS97/Mask-LRP △ Less

Submitted 26 January, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

arXiv:2401.05573 [pdf, other]

doi 10.3847/2515-5172/ad19de

M31N 2013-10c: A Newly Identified Recurrent Nova in M31

Authors: Allen W. Shafter, Kamil Hornoch, Hana Kučáková, Petr Fatka, Jingyuan Zhao, Xing Gao, Shahidin Yaqup, Tuhong Zhong, Ali Esamdin, Chunhai Bai, Na Wang, Paul Benni, Aiden Luo, Ilana Yousuf

Abstract: The nova M31N 2023-11f (2023yoa) has been recently identified as the second eruption of a previously recognized nova, M31N 2013-10c, establishing the latter object as the 21st recurrent nova system thus far identified in M31. Here we present well sampled $R$-band lightcurves of both the 2013 and 2023 eruptions of this system. The photometric evolution of each eruption was quite similar as expected… ▽ More The nova M31N 2023-11f (2023yoa) has been recently identified as the second eruption of a previously recognized nova, M31N 2013-10c, establishing the latter object as the 21st recurrent nova system thus far identified in M31. Here we present well sampled $R$-band lightcurves of both the 2013 and 2023 eruptions of this system. The photometric evolution of each eruption was quite similar as expected for the same progenitor system. The 2013 and 2023 eruptions each reached peak magnitudes just brighter than $R\sim16$, with fits to the declining branches of the eruptions yielding times to decline by two magnitudes of $t_2(R)=5.5\pm1.7$ and $t_2(R)=3.4\pm1.5$ days, respectively. M31N 2013-10c has an absolute magnitude at peak, $M_R=-8.8\pm0.2$, making it the most luminous known recurrent nova in M31. △ Less

Submitted 10 January, 2024; originally announced January 2024.

Comments: 4 pages, 1 figure, 1 table; Accepted for publication in RNAAS

Journal ref: Res. Notes AAS 8 5 (2024)

arXiv:2401.03959 [pdf, ps, other]

Projected rotational velocities for LAMOST stars with effective temperature lower than 9000 K

Authors: Fang Zuo, A-Li Luo, Bing Du, Yinbi Li, Hugh R. A. Jones, Yi-han Song, Xiao Kong, Yan-xin Guo

Abstract: In Data Release 9 of LAMOST, we present measurements of v sin i for a total of 121,698 stars measured using the Medium Resolution Spectrograph (MRS) and 80,108 stars using the Low Resolution Spectrograph (LRS). These values were obtained through a chi^2 minimisation process, comparing LAMOST spectra with corresponding grids of synthetically broadened spectra. Due to the resolution and the spectral… ▽ More In Data Release 9 of LAMOST, we present measurements of v sin i for a total of 121,698 stars measured using the Medium Resolution Spectrograph (MRS) and 80,108 stars using the Low Resolution Spectrograph (LRS). These values were obtained through a chi^2 minimisation process, comparing LAMOST spectra with corresponding grids of synthetically broadened spectra. Due to the resolution and the spectral range of LAMOST, v sin i measurements are limited to stars with effective temperature (Teff) ranging from 5000 K to 8500 K for MRS and 7000 K to 9000 K for LRS. The detectable v sin i for MRS is set between 27 km/s and 350 km/s , and for LRS between 110 km/s and 350 km/s, This limitation is because the convolved reference spectra become less informative beyond 350 km/s. The intrinsic precisions of v sin i , determined from multi-epoch observations, is approximately 4.0 km/s for MRS and 10.0 km/s for LRS at signal-to-noise ratio (S/N) greater than 50. Our v sin i values show consistence with those from APOGEE17, displaying a scatter of 8.79 km/s. They are also in agreement with measurements from the Gaia DR3 and SUN catalogs. An observed trend in LAMOST MRS data is the decrease in v sin i with dropping Teff, particularly transiting around 7000 K for dwarfs and 6500 K for giants, primarily observed in stars with near-solar abundances. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: 13 pages, 16 figures

arXiv:2311.08944 [pdf, other]

Stellar Atmospheric Parameters for Cool Dwarfs in Gaia DR3

Authors: Cai-Xia Qu, A-Li Luo, Rui Wang, Hugh R. A. Jones, Bing Du, Xiang-Lei Chen, You-Fen Wang

Abstract: We provide a catalogue of atmospheric parameters for 1,806,921 cool dwarfs from Gaia DR3 which lie within the range covered by LAMOST cool dwarf spectroscopic parameters: 3200 K < T_{eff}< 4300 K, -0.8 < [M/H] < 0.2 dex, and 4.5 <log{g} < 5.5 dex. Our values are derived based on Machine Learning models trained with multi-band photometry corrected for dust. The photometric data comprises of optical… ▽ More We provide a catalogue of atmospheric parameters for 1,806,921 cool dwarfs from Gaia DR3 which lie within the range covered by LAMOST cool dwarf spectroscopic parameters: 3200 K < T_{eff}< 4300 K, -0.8 < [M/H] < 0.2 dex, and 4.5 <log{g} < 5.5 dex. Our values are derived based on Machine Learning models trained with multi-band photometry corrected for dust. The photometric data comprises of optical from SDSS r, i, z bands, near-infrared from 2MASS J, H, K and mid-infrared from ALLWISE W1, W2. We used both random forest and LightGBM machine learning models and found similar results from both with an error dispersion of 68 K, 0.22 dex, and 0.05 dex for T_{eff}, [M/H], and log {g}, respectively. Assessment of the relative feature importance of different photometric colors indicated W1 -- W2 as most sensitive to both T_{eff} and log{g}, with J -- H most sensitive to [M/H]. We find that our values show a good agreement with APOGEE, but are significantly different to those provided as part of Gaia DR3. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: 14 pages, 12 figures, accepted by ApJS

arXiv:2311.00305 [pdf, ps, other]

doi 10.1073/pnas.2304179120

Planets Across Space and Time (PAST). V. The evolution of hot Jupiters revealed by the age distribution of their host stars

Authors: Di-Chang Chen, Ji-Wei Xie, Ji-Lin Zhou, Subo Dong, Jia-Yi Yang, Wei Zhu, Chao Liu, Yang Huang, Mao-Sheng Xiang, Hai-Feng Wang, Zheng Zheng, Ali Luo, Jing-Hua Zhang, Zi Zhu

Abstract: The unexpected discovery of hot Jupiters challenged the classical theory of planet formation inspired by our solar system. Until now, the origin and evolution of hot Jupiters are still uncertain. Determining their age distribution and temporal evolution can provide more clues into the mechanism of their formation and subsequent evolution. Using a sample of 383 giant planets around Sun-like stars c… ▽ More The unexpected discovery of hot Jupiters challenged the classical theory of planet formation inspired by our solar system. Until now, the origin and evolution of hot Jupiters are still uncertain. Determining their age distribution and temporal evolution can provide more clues into the mechanism of their formation and subsequent evolution. Using a sample of 383 giant planets around Sun-like stars collected from the kinematic catalogs of the Planets Across Space and Time (PAST) project, we find that hot Jupiters are preferentially hosted by relatively younger stars in the Galactic thin disk. We subsequently find that the frequency of hot Jupiters declines with age. In contrast, the frequency of warm/cold Jupiters shows no significant dependence on age. Such a trend is expected from the tidal evolution of hot Jupiters' orbits, and our result offers supporting evidence using a large sample. We also perform a joint analysis on the planet frequencies in the stellar age-metallicity plane. The result suggests that the frequencies of hot Jupiters and warm/cold Jupiters, after removing the age dependence are both correlated with stellar metallicities. Moreover, we show that the above correlations can explain the bulk of the discrepancy in hot Jupiter frequencies inferred from the transit and radial velocity (RV) surveys, given that RV targets tend to be more metal-rich and younger than transits. △ Less

Submitted 1 November, 2023; originally announced November 2023.

Comments: Published in PNAS; 7 pages, 5 figures in the main text; 17 pages, 29 figures, 5 tables in the supporting information

arXiv:2310.20113 [pdf, other]

Planets Across Space and Time (PAST) IV: The Occurrence and Architecture of Kepler Planetary Systems as a Function of Kinematic Age Revealed by the LAMOST-Gaia-Kepler Sample

Authors: Jia-Yi Yang, Di-Chang Chen, Ji-Wei Xie, Ji-Lin Zhou, Subo Dong, Zi Zhu, Zheng Zheng, Chao Liu, Weikai Zong, Ali Luo

Abstract: One of the fundamental questions in astronomy is how planetary systems form and evolve. Measuring the planetary occurrence and architecture as a function of time directly addresses this question. In the fourth paper of the Planets Across Space and Time (PAST) series, we investigate the occurrence and architecture of Kepler planetary systems as a function of kinematic age by using the LAMOST-Gaia-K… ▽ More One of the fundamental questions in astronomy is how planetary systems form and evolve. Measuring the planetary occurrence and architecture as a function of time directly addresses this question. In the fourth paper of the Planets Across Space and Time (PAST) series, we investigate the occurrence and architecture of Kepler planetary systems as a function of kinematic age by using the LAMOST-Gaia-Kepler sample. To isolate the age effect, other stellar properties (e.g., metallicity) have been controlled. We find the following results. (1) The fraction of stars with Kepler-like planets ($F_{\text{Kep}}$) is about 50% for all stars; no significant trend is found between $F_{\text{Kep}}$ and age. (2) The average planet multiplicity ($\bar{N}_p$) exhibits a decreasing trend (~2$σ$ significance) with age. It decreases from $\bar{N}_p$~3 for stars younger than 1 Gyr to $\bar{N}_p$~1.8 for stars about 8 Gyr. (3) The number of planets per star ($η=F_{\text{Kep}}\times\bar{N}_p$) also shows a decreasing trend (~2-3$σ$ significance). It decreases from $η$~1.6-1.7 for young stars to $η$~1.0 for old stars. (4) The mutual orbital inclination of the planets ($σ_{i,k}$) increases from $1.2^{+1.4}_{-0.5}$ to $3.5^{+8.1}_{-2.3}$ as stars aging from 0.5 to 8 Gyr with a best fit of $\log{σ_{i,k}}=0.2+0.4\times\log{\frac{\text{Age}}{\text{1Gyr}}}$. Interestingly, the Solar System also fits such a trend. The nearly independence of $F_{\text{Kep}}$~50% on age implies that planet formation is robust and stable across the Galaxy history. The age dependence of $\bar{N}_p$ and $σ_{i,k}$ demonstrates planetary architecture is evolving, and planetary systems generally become dynamically hotter with fewer planets as they age. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: 27 pages, 20 figures, 4tables, accepted for publication in AJ

arXiv:2310.04420 [pdf, other]

BrainSCUBA: Fine-Grained Natural Language Captions of Visual Cortex Selectivity

Authors: Andrew F. Luo, Margaret M. Henderson, Michael J. Tarr, Leila Wehbe

Abstract: Understanding the functional organization of higher visual cortex is a central focus in neuroscience. Past studies have primarily mapped the visual and semantic selectivity of neural populations using hand-selected stimuli, which may potentially bias results towards pre-existing hypotheses of visual cortex functionality. Moving beyond conventional approaches, we introduce a data-driven method that… ▽ More Understanding the functional organization of higher visual cortex is a central focus in neuroscience. Past studies have primarily mapped the visual and semantic selectivity of neural populations using hand-selected stimuli, which may potentially bias results towards pre-existing hypotheses of visual cortex functionality. Moving beyond conventional approaches, we introduce a data-driven method that generates natural language descriptions for images predicted to maximally activate individual voxels of interest. Our method -- Semantic Captioning Using Brain Alignments ("BrainSCUBA") -- builds upon the rich embedding space learned by a contrastive vision-language model and utilizes a pre-trained large language model to generate interpretable captions. We validate our method through fine-grained voxel-level captioning across higher-order visual regions. We further perform text-conditioned image synthesis with the captions, and show that our images are semantically coherent and yield high predicted activations. Finally, to demonstrate how our method enables scientific discovery, we perform exploratory investigations on the distribution of "person" representations in the brain, and discover fine-grained semantic selectivity in body-selective areas. Unlike earlier studies that decode text, our method derives voxel-wise captions of semantic selectivity. Our results show that BrainSCUBA is a promising means for understanding functional preferences in the brain, and provides motivation for further hypothesis-driven investigation of visual cortex. △ Less

Submitted 3 May, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

Comments: ICLR 2024. Project page: https://www.cs.cmu.edu/~afluo/BrainSCUBA

arXiv:2310.00234 [pdf, other]

Pixel-Inconsistency Modeling for Image Manipulation Localization

Authors: Chenqi Kong, Anwei Luo, Shiqi Wang, Haoliang Li, Anderson Rocha, Alex C. Kot

Abstract: Digital image forensics plays a crucial role in image authentication and manipulation localization. Despite the progress powered by deep neural networks, existing forgery localization methodologies exhibit limitations when deployed to unseen datasets and perturbed images (i.e., lack of generalization and robustness to real-world applications). To circumvent these problems and aid image integrity,… ▽ More Digital image forensics plays a crucial role in image authentication and manipulation localization. Despite the progress powered by deep neural networks, existing forgery localization methodologies exhibit limitations when deployed to unseen datasets and perturbed images (i.e., lack of generalization and robustness to real-world applications). To circumvent these problems and aid image integrity, this paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts. The rationale is grounded on the observation that most image signal processors (ISP) involve the demosaicing process, which introduces pixel correlations in pristine images. Moreover, manipulating operations, including splicing, copy-move, and inpainting, directly affect such pixel regularity. We, therefore, first split the input image into several blocks and design masked self-attention mechanisms to model the global pixel dependency in input images. Simultaneously, we optimize another local pixel dependency stream to mine local manipulation clues within input forgery images. In addition, we design novel Learning-to-Weight Modules (LWM) to combine features from the two streams, thereby enhancing the final forgery localization performance. To improve the training process, we propose a novel Pixel-Inconsistency Data Augmentation (PIDA) strategy, driving the model to focus on capturing inherent pixel-level artifacts instead of mining semantic forgery traces. This work establishes a comprehensive benchmark integrating 15 representative detection models across 12 datasets. Extensive experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints and achieve state-of-the-art generalization and robustness performances in image manipulation localization. △ Less

Submitted 29 September, 2023; originally announced October 2023.

arXiv:2309.16217 [pdf, other]

GAFlow: Incorporating Gaussian Attention into Optical Flow

Authors: Ao Luo, Fan Yang, Xin Li, Lang Nie, Chunyu Lin, Haoqiang Fan, Shuaicheng Liu

Abstract: Optical flow, or the estimation of motion fields from image sequences, is one of the fundamental problems in computer vision. Unlike most pixel-wise tasks that aim at achieving consistent representations of the same category, optical flow raises extra demands for obtaining local discrimination and smoothness, which yet is not fully explored by existing approaches. In this paper, we push Gaussian A… ▽ More Optical flow, or the estimation of motion fields from image sequences, is one of the fundamental problems in computer vision. Unlike most pixel-wise tasks that aim at achieving consistent representations of the same category, optical flow raises extra demands for obtaining local discrimination and smoothness, which yet is not fully explored by existing approaches. In this paper, we push Gaussian Attention (GA) into the optical flow models to accentuate local properties during representation learning and enforce the motion affinity during matching. Specifically, we introduce a novel Gaussian-Constrained Layer (GCL) which can be easily plugged into existing Transformer blocks to highlight the local neighborhood that contains fine-grained structural information. Moreover, for reliable motion analysis, we provide a new Gaussian-Guided Attention Module (GGAM) which not only inherits properties from Gaussian distribution to instinctively revolve around the neighbor fields of each point but also is empowered to put the emphasis on contextually related regions during matching. Our fully-equipped model, namely Gaussian Attention Flow network (GAFlow), naturally incorporates a series of novel Gaussian-based modules into the conventional optical flow framework for reliable motion analysis. Extensive experiments on standard optical flow datasets consistently demonstrate the exceptional performance of the proposed approach in terms of both generalization ability evaluation and online benchmark testing. Code is available at https://github.com/LA30/GAFlow. △ Less

Submitted 28 September, 2023; originally announced September 2023.

Comments: To appear in ICCV-2023

arXiv:2309.11092 [pdf, other]

Generalized Face Forgery Detection via Adaptive Learning for Pre-trained Vision Transformer

Authors: Anwei Luo, Rizhao Cai, Chenqi Kong, Yakun Ju, Xiangui Kang, Jiwu Huang, Alex C. Kot

Abstract: With the rapid progress of generative models, the current challenge in face forgery detection is how to effectively detect realistic manipulated faces from different unseen domains. Though previous studies show that pre-trained Vision Transformer (ViT) based models can achieve some promising results after fully fine-tuning on the Deepfake dataset, their generalization performances are still unsati… ▽ More With the rapid progress of generative models, the current challenge in face forgery detection is how to effectively detect realistic manipulated faces from different unseen domains. Though previous studies show that pre-trained Vision Transformer (ViT) based models can achieve some promising results after fully fine-tuning on the Deepfake dataset, their generalization performances are still unsatisfactory. One possible reason is that fully fine-tuned ViT-based models may disrupt the pre-trained features [1, 2] and overfit to some data-specific patterns [3]. To alleviate this issue, we present a \textbf{F}orgery-aware \textbf{A}daptive \textbf{Vi}sion \textbf{T}ransformer (FA-ViT) under the adaptive learning paradigm, where the parameters in the pre-trained ViT are kept fixed while the designed adaptive modules are optimized to capture forgery features. Specifically, a global adaptive module is designed to model long-range interactions among input tokens, which takes advantage of self-attention mechanism to mine global forgery clues. To further explore essential local forgery clues, a local adaptive module is proposed to expose local inconsistencies by enhancing the local contextual association. In addition, we introduce a fine-grained adaptive learning module that emphasizes the common compact representation of genuine faces through relationship learning in fine-grained pairs, driving these proposed adaptive modules to be aware of fine-grained forgery-aware information. Extensive experiments demonstrate that our FA-ViT achieves state-of-the-arts results in the cross-dataset evaluation, and enhances the robustness against unseen perturbations. Particularly, FA-ViT achieves 93.83\% and 78.32\% AUC scores on Celeb-DF and DFDC datasets in the cross-dataset evaluation. The code and trained model have been released at: https://github.com/LoveSiameseCat/FAViT. △ Less

Submitted 21 August, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

arXiv:2309.05968 [pdf]

Neural Network Layer Matrix Decomposition reveals Latent Manifold Encoding and Memory Capacity

Authors: Ng Shyh-Chang, A-Li Luo, Bo Qiu

Abstract: We prove the converse of the universal approximation theorem, i.e. a neural network (NN) encoding theorem which shows that for every stably converged NN of continuous activation functions, its weight matrix actually encodes a continuous function that approximates its training dataset to within a finite margin of error over a bounded domain. We further show that using the Eckart-Young theorem for t… ▽ More We prove the converse of the universal approximation theorem, i.e. a neural network (NN) encoding theorem which shows that for every stably converged NN of continuous activation functions, its weight matrix actually encodes a continuous function that approximates its training dataset to within a finite margin of error over a bounded domain. We further show that using the Eckart-Young theorem for truncated singular value decomposition of the weight matrix for every NN layer, we can illuminate the nature of the latent space manifold of the training dataset encoded and represented by every NN layer, and the geometric nature of the mathematical operations performed by each NN layer. Our results have implications for understanding how NNs break the curse of dimensionality by harnessing memory capacity for expressivity, and that the two are complementary. This Layer Matrix Decomposition (LMD) further suggests a close relationship between eigen-decomposition of NN layers and the latest advances in conceptualizations of Hopfield networks and Transformer NN models. △ Less

Submitted 12 September, 2023; originally announced September 2023.

arXiv:2308.12535 [pdf, other]

SCP: Spherical-Coordinate-based Learned Point Cloud Compression

Authors: Ao Luo, Linxin Song, Keisuke Nonaka, Kyohei Unno, Heming Sun, Masayuki Goto, Jiro Katto

Abstract: In recent years, the task of learned point cloud compression has gained prominence. An important type of point cloud, the spinning LiDAR point cloud, is generated by spinning LiDAR on vehicles. This process results in numerous circular shapes and azimuthal angle invariance features within the point clouds. However, these two features have been largely overlooked by previous methodologies. In this… ▽ More In recent years, the task of learned point cloud compression has gained prominence. An important type of point cloud, the spinning LiDAR point cloud, is generated by spinning LiDAR on vehicles. This process results in numerous circular shapes and azimuthal angle invariance features within the point clouds. However, these two features have been largely overlooked by previous methodologies. In this paper, we introduce a model-agnostic method called Spherical-Coordinate-based learned Point cloud compression (SCP), designed to leverage the aforementioned features fully. Additionally, we propose a multi-level Octree for SCP to mitigate the reconstruction error for distant areas within the Spherical-coordinate-based Octree. SCP exhibits excellent universality, making it applicable to various learned point cloud compression techniques. Experimental results demonstrate that SCP surpasses previous state-of-the-art methods by up to 29.14% in point-to-point PSNR BD-Rate. △ Less

Submitted 8 February, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

arXiv:2307.11431 [pdf, other]

doi 10.1007/s10509-023-04219-w

$\mathrm{H}α$ chromospheric activity of F-, G-, and K-type stars observed by the LAMOST Medium-Resolution Spectroscopic Survey

Authors: Han He, Weitao Zhang, Haotong Zhang, Song Wang, Ali Luo, Jun Zhang

Abstract: The distribution of stellar $\mathrm{H}α$ chromospheric activity with respect to stellar atmospheric parameters (effective temperature $T_\mathrm{eff}$, surface gravity $\log\,g$, and metallicity $\mathrm{[Fe/H]}$) and main-sequence/giant categories is investigated for the F-, G-, and K-type stars observed by the LAMOST Medium-Resolution Spectroscopic Survey (MRS). A total of 329,294 MRS spectra f… ▽ More The distribution of stellar $\mathrm{H}α$ chromospheric activity with respect to stellar atmospheric parameters (effective temperature $T_\mathrm{eff}$, surface gravity $\log\,g$, and metallicity $\mathrm{[Fe/H]}$) and main-sequence/giant categories is investigated for the F-, G-, and K-type stars observed by the LAMOST Medium-Resolution Spectroscopic Survey (MRS). A total of 329,294 MRS spectra from LAMOST DR8 are utilized in the analysis. The $\mathrm{H}α$ activity index ($I_{\mathrm{H}α}$) and the $\mathrm{H}α$ $R$-index ($R_{\mathrm{H}α}$) are evaluated for the MRS spectra. The $\mathrm{H}α$ chromospheric activity distributions with individual stellar parameters as well as in the $T_\mathrm{eff}$ -- $\log\,g$ and $T_\mathrm{eff}$ -- $\mathrm{[Fe/H]}$ parameter spaces are analyzed based on the $R_{\mathrm{H}α}$ index data. It is found that: (1) for the main-sequence sample, the $R_{\mathrm{H}α}$ distribution with $T_\mathrm{eff}$ has a bowl-shaped lower envelope with a minimum at about 6200 K, a hill-shaped middle envelope with a maximum at about 5600 K, and an upper envelope continuing to increase from hotter to cooler stars; (2) for the giant sample, the middle and upper envelopes of the $R_{\mathrm{H}α}$ distribution first increase with a decrease of $T_\mathrm{eff}$ and then drop to a lower activity level at about 4300 K, revealing different activity characteristics at different stages of stellar evolution; (3) for both the main-sequence and giant samples, the upper envelope of the $R_{\mathrm{H}α}$ distribution with metallicity is higher for stars with $\mathrm{[Fe/H]}$ greater than about $-1.0$, and the lowest-metallicity stars hardly exhibit high $\mathrm{H}α$ indices. A dataset of $\mathrm{H}α$ activity indices for the LAMOST MRS spectra analyzed is provided with this paper. △ Less

Submitted 31 July, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

Comments: 32 pages, 12 figures, 1 table, accepted for publication in Astrophysics and Space Science

Journal ref: Astrophysics and Space Science (2023) 368: 63

arXiv:2307.11207 [pdf, other]

doi 10.3847/1538-3881/ace322

Mass-accretion, spectral, and photometric properties of T Tauri stars in Taurus based on TESS and LAMOST

Authors: Chia-Lung Lin, Wing-Huen Ip, Yao Hsiao, Tzu-Hueng Chang, Yi-han Song, A-Li Luo

Abstract: We present the analysis of 16 classical T Taur stars using LAMOST and TESS data, investigating spectral properties, photometric variations, and mass-accretion rates. All 16 stars exhibit emissions in H$α$ lines, from which the average mass-accretion rate of $1.76\times10^{-9}~M_{\odot}yr^{-1}$ is derived. Two of the stars, DL Tau and Haro 6-13, show mass-accretion bursts simultaneously in TESS, AS… ▽ More We present the analysis of 16 classical T Taur stars using LAMOST and TESS data, investigating spectral properties, photometric variations, and mass-accretion rates. All 16 stars exhibit emissions in H$α$ lines, from which the average mass-accretion rate of $1.76\times10^{-9}~M_{\odot}yr^{-1}$ is derived. Two of the stars, DL Tau and Haro 6-13, show mass-accretion bursts simultaneously in TESS, ASAS-SN, and/or ZTF survey. Based on these observations, we find that the mass-accretion rates of DL Tau and Haro 6-13 reach their maximums of $2.5 \times 10^{-8}~M_{\odot}yr^{-1}$ and $2 \times 10^{-10}~M_{\odot}yr^{-1}$ during the TESS observation, respectively. We detect thirteen flares among these stars. The flare frequency distribution shows that the CTTSs' flare activity is not only dominated by strong flares with high energy but much more active than those of solar-type and young low-mass stars. By comparing the variability classes reported in the literature, we find that the transition timescale between different classes of variability in CTTSs, such as from Stochastic (S) to Bursting (B) or from quasi-periodic symmetric (QPS) to quasi-periodic dipping (QPD), may range from 1.6 to 4 years. We observe no significant correlation between inclination and mass-accretion rates derived from the emission indicators. This suggests that inner disk properties may be more important than that of outer disk. Finally, we find a relatively significant positive correlation between the asymmetric metric "M" and the cold disk inclination compared to the literature. A weak negative correlation between the periodicity metric "Q" value and inclination has been also found. △ Less

Submitted 20 July, 2023; originally announced July 2023.

Comments: 39 pages, 22 figures, 8 tables

Journal ref: The Astronomical Journal, Volume 166, Number 3, year 2023

arXiv:2307.06486 [pdf, ps, other]

doi 10.1038/s41563-024-01797-0

Absence of $3a_0$ Charge Density Wave Order in the Infinite Layer Nickelates

Authors: C. T. Parzyck, N. K. Gupta, Y. Wu, V. Anil, L. Bhatt, M. Bouliane, R. Gong, B. Z. Gregory, A. Luo, R. Sutarto, F. He, Y. -D. Chuang, T. Zhou, G. Herranz, L. F. Kourkoutis, A. Singer, D. G. Schlom, D. G. Hawthorn, K. M. Shen

Abstract: A hallmark of many unconventional superconductors is the presence of many-body interactions which give rise to broken symmetry states intertwined with superconductivity. Recent resonant soft x-ray scattering experiments report commensurate $3a_0$ charge density wave order in the infinite layer nickelates, which has important implications regarding the universal interplay between charge order and s… ▽ More A hallmark of many unconventional superconductors is the presence of many-body interactions which give rise to broken symmetry states intertwined with superconductivity. Recent resonant soft x-ray scattering experiments report commensurate $3a_0$ charge density wave order in the infinite layer nickelates, which has important implications regarding the universal interplay between charge order and superconductivity in both the cuprates and nickelates. Here, we present x-ray scattering and spectroscopy measurements on a series of NdNiO$_{2+x}$ samples which reveal that the signatures of charge density wave order are absent in fully reduced, single-phase NdNiO$_2$. The $3a_0$ superlattice peak instead originates from a partially reduced impurity phase where excess apical oxygens form ordered rows with 3 unit cell periodicity. The absence of any observable charge density wave order in NdNiO$_2$ highlights a crucial difference between the phase diagrams of the cuprate and nickelate superconductors. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: Main Text: 8 pages, 4 figures. Supplemental: 12 pages, 12 figures

arXiv:2306.15611 [pdf, ps, other]

The Stellar Abundances and Galactic Evolution Survey (SAGES) -- -- I. General Description and the First Data Release (DR1)

Authors: Zhou Fan, Gang Zhao, Wei Wang, Jie Zheng, Jingkun Zhao, Chun Li, Yuqin Chen, Haibo Yuan, Haining Li, Kefeng Tan, Yihan Song, Fang Zuo, Yang Huang, Ali Luo, Ali Esamdin, Lu Ma, Bin Li, Nan Song, Frank Grupp, Haibin Zhao, Shuhrat A. Ehgamberdiev, Otabek A. Burkhonov, Guojie Feng, Chunhai Bai, Xuan Zhang , et al. (13 additional authors not shown)

Abstract: The Stellar Abundances and Galactic Evolution Survey (SAGES) of the northern sky is a specifically-designed multi-band photometric survey aiming to provide reliable stellar parameters with accuracy comparable to those from low-resolution optical spectra. It was carried out with the 2.3-m Bok telescope of Steward Observatory and three other telescopes. The observations in the $u_s$ and $v_s$ passba… ▽ More The Stellar Abundances and Galactic Evolution Survey (SAGES) of the northern sky is a specifically-designed multi-band photometric survey aiming to provide reliable stellar parameters with accuracy comparable to those from low-resolution optical spectra. It was carried out with the 2.3-m Bok telescope of Steward Observatory and three other telescopes. The observations in the $u_s$ and $v_s$ passband produced over 36,092 frames of images in total, covering a sky area of $\sim9960$ degree$^2$. The median survey completeness of all observing fields for the two bands are of $u_{\rm s}=20.4$ mag and $v_s=20.3$ mag, respectively, while the limiting magnitudes with signal-to-noise ratio (S/N) of 100 are $u_s\sim17$ mag and $v_s\sim18$ mag, correspondingly. We combined our catalog with the data release 1 (DR1) of the first of Panoramic Survey Telescope And Rapid Response System (Pan-STARRS1, PS1) catalog, and obtained a total of 48,553,987 sources which have at least one photometric measurement in each of the SAGES $u_s$ and $v_s$ and PS1 $grizy$ passbands, which is the DR1 of SAGES and it will be released in our paper. We compare our $gri$ point-source photometry with those of PS1 and found an RMS scatter of $\sim2$% in difference of PS1 and SAGES for the same band. We estimated an internal photometric precision of SAGES to be on the order of $\sim1$%. Astrometric precision is better than $0^{\prime\prime}.2$ based on comparison with the DR1 of Gaia mission. In this paper, we also describe the final end-user database, and provide some science applications. △ Less

Submitted 28 June, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

Comments: 49 pages, 21 figures, 5 table, accepted for publication in ApJS

arXiv:2306.10332 [pdf, other]

doi 10.3847/1538-4357/acdf42

Direct observational evidence of the multi-scale, dynamical mass accretion toward a high-mass star forming hub-filament system

Authors: Dongting Yang, Hong-Li Liu, Anandmayee Tej, Tie Liu, Patricio Sanhueza, Sheng-Li Qin, Xing Lu, Ke Wang, Sirong Pan, Feng-Wei Xu, Enrique Vazquez-Semadeni, Shanghuo Li, Gilberto C. Gomez, Aina Palau, Guido Garay, Paul F. Goldsmith, Mika Juvela, Anindya Saha, Leonardo Bronfman, Chang Won Lee, Kenichi Tatematsu, Lokesh Dewangan, Jianwen Zhou, Yong Zhang, Amelia Stutz , et al. (6 additional authors not shown)

Abstract: There is growing evidence that high-mass star formation and hub-filament systems (HFS) are intricately linked. The gas kinematics along the filaments and the forming high-mass star(s) in the central hub are in excellent agreement with the new generation of global hierarchical high-mass star formation models. In this paper, we present an observational investigation of a typical HFS cloud, G310.142+… ▽ More There is growing evidence that high-mass star formation and hub-filament systems (HFS) are intricately linked. The gas kinematics along the filaments and the forming high-mass star(s) in the central hub are in excellent agreement with the new generation of global hierarchical high-mass star formation models. In this paper, we present an observational investigation of a typical HFS cloud, G310.142+0.758 (G310 hereafter) which reveals unambiguous evidence of mass inflow from the cloud scale via the filaments onto the forming protostar(s) at the hub conforming with the model predictions. Continuum and molecular line data from the ATOMS and MALT90 surveys are used that cover different spatial scales. Three filaments (with total mass $5.7\pm1.1\times 10^3~M_{\odot}$) are identified converging toward the central hub region where several signposts of high-mass star formation have been observed. The hub region contains a massive clump ($1280\pm260~M_{\odot}$) harbouring a central massive core. Additionally, five outflow lobes are associated with the central massive core implying a forming cluster. The observed large-scale, smooth and coherent velocity gradients from the cloud down to the core scale, and the signatures of infall motion seen in the central massive clump and core, clearly unveil a nearly-continuous, multi-scale mass accretion/transfer process at a similar mass infall rate of $\sim 10^{-3}~M_{\odot}~yr^{-1}$ over all scales, feeding the central forming high-mass protostar(s) in the G310 HFS cloud. △ Less

Submitted 17 June, 2023; originally announced June 2023.

Comments: Accepted to publish in ApJ. 10 pages with 6 figures and 2 tables

arXiv:2306.03089 [pdf, other]

Brain Diffusion for Visual Exploration: Cortical Discovery using Large Scale Generative Models

Authors: Andrew F. Luo, Margaret M. Henderson, Leila Wehbe, Michael J. Tarr

Abstract: A long standing goal in neuroscience has been to elucidate the functional organization of the brain. Within higher visual cortex, functional accounts have remained relatively coarse, focusing on regions of interest (ROIs) and taking the form of selectivity for broad categories such as faces, places, bodies, food, or words. Because the identification of such ROIs has typically relied on manually as… ▽ More A long standing goal in neuroscience has been to elucidate the functional organization of the brain. Within higher visual cortex, functional accounts have remained relatively coarse, focusing on regions of interest (ROIs) and taking the form of selectivity for broad categories such as faces, places, bodies, food, or words. Because the identification of such ROIs has typically relied on manually assembled stimulus sets consisting of isolated objects in non-ecological contexts, exploring functional organization without robust a priori hypotheses has been challenging. To overcome these limitations, we introduce a data-driven approach in which we synthesize images predicted to activate a given brain region using paired natural images and fMRI recordings, bypassing the need for category-specific stimuli. Our approach -- Brain Diffusion for Visual Exploration ("BrainDiVE") -- builds on recent generative methods by combining large-scale diffusion models with brain-guided image synthesis. Validating our method, we demonstrate the ability to synthesize preferred images with appropriate semantic specificity for well-characterized category-selective ROIs. We then show that BrainDiVE can characterize differences between ROIs selective for the same high-level category. Finally we identify novel functional subdivisions within these ROIs, validated with behavioral data. These results advance our understanding of the fine-grained functional organization of human visual cortex, and provide well-specified constraints for further examination of cortical organization using hypothesis-driven methods. △ Less

Submitted 28 November, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

Comments: NeurIPS 2023 (Oral). Project page: https://www.cs.cmu.edu/~afluo/BrainDiVE/

arXiv:2306.00306 [pdf, other]

Low-Light Image Enhancement with Wavelet-based Diffusion Models

Authors: Hai Jiang, Ao Luo, Songchen Han, Haoqiang Fan, Shuaicheng Liu

Abstract: Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration. To address these issues, we propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL. Specifically, we present a wavelet-based conditional diffusion model (WCDM) that leverages… ▽ More Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration. To address these issues, we propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL. Specifically, we present a wavelet-based conditional diffusion model (WCDM) that leverages the generative power of diffusion models to produce results with satisfactory perceptual fidelity. Additionally, it also takes advantage of the strengths of wavelet transformation to greatly accelerate inference and reduce computational resource usage without sacrificing information. To avoid chaotic content and diversity, we perform both forward diffusion and denoising in the training phase of WCDM, enabling the model to achieve stable denoising and reduce randomness during inference. Moreover, we further design a high-frequency restoration module (HFRM) that utilizes the vertical and horizontal details of the image to complement the diagonal information for better fine-grained restoration. Extensive experiments on publicly available real-world benchmarks demonstrate that our method outperforms the existing state-of-the-art methods both quantitatively and visually, and it achieves remarkable improvements in efficiency compared to previous diffusion-based methods. In addition, we empirically show that the application for low-light face detection also reveals the latent practical values of our method. Code is available at https://github.com/JianghaiSCU/Diffusion-Low-Light. △ Less

Submitted 25 September, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

Comments: Accepted by Siggraph Aisa 2023 (ACM Transactions on Graphics)

arXiv:2305.10217 [pdf, other]

doi 10.1088/1674-4527/acd67e

Deep Learning Applications Based on WISE Infrared Data: Classification of Stars, Galaxies and Quasars

Authors: Guiyu Zhao, Bo Qiu, A-Li Luo, Xiaoyu Guo, Lin Yao, Kun Wang, Yuanbo Liu

Abstract: The Wide-field Infrared Survey Explorer (WISE) has detected hundreds of millions of sources over the entire sky. However, classifying them reliably is a great challenge due to degeneracies in WISE multicolor space and low detection levels in its two longest-wavelength bandpasses. In this paper, the deep learning classification network, IICnet (Infrared Image Classification network), is designed to… ▽ More The Wide-field Infrared Survey Explorer (WISE) has detected hundreds of millions of sources over the entire sky. However, classifying them reliably is a great challenge due to degeneracies in WISE multicolor space and low detection levels in its two longest-wavelength bandpasses. In this paper, the deep learning classification network, IICnet (Infrared Image Classification network), is designed to classify sources from WISE images to achieve a more accurate classification goal. IICnet shows good ability on the feature extraction of the WISE sources. Experiments demonstrates that the classification results of IICnet are superior to some other methods; it has obtained 96.2% accuracy for galaxies, 97.9% accuracy for quasars, and 96.4% accuracy for stars, and the Area Under Curve (AUC) of the IICnet classifier can reach more than 99%. In addition, the superiority of IICnet in processing infrared images has been demonstrated in the comparisons with VGG16, GoogleNet, ResNet34, MobileNet, EfficientNetV2, and RepVGG-fewer parameters and faster inference. The above proves that IICnet is an effective method to classify infrared sources. △ Less

Submitted 17 May, 2023; originally announced May 2023.

arXiv:2305.09294 [pdf, other]

doi 10.3847/1538-4365/acd05b

S-type stars from LAMOST DR10: classification of intrinsic and extrinsic stars

Authors: Jing Chen, Yin-Bi Li, A-Li Luo, Xiao-Xiao Ma, Shuo Li

Abstract: In this paper, we found 2939 S-type stars from LAMOST Data Release 10 using two machine-learning methods, and 2306 of them were reported for the first time. The main purpose of this work is to study how to divide S-type stars into intrinsic and extrinsic stars with photometric data and LAMOST spectra. Using infrared photometric data, we adopted two methods to distinguish S-type stars, i.e., XGBoos… ▽ More In this paper, we found 2939 S-type stars from LAMOST Data Release 10 using two machine-learning methods, and 2306 of them were reported for the first time. The main purpose of this work is to study how to divide S-type stars into intrinsic and extrinsic stars with photometric data and LAMOST spectra. Using infrared photometric data, we adopted two methods to distinguish S-type stars, i.e., XGBoost algorithm and color-color diagrams. We trained XGBoost model with 15 input features consisting of colors and absolute magnitudes of Two Micron All Sky Survey (2MASS), AllWISE, AKARI, and IRAS, and found that the model trained by input features with 2MASS, AKARI, and IRAS data has the highest accuracy of 95.52%. Furthermore, using this XGBoost model, we found four color-color diagrams with six infrared color criteria to divide S-type stars, which has an accuracy of about 90%. Applying the two methods to the 2939 S-type stars, 381 (XGBoost)/336 (color-color diagrams) intrinsic and 495 (XGBoost)/82 (color-color diagrams) extrinsic stars were classified, respectively. Using these photometrically classified intrinsic and extrinsic stars, we retrained XGBoost model with their blue and red medium-resolution spectra, and the 2939 stars were divided into 855 intrinsic and 2056 extrinsic stars from spectra with an accuracy of 94.82%. In addition, we also found four spectral regions of Zr I (6451.6A), Ne II (6539.6A), Hα (6564.5A), and Fe I (6609.1A) and C I (6611.4A) are the most important features, which can reach an accuracy of 92.1% when using them to classify S-type stars. △ Less

Submitted 16 May, 2023; originally announced May 2023.

Comments: 21 pages,13 figures, Accepted by ApJS

arXiv:2305.09191 [pdf, other]

doi 10.3847/1538-4365/acd69c

Ionized gas metallicity of the strong [OIII]λ emission-line compact galaxies in the LAMOST survey

Authors: Siqi Liu, A-Li Luo, Wei Zhang, Xiao Kong, Yong-Heng Zhao

Abstract: This article reports a sample of 1830 strong [O III] λ5007 emission-line compact galaxies discovered with the LAMOST spectroscopic survey and the photometric catalog of SDSS. We newly identify 402 spectra of 346 strong [O III]λ5007 emission-line compact galaxies by finding compact isolated point sources. Combined with the samples in our previous work (Liu et al. 2022), this returns a sample of 183… ▽ More This article reports a sample of 1830 strong [O III] λ5007 emission-line compact galaxies discovered with the LAMOST spectroscopic survey and the photometric catalog of SDSS. We newly identify 402 spectra of 346 strong [O III]λ5007 emission-line compact galaxies by finding compact isolated point sources. Combined with the samples in our previous work (Liu et al. 2022), this returns a sample of 1830 unique strong [O III]λ5007 emission-line compact galaxies with 2033 spectra of z <= 0.53. For the sources with 2σ [OIII]λ4363 detections, we calculate the gas-phase metallicity with the direct-Te method, and verify that the strong-line metallicity diagnostics calibrated with the direct-Te method also applies to this sample. The strong [O III]λ5007 emission-line compact galaxies fall below several Te-calibrated mass-metallicity relations. The N/O measurements of the strong [O iii]λ5007 emission-line compact galaxies mainly locate at a plateau at low metallicity, indicating the product of primary nucleosynthesis. The Ne3O2 and O32 relation follows a tight linear relation with no redshift evolution. The Ne3O2 anti-correlates with the stellar mass, and at fixed stellar mass the Ne3O2 increase with the redshift. Eight sources with asymmetric [O III]λ5007 emission-line profiles have been identified, however with no [O III]λ4363 detection, which proves the rich metal content and complex ionized gas kinematics within the galaxies. Higher-resolution spectroscopy will be necessary to identify the ionized gas components in detail. △ Less

Submitted 16 May, 2023; originally announced May 2023.

Comments: 20 pages, 13 pictures, accepted by ApJS

arXiv:2305.05854 [pdf, other]

doi 10.3847/1538-4365/acce36

Stellar Parameters and Chemical Abundances Estimated from LAMOST-II DR8 MRS based on Cycle-StarNet

Authors: Rui Wang, A-Li Luo, Shuo Zhang, Yuan-Sen Ting, Teaghan O'Briain, LAMOST MRS Collaboration

Abstract: Deriving stellar atmospheric parameters and chemical abundances from stellar spectra is crucial for understanding the evolution of the Milky Way. By performing a fitting with MARCS model atmospheric theoretical synthetic spectra combined with a domain-adaptation method, we estimate the fundamental stellar parameters (Teff, log g, [Fe/H], vmic, and vmac) and 11 chemical abundances for 1.38 million… ▽ More Deriving stellar atmospheric parameters and chemical abundances from stellar spectra is crucial for understanding the evolution of the Milky Way. By performing a fitting with MARCS model atmospheric theoretical synthetic spectra combined with a domain-adaptation method, we estimate the fundamental stellar parameters (Teff, log g, [Fe/H], vmic, and vmac) and 11 chemical abundances for 1.38 million FGKM-type stars of the Medium-Resolution Spectroscopic Survey (MRS) from LAMOST-II DR8. The domain-adaptation method, Cycle-StarNet, is employed to reduce the gap between observed and synthetic spectra, and the L-BFGS algorithm is used to search for the best-fit synthetic spectra. By combining the 2MASS photometric survey data, Gaia EDR3 parallax, and MIST isochrones, the surface gravities of the stars are constrained after estimating their bolometric luminosities. The accuracy of Teff, log g, and [Fe/H] can reach 150 K, 0.11 dex, and 0.15 dex, evaluated by the PASTEL catalog, asteroseismic samples, and other spectroscopic surveys. The precision of these parameters and elemental abundances ([C/Fe], [Na/Fe], [Mg/Fe], [Si/Fe], [Ca/Fe], [Ti/Fe], [Cr/Fe], [Mn/Fe], [Co/Fe], [Ni/Fe], and [Cu/Fe]) is assessed by repeated observations and validated by cluster members. For spectra with signal-to-noise (S/N) ratios greater than 10, the precision of the three stellar parameters and elemental abundances can achieve 76 K, 0.014 dex, 0.096 dex, and 0.04-0.15 dex. For spectra with S/N ratios higher than 100, the precision stabilizes at 22 K, 0.006 dex, 0.043 dex, and 0.01-0.06 dex. The full LAMOST MRS stellar properties catalog is available online. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: Accepted for publication in ApJS

arXiv:2305.05167 [pdf, other]

doi 10.1088/1674-4527/accb78

The HI gas fraction scaling relation of the Green Pea galaxies

Authors: Siqi Liu, A-Li Luo, Wei Zhang, Yan-Xia Zhang, Xiao Kong, Yong-Heng Zhao

Abstract: Green Pea galaxies are compact galaxies with high star formation rates. However, limited samples of Green Pea galaxies have HI 21 cm measurements. Whether the HI gas fraction f_{HI} = M_{HI}/M_{*} of Green Pea galaxies follows the existing scaling relations between the f_{HI} and NUV-r color or linear combinations of color and other physical quantities needs checking. Using archival data of HI 21c… ▽ More Green Pea galaxies are compact galaxies with high star formation rates. However, limited samples of Green Pea galaxies have HI 21 cm measurements. Whether the HI gas fraction f_{HI} = M_{HI}/M_{*} of Green Pea galaxies follows the existing scaling relations between the f_{HI} and NUV-r color or linear combinations of color and other physical quantities needs checking. Using archival data of HI 21cm observations, we investigate the scaling relation of the NUV-r color with the M_{HI}/M_{*} of 38 Green Pea galaxies, including 17 detections and 21 non-detections. The HI to stellar mass ratios (f_{HI}) of Green Pea galaxies deviate from the polynomial form, where a higher HI gas fraction is predicted given the current NUV-r color, even with the emission lines removed. The blue sources (NUV-r<1) from the comparison sample (ALFALFA-SDSS) follow a similar trend. The HI gas fraction scaling relations with linear combination forms of -0.34(NUV-r) - 0.64 log(mu_{*,z}) + 5.94 and -0.77 log mu_{*,i} + 0.26 log SFR/M_{*}+8.53, better predict the HI gas fraction of the Green Pea galaxies. In order to obtain accurate linear combined forms, higher-resolution photometry from space-based telescopes is needed. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: 15 pages, 7 figures, to be published in RAA

arXiv:2304.12489 [pdf, other]

Beyond the Prior Forgery Knowledge: Mining Critical Clues for General Face Forgery Detection

Authors: Anwei Luo, Chenqi Kong, Jiwu Huang, Yongjian Hu, Xiangui Kang, Alex C. Kot

Abstract: Face forgery detection is essential in combating malicious digital face attacks. Previous methods mainly rely on prior expert knowledge to capture specific forgery clues, such as noise patterns, blending boundaries, and frequency artifacts. However, these methods tend to get trapped in local optima, resulting in limited robustness and generalization capability. To address these issues, we propose… ▽ More Face forgery detection is essential in combating malicious digital face attacks. Previous methods mainly rely on prior expert knowledge to capture specific forgery clues, such as noise patterns, blending boundaries, and frequency artifacts. However, these methods tend to get trapped in local optima, resulting in limited robustness and generalization capability. To address these issues, we propose a novel Critical Forgery Mining (CFM) framework, which can be flexibly assembled with various backbones to boost their generalization and robustness performance. Specifically, we first build a fine-grained triplet and suppress specific forgery traces through prior knowledge-agnostic data augmentation. Subsequently, we propose a fine-grained relation learning prototype to mine critical information in forgeries through instance and local similarity-aware losses. Moreover, we design a novel progressive learning controller to guide the model to focus on principal feature components, enabling it to learn critical forgery features in a coarse-to-fine manner. The proposed method achieves state-of-the-art forgery detection performance under various challenging evaluation settings. △ Less

Submitted 24 April, 2023; originally announced April 2023.

Showing 1–50 of 232 results for author: Luo, A