-
Noise-Resilient Unsupervised Graph Representation Learning via Multi-Hop Feature Quality Estimation
Authors:
Shiyuan Li,
Yixin Liu,
Qingfeng Chen,
Geoffrey I. Webb,
Shirui Pan
Abstract:
Unsupervised graph representation learning (UGRL) based on graph neural networks (GNNs), has received increasing attention owing to its efficacy in handling graph-structured data. However, existing UGRL methods ideally assume that the node features are noise-free, which makes them fail to distinguish between useful information and noise when applied to real data with noisy features, thus affecting…
▽ More
Unsupervised graph representation learning (UGRL) based on graph neural networks (GNNs), has received increasing attention owing to its efficacy in handling graph-structured data. However, existing UGRL methods ideally assume that the node features are noise-free, which makes them fail to distinguish between useful information and noise when applied to real data with noisy features, thus affecting the quality of learned representations. This urges us to take node noisy features into account in real-world UGRL. With empirical analysis, we reveal that feature propagation, the essential operation in GNNs, acts as a "double-edged sword" in handling noisy features - it can both denoise and diffuse noise, leading to varying feature quality across nodes, even within the same node at different hops. Building on this insight, we propose a novel UGRL method based on Multi-hop feature Quality Estimation (MQE for short). Unlike most UGRL models that directly utilize propagation-based GNNs to generate representations, our approach aims to learn representations through estimating the quality of propagated features at different hops. Specifically, we introduce a Gaussian model that utilizes a learnable "meta-representation" as a condition to estimate the expectation and variance of multi-hop propagated features via neural networks. In this way, the "meta representation" captures the semantic and structural information underlying multiple propagated features but is naturally less susceptible to interference by noise, thereby serving as high-quality node representations beneficial for downstream tasks. Extensive experiments on multiple real-world datasets demonstrate that MQE in learning reliable node representations in scenarios with diverse types of feature noise.
△ Less
Submitted 29 July, 2024;
originally announced July 2024.
-
Robust Preference for Dynamical Dark Energy in DESI BAO and SN Measurements
Authors:
William Giarè,
Mahdi Najafi,
Supriya Pan,
Eleonora Di Valentino,
Javad T. Firouzjaee
Abstract:
Recent Baryon Acoustic Oscillation (BAO) measurements released by DESI, when combined with Cosmic Microwave Background (CMB) data from Planck and two different samples of Type Ia supernovae (Pantheon-Plus and DESY5) reveal a preference for Dynamical Dark Energy (DDE) characterized by a present-day quintessence-like equation of state that crossed into the phantom regime in the past. A core ansatz f…
▽ More
Recent Baryon Acoustic Oscillation (BAO) measurements released by DESI, when combined with Cosmic Microwave Background (CMB) data from Planck and two different samples of Type Ia supernovae (Pantheon-Plus and DESY5) reveal a preference for Dynamical Dark Energy (DDE) characterized by a present-day quintessence-like equation of state that crossed into the phantom regime in the past. A core ansatz for this result is assuming a linear Chevallier-Polarski-Linder (CPL) parameterization $w(a) = w_0 + w_a (1-a)$ to describe the evolution of the DE equation of state (EoS). In this paper, we test if and to what extent this assumption impacts the results. To prevent broadening uncertainties in cosmological parameter inference and facilitate direct comparison with the baseline CPL case, we focus on 4 alternative well-known models that, just like CPL, consist of only two free parameters: the present-day DE EoS ($w_0$) and a parameter quantifying its dynamical evolution ($w_a$). We demonstrate that the preference for DDE remains robust regardless of the parameterization: $w_0$ consistently remains in the quintessence regime, while $w_a$ consistently indicates a preference for a dynamical evolution towards the phantom regime. This tendency is significantly strengthened by DESY5 SN measurements. By comparing the best-fit $χ^2$ obtained within each DDE model, we notice that the linear CPL parameterization is not the best-fitting case. Among the models considered, the EoS proposed by Barboza and Alcaniz consistently leads to the most significant improvement.
△ Less
Submitted 23 July, 2024;
originally announced July 2024.
-
Coupled phantom cosmological model motivated by the warm inflationary paradigm
Authors:
Sudip Halder,
Supriya Pan,
Paulo M. Sá,
Tapan Saha
Abstract:
In this article, we investigate a coupled phantom dark-energy cosmological model in which the coupling term between a phantom scalar field with an exponential potential and a pressureless dark-matter fluid is motivated by the warm inflationary paradigm. Using methods of qualitative analysis of dynamical systems, complemented by numerical solutions of the evolution equations, we study the late-time…
▽ More
In this article, we investigate a coupled phantom dark-energy cosmological model in which the coupling term between a phantom scalar field with an exponential potential and a pressureless dark-matter fluid is motivated by the warm inflationary paradigm. Using methods of qualitative analysis of dynamical systems, complemented by numerical solutions of the evolution equations, we study the late-time behavior of our model. We show that contrary to the uncoupled scenario, the coupled phantom model admits accelerated scaling solutions. However, they do not correspond to a final state of the universe's evolution and, therefore, cannot be used to solve the cosmological coincidence problem. Furthermore, we show that, for certain coupling parameter values, the total equation-of-state parameter's asymptotic behavior is significantly changed when compared to the uncoupled scenario, allowing for solutions less phantom even for steeper potentials of the phantom scalar field.
△ Less
Submitted 22 July, 2024;
originally announced July 2024.
-
A Solution toward Transparent and Practical AI Regulation: Privacy Nutrition Labels for Open-source Generative AI-based Applications
Authors:
Meixue Si,
Shidong Pan,
Dianshu Liao,
Xiaoyu Sun,
Zhen Tao,
Wenchang Shi,
Zhenchang Xing
Abstract:
The rapid development and widespread adoption of Generative Artificial Intelligence-based (GAI) applications have greatly enriched our daily lives, benefiting people by enhancing creativity, personalizing experiences, improving accessibility, and fostering innovation and efficiency across various domains. However, along with the development of GAI applications, concerns have been raised about tran…
▽ More
The rapid development and widespread adoption of Generative Artificial Intelligence-based (GAI) applications have greatly enriched our daily lives, benefiting people by enhancing creativity, personalizing experiences, improving accessibility, and fostering innovation and efficiency across various domains. However, along with the development of GAI applications, concerns have been raised about transparency in their privacy practices. Traditional privacy policies often fail to effectively communicate essential privacy information due to their complexity and length, and open-source community developers often neglect privacy practices even more. Only 12.2% of examined open-source GAI apps provide a privacy policy. To address this, we propose a regulation-driven GAI Privacy Label and introduce Repo2Label, a novel framework for automatically generating these labels based on code repositories. Our user study indicates a common endorsement of the proposed GAI privacy label format. Additionally, Repo2Label achieves a precision of 0.81, recall of 0.88, and F1-score of 0.84 based on the benchmark dataset, significantly outperforming the developer self-declared privacy notices. We also discuss the common regulatory (in)compliance of open-source GAI apps, comparison with other privacy notices, and broader impacts to different stakeholders. Our findings suggest that Repo2Label could serve as a significant tool for bolstering the privacy transparency of GAI apps and make them more practical and responsible.
△ Less
Submitted 22 July, 2024;
originally announced July 2024.
-
Dynamical dark energy confronted with multiple CMB missions
Authors:
Mahdi Najafi,
Supriya Pan,
Eleonora Di Valentino,
Javad T. Firouzjaee
Abstract:
The measurements of the cosmic microwave background (CMB) have played a significant role in understanding the nature of dark energy. In this article, we investigate the dynamics of the dark energy equation of state, utilizing high-precision CMB data from multiple experiments. We begin by examining the Chevallier-Polarski-Linder (CPL) parametrization, a commonly used and recognized framework for de…
▽ More
The measurements of the cosmic microwave background (CMB) have played a significant role in understanding the nature of dark energy. In this article, we investigate the dynamics of the dark energy equation of state, utilizing high-precision CMB data from multiple experiments. We begin by examining the Chevallier-Polarski-Linder (CPL) parametrization, a commonly used and recognized framework for describing the dark energy equation of state. We then explore the general Exponential parametrization, which incorporates CPL as its first-order approximation, and extensions of this parametrization that incorporate nonlinear terms. We constrain these scenarios using CMB data from various missions, including the Planck 2018 legacy release, the Wilkinson Microwave Anisotropy Probe (WMAP), the Atacama Cosmology Telescope (ACT), and the South Pole Telescope (SPT), as well as combinations with low-redshift cosmological probes such as Baryon Acoustic Oscillations (BAO) and the Pantheon sample. While the $Λ$CDM cosmology remains consistent within the 68\% confidence level, we observe that the extensions of the CPL parametrization are indistinguishable for Planck data. However, for ACT and SPT data, the inclusion of additional terms begins to reveal a peak in $w_{\rm a, DE}$ that was previously unconstrained.
△ Less
Submitted 20 July, 2024;
originally announced July 2024.
-
Supernova Pointing Capabilities of DUNE
Authors:
DUNE Collaboration,
A. Abed Abud,
B. Abi,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
B. Aimard,
F. Akbar,
K. Allison,
S. Alonso Monsalve,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
H. Amar,
P. Amedo,
J. Anderson,
D. A. Andrade
, et al. (1340 additional authors not shown)
Abstract:
The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electr…
▽ More
The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electron-neutrino charged-current absorption on $^{40}$Ar and elastic scattering of neutrinos on electrons. Procedures to reconstruct individual interactions, including a newly developed technique called ``brems flipping'', as well as the burst direction from an ensemble of interactions are described. Performance of the burst direction reconstruction is evaluated for supernovae happening at a distance of 10 kpc for a specific supernova burst flux model. The pointing resolution is found to be 3.4 degrees at 68% coverage for a perfect interaction-channel classification and a fiducial mass of 40 kton, and 6.6 degrees for a 10 kton fiducial mass respectively. Assuming a 4% rate of charged-current interactions being misidentified as elastic scattering, DUNE's burst pointing resolution is found to be 4.3 degrees (8.7 degrees) at 68% coverage.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Inoculating solid-state homogeneous precipitation by impurity atoms through a spinodal decomposition like pathway
Authors:
Shiwei Pan,
Chunan Li,
Hanne-Sofie Søreide,
Dongdong Zhao,
Constantinos Hatzoglou,
Feng Qian,
Long-Qing Chen,
Yanjun Li
Abstract:
Solid-state homogeneous precipitation of nano-sized precipitates is one of the most effective processes to strengthen metal alloys, where the final density and size distribution of precipitates are largely controlled by the precipitation kinetics. Here, we report a strategy to inoculate the homogeneous precipitation of coherent precipitates to enhance the precipitation strengthening. Using the tec…
▽ More
Solid-state homogeneous precipitation of nano-sized precipitates is one of the most effective processes to strengthen metal alloys, where the final density and size distribution of precipitates are largely controlled by the precipitation kinetics. Here, we report a strategy to inoculate the homogeneous precipitation of coherent precipitates to enhance the precipitation strengthening. Using the technologically important dilute Al-Zr alloys as an example, we demonstrate that an addition of a trace level of economical and readily available, non-L1$_{2}$ phase forming impurity atoms, X (X= Sn, Sb, Bi or Cd) and Si, can significantly enhance the diffusivity of Zr atoms and overturn the precipitation of L1$_{2}$-structured Al$_{3}$Zr nanoparticles from the classical homogeneous nucleation and growth pathway into a nonclassical nucleation pathway: Al$_{3}$Zr forms through the spontaneous formation of nano-scale local concentration fluctuations of Zr atoms on Zr-X(-Si)-vacancy clusters followed by a continuous increase of the concentration and chemical short-range ordering (CSRO). Such an impurity atoms induced heterogeneous nucleation based on a "spinodal decomposition like" mechanism dramatically accelerates the precipitation kinetics, leading to an order of magnitude higher number density of precipitates and a record high hardening efficiency of solute Zr atoms. By formulating the generalized selection principles for inoculating impurity elements, this inoculation strategy should be extendable to a broader range of materials to further explore the precipitation strengthening potentials.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking
Authors:
Yuheng Li,
Tianyu Luan,
Yizhou Wu,
Shaoyan Pan,
Yenho Chen,
Xiaofeng Yang
Abstract:
Due to the scarcity of labeled data, self-supervised learning (SSL) has gained much attention in 3D medical image segmentation, by extracting semantic representations from unlabeled data. Among SSL strategies, Masked image modeling (MIM) has shown effectiveness by reconstructing randomly masked images to learn detailed representations. However, conventional MIM methods require extensive training d…
▽ More
Due to the scarcity of labeled data, self-supervised learning (SSL) has gained much attention in 3D medical image segmentation, by extracting semantic representations from unlabeled data. Among SSL strategies, Masked image modeling (MIM) has shown effectiveness by reconstructing randomly masked images to learn detailed representations. However, conventional MIM methods require extensive training data to achieve good performance, which still poses a challenge for medical imaging. Since random masking uniformly samples all regions within medical images, it may overlook crucial anatomical regions and thus degrade the pretraining efficiency. We propose AnatoMask, a novel MIM method that leverages reconstruction loss to dynamically identify and mask out anatomically significant regions to improve pretraining efficacy. AnatoMask takes a self-distillation approach, where the model learns both how to find more significant regions to mask and how to reconstruct these masked regions. To avoid suboptimal learning, Anatomask adjusts the pretraining difficulty progressively using a masking dynamics function. We have evaluated our method on 4 public datasets with multiple imaging modalities (CT, MRI, and PET). AnatoMask demonstrates superior performance and scalability compared to existing SSL methods. The code is available at https://github.com/ricklisz/AnatoMask.
△ Less
Submitted 16 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
Cubic interactions for massless and partially massless spin-1 and spin-2 fields
Authors:
Nicolas Boulanger,
Sebastian Garcia-Saenz,
Songsong Pan,
Lucas Traina
Abstract:
We perform a complete classification of the consistent two-derivative cubic couplings for a system containing an arbitrary number of massless spin-1, massless spin-2, and partially massless (PM) spin-2 fields in $D$-dimensional (anti-)de Sitter space. In addition to previously known results, we find a unique candidate mixing between spin-1 and PM spin-2 fields. We derive all the quadratic constrai…
▽ More
We perform a complete classification of the consistent two-derivative cubic couplings for a system containing an arbitrary number of massless spin-1, massless spin-2, and partially massless (PM) spin-2 fields in $D$-dimensional (anti-)de Sitter space. In addition to previously known results, we find a unique candidate mixing between spin-1 and PM spin-2 fields. We derive all the quadratic constraints on the structure constants of the theory, allowing for relative ``wrong-sign'' kinetic terms for any of the fields. In the particular case when the kinetic terms in each sector have no relative signs, we find that the unique consistent non-trivial theory is given by multiple independent copies of conformal gravity coupled to a Yang-Mills sector in $D=4$. Our results strengthen the well-known no-go theorems on the absence of mutual interactions for massless and PM spin-2 fields.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Image-Conditional Diffusion Transformer for Underwater Image Enhancement
Authors:
Xingyang Nie,
Su Pan,
Xiaoyu Zhai,
Shifei Tao,
Fengzhong Qu,
Biao Wang,
Huilin Ge,
Guojie Xiao
Abstract:
Underwater image enhancement (UIE) has attracted much attention owing to its importance for underwater operation and marine engineering. Motivated by the recent advance in generative models, we propose a novel UIE method based on image-conditional diffusion transformer (ICDT). Our method takes the degraded underwater image as the conditional input and converts it into latent space where ICDT is ap…
▽ More
Underwater image enhancement (UIE) has attracted much attention owing to its importance for underwater operation and marine engineering. Motivated by the recent advance in generative models, we propose a novel UIE method based on image-conditional diffusion transformer (ICDT). Our method takes the degraded underwater image as the conditional input and converts it into latent space where ICDT is applied. ICDT replaces the conventional U-Net backbone in a denoising diffusion probabilistic model (DDPM) with a transformer, and thus inherits favorable properties such as scalability from transformers. Furthermore, we train ICDT with a hybrid loss function involving variances to achieve better log-likelihoods, which meanwhile significantly accelerates the sampling process. We experimentally assess the scalability of ICDTs and compare with prior works in UIE on the Underwater ImageNet dataset. Besides good scaling properties, our largest model, ICDT-XL/2, outperforms all comparison methods, achieving state-of-the-art (SOTA) quality of image enhancement.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Automatically Analyzing Performance Issues in Android Apps: How Far Are We?
Authors:
Dianshu Liao,
Shidong Pan,
Siyuan Yang,
Yitong Wang,
Yanjie Zhao,
Zhenchang Xing,
Xiaoyu Sun
Abstract:
Performance plays a critical role in ensuring the smooth operation of any mobile application, directly influencing user engagement and retention. Android applications are no exception. However, unlike functionality issues, performance issues are more challenging to discover as their root causes are sophisticated and typically emerge under specific payloads. To tackle this problem, researchers have…
▽ More
Performance plays a critical role in ensuring the smooth operation of any mobile application, directly influencing user engagement and retention. Android applications are no exception. However, unlike functionality issues, performance issues are more challenging to discover as their root causes are sophisticated and typically emerge under specific payloads. To tackle this problem, researchers have dedicated substantial efforts to proposing automatic approaches for understanding, detecting, and resolving performance issues. Despite these endeavors, it still remains unknown what the status quo of Android performance analysis is, and whether existing approaches can indeed accurately reflect real performance issues. To fill this research gap, we conducted a systematic literature review followed by an explanatory study to explore relevant studies and real-world challenges. Our findings reveal that current tools have limited capabilities, covering only 17.50% of the performance issues. Additionally, existing datasets encompass only 27.50% of the issues and are very limited in size. We also show real-world issue patterns, underscoring the huge gap between the identified techniques and practical concerns. Furthermore, possible solutions are provided to guide future research towards achieving effective performance issue detection and resolution.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
The Solution for the sequential task continual learning track of the 2nd Greater Bay Area International Algorithm Competition
Authors:
Sishun Pan,
Xixian Wu,
Tingmin Li,
Longfei Huang,
Mingxu Feng,
Zhonghua Wan,
Yang Yang
Abstract:
This paper presents a data-free, parameter-isolation-based continual learning algorithm we developed for the sequential task continual learning track of the 2nd Greater Bay Area International Algorithm Competition. The method learns an independent parameter subspace for each task within the network's convolutional and linear layers and freezes the batch normalization layers after the first task. S…
▽ More
This paper presents a data-free, parameter-isolation-based continual learning algorithm we developed for the sequential task continual learning track of the 2nd Greater Bay Area International Algorithm Competition. The method learns an independent parameter subspace for each task within the network's convolutional and linear layers and freezes the batch normalization layers after the first task. Specifically, for domain incremental setting where all domains share a classification head, we freeze the shared classification head after first task is completed, effectively solving the issue of catastrophic forgetting. Additionally, facing the challenge of domain incremental settings without providing a task identity, we designed an inference task identity strategy, selecting an appropriate mask matrix for each sample. Furthermore, we introduced a gradient supplementation strategy to enhance the importance of unselected parameters for the current task, facilitating learning for new tasks. We also implemented an adaptive importance scoring strategy that dynamically adjusts the amount of parameters to optimize single-task performance while reducing parameter usage. Moreover, considering the limitations of storage space and inference time, we designed a mask matrix compression strategy to save storage space and improve the speed of encryption and decryption of the mask matrix. Our approach does not require expanding the core network or using external auxiliary networks or data, and performs well under both task incremental and domain incremental settings. This solution ultimately won a second-place prize in the competition.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
The Solution for the AIGC Inference Performance Optimization Competition
Authors:
Sishun Pan,
Haonan Xu,
Zhonghua Wan,
Yang Yang
Abstract:
In recent years, the rapid advancement of large-scale pre-trained language models based on transformer architectures has revolutionized natural language processing tasks. Among these, ChatGPT has gained widespread popularity, demonstrating human-level conversational abilities and attracting over 100 million monthly users by late 2022. Concurrently, Baidu's commercial deployment of the Ernie Wenxin…
▽ More
In recent years, the rapid advancement of large-scale pre-trained language models based on transformer architectures has revolutionized natural language processing tasks. Among these, ChatGPT has gained widespread popularity, demonstrating human-level conversational abilities and attracting over 100 million monthly users by late 2022. Concurrently, Baidu's commercial deployment of the Ernie Wenxin model has significantly enhanced marketing effectiveness through AI-driven technologies. This paper focuses on optimizing high-performance inference for Ernie models, emphasizing GPU acceleration and leveraging the Paddle inference framework. We employ techniques such as Faster Transformer for efficient model processing, embedding layer pruning to reduce computational overhead, and FP16 half-precision inference for enhanced computational efficiency. Additionally, our approach integrates efficient data handling strategies using multi-process parallel processing to minimize latency. Experimental results demonstrate that our optimized solution achieves up to an 8.96x improvement in inference speed compared to standard methods, while maintaining competitive performance.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Classification of Power Quality Disturbances Using Resnet with Channel Attention Mechanism
Authors:
Su Pan,
Xingyang Nie,
Xiaoyu Zhai,
Biao Wang,
Huilin Ge,
Cheng He,
Zhenping Ding
Abstract:
The detection and classification of power quality disturbances (PQDs) carries significant importance for power systems. In response to this imperative, numerous intelligent diagnostic methods have been developed. However, existing identification methods usually concentrate on single-type signals or on complex signals with two types, rendering them susceptible to noisy labels and environmental effe…
▽ More
The detection and classification of power quality disturbances (PQDs) carries significant importance for power systems. In response to this imperative, numerous intelligent diagnostic methods have been developed. However, existing identification methods usually concentrate on single-type signals or on complex signals with two types, rendering them susceptible to noisy labels and environmental effects. This study proposes a novel method for the classification of PQDs, termed ST-GSResNet, which utilizes the S-Transform and an improved residual neural network (ResNet) with a channel attention mechanism. The ST-GSResNet approach initially uses the S-Transform to transform a time-series signal into a 2D time-frequency image for feature enhancement. Then, an improved ResNet model is introduced, which employs grouped convolution instead of the traditional convolution operation. This improvement aims to facilitate learning with a block-diagonal structured sparsity on the channel dimension, the highly-correlated filters are learned in a more structured way in the networks with filter groups. By reducing the number of parameters in the network in this significant manner, the model becomes less prone to overfitting. Furthermore, the SE module concentrates on primary components, which enhances the model's robustness in recognition and immunity to noise. Experimental results demonstrate that, compared to existing deep learning models, our approach has advantages in computational efficiency and classification accuracy.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
A Large-scale Investigation of Semantically Incompatible APIs behind Compatibility Issues in Android Apps
Authors:
Shidong Pan,
Tianchen Guo,
Lihong Zhang,
Pei Liu,
Zhenchang Xing,
Xiaoyu Sun
Abstract:
Application Programming Interface (API) incompatibility is a long-standing issue in Android application development. The rapid evolution of Android APIs results in a significant number of API additions, removals, and changes between adjacent versions. Unfortunately, this high frequency of alterations may lead to compatibility issues, often without adequate notification to developers regarding thes…
▽ More
Application Programming Interface (API) incompatibility is a long-standing issue in Android application development. The rapid evolution of Android APIs results in a significant number of API additions, removals, and changes between adjacent versions. Unfortunately, this high frequency of alterations may lead to compatibility issues, often without adequate notification to developers regarding these changes. Although researchers have proposed some work on detecting compatibility issues caused by changes in API signatures, they often overlook compatibility issues stemming from sophisticated semantic changes. In response to this challenge, we conducted a large-scale discovery of incompatible APIs in the Android Open Source Project (AOSP) by leveraging static analysis and pre-trained Large Language Models (LLMs) across adjacent versions. We systematically formulate the problem and propose a unified framework to detect incompatible APIs, especially for semantic changes. It's worth highlighting that our approach achieves a 0.83 F1-score in identifying semantically incompatible APIs in the Android framework. Ultimately, our approach detects 5,481 incompatible APIs spanning from version 4 to version 33. We further demonstrate its effectiveness in supplementing the state-of-the-art methods in detecting a broader spectrum of compatibility issues (+92.3%) that have been previously overlooked.
△ Less
Submitted 26 June, 2024; v1 submitted 25 June, 2024;
originally announced June 2024.
-
MindSpore Quantum: A User-Friendly, High-Performance, and AI-Compatible Quantum Computing Framework
Authors:
Xusheng Xu,
Jiangyu Cui,
Zidong Cui,
Runhong He,
Qingyu Li,
Xiaowei Li,
Yanling Lin,
Jiale Liu,
Wuxin Liu,
Jiale Lu,
Maolin Luo,
Chufan Lyu,
Shijie Pan,
Mosharev Pavel,
Runqiu Shu,
Jialiang Tang,
Ruoqian Xu,
Shu Xu,
Kang Yang,
Fan Yu,
Qingguo Zeng,
Haiying Zhao,
Qiang Zheng,
Junyuan Zhou,
Xu Zhou
, et al. (14 additional authors not shown)
Abstract:
We introduce MindSpore Quantum, a pioneering hybrid quantum-classical framework with a primary focus on the design and implementation of noisy intermediate-scale quantum (NISQ) algorithms. Leveraging the robust support of MindSpore, an advanced open-source deep learning training/inference framework, MindSpore Quantum exhibits exceptional efficiency in the design and training of variational quantum…
▽ More
We introduce MindSpore Quantum, a pioneering hybrid quantum-classical framework with a primary focus on the design and implementation of noisy intermediate-scale quantum (NISQ) algorithms. Leveraging the robust support of MindSpore, an advanced open-source deep learning training/inference framework, MindSpore Quantum exhibits exceptional efficiency in the design and training of variational quantum algorithms on both CPU and GPU platforms, delivering remarkable performance. Furthermore, this framework places a strong emphasis on enhancing the operational efficiency of quantum algorithms when executed on real quantum hardware. This encompasses the development of algorithms for quantum circuit compilation and qubit mapping, crucial components for achieving optimal performance on quantum processors. In addition to the core framework, we introduce QuPack, a meticulously crafted quantum computing acceleration engine. QuPack significantly accelerates the simulation speed of MindSpore Quantum, particularly in variational quantum eigensolver (VQE), quantum approximate optimization algorithm (QAOA), and tensor network simulations, providing astonishing speed. This combination of cutting-edge technologies empowers researchers and practitioners to explore the frontiers of quantum computing with unprecedented efficiency and performance.
△ Less
Submitted 10 July, 2024; v1 submitted 24 June, 2024;
originally announced June 2024.
-
The Championship-Winning Solution for the 5th CLVISION Challenge 2024
Authors:
Sishun Pan,
Tingmin Li,
Yang Yang
Abstract:
In this paper, we introduce our approach to the 5th CLVision Challenge, which presents distinctive challenges beyond traditional class incremental learning. Unlike standard settings, this competition features the recurrence of previously encountered classes and includes unlabeled data that may contain Out-of-Distribution (OOD) categories. Our approach is based on Winning Subnetworks to allocate in…
▽ More
In this paper, we introduce our approach to the 5th CLVision Challenge, which presents distinctive challenges beyond traditional class incremental learning. Unlike standard settings, this competition features the recurrence of previously encountered classes and includes unlabeled data that may contain Out-of-Distribution (OOD) categories. Our approach is based on Winning Subnetworks to allocate independent parameter spaces for each task addressing the catastrophic forgetting problem in class incremental learning and employ three training strategies: supervised classification learning, unsupervised contrastive learning, and pseudo-label classification learning to fully utilize the information in both labeled and unlabeled data, enhancing the classification performance of each subnetwork. Furthermore, during the inference stage, we have devised an interaction strategy between subnetworks, where the prediction for a specific class of a particular sample is the average logits across different subnetworks corresponding to that class, leveraging the knowledge learned from different subnetworks on recurring classes to improve classification accuracy. These strategies can be simultaneously applied to the three scenarios of the competition, effectively solving the difficulties in the competition scenarios. Experimentally, our method ranks first in both the pre-selection and final evaluation stages, with an average accuracy of 0.4535 during the preselection stage and an average accuracy of 0.4805 during the final evaluation stage.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
LLM-Powered Explanations: Unraveling Recommendations Through Subgraph Reasoning
Authors:
Guangsi Shi,
Xiaofeng Deng,
Linhao Luo,
Lijuan Xia,
Lei Bao,
Bei Ye,
Fei Du,
Shirui Pan,
Yuxiao Li
Abstract:
Recommender systems are pivotal in enhancing user experiences across various web applications by analyzing the complicated relationships between users and items. Knowledge graphs(KGs) have been widely used to enhance the performance of recommender systems. However, KGs are known to be noisy and incomplete, which are hard to provide reliable explanations for recommendation results. An explainable r…
▽ More
Recommender systems are pivotal in enhancing user experiences across various web applications by analyzing the complicated relationships between users and items. Knowledge graphs(KGs) have been widely used to enhance the performance of recommender systems. However, KGs are known to be noisy and incomplete, which are hard to provide reliable explanations for recommendation results. An explainable recommender system is crucial for the product development and subsequent decision-making. To address these challenges, we introduce a novel recommender that synergies Large Language Models (LLMs) and KGs to enhance the recommendation and provide interpretable results. Specifically, we first harness the power of LLMs to augment KG reconstruction. LLMs comprehend and decompose user reviews into new triples that are added into KG. In this way, we can enrich KGs with explainable paths that express user preferences. To enhance the recommendation on augmented KGs, we introduce a novel subgraph reasoning module that effectively measures the importance of nodes and discovers reasoning for recommendation. Finally, these reasoning paths are fed into the LLMs to generate interpretable explanations of the recommendation results. Our approach significantly enhances both the effectiveness and interpretability of recommender systems, especially in cross-selling scenarios where traditional methods falter. The effectiveness of our approach has been rigorously tested on four open real-world datasets, with our methods demonstrating a superior performance over contemporary state-of-the-art techniques by an average improvement of 12%. The application of our model in a multinational engineering and technology company cross-selling recommendation system further underscores its practical utility and potential to redefine recommendation practices through improved accuracy and user trust.
△ Less
Submitted 29 June, 2024; v1 submitted 22 June, 2024;
originally announced June 2024.
-
Adaptive Self-Supervised Consistency-Guided Diffusion Model for Accelerated MRI Reconstruction
Authors:
Mojtaba Safari,
Zach Eidex,
Shaoyan Pan,
Richard L. J. Qiu,
Xiaofeng Yang
Abstract:
Purpose: To propose a self-supervised deep learning-based compressed sensing MRI (DL-based CS-MRI) method named "Adaptive Self-Supervised Consistency Guided Diffusion Model (ASSCGD)" to accelerate data acquisition without requiring fully sampled datasets. Materials and Methods: We used the fastMRI multi-coil brain axial T2-weighted (T2-w) dataset from 1,376 cases and single-coil brain quantitative…
▽ More
Purpose: To propose a self-supervised deep learning-based compressed sensing MRI (DL-based CS-MRI) method named "Adaptive Self-Supervised Consistency Guided Diffusion Model (ASSCGD)" to accelerate data acquisition without requiring fully sampled datasets. Materials and Methods: We used the fastMRI multi-coil brain axial T2-weighted (T2-w) dataset from 1,376 cases and single-coil brain quantitative magnetization prepared 2 rapid acquisition gradient echoes (MP2RAGE) T1 maps from 318 cases to train and test our model. Robustness against domain shift was evaluated using two out-of-distribution (OOD) datasets: multi-coil brain axial postcontrast T1 -weighted (T1c) dataset from 50 cases and axial T1-weighted (T1-w) dataset from 50 patients. Data were retrospectively subsampled at acceleration rates R in {2x, 4x, 8x}. ASSCGD partitions a random sampling pattern into two disjoint sets, ensuring data consistency during training. We compared our method with ReconFormer Transformer and SS-MRI, assessing performance using normalized mean squared error (NMSE), peak signal-to-noise ratio (PSNR), and structural similarity index (SSIM). Statistical tests included one-way analysis of variance (ANOVA) and multi-comparison Tukey's Honesty Significant Difference (HSD) tests. Results: ASSCGD preserved fine structures and brain abnormalities visually better than comparative methods at R = 8x for both multi-coil and single-coil datasets. It achieved the lowest NMSE at R in {4x, 8x}, and the highest PSNR and SSIM values at all acceleration rates for the multi-coil dataset. Similar trends were observed for the single-coil dataset, though SSIM values were comparable to ReconFormer at R in {2x, 8x}. These results were further confirmed by the voxel-wise correlation scatter plots. OOD results showed significant (p << 10^-5 ) improvements in undersampled image quality after reconstruction.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark
Authors:
Yili Wang,
Yixin Liu,
Xu Shen,
Chenyu Li,
Kaize Ding,
Rui Miao,
Ying Wang,
Shirui Pan,
Xin Wang
Abstract:
To build safe and reliable graph machine learning systems, unsupervised graph-level anomaly detection (GLAD) and unsupervised graph-level out-of-distribution (OOD) detection (GLOD) have received significant attention in recent years. Though those two lines of research indeed share the same objective, they have been studied independently in the community due to distinct evaluation setups, creating…
▽ More
To build safe and reliable graph machine learning systems, unsupervised graph-level anomaly detection (GLAD) and unsupervised graph-level out-of-distribution (OOD) detection (GLOD) have received significant attention in recent years. Though those two lines of research indeed share the same objective, they have been studied independently in the community due to distinct evaluation setups, creating a gap that hinders the application and evaluation of methods from one to the other. To bridge the gap, in this work, we present a Unified Benchmark for unsupervised Graph-level OOD and anomaly Detection (our method), a comprehensive evaluation framework that unifies GLAD and GLOD under the concept of generalized graph-level OOD detection. Our benchmark encompasses 35 datasets spanning four practical anomaly and OOD detection scenarios, facilitating the comparison of 16 representative GLAD/GLOD methods. We conduct multi-dimensional analyses to explore the effectiveness, generalizability, robustness, and efficiency of existing methods, shedding light on their strengths and limitations. Furthermore, we provide an open-source codebase (https://github.com/UB-GOLD/UB-GOLD) of our method to foster reproducible research and outline potential directions for future investigations based on our insights.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
A Recursive Relation for Bipartition Numbers
Authors:
Yen-Chi Roger Lin,
Shu-Yen Pan
Abstract:
We establish a recursive relation for the bipartition number $p_2(n)$ which might be regarded as an analogue of Euler's recursive relation for the partition number $p(n)$. Two proofs of the main result are proved in this article. The first one is using the generating function, and the second one is using combinatoric objects (called ``symbols'') created by Lusztig for studying representation theor…
▽ More
We establish a recursive relation for the bipartition number $p_2(n)$ which might be regarded as an analogue of Euler's recursive relation for the partition number $p(n)$. Two proofs of the main result are proved in this article. The first one is using the generating function, and the second one is using combinatoric objects (called ``symbols'') created by Lusztig for studying representation theory of finite classical groups.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models
Authors:
Tao Zhang,
Ziqian Zeng,
Yuxiang Xiao,
Huiping Zhuang,
Cen Chen,
James Foulds,
Shimei Pan
Abstract:
Large Language Models (LLMs) are prone to generating content that exhibits gender biases, raising significant ethical concerns. Alignment, the process of fine-tuning LLMs to better align with desired behaviors, is recognized as an effective approach to mitigate gender biases. Although proprietary LLMs have made significant strides in mitigating gender bias, their alignment datasets are not publicl…
▽ More
Large Language Models (LLMs) are prone to generating content that exhibits gender biases, raising significant ethical concerns. Alignment, the process of fine-tuning LLMs to better align with desired behaviors, is recognized as an effective approach to mitigate gender biases. Although proprietary LLMs have made significant strides in mitigating gender bias, their alignment datasets are not publicly available. The commonly used and publicly available alignment dataset, HH-RLHF, still exhibits gender bias to some extent. There is a lack of publicly available alignment datasets specifically designed to address gender bias. Hence, we developed a new dataset named GenderAlign, aiming at mitigating a comprehensive set of gender biases in LLMs. This dataset comprises 8k single-turn dialogues, each paired with a "chosen" and a "rejected" response. Compared to the "rejected" responses, the "chosen" responses demonstrate lower levels of gender bias and higher quality. Furthermore, we categorized the gender biases in the "rejected" responses of GenderAlign into 4 principal categories. The experimental results show the effectiveness of GenderAlign in reducing gender bias in LLMs.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
The Heterophilic Snowflake Hypothesis: Training and Empowering GNNs for Heterophilic Graphs
Authors:
Kun Wang,
Guibin Zhang,
Xinnan Zhang,
Junfeng Fang,
Xun Wu,
Guohao Li,
Shirui Pan,
Wei Huang,
Yuxuan Liang
Abstract:
Graph Neural Networks (GNNs) have become pivotal tools for a range of graph-based learning tasks. Notably, most current GNN architectures operate under the assumption of homophily, whether explicitly or implicitly. While this underlying assumption is frequently adopted, it is not universally applicable, which can result in potential shortcomings in learning effectiveness. In this paper, \textbf{fo…
▽ More
Graph Neural Networks (GNNs) have become pivotal tools for a range of graph-based learning tasks. Notably, most current GNN architectures operate under the assumption of homophily, whether explicitly or implicitly. While this underlying assumption is frequently adopted, it is not universally applicable, which can result in potential shortcomings in learning effectiveness. In this paper, \textbf{for the first time}, we transfer the prevailing concept of ``one node one receptive field" to the heterophilic graph. By constructing a proxy label predictor, we enable each node to possess a latent prediction distribution, which assists connected nodes in determining whether they should aggregate their associated neighbors. Ultimately, every node can have its own unique aggregation hop and pattern, much like each snowflake is unique and possesses its own characteristics. Based on observations, we innovatively introduce the Heterophily Snowflake Hypothesis and provide an effective solution to guide and facilitate research on heterophilic graphs and beyond. We conduct comprehensive experiments including (1) main results on 10 graphs with varying heterophily ratios across 10 backbones; (2) scalability on various deep GNN backbones (SGC, JKNet, etc.) across various large number of layers (2,4,6,8,16,32 layers); (3) comparison with conventional snowflake hypothesis; (4) efficiency comparison with existing graph pruning algorithms. Our observations show that our framework acts as a versatile operator for diverse tasks. It can be integrated into various GNN frameworks, boosting performance in-depth and offering an explainable approach to choosing the optimal network depth. The source code is available at \url{https://github.com/bingreeky/HeteroSnoH}.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Tilt stability of Ky-Fan $κ$-norm composite optimization
Authors:
Yulan Liu,
Shaohua Pan,
Wen Song
Abstract:
This paper concerns the tilt stability for the minimization of the sum of a twice continuously differentiable matrix-valued function and the Ky-Fan $κ$-norm. By using the expression of second subderivative of the Ky-Fan $κ$-norm, we derive a verifiable criterion to identify the tilt stability of a local minimum for this class of nonconvex and nonsmooth problems. As a byproduct, a practical criteri…
▽ More
This paper concerns the tilt stability for the minimization of the sum of a twice continuously differentiable matrix-valued function and the Ky-Fan $κ$-norm. By using the expression of second subderivative of the Ky-Fan $κ$-norm, we derive a verifiable criterion to identify the tilt stability of a local minimum for this class of nonconvex and nonsmooth problems. As a byproduct, a practical criterion is achieved for the tilt stable solution of the nuclear-norm regularized minimization.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Convergence of ZH-type nonmonotone descent method for Kurdyka-Łojasiewicz optimization problems
Authors:
Yitian Qian,
Ting Tao,
Shaohua Pan,
Houduo Qi
Abstract:
This note concerns a class of nonmonotone descent methods for minimizing a proper lower semicontinuous Kurdyka-Ł$\ddot{o}$jasiewicz (KL) function $Φ$, whose iterate sequence obeys the ZH-type nonmonotone decrease condition and a relative error condition. We prove that the iterate sequence converges to a critical point of $Φ$, and if $Φ$ has the KL property of exponent $θ\in(0,1)$ at this critical…
▽ More
This note concerns a class of nonmonotone descent methods for minimizing a proper lower semicontinuous Kurdyka-Ł$\ddot{o}$jasiewicz (KL) function $Φ$, whose iterate sequence obeys the ZH-type nonmonotone decrease condition and a relative error condition. We prove that the iterate sequence converges to a critical point of $Φ$, and if $Φ$ has the KL property of exponent $θ\in(0,1)$ at this critical point, the convergence has a linear rate for $θ\in(0,1/2]$ and a sublinear rate of exponent $\frac{1-θ}{1-2θ}$ for $θ\in(1/2,1)$. Our results first resolve the full convergence of the iterate sequence generated by the ZH-type nonmonotone descent method for nonconvex and nonsmooth optimization problems, and extend the full convergence of monotone descent methods for KL optimization problems to the ZH-type nonmonotone descent method.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Split-and-Fit: Learning B-Reps via Structure-Aware Voronoi Partitioning
Authors:
Yilin Liu,
Jiale Chen,
Shanshan Pan,
Daniel Cohen-Or,
Hao Zhang,
Hui Huang
Abstract:
We introduce a novel method for acquiring boundary representations (B-Reps) of 3D CAD models which involves a two-step process: it first applies a spatial partitioning, referred to as the ``split``, followed by a ``fit`` operation to derive a single primitive within each partition. Specifically, our partitioning aims to produce the classical Voronoi diagram of the set of ground-truth (GT) B-Rep pr…
▽ More
We introduce a novel method for acquiring boundary representations (B-Reps) of 3D CAD models which involves a two-step process: it first applies a spatial partitioning, referred to as the ``split``, followed by a ``fit`` operation to derive a single primitive within each partition. Specifically, our partitioning aims to produce the classical Voronoi diagram of the set of ground-truth (GT) B-Rep primitives. In contrast to prior B-Rep constructions which were bottom-up, either via direct primitive fitting or point clustering, our Split-and-Fit approach is top-down and structure-aware, since a Voronoi partition explicitly reveals both the number of and the connections between the primitives. We design a neural network to predict the Voronoi diagram from an input point cloud or distance field via a binary classification. We show that our network, coined NVD-Net for neural Voronoi diagrams, can effectively learn Voronoi partitions for CAD models from training data and exhibits superior generalization capabilities. Extensive experiments and evaluation demonstrate that the resulting B-Reps, consisting of parametric surfaces, curves, and vertices, are more plausible than those obtained by existing alternatives, with significant improvements in reconstruction quality. Code will be released on https://github.com/yilinliu77/NVDNet.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Bayesian Inference for Spatial-temporal Non-Gaussian Data Using Predictive Stacking
Authors:
Soumyakanti Pan,
Lu Zhang,
Jonathan R. Bradley,
Sudipto Banerjee
Abstract:
Analysing non-Gaussian spatial-temporal data typically requires introducing spatial dependence in generalised linear models through the link function of an exponential family distribution. However, unlike in Gaussian likelihoods, inference is considerably encumbered by the inability to analytically integrate out the random effects and reduce the dimension of the parameter space. Iterative estimati…
▽ More
Analysing non-Gaussian spatial-temporal data typically requires introducing spatial dependence in generalised linear models through the link function of an exponential family distribution. However, unlike in Gaussian likelihoods, inference is considerably encumbered by the inability to analytically integrate out the random effects and reduce the dimension of the parameter space. Iterative estimation algorithms struggle to converge due to the presence of weakly identified parameters. We devise an approach that obviates these issues by exploiting generalised conjugate multivariate distribution theory for exponential families, which enables exact sampling from analytically available posterior distributions conditional upon some fixed process parameters. More specifically, we expand upon the Diaconis-Ylvisaker family of conjugate priors to achieve analytically tractable posterior inference for spatially-temporally varying regression models conditional on some kernel parameters. Subsequently, we assimilate inference from these individual posterior distributions over a range of values of these parameters using Bayesian predictive stacking. We evaluate inferential performance on simulated data, compare with fully Bayesian inference using Markov chain Monte Carlo and apply our proposed method to analyse spatially-temporally referenced avian count data from the North American Breeding Bird Survey database.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
A majorized PAM method with subspace correction for low-rank composite factorization model
Authors:
Ting Tao,
Yitian Qian,
Shaohua Pan
Abstract:
This paper concerns a class of low-rank composite factorization models arising from matrix completion. For this nonconvex and nonsmooth optimization problem, we propose a proximal alternating minimization algorithm (PAMA) with subspace correction, in which a subspace correction step is imposed on every proximal subproblem so as to guarantee that the corrected proximal subproblem has a closed-form…
▽ More
This paper concerns a class of low-rank composite factorization models arising from matrix completion. For this nonconvex and nonsmooth optimization problem, we propose a proximal alternating minimization algorithm (PAMA) with subspace correction, in which a subspace correction step is imposed on every proximal subproblem so as to guarantee that the corrected proximal subproblem has a closed-form solution. For this subspace correction PAMA, we prove the subsequence convergence of the iterate sequence, and establish the convergence of the whole iterate sequence and the column subspace sequences of factor pairs under the KL property of objective function and a restrictive condition that holds automatically for the column $\ell_{2,0}$-norm function. Numerical comparison with the proximal alternating linearized minimization method on one-bit matrix completion problems indicates that PAMA has an advantage in seeking lower relative error within less time.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Coherent control of a triangular exchange-only spin qubit
Authors:
Edwin Acuna,
Joseph D. Broz,
Kaushal Shyamsundar,
Antonio B. Mei,
Colin P. Feeney,
Valerie Smetanka,
Tiffany Davis,
Kangmu Lee,
Maxwell D. Choi,
Brydon Boyd,
June Suh,
Wonill D. Ha,
Cameron Jennings,
Andrew S. Pan,
Daniel S. Sanchez,
Matthew D. Reed,
Jason R. Petta
Abstract:
We demonstrate coherent control of a three-electron exchange-only spin qubit with the quantum dots arranged in a close-packed triangular geometry. The device is tuned to confine one electron in each quantum dot, as evidenced by pairwise charge stability diagrams. Time-domain control of the exchange coupling is demonstrated and qubit performance is characterized using blind randomized benchmarking,…
▽ More
We demonstrate coherent control of a three-electron exchange-only spin qubit with the quantum dots arranged in a close-packed triangular geometry. The device is tuned to confine one electron in each quantum dot, as evidenced by pairwise charge stability diagrams. Time-domain control of the exchange coupling is demonstrated and qubit performance is characterized using blind randomized benchmarking, with an average single-qubit gate fidelity F = 99.84%. The compact triangular device geometry can be readily scaled to larger two-dimensional quantum dot arrays with high connectivity.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Decision-focused Graph Neural Networks for Combinatorial Optimization
Authors:
Yang Liu,
Chuan Zhou,
Peng Zhang,
Shirui Pan,
Zhao Li,
Hongyang Chen
Abstract:
In recent years, there has been notable interest in investigating combinatorial optimization (CO) problems by neural-based framework. An emerging strategy to tackle these challenging problems involves the adoption of graph neural networks (GNNs) as an alternative to traditional algorithms, a subject that has attracted considerable attention. Despite the growing popularity of GNNs and traditional a…
▽ More
In recent years, there has been notable interest in investigating combinatorial optimization (CO) problems by neural-based framework. An emerging strategy to tackle these challenging problems involves the adoption of graph neural networks (GNNs) as an alternative to traditional algorithms, a subject that has attracted considerable attention. Despite the growing popularity of GNNs and traditional algorithm solvers in the realm of CO, there is limited research on their integrated use and the correlation between them within an end-to-end framework. The primary focus of our work is to formulate a more efficient and precise framework for CO by employing decision-focused learning on graphs. Additionally, we introduce a decision-focused framework that utilizes GNNs to address CO problems with auxiliary support. To realize an end-to-end approach, we have designed two cascaded modules: (a) an unsupervised trained graph predictive model, and (b) a solver for quadratic binary unconstrained optimization. Empirical evaluations are conducted on various classical tasks, including maximum cut, maximum independent set, and minimum vertex cover. The experimental results on classical CO problems (i.e. MaxCut, MIS, and MVC) demonstrate the superiority of our method over both the standalone GNN approach and classical methods.
△ Less
Submitted 9 June, 2024; v1 submitted 5 June, 2024;
originally announced June 2024.
-
MagiNet: Mask-Aware Graph Imputation Network for Incomplete Traffic Data
Authors:
Jianping Zhou,
Bin Lu,
Zhanyu Liu,
Siyu Pan,
Xuejun Feng,
Hua Wei,
Guanjie Zheng,
Xinbing Wang,
Chenghu Zhou
Abstract:
Due to detector malfunctions and communication failures, missing data is ubiquitous during the collection of traffic data. Therefore, it is of vital importance to impute the missing values to facilitate data analysis and decision-making for Intelligent Transportation System (ITS). However, existing imputation methods generally perform zero pre-filling techniques to initialize missing values, intro…
▽ More
Due to detector malfunctions and communication failures, missing data is ubiquitous during the collection of traffic data. Therefore, it is of vital importance to impute the missing values to facilitate data analysis and decision-making for Intelligent Transportation System (ITS). However, existing imputation methods generally perform zero pre-filling techniques to initialize missing values, introducing inevitable noises. Moreover, we observe prevalent over-smoothing interpolations, falling short in revealing the intrinsic spatio-temporal correlations of incomplete traffic data. To this end, we propose Mask-Aware Graph imputation Network: MagiNet. Our method designs an adaptive mask spatio-temporal encoder to learn the latent representations of incomplete data, eliminating the reliance on pre-filling missing values. Furthermore, we devise a spatio-temporal decoder that stacks multiple blocks to capture the inherent spatial and temporal dependencies within incomplete traffic data, alleviating over-smoothing imputation. Extensive experiments demonstrate that our method outperforms state-of-the-art imputation methods on five real-world traffic datasets, yielding an average improvement of 4.31% in RMSE and 3.72% in MAPE.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Augmented Commonsense Knowledge for Remote Object Grounding
Authors:
Bahram Mohammadi,
Yicong Hong,
Yuankai Qi,
Qi Wu,
Shirui Pan,
Javen Qinfeng Shi
Abstract:
The vision-and-language navigation (VLN) task necessitates an agent to perceive the surroundings, follow natural language instructions, and act in photo-realistic unseen environments. Most of the existing methods employ the entire image or object features to represent navigable viewpoints. However, these representations are insufficient for proper action prediction, especially for the REVERIE task…
▽ More
The vision-and-language navigation (VLN) task necessitates an agent to perceive the surroundings, follow natural language instructions, and act in photo-realistic unseen environments. Most of the existing methods employ the entire image or object features to represent navigable viewpoints. However, these representations are insufficient for proper action prediction, especially for the REVERIE task, which uses concise high-level instructions, such as ''Bring me the blue cushion in the master bedroom''. To address enhancing representation, we propose an augmented commonsense knowledge model (ACK) to leverage commonsense information as a spatio-temporal knowledge graph for improving agent navigation. Specifically, the proposed approach involves constructing a knowledge base by retrieving commonsense information from ConceptNet, followed by a refinement module to remove noisy and irrelevant knowledge. We further present ACK which consists of knowledge graph-aware cross-modal and concept aggregation modules to enhance visual representation and visual-textual data alignment by integrating visible objects, commonsense knowledge, and concept history, which includes object and knowledge temporal information. Moreover, we add a new pipeline for the commonsense-based decision-making process which leads to more accurate local action prediction. Experimental results demonstrate our proposed model noticeably outperforms the baseline and archives the state-of-the-art on the REVERIE benchmark.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Ta2Pd3Te5 topological thermometer
Authors:
Yupeng Li,
Anqi Wang,
Senyang Pan,
Dayu Yan,
Guang Yang,
Xingchen Guo,
Yu Hong,
Guangtong Liu,
Fanming Qu,
Zhijun Wang,
Tian Qian,
Jinglei Zhang,
Youguo Shi,
Li Lu,
Jie Shen
Abstract:
In recent decades, there has been a persistent pursuit of applications for surface/edge states in topological systems, driven by their dissipationless transport effects. However, there have been limited tangible breakthroughs in this field. This work demonstrates the remarkable properties of the topological insulator Ta2Pd3Te5, as a thermometer. This material exhibits a power-law correlation in te…
▽ More
In recent decades, there has been a persistent pursuit of applications for surface/edge states in topological systems, driven by their dissipationless transport effects. However, there have been limited tangible breakthroughs in this field. This work demonstrates the remarkable properties of the topological insulator Ta2Pd3Te5, as a thermometer. This material exhibits a power-law correlation in temperature-dependent resistance at low temperatures, stemming from its Luttinger liquid behavior of edge states, while exhibiting semiconductor behavior at high temperatures. The power-law behavior effectively addresses the issue of infinite resistance in semiconductor thermometers at ultra-low temperatures, thereby playing a crucial role in enabling efficient thermometry in refrigerators supporting millikelvin temperatures or below. By employing chemical doping, adjusting thickness, and controlling gate voltage, its power-law behavior and semiconductor behavior can be effectively modulated. This enables efficient thermometry spanning from millikelvin temperatures to room temperature, and allows for precise local temperature measurement. Furthermore, this thermometer exhibits excellent temperature sensitivity and resolution, and can be fine-tuned to show small magnetoresistance. In summary, the Ta2Pd3Te5 thermometer, also referred to as a topological thermometer, exhibits outstanding performance and significant potential for measuring a wider range of temperatures compared to conventional low-temperature thermometers.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
A-SDM: Accelerating Stable Diffusion through Model Assembly and Feature Inheritance Strategies
Authors:
Jinchao Zhu,
Yuxuan Wang,
Siyuan Pan,
Pengfei Wan,
Di Zhang,
Gao Huang
Abstract:
The Stable Diffusion Model (SDM) is a prevalent and effective model for text-to-image (T2I) and image-to-image (I2I) generation. Despite various attempts at sampler optimization, model distillation, and network quantification, these approaches typically maintain the original network architecture. The extensive parameter scale and substantial computational demands have limited research into adjusti…
▽ More
The Stable Diffusion Model (SDM) is a prevalent and effective model for text-to-image (T2I) and image-to-image (I2I) generation. Despite various attempts at sampler optimization, model distillation, and network quantification, these approaches typically maintain the original network architecture. The extensive parameter scale and substantial computational demands have limited research into adjusting the model architecture. This study focuses on reducing redundant computation in SDM and optimizes the model through both tuning and tuning-free methods. 1) For the tuning method, we design a model assembly strategy to reconstruct a lightweight model while preserving performance through distillation. Second, to mitigate performance loss due to pruning, we incorporate multi-expert conditional convolution (ME-CondConv) into compressed UNets to enhance network performance by increasing capacity without sacrificing speed. Third, we validate the effectiveness of the multi-UNet switching method for improving network speed. 2) For the tuning-free method, we propose a feature inheritance strategy to accelerate inference by skipping local computations at the block, layer, or unit level within the network structure. We also examine multiple sampling modes for feature inheritance at the time-step level. Experiments demonstrate that both the proposed tuning and the tuning-free methods can improve the speed and performance of the SDM. The lightweight model reconstructed by the model assembly strategy increases generation speed by $22.4%$, while the feature inheritance strategy enhances the SDM generation speed by $40.0%$.
△ Less
Submitted 17 June, 2024; v1 submitted 31 May, 2024;
originally announced June 2024.
-
Sign is Not a Remedy: Multiset-to-Multiset Message Passing for Learning on Heterophilic Graphs
Authors:
Langzhang Liang,
Sunwoo Kim,
Kijung Shin,
Zenglin Xu,
Shirui Pan,
Yuan Qi
Abstract:
Graph Neural Networks (GNNs) have gained significant attention as a powerful modeling and inference method, especially for homophilic graph-structured data. To empower GNNs in heterophilic graphs, where adjacent nodes exhibit dissimilar labels or features, Signed Message Passing (SMP) has been widely adopted. However, there is a lack of theoretical and empirical analysis regarding the limitations…
▽ More
Graph Neural Networks (GNNs) have gained significant attention as a powerful modeling and inference method, especially for homophilic graph-structured data. To empower GNNs in heterophilic graphs, where adjacent nodes exhibit dissimilar labels or features, Signed Message Passing (SMP) has been widely adopted. However, there is a lack of theoretical and empirical analysis regarding the limitations of SMP. In this work, we unveil some potential pitfalls of SMP and their remedies. We first identify two limitations of SMP: undesirable representation update for multi-hop neighbors and vulnerability against oversmoothing issues. To overcome these challenges, we propose a novel message passing function called Multiset to Multiset GNN(M2M-GNN). Our theoretical analyses and extensive experiments demonstrate that M2M-GNN effectively alleviates the aforementioned limitations of SMP, yielding superior performance in comparison
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Large Language Model Watermark Stealing With Mixed Integer Programming
Authors:
Zhaoxi Zhang,
Xiaomei Zhang,
Yanjun Zhang,
Leo Yu Zhang,
Chao Chen,
Shengshan Hu,
Asif Gill,
Shirui Pan
Abstract:
The Large Language Model (LLM) watermark is a newly emerging technique that shows promise in addressing concerns surrounding LLM copyright, monitoring AI-generated text, and preventing its misuse. The LLM watermark scheme commonly includes generating secret keys to partition the vocabulary into green and red lists, applying a perturbation to the logits of tokens in the green list to increase their…
▽ More
The Large Language Model (LLM) watermark is a newly emerging technique that shows promise in addressing concerns surrounding LLM copyright, monitoring AI-generated text, and preventing its misuse. The LLM watermark scheme commonly includes generating secret keys to partition the vocabulary into green and red lists, applying a perturbation to the logits of tokens in the green list to increase their sampling likelihood, thus facilitating watermark detection to identify AI-generated text if the proportion of green tokens exceeds a threshold. However, recent research indicates that watermarking methods using numerous keys are susceptible to removal attacks, such as token editing, synonym substitution, and paraphrasing, with robustness declining as the number of keys increases. Therefore, the state-of-the-art watermark schemes that employ fewer or single keys have been demonstrated to be more robust against text editing and paraphrasing. In this paper, we propose a novel green list stealing attack against the state-of-the-art LLM watermark scheme and systematically examine its vulnerability to this attack. We formalize the attack as a mixed integer programming problem with constraints. We evaluate our attack under a comprehensive threat model, including an extreme scenario where the attacker has no prior knowledge, lacks access to the watermark detector API, and possesses no information about the LLM's parameter settings or watermark injection/detection scheme. Extensive experiments on LLMs, such as OPT and LLaMA, demonstrate that our attack can successfully steal the green list and remove the watermark across all settings.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Overcoming Negative Transfer by Online Selection: Distant Domain Adaptation for Fault Diagnosis
Authors:
Ziyan Wang,
Mohamed Ragab,
Wenmian Yang,
Min Wu,
Sinno Jialin Pan,
Jie Zhang,
Zhenghua Chen
Abstract:
Unsupervised domain adaptation (UDA) has achieved remarkable success in fault diagnosis, bringing significant benefits to diverse industrial applications. While most UDA methods focus on cross-working condition scenarios where the source and target domains are notably similar, real-world applications often grapple with severe domain shifts. We coin the term `distant domain adaptation problem' to d…
▽ More
Unsupervised domain adaptation (UDA) has achieved remarkable success in fault diagnosis, bringing significant benefits to diverse industrial applications. While most UDA methods focus on cross-working condition scenarios where the source and target domains are notably similar, real-world applications often grapple with severe domain shifts. We coin the term `distant domain adaptation problem' to describe the challenge of adapting from a labeled source domain to a significantly disparate unlabeled target domain. This problem exhibits the risk of negative transfer, where extraneous knowledge from the source domain adversely affects the target domain performance. Unfortunately, conventional UDA methods often falter in mitigating this negative transfer, leading to suboptimal performance. In response to this challenge, we propose a novel Online Selective Adversarial Alignment (OSAA) approach. Central to OSAA is its ability to dynamically identify and exclude distant source samples via an online gradient masking approach, focusing primarily on source samples that closely resemble the target samples. Furthermore, recognizing the inherent complexities in bridging the source and target domains, we construct an intermediate domain to act as a transitional domain and ease the adaptation process. Lastly, we develop a class-conditional adversarial adaptation to address the label distribution disparities while learning domain invariant representation to account for potential label distribution disparities between the domains. Through detailed experiments and ablation studies on two real-world datasets, we validate the superior performance of the OSAA method over state-of-the-art methods, underscoring its significant utility in practical scenarios with severe domain shifts.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
Balancing User Preferences by Social Networks: A Condition-Guided Social Recommendation Model for Mitigating Popularity Bias
Authors:
Xin He,
Wenqi Fan,
Ruobing Wang,
Yili Wang,
Ying Wang,
Shirui Pan,
Xin Wang
Abstract:
Social recommendation models weave social interactions into their design to provide uniquely personalized recommendation results for users. However, social networks not only amplify the popularity bias in recommendation models, resulting in more frequent recommendation of hot items and fewer long-tail items, but also include a substantial amount of redundant information that is essentially meaning…
▽ More
Social recommendation models weave social interactions into their design to provide uniquely personalized recommendation results for users. However, social networks not only amplify the popularity bias in recommendation models, resulting in more frequent recommendation of hot items and fewer long-tail items, but also include a substantial amount of redundant information that is essentially meaningless for the model's performance. Existing social recommendation models fail to address the issues of popularity bias and the redundancy of social information, as they directly characterize social influence across the entire social network without making targeted adjustments. In this paper, we propose a Condition-Guided Social Recommendation Model (named CGSoRec) to mitigate the model's popularity bias by denoising the social network and adjusting the weights of user's social preferences. More specifically, CGSoRec first includes a Condition-Guided Social Denoising Model (CSD) to remove redundant social relations in the social network for capturing users' social preferences with items more precisely. Then, CGSoRec calculates users' social preferences based on denoised social network and adjusts the weights in users' social preferences to make them can counteract the popularity bias present in the recommendation model. At last, CGSoRec includes a Condition-Guided Diffusion Recommendation Model (CGD) to introduce the adjusted social preferences as conditions to control the recommendation results for a debiased direction. Comprehensive experiments on three real-world datasets demonstrate the effectiveness of our proposed method. The code is in: https://github.com/hexin5515/CGSoRec.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
ARC: A Generalist Graph Anomaly Detector with In-Context Learning
Authors:
Yixin Liu,
Shiyuan Li,
Yu Zheng,
Qingfeng Chen,
Chengqi Zhang,
Shirui Pan
Abstract:
Graph anomaly detection (GAD), which aims to identify abnormal nodes that differ from the majority within a graph, has garnered significant attention. However, current GAD methods necessitate training specific to each dataset, resulting in high training costs, substantial data requirements, and limited generalizability when being applied to new datasets and domains. To address these limitations, t…
▽ More
Graph anomaly detection (GAD), which aims to identify abnormal nodes that differ from the majority within a graph, has garnered significant attention. However, current GAD methods necessitate training specific to each dataset, resulting in high training costs, substantial data requirements, and limited generalizability when being applied to new datasets and domains. To address these limitations, this paper proposes ARC, a generalist GAD approach that enables a ``one-for-all'' GAD model to detect anomalies across various graph datasets on-the-fly. Equipped with in-context learning, ARC can directly extract dataset-specific patterns from the target dataset using few-shot normal samples at the inference stage, without the need for retraining or fine-tuning on the target dataset. ARC comprises three components that are well-crafted for capturing universal graph anomaly patterns: 1) smoothness-based feature Alignment module that unifies the features of different datasets into a common and anomaly-sensitive space; 2) ego-neighbor Residual graph encoder that learns abnormality-related node embeddings; and 3) cross-attentive in-Context anomaly scoring module that predicts node abnormality by leveraging few-shot normal samples. Extensive experiments on multiple benchmark datasets from various domains demonstrate the superior anomaly detection performance, efficiency, and generalizability of ARC.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
A Cross-Field Fusion Strategy for Drug-Target Interaction Prediction
Authors:
Hongzhi Zhang,
Xiuwen Gong,
Shirui Pan,
Jia Wu,
Bo Du,
Wenbin Hu
Abstract:
Drug-target interaction (DTI) prediction is a critical component of the drug discovery process. In the drug development engineering field, predicting novel drug-target interactions is extremely crucial.However, although existing methods have achieved high accuracy levels in predicting known drugs and drug targets, they fail to utilize global protein information during DTI prediction. This leads to…
▽ More
Drug-target interaction (DTI) prediction is a critical component of the drug discovery process. In the drug development engineering field, predicting novel drug-target interactions is extremely crucial.However, although existing methods have achieved high accuracy levels in predicting known drugs and drug targets, they fail to utilize global protein information during DTI prediction. This leads to an inability to effectively predict interaction the interactions between novel drugs and their targets. As a result, the cross-field information fusion strategy is employed to acquire local and global protein information. Thus, we propose the siamese drug-target interaction SiamDTI prediction method, which utilizes a double channel network structure for cross-field supervised learning.Experimental results on three benchmark datasets demonstrate that SiamDTI achieves higher accuracy levels than other state-of-the-art (SOTA) methods on novel drugs and targets.Additionally, SiamDTI's performance with known drugs and targets is comparable to that of SOTA approachs. The code is available at https://anonymous.4open.science/r/DDDTI-434D.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Regressor-free Molecule Generation to Support Drug Response Prediction
Authors:
Kun Li,
Xiuwen Gong,
Shirui Pan,
Jia Wu,
Bo Du,
Wenbin Hu
Abstract:
Drug response prediction (DRP) is a crucial phase in drug discovery, and the most important metric for its evaluation is the IC50 score. DRP results are heavily dependent on the quality of the generated molecules. Existing molecule generation methods typically employ classifier-based guidance, enabling sampling within the IC50 classification range. However, these methods fail to ensure the samplin…
▽ More
Drug response prediction (DRP) is a crucial phase in drug discovery, and the most important metric for its evaluation is the IC50 score. DRP results are heavily dependent on the quality of the generated molecules. Existing molecule generation methods typically employ classifier-based guidance, enabling sampling within the IC50 classification range. However, these methods fail to ensure the sampling space range's effectiveness, generating numerous ineffective molecules. Through experimental and theoretical study, we hypothesize that conditional generation based on the target IC50 score can obtain a more effective sampling space. As a result, we introduce regressor-free guidance molecule generation to ensure sampling within a more effective space and support DRP. Regressor-free guidance combines a diffusion model's score estimation with a regression controller model's gradient based on number labels. To effectively map regression labels between drugs and cell lines, we design a common-sense numerical knowledge graph that constrains the order of text representations. Experimental results on the real-world dataset for the DRP task demonstrate our method's effectiveness in drug discovery. The code is available at:https://anonymous.4open.science/r/RMCD-DBD1.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Gradient Transformation: Towards Efficient and Model-Agnostic Unlearning for Dynamic Graph Neural Networks
Authors:
He Zhang,
Bang Wu,
Xiangwen Yang,
Xingliang Yuan,
Chengqi Zhang,
Shirui Pan
Abstract:
Graph unlearning has emerged as an essential tool for safeguarding user privacy and mitigating the negative impacts of undesirable data. Meanwhile, the advent of dynamic graph neural networks (DGNNs) marks a significant advancement due to their superior capability in learning from dynamic graphs, which encapsulate spatial-temporal variations in diverse real-world applications (e.g., traffic foreca…
▽ More
Graph unlearning has emerged as an essential tool for safeguarding user privacy and mitigating the negative impacts of undesirable data. Meanwhile, the advent of dynamic graph neural networks (DGNNs) marks a significant advancement due to their superior capability in learning from dynamic graphs, which encapsulate spatial-temporal variations in diverse real-world applications (e.g., traffic forecasting). With the increasing prevalence of DGNNs, it becomes imperative to investigate the implementation of dynamic graph unlearning. However, current graph unlearning methodologies are designed for GNNs operating on static graphs and exhibit limitations including their serving in a pre-processing manner and impractical resource demands. Furthermore, the adaptation of these methods to DGNNs presents non-trivial challenges, owing to the distinctive nature of dynamic graphs. To this end, we propose an effective, efficient, model-agnostic, and post-processing method to implement DGNN unlearning. Specifically, we first define the unlearning requests and formulate dynamic graph unlearning in the context of continuous-time dynamic graphs. After conducting a role analysis on the unlearning data, the remaining data, and the target DGNN model, we propose a method called Gradient Transformation and a loss function to map the unlearning request to the desired parameter update. Evaluations on six real-world datasets and state-of-the-art DGNN backbones demonstrate its effectiveness (e.g., limited performance drop even obvious improvement) and efficiency (e.g., at most 7.23$\times$ speed-up) outperformance, and potential advantages in handling future unlearning requests (e.g., at most 32.59$\times$ speed-up).
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Graph Sparsification via Mixture of Graphs
Authors:
Guibin Zhang,
Xiangguo Sun,
Yanwei Yue,
Kun Wang,
Tianlong Chen,
Shirui Pan
Abstract:
Graph Neural Networks (GNNs) have demonstrated superior performance across various graph learning tasks but face significant computational challenges when applied to large-scale graphs. One effective approach to mitigate these challenges is graph sparsification, which involves removing non-essential edges to reduce computational overhead. However, previous graph sparsification methods often rely o…
▽ More
Graph Neural Networks (GNNs) have demonstrated superior performance across various graph learning tasks but face significant computational challenges when applied to large-scale graphs. One effective approach to mitigate these challenges is graph sparsification, which involves removing non-essential edges to reduce computational overhead. However, previous graph sparsification methods often rely on a single global sparsity setting and uniform pruning criteria, failing to provide customized sparsification schemes for each node's complex local context. In this paper, we introduce Mixture-of-Graphs (MoG), leveraging the concept of Mixture-of-Experts (MoE), to dynamically select tailored pruning solutions for each node. Specifically, MoG incorporates multiple sparsifier experts, each characterized by unique sparsity levels and pruning criteria, and selects the appropriate experts for each node. Subsequently, MoG performs a mixture of the sparse graphs produced by different experts on the Grassmann manifold to derive an optimal sparse graph. One notable property of MoG is its entirely local nature, as it depends on the specific circumstances of each individual node. Extensive experiments on four large-scale OGB datasets and two superpixel datasets, equipped with five GNN backbones, demonstrate that MoG (I) identifies subgraphs at higher sparsity levels ($8.67\%\sim 50.85\%$), with performance equal to or better than the dense graph, (II) achieves $1.47-2.62\times$ speedup in GNN inference with negligible performance drop, and (III) boosts ``top-student'' GNN performance ($1.02\%\uparrow$ on RevGNN+\textsc{ogbn-proteins} and $1.74\%\uparrow$ on DeeperGCN+\textsc{ogbg-ppa}).
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Large Language Models-guided Dynamic Adaptation for Temporal Knowledge Graph Reasoning
Authors:
Jiapu Wang,
Kai Sun,
Linhao Luo,
Wei Wei,
Yongli Hu,
Alan Wee-Chung Liew,
Shirui Pan,
Baocai Yin
Abstract:
Temporal Knowledge Graph Reasoning (TKGR) is the process of utilizing temporal information to capture complex relations within a Temporal Knowledge Graph (TKG) to infer new knowledge. Conventional methods in TKGR typically depend on deep learning algorithms or temporal logical rules. However, deep learning-based TKGRs often lack interpretability, whereas rule-based TKGRs struggle to effectively le…
▽ More
Temporal Knowledge Graph Reasoning (TKGR) is the process of utilizing temporal information to capture complex relations within a Temporal Knowledge Graph (TKG) to infer new knowledge. Conventional methods in TKGR typically depend on deep learning algorithms or temporal logical rules. However, deep learning-based TKGRs often lack interpretability, whereas rule-based TKGRs struggle to effectively learn temporal rules that capture temporal patterns. Recently, Large Language Models (LLMs) have demonstrated extensive knowledge and remarkable proficiency in temporal reasoning. Consequently, the employment of LLMs for Temporal Knowledge Graph Reasoning (TKGR) has sparked increasing interest among researchers. Nonetheless, LLMs are known to function as black boxes, making it challenging to comprehend their reasoning process. Additionally, due to the resource-intensive nature of fine-tuning, promptly updating LLMs to integrate evolving knowledge within TKGs for reasoning is impractical. To address these challenges, in this paper, we propose a Large Language Models-guided Dynamic Adaptation (LLM-DA) method for reasoning on TKGs. Specifically, LLM-DA harnesses the capabilities of LLMs to analyze historical data and extract temporal logical rules. These rules unveil temporal patterns and facilitate interpretable reasoning. To account for the evolving nature of TKGs, a dynamic adaptation strategy is proposed to update the LLM-generated rules with the latest events. This ensures that the extracted rules always incorporate the most recent knowledge and better generalize to the predictions on future events. Experimental results show that without the need of fine-tuning, LLM-DA significantly improves the accuracy of reasoning over several common datasets, providing a robust framework for TKGR tasks.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Resilience Analysis of Multi-modal Logistics Service Network Through Robust Optimization with Budget-of-Uncertainty
Authors:
Yaxin Pang,
Shenle Pan,
Eric Ballot
Abstract:
Supply chain resilience analysis aims to identify the critical elements in the supply chain, measure its reliability, and analyze solutions for improving vulnerabilities. While extensive methods like stochastic approaches have been dominant, robust optimization-widely applied in robust planning under uncertainties without specific probability distributions-remains relatively underexplored for this…
▽ More
Supply chain resilience analysis aims to identify the critical elements in the supply chain, measure its reliability, and analyze solutions for improving vulnerabilities. While extensive methods like stochastic approaches have been dominant, robust optimization-widely applied in robust planning under uncertainties without specific probability distributions-remains relatively underexplored for this research problem. This paper employs robust optimization with budget-of-uncertainty as a tool to analyze the resilience of multi-modal logistics service networks under time uncertainty. We examine the interactive effects of three critical factors: network size, disruption scale, disruption degree. The computational experiments offer valuable managerial insights for practitioners and researchers.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Hi-GMAE: Hierarchical Graph Masked Autoencoders
Authors:
Chuang Liu,
Zelin Yao,
Yibing Zhan,
Xueqi Ma,
Dapeng Tao,
Jia Wu,
Wenbin Hu,
Shirui Pan,
Bo Du
Abstract:
Graph Masked Autoencoders (GMAEs) have emerged as a notable self-supervised learning approach for graph-structured data. Existing GMAE models primarily focus on reconstructing node-level information, categorizing them as single-scale GMAEs. This methodology, while effective in certain contexts, tends to overlook the complex hierarchical structures inherent in many real-world graphs. For instance,…
▽ More
Graph Masked Autoencoders (GMAEs) have emerged as a notable self-supervised learning approach for graph-structured data. Existing GMAE models primarily focus on reconstructing node-level information, categorizing them as single-scale GMAEs. This methodology, while effective in certain contexts, tends to overlook the complex hierarchical structures inherent in many real-world graphs. For instance, molecular graphs exhibit a clear hierarchical organization in the form of the atoms-functional groups-molecules structure. Hence, the inability of single-scale GMAE models to incorporate these hierarchical relationships often leads to their inadequate capture of crucial high-level graph information, resulting in a noticeable decline in performance. To address this limitation, we propose Hierarchical Graph Masked AutoEncoders (Hi-GMAE), a novel multi-scale GMAE framework designed to handle the hierarchical structures within graphs. First, Hi-GMAE constructs a multi-scale graph hierarchy through graph pooling, enabling the exploration of graph structures across different granularity levels. To ensure masking uniformity of subgraphs across these scales, we propose a novel coarse-to-fine strategy that initiates masking at the coarsest scale and progressively back-projects the mask to the finer scales. Furthermore, we integrate a gradual recovery strategy with the masking process to mitigate the learning challenges posed by completely masked subgraphs. Diverging from the standard graph neural network (GNN) used in GMAE models, Hi-GMAE modifies its encoder and decoder into hierarchical structures. This entails using GNN at the finer scales for detailed local graph analysis and employing a graph transformer at coarser scales to capture global information. Our experiments on 15 graph datasets consistently demonstrate that Hi-GMAE outperforms 17 state-of-the-art self-supervised competitors.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
All Nodes are created Not Equal: Node-Specific Layer Aggregation and Filtration for GNN
Authors:
Shilong Wang,
Hao Wu,
Yifan Duan,
Guibin Zhang,
Guohao Li,
Yuxuan Liang,
Shirui Pan,
Kun Wang,
Yang Wang
Abstract:
The ever-designed Graph Neural Networks, though opening a promising path for the modeling of the graph-structure data, unfortunately introduce two daunting obstacles to their deployment on devices. (I) Most of existing GNNs are shallow, due mostly to the over-smoothing and gradient-vanish problem as they go deeper as convolutional architectures. (II) The vast majority of GNNs adhere to the homophi…
▽ More
The ever-designed Graph Neural Networks, though opening a promising path for the modeling of the graph-structure data, unfortunately introduce two daunting obstacles to their deployment on devices. (I) Most of existing GNNs are shallow, due mostly to the over-smoothing and gradient-vanish problem as they go deeper as convolutional architectures. (II) The vast majority of GNNs adhere to the homophily assumption, where the central node and its adjacent nodes share the same label. This assumption often poses challenges for many GNNs working with heterophilic graphs. Addressing the aforementioned issue has become a looming challenge in enhancing the robustness and scalability of GNN applications. In this paper, we take a comprehensive and systematic approach to overcoming the two aforementioned challenges for the first time. We propose a Node-Specific Layer Aggregation and Filtration architecture, termed NoSAF, a framework capable of filtering and processing information from each individual nodes. NoSAF introduces the concept of "All Nodes are Created Not Equal" into every layer of deep networks, aiming to provide a reliable information filter for each layer's nodes to sieve out information beneficial for the subsequent layer. By incorporating a dynamically updated codebank, NoSAF dynamically optimizes the optimal information outputted downwards at each layer. This effectively overcomes heterophilic issues and aids in deepening the network. To compensate for the information loss caused by the continuous filtering in NoSAF, we also propose NoSAF-D (Deep), which incorporates a compensation mechanism that replenishes information in every layer of the model, allowing NoSAF to perform meaningful computations even in very deep layers.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Don't Chase Your Tail! Missing Key Aspects Augmentation in Textual Vulnerability Descriptions of Long-tail Software through Feature Inference
Authors:
Linyi Han,
Shidong Pan,
Zhenchang Xing,
Jiamou Sun,
Sofonias Yitagesu,
Xiaowang Zhang,
Zhiyong Feng
Abstract:
Augmenting missing key aspects in Textual Vulnerability Descriptions (TVDs) for software with a large user base (referred to as non-long-tail software) has greatly advanced vulnerability analysis and software security research. However, these methods often overlook software instances that have a limited user base (referred to as long-tail software) due to limited TVDs, variations in software featu…
▽ More
Augmenting missing key aspects in Textual Vulnerability Descriptions (TVDs) for software with a large user base (referred to as non-long-tail software) has greatly advanced vulnerability analysis and software security research. However, these methods often overlook software instances that have a limited user base (referred to as long-tail software) due to limited TVDs, variations in software features, and domain-specific jargon, which hinders vulnerability analysis and software repairs. In this paper, we introduce a novel software feature inference framework designed to augment the missing key aspects of TVDs for long-tail software. Firstly, we tackle the issue of non-standard software names found in community-maintained vulnerability databases by cross-referencing government databases with Common Vulnerabilities and Exposures (CVEs). Next, we employ Large Language Models (LLMs) to generate the missing key aspects. However, the limited availability of historical TVDs restricts the variety of examples. To overcome this limitation, we utilize the Common Weakness Enumeration (CWE) to classify all TVDs and select cluster centers as representative examples. To ensure accuracy, we present Natural Language Inference (NLI) models specifically designed for long-tail software. These models identify and eliminate incorrect responses. Additionally, we use a wiki repository to provide explanations for proprietary terms. Our evaluations demonstrate that our approach significantly improves the accuracy of augmenting missing key aspects of TVDs for log-tail software from 0.27 to 0.56 (+107%). Interestingly, the accuracy of non-long-tail software also increases from 64% to 71%. As a result, our approach can be useful in various downstream tasks that require complete TVD information.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
Implication of the velocity dispersion scalings on high-mass star formation in molecular clouds
Authors:
An-Xu Luo,
Hong-Li Liu,
Sheng-Li Qin,
Dong-Ting Yang,
Sirong Pan
Abstract:
This paper is aimed at exploring implications of velocity dispersion scalings on high-mass star formation in molecular clouds, including the scalings of Larson's linewidth--size ($σ$--$R$) and ratio--mass surface density ($\cal{L}$--$Σ$; here $\cal{L}$$=σ/R^{0.5}$). We have systematically analyzed the $σ$ parameter of well-selected 221 massive clumps, complemented with published samples of other h…
▽ More
This paper is aimed at exploring implications of velocity dispersion scalings on high-mass star formation in molecular clouds, including the scalings of Larson's linewidth--size ($σ$--$R$) and ratio--mass surface density ($\cal{L}$--$Σ$; here $\cal{L}$$=σ/R^{0.5}$). We have systematically analyzed the $σ$ parameter of well-selected 221 massive clumps, complemented with published samples of other hierarchical density structures of molecular clouds over spatial scales of 0.01--10 pc. Those massive clumps are classified into four phases: quiescent, protostellar, HII region, and PDR clumps in an evolutionary sequence. The velocity dispersion of clumps increases overall with the evolutionary sequence, reflecting enhanced stellar feedback in more evolved phases. The relations of $σ$--$R$ and $\cal{L}$--$Σ$ are weak with the clump sample alone, but become evident when combined with others spanning a much wider spatial scales. For $σ$--$R$, its tight relation indicates a kinematic connection between hierarchical density structures, supporting theoretical models of multiscale high-mass star formation. From the $\cal{L}$--$Σ$ relation, cloud structures can be found to transition from over-virial state ($α_\mathrm{vir} > 2$) to sub-virial state ($α_\mathrm{vir} < 2$) as they become smaller and denser, indicating a possible shift in the governing force from turbulence to gravity. This implies that the multiscale physical process of high-mass star formation hinges on the self-gravity of sub-virial molecular clouds. However, the influence of turbulence may not be dismissed until large-scale clouds attain a sub-virial state. This is pending confirmation from future multiscale kinematic observations of molecular clouds with uniform observing settings.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
What role of gravity, turbulence and magnetic fields in high-mass star formation clouds?
Authors:
An-Xu Luo,
Hong-Li Liu,
Guang-Xing Li,
Sirong Pan,
Dong-Ting Yang
Abstract:
To explore the potential role of gravity, turbulence and magnetic fields in high-mass star formation in molecular clouds, this study revisits the velocity dispersion--size ($σ$--$L$) and density--size ($ρ$--$L$) scalings and the associated turbulent energy spectrum using an extensive data sample. The sample includes various hierarchical density structures in high-mass star formation clouds, across…
▽ More
To explore the potential role of gravity, turbulence and magnetic fields in high-mass star formation in molecular clouds, this study revisits the velocity dispersion--size ($σ$--$L$) and density--size ($ρ$--$L$) scalings and the associated turbulent energy spectrum using an extensive data sample. The sample includes various hierarchical density structures in high-mass star formation clouds, across scales of 0.01 to 100 pc. We observe $σ\propto L^{0.26}$ and $ρ\propto L^{-1.54}$ scalings, converging toward a virial equilibrium state. A nearly flat virial parameter--mass ($α_{\rm vir}-M$) distribution is seen across all density scales, with $α_{\rm vir}$ values centered around unity, suggesting a global equilibrium maintained by the interplay between gravity and turbulence across multiple scales. Our turbulent energy spectrum ($E(k)$) analysis, based on the $σ$--$L$ and $ρ$--$L$ scalings, yields a characteristic $E(k) \propto k^{-1.52}$. These findings indicate the potential significance of gravity, turbulence, and possibly magnetic fields all in regulating dynamics of molecular clouds and high-mass star formation therein.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.