Search | arXiv e-print repository

On the Causal Sufficiency and Necessity of Multi-Modal Representation Learning

Authors: Jingyao Wang, Wenwen Qiang, Jiangmeng Li, Lingyu Si, Changwen Zheng, Bing Su

Abstract: An effective paradigm of multi-modal learning (MML) is to learn unified representations among modalities. From a causal perspective, constraining the consistency between different modalities can mine causal representations that convey primary events. However, such simple consistency may face the risk of learning insufficient or unnecessary information: a necessary but insufficient cause is invaria… ▽ More An effective paradigm of multi-modal learning (MML) is to learn unified representations among modalities. From a causal perspective, constraining the consistency between different modalities can mine causal representations that convey primary events. However, such simple consistency may face the risk of learning insufficient or unnecessary information: a necessary but insufficient cause is invariant across modalities but may not have the required accuracy; a sufficient but unnecessary cause tends to adapt well to specific modalities but may be hard to adapt to new data. To address this issue, in this paper, we aim to learn representations that are both causal sufficient and necessary, i.e., Causal Complete Cause ($C^3$), for MML. Firstly, we define the concept of $C^3$ for MML, which reflects the probability of being causal sufficiency and necessity. We also propose the identifiability and measurement of $C^3$, i.e., $C^3$ risk, to ensure calculating the learned representations' $C^3$ scores in practice. Then, we theoretically prove the effectiveness of $C^3$ risk by establishing the performance guarantee of MML with a tight generalization bound. Based on these theoretical results, we propose a plug-and-play method, namely Causal Complete Cause Regularization ($C^3$R), to learn causal complete representations by constraining the $C^3$ risk bound. Extensive experiments conducted on various benchmark datasets empirically demonstrate the effectiveness of $C^3$R. △ Less

Submitted 19 July, 2024; originally announced July 2024.

arXiv:2406.18443 [pdf, other]

Unveiling the Unknown: Conditional Evidence Decoupling for Unknown Rejection

Authors: Zhaowei Wu, Binyi Su, Hua Zhang, Zhong Zhou

Abstract: In this paper, we focus on training an open-set object detector under the condition of scarce training samples, which should distinguish the known and unknown categories. Under this challenging scenario, the decision boundaries of unknowns are difficult to learn and often ambiguous. To mitigate this issue, we develop a novel open-set object detection framework, which delves into conditional eviden… ▽ More In this paper, we focus on training an open-set object detector under the condition of scarce training samples, which should distinguish the known and unknown categories. Under this challenging scenario, the decision boundaries of unknowns are difficult to learn and often ambiguous. To mitigate this issue, we develop a novel open-set object detection framework, which delves into conditional evidence decoupling for the unknown rejection. Specifically, we select pseudo-unknown samples by leveraging the discrepancy in attribution gradients between known and unknown classes, alleviating the inadequate unknown distribution coverage of training data. Subsequently, we propose a Conditional Evidence Decoupling Loss (CEDL) based on Evidential Deep Learning (EDL) theory, which decouples known and unknown properties in pseudo-unknown samples to learn distinct knowledge, enhancing separability between knowns and unknowns. Additionally, we propose an Abnormality Calibration Loss (ACL), which serves as a regularization term to adjust the output probability distribution, establishing robust decision boundaries for the unknown rejection. Our method has achieved the superiority performance over previous state-of-the-art approaches, improving the mean recall of unknown class by 7.24% across all shots in VOC10-5-5 dataset settings and 1.38% in VOC-COCO dataset settings. The code is available via https://github.com/zjzwzw/CED-FOOD. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.11092 [pdf, other]

Guaranteed Sampling Flexibility for Low-tubal-rank Tensor Completion

Authors: Bowen Su, Juntao You, HanQin Cai, Longxiu Huang

Abstract: While Bernoulli sampling is extensively studied in tensor completion, t-CUR sampling approximates low-tubal-rank tensors via lateral and horizontal subtensors. However, both methods lack sufficient flexibility for diverse practical applications. To address this, we introduce Tensor Cross-Concentrated Sampling (t-CCS), a novel and straightforward sampling model that advances the matrix cross-concen… ▽ More While Bernoulli sampling is extensively studied in tensor completion, t-CUR sampling approximates low-tubal-rank tensors via lateral and horizontal subtensors. However, both methods lack sufficient flexibility for diverse practical applications. To address this, we introduce Tensor Cross-Concentrated Sampling (t-CCS), a novel and straightforward sampling model that advances the matrix cross-concentrated sampling concept within a tensor framework. t-CCS effectively bridges the gap between Bernoulli and t-CUR sampling, offering additional flexibility that can lead to computational savings in various contexts. A key aspect of our work is the comprehensive theoretical analysis provided. We establish a sufficient condition for the successful recovery of a low-rank tensor from its t-CCS samples. In support of this, we also develop a theoretical framework validating the feasibility of t-CUR via uniform random sampling and conduct a detailed theoretical sampling complexity analysis for tensor completion problems utilizing the general Bernoulli sampling model. Moreover, we introduce an efficient non-convex algorithm, the Iterative t-CUR Tensor Completion (ITCURTC) algorithm, specifically designed to tackle the t-CCS-based tensor completion. We have intensively tested and validated the effectiveness of the t-CCS model and the ITCURTC algorithm across both synthetic and real-world datasets. △ Less

Submitted 16 June, 2024; originally announced June 2024.

arXiv:2406.04519 [pdf, other]

Multifidelity digital twin for real-time monitoring of structural dynamics in aquaculture net cages

Authors: Eirini Katsidoniotaki, Biao Su, Eleni Kelasidi, Themistoklis P. Sapsis

Abstract: As the global population grows and climate change intensifies, sustainable food production is critical. Marine aquaculture offers a viable solution, providing a sustainable protein source. However, the industry's expansion requires novel technologies for remote management and autonomous operations. Digital twin technology can advance the aquaculture industry, but its adoption has been limited. Fis… ▽ More As the global population grows and climate change intensifies, sustainable food production is critical. Marine aquaculture offers a viable solution, providing a sustainable protein source. However, the industry's expansion requires novel technologies for remote management and autonomous operations. Digital twin technology can advance the aquaculture industry, but its adoption has been limited. Fish net cages, which are flexible floating structures, are critical yet vulnerable components of aquaculture farms. Exposed to harsh and dynamic marine environments, the cages experience significant loads and risk damage, leading to fish escapes, environmental impacts, and financial losses. We propose a multifidelity surrogate modeling framework for integration into a digital twin for real-time monitoring of aquaculture net cage structural dynamics under stochastic marine conditions. Central to this framework is the nonlinear autoregressive Gaussian process method, which learns complex, nonlinear cross-correlations between models of varying fidelity. It combines low-fidelity simulation data with a small set of high-fidelity field sensor measurements, which offer the real dynamics but are costly and spatially sparse. Validated at the SINTEF ACE fish farm in Norway, our digital twin receives online metocean data and accurately predicts net cage displacements and mooring line loads, aligning closely with field measurements. The proposed framework is beneficial where application-specific data are scarce, offering rapid predictions and real-time system representation. The developed digital twin prevents potential damages by assessing structural integrity and facilitates remote operations with unmanned underwater vehicles. Our work also compares GP and GCNs for predicting net cage deformation, highlighting the latter's effectiveness in complex structural applications. △ Less

Submitted 10 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

arXiv:2405.19654 [pdf, other]

Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training

Authors: Jinxia Yang, Bing Su, Wayne Xin Zhao, Ji-Rong Wen

Abstract: Medical vision-language pre-training methods mainly leverage the correspondence between paired medical images and radiological reports. Although multi-view spatial images and temporal sequences of image-report pairs are available in off-the-shelf multi-modal medical datasets, most existing methods have not thoroughly tapped into such extensive supervision signals. In this paper, we introduce the M… ▽ More Medical vision-language pre-training methods mainly leverage the correspondence between paired medical images and radiological reports. Although multi-view spatial images and temporal sequences of image-report pairs are available in off-the-shelf multi-modal medical datasets, most existing methods have not thoroughly tapped into such extensive supervision signals. In this paper, we introduce the Med-ST framework for fine-grained spatial and temporal modeling to exploit information from multiple spatial views of chest radiographs and temporal historical records. For spatial modeling, Med-ST employs the Mixture of View Expert (MoVE) architecture to integrate different visual features from both frontal and lateral views. To achieve a more comprehensive alignment, Med-ST not only establishes the global alignment between whole images and texts but also introduces modality-weighted local alignment between text tokens and spatial regions of images. For temporal modeling, we propose a novel cross-modal bidirectional cycle consistency objective by forward mapping classification (FMC) and reverse mapping regression (RMR). By perceiving temporal information from simple to complex, Med-ST can learn temporal semantics. Experimental results across four distinct tasks demonstrate the effectiveness of Med-ST, especially in temporal classification tasks. Our code and model are available at https://github.com/SVT-Yang/MedST. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: Accepted at ICML 2024

arXiv:2405.01053 [pdf, other]

Explicitly Modeling Universality into Self-Supervised Learning

Authors: Jingyao Wang, Wenwen Qiang, Zeen Song, Lingyu Si, Jiangmeng Li, Changwen Zheng, Bing Su

Abstract: The goal of universality in self-supervised learning (SSL) is to learn universal representations from unlabeled data and achieve excellent performance on all samples and tasks. However, these methods lack explicit modeling of the universality in the learning objective, and the related theoretical understanding remains limited. This may cause models to overfit in data-scarce situations and generali… ▽ More The goal of universality in self-supervised learning (SSL) is to learn universal representations from unlabeled data and achieve excellent performance on all samples and tasks. However, these methods lack explicit modeling of the universality in the learning objective, and the related theoretical understanding remains limited. This may cause models to overfit in data-scarce situations and generalize poorly in real life. To address these issues, we provide a theoretical definition of universality in SSL, which constrains both the learning and evaluation universality of the SSL models from the perspective of discriminability, transferability, and generalization. Then, we propose a $σ$-measurement to help quantify the score of one SSL model's universality. Based on the definition and measurement, we propose a general SSL framework, called GeSSL, to explicitly model universality into SSL. It introduces a self-motivated target based on $σ$-measurement, which enables the model to find the optimal update direction towards universality. Extensive theoretical and empirical evaluations demonstrate the superior performance of GeSSL. △ Less

Submitted 23 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

Comments: 28 pages, submitted to ICML24 with 7766

arXiv:2404.18377 [pdf, other]

Inference for the panel ARMA-GARCH model when both $N$ and $T$ are large

Authors: Bing Su, Ke Zhu

Abstract: We propose a panel ARMA-GARCH model to capture the dynamics of large panel data with $N$ individuals over $T$ time periods. For this model, we provide a two-step estimation procedure to estimate the ARMA parameters and GARCH parameters stepwisely. Under some regular conditions, we show that all of the proposed estimators are asymptotically normal with the convergence rate $(NT)^{-1/2}$, and they h… ▽ More We propose a panel ARMA-GARCH model to capture the dynamics of large panel data with $N$ individuals over $T$ time periods. For this model, we provide a two-step estimation procedure to estimate the ARMA parameters and GARCH parameters stepwisely. Under some regular conditions, we show that all of the proposed estimators are asymptotically normal with the convergence rate $(NT)^{-1/2}$, and they have the asymptotic biases when both $N$ and $T$ diverge to infinity at the same rate. Particularly, we find that the asymptotic biases result from the fixed effect, estimation effect, and unobservable initial values. To correct the biases, we further propose the bias-corrected version of estimators by using either the analytical asymptotics or jackknife method. Our asymptotic results are based on a new central limit theorem for the linear-quadratic form in the martingale difference sequence, when the weight matrix is uniformly bounded in row and column. Simulations and one real example are given to demonstrate the usefulness of our panel ARMA-GARCH model. △ Less

Submitted 28 April, 2024; originally announced April 2024.

arXiv:2404.15131 [pdf, other]

Optimizing Multi-Touch Textile and Tactile Skin Sensing Through Circuit Parameter Estimation

Authors: Bo Ying Su, Yuchen Wu, Chengtao Wen, Changliu Liu

Abstract: Tactile and textile skin technologies have become increasingly important for enhancing human-robot interaction and allowing robots to adapt to different environments. Despite notable advancements, there are ongoing challenges in skin signal processing, particularly in achieving both accuracy and speed in dynamic touch sensing. This paper introduces a new framework that poses the touch sensing prob… ▽ More Tactile and textile skin technologies have become increasingly important for enhancing human-robot interaction and allowing robots to adapt to different environments. Despite notable advancements, there are ongoing challenges in skin signal processing, particularly in achieving both accuracy and speed in dynamic touch sensing. This paper introduces a new framework that poses the touch sensing problem as an estimation problem of resistive sensory arrays. Utilizing a Regularized Least Squares objective function which estimates the resistance distribution of the skin. We enhance the touch sensing accuracy and mitigate the ghosting effects, where false or misleading touches may be registered. Furthermore, our study presents a streamlined skin design that simplifies manufacturing processes without sacrificing performance. Experimental outcomes substantiate the effectiveness of our method, showing 26.9% improvement in multi-touch force-sensing accuracy for the tactile skin. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.04095 [pdf, other]

Dynamic Prompt Optimizing for Text-to-Image Generation

Authors: Wenyi Mo, Tianyu Zhang, Yalong Bai, Bing Su, Ji-Rong Wen, Qing Yang

Abstract: Text-to-image generative models, specifically those based on diffusion models like Imagen and Stable Diffusion, have made substantial advancements. Recently, there has been a surge of interest in the delicate refinement of text prompts. Users assign weights or alter the injection time steps of certain words in the text prompts to improve the quality of generated images. However, the success of fin… ▽ More Text-to-image generative models, specifically those based on diffusion models like Imagen and Stable Diffusion, have made substantial advancements. Recently, there has been a surge of interest in the delicate refinement of text prompts. Users assign weights or alter the injection time steps of certain words in the text prompts to improve the quality of generated images. However, the success of fine-control prompts depends on the accuracy of the text prompts and the careful selection of weights and time steps, which requires significant manual intervention. To address this, we introduce the \textbf{P}rompt \textbf{A}uto-\textbf{E}diting (PAE) method. Besides refining the original prompts for image generation, we further employ an online reinforcement learning strategy to explore the weights and injection time steps of each word, leading to the dynamic fine-control prompts. The reward function during training encourages the model to consider aesthetic score, semantic consistency, and user preferences. Experimental results demonstrate that our proposed method effectively improves the original prompts, generating visually more appealing images while maintaining semantic alignment. Code is available at https://github.com/Mowenyii/PAE. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: Accepted to CVPR 2024

arXiv:2404.00340 [pdf, other]

Deep Reinforcement Learning in Autonomous Car Path Planning and Control: A Survey

Authors: Yiyang Chen, Chao Ji, Yunrui Cai, Tong Yan, Bo Su

Abstract: Combining data-driven applications with control systems plays a key role in recent Autonomous Car research. This thesis offers a structured review of the latest literature on Deep Reinforcement Learning (DRL) within the realm of autonomous vehicle Path Planning and Control. It collects a series of DRL methodologies and algorithms and their applications in the field, focusing notably on their roles… ▽ More Combining data-driven applications with control systems plays a key role in recent Autonomous Car research. This thesis offers a structured review of the latest literature on Deep Reinforcement Learning (DRL) within the realm of autonomous vehicle Path Planning and Control. It collects a series of DRL methodologies and algorithms and their applications in the field, focusing notably on their roles in trajectory planning and dynamic control. In this review, we delve into the application outcomes of DRL technologies in this domain. By summarizing these literatures, we highlight potential challenges, aiming to offer insights that might aid researchers engaged in related fields. △ Less

Submitted 30 March, 2024; originally announced April 2024.

arXiv:2403.07435 [pdf, other]

Broadened-beam Uniform Rectangular Array Coefficient Design in LEO SatComs Under Quality of Service and Constant Modulus Constraints

Authors: Weiting Lin, Yuchieh Wu, Borching Su

Abstract: Satellite communications (SatComs) are anticipated to provide global Internet access. Low Earth orbit (LEO) satellites (SATs) have the advantage of providing higher downlink capacity owing to their smaller link budget compared with medium Earth orbit (MEO) and geostationary Earth orbit (GEO) SATs. In this paper, beam-broadening algorithms for uniform rectangular arrays (URAs) in LEO SatComs were s… ▽ More Satellite communications (SatComs) are anticipated to provide global Internet access. Low Earth orbit (LEO) satellites (SATs) have the advantage of providing higher downlink capacity owing to their smaller link budget compared with medium Earth orbit (MEO) and geostationary Earth orbit (GEO) SATs. In this paper, beam-broadening algorithms for uniform rectangular arrays (URAs) in LEO SatComs were studied. The proposed method is the first of its kind that jointly considers the path loss variation from SAT to user terminal (UT) due to the Earth's curvature to guarantee quality of service (QoS) inspired by the synthesis of isoflux radiation patterns in the literature, constant modulus constraint (CMC) favored for maximizing power amplifier (PA) efficiency, and out-of-beam radiation suppression to avoid interference. A URA design problem is formulated and decomposed into two uniform linear array (ULA) design subproblems utilizing the idea of Kronecker product beamforming to reduce the computational complexity of designing URA.The non-convex ULA subproblems are solved by a convex iterative algorithm. Simulation results reveal the advantages of the proposed method for suppressing out-of-beam radiation and achieving design criteria. In addition, channel capacity evaluation is carried out and shows that the proposed ``broadened-beam" beamformers can offer capacities that are at least four times greater than ``narrow-beam" beamformers employing an array steering vector when beam transition time is taken into account. The proposed method holds potential for LEO broadcasting applications such as digital video broadcasting (DVB). △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2403.04988 [pdf, other]

doi 10.1140/epjc/s10052-024-12773-y

Constraining primordial black holes as dark matter using AMS-02 data

Authors: Bing-Yu Su, Xu Pan, Guan-Sen Wang, Lei Zu, Yupeng Yang, Lei Feng

Abstract: Primordial black holes (PBHs) are the plausible candidates for the cosmological dark matter. Theoretically, PBHs with masses $M_{\rm PBH}$ in the range of $4\times10^{14}\sim 10^{17}\,{\rm g}$ can emit sub-GeV electrons and positrons through Hawking radiation. Some of these particles could undergo diffusive reacceleration during propagation in the Milky Way, potentially reaching energies up to the… ▽ More Primordial black holes (PBHs) are the plausible candidates for the cosmological dark matter. Theoretically, PBHs with masses $M_{\rm PBH}$ in the range of $4\times10^{14}\sim 10^{17}\,{\rm g}$ can emit sub-GeV electrons and positrons through Hawking radiation. Some of these particles could undergo diffusive reacceleration during propagation in the Milky Way, potentially reaching energies up to the GeV level observed by AMS-02. In this work, we utilize AMS-02 data to constrain the PBH abundance $f_{\rm PBH}$ by employing the reacceleration mechanism. Under the assumption of a monochromatic PBH mass distribution, our findings reveal that the limit is stricter than that derived from Voyager 1 data. This difference is particularly pronounced when $M_{\rm PBH}\lesssim10^{15}\,{\rm g}$, exceeding an order of magnitude. The constraints are even more robust in a more realistic scenario involving a log-normal mass distribution of PBHs. Moreover, we explore the impact of varying propagation parameters and solar modulation potential within reasonable ranges, and find that such variations have minimal effects on the final results. △ Less

Submitted 21 June, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

Comments: 6 pages, 3 figures

Journal ref: Eur. Phys. J. C 84, 606 (2024)

arXiv:2402.09204 [pdf, other]

Domain-adaptive and Subgroup-specific Cascaded Temperature Regression for Out-of-distribution Calibration

Authors: Jiexin Wang, Jiahao Chen, Bing Su

Abstract: Although deep neural networks yield high classification accuracy given sufficient training data, their predictions are typically overconfident or under-confident, i.e., the prediction confidences cannot truly reflect the accuracy. Post-hoc calibration tackles this problem by calibrating the prediction confidences without re-training the classification model. However, current approaches assume cong… ▽ More Although deep neural networks yield high classification accuracy given sufficient training data, their predictions are typically overconfident or under-confident, i.e., the prediction confidences cannot truly reflect the accuracy. Post-hoc calibration tackles this problem by calibrating the prediction confidences without re-training the classification model. However, current approaches assume congruence between test and validation data distributions, limiting their applicability to out-of-distribution scenarios. To this end, we propose a novel meta-set-based cascaded temperature regression method for post-hoc calibration. Our method tailors fine-grained scaling functions to distinct test sets by simulating various domain shifts through data augmentation on the validation set. We partition each meta-set into subgroups based on predicted category and confidence level, capturing diverse uncertainties. A regression network is then trained to derive category-specific and confidence-level-specific scaling, achieving calibration across meta-sets. Extensive experimental results on MNIST, CIFAR-10, and TinyImageNet demonstrate the effectiveness of the proposed method. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Journal ref: 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024), Seoul, Korea

arXiv:2401.15566 [pdf, other]

On the Robustness of Cross-Concentrated Sampling for Matrix Completion

Authors: HanQin Cai, Longxiu Huang, Chandra Kundu, Bowen Su

Abstract: Matrix completion is one of the crucial tools in modern data science research. Recently, a novel sampling model for matrix completion coined cross-concentrated sampling (CCS) has caught much attention. However, the robustness of the CCS model against sparse outliers remains unclear in the existing studies. In this paper, we aim to answer this question by exploring a novel Robust CCS Completion pro… ▽ More Matrix completion is one of the crucial tools in modern data science research. Recently, a novel sampling model for matrix completion coined cross-concentrated sampling (CCS) has caught much attention. However, the robustness of the CCS model against sparse outliers remains unclear in the existing studies. In this paper, we aim to answer this question by exploring a novel Robust CCS Completion problem. A highly efficient non-convex iterative algorithm, dubbed Robust CUR Completion (RCURC), is proposed. The empirical performance of the proposed algorithm, in terms of both efficiency and robustness, is verified in synthetic and real datasets. △ Less

Submitted 27 January, 2024; originally announced January 2024.

Comments: 58th Annual Conference of Information Sciences and Systems

arXiv:2401.11359 [pdf, other]

The Exact Risks of Reference Panel-based Regularized Estimators

Authors: Buxin Su, Qiang Sun, Xiaochen Yang, Bingxin Zhao

Abstract: Reference panel-based estimators have become widely used in genetic prediction of complex traits due to their ability to address data privacy concerns and reduce computational and communication costs. These estimators estimate the covariance matrix of predictors using an external reference panel, instead of relying solely on the original training data. In this paper, we investigate the performance… ▽ More Reference panel-based estimators have become widely used in genetic prediction of complex traits due to their ability to address data privacy concerns and reduce computational and communication costs. These estimators estimate the covariance matrix of predictors using an external reference panel, instead of relying solely on the original training data. In this paper, we investigate the performance of reference panel-based $L_1$ and $L_2$ regularized estimators within a unified framework based on approximate message passing (AMP). We uncover several key factors that influence the accuracy of reference panel-based estimators, including the sample sizes of the training data and reference panels, the signal-to-noise ratio, the underlying sparsity of the signal, and the covariance matrix among predictors. Our findings reveal that, even when the sample size of the reference panel matches that of the training data, reference panel-based estimators tend to exhibit lower accuracy compared to traditional regularized estimators. Furthermore, we observe that this performance gap widens as the amount of training data increases, highlighting the importance of constructing large-scale reference panels to mitigate this issue. To support our theoretical analysis, we develop a novel non-separable matrix AMP framework capable of handling the complexities introduced by a general covariance matrix and the additional randomness associated with a reference panel. We validate our theoretical results through extensive simulation studies and real data analyses using the UK Biobank database. △ Less

Submitted 20 January, 2024; originally announced January 2024.

Comments: 100 pages, 11 figures

arXiv:2401.00187 [pdf]

Anomalous size effects of effective stiffnesses in bistable counter-rotating mechanical metamaterials

Authors: Zehuan Tang, Tingfeng Ma, Boyue Su, Pengfei Kang, Bowei Wu, Hui Chen, Shuanghuizhi Li, Decai Wu, Yujie Zhang, Gen Zhao

Abstract: Counter-rotating mechanical metamaterials have previously been found to have anomalous characteristics or functions such as auxetics effects, shape changers, and soliton transports, which are all under monostable conditions. The properties of counter-rotating mechanical metamaterials under bistable conditions have not yet been explored. Here, we found that for a bistable counter-rotating metamater… ▽ More Counter-rotating mechanical metamaterials have previously been found to have anomalous characteristics or functions such as auxetics effects, shape changers, and soliton transports, which are all under monostable conditions. The properties of counter-rotating mechanical metamaterials under bistable conditions have not yet been explored. Here, we found that for a bistable counter-rotating metamaterial chain, the effective stiffnesses of the two steady states are different in the chain with even-numbered nodes. For the chain with odd-numbered nodes, the effective stiffnesses corresponding to the two steady states are exactly the same. This special property is not characterized by the characteristic attenuation lengths of the underlying mechanism, but depends on the different symmetries of the underlying mechanism of the chains with odd and even nodes. In addition, the relationship between the abnormal non-monotonic size effect and equilibrium angle are clarified. More interestingly, for one-dimensional chains with even-numbered nodes, the size effect of effective stiffness bifurcates at a specific equilibrium angle, and the according mechanisms are revealed. △ Less

Submitted 30 December, 2023; originally announced January 2024.

arXiv:2311.07802 [pdf, ps, other]

Global Solutions For Systems of Quadratic Nonlinear Schrödinger Equations in 3D

Authors: Boyang Su

Abstract: In this work, we prove global well-posedness and scattering for systems of quadratic nonlinear Schrödinger equations in the critical three-dimensional case, for small, localized data. For the terms corresponding to the nonlinearity $u\bar{u}$, we need to do an $ε$ regularization of this part of the nonlinearity. In order to tackle quadratic space-time resonances, after performing a Littlewood-Pa… ▽ More In this work, we prove global well-posedness and scattering for systems of quadratic nonlinear Schrödinger equations in the critical three-dimensional case, for small, localized data. For the terms corresponding to the nonlinearity $u\bar{u}$, we need to do an $ε$ regularization of this part of the nonlinearity. In order to tackle quadratic space-time resonances, after performing a Littlewood-Paley decomposition, we use integration by parts in the Duhamel term, to take advantage of the oscillations when space-time resonances are absent. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Comments: 100 pages

arXiv:2310.19973 [pdf, other]

Unified Enhancement of Privacy Bounds for Mixture Mechanisms via $f$-Differential Privacy

Authors: Chendi Wang, Buxin Su, Jiayuan Ye, Reza Shokri, Weijie J. Su

Abstract: Differentially private (DP) machine learning algorithms incur many sources of randomness, such as random initialization, random batch subsampling, and shuffling. However, such randomness is difficult to take into account when proving differential privacy bounds because it induces mixture distributions for the algorithm's output that are difficult to analyze. This paper focuses on improving privacy… ▽ More Differentially private (DP) machine learning algorithms incur many sources of randomness, such as random initialization, random batch subsampling, and shuffling. However, such randomness is difficult to take into account when proving differential privacy bounds because it induces mixture distributions for the algorithm's output that are difficult to analyze. This paper focuses on improving privacy bounds for shuffling models and one-iteration differentially private gradient descent (DP-GD) with random initializations using $f$-DP. We derive a closed-form expression of the trade-off function for shuffling models that outperforms the most up-to-date results based on $(ε,δ)$-DP. Moreover, we investigate the effects of random initialization on the privacy of one-iteration DP-GD. Our numerical computations of the trade-off function indicate that random initialization can enhance the privacy of DP-GD. Our analysis of $f$-DP guarantees for these mixture mechanisms relies on an inequality for trade-off functions introduced in this paper. This inequality implies the joint convexity of $F$-divergences. Finally, we study an $f$-DP analog of the advanced joint convexity of the hockey-stick divergence related to $(ε,δ)$-DP and apply it to analyze the privacy of mixture mechanisms. △ Less

Submitted 1 November, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

arXiv:2309.16567 [pdf, other]

Universal Murray's law for optimised fluid transport in synthetic structures

Authors: Binghan Zhou, Qian Cheng, Zhuo Chen, Zesheng Chen, Dongfang Liang, Eric Anthony Munro, Guolin Yun, Yoshiki Kawai, Jinrui Chen, Tynee Bhowmick, Padmanathan Karthick Kannan, Luigi G. Occhipinti, Hidetoshi Matsumoto, Julian Gardner, Bao-Lian Su, Tawfique Hasan

Abstract: Materials following Murray's law are of significant interest due to their unique porous structure and optimal mass transfer ability. However, it is challenging to construct such biomimetic hierarchical channels with perfectly cylindrical pores in synthetic systems following the existing theory. Achieving superior mass transport capacity revealed by Murray's law in nanostructured materials has thus… ▽ More Materials following Murray's law are of significant interest due to their unique porous structure and optimal mass transfer ability. However, it is challenging to construct such biomimetic hierarchical channels with perfectly cylindrical pores in synthetic systems following the existing theory. Achieving superior mass transport capacity revealed by Murray's law in nanostructured materials has thus far remained out of reach. We propose a Universal Murray's law applicable to a wide range of hierarchical structures, shapes and generalised transfer processes. We experimentally demonstrate optimal flow of various fluids in hierarchically planar and tubular graphene aerogel structures to validate the proposed law. By adjusting the macroscopic pores in such aerogel-based gas sensors, we also show a significantly improved sensor response dynamic. Our work provides a solid framework for designing synthetic Murray materials with arbitrarily shaped channels for superior mass transfer capabilities, with future implications in catalysis, sensing and energy applications. △ Less

Submitted 14 April, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

Comments: 19 pages, 4 figures

arXiv:2309.15000 [pdf]

Elastic fractal higher-order topological states

Authors: Tingfeng Ma, Bowei Wu, Jiachao Xu, Hui Chen, Shuanghuizhi Li, Boyue Su, Pengfei Kang

Abstract: In this work, elastic fractal higher-order topological states are investigated. Bott index is adopted to characterize the topological property of elastic fractal structures. The topological corner and edge states of elastic waves in fractal structures are realized theoretically and experimentally. Different from traditional two-dimension (2D) high-order topological insulators based on periodic str… ▽ More In this work, elastic fractal higher-order topological states are investigated. Bott index is adopted to characterize the topological property of elastic fractal structures. The topological corner and edge states of elastic waves in fractal structures are realized theoretically and experimentally. Different from traditional two-dimension (2D) high-order topological insulators based on periodic structures, the high-order topological states based on elastic fractal structures in this work support not only abundant topological inner corner states and edge states, but also topological outer corner states. The richness of corner states is much higher than that of topological insulators based on periodic structures. The strong robustness of the topological corner states in the fractal structure are verified by introducing disorders and defects. The topological phenomenon of elastic fractal structures revealed in this work enriches the topological physics of elastic systems and breaks the limitation of that relies on periodic elastic structures. The results have important application prospects in energy harvesting, information transmissions, elastic energy acquisitions and high-sensitivity detections. △ Less

Submitted 19 October, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

arXiv:2308.14125 [pdf]

doi 10.1002/advs.202101532

Persistence of Monoclinic Crystal Structure in Three-Dimensional Second-Order Topological Insulator Candidate 1T'-MoTe2 Thin Flake without Structural Phase transition

Authors: Bo Su, Yuan Huang, Yan Hui Hou, Jiawei Li, Rong Yang, Yongchang Ma, Yang Yang, Guangyu Zhang, Xingjiang Zhou, Jianlin Luo, Zhi-Guo Chen

Abstract: A van der Waals material, MoTe2 with a monoclinic 1T' crystal structure is a candidate for three-dimensional (3D) second-order topological insulators (SOTIs) hosting gapless hinge states and insulating surface states. However, due to the temperature-induced structural phase transition, the monoclinic 1T' structure of MoTe2 would be transformed into the orthorhombic Td structure as the temperature… ▽ More A van der Waals material, MoTe2 with a monoclinic 1T' crystal structure is a candidate for three-dimensional (3D) second-order topological insulators (SOTIs) hosting gapless hinge states and insulating surface states. However, due to the temperature-induced structural phase transition, the monoclinic 1T' structure of MoTe2 would be transformed into the orthorhombic Td structure as the temperature is lowered, which hinders the experimental verification and the electronic applications of the predicted SOTI state at low temperatures. Here, we present systematic Raman spectroscopy studies of the exfoliated MoTe2 thin flakes with variable thicknesses at different temperatures. As a spectroscopic signature of the orthorhombic Td structure of MoTe2, the out-of-plane vibration mode D at ~ 125 cm-1 is always visible below a certain temperature in the multilayer flakes thicker than ~ 27.7 nm, but vanishes in the temperature range from 80 K to 320 K when the flake thickness becomes lower than ~ 19.5 nm. The absence of the out-of-plane vibration mode D in the Raman spectra here demonstrates not only the disappearance of the monoclinic-to-orthorhombic phase transition but also the persistence of the monoclinic 1T' structure in the MoTe2 thin flakes thinner than ~ 19.5 nm at low temperatures down to 80 K, which may be caused by the high enough density of the holes introduced during the gold-enhanced exfoliation process and exposure to air. The MoTe2 thin flakes with the low-temperature monoclinic 1T' structure provide a material platform for realizing SOTI states in van der Waals materials at low temperatures, which paves the way for developing a new generation of electronic devices based on SOTIs. △ Less

Submitted 27 August, 2023; originally announced August 2023.

Comments: 20 pages, 5 figures

Journal ref: Advanced Science 9, 2101532 (2022)

arXiv:2308.14112 [pdf]

doi 10.1038/s43246-023-00392-1

Tightly-bound and room-temperature-stable excitons in van der Waals degenerate-semiconductor Bi4O4SeCl2 with high charge-carrier density

Authors: Yueshan Xu, Junjie Wang, Bo Su, Jun Deng, Cao Peng, Chunlong Wu, Qinghua Zhang, Lin Gu, Jianlin Luo, Nan Xu, Jian-gang Guo, Zhi-Guo Chen

Abstract: Excitons, which represent a type of quasi-particles consisting of electron-hole pairs bound by the mutual Coulomb interaction, were often observed in lowly-doped semiconductors or insulators. However, realizing excitons in the semiconductors or insulators with high charge carrier densities is a challenging task. Here, we perform infrared spectroscopy, electrical transport, ab initio calculation, a… ▽ More Excitons, which represent a type of quasi-particles consisting of electron-hole pairs bound by the mutual Coulomb interaction, were often observed in lowly-doped semiconductors or insulators. However, realizing excitons in the semiconductors or insulators with high charge carrier densities is a challenging task. Here, we perform infrared spectroscopy, electrical transport, ab initio calculation, and angle-resolved-photoemission spectroscopy studies of a van der Waals degenerate-semiconductor Bi4O4SeCl2. A peak-like feature (i.e., alpha peak) is present around ~ 125 meV in the optical conductivity spectra at low temperature T = 8 K and room temperature. After being excluded from the optical excitations of free carriers, interband transitions, localized states and polarons, the alpha peak is assigned as the exciton absorption. Moreover, assuming the existence of weakly-bound excitons--Wannier-type excitons in this material violates the Lyddane-Sachs-Teller relation. Besides, the exciton binding energy of ~ 375 meV, which is about an order of magnitude larger than those of conventional semiconductors, and the charge-carrier concentration of ~ 1.25 * 10^19 cm^-3, which is higher than the Mott density, further indicate that the excitons in this highly-doped system should be tightly bound. Our results pave the way for developing the optoelectronic devices based on the tightly-bound and room-temperature-stable excitons in highly-doped van der Waals degenerate semiconductors. △ Less

Submitted 27 August, 2023; originally announced August 2023.

Comments: Accepted by Communications Materials

Journal ref: Communications Materials 4, 69 (2023)

arXiv:2308.06358 [pdf, other]

CA2: Cyber Attacks Analytics

Authors: Luyu Cheng, Bairui Su, Yumeng Xue, Xiaoyu Liu, Yunhai Wang

Abstract: The VAST Challenge 2020 Mini-Challenge 1 requires participants to identify the responsible white hat groups behind a fictional Internet outage. To address this task, we have created a visual analytics system named CA2: Cyber Attacks Analytics. This system is designed to efficiently compare and match subgraphs within an extensive graph containing anonymized profiles. Additionally, we showcase an it… ▽ More The VAST Challenge 2020 Mini-Challenge 1 requires participants to identify the responsible white hat groups behind a fictional Internet outage. To address this task, we have created a visual analytics system named CA2: Cyber Attacks Analytics. This system is designed to efficiently compare and match subgraphs within an extensive graph containing anonymized profiles. Additionally, we showcase an iterative workflow that utilizes our system's capabilities to pinpoint the responsible group. △ Less

Submitted 11 August, 2023; originally announced August 2023.

Comments: IEEE Conference on Visual Analytics Science and Technology (VAST) Challenge Workshop 2020

arXiv:2308.05648 [pdf, other]

Counterfactual Cross-modality Reasoning for Weakly Supervised Video Moment Localization

Authors: Zezhong Lv, Bing Su, Ji-Rong Wen

Abstract: Video moment localization aims to retrieve the target segment of an untrimmed video according to the natural language query. Weakly supervised methods gains attention recently, as the precise temporal location of the target segment is not always available. However, one of the greatest challenges encountered by the weakly supervised method is implied in the mismatch between the video and language i… ▽ More Video moment localization aims to retrieve the target segment of an untrimmed video according to the natural language query. Weakly supervised methods gains attention recently, as the precise temporal location of the target segment is not always available. However, one of the greatest challenges encountered by the weakly supervised method is implied in the mismatch between the video and language induced by the coarse temporal annotations. To refine the vision-language alignment, recent works contrast the cross-modality similarities driven by reconstructing masked queries between positive and negative video proposals. However, the reconstruction may be influenced by the latent spurious correlation between the unmasked and the masked parts, which distorts the restoring process and further degrades the efficacy of contrastive learning since the masked words are not completely reconstructed from the cross-modality knowledge. In this paper, we discover and mitigate this spurious correlation through a novel proposed counterfactual cross-modality reasoning method. Specifically, we first formulate query reconstruction as an aggregated causal effect of cross-modality and query knowledge. Then by introducing counterfactual cross-modality knowledge into this aggregation, the spurious impact of the unmasked part contributing to the reconstruction is explicitly modeled. Finally, by suppressing the unimodal effect of masked query, we can rectify the reconstructions of video proposals to perform reasonable contrastive learning. Extensive experimental evaluations demonstrate the effectiveness of our proposed method. The code is available at \href{https://github.com/sLdZ0306/CCR}{https://github.com/sLdZ0306/CCR}. △ Less

Submitted 14 October, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

Comments: Accepted by ACM MM 2023

arXiv:2308.03950 [pdf, other]

doi 10.1145/3581783.3611888

Zero-shot Skeleton-based Action Recognition via Mutual Information Estimation and Maximization

Authors: Yujie Zhou, Wenwen Qiang, Anyi Rao, Ning Lin, Bing Su, Jiaqi Wang

Abstract: Zero-shot skeleton-based action recognition aims to recognize actions of unseen categories after training on data of seen categories. The key is to build the connection between visual and semantic space from seen to unseen classes. Previous studies have primarily focused on encoding sequences into a singular feature vector, with subsequent mapping the features to an identical anchor point within t… ▽ More Zero-shot skeleton-based action recognition aims to recognize actions of unseen categories after training on data of seen categories. The key is to build the connection between visual and semantic space from seen to unseen classes. Previous studies have primarily focused on encoding sequences into a singular feature vector, with subsequent mapping the features to an identical anchor point within the embedded space. Their performance is hindered by 1) the ignorance of the global visual/semantic distribution alignment, which results in a limitation to capture the true interdependence between the two spaces. 2) the negligence of temporal information since the frame-wise features with rich action clues are directly pooled into a single feature vector. We propose a new zero-shot skeleton-based action recognition method via mutual information (MI) estimation and maximization. Specifically, 1) we maximize the MI between visual and semantic space for distribution alignment; 2) we leverage the temporal information for estimating the MI by encouraging MI to increase as more frames are observed. Extensive experiments on three large-scale skeleton action datasets confirm the effectiveness of our method. Code: https://github.com/YujieOuO/SMIE. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Comments: Accepted by ACM MM 2023

arXiv:2308.03072 [pdf, other]

Customizing Textile and Tactile Skins for Interactive Industrial Robots

Authors: Bo Ying Su, Zhongqi Wei, James McCann, Wenzhen Yuan, Changliu Liu

Abstract: Tactile skins made from textiles enhance robot-human interaction by localizing contact points and measuring contact forces. This paper presents a solution for rapidly fabricating, calibrating, and deploying these skins on industrial robot arms. The novel automated skin calibration procedure maps skin locations to robot geometry and calibrates contact force. Through experiments on a FANUC LR Mate 2… ▽ More Tactile skins made from textiles enhance robot-human interaction by localizing contact points and measuring contact forces. This paper presents a solution for rapidly fabricating, calibrating, and deploying these skins on industrial robot arms. The novel automated skin calibration procedure maps skin locations to robot geometry and calibrates contact force. Through experiments on a FANUC LR Mate 200id/7L industrial robot, we demonstrate that tactile skins made from textiles can be effectively used for human-robot interaction in industrial environments, and can provide unique opportunities in robot control and learning, making them a promising technology for enhancing robot perception and interaction. △ Less

Submitted 6 August, 2023; originally announced August 2023.

arXiv:2308.01850 [pdf, other]

Synthesizing Long-Term Human Motions with Diffusion Models via Coherent Sampling

Authors: Zhao Yang, Bing Su, Ji-Rong Wen

Abstract: Text-to-motion generation has gained increasing attention, but most existing methods are limited to generating short-term motions that correspond to a single sentence describing a single action. However, when a text stream describes a sequence of continuous motions, the generated motions corresponding to each sentence may not be coherently linked. Existing long-term motion generation methods face… ▽ More Text-to-motion generation has gained increasing attention, but most existing methods are limited to generating short-term motions that correspond to a single sentence describing a single action. However, when a text stream describes a sequence of continuous motions, the generated motions corresponding to each sentence may not be coherently linked. Existing long-term motion generation methods face two main issues. Firstly, they cannot directly generate coherent motions and require additional operations such as interpolation to process the generated actions. Secondly, they generate subsequent actions in an autoregressive manner without considering the influence of future actions on previous ones. To address these issues, we propose a novel approach that utilizes a past-conditioned diffusion model with two optional coherent sampling methods: Past Inpainting Sampling and Compositional Transition Sampling. Past Inpainting Sampling completes subsequent motions by treating previous motions as conditions, while Compositional Transition Sampling models the distribution of the transition as the composition of two adjacent motions guided by different text prompts. Our experimental results demonstrate that our proposed method is capable of generating compositional and coherent long-term 3D human motions controlled by a user-instructed long text stream. The code is available at \href{https://github.com/yangzhao1230/PCMDM}{https://github.com/yangzhao1230/PCMDM}. △ Less

Submitted 3 August, 2023; originally announced August 2023.

Comments: Accepted at ACM MM 2023

arXiv:2308.01097

doi 10.1145/3581783.3612330

Spatio-Temporal Branching for Motion Prediction using Motion Increments

Authors: Jiexin Wang, Yujie Zhou, Wenwen Qiang, Ying Ba, Bing Su, Ji-Rong Wen

Abstract: Human motion prediction (HMP) has emerged as a popular research topic due to its diverse applications, but it remains a challenging task due to the stochastic and aperiodic nature of future poses. Traditional methods rely on hand-crafted features and machine learning techniques, which often struggle to model the complex dynamics of human motion. Recent deep learning-based methods have achieved suc… ▽ More Human motion prediction (HMP) has emerged as a popular research topic due to its diverse applications, but it remains a challenging task due to the stochastic and aperiodic nature of future poses. Traditional methods rely on hand-crafted features and machine learning techniques, which often struggle to model the complex dynamics of human motion. Recent deep learning-based methods have achieved success by learning spatio-temporal representations of motion, but these models often overlook the reliability of motion data. Additionally, the temporal and spatial dependencies of skeleton nodes are distinct. The temporal relationship captures motion information over time, while the spatial relationship describes body structure and the relationships between different nodes. In this paper, we propose a novel spatio-temporal branching network using incremental information for HMP, which decouples the learning of temporal-domain and spatial-domain features, extracts more motion information, and achieves complementary cross-domain knowledge learning through knowledge distillation. Our approach effectively reduces noise interference and provides more expressive information for characterizing motion by separately extracting temporal and spatial features. We evaluate our approach on standard HMP benchmarks and outperform state-of-the-art methods in terms of prediction accuracy. △ Less

Submitted 17 July, 2024; v1 submitted 2 August, 2023; originally announced August 2023.

Comments: The incremental information of our paper includes the displacement information from the last frame of the historical sequence, derived from the motion information of the first frame in the future sequence and the motion information of the last frame of the historical sequence. This implicitly contains future information, inadvertently giving an unfair advantage in the human motion prediction task

Journal ref: ACM MM 2023

arXiv:2308.00070 [pdf, other]

Phase transition during inflation and the gravitational wave signal at pulsar timing arrays

Authors: Haipeng An, Boye Su, Hanwen Tai, Lian-Tao Wang, Chen Yang

Abstract: Gravitational wave signal offers a promising window into the dynamics of the early universe. The recent results from pulsar timing arrays (PTAs) could be the first glimpse of such new physics. In particular, they could point to new details during the inflation, which can not be probed by other means. We explore the possibility that the new results could come from the secondary gravitational wave s… ▽ More Gravitational wave signal offers a promising window into the dynamics of the early universe. The recent results from pulsar timing arrays (PTAs) could be the first glimpse of such new physics. In particular, they could point to new details during the inflation, which can not be probed by other means. We explore the possibility that the new results could come from the secondary gravitational wave sourced by curvature perturbations, generated by a first-order phase transition during the inflation. Based on the results of a field-theoretic lattice simulation of the phase transition process, we show that the gravitational wave signal generated through this mechanism can account for the new results from the PTAs. We analyze the spectral shape of the signal in detail. Future observations can use such information to distinguish the gravitational wave signal considered here from other possible sources. △ Less

Submitted 31 July, 2023; originally announced August 2023.

arXiv:2307.11374 [pdf, other]

Exploring the Dark Energy Equation of State with JWST

Authors: Pei Wang, Bing-Yu Su, Lei Zu, Yupeng Yang, Lei Feng

Abstract: Observations from the James Webb Space Telescope (JWST) have unveiled several galaxies with stellar mass $M_*\gtrsim10^{10} M_\odot$ at $7.4\lesssim z\lesssim 9.1$. These remarkable findings indicate an unexpectedly high stellar mass density, which contradicts the prediction of the $Λ\rm CDM$ model. We adopt the Chevallier--Polarski--Linder (CPL) parameterization, one of the dynamic dark energy mo… ▽ More Observations from the James Webb Space Telescope (JWST) have unveiled several galaxies with stellar mass $M_*\gtrsim10^{10} M_\odot$ at $7.4\lesssim z\lesssim 9.1$. These remarkable findings indicate an unexpectedly high stellar mass density, which contradicts the prediction of the $Λ\rm CDM$ model. We adopt the Chevallier--Polarski--Linder (CPL) parameterization, one of the dynamic dark energy models, to probe the role of dark energy on shaping the galaxy formation. By considering varying star formation efficiencies within this framework, our analysis demonstrates that an increased proportion of dark energy in the universe corresponds to the formation of more massive galaxies at higher redshifts, given a fixed perturbation amplitude observed today. Furthermore, through elaborately selecting CPL parameters, we successfully explain the JWST observations with star formation efficiencies $ε\gtrsim0.05$ at a confidence level of $95\%$. These intriguing results indicate the promising prospect of revealing the nature of dark energy by analyzing the high-redshift massive galaxies. △ Less

Submitted 21 July, 2023; originally announced July 2023.

Comments: 7 pages, 3 figures

arXiv:2307.04179 [pdf, other]

IANS: Intelligibility-aware Null-steering Beamforming for Dual-Microphone Arrays

Authors: Wen-Yuan Ting, Syu-Siang Wang, Yu Tsao, Borching Su

Abstract: Beamforming techniques are popular in speech-related applications due to their effective spatial filtering capabilities. Nonetheless, conventional beamforming techniques generally depend heavily on either the target's direction-of-arrival (DOA), relative transfer function (RTF) or covariance matrix. This paper presents a new approach, the intelligibility-aware null-steering (IANS) beamforming fram… ▽ More Beamforming techniques are popular in speech-related applications due to their effective spatial filtering capabilities. Nonetheless, conventional beamforming techniques generally depend heavily on either the target's direction-of-arrival (DOA), relative transfer function (RTF) or covariance matrix. This paper presents a new approach, the intelligibility-aware null-steering (IANS) beamforming framework, which uses the STOI-Net intelligibility prediction model to improve speech intelligibility without prior knowledge of the speech signal parameters mentioned earlier. The IANS framework combines a null-steering beamformer (NSBF) to generate a set of beamformed outputs, and STOI-Net, to determine the optimal result. Experimental results indicate that IANS can produce intelligibility-enhanced signals using a small dual-microphone array. The results are comparable to those obtained by null-steering beamformers with given knowledge of DOAs. △ Less

Submitted 9 July, 2023; originally announced July 2023.

Comments: Preprint submitted to IEEE MLSP 2023

arXiv:2306.15959 [pdf, other]

doi 10.1002/prop.202200166

The Joule--Thomson and Joule--Thomson-like effects of the black holes in a cavity

Authors: Nan Li, Jin-Yu Li, Bing-Yu Su

Abstract: When a black hole is enclosed in a cavity in asymptotically flat space, an effective volume can be introduced, and an effective pressure can be further defined as its conjugate variable. By this means, an extended phase space is constructed in a cavity, which resembles that in the anti-de Sitter (AdS) space in many aspects. However, there are still some notable dissimilarities simultaneously. In t… ▽ More When a black hole is enclosed in a cavity in asymptotically flat space, an effective volume can be introduced, and an effective pressure can be further defined as its conjugate variable. By this means, an extended phase space is constructed in a cavity, which resembles that in the anti-de Sitter (AdS) space in many aspects. However, there are still some notable dissimilarities simultaneously. In this work, the Joule--Thomson (JT) effect of the black holes, widely discussed in the AdS space as an isenthalpic (constant-mass) process, is shown to only have cooling region in a cavity. On the contrary, in a constant-thermal-energy process (the JT-like effect), there is only heating region in a cavity. Altogether, different from the AdS case, there is no inversion temperature or inversion curve in a cavity. Our work reveals the subtle discrepancy between the two different extended phase spaces that is sensitive to the specific boundary conditions. △ Less

Submitted 2 July, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

Comments: 19 pages, 2 figures

Journal ref: Fortsch. Phys. 71 (2023), 2200166

arXiv:2306.05364 [pdf, other]

An inflation model for massive primordial black holes to interpret the JWST observations

Authors: Bing-Yu Su, Nan Li, Lei Feng

Abstract: The first observations of the James Webb Space Telescope (JWST) have identified six massive galaxy candidates with the stellar masses $M_\ast\gtrsim 10^{10}\,M_\odot$ at high redshifts $7.4\lesssim z\lesssim 9.1$, with two most massive high-$z$ objects having the cumulative comoving number densities $n_{\rm G}$ up to $1.6\times 10^{-5}\, {\rm Mpc}^{-3}$. The presence of such massive sources in the… ▽ More The first observations of the James Webb Space Telescope (JWST) have identified six massive galaxy candidates with the stellar masses $M_\ast\gtrsim 10^{10}\,M_\odot$ at high redshifts $7.4\lesssim z\lesssim 9.1$, with two most massive high-$z$ objects having the cumulative comoving number densities $n_{\rm G}$ up to $1.6\times 10^{-5}\, {\rm Mpc}^{-3}$. The presence of such massive sources in the early universe challenges the standard $Λ$CDM model since the needed star formation efficiency is unrealistically high. This tension can be alleviated via the accretion of massive primordial black holes (PBHs). In this work, with the updated data from the first JWST observations, we find that the PBHs with mass $10^8\,M_\odot\lesssim M_{\rm PBH}\lesssim 10^{11}\,M_\odot$ can act as the seeds of extremely massive galaxies even with a low abundance $10^{-7}\lesssim f_{\rm PBH}\lesssim 10^{-3}$. We construct an ultraslow-roll inflation model and investigate its possibility of producing the required PBHs. We explore the model in two cases, depending on whether there is a perfect plateau on the inflaton potential. If the plateau is allowed to incline slightly, our model can produce the PBHs that cover the required PBH mass and abundance range to explain the JWST data. △ Less

Submitted 8 June, 2023; originally announced June 2023.

Comments: 19 pages, 2 figures

arXiv:2305.12618 [pdf, other]

Atomic and Subgraph-aware Bilateral Aggregation for Molecular Representation Learning

Authors: Jiahao Chen, Yurou Liu, Jiangmeng Li, Bing Su, Jirong Wen

Abstract: Molecular representation learning is a crucial task in predicting molecular properties. Molecules are often modeled as graphs where atoms and chemical bonds are represented as nodes and edges, respectively, and Graph Neural Networks (GNNs) have been commonly utilized to predict atom-related properties, such as reactivity and solubility. However, functional groups (subgraphs) are closely related to… ▽ More Molecular representation learning is a crucial task in predicting molecular properties. Molecules are often modeled as graphs where atoms and chemical bonds are represented as nodes and edges, respectively, and Graph Neural Networks (GNNs) have been commonly utilized to predict atom-related properties, such as reactivity and solubility. However, functional groups (subgraphs) are closely related to some chemical properties of molecules, such as efficacy, and metabolic properties, which cannot be solely determined by individual atoms. In this paper, we introduce a new model for molecular representation learning called the Atomic and Subgraph-aware Bilateral Aggregation (ASBA), which addresses the limitations of previous atom-wise and subgraph-wise models by incorporating both types of information. ASBA consists of two branches, one for atom-wise information and the other for subgraph-wise information. Considering existing atom-wise GNNs cannot properly extract invariant subgraph features, we propose a decomposition-polymerization GNN architecture for the subgraph-wise branch. Furthermore, we propose cooperative node-level and graph-level self-supervised learning strategies for ASBA to improve its generalization. Our method offers a more comprehensive way to learn representations for molecular property prediction and has broad potential in drug and material discovery applications. Extensive experiments have demonstrated the effectiveness of our method. △ Less

Submitted 21 May, 2023; originally announced May 2023.

arXiv:2305.02373 [pdf, other]

Efficient estimation of weighted cumulative treatment effects by double/debiased machine learning

Authors: Shenbo Xu, Bang Zheng, Bowen Su, Stan Finkelstein, Roy Welsch, Kenney Ng, Ioanna Tzoulaki, Zach Shahn

Abstract: In empirical studies with time-to-event outcomes, investigators often leverage observational data to conduct causal inference on the effect of exposure when randomized controlled trial data is unavailable. Model misspecification and lack of overlap are common issues in observational studies, and they often lead to inconsistent and inefficient estimators of the average treatment effect. Estimators… ▽ More In empirical studies with time-to-event outcomes, investigators often leverage observational data to conduct causal inference on the effect of exposure when randomized controlled trial data is unavailable. Model misspecification and lack of overlap are common issues in observational studies, and they often lead to inconsistent and inefficient estimators of the average treatment effect. Estimators targeting overlap weighted effects have been proposed to address the challenge of poor overlap, and methods enabling flexible machine learning for nuisance models address model misspecification. However, the approaches that allow machine learning for nuisance models have not been extended to the setting of weighted average treatment effects for time-to-event outcomes when there is poor overlap. In this work, we propose a class of one-step cross-fitted double/debiased machine learning estimators for the weighted cumulative causal effect as a function of restriction time. We prove that the proposed estimators are consistent, asymptotically linear, and reach semiparametric efficiency bounds under regularity conditions. Our simulations show that the proposed estimators using nonparametric machine learning nuisance models perform as well as established methods that require correctly-specified parametric nuisance models, illustrating that our estimators mitigate the need for oracle parametric nuisance models. We apply the proposed methods to real-world observational data from a UK primary care database to compare the effects of anti-diabetic drugs on cancer clinical outcomes. △ Less

Submitted 3 May, 2023; originally announced May 2023.

arXiv:2304.08288 [pdf, other]

doi 10.1109/ICASSP49357.2023.10095211

Toward Auto-evaluation with Confidence-based Category Relation-aware Regression

Authors: Jiexin Wang, Jiahao Chen, Bing Su

Abstract: Auto-evaluation aims to automatically evaluate a trained model on any test dataset without human annotations. Most existing methods utilize global statistics of features extracted by the model as the representation of a dataset. This ignores the influence of the classification head and loses category-wise confusion information of the model. However, ratios of instances assigned to different catego… ▽ More Auto-evaluation aims to automatically evaluate a trained model on any test dataset without human annotations. Most existing methods utilize global statistics of features extracted by the model as the representation of a dataset. This ignores the influence of the classification head and loses category-wise confusion information of the model. However, ratios of instances assigned to different categories together with their confidence scores reflect how many instances in which categories are difficult for the model to classify, which contain significant indicators for both overall and category-wise performances. In this paper, we propose a Confidence-based Category Relation-aware Regression ($C^2R^2$) method. $C^2R^2$ divides all instances in a meta-set into different categories according to their confidence scores and extracts the global representation from them. For each category, $C^2R^2$ encodes its local confusion relations to other categories into a local representation. The overall and category-wise performances are regressed from global and local representations, respectively. Extensive experiments show the effectiveness of our method. △ Less

Submitted 9 May, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

Journal ref: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5

arXiv:2304.06537 [pdf, other]

Transfer Knowledge from Head to Tail: Uncertainty Calibration under Long-tailed Distribution

Authors: Jiahao Chen, Bing Su

Abstract: How to estimate the uncertainty of a given model is a crucial problem. Current calibration techniques treat different classes equally and thus implicitly assume that the distribution of training data is balanced, but ignore the fact that real-world data often follows a long-tailed distribution. In this paper, we explore the problem of calibrating the model trained from a long-tailed distribution.… ▽ More How to estimate the uncertainty of a given model is a crucial problem. Current calibration techniques treat different classes equally and thus implicitly assume that the distribution of training data is balanced, but ignore the fact that real-world data often follows a long-tailed distribution. In this paper, we explore the problem of calibrating the model trained from a long-tailed distribution. Due to the difference between the imbalanced training distribution and balanced test distribution, existing calibration methods such as temperature scaling can not generalize well to this problem. Specific calibration methods for domain adaptation are also not applicable because they rely on unlabeled target domain instances which are not available. Models trained from a long-tailed distribution tend to be more overconfident to head classes. To this end, we propose a novel knowledge-transferring-based calibration method by estimating the importance weights for samples of tail classes to realize long-tailed calibration. Our method models the distribution of each class as a Gaussian distribution and views the source statistics of head classes as a prior to calibrate the target distributions of tail classes. We adaptively transfer knowledge from head classes to get the target probability density of tail classes. The importance weight is estimated by the ratio of the target probability density over the source probability density. Extensive experiments on CIFAR-10-LT, MNIST-LT, CIFAR-100-LT, and ImageNet-LT datasets demonstrate the effectiveness of our method. △ Less

Submitted 13 April, 2023; originally announced April 2023.

arXiv:2302.09018 [pdf, other]

doi 10.1609/aaai.v37i3.25495

Self-supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences

Authors: Yujie Zhou, Haodong Duan, Anyi Rao, Bing Su, Jiaqi Wang

Abstract: Self-supervised learning has demonstrated remarkable capability in representation learning for skeleton-based action recognition. Existing methods mainly focus on applying global data augmentation to generate different views of the skeleton sequence for contrastive learning. However, due to the rich action clues in the skeleton sequences, existing methods may only take a global perspective to lear… ▽ More Self-supervised learning has demonstrated remarkable capability in representation learning for skeleton-based action recognition. Existing methods mainly focus on applying global data augmentation to generate different views of the skeleton sequence for contrastive learning. However, due to the rich action clues in the skeleton sequences, existing methods may only take a global perspective to learn to discriminate different skeletons without thoroughly leveraging the local relationship between different skeleton joints and video frames, which is essential for real-world applications. In this work, we propose a Partial Spatio-Temporal Learning (PSTL) framework to exploit the local relationship from a partial skeleton sequences built by a unique spatio-temporal masking strategy. Specifically, we construct a negative-sample-free triplet steam structure that is composed of an anchor stream without any masking, a spatial masking stream with Central Spatial Masking (CSM), and a temporal masking stream with Motion Attention Temporal Masking (MATM). The feature cross-correlation matrix is measured between the anchor stream and the other two masking streams, respectively. (1) Central Spatial Masking discards selected joints from the feature calculation process, where the joints with a higher degree of centrality have a higher possibility of being selected. (2) Motion Attention Temporal Masking leverages the motion of action and remove frames that move faster with a higher possibility. Our method achieves SOTA performance on NTU-60, NTU-120 and PKU-MMD under various downstream tasks. A practical evaluation is performed where some skeleton joints are lost in downstream tasks. In contrast to previous methods that suffer from large performance drops, our PSTL can still achieve remarkable results, validating the robustness of our method. Code: https://github.com/YujieOuO/PSTL.git. △ Less

Submitted 22 February, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

Comments: Accepted by AAAI 2023(Oral)

arXiv:2302.05787 [pdf, other]

Differentially Private Normalizing Flows for Density Estimation, Data Synthesis, and Variational Inference with Application to Electronic Health Records

Authors: Bingyue Su, Yu Wang, Daniele E. Schiavazzi, Fang Liu

Abstract: Electronic health records (EHR) often contain sensitive medical information about individual patients, posing significant limitations to sharing or releasing EHR data for downstream learning and inferential tasks. We use normalizing flows (NF), a family of deep generative models, to estimate the probability density of a dataset with differential privacy (DP) guarantees, from which privacy-preservi… ▽ More Electronic health records (EHR) often contain sensitive medical information about individual patients, posing significant limitations to sharing or releasing EHR data for downstream learning and inferential tasks. We use normalizing flows (NF), a family of deep generative models, to estimate the probability density of a dataset with differential privacy (DP) guarantees, from which privacy-preserving synthetic data are generated. We apply the technique to an EHR dataset containing patients with pulmonary hypertension. We assess the learning and inferential utility of the synthetic data by comparing the accuracy in the prediction of the hypertension status and variational posterior distribution of the parameters of a physics-based model. In addition, we use a simulated dataset from a nonlinear model to compare the results from variational inference (VI) based on privacy-preserving synthetic data, and privacy-preserving VI obtained from directly privatizing NFs for VI with DP guarantees given the original non-private dataset. The results suggest that synthetic data generated through differentially private density estimation with NF can yield good utility at a reasonable privacy cost. We also show that VI obtained from differentially private NF based on the free energy bound loss may produce variational approximations with significantly altered correlation structure, and loss formulations based on alternative dissimilarity metrics between two distributions might provide improved results. △ Less

Submitted 11 February, 2023; originally announced February 2023.

arXiv:2301.06658 [pdf, other]

Statistical inference for the logarithmic spatial heteroskedasticity model with exogenous variables

Authors: Bing Su, Fukang Zhu, Ke Zhu

Abstract: The spatial dependence in mean has been well studied by plenty of models in a large strand of literature, however, the investigation of spatial dependence in variance is lagging significantly behind. The existing models for the spatial dependence in variance are scarce, with neither probabilistic structure nor statistical inference procedure being explored. To circumvent this deficiency, this pape… ▽ More The spatial dependence in mean has been well studied by plenty of models in a large strand of literature, however, the investigation of spatial dependence in variance is lagging significantly behind. The existing models for the spatial dependence in variance are scarce, with neither probabilistic structure nor statistical inference procedure being explored. To circumvent this deficiency, this paper proposes a new generalized logarithmic spatial heteroscedasticity model with exogenous variables (denoted by the log-SHE model) to study the spatial dependence in variance. For the log-SHE model, its spatial near-epoch dependence (NED) property is investigated, and a systematic statistical inference procedure is provided, including the maximum likelihood and generalized method of moments estimators, the Wald, Lagrange multiplier and likelihood-ratio-type D tests for model parameter constraints, and the overidentification test for the model diagnostic checking. Using the tool of spatial NED, the asymptotics of all proposed estimators and tests are established under regular conditions. The usefulness of the proposed methodology is illustrated by simulation results and a real data example on the house selling price. △ Less

Submitted 16 January, 2023; originally announced January 2023.

arXiv:2210.15996 [pdf, other]

Towards Generalized Few-Shot Open-Set Object Detection

Authors: Binyi Su, Hua Zhang, Jingzhi Li, Zhong Zhou

Abstract: Open-set object detection (OSOD) aims to detect the known categories and reject unknown objects in a dynamic world, which has achieved significant attention. However, previous approaches only consider this problem in data-abundant conditions, while neglecting the few-shot scenes. In this paper, we seek a solution for the generalized few-shot open-set object detection (G-FOOD), which aims to avoid… ▽ More Open-set object detection (OSOD) aims to detect the known categories and reject unknown objects in a dynamic world, which has achieved significant attention. However, previous approaches only consider this problem in data-abundant conditions, while neglecting the few-shot scenes. In this paper, we seek a solution for the generalized few-shot open-set object detection (G-FOOD), which aims to avoid detecting unknown classes as known classes with a high confidence score while maintaining the performance of few-shot detection. The main challenge for this task is that few training samples induce the model to overfit on the known classes, resulting in a poor open-set performance. We propose a new G-FOOD algorithm to tackle this issue, named \underline{F}ew-sh\underline{O}t \underline{O}pen-set \underline{D}etector (FOOD), which contains a novel class weight sparsification classifier (CWSC) and a novel unknown decoupling learner (UDL). To prevent over-fitting, CWSC randomly sparses parts of the normalized weights for the logit prediction of all classes, and then decreases the co-adaptability between the class and its neighbors. Alongside, UDL decouples training the unknown class and enables the model to form a compact unknown decision boundary. Thus, the unknown objects can be identified with a confidence probability without any threshold, prototype, or generation. We compare our method with several state-of-the-art OSOD methods in few-shot scenes and observe that our method improves the F-score of unknown classes by 4.80\%-9.08\% across all shots in VOC-COCO dataset settings \footnote[1]{The source code is available at \url{https://github.com/binyisu/food}}. △ Less

Submitted 21 February, 2024; v1 submitted 28 October, 2022; originally announced October 2022.

arXiv:2210.13446 [pdf, other]

Flying Trot Control Method for Quadruped Robot Based on Trajectory Planning

Authors: Hongge Wang, Hui Chai, Bin Chen, Aizhen Xie, Rui Song, Bo Su

Abstract: An intuitive control method for the flying trot, which combines offline trajectory planning with real-time balance control, is presented. The motion features of running animals in the vertical direction were analysed using the spring-load-inverted-pendulum (SLIP) model, and the foot trajectory of the robot was planned, so the robot could run similar to an animal capable of vertical flight, accordi… ▽ More An intuitive control method for the flying trot, which combines offline trajectory planning with real-time balance control, is presented. The motion features of running animals in the vertical direction were analysed using the spring-load-inverted-pendulum (SLIP) model, and the foot trajectory of the robot was planned, so the robot could run similar to an animal capable of vertical flight, according to the given height and speed of the trunk. To improve the robustness of running, a posture control method based on a foot acceleration adjustment is proposed. A novel kinematic based CoM observation method and CoM regulation method is present to enhance the stability of locomotion. To reduce the impact force when the robot interacts with the environment, the virtual model control method is used in the control of the foot trajectory to achieve active compliance. By selecting the proper parameters for the virtual model, the oscillation motion of the virtual model and the planning motion of the support foot are synchronized to avoid the large disturbance caused by the oscillation motion of the virtual model in relation to the robot motion. The simulation and experiment using the quadruped robot Billy are reported. In the experiment, the maximum speed of the robot could reach 4.73 times the body length per second, which verified the feasibility of the control method. △ Less

Submitted 24 October, 2022; originally announced October 2022.

Comments: 30 pages, 20 figures, journal

arXiv:2210.10980 [pdf, ps, other]

Sieve Method and Prime Gaps via Probabilistic Method

Authors: Buxin Su

Abstract: Most prime gaps results have been proven using tools from analytic or algebraic number theory in the last few centuries. In this paper, we would like to present some probabilistic way of proving many essential results. A major component of the proof is a probabilistic approach to the sieve method. In addition, we discuss their connections with recent work by Zhang and Maynard on small and large ga… ▽ More Most prime gaps results have been proven using tools from analytic or algebraic number theory in the last few centuries. In this paper, we would like to present some probabilistic way of proving many essential results. A major component of the proof is a probabilistic approach to the sieve method. In addition, we discuss their connections with recent work by Zhang and Maynard on small and large gaps in prime numbers. △ Less

Submitted 19 October, 2022; originally announced October 2022.

Comments: 15 pages, 0 figures

arXiv:2210.08564 [pdf, other]

doi 10.1109/TSP.2023.3291454

Waveform Design for Optimal PSL Under Spectral and Unimodular Constraints via Alternating Minimization

Authors: Chin-Wei Huang, Li-Fu Chen, Borching Su

Abstract: In an active sensing system, waveforms with good auto-correlations are preferred for accurate parameter estimation. Furthermore, spectral compatibility is required to avoid mutual interference between devices as the electromagnetic environment becomes increasingly crowded. Waveforms should also be unimodular due to hardware limits. In this paper, a new approach to generating a unimodular sequence… ▽ More In an active sensing system, waveforms with good auto-correlations are preferred for accurate parameter estimation. Furthermore, spectral compatibility is required to avoid mutual interference between devices as the electromagnetic environment becomes increasingly crowded. Waveforms should also be unimodular due to hardware limits. In this paper, a new approach to generating a unimodular sequence with an approximately optimal peak side-lobe level (PSL) in auto-correlation and adjustable stopband attenuation is proposed. The proposed method is based on alternating minimization (AM) and numerical results suggest that it outperforms existing methods in terms of PSL. We also develop a theoretical lower bound for the PSL minimization problem under spectral constraints and unimodular constraints, which can be used for the evaluation of the results in various works about this waveform design problem. It is observed in the numerical results that the PSL of the proposed algorithm is close to the derived lower bound. △ Less

Submitted 17 May, 2023; v1 submitted 16 October, 2022; originally announced October 2022.

Comments: 13 pages, 7 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2209.07902 [pdf, other]

MetaMask: Revisiting Dimensional Confounder for Self-Supervised Learning

Authors: Jiangmeng Li, Wenwen Qiang, Yanan Zhang, Wenyi Mo, Changwen Zheng, Bing Su, Hui Xiong

Abstract: As a successful approach to self-supervised learning, contrastive learning aims to learn invariant information shared among distortions of the input sample. While contrastive learning has yielded continuous advancements in sampling strategy and architecture design, it still remains two persistent defects: the interference of task-irrelevant information and sample inefficiency, which are related to… ▽ More As a successful approach to self-supervised learning, contrastive learning aims to learn invariant information shared among distortions of the input sample. While contrastive learning has yielded continuous advancements in sampling strategy and architecture design, it still remains two persistent defects: the interference of task-irrelevant information and sample inefficiency, which are related to the recurring existence of trivial constant solutions. From the perspective of dimensional analysis, we find out that the dimensional redundancy and dimensional confounder are the intrinsic issues behind the phenomena, and provide experimental evidence to support our viewpoint. We further propose a simple yet effective approach MetaMask, short for the dimensional Mask learned by Meta-learning, to learn representations against dimensional redundancy and confounder. MetaMask adopts the redundancy-reduction technique to tackle the dimensional redundancy issue and innovatively introduces a dimensional mask to reduce the gradient effects of specific dimensions containing the confounder, which is trained by employing a meta-learning paradigm with the objective of improving the performance of masked representations on a typical self-supervised task. We provide solid theoretical analyses to prove MetaMask can obtain tighter risk bounds for downstream classification compared to typical contrastive methods. Empirically, our method achieves state-of-the-art performance on various benchmarks. △ Less

Submitted 9 August, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

Comments: Accepted by NeurIPS 2022 as Spotlight

arXiv:2209.07811 [pdf, other]

doi 10.1109/TKDE.2022.3198746

Modeling Multiple Views via Implicitly Preserving Global Consistency and Local Complementarity

Authors: Jiangmeng Li, Wenwen Qiang, Changwen Zheng, Bing Su, Farid Razzak, Ji-Rong Wen, Hui Xiong

Abstract: While self-supervised learning techniques are often used to mining implicit knowledge from unlabeled data via modeling multiple views, it is unclear how to perform effective representation learning in a complex and inconsistent context. To this end, we propose a methodology, specifically consistency and complementarity network (CoCoNet), which avails of strict global inter-view consistency and loc… ▽ More While self-supervised learning techniques are often used to mining implicit knowledge from unlabeled data via modeling multiple views, it is unclear how to perform effective representation learning in a complex and inconsistent context. To this end, we propose a methodology, specifically consistency and complementarity network (CoCoNet), which avails of strict global inter-view consistency and local cross-view complementarity preserving regularization to comprehensively learn representations from multiple views. On the global stage, we reckon that the crucial knowledge is implicitly shared among views, and enhancing the encoder to capture such knowledge from data can improve the discriminability of the learned representations. Hence, preserving the global consistency of multiple views ensures the acquisition of common knowledge. CoCoNet aligns the probabilistic distribution of views by utilizing an efficient discrepancy metric measurement based on the generalized sliced Wasserstein distance. Lastly on the local stage, we propose a heuristic complementarity-factor, which joints cross-view discriminative knowledge, and it guides the encoders to learn not only view-wise discriminability but also cross-view complementary information. Theoretically, we provide the information-theoretical-based analyses of our proposed CoCoNet. Empirically, to investigate the improvement gains of our approach, we conduct adequate experimental validations, which demonstrate that CoCoNet outperforms the state-of-the-art self-supervised methods by a significant margin proves that such implicit consistency and complementarity preserving regularization can enhance the discriminability of latent representations. △ Less

Submitted 9 August, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

Comments: Accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE) 2022; Refer to https://ieeexplore.ieee.org/document/9857632

arXiv:2209.06419 [pdf, other]

Frequency Reversal Alamouti Code-Based FBMC with Resilience to Inter-Antenna Frequency Offsets

Authors: Cheng-Yu Lin, Borching Su, Kwonhue Choi

Abstract: Transmit diversity schemes for filter bank multicarrier (FBMC) are known to be challenging. No existing schemes have considered the presence of inter-antenna frequency offset (IAFO), which will result in performance degradation. In this letter, a new transmit scheme based on the frequency reversal Alamouti code (FRAC)-based structure to address the issue of IAFO is proposed and is proven to inhere… ▽ More Transmit diversity schemes for filter bank multicarrier (FBMC) are known to be challenging. No existing schemes have considered the presence of inter-antenna frequency offset (IAFO), which will result in performance degradation. In this letter, a new transmit scheme based on the frequency reversal Alamouti code (FRAC)-based structure to address the issue of IAFO is proposed and is proven to inherently cancel the inter-antenna inter-carrier interference (ICI) while preserving spatial diversity. Moreover, the proposed FRAC structure is applicable in frequency-selective channels. Numerical results show that the proposed scheme undergoes negligible bit error rate (BER) degradation even with considerable IAFOs. △ Less

Submitted 14 September, 2022; originally announced September 2022.

arXiv:2209.05481 [pdf, other]

A Molecular Multimodal Foundation Model Associating Molecule Graphs with Natural Language

Authors: Bing Su, Dazhao Du, Zhao Yang, Yujie Zhou, Jiangmeng Li, Anyi Rao, Hao Sun, Zhiwu Lu, Ji-Rong Wen

Abstract: Although artificial intelligence (AI) has made significant progress in understanding molecules in a wide range of fields, existing models generally acquire the single cognitive ability from the single molecular modality. Since the hierarchy of molecular knowledge is profound, even humans learn from different modalities including both intuitive diagrams and professional texts to assist their unders… ▽ More Although artificial intelligence (AI) has made significant progress in understanding molecules in a wide range of fields, existing models generally acquire the single cognitive ability from the single molecular modality. Since the hierarchy of molecular knowledge is profound, even humans learn from different modalities including both intuitive diagrams and professional texts to assist their understanding. Inspired by this, we propose a molecular multimodal foundation model which is pretrained from molecular graphs and their semantically related textual data (crawled from published Scientific Citation Index papers) via contrastive learning. This AI model represents a critical attempt that directly bridges molecular graphs and natural language. Importantly, through capturing the specific and complementary information of the two modalities, our proposed model can better grasp molecular expertise. Experimental results show that our model not only exhibits promising performance in cross-modal tasks such as cross-modal retrieval and molecule caption, but also enhances molecular property prediction and possesses capability to generate meaningful molecular graphs from natural language descriptions. We believe that our model would have a broad impact on AI-empowered fields across disciplines such as biology, chemistry, materials, environment, and medicine, among others. △ Less

Submitted 11 September, 2022; originally announced September 2022.

arXiv:2208.08923 [pdf, other]

doi 10.1002/adom.202202639

Strong nonlinear optical response and transient symmetry switching in Type-II Weyl semimetal $β$-WP2

Authors: Tianchen Hu, Bo Su, Liyu Shi, Zixiao Wang, Li Yue, Shuxiang Xu, Sijie Zhang, Qiaomei Liu, Qiong Wu, Rongsheng Li, Xinyu Zhou, Jiayu Yuan, Dong Wu, Zhiguo Chen, Tao Dong, Nanlin Wang

Abstract: The topological Weyl semimetals with peculiar band structure exhibit novel nonlinear optical enhancement phenomena even for light at optical wavelengths. While many intriguing nonlinear optical effects were constantly uncovered in type-I semimetals, few experimental works focused on basic nonlinear optical properties in type-II Weyl semimetals. Here we perform a fundamental static and time-resolve… ▽ More The topological Weyl semimetals with peculiar band structure exhibit novel nonlinear optical enhancement phenomena even for light at optical wavelengths. While many intriguing nonlinear optical effects were constantly uncovered in type-I semimetals, few experimental works focused on basic nonlinear optical properties in type-II Weyl semimetals. Here we perform a fundamental static and time-resolved second harmonic generation (SHG) on the three dimensional Type-II Weyl semimetal candidate $β$-WP$_2$. Although $β$-WP$_2$ exhibits extremely high conductivity and an extraordinarily large mean free path, the second harmonic generation is unscreened by conduction electrons, we observed rather strong SHG response compared to non-topological polar metals and archetypal ferroelectric insulators. Additionally, our time-resolved SHG experiment traces ultrafast symmetry switching and reveals that polar metal $β$-WP$_2$ tends to form inversion symmetric metastable state after photo-excitation. Intense femtosecond laser pulse could optically drive symmetry switching and tune nonlinear optical response on ultrafast timescales although the interlayer coupling of $β$-WP$_2$ is very strong. Our work is illuminating for the polar metal nonlinear optics and potential ultrafast topological optoelectronic applications. △ Less

Submitted 18 August, 2022; originally announced August 2022.

Comments: 8 pages, 5 figures

Journal ref: Advanced Optical Materials 2023, 2202639

arXiv:2207.02454 [pdf, other]

Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives

Authors: Bin Su, Shaoguang Mao, Frank Soong, Zhiyong Wu

Abstract: Ordinal regression with anchored reference samples (ORARS) has been proposed for predicting the subjective Mean Opinion Score (MOS) of input stimuli automatically. The ORARS addresses the MOS prediction problem by pairing a test sample with each of the pre-scored anchored reference samples. A trained binary classifier is then used to predict which sample, test or anchor, is better statistically. P… ▽ More Ordinal regression with anchored reference samples (ORARS) has been proposed for predicting the subjective Mean Opinion Score (MOS) of input stimuli automatically. The ORARS addresses the MOS prediction problem by pairing a test sample with each of the pre-scored anchored reference samples. A trained binary classifier is then used to predict which sample, test or anchor, is better statistically. Posteriors of the binary preference decision are then used to predict the MOS of the test sample. In this paper, rigorous framework, analysis, and experiments to demonstrate that ORARS are advantageous over simple regressions are presented. The contributions of this work are: 1) Show that traditional regression can be reformulated into multiple preference tests to yield a better performance, which is confirmed with simulations experimentally; 2) Generalize ORARS to other regression problems and verify its effectiveness; 3) Provide some prerequisite conditions which can insure proper application of ORARS. △ Less

Submitted 6 July, 2022; originally announced July 2022.

Showing 1–50 of 140 results for author: Su, B