-
Deep learning for the design of non-Hermitian topolectrical circuits
Authors:
Xi Chen,
Jinyang Sun,
Xiumei Wang,
Hengxuan Jiang,
Dandan Zhu,
Xingping Zhou
Abstract:
Non-Hermitian topological phases can produce some remarkable properties, compared with their Hermitian counterpart, such as the breakdown of conventional bulk-boundary correspondence and the non-Hermitian topological edge mode. Here, we introduce several algorithms with multi-layer perceptron (MLP), and convolutional neural network (CNN) in the field of deep learning, to predict the winding of eig…
▽ More
Non-Hermitian topological phases can produce some remarkable properties, compared with their Hermitian counterpart, such as the breakdown of conventional bulk-boundary correspondence and the non-Hermitian topological edge mode. Here, we introduce several algorithms with multi-layer perceptron (MLP), and convolutional neural network (CNN) in the field of deep learning, to predict the winding of eigenvalues non-Hermitian Hamiltonians. Subsequently, we use the smallest module of the periodic circuit as one unit to construct high-dimensional circuit data features. Further, we use the Dense Convolutional Network (DenseNet), a type of convolutional neural network that utilizes dense connections between layers to design a non-Hermitian topolectrical Chern circuit, as the DenseNet algorithm is more suitable for processing high-dimensional data. Our results demonstrate the effectiveness of the deep learning network in capturing the global topological characteristics of a non-Hermitian system based on training data.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
VisLingInstruct: Elevating Zero-Shot Learning in Multi-Modal Language Models with Autonomous Instruction Optimization
Authors:
Dongsheng Zhu,
Xunzhu Tang,
Weidong Han,
Jinghui Lu,
Yukun Zhao,
Guoliang Xing,
Junfeng Wang,
Dawei Yin
Abstract:
This paper presents VisLingInstruct, a novel approach to advancing Multi-Modal Language Models (MMLMs) in zero-shot learning. Current MMLMs show impressive zero-shot abilities in multi-modal tasks, but their performance depends heavily on the quality of instructions. VisLingInstruct tackles this by autonomously evaluating and optimizing instructional texts through In-Context Learning, improving th…
▽ More
This paper presents VisLingInstruct, a novel approach to advancing Multi-Modal Language Models (MMLMs) in zero-shot learning. Current MMLMs show impressive zero-shot abilities in multi-modal tasks, but their performance depends heavily on the quality of instructions. VisLingInstruct tackles this by autonomously evaluating and optimizing instructional texts through In-Context Learning, improving the synergy between visual perception and linguistic expression in MMLMs. Alongside this instructional advancement, we have also optimized the visual feature extraction modules in MMLMs, further augmenting their responsiveness to textual content. Our comprehensive experiments on MMLMs, based on FlanT5 and Vicuna, show that VisLingInstruct significantly improves zero-shot performance in visual multi-modal tasks. Notably, it achieves a 13.1% and 9% increase in accuracy over the prior state-of-the-art on the TextVQA and HatefulMemes datasets. Our main code is available at https://github.com/Zhudongsheng75/VisLingInstruct.
△ Less
Submitted 20 June, 2024; v1 submitted 11 February, 2024;
originally announced February 2024.
-
Point cloud-based registration and image fusion between cardiac SPECT MPI and CTA
Authors:
Shaojie Tang,
Penpen Miao,
Xingyu Gao,
Yu Zhong,
Dantong Zhu,
Haixing Wen,
Zhihui Xu,
Qiuyue Wei,
Hongping Yao,
Xin Huang,
Rui Gao,
Chen Zhao,
Weihua Zhou
Abstract:
A method was proposed for the point cloud-based registration and image fusion between cardiac single photon emission computed tomography (SPECT) myocardial perfusion images (MPI) and cardiac computed tomography angiograms (CTA). Firstly, the left ventricle (LV) epicardial regions (LVERs) in SPECT and CTA images were segmented by using different U-Net neural networks trained to generate the point c…
▽ More
A method was proposed for the point cloud-based registration and image fusion between cardiac single photon emission computed tomography (SPECT) myocardial perfusion images (MPI) and cardiac computed tomography angiograms (CTA). Firstly, the left ventricle (LV) epicardial regions (LVERs) in SPECT and CTA images were segmented by using different U-Net neural networks trained to generate the point clouds of the LV epicardial contours (LVECs). Secondly, according to the characteristics of cardiac anatomy, the special points of anterior and posterior interventricular grooves (APIGs) were manually marked in both SPECT and CTA image volumes. Thirdly, we developed an in-house program for coarsely registering the special points of APIGs to ensure a correct cardiac orientation alignment between SPECT and CTA images. Fourthly, we employed ICP, SICP or CPD algorithm to achieve a fine registration for the point clouds (together with the special points of APIGs) of the LV epicardial surfaces (LVERs) in SPECT and CTA images. Finally, the image fusion between SPECT and CTA was realized after the fine registration. The experimental results showed that the cardiac orientation was aligned well and the mean distance error of the optimal registration method (CPD with affine transform) was consistently less than 3 mm. The proposed method could effectively fuse the structures from cardiac CTA and SPECT functional images, and demonstrated a potential in assisting in accurate diagnosis of cardiac diseases by combining complementary advantages of the two imaging modalities.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Buffon-Laplace Needle Problem as a geometric probabilistic approach to filtration process
Authors:
Yan-Jie Min,
De-Quan Zhu,
Jin-Hua Zhao
Abstract:
Buffon-Laplace Needle Problem considers a needle of a length $l$ randomly dropped on a large plane distributed with vertically parallel lines with distances $a$ and $b$ ($a \geqslant b$), respectively. As a classical problem in stochastic probability, it serves as a mathematical basis of various physical literature, such as the efficiency of a filter and the emergence of clogging in filtration pro…
▽ More
Buffon-Laplace Needle Problem considers a needle of a length $l$ randomly dropped on a large plane distributed with vertically parallel lines with distances $a$ and $b$ ($a \geqslant b$), respectively. As a classical problem in stochastic probability, it serves as a mathematical basis of various physical literature, such as the efficiency of a filter and the emergence of clogging in filtration process. Yet its potential application is limited by previous focus on its original form of the `short' needle case of $l < b$ and its analytical difficulty in a general sense. Here, rather than a `short' two-dimensional needle, we analytically solve the problems with two- and three-dimensional needles and spherocylinders of arbitrary length and radius dropped on a grid with any rectangular shape. We further confirm our analytical theory with Monte Carlo simulation. Our framework here helps to provide a geometric analytical perspective to filtration process, and also extend the analytical power of the needle problem into unexplored parameter regions for physical problems involving stochastic processes.
△ Less
Submitted 30 July, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
Gradient Aligned Regression via Pairwise Losses
Authors:
Dixian Zhu,
Tianbao Yang,
Livnat Jerby
Abstract:
Regression is a fundamental task in machine learning that has garnered extensive attention over the past decades. The conventional approach for regression involves employing loss functions that primarily concentrate on aligning model prediction with the ground truth for each individual data sample. Recent research endeavors have introduced novel perspectives by incorporating label similarity to re…
▽ More
Regression is a fundamental task in machine learning that has garnered extensive attention over the past decades. The conventional approach for regression involves employing loss functions that primarily concentrate on aligning model prediction with the ground truth for each individual data sample. Recent research endeavors have introduced novel perspectives by incorporating label similarity to regression via imposing extra pairwise regularization on the latent feature space and demonstrated the effectiveness. However, there are two drawbacks for those approaches: i) their pairwise operation in latent feature space is computationally more expensive than conventional regression losses; ii) it lacks of theoretical justifications behind such regularization. In this work, we propose GAR (Gradient Aligned Regression) as a competitive alternative method in label space, which is constituted by a conventional regression loss and two pairwise label difference losses for gradient alignment including magnitude and direction. GAR enjoys: i) the same level efficiency as conventional regression loss because the quadratic complexity for the proposed pairwise losses can be reduced to linear complexity; ii) theoretical insights from learning the pairwise label difference to learning the gradient of the ground truth function. We limit our current scope as regression on the clean data setting without noises, outliers or distributional shifts, etc. We demonstrate the effectiveness of the proposed method practically on two synthetic datasets and on eight extensive real-world tasks from six benchmark datasets with other eight competitive baselines. Running time experiments demonstrate the superior efficiency of the proposed GAR over existing methods with pairwise regularization in latent feature space and ablation studies demonstrate the effectiveness of each component for GAR.
△ Less
Submitted 22 May, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner
Authors:
Ying Zang,
Chenglong Fu,
Runlong Cao,
Didi Zhu,
Min Zhang,
Wenjun Hu,
Lanyun Zhu,
Tianrun Chen
Abstract:
Referring expression segmentation (RES), a task that involves localizing specific instance-level objects based on free-form linguistic descriptions, has emerged as a crucial frontier in human-AI interaction. It demands an intricate understanding of both visual and textual contexts and often requires extensive training data. This paper introduces RESMatch, the first semi-supervised learning (SSL) a…
▽ More
Referring expression segmentation (RES), a task that involves localizing specific instance-level objects based on free-form linguistic descriptions, has emerged as a crucial frontier in human-AI interaction. It demands an intricate understanding of both visual and textual contexts and often requires extensive training data. This paper introduces RESMatch, the first semi-supervised learning (SSL) approach for RES, aimed at reducing reliance on exhaustive data annotation. Extensive validation on multiple RES datasets demonstrates that RESMatch significantly outperforms baseline approaches, establishing a new state-of-the-art. Although existing SSL techniques are effective in image segmentation, we find that they fall short in RES. Facing the challenges including the comprehension of free-form linguistic descriptions and the variability in object attributes, RESMatch introduces a trifecta of adaptations: revised strong perturbation, text augmentation, and adjustments for pseudo-label quality and strong-weak supervision. This pioneering work lays the groundwork for future research in semi-supervised learning for referring expression segmentation.
△ Less
Submitted 11 February, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
Hidden domain boundary dynamics towards crystalline perfection
Authors:
A. Mangu,
V. A. Stoica,
H. Zheng,
T. Yang,
M. Zhang,
H. Wang,
Q. L. Nguyen,
S. Song,
S. Das,
P. Meisenheimer,
E. Donoway,
M. Chollet,
Y. Sun,
J. J. Turner,
J. W. Freeland,
H. Wen,
L. W. Martin,
L. -Q. Chen,
V. Gopalan,
D. Zhu,
Y. Cao,
A. M. Lindenberg
Abstract:
A central paradigm of non-equilibrium physics concerns the dynamics of heterogeneity and disorder, impacting processes ranging from the behavior of glasses to the emergent functionality of active matter. Understanding these complex mesoscopic systems requires probing the microscopic trajectories associated with irreversible processes, the role of fluctuations and entropy growth, and the timescales…
▽ More
A central paradigm of non-equilibrium physics concerns the dynamics of heterogeneity and disorder, impacting processes ranging from the behavior of glasses to the emergent functionality of active matter. Understanding these complex mesoscopic systems requires probing the microscopic trajectories associated with irreversible processes, the role of fluctuations and entropy growth, and the timescales on which non-equilibrium responses are ultimately maintained. Approaches that illuminate these processes in model systems may enable a more general understanding of other heterogeneous non-equilibrium phenomena, and potentially define ultimate speed and energy cost limits for information processing technologies. Here, we apply ultrafast single shot x-ray photon correlation spectroscopy to resolve the non-equilibrium, heterogeneous, and irreversible mesoscale dynamics during a light-induced phase transition. This approach defines a new way of capturing the nucleation of the induced phase, the formation of transient mesoscale defects at the boundaries of the nuclei, and the eventual annihilation of these defects, even in systems with complex polarization topologies. A non-equilibrium response spanning >10 orders of magnitude in timescales is observed, with multistep behavior similar to the plateaus observed in supercooled liquids and glasses. We show how the observed time-dependent long-time correlations can be understood in terms of the stochastic dynamics of domain walls, encoded in effective waiting-time distributions with power-law tails. This work defines new possibilities for probing the non-equilibrium and correlated dynamics of disordered and heterogeneous media.
△ Less
Submitted 21 March, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
RevOrder: A Novel Method for Enhanced Arithmetic in Language Models
Authors:
Si Shen,
Peijun Shen,
Danhao Zhu
Abstract:
This paper presents RevOrder, a novel technique aimed at improving arithmetic operations in large language models (LLMs) by reversing the output digits in addition, subtraction, and n-digit by 1-digit (nD by 1D) multiplication tasks. Our method significantly reduces the Count of Sequential Intermediate Digits (CSID) to $\mathcal{O}(1)$, a new metric we introduce to assess equation complexity. Thro…
▽ More
This paper presents RevOrder, a novel technique aimed at improving arithmetic operations in large language models (LLMs) by reversing the output digits in addition, subtraction, and n-digit by 1-digit (nD by 1D) multiplication tasks. Our method significantly reduces the Count of Sequential Intermediate Digits (CSID) to $\mathcal{O}(1)$, a new metric we introduce to assess equation complexity. Through comprehensive testing, RevOrder not only achieves perfect accuracy in basic arithmetic operations but also substantially boosts LLM performance in division tasks, particularly with large numbers where traditional models struggle. Implementation of RevOrder is cost-effective for both training and inference phases. Moreover, applying RevOrder to fine-tune the LLaMA2-7B model on the GSM8K math task results in a considerable improvement, reducing equation calculation errors by 46% and increasing overall scores from 41.6 to 44.4.
△ Less
Submitted 23 February, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Uncover the nature of overlapping community in cities
Authors:
Peng Luo,
Di Zhu
Abstract:
Urban spaces, though often perceived as discrete communities, are shared by various functional and social groups. Our study introduces a graph-based physics-aware deep learning framework, illuminating the intricate overlapping nature inherent in urban communities. Through analysis of individual mobile phone positioning data at Twin Cities metro area (TCMA) in Minnesota, USA, our findings reveal th…
▽ More
Urban spaces, though often perceived as discrete communities, are shared by various functional and social groups. Our study introduces a graph-based physics-aware deep learning framework, illuminating the intricate overlapping nature inherent in urban communities. Through analysis of individual mobile phone positioning data at Twin Cities metro area (TCMA) in Minnesota, USA, our findings reveal that 95.7 % of urban functional complexity stems from the overlapping structure of communities during weekdays. Significantly, our research not only quantifies these overlaps but also reveals their compelling correlations with income and racial indicators, unraveling the complex segregation patterns in U.S. cities. As the first to elucidate the overlapping nature of urban communities, this work offers a unique geospatial perspective on looking at urban structures, highlighting the nuanced interplay of socioeconomic dynamics within cities.
△ Less
Submitted 31 January, 2024;
originally announced February 2024.
-
Measurements of Normalized Differential Cross Sections of Inclusive $η$ Production in $e^{+}e^{-}$ Annihilation at Energy from 2.0000 to 3.6710 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
D. Anderle,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko
, et al. (641 additional authors not shown)
Abstract:
Using data samples collected with the BESIII detector operating at the BEPCII storage ring, the cross section of the inclusive process $e^{+}e^{-} \to η+ X$, normalized by the total cross section of $e^{+}e^{-} \to \text{hadrons}$, is measured at eight center-of-mass energy points from 2.0000 GeV to 3.6710 GeV. These are the first measurements with momentum dependence in this energy region. Our me…
▽ More
Using data samples collected with the BESIII detector operating at the BEPCII storage ring, the cross section of the inclusive process $e^{+}e^{-} \to η+ X$, normalized by the total cross section of $e^{+}e^{-} \to \text{hadrons}$, is measured at eight center-of-mass energy points from 2.0000 GeV to 3.6710 GeV. These are the first measurements with momentum dependence in this energy region. Our measurement shows a significant discrepancy from calculations with the existing fragmentation functions. To address this discrepancy, a new QCD analysis is performed at the next-to-next-to-leading order with hadron mass corrections and higher twist effects, which can explain both the established high-energy data and our measurements reasonably well.
△ Less
Submitted 15 July, 2024; v1 submitted 31 January, 2024;
originally announced January 2024.
-
A New Look at the Scalar Meson $f_0(500)$ via $D^+\to π^+π^-\ell^+ν_\ell$ Decays
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann,
H. Cai,
X. Cai
, et al. (615 additional authors not shown)
Abstract:
Using $2.93~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector at the center-of-mass energy of 3.773 GeV, we investigate the semileptonic decays $D^+\to π^+π^- \ell^+ν_\ell$ ($\ell=e$ and $μ$). The $D^+\to f_0(500)μ^+ν_μ$ decay is observed for the first time. By analyzing simultaneously the differential decay rates of $D^+\to f_0(500) μ^+ν_μ$ and…
▽ More
Using $2.93~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector at the center-of-mass energy of 3.773 GeV, we investigate the semileptonic decays $D^+\to π^+π^- \ell^+ν_\ell$ ($\ell=e$ and $μ$). The $D^+\to f_0(500)μ^+ν_μ$ decay is observed for the first time. By analyzing simultaneously the differential decay rates of $D^+\to f_0(500) μ^+ν_μ$ and $D^+\to f_0(500) e^+ν_e$ in different $\ell^+ν_\ell$ four-momentum transfer intervals, the product of the relevant hadronic form factor $f^{f_0}_{+}(0)$ and the magnitude of the $c\to d$ Cabibbo-Kobayashi-Maskawa matrix element $|V_{cd}|$ is determined to be $f_{+}^{f_0} (0)|V_{cd}|=0.0787\pm0.0060_{\rm stat}\pm0.0033_{\rm syst}$ for the first time. With the input of $|V_{cd}|$ from the global fit in the standard model, we determine $f_{+}^{f_0} (0)=0.350\pm0.027_{\rm stat}\pm0.015_{\rm syst}$. The absolute branching fractions of $D^+\to f_0(500)_{(π^+π^-)}μ^+ν_μ$ and $D^+\to ρ^0_{(π^+π^-)} μ^+ν_μ$ are determined as $(0.72\pm0.13_{\rm stat}\pm0.10_{\rm syst})\times10^{-3}$ and $(1.64\pm0.13_{\rm stat}\pm0.11_{\rm syst})\times 10^{-3}$. Combining these results with those of previous BESIII measurements on their semielectronic counterparts from the same data sample, we test lepton flavor universality by measuring the branching fraction ratios ${\mathcal B}_{D^+\to ρ^0 μ^+ν_μ}/{\mathcal B}_{D^+\to ρ^0 e^+ν_e}=0.88\pm0.10$ and ${\mathcal B}_{D^+\to f_0(500) μ^+ν_μ}/{\mathcal B}_{D^+\to f_0(500) e^+ν_e}=1.14\pm0.28$, which are compatible with the standard model expectation.
△ Less
Submitted 4 February, 2024; v1 submitted 23 January, 2024;
originally announced January 2024.
-
A Quantile Nelson-Siegel model
Authors:
Matteo Iacopini,
Aubrey Poon,
Luca Rossini,
Dan Zhu
Abstract:
A widespread approach to modelling the interaction between macroeconomic variables and the yield curve relies on three latent factors usually interpreted as the level, slope, and curvature (Diebold et al., 2006). This approach is inherently focused on the conditional mean of the yields and postulates a dynamic linear model where the latent factors smoothly change over time. However, periods of dee…
▽ More
A widespread approach to modelling the interaction between macroeconomic variables and the yield curve relies on three latent factors usually interpreted as the level, slope, and curvature (Diebold et al., 2006). This approach is inherently focused on the conditional mean of the yields and postulates a dynamic linear model where the latent factors smoothly change over time. However, periods of deep crisis, such as the Great Recession and the recent pandemic, have highlighted the importance of statistical models that account for asymmetric shocks and are able to forecast the tails of a variable's distribution. A new version of the dynamic three-factor model is proposed to address this issue based on quantile regressions. The novel approach leverages the potential of quantile regression to model the entire (conditional) distribution of the yields instead of restricting to its mean. An application to US data from the 1970s shows the significant heterogeneity of the interactions between financial and macroeconomic variables across different quantiles. Moreover, an out-of-sample forecasting exercise showcases the proposed method's advantages in predicting the yield distribution tails compared to the standard conditional mean model. Finally, by inspecting the posterior distribution of the three factors during the recent major crises, new evidence is found that supports the greater and longer-lasting negative impact of the great recession on the yields compared to the COVID-19 pandemic.
△ Less
Submitted 18 January, 2024;
originally announced January 2024.
-
Measurement of Born cross section of $e^{+}e^{-}\rightarrowΣ^{+}\barΣ^{-}$ at center-of-mass energies between 3.510 and 4.951 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (632 additional authors not shown)
Abstract:
Using 24.1 fb$^{-1}$ of $e^{+}e^{-}$ collision data collected with the BESIII detector at the BEPCII collider, the Born cross sections and effective form factors of the $e^{+}e^{-}\rightarrowΣ^{+}\barΣ^{-}$ reaction are measured. The measurements are performed at center-of-mass energies ranging from 3.510 to 4.951 GeV. No significant evidence for the decay of the charmonium(-like) states,…
▽ More
Using 24.1 fb$^{-1}$ of $e^{+}e^{-}$ collision data collected with the BESIII detector at the BEPCII collider, the Born cross sections and effective form factors of the $e^{+}e^{-}\rightarrowΣ^{+}\barΣ^{-}$ reaction are measured. The measurements are performed at center-of-mass energies ranging from 3.510 to 4.951 GeV. No significant evidence for the decay of the charmonium(-like) states, $ψ(3770)$, $ψ(4040)$, $ψ(4160)$, $Y(4230)$, $Y(4360)$, $ψ(4415)$, and $Y(4660)$, into a $Σ^{+}\barΣ^{-}$ final state is observed. Consequently, upper limits for the products of the branching fractions and the electronic partial widths at the 90% confidence level are reported for these decays.
△ Less
Submitted 6 May, 2024; v1 submitted 10 January, 2024;
originally announced January 2024.
-
First study of antihyperon-nucleon scattering $\barΛp\rightarrow\barΛp$ and measurement of $Λp\rightarrowΛp$ cross section
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
Using $(10.087\pm0.044)\times10^{9}$ $J/ψ$ events collected with the BESIII detector at the BEPCII storage ring, the processes $Λp\rightarrowΛp$ and $\barΛp\rightarrow\barΛp$ are studied, where the $Λ/\barΛ$ baryons are produced in the process $J/ψ\rightarrowΛ\barΛ$ and the protons are the hydrogen nuclei in the cooling oil of the beam pipe. Clear signals are observed for the two reactions. The cr…
▽ More
Using $(10.087\pm0.044)\times10^{9}$ $J/ψ$ events collected with the BESIII detector at the BEPCII storage ring, the processes $Λp\rightarrowΛp$ and $\barΛp\rightarrow\barΛp$ are studied, where the $Λ/\barΛ$ baryons are produced in the process $J/ψ\rightarrowΛ\barΛ$ and the protons are the hydrogen nuclei in the cooling oil of the beam pipe. Clear signals are observed for the two reactions. The cross sections in $-0.9\leq\rm{cos}θ_{Λ/\barΛ}\leq0.9$ are measured to be $σ(Λp\rightarrowΛp)=(12.2\pm1.6_{\rm{stat}}\pm1.1_{\rm{sys}})$ mb and $σ(\barΛ p\rightarrow\barΛ p)=(17.5\pm2.1_{\rm{stat}}\pm1.6_{\rm{sys}})$ mb at the $Λ/\barΛ$ momentum of $1.074$ GeV/$c$ within a range of $\pm0.017$ GeV/$c$, where the $θ_{Λ/\barΛ}$ are the scattering angles of the $Λ/\barΛ$ in the $Λp/\barΛp$ rest frames. Furthermore, the differential cross sections of the two reactions are also measured, where there is a slight tendency of forward scattering for $Λp\rightarrowΛp$, and a strong forward peak for $\barΛp\rightarrow\barΛp$. We present an approach to extract the total elastic cross sections by extrapolation. The study of $\barΛp\rightarrow\barΛp$ represents the first study of antihyperon-nucleon scattering, and these new measurements will serve as important inputs for the theoretical understanding of the (anti)hyperon-nucleon interaction.
△ Less
Submitted 18 May, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
Observation of $ψ(3686) \to Ω^- K^+ \barΞ^0 $+c.c
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (630 additional authors not shown)
Abstract:
Using $(27.12 \pm 0.14) \times 10^{8}$ $ψ(3686)$ events collected with the BESIII detector at BEPCII, the decay of $ψ(3686) \to Ω^- K^+ \barΞ^0 +c.c.$ is observed for the first time. The branching fraction of this decay is measured to be $\mathcal{B}_{ψ(3686) \to Ω^- K^+ \barΞ^0 +c.c.}=(2.78 \pm 0.40 \pm 0.18 ) \times 10^{-6}$, where the first uncertainty is statistical and the second is systemati…
▽ More
Using $(27.12 \pm 0.14) \times 10^{8}$ $ψ(3686)$ events collected with the BESIII detector at BEPCII, the decay of $ψ(3686) \to Ω^- K^+ \barΞ^0 +c.c.$ is observed for the first time. The branching fraction of this decay is measured to be $\mathcal{B}_{ψ(3686) \to Ω^- K^+ \barΞ^0 +c.c.}=(2.78 \pm 0.40 \pm 0.18 ) \times 10^{-6}$, where the first uncertainty is statistical and the second is systematic. Possible baryon excited states are searched for in this decay, but no evident intermediate state is observed with the current sample size.
△ Less
Submitted 15 April, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
Photonic Modes Prediction via Multi-Modal Diffusion Model
Authors:
Jinyang Sun,
Xi Chen,
Xiumei Wang,
Dandan Zhu,
Xingping Zhou
Abstract:
The concept of photonic modes is the cornerstone in optics and photonics, which can describe the propagation of the light. The Maxwell's equations play the role in calculating the mode field based on the structure information, while this process needs a great deal of computations, especially in the handle with a three-dimensional model. To overcome this obstacle, we introduce the Multi-Modal Diffu…
▽ More
The concept of photonic modes is the cornerstone in optics and photonics, which can describe the propagation of the light. The Maxwell's equations play the role in calculating the mode field based on the structure information, while this process needs a great deal of computations, especially in the handle with a three-dimensional model. To overcome this obstacle, we introduce the Multi-Modal Diffusion model to predict the photonic modes in one certain structure. The Contrastive Language-Image Pre-training (CLIP) model is used to build the connections between photonic structures and the corresponding modes. Then we exemplify Stable Diffusion (SD) model to realize the function of optical fields generation from structure information. Our work introduces Multi-Modal deep learning to construct complex mapping between structural information and light field as high-dimensional vectors, and generates light field images based on this mapping.
△ Less
Submitted 22 February, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
First observation of the decay $Λ^+_c\to nK^{0}_{S}π^+π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (630 additional authors not shown)
Abstract:
Based on 4.5 fb$^{-1}$ of $e^{+}e^{-}$ collision data accumulated at center-of-mass energies between $4599.53$ MeV and $4698.82$ MeV with the BESIII detector, the decay $Λ_{c}^{+}\to nK_{S}^{0}π^+π^0$ is observed for the first time with a significance of $9.2σ$. The branching fraction is measured to be $(0.85\pm0.13\pm0.03)\%$, where the first uncertainty is statistical and the second systematic,…
▽ More
Based on 4.5 fb$^{-1}$ of $e^{+}e^{-}$ collision data accumulated at center-of-mass energies between $4599.53$ MeV and $4698.82$ MeV with the BESIII detector, the decay $Λ_{c}^{+}\to nK_{S}^{0}π^+π^0$ is observed for the first time with a significance of $9.2σ$. The branching fraction is measured to be $(0.85\pm0.13\pm0.03)\%$, where the first uncertainty is statistical and the second systematic, which differs from the theoretical prediction based on isospin by 4.4$σ$. This indicates that there may be resonant contributions or some unknown dynamics in this decay.
△ Less
Submitted 28 March, 2024; v1 submitted 11 January, 2024;
originally announced January 2024.
-
Current Effect-eliminated Optimal Target Assignment and Motion Planning for a Multi-UUV System
Authors:
Danjie Zhu,
Simon X. Yang
Abstract:
The paper presents an innovative approach (CBNNTAP) that addresses the complexities and challenges introduced by ocean currents when optimizing target assignment and motion planning for a multi-unmanned underwater vehicle (UUV) system. The core of the proposed algorithm involves the integration of several key components. Firstly, it incorporates a bio-inspired neural network-based (BINN) approach…
▽ More
The paper presents an innovative approach (CBNNTAP) that addresses the complexities and challenges introduced by ocean currents when optimizing target assignment and motion planning for a multi-unmanned underwater vehicle (UUV) system. The core of the proposed algorithm involves the integration of several key components. Firstly, it incorporates a bio-inspired neural network-based (BINN) approach which predicts the most efficient paths for individual UUVs while simultaneously ensuring collision avoidance among the vehicles. Secondly, an efficient target assignment component is integrated by considering the path distances determined by the BINN algorithm. In addition, a critical innovation within the CBNNTAP algorithm is its capacity to address the disruptive effects of ocean currents, where an adjustment component is seamlessly integrated to counteract the deviations caused by these currents, which enhances the accuracy of both motion planning and target assignment for the UUVs. The effectiveness of the CBNNTAP algorithm is demonstrated through comprehensive simulation results and the outcomes underscore the superiority of the developed algorithm in nullifying the effects of static and dynamic ocean currents in 2D and 3D scenarios.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
Search for a massless particle beyond the Standard Model in the $Σ^+\rightarrow p+{\rm invisible}$ decay
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
A massless particle beyond the Standard Model is searched for in the two-body decay $Σ^+\rightarrow p+{\rm invisible}$ using $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected at a center-of-mass energy of $\sqrt{s}=3.097$ GeV with the BESIII detector at the BEPCII collider. No significant signal is observed, and the upper limit on the branching fraction $B(Σ^+\rightarrow p+{\rm invisible})$…
▽ More
A massless particle beyond the Standard Model is searched for in the two-body decay $Σ^+\rightarrow p+{\rm invisible}$ using $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected at a center-of-mass energy of $\sqrt{s}=3.097$ GeV with the BESIII detector at the BEPCII collider. No significant signal is observed, and the upper limit on the branching fraction $B(Σ^+\rightarrow p+{\rm invisible})$ is determined to be $3.2\times10^{-5}$ at the 90% confidence level. This is the first search for a flavor-changing neutral current process with missing energy in hyperon decays which plays an important role in constraining new physics models.
△ Less
Submitted 5 April, 2024; v1 submitted 28 December, 2023;
originally announced December 2023.
-
Hard X-ray Generation and Detection of Nanometer-Scale Localized Coherent Acoustic Wave Packets in SrTiO$_3$ and KTaO$_3$
Authors:
Yijing Huang,
Peihao Sun,
Samuel W. Teitelbaum,
Haoyuan Li,
Yanwen Sun,
Nan Wang,
Sanghoon Song,
Takahiro Sato,
Matthieu Chollet,
Taito Osaka,
Ichiro Inoue,
Ryan A. Duncan,
Hyun D. Shin,
Johann Haber,
Jinjian Zhou,
Marco Bernardi,
Mingqiang Gu,
James M. Rondinelli,
Mariano Trigo,
Makina Yabashi,
Alexei A. Maznev,
Keith A. Nelson,
Diling Zhu,
David A. Reis
Abstract:
We demonstrate that the absorption of femtosecond x-ray pulses can excite quasi-spherical high-wavevector coherent acoustic phonon wavepackets using an all x-ray pump and probe scattering experiment. The time- and momentum-resolved diffuse scattering signal is consistent with strain pulses induced by the rapid electron cascade dynamics following photoionization at uncorrelated excitation centers.…
▽ More
We demonstrate that the absorption of femtosecond x-ray pulses can excite quasi-spherical high-wavevector coherent acoustic phonon wavepackets using an all x-ray pump and probe scattering experiment. The time- and momentum-resolved diffuse scattering signal is consistent with strain pulses induced by the rapid electron cascade dynamics following photoionization at uncorrelated excitation centers. We quantify key parameters of this process, including the localization size of the strain wavepacket and the energy absorption efficiency, which are determined by the photoelectron and Auger electron cascade dynamics, as well as the electron-phonon interaction. In particular, we obtain the localization size of the observed strain wave packet to be 1.5 and 2.5 nm for bulk SrTiO$_3$ and KTaO$_3$ single crystals, even though there are no nanoscale structures or light-intensity patterns that would ordinarily be required to generate acoustic waves of wavelengths much shorter than the penetration depth. Whereas in GaAs and GaP we do not observe a signal above background. The results provide crucial information on x-ray matter interactions, which sheds light on the mechanism of x-ray energy deposition, and the study of high wavevector acoustic phonons and thermal transport at the nanoscale.
△ Less
Submitted 2 January, 2024; v1 submitted 27 December, 2023;
originally announced December 2023.
-
Observation of $χ_{cJ}\to 3(K^+K^-)$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (632 additional authors not shown)
Abstract:
By analyzing $(27.12\pm0.14)\times10^8$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decay processes $χ_{cJ} \to 3(K^+K^-)$ ($J=0,1,2$) are observed for the first time with statistical significances of 8.2$σ$, 8.1$σ$, and 12.4$σ$, respectively. The product branching fractions of $ψ(3686)\toγχ_{cJ}$, $χ_{cJ}\to 3(K^+K^-)$ are presented and the branching…
▽ More
By analyzing $(27.12\pm0.14)\times10^8$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decay processes $χ_{cJ} \to 3(K^+K^-)$ ($J=0,1,2$) are observed for the first time with statistical significances of 8.2$σ$, 8.1$σ$, and 12.4$σ$, respectively. The product branching fractions of $ψ(3686)\toγχ_{cJ}$, $χ_{cJ}\to 3(K^+K^-)$ are presented and the branching fractions of $χ_{cJ}\to 3(K^+K^-)$ decays are determined to be
$\mathcal{B}_{χ_{c0}\to 3(K^+K^-)}$=$(10.7\pm1.8\pm1.1)$$\times10^{-6}$,
$\mathcal{B}_{χ_{c1}\to 3(K^+K^-)}$=$(4.2\pm0.9\pm0.5)$$\times10^{-6}$, and
$\mathcal{B}_{χ_{c2}\to 3(K^+K^-)}$=$(7.2\pm1.1\pm0.8)$$\times10^{-6}$,
where the first uncertainties are statistical and the second are systematic.
△ Less
Submitted 26 December, 2023;
originally announced December 2023.
-
MFABA: A More Faithful and Accelerated Boundary-based Attribution Method for Deep Neural Networks
Authors:
Zhiyu Zhu,
Huaming Chen,
Jiayu Zhang,
Xinyi Wang,
Zhibo Jin,
Minhui Xue,
Dongxiao Zhu,
Kim-Kwang Raymond Choo
Abstract:
To better understand the output of deep neural networks (DNN), attribution based methods have been an important approach for model interpretability, which assign a score for each input dimension to indicate its importance towards the model outcome. Notably, the attribution methods use the axioms of sensitivity and implementation invariance to ensure the validity and reliability of attribution resu…
▽ More
To better understand the output of deep neural networks (DNN), attribution based methods have been an important approach for model interpretability, which assign a score for each input dimension to indicate its importance towards the model outcome. Notably, the attribution methods use the axioms of sensitivity and implementation invariance to ensure the validity and reliability of attribution results. Yet, the existing attribution methods present challenges for effective interpretation and efficient computation. In this work, we introduce MFABA, an attribution algorithm that adheres to axioms, as a novel method for interpreting DNN. Additionally, we provide the theoretical proof and in-depth analysis for MFABA algorithm, and conduct a large scale experiment. The results demonstrate its superiority by achieving over 101.5142 times faster speed than the state-of-the-art attribution algorithms. The effectiveness of MFABA is thoroughly evaluated through the statistical analysis in comparison to other methods, and the full implementation package is open-source at: https://github.com/LMBTough/MFABA
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Joint Trading and Scheduling among Coupled Carbon-Electricity-Heat-Gas Industrial Clusters
Authors:
Dafeng Zhu,
Bo Yang,
Yu Wu,
Haoran Deng,
Zhaoyang Dong,
Kai Ma,
Xinping Guan
Abstract:
This paper presents a carbon-energy coupling management framework for an industrial park, where the carbon flow model accompanying multi-energy flows is adopted to track and suppress carbon emissions on the user side. To deal with the quadratic constraint of gas flows, a bound tightening algorithm for constraints relaxation is adopted. The synergies among the carbon capture, energy storage, power-…
▽ More
This paper presents a carbon-energy coupling management framework for an industrial park, where the carbon flow model accompanying multi-energy flows is adopted to track and suppress carbon emissions on the user side. To deal with the quadratic constraint of gas flows, a bound tightening algorithm for constraints relaxation is adopted. The synergies among the carbon capture, energy storage, power-to-gas further consume renewable energy and reduce carbon emissions. Aiming at carbon emissions disparities and supply-demand imbalances, this paper proposes a carbon trading ladder reward and punishment mechanism and an energy trading and scheduling method based on Lyapunov optimization and matching game to maximize the long-term benefits of each industrial cluster without knowing the prior information of random variables. Case studies show that our proposed trading method can reduce overall costs and carbon emissions while relieving energy pressure, which is important for Environmental, Social and Governance (ESG).
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
PointeNet: A Lightweight Framework for Effective and Efficient Point Cloud Analysis
Authors:
Lipeng Gu,
Xuefeng Yan,
Liangliang Nan,
Dingkun Zhu,
Honghua Chen,
Weiming Wang,
Mingqiang Wei
Abstract:
Current methodologies in point cloud analysis predominantly explore 3D geometries, often achieved through the introduction of intricate learnable geometric extractors in the encoder or by deepening networks with repeated blocks. However, these approaches inevitably lead to a significant number of learnable parameters, resulting in substantial computational costs and imposing memory burdens on CPU/…
▽ More
Current methodologies in point cloud analysis predominantly explore 3D geometries, often achieved through the introduction of intricate learnable geometric extractors in the encoder or by deepening networks with repeated blocks. However, these approaches inevitably lead to a significant number of learnable parameters, resulting in substantial computational costs and imposing memory burdens on CPU/GPU. Additionally, the existing strategies are primarily tailored for object-level point cloud classification and segmentation tasks, with limited extensions to crucial scene-level applications, such as autonomous driving. In response to these limitations, we introduce PointeNet, an efficient network designed specifically for point cloud analysis. PointeNet distinguishes itself with its lightweight architecture, low training cost, and plug-and-play capability, effectively capturing representative features. The network consists of a Multivariate Geometric Encoding (MGE) module and an optional Distance-aware Semantic Enhancement (DSE) module. The MGE module employs operations of sampling, grouping, and multivariate geometric aggregation to lightweightly capture and adaptively aggregate multivariate geometric features, providing a comprehensive depiction of 3D geometries. The DSE module, designed for real-world autonomous driving scenarios, enhances the semantic perception of point clouds, particularly for distant points. Our method demonstrates flexibility by seamlessly integrating with a classification/segmentation head or embedding into off-the-shelf 3D object detection networks, achieving notable performance improvements at a minimal cost. Extensive experiments on object-level datasets, including ModelNet40, ScanObjectNN, ShapeNetPart, and the scene-level dataset KITTI, demonstrate the superior performance of PointeNet over state-of-the-art methods in point cloud analysis.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
GraphScope Flex: LEGO-like Graph Computing Stack
Authors:
Tao He,
Shuxian Hu,
Longbin Lai,
Dongze Li,
Neng Li,
Xue Li,
Lexiao Liu,
Xiaojian Luo,
Binqing Lyu,
Ke Meng,
Sijie Shen,
Li Su,
Lei Wang,
Jingbo Xu,
Wenyuan Yu,
Weibin Zeng,
Lei Zhang,
Siyuan Zhang,
Jingren Zhou,
Xiaoli Zhou,
Diwen Zhu
Abstract:
Graph computing has become increasingly crucial in processing large-scale graph data, with numerous systems developed for this purpose. Two years ago, we introduced GraphScope as a system addressing a wide array of graph computing needs, including graph traversal, analytics, and learning in one system. Since its inception, GraphScope has achieved significant technological advancements and gained w…
▽ More
Graph computing has become increasingly crucial in processing large-scale graph data, with numerous systems developed for this purpose. Two years ago, we introduced GraphScope as a system addressing a wide array of graph computing needs, including graph traversal, analytics, and learning in one system. Since its inception, GraphScope has achieved significant technological advancements and gained widespread adoption across various industries. However, one key lesson from this journey has been understanding the limitations of a "one-size-fits-all" approach, especially when dealing with the diversity of programming interfaces, applications, and data storage formats in graph computing. In response to these challenges, we present GraphScope Flex, the next iteration of GraphScope. GraphScope Flex is designed to be both resource-efficient and cost-effective, while also providing flexibility and user-friendliness through its LEGO-like modularity. This paper explores the architectural innovations and fundamental design principles of GraphScope Flex, all of which are direct outcomes of the lessons learned during our ongoing development process. We validate the adaptability and efficiency of GraphScope Flex with extensive evaluations on synthetic and real-world datasets. The results show that GraphScope Flex achieves 2.4X throughput and up to 55.7X speedup over other systems on the LDBC Social Network and Graphalytics benchmarks, respectively. Furthermore, GraphScope Flex accomplishes up to a 2,400X performance gain in real-world applications, demonstrating its proficiency across a wide range of graph computing scenarios with increased effectiveness.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Enhancing Data Lakes with GraphAr: Efficient Graph Data Management with a Specialized Storage Scheme
Authors:
Xue Li,
Weibin Zeng,
Zhibin Wang,
Diwen Zhu,
Jingbo Xu,
Wenyuan Yu,
Jingren Zhou
Abstract:
Data lakes, increasingly adopted for their ability to store and analyze diverse types of data, commonly use columnar storage formats like Parquet and ORC for handling relational tables. However, these traditional setups fall short when it comes to efficiently managing graph data, particularly those conforming to the Labeled Property Graph (LPG) model. To address this gap, this paper introduces Gra…
▽ More
Data lakes, increasingly adopted for their ability to store and analyze diverse types of data, commonly use columnar storage formats like Parquet and ORC for handling relational tables. However, these traditional setups fall short when it comes to efficiently managing graph data, particularly those conforming to the Labeled Property Graph (LPG) model. To address this gap, this paper introduces GraphAr, a specialized storage scheme designed to enhance existing data lakes for efficient graph data management. Leveraging the strengths of Parquet, GraphAr captures LPG semantics precisely and facilitates graph-specific operations such as neighbor retrieval and label filtering. Through innovative data organization, encoding, and decoding techniques, GraphAr dramatically improves performance. Our evaluations reveal that GraphAr outperforms conventional Parquet and Acero-based methods, achieving an average speedup of $3283\times$ for neighbor retrieval, $6.0\times$ for label filtering, and $29.5\times$ for end-to-end workloads. These findings highlight GraphAr's potential to extend the utility of data lakes by enabling efficient graph data management.
△ Less
Submitted 21 December, 2023; v1 submitted 15 December, 2023;
originally announced December 2023.
-
Directly observing atomic-scale relaxations of a glass forming liquid using femtosecond X-ray photon correlation spectroscopy
Authors:
Tomoki Fujita,
Yanwen Sun,
Haoyuan Li,
Thies J. Albert,
Sanghoon Song,
Takahiro Sato,
Jens Moesgaard,
Antoine Cornet,
Peihao Sun,
Ying Chen,
Mianzhen Mo,
Narges Amini,
Fan Yang,
Arune Makareviciute,
Garrett Coleman,
Pierre Lucas,
Jan Peter Embs,
Vincent Esposito,
Joan Vila-Comamala,
Nan Wang,
Talgat Mamyrbayev,
Christian David,
Jerome Hastings,
Beatrice Ruta,
Paul Fuoss
, et al. (3 additional authors not shown)
Abstract:
Glass forming liquids exhibit structural relaxation behaviors, reflecting underlying atomic rearrangements on a wide range of timescales. These behaviors play a crucial role in determining many material properties. However, the relaxation processes on the atomic scale are not well understood due to the experimental difficulties in directly characterizing the evolving correlations of atomic order i…
▽ More
Glass forming liquids exhibit structural relaxation behaviors, reflecting underlying atomic rearrangements on a wide range of timescales. These behaviors play a crucial role in determining many material properties. However, the relaxation processes on the atomic scale are not well understood due to the experimental difficulties in directly characterizing the evolving correlations of atomic order in disordered systems. Here, taking the model system Ge15Te85, we demonstrate an experimental approach that probes the relaxation dynamics by scattering the coherent X-ray pulses with femtosecond duration produced by X-ray free electron lasers (XFELs). By collecting the summed speckle patterns from two rapidly successive, nearly identical X-ray pulses generated using a split-delay system, we can extract the contrast decay of speckle patterns originating from sample dynamics and observe the full decorrelation of local order on the sub-picosecond timescale. This provides the direct atomic-level evidence of fragile liquid behavior of Ge15Te85. Our results demonstrate the strategy for XFEL-based X-ray photon correlation spectroscopy (XPCS), attaining femtosecond temporal and atomic-scale spatial resolutions. This twelve orders of magnitude extension from the millisecond regime of synchrotron-based XPCS opens a new avenue of experimental studies of relaxation dynamics in liquids, glasses, and other highly disordered systems.
△ Less
Submitted 8 June, 2024; v1 submitted 13 December, 2023;
originally announced December 2023.
-
Holistic Evaluation of GPT-4V for Biomedical Imaging
Authors:
Zhengliang Liu,
Hanqi Jiang,
Tianyang Zhong,
Zihao Wu,
Chong Ma,
Yiwei Li,
Xiaowei Yu,
Yutong Zhang,
Yi Pan,
Peng Shu,
Yanjun Lyu,
Lu Zhang,
Junjie Yao,
Peixin Dong,
Chao Cao,
Zhenxiang Xiao,
Jiaqi Wang,
Huan Zhao,
Shaochen Xu,
Yaonai Wei,
Jingyuan Chen,
Haixing Dai,
Peilong Wang,
Hao He,
Zewei Wang
, et al. (25 additional authors not shown)
Abstract:
In this paper, we present a large-scale evaluation probing GPT-4V's capabilities and limitations for biomedical image analysis. GPT-4V represents a breakthrough in artificial general intelligence (AGI) for computer vision, with applications in the biomedical domain. We assess GPT-4V's performance across 16 medical imaging categories, including radiology, oncology, ophthalmology, pathology, and mor…
▽ More
In this paper, we present a large-scale evaluation probing GPT-4V's capabilities and limitations for biomedical image analysis. GPT-4V represents a breakthrough in artificial general intelligence (AGI) for computer vision, with applications in the biomedical domain. We assess GPT-4V's performance across 16 medical imaging categories, including radiology, oncology, ophthalmology, pathology, and more. Tasks include modality recognition, anatomy localization, disease diagnosis, report generation, and lesion detection. The extensive experiments provide insights into GPT-4V's strengths and weaknesses. Results show GPT-4V's proficiency in modality and anatomy recognition but difficulty with disease diagnosis and localization. GPT-4V excels at diagnostic report generation, indicating strong image captioning skills. While promising for biomedical imaging AI, GPT-4V requires further enhancement and validation before clinical deployment. We emphasize responsible development and testing for trustworthy integration of biomedical AGI. This rigorous evaluation of GPT-4V on diverse medical images advances understanding of multimodal large language models (LLMs) and guides future work toward impactful healthcare applications.
△ Less
Submitted 10 November, 2023;
originally announced December 2023.
-
Uncovering Gender Stereotypes in Video Game Character Designs: A Multi-Modal Analysis of Honor of Kings
Authors:
Bingqing Liu,
Kyrie Zhixuan Zhou,
Danlei Zhu,
Jaihyun Park
Abstract:
In this paper, we conduct a comprehensive analysis of gender stereotypes in the character design of Honor of Kings, a popular multiplayer online battle arena (MOBA) game in China. We probe gender stereotypes through the lens of role assignments, visual designs, spoken lines, and background stories, combining qualitative analysis and text mining based on the moral foundation theory. Male heroes are…
▽ More
In this paper, we conduct a comprehensive analysis of gender stereotypes in the character design of Honor of Kings, a popular multiplayer online battle arena (MOBA) game in China. We probe gender stereotypes through the lens of role assignments, visual designs, spoken lines, and background stories, combining qualitative analysis and text mining based on the moral foundation theory. Male heroes are commonly designed as masculine fighters with power and female heroes as feminine "ornaments" with ideal looks. We contribute with a culture-aware and multi-modal understanding of gender stereotypes in games, leveraging text-, visual-, and role-based evidence.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
FedDRO: Federated Compositional Optimization for Distributionally Robust Learning
Authors:
Prashant Khanduri,
Chengyin Li,
Rafi Ibn Sultan,
Yao Qiang,
Joerg Kliewer,
Dongxiao Zhu
Abstract:
Recently, compositional optimization (CO) has gained popularity because of its applications in distributionally robust optimization (DRO) and many other machine learning problems. Large-scale and distributed availability of data demands the development of efficient federated learning (FL) algorithms for solving CO problems. Developing FL algorithms for CO is particularly challenging because of the…
▽ More
Recently, compositional optimization (CO) has gained popularity because of its applications in distributionally robust optimization (DRO) and many other machine learning problems. Large-scale and distributed availability of data demands the development of efficient federated learning (FL) algorithms for solving CO problems. Developing FL algorithms for CO is particularly challenging because of the compositional nature of the objective. Moreover, current state-of-the-art methods to solve such problems rely on large batch gradients (depending on the solution accuracy) not feasible for most practical settings. To address these challenges, in this work, we propose efficient FedAvg-type algorithms for solving non-convex CO in the FL setting. We first establish that vanilla FedAvg is not suitable to solve distributed CO problems because of the data heterogeneity in the compositional objective at each client which leads to the amplification of bias in the local compositional gradient estimates. To this end, we propose a novel FL framework FedDRO that utilizes the DRO problem structure to design a communication strategy that allows FedAvg to control the bias in the estimation of the compositional gradient. A key novelty of our work is to develop solution accuracy-independent algorithms that do not require large batch gradients (and function evaluations) for solving federated CO problems. We establish $\mathcal{O}(ε^{-2})$ sample and $\mathcal{O}(ε^{-3/2})$ communication complexity in the FL setting while achieving linear speedup with the number of clients. We corroborate our theoretical findings with empirical studies on large-scale DRO problems.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Deep Calibration of Market Simulations using Neural Density Estimators and Embedding Networks
Authors:
Namid R. Stillman,
Rory Baggott,
Justin Lyon,
Jianfei Zhang,
Dingqiu Zhu,
Tao Chen,
Perukrishnen Vytelingum
Abstract:
The ability to construct a realistic simulator of financial exchanges, including reproducing the dynamics of the limit order book, can give insight into many counterfactual scenarios, such as a flash crash, a margin call, or changes in macroeconomic outlook. In recent years, agent-based models have been developed that reproduce many features of an exchange, as summarised by a set of stylised facts…
▽ More
The ability to construct a realistic simulator of financial exchanges, including reproducing the dynamics of the limit order book, can give insight into many counterfactual scenarios, such as a flash crash, a margin call, or changes in macroeconomic outlook. In recent years, agent-based models have been developed that reproduce many features of an exchange, as summarised by a set of stylised facts and statistics. However, the ability to calibrate simulators to a specific period of trading remains an open challenge. In this work, we develop a novel approach to the calibration of market simulators by leveraging recent advances in deep learning, specifically using neural density estimators and embedding networks. We demonstrate that our approach is able to correctly identify high probability parameter sets, both when applied to synthetic and historical data, and without reliance on manually selected or weighted ensembles of stylised facts.
△ Less
Submitted 27 November, 2023; v1 submitted 20 November, 2023;
originally announced November 2023.
-
GeoSAM: Fine-tuning SAM with Sparse and Dense Visual Prompting for Automated Segmentation of Mobility Infrastructure
Authors:
Rafi Ibn Sultan,
Chengyin Li,
Hui Zhu,
Prashant Khanduri,
Marco Brocanelli,
Dongxiao Zhu
Abstract:
The Segment Anything Model (SAM) has shown impressive performance when applied to natural image segmentation. However, it struggles with geographical images like aerial and satellite imagery, especially when segmenting mobility infrastructure including roads, sidewalks, and crosswalks. This inferior performance stems from the narrow features of these objects, their textures blending into the surro…
▽ More
The Segment Anything Model (SAM) has shown impressive performance when applied to natural image segmentation. However, it struggles with geographical images like aerial and satellite imagery, especially when segmenting mobility infrastructure including roads, sidewalks, and crosswalks. This inferior performance stems from the narrow features of these objects, their textures blending into the surroundings, and interference from objects like trees, buildings, vehicles, and pedestrians - all of which can disorient the model to produce inaccurate segmentation maps. To address these challenges, we propose Geographical SAM (GeoSAM), a novel SAM-based framework that implements a fine-tuning strategy using the dense visual prompt from zero-shot learning, and the sparse visual prompt from a pre-trained CNN segmentation model. The proposed GeoSAM outperforms existing approaches for geographical image segmentation, specifically by 26%, 7%, and 17% for road infrastructure, pedestrian infrastructure, and on average, respectively, representing a momentous leap in leveraging foundation models to segment mobility infrastructure including both road and pedestrian infrastructure in geographical images. The source code can be found on this GitHub repository: https://github.com/rafiibnsultan/GeoSAM/tree/main.
△ Less
Submitted 30 January, 2024; v1 submitted 19 November, 2023;
originally announced November 2023.
-
Current manipulation of Giant tunneling altermagnetic resistance in collinear Antiferromagnetic RuO2/MgO/RuO2 sandwich structure
Authors:
Shijie Xu,
Yan Huang,
Farzad Mahfouzi,
Zhizhong Zhang,
Houyi Cheng,
Bingqian Dai,
Jinwoong Kim,
Wenlong Cai,
Kewen Shi,
Daoqian Zhu,
Zongxia Guo,
Caihua Cao,
Kun Zhang,
Albert Fert,
Yue Zhang,
Kang L. Wang,
Nicholas Kioussis,
Weisheng Zhao
Abstract:
As an emerging non-volatile memory technology, magnetic random access memory (MRAM) has key features and advantages including non-volatility, high speed, endurance, low power consumption and radiation tolerance. Conventional MRAM utilizes magnetic tunnel junctions (MTJs), which consist of two ferromagnetic layers separated by an insulating tunnel barrier. The orientation of the magnetic layers rep…
▽ More
As an emerging non-volatile memory technology, magnetic random access memory (MRAM) has key features and advantages including non-volatility, high speed, endurance, low power consumption and radiation tolerance. Conventional MRAM utilizes magnetic tunnel junctions (MTJs), which consist of two ferromagnetic layers separated by an insulating tunnel barrier. The orientation of the magnetic layers represents the binary data (0 or 1), and electrical resistance changes depending on the relative orientation of these magnetic layers. Despite these advancements, the quest for a swifter, more stable magneto-resistive random-access memory paradigm persists. In this vein, we present a groundbreaking development: room-temperature antiferromagnetic tunnel junctions devoid of any net magnetic moment. Over 200% tunneling altermagnetic resistance (TAR) ratio was measured at RuO2 (110)/MgO/RuO2 (110)/W structure, which is achieved by changing the antiferromagnetic Neel vector of RuO2 with an ultralow current density 2 MA*cm-2.
△ Less
Submitted 24 November, 2023; v1 submitted 16 November, 2023;
originally announced November 2023.
-
Hijacking Large Language Models via Adversarial In-Context Learning
Authors:
Yao Qiang,
Xiangyu Zhou,
Dongxiao Zhu
Abstract:
In-context learning (ICL) has emerged as a powerful paradigm leveraging LLMs for specific downstream tasks by utilizing labeled examples as demonstrations (demos) in the precondition prompts. Despite its promising performance, ICL suffers from instability with the choice and arrangement of examples. Additionally, crafted adversarial attacks pose a notable threat to the robustness of ICL. However,…
▽ More
In-context learning (ICL) has emerged as a powerful paradigm leveraging LLMs for specific downstream tasks by utilizing labeled examples as demonstrations (demos) in the precondition prompts. Despite its promising performance, ICL suffers from instability with the choice and arrangement of examples. Additionally, crafted adversarial attacks pose a notable threat to the robustness of ICL. However, existing attacks are either easy to detect, rely on external models, or lack specificity towards ICL. This work introduces a novel transferable attack against ICL to address these issues, aiming to hijack LLMs to generate the target response or jailbreak. Our hijacking attack leverages a gradient-based prompt search method to learn and append imperceptible adversarial suffixes to the in-context demos without directly contaminating the user queries. Comprehensive experimental results across different generation and jailbreaking tasks highlight the effectiveness of our hijacking attack, resulting in distracted attention towards adversarial tokens and consequently leading to unwanted target outputs. We also propose a defense strategy against hijacking attacks through the use of extra clean demos, which enhances the robustness of LLMs during ICL. Broadly, this work reveals the significant security vulnerabilities of LLMs and emphasizes the necessity for in-depth studies on their robustness.
△ Less
Submitted 15 June, 2024; v1 submitted 16 November, 2023;
originally announced November 2023.
-
A Deep Reinforcement Learning Approach to Efficient Distributed Optimization
Authors:
Daokuan Zhu,
Tianqi Xu,
Jie Lu
Abstract:
In distributed optimization, the practical problem-solving performance is essentially sensitive to algorithm selection, parameter setting, problem type and data pattern. Thus, it is often laborious to acquire a highly efficient method for a given specific problem. In this paper, we propose a learning-based method to achieve efficient distributed optimization over networked systems. Specifically, a…
▽ More
In distributed optimization, the practical problem-solving performance is essentially sensitive to algorithm selection, parameter setting, problem type and data pattern. Thus, it is often laborious to acquire a highly efficient method for a given specific problem. In this paper, we propose a learning-based method to achieve efficient distributed optimization over networked systems. Specifically, a deep reinforcement learning (DRL) framework is developed for adaptive configuration within a parameterized unifying algorithmic form, which incorporates an abundance of decentralized first-order and second-order optimization algorithms. We exploit the local consensus and objective information to represent the regularities of problem instances and trace the solving progress, which constitute the states observed by a DRL agent. The framework is trained using Proximal Policy Optimization (PPO) on a number of practical problem instances of similar structures yet different problem data. Experiments on various smooth and non-smooth classes of objective functions demonstrate that our proposed learning-based method outperforms several state-of-the-art distributed optimization algorithms in terms of convergence speed and solution accuracy.
△ Less
Submitted 3 January, 2024; v1 submitted 15 November, 2023;
originally announced November 2023.
-
Testing learning-enabled cyber-physical systems with Large-Language Models: A Formal Approach
Authors:
Xi Zheng,
Aloysius K. Mok,
Ruzica Piskac,
Yong Jae Lee,
Bhaskar Krishnamachari,
Dakai Zhu,
Oleg Sokolsky,
Insup Lee
Abstract:
The integration of machine learning (ML) into cyber-physical systems (CPS) offers significant benefits, including enhanced efficiency, predictive capabilities, real-time responsiveness, and the enabling of autonomous operations. This convergence has accelerated the development and deployment of a range of real-world applications, such as autonomous vehicles, delivery drones, service robots, and te…
▽ More
The integration of machine learning (ML) into cyber-physical systems (CPS) offers significant benefits, including enhanced efficiency, predictive capabilities, real-time responsiveness, and the enabling of autonomous operations. This convergence has accelerated the development and deployment of a range of real-world applications, such as autonomous vehicles, delivery drones, service robots, and telemedicine procedures. However, the software development life cycle (SDLC) for AI-infused CPS diverges significantly from traditional approaches, featuring data and learning as two critical components. Existing verification and validation techniques are often inadequate for these new paradigms. In this study, we pinpoint the main challenges in ensuring formal safety for learningenabled CPS.We begin by examining testing as the most pragmatic method for verification and validation, summarizing the current state-of-the-art methodologies. Recognizing the limitations in current testing approaches to provide formal safety guarantees, we propose a roadmap to transition from foundational probabilistic testing to a more rigorous approach capable of delivering formal assurance.
△ Less
Submitted 16 May, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Discordance Minimization-based Imputation Algorithms for Missing Values in Rating Data
Authors:
Young Woong Park,
Jinhak Kim,
Dan Zhu
Abstract:
Ratings are frequently used to evaluate and compare subjects in various applications, from education to healthcare, because ratings provide succinct yet credible measures for comparing subjects. However, when multiple rating lists are combined or considered together, subjects often have missing ratings, because most rating lists do not rate every subject in the combined list. In this study, we pro…
▽ More
Ratings are frequently used to evaluate and compare subjects in various applications, from education to healthcare, because ratings provide succinct yet credible measures for comparing subjects. However, when multiple rating lists are combined or considered together, subjects often have missing ratings, because most rating lists do not rate every subject in the combined list. In this study, we propose analyses on missing value patterns using six real-world data sets in various applications, as well as the conditions for applicability of imputation algorithms. Based on the special structures and properties derived from the analyses, we propose optimization models and algorithms that minimize the total rating discordance across rating providers to impute missing ratings in the combined rating lists, using only the known rating information. The total rating discordance is defined as the sum of the pairwise discordance metric, which can be written as a quadratic function. Computational experiments based on real-world and synthetic rating data sets show that the proposed methods outperform the state-of-the-art general imputation methods in the literature in terms of imputation accuracy.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
Spin-flop magnetoresistance in a collinear antiferromagnetic tunnel junction
Authors:
Shijie Xu,
Zhizhong Zhang,
Farzad Mahfouzi,
Yan Huang,
Houyi Cheng,
Bingqian Dai,
Wenlong Cai,
Kewen Shi,
Daoqian Zhu,
Zongxia Guo,
Caihua Cao,
Yongshan Liu,
Albert Fert,
Nicholas Kioussis,
Kang L. Wang,
Yue Zhang.,
Weisheng Zhao
Abstract:
Collinear antiferromagnetic (AFM) materials have unique promise of no stray fields, display ultrafast dynamics, and being robust against perturbation filed which motivates the extensive research of antiferromagnetic spintronics. However, the manipulation and detection of antiferromagnetic order remain formidable challenges. Here, we report the electrical detection of colinear antiferromagnetism in…
▽ More
Collinear antiferromagnetic (AFM) materials have unique promise of no stray fields, display ultrafast dynamics, and being robust against perturbation filed which motivates the extensive research of antiferromagnetic spintronics. However, the manipulation and detection of antiferromagnetic order remain formidable challenges. Here, we report the electrical detection of colinear antiferromagnetism in all-epitaxial RuO2/MgO/RuO2 three-terminal tunnel junctions (TJ) using spin-flop tunnel anisotropy magnetoresistance (TAMR). We measured a TAMR ratio of around 60% at room temperature, which arises between the parallel and perpendicular configurations of the adjacent collinear AFM state. Furthermore, we carried out angular dependent measurements using this AFM-TJ and showed that the magnitude of anisotropic longitudinal magnetoresistance in the AFM-TJ can be controlled by the direction of magnetic field. We also theoretically found that the colinear antiferromagnetic MTJ may produce a substantially large TAMR ratio as a result of the time-reversal, strong spin orbit coupling (SOC) characteristic of antiferromagnetic RuO2. Our work not only propels antiferromagnetic materials to the forefront of spintronic device innovation but also unveils a novel paradigm for electrically governed antiferromagnetic spintronics, auguring transformative advancements in high-speed, low-energy information devices.
△ Less
Submitted 4 November, 2023;
originally announced November 2023.
-
Coherent control of a superconducting qubit using light
Authors:
Hana K. Warner,
Jeffrey Holzgrafe,
Beatriz Yankelevich,
David Barton,
Stefano Poletto,
C. J. Xin,
Neil Sinclair,
Di Zhu,
Eyob Sete,
Brandon Langley,
Emma Batson,
Marco Colangelo,
Amirhassan Shams-Ansari,
Graham Joe,
Karl K. Berggren,
Liang Jiang,
Matthew Reagor,
Marko Loncar
Abstract:
Quantum science and technology promise the realization of a powerful computational resource that relies on a network of quantum processors connected with low loss and low noise communication channels capable of distributing entangled states [1,2]. While superconducting microwave qubits (3-8 GHz) operating in cryogenic environments have emerged as promising candidates for quantum processor nodes du…
▽ More
Quantum science and technology promise the realization of a powerful computational resource that relies on a network of quantum processors connected with low loss and low noise communication channels capable of distributing entangled states [1,2]. While superconducting microwave qubits (3-8 GHz) operating in cryogenic environments have emerged as promising candidates for quantum processor nodes due to their strong Josephson nonlinearity and low loss [3], the information between spatially separated processor nodes will likely be carried at room temperature via telecommunication photons (200 THz) propagating in low loss optical fibers. Transduction of quantum information [4-10] between these disparate frequencies is therefore critical to leverage the advantages of each platform by interfacing quantum resources. Here, we demonstrate coherent optical control of a superconducting qubit. We achieve this by developing a microwave-optical quantum transducer that operates with up to 1.18% conversion efficiency (1.16% cooperativity) and demonstrate optically-driven Rabi oscillations (2.27 MHz) in a superconducting qubit without impacting qubit coherence times (800 ns). Finally, we discuss outlooks towards using the transducer to network quantum processor nodes.
△ Less
Submitted 30 October, 2023; v1 submitted 24 October, 2023;
originally announced October 2023.
-
LUNA: A Model-Based Universal Analysis Framework for Large Language Models
Authors:
Da Song,
Xuan Xie,
Jiayang Song,
Derui Zhu,
Yuheng Huang,
Felix Juefei-Xu,
Lei Ma
Abstract:
Over the past decade, Artificial Intelligence (AI) has had great success recently and is being used in a wide range of academic and industrial fields. More recently, LLMs have made rapid advancements that have propelled AI to a new level, enabling even more diverse applications and industrial domains with intelligence, particularly in areas like software engineering and natural language processing…
▽ More
Over the past decade, Artificial Intelligence (AI) has had great success recently and is being used in a wide range of academic and industrial fields. More recently, LLMs have made rapid advancements that have propelled AI to a new level, enabling even more diverse applications and industrial domains with intelligence, particularly in areas like software engineering and natural language processing. Nevertheless, a number of emerging trustworthiness concerns and issues exhibited in LLMs have already recently received much attention, without properly solving which the widespread adoption of LLMs could be greatly hindered in practice. The distinctive characteristics of LLMs, such as the self-attention mechanism, extremely large model scale, and autoregressive generation schema, differ from classic AI software based on CNNs and RNNs and present new challenges for quality analysis. Up to the present, it still lacks universal and systematic analysis techniques for LLMs despite the urgent industrial demand. Towards bridging this gap, we initiate an early exploratory study and propose a universal analysis framework for LLMs, LUNA, designed to be general and extensible, to enable versatile analysis of LLMs from multiple quality perspectives in a human-interpretable manner. In particular, we first leverage the data from desired trustworthiness perspectives to construct an abstract model as an auxiliary analysis asset, which is empowered by various abstract model construction methods. To assess the quality of the abstract model, we collect and define a number of evaluation metrics, aiming at both abstract model level and the semantics level. Then, the semantics, which is the degree of satisfaction of the LLM w.r.t. the trustworthiness perspective, is bound to and enriches the abstract model with semantics, which enables more detailed analysis applications for diverse purposes.
△ Less
Submitted 13 June, 2024; v1 submitted 22 October, 2023;
originally announced October 2023.
-
A Distributed Buffering Drift-Plus-Penalty Algorithm for Coupling Constrained Optimization
Authors:
Dandan Wang,
Daokuan Zhu,
Zichong Ou,
Jie Lu
Abstract:
This paper focuses on distributed constrained optimization over time-varying directed networks, where all agents cooperate to optimize the sum of their locally accessible objective functions subject to a coupled inequality constraint consisting of all their local constraint functions. To address this problem, we develop a buffering drift-plus-penalty algorithm, referred to as B-DPP. The proposed B…
▽ More
This paper focuses on distributed constrained optimization over time-varying directed networks, where all agents cooperate to optimize the sum of their locally accessible objective functions subject to a coupled inequality constraint consisting of all their local constraint functions. To address this problem, we develop a buffering drift-plus-penalty algorithm, referred to as B-DPP. The proposed B-DPP algorithm utilizes the idea of drift-plus-penalty minimization in centralized optimization to control constraint violation and objective error, and adapts it to the distributed setting. It also innovatively incorporates a buffer variable into local virtual queue updates to acquire flexible and desirable tracking of constraint violation. We show that B-DPP achieves $O(1/\sqrt{t})$ rates of convergence to both optimality and feasibility, which outperform the alternative methods in the literature. Moreover, with a proper buffer parameter, B-DPP is capable of reaching feasibility within a finite number of iterations, which is a pioneering result in the area. Simulations on a resource allocation problem over 5G virtualized networks demonstrate the competitive convergence performance and efficiency of B-DPP.
△ Less
Submitted 14 October, 2023;
originally announced October 2023.
-
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Authors:
Jun Chen,
Deyao Zhu,
Xiaoqian Shen,
Xiang Li,
Zechun Liu,
Pengchuan Zhang,
Raghuraman Krishnamoorthi,
Vikas Chandra,
Yunyang Xiong,
Mohamed Elhoseiny
Abstract:
Large language models have shown their remarkable capabilities as a general interface for various language-related applications. Motivated by this, we target to build a unified interface for completing many vision-language tasks including image description, visual question answering, and visual grounding, among others. The challenge is to use a single model for performing diverse vision-language t…
▽ More
Large language models have shown their remarkable capabilities as a general interface for various language-related applications. Motivated by this, we target to build a unified interface for completing many vision-language tasks including image description, visual question answering, and visual grounding, among others. The challenge is to use a single model for performing diverse vision-language tasks effectively with simple multi-modal instructions. Towards this objective, we introduce MiniGPT-v2, a model that can be treated as a unified interface for better handling various vision-language tasks. We propose using unique identifiers for different tasks when training the model. These identifiers enable our model to better distinguish each task instruction effortlessly and also improve the model learning efficiency for each task. After the three-stage training, the experimental results show that MiniGPT-v2 achieves strong performance on many visual question-answering and visual grounding benchmarks compared to other vision-language generalist models. Our model and codes are available at https://minigpt-v2.github.io/
△ Less
Submitted 7 November, 2023; v1 submitted 13 October, 2023;
originally announced October 2023.
-
InfoCL: Alleviating Catastrophic Forgetting in Continual Text Classification from An Information Theoretic Perspective
Authors:
Yifan Song,
Peiyi Wang,
Weimin Xiong,
Dawei Zhu,
Tianyu Liu,
Zhifang Sui,
Sujian Li
Abstract:
Continual learning (CL) aims to constantly learn new knowledge over time while avoiding catastrophic forgetting on old tasks. We focus on continual text classification under the class-incremental setting. Recent CL studies have identified the severe performance decrease on analogous classes as a key factor for catastrophic forgetting. In this paper, through an in-depth exploration of the represent…
▽ More
Continual learning (CL) aims to constantly learn new knowledge over time while avoiding catastrophic forgetting on old tasks. We focus on continual text classification under the class-incremental setting. Recent CL studies have identified the severe performance decrease on analogous classes as a key factor for catastrophic forgetting. In this paper, through an in-depth exploration of the representation learning process in CL, we discover that the compression effect of the information bottleneck leads to confusion on analogous classes. To enable the model learn more sufficient representations, we propose a novel replay-based continual text classification method, InfoCL. Our approach utilizes fast-slow and current-past contrastive learning to perform mutual information maximization and better recover the previously learned representations. In addition, InfoCL incorporates an adversarial memory augmentation strategy to alleviate the overfitting problem of replay. Experimental results demonstrate that InfoCL effectively mitigates forgetting and achieves state-of-the-art performance on three text classification tasks. The code is publicly available at https://github.com/Yifan-Song793/InfoCL.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Empirical Evaluation of the Segment Anything Model (SAM) for Brain Tumor Segmentation
Authors:
Mohammad Peivandi,
Jason Zhang,
Michael Lu,
Dongxiao Zhu,
Zhifeng Kou
Abstract:
Brain tumor segmentation presents a formidable challenge in the field of Medical Image Segmentation. While deep-learning models have been useful, human expert segmentation remains the most accurate method. The recently released Segment Anything Model (SAM) has opened up the opportunity to apply foundation models to this difficult task. However, SAM was primarily trained on diverse natural images.…
▽ More
Brain tumor segmentation presents a formidable challenge in the field of Medical Image Segmentation. While deep-learning models have been useful, human expert segmentation remains the most accurate method. The recently released Segment Anything Model (SAM) has opened up the opportunity to apply foundation models to this difficult task. However, SAM was primarily trained on diverse natural images. This makes applying SAM to biomedical segmentation, such as brain tumors with less defined boundaries, challenging. In this paper, we enhanced SAM's mask decoder using transfer learning with the Decathlon brain tumor dataset. We developed three methods to encapsulate the four-dimensional data into three dimensions for SAM. An on-the-fly data augmentation approach has been used with a combination of rotations and elastic deformations to increase the size of the training dataset. Two key metrics: the Dice Similarity Coefficient (DSC) and the Hausdorff Distance 95th Percentile (HD95), have been applied to assess the performance of our segmentation models. These metrics provided valuable insights into the quality of the segmentation results. In our evaluation, we compared this improved model to two benchmarks: the pretrained SAM and the widely used model, nnUNetv2. We find that the improved SAM shows considerable improvement over the pretrained SAM, while nnUNetv2 outperformed the improved SAM in terms of overall segmentation accuracy. Nevertheless, the improved SAM demonstrated slightly more consistent results than nnUNetv2, especially on challenging cases that can lead to larger Hausdorff distances. In the future, more advanced techniques can be applied in order to further improve the performance of SAM on brain tumor segmentation.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
ChatRadio-Valuer: A Chat Large Language Model for Generalizable Radiology Report Generation Based on Multi-institution and Multi-system Data
Authors:
Tianyang Zhong,
Wei Zhao,
Yutong Zhang,
Yi Pan,
Peixin Dong,
Zuowei Jiang,
Xiaoyan Kui,
Youlan Shang,
Li Yang,
Yaonai Wei,
Longtao Yang,
Hao Chen,
Huan Zhao,
Yuxiao Liu,
Ning Zhu,
Yiwei Li,
Yisong Wang,
Jiaqi Yao,
Jiaqi Wang,
Ying Zeng,
Lei He,
Chao Zheng,
Zhixue Zhang,
Ming Li,
Zhengliang Liu
, et al. (17 additional authors not shown)
Abstract:
Radiology report generation, as a key step in medical image analysis, is critical to the quantitative analysis of clinically informed decision-making levels. However, complex and diverse radiology reports with cross-source heterogeneity pose a huge generalizability challenge to the current methods under massive data volume, mainly because the style and normativity of radiology reports are obviousl…
▽ More
Radiology report generation, as a key step in medical image analysis, is critical to the quantitative analysis of clinically informed decision-making levels. However, complex and diverse radiology reports with cross-source heterogeneity pose a huge generalizability challenge to the current methods under massive data volume, mainly because the style and normativity of radiology reports are obviously distinctive among institutions, body regions inspected and radiologists. Recently, the advent of large language models (LLM) offers great potential for recognizing signs of health conditions. To resolve the above problem, we collaborate with the Second Xiangya Hospital in China and propose ChatRadio-Valuer based on the LLM, a tailored model for automatic radiology report generation that learns generalizable representations and provides a basis pattern for model adaptation in sophisticated analysts' cases. Specifically, ChatRadio-Valuer is trained based on the radiology reports from a single institution by means of supervised fine-tuning, and then adapted to disease diagnosis tasks for human multi-system evaluation (i.e., chest, abdomen, muscle-skeleton, head, and maxillofacial $\&$ neck) from six different institutions in clinical-level events. The clinical dataset utilized in this study encompasses a remarkable total of \textbf{332,673} observations. From the comprehensive results on engineering indicators, clinical efficacy and deployment cost metrics, it can be shown that ChatRadio-Valuer consistently outperforms state-of-the-art models, especially ChatGPT (GPT-3.5-Turbo) and GPT-4 et al., in terms of the diseases diagnosis from radiology reports. ChatRadio-Valuer provides an effective avenue to boost model generalization performance and alleviate the annotation workload of experts to enable the promotion of clinical AI applications in radiology reports.
△ Less
Submitted 9 October, 2023; v1 submitted 8 October, 2023;
originally announced October 2023.
-
Non-Smooth Weakly-Convex Finite-sum Coupled Compositional Optimization
Authors:
Quanqi Hu,
Dixian Zhu,
Tianbao Yang
Abstract:
This paper investigates new families of compositional optimization problems, called $\underline{\bf n}$on-$\underline{\bf s}$mooth $\underline{\bf w}$eakly-$\underline{\bf c}$onvex $\underline{\bf f}$inite-sum $\underline{\bf c}$oupled $\underline{\bf c}$ompositional $\underline{\bf o}$ptimization (NSWC FCCO). There has been a growing interest in FCCO due to its wide-ranging applications in machin…
▽ More
This paper investigates new families of compositional optimization problems, called $\underline{\bf n}$on-$\underline{\bf s}$mooth $\underline{\bf w}$eakly-$\underline{\bf c}$onvex $\underline{\bf f}$inite-sum $\underline{\bf c}$oupled $\underline{\bf c}$ompositional $\underline{\bf o}$ptimization (NSWC FCCO). There has been a growing interest in FCCO due to its wide-ranging applications in machine learning and AI, as well as its ability to address the shortcomings of stochastic algorithms based on empirical risk minimization. However, current research on FCCO presumes that both the inner and outer functions are smooth, limiting their potential to tackle a more diverse set of problems. Our research expands on this area by examining non-smooth weakly-convex FCCO, where the outer function is weakly convex and non-decreasing, and the inner function is weakly-convex. We analyze a single-loop algorithm and establish its complexity for finding an $ε$-stationary point of the Moreau envelop of the objective function. Additionally, we also extend the algorithm to solving novel non-smooth weakly-convex tri-level finite-sum coupled compositional optimization problems, which feature a nested arrangement of three functions. Lastly, we explore the applications of our algorithms in deep learning for two-way partial AUC maximization and multi-instance two-way partial AUC maximization, using empirical studies to showcase the effectiveness of the proposed algorithms.
△ Less
Submitted 3 March, 2024; v1 submitted 4 October, 2023;
originally announced October 2023.
-
LawBench: Benchmarking Legal Knowledge of Large Language Models
Authors:
Zhiwei Fei,
Xiaoyu Shen,
Dawei Zhu,
Fengzhe Zhou,
Zhuo Han,
Songyang Zhang,
Kai Chen,
Zongwen Shen,
Jidong Ge
Abstract:
Large language models (LLMs) have demonstrated strong capabilities in various aspects. However, when applying them to the highly specialized, safe-critical legal domain, it is unclear how much legal knowledge they possess and whether they can reliably perform legal-related tasks. To address this gap, we propose a comprehensive evaluation benchmark LawBench. LawBench has been meticulously crafted t…
▽ More
Large language models (LLMs) have demonstrated strong capabilities in various aspects. However, when applying them to the highly specialized, safe-critical legal domain, it is unclear how much legal knowledge they possess and whether they can reliably perform legal-related tasks. To address this gap, we propose a comprehensive evaluation benchmark LawBench. LawBench has been meticulously crafted to have precise assessment of the LLMs' legal capabilities from three cognitive levels: (1) Legal knowledge memorization: whether LLMs can memorize needed legal concepts, articles and facts; (2) Legal knowledge understanding: whether LLMs can comprehend entities, events and relationships within legal text; (3) Legal knowledge applying: whether LLMs can properly utilize their legal knowledge and make necessary reasoning steps to solve realistic legal tasks. LawBench contains 20 diverse tasks covering 5 task types: single-label classification (SLC), multi-label classification (MLC), regression, extraction and generation. We perform extensive evaluations of 51 LLMs on LawBench, including 20 multilingual LLMs, 22 Chinese-oriented LLMs and 9 legal specific LLMs. The results show that GPT-4 remains the best-performing LLM in the legal domain, surpassing the others by a significant margin. While fine-tuning LLMs on legal specific text brings certain improvements, we are still a long way from obtaining usable and reliable LLMs in legal tasks. All data, model predictions and evaluation code are released in https://github.com/open-compass/LawBench/. We hope this benchmark provides in-depth understanding of the LLMs' domain-specified capabilities and speed up the development of LLMs in the legal domain.
△ Less
Submitted 28 September, 2023;
originally announced September 2023.
-
NoisyNN: Exploring the Influence of Information Entropy Change in Learning Systems
Authors:
Xiaowei Yu,
Zhe Huang,
Yao Xue,
Lu Zhang,
Li Wang,
Tianming Liu,
Dajiang Zhu
Abstract:
We explore the impact of entropy change in deep learning systems via noise injection at different levels, i.e., the latent space and input image. The series of models that employ our methodology are collectively known as Noisy Neural Networks (NoisyNN), with examples such as NoisyViT and NoisyCNN. Noise is conventionally viewed as a harmful perturbation in various deep learning architectures, such…
▽ More
We explore the impact of entropy change in deep learning systems via noise injection at different levels, i.e., the latent space and input image. The series of models that employ our methodology are collectively known as Noisy Neural Networks (NoisyNN), with examples such as NoisyViT and NoisyCNN. Noise is conventionally viewed as a harmful perturbation in various deep learning architectures, such as convolutional neural networks (CNNs) and vision transformers (ViTs), as well as different learning tasks like image classification and transfer learning. However, this work shows noise can be an effective way to change the entropy of the learning system. We demonstrate that specific noise can boost the performance of various deep architectures under certain conditions. We theoretically prove the enhancement gained from positive noise by reducing the task complexity defined by information entropy and experimentally show the significant performance gain in large image datasets, such as the ImageNet. Herein, we use the information entropy to define the complexity of the task. We categorize the noise into two types, positive noise (PN) and harmful noise (HN), based on whether the noise can help reduce the complexity of the task. Extensive experiments of CNNs and ViTs have shown performance improvements by proactively injecting positive noise, where we achieved an unprecedented top 1 accuracy of over 95$\%$ on ImageNet. Both theoretical analysis and empirical evidence have confirmed that the presence of positive noise, can benefit the learning process, while the traditionally perceived harmful noise indeed impairs deep learning models. The different roles of noise offer new explanations for deep models on specific tasks and provide a new paradigm for improving model performance. Moreover, it reminds us that we can influence the performance of learning systems via information entropy change.
△ Less
Submitted 2 February, 2024; v1 submitted 19 September, 2023;
originally announced September 2023.
-
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training
Authors:
Dawei Zhu,
Nan Yang,
Liang Wang,
Yifan Song,
Wenhao Wu,
Furu Wei,
Sujian Li
Abstract:
Large Language Models (LLMs) are trained with a pre-defined context length, restricting their use in scenarios requiring long inputs. Previous efforts for adapting LLMs to a longer length usually requires fine-tuning with this target length (Full-length fine-tuning), suffering intensive training cost. To decouple train length from target length for efficient context window extension, we propose Po…
▽ More
Large Language Models (LLMs) are trained with a pre-defined context length, restricting their use in scenarios requiring long inputs. Previous efforts for adapting LLMs to a longer length usually requires fine-tuning with this target length (Full-length fine-tuning), suffering intensive training cost. To decouple train length from target length for efficient context window extension, we propose Positional Skip-wisE (PoSE) training that smartly simulates long inputs using a fixed context window. This is achieved by first dividing the original context window into several chunks, then designing distinct skipping bias terms to manipulate the position indices of each chunk. These bias terms and the lengths of each chunk are altered for every training example, allowing the model to adapt to all positions within target length. Experimental results show that PoSE greatly reduces memory and time overhead compared with Full-length fine-tuning, with minimal impact on performance. Leveraging this advantage, we have successfully extended the LLaMA model to 128k tokens using a 2k training context window. Furthermore, we empirically confirm that PoSE is compatible with all RoPE-based LLMs and position interpolation strategies. Notably, our method can potentially support infinite length, limited only by memory usage in inference. With ongoing progress for efficient inference, we believe PoSE can further scale the context window beyond 128k.
△ Less
Submitted 21 February, 2024; v1 submitted 19 September, 2023;
originally announced September 2023.
-
PolicyGPT: Automated Analysis of Privacy Policies with Large Language Models
Authors:
Chenhao Tang,
Zhengliang Liu,
Chong Ma,
Zihao Wu,
Yiwei Li,
Wei Liu,
Dajiang Zhu,
Quanzheng Li,
Xiang Li,
Tianming Liu,
Lei Fan
Abstract:
Privacy policies serve as the primary conduit through which online service providers inform users about their data collection and usage procedures. However, in a bid to be comprehensive and mitigate legal risks, these policy documents are often quite verbose. In practical use, users tend to click the Agree button directly rather than reading them carefully. This practice exposes users to risks of…
▽ More
Privacy policies serve as the primary conduit through which online service providers inform users about their data collection and usage procedures. However, in a bid to be comprehensive and mitigate legal risks, these policy documents are often quite verbose. In practical use, users tend to click the Agree button directly rather than reading them carefully. This practice exposes users to risks of privacy leakage and legal issues. Recently, the advent of Large Language Models (LLM) such as ChatGPT and GPT-4 has opened new possibilities for text analysis, especially for lengthy documents like privacy policies. In this study, we investigate a privacy policy text analysis framework PolicyGPT based on the LLM. This framework was tested using two datasets. The first dataset comprises of privacy policies from 115 websites, which were meticulously annotated by legal experts, categorizing each segment into one of 10 classes. The second dataset consists of privacy policies from 304 popular mobile applications, with each sentence manually annotated and classified into one of another 10 categories. Under zero-shot learning conditions, PolicyGPT demonstrated robust performance. For the first dataset, it achieved an accuracy rate of 97%, while for the second dataset, it attained an 87% accuracy rate, surpassing that of the baseline machine learning and neural network models.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.