-
Search for $B_{(s)}^{*0}\toμ^+μ^-$ in $B_c^+\toπ^+μ^+μ^-$ decays
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1113 additional authors not shown)
Abstract:
A search for the very rare $B^{*0}\toμ^+μ^-$ and $B_{s}^{*0}\toμ^+μ^-$ decays is conducted by analysing the $B_c^+\to π^+μ^+μ^-$ process. The analysis uses proton-proton collision data collected with the LHCb detector between 2011 and 2018, corresponding to an integrated luminosity of 9$\text{\,fb}^{-1}$. The signal signatures correspond to simultaneous peaks in the $μ^+μ^-$ and $π^+μ^+μ^-$ invari…
▽ More
A search for the very rare $B^{*0}\toμ^+μ^-$ and $B_{s}^{*0}\toμ^+μ^-$ decays is conducted by analysing the $B_c^+\to π^+μ^+μ^-$ process. The analysis uses proton-proton collision data collected with the LHCb detector between 2011 and 2018, corresponding to an integrated luminosity of 9$\text{\,fb}^{-1}$. The signal signatures correspond to simultaneous peaks in the $μ^+μ^-$ and $π^+μ^+μ^-$ invariant masses. No evidence for an excess of events over background is observed for either signal decay mode. Upper limits at the $90\%$ confidence level are set on the branching fractions relative to that for $B_c^+\to J\mskip -3mu/\mskip -2muψπ^+$ decays, \begin{align*}
{\cal R}_{B^{*0}(μ^+μ^-)π^+/J\mskip -3mu/\mskip -2muψπ^+} &< 3.8\times 10^{-5}\ \text{ and }
{\cal R}_{B_{s}^{*0}(μ^+μ^-)π^+/J\mskip -3mu/\mskip -2muψπ^+} &< 5.0\times 10^{-5}\,. \end{align*}
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Exploring Text-Queried Sound Event Detection with Audio Source Separation
Authors:
Han Yin,
Jisheng Bai,
Yang Xiao,
Hui Wang,
Siqi Zheng,
Yafeng Chen,
Rohan Kumar Das,
Chong Deng,
Jianfeng Chen
Abstract:
In sound event detection (SED), overlapping sound events pose a significant challenge, as certain events can be easily masked by background noise or other events, resulting in poor detection performance. To address this issue, we propose the text-queried SED (TQ-SED) framework. Specifically, we first pre-train a language-queried audio source separation (LASS) model to separate the audio tracks cor…
▽ More
In sound event detection (SED), overlapping sound events pose a significant challenge, as certain events can be easily masked by background noise or other events, resulting in poor detection performance. To address this issue, we propose the text-queried SED (TQ-SED) framework. Specifically, we first pre-train a language-queried audio source separation (LASS) model to separate the audio tracks corresponding to different events from the input audio. Then, multiple target SED branches are employed to detect individual events. AudioSep is a state-of-the-art LASS model, but has limitations in extracting dynamic audio information because of its pure convolutional structure for separation. To address this, we integrate a dual-path recurrent neural network block into the model. We refer to this structure as AudioSep-DP, which achieves the first place in DCASE 2024 Task 9 on language-queried audio source separation (objective single model track). Experimental results show that TQ-SED can significantly improve the SED performance, with an improvement of 7.22\% on F1 score over the conventional framework. Additionally, we setup comprehensive experiments to explore the impact of model complexity. The source code and pre-trained model are released at https://github.com/apple-yinhan/TQ-SED.
△ Less
Submitted 20 September, 2024;
originally announced September 2024.
-
Analysis of $\itΛ^\mathrm{0}_b \rightarrow pK^-μ^+μ^-$ decays
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1114 additional authors not shown)
Abstract:
The differential branching fraction and angular coefficients of \ensuremath{\itΛ^\mathrm{0}_b \rightarrow pK^-μ^+μ^-}\xspace decays are measured in bins of the dimuon mass squared and dihadron mass. The analysis is performed using a data set corresponding to 9$\aunit{fb}^{-1}$ of integrated luminosity collected with the $\mbox{LHCb}$ detector between 2011 and 2018. The data are consistent with rec…
▽ More
The differential branching fraction and angular coefficients of \ensuremath{\itΛ^\mathrm{0}_b \rightarrow pK^-μ^+μ^-}\xspace decays are measured in bins of the dimuon mass squared and dihadron mass. The analysis is performed using a data set corresponding to 9$\aunit{fb}^{-1}$ of integrated luminosity collected with the $\mbox{LHCb}$ detector between 2011 and 2018. The data are consistent with receiving contributions from a mixture of $\itΛ$ resonances with different spin-parity quantum numbers. The angular coefficients show a pattern of vector--axial vector interference that is a characteristic of the type of flavour-changing neutral-current transition relevant for these decays.
△ Less
Submitted 19 September, 2024;
originally announced September 2024.
-
A semi-analytical method using auxiliary sine series for vibration and sound radiation of a rectangular plate with elastic edges
Authors:
Guoming Deng,
Xian Wu,
Changxiao Shao,
Songlin Zheng,
Jianwang Shao
Abstract:
This paper proposes an efficient semi-analytical method using auxiliary sine series for transverse vibration and sound radiation of a thin rectangular plate with edges elastically restrained against translation and rotation. The formulation, constructed by two-dimensional sine and/or cosine series, can approximately express the bending displacement, and calculate vibration and sound radiation unde…
▽ More
This paper proposes an efficient semi-analytical method using auxiliary sine series for transverse vibration and sound radiation of a thin rectangular plate with edges elastically restrained against translation and rotation. The formulation, constructed by two-dimensional sine and/or cosine series, can approximately express the bending displacement, and calculate vibration and sound radiation under excitation of point force, arbitrary-angle plane wave, or diffuse acoustic field with acceptable accuracy. It is also applied for baffled or unbaffled conditions. A post-process program is developed to predict vibrating frequencies and modes, mean square velocity spectrum, and sound transmission loss via reduced-order integrals of radiation impedances. The method is validated by experiment and simulation results, demonstrating accurate and efficient computation using a single program for transverse vibration and sound radiation of a plate under different elastic boundary conditions and different excitations. Formulas given in this paper provide a basis for the code development on transverse vibration and sound radiation analysis of thin plates.
△ Less
Submitted 15 September, 2024;
originally announced September 2024.
-
Range-SLAM: Ultra-Wideband-Based Smoke-Resistant Real-Time Localization and Mapping
Authors:
Yi Liu,
Zhuozhu Jian,
Shengtao Zheng,
Houde Liu,
Xueqian Wang,
Xinlei Chen,
Bin Liang
Abstract:
This paper presents Range-SLAM, a real-time, lightweight SLAM system designed to address the challenges of localization and mapping in environments with smoke and other harsh conditions using Ultra-Wideband (UWB) signals. While optical sensors like LiDAR and cameras struggle in low-visibility environments, UWB signals provide a robust alternative for real-time positioning. The proposed system uses…
▽ More
This paper presents Range-SLAM, a real-time, lightweight SLAM system designed to address the challenges of localization and mapping in environments with smoke and other harsh conditions using Ultra-Wideband (UWB) signals. While optical sensors like LiDAR and cameras struggle in low-visibility environments, UWB signals provide a robust alternative for real-time positioning. The proposed system uses general UWB devices to achieve accurate mapping and localization without relying on expensive LiDAR or other dedicated hardware. By utilizing only the distance and Received Signal Strength Indicator (RSSI) provided by UWB sensors in relation to anchors, we combine the motion of the tag-carrying agent with raycasting algorithm to construct a 2D occupancy grid map in real time. To enhance localization in challenging conditions, a Weighted Least Squares (WLS) method is employed. Extensive real-world experiments, including smoke-filled environments and simulated
△ Less
Submitted 15 September, 2024;
originally announced September 2024.
-
Contactless Fingerprint Recognition Using 3D Graph Matching
Authors:
Zhe Cui,
Yuwei Jia,
Siyang Zheng,
Fei Su
Abstract:
Contactless fingerprint is a newly developed type of fingerprint, and has gained lots of attention in recent fingerprint studies. However, most existing contactless fingerprint algorithms treat contactless fingerprints as 2D plain fingerprints, and utilize similar recognition methods as traditional contact-based 2D fingerprints. This recognition approach does not consider the modality difference b…
▽ More
Contactless fingerprint is a newly developed type of fingerprint, and has gained lots of attention in recent fingerprint studies. However, most existing contactless fingerprint algorithms treat contactless fingerprints as 2D plain fingerprints, and utilize similar recognition methods as traditional contact-based 2D fingerprints. This recognition approach does not consider the modality difference between contactless and contact fingerprints, especially the intrinsic 3D characteristic of contactless fingerprints. This paper proposes a novel contactless fingerprint recognition algorithm that captures the revealed 3D feature of contactless fingerprints rather than the plain 2D feature. The proposed method first recovers 3D features from the input contactless fingerprint, including the 3D shape model and 3D fingerprint feature (minutiae, orientation, etc.). Then, a novel 3D graph matching is conducted in 3D space according to the extracted 3D feature. Our method captures the real 3D nature of contactless fingerprints as the whole feature extraction and matching algorithms are completed in real 3D space. Experiments results on contactless fingerprint databases show that the proposed method successfully improves the matching accuracy of contactless fingerprints. Exceptionally, our method performs stably across multiple poses of contactless fingerprints due to 3D graph matching, which is a great advantage compared to previous contactless fingerprint recognition algorithms.
△ Less
Submitted 13 September, 2024;
originally announced September 2024.
-
Legal Fact Prediction: Task Definition and Dataset Construction
Authors:
Junkai Liu,
Yujie Tong,
Hui Huang,
Shuyuan Zheng,
Muyun Yang,
Peicheng Wu,
Makoto Onizuka,
Chuan Xiao
Abstract:
Legal facts refer to the facts that can be proven by acknowledged evidence in a trial. They form the basis for the determination of court judgments. This paper introduces a novel NLP task: legal fact prediction, which aims to predict the legal fact based on a list of evidence. The predicted facts can instruct the parties and their lawyers involved in a trial to strengthen their submissions and opt…
▽ More
Legal facts refer to the facts that can be proven by acknowledged evidence in a trial. They form the basis for the determination of court judgments. This paper introduces a novel NLP task: legal fact prediction, which aims to predict the legal fact based on a list of evidence. The predicted facts can instruct the parties and their lawyers involved in a trial to strengthen their submissions and optimize their strategies during the trial. Moreover, since real legal facts are difficult to obtain before the final judgment, the predicted facts also serve as an important basis for legal judgment prediction. We construct a benchmark dataset consisting of evidence lists and ground-truth legal facts for real civil loan cases, LFPLoan. Our experiments on this dataset show that this task is non-trivial and requires further considerable research efforts.
△ Less
Submitted 11 September, 2024;
originally announced September 2024.
-
First determination of the spin-parity of $Ξ_{c}(3055)^{+,0}$ baryons
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1109 additional authors not shown)
Abstract:
The ${Ξ_{b}^{0(-)}\toΞ_{c}(3055)^{+(0)}(\to D^{+(0)}Λ)π^{-}}$ decay chains are observed, and the spin-parity of $Ξ_{c}(3055)^{+(0)}$ baryons is determined for the first time. The measurement is performed using proton-proton collision data at a center-of-mass energy of $\sqrt{s}=13\,\text{TeV}$, corresponding to an integrated luminosity of $5.4\,\text{fb}^{-1}$, recorded by the~$\text{LHCb}$ experi…
▽ More
The ${Ξ_{b}^{0(-)}\toΞ_{c}(3055)^{+(0)}(\to D^{+(0)}Λ)π^{-}}$ decay chains are observed, and the spin-parity of $Ξ_{c}(3055)^{+(0)}$ baryons is determined for the first time. The measurement is performed using proton-proton collision data at a center-of-mass energy of $\sqrt{s}=13\,\text{TeV}$, corresponding to an integrated luminosity of $5.4\,\text{fb}^{-1}$, recorded by the~$\text{LHCb}$ experiment between 2016 and 2018. The spin-parity of the $Ξ_{c}(3055)^{+(0)}$ baryons is determined to be $3/2^{+}$ with a significance of more than $6.5σ$ ($3.5σ$) compared to all other tested hypotheses. The up-down asymmetries of the ${Ξ_{b}^{0(-)}\toΞ_{c}(3055)^{+(0)}π^{-}}$ transitions are measured to be $-0.92\pm0.10\pm0.05$ ($-0.92\pm0.16\pm0.22$), consistent with maximal parity violation, where the first uncertainty is statistical and the second is systematic. These results support the hypothesis that the $Ξ_{c}(3055)^{+(0)}$ baryons correspond to the first $D$-wave $λ$-mode excitation of the $Ξ_{c}$ flavor triplet.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
Nonlinear Cooperative Output Regulation with Input Delay Compensation
Authors:
Shiqi Zheng,
Choon Ki Ahn,
Xiaowei Jiang,
Huaicheng Yan,
Peng Shi
Abstract:
This paper investigates the cooperative output regulation (COR) of nonlinear multi-agent systems (MASs) with long input delay based on periodic event-triggered mechanism. Compared with other mechanisms, periodic event-triggered control can automatically guarantee a Zeno-free behavior and avoid the continuous monitoring of triggered conditions. First, a new periodic event-triggered distributed obse…
▽ More
This paper investigates the cooperative output regulation (COR) of nonlinear multi-agent systems (MASs) with long input delay based on periodic event-triggered mechanism. Compared with other mechanisms, periodic event-triggered control can automatically guarantee a Zeno-free behavior and avoid the continuous monitoring of triggered conditions. First, a new periodic event-triggered distributed observer, which is based on the fully asynchronous communication data, is proposed to estimate the leader information. Second, a new distributed predictor feedback control method is proposed for the considered nonlinear MASs with input delay. By coordinate transformation, the MASs are mapped into new coupled ODE-PDE target systems with some disturbance-like terms. Then, we show that the COR problem is solvable. At last, to further save the communication resource, a periodic event-triggered mechanism is considered in the sensor-to-controller transmission in every agent. A new periodic event-triggered filter is proposed to deal with the periodic event-triggered feedback data. The MASs with input delay are mapped into coupled ODE-PDE target systems with sampled data information. Then, Lyapunov-Krasovskii functions are constructed to demonstrate the exponential stability of the MASs. Simulations verify the validity of the proposed results.
△ Less
Submitted 8 September, 2024;
originally announced September 2024.
-
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Authors:
Yunze Man,
Shuhong Zheng,
Zhipeng Bao,
Martial Hebert,
Liang-Yan Gui,
Yu-Xiong Wang
Abstract:
Complex 3D scene understanding has gained increasing attention, with scene encoding strategies playing a crucial role in this success. However, the optimal scene encoding strategies for various scenarios remain unclear, particularly compared to their image-based counterparts. To address this issue, we present a comprehensive study that probes various visual encoding models for 3D scene understandi…
▽ More
Complex 3D scene understanding has gained increasing attention, with scene encoding strategies playing a crucial role in this success. However, the optimal scene encoding strategies for various scenarios remain unclear, particularly compared to their image-based counterparts. To address this issue, we present a comprehensive study that probes various visual encoding models for 3D scene understanding, identifying the strengths and limitations of each model across different scenarios. Our evaluation spans seven vision foundation encoders, including image-based, video-based, and 3D foundation models. We evaluate these models in four tasks: Vision-Language Scene Reasoning, Visual Grounding, Segmentation, and Registration, each focusing on different aspects of scene understanding. Our evaluations yield key findings: DINOv2 demonstrates superior performance, video models excel in object-level tasks, diffusion models benefit geometric tasks, and language-pretrained models show unexpected limitations in language-related tasks. These insights challenge some conventional understandings, provide novel perspectives on leveraging visual foundation models, and highlight the need for more flexible encoder selection in future vision-language and scene-understanding tasks.
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
Measurement of exclusive $J/ψ$ and $ψ(2S)$ production at $\sqrt{s}=13$ TeV
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1072 additional authors not shown)
Abstract:
Measurements are presented of the cross-section for the central exclusive production of $J/ψ\toμ^+μ^-$ and $ψ(2S)\toμ^+μ^-$ processes in proton-proton collisions at $\sqrt{s} = 13 $ TeV with 2016-2018 data. They are performed by requiring both muons to be in the LHCb acceptance (with pseudorapidity $2<η_{μ^\pm} < 4.5$) and mesons in the rapidity range $2.0 < y < 4.5$. The integrated cross-section…
▽ More
Measurements are presented of the cross-section for the central exclusive production of $J/ψ\toμ^+μ^-$ and $ψ(2S)\toμ^+μ^-$ processes in proton-proton collisions at $\sqrt{s} = 13 $ TeV with 2016-2018 data. They are performed by requiring both muons to be in the LHCb acceptance (with pseudorapidity $2<η_{μ^\pm} < 4.5$) and mesons in the rapidity range $2.0 < y < 4.5$. The integrated cross-section results are \begin{equation*}
σ_{J/ψ\toμ^+μ^-}(2.0<y_{J/ψ}<4.5,2.0<η_{μ^\pm} < 4.5) = 400 \pm 2 \pm 5 \pm 12 \,{\rm pb}\,,
\end{equation*} \begin{equation*}
σ_{ψ(2S)\toμ^+μ^-}(2.0<y_{ψ(2S)}<4.5,2.0<η_{μ^\pm} < 4.5) = 9.40 \pm 0.15 \pm 0.13 \pm 0.27 \,{\rm pb}\,, \end{equation*} where the uncertainties are statistical, systematic and due to the luminosity determination. In addition, a measurement of the ratio of $ψ(2S)$ and $J/ψ$ cross-sections, at an average photon-proton centre-of-mass energy of 1 TeV, is performed, giving \begin{equation*}
\frac{σ_{ψ(2S)}}{σ_{J/ψ}} = 0.1763 \pm 0.0029 \pm 0.0008 \pm 0.0039 \,, \end{equation*} where the first uncertainty is statistical, the second systematic and the third due to the knowledge of the involved branching fractions. For the first time, the dependence of the $J/ψ$ and $ψ(2S)$ cross-sections on the total transverse momentum transfer is determined in $pp$ collisions and is found consistent with the behaviour observed in electron-proton collisions.
△ Less
Submitted 11 September, 2024; v1 submitted 5 September, 2024;
originally announced September 2024.
-
Measurement of $CP$ violation in ${B^0}\rightarrow{D^{+}D^{-}}$ and ${B^{0}_{s}}\rightarrow{D^{+}_{s}D^{-}_{s}}$ decays
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1115 additional authors not shown)
Abstract:
A time-dependent, flavour-tagged measurement of $CP$ violation is performed with ${B^0}\rightarrow{D^{+}D^{-}}$ and ${B^{0}_{s}}\rightarrow{D^{+}_{s}D^{-}_{s}}$ decays, using data collected by the LHCb detector in proton-proton collisions at a centre-of-mass energy of 13 TeV corresponding to an integrated luminosity of 6 fb$^{-1}$. In ${B^0}\rightarrow{D^{+}D^{-}}$ decays the $CP$-violation parame…
▽ More
A time-dependent, flavour-tagged measurement of $CP$ violation is performed with ${B^0}\rightarrow{D^{+}D^{-}}$ and ${B^{0}_{s}}\rightarrow{D^{+}_{s}D^{-}_{s}}$ decays, using data collected by the LHCb detector in proton-proton collisions at a centre-of-mass energy of 13 TeV corresponding to an integrated luminosity of 6 fb$^{-1}$. In ${B^0}\rightarrow{D^{+}D^{-}}$ decays the $CP$-violation parameters are measured to be \begin{align}
S_{D^{+}D^{-}} & = -0.552 \pm 0.100\,\text{(stat)} \pm 0.010\,\text{(syst)}, \nonumber \newline
C_{D^{+}D^{-}} & = \phantom{-}0.128 \pm0.103\,\text{(stat)} \pm 0.010\,\text{(syst)}. \nonumber \end{align} In $B^{0}_{s} \rightarrow D^{+}_{s}D^{-}_{s}$ decays the $CP$-violating parameter formulation in terms of $φ_{s}$ and $|λ|$ results in \begin{align}
φ_{s} & = -0.086 \pm 0.106 \,\text{(stat)} \pm 0.028\,\text{(syst)} \,\text{rad}, \nonumber \newline
|λ_{D^{+}_{s}D^{-}_{s}}| & = \phantom{-}1.145 \pm 0.126\,\text{(stat)} \pm 0.031\,\text{(syst)}. \nonumber \end{align} These results represent the most precise single measurement of the $CP$-violation parameters in their respective channels. For the first time in a single measurement, $CP$ symmetry is observed to be violated in ${B^0}\rightarrow{D^{+}D^{-}}$ decays with a significance exceeding six standard deviations.
△ Less
Submitted 4 September, 2024;
originally announced September 2024.
-
Physics-informed neural networks incorporating energy dissipation for the phase-field model of ferroelectric microstructure evolution
Authors:
Lan Shang,
Sizheng Zheng,
Jin Wang,
Jie Wang
Abstract:
Physics-informed neural networks (PINNs) are an emerging technique to solve partial differential equations (PDEs). In this work, we propose a simple but effective PINN approach for the phase-field model of ferroelectric microstructure evolution. This model is a time-dependent, nonlinear, and high-order PDE system of multi-physics, challenging to be solved using a baseline PINN. Considering that th…
▽ More
Physics-informed neural networks (PINNs) are an emerging technique to solve partial differential equations (PDEs). In this work, we propose a simple but effective PINN approach for the phase-field model of ferroelectric microstructure evolution. This model is a time-dependent, nonlinear, and high-order PDE system of multi-physics, challenging to be solved using a baseline PINN. Considering that the acquisition of steady microstructures is one of the primary focuses in simulations of ferroelectric microstructure evolution, we simplify the time-dependent PDE system to be a static problem. This static problem, however, is ill-posed. To overcome this issue, a term originated from the law of energy dissipation is embedded into the loss function as an extra constraint for the PINN. With this modification, the PINN successfully predicts the steady ferroelectric microstructure without tracking the evolution process. In addition, although the proposed PINN approach cannot tackle the dynamic problem in a straightforward fashion, it is of benefit to the PINN prediction of the evolution process by providing labeled data. These data are crucial because they help the PINN avoid the propagation failure, a common failure mode of PINNs when predicting dynamic behaviors. The above mentioned advantages of the proposed PINN approach are demonstrated through a number of examples.
△ Less
Submitted 3 September, 2024;
originally announced September 2024.
-
Measurement of $\itΛ_\it{b}^0$, $\itΛ_\it{c}^+$ and $\itΛ$ decay parameters using $\itΛ_\it{b}^0 \to \itΛ_\it{c}^+ h^-$ decays
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1103 additional authors not shown)
Abstract:
A comprehensive study of the angular distributions in the bottom-baryon decays $\itΛ^\mathrm{0}_b\to\itΛ_c^+ h^-(h=π, K)$, followed by $\itΛ_c^+\to\itΛ h^+$ with $\itΛ\to \it{p} π^-$ or $\itΛ_c^+\to\it{p}\it{K}^0_\mathrm{S}$ decays, is performed using a data sample of proton-proton collisions corresponding to an integrated luminosity of $9~\mathrm{fb}^{-1}$ collected by the LHCb experiment at cent…
▽ More
A comprehensive study of the angular distributions in the bottom-baryon decays $\itΛ^\mathrm{0}_b\to\itΛ_c^+ h^-(h=π, K)$, followed by $\itΛ_c^+\to\itΛ h^+$ with $\itΛ\to \it{p} π^-$ or $\itΛ_c^+\to\it{p}\it{K}^0_\mathrm{S}$ decays, is performed using a data sample of proton-proton collisions corresponding to an integrated luminosity of $9~\mathrm{fb}^{-1}$ collected by the LHCb experiment at center-of-mass energies of 7, 8 and 13 $\mathrm{Te\kern -0.1em V}$. The decay parameters and the associated charge-parity ($C\!P$) asymmetries are measured, with no significant $C\!P$ violation observed. For the first time, the $\itΛ^\mathrm{0}_b \to \itΛ_c^+ h^-$ decay parameters are measured. The most precise measurements of the decay parameters $α, β$ and $γ$ are obtained for $\itΛ_c^+$ decays and an independent measurement of the decay parameters for the strange-baryon $\itΛ$ decay is provided. The results deepen our understanding of weak decay dynamics in baryon decays.
△ Less
Submitted 4 September, 2024;
originally announced September 2024.
-
LLM-GAN: Construct Generative Adversarial Network Through Large Language Models For Explainable Fake News Detection
Authors:
Yifeng Wang,
Zhouhong Gu,
Siwei Zhang,
Suhang Zheng,
Tao Wang,
Tianyu Li,
Hongwei Feng,
Yanghua Xiao
Abstract:
Explainable fake news detection predicts the authenticity of news items with annotated explanations. Today, Large Language Models (LLMs) are known for their powerful natural language understanding and explanation generation abilities. However, presenting LLMs for explainable fake news detection remains two main challenges. Firstly, fake news appears reasonable and could easily mislead LLMs, leavin…
▽ More
Explainable fake news detection predicts the authenticity of news items with annotated explanations. Today, Large Language Models (LLMs) are known for their powerful natural language understanding and explanation generation abilities. However, presenting LLMs for explainable fake news detection remains two main challenges. Firstly, fake news appears reasonable and could easily mislead LLMs, leaving them unable to understand the complex news-faking process. Secondly, utilizing LLMs for this task would generate both correct and incorrect explanations, which necessitates abundant labor in the loop. In this paper, we propose LLM-GAN, a novel framework that utilizes prompting mechanisms to enable an LLM to become Generator and Detector and for realistic fake news generation and detection. Our results demonstrate LLM-GAN's effectiveness in both prediction performance and explanation quality. We further showcase the integration of LLM-GAN to a cloud-native AI platform to provide better fake news detection service in the cloud.
△ Less
Submitted 3 September, 2024;
originally announced September 2024.
-
Measurement of $C\!P$ violation observables in $D^+\rightarrow K^-K^+π^+$ decays
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1109 additional authors not shown)
Abstract:
A search for violation of the charge-parity $C\!P$ symmetry in the $D^+\rightarrow K^-K^+π^+$ decay is presented, with proton-proton collision data corresponding to an integrated luminosity of 5.4 fb$^{-1}$, collected at a center-of-mass energy of $13$ TeV with the LHCb detector. A novel model-independent technique is used to compare the $D^+$ and $D^-$ phase-space distributions, with instrumental…
▽ More
A search for violation of the charge-parity $C\!P$ symmetry in the $D^+\rightarrow K^-K^+π^+$ decay is presented, with proton-proton collision data corresponding to an integrated luminosity of 5.4 fb$^{-1}$, collected at a center-of-mass energy of $13$ TeV with the LHCb detector. A novel model-independent technique is used to compare the $D^+$ and $D^-$ phase-space distributions, with instrumental asymmetries subtracted using the $D^+_{s}\rightarrow K^-K^+π^+$ decay as a control channel. The $p$-value for the hypothesis of $C\!P$ conservation is $8.1\%$. The $C\!P$ asymmetry observables $A_{C\!P|S}^{φπ^+} = (0.95 \pm 0.43_{stat} \pm 0.26_{syst})\times 10^{-3}$ and $A_{C\!P|S}^{\overline{K}^{*0}K^+} = (-0.26 \pm 0.56_{ stat} \pm 0.18_{syst})\times 10^{-3}$ are also measured. These results show no evidence of $C\!P$ violation and represent the most sensitive search performed through the phase space of a multibody decay.
△ Less
Submitted 2 September, 2024;
originally announced September 2024.
-
Misaligned Over-The-Air Computation of Multi-Sensor Data with Wiener-Denoiser Network
Authors:
Mingjun Du,
Sihui Zheng,
Xiao-Ping Zhang,
Yuhan Dong
Abstract:
In data driven deep learning, distributed sensing and joint computing bring heavy load for computing and communication. To face the challenge, over-the-air computation (OAC) has been proposed for multi-sensor data aggregation, which enables the server to receive a desired function of massive sensing data during communication. However, the strict synchronization and accurate channel estimation cons…
▽ More
In data driven deep learning, distributed sensing and joint computing bring heavy load for computing and communication. To face the challenge, over-the-air computation (OAC) has been proposed for multi-sensor data aggregation, which enables the server to receive a desired function of massive sensing data during communication. However, the strict synchronization and accurate channel estimation constraints in OAC are hard to be satisfied in practice, leading to time and channel-gain misalignment. The paper formulates the misalignment problem as a non-blind image deblurring problem. At the receiver side, we first use the Wiener filter to deblur, followed by a U-Net network designed for further denoising. Our method is capable to exploit the inherent correlations in the signal data via learning, thus outperforms traditional methods in term of accuracy. Our code is available at https://github.com/auto-Dog/MOAC_deep
△ Less
Submitted 1 September, 2024;
originally announced September 2024.
-
Study of the rare decay $J/ψ\to μ^+μ^-μ^+μ^-$
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1096 additional authors not shown)
Abstract:
The rare electromagnetic $J/ψ\to μ^+μ^-μ^+μ^-$ decay is observed with a significance greatly exceeding the discovery threshold, using proton-proton collision data collected by the LHCb experiment during 2016-2018 at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of $5.4\,\text{fb}^{-1}$. The rate of this decay is measured relative to that of the $J/ψ\to μ^+μ^-$ mode.…
▽ More
The rare electromagnetic $J/ψ\to μ^+μ^-μ^+μ^-$ decay is observed with a significance greatly exceeding the discovery threshold, using proton-proton collision data collected by the LHCb experiment during 2016-2018 at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of $5.4\,\text{fb}^{-1}$. The rate of this decay is measured relative to that of the $J/ψ\to μ^+μ^-$ mode. Using the QED model for the four-muon decay in the efficiency estimation, its branching fraction is determined to be \begin{equation*}
{\mathcal{B}}(J/ψ\to μ^+μ^-μ^+μ^-) = (1.13\pm0.10\pm0.05\pm0.01)\times 10^{-6}, \end{equation*} where the uncertainties are statistical, systematic and due to the uncertainty on the branching fraction of the $J/ψ\to μ^+μ^-$ decay.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Authors:
Shengpeng Ji,
Ziyue Jiang,
Xize Cheng,
Yifu Chen,
Minghui Fang,
Jialong Zuo,
Qian Yang,
Ruiqi Li,
Ziang Zhang,
Xiaoda Yang,
Rongjie Huang,
Yidi Jiang,
Qian Chen,
Siqi Zheng,
Wen Wang,
Zhou Zhao
Abstract:
Language models have been effectively applied to modeling natural signals, such as images, video, speech, and audio. A crucial component of these models is the codec tokenizer, which compresses high-dimensional natural signals into lower-dimensional discrete tokens. In this paper, we introduce WavTokenizer, which offers several advantages over previous SOTA acoustic codec models in the audio domai…
▽ More
Language models have been effectively applied to modeling natural signals, such as images, video, speech, and audio. A crucial component of these models is the codec tokenizer, which compresses high-dimensional natural signals into lower-dimensional discrete tokens. In this paper, we introduce WavTokenizer, which offers several advantages over previous SOTA acoustic codec models in the audio domain: 1)extreme compression. By compressing the layers of quantizers and the temporal dimension of the discrete codec, one-second audio of 24kHz sampling rate requires only a single quantizer with 40 or 75 tokens. 2)improved subjective quality. Despite the reduced number of tokens, WavTokenizer achieves state-of-the-art reconstruction quality with outstanding UTMOS scores and inherently contains richer semantic information. Specifically, we achieve these results by designing a broader VQ space, extended contextual windows, and improved attention networks, as well as introducing a powerful multi-scale discriminator and an inverse Fourier transform structure. We conducted extensive reconstruction experiments in the domains of speech, audio, and music. WavTokenizer exhibited strong performance across various objective and subjective metrics compared to state-of-the-art models. We also tested semantic information, VQ utilization, and adaptability to generative models. Comprehensive ablation studies confirm the necessity of each module in WavTokenizer. The related code, demos, and pre-trained models are available at https://github.com/jishengpeng/WavTokenizer.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Passenger hazard perception based on EEG signals for highly automated driving vehicles
Authors:
Ashton Yu Xuan Tan,
Yingkai Yang,
Xiaofei Zhang,
Bowen Li,
Xiaorong Gao,
Sifa Zheng,
Jianqiang Wang,
Xinyu Gu,
Jun Li,
Yang Zhao,
Yuxin Zhang,
Tania Stathaki
Abstract:
Enhancing the safety of autonomous vehicles is crucial, especially given recent accidents involving automated systems. As passengers in these vehicles, humans' sensory perception and decision-making can be integrated with autonomous systems to improve safety. This study explores neural mechanisms in passenger-vehicle interactions, leading to the development of a Passenger Cognitive Model (PCM) and…
▽ More
Enhancing the safety of autonomous vehicles is crucial, especially given recent accidents involving automated systems. As passengers in these vehicles, humans' sensory perception and decision-making can be integrated with autonomous systems to improve safety. This study explores neural mechanisms in passenger-vehicle interactions, leading to the development of a Passenger Cognitive Model (PCM) and the Passenger EEG Decoding Strategy (PEDS). Central to PEDS is a novel Convolutional Recurrent Neural Network (CRNN) that captures spatial and temporal EEG data patterns. The CRNN, combined with stacking algorithms, achieves an accuracy of $85.0\% \pm 3.18\%$. Our findings highlight the predictive power of pre-event EEG data, enhancing the detection of hazardous scenarios and offering a network-driven framework for safer autonomous vehicles.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Reactzyme: A Benchmark for Enzyme-Reaction Prediction
Authors:
Chenqing Hua,
Bozitao Zhong,
Sitao Luan,
Liang Hong,
Guy Wolf,
Doina Precup,
Shuangjia Zheng
Abstract:
Enzymes, with their specific catalyzed reactions, are necessary for all aspects of life, enabling diverse biological processes and adaptations. Predicting enzyme functions is essential for understanding biological pathways, guiding drug development, enhancing bioproduct yields, and facilitating evolutionary studies. Addressing the inherent complexities, we introduce a new approach to annotating en…
▽ More
Enzymes, with their specific catalyzed reactions, are necessary for all aspects of life, enabling diverse biological processes and adaptations. Predicting enzyme functions is essential for understanding biological pathways, guiding drug development, enhancing bioproduct yields, and facilitating evolutionary studies. Addressing the inherent complexities, we introduce a new approach to annotating enzymes based on their catalyzed reactions. This method provides detailed insights into specific reactions and is adaptable to newly discovered reactions, diverging from traditional classifications by protein family or expert-derived reaction classes. We employ machine learning algorithms to analyze enzyme reaction datasets, delivering a much more refined view on the functionality of enzymes. Our evaluation leverages the largest enzyme-reaction dataset to date, derived from the SwissProt and Rhea databases with entries up to January 8, 2024. We frame the enzyme-reaction prediction as a retrieval problem, aiming to rank enzymes by their catalytic ability for specific reactions. With our model, we can recruit proteins for novel reactions and predict reactions in novel proteins, facilitating enzyme discovery and function annotation.
△ Less
Submitted 24 August, 2024;
originally announced August 2024.
-
Multi-Style Facial Sketch Synthesis through Masked Generative Modeling
Authors:
Bowen Sun,
Guo Lu,
Shibao Zheng
Abstract:
The facial sketch synthesis (FSS) model, capable of generating sketch portraits from given facial photographs, holds profound implications across multiple domains, encompassing cross-modal face recognition, entertainment, art, media, among others. However, the production of high-quality sketches remains a formidable task, primarily due to the challenges and flaws associated with three key factors:…
▽ More
The facial sketch synthesis (FSS) model, capable of generating sketch portraits from given facial photographs, holds profound implications across multiple domains, encompassing cross-modal face recognition, entertainment, art, media, among others. However, the production of high-quality sketches remains a formidable task, primarily due to the challenges and flaws associated with three key factors: (1) the scarcity of artist-drawn data, (2) the constraints imposed by limited style types, and (3) the deficiencies of processing input information in existing models. To address these difficulties, we propose a lightweight end-to-end synthesis model that efficiently converts images to corresponding multi-stylized sketches, obviating the necessity for any supplementary inputs (\eg, 3D geometry). In this study, we overcome the issue of data insufficiency by incorporating semi-supervised learning into the training process. Additionally, we employ a feature extraction module and style embeddings to proficiently steer the generative transformer during the iterative prediction of masked image tokens, thus achieving a continuous stylized output that retains facial features accurately in sketches. The extensive experiments demonstrate that our method consistently outperforms previous algorithms across multiple benchmarks, exhibiting a discernible disparity.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization
Authors:
Luyao Cheng,
Hui Wang,
Siqi Zheng,
Yafeng Chen,
Rongjie Huang,
Qinglin Zhang,
Qian Chen,
Xihao Li
Abstract:
Speaker diarization, the process of segmenting an audio stream or transcribed speech content into homogenous partitions based on speaker identity, plays a crucial role in the interpretation and analysis of human speech. Most existing speaker diarization systems rely exclusively on unimodal acoustic information, making the task particularly challenging due to the innate ambiguities of audio signals…
▽ More
Speaker diarization, the process of segmenting an audio stream or transcribed speech content into homogenous partitions based on speaker identity, plays a crucial role in the interpretation and analysis of human speech. Most existing speaker diarization systems rely exclusively on unimodal acoustic information, making the task particularly challenging due to the innate ambiguities of audio signals. Recent studies have made tremendous efforts towards audio-visual or audio-semantic modeling to enhance performance. However, even the incorporation of up to two modalities often falls short in addressing the complexities of spontaneous and unstructured conversations. To exploit more meaningful dialogue patterns, we propose a novel multimodal approach that jointly utilizes audio, visual, and semantic cues to enhance speaker diarization. Our method elegantly formulates the multimodal modeling as a constrained optimization problem. First, we build insights into the visual connections among active speakers and the semantic interactions within spoken content, thereby establishing abundant pairwise constraints. Then we introduce a joint pairwise constraint propagation algorithm to cluster speakers based on these visual and semantic constraints. This integration effectively leverages the complementary strengths of different modalities, refining the affinity estimation between individual speaker embeddings. Extensive experiments conducted on multiple multimodal datasets demonstrate that our approach consistently outperforms state-of-the-art speaker diarization methods.
△ Less
Submitted 21 August, 2024;
originally announced August 2024.
-
SZU-AFS Antispoofing System for the ASVspoof 5 Challenge
Authors:
Yuxiong Xu,
Jiafeng Zhong,
Sengui Zheng,
Zefeng Liu,
Bin Li
Abstract:
This paper presents the SZU-AFS anti-spoofing system, designed for Track 1 of the ASVspoof 5 Challenge under open conditions. The system is built with four stages: selecting a baseline model, exploring effective data augmentation (DA) methods for fine-tuning, applying a co-enhancement strategy based on gradient norm aware minimization (GAM) for secondary fine-tuning, and fusing logits scores from…
▽ More
This paper presents the SZU-AFS anti-spoofing system, designed for Track 1 of the ASVspoof 5 Challenge under open conditions. The system is built with four stages: selecting a baseline model, exploring effective data augmentation (DA) methods for fine-tuning, applying a co-enhancement strategy based on gradient norm aware minimization (GAM) for secondary fine-tuning, and fusing logits scores from the two best-performing fine-tuned models. The system utilizes the Wav2Vec2 front-end feature extractor and the AASIST back-end classifier as the baseline model. During model fine-tuning, three distinct DA policies have been investigated: single-DA, random-DA, and cascade-DA. Moreover, the employed GAM-based co-enhancement strategy, designed to fine-tune the augmented model at both data and optimizer levels, helps the Adam optimizer find flatter minima, thereby boosting model generalization. Overall, the final fusion system achieves a minDCF of 0.115 and an EER of 4.04% on the evaluation set.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
A Series of (Net) Spin-down Glitches in PSR J1522-5735: Insights from the Vortex Creep and Vortex Bending Models
Authors:
S. Q. Zhou,
W. T. Ye,
M. Y. Ge,
E. GügercinoğLu,
S. J. Zheng,
C. Yu,
J. P. Yuan,
J. Zhang
Abstract:
Through a detailed timing analysis of $\textit{Fermi}$-LAT data, the rotational behavior of the $γ$-ray pulsar PSR J1522$-$5735 was tracked from August 2008 (MJD 54692) to January 2024 (MJD 60320). During this 15.4-year period, two over-recovery glitches and four anti-glitches were identified, marking a rare occurrence in rotation-powered pulsars (RPPs). The magnitudes of these (net) spin-down gli…
▽ More
Through a detailed timing analysis of $\textit{Fermi}$-LAT data, the rotational behavior of the $γ$-ray pulsar PSR J1522$-$5735 was tracked from August 2008 (MJD 54692) to January 2024 (MJD 60320). During this 15.4-year period, two over-recovery glitches and four anti-glitches were identified, marking a rare occurrence in rotation-powered pulsars (RPPs). The magnitudes of these (net) spin-down glitches were determined to be $|Δν_{\rm g}/ν| \sim 10^{-8}$, well above the estimated detectability limit. For the two over-recovery glitches, the respective recovery fractions $Q$ are $2.1(7)$ and $1.4(2)$. Further analysis showed no substantial variations in either the flux or pulse profile shape in any of these events, suggesting that small (net) spin-down glitches, unlike large events observed in magnetars and magnetar-like RPPs, may occur without leaving an impact on the magnetosphere. Within the framework of the vortex creep and vortex bending models, anti-glitches and over-recoveries indicate the recoupling of vortex lines that moved inward as a result of a crustquake; meanwhile, the apparent fluctuations in the spin-down rate after the glitches occur as a result of the coupling of the oscillations of bent vortex lines to the magnetosphere.
△ Less
Submitted 17 August, 2024;
originally announced August 2024.
-
Spectral properties of high dimensional rescaled sample correlation matrices
Authors:
Weijiang Chen,
Shurong Zheng,
Tingting Zou
Abstract:
High-dimensional sample correlation matrices are a crucial class of random matrices in multivariate statistical analysis. The central limit theorem (CLT) provides a theoretical foundation for statistical inference. In this paper, assuming that the data dimension increases proportionally with the sample size, we derive the limiting spectral distribution of the matrix…
▽ More
High-dimensional sample correlation matrices are a crucial class of random matrices in multivariate statistical analysis. The central limit theorem (CLT) provides a theoretical foundation for statistical inference. In this paper, assuming that the data dimension increases proportionally with the sample size, we derive the limiting spectral distribution of the matrix $\widehat{\mathbf{R}}_n\mathbf{M}$ and establish the CLTs for the linear spectral statistics (LSS) of $\widehat{\mathbf{R}}_n\mathbf{M}$ in two structures: linear independent component structure and elliptical structure. In contrast to existing literature, our proposed spectral properties do not require $\mathbf{M}$ to be an identity matrix. Moreover, we also derive the joint limiting distribution of LSSs of $\widehat{\mathbf{R}}_n \mathbf{M}_1,\ldots,\widehat{\mathbf{R}}_n \mathbf{M}_K$. As an illustration, an application is given for the CLT.
△ Less
Submitted 29 August, 2024; v1 submitted 17 August, 2024;
originally announced August 2024.
-
Interference detection in radio astronomy applying Shapiro-Wilks normality test, spectral entropy, and spectral relative entropy
Authors:
Zhicheng Cao,
Natalia A. Schmid,
Kevin Bandura,
Duncan R. Lorimer,
Morgan Dameron,
Katelyn Crockett,
Clayton Grubick,
Andreas Schmid,
Shaonan Zheng
Abstract:
Radio-frequency interference (RFI) is becoming an increasingly significant problem for most radio telescopes. Working with Green Bank Telescope data from PSR J1730+0747 in the form of complex-valued channelized voltages and their respective high-resolution power spectral densities, we evaluate a variety of statistical measures to characterize RFI. As a baseline for performance comparison, we use m…
▽ More
Radio-frequency interference (RFI) is becoming an increasingly significant problem for most radio telescopes. Working with Green Bank Telescope data from PSR J1730+0747 in the form of complex-valued channelized voltages and their respective high-resolution power spectral densities, we evaluate a variety of statistical measures to characterize RFI. As a baseline for performance comparison, we use median absolute deviation (MAD) in complex channelized voltage data and spectral kurtosis (SK) in power spectral density data to characterize and filter out RFI. From a new perspective, we implement the Shapiro-Wilks (SW) test for normality and two information theoretical measures, spectral entropy (SE) and spectral relative entropy (SRE), and apply them to mitigate RFI. The baseline RFI mitigation algorithms are compared against our novel RFI detection algorithms to determine how effective and robust the performance is. Except for MAD, we find significant improvements in signal-to-noise ratio through the application of SE, symmetrical SRE, asymmetrical SRE, SK, and SW. These algorithms also do a good job of characterizing broadband RFI. Time- and frequency-variable RFI signals are best detected by SK and SW tests.
△ Less
Submitted 12 August, 2024;
originally announced August 2024.
-
CAR: Contrast-Agnostic Deformable Medical Image Registration with Contrast-Invariant Latent Regularization
Authors:
Yinsong Wang,
Siyi Du,
Shaoming Zheng,
Xinzhe Luo,
Chen Qin
Abstract:
Multi-contrast image registration is a challenging task due to the complex intensity relationships between different imaging contrasts. Conventional image registration methods are typically based on iterative optimizations for each input image pair, which is time-consuming and sensitive to contrast variations. While learning-based approaches are much faster during the inference stage, due to gener…
▽ More
Multi-contrast image registration is a challenging task due to the complex intensity relationships between different imaging contrasts. Conventional image registration methods are typically based on iterative optimizations for each input image pair, which is time-consuming and sensitive to contrast variations. While learning-based approaches are much faster during the inference stage, due to generalizability issues, they typically can only be applied to the fixed contrasts observed during the training stage. In this work, we propose a novel contrast-agnostic deformable image registration framework that can be generalized to arbitrary contrast images, without observing them during training. Particularly, we propose a random convolution-based contrast augmentation scheme, which simulates arbitrary contrasts of images over a single image contrast while preserving their inherent structural information. To ensure that the network can learn contrast-invariant representations for facilitating contrast-agnostic registration, we further introduce contrast-invariant latent regularization (CLR) that regularizes representation in latent space through a contrast invariance loss. Experiments show that CAR outperforms the baseline approaches regarding registration accuracy and also possesses better generalization ability to unseen imaging contrasts. Code is available at \url{https://github.com/Yinsong0510/CAR}.
△ Less
Submitted 3 August, 2024;
originally announced August 2024.
-
Observation of muonic Dalitz decays of $χ_{b}$ mesons and precise spectroscopy of hidden-beauty states
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1114 additional authors not shown)
Abstract:
The decays of the $χ_{b1}(1P)$, $χ_{b2}(1P)$, $χ_{b1}(2P)$ and $χ_{b2}(2P)$~mesons into the~$Υ(1S)μ^+μ^-$ final state are observed with a high significance using proton-proton collision data collected with the LHCb detector and corresponding to an integrated luminosity of 9fb$^{-1}$. The newly observed decays together with the $Υ(2S)\rightarrow Υ(1S)π^+π^-$ and $Υ(3S)\rightarrow Υ(2S)π^+π^-$ decay…
▽ More
The decays of the $χ_{b1}(1P)$, $χ_{b2}(1P)$, $χ_{b1}(2P)$ and $χ_{b2}(2P)$~mesons into the~$Υ(1S)μ^+μ^-$ final state are observed with a high significance using proton-proton collision data collected with the LHCb detector and corresponding to an integrated luminosity of 9fb$^{-1}$. The newly observed decays together with the $Υ(2S)\rightarrow Υ(1S)π^+π^-$ and $Υ(3S)\rightarrow Υ(2S)π^+π^-$ decay modes are used for precision measurements of the mass and mass splittings for the hidden-beauty states.
△ Less
Submitted 9 August, 2024;
originally announced August 2024.
-
SGSR: Structure-Guided Multi-Contrast MRI Super-Resolution via Spatio-Frequency Co-Query Attention
Authors:
Shaoming Zheng,
Yinsong Wang,
Siyi Du,
Chen Qin
Abstract:
Magnetic Resonance Imaging (MRI) is a leading diagnostic modality for a wide range of exams, where multiple contrast images are often acquired for characterizing different tissues. However, acquiring high-resolution MRI typically extends scan time, which can introduce motion artifacts. Super-resolution of MRI therefore emerges as a promising approach to mitigate these challenges. Earlier studies h…
▽ More
Magnetic Resonance Imaging (MRI) is a leading diagnostic modality for a wide range of exams, where multiple contrast images are often acquired for characterizing different tissues. However, acquiring high-resolution MRI typically extends scan time, which can introduce motion artifacts. Super-resolution of MRI therefore emerges as a promising approach to mitigate these challenges. Earlier studies have investigated the use of multiple contrasts for MRI super-resolution (MCSR), whereas majority of them did not fully exploit the rich contrast-invariant structural information. To fully utilize such crucial prior knowledge of multi-contrast MRI, in this work, we propose a novel structure-guided MCSR (SGSR) framework based on a new spatio-frequency co-query attention (CQA) mechanism. Specifically, CQA performs attention on features of multiple contrasts with a shared structural query, which is particularly designed to extract, fuse, and refine the common structures from different contrasts. We further propose a novel frequency-domain CQA module in addition to the spatial domain, to enable more fine-grained structural refinement. Extensive experiments on fastMRI knee data and low-field brain MRI show that SGSR outperforms state-of-the-art MCSR methods with statistical significance.
△ Less
Submitted 6 August, 2024;
originally announced August 2024.
-
CADRL: Category-aware Dual-agent Reinforcement Learning for Explainable Recommendations over Knowledge Graphs
Authors:
Shangfei Zheng,
Hongzhi Yin,
Tong Chen,
Xiangjie Kong,
Jian Hou,
Pengpeng Zhao
Abstract:
Knowledge graphs (KGs) have been widely adopted to mitigate data sparsity and address cold-start issues in recommender systems. While existing KGs-based recommendation methods can predict user preferences and demands, they fall short in generating explicit recommendation paths and lack explainability. As a step beyond the above methods, recent advancements utilize reinforcement learning (RL) to fi…
▽ More
Knowledge graphs (KGs) have been widely adopted to mitigate data sparsity and address cold-start issues in recommender systems. While existing KGs-based recommendation methods can predict user preferences and demands, they fall short in generating explicit recommendation paths and lack explainability. As a step beyond the above methods, recent advancements utilize reinforcement learning (RL) to find suitable items for a given user via explainable recommendation paths. However, the performance of these solutions is still limited by the following two points. (1) Lack of ability to capture contextual dependencies from neighboring information. (2) The excessive reliance on short recommendation paths due to efficiency concerns. To surmount these challenges, we propose a category-aware dual-agent reinforcement learning (CADRL) model for explainable recommendations over KGs. Specifically, our model comprises two components: (1) a category-aware gated graph neural network that jointly captures context-aware item representations from neighboring entities and categories, and (2) a dual-agent RL framework where two agents efficiently traverse long paths to search for suitable items. Finally, experimental results show that CADRL outperforms state-of-the-art models in terms of both effectiveness and efficiency on large-scale datasets.
△ Less
Submitted 6 August, 2024;
originally announced August 2024.
-
Lattice and magnetic structure in the van der Waals antiferromagnet VBr3
Authors:
Yimeng Gu,
Yiqing Hao,
Zeyu Kao,
Yiqing Gu,
Feiyang Liu,
Shiyi Zheng,
Huibo Cao,
Lunhua He,
Jun Zhao
Abstract:
We report a comprehensive investigation of the lattice and magnetic structure in van der Waals antiferromagnet VBr3, characterized by a BiI3-type structure at room temperature. Neutron diffraction experiments were performed on both polycrystalline and single-crystalline VBr3 samples, revealing clear magnetic Bragg peaks emerging below the Néel temperature of TN = 26.5 K. These magnetic Bragg peaks…
▽ More
We report a comprehensive investigation of the lattice and magnetic structure in van der Waals antiferromagnet VBr3, characterized by a BiI3-type structure at room temperature. Neutron diffraction experiments were performed on both polycrystalline and single-crystalline VBr3 samples, revealing clear magnetic Bragg peaks emerging below the Néel temperature of TN = 26.5 K. These magnetic Bragg peaks can be indexed by k = (0, 0.5, 1) in hexagonal notation. Our refinement analysis suggests that the antiferromagnetic order in VBr3 manifests as a zigzag structure. Moreover, we observed peak splitting for nuclear Bragg peaks in the HK-plane below the structure transition temperature of Ts = 94 K, indicating the breaking of 3-fold symmetry within the ab-plane.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Multiple sliding ferroelectricity of rhombohedral-stacked InSe for reconfigurable photovoltaics and imaging applications
Authors:
Qingrong Liang,
Guozhong Zheng,
Liu Yang,
Shoujun Zheng
Abstract:
Through stacking engineering of two-dimensional (2D) materials, a switchable interface polarization can be generated through interlayer sliding, so called sliding ferroelectricity, which is advantageous over the traditional ferroelectricity due to ultra-thin thickness, high switching speed and low fatigue. However, 2D materials with intrinsic sliding ferroelectricity are still rare, with the excep…
▽ More
Through stacking engineering of two-dimensional (2D) materials, a switchable interface polarization can be generated through interlayer sliding, so called sliding ferroelectricity, which is advantageous over the traditional ferroelectricity due to ultra-thin thickness, high switching speed and low fatigue. However, 2D materials with intrinsic sliding ferroelectricity are still rare, with the exception of rhombohedral-stacked MoS2, which limits sliding ferroelectricity for practical applications such as high-speed storage, photovoltaic, and neuromorphic computing. Here, we reported the observation of sliding ferroelectricity with multiple states in undoped rhombohedral-stacked InSe (γ-InSe) via dual-frequency resonance tracking piezoresponse force microscopy, scanning Kelvin probe microscopy and conductive atomic force microscopy. The tunable bulk photovoltaic effect via the electric field is achieved in the graphene/γ-InSe/graphene tunneling device with a photovoltaic current density of ~15 mA/cm2, which is attributed to the multiple sliding steps in γ-InSe according to our theoretical calculations. The vdw tunneling device also features a high photo responsivity of ~255 A/W and a fast response time for real-time imaging. Our work not only enriches rhombohedral-stacked 2D materials for sliding ferroelectricity, but also sheds light on their potential for tunable photovoltaics and imaging applications.
△ Less
Submitted 30 July, 2024;
originally announced July 2024.
-
Large-scale cervical precancerous screening via AI-assisted cytology whole slide image analysis
Authors:
Honglin Li,
Yusuan Sun,
Chenglu Zhu,
Yunlong Zhang,
Shichuan Zhang,
Zhongyi Shui,
Pingyi Chen,
Jingxiong Li,
Sunyi Zheng,
Can Cui,
Lin Yang
Abstract:
Cervical Cancer continues to be the leading gynecological malignancy, posing a persistent threat to women's health on a global scale. Early screening via cytology Whole Slide Image (WSI) diagnosis is critical to prevent this Cancer progression and improve survival rate, but pathologist's single test suffers inevitable false negative due to the immense number of cells that need to be reviewed withi…
▽ More
Cervical Cancer continues to be the leading gynecological malignancy, posing a persistent threat to women's health on a global scale. Early screening via cytology Whole Slide Image (WSI) diagnosis is critical to prevent this Cancer progression and improve survival rate, but pathologist's single test suffers inevitable false negative due to the immense number of cells that need to be reviewed within a WSI. Though computer-aided automated diagnostic models can serve as strong complement for pathologists, their effectiveness is hampered by the paucity of extensive and detailed annotations, coupled with the limited interpretability and robustness. These factors significantly hinder their practical applicability and reliability in clinical settings. To tackle these challenges, we develop an AI approach, which is a Scalable Technology for Robust and Interpretable Diagnosis built on Extensive data (STRIDE) of cervical cytology. STRIDE addresses the bottleneck of limited annotations by integrating patient-level labels with a small portion of cell-level labels through an end-to-end training strategy, facilitating scalable learning across extensive datasets. To further improve the robustness to real-world domain shifts of cytology slide-making and imaging, STRIDE employs color adversarial samples training that mimic staining and imaging variations. Lastly, to achieve pathologist-level interpretability for the trustworthiness in clinical settings, STRIDE can generate explanatory textual descriptions that simulates pathologists' diagnostic processes by cell image feature and textual description alignment. Conducting extensive experiments and evaluations in 183 medical centers with a dataset of 341,889 WSIs and 0.1 billion cells from cervical cytology patients, STRIDE has demonstrated a remarkable superiority over previous state-of-the-art techniques.
△ Less
Submitted 28 July, 2024;
originally announced July 2024.
-
Measurement of $D^0-\overline{D}^0$ mixing and search for $CP$ violation with $D^0\rightarrow K^+π^-$ decays
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1065 additional authors not shown)
Abstract:
A measurement of the time-dependent ratio of the $D^0\rightarrow K^+π^-$ to $\overline{D}^0\rightarrow K^+π^-$ decay rates is reported. The analysis uses a sample of proton-proton collisions corresponding to an integrated luminosity of 6 fb$^-1$ recorded by the LHCb experiment from 2015 through 2018 at a center-of-mass energy of 13 TeV. The $D^0$ meson is required to originate from a…
▽ More
A measurement of the time-dependent ratio of the $D^0\rightarrow K^+π^-$ to $\overline{D}^0\rightarrow K^+π^-$ decay rates is reported. The analysis uses a sample of proton-proton collisions corresponding to an integrated luminosity of 6 fb$^-1$ recorded by the LHCb experiment from 2015 through 2018 at a center-of-mass energy of 13 TeV. The $D^0$ meson is required to originate from a $D^{*+}\rightarrow D^0π^+$ decay, such that its flavor at production is inferred from the charge of the accompanying pion. The measurement is performed simultaneously for the $K^+π^-$ and $K^-π^+$ final states, allowing both mixing and $CP$-violation parameters to be determined. The value of the ratio of the decay rates at production is determined to be $R_{Kπ} = (343.1 \pm 2.0) \times 10^{-5}$. The mixing parameters are measured to be $c_{Kπ} = (51.4 \pm 3.5) \times 10^{-4}$ and $c_{Kπ}^{\prime} = (13 \pm 4) \times 10^{-6}$, where $\sqrt{R_{Kπ}}c_{Kπ}$ is the linear coefficient of the expansion of the ratio as a function of decay time in units of the $D^0$ lifetime, and $c_{Kπ}^{\prime}$ is the quadratic coefficient, both averaged between the $K^+π^-$ and $K^-π^+$ final states. The precision is improved relative to the previous best measurement by approximately 60%. No evidence for $CP$ violation is found.
△ Less
Submitted 25 July, 2024;
originally announced July 2024.
-
Experimental demonstration of spontaneous symmetry breaking with emergent multi-qubit entanglement
Authors:
Ri-Hua Zheng,
Wen Ning,
Jia-Hao Lü,
Xue-Jia Yu,
Fang Wu,
Cheng-Lin Deng,
Zhen-Biao Yang,
Kai Xu,
Dongning Zheng,
Heng Fan,
Shi-Biao Zheng
Abstract:
Spontaneous symmetry breaking (SSB) is crucial to the occurrence of phase transitions. Once a phase transition occurs, a quantum system presents degenerate eigenstates that lack the symmetry of the Hamiltonian. After crossing the critical point, the system is essentially evolved to a quantum superposition of these eigenstates until decoherence sets in. Despite the fundamental importance and potent…
▽ More
Spontaneous symmetry breaking (SSB) is crucial to the occurrence of phase transitions. Once a phase transition occurs, a quantum system presents degenerate eigenstates that lack the symmetry of the Hamiltonian. After crossing the critical point, the system is essentially evolved to a quantum superposition of these eigenstates until decoherence sets in. Despite the fundamental importance and potential applications in quantum technologies, such quantum-mechanical SSB phenomena have not been experimentally explored in many-body systems. We here present an experimental demonstration of the SSB process in the Lipkin-Meshkov-Glick model, governed by the competition between the individual driving and intra-qubit interaction. The model is realized in a circuit quantum electrodynamics system, where 6 Xmon qubits are coupled in an all-to-all manner through virtual photon exchange mediated by a resonator. The observed nonclassical correlations among these qubits in the symmetry-breaking region go beyond the conventional description of SSB, shedding new light on phase transitions for quantum many-body systems.
△ Less
Submitted 29 July, 2024; v1 submitted 17 July, 2024;
originally announced July 2024.
-
Amplitude analysis of $B^+ \to ψ(2S) K^+ π^+ π^-$ decays
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1092 additional authors not shown)
Abstract:
The first full amplitude analysis of $B^+ \to ψ(2S) K^+ π^+ π^-$ decays is performed using proton-proton collision data corresponding to an integrated luminosity of $9\,\text{fb}^{-1}$ recorded with the LHCb detector. The rich $K^+ π^+ π^-$ spectrum is studied and the branching fractions of the resonant substructure associated with the prominent $K_1(1270)^+$ contribution are measured. The data ca…
▽ More
The first full amplitude analysis of $B^+ \to ψ(2S) K^+ π^+ π^-$ decays is performed using proton-proton collision data corresponding to an integrated luminosity of $9\,\text{fb}^{-1}$ recorded with the LHCb detector. The rich $K^+ π^+ π^-$ spectrum is studied and the branching fractions of the resonant substructure associated with the prominent $K_1(1270)^+$ contribution are measured. The data cannot be described by conventional strange and charmonium resonances only. An amplitude model with 53 components is developed comprising 11 hidden-charm exotic hadrons. New production mechanisms for charged charmonium-like states are observed. Significant resonant activity with spin-parity $J^P = 1^+$ in the $ψ(2S) π^+$ system is confirmed and a multi-pole structure is demonstrated. The spectral decomposition of the $ψ(2S) π^+ π^-$ invariant-mass structure, dominated by $X^0 \to ψ(2S) ρ(770)^0$ decays, broadly resembles the $J/ψφ$ spectrum observed in $B^+ \to J/ψφK^+$ decays. Exotic $ψ(2S) K^+ π^-$ resonances are observed for the first time.
△ Less
Submitted 17 July, 2024;
originally announced July 2024.
-
LVCP: LiDAR-Vision Tightly Coupled Collaborative Real-time Relative Positioning
Authors:
Zhuozhu Jian,
Qixuan Li,
Shengtao Zheng,
Xueqian Wang,
Xinlei Chen
Abstract:
In air-ground collaboration scenarios without GPS and prior maps, the relative positioning of drones and unmanned ground vehicles (UGVs) has always been a challenge. For a drone equipped with monocular camera and an UGV equipped with LiDAR as an external sensor, we propose a robust and real-time relative pose estimation method (LVCP) based on the tight coupling of vision and LiDAR point cloud info…
▽ More
In air-ground collaboration scenarios without GPS and prior maps, the relative positioning of drones and unmanned ground vehicles (UGVs) has always been a challenge. For a drone equipped with monocular camera and an UGV equipped with LiDAR as an external sensor, we propose a robust and real-time relative pose estimation method (LVCP) based on the tight coupling of vision and LiDAR point cloud information, which does not require prior information such as maps or precise initial poses. Given that large-scale point clouds generated by 3D sensors has more accurate spatial geometric information than the feature point cloud generated by image, we utilize LiDAR point clouds to correct the drift in visual-inertial odometry (VIO) when the camera undergoes significant shaking or the IMU has a low signal-to-noise ratio. To achieve this, we propose a novel coarse-to-fine framework for LiDAR-vision collaborative localization. In this framework, we construct point-plane association based on spatial geometric information, and innovatively construct a point-aided Bundle Adjustment (BA) problem as the backend to simultaneously estimate the relative pose of the camera and LiDAR and correct the VIO drift. In this process, we propose a particle swarm optimization (PSO) based sampling algorithm to complete the coarse estimation of the current camera-LiDAR pose. In this process, the initial pose of the camera used for sampling is obtained based on VIO propagation, and the valid feature-plane association number (VFPN) is used to trigger PSO-sampling process. Additionally, we propose a method that combines Structure from Motion (SFM) and multi-level sampling to initialize the algorithm, addressing the challenge of lacking initial values.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
ContextualStory: Consistent Visual Storytelling with Spatially-Enhanced and Storyline Context
Authors:
Sixiao Zheng,
Yanwei Fu
Abstract:
Visual storytelling involves generating a sequence of coherent frames from a textual storyline while maintaining consistency in characters and scenes. Existing autoregressive methods, which rely on previous frame-sentence pairs, struggle with high memory usage, slow generation speeds, and limited context integration. To address these issues, we propose ContextualStory, a novel framework designed t…
▽ More
Visual storytelling involves generating a sequence of coherent frames from a textual storyline while maintaining consistency in characters and scenes. Existing autoregressive methods, which rely on previous frame-sentence pairs, struggle with high memory usage, slow generation speeds, and limited context integration. To address these issues, we propose ContextualStory, a novel framework designed to generate coherent story frames and extend frames for story continuation. ContextualStory utilizes Spatially-Enhanced Temporal Attention to capture spatial and temporal dependencies, handling significant character movements effectively. Additionally, we introduces a Storyline Contextualizer to enrich context in storyline embedding and a StoryFlow Adapter to measure scene changes between frames for guiding model. Extensive experiments on PororoSV and FlintstonesSV benchmarks demonstrate that ContextualStory significantly outperforms existing methods in both story visualization and story continuation.
△ Less
Submitted 21 August, 2024; v1 submitted 13 July, 2024;
originally announced July 2024.
-
Scheme for measuring topological transitions in a continuous variable system
Authors:
Bi-Yao Wang,
Hao-Long Zhang,
Shou-Bang Yang,
Fan Wu,
Zhen-Biao Yang,
Shi-Biao Zheng
Abstract:
We propose a scheme for measuring topological properties in a two-photon-driven Kerr-nonlinear resonator (KNR) subjected to a single-photon modulation. The topological properties are revealed through the observation of the Berry curvature and hence the first Chern number, as a nonadiabatic response of the physical observable to the change rate of the control parameter of the modulated drive. The p…
▽ More
We propose a scheme for measuring topological properties in a two-photon-driven Kerr-nonlinear resonator (KNR) subjected to a single-photon modulation. The topological properties are revealed through the observation of the Berry curvature and hence the first Chern number, as a nonadiabatic response of the physical observable to the change rate of the control parameter of the modulated drive. The parameter manifold, constructed from the system's Hamiltonian that determines its dynamics constrained in the state space spanned by the even and odd cat states as two basis states, is adjusted so that the degeneracy crossing the manifold indicates a topological transition. The scheme, with such continuous variable states in mesoscpic systems, provides a new perspective for exploration of the geometry and the related topology with complex systems.
△ Less
Submitted 13 July, 2024;
originally announced July 2024.
-
Model Predictive Control For Mobile Manipulators Based On Neural Dynamics(Extended version)
Authors:
Tao Su,
Shiqi Zheng
Abstract:
This article focuses on the trajectory tracking problem of mobile manipulators (MMs). Firstly, we construct a position and orientation model predictive tracking control (POMPTC) scheme for mobile manipulators. The proposed POMPTC scheme can simultaneously minimize the tracking error, joint velocity, and joint acceleration. Moreover, it can achieve synchronous control for the position and orientati…
▽ More
This article focuses on the trajectory tracking problem of mobile manipulators (MMs). Firstly, we construct a position and orientation model predictive tracking control (POMPTC) scheme for mobile manipulators. The proposed POMPTC scheme can simultaneously minimize the tracking error, joint velocity, and joint acceleration. Moreover, it can achieve synchronous control for the position and orientation of the end-effector. Secondly, a finite-time convergent neural dynamics (FTCND) model is constructed to find the optimal solution of the POMPTC scheme. Then, based on the proposed POMPTC scheme, a non-singular fast terminal sliding model (NFTSM) control method is presented, which considers the disturbances caused by the base motion on the manipulator at the dynamic level. It can achieve finite-time tracking performance and improve the anti-disturbances ability. Finally, simulation and experiments show that the proposed control method has the advantages of strong robustness, fast convergence, and high control accuracy.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Constraints on freeze-in dark matter from Lyman-$α$ forest and 21-cm signal : single-field models
Authors:
Zixuan Xu,
Quan Zhou,
Sibo Zheng
Abstract:
We report new Lyman-$α$ and 21-cm constraints on freeze-in dark matter (FIDM) which injects energy into the intergalactic medium either through annihilation or decay to photon(s) or electron-positron pair. With respect to Lyman-$α$ we fix the baseline ionization history using low redshift data about astrophysical reionization, whereas for 21-cm signal we adopt the baseline values of 21-cm power sp…
▽ More
We report new Lyman-$α$ and 21-cm constraints on freeze-in dark matter (FIDM) which injects energy into the intergalactic medium either through annihilation or decay to photon(s) or electron-positron pair. With respect to Lyman-$α$ we fix the baseline ionization history using low redshift data about astrophysical reionization, whereas for 21-cm signal we adopt the baseline values of 21-cm power spectrum through a standard modeling of star formation developed so far. Using the latest numerical tools, we show that (i) for sterile neutrino FIDM, current Lyman-$α$ data and future sensitivity of SKA-low (1000 hrs) on the 21-cm power spectra excludes the FIDM mass up to $1.8\times 10^{-3}$ GeV at 95$\%$ CL and $5.46\times 10^{-4}$ GeV, respectively, and (ii) for millicharged FIDM, current Lyman-$α$ data only excludes the millicharge down to $10^{-8}$ within the FIDM mass range of $10^{-3}-1$ GeV at 95$\%$ CL, suggesting that the surviving parameter space of millicharged FIDM is still intact.
△ Less
Submitted 20 September, 2024; v1 submitted 11 July, 2024;
originally announced July 2024.
-
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
Authors:
Zilong Wang,
Zifeng Wang,
Long Le,
Huaixiu Steven Zheng,
Swaroop Mishra,
Vincent Perot,
Yuwei Zhang,
Anush Mattapalli,
Ankur Taly,
Jingbo Shang,
Chen-Yu Lee,
Tomas Pfister
Abstract:
Retrieval augmented generation (RAG) combines the generative abilities of large language models (LLMs) with external knowledge sources to provide more accurate and up-to-date responses. Recent RAG advancements focus on improving retrieval outcomes through iterative LLM refinement or self-critique capabilities acquired through additional instruction tuning of LLMs. In this work, we introduce Specul…
▽ More
Retrieval augmented generation (RAG) combines the generative abilities of large language models (LLMs) with external knowledge sources to provide more accurate and up-to-date responses. Recent RAG advancements focus on improving retrieval outcomes through iterative LLM refinement or self-critique capabilities acquired through additional instruction tuning of LLMs. In this work, we introduce Speculative RAG - a framework that leverages a larger generalist LM to efficiently verify multiple RAG drafts produced in parallel by a smaller, distilled specialist LM. Each draft is generated from a distinct subset of retrieved documents, offering diverse perspectives on the evidence while reducing input token counts per draft. This approach enhances comprehension of each subset and mitigates potential position bias over long context. Our method accelerates RAG by delegating drafting to the smaller specialist LM, with the larger generalist LM performing a single verification pass over the drafts. Extensive experiments demonstrate that Speculative RAG achieves state-of-the-art performance with reduced latency on TriviaQA, MuSiQue, PubHealth, and ARC-Challenge benchmarks. It notably enhances accuracy by up to 12.97% while reducing latency by 51% compared to conventional RAG systems on PubHealth.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data
Authors:
Siyi Du,
Shaoming Zheng,
Yinsong Wang,
Wenjia Bai,
Declan P. O'Regan,
Chen Qin
Abstract:
Images and structured tables are essential parts of real-world databases. Though tabular-image representation learning is promising to create new insights, it remains a challenging task, as tabular data is typically heterogeneous and incomplete, presenting significant modality disparities with images. Earlier works have mainly focused on simple modality fusion strategies in complete data scenarios…
▽ More
Images and structured tables are essential parts of real-world databases. Though tabular-image representation learning is promising to create new insights, it remains a challenging task, as tabular data is typically heterogeneous and incomplete, presenting significant modality disparities with images. Earlier works have mainly focused on simple modality fusion strategies in complete data scenarios, without considering the missing data issue, and thus are limited in practice. In this paper, we propose TIP, a novel tabular-image pre-training framework for learning multimodal representations robust to incomplete tabular data. Specifically, TIP investigates a novel self-supervised learning (SSL) strategy, including a masked tabular reconstruction task for tackling data missingness, and image-tabular matching and contrastive learning objectives to capture multimodal information. Moreover, TIP proposes a versatile tabular encoder tailored for incomplete, heterogeneous tabular data and a multimodal interaction module for inter-modality representation learning. Experiments are performed on downstream multimodal classification tasks using both natural and medical image datasets. The results show that TIP outperforms state-of-the-art supervised/SSL image/multimodal algorithms in both complete and incomplete data scenarios. Our code is available at https://github.com/siyi-wind/TIP.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question Answering
Authors:
Pingyi Chen,
Chenglu Zhu,
Sunyi Zheng,
Honglin Li,
Lin Yang
Abstract:
Whole slide imaging is routinely adopted for carcinoma diagnosis and prognosis. Abundant experience is required for pathologists to achieve accurate and reliable diagnostic results of whole slide images (WSI). The huge size and heterogeneous features of WSIs make the workflow of pathological reading extremely time-consuming. In this paper, we propose a novel framework (WSI-VQA) to interpret WSIs b…
▽ More
Whole slide imaging is routinely adopted for carcinoma diagnosis and prognosis. Abundant experience is required for pathologists to achieve accurate and reliable diagnostic results of whole slide images (WSI). The huge size and heterogeneous features of WSIs make the workflow of pathological reading extremely time-consuming. In this paper, we propose a novel framework (WSI-VQA) to interpret WSIs by generative visual question answering. WSI-VQA shows universality by reframing various kinds of slide-level tasks in a question-answering pattern, in which pathologists can achieve immunohistochemical grading, survival prediction, and tumor subtyping following human-machine interaction. Furthermore, we establish a WSI-VQA dataset which contains 8672 slide-level question-answering pairs with 977 WSIs. Besides the ability to deal with different slide-level tasks, our generative model which is named Wsi2Text Transformer (W2T) outperforms existing discriminative models in medical correctness, which reveals the potential of our model to be applied in the clinical scenario. Additionally, we also visualize the co-attention mapping between word embeddings and WSIs as an intuitive explanation for diagnostic results. The dataset and related code are available at https://github.com/cpystan/WSI-VQA.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens
Authors:
Zhihao Du,
Qian Chen,
Shiliang Zhang,
Kai Hu,
Heng Lu,
Yexin Yang,
Hangrui Hu,
Siqi Zheng,
Yue Gu,
Ziyang Ma,
Zhifu Gao,
Zhijie Yan
Abstract:
Recent years have witnessed a trend that large language model (LLM) based text-to-speech (TTS) emerges into the mainstream due to their high naturalness and zero-shot capacity. In this paradigm, speech signals are discretized into token sequences, which are modeled by an LLM with text as prompts and reconstructed by a token-based vocoder to waveforms. Obviously, speech tokens play a critical role…
▽ More
Recent years have witnessed a trend that large language model (LLM) based text-to-speech (TTS) emerges into the mainstream due to their high naturalness and zero-shot capacity. In this paradigm, speech signals are discretized into token sequences, which are modeled by an LLM with text as prompts and reconstructed by a token-based vocoder to waveforms. Obviously, speech tokens play a critical role in LLM-based TTS models. Current speech tokens are learned in an unsupervised manner, which lacks explicit semantic information and alignment to the text. In this paper, we propose to represent speech with supervised semantic tokens, which are derived from a multilingual speech recognition model by inserting vector quantization into the encoder. Based on the tokens, we further propose a scalable zero-shot TTS synthesizer, CosyVoice, which consists of an LLM for text-to-token generation and a conditional flow matching model for token-to-speech synthesis. Experimental results show that supervised semantic tokens significantly outperform existing unsupervised tokens in terms of content consistency and speaker similarity for zero-shot voice cloning. Moreover, we find that utilizing large-scale data further improves the synthesis performance, indicating the scalable capacity of CosyVoice. To the best of our knowledge, this is the first attempt to involve supervised speech tokens into TTS models.
△ Less
Submitted 9 July, 2024; v1 submitted 7 July, 2024;
originally announced July 2024.
-
PTaRL: Prototype-based Tabular Representation Learning via Space Calibration
Authors:
Hangting Ye,
Wei Fan,
Xiaozhuang Song,
Shun Zheng,
He Zhao,
Dandan Guo,
Yi Chang
Abstract:
Tabular data have been playing a mostly important role in diverse real-world fields, such as healthcare, engineering, finance, etc. With the recent success of deep learning, many tabular machine learning (ML) methods based on deep networks (e.g., Transformer, ResNet) have achieved competitive performance on tabular benchmarks. However, existing deep tabular ML methods suffer from the representatio…
▽ More
Tabular data have been playing a mostly important role in diverse real-world fields, such as healthcare, engineering, finance, etc. With the recent success of deep learning, many tabular machine learning (ML) methods based on deep networks (e.g., Transformer, ResNet) have achieved competitive performance on tabular benchmarks. However, existing deep tabular ML methods suffer from the representation entanglement and localization, which largely hinders their prediction performance and leads to performance inconsistency on tabular tasks. To overcome these problems, we explore a novel direction of applying prototype learning for tabular ML and propose a prototype-based tabular representation learning framework, PTaRL, for tabular prediction tasks. The core idea of PTaRL is to construct prototype-based projection space (P-Space) and learn the disentangled representation around global data prototypes. Specifically, PTaRL mainly involves two stages: (i) Prototype Generation, that constructs global prototypes as the basis vectors of P-Space for representation, and (ii) Prototype Projection, that projects the data samples into P-Space and keeps the core global data information via Optimal Transport. Then, to further acquire the disentangled representations, we constrain PTaRL with two strategies: (i) to diversify the coordinates towards global prototypes of different representations within P-Space, we bring up a diversification constraint for representation calibration; (ii) to avoid prototype entanglement in P-Space, we introduce a matrix orthogonalization constraint to ensure the independence of global prototypes. Finally, we conduct extensive experiments in PTaRL coupled with state-of-the-art deep tabular ML models on various tabular benchmarks and the results have shown our consistent superiority.
△ Less
Submitted 15 July, 2024; v1 submitted 7 July, 2024;
originally announced July 2024.
-
A Mapping Strategy for Interacting with Latent Audio Synthesis Using Artistic Materials
Authors:
Shuoyang Zheng,
Anna Xambó Sedó,
Nick Bryan-Kinns
Abstract:
This paper presents a mapping strategy for interacting with the latent spaces of generative AI models. Our approach involves using unsupervised feature learning to encode a human control space and mapping it to an audio synthesis model's latent space. To demonstrate how this mapping strategy can turn high-dimensional sensor data into control mechanisms of a deep generative model, we present a proo…
▽ More
This paper presents a mapping strategy for interacting with the latent spaces of generative AI models. Our approach involves using unsupervised feature learning to encode a human control space and mapping it to an audio synthesis model's latent space. To demonstrate how this mapping strategy can turn high-dimensional sensor data into control mechanisms of a deep generative model, we present a proof-of-concept system that uses visual sketches to control an audio synthesis model. We draw on emerging discourses in XAIxArts to discuss how this approach can contribute to XAI in artistic and creative contexts, we also discuss its current limitations and propose future research directions.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs
Authors:
Keyu An,
Qian Chen,
Chong Deng,
Zhihao Du,
Changfeng Gao,
Zhifu Gao,
Yue Gu,
Ting He,
Hangrui Hu,
Kai Hu,
Shengpeng Ji,
Yabin Li,
Zerui Li,
Heng Lu,
Haoneng Luo,
Xiang Lv,
Bin Ma,
Ziyang Ma,
Chongjia Ni,
Changhe Song,
Jiaqi Shi,
Xian Shi,
Hao Wang,
Wen Wang,
Yuxuan Wang
, et al. (8 additional authors not shown)
Abstract:
This report introduces FunAudioLLM, a model family designed to enhance natural voice interactions between humans and large language models (LLMs). At its core are two innovative models: SenseVoice, which handles multilingual speech recognition, emotion recognition, and audio event detection; and CosyVoice, which facilitates natural speech generation with control over multiple languages, timbre, sp…
▽ More
This report introduces FunAudioLLM, a model family designed to enhance natural voice interactions between humans and large language models (LLMs). At its core are two innovative models: SenseVoice, which handles multilingual speech recognition, emotion recognition, and audio event detection; and CosyVoice, which facilitates natural speech generation with control over multiple languages, timbre, speaking style, and speaker identity. SenseVoice-Small delivers exceptionally low-latency ASR for 5 languages, and SenseVoice-Large supports high-precision ASR for over 50 languages, while CosyVoice excels in multi-lingual voice generation, zero-shot in-context learning, cross-lingual voice cloning, and instruction-following capabilities. The models related to SenseVoice and CosyVoice have been open-sourced on Modelscope and Huggingface, along with the corresponding training, inference, and fine-tuning codes released on GitHub. By integrating these models with LLMs, FunAudioLLM enables applications such as speech-to-speech translation, emotional voice chat, interactive podcasts, and expressive audiobook narration, thereby pushing the boundaries of voice interaction technology. Demos are available at https://fun-audio-llm.github.io, and the code can be accessed at https://github.com/FunAudioLLM.
△ Less
Submitted 10 July, 2024; v1 submitted 4 July, 2024;
originally announced July 2024.
-
LANE: Logic Alignment of Non-tuning Large Language Models and Online Recommendation Systems for Explainable Reason Generation
Authors:
Hongke Zhao,
Songming Zheng,
Likang Wu,
Bowen Yu,
Jing Wang
Abstract:
The explainability of recommendation systems is crucial for enhancing user trust and satisfaction. Leveraging large language models (LLMs) offers new opportunities for comprehensive recommendation logic generation. However, in existing related studies, fine-tuning LLM models for recommendation tasks incurs high computational costs and alignment issues with existing systems, limiting the applicatio…
▽ More
The explainability of recommendation systems is crucial for enhancing user trust and satisfaction. Leveraging large language models (LLMs) offers new opportunities for comprehensive recommendation logic generation. However, in existing related studies, fine-tuning LLM models for recommendation tasks incurs high computational costs and alignment issues with existing systems, limiting the application potential of proven proprietary/closed-source LLM models, such as GPT-4. In this work, our proposed effective strategy LANE aligns LLMs with online recommendation systems without additional LLMs tuning, reducing costs and improving explainability. This innovative approach addresses key challenges in integrating language models with recommendation systems while fully utilizing the capabilities of powerful proprietary models. Specifically, our strategy operates through several key components: semantic embedding, user multi-preference extraction using zero-shot prompting, semantic alignment, and explainable recommendation generation using Chain of Thought (CoT) prompting. By embedding item titles instead of IDs and utilizing multi-head attention mechanisms, our approach aligns the semantic features of user preferences with those of candidate items, ensuring coherent and user-aligned recommendations. Sufficient experimental results including performance comparison, questionnaire voting, and visualization cases prove that our method can not only ensure recommendation performance, but also provide easy-to-understand and reasonable recommendation logic.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.