-
Probing quarkonium gravitational form factors using contact interaction
Authors:
M. Atif Sultan,
Zanbin Xing,
Khépani Raya,
Adnan Bashir,
Lei Chang
Abstract:
Given the unique role of gravitational form factors in unveiling the internal mechanics of hadrons, we examine the gravitational form factors (GFFs) of quarkonium systems within the Dyson-Schwinger and Bethe-Salpeter equations framework. A contact interaction model is employed, along with a novel approach to the dressed amputated meson-meson scattering amplitude that makes contact with the energy-…
▽ More
Given the unique role of gravitational form factors in unveiling the internal mechanics of hadrons, we examine the gravitational form factors (GFFs) of quarkonium systems within the Dyson-Schwinger and Bethe-Salpeter equations framework. A contact interaction model is employed, along with a novel approach to the dressed amputated meson-meson scattering amplitude that makes contact with the energy-momentum tensor and so the GFFs. The resulting form factors fulfill the anticipated symmetry constraints. The corresponding charge and mass radii are also analyzed, as well as the so-called $D-$term. For pseudoscalar mesons, we show that the $D-$term is bounded within the $(-1, -1/3)$ range; the bounds corresponding to the massless and infinitely massive cases, respectively. Considering the current diversity of opinions in the field, we anticipate that lattice QCD simulations can provide a refined analysis, as resolving the gravitational form factors and \textit{D}-term of quarkonium holds significant physical implications.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
SU(3) symmetry analysis in charmed baryon two body decays with penguin diagram contribution
Authors:
Zhi-Peng Xing,
Yu-Ji Shi,
Jin Sun,
Ye Xing
Abstract:
An increasing number of experimental measurements from the BESIII, Belle, and Belle-II collaborations encourage investigations into charmed baryon two-body decay processes. By including contributions from the penguin diagrams that are ignored in previous studies, we perform a global analysis with SU(3) flavor symmetry. Assuming all form factors are real, we achieve a remarkable minimal…
▽ More
An increasing number of experimental measurements from the BESIII, Belle, and Belle-II collaborations encourage investigations into charmed baryon two-body decay processes. By including contributions from the penguin diagrams that are ignored in previous studies, we perform a global analysis with SU(3) flavor symmetry. Assuming all form factors are real, we achieve a remarkable minimal $χ^{2}/d.o.f = 0.788$ and find that the contribution of the amplitude proportional to $V_{cb}^*V_{ub}$ is of the order $\sim 0.01$, comparable with the contribution of the tree-level diagram. Additionally, by using the KPW theorem to reduce the number of amplitudes from 13 to 7 in the leading contribution, it becomes possible to consider the complex form factor case for the leading IRA amplitude in the global analysis. However, the analysis of complex form factors significantly conflicts with the experimental data $Br(Ξ_c^0\toΞ^-π^+)$, and by excluding this data, $χ^2/d.o.f$ is reduced from 5.95 to 1.19. Although the analysis of complex form factors shows a significant central value of the penguin diagram contribution, the large errors from the corresponding form factors make it a challenge to precisely determine its true contribution. Consequently, the direct CP violation in decay processes is predicted to be approximately zero. With more data in future experiments, the penguin diagram contribution with the amplitude proportional to $V_{cb}^*V_{ub}$ will be precisely determined, allowing for a more accurate prediction of CP violation. Our analysis necessitates further theoretical investigations and experimental measurements in the future.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models
Authors:
Zhening Xing,
Gereon Fox,
Yanhong Zeng,
Xingang Pan,
Mohamed Elgharib,
Christian Theobalt,
Kai Chen
Abstract:
Large Language Models have shown remarkable efficacy in generating streaming data such as text and audio, thanks to their temporally uni-directional attention mechanism, which models correlations between the current token and previous tokens. However, video streaming remains much less explored, despite a growing need for live video processing. State-of-the-art video diffusion models leverage bi-di…
▽ More
Large Language Models have shown remarkable efficacy in generating streaming data such as text and audio, thanks to their temporally uni-directional attention mechanism, which models correlations between the current token and previous tokens. However, video streaming remains much less explored, despite a growing need for live video processing. State-of-the-art video diffusion models leverage bi-directional temporal attention to model the correlations between the current frame and all the surrounding (i.e. including future) frames, which hinders them from processing streaming videos. To address this problem, we present Live2Diff, the first attempt at designing a video diffusion model with uni-directional temporal attention, specifically targeting live streaming video translation. Compared to previous works, our approach ensures temporal consistency and smoothness by correlating the current frame with its predecessors and a few initial warmup frames, without any future frames. Additionally, we use a highly efficient denoising scheme featuring a KV-cache mechanism and pipelining, to facilitate streaming video translation at interactive framerates. Extensive experiments demonstrate the effectiveness of the proposed attention mechanism and pipeline, outperforming previous methods in terms of temporal smoothness and/or efficiency.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Automatically Analyzing Performance Issues in Android Apps: How Far Are We?
Authors:
Dianshu Liao,
Shidong Pan,
Siyuan Yang,
Yitong Wang,
Yanjie Zhao,
Zhenchang Xing,
Xiaoyu Sun
Abstract:
Performance plays a critical role in ensuring the smooth operation of any mobile application, directly influencing user engagement and retention. Android applications are no exception. However, unlike functionality issues, performance issues are more challenging to discover as their root causes are sophisticated and typically emerge under specific payloads. To tackle this problem, researchers have…
▽ More
Performance plays a critical role in ensuring the smooth operation of any mobile application, directly influencing user engagement and retention. Android applications are no exception. However, unlike functionality issues, performance issues are more challenging to discover as their root causes are sophisticated and typically emerge under specific payloads. To tackle this problem, researchers have dedicated substantial efforts to proposing automatic approaches for understanding, detecting, and resolving performance issues. Despite these endeavors, it still remains unknown what the status quo of Android performance analysis is, and whether existing approaches can indeed accurately reflect real performance issues. To fill this research gap, we conducted a systematic literature review followed by an explanatory study to explore relevant studies and real-world challenges. Our findings reveal that current tools have limited capabilities, covering only 17.50% of the performance issues. Additionally, existing datasets encompass only 27.50% of the issues and are very limited in size. We also show real-world issue patterns, underscoring the huge gap between the identified techniques and practical concerns. Furthermore, possible solutions are provided to guide future research towards achieving effective performance issue detection and resolution.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds
Authors:
Yiming Zhang,
Yicheng Gu,
Yanhong Zeng,
Zhening Xing,
Yuancheng Wang,
Zhizheng Wu,
Kai Chen
Abstract:
We study Neural Foley, the automatic generation of high-quality sound effects synchronizing with videos, enabling an immersive audio-visual experience. Despite its wide range of applications, existing approaches encounter limitations when it comes to simultaneously synthesizing high-quality and video-aligned (i.e.,, semantic relevant and temporal synchronized) sounds. To overcome these limitations…
▽ More
We study Neural Foley, the automatic generation of high-quality sound effects synchronizing with videos, enabling an immersive audio-visual experience. Despite its wide range of applications, existing approaches encounter limitations when it comes to simultaneously synthesizing high-quality and video-aligned (i.e.,, semantic relevant and temporal synchronized) sounds. To overcome these limitations, we propose FoleyCrafter, a novel framework that leverages a pre-trained text-to-audio model to ensure high-quality audio generation. FoleyCrafter comprises two key components: the semantic adapter for semantic alignment and the temporal controller for precise audio-video synchronization. The semantic adapter utilizes parallel cross-attention layers to condition audio generation on video features, producing realistic sound effects that are semantically relevant to the visual content. Meanwhile, the temporal controller incorporates an onset detector and a timestampbased adapter to achieve precise audio-video alignment. One notable advantage of FoleyCrafter is its compatibility with text prompts, enabling the use of text descriptions to achieve controllable and diverse video-to-audio generation according to user intents. We conduct extensive quantitative and qualitative experiments on standard benchmarks to verify the effectiveness of FoleyCrafter. Models and codes are available at https://github.com/open-mmlab/FoleyCrafter.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Non-Hermitian skin effect in arbitrary dimensions: non-Bloch band theory and classification
Authors:
Yuncheng Xiong,
Ze-Yu Xing,
Haiping Hu
Abstract:
Non-Hermitian skin effect (NHSE) is a distinctive phenomenon in non-Hermitian systems, characterized by a significant accumulation of eigenstates at system boundaries. While well-understood in one dimension via non-Bloch band theory, unraveling the NHSE in higher dimensions faces formidable challenges due to the diversity of open boundary conditions or lattice geometries and inevitable numerical e…
▽ More
Non-Hermitian skin effect (NHSE) is a distinctive phenomenon in non-Hermitian systems, characterized by a significant accumulation of eigenstates at system boundaries. While well-understood in one dimension via non-Bloch band theory, unraveling the NHSE in higher dimensions faces formidable challenges due to the diversity of open boundary conditions or lattice geometries and inevitable numerical errors. Key issues, including higher-dimensional non-Bloch band theory, geometric dependency, spectral convergence and stability, and a complete classification of NHSE, remain elusive. In this work, we address these challenges by presenting a geometry-adaptive non-Bloch band theory in arbitrary dimensions, through the lens of spectral potential. Our formulation accurately determines the energy spectra, density of states, and generalized Brillouin zone for a given geometry in the thermodynamic limit (TDL), revealing their geometric dependencies. Furthermore, we systematically classify the NHSE into critical and non-reciprocal types using net winding numbers. In the critical case, we identify novel scale-free skin modes residing on the boundary. In the nonreciprocal case, the skin modes manifest in various forms, including normal or anomalous corner modes, boundary modes or scale-free modes. We reveal the non-convergence and instability of the non-Bloch spectra in the presence of scale-free modes and attribute it to the non-exchangeability of the zero-perturbation limit and the TDL. The instability drives the energy spectra towards the Amoeba spectra in the critical case. Our findings provide a unified non-Bloch band theory governing the energy spectra, density of states, and generalized Brillouin zone in the TDL, offering a comprehensive understanding of NHSE in arbitrary dimensions.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Observable CP-violation in charmed baryons decays with SU(3) symmetry analysis
Authors:
Jin Sun,
Ruilin Zhu,
Zhi-Peng Xing
Abstract:
CP violation in baryon decays has recently garnered significant attention. Traditionally, CP violation in charmed baryon decays is predicted to be at the order of $O(10^{-4})$. Utilizing recent measurements from BESIII, Belle and Belle~II experiments, we conduct a comprehensive global analysis to estimate the contribution of topological diagrams in these processes. Our study reveals that topologic…
▽ More
CP violation in baryon decays has recently garnered significant attention. Traditionally, CP violation in charmed baryon decays is predicted to be at the order of $O(10^{-4})$. Utilizing recent measurements from BESIII, Belle and Belle~II experiments, we conduct a comprehensive global analysis to estimate the contribution of topological diagrams in these processes. Our study reveals that topological diagrams alone cannot fully account for the experimental data. By incorporating new effects that are not included in traditional topological diagrams and exploring the possible source of the new effect, our analysis reveals the potential for significant CP violation in $Λ_c^+\to n π^+$ and $Ξ_c^+\to Ξ^0 K^+$. This suggests a promising opportunity to observe CP violation for the first time in charmed baryon decays.
△ Less
Submitted 11 July, 2024; v1 submitted 29 June, 2024;
originally announced July 2024.
-
Formation of Wind-Fed Black Hole High-mass X-ray Binaries: The Role of Roche-lobe-Overflow Post Black-Hole Formation
Authors:
Zepei Xing,
Tassos Fragos,
Emmanouil Zapartas,
Tom M. Kwan,
Lixin Dai,
Ilya Mandel,
Matthias U. Kruckow,
Max Briel,
Jeff J. Andrews,
Simone S. Bavera,
Seth Gossage,
Konstantinos Kovlakas,
Kyle A. Rocha,
Meng Sun,
Philipp M. Srivastava
Abstract:
The three dynamically confirmed wind-fed black hole high-mass X-ray binaries (BH-HMXBs) are suggested to all contain a highly spinning black hole (BH). However, based on the theories of efficient angular momentum transport inside the stars, we expect that the first-born BHs in binary systems should have low spins, which is consistent with gravitational-wave observations. As a result, the origin of…
▽ More
The three dynamically confirmed wind-fed black hole high-mass X-ray binaries (BH-HMXBs) are suggested to all contain a highly spinning black hole (BH). However, based on the theories of efficient angular momentum transport inside the stars, we expect that the first-born BHs in binary systems should have low spins, which is consistent with gravitational-wave observations. As a result, the origin of the high BH spins measured in wind-fed BH-HMXBs remains a mystery. In this paper, we conduct a binary population synthesis study on wind-fed BH-HMXBs at solar metallicity with the use of the newly developed code POSYDON, considering three scenarios for BH accretion: Eddington-limited, moderately super-Eddington, and fully conservative accretion. Taking into account the conditions for accretion-disk formation, we find that regardless of the accretion model, these systems are more likely to have already experienced a phase of Roche-lobe overflow after the BH formation. To account for the extreme BH spins, highly conservative accretion onto BHs is required, when assuming the accreted material carries the specific angular momentum at the innermost stable orbit. Besides, in our simulations we found that the systems with donor stars within the mass range of $10-20\,M_{\odot}$ are prevalent, posing a challenge in explaining simultaneously all observed properties of the BH-HMXB in our Galaxy, Cygnus X-1, and potentially hinting that the accretion efficiency onto non-degenerate stars, before the formation of the BH, is also more conservative than assumed in our simulations.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
A Large-scale Investigation of Semantically Incompatible APIs behind Compatibility Issues in Android Apps
Authors:
Shidong Pan,
Tianchen Guo,
Lihong Zhang,
Pei Liu,
Zhenchang Xing,
Xiaoyu Sun
Abstract:
Application Programming Interface (API) incompatibility is a long-standing issue in Android application development. The rapid evolution of Android APIs results in a significant number of API additions, removals, and changes between adjacent versions. Unfortunately, this high frequency of alterations may lead to compatibility issues, often without adequate notification to developers regarding thes…
▽ More
Application Programming Interface (API) incompatibility is a long-standing issue in Android application development. The rapid evolution of Android APIs results in a significant number of API additions, removals, and changes between adjacent versions. Unfortunately, this high frequency of alterations may lead to compatibility issues, often without adequate notification to developers regarding these changes. Although researchers have proposed some work on detecting compatibility issues caused by changes in API signatures, they often overlook compatibility issues stemming from sophisticated semantic changes. In response to this challenge, we conducted a large-scale discovery of incompatible APIs in the Android Open Source Project (AOSP) by leveraging static analysis and pre-trained Large Language Models (LLMs) across adjacent versions. We systematically formulate the problem and propose a unified framework to detect incompatible APIs, especially for semantic changes. It's worth highlighting that our approach achieves a 0.83 F1-score in identifying semantically incompatible APIs in the Android framework. Ultimately, our approach detects 5,481 incompatible APIs spanning from version 4 to version 33. We further demonstrate its effectiveness in supplementing the state-of-the-art methods in detecting a broader spectrum of compatibility issues (+92.3%) that have been previously overlooked.
△ Less
Submitted 26 June, 2024; v1 submitted 25 June, 2024;
originally announced June 2024.
-
Bridging Electromagnetic and Gravitational Form Factors: Insights from LFHQCD
Authors:
Xiaobin Wang,
Zanbin Xing,
Minghui Ding,
Khépani Raya,
Lei Chang
Abstract:
We propose an efficacious approach to derive the generalized parton distributions for the pion and proton, based upon prior knowledge of their respective parton distribution functions (PDFs). Our method leverages on integral representations of the electromagnetic form factors derived from the light-front holographic QCD (LFHQCD) formalism, coupled with PDFs computed from continuum Schwinger functi…
▽ More
We propose an efficacious approach to derive the generalized parton distributions for the pion and proton, based upon prior knowledge of their respective parton distribution functions (PDFs). Our method leverages on integral representations of the electromagnetic form factors derived from the light-front holographic QCD (LFHQCD) formalism, coupled with PDFs computed from continuum Schwinger functional methods at the hadronic scale. Using these techniques, we calculate gravitational form factors and associated mass distributions for each hadron. Remarkably, our calculations yield results that closely match recent lattice QCD simulations conducted near the physical pion mass. This work not only deepens our understanding of hadronic structure but also highlights the efficacy of the LFHQCD approach in modeling fundamental properties of hadrons.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Authors:
Miaosen Zhang,
Yixuan Wei,
Zhen Xing,
Yifei Ma,
Zuxuan Wu,
Ji Li,
Zheng Zhang,
Qi Dai,
Chong Luo,
Xin Geng,
Baining Guo
Abstract:
Modern vision models are trained on very large noisy datasets. While these models acquire strong capabilities, they may not follow the user's intent to output the desired results in certain aspects, e.g., visual aesthetic, preferred style, and responsibility. In this paper, we target the realm of visual aesthetics and aim to align vision models with human aesthetic standards in a retrieval system.…
▽ More
Modern vision models are trained on very large noisy datasets. While these models acquire strong capabilities, they may not follow the user's intent to output the desired results in certain aspects, e.g., visual aesthetic, preferred style, and responsibility. In this paper, we target the realm of visual aesthetics and aim to align vision models with human aesthetic standards in a retrieval system. Advanced retrieval systems usually adopt a cascade of aesthetic models as re-rankers or filters, which are limited to low-level features like saturation and perform poorly when stylistic, cultural or knowledge contexts are involved. We find that utilizing the reasoning ability of large language models (LLMs) to rephrase the search query and extend the aesthetic expectations can make up for this shortcoming. Based on the above findings, we propose a preference-based reinforcement learning method that fine-tunes the vision models to distill the knowledge from both LLMs reasoning and the aesthetic models to better align the vision models with human aesthetics. Meanwhile, with rare benchmarks designed for evaluating retrieval systems, we leverage large multi-modality model (LMM) to evaluate the aesthetic performance with their strong abilities. As aesthetic assessment is one of the most subjective tasks, to validate the robustness of LMM, we further propose a novel dataset named HPIR to benchmark the alignment with human aesthetics. Experiments demonstrate that our method significantly enhances the aesthetic behaviors of the vision models, under several metrics. We believe the proposed algorithm can be a general practice for aligning vision models with human values.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
VersiCode: Towards Version-controllable Code Generation
Authors:
Tongtong Wu,
Weigang Wu,
Xingyu Wang,
Kang Xu,
Suyu Ma,
Bo Jiang,
Ping Yang,
Zhenchang Xing,
Yuan-Fang Li,
Gholamreza Haffari
Abstract:
Significant research has focused on improving the performance of large language model on code-related tasks due to their practical importance. Although performance is typically evaluated using public benchmark datasets, the existing datasets do not account for the concept of \emph{version}, which is crucial in professional software development. In this paper, we introduce VersiCode, the first comp…
▽ More
Significant research has focused on improving the performance of large language model on code-related tasks due to their practical importance. Although performance is typically evaluated using public benchmark datasets, the existing datasets do not account for the concept of \emph{version}, which is crucial in professional software development. In this paper, we introduce VersiCode, the first comprehensive dataset designed to assess the ability of large language models to generate verifiable code for specific library versions. VersiCode encompasses 300 libraries across more than 2,000 versions spanning 9 years. We design two dedicated evaluation tasks: version-specific code completion (VSCC) and version-aware code editing (VACE). Comprehensive experiments are conducted to benchmark the performance of LLMs, revealing the challenging nature of these tasks and VersiCode, that even state-of-the-art LLMs struggle to generate version-correct code. This dataset, together with the proposed tasks, sheds light on LLMs' capabilities and limitations in handling version-specific code generation, and opens up an important new area of research for further investigation. The resources can be found at https://github.com/wutong8023/VersiCode.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction
Authors:
Zhen Xing,
Qi Dai,
Zejia Weng,
Zuxuan Wu,
Yu-Gang Jiang
Abstract:
Text-guided video prediction (TVP) involves predicting the motion of future frames from the initial frame according to an instruction, which has wide applications in virtual reality, robotics, and content creation. Previous TVP methods make significant breakthroughs by adapting Stable Diffusion for this task. However, they struggle with frame consistency and temporal stability primarily due to the…
▽ More
Text-guided video prediction (TVP) involves predicting the motion of future frames from the initial frame according to an instruction, which has wide applications in virtual reality, robotics, and content creation. Previous TVP methods make significant breakthroughs by adapting Stable Diffusion for this task. However, they struggle with frame consistency and temporal stability primarily due to the limited scale of video datasets. We observe that pretrained Image2Video diffusion models possess good priors for video dynamics but they lack textual control. Hence, transferring Image2Video models to leverage their video dynamic priors while injecting instruction control to generate controllable videos is both a meaningful and challenging task. To achieve this, we introduce the Multi-Modal Large Language Model (MLLM) to predict future video states based on initial frames and text instructions. More specifically, we design a dual query transformer (DQFormer) architecture, which integrates the instructions and frames into the conditional embeddings for future frame prediction. Additionally, we develop Long-Short Term Temporal Adapters and Spatial Adapters that can quickly transfer general video diffusion models to specific scenarios with minimal training costs. Experimental results show that our method significantly outperforms state-of-the-art techniques on four datasets: Something Something V2, Epic Kitchen-100, Bridge Data, and UCF-101. Notably, AID achieves 91.2% and 55.5% FVD improvements on Bridge and SSv2 respectively, demonstrating its effectiveness in various domains. More examples can be found at our website https://chenhsing.github.io/AID.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Refactoring to Pythonic Idioms: A Hybrid Knowledge-Driven Approach Leveraging Large Language Models
Authors:
Zejun Zhang,
Zhenchang Xing,
Xiaoxue Ren,
Qinghua Lu,
Xiwei Xu
Abstract:
Pythonic idioms are highly valued and widely used in the Python programming community. However, many Python users find it challenging to use Pythonic idioms. Adopting a rule-based approach or LLM-only approach is not sufficient to overcome three persistent challenges of code idiomatization including code miss, wrong detection and wrong refactoring. Motivated by the determinism of rules and adaptab…
▽ More
Pythonic idioms are highly valued and widely used in the Python programming community. However, many Python users find it challenging to use Pythonic idioms. Adopting a rule-based approach or LLM-only approach is not sufficient to overcome three persistent challenges of code idiomatization including code miss, wrong detection and wrong refactoring. Motivated by the determinism of rules and adaptability of LLMs, we propose a hybrid approach consisting of three modules. We not only write prompts to instruct LLMs to complete tasks, but we also invoke Analytic Rule Interfaces (ARIs) to accomplish tasks. The ARIs are Python code generated by prompting LLMs to generate code. We first construct a knowledge module with three elements including ASTscenario, ASTcomponent and Condition, and prompt LLMs to generate Python code for incorporation into an ARI library for subsequent use. After that, for any syntax-error-free Python code, we invoke ARIs from the ARI library to extract ASTcomponent from the ASTscenario, and then filter out ASTcomponent that does not meet the condition. Finally, we design prompts to instruct LLMs to abstract and idiomatize code, and then invoke ARIs from the ARI library to rewrite non-idiomatic code into the idiomatic code. Next, we conduct a comprehensive evaluation of our approach, RIdiom, and Prompt-LLM on nine established Pythonic idioms in RIdiom. Our approach exhibits superior accuracy, F1-score, and recall, while maintaining precision levels comparable to RIdiom, all of which consistently exceed or come close to 90% for each metric of each idiom. Lastly, we extend our evaluation to encompass four new Pythonic idioms. Our approach consistently outperforms Prompt-LLM, achieving metrics with values consistently exceeding 90% for accuracy, F1-score, precision, and recall.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning
Authors:
Yupeng Zheng,
Zebin Xing,
Qichao Zhang,
Bu Jin,
Pengfei Li,
Yuhang Zheng,
Zhongpu Xia,
Kun Zhan,
Xianpeng Lang,
Yaran Chen,
Dongbin Zhao
Abstract:
Vehicle motion planning is an essential component of autonomous driving technology. Current rule-based vehicle motion planning methods perform satisfactorily in common scenarios but struggle to generalize to long-tailed situations. Meanwhile, learning-based methods have yet to achieve superior performance over rule-based approaches in large-scale closed-loop scenarios. To address these issues, we…
▽ More
Vehicle motion planning is an essential component of autonomous driving technology. Current rule-based vehicle motion planning methods perform satisfactorily in common scenarios but struggle to generalize to long-tailed situations. Meanwhile, learning-based methods have yet to achieve superior performance over rule-based approaches in large-scale closed-loop scenarios. To address these issues, we propose PlanAgent, the first mid-to-mid planning system based on a Multi-modal Large Language Model (MLLM). MLLM is used as a cognitive agent to introduce human-like knowledge, interpretability, and common-sense reasoning into the closed-loop planning. Specifically, PlanAgent leverages the power of MLLM through three core modules. First, an Environment Transformation module constructs a Bird's Eye View (BEV) map and a lane-graph-based textual description from the environment as inputs. Second, a Reasoning Engine module introduces a hierarchical chain-of-thought from scene understanding to lateral and longitudinal motion instructions, culminating in planner code generation. Last, a Reflection module is integrated to simulate and evaluate the generated planner for reducing MLLM's uncertainty. PlanAgent is endowed with the common-sense reasoning and generalization capability of MLLM, which empowers it to effectively tackle both common and complex long-tailed scenarios. Our proposed PlanAgent is evaluated on the large-scale and challenging nuPlan benchmarks. A comprehensive set of experiments convincingly demonstrates that PlanAgent outperforms the existing state-of-the-art in the closed-loop motion planning task. Codes will be soon released.
△ Less
Submitted 4 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
Mapping the sources of CP violation in neutrino oscillations from the seesaw mechanism
Authors:
Zhi-zhong Xing
Abstract:
We present the first complete calculation of the Jarlskog invariant, a working measure of the strength of CP violation in the flavor oscillations of three light neutrino species, with the help of a full Euler-like block parametrization of the flavor structure in the canonical seesaw mechanism. We find that this invariant depends on 240 linear combinations of the 6 original phase parameters that ar…
▽ More
We present the first complete calculation of the Jarlskog invariant, a working measure of the strength of CP violation in the flavor oscillations of three light neutrino species, with the help of a full Euler-like block parametrization of the flavor structure in the canonical seesaw mechanism. We find that this invariant depends on 240 linear combinations of the 6 original phase parameters that are responsible for CP violation in the decays of three heavy Majorana neutrinos in 27 linear combinations as a whole, and thus provides the first model-independent connection between the microscopic and macroscopic matter-antimatter asymmetries.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
No Vandalism: Privacy-Preserving and Byzantine-Robust Federated Learning
Authors:
Zhibo Xing,
Zijian Zhang,
Zi'ang Zhang,
Jiamou Liu,
Liehuang Zhu,
Giovanni Russello
Abstract:
Federated learning allows several clients to train one machine learning model jointly without sharing private data, providing privacy protection. However, traditional federated learning is vulnerable to poisoning attacks, which can not only decrease the model performance, but also implant malicious backdoors. In addition, direct submission of local model parameters can also lead to the privacy lea…
▽ More
Federated learning allows several clients to train one machine learning model jointly without sharing private data, providing privacy protection. However, traditional federated learning is vulnerable to poisoning attacks, which can not only decrease the model performance, but also implant malicious backdoors. In addition, direct submission of local model parameters can also lead to the privacy leakage of the training dataset. In this paper, we aim to build a privacy-preserving and Byzantine-robust federated learning scheme to provide an environment with no vandalism (NoV) against attacks from malicious participants. Specifically, we construct a model filter for poisoned local models, protecting the global model from data and model poisoning attacks. This model filter combines zero-knowledge proofs to provide further privacy protection. Then, we adopt secret sharing to provide verifiable secure aggregation, removing malicious clients that disrupting the aggregation process. Our formal analysis proves that NoV can protect data privacy and weed out Byzantine attackers. Our experiments illustrate that NoV can effectively address data and model poisoning attacks, including PGD, and outperforms other related schemes.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Measurement of Electron Antineutrino Oscillation Amplitude and Frequency via Neutron Capture on Hydrogen at Daya Bay
Authors:
Daya Bay collaboration,
F. P. An,
W. D. Bai,
A. B. Balantekin,
M. Bishai,
S. Blyth,
G. F. Cao,
J. Cao,
J. F. Chang,
Y. Chang,
H. S. Chen,
H. Y. Chen,
S. M. Chen,
Y. Chen,
Y. X. Chen,
Z. Y. Chen,
J. Cheng,
J. Cheng,
Y. -C. Cheng,
Z. K. Cheng,
J. J. Cherwinka,
M. C. Chu,
J. P. Cummings,
O. Dalager,
F. S. Deng
, et al. (177 additional authors not shown)
Abstract:
This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive…
▽ More
This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive region, the relative $\overlineν_{e}$ rates and energy spectra variation among the near and far detectors gives $\mathrm{sin}^22θ_{13} = 0.0759_{-0.0049}^{+0.0050}$ and $Δm^2_{32} = (2.72^{+0.14}_{-0.15})\times10^{-3}$ eV$^2$ assuming the normal neutrino mass ordering, and $Δm^2_{32} = (-2.83^{+0.15}_{-0.14})\times10^{-3}$ eV$^2$ for the inverted neutrino mass ordering. This estimate of $\sin^2 2θ_{13}$ is consistent with and essentially independent from the one obtained using the capture-on-gadolinium sample at Daya Bay. The combination of these two results yields $\mathrm{sin}^22θ_{13}= 0.0833\pm0.0022$, which represents an 8% relative improvement in precision regarding the Daya Bay full 3158-day capture-on-gadolinium result.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Infinite class field tower with small root discriminant
Authors:
Qi Liu,
Zugan Xing
Abstract:
We generalize Schoof's theorem in 1986 and apply this to construct a class of Kummer extensions of the cyclotomic fields with infinite class tower. As an application, we give some number fields with a small root discriminant, which has an infinite $p$-class field tower when $p=3, 5, 7$.
We generalize Schoof's theorem in 1986 and apply this to construct a class of Kummer extensions of the cyclotomic fields with infinite class tower. As an application, we give some number fields with a small root discriminant, which has an infinite $p$-class field tower when $p=3, 5, 7$.
△ Less
Submitted 6 June, 2024; v1 submitted 2 June, 2024;
originally announced June 2024.
-
VBIM-Net: Variational Born Iterative Network for Inverse Scattering Problems
Authors:
Ziqing Xing,
Zhaoyang Zhang,
Zirui Chen,
Yusong Wang,
Haoran Ma,
Zhun Wei,
Gang Bao
Abstract:
Recently, studies have shown the potential of integrating field-type iterative methods with deep learning (DL) techniques in solving inverse scattering problems (ISPs). In this article, we propose a novel Variational Born Iterative Network, namely, VBIM-Net, to solve the full-wave ISPs with significantly improved flexibility and inversion quality. The proposed VBIM-Net emulates the alternating upd…
▽ More
Recently, studies have shown the potential of integrating field-type iterative methods with deep learning (DL) techniques in solving inverse scattering problems (ISPs). In this article, we propose a novel Variational Born Iterative Network, namely, VBIM-Net, to solve the full-wave ISPs with significantly improved flexibility and inversion quality. The proposed VBIM-Net emulates the alternating updates of the total electric field and the contrast in the variational Born iterative method (VBIM) by multiple layers of subnetworks. We embed the calculation of the contrast variation into each of the subnetworks, converting the scattered field residual into an approximate contrast variation and then enhancing it by a U-Net, thus avoiding the requirement of matched measurement dimension and grid resolution as in existing approaches. The total field and contrast of each layer's output is supervised in the loss function of VBIM-Net, which guarantees the physical interpretability of variables of the subnetworks. In addition, we design a training scheme with extra noise to enhance the model's stability. Extensive numerical results on synthetic and experimental data both verify the inversion quality, generalization ability, and robustness of the proposed VBIM-Net. This work may provide some new inspiration for the design of efficient field-type DL schemes.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
JUNO Sensitivity to Invisible Decay Modes of Neutrons
Authors:
JUNO Collaboration,
Angel Abusleme,
Thomas Adam,
Kai Adamowicz,
Shakeel Ahmad,
Rizwan Ahmed,
Sebastiano Aiello,
Fengpeng An,
Qi An,
Giuseppe Andronico,
Nikolay Anfimov,
Vito Antonelli,
Tatiana Antoshkina,
João Pedro Athayde Marcondes de André,
Didier Auguste,
Weidong Bai,
Nikita Balashov,
Wander Baldini,
Andrea Barresi,
Davide Basilico,
Eric Baussan,
Marco Bellato,
Marco Beretta,
Antonio Bergnoli,
Daniel Bick
, et al. (635 additional authors not shown)
Abstract:
We explore the bound neutrons decay into invisible particles (e.g., $n\rightarrow 3 ν$ or $nn \rightarrow 2 ν$) in the JUNO liquid scintillator detector. The invisible decay includes two decay modes: $ n \rightarrow { inv} $ and $ nn \rightarrow { inv} $. The invisible decays of $s$-shell neutrons in $^{12}{\rm C}$ will leave a highly excited residual nucleus. Subsequently, some de-excitation mode…
▽ More
We explore the bound neutrons decay into invisible particles (e.g., $n\rightarrow 3 ν$ or $nn \rightarrow 2 ν$) in the JUNO liquid scintillator detector. The invisible decay includes two decay modes: $ n \rightarrow { inv} $ and $ nn \rightarrow { inv} $. The invisible decays of $s$-shell neutrons in $^{12}{\rm C}$ will leave a highly excited residual nucleus. Subsequently, some de-excitation modes of the excited residual nuclei can produce a time- and space-correlated triple coincidence signal in the JUNO detector. Based on a full Monte Carlo simulation informed with the latest available data, we estimate all backgrounds, including inverse beta decay events of the reactor antineutrino $\barν_e$, natural radioactivity, cosmogenic isotopes and neutral current interactions of atmospheric neutrinos. Pulse shape discrimination and multivariate analysis techniques are employed to further suppress backgrounds. With two years of exposure, JUNO is expected to give an order of magnitude improvement compared to the current best limits. After 10 years of data taking, the JUNO expected sensitivities at a 90% confidence level are $τ/B( n \rightarrow { inv} ) > 5.0 \times 10^{31} \, {\rm yr}$ and $τ/B( nn \rightarrow { inv} ) > 1.4 \times 10^{32} \, {\rm yr}$.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Don't Chase Your Tail! Missing Key Aspects Augmentation in Textual Vulnerability Descriptions of Long-tail Software through Feature Inference
Authors:
Linyi Han,
Shidong Pan,
Zhenchang Xing,
Jiamou Sun,
Sofonias Yitagesu,
Xiaowang Zhang,
Zhiyong Feng
Abstract:
Augmenting missing key aspects in Textual Vulnerability Descriptions (TVDs) for software with a large user base (referred to as non-long-tail software) has greatly advanced vulnerability analysis and software security research. However, these methods often overlook software instances that have a limited user base (referred to as long-tail software) due to limited TVDs, variations in software featu…
▽ More
Augmenting missing key aspects in Textual Vulnerability Descriptions (TVDs) for software with a large user base (referred to as non-long-tail software) has greatly advanced vulnerability analysis and software security research. However, these methods often overlook software instances that have a limited user base (referred to as long-tail software) due to limited TVDs, variations in software features, and domain-specific jargon, which hinders vulnerability analysis and software repairs. In this paper, we introduce a novel software feature inference framework designed to augment the missing key aspects of TVDs for long-tail software. Firstly, we tackle the issue of non-standard software names found in community-maintained vulnerability databases by cross-referencing government databases with Common Vulnerabilities and Exposures (CVEs). Next, we employ Large Language Models (LLMs) to generate the missing key aspects. However, the limited availability of historical TVDs restricts the variety of examples. To overcome this limitation, we utilize the Common Weakness Enumeration (CWE) to classify all TVDs and select cluster centers as representative examples. To ensure accuracy, we present Natural Language Inference (NLI) models specifically designed for long-tail software. These models identify and eliminate incorrect responses. Additionally, we use a wiki repository to provide explanations for proprietary terms. Our evaluations demonstrate that our approach significantly improves the accuracy of augmenting missing key aspects of TVDs for log-tail software from 0.27 to 0.56 (+107%). Interestingly, the accuracy of non-long-tail software also increases from 64% to 71%. As a result, our approach can be useful in various downstream tasks that require complete TVD information.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
Recursive stochastic differential games with non-Lipschitzian generators and viscosity solutions of Hamilton-Jacobi-Bellman-Isaacs equation
Authors:
Guangchen Wang,
Zhuangzhuang Xing
Abstract:
This investigation is dedicated to a two-player zero-sum stochastic differential game (SDG), where a cost function is characterized by a backward stochastic differential equation (BSDE) with a continuous and monotonic generator regarding the first unknown variable, which possesses immense applicability in financial engineering. A verification theorem by virtue of classical solution of derived Hami…
▽ More
This investigation is dedicated to a two-player zero-sum stochastic differential game (SDG), where a cost function is characterized by a backward stochastic differential equation (BSDE) with a continuous and monotonic generator regarding the first unknown variable, which possesses immense applicability in financial engineering. A verification theorem by virtue of classical solution of derived Hamilton-Jacobi-Bellman-Isaacs (HJBI) equation is given. The dynamic programming principle (DPP) and unique weak (viscosity) solvability of HJBI equation are formulated through comparison theorem for BSDEs with monotonic generators and stability of viscosity solution. Some new regularity properties of value function are presented. Finally, we propose three concrete examples, which are concerned with resp., classical, and viscosity solution of HJBI equation, as well as a financial application where an investor with a non-Lipschitzian Epstein-Zin utility deals with market friction to maximize her utility preference.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
An AI System Evaluation Framework for Advancing AI Safety: Terminology, Taxonomy, Lifecycle Mapping
Authors:
Boming Xia,
Qinghua Lu,
Liming Zhu,
Zhenchang Xing
Abstract:
The advent of advanced AI underscores the urgent need for comprehensive safety evaluations, necessitating collaboration across communities (i.e., AI, software engineering, and governance). However, divergent practices and terminologies across these communities, combined with the complexity of AI systems-of which models are only a part-and environmental affordances (e.g., access to tools), obstruct…
▽ More
The advent of advanced AI underscores the urgent need for comprehensive safety evaluations, necessitating collaboration across communities (i.e., AI, software engineering, and governance). However, divergent practices and terminologies across these communities, combined with the complexity of AI systems-of which models are only a part-and environmental affordances (e.g., access to tools), obstruct effective communication and comprehensive evaluation. This paper proposes a framework for AI system evaluation comprising three components: 1) harmonised terminology to facilitate communication across communities involved in AI safety evaluation; 2) a taxonomy identifying essential elements for AI system evaluation; 3) a mapping between AI lifecycle, stakeholders, and requisite evaluations for accountable AI supply chain. This framework catalyses a deeper discourse on AI system evaluation beyond model-centric approaches.
△ Less
Submitted 15 May, 2024; v1 submitted 8 April, 2024;
originally announced April 2024.
-
Search for a sub-eV sterile neutrino using Daya Bay's full dataset
Authors:
F. P. An,
W. D. Bai,
A. B. Balantekin,
M. Bishai,
S. Blyth,
G. F. Cao,
J. Cao,
J. F. Chang,
Y. Chang,
H. S. Chen,
H. Y. Chen,
S. M. Chen,
Y. Chen,
Y. X. Chen,
Z. Y. Chen,
J. Cheng,
Y. C. Cheng,
Z. K. Cheng,
J. J. Cherwinka,
M. C. Chu,
J. P. Cummings,
O. Dalager,
F. S. Deng,
X. Y. Ding,
Y. Y. Ding
, et al. (176 additional authors not shown)
Abstract:
This Letter presents results of a search for the mixing of a sub-eV sterile neutrino with three active neutrinos based on the full data sample of the Daya Bay Reactor Neutrino Experiment, collected during 3158 days of detector operation, which contains $5.55 \times 10^{6}$ reactor \anue candidates identified as inverse beta-decay interactions followed by neutron-capture on gadolinium. The analysis…
▽ More
This Letter presents results of a search for the mixing of a sub-eV sterile neutrino with three active neutrinos based on the full data sample of the Daya Bay Reactor Neutrino Experiment, collected during 3158 days of detector operation, which contains $5.55 \times 10^{6}$ reactor \anue candidates identified as inverse beta-decay interactions followed by neutron-capture on gadolinium. The analysis benefits from a doubling of the statistics of our previous result and from improvements of several important systematic uncertainties. No significant oscillation due to mixing of a sub-eV sterile neutrino with active neutrinos was found. Exclusion limits are set by both Feldman-Cousins and CLs methods. Light sterile neutrino mixing with $\sin^2 2θ_{14} \gtrsim 0.01$ can be excluded at 95\% confidence level in the region of $0.01$ eV$^2 \lesssim |Δm^{2}_{41}| \lesssim 0.1 $ eV$^2$. This result represents the world-leading constraints in the region of $2 \times 10^{-4}$ eV$^2 \lesssim |Δm^{2}_{41}| \lesssim 0.2 $ eV$^2$.
△ Less
Submitted 15 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model
Authors:
Qijun Feng,
Zhen Xing,
Zuxuan Wu,
Yu-Gang Jiang
Abstract:
Reconstructing detailed 3D objects from single-view images remains a challenging task due to the limited information available. In this paper, we introduce FDGaussian, a novel two-stage framework for single-image 3D reconstruction. Recent methods typically utilize pre-trained 2D diffusion models to generate plausible novel views from the input image, yet they encounter issues with either multi-vie…
▽ More
Reconstructing detailed 3D objects from single-view images remains a challenging task due to the limited information available. In this paper, we introduce FDGaussian, a novel two-stage framework for single-image 3D reconstruction. Recent methods typically utilize pre-trained 2D diffusion models to generate plausible novel views from the input image, yet they encounter issues with either multi-view inconsistency or lack of geometric fidelity. To overcome these challenges, we propose an orthogonal plane decomposition mechanism to extract 3D geometric features from the 2D input, enabling the generation of consistent multi-view images. Moreover, we further accelerate the state-of-the-art Gaussian Splatting incorporating epipolar attention to fuse images from different viewpoints. We demonstrate that FDGaussian generates images with high consistency across different views and reconstructs high-quality 3D objects, both qualitatively and quantitatively. More examples can be found at our website https://qjfeng.net/FDGaussian/.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
To Be or not to Be: the role of rotation in modeling Galactic Be X-ray Binaries
Authors:
Kyle Akira Rocha,
Vicky Kalogera,
Zoheyr Doctor,
Jeff J. Andrews,
Meng Sun,
Seth Gossage,
Simone S. Bavera,
Tassos Fragos,
Konstantinos Kovlakas,
Matthias U. Kruckow,
Devina Misra,
Philipp M. Srivastava,
Zepei Xing,
Emmanouil Zapartas
Abstract:
Be X-ray binaries (Be-XRBs) are crucial in understanding high-mass X-ray binaries, featuring a rapidly rotating Be star and a neutron star companion in an eccentric orbit, intermittently accreting material from the Be star's decretion disk. Originating from binary stellar evolution, Be-XRBs are of significant interest to binary population synthesis (BPS) studies, encapsulating the physics of super…
▽ More
Be X-ray binaries (Be-XRBs) are crucial in understanding high-mass X-ray binaries, featuring a rapidly rotating Be star and a neutron star companion in an eccentric orbit, intermittently accreting material from the Be star's decretion disk. Originating from binary stellar evolution, Be-XRBs are of significant interest to binary population synthesis (BPS) studies, encapsulating the physics of supernovae, common envelope, and mass transfer (MT). Using the POSYDON BPS code, employing pre-computed grids of detailed binary stellar evolution models, we investigate the Galactic Be-XRB population. POSYDON incorporates stellar rotation self-consistently during MT phases, enabling a detailed examination of the rotational distribution of Be stars. Our fiducial BPS and Be-XRB model align well with the orbital properties of Galactic Be-XRBs, emphasizing the role of rotational constraints. Our modeling reveals a bimodal rotational distribution of Be-XRB-like systems, in excellent agreement with literature values. All Be-XRBs undergo an MT phase before the first compact object forms, with over half experiencing a second MT phase from a stripped helium companion (Case BB). Computing rotationally-limited MT efficiencies and applying them to our population, we find that the majority of Be-XRBs have undergone highly non-conservative MT (beta ~ 0.15). Our study underscores the importance of detailed angular momentum modeling during MT in interpreting Be-XRB populations, emphasizing this population as a key probe for the stability and efficiency of MT in interacting binaries.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
On the origin of topotactic reduction effect for superconductivity in infinite-layer nickelates
Authors:
Shengwei Zeng,
Chi Sin Tang,
Zhaoyang Luo,
Lin Er Chow,
Zhi Shiuh Lim,
Saurav Prakash,
Ping Yang,
Caozheng Diao,
Xiaojiang Yu,
Zhenxiang Xing,
Rong Ji,
Xinmao Yin,
Changjian Li,
X. Renshaw Wang,
Qian He,
Mark B. H. Breese,
A. Ariando,
Huajun Liu
Abstract:
Topotactic reduction utilizing metal hydrides as reagents emerges as an effective approach to achieve exceptionally low oxidization states of metal ions and unconventional coordination networks. This method opens avenues to the development of entirely new functional materials, with one notable example being the infinite-layer nickelate superconductors. However, the reduction effect on the atomic r…
▽ More
Topotactic reduction utilizing metal hydrides as reagents emerges as an effective approach to achieve exceptionally low oxidization states of metal ions and unconventional coordination networks. This method opens avenues to the development of entirely new functional materials, with one notable example being the infinite-layer nickelate superconductors. However, the reduction effect on the atomic reconstruction and electronic structures -- crucial for superconductivity -- remains largely unresolved. We design two sets of control Nd$_{0.8}$Sr$_{0.2}$NiO$_2$ thin films and implement secondary ion mass spectroscopy to highlight the absence of reduction-induced hydrogen intercalation. X-ray absorption spectroscopy shows a significant linear dichroism with dominant Ni 3d$_{x2{-}y2}$ orbitals on superconducting samples, indicating a Ni single-band nature of infinite-layer nickelates. Consistent with the superconducting $T_c$, the Ni 3d orbitals asymmetry manifests a dome-like reduction duration dependence. Our results unveil the critical role of reduction in modulating the Ni-3d orbital polarization and its impact on the superconducting properties.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
A new Wolfenstein-like expansion of lepton flavor mixing towards understanding its fine structure
Authors:
Zhi-zhong Xing
Abstract:
Taking the tri-bimaximal flavor mixing pattern as a particular basis, we propose a new way to expand the $3\times 3$ unitary Pontecorvo-Maki-Nakagawa-Sakata (PMNS) lepton flavor mixing matrix $U$ in powers of the magnitude of its smallest element $ξ\equiv \left|U^{}_{e 3}\right| \simeq 0.149$. Such a Wolfenstein-like parametrization of $U$ allows us to easily describe the salient features and fine…
▽ More
Taking the tri-bimaximal flavor mixing pattern as a particular basis, we propose a new way to expand the $3\times 3$ unitary Pontecorvo-Maki-Nakagawa-Sakata (PMNS) lepton flavor mixing matrix $U$ in powers of the magnitude of its smallest element $ξ\equiv \left|U^{}_{e 3}\right| \simeq 0.149$. Such a Wolfenstein-like parametrization of $U$ allows us to easily describe the salient features and fine structures of flavor mixing and CP violation, both in vacuum and in matter.
△ Less
Submitted 27 April, 2024; v1 submitted 1 March, 2024;
originally announced March 2024.
-
{A New Hope}: Contextual Privacy Policies for Mobile Applications and An Approach Toward Automated Generation
Authors:
Shidong Pan,
Zhen Tao,
Thong Hoang,
Dawen Zhang,
Tianshi Li,
Zhenchang Xing,
Sherry Xu,
Mark Staples,
Thierry Rakotoarivelo,
David Lo
Abstract:
Privacy policies have emerged as the predominant approach to conveying privacy notices to mobile application users. In an effort to enhance both readability and user engagement, the concept of contextual privacy policies (CPPs) has been proposed by researchers. The aim of CPPs is to fragment privacy policies into concise snippets, displaying them only within the corresponding contexts within the a…
▽ More
Privacy policies have emerged as the predominant approach to conveying privacy notices to mobile application users. In an effort to enhance both readability and user engagement, the concept of contextual privacy policies (CPPs) has been proposed by researchers. The aim of CPPs is to fragment privacy policies into concise snippets, displaying them only within the corresponding contexts within the application's graphical user interfaces (GUIs). In this paper, we first formulate CPP in mobile application scenario, and then present a novel multimodal framework, named SeePrivacy, specifically designed to automatically generate CPPs for mobile applications. This method uniquely integrates vision-based GUI understanding with privacy policy analysis, achieving 0.88 precision and 0.90 recall to detect contexts, as well as 0.98 precision and 0.96 recall in extracting corresponding policy segments. A human evaluation shows that 77% of the extracted privacy policy segments were perceived as well-aligned with the detected contexts. These findings suggest that SeePrivacy could serve as a significant tool for bolstering user interaction with, and understanding of, privacy policies. Furthermore, our solution has the potential to make privacy notices more accessible and inclusive, thus appealing to a broader demographic. A demonstration of our work can be accessed at https://cpp4app.github.io/SeePrivacy/
△ Less
Submitted 10 March, 2024; v1 submitted 22 February, 2024;
originally announced February 2024.
-
First measurement of the yield of $^8$He isotopes produced in liquid scintillator by cosmic-ray muons at Daya Bay
Authors:
Daya Bay Collaboration,
F. P. An,
W. D. Bai,
A. B. Balantekin,
M. Bishai,
S. Blyth,
G. F. Cao,
J. Cao,
J. F. Chang,
Y. Chang,
H. S. Chen,
H. Y. Chen,
S. M. Chen,
Y. Chen,
Y. X. Chen,
Z. Y. Chen,
J. Cheng,
Y. C. Cheng,
Z. K. Cheng,
J. J. Cherwinka,
M. C. Chu,
J. P. Cummings,
O. Dalager,
F. S. Deng,
X. Y. Ding
, et al. (177 additional authors not shown)
Abstract:
Daya Bay presents the first measurement of cosmogenic $^8$He isotope production in liquid scintillator, using an innovative method for identifying cascade decays of $^8$He and its child isotope, $^8$Li. We also measure the production yield of $^9$Li isotopes using well-established methodology. The results, in units of 10$^{-8}μ^{-1}$g$^{-1}$cm$^{2}$, are 0.307$\pm$0.042, 0.341$\pm$0.040, and 0.546…
▽ More
Daya Bay presents the first measurement of cosmogenic $^8$He isotope production in liquid scintillator, using an innovative method for identifying cascade decays of $^8$He and its child isotope, $^8$Li. We also measure the production yield of $^9$Li isotopes using well-established methodology. The results, in units of 10$^{-8}μ^{-1}$g$^{-1}$cm$^{2}$, are 0.307$\pm$0.042, 0.341$\pm$0.040, and 0.546$\pm$0.076 for $^8$He, and 6.73$\pm$0.73, 6.75$\pm$0.70, and 13.74$\pm$0.82 for $^9$Li at average muon energies of 63.9~GeV, 64.7~GeV, and 143.0~GeV, respectively. The measured production rate of $^8$He isotopes is more than an order of magnitude lower than any other measurement of cosmogenic isotope production. It replaces the results of previous attempts to determine the ratio of $^8$He to $^9$Li production that yielded a wide range of limits from 0 to 30\%. The results provide future liquid-scintillator-based experiments with improved ability to predict cosmogenic backgrounds.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Dilatonic Geometrodynamics of a Two-Dimensional Curved Surface due to a Quantum Mechanically Confined Particle
Authors:
Leo Rodriguez,
Shanshan Rodriguez,
Zhenzhong Xing,
L. R. Ram-Mohan
Abstract:
We provide a unique and novel extension of da Costa's calculation of a quantum mechanically constrained particle by analyzing the perturbative back reaction of the quantum confined particle's eigenstates and spectra upon the geometry of the curved surface itself. We do this by first formulating a two dimensional action principle of the quantum constrained particle, which upon wave function variati…
▽ More
We provide a unique and novel extension of da Costa's calculation of a quantum mechanically constrained particle by analyzing the perturbative back reaction of the quantum confined particle's eigenstates and spectra upon the geometry of the curved surface itself. We do this by first formulating a two dimensional action principle of the quantum constrained particle, which upon wave function variation reproduces Schrödinger's equation including da Costa's surface curvature induced potentials. Given this action principle, we vary its functional with respect to the embedded two dimensional inverse-metric to obtain the respective geometrodynamical Einstein equation. We solve this resulting Einstein equation perturbatively by first solving the da Costa's Schrödinger equation to obtain an initial eigensystem, which is used as initial-input data for a perturbed metric inserted into the derived Einstein equation. As a proof of concept, we perform this calculation on a two-sphere and show its first iterative perturbed shape. We also include the back reaction of a constant external magnetic field in a separate calculation. The geometrodynamical analysis is performed within a two dimensional dilation gravity analog, due to several computational advantages.
△ Less
Submitted 4 February, 2024;
originally announced February 2024.
-
An Enhanced Modelling Approach for Warehouse Sharing Platform System Designing Problem
Authors:
Zeren Xing,
Yuehui Wu,
Shuangyuan Yu
Abstract:
With the increasing importance of sustainability, warehouse sharing arises as a possible way to improve the efficiency of the existing logistics system. This paper studied the warehouse sharing platform systems (WSPS) and investigated its supply chain network, including factories, warehouses, and customers. We proposed an enhanced modelling approach for the warehouse sharing platform system design…
▽ More
With the increasing importance of sustainability, warehouse sharing arises as a possible way to improve the efficiency of the existing logistics system. This paper studied the warehouse sharing platform systems (WSPS) and investigated its supply chain network, including factories, warehouses, and customers. We proposed an enhanced modelling approach for the warehouse sharing platform system design problem (WSPSDP) using the multi-allocation hub location routing problem framework. New elements such as inter-warehouse transportation and multiple-allocation scheme were added compared to the existing WSPS model. Then an adaptive large neighbourhood decomposition search heuristic was applied to solve our problem. Computational experiments were conducted on different-sized instances for comparison with the WSPS model without inter-warehouse transportation and the WSPS model with single-allocation scheme. The results suggested that our proposed WSPSDP model is more cost-efficient than the existing WSPS models, and it has the potential to promote the utilisation of existing cheap idle warehouses.
△ Less
Submitted 27 January, 2024;
originally announced January 2024.
-
Moving beyond Deletions: Program Simplification via Diverse Program Transformations
Authors:
Haibo Wang,
Zezhong Xing,
Zheng Wang,
Chengnian Sun,
Shin Hwei Tan
Abstract:
To reduce the complexity of software, Developers manually simplify program (known as developer-induced program simplification in this paper) to reduce its code size yet preserving its functionality but manual simplification is time-consuming and error-prone. To reduce manual effort, rule-based approaches (e.g., refactoring) and deletion-based approaches (e.g., delta debugging) can be potentially a…
▽ More
To reduce the complexity of software, Developers manually simplify program (known as developer-induced program simplification in this paper) to reduce its code size yet preserving its functionality but manual simplification is time-consuming and error-prone. To reduce manual effort, rule-based approaches (e.g., refactoring) and deletion-based approaches (e.g., delta debugging) can be potentially applied to automate developer-induced program simplification. However, as there is little study on how developers simplify programs in Open-source Software (OSS) projects, it is unclear whether these approaches can be effectively used for developer-induced program simplification. Hence, we present the first study of developer-induced program simplification in OSS projects, focusing on the types of program transformations used, the motivations behind simplifications, and the set of program transformations covered by existing refactoring types. Our study of 382 pull requests from 296 projects reveals that there exist gaps in applying existing approaches for automating developer-induced program simplification. and outlines the criteria for designing automatic program simplification techniques. Inspired by our study and to reduce the manual effort in developer-induced program simplification, we propose SimpT5, a tool that can automatically produce simplified programs (semantically-equivalent programs with reduced source lines of code). SimpT5 is trained based on our collected dataset of 92,485 simplified programs with two heuristics: (1) simplified line localization that encodes lines changed in simplified programs, and (2)checkers that measure the quality of generated programs. Our evaluation shows that SimpT5 are more effective than prior approaches in automating developer-induced program simplification.
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
GPTVoiceTasker: LLM-Powered Virtual Assistant for Smartphone
Authors:
Minh Duc Vu,
Han Wang,
Zhuang Li,
Jieshan Chen,
Shengdong Zhao,
Zhenchang Xing,
Chunyang Chen
Abstract:
Virtual assistants have the potential to play an important role in helping users achieves different tasks. However, these systems face challenges in their real-world usability, characterized by inefficiency and struggles in grasping user intentions. Leveraging recent advances in Large Language Models (LLMs), we introduce GptVoiceTasker, a virtual assistant poised to enhance user experiences and ta…
▽ More
Virtual assistants have the potential to play an important role in helping users achieves different tasks. However, these systems face challenges in their real-world usability, characterized by inefficiency and struggles in grasping user intentions. Leveraging recent advances in Large Language Models (LLMs), we introduce GptVoiceTasker, a virtual assistant poised to enhance user experiences and task efficiency on mobile devices. GptVoiceTasker excels at intelligently deciphering user commands and executing relevant device interactions to streamline task completion. The system continually learns from historical user commands to automate subsequent usages, further enhancing execution efficiency. Our experiments affirm GptVoiceTasker's exceptional command interpretation abilities and the precision of its task automation module. In our user study, GptVoiceTasker boosted task efficiency in real-world scenarios by 34.85%, accompanied by positive participant feedback. We made GptVoiceTasker open-source, inviting further research into LLMs utilization for diverse tasks through prompt engineering and leveraging user usage data to improve efficiency.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Vivim: a Video Vision Mamba for Medical Video Object Segmentation
Authors:
Yijun Yang,
Zhaohu Xing,
Chunwang Huang,
Lei Zhu
Abstract:
Traditional convolutional neural networks have a limited receptive field while transformer-based networks are mediocre in constructing long-term dependency from the perspective of computational complexity. Such the bottleneck poses a significant challenge when processing long sequences in video analysis tasks. Very recently, the state space models (SSMs) with efficient hardware-aware designs, famo…
▽ More
Traditional convolutional neural networks have a limited receptive field while transformer-based networks are mediocre in constructing long-term dependency from the perspective of computational complexity. Such the bottleneck poses a significant challenge when processing long sequences in video analysis tasks. Very recently, the state space models (SSMs) with efficient hardware-aware designs, famous by Mamba, have exhibited impressive achievements in long sequence modeling, which facilitates the development of deep neural networks on many vision tasks. To better capture available dynamic cues in video frames, this paper presents a generic Video Vision Mamba-based framework, dubbed as \textbf{Vivim}, for medical video object segmentation tasks. Our Vivim can effectively compress the long-term spatiotemporal representation into sequences at varying scales by our designed Temporal Mamba Block. We also introduce a boundary-aware constraint to enhance the discriminative ability of Vivim on ambiguous lesions in medical images. Extensive experiments on thyroid segmentation in ultrasound videos and polyp segmentation in colonoscopy videos demonstrate the effectiveness and efficiency of our Vivim, superior to existing methods. The code is available at: https://github.com/scott-yjyang/Vivim.
△ Less
Submitted 12 March, 2024; v1 submitted 25 January, 2024;
originally announced January 2024.
-
Heavy baryon decays into light meson and dark baryon within LCSR
Authors:
Yu-Ji Shi,
Ye Xing,
Zhi-Peng Xing
Abstract:
We studied the decays of Heavy baryon into a pseudoscalar meson and a dark baryon in the recently developed $B$-Mesogenesis scenario, where the two types of effective Lagrangians proposed by the scenario are both considered. The decay amplitudes of $Λ_b^0$ are calculated by light-cone sum rules using its light-cone distribution amplitudes. The decay amplitudes of $Ξ_b^{0,\pm}$ are related with tho…
▽ More
We studied the decays of Heavy baryon into a pseudoscalar meson and a dark baryon in the recently developed $B$-Mesogenesis scenario, where the two types of effective Lagrangians proposed by the scenario are both considered. The decay amplitudes of $Λ_b^0$ are calculated by light-cone sum rules using its light-cone distribution amplitudes. The decay amplitudes of $Ξ_b^{0,\pm}$ are related with those of $Λ_b^0$ through a flavor SU(3) analysis. The uncertainties of threshold parameter and the Borel parameter are both considered in the numerical calculation. The values of effective coupling constants in the $B$-Mesogenesis are taken as their upper limits that obtained from our previous study on the inclusive decay. The upper limits of the decay branching fractions are presented as functions of the dark baryon mass.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation
Authors:
Zhaohu Xing,
Tian Ye,
Yijun Yang,
Guang Liu,
Lei Zhu
Abstract:
The Transformer architecture has shown a remarkable ability in modeling global relationships. However, it poses a significant computational challenge when processing high-dimensional medical images. This hinders its development and widespread adoption in this task. Mamba, as a State Space Model (SSM), recently emerged as a notable manner for long-range dependencies in sequential modeling, excellin…
▽ More
The Transformer architecture has shown a remarkable ability in modeling global relationships. However, it poses a significant computational challenge when processing high-dimensional medical images. This hinders its development and widespread adoption in this task. Mamba, as a State Space Model (SSM), recently emerged as a notable manner for long-range dependencies in sequential modeling, excelling in natural language processing filed with its remarkable memory efficiency and computational speed. Inspired by its success, we introduce SegMamba, a novel 3D medical image \textbf{Seg}mentation \textbf{Mamba} model, designed to effectively capture long-range dependencies within whole volume features at every scale. Our SegMamba, in contrast to Transformer-based methods, excels in whole volume feature modeling from a state space model standpoint, maintaining superior processing speed, even with volume features at a resolution of {$64\times 64\times 64$}. Comprehensive experiments on the BraTS2023 dataset demonstrate the effectiveness and efficiency of our SegMamba. The code for SegMamba is available at: https://github.com/ge-xing/SegMamba
△ Less
Submitted 25 February, 2024; v1 submitted 24 January, 2024;
originally announced January 2024.
-
QCD anomalies in electromagnetic processes: A solution to the $γ\to3π$ puzzle
Authors:
Zanbin Xing,
Hao Dang,
M. Atif Sultan,
Khépani Raya,
Lei Chang
Abstract:
In this work, the $γ\to3π$ form factor is calculated within the Dyson-Schwinger equations framework using a contact interaction model within the so-called modified rainbow ladder truncation. The present calculation takes into account the pseudovector component in the pion Bethe-Salpeter amplitude (BSA) and $π-π$ scattering effects, producing a $γ\to3π$ anomaly which is $1+6\mathcal{R}_π^2$ larger…
▽ More
In this work, the $γ\to3π$ form factor is calculated within the Dyson-Schwinger equations framework using a contact interaction model within the so-called modified rainbow ladder truncation. The present calculation takes into account the pseudovector component in the pion Bethe-Salpeter amplitude (BSA) and $π-π$ scattering effects, producing a $γ\to3π$ anomaly which is $1+6\mathcal{R}_π^2$ larger than the low energy prediction. Here $\mathcal{R_π}$ is the relative ratio of the pseudovector and pseudoscalar components in the pion BSA; with our parameters input, this correction raises the $γ\to3π$ anomaly by around $10\%$. The main outcome of this work is the unveiling of the origin of such correction, which could be a possible explanation of the discrepancy between the existing experimental data and the low energy prediction. Moreover, it is highlighted how the magnitude of the anomaly is affected in effective theories that require an irremovable ultraviolet cutoff. We find that for both the anomalous processes $π\to2γ$ and $γ\to 3π$, the missing contribution to the anomaly can be compensated by the additional structures related with the quark anomalous magnetic moment.
△ Less
Submitted 11 January, 2024; v1 submitted 6 January, 2024;
originally announced January 2024.
-
Charged-current non-standard neutrino interactions at Daya Bay
Authors:
Daya Bay collaboration,
F. P. An,
W. D. Bai,
A. B. Balantekin,
M. Bishai,
S. Blyth,
G. F. Cao,
J. Cao,
J. F. Chang,
Y. Chang,
H. S. Chen,
H. Y. Chen,
S. M. Chen,
Y. Chen,
Y. X. Chen,
Z. Y. Chen,
J. Cheng,
Y. C. Cheng,
Z. K. Cheng,
J. J. Cherwinka,
M. C. Chu,
J. P. Cummings,
O. Dalager,
F. S. Deng,
X. Y. Ding
, et al. (177 additional authors not shown)
Abstract:
The full data set of the Daya Bay reactor neutrino experiment is used to probe the effect of the charged current non-standard interactions (CC-NSI) on neutrino oscillation experiments. Two different approaches are applied and constraints on the corresponding CC-NSI parameters are obtained with the neutrino flux taken from the Huber-Mueller model with a $5\%$ uncertainty. For the quantum mechanics-…
▽ More
The full data set of the Daya Bay reactor neutrino experiment is used to probe the effect of the charged current non-standard interactions (CC-NSI) on neutrino oscillation experiments. Two different approaches are applied and constraints on the corresponding CC-NSI parameters are obtained with the neutrino flux taken from the Huber-Mueller model with a $5\%$ uncertainty. For the quantum mechanics-based approach (QM-NSI), the constraints on the CC-NSI parameters $ε_{eα}$ and $ε_{eα}^{s}$ are extracted with and without the assumption that the effects of the new physics are the same in the production and detection processes, respectively. The approach based on the weak effective field theory (WEFT-NSI) deals with four types of CC-NSI represented by the parameters $[\varepsilon_{X}]_{eα}$. For both approaches, the results for the CC-NSI parameters are shown for cases with various fixed values of the CC-NSI and the Dirac CP-violating phases, and when they are allowed to vary freely. We find that constraints on the QM-NSI parameters $ε_{eα}$ and $ε_{eα}^{s}$ from the Daya Bay experiment alone can reach the order $\mathcal{O}(0.01)$ for the former and $\mathcal{O}(0.1)$ for the latter, while for WEFT-NSI parameters $[\varepsilon_{X}]_{eα}$, we obtain $\mathcal{O}(0.1)$ for both cases.
△ Less
Submitted 19 March, 2024; v1 submitted 5 January, 2024;
originally announced January 2024.
-
Light baryon in three quark picture light front approach and its application: hyperon weak radiative decays
Authors:
Zhi-Peng Xing,
Yu Ji Shi,
Jin Sun,
Zhen-Xing Zhao
Abstract:
Motivated by recent experimental data on $Σ^+\to pγ$ at BESIII, we investigate a class of hyperon weak radiative decays. To estimate these processes, in our research, we employ a new type of light-front quark model with a three-quark picture for octet baryons. In the three-quark picture, with the use of $SU(3)_f$ and spin symmetries, we present a general form of the light front wave function for e…
▽ More
Motivated by recent experimental data on $Σ^+\to pγ$ at BESIII, we investigate a class of hyperon weak radiative decays. To estimate these processes, in our research, we employ a new type of light-front quark model with a three-quark picture for octet baryons. In the three-quark picture, with the use of $SU(3)_f$ and spin symmetries, we present a general form of the light front wave function for each octet baryon. By including contributions from the penguin diagram and W exchange diagram, we perform a complete calculation on the branching ratios ($Br$) and the asymmetry parameter ($α$) for hyperon weak radiative decay processes. Our results are helpful for discovering additional hyperon weak radiative decay processes in experimental facilities, and our research will promote the theoretical study of baryons.
△ Less
Submitted 8 January, 2024; v1 submitted 29 December, 2023;
originally announced December 2023.
-
PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models
Authors:
Yiming Zhang,
Zhening Xing,
Yanhong Zeng,
Youqing Fang,
Kai Chen
Abstract:
Recent advancements in personalized text-to-image (T2I) models have revolutionized content creation, empowering non-experts to generate stunning images with unique styles. While promising, adding realistic motions into these personalized images by text poses significant challenges in preserving distinct styles, high-fidelity details, and achieving motion controllability by text. In this paper, we…
▽ More
Recent advancements in personalized text-to-image (T2I) models have revolutionized content creation, empowering non-experts to generate stunning images with unique styles. While promising, adding realistic motions into these personalized images by text poses significant challenges in preserving distinct styles, high-fidelity details, and achieving motion controllability by text. In this paper, we present PIA, a Personalized Image Animator that excels in aligning with condition images, achieving motion controllability by text, and the compatibility with various personalized T2I models without specific tuning. To achieve these goals, PIA builds upon a base T2I model with well-trained temporal alignment layers, allowing for the seamless transformation of any personalized T2I model into an image animation model. A key component of PIA is the introduction of the condition module, which utilizes the condition frame and inter-frame affinity as input to transfer appearance information guided by the affinity hint for individual frame synthesis in the latent space. This design mitigates the challenges of appearance-related image alignment within and allows for a stronger focus on aligning with motion-related guidance.
△ Less
Submitted 25 March, 2024; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Hunting imaging biomarkers in pulmonary fibrosis: Benchmarks of the AIIB23 challenge
Authors:
Yang Nan,
Xiaodan Xing,
Shiyi Wang,
Zeyu Tang,
Federico N Felder,
Sheng Zhang,
Roberta Eufrasia Ledda,
Xiaoliu Ding,
Ruiqi Yu,
Weiping Liu,
Feng Shi,
Tianyang Sun,
Zehong Cao,
Minghui Zhang,
Yun Gu,
Hanxiao Zhang,
Jian Gao,
Pingyu Wang,
Wen Tang,
Pengxin Yu,
Han Kang,
Junqiang Chen,
Xing Lu,
Boyu Zhang,
Michail Mamalakis
, et al. (16 additional authors not shown)
Abstract:
Airway-related quantitative imaging biomarkers are crucial for examination, diagnosis, and prognosis in pulmonary diseases. However, the manual delineation of airway trees remains prohibitively time-consuming. While significant efforts have been made towards enhancing airway modelling, current public-available datasets concentrate on lung diseases with moderate morphological variations. The intric…
▽ More
Airway-related quantitative imaging biomarkers are crucial for examination, diagnosis, and prognosis in pulmonary diseases. However, the manual delineation of airway trees remains prohibitively time-consuming. While significant efforts have been made towards enhancing airway modelling, current public-available datasets concentrate on lung diseases with moderate morphological variations. The intricate honeycombing patterns present in the lung tissues of fibrotic lung disease patients exacerbate the challenges, often leading to various prediction errors. To address this issue, the 'Airway-Informed Quantitative CT Imaging Biomarker for Fibrotic Lung Disease 2023' (AIIB23) competition was organized in conjunction with the official 2023 International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI). The airway structures were meticulously annotated by three experienced radiologists. Competitors were encouraged to develop automatic airway segmentation models with high robustness and generalization abilities, followed by exploring the most correlated QIB of mortality prediction. A training set of 120 high-resolution computerised tomography (HRCT) scans were publicly released with expert annotations and mortality status. The online validation set incorporated 52 HRCT scans from patients with fibrotic lung disease and the offline test set included 140 cases from fibrosis and COVID-19 patients. The results have shown that the capacity of extracting airway trees from patients with fibrotic lung disease could be enhanced by introducing voxel-wise weighted general union loss and continuity loss. In addition to the competitive image biomarkers for prognosis, a strong airway-derived biomarker (Hazard ratio>1.5, p<0.0001) was revealed for survival prognostication compared with existing clinical measurements, clinician assessment and AI-based biomarkers.
△ Less
Submitted 16 April, 2024; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Near-Field Localization and Phase Shift Optimization for RIS-Assisted Non-Ideal OFDM Systems
Authors:
Hanfu Zhang,
Erwu Liu,
Rui Wang,
Zhe Xing,
Yan Liu
Abstract:
By incorporating reconfigurable intelligent surface (RIS) into communication-assisted localization systems, the issue of signal blockage caused by obstacles can be addressed, and passive beamforming can be employed to enhance localization accuracy. However, existing works mainly consider ideal channels and do not account for the effects of realistic impairments like carrier frequency offset (CFO)…
▽ More
By incorporating reconfigurable intelligent surface (RIS) into communication-assisted localization systems, the issue of signal blockage caused by obstacles can be addressed, and passive beamforming can be employed to enhance localization accuracy. However, existing works mainly consider ideal channels and do not account for the effects of realistic impairments like carrier frequency offset (CFO) and phase noise (PN) on localization. This paper proposes an iterative joint estimation algorithm for CFO, PN, and user position based on maximum a posteriori (MAP) criterion and gradient descent (GD) algorithm. Closed-form expressions for CFO and PN updates are provided. The hybrid Cramér-Rao lower bound (HCRLB) for the estimation parameters is derived, and the ambiguity in CFO and PN estimation is analyzed. To minimize the HCRLB, a non-convex RIS shift optimization problem is formulated and is transformed into a convex semidefinite programming (SDP) problem using the technique of semidefinite relaxation (SDR) and Schur complement. After optimizing the RIS phase shift, the theoretical positioning accuracy within the area of interest (AOI) can be improved by two orders of magnitude, with a maximum positioning root mean square error (RMSE) lower than $\rm 10^{-2}m$.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
SegRap2023: A Benchmark of Organs-at-Risk and Gross Tumor Volume Segmentation for Radiotherapy Planning of Nasopharyngeal Carcinoma
Authors:
Xiangde Luo,
Jia Fu,
Yunxin Zhong,
Shuolin Liu,
Bing Han,
Mehdi Astaraki,
Simone Bendazzoli,
Iuliana Toma-Dasu,
Yiwen Ye,
Ziyang Chen,
Yong Xia,
Yanzhou Su,
Jin Ye,
Junjun He,
Zhaohu Xing,
Hongqiu Wang,
Lei Zhu,
Kaixiang Yang,
Xin Fang,
Zhiwei Wang,
Chan Woong Lee,
Sang Joon Park,
Jaehee Chun,
Constantin Ulrich,
Klaus H. Maier-Hein
, et al. (17 additional authors not shown)
Abstract:
Radiation therapy is a primary and effective NasoPharyngeal Carcinoma (NPC) treatment strategy. The precise delineation of Gross Tumor Volumes (GTVs) and Organs-At-Risk (OARs) is crucial in radiation treatment, directly impacting patient prognosis. Previously, the delineation of GTVs and OARs was performed by experienced radiation oncologists. Recently, deep learning has achieved promising results…
▽ More
Radiation therapy is a primary and effective NasoPharyngeal Carcinoma (NPC) treatment strategy. The precise delineation of Gross Tumor Volumes (GTVs) and Organs-At-Risk (OARs) is crucial in radiation treatment, directly impacting patient prognosis. Previously, the delineation of GTVs and OARs was performed by experienced radiation oncologists. Recently, deep learning has achieved promising results in many medical image segmentation tasks. However, for NPC OARs and GTVs segmentation, few public datasets are available for model development and evaluation. To alleviate this problem, the SegRap2023 challenge was organized in conjunction with MICCAI2023 and presented a large-scale benchmark for OAR and GTV segmentation with 400 Computed Tomography (CT) scans from 200 NPC patients, each with a pair of pre-aligned non-contrast and contrast-enhanced CT scans. The challenge's goal was to segment 45 OARs and 2 GTVs from the paired CT scans. In this paper, we detail the challenge and analyze the solutions of all participants. The average Dice similarity coefficient scores for all submissions ranged from 76.68\% to 86.70\%, and 70.42\% to 73.44\% for OARs and GTVs, respectively. We conclude that the segmentation of large-size OARs is well-addressed, and more efforts are needed for GTVs and small-size or thin-structure OARs. The benchmark will remain publicly available here: https://segrap2023.grand-challenge.org
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Designing with Language: Wireframing UI Design Intent with Generative Large Language Models
Authors:
Sidong Feng,
Mingyue Yuan,
Jieshan Chen,
Zhenchang Xing,
Chunyang Chen
Abstract:
Wireframing is a critical step in the UI design process. Mid-fidelity wireframes offer more impactful and engaging visuals compared to low-fidelity versions. However, their creation can be time-consuming and labor-intensive, requiring the addition of actual content and semantic icons. In this paper, we introduce a novel solution WireGen, to automatically generate mid-fidelity wireframes with just…
▽ More
Wireframing is a critical step in the UI design process. Mid-fidelity wireframes offer more impactful and engaging visuals compared to low-fidelity versions. However, their creation can be time-consuming and labor-intensive, requiring the addition of actual content and semantic icons. In this paper, we introduce a novel solution WireGen, to automatically generate mid-fidelity wireframes with just a brief design intent description using the generative Large Language Models (LLMs). Our experiments demonstrate the effectiveness of WireGen in producing 77.5% significantly better wireframes, outperforming two widely-used in-context learning baselines. A user study with 5 designers further validates its real-world usefulness, highlighting its potential value to enhance UI design process.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
A^3-CodGen: A Repository-Level Code Generation Framework for Code Reuse with Local-Aware, Global-Aware, and Third-Party-Library-Aware
Authors:
Dianshu Liao,
Shidong Pan,
Xiaoyu Sun,
Xiaoxue Ren,
Qing Huang,
Zhenchang Xing,
Huan Jin,
Qinying Li
Abstract:
Code generation tools are essential to help developers in the software development process. Existing tools often disconnect with the working context, i.e., the code repository, causing the generated code to be not similar to human developers. In this paper, we propose a novel code generation framework, dubbed A^3-CodGen, to harness information within the code repository to generate code with fewer…
▽ More
Code generation tools are essential to help developers in the software development process. Existing tools often disconnect with the working context, i.e., the code repository, causing the generated code to be not similar to human developers. In this paper, we propose a novel code generation framework, dubbed A^3-CodGen, to harness information within the code repository to generate code with fewer potential logical errors, code redundancy, and library-induced compatibility issues. We identify three categories of representative information for the code repository: local-aware information from current code file, global-aware information from other code files, and third-party-library information. Results demonstrate that by adopting the A^3-CodGen framework, we successfully extract, fuse, and feed code repository information into the LLM, generating more accurate, efficient, and highly reusable code. The effectiveness of our framework is further underscored by generating code with a higher reuse rate, compared to human developers. This research contributes significantly to the field of code generation, providing developers with a more powerful tool to address the evolving demands in software development in practice.
△ Less
Submitted 5 March, 2024; v1 submitted 10 December, 2023;
originally announced December 2023.
-
VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models
Authors:
Zhen Xing,
Qi Dai,
Zihao Zhang,
Hui Zhang,
Han Hu,
Zuxuan Wu,
Yu-Gang Jiang
Abstract:
Diffusion models have achieved significant success in image and video generation. This motivates a growing interest in video editing tasks, where videos are edited according to provided text descriptions. However, most existing approaches only focus on video editing for short clips and rely on time-consuming tuning or inference. We are the first to propose Video Instruction Diffusion (VIDiff), a u…
▽ More
Diffusion models have achieved significant success in image and video generation. This motivates a growing interest in video editing tasks, where videos are edited according to provided text descriptions. However, most existing approaches only focus on video editing for short clips and rely on time-consuming tuning or inference. We are the first to propose Video Instruction Diffusion (VIDiff), a unified foundation model designed for a wide range of video tasks. These tasks encompass both understanding tasks (such as language-guided video object segmentation) and generative tasks (video editing and enhancement). Our model can edit and translate the desired results within seconds based on user instructions. Moreover, we design an iterative auto-regressive method to ensure consistency in editing and enhancing long videos. We provide convincing generative results for diverse input videos and written instructions, both qualitatively and quantitatively. More examples can be found at our website https://ChenHsing.github.io/VIDiff.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Navigating Privacy and Copyright Challenges Across the Data Lifecycle of Generative AI
Authors:
Dawen Zhang,
Boming Xia,
Yue Liu,
Xiwei Xu,
Thong Hoang,
Zhenchang Xing,
Mark Staples,
Qinghua Lu,
Liming Zhu
Abstract:
The advent of Generative AI has marked a significant milestone in artificial intelligence, demonstrating remarkable capabilities in generating realistic images, texts, and data patterns. However, these advancements come with heightened concerns over data privacy and copyright infringement, primarily due to the reliance on vast datasets for model training. Traditional approaches like differential p…
▽ More
The advent of Generative AI has marked a significant milestone in artificial intelligence, demonstrating remarkable capabilities in generating realistic images, texts, and data patterns. However, these advancements come with heightened concerns over data privacy and copyright infringement, primarily due to the reliance on vast datasets for model training. Traditional approaches like differential privacy, machine unlearning, and data poisoning only offer fragmented solutions to these complex issues. Our paper delves into the multifaceted challenges of privacy and copyright protection within the data lifecycle. We advocate for integrated approaches that combines technical innovation with ethical foresight, holistically addressing these concerns by investigating and devising solutions that are informed by the lifecycle perspective. This work aims to catalyze a broader discussion and inspire concerted efforts towards data privacy and copyright integrity in Generative AI.
△ Less
Submitted 10 January, 2024; v1 submitted 30 November, 2023;
originally announced November 2023.
-
AdaDiff: Adaptive Step Selection for Fast Diffusion
Authors:
Hui Zhang,
Zuxuan Wu,
Zhen Xing,
Jie Shao,
Yu-Gang Jiang
Abstract:
Diffusion models, as a type of generative models, have achieved impressive results in generating images and videos conditioned on textual conditions. However, the generation process of diffusion models involves denoising for dozens of steps to produce photorealistic images/videos, which is computationally expensive. Unlike previous methods that design ``one-size-fits-all'' approaches for speed up,…
▽ More
Diffusion models, as a type of generative models, have achieved impressive results in generating images and videos conditioned on textual conditions. However, the generation process of diffusion models involves denoising for dozens of steps to produce photorealistic images/videos, which is computationally expensive. Unlike previous methods that design ``one-size-fits-all'' approaches for speed up, we argue denoising steps should be sample-specific conditioned on the richness of input texts. To this end, we introduce AdaDiff, a lightweight framework designed to learn instance-specific step usage policies, which are then used by the diffusion model for generation. AdaDiff is optimized using a policy gradient method to maximize a carefully designed reward function, balancing inference time and generation quality. We conduct experiments on three image generation and two video generation benchmarks and demonstrate that our approach achieves similar results in terms of visual quality compared to the baseline using a fixed 50 denoising steps while reducing inference time by at least 33%, going as high as 40%. Furthermore, our qualitative analysis shows that our method allocates more steps to more informative text conditions and fewer steps to simpler text conditions.
△ Less
Submitted 24 November, 2023;
originally announced November 2023.