Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 1,655 results for author: Hu, Z

.
  1. arXiv:2409.09420  [pdf, ps, other

    math.PR

    A study on the $F$-distribution motivated by Chvátal's theorem

    Authors: Qianqian Zhou, Peng Lu, Zechun Hu

    Abstract: Let $X_{d_1, d_2}$ be an $F$-random variable with parameters $d_1$ and $d_2,$ and expectation $E[X_{d_1, d_2}]$. In this paper, for any $κ>0,$ we investigate the infimum value of the probability $P(X_{d_1, d_2}\leq κE[X_{d_1, d_2}])$. Our motivation comes from Chvátal's theorem on the binomial distribution.

    Submitted 14 September, 2024; originally announced September 2024.

  2. arXiv:2409.09300  [pdf, other

    cs.CV

    ManiDext: Hand-Object Manipulation Synthesis via Continuous Correspondence Embeddings and Residual-Guided Diffusion

    Authors: Jiajun Zhang, Yuxiang Zhang, Liang An, Mengcheng Li, Hongwen Zhang, Zonghai Hu, Yebin Liu

    Abstract: Dynamic and dexterous manipulation of objects presents a complex challenge, requiring the synchronization of hand motions with the trajectories of objects to achieve seamless and physically plausible interactions. In this work, we introduce ManiDext, a unified hierarchical diffusion-based framework for generating hand manipulation and grasp poses based on 3D object trajectories. Our key insight is… ▽ More

    Submitted 14 September, 2024; originally announced September 2024.

  3. arXiv:2409.09292  [pdf, other

    cs.CV

    StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads

    Authors: Suzhen Wang, Yifeng Ma, Yu Ding, Zhipeng Hu, Changjie Fan, Tangjie Lv, Zhidong Deng, Xin Yu

    Abstract: Individuals have unique facial expression and head pose styles that reflect their personalized speaking styles. Existing one-shot talking head methods cannot capture such personalized characteristics and therefore fail to produce diverse speaking styles in the final videos. To address this challenge, we propose a one-shot style-controllable talking face generation method that can obtain speaking s… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

    Comments: TPAMI 2024. arXiv admin note: text overlap with arXiv:2301.01081

  4. arXiv:2409.07911  [pdf, other

    cs.LG

    Tera-SpaceCom: GNN-based Deep Reinforcement Learning for Joint Resource Allocation and Task Offloading in TeraHertz Band Space Networks

    Authors: Zhifeng Hu, Chong Han, Wolfgang Gerstacker, Ian F. Akyildiz

    Abstract: Terahertz (THz) space communications (Tera-SpaceCom) is envisioned as a promising technology to enable various space science and communication applications. Mainly, the realm of Tera-SpaceCom consists of THz sensing for space exploration, data centers in space providing cloud services for space exploration tasks, and a low earth orbit (LEO) mega-constellation relaying these tasks to ground station… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

  5. Fully-Optimized Quantum Metrology: Framework, Tools, and Applications

    Authors: Qiushi Liu, Zihao Hu, Haidong Yuan, Yuxiang Yang

    Abstract: This tutorial introduces a systematic approach for addressing the key question of quantum metrology: For a generic task of sensing an unknown parameter, what is the ultimate precision given a constrained set of admissible strategies. The approach outputs the maximal attainable precision (in terms of the maximum of quantum Fisher information) as a semidefinite program and optimal strategies as feas… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

    Comments: Tutorial. 38 pages, 18 figures

    Journal ref: Adv. Quantum Technol. 2024, 2400094 (2024)

  6. World-Grounded Human Motion Recovery via Gravity-View Coordinates

    Authors: Zehong Shen, Huaijin Pi, Yan Xia, Zhi Cen, Sida Peng, Zechen Hu, Hujun Bao, Ruizhen Hu, Xiaowei Zhou

    Abstract: We present a novel method for recovering world-grounded human motion from monocular video. The main challenge lies in the ambiguity of defining the world coordinate system, which varies between sequences. Previous approaches attempt to alleviate this issue by predicting relative motion in an autoregressive manner, but are prone to accumulating errors. Instead, we propose estimating human poses in… ▽ More

    Submitted 10 September, 2024; originally announced September 2024.

    Comments: Accepted at SIGGRAPH Asia 2024 (Conference Track). Project page: https://zju3dv.github.io/gvhmr/

  7. arXiv:2409.06155  [pdf, other

    cond-mat.mes-hall cond-mat.str-el

    Exciton crystal melting and destruction by disorder in bilayer quantum hall system with total filling factor one

    Authors: Zhengfei Hu, Kun Yang

    Abstract: Bilayer quantum hall system with total filling factor 1 was studied in the regime of heavy layer imbalance in a recent transport experiment (Ref. 1), with intriguing new findings. We demonstrate in this paper that 1) the exciton Wigner crystal in this regime can melt into a superfluid phase, giving rise to re-entrant superfluid behavior; 2) in the presence of disorder, electron and hole Wigner cry… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

    Comments: 12 pages, 3 figures

  8. arXiv:2409.06154  [pdf, other

    cs.CV

    UniLearn: Enhancing Dynamic Facial Expression Recognition through Unified Pre-Training and Fine-Tuning on Images and Videos

    Authors: Yin Chen, Jia Li, Yu Zhang, Zhenzhen Hu, Shiguang Shan, Meng Wang, Richang Hong

    Abstract: Dynamic facial expression recognition (DFER) is essential for understanding human emotions and behavior. However, conventional DFER methods, which primarily use dynamic facial data, often underutilize static expression images and their labels, limiting their performance and robustness. To overcome this, we introduce UniLearn, a novel unified learning paradigm that integrates static facial expressi… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

  9. arXiv:2409.05865  [pdf, other

    cs.RO cs.LG

    Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments

    Authors: Haritheja Etukuru, Norihito Naka, Zijin Hu, Seungjae Lee, Julian Mehu, Aaron Edsinger, Chris Paxton, Soumith Chintala, Lerrel Pinto, Nur Muhammad Mahi Shafiullah

    Abstract: Robot models, particularly those trained with large amounts of data, have recently shown a plethora of real-world manipulation and navigation capabilities. Several independent efforts have shown that given sufficient training data in an environment, robot policies can generalize to demonstrated variations in that environment. However, needing to finetune robot models to every new environment stand… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

    Comments: Project website https://robotutilitymodels.com

  10. arXiv:2409.05785  [pdf, other

    cs.DC cs.AI

    NeurLZ: On Enhancing Lossy Compression Performance based on Error-Controlled Neural Learning for Scientific Data

    Authors: Wenqi Jia, Youyuan Liu, Zhewen Hu, Jinzhen Wang, Boyuan Zhang, Wei Niu, Junzhou Huang, Stavros Kalafatis, Sian Jin, Miao Yin

    Abstract: Large-scale scientific simulations generate massive datasets that pose significant challenges for storage and I/O. While traditional lossy compression techniques can improve performance, balancing compression ratio, data quality, and throughput remains difficult. To address this, we propose NeurLZ, a novel cross-field learning-based and error-controlled compression framework for scientific data. B… ▽ More

    Submitted 9 September, 2024; v1 submitted 9 September, 2024; originally announced September 2024.

  11. arXiv:2409.05734  [pdf, other

    physics.optics

    Structured Random Model for Fast and Robust Phase Retrieval

    Authors: Zhiyuan Hu, Julián Tachella, Michael Unser, Jonathan Dong

    Abstract: Phase retrieval, a nonlinear problem prevalent in imaging applications, has been extensively studied using random models, some of which with i.i.d. sensing matrix components. While these models offer robust reconstruction guarantees, they are computationally expensive and impractical for real-world scenarios. In contrast, Fourier-based models, common in applications such as ptychography and coded… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

  12. arXiv:2409.05552  [pdf, other

    cs.CV

    Seeing is Believing? Enhancing Vision-Language Navigation using Visual Perturbations

    Authors: Xuesong Zhang, Jia Li, Yunbo Xu, Zhenzhen Hu, Richang Hong

    Abstract: Autonomous navigation for an embodied agent guided by natural language instructions remains a formidable challenge in vision-and-language navigation (VLN). Despite remarkable recent progress in learning fine-grained and multifarious visual representations, the tendency to overfit to the training environments leads to unsatisfactory generalization performance. In this work, we present a versatile M… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

    Comments: 5 pages, 2 figures, submitted to ICASSP 2025

  13. arXiv:2409.05073  [pdf, ps, other

    math.AG

    Parahoric reduction theory of formal connections (or Higgs fields)

    Authors: Zhi Hu, Pengfei Huang, Ruiran Sun, Runhong Zong

    Abstract: In this paper, we establish the parahoric reduction theory of formal connections (or Higgs fields) on a formal principal bundle with parahoric structures, which generalizes Babbitt-Varadarajan's result for the case without parahoric structures [5] and Boalch's result for the case of regular singularity [9]. As applications, we prove the equivalence between extrinsic definition and intrinsic defini… ▽ More

    Submitted 8 September, 2024; originally announced September 2024.

    Comments: 24 pages, comments are welcome!

  14. arXiv:2409.04805  [pdf, other

    gr-qc astro-ph.HE hep-ph

    Neutron stars in the bumblebee theory of gravity

    Authors: Peixiang Ji, Zhuhai Li, Lirui Yang, Rui Xu, Zexin Hu, Lijing Shao

    Abstract: Recently, theoretical studies on the bumblebee gravity model, a nonminimally-coupled vector-tensor theory that violates the Lorentz symmetry, have flourished, with a simultaneous increase in the utilization of observations to impose constraints. The static spherical solutions of neutron stars (NSs) in the bumblebee theory are calculated comprehensively in this work. These solutions with different… ▽ More

    Submitted 7 September, 2024; originally announced September 2024.

    Comments: 15 pages, 9 figures

  15. arXiv:2409.03605  [pdf, other

    cs.CV cs.MM

    SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing

    Authors: Lingyu Xiong, Xize Cheng, Jintao Tan, Xianjia Wu, Xiandong Li, Lei Zhu, Fei Ma, Minglei Li, Huang Xu, Zhihu Hu

    Abstract: Audio-driven talking face generation aims to synthesize video with lip movements synchronized to input audio. However, current generative techniques face challenges in preserving intricate regional textures (skin, teeth). To address the aforementioned challenges, we propose a novel framework called SegTalker to decouple lip movements and image textures by introducing segmentation as intermediate r… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

    Comments: 10 pages, 7 figures, 3 tables

  16. Direct Readout of Nitrogen-Vacancy Hybrid-Spin Quantum Register in Diamond by Photon Arrival Time Analysis

    Authors: Jingyan He, Yu Tian, Zhiyi Hu, Runchuan Ye, Xiangyu Wang, Dawei Lu, Nanyang Xu

    Abstract: Quantum state readout plays a pivotal role in quantum technologies, spanning applications in sensing, computation, and secure communication. In this work, we introduce a new approach for efficiently reading populations of hybrid-spin states in the nitrogen-vacancy center of diamond using a single laser pulse, which utilizes the excited state level anti-crossing mechanism at around 500 Gs. Reading… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  17. Four-order power reduction in nanoscale electron-nuclear double resonance with a nitrogen-vacancy center in diamond

    Authors: Zhiyi Hu, Fengjian Jiang, Jingyan He, Yulin Dai, Ya Wang, Nanyang Xu, Jiangfeng Du

    Abstract: Detecting nuclear spins using single Nitrogen-Vacancy (NV) centers is of particular importance in nano-scale science and engineering, but often suffers from the heating effect of microwave fields for spin manipulation, especially under high magnetic fields. Here, we realize an energy-efficient nano-scale nuclear-spin detection using a phase-modulation electron-nuclear double resonance scheme. The… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  18. arXiv:2409.03231  [pdf, other

    cs.LG math.DS math.NA stat.ML

    State-space models are accurate and efficient neural operators for dynamical systems

    Authors: Zheyuan Hu, Nazanin Ahmadi Daryakenari, Qianli Shen, Kenji Kawaguchi, George Em Karniadakis

    Abstract: Physics-informed machine learning (PIML) has emerged as a promising alternative to classical methods for predicting dynamical systems, offering faster and more generalizable solutions. However, existing models, including recurrent neural networks (RNNs), transformers, and neural operators, face challenges such as long-time integration, long-range dependencies, chaotic dynamics, and extrapolation,… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

    Comments: 34 pages

    ACM Class: F.2.2; I.2.7

  19. arXiv:2409.02577  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall

    Interlayer coupling rotatable magnetic easy-axis in MnSe2 mono- and bi-layers

    Authors: Zhongqin Zhang, Cong Wang, PengJie Guo, Linwei Zhou, Yuhao Pan, Zhixin Hu, Wei Ji

    Abstract: Interlayer coupling plays a critical role in tuning the electronic structures and magnetic ground states of two-dimensional materials, influenced by the number of layers, interlayer distance, and stacking order. However, its effect on the orientation of the magnetic easy axis remains underexplored. In this study, we demonstrate that interlayer coupling can significantly alter the magnetic easy-axi… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

  20. arXiv:2409.02076  [pdf, other

    cs.CL

    LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs

    Authors: Yuhao Wu, Ming Shan Hee, Zhiqing Hu, Roy Ka-Wei Lee

    Abstract: In evaluating the long-context capabilities of large language models (LLMs), benchmarks such as "Needle-in-a-Haystack" (NIAH), Ruler, and Needlebench are commonly used. While these benchmarks measure how well models understand long-context input sequences, they do not effectively gauge the quality of long-form text generation--a critical aspect for applications such as design proposals and creativ… ▽ More

    Submitted 15 September, 2024; v1 submitted 3 September, 2024; originally announced September 2024.

    Comments: work in progress; Github: https://github.com/mozhu621/LongGenBench/

  21. arXiv:2409.01557  [pdf, other

    cs.CV

    TASL-Net: Tri-Attention Selective Learning Network for Intelligent Diagnosis of Bimodal Ultrasound Video

    Authors: Chengqian Zhao, Zhao Yao, Zhaoyu Hu, Yuanxin Xie, Yafang Zhang, Yuanyuan Wang, Shuo Li, Jianhua Zhou, Jianqiao Zhou, Yin Wang, Jinhua Yu

    Abstract: In the intelligent diagnosis of bimodal (gray-scale and contrast-enhanced) ultrasound videos, medical domain knowledge such as the way sonographers browse videos, the particular areas they emphasize, and the features they pay special attention to, plays a decisive role in facilitating precise diagnosis. Embedding medical knowledge into the deep learning network can not only enhance performance but… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

  22. arXiv:2409.01240  [pdf, other

    cs.HC

    DiffEyeSyn: Diffusion-based User-specific Eye Movement Synthesis

    Authors: Chuhan Jiao, Guanhua Zhang, Zhiming Hu, Andreas Bulling

    Abstract: High-frequency components in eye gaze data contain user-specific information promising for various applications, but existing gaze modelling methods focus on low frequencies of typically not more than 30 Hz. We present DiffEyeSyn -- the first computational method to synthesise high-frequency gaze data, including eye movement characteristics specific to individual users. The key idea is to consider… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

  23. arXiv:2409.00519  [pdf, ps, other

    math.AP

    Blow-up solutions for the steady state of the Keller-Segel system on Riemann surfaces

    Authors: Zhengni Hu, Thomas Bartsch, Mohameden Ahmedou

    Abstract: We study the following Neumann boundary problem related to the stationary solutions of the Keller-Segel system, a basic model of chemotaxis phenomena: \[ -Δ_g u +βu =λ\left(\frac{Ve^u}{\int_Σ Ve^u d v_g}-\frac{1}{|Σ|_g}\right) \text { in } \mathringΣ\] with $\partial_{ ν_g} u=0, \text { on } \partial Σ$, where $(Σ, g)$ is a compact Riemann surface with the interior $\mathringΣ$ and the smooth boun… ▽ More

    Submitted 31 August, 2024; originally announced September 2024.

    MSC Class: 35J57; 58J05

  24. arXiv:2409.00509  [pdf, other

    cs.CL

    LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models

    Authors: Zhiyuan Hu, Yuliang Liu, Jinman Zhao, Suyuchen Wang, Yan Wang, Wei Shen, Qing Gu, Anh Tuan Luu, See-Kiong Ng, Zhiwei Jiang, Bryan Hooi

    Abstract: Large language models (LLMs) face significant challenges in handling long-context tasks because of their limited effective context window size during pretraining, which restricts their ability to generalize over extended sequences. Meanwhile, extending the context window in LLMs through post-pretraining is highly resource-intensive. To address this, we introduce LongRecipe, an efficient training s… ▽ More

    Submitted 4 September, 2024; v1 submitted 31 August, 2024; originally announced September 2024.

    Comments: Work in Progress

  25. arXiv:2409.00402  [pdf, ps, other

    cs.IT eess.SP

    Generalized Orthogonal Chirp Division Multiplexing in Doubly Selective Channels

    Authors: Yun Liu, Hao Zhao, Huazhen Yao, Zeng Hu, Yinming Cui, Dehuan Wan

    Abstract: In recent years, orthogonal chirp division modulation (OCDM) has gained attention as a robust communication waveform due to its strong resistance to both time-domain and frequency-domain interference. However, similar to orthogonal frequency division multiplexing (OFDM), OCDM suffers from a high peak-to-average power ratio (PAPR), resulting in increased hardware costs and reduced energy efficiency… ▽ More

    Submitted 31 August, 2024; originally announced September 2024.

  26. arXiv:2408.17372  [pdf, ps, other

    math.AP

    Partial Blow-up Phenomena in the $SU(3)$ Toda System on Riemann Surfaces

    Authors: Zhengni Hu, Mohameden Ahmedou, Thomas Bartsch

    Abstract: This work studies the partial blow-up phenomena for the $SU(3)$ Toda system on compact Riemann surfaces with smooth boundary. We consider the following coupled Liouville system with Neumann boundary conditions:… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    MSC Class: 35J57; 58J05

  27. arXiv:2408.16917  [pdf, ps, other

    math.AP

    Blow-up solutions for mean field equations with Neumann boundary conditions on Riemann surfaces

    Authors: Zhengni Hu, Thomas Bartsch, Mohameden Ahmedou

    Abstract: On a compact Riemann surface $(Σ, g)$ with a smooth boundary $\partial Σ$, we consider the following mean field equations with Neumann boundary conditions: $$ -Δ_g u = λ\left(\frac{Ve^u}{\int_Σ Ve^u \, dv_g} - \frac{1}{|Σ|_g}\right) \text{ in } Σ\text{ with } \partial_{ν_g} u = 0 \text{ on } \partial Σ, $$ We find conditions on the potential function $V: Σ\to \mathbb{R}^+$ such that solutions exis… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    MSC Class: 35B33 (Primary) 35J61; 35R01 (Secondary)

  28. arXiv:2408.14979  [pdf, ps, other

    math.AP

    On Solutions for Singular Toda System on Riemann Surfaces with Boundary

    Authors: Zhengni Hu

    Abstract: This paper studies solutions to a singular $SU(3)$ Toda system with linear source terms on a compact Riemann surface $Σ$ with smooth boundaries $\partialΣ$. We establish the existence of solutions when the parameters are not critical, assuming that Euler characteristic $χ(Σ)<1$ via analyzing the sublevels. Furthermore, we find a sufficient condition that ensures multiple solutions for generic pote… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    MSC Class: 35J50 (Primary) 35J61; 35R01; 58J32 (Secondary)

  29. arXiv:2408.14843  [pdf, other

    cs.LG cs.NE eess.SP

    Correntropy-Based Improper Likelihood Model for Robust Electrophysiological Source Imaging

    Authors: Yuanhao Li, Badong Chen, Zhongxu Hu, Keita Suzuki, Wenjun Bai, Yasuharu Koike, Okito Yamashita

    Abstract: Bayesian learning provides a unified skeleton to solve the electrophysiological source imaging task. From this perspective, existing source imaging algorithms utilize the Gaussian assumption for the observation noise to build the likelihood function for Bayesian inference. However, the electromagnetic measurements of brain activity are usually affected by miscellaneous artifacts, leading to a pote… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  30. arXiv:2408.14492  [pdf, other

    cs.LG

    Evolvable Psychology Informed Neural Network for Memory Behavior Modeling

    Authors: Xiaoxuan Shen, Zhihai Hu, Qirong Chen, Shengyingjie Liu, Ruxia Liang, Jianwen Sun

    Abstract: Memory behavior modeling is a core issue in cognitive psychology and education. Classical psychological theories typically use memory equations to describe memory behavior, which exhibits insufficient accuracy and controversy, while data-driven memory modeling methods often require large amounts of training data and lack interpretability. Knowledge-informed neural network models have shown excelle… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

  31. arXiv:2408.13914  [pdf, ps, other

    eess.SY

    Data-driven approximate output regulation of nonlinear systems

    Authors: Zhongjie Hu, Claudio De Persis, Pietro Tesi

    Abstract: The paper deals with the data-based design of controllers that solve the output regulation problem for nonlinear systems. Inspired by recent developments in model-based output regulation design techniques and in data-driven control design for nonlinear systems, we derive a data-dependent semidefinite program that, when solved, directly returns a controller that approximately regulates the tracking… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

  32. arXiv:2408.12615  [pdf, other

    eess.IV cs.CV cs.LG

    Pediatric TSC-Related Epilepsy Classification from Clinical MR Images Using Quantum Neural Network

    Authors: Ling Lin, Yihang Zhou, Zhanqi Hu, Dian Jiang, Congcong Liu, Shuo Zhou, Yanjie Zhu, Jianxiang Liao, Dong Liang, Hairong Zheng, Haifeng Wang

    Abstract: Tuberous sclerosis complex (TSC) manifests as a multisystem disorder with significant neurological implications. This study addresses the critical need for robust classification models tailored to TSC in pediatric patients, introducing QResNet,a novel deep learning model seamlessly integrating conventional convolutional neural networks with quantum neural networks. The model incorporates a two-lay… ▽ More

    Submitted 26 August, 2024; v1 submitted 8 August, 2024; originally announced August 2024.

    Comments: 5 pages,4 figures,2 tables,presented at ISBI 2024

  33. arXiv:2408.12320  [pdf, other

    cs.AI cs.LG

    PolyRouter: A Multi-LLM Querying System

    Authors: Dimitris Stripelis, Zijian Hu, Jipeng Zhang, Zhaozhuo Xu, Alay Dilipbhai Shah, Han Jin, Yuhang Yao, Salman Avestimehr, Chaoyang He

    Abstract: With the rapid growth of Large Language Models (LLMs) across various domains, numerous new LLMs have emerged, each possessing domain-specific expertise. This proliferation has highlighted the need for quick, high-quality, and cost-effective LLM query response methods. Yet, no single LLM exists to efficiently balance this trilemma. Some models are powerful but extremely costly, while others are fas… ▽ More

    Submitted 26 August, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

    Comments: 14 pages, 7 figures, 2 tables

    ACM Class: I.2; I.5

  34. arXiv:2408.10635  [pdf, other

    cs.AI cs.CL

    Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search

    Authors: Jonathan Light, Min Cai, Weiqin Chen, Guanzhi Wang, Xiusi Chen, Wei Cheng, Yisong Yue, Ziniu Hu

    Abstract: In this paper, we propose a new method Strategist that utilizes LLMs to acquire new skills for playing multi-agent games through a self-improvement process. Our method gathers quality feedback through self-play simulations with Monte Carlo tree search and LLM-based reflection, which can then be used to learn high-level strategic skills such as how to evaluate states that guide the low-level execut… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: website: https://llm-strategist.github.io

  35. arXiv:2408.09988  [pdf, other

    hep-ph

    Chiral-odd gluon generalized parton distributions in the proton: A light-front quantization approach

    Authors: Bolang Lin, Sreeraj Nair, Chandan Mondal, Siqi Xu, Zhi Hu, Pengxiang Zhang, Xingbo Zhao, James P. Vary

    Abstract: Within the basis light-front quantization (BLFQ) framework, we evaluate the gluon chiral-odd generalized parton distributions (GPDs) inside the proton at zero skewness. We employ the light-front wave functions of the proton obtained from a light-front quantized Hamiltonian with quantum chromodynamics input using BLFQ. Our investigation encompasses both the valence Fock sector with three constituen… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 11 pages, 12 figures

  36. arXiv:2408.09088  [pdf

    quant-ph

    Quantum encryption design overcomes Shannon's theorem to achieve perfect secrecy with reusable keys

    Authors: Zixuan Hu, Zhenyu Li

    Abstract: Shannon's perfect-secrecy theorem states that a perfect encryption system that yields zero information to the adversary must be a one-time pad (OTP) with the keys randomly generated and never reused. However, recently discovered exotic properties of quantum entanglement have motivated us to reconsider Shannon's theorem in the quantum regime. In this work we design a quantum encryption method that… ▽ More

    Submitted 26 August, 2024; v1 submitted 16 August, 2024; originally announced August 2024.

    Comments: This revision added a worked example and a quantum circuit figure

  37. arXiv:2408.07611  [pdf, other

    cs.CL cs.IR

    WeKnow-RAG: An Adaptive Approach for Retrieval-Augmented Generation Integrating Web Search and Knowledge Graphs

    Authors: Weijian Xie, Xuefeng Liang, Yuhui Liu, Kaihua Ni, Hong Cheng, Zetian Hu

    Abstract: Large Language Models (LLMs) have greatly contributed to the development of adaptive intelligent agents and are positioned as an important way to achieve Artificial General Intelligence (AGI). However, LLMs are prone to produce factually incorrect information and often produce "phantom" content that undermines their reliability, which poses a serious challenge for their deployment in real-world sc… ▽ More

    Submitted 27 August, 2024; v1 submitted 14 August, 2024; originally announced August 2024.

    Comments: 8 pages, 2 figures, technical report for 3rd place in Task 3 of Meta KDD Cup 2024 CRAG Challenge

  38. arXiv:2408.07482  [pdf, other

    cs.DC cs.AI

    Training Overhead Ratio: A Practical Reliability Metric for Large Language Model Training Systems

    Authors: Ning Lu, Qian Xie, Hao Zhang, Wenyi Fang, Yang Zheng, Zheng Hu, Jiantao Ma

    Abstract: Large Language Models (LLMs) are revolutionizing the AI industry with their superior capabilities. Training these models requires large-scale GPU clusters and significant computing time, leading to frequent failures that significantly increase training costs. Despite its significance, this field lacks a metric for evaluating reliability. In this work, we introduce a novel reliability metric called… ▽ More

    Submitted 5 September, 2024; v1 submitted 14 August, 2024; originally announced August 2024.

    Comments: To be published in: IEEE International Symposium on Software Reliability Engineering (ISSRE2024) workshop

  39. arXiv:2408.07444  [pdf, other

    eess.IV cs.CV

    Costal Cartilage Segmentation with Topology Guided Deformable Mamba: Method and Benchmark

    Authors: Senmao Wang, Haifan Gong, Runmeng Cui, Boyao Wan, Yicheng Liu, Zhonglin Hu, Haiqing Yang, Jingyang Zhou, Bo Pan, Lin Lin, Haiyue Jiang

    Abstract: Costal cartilage segmentation is crucial to various medical applications, necessitating precise and reliable techniques due to its complex anatomy and the importance of accurate diagnosis and surgical planning. We propose a novel deep learning-based approach called topology-guided deformable Mamba (TGDM) for costal cartilage segmentation. The TGDM is tailored to capture the intricate long-range co… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  40. arXiv:2408.06000  [pdf, other

    cs.CV eess.IV

    An Analysis for Image-to-Image Translation and Style Transfer

    Authors: Xiaoming Yu, Jie Tian, Zhenhua Hu

    Abstract: With the development of generative technologies in deep learning, a large number of image-to-image translation and style transfer models have emerged at an explosive rate in recent years. These two technologies have made significant progress and can generate realistic images. However, many communities tend to confuse the two, because both generate the desired image based on the input image and bot… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  41. arXiv:2408.05099  [pdf

    physics.optics physics.app-ph

    Lithography-free patterning of chalcogenide materials for integrated photonic devices

    Authors: Zhen Hu, Yuru Li, Yan Li, Shunyu Yao, Hongfei Chen, Tao Zhang, Zhaohuan Ao, Zhaohui Li

    Abstract: Chalcogenide material-based integrated photonic devices have garnered widespread attention due to their unique wideband transparency. Despite their recognized CMOS compatibility, the fabrication of these devices relies predominantly on lithography techniques. However, chalcogenide thin films are highly susceptible to oxidation, necessitating customized process flows and complex protective measures… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  42. arXiv:2408.04425  [pdf, other

    astro-ph.HE

    Effects from Dark Matter Halos on X-ray Pulsar Pulse Profiles

    Authors: Yukun Liu, Hong-Bo Li, Yong Gao, Lijing Shao, Zexin Hu

    Abstract: Neutron stars (NSs) can capture dark matter (DM) particles because of their deep gravitational potential and high density. The accumulated DM can affect the properties of NSs. In this work we use a general relativistic two-fluid formalism to solve the structure of DM-admixed NSs (DANSs) and the surrounding spacetime. Specifically, we pay attention to the situation where those DANSs possess DM halo… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: 9 pages, 11 figures

  43. Digital Avatars: Framework Development and Their Evaluation

    Authors: Timothy Rupprecht, Sung-En Chang, Yushu Wu, Lei Lu, Enfu Nan, Chih-hsiang Li, Caiyue Lai, Zhimin Li, Zhijun Hu, Yumei He, David Kaeli, Yanzhi Wang

    Abstract: We present a novel prompting strategy for artificial intelligence driven digital avatars. To better quantify how our prompting strategy affects anthropomorphic features like humor, authenticity, and favorability we present Crowd Vote - an adaptation of Crowd Score that allows for judges to elect a large language model (LLM) candidate over competitors answering the same or similar prompts. To visua… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: This work was presented during the IJCAI 2024 conference proceedings for demonstrations

    MSC Class: 68 ACM Class: D.2.2; C.3

    Journal ref: 2024 Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence Demo Track. Pages 8780-8783

  44. arXiv:2408.03910  [pdf, other

    cs.SE cs.AI cs.CL

    CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

    Authors: Xiangyan Liu, Bo Lan, Zhiyuan Hu, Yang Liu, Zhicheng Zhang, Fei Wang, Michael Shieh, Wenmeng Zhou

    Abstract: Large Language Models (LLMs) excel in stand-alone code tasks like HumanEval and MBPP, but struggle with handling entire code repositories. This challenge has prompted research on enhancing LLM-codebase interaction at a repository scale. Current solutions rely on similarity-based retrieval or manual tools and APIs, each with notable drawbacks. Similarity-based retrieval often has low recall in comp… ▽ More

    Submitted 11 August, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

    Comments: work in progress

  45. arXiv:2408.01696  [pdf, other

    cs.SD cs.AI eess.AS

    Generating High-quality Symbolic Music Using Fine-grained Discriminators

    Authors: Zhedong Zhang, Liang Li, Jiehua Zhang, Zhenghui Hu, Hongkui Wang, Chenggang Yan, Jian Yang, Yuankai Qi

    Abstract: Existing symbolic music generation methods usually utilize discriminator to improve the quality of generated music via global perception of music. However, considering the complexity of information in music, such as rhythm and melody, a single discriminator cannot fully reflect the differences in these two primary dimensions of music. In this work, we propose to decouple the melody and rhythm from… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

    Comments: Accepted by ICPR2024

  46. arXiv:2408.00491  [pdf, other

    cs.CL cs.CV cs.MM

    GalleryGPT: Analyzing Paintings with Large Multimodal Models

    Authors: Yi Bin, Wenhao Shi, Yujuan Ding, Zhiqiang Hu, Zheng Wang, Yang Yang, See-Kiong Ng, Heng Tao Shen

    Abstract: Artwork analysis is important and fundamental skill for art appreciation, which could enrich personal aesthetic sensibility and facilitate the critical thinking ability. Understanding artworks is challenging due to its subjective nature, diverse interpretations, and complex visual elements, requiring expertise in art history, cultural background, and aesthetic theory. However, limited by the data… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: Accepted as Oral Presentation at ACM Multimedia 2024

  47. arXiv:2408.00245  [pdf, other

    astro-ph.HE gr-qc

    Measuring the Spin of the Galactic Center Supermassive Black Hole with Two Pulsars

    Authors: Zexin Hu, Lijing Shao

    Abstract: As a key science project of the Square Kilometre Array (SKA), the discovery and timing observations of radio pulsars in the Galactic Center would provide high-precision measurements of the spacetime around the supermassive black hole, Sagittarius A* (Sgr A*), and initiate novel tests of general relativity. The spin of Sgr A* could be measured with a relative error of $\lesssim 1\%$ by timing one p… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

    Comments: 5 pages, 3 figures

  48. arXiv:2408.00008  [pdf, other

    cs.DC cs.LG

    ScaleLLM: A Resource-Frugal LLM Serving Framework by Optimizing End-to-End Efficiency

    Authors: Yuhang Yao, Han Jin, Alay Dilipbhai Shah, Shanshan Han, Zijian Hu, Yide Ran, Dimitris Stripelis, Zhaozhuo Xu, Salman Avestimehr, Chaoyang He

    Abstract: Large language models (LLMs) have surged in popularity and are extensively used in commercial applications, where the efficiency of model serving is crucial for the user experience. Most current research focuses on optimizing individual sub-procedures, e.g. local inference and communication, however, there is no comprehensive framework that provides a holistic system view for optimizing LLM servin… ▽ More

    Submitted 10 September, 2024; v1 submitted 23 July, 2024; originally announced August 2024.

  49. arXiv:2408.00001  [pdf, other

    cs.CV cs.AI cs.CY

    Replication in Visual Diffusion Models: A Survey and Outlook

    Authors: Wenhao Wang, Yifan Sun, Zongxin Yang, Zhengdong Hu, Zhentao Tan, Yi Yang

    Abstract: Visual diffusion models have revolutionized the field of creative AI, producing high-quality and diverse content. However, they inevitably memorize training images or videos, subsequently replicating their concepts, content, or styles during inference. This phenomenon raises significant concerns about privacy, security, and copyright within generated outputs. In this survey, we provide the first c… ▽ More

    Submitted 7 July, 2024; originally announced August 2024.

    Comments: The first survey focuses on replication in visual diffusion models. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  50. arXiv:2407.20299  [pdf, other

    cs.LG cs.AI

    Dataset Distillation for Offline Reinforcement Learning

    Authors: Jonathan Light, Yuanzhe Liu, Ziniu Hu

    Abstract: Offline reinforcement learning often requires a quality dataset that we can train a policy on. However, in many situations, it is not possible to get such a dataset, nor is it easy to train a policy to perform well in the actual environment given the offline data. We propose using data distillation to train and distill a better dataset which can then be used for training a better policy model. We… ▽ More

    Submitted 31 July, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

    Comments: ICML 2024 DMLR Workshop