Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 839 results for author: Ye, Y

.
  1. arXiv:2409.01935  [pdf, other

    cs.CV

    Map-Assisted Remote-Sensing Image Compression at Extremely Low Bitrates

    Authors: Yixuan Ye, Ce Wang, Wanjie Sun, Zhenzhong Chen

    Abstract: Remote-sensing (RS) image compression at extremely low bitrates has always been a challenging task in practical scenarios like edge device storage and narrow bandwidth transmission. Generative models including VAEs and GANs have been explored to compress RS images into extremely low-bitrate streams. However, these generative models struggle to reconstruct visually plausible images due to the highl… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  2. arXiv:2408.15011  [pdf, other

    cs.CV

    Pre-training Everywhere: Parameter-Efficient Fine-Tuning for Medical Image Analysis via Target Parameter Pre-training

    Authors: Xingliang Lei, Yiwen Ye, Ziyang Chen, Minglei Shu, Yong Xia

    Abstract: Parameter-efficient fine-tuning (PEFT) techniques have emerged to address issues of overfitting and high computational costs associated with fully fine-tuning in the paradigm of self-supervised learning. Mainstream methods based on PEFT involve adding a few trainable parameters while keeping the pre-trained parameters of the backbone fixed. These methods achieve comparative, and often superior, pe… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  3. arXiv:2408.12814  [pdf, other

    cs.CV

    From Few to More: Scribble-based Medical Image Segmentation via Masked Context Modeling and Continuous Pseudo Labels

    Authors: Zhisong Wang, Yiwen Ye, Ziyang Chen, Minglei Shu, Yong Xia

    Abstract: Scribble-based weakly supervised segmentation techniques offer comparable performance to fully supervised methods while significantly reducing annotation costs, making them an appealing alternative. Existing methods often rely on auxiliary tasks to enforce semantic consistency and use hard pseudo labels for supervision. However, these methods often overlook the unique requirements of models traine… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

  4. arXiv:2408.10318  [pdf, other

    hep-ph hep-th

    Positivity Bounds in Scalar Effective Field Theories at One-loop Level

    Authors: Yunxiao Ye, Bin He, Jiayin Gu

    Abstract: Parameters in an effective field theory can be subject to certain positivity bounds if one requires a UV completion that obeys the fundamental principles of quantum field theory. These bounds are relatively straight forward at the tree level, but would become more obscure when loop effects are important. Using scalar theories as examples, we carefully exam the positivity bounds in a case where the… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 22 pages, 4 figures

  5. arXiv:2408.09698  [pdf, other

    cs.IR cs.AI

    Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation

    Authors: Yuyang Ye, Zhi Zheng, Yishan Shen, Tianshu Wang, Hengruo Zhang, Peijun Zhu, Runlong Yu, Kai Zhang, Hui Xiong

    Abstract: Recent advances in Large Language Models (LLMs) have demonstrated significant potential in the field of Recommendation Systems (RSs). Most existing studies have focused on converting user behavior logs into textual prompts and leveraging techniques such as prompt tuning to enable LLMs for recommendation tasks. Meanwhile, research interest has recently grown in multimodal recommendation systems tha… ▽ More

    Submitted 20 August, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

  6. arXiv:2408.08682  [pdf, other

    cs.AI cs.CL cs.CV

    LLM-PCGC: Large Language Model-based Point Cloud Geometry Compression

    Authors: Yuqi Ye, Wei Gao

    Abstract: The key to effective point cloud compression is to obtain a robust context model consistent with complex 3D data structures. Recently, the advancement of large language models (LLMs) has highlighted their capabilities not only as powerful generators for in-context learning and generation but also as effective compressors. These dual attributes of LLMs make them particularly well-suited to meet the… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  7. arXiv:2408.08147  [pdf, other

    cs.DC cs.CL cs.LG

    P/D-Serve: Serving Disaggregated Large Language Model at Scale

    Authors: Yibo Jin, Tao Wang, Huimin Lin, Mingyang Song, Peiyang Li, Yipeng Ma, Yicheng Shan, Zhengfan Yuan, Cailong Li, Yajing Sun, Tiandeng Wu, Xing Chu, Ruizhi Huan, Li Ma, Xiao You, Wenting Zhou, Yunpeng Ye, Wen Liu, Xiangkun Xu, Yongsheng Zhang, Tiantian Dong, Jiawei Zhu, Zhe Wang, Xijian Ju, Jianxun Song , et al. (5 additional authors not shown)

    Abstract: Serving disaggregated large language models (LLMs) over tens of thousands of xPU devices (GPUs or NPUs) with reliable performance faces multiple challenges. 1) Ignoring the diversity (various prefixes and tidal requests), treating all the prompts in a mixed pool is inadequate. To facilitate the similarity per scenario and minimize the inner mismatch on P/D (prefill and decoding) processing, fine-g… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  8. arXiv:2408.07343  [pdf, other

    cs.CV

    Gradient Alignment Improves Test-Time Adaptation for Medical Image Segmentation

    Authors: Ziyang Chen, Yiwen Ye, Yongsheng Pan, Yong Xia

    Abstract: Although recent years have witnessed significant advancements in medical image segmentation, the pervasive issue of domain shift among medical images from diverse centres hinders the effective deployment of pre-trained models. Many Test-time Adaptation (TTA) methods have been proposed to address this issue by fine-tuning pre-trained models with test data during inference. These methods, however, o… ▽ More

    Submitted 16 August, 2024; v1 submitted 14 August, 2024; originally announced August 2024.

  9. arXiv:2408.06174  [pdf

    cond-mat.supr-con

    Emergent superconductivity and pair density wave at antiphase boundaries of charge density wave order in kagome metals

    Authors: Xianghe Han, Hui Chen, Hengxin Tan, Zhongyi Cao, Zihao Huang, Yuhan Ye, Zhen Zhao, Chengmin Shen, Haitao Yang, Binghai Yan, Ziqiang Wang, Hong-Jun Gao

    Abstract: Central to the layered kagome lattice superconductors AV3Sb5 (A = K, Cs, Rb) is a cascade of novel quantum states triggered by an unconventional charge density wave (CDW) order. The three-dimensional (3D) order involves a 2x2x2 phase coherent stacking of 2x2 charge density modulations in the kagome plane at low temperatures, exhibiting a CDW energy gap and evidence for time-reversal symmetry break… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  10. arXiv:2408.03680  [pdf, other

    cs.SE

    Iterative Knowledge Distillation through Feedback-Driven Learning Cycles

    Authors: Yujia Chen, Yang Ye, Zhongqi Li, Yuchi Ma, Cuiyun Gao

    Abstract: Large code models (LCMs) have remarkably advanced the field of code intelligence. Despite their impressive capabilities, they still face practical employment challenges, such as high costs, limited accessibility of proprietary LCMs, and adaptability issues of ultra-large LCMs. These challenges highlight the critical need for more accessible, lightweight yet effective LCMs. In this paper, we propos… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  11. arXiv:2408.01091  [pdf, other

    cs.AI

    Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions

    Authors: Jin Gao, Lei Gan, Yuankai Li, Yixin Ye, Dequan Wang

    Abstract: Large multimodal models (LMMs) excel in adhering to human instructions. However, self-contradictory instructions may arise due to the increasing trend of multimodal interaction and context length, which is challenging for language beginners and vulnerable populations. We introduce the Self-Contradictory Instructions benchmark to evaluate the capability of LMMs in recognizing conflicting commands.… ▽ More

    Submitted 5 August, 2024; v1 submitted 2 August, 2024; originally announced August 2024.

    Comments: Accepted by the 18th European Conference on Computer Vision ECCV 2024

  12. arXiv:2408.00310  [pdf, other

    cs.LG math.OC

    Online Linear Programming with Batching

    Authors: Haoran Xu, Peter W. Glynn, Yinyu Ye

    Abstract: We study Online Linear Programming (OLP) with batching. The planning horizon is cut into $K$ batches, and the decisions on customers arriving within a batch can be delayed to the end of their associated batch. Compared with OLP without batching, the ability to delay decisions brings better operational performance, as measured by regret. Two research questions of interest are: (1) What is a lower b… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  13. arXiv:2408.00230  [pdf, other

    cs.AI cs.CL

    Lost in Translation: Latent Concept Misalignment in Text-to-Image Diffusion Models

    Authors: Juntu Zhao, Junyu Deng, Yixin Ye, Chongxuan Li, Zhijie Deng, Dequan Wang

    Abstract: Advancements in text-to-image diffusion models have broadened extensive downstream practical applications, but such models often encounter misalignment issues between text and image. Taking the generation of a combination of two disentangled concepts as an example, say given the prompt "a tea cup of iced coke", existing models usually generate a glass cup of iced coke because the iced coke usually… ▽ More

    Submitted 5 August, 2024; v1 submitted 31 July, 2024; originally announced August 2024.

    Comments: Accepted by the 18th European Conference on Computer Vision ECCV 2024

  14. arXiv:2407.21415  [pdf, other

    quant-ph

    In situ Qubit Frequency Tuning Circuit for Scalable Superconducting Quantum Computing: Scheme and Experiment

    Authors: Lei Jiang, Yu Xu, Shaowei Li, Zhiguang Yan, Ming Gong, Tao Rong, Chenyin Sun, Tianzuo Sun, Tao Jiang, Hui Deng, Chen Zha, Jin Lin, Fusheng Chen, Qingling Zhu, Yangsen Ye, Hao Rong, Kai Yan, Sirui Cao, Yuan Li, Shaojun Guo, Haoran Qian, Yisen Hu, Yulin Wu, Yuhuai Li, Gang Wu , et al. (8 additional authors not shown)

    Abstract: Frequency tunable qubit plays a significant role for scalable superconducting quantum processors. The state-of-the-art room-temperature electronics for tuning qubit frequency suffers from unscalable limit, such as heating problem, linear growth of control cables, etc. Here we propose a scalable scheme to tune the qubit frequency by using in situ superconducting circuit, which is based on radio fre… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

    Comments: 9 pages, 6 figures

  15. arXiv:2407.21075  [pdf, other

    cs.AI cs.CL cs.LG

    Apple Intelligence Foundation Language Models

    Authors: Tom Gunter, Zirui Wang, Chong Wang, Ruoming Pang, Andy Narayanan, Aonan Zhang, Bowen Zhang, Chen Chen, Chung-Cheng Chiu, David Qiu, Deepak Gopinath, Dian Ang Yap, Dong Yin, Feng Nan, Floris Weers, Guoli Yin, Haoshuo Huang, Jianyu Wang, Jiarui Lu, John Peebles, Ke Ye, Mark Lee, Nan Du, Qibin Chen, Quentin Keunebroek , et al. (130 additional authors not shown)

    Abstract: We present foundation language models developed to power Apple Intelligence features, including a ~3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute. These models are designed to perform a wide range of tasks efficiently, accurately, and responsibly. This report describes the model architecture, the data used… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  16. arXiv:2407.20174  [pdf, other

    cs.CV cs.AI

    Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning

    Authors: Xingchen Zeng, Haichuan Lin, Yilin Ye, Wei Zeng

    Abstract: Emerging multimodal large language models (MLLMs) exhibit great potential for chart question answering (CQA). Recent efforts primarily focus on scaling up training datasets (i.e., charts, data tables, and question-answer (QA) pairs) through data collection and synthesis. However, our empirical study on existing MLLMs and CQA datasets reveals notable gaps. First, current data collection and synthes… ▽ More

    Submitted 11 August, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

    Comments: 11 pages, 7 figures

  17. WalkTheDog: Cross-Morphology Motion Alignment via Phase Manifolds

    Authors: Peizhuo Li, Sebastian Starke, Yuting Ye, Olga Sorkine-Hornung

    Abstract: We present a new approach for understanding the periodicity structure and semantics of motion datasets, independently of the morphology and skeletal structure of characters. Unlike existing methods using an overly sparse high-dimensional latent, we propose a phase manifold consisting of multiple closed curves, each corresponding to a latent amplitude. With our proposed vector quantized periodic au… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: SIGGRAPH 2024. Project page: https://peizhuoli.github.io/walkthedog Video: https://www.youtube.com/watch?v=tNVO2jqeTNw

  18. arXiv:2407.18172  [pdf

    physics.optics

    Chip-scale sensor for spectroscopic metrology

    Authors: Chunhui Yao, Wanlu Zhang, Peng Bao, Jie Ma, Wei Zhuo, Minjia Chen, Zhitian Shi, Jingwen Zhou, Yuxiao Ye, Liang Ming, Ting Yan, Richard Penty, Qixiang Cheng

    Abstract: Miniaturized spectrometers hold great promise for in situ, in vitro, and even in vivo sensing applications. However, their size reduction imposes vital performance constraints in meeting the rigorous demands of spectroscopy, including fine resolution, high accuracy, and ultra-wide observation window. The prevailing view in the community holds that miniaturized spectrometers are most suitable for t… ▽ More

    Submitted 12 August, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

  19. arXiv:2407.17060  [pdf, other

    cs.CV cs.AI cs.CL eess.IV

    High Efficiency Image Compression for Large Visual-Language Models

    Authors: Binzhe Li, Shurun Wang, Shiqi Wang, Yan Ye

    Abstract: In recent years, large visual language models (LVLMs) have shown impressive performance and promising generalization capability in multi-modal tasks, thus replacing humans as receivers of visual information in various application scenarios. In this paper, we pioneer to propose a variable bitrate image compression framework consisting of a pre-editing module and an end-to-end codec to achieve promi… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  20. arXiv:2407.16660  [pdf, other

    cs.DB

    Dynamic Subgraph Matching via Cost-Model-based Vertex Dominance Embeddings (Technical Report)

    Authors: Yutong Ye, Xiang Lian, Nan Zhang, Mingsong Chen

    Abstract: In many real-world applications such as social network analysis, knowledge graph discovery, biological network analytics, and so on, graph data management has become increasingly important and has drawn much attention from the database community. While many graphs (e.g., Twitter, Wikipedia, etc.) are usually involving over time, it is of great importance to study the dynamic subgraph matching (DSM… ▽ More

    Submitted 31 July, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

  21. arXiv:2407.16532  [pdf, other

    cond-mat.soft physics.flu-dyn

    Propulsion Contribution from Individual Filament in Flagellar Bundle

    Authors: Jin Zhu, Yateng Qiao, Lingchun Yan, Yan Zeng, Yibo Wu, Hongyi Bian, Yidi Huang, Yuxin Ye, Yingyue Huang, Russell Hii Ching Wei, Yinuo Teng, Yunlong Guo, Gaojin Li, Zijie Qu

    Abstract: Flagellated microorganisms overcome the low-Reynolds-number time reversibility by rotating helical flagella. For peritrichous bacteria, such as Escherichia coli, the randomly distributed flagellar filaments align along the same direction to form a bundle, facilitating complex locomotive strategies. To understand the process of flagella bundling, especially the propulsion force, we develop a multi-… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  22. arXiv:2407.16404  [pdf

    eess.SY

    Evaluating Uncertainties in Electricity Markets via Machine Learning and Quantum Computing

    Authors: Shuyang Zhu, Ziqing Zhu, Linghua Zhu, Yujian Ye, Siqi Bu, Sasa Z. Djokic

    Abstract: The analysis of decision-making process in electricity markets is crucial for understanding and resolving issues related to market manipulation and reduced social welfare. Traditional Multi-Agent Reinforcement Learning (MARL) method can model decision-making of generation companies (GENCOs), but faces challenges due to uncertainties in policy functions, reward functions, and inter-agent interactio… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: 3 pages, 3 figures, plan for submitting to IEEE Power Engineering Letters

  23. arXiv:2407.15098  [pdf, other

    cs.CR cs.LG

    SeqMIA: Sequential-Metric Based Membership Inference Attack

    Authors: Hao Li, Zheng Li, Siyuan Wu, Chengrui Hu, Yutong Ye, Min Zhang, Dengguo Feng, Yang Zhang

    Abstract: Most existing membership inference attacks (MIAs) utilize metrics (e.g., loss) calculated on the model's final state, while recent advanced attacks leverage metrics computed at various stages, including both intermediate and final stages, throughout the model training. Nevertheless, these attacks often process multiple intermediate states of the metric independently, ignoring their time-dependent… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: Accepted by ACM CCS 2024

  24. arXiv:2407.15049  [pdf, other

    math.OC

    Accelerating Low-Rank Factorization-Based Semidefinite Programming Algorithms on GPU

    Authors: Qiushi Han, Zhenwei Lin, Hanwen Liu, Caihua Chen, Qi Deng, Dongdong Ge, Yinyu Ye

    Abstract: In this paper, we address a long-standing challenge: how to achieve both efficiency and scalability in solving semidefinite programming problems. We propose breakthrough acceleration techniques for a wide range of low-rank factorization-based first-order methods using GPUs, making the computation much more efficient and scalable. To illustrate the idea and effectiveness of our approach, we use the… ▽ More

    Submitted 23 August, 2024; v1 submitted 21 July, 2024; originally announced July 2024.

  25. arXiv:2407.14982  [pdf, other

    cs.CV cs.AI

    GreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image Generation

    Authors: Jingzhi Gong, Sisi Li, Giordano d'Aloisio, Zishuo Ding, Yulong Ye, William B. Langdon, Federica Sarro

    Abstract: Tuning the parameters and prompts for improving AI-based text-to-image generation has remained a substantial yet unaddressed challenge. Hence we introduce GreenStableYolo, which improves the parameters and prompts for Stable Diffusion to both reduce GPU inference time and increase image generation quality using NSGA-II and Yolo. Our experiments show that despite a relatively slight trade-off (18… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    Comments: This paper is published in the SSBSE Challenge Track 2024

  26. arXiv:2407.13230  [pdf, other

    cond-mat.mes-hall

    Probing spin textures in atomically thin CrSBr through tunneling magnetoresistance

    Authors: Ziqi Liu, Chengfeng Zhu, Yuchen Gao, Zuxin Chen, Pingfan Gu, Yu Ye

    Abstract: The exploration of spin configurations and magnetoresistance in van der Waals magnetic semiconductors, particularly in the realm of thin-layer structures, holds paramount significance for the development of two-dimensional spintronic nanodevices. In this Letter, we conducted comprehensive magnetotransport measurements on a few-layer CrSBr using a vertical tunneling device configuration. Notably, o… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 8 pages, 4 figures

  27. arXiv:2407.12315  [pdf, other

    cs.CV cs.AI cs.HC cs.IR

    ModalChorus: Visual Probing and Alignment of Multi-modal Embeddings via Modal Fusion Map

    Authors: Yilin Ye, Shishi Xiao, Xingchen Zeng, Wei Zeng

    Abstract: Multi-modal embeddings form the foundation for vision-language models, such as CLIP embeddings, the most widely used text-image embeddings. However, these embeddings are vulnerable to subtle misalignment of cross-modal features, resulting in decreased model performance and diminished generalization. To address this problem, we design ModalChorus, an interactive system for visual probing and alignm… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Accepted by VIS 2024

  28. arXiv:2407.09371  [pdf, other

    stat.ME econ.EM stat.CO

    Computationally Efficient Estimation of Large Probit Models

    Authors: Patrick Ding, Guido Imbens, Zhaonan Qu, Yinyu Ye

    Abstract: Probit models are useful for modeling correlated discrete responses in many disciplines, including discrete choice data in economics. However, the Gaussian latent variable feature of probit models coupled with identification constraints pose significant computational challenges for its estimation and inference, especially when the dimension of the discrete response variable is large. In this paper… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  29. PAIL: Performance based Adversarial Imitation Learning Engine for Carbon Neutral Optimization

    Authors: Yuyang Ye, Lu-An Tang, Haoyu Wang, Runlong Yu, Wenchao Yu, Erhu He, Haifeng Chen, Hui Xiong

    Abstract: Achieving carbon neutrality within industrial operations has become increasingly imperative for sustainable development. It is both a significant challenge and a key opportunity for operational optimization in industry 4.0. In recent years, Deep Reinforcement Learning (DRL) based methods offer promising enhancements for sequential optimization processes and can be used for reducing carbon emission… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  30. arXiv:2407.07320  [pdf

    cs.LG cs.RO

    Flow to Rare Events: An Application of Normalizing Flow in Temporal Importance Sampling for Automated Vehicle Validation

    Authors: Yichun Ye, He Zhang, Ye Tian, Jian Sun

    Abstract: Automated Vehicle (AV) validation based on simulated testing requires unbiased evaluation and high efficiency. One effective solution is to increase the exposure to risky rare events while reweighting the probability measure. However, characterizing the distribution of risky events is particularly challenging due to the paucity of samples and the temporality of continuous scenario variables. To so… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  31. arXiv:2407.06061  [pdf

    cond-mat.supr-con cond-mat.mtrl-sci

    Superconductivity up to 14.2 K in MnB$_4$ under pressure

    Authors: Zhe-Ning Xiang, Ying-Jie Zhang, Qing Lu, Qing Li, Yiwen Li, Tianheng Huang, Yijie Zhu, Yongze Ye, Jian Sun, Hai-Hu Wen

    Abstract: The discovery of superconductivity in 3$d$-transition metal compounds with strong magnetism is interesting but rare. Especially for Mn-based compounds, there exist only very limited materials that show superconductivity. Here, we report the discovery of superconductivity up to 14.2 K in a Mn-based material MnB$_4$. By applying high pressures, we found the continuous suppression of a weak insulatin… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 36 pages total; 24 pages of main text with 5 figures, 12 pages of supplement with 1 table and 8 figures

  32. arXiv:2407.05798  [pdf

    cond-mat.supr-con cond-mat.mes-hall cond-mat.mtrl-sci

    Visualization of Unconventional Rashba Band and Vortex Zero Mode in Topopogical Superconductor Candidate AuSn$_{4}$

    Authors: Yuhan Ye, Rui Song, Hongqin Xiao, Guoyu Xian, Hui Guo, Haitao Yang, Hui Chen, Hong-Jun Gao

    Abstract: Topological superconductivity (TSC) is a promising platform to host Majorana zero mode (MZM) for topological quantum computing. Recently, the noble metal alloy AuSn$_{4}$ has been identified as an intrinsic surface TSC. However, the atomic visualization of its nontrivial surface states and MZM remains elusive. Here, we report the direct observation of unconventional surface states and vortex zero… ▽ More

    Submitted 9 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: 17 pages, 4 figures

  33. arXiv:2407.04984  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall

    Prolonged Phase Segregation of Mixed-Halide Perovskite Nanocrystals in the Dark

    Authors: Xueying Ma, Yuhui Ye, Yang Xiao, Shengnan Feng, Chunfeng Zhang, Keyu Xia, Fengrui Hu, Min Xiao, Xiaoyong Wang

    Abstract: A critical issue hindering the potential applications of semiconductor mixed-halide perovskites is the phase segregation effect, wherein localized regions enriched with one type of halide anions would be formed upon continuous photogeneration of the excited-state charge carriers. These unexpected phases are capable of remixing again in the dark under the entropic driving force, the process of whic… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  34. arXiv:2407.01976  [pdf, other

    cs.CL cs.AI cs.MM

    A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding

    Authors: Jinghui Lu, Haiyang Yu, Yanjie Wang, Yongjie Ye, Jingqun Tang, Ziwei Yang, Binghong Wu, Qi Liu, Hao Feng, Han Wang, Hao Liu, Can Huang

    Abstract: Recently, many studies have demonstrated that exclusively incorporating OCR-derived text and spatial layouts with large language models (LLMs) can be highly effective for document understanding tasks. However, existing methods that integrate spatial layouts with text have limitations, such as producing overly long text sequences or failing to fully leverage the autoregressive traits of LLMs. In th… ▽ More

    Submitted 24 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  35. arXiv:2406.18572  [pdf, other

    cs.CV cs.LG

    GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model

    Authors: Ling Li, Yu Ye, Bingchuan Jiang, Wei Zeng

    Abstract: This work tackles the problem of geo-localization with a new paradigm using a large vision-language model (LVLM) augmented with human inference knowledge. A primary challenge here is the scarcity of data for training the LVLM - existing street-view datasets often contain numerous low-quality images lacking visual clues, and lack any reasoning inference. To address the data-quality issue, we devise… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  36. arXiv:2406.18051  [pdf, other

    cs.CV

    ViT-1.58b: Mobile Vision Transformers in the 1-bit Era

    Authors: Zhengqing Yuan, Rong Zhou, Hongyi Wang, Lifang He, Yanfang Ye, Lichao Sun

    Abstract: Vision Transformers (ViTs) have achieved remarkable performance in various image classification tasks by leveraging the attention mechanism to process image patches as tokens. However, the high computational and memory demands of ViTs pose significant challenges for deployment in resource-constrained environments. This paper introduces ViT-1.58b, a novel 1.58-bit quantized ViT model designed to dr… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  37. arXiv:2406.17642  [pdf, other

    cs.CL cs.AI

    Banishing LLM Hallucinations Requires Rethinking Generalization

    Authors: Johnny Li, Saksham Consul, Eda Zhou, James Wong, Naila Farooqui, Yuxin Ye, Nithyashree Manohar, Zhuxiaona Wei, Tian Wu, Ben Echols, Sharon Zhou, Gregory Diamos

    Abstract: Despite their powerful chat, coding, and reasoning abilities, Large Language Models (LLMs) frequently hallucinate. Conventional wisdom suggests that hallucinations are a consequence of a balance between creativity and factuality, which can be mitigated, but not eliminated, by grounding the LLM in external knowledge sources. Through extensive systematic experiments, we show that these traditional a… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  38. arXiv:2406.16793  [pdf, other

    cs.LG cs.AI

    Adam-mini: Use Fewer Learning Rates To Gain More

    Authors: Yushun Zhang, Congliang Chen, Ziniu Li, Tian Ding, Chenwei Wu, Yinyu Ye, Zhi-Quan Luo, Ruoyu Sun

    Abstract: We propose Adam-mini, an optimizer that achieves on-par or better performance than AdamW with 45% to 50% less memory footprint. Adam-mini reduces memory by cutting down the learning rate resources in Adam (i.e., $1/\sqrt{v}$). We find that $\geq$ 90% of these learning rates in $v$ could be harmlessly removed if we (1) carefully partition the parameters into blocks following our proposed principle… ▽ More

    Submitted 3 July, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  39. arXiv:2406.15796  [pdf, other

    cs.CL

    Rethinking Entity-level Unlearning for Large Language Models

    Authors: Weitao Ma, Xiaocheng Feng, Weihong Zhong, Lei Huang, Yangfan Ye, Bing Qin

    Abstract: Large language model unlearning has gained increasing attention due to its potential to mitigate security and privacy concerns. Current research predominantly focuses on Instance-level unlearning, specifically aiming at forgetting predefined instances of sensitive content. However, a notable gap still exists in exploring the deletion of complete entity-related information, which is crucial in many… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Work in progress

  40. arXiv:2406.12753  [pdf, other

    cs.CL cs.AI

    OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

    Authors: Zhen Huang, Zengzhi Wang, Shijie Xia, Xuefeng Li, Haoyang Zou, Ruijie Xu, Run-Ze Fan, Lyumanshan Ye, Ethan Chern, Yixin Ye, Yikai Zhang, Yuqing Yang, Ting Wu, Binjie Wang, Shichao Sun, Yang Xiao, Yiyuan Li, Fan Zhou, Steffi Chern, Yiwei Qin, Yan Ma, Jiadi Su, Yixiu Liu, Yuxiang Zheng, Shaoting Zhang , et al. (3 additional authors not shown)

    Abstract: The evolution of Artificial Intelligence (AI) has been significantly accelerated by advancements in Large Language Models (LLMs) and Large Multimodal Models (LMMs), gradually showcasing potential cognitive reasoning abilities in problem-solving and scientific discovery (i.e., AI4Science) once exclusive to human intellect. To comprehensively evaluate current models' performance in cognitive reasoni… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 44 pages

  41. arXiv:2406.12433  [pdf, other

    cs.IR

    LLM-enhanced Reranking in Recommender Systems

    Authors: Jingtong Gao, Bo Chen, Xiangyu Zhao, Weiwen Liu, Xiangyang Li, Yichao Wang, Zijian Zhang, Wanyu Wang, Yuyang Ye, Shanru Lin, Huifeng Guo, Ruiming Tang

    Abstract: Reranking is a critical component in recommender systems, playing an essential role in refining the output of recommendation algorithms. Traditional reranking models have focused predominantly on accuracy, but modern applications demand consideration of additional criteria such as diversity and fairness. Existing reranking approaches often fail to harmonize these diverse criteria effectively at th… ▽ More

    Submitted 20 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  42. arXiv:2406.11937  [pdf, other

    physics.ins-det hep-ex physics.data-an

    Using graph neural networks to reconstruct charged pion showers in the CMS High Granularity Calorimeter

    Authors: M. Aamir, B. Acar, G. Adamov, T. Adams, C. Adloff, S. Afanasiev, C. Agrawal, C. Agrawal, A. Ahmad, H. A. Ahmed, S. Akbar, N. Akchurin, B. Akgul, B. Akgun, R. O. Akpinar, E. Aktas, A. AlKadhim, V. Alexakhin, J. Alimena, J. Alison, A. Alpana, W. Alshehri, P. Alvarez Dominguez, M. Alyari, C. Amendola , et al. (550 additional authors not shown)

    Abstract: A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadr… ▽ More

    Submitted 30 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Prepared for submission to JINST

  43. arXiv:2406.09905  [pdf, other

    cs.CV cs.GR

    Nymeria: A Massive Collection of Multimodal Egocentric Daily Motion in the Wild

    Authors: Lingni Ma, Yuting Ye, Fangzhou Hong, Vladimir Guzov, Yifeng Jiang, Rowan Postyeni, Luis Pesqueira, Alexander Gamino, Vijay Baiyya, Hyo Jin Kim, Kevin Bailey, David Soriano Fosas, C. Karen Liu, Ziwei Liu, Jakob Engel, Renzo De Nardi, Richard Newcombe

    Abstract: We introduce Nymeria - a large-scale, diverse, richly annotated human motion dataset collected in the wild with multiple multimodal egocentric devices. The dataset comes with a) full-body 3D motion ground truth; b) egocentric multimodal recordings from Project Aria devices with RGB, grayscale, eye-tracking cameras, IMUs, magnetometer, barometer, and microphones; and c) an additional "observer" dev… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  44. arXiv:2406.08698  [pdf, other

    astro-ph.HE hep-ph

    Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 17 pages, 12 figures, accepted by PRL

  45. arXiv:2406.01326  [pdf, other

    cs.CV

    TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

    Authors: Weichao Zhao, Hao Feng, Qi Liu, Jingqun Tang, Shu Wei, Binghong Wu, Lei Liao, Yongjie Ye, Hao Liu, Houqiang Li, Can Huang

    Abstract: Tables contain factual and quantitative data accompanied by various structures and contents that pose challenges for machine comprehension. Previous methods generally design task-specific architectures and objectives for individual tasks, resulting in modal isolation and intricate workflows. In this paper, we present a novel large vision-language model, TabPedia, equipped with a concept synergy me… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 20 pages, 8 figures

  46. arXiv:2406.00274  [pdf, other

    math.OC

    A Single-Loop Robust Policy Gradient Method for Robust Markov Decision Processes

    Authors: Zhenwei Lin, Chenyu Xue, Qi Deng, Yinyu Ye

    Abstract: Robust Markov Decision Processes (RMDPs) have recently been recognized as a valuable and promising approach to discovering a policy with creditable performance, particularly in the presence of a dynamic environment and estimation errors in the transition matrix due to limited data. Despite extensive exploration of dynamic programming algorithms for solving RMDPs, there has been a notable upswing i… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

  47. arXiv:2405.20001  [pdf, other

    hep-ph

    Prospect of measuring the top quark mass through energy correlators

    Authors: Meng Xiao, Yulei Ye, Xinyu Zhu

    Abstract: Reaching a high precision of the top quark mass is an important task of the Large Hadron Collider. We perform a feasibility study of measuring the top quark mass through the three-point energy correlator. The expected sensitivity of the top quark mass in the boosted regime is presented. We further introduce its application to the low top $p_\text{T}$ regime and demonstrate that both the W boson an… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 9 pages, 4 figures

  48. arXiv:2405.18666  [pdf

    physics.optics

    On-Chip Vectorial Structured Light Manipulation via Inverse Design

    Authors: Xiaobin Lin, Maoliang Wei, Kunhao Lei, Zijia Wang, Chi Wang, Hui Ma, Yuting Ye, Qiwei Zhan, Da Li, Shixun Dai, Baile Zhang, Xiaoyong Hu, Lan Li, Erping Li, Hongtao Lin

    Abstract: On-chip structured light, with potentially infinite complexity, has emerged as a linchpin in the realm of integrated photonics. However, the realization of arbitrarily tailoring a multitude of light field dimensions in complex media remains a challenge1, Through associating physical light fields and mathematical function spaces by introducing a mapping operator, we proposed a data-driven inverse d… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 50 pages, 18 figures

  49. arXiv:2405.16160  [pdf, other

    math.OC

    Restarted Primal-Dual Hybrid Conjugate Gradient Method for Large-Scale Quadratic Programming

    Authors: Yicheng Huang, Wanyu Zhang, Hongpei Li, Weihan Xue, Dongdong Ge, Huikang Liu, Yinyu Ye

    Abstract: Convex quadratic programming (QP) is an essential class of optimization problems with broad applications across various fields. Traditional QP solvers, typically based on simplex or barrier methods, face significant scalability challenges. In response to these limitations, recent research has shifted towards matrix-free first-order methods to enhance scalability in QP. Among these, the restarted a… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  50. arXiv:2405.16079  [pdf, other

    cond-mat.mes-hall

    Intrinsic localized excitons in MoSe$_2$/CrSBr heterostructures

    Authors: Xinyue Huang, Zhigang Song, Yuchen Gao, Pingfan Gu, Kenji Watanabe, Takashi Taniguchi, Shiqi Yang, Zuxin Chen, Yu Ye

    Abstract: We present a comprehensive investigation of optical properties in MoSe$_2$/CrSBr heterostructures, unveiling the presence of localized excitons represented by a new emission feature, X$^*$. We demonstrate through temperature- and power-dependent photoluminescence spectroscopy that X$^*$ originates from excitons confined by intrinsic defects within the CrSBr layer. The valley polarization of X$^*$… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 9 pages, 4 figures