Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 391 results for author: Zhou, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.15601  [pdf, other

    cond-mat.mtrl-sci cs.LG

    Grand canonical generative diffusion model for crystalline phases and grain boundaries

    Authors: Bo Lei, Enze Chen, Hyuna Kwon, Tim Hsu, Babak Sadigh, Vincenzo Lordi, Timofey Frolov, Fei Zhou

    Abstract: The diffusion model has emerged as a powerful tool for generating atomic structures for materials science. This work calls attention to the deficiency of current particle-based diffusion models, which represent atoms as a point cloud, in generating even the simplest ordered crystalline structures. The problem is attributed to particles being trapped in local minima during the score-driven simulate… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  2. arXiv:2408.03616  [pdf, other

    eess.IV cs.CV

    Distillation Learning Guided by Image Reconstruction for One-Shot Medical Image Segmentation

    Authors: Feng Zhou, Yanjie Zhou, Longjie Wang, Yun Peng, David E. Carlson, Liyun Tu

    Abstract: Traditional one-shot medical image segmentation (MIS) methods use registration networks to propagate labels from a reference atlas or rely on comprehensive sampling strategies to generate synthetic labeled data for training. However, these methods often struggle with registration errors and low-quality synthetic images, leading to poor performance and generalization. To overcome this, we introduce… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  3. arXiv:2408.02265  [pdf, other

    cs.CV

    Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts

    Authors: Andong Tan, Fengtao Zhou, Hao Chen

    Abstract: The concept bottleneck model (CBM) is an interpretable-by-design framework that makes decisions by first predicting a set of interpretable concepts, and then predicting the class label based on the given concepts. Existing CBMs are trained with a fixed set of concepts (concepts are either annotated by the dataset or queried from language models). However, this closed-world assumption is unrealisti… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: ECCV2024

  4. arXiv:2407.21465  [pdf, other

    cs.CV

    MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection

    Authors: Kuo Wang, Lechao Cheng, Weikai Chen, Pingping Zhang, Liang Lin, Fan Zhou, Guanbin Li

    Abstract: Learning from pseudo-labels that generated with VLMs~(Vision Language Models) has been shown as a promising solution to assist open vocabulary detection (OVD) in recent studies. However, due to the domain gap between VLM and vision-detection tasks, pseudo-labels produced by the VLMs are prone to be noisy, while the training design of the detector further amplifies the bias. In this work, we invest… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

    Comments: Codes are available at https://github.com/wkfdb/MarvelOVD

  5. arXiv:2407.21298  [pdf, other

    cs.LG cs.AI q-bio.BM

    A Vectorization Method Induced By Maximal Margin Classification For Persistent Diagrams

    Authors: An Wu, Yu Pan, Fuqi Zhou, Jinghui Yan, Chuanlu Liu

    Abstract: Persistent homology is an effective method for extracting topological information, represented as persistent diagrams, of spatial structure data. Hence it is well-suited for the study of protein structures. Attempts to incorporate Persistent homology in machine learning methods of protein function prediction have resulted in several techniques for vectorizing persistent diagrams. However, current… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

  6. arXiv:2407.20772  [pdf, other

    eess.SP cs.NI

    Edge Learning Based Collaborative Automatic Modulation Classification for Hierarchical Cognitive Radio Networks

    Authors: Peihao Dong, Chaowei He, Shen Gao, Fuhui Zhou, Qihui Wu

    Abstract: In hierarchical cognitive radio networks, edge or cloud servers utilize the data collected by edge devices for modulation classification, which, however, is faced with problems of the computation load, transmission overhead, and data privacy. In this article, an edge learning (EL) based framework jointly mobilizing the edge device and the edge server for intelligent co-inference is proposed to rea… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

    Comments: Accepted by IEEE Internet of Things Journal

  7. arXiv:2407.19820  [pdf, other

    cs.CV

    ActivityCLIP: Enhancing Group Activity Recognition by Mining Complementary Information from Text to Supplement Image Modality

    Authors: Guoliang Xu, Jianqin Yin, Feng Zhou, Yonghao Dang

    Abstract: Previous methods usually only extract the image modality's information to recognize group activity. However, mining image information is approaching saturation, making it difficult to extract richer information. Therefore, extracting complementary information from other modalities to supplement image information has become increasingly important. In fact, action labels provide clear text informati… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  8. arXiv:2407.18449  [pdf, other

    eess.IV cs.CV cs.LG

    Towards A Generalizable Pathology Foundation Model via Unified Knowledge Distillation

    Authors: Jiabo Ma, Zhengrui Guo, Fengtao Zhou, Yihui Wang, Yingxue Xu, Yu Cai, Zhengjie Zhu, Cheng Jin, Yi Lin, Xinrui Jiang, Anjia Han, Li Liang, Ronald Cheong Kin Chan, Jiguang Wang, Kwang-Ting Cheng, Hao Chen

    Abstract: Foundation models pretrained on large-scale datasets are revolutionizing the field of computational pathology (CPath). The generalization ability of foundation models is crucial for the success in various downstream clinical tasks. However, current foundation models have only been evaluated on a limited type and number of tasks, leaving their generalization ability and overall performance unclear.… ▽ More

    Submitted 3 August, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

    Report number: I.2.10

  9. arXiv:2407.17910  [pdf, other

    stat.ML cs.AI cs.LG

    Causal Deepsets for Off-policy Evaluation under Spatial or Spatio-temporal Interferences

    Authors: Runpeng Dai, Jianing Wang, Fan Zhou, Shikai Luo, Zhiwei Qin, Chengchun Shi, Hongtu Zhu

    Abstract: Off-policy evaluation (OPE) is widely applied in sectors such as pharmaceuticals and e-commerce to evaluate the efficacy of novel products or policies from offline datasets. This paper introduces a causal deepset framework that relaxes several key structural assumptions, primarily the mean-field assumption, prevalent in existing OPE methodologies that handle spatio-temporal interference. These tra… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  10. arXiv:2407.17619  [pdf, ps, other

    cs.DS cs.CR

    The Power of Graph Sparsification in the Continual Release Model

    Authors: Alessandro Epasto, Quanquan C. Liu, Tamalika Mukherjee, Felix Zhou

    Abstract: The graph continual release model of differential privacy seeks to produce differentially private solutions to graph problems under a stream of updates where new private solutions are released after each update. Streaming graph algorithms in the non-private literature also produce (approximately) accurate solutions when provided updates in a stream, but they additionally try to achieve two other g… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  11. arXiv:2407.16724  [pdf, other

    cs.CL

    Educating LLMs like Human Students: Structure-aware Injection of Domain Knowledge

    Authors: Kai Liu, Ze Chen, Zhihang Fu, Rongxin Jiang, Fan Zhou, Yaowu Chen, Yue Wu, Jieping Ye

    Abstract: This paper presents a pioneering methodology, termed StructTuning, to efficiently transform foundation Large Language Models (LLMs) into domain specialists. It significantly minimizes the training corpus requirement to a mere 0.3% while achieving an impressive 50% of traditional knowledge injection performance. Our method is inspired by the educational processes for human students, particularly ho… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: N/A

  12. arXiv:2407.16434  [pdf, other

    cs.CL

    Enhancing LLM's Cognition via Structurization

    Authors: Kai Liu, Zhihang Fu, Chao Chen, Wei Zhang, Rongxin Jiang, Fan Zhou, Yaowu Chen, Yue Wu, Jieping Ye

    Abstract: When reading long-form text, human cognition is complex and structurized. While large language models (LLMs) process input contexts through a causal and sequential perspective, this approach can potentially limit their ability to handle intricate and complex inputs effectively. To enhance LLM's cognition capability, this paper presents a novel concept of context structurization. Specifically, we t… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: N/A

  13. arXiv:2407.16430  [pdf, other

    cs.CV

    Rethinking Out-of-Distribution Detection on Imbalanced Data Distribution

    Authors: Kai Liu, Zhihang Fu, Sheng Jin, Chao Chen, Ze Chen, Rongxin Jiang, Fan Zhou, Yaowu Chen, Jieping Ye

    Abstract: Detecting and rejecting unknown out-of-distribution (OOD) samples is critical for deployed neural networks to void unreliable predictions. In real-world scenarios, however, the efficacy of existing OOD detection methods is often impeded by the inherent imbalance of in-distribution (ID) data, which causes significant performance decline. Through statistical observations, we have identified two comm… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: N/A

  14. arXiv:2407.16424  [pdf, other

    cs.CV

    ESOD: Efficient Small Object Detection on High-Resolution Images

    Authors: Kai Liu, Zhihang Fu, Sheng Jin, Ze Chen, Fan Zhou, Rongxin Jiang, Yaowu Chen, Jieping Ye

    Abstract: Enlarging input images is a straightforward and effective approach to promote small object detection. However, simple image enlargement is significantly expensive on both computations and GPU memory. In fact, small objects are usually sparsely distributed and locally clustered. Therefore, massive feature extraction computations are wasted on the non-target background area of images. Recent works h… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: N/A

  15. arXiv:2407.16269  [pdf, other

    cs.CV

    HyTAS: A Hyperspectral Image Transformer Architecture Search Benchmark and Analysis

    Authors: Fangqin Zhou, Mert Kilickaya, Joaquin Vanschoren, Ran Piao

    Abstract: Hyperspectral Imaging (HSI) plays an increasingly critical role in precise vision tasks within remote sensing, capturing a wide spectrum of visual data. Transformer architectures have significantly enhanced HSI task performance, while advancements in Transformer Architecture Search (TAS) have improved model discovery. To harness these advancements for HSI classification, we make the following cont… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: The paper is accepted at ECCV2024

  16. arXiv:2407.16161  [pdf, other

    cs.LG

    TransFeat-TPP: An Interpretable Deep Covariate Temporal Point Processes

    Authors: Zizhuo Meng, Boyu Li, Xuhui Fan, Zhidong Li, Yang Wang, Fang Chen, Feng Zhou

    Abstract: The classical temporal point process (TPP) constructs an intensity function by taking the occurrence times into account. Nevertheless, occurrence time may not be the only relevant factor, other contextual data, termed covariates, may also impact the event evolution. Incorporating such covariates into the model is beneficial, while distinguishing their relevance to the event dynamics is of great pr… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  17. arXiv:2407.15369  [pdf, other

    cs.CV

    Sparse Prior Is Not All You Need: When Differential Directionality Meets Saliency Coherence for Infrared Small Target Detection

    Authors: Fei Zhou, Maixia Fu, Yulei Qian, Jian Yang, Yimian Dai

    Abstract: Infrared small target detection is crucial for the efficacy of infrared search and tracking systems. Current tensor decomposition methods emphasize representing small targets with sparsity but struggle to separate targets from complex backgrounds due to insufficient use of intrinsic directional information and reduced target visibility during decomposition. To address these challenges, this study… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: Submitted to IEEE TIM, Minor Revision

  18. arXiv:2407.15362  [pdf, other

    cs.CV cs.AI

    A Multimodal Knowledge-enhanced Whole-slide Pathology Foundation Model

    Authors: Yingxue Xu, Yihui Wang, Fengtao Zhou, Jiabo Ma, Shu Yang, Huangjing Lin, Xin Wang, Jiguang Wang, Li Liang, Anjia Han, Ronald Cheong Kin Chan, Hao Chen

    Abstract: Remarkable strides in computational pathology have been made in the task-agnostic foundation model that advances the performance of a wide array of downstream clinical tasks. Despite the promising performance, there are still several challenges. First, prior works have resorted to either vision-only or vision-captions data, disregarding invaluable pathology reports and gene expression profiles whi… ▽ More

    Submitted 5 August, 2024; v1 submitted 22 July, 2024; originally announced July 2024.

    Comments: 45 pages, 9 figures

  19. arXiv:2407.14032  [pdf, other

    cs.CV

    Semantic-CC: Boosting Remote Sensing Image Change Captioning via Foundational Knowledge and Semantic Guidance

    Authors: Yongshuo Zhu, Lu Li, Keyan Chen, Chenyang Liu, Fugen Zhou, Zhenwei Shi

    Abstract: Remote sensing image change captioning (RSICC) aims to articulate the changes in objects of interest within bi-temporal remote sensing images using natural language. Given the limitations of current RSICC methods in expressing general features across multi-temporal and spatial scenarios, and their deficiency in providing granular, robust, and precise change descriptions, we introduce a novel chang… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  20. Motif-Consistent Counterfactuals with Adversarial Refinement for Graph-Level Anomaly Detection

    Authors: Chunjing Xiao, Shikang Pang, Wenxin Tai, Yanlong Huang, Goce Trajcevski, Fan Zhou

    Abstract: Graph-level anomaly detection is significant in diverse domains. To improve detection performance, counterfactual graphs have been exploited to benefit the generalization capacity by learning causal relations. Most existing studies directly introduce perturbations (e.g., flipping edges) to generate counterfactual graphs, which are prone to alter the semantics of generated examples and make them of… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: Accepted by KDD 2024

  21. arXiv:2407.10737  [pdf, other

    cs.CV cs.AI

    Aligning Neuronal Coding of Dynamic Visual Scenes with Foundation Vision Models

    Authors: Rining Wu, Feixiang Zhou, Ziwei Yin, Jian K. Liu

    Abstract: Our brains represent the ever-changing environment with neurons in a highly dynamic fashion. The temporal features of visual pixels in dynamic natural scenes are entrapped in the neuronal responses of the retina. It is crucial to establish the intrinsic temporal relationship between visual pixels and neuronal responses. Recent foundation vision models have paved an advanced way of understanding im… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: This article is accepted by ECCV 2024, which ID is 12149. Accepted papers' id can be found in: https://eccv2024.ecva.net/Conferences/2024/AcceptedPapers

  22. arXiv:2407.10499  [pdf, other

    cs.CL

    CIBench: Evaluating Your LLMs with a Code Interpreter Plugin

    Authors: Songyang Zhang, Chuyu Zhang, Yingfan Hu, Haowen Shen, Kuikun Liu, Zerun Ma, Fengzhe Zhou, Wenwei Zhang, Xuming He, Dahua Lin, Kai Chen

    Abstract: While LLM-Based agents, which use external tools to solve complex problems, have made significant progress, benchmarking their ability is challenging, thereby hindering a clear understanding of their limitations. In this paper, we propose an interactive evaluation framework, named CIBench, to comprehensively assess LLMs' ability to utilize code interpreters for data science tasks. Our evaluation f… ▽ More

    Submitted 25 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: Under review. The first three authors contribute equally, and Songyang Zhang is the project leader

  23. arXiv:2407.07673  [pdf, other

    cs.CV

    Towards Adaptive Pseudo-label Learning for Semi-Supervised Temporal Action Localization

    Authors: Feixiang Zhou, Bryan Williams, Hossein Rahmani

    Abstract: Alleviating noisy pseudo labels remains a key challenge in Semi-Supervised Temporal Action Localization (SS-TAL). Existing methods often filter pseudo labels based on strict conditions, but they typically assess classification and localization quality separately, leading to suboptimal pseudo-label ranking and selection. In particular, there might be inaccurate pseudo labels within selected positiv… ▽ More

    Submitted 24 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  24. arXiv:2407.02143  [pdf, other

    cs.LG cs.SI

    Counterfactual Data Augmentation with Denoising Diffusion for Graph Anomaly Detection

    Authors: Chunjing Xiao, Shikang Pang, Xovee Xu, Xuan Li, Goce Trajcevski, Fan Zhou

    Abstract: A critical aspect of Graph Neural Networks (GNNs) is to enhance the node representations by aggregating node neighborhood information. However, when detecting anomalies, the representations of abnormal nodes are prone to be averaged by normal neighbors, making the learned anomaly representations less distinguishable. To tackle this issue, we propose CAGAD -- an unsupervised Counterfactual data Aug… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted by IEEE Transactions on Computational Social Systems(TCSS). DOI: https://doi.org/10.1109/TCSS.2024.3403503

  25. arXiv:2406.19611  [pdf, other

    q-bio.QM cs.AI

    Multimodal Data Integration for Precision Oncology: Challenges and Future Directions

    Authors: Huajun Zhou, Fengtao Zhou, Chenyu Zhao, Yingxue Xu, Luyang Luo, Hao Chen

    Abstract: The essence of precision oncology lies in its commitment to tailor targeted treatments and care measures to each patient based on the individual characteristics of the tumor. The inherent heterogeneity of tumors necessitates gathering information from diverse data sources to provide valuable insights from various perspectives, fostering a holistic comprehension of the tumor. Over the past decade,… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 15 pages, 4 figures

  26. arXiv:2406.18364  [pdf

    cs.CL cs.AI

    Research on Information Extraction of LCSTS Dataset Based on an Improved BERTSum-LSTM Model

    Authors: Yiming Chen, Haobin Chen, Simin Liu, Yunyun Liu, Fanhao Zhou, Bing Wei

    Abstract: With the continuous advancement of artificial intelligence, natural language processing technology has become widely utilized in various fields. At the same time, there are many challenges in creating Chinese news summaries. First of all, the semantics of Chinese news is complex, and the amount of information is enormous. Extracting critical information from Chinese news presents a significant cha… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: submitted to ICMIII 2024

  27. arXiv:2406.17797  [pdf, other

    physics.chem-ph cs.AI cs.LG

    MoleculeCLA: Rethinking Molecular Benchmark via Computational Ligand-Target Binding Analysis

    Authors: Shikun Feng, Jiaxin Zheng, Yinjun Jia, Yanwen Huang, Fengfeng Zhou, Wei-Ying Ma, Yanyan Lan

    Abstract: Molecular representation learning is pivotal for various molecular property prediction tasks related to drug discovery. Robust and accurate benchmarks are essential for refining and validating current methods. Existing molecular property benchmarks derived from wet experiments, however, face limitations such as data volume constraints, unbalanced label distribution, and noisy labels. To address th… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  28. arXiv:2406.14887  [pdf, other

    cs.CL

    InternLM-Law: An Open Source Chinese Legal Large Language Model

    Authors: Zhiwei Fei, Songyang Zhang, Xiaoyu Shen, Dawei Zhu, Xiao Wang, Maosong Cao, Fengzhe Zhou, Yining Li, Wenwei Zhang, Dahua Lin, Kai Chen, Jidong Ge

    Abstract: While large language models (LLMs) have showcased impressive capabilities, they struggle with addressing legal queries due to the intricate complexities and specialized expertise required in the legal field. In this paper, we introduce InternLM-Law, a specialized LLM tailored for addressing diverse legal queries related to Chinese laws, spanning from responding to standard legal questions (e.g., l… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Our dataset, code and models will be released at https://github.com/InternLM/InternLM-Law

  29. arXiv:2406.13555  [pdf, other

    cs.CL cs.AI

    BiLD: Bi-directional Logits Difference Loss for Large Language Model Distillation

    Authors: Minchong Li, Feng Zhou, Xiaohui Song

    Abstract: In recent years, large language models (LLMs) have shown exceptional capabilities across various natural language processing (NLP) tasks. However, such impressive performance often comes with the trade-off of an increased parameter size, posing significant challenges for widespread deployment. Knowledge distillation (KD) provides a solution by transferring knowledge from a large teacher model to a… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Submitted to ARR June (for EMNLP 2024)

  30. arXiv:2406.12753  [pdf, other

    cs.CL cs.AI

    OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

    Authors: Zhen Huang, Zengzhi Wang, Shijie Xia, Xuefeng Li, Haoyang Zou, Ruijie Xu, Run-Ze Fan, Lyumanshan Ye, Ethan Chern, Yixin Ye, Yikai Zhang, Yuqing Yang, Ting Wu, Binjie Wang, Shichao Sun, Yang Xiao, Yiyuan Li, Fan Zhou, Steffi Chern, Yiwei Qin, Yan Ma, Jiadi Su, Yixiu Liu, Yuxiang Zheng, Shaoting Zhang , et al. (3 additional authors not shown)

    Abstract: The evolution of Artificial Intelligence (AI) has been significantly accelerated by advancements in Large Language Models (LLMs) and Large Multimodal Models (LMMs), gradually showcasing potential cognitive reasoning abilities in problem-solving and scientific discovery (i.e., AI4Science) once exclusive to human intellect. To comprehensively evaluate current models' performance in cognitive reasoni… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 44 pages

  31. GMP-AR: Granularity Message Passing and Adaptive Reconciliation for Temporal Hierarchy Forecasting

    Authors: Fan Zhou, Chen Pan, Lintao Ma, Yu Liu, James Zhang, Jun Zhou, Hongyuan Mei, Weitao Lin, Zi Zhuang, Wenxin Ning, Yunhua Hu, Siqiao Xue

    Abstract: Time series forecasts of different temporal granularity are widely used in real-world applications, e.g., sales prediction in days and weeks for making different inventory plans. However, these tasks are usually solved separately without ensuring coherence, which is crucial for aligning downstream decisions. Previous works mainly focus on ensuring coherence with some straightforward methods, e.g.,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  32. arXiv:2406.11434  [pdf, other

    cs.DB

    DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered by Large Language Models

    Authors: Fan Zhou, Siqiao Xue, Danrui Qi, Wenhui Shi, Wang Zhao, Ganglin Wei, Hongyang Zhang, Caigai Jiang, Gangwei Jiang, Zhixuan Chu, Faqiang Chen

    Abstract: Large language models (LLMs) becomes the dominant paradigm for the challenging task of text-to-SQL. LLM-empowered text-to-SQL methods are typically categorized into prompting-based and tuning approaches. Compared to prompting-based methods, benchmarking fine-tuned LLMs for text-to-SQL is important yet under-explored, partially attributed to the prohibitively high computational cost. In this paper,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  33. arXiv:2406.10869  [pdf, other

    eess.IV cs.CV

    Geometric Distortion Guided Transformer for Omnidirectional Image Super-Resolution

    Authors: Cuixin Yang, Rongkang Dong, Jun Xiao, Cong Zhang, Kin-Man Lam, Fei Zhou, Guoping Qiu

    Abstract: As virtual and augmented reality applications gain popularity, omnidirectional image (ODI) super-resolution has become increasingly important. Unlike 2D plain images that are formed on a plane, ODIs are projected onto spherical surfaces. Applying established image super-resolution methods to ODIs, therefore, requires performing equirectangular projection (ERP) to map the ODIs onto a plane. ODI sup… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 13 pages, 12 figures, journal

  34. arXiv:2406.09858  [pdf, other

    cs.CV

    Vision Language Modeling of Content, Distortion and Appearance for Image Quality Assessment

    Authors: Fei Zhou, Zhicong Huang, Tianhao Gu, Guoping Qiu

    Abstract: The visual quality of an image is confounded by a number of intertwined factors including its semantic content, distortion characteristics and appearance properties such as brightness, contrast, sharpness, and colourfulness. Distilling high level knowledge about all these quality bearing attributes is crucial for developing objective Image Quality Assessment (IQA).While existing solutions have mod… ▽ More

    Submitted 21 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

  35. arXiv:2406.05036  [pdf, other

    cs.LG cs.AI

    TimeSieve: Extracting Temporal Dynamics through Information Bottlenecks

    Authors: Ninghui Feng, Songning Lai, Jiayu Yang, Fobao Zhou, Zhenxiao Yin, Hang Zhao

    Abstract: Time series forecasting has become an increasingly popular research area due to its critical applications in various real-world domains such as traffic management, weather prediction, and financial analysis. Despite significant advancements, existing models face notable challenges, including the necessity of manual hyperparameter tuning for different datasets, and difficulty in effectively disting… ▽ More

    Submitted 21 August, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

  36. arXiv:2406.03421  [pdf, other

    cs.CV

    Post-hoc Part-prototype Networks

    Authors: Andong Tan, Fengtao Zhou, Hao Chen

    Abstract: Post-hoc explainability methods such as Grad-CAM are popular because they do not influence the performance of a trained model. However, they mainly reveal "where" a model looks at for a given input, fail to explain "what" the model looks for (e.g., what is important to classify a bird image to a Scott Oriole?). Existing part-prototype networks leverage part-prototypes (e.g., characteristic Scott O… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  37. arXiv:2405.16940  [pdf, other

    cs.CV

    Adversarial Attacks on Both Face Recognition and Face Anti-spoofing Models

    Authors: Fengfan Zhou, Qianyu Zhou, Xiangtai Li, Xuequan Lu, Lizhuang Ma, Hefei Ling

    Abstract: Adversarial attacks on Face Recognition (FR) systems have proven highly effective in compromising pure FR models, yet adversarial examples may be ineffective to the complete FR systems as Face Anti-Spoofing (FAS) models are often incorporated and can detect a significant number of them. To address this under-explored and essential problem, we propose a novel setting of adversarially attacking both… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  38. arXiv:2405.16197  [pdf, other

    cs.CV eess.IV

    A 7K Parameter Model for Underwater Image Enhancement based on Transmission Map Prior

    Authors: Fuheng Zhou, Dikai Wei, Ye Fan, Yulong Huang, Yonggang Zhang

    Abstract: Although deep learning based models for underwater image enhancement have achieved good performance, they face limitations in both lightweight and effectiveness, which prevents their deployment and application on resource-constrained platforms. Moreover, most existing deep learning based models use data compression to get high-level semantic information in latent space instead of using the origina… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 10 pages

  39. arXiv:2405.16059  [pdf, other

    cs.SI

    Interpretable Transformer Hawkes Processes: Unveiling Complex Interactions in Social Networks

    Authors: Zizhuo Meng, Ke Wan, Yadong Huang, Zhidong Li, Yang Wang, Feng Zhou

    Abstract: Social networks represent complex ecosystems where the interactions between users or groups play a pivotal role in information dissemination, opinion formation, and social interactions. Effectively harnessing event sequence data within social networks to unearth interactions among users or groups has persistently posed a challenging frontier within the realm of point processes. Current deep point… ▽ More

    Submitted 28 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  40. arXiv:2405.15599  [pdf, ps, other

    cs.LG stat.ML

    On the Computational Landscape of Replicable Learning

    Authors: Alkis Kalavasis, Amin Karbasi, Grigoris Velegkas, Felix Zhou

    Abstract: We study computational aspects of algorithmic replicability, a notion of stability introduced by Impagliazzo, Lei, Pitassi, and Sorrell [2022]. Motivated by a recent line of work that established strong statistical connections between replicability and other notions of learnability such as online learning, private learning, and SQ learning, we aim to understand better the computational connections… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  41. arXiv:2405.12209  [pdf, other

    cs.CL

    MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark

    Authors: Hongwei Liu, Zilong Zheng, Yuxuan Qiao, Haodong Duan, Zhiwei Fei, Fengzhe Zhou, Wenwei Zhang, Songyang Zhang, Dahua Lin, Kai Chen

    Abstract: Recent advancements in large language models (LLMs) have showcased significant improvements in mathematics. However, traditional math benchmarks like GSM8k offer a unidimensional perspective, falling short in providing a holistic assessment of the LLMs' math capabilities. To address this gap, we introduce MathBench, a new benchmark that rigorously assesses the mathematical capabilities of large la… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Project: https://github.com/open-compass/MathBench

  42. arXiv:2405.11281  [pdf, other

    cs.DC cs.AI

    Cooperative Cognitive Dynamic System in UAV Swarms: Reconfigurable Mechanism and Framework

    Authors: Ziye Jia, Jiahao You, Chao Dong, Qihui Wu, Fuhui Zhou, Dusit Niyato, Zhu Han

    Abstract: As the demands for immediate and effective responses increase in both civilian and military domains, the unmanned aerial vehicle (UAV) swarms emerge as effective solutions, in which multiple cooperative UAVs can work together to achieve specific goals. However, how to manage such complex systems to ensure real-time adaptability lack sufficient researches. Hence, in this paper, we propose the coope… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  43. arXiv:2405.08938  [pdf, ps, other

    cs.DS

    Pointwise Lipschitz Continuous Graph Algorithms via Proximal Gradient Analysis

    Authors: Quanquan C. Liu, Grigoris Velegkas, Yuichi Yoshida, Felix Zhou

    Abstract: In many real-world applications, it is prohibitively expensive to drastically change the solution to a problem after a small perturbation in the environment. Therefore, the stability of an algorithm is a very desirable property. In this paper, we study the class of pointwise Lipschitz continuous algorithms as introduced in the recent work of Kumabe and Yoshida [FOCS'23]. The Lipschitz constant of… ▽ More

    Submitted 9 July, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

  44. arXiv:2405.08603  [pdf, other

    cs.CL

    A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine

    Authors: Hanguang Xiao, Feizhong Zhou, Xingyue Liu, Tianqi Liu, Zhipeng Li, Xin Liu, Xiaoxuan Huang

    Abstract: Since the release of ChatGPT and GPT-4, large language models (LLMs) and multimodal large language models (MLLMs) have garnered significant attention due to their powerful and general capabilities in understanding, reasoning, and generation, thereby offering new paradigms for the integration of artificial intelligence with medicine. This survey comprehensively overviews the development background… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  45. arXiv:2405.08005  [pdf, other

    math.OC cs.AI cs.GT cs.LG stat.ML

    Graphon Mean Field Games with a Representative Player: Analysis and Learning Algorithm

    Authors: Fuzhong Zhou, Chenyu Zhang, Xu Chen, Xuan Di

    Abstract: We propose a discrete time graphon game formulation on continuous state and action spaces using a representative player to study stochastic games with heterogeneous interaction among agents. This formulation admits both philosophical and mathematical advantages, compared to a widely adopted formulation using a continuum of players. We prove the existence and uniqueness of the graphon equilibrium w… ▽ More

    Submitted 4 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: Published as a conference paper at ICML 2024

  46. arXiv:2405.07088  [pdf, other

    cs.HC

    Towards Context-Aware Modeling of Situation Awareness in Conditionally Automated Driving

    Authors: Lilit Avetisyan, X. Jessie Yang, Feng Zhou

    Abstract: Maintaining adequate situation awareness (SA) is crucial for the safe operation of conditionally automated vehicles (AVs), which requires drivers to regain control during takeover (TOR) events. This study developed a predictive model for real-time assessment of driver SA using multimodal data (e.g., galvanic skin response, heart rate and eye tracking data, and driver characteristics) collected in… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: 37 Pages, 8 figures

  47. arXiv:2405.01507  [pdf, other

    cs.LG stat.ML

    Accelerating Convergence in Bayesian Few-Shot Classification

    Authors: Tianjun Ke, Haoqun Cao, Feng Zhou

    Abstract: Bayesian few-shot classification has been a focal point in the field of few-shot learning. This paper seamlessly integrates mirror descent-based variational inference into Gaussian process-based few-shot classification, addressing the challenge of non-conjugate inference. By leveraging non-Euclidean geometry, mirror descent achieves accelerated convergence by providing the steepest descent directi… ▽ More

    Submitted 7 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

  48. arXiv:2404.17771  [pdf, ps, other

    cs.CV

    Characterization of dim light response in DVS pixel: Discontinuity of event triggering time

    Authors: Xiao Jiang, Fei Zhou

    Abstract: Dynamic Vision Sensors (DVS) have recently generated great interest because of the advantages of wide dynamic range and low latency compared with conventional frame-based cameras. However, the complicated behaviors in dim light conditions are still not clear, restricting the applications of DVS. In this paper, we analyze the typical DVS circuit, and find that there exists discontinuity of event tr… ▽ More

    Submitted 30 April, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

    Comments: 6 pages, 4 figures

  49. arXiv:2404.15891  [pdf, other

    cs.CV

    OMEGAS: Object Mesh Extraction from Large Scenes Guided by Gaussian Segmentation

    Authors: Lizhi Wang, Feng Zhou, Bo yu, Pu Cao, Jianqin Yin

    Abstract: Recent advancements in 3D reconstruction technologies have paved the way for high-quality and real-time rendering of complex 3D scenes. Despite these achievements, a notable challenge persists: it is difficult to precisely reconstruct specific objects from large scenes. Current scene reconstruction techniques frequently result in the loss of object detail textures and are unable to reconstruct obj… ▽ More

    Submitted 27 August, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

  50. arXiv:2404.12367  [pdf, other

    cond-mat.mtrl-sci cs.LG physics.chem-ph

    Information theory unifies atomistic machine learning, uncertainty quantification, and materials thermodynamics

    Authors: Daniel Schwalbe-Koda, Sebastien Hamel, Babak Sadigh, Fei Zhou, Vincenzo Lordi

    Abstract: An accurate description of information is relevant for a range of problems in atomistic modeling, such as sampling methods, detecting rare events, analyzing datasets, or performing uncertainty quantification (UQ) in machine learning (ML)-driven simulations. Although individual methods have been proposed for each of these tasks, they lack a common theoretical background integrating their solutions.… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Report number: LLNL-JRNL-862887-DRAFT