Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 109 results for author: Zong, Z

.
  1. arXiv:2408.15797  [pdf

    physics.comp-ph

    Deep potential for interaction between hydrated Cs+ and graphene

    Authors: Yangjun Qin, Xiao Wan, Liuhua Mu, Zhicheng Zong, Tianhao Li, Nuo Yang

    Abstract: The influence of hydrated cation-π interaction forces on the adsorption and filtration capabilities of graphene-based membrane materials is significant. However, the lack of interaction potential between hydrated Cs+ and graphene limits the scope of adsorption studies. Here, it is developed that a deep neural network potential function model to predict the interaction force between hydrated Cs+ an… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  2. arXiv:2406.19421  [pdf, other

    hep-ex physics.ins-det

    The Belle II Detector Upgrades Framework Conceptual Design Report

    Authors: H. Aihara, A. Aloisio, D. P. Auguste, M. Aversano, M. Babeluk, S. Bahinipati, Sw. Banerjee, M. Barbero, J. Baudot, A. Beaubien, F. Becherer, T. Bergauer, F. U. Bernlochner., V. Bertacchi, G. Bertolone, C. Bespin, M. Bessner, S. Bettarini, A. J. Bevan, B. Bhuyan, M. Bona, J. F. Bonis, J. Borah, F. Bosi, R. Boudagga , et al. (186 additional authors not shown)

    Abstract: We describe the planned near-term and potential longer-term upgrades of the Belle II detector at the SuperKEKB electron-positron collider operating at the KEK laboratory in Tsukuba, Japan. These upgrades will allow increasingly sensitive searches for possible new physics beyond the Standard Model in flavor, tau, electroweak and dark sector physics that are both complementary to and competitive wit… ▽ More

    Submitted 4 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: Editor: F. Forti 170 pages

    Report number: KEK-REPORT-2024-1, BELLE2-REPORT-2024-042

  3. arXiv:2406.11831  [pdf, other

    cs.CV

    Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models

    Authors: Bingqi Ma, Zhuofan Zong, Guanglu Song, Hongsheng Li, Yu Liu

    Abstract: Large language models (LLMs) based on decoder-only transformers have demonstrated superior text understanding capabilities compared to CLIP and T5-series models. However, the paradigm for utilizing current advanced LLMs in text-to-image diffusion models remains to be explored. We observed an unusual phenomenon: directly using a large language model as the prompt encoder significantly degrades the… ▽ More

    Submitted 21 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  4. arXiv:2406.03520  [pdf, other

    cs.CV cs.AI cs.LG

    VideoPhy: Evaluating Physical Commonsense for Video Generation

    Authors: Hritik Bansal, Zongyu Lin, Tianyi Xie, Zeshun Zong, Michal Yarom, Yonatan Bitton, Chenfanfu Jiang, Yizhou Sun, Kai-Wei Chang, Aditya Grover

    Abstract: Recent advances in internet-scale video data pretraining have led to the development of text-to-video generative models that can create high-quality videos across a broad range of visual concepts and styles. Due to their ability to synthesize realistic motions and render complex objects, these generative models have the potential to become general-purpose simulators of the physical world. However,… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 36 pages, 26 figures, 8 tables

  5. arXiv:2405.18515  [pdf, other

    cs.LG

    Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication

    Authors: Yunuo Chen, Tianyi Xie, Zeshun Zong, Xuan Li, Feng Gao, Yin Yang, Ying Nian Wu, Chenfanfu Jiang

    Abstract: Existing diffusion-based text-to-3D generation methods primarily focus on producing visually realistic shapes and appearances, often neglecting the physical constraints necessary for downstream tasks. Generated models frequently fail to maintain balance when placed in physics-based simulations or 3D printed. This balance is crucial for satisfying user design intentions in interactive gaming, embod… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  6. arXiv:2404.13046  [pdf, other

    cs.CV

    MoVA: Adapting Mixture of Vision Experts to Multimodal Context

    Authors: Zhuofan Zong, Bingqi Ma, Dazhong Shen, Guanglu Song, Hao Shao, Dongzhi Jiang, Hongsheng Li, Yu Liu

    Abstract: As the key component in multimodal large language models (MLLMs), the ability of the visual encoder greatly affects MLLM's understanding on diverse image content. Although some large-scale pretrained vision encoders such as vision encoders in CLIP and DINOv2 have brought promising performance, we found that there is still no single vision encoder that can dominate various image content understandi… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  7. arXiv:2404.03653  [pdf, other

    cs.CV cs.AI cs.CL

    CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

    Authors: Dongzhi Jiang, Guanglu Song, Xiaoshi Wu, Renrui Zhang, Dazhong Shen, Zhuofan Zong, Yu Liu, Hongsheng Li

    Abstract: Diffusion models have demonstrated great success in the field of text-to-image generation. However, alleviating the misalignment between the text prompts and images is still challenging. The root reason behind the misalignment has not been extensively investigated. We observe that the misalignment is caused by inadequate token attention activation. We further attribute this phenomenon to the diffu… ▽ More

    Submitted 3 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Project Page: https://caraj7.github.io/comat

  8. arXiv:2403.16999  [pdf, other

    cs.CV

    Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

    Authors: Hao Shao, Shengju Qian, Han Xiao, Guanglu Song, Zhuofan Zong, Letian Wang, Yu Liu, Hongsheng Li

    Abstract: Multi-Modal Large Language Models (MLLMs) have demonstrated impressive performance in various VQA tasks. However, they often lack interpretability and struggle with complex visual inputs, especially when the resolution of the input image is high or when the interested region that could provide key information for answering the question is small. To address these challenges, we collect and introduc… ▽ More

    Submitted 7 July, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: Code: https://github.com/deepcs233/Visual-CoT

  9. arXiv:2403.13783  [pdf, other

    cs.RO

    A Convex Formulation of Frictional Contact for the Material Point Method and Rigid Bodies

    Authors: Zeshun Zong, Chenfanfu Jiang, Xuchen Han

    Abstract: In this paper, we introduce a novel convex formulation that seamlessly integrates the Material Point Method (MPM) with articulated rigid body dynamics in frictional contact scenarios. We extend the linear corotational hyperelastic model into the realm of elastoplasticity and include an efficient return mapping algorithm. This approach is particularly effective for MPM simulations involving signifi… ▽ More

    Submitted 22 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: The supplemental video is available at https://youtu.be/5jrQtF5D0DA

  10. arXiv:2401.15318  [pdf, other

    cs.GR cs.AI cs.CV cs.LG

    Gaussian Splashing: Unified Particles for Versatile Motion Synthesis and Rendering

    Authors: Yutao Feng, Xiang Feng, Yintong Shang, Ying Jiang, Chang Yu, Zeshun Zong, Tianjia Shao, Hongzhi Wu, Kun Zhou, Chenfanfu Jiang, Yin Yang

    Abstract: We demonstrate the feasibility of integrating physics-based animations of solids and fluids with 3D Gaussian Splatting (3DGS) to create novel effects in virtual scenes reconstructed using 3DGS. Leveraging the coherence of the Gaussian Splatting and Position-Based Dynamics (PBD) in the underlying representation, we manage rendering, view synthesis, and the dynamics of solids and fluids in a cohesiv… ▽ More

    Submitted 23 July, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

  11. arXiv:2312.06160  [pdf, ps, other

    math.AG math.SG

    Open WDVV Equations and Frobenius Structures for Toric Calabi-Yau 3-Folds

    Authors: Song Yu, Zhengyu Zong

    Abstract: Let $X$ be a toric Calabi-Yau 3-fold and let $L\subset X$ be an Aganagic-Vafa outer brane. We prove two versions of open WDVV equations for the open Gromov-Witten theory of $(X,L)$. The first version of the open WDVV equation leads to the construction of a semi-simple (formal) Frobenius manifold and the second version leads to the construction of a flat (formal) $F$-manifold.

    Submitted 2 June, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: 26 pages

    MSC Class: 14N35; 53D45

  12. arXiv:2311.12198  [pdf, other

    cs.GR cs.AI cs.CV cs.LG

    PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics

    Authors: Tianyi Xie, Zeshun Zong, Yuxing Qiu, Xuan Li, Yutao Feng, Yin Yang, Chenfanfu Jiang

    Abstract: We introduce PhysGaussian, a new method that seamlessly integrates physically grounded Newtonian dynamics within 3D Gaussians to achieve high-quality novel motion synthesis. Employing a custom Material Point Method (MPM), our approach enriches 3D Gaussian kernels with physically meaningful kinematic deformation and mechanical stress attributes, all evolved in line with continuum mechanics principl… ▽ More

    Submitted 15 April, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

    Comments: Accepted by CVPR 2024

  13. arXiv:2310.17790  [pdf, other

    cs.GR cs.CE cs.LG math.NA

    Neural Stress Fields for Reduced-order Elastoplasticity and Fracture

    Authors: Zeshun Zong, Xuan Li, Minchen Li, Maurizio M. Chiaramonte, Wojciech Matusik, Eitan Grinspun, Kevin Carlberg, Chenfanfu Jiang, Peter Yichen Chen

    Abstract: We propose a hybrid neural network and physics framework for reduced-order modeling of elastoplasticity and fracture. State-of-the-art scientific computing models like the Material Point Method (MPM) faithfully simulate large-deformation elastoplasticity and fracture mechanics. However, their long runtime and large memory consumption render them unsuitable for applications constrained by computati… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  14. arXiv:2310.02638  [pdf, other

    cs.CV

    P2CADNet: An End-to-End Reconstruction Network for Parametric 3D CAD Model from Point Clouds

    Authors: Zhihao Zong, Fazhi He, Rubin Fan, Yuxin Liu

    Abstract: Computer Aided Design (CAD), especially the feature-based parametric CAD, plays an important role in modern industry and society. However, the reconstruction of featured CAD model is more challenging than the reconstruction of other CAD models. To this end, this paper proposes an end-to-end network to reconstruct featured CAD model from point cloud (P2CADNet). Initially, the proposed P2CADNet arch… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  15. arXiv:2309.01473  [pdf, ps, other

    math.AG math-ph

    Twisted Equivariant Gromov-Witten Theory of the Classifying Space of a Finite Group

    Authors: Zhuoming Lan, Zhengyu Zong

    Abstract: For any finite group $G$, the equivariant Gromov-Witten invariants of $[\mathbb{C}^r/G]$ can be viewed as a certain twisted Gromov-Witten invariants of the classifying stack $\mathcal{B} G$. In this paper, we use Tseng's orbifold quantum Riemann-Roch theorem to express the equivariant Gromov-Witten invariants of $[\mathbb{C}^r/G]$ as a sum over Feynman graphs, where the weight of each graph is exp… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

    Comments: This paper is a non-abelian generalization of arXiv:1310.4812

  16. arXiv:2308.02164  [pdf

    cond-mat.mtrl-sci

    Using Targeted Phonon Excitation to Modulate Thermal Conductivity of Boron Nitride

    Authors: Dongkai Pan, Xiao Wan, Zhicheng Zong, Yangjun Qin, Nuo Yang

    Abstract: Modulation of thermal conductivity has become a hotspot in the field of heat conduction. A novel strategy based on targeted phonon excitation has been recently proposed for efficient and reversible modulation of thermal conductivity. In this article, the effectiveness of that strategy is further evaluated on hexagonal boron nitride through ab initio methods. Results indicate that thermal conductiv… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: 12 pages, 3 figures

  17. arXiv:2306.05326  [pdf, ps, other

    math.AG math-ph math.GT

    Torus knots in Lens spaces, open Gromov-Witten invariants, and topological recursion

    Authors: Jinghao Yu, Zhengyu Zong

    Abstract: Starting from a torus knot $\mathcal{K}$ in the lens space $L(p,-1)$, we construct a Lagrangian sub-manifold $L_{\mathcal{K}}$ in $\mathcal{X}=\big(\mathcal{O}_{\mathbb{P}^1}(-1)\oplus \mathcal{O}_{\mathbb{P}^1}(-1)\big)/\mathbb{Z}_p$ under the conifold transition. We prove a mirror theorem which relates the all genus open-closed Gromov-Witten invariants of $(\mathcal{X},L_{\mathcal{K}})$ to the t… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: 43 pages, 6 figures

  18. arXiv:2305.19877  [pdf

    cond-mat.mtrl-sci

    Enhancing interfacial thermal conductance of Si/PVDF by strengthening atomic couplings

    Authors: Zhicheng Zong, Shichen Deng, Yangjun Qin, Xiao Wan, Jiahong Zhan, Dengke Ma, Nuo Yang

    Abstract: The thermal transport across inorganic/organic interfaces attracts interest for both academic and industry due to its widely applications in flexible electronics etc. Here, the interfacial thermal conductance of inorganic/organic interfaces consisting of silicon and polyvinylidene fluoride is systematically investigated by molecular dynamics simulations. Interestingly, it is demonstrated that a mo… ▽ More

    Submitted 10 June, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

  19. arXiv:2305.18295  [pdf, other

    cs.CV

    RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths

    Authors: Zeyue Xue, Guanglu Song, Qiushan Guo, Boxiao Liu, Zhuofan Zong, Yu Liu, Ping Luo

    Abstract: Text-to-image generation has recently witnessed remarkable achievements. We introduce a text-conditional image diffusion model, termed RAPHAEL, to generate highly artistic images, which accurately portray the text prompts, encompassing multiple nouns, adjectives, and verbs. This is achieved by stacking tens of mixture-of-experts (MoEs) layers, i.e., space-MoE and time-MoE layers, enabling billions… ▽ More

    Submitted 9 March, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023

  20. arXiv:2305.16143  [pdf, other

    cs.LG

    Condensed Prototype Replay for Class Incremental Learning

    Authors: Jiangtao Kong, Zhenyu Zong, Tianyi Zhou, Huajie Shao

    Abstract: Incremental learning (IL) suffers from catastrophic forgetting of old tasks when learning new tasks. This can be addressed by replaying previous tasks' data stored in a memory, which however is usually prone to size limits and privacy leakage. Recent studies store only class centroids as prototypes and augment them with Gaussian noises to create synthetic data for replay. However, they cannot effe… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  21. arXiv:2304.00967  [pdf, other

    cs.CV

    Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction

    Authors: Zhuofan Zong, Dongzhi Jiang, Guanglu Song, Zeyue Xue, Jingyong Su, Hongsheng Li, Yu Liu

    Abstract: In this paper, we propose a new paradigm, named Historical Object Prediction (HoP) for multi-view 3D detection to leverage temporal information more effectively. The HoP approach is straightforward: given the current timestamp t, we generate a pseudo Bird's-Eye View (BEV) feature of timestamp t-k from its adjacent frames and utilize this feature to predict the object set at timestamp t-k. Our appr… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: Tech report. Codes will be available at https://github.com/Sense-X/HoP

  22. Determination of Molecular Energies via Quantum Imaginary Time Evolution in a Superconducting Qubit System

    Authors: Zhiwen Zong, Sainan Huai, Tianqi Cai, Wenyan Jin, Ze Zhan, Zhenxing Zhang, Kunliang Bu, Liyang Sui, Ying Fei, Yicong Zheng, Shengyu Zhang, Jianlan Wu, Yi Yin

    Abstract: As a valid tool for solving ground state problems, imaginary time evolution (ITE) is widely used in physical and chemical simulations. Different ITE-based algorithms in their quantum counterpart have recently been proposed and applied to some real systems. We experimentally realize the variational-based quantum imaginary time evolution (QITE) algorithm to simulate the ground state energy of hydrog… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: 11 pages, 5 figures

  23. arXiv:2211.12860  [pdf, other

    cs.CV

    DETRs with Collaborative Hybrid Assignments Training

    Authors: Zhuofan Zong, Guanglu Song, Yu Liu

    Abstract: In this paper, we provide the observation that too few queries assigned as positive samples in DETR with one-to-one set matching leads to sparse supervision on the encoder's output which considerably hurt the discriminative feature learning of the encoder and vice visa for attention learning in the decoder. To alleviate this, we present a novel collaborative hybrid assignments training scheme, nam… ▽ More

    Submitted 10 August, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

    Comments: ICCV 2023. Codes are available at https://github.com/Sense-X/Co-DETR

  24. arXiv:2211.05910  [pdf, other

    eess.IV cs.CV

    Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report

    Authors: Andrey Ignatov, Radu Timofte, Maurizio Denna, Abdel Younes, Ganzorig Gankhuyag, Jingang Huh, Myeong Kyun Kim, Kihwan Yoon, Hyeon-Cheol Moon, Seungho Lee, Yoonsik Choe, Jinwoo Jeong, Sungjei Kim, Maciej Smyl, Tomasz Latkowski, Pawel Kubik, Michal Sokolski, Yujie Ma, Jiahao Chao, Zhou Zhou, Hongfan Gao, Zhengfeng Yang, Zhenbing Zeng, Zhengyang Zhuge, Chenghua Li , et al. (71 additional authors not shown)

    Abstract: Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: text overlap with arXiv:2105.07825, arXiv:2105.08826, arXiv:2211.04470, arXiv:2211.03885, arXiv:2211.05256

  25. arXiv:2211.00683  [pdf, other

    cs.LG cs.AI

    Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation

    Authors: Cody Blakeney, Jessica Zosa Forde, Jonathan Frankle, Ziliang Zong, Matthew L. Leavitt

    Abstract: Methods for improving the efficiency of deep network training (i.e. the resources required to achieve a given level of model quality) are of immediate benefit to deep learning practitioners. Distillation is typically used to compress models or improve model quality, but it's unclear if distillation actually improves training efficiency. Can the quality improvements of distillation be converted int… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

  26. arXiv:2210.11153  [pdf, other

    eess.IV cs.CV

    Reversed Image Signal Processing and RAW Reconstruction. AIM 2022 Challenge Report

    Authors: Marcos V. Conde, Radu Timofte, Yibin Huang, Jingyang Peng, Chang Chen, Cheng Li, Eduardo Pérez-Pellitero, Fenglong Song, Furui Bai, Shuai Liu, Chaoyu Feng, Xiaotao Wang, Lei Lei, Yu Zhu, Chenghua Li, Yingying Jiang, Yong A, Peisong Wang, Cong Leng, Jian Cheng, Xiaoyu Liu, Zhicun Yin, Zhilu Zhang, Junyi Li, Ming Liu , et al. (18 additional authors not shown)

    Abstract: Cameras capture sensor RAW images and transform them into pleasant RGB images, suitable for the human eyes, using their integrated Image Signal Processor (ISP). Numerous low-level vision tasks operate in the RAW domain (e.g. image denoising, white balance) due to its linear relationship with the scene irradiance, wide-range of information at 12bits, and sensor designs. Despite this, RAW image data… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: ECCV 2022 Advances in Image Manipulation (AIM) workshop

  27. arXiv:2210.11078  [pdf, other

    cs.CV

    Large-batch Optimization for Dense Visual Predictions

    Authors: Zeyue Xue, Jianming Liang, Guanglu Song, Zhuofan Zong, Liang Chen, Yu Liu, Ping Luo

    Abstract: Training a large-scale deep neural network in a large-scale dataset is challenging and time-consuming. The recent breakthrough of large-batch optimization is a promising way to tackle this challenge. However, although the current advanced algorithms such as LARS and LAMB succeed in classification models, the complicated pipelines of dense visual predictions such as object detection and segmentatio… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: 23 pages, 6 figures

    Journal ref: NeurIPS 2022

  28. arXiv:2209.09694  [pdf

    cond-mat.mtrl-sci

    Modulating Thermal Conductivity via Targeted Phonon Excitation

    Authors: Xiao Wan, Dongkai Pan, Jing-Tao Lü, Sebastian Volz, Lifa Zhang, Qing Hao, Yangjun Qin, Zhicheng Zong, Nuo Yang

    Abstract: Thermal conductivity is a critical material property in numerous applications, such as those related to thermoelectric devices and heat dissipation. Effectively modulating thermal conductivity has become a great concern in the field of heat conduction. In this study, a quantum strategy is proposed to modulate thermal conductivity by exciting targeted phonons. The results show that the thermal cond… ▽ More

    Submitted 5 April, 2023; v1 submitted 20 September, 2022; originally announced September 2022.

  29. arXiv:2208.04844  [pdf, other

    math.OC

    Topology Optimization with Frictional Self-Contact

    Authors: Zeshun Zong, Xuan Li, Jianping Ye, Sian Wen, Yin Yang, Danny M. Kaufman, Minchen Li, Chenfanfu Jiang

    Abstract: Contact-aware topology optimization faces challenges in robustness, accuracy, and applicability to internal structural surfaces under self-contact. This work builds on the recently proposed barrier-based Incremental Potential Contact (IPC) model and presents a new self-contact-aware topology optimization framework. A combination of SIMP, adjoint sensitivity analysis, and the IPC frictional-contact… ▽ More

    Submitted 24 August, 2022; v1 submitted 6 August, 2022; originally announced August 2022.

  30. arXiv:2208.03620  [pdf, other

    cs.CV

    Learning Omnidirectional Flow in 360-degree Video via Siamese Representation

    Authors: Keshav Bhandari, Bin Duan, Gaowen Liu, Hugo Latapie, Ziliang Zong, Yan Yan

    Abstract: Optical flow estimation in omnidirectional videos faces two significant issues: the lack of benchmark datasets and the challenge of adapting perspective video-based methods to accommodate the omnidirectional nature. This paper proposes the first perceptually natural-synthetic omnidirectional benchmark dataset with a 360-degree field of view, FLOW360, with 40 different videos and 4,000 video frames… ▽ More

    Submitted 6 August, 2022; originally announced August 2022.

    Comments: Accepted to ECCV22

  31. arXiv:2207.06540  [pdf, other

    cs.LG cs.CV

    Lipschitz Continuity Retained Binary Neural Network

    Authors: Yuzhang Shang, Dan Xu, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

    Abstract: Relying on the premise that the performance of a binary neural network can be largely restored with eliminated quantization error between full-precision weight vectors and their corresponding binary vectors, existing works of network binarization frequently adopt the idea of model robustness to reach the aforementioned objective. However, robustness remains to be an ill-defined concept without sol… ▽ More

    Submitted 16 July, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

    Comments: Paper accepted to ECCV 2022

  32. arXiv:2207.05785  [pdf, other

    cs.CV

    Domain Gap Estimation for Source Free Unsupervised Domain Adaptation with Many Classifiers

    Authors: Ziyang Zong, Jun He, Lei Zhang, Hai Huan

    Abstract: In theory, the success of unsupervised domain adaptation (UDA) largely relies on domain gap estimation. However, for source free UDA, the source domain data can not be accessed during adaptation, which poses great challenge of measuring the domain gap. In this paper, we propose to use many classifiers to learn the source domain decision boundaries, which provides a tighter upper bound of the domai… ▽ More

    Submitted 2 October, 2022; v1 submitted 12 July, 2022; originally announced July 2022.

    Comments: 31 pages

  33. arXiv:2207.02970  [pdf, other

    cs.CV cs.LG

    Network Binarization via Contrastive Learning

    Authors: Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan

    Abstract: Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit. However, there is still a huge performance gap between Binary Neural Networks (BNNs) and their full-precision (FP) counterparts. As the quantization error caused by weights binarization has been reduced in earlier works, the activations binarization becomes the major obstacle for further imp… ▽ More

    Submitted 16 July, 2022; v1 submitted 6 July, 2022; originally announced July 2022.

    Comments: Accepted to ECCV 2022

  34. arXiv:2205.05675  [pdf, other

    cs.CV eess.IV

    NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

    Authors: Yawei Li, Kai Zhang, Radu Timofte, Luc Van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang , et al. (86 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2022 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The task of the challenge was to super-resolve an input image with a magnification factor of $\times$4 based on pairs of low and corresponding high resolution images. The aim was to design a network for single image super-resolution that achieved improvement of e… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: Validation code of the baseline model is available at https://github.com/ofsoundof/IMDN. Validation of all submitted models is available at https://github.com/ofsoundof/NTIRE2022_ESR

  35. arXiv:2201.12712  [pdf, other

    cs.CV cs.AI

    Win the Lottery Ticket via Fourier Analysis: Frequencies Guided Network Pruning

    Authors: Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

    Abstract: With the remarkable success of deep learning recently, efficient network compression algorithms are urgently demanded for releasing the potential computational power of edge devices, such as smartphones or tablets. However, optimal network pruning is a non-trivial task which mathematically is an NP-hard problem. Previous researchers explain training a pruned network as buying a lottery ticket. In… ▽ More

    Submitted 29 January, 2022; originally announced January 2022.

    Comments: accepted to ICASSP 2022

  36. arXiv:2111.12624  [pdf, other

    cs.CV

    Self-slimmed Vision Transformer

    Authors: Zhuofan Zong, Kunchang Li, Guanglu Song, Yali Wang, Yu Qiao, Biao Leng, Yu Liu

    Abstract: Vision transformers (ViTs) have become the popular structures and outperformed convolutional neural networks (CNNs) on various vision tasks. However, such powerful transformers bring a huge computation burden, because of the exhausting token-to-token comparison. The previous works focus on dropping insignificant tokens to reduce the computational cost of ViTs. But when the dropping ratio increases… ▽ More

    Submitted 12 September, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: Accepted by ECCV 2022. Code is available at https://github.com/Sense-X/SiT

  37. RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection

    Authors: Zhuofan Zong, Qianggang Cao, Biao Leng

    Abstract: Feature pyramid networks (FPN) are widely exploited for multi-scale feature fusion in existing advanced object detection frameworks. Numerous previous works have developed various structures for bidirectional feature fusion, all of which are shown to improve the detection performance effectively. We observe that these complicated network structures require feature pyramids to be stacked in a fixed… ▽ More

    Submitted 23 October, 2021; originally announced October 2021.

    Comments: Accepted by ACM MM2021

  38. arXiv:2110.04397  [pdf, other

    cs.LG cs.AI cs.CY

    Measure Twice, Cut Once: Quantifying Bias and Fairness in Deep Neural Networks

    Authors: Cody Blakeney, Gentry Atkinson, Nathaniel Huish, Yan Yan, Vangelis Metris, Ziliang Zong

    Abstract: Algorithmic bias is of increasing concern, both to the research community, and society at large. Bias in AI is more abstract and unintuitive than traditional forms of discrimination and can be more difficult to detect and mitigate. A clear gap exists in the current literature on evaluating the relative bias in the performance of multi-class classifiers. In this work, we propose two simple yet effe… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

  39. arXiv:2110.00941  [pdf, ps, other

    quant-ph

    Experimental Determination of Multi-Qubit Ground State via a Cluster Mean-Field Algorithm

    Authors: Ze Zhan, Chongxin Run, Zhiwen Zong, Liang Xiang, Ying Fei, Wenyan Jin, Zhilong Jia, Peng Duan, Jianlan Wu, Yi Yin, Guoping Guo

    Abstract: A quantum eigensolver is designed under a multi-layer cluster mean-field (CMF) algorithm by partitioning a quantum system into spatially-separated clusters. For each cluster, a reduced Hamiltonian is obtained after a partial average over its environment cluster. The products of eigenstates from different clusters construct a compressed Hilbert space, in which an effective Hamiltonian is diagonaliz… ▽ More

    Submitted 3 October, 2021; originally announced October 2021.

  40. arXiv:2108.12905  [pdf, other

    cs.LG cs.AI cs.CV

    Lipschitz Continuity Guided Knowledge Distillation

    Authors: Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

    Abstract: Knowledge distillation has become one of the most important model compression techniques by distilling knowledge from larger teacher networks to smaller student ones. Although great success has been achieved by prior distillation methods via delicately designing various types of knowledge, they overlook the functional properties of neural networks, which makes the process of applying those techniq… ▽ More

    Submitted 29 August, 2021; originally announced August 2021.

    Comments: This work has been accepted by ICCV 2021

  41. arXiv:2108.04462  [pdf, other

    cs.LG cs.AI

    Deep Reinforcement Learning for Demand Driven Services in Logistics and Transportation Systems: A Survey

    Authors: Zefang Zong, Tao Feng, Tong Xia, Depeng Jin, Yong Li

    Abstract: Recent technology development brings the booming of numerous new Demand-Driven Services (DDS) into urban lives, including ridesharing, on-demand delivery, express systems and warehousing. In DDS, a service loop is an elemental structure, including its service worker, the service providers and corresponding service targets. The service workers should transport either humans or parcels from the prov… ▽ More

    Submitted 23 March, 2022; v1 submitted 10 August, 2021; originally announced August 2021.

    Comments: 21 pages. survey preprint

  42. arXiv:2106.07849  [pdf, other

    cs.LG cs.AI cs.CV

    Simon Says: Evaluating and Mitigating Bias in Pruned Neural Networks with Knowledge Distillation

    Authors: Cody Blakeney, Nathaniel Huish, Yan Yan, Ziliang Zong

    Abstract: In recent years the ubiquitous deployment of AI has posed great concerns in regards to algorithmic bias, discrimination, and fairness. Compared to traditional forms of bias or discrimination caused by humans, algorithmic bias generated by AI is more abstract and unintuitive therefore more difficult to explain and mitigate. A clear gap exists in the current literature on evaluating and mitigating b… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

  43. arXiv:2106.01532  [pdf, other

    cs.CV

    Noise Doesn't Lie: Towards Universal Detection of Deep Inpainting

    Authors: Ang Li, Qiuhong Ke, Xingjun Ma, Haiqin Weng, Zhiyuan Zong, Feng Xue, Rui Zhang

    Abstract: Deep image inpainting aims to restore damaged or missing regions in an image with realistic contents. While having a wide range of applications such as object removal and image recovery, deep inpainting techniques also have the risk of being manipulated for image forgery. A promising countermeasure against such forgeries is deep inpainting detection, which aims to locate the inpainted regions in a… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: Accepted by IJCAI 2021

  44. arXiv:2105.03333  [pdf, ps, other

    quant-ph

    Quantify the Non-Markovian Process with Intervening Projections in a Superconducting Processor

    Authors: Liang Xiang, Zhiwen Zong, Ze Zhan, Ying Fei, Chongxin Run, Yaozu Wu, Wenyan Jin, Zhilong Jia, Peng Duan, Jianlan Wu, Yi Yin, Guoping Guo

    Abstract: A Markov assumption considers a physical system memoryless to simplify its dynamics. Whereas memory effect or the non-Markovian phenomenon is more general in nature. In the quantum regime, it is challenging to define or quantify the non-Markovianity because the measurement of a quantum system often interferes with it. We simulate the open quantum dynamics in a superconducting processor, then chara… ▽ More

    Submitted 18 June, 2021; v1 submitted 7 May, 2021; originally announced May 2021.

  45. Optimization of Controlled-Z Gate with Data-Driven Gradient Ascent Pulse Engineering in a Superconducting Qubit System

    Authors: Zhiwen Zong, Zhenhai Sun, Zhangjingzi Dong, Chongxin Run, Liang Xiang, Ze Zhan, Qianlong Wang, Ying Fei, Yaozu Wu, Wenyan Jin, Cong Xiao, Zhilong Jia, Peng Duan, Jianlan Wu, Yi Yin, Guoping Guo

    Abstract: The experimental optimization of a two-qubit controlled-Z (CZ) gate is realized following two different data-driven gradient ascent pulse engineering (GRAPE) protocols in the aim of optimizing the gate operator and the output quantum state, respectively. For both GRAPE protocols, the key computation of gradients utilizes mixed information of the input Z-control pulse and the experimental measureme… ▽ More

    Submitted 24 April, 2021; originally announced April 2021.

    Journal ref: Phys. Rev. Applied 15, 064005 (2021)

  46. Experimental Determination of Electronic States via Digitized Shortcut-to-Adiabaticity and Sequential Digitized Adiabaticity

    Authors: Ze Zhan, Chongxin Run, Zhiwen Zong, Liang Xiang, Ying Fei, Zhenhai Sun, Yaozu Wu, Zhilong Jia, Peng Duan, Jianlan Wu, Yi Yin, Guoping Guo

    Abstract: A combination of the digitized shortcut-to-adiabaticity (STA) and the sequential digitized adiabaticity is implemented in a superconducting quantum device to determine electronic states in two example systems, the H2 molecule and the topological Bernevig-Hughes-Zhang (BHZ) model. For H2, a short internuclear distance is chosen as a starting point, at which the ground and excited states are obtaine… ▽ More

    Submitted 18 June, 2021; v1 submitted 10 March, 2021; originally announced March 2021.

    Journal ref: Phys. Rev. Applied 16, 034050 (2021)

  47. arXiv:2012.03096  [pdf, other

    cs.LG

    Parallel Blockwise Knowledge Distillation for Deep Neural Network Compression

    Authors: Cody Blakeney, Xiaomin Li, Yan Yan, Ziliang Zong

    Abstract: Deep neural networks (DNNs) have been extremely successful in solving many challenging AI tasks in natural language processing, speech recognition, and computer vision nowadays. However, DNNs are typically computation intensive, memory demanding, and power hungry, which significantly limits their usage on platforms with constrained resources. Therefore, a variety of compression techniques (e.g. qu… ▽ More

    Submitted 5 December, 2020; originally announced December 2020.

  48. arXiv:2011.09290  [pdf, other

    cs.CR

    Practical Privacy Attacks on Vertical Federated Learning

    Authors: Haiqin Weng, Juntao Zhang, Xingjun Ma, Feng Xue, Tao Wei, Shouling Ji, Zhiyuan Zong

    Abstract: Federated learning (FL) is a privacy-preserving learning paradigm that allows multiple parities to jointly train a powerful machine learning model without sharing their private data. According to the form of collaboration, FL can be further divided into horizontal federated learning (HFL) and vertical federated learning (VFL). In HFL, participants share the same feature space and collaborate on da… ▽ More

    Submitted 22 July, 2022; v1 submitted 18 November, 2020; originally announced November 2020.

  49. arXiv:2010.08055  [pdf, other

    cs.CV

    Egok360: A 360 Egocentric Kinetic Human Activity Video Dataset

    Authors: Keshav Bhandari, Mario A. DeLaGarza, Ziliang Zong, Hugo Latapie, Yan Yan

    Abstract: Recently, there has been a growing interest in wearable sensors which provides new research perspectives for 360 ° video analysis. However, the lack of 360 ° datasets in literature hinders the research in this field. To bridge this gap, in this paper we propose a novel Egocentric (first-person) 360° Kinetic human activity video dataset (EgoK360). The EgoK360 dataset contains annotations of human a… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: 5 pages, 5 figures, 1 table, 2020 IEEE International Conference on Image Processing (ICIP)

  50. arXiv:2010.08045  [pdf, other

    cs.CV

    Revisiting Optical Flow Estimation in 360 Videos

    Authors: Keshav Bhandari, Ziliang Zong, Yan Yan

    Abstract: Nowadays 360 video analysis has become a significant research topic in the field since the appearance of high-quality and low-cost 360 wearable devices. In this paper, we propose a novel LiteFlowNet360 architecture for 360 videos optical flow estimation. We design LiteFlowNet360 as a domain adaptation framework from perspective video domain to 360 video domain. We adapt it from simple kernel trans… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: 8 Pages, 7 figures, 1 Table, 5 Equations, 25th International Conference on Pattern Recognition Milan, Italy