Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–18 of 18 results for author: Mi, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.09248  [pdf, other

    cs.CV

    MagicID: Flexible ID Fidelity Generation System

    Authors: Zhaoli Deng, Wen Liu, Fanyi Wang, Junkang Zhang, Fan Chen, Meng Zhang, Wendong Zhang, Zhenpeng Mi

    Abstract: Portrait Fidelity Generation is a prominent research area in generative models, with a primary focus on enhancing both controllability and fidelity. Current methods face challenges in generating high-fidelity portrait results when faces occupy a small portion of the image with a low resolution, especially in multi-person group photo settings. To tackle these issues, we propose a systematic solutio… ▽ More

    Submitted 20 August, 2024; v1 submitted 17 August, 2024; originally announced August 2024.

  2. arXiv:2408.09240  [pdf, other

    cs.CV

    RepControlNet: ControlNet Reparameterization

    Authors: Zhaoli Deng, Kaibin Zhou, Fanyi Wang, Zhenpeng Mi

    Abstract: With the wide application of diffusion model, the high cost of inference resources has became an important bottleneck for its universal application. Controllable generation, such as ControlNet, is one of the key research directions of diffusion model, and the research related to inference acceleration and model compression is more important. In order to solve this problem, this paper proposes a mo… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

  3. arXiv:2406.06282  [pdf, other

    cs.LG

    PowerInfer-2: Fast Large Language Model Inference on a Smartphone

    Authors: Zhenliang Xue, Yixin Song, Zeyu Mi, Le Chen, Yubin Xia, Haibo Chen

    Abstract: This paper introduces PowerInfer-2, a framework designed for high-speed inference of Large Language Models (LLMs) on smartphones, particularly effective for models whose sizes exceed the device's memory capacity. The key insight of PowerInfer-2 is to utilize the heterogeneous computation, memory, and I/O resources in smartphones by decomposing traditional matrix computations into fine-grained neur… ▽ More

    Submitted 12 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: 14 pages, 11 figures

  4. arXiv:2406.05955  [pdf, other

    cs.LG cs.CL

    Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters

    Authors: Yixin Song, Haotong Xie, Zhengyan Zhang, Bo Wen, Li Ma, Zeyu Mi, Haibo Chen

    Abstract: Exploiting activation sparsity is a promising approach to significantly accelerating the inference process of large language models (LLMs) without compromising performance. However, activation sparsity is determined by activation functions, and commonly used ones like SwiGLU and GeGLU exhibit limited sparsity. Simply replacing these functions with ReLU fails to achieve sufficient sparsity. Moreove… ▽ More

    Submitted 10 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

  5. arXiv:2404.10445  [pdf, other

    cs.LG cs.AI

    SparseDM: Toward Sparse Efficient Diffusion Models

    Authors: Kafeng Wang, Jianfei Chen, He Li, Zhenpeng Mi, Jun Zhu

    Abstract: Diffusion models have been extensively used in data generation tasks and are recognized as one of the best generative models. However, their time-consuming deployment, long inference time, and requirements on large memory limit their application on mobile devices. In this paper, we propose a method based on the improved Straight-Through Estimator to improve the deployment efficiency of diffusion m… ▽ More

    Submitted 30 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  6. arXiv:2402.03804  [pdf, other

    cs.LG cs.AI

    ReLU$^2$ Wins: Discovering Efficient Activation Functions for Sparse LLMs

    Authors: Zhengyan Zhang, Yixin Song, Guanghui Yu, Xu Han, Yankai Lin, Chaojun Xiao, Chenyang Song, Zhiyuan Liu, Zeyu Mi, Maosong Sun

    Abstract: Sparse computation offers a compelling solution for the inference of Large Language Models (LLMs) in low-resource scenarios by dynamically skipping the computation of inactive neurons. While traditional approaches focus on ReLU-based LLMs, leveraging zeros in activation values, we broaden the scope of sparse LLMs beyond zero activation values. We introduce a general method that defines neuron acti… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  7. arXiv:2312.12456  [pdf, ps, other

    cs.LG cs.OS

    PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU

    Authors: Yixin Song, Zeyu Mi, Haotong Xie, Haibo Chen

    Abstract: This paper introduces PowerInfer, a high-speed Large Language Model (LLM) inference engine on a personal computer (PC) equipped with a single consumer-grade GPU. The key underlying the design of PowerInfer is exploiting the high locality inherent in LLM inference, characterized by a power-law distribution in neuron activation. This distribution indicates that a small subset of neurons, termed hot… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: 15 pages, 18 figures

  8. arXiv:2303.11565  [pdf, other

    physics.optics cond-mat.stat-mech cs.ET physics.app-ph

    Wavelength-division multiplexing optical Ising simulator enabling fully programmable spin couplings and external magnetic fields

    Authors: Li Luo, Zhiyi Mi, Junyi Huang, Zhichao Ruan

    Abstract: Recently, spatial photonic Ising machines (SPIMs) have demonstrated the abilities to compute the Ising Hamiltonian of large-scale spin systems, with the advantages of ultrafast speed and high power efficiency. However, such optical computations have been limited to specific Ising models with fully connected couplings. Here we develop a wavelength-division multiplexing SPIM to enable programmable s… ▽ More

    Submitted 24 March, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: 6 pages, 4 figures

    Journal ref: Sience Advances 9, eadg623 (2023)

  9. arXiv:2301.02484  [pdf, other

    cs.CV

    Graph-Collaborated Auto-Encoder Hashing for Multi-view Binary Clustering

    Authors: Huibing Wang, Mingze Yao, Guangqi Jiang, Zetian Mi, Xianping Fu

    Abstract: Unsupervised hashing methods have attracted widespread attention with the explosive growth of large-scale data, which can greatly reduce storage and computation by learning compact binary codes. Existing unsupervised hashing methods attempt to exploit the valuable information from samples, which fails to take the local geometric structure of unlabeled samples into consideration. Moreover, hashing… ▽ More

    Submitted 6 January, 2023; originally announced January 2023.

  10. arXiv:2201.09652  [pdf, ps, other

    cs.OS cs.AR cs.CR

    DuVisor: a User-level Hypervisor Through Delegated Virtualization

    Authors: Jiahao Chen, Dingji Li, Zeyu Mi, Yuxuan Liu, Binyu Zang, Haibing Guan, Haibo Chen

    Abstract: Today's mainstream virtualization systems comprise of two cooperative components: a kernel-resident driver that accesses virtualization hardware and a user-level helper process that provides VM management and I/O virtualization. However, this virtualization architecture has intrinsic issues in both security (a large attack surface) and performance. While there is a long thread of work trying to mi… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

    Comments: 17 pages, 9 figures

  11. arXiv:2112.02338  [pdf, other

    cs.CV

    Generalized Binary Search Network for Highly-Efficient Multi-View Stereo

    Authors: Zhenxing Mi, Di Chang, Dan Xu

    Abstract: Multi-view Stereo (MVS) with known camera parameters is essentially a 1D search problem within a valid depth range. Recent deep learning-based MVS methods typically densely sample depth hypotheses in the depth range, and then construct prohibitively memory-consuming 3D cost volumes for depth prediction. Although coarse-to-fine sampling strategies alleviate this overhead issue to a certain extent,… ▽ More

    Submitted 4 December, 2021; originally announced December 2021.

    Comments: 16 pages

  12. Stereo CenterNet based 3D Object Detection for Autonomous Driving

    Authors: Yuguang Shi, Yu Guo, Zhenqiang Mi, Xinjie Li

    Abstract: Recently, three-dimensional (3D) detection based on stereo images has progressed remarkably; however, most advanced methods adopt anchor-based two-dimensional (2D) detection or depth estimation to address this problem. Nevertheless, high computational cost inhibits these methods from achieving real-time performance. In this study, we propose a 3D object detection method, Stereo CenterNet (SC), usi… ▽ More

    Submitted 23 September, 2021; v1 submitted 19 March, 2021; originally announced March 2021.

    Journal ref: Published by Neurocomputing,Volume 471, 30 January 2022, Pages 219-229

  13. arXiv:2101.10353  [pdf, other

    cs.CV

    DeepDT: Learning Geometry From Delaunay Triangulation for Surface Reconstruction

    Authors: Yiming Luo, Zhenxing Mi, Wenbing Tao

    Abstract: In this paper, a novel learning-based network, named DeepDT, is proposed to reconstruct the surface from Delaunay triangulation of point cloud. DeepDT learns to predict inside/outside labels of Delaunay tetrahedrons directly from a point cloud and corresponding Delaunay triangulation. The local geometry features are first extracted from the input point cloud and aggregated into a graph deriving fr… ▽ More

    Submitted 1 April, 2021; v1 submitted 25 January, 2021; originally announced January 2021.

    Comments: Accepted by AAAI 2021

  14. arXiv:1911.07401  [pdf, other

    cs.CV

    SSRNet: Scalable 3D Surface Reconstruction Network

    Authors: Zhenxing Mi, Yiming Luo, Wenbing Tao

    Abstract: Existing learning-based surface reconstruction methods from point clouds are still facing challenges in terms of scalability and preservation of details on large-scale point clouds. In this paper, we propose the SSRNet, a novel scalable learning-based method for surface reconstruction. The proposed SSRNet constructs local geometry-aware features for octree vertices and designs a scalable reconstru… ▽ More

    Submitted 13 April, 2020; v1 submitted 17 November, 2019; originally announced November 2019.

    Comments: Accepted by CVPR2020, typos corrected, references added, images revised

  15. arXiv:1907.05595  [pdf, other

    eess.IV cs.CV

    Jointly Adversarial Network to Wavelength Compensation and Dehazing of Underwater Images

    Authors: Xueyan Ding, Yafei Wang, Yang Yan, Zheng Liang, Zetian Mi, Xianping Fu

    Abstract: Severe color casts, low contrast and blurriness of underwater images caused by light absorption and scattering result in a difficult task for exploring underwater environments. Different from most of previous underwater image enhancement methods that compute light attenuation along object-camera path through hazy image formation model, we propose a novel jointly wavelength compensation and dehazin… ▽ More

    Submitted 12 July, 2019; originally announced July 2019.

  16. arXiv:1905.12973  [pdf, other

    cs.RO

    Partial Computing Offloading Assisted Cloud Point Registration in Multi-robot SLAM

    Authors: Biwei Li, Zhenqiang Mi, Yu Guo, Yang Yang, Mohammad S. Obaidat

    Abstract: Multi-robot visual simultaneous localization and mapping (SLAM) system is normally consisted of multiple mobile robots equipped with camera and/or other visual sensors. The networked robots work independently or cooperatively in an unknown scene in order to solve autonomous localization and mapping problem. One of the most critical issues in Multi-robot visual SLAM is the intensive computation tha… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

  17. Removing Stripes, Scratches, and Curtaining with Non-Recoverable Compressed Sensing

    Authors: Jonathan Schwartz, Yi Jiang, Yongjie Wang, Anthony Aiello, Pallab Bhattacharya, Hui Yuan, Zetian Mi, Nabil Bassim, Robert Hovden

    Abstract: Highly-directional image artifacts such as ion mill curtaining, mechanical scratches, or image striping from beam instability degrade the interpretability of micrographs. These unwanted, aperiodic features extend the image along a primary direction and occupy a small wedge of information in Fourier space. Deleting this wedge of data replaces stripes, scratches, or curtaining, with more complex str… ▽ More

    Submitted 23 January, 2019; originally announced January 2019.

    Comments: 15 pages, 5 figures

  18. arXiv:1810.03286  [pdf, other

    cs.CV

    Guiding Intelligent Surveillance System by learning-by-synthesis gaze estimation

    Authors: Tongtong Zhao, Yuxiao Yan, Jinjia Peng, Zetian Mi, Xianping Fu

    Abstract: We describe a novel learning-by-synthesis method for estimating gaze direction of an automated intelligent surveillance system. Recently, progress in learning-by-synthesis has proposed training models on synthetic images, which can effectively reduce the cost of manpower and material resources. However, learning from synthetic images still fails to achieve the desired performance compared to natur… ▽ More

    Submitted 8 October, 2018; originally announced October 2018.

    Comments: Submit to the journal of Pattern Recognition Letters