Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 52 results for author: Shu, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03900  [pdf, other

    cs.CV

    Oracle Bone Inscriptions Multi-modal Dataset

    Authors: Bang Li, Donghao Luo, Yujie Liang, Jing Yang, Zengmao Ding, Xu Peng, Boyuan Jiang, Shengwei Han, Dan Sui, Peichao Qin, Pian Wu, Chaoyang Wang, Yun Qi, Taisong Jin, Chengjie Wang, Xiaoming Huang, Zhan Shu, Rongrong Ji, Yongge Liu, Yunsheng Wu

    Abstract: Oracle bone inscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography. However, the task of deciphering OBI, in the current climate of the scholarship, can prove extremely challenging. Out of the 4,500 oracle bone characters excavated, only a third have been successfully identified. Therefore, leveraging… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2407.03568  [pdf, other

    cs.SI cs.IR

    When LLM Meets Hypergraph: A Sociological Analysis on Personality via Online Social Networks

    Authors: Zhiyao Shu, Xiangguo Sun, Hong Cheng

    Abstract: Individual personalities significantly influence our perceptions, decisions, and social interactions, which is particularly crucial for gaining insights into human behavior patterns in online social network analysis. Many psychological studies have observed that personalities are strongly reflected in their social behaviors and social environments. In light of these problems, this paper proposes a… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  3. arXiv:2407.02813  [pdf, other

    cs.CV cs.AI cs.LG

    Data Overfitting for On-Device Super-Resolution with Dynamic Algorithm and Compiler Co-Design

    Authors: Gen Li, Zhihao Shu, Jie Ji, Minghai Qin, Fatemeh Afghah, Wei Niu, Xiaolong Ma

    Abstract: Deep neural networks (DNNs) are frequently employed in a variety of computer vision applications. Nowadays, an emerging trend in the current video distribution system is to take advantage of DNN's overfitting properties to perform video resolution upscaling. By splitting videos into chunks and applying a super-resolution (SR) model to overfit each chunk, this scheme of SR models plus video chunks… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: ECCV2024

  4. arXiv:2406.09371  [pdf, other

    cs.CV cs.LG

    LRM-Zero: Training Large Reconstruction Models with Synthesized Data

    Authors: Desai Xie, Sai Bi, Zhixin Shu, Kai Zhang, Zexiang Xu, Yi Zhou, Sören Pirk, Arie Kaufman, Xin Sun, Hao Tan

    Abstract: We present LRM-Zero, a Large Reconstruction Model (LRM) trained entirely on synthesized 3D data, achieving high-quality sparse-view 3D reconstruction. The core of LRM-Zero is our procedural 3D dataset, Zeroverse, which is automatically synthesized from simple primitive shapes with random texturing and augmentations (e.g., height fields, boolean differences, and wireframes). Unlike previous 3D data… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 23 pages, 8 figures. Our code and interactive visualization are available at: https://desaixie.github.io/lrm-zero/

  5. arXiv:2406.03912  [pdf, other

    cs.AI cs.LG cs.RO eess.SY

    GenSafe: A Generalizable Safety Enhancer for Safe Reinforcement Learning Algorithms Based on Reduced Order Markov Decision Process Model

    Authors: Zhehua Zhou, Xuan Xie, Jiayang Song, Zhan Shu, Lei Ma

    Abstract: Although deep reinforcement learning has demonstrated impressive achievements in controlling various autonomous systems, e.g., autonomous vehicles or humanoid robots, its inherent reliance on random exploration raises safety concerns in their real-world applications. To improve system safety during the learning process, a variety of Safe Reinforcement Learning (SRL) algorithms have been proposed,… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  6. A Spatiotemporal Hand-Eye Calibration for Trajectory Alignment in Visual(-Inertial) Odometry Evaluation

    Authors: Zichao Shu, Lijun Li, Rui Wang, Zetao Chen

    Abstract: A common prerequisite for evaluating a visual(-inertial) odometry (VO/VIO) algorithm is to align the timestamps and the reference frame of its estimated trajectory with a reference ground-truth derived from a system of superior precision, such as a motion capture system. The trajectory-based alignment, typically modeled as a classic hand-eye calibration, significantly influences the accuracy of ev… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 8 pages, 9 figures, 2 tables

    Journal ref: IEEE Robotics and Automation Letters, vol. 9, no. 6, pp. 5134-5141, June 2024

  7. arXiv:2404.13528  [pdf, other

    cs.LG cs.AI cs.DC

    SmartMem: Layout Transformation Elimination and Adaptation for Efficient DNN Execution on Mobile

    Authors: Wei Niu, Md Musfiqur Rahman Sanim, Zhihao Shu, Jiexiong Guan, Xipeng Shen, Miao Yin, Gagan Agrawal, Bin Ren

    Abstract: This work is motivated by recent developments in Deep Neural Networks, particularly the Transformer architectures underlying applications such as ChatGPT, and the need for performing inference on mobile devices. Focusing on emerging transformers (specifically the ones with computationally efficient Swin-like architectures) and large models (e.g., Stable Diffusion and LLMs) based on transformers, w… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  8. arXiv:2404.12142  [pdf, other

    cs.CV cs.LG eess.IV

    SDIP: Self-Reinforcement Deep Image Prior Framework for Image Processing

    Authors: Ziyu Shu, Zhixin Pan

    Abstract: Deep image prior (DIP) proposed in recent research has revealed the inherent trait of convolutional neural networks (CNN) for capturing substantial low-level image statistics priors. This framework efficiently addresses the inverse problems in image processing and has induced extensive applications in various domains. However, as the whole algorithm is initialized randomly, the DIP algorithm often… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  9. arXiv:2404.03066  [pdf, other

    cs.MA cs.NI math.DS

    Traffic Divergence Theory: An Analysis Formalism for Dynamic Networks

    Authors: Matin Macktoobian, Zhan Shu, Qing Zhao

    Abstract: Traffic dynamics is universally crucial in analyzing and designing almost any network. This article introduces a novel theoretical approach to analyzing network traffic dynamics. This theory's machinery is based on the notion of traffic divergence, which captures the flow (im)balance of network nodes and links. It features various analytical probes to investigate both spatial and temporal traffic… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Journal ref: IEEE Access, 2024

  10. arXiv:2403.09632  [pdf, other

    cs.CV

    Holo-Relighting: Controllable Volumetric Portrait Relighting from a Single Image

    Authors: Yiqun Mei, Yu Zeng, He Zhang, Zhixin Shu, Xuaner Zhang, Sai Bi, Jianming Zhang, HyunJoon Jung, Vishal M. Patel

    Abstract: At the core of portrait photography is the search for ideal lighting and viewpoint. The process often requires advanced knowledge in photography and an elaborate studio setup. In this work, we propose Holo-Relighting, a volumetric relighting method that is capable of synthesizing novel viewpoints, and novel lighting from a single image. Holo-Relighting leverages the pretrained 3D GAN (EG3D) to rec… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: CVPR2024

  11. arXiv:2402.10787  [pdf, other

    cs.LG cs.AI cs.CL

    EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge

    Authors: Xuan Shen, Zhenglun Kong, Changdi Yang, Zhaoyang Han, Lei Lu, Peiyan Dong, Cheng Lyu, Chih-hsiang Li, Xuehang Guo, Zhihao Shu, Wei Niu, Miriam Leeser, Pu Zhao, Yanzhi Wang

    Abstract: Despite the remarkable strides of Large Language Models (LLMs) in various fields, the wide applications of LLMs on edge devices are limited due to their massive parameters and computations. To address this, quantization is commonly adopted to generate lightweight LLMs with efficient computations and fast inference. However, Post-Training Quantization (PTQ) methods dramatically degrade in quality w… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Preprint

  12. arXiv:2402.03723  [pdf, other

    cs.CV

    Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos

    Authors: Alfredo Rivero, ShahRukh Athar, Zhixin Shu, Dimitris Samaras

    Abstract: Creating controllable 3D human portraits from casual smartphone videos is highly desirable due to their immense value in AR/VR applications. The recent development of 3D Gaussian Splatting (3DGS) has shown improvements in rendering quality and training efficiency. However, it still remains a challenge to accurately model and disentangle head movements and facial expressions from a single-view capt… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  13. arXiv:2312.13980  [pdf, other

    cs.CV cs.LG

    Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning

    Authors: Desai Xie, Jiahao Li, Hao Tan, Xin Sun, Zhixin Shu, Yi Zhou, Sai Bi, Sören Pirk, Arie E. Kaufman

    Abstract: Multi-view diffusion models, obtained by applying Supervised Finetuning (SFT) to text-to-image diffusion models, have driven recent breakthroughs in text-to-3D research. However, due to the limited size and quality of existing 3D datasets, they still suffer from multi-view inconsistencies and Neural Radiance Field (NeRF) reconstruction artifacts. We argue that multi-view diffusion models can benef… ▽ More

    Submitted 9 April, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: 22 pages, 16 figures. Our code, training and testing data, and video results are available at: https://desaixie.github.io/carve-3d. This paper has been accepted to CVPR 2024. v2: incorporated changes from the CVPR 2024 camera-ready version

  14. arXiv:2312.06886  [pdf, other

    cs.CV

    Relightful Harmonization: Lighting-aware Portrait Background Replacement

    Authors: Mengwei Ren, Wei Xiong, Jae Shin Yoon, Zhixin Shu, Jianming Zhang, HyunJoon Jung, Guido Gerig, He Zhang

    Abstract: Portrait harmonization aims to composite a subject into a new background, adjusting its lighting and color to ensure harmony with the background scene. Existing harmonization techniques often only focus on adjusting the global color and brightness of the foreground and ignore crucial illumination cues from the background such as apparent lighting direction, leading to unrealistic compositions. We… ▽ More

    Submitted 7 April, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: CVPR 2024 camera ready

  15. arXiv:2311.09544  [pdf, other

    cs.IR cs.AI cs.LG

    Scaling User Modeling: Large-scale Online User Representations for Ads Personalization in Meta

    Authors: Wei Zhang, Dai Li, Chen Liang, Fang Zhou, Zhongke Zhang, Xuewei Wang, Ru Li, Yi Zhou, Yaning Huang, Dong Liang, Kai Wang, Zhangyuan Wang, Zhengxing Chen, Fenggang Wu, Minghai Chen, Huayu Li, Yunnan Wu, Zhan Shu, Mindi Yuan, Sri Reddy

    Abstract: Effective user representations are pivotal in personalized advertising. However, stringent constraints on training throughput, serving latency, and memory, often limit the complexity and input feature set of online ads ranking models. This challenge is magnified in extensive systems like Meta's, which encompass hundreds of models with diverse specifications, rendering the tailoring of user represe… ▽ More

    Submitted 22 May, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: 8 pages, 3 figures

    MSC Class: 68T05; 68T30 ACM Class: I.2.1; H.3.5; H.3.3

    Journal ref: Companion Proceedings of the ACM Web Conference 2024 (WWW '24 Companion), May 13--17, 2024, Singapore, Singapore

  16. arXiv:2310.20024  [pdf, other

    cs.RO cs.AI cs.LG cs.MA

    Topology Recoverability Prediction for Ad-Hoc Robot Networks: A Data-Driven Fault-Tolerant Approach

    Authors: Matin Macktoobian, Zhan Shu, Qing Zhao

    Abstract: Faults occurring in ad-hoc robot networks may fatally perturb their topologies leading to disconnection of subsets of those networks. Optimal topology synthesis is generally resource-intensive and time-consuming to be done in real time for large ad-hoc robot networks. One should only perform topology re-computations if the probability of topology recoverability after the occurrence of any fault su… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Journal ref: IEEE Transactions on Signal and Information Processing over Networks, 2023

  17. arXiv:2309.11009  [pdf, other

    cs.CV

    Controllable Dynamic Appearance for Neural 3D Portraits

    Authors: ShahRukh Athar, Zhixin Shu, Zexiang Xu, Fujun Luan, Sai Bi, Kalyan Sunkavalli, Dimitris Samaras

    Abstract: Recent advances in Neural Radiance Fields (NeRFs) have made it possible to reconstruct and reanimate dynamic portrait scenes with control over head-pose, facial expressions and viewing direction. However, training such models assumes photometric consistency over the deformed region e.g. the face must be evenly lit as it deforms with changing head-pose and facial expression. Such photometric consis… ▽ More

    Submitted 21 September, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

  18. arXiv:2309.06687  [pdf, other

    cs.RO cs.AI

    Self-Refined Large Language Model as Automated Reward Function Designer for Deep Reinforcement Learning in Robotics

    Authors: Jiayang Song, Zhehua Zhou, Jiawei Liu, Chunrong Fang, Zhan Shu, Lei Ma

    Abstract: Although Deep Reinforcement Learning (DRL) has achieved notable success in numerous robotic applications, designing a high-performing reward function remains a challenging task that often requires substantial manual input. Recently, Large Language Models (LLMs) have been extensively adopted to address tasks demanding in-depth common-sense knowledge, such as reasoning and planning. Recognizing that… ▽ More

    Submitted 2 October, 2023; v1 submitted 12 September, 2023; originally announced September 2023.

  19. arXiv:2308.13724  [pdf, other

    cs.RO cs.AI

    ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon Sequential Task Planning

    Authors: Zhehua Zhou, Jiayang Song, Kunpeng Yao, Zhan Shu, Lei Ma

    Abstract: Motivated by the substantial achievements observed in Large Language Models (LLMs) in the field of natural language processing, recent research has commenced investigations into the application of LLMs for complex, long-horizon sequential task planning challenges in robotics. LLMs are advantageous in offering the potential to enhance the generalizability as task-agnostic planners and facilitate fl… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  20. Towards Building AI-CPS with NVIDIA Isaac Sim: An Industrial Benchmark and Case Study for Robotics Manipulation

    Authors: Zhehua Zhou, Jiayang Song, Xuan Xie, Zhan Shu, Lei Ma, Dikai Liu, Jianxiong Yin, Simon See

    Abstract: As a representative cyber-physical system (CPS), robotic manipulator has been widely adopted in various academic research and industrial processes, indicating its potential to act as a universal interface between the cyber and the physical worlds. Recent studies in robotics manipulation have started employing artificial intelligence (AI) approaches as controllers to achieve better adaptability and… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

  21. arXiv:2307.01425  [pdf, other

    cs.CV

    Consistent Multimodal Generation via A Unified GAN Framework

    Authors: Zhen Zhu, Yijun Li, Weijie Lyu, Krishna Kumar Singh, Zhixin Shu, Soeren Pirk, Derek Hoiem

    Abstract: We investigate how to generate multimodal image outputs, such as RGB, depth, and surface normals, with a single generative model. The challenge is to produce outputs that are realistic, and also consistent with each other. Our solution builds on the StyleGAN3 architecture, with a shared backbone and modality-specific branches in the last layers of the synthesis network, and we propose per-modality… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: In review

  22. arXiv:2303.12950  [pdf, other

    cs.CV cs.GR

    LightPainter: Interactive Portrait Relighting with Freehand Scribble

    Authors: Yiqun Mei, He Zhang, Xuaner Zhang, Jianming Zhang, Zhixin Shu, Yilin Wang, Zijun Wei, Shi Yan, HyunJoon Jung, Vishal M. Patel

    Abstract: Recent portrait relighting methods have achieved realistic results of portrait lighting effects given a desired lighting representation such as an environment map. However, these methods are not intuitive for user interaction and lack precise lighting control. We introduce LightPainter, a scribble-based relighting system that allows users to interactively manipulate portrait lighting effect with e… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: CVPR2023

  23. arXiv:2303.12823  [pdf, other

    eess.SY cs.AI

    Data-Driven Leader-following Consensus for Nonlinear Multi-Agent Systems against Composite Attacks: A Twins Layer Approach

    Authors: Xin Gong, Jintao Peng, Dong Yang, Zhan Shu, Tingwen Huang, Yukang Cui

    Abstract: This paper studies the leader-following consensuses of uncertain and nonlinear multi-agent systems against composite attacks (CAs), including Denial of Service (DoS) attacks and actuation attacks (AAs). A double-layer control framework is formulated, where a digital twin layer (TL) is added beside the traditional cyber-physical layer (CPL), inspired by the recent Digital Twin technology. Consequen… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

  24. arXiv:2302.09335  [pdf, other

    cs.AI

    Knowledge Graph Completion based on Tensor Decomposition for Disease Gene Prediction

    Authors: Xinyan Wang, Ting Jia, Chongyu Wang, Kuan Xu, Zixin Shu, Jian Yu, Kuo Yang, Xuezhong Zhou

    Abstract: Accurate identification of disease genes has consistently been one of the keys to decoding a disease's molecular mechanism. Most current approaches focus on constructing biological networks and utilizing machine learning, especially, deep learning to identify disease genes, but ignore the complex relations between entities in the biological knowledge graph. In this paper, we construct a biological… ▽ More

    Submitted 16 March, 2023; v1 submitted 18 February, 2023; originally announced February 2023.

  25. arXiv:2302.04871  [pdf, other

    cs.CV

    In-N-Out: Faithful 3D GAN Inversion with Volumetric Decomposition for Face Editing

    Authors: Yiran Xu, Zhixin Shu, Cameron Smith, Seoung Wug Oh, Jia-Bin Huang

    Abstract: 3D-aware GANs offer new capabilities for view synthesis while preserving the editing functionalities of their 2D counterparts. GAN inversion is a crucial step that seeks the latent code to reconstruct input images or videos, subsequently enabling diverse editing tasks through manipulation of this latent code. However, a model pre-trained on a particular dataset (e.g., FFHQ) often has difficulty re… ▽ More

    Submitted 14 April, 2024; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: Project page: https://in-n-out-3d.github.io/

  26. arXiv:2212.10699  [pdf, other

    cs.CV cs.GR

    PaletteNeRF: Palette-based Appearance Editing of Neural Radiance Fields

    Authors: Zhengfei Kuang, Fujun Luan, Sai Bi, Zhixin Shu, Gordon Wetzstein, Kalyan Sunkavalli

    Abstract: Recent advances in neural radiance fields have enabled the high-fidelity 3D reconstruction of complex scenes for novel view synthesis. However, it remains underexplored how the appearance of such representations can be efficiently edited while maintaining photorealism. In this work, we present PaletteNeRF, a novel method for photorealistic appearance editing of neural radiance fields (NeRF) base… ▽ More

    Submitted 24 January, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  27. arXiv:2210.14416  [pdf, other

    eess.IV cs.CV cs.LG

    Residual Back Projection With Untrained Neural Networks

    Authors: Ziyu Shu, Alireza Entezari

    Abstract: Background and Objective: The success of neural networks in a number of image processing tasks has motivated their application in image reconstruction problems in computed tomography (CT). While progress has been made in this area, the lack of stability and theoretical guarantees for accuracy, together with the scarcity of high-quality training data for specific imaging domains pose challenges for… ▽ More

    Submitted 29 December, 2023; v1 submitted 25 October, 2022; originally announced October 2022.

  28. arXiv:2208.11257  [pdf, other

    cs.CV

    3D-FM GAN: Towards 3D-Controllable Face Manipulation

    Authors: Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Richard Zhang, S. Y. Kung

    Abstract: 3D-controllable portrait synthesis has significantly advanced, thanks to breakthroughs in generative adversarial networks (GANs). However, it is still challenging to manipulate existing face images with precise 3D control. While concatenating GAN inversion and a 3D-aware, noise-to-image GAN is a straight-forward solution, it is inefficient and may lead to noticeable drop in editing quality. To fil… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

    Comments: Accepted to ECCV2022. Project webpage: https://lychenyoko.github.io/3D-FM-GAN-Webpage/

  29. arXiv:2206.06481  [pdf, other

    cs.CV

    RigNeRF: Fully Controllable Neural 3D Portraits

    Authors: ShahRukh Athar, Zexiang Xu, Kalyan Sunkavalli, Eli Shechtman, Zhixin Shu

    Abstract: Volumetric neural rendering methods, such as neural radiance fields (NeRFs), have enabled photo-realistic novel view synthesis. However, in their standard form, NeRFs do not support the editing of objects, such as a human head, within a scene. In this work, we propose RigNeRF, a system that goes beyond just novel view synthesis and enables full control of head pose and facial expressions learned f… ▽ More

    Submitted 13 June, 2022; originally announced June 2022.

    Comments: The project page can be found here: http://shahrukhathar.github.io/2022/06/06/RigNeRF.html

  30. arXiv:2203.12780  [pdf, other

    cs.CV

    Learning Motion-Dependent Appearance for High-Fidelity Rendering of Dynamic Humans from a Single Camera

    Authors: Jae Shin Yoon, Duygu Ceylan, Tuanfeng Y. Wang, Jingwan Lu, Jimei Yang, Zhixin Shu, Hyun Soo Park

    Abstract: Appearance of dressed humans undergoes a complex geometric transformation induced not only by the static pose but also by its dynamics, i.e., there exists a number of cloth geometric configurations given a pose depending on the way it has moved. Such appearance modeling conditioned on motion has been largely neglected in existing human rendering methods, resulting in rendering of physically implau… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: CVPR accepted. 15 pages. 17 figures, 5 tables

    Journal ref: IEEE Computer Vision and Pattern Recognition (CVPR) 2022

  31. Laplacian2Mesh: Laplacian-Based Mesh Understanding

    Authors: Qiujie Dong, Zixiong Wang, Manyi Li, Junjie Gao, Shuangmin Chen, Zhenyu Shu, Shiqing Xin, Changhe Tu, Wenping Wang

    Abstract: Geometric deep learning has sparked a rising interest in computer graphics to perform shape understanding tasks, such as shape classification and semantic segmentation. When the input is a polygonal surface, one has to suffer from the irregular mesh structure. Motivated by the geometric spectral theory, we introduce Laplacian2Mesh, a novel and flexible convolutional neural network (CNN) framework… ▽ More

    Submitted 16 March, 2023; v1 submitted 1 February, 2022; originally announced February 2022.

    Comments: Accepted by IEEE Transactions on Visualization and Computer Graphics (TVCG)

  32. Learning Optimal Topology for Ad-hoc Robot Networks

    Authors: Matin Macktoobian, Zhan Shu, Qing Zhao

    Abstract: In this paper, we synthesize a data-driven method to predict the optimal topology of an ad-hoc robot network. This problem is technically a multi-task classification problem. However, we divide it into a class of multi-class classification problems that can be more efficiently solved. For this purpose, we first compose an algorithm to create ground-truth optimal topologies associated with various… ▽ More

    Submitted 4 April, 2024; v1 submitted 30 January, 2022; originally announced January 2022.

    Comments: This version is the one published in IEEE Robotics and Automation Letters

    Journal ref: IEEE Robotics and Automation Letters, 2023

  33. arXiv:2201.08845  [pdf, other

    cs.CV

    Point-NeRF: Point-based Neural Radiance Fields

    Authors: Qiangeng Xu, Zexiang Xu, Julien Philip, Sai Bi, Zhixin Shu, Kalyan Sunkavalli, Ulrich Neumann

    Abstract: Volumetric neural rendering methods like NeRF generate high-quality view synthesis results but are optimized per-scene leading to prohibitive reconstruction time. On the other hand, deep multi-view stereo methods can quickly reconstruct scene geometry via direct network inference. Point-NeRF combines the advantages of these two approaches by using neural 3D point clouds, with associated neural fea… ▽ More

    Submitted 15 March, 2023; v1 submitted 21 January, 2022; originally announced January 2022.

    Comments: Accepted to CVPR 2022 (Oral)

    Journal ref: In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 5438-5448) (2022)

  34. arXiv:2109.06166  [pdf, other

    cs.CV

    Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN

    Authors: Badour AlBahar, Jingwan Lu, Jimei Yang, Zhixin Shu, Eli Shechtman, Jia-Bin Huang

    Abstract: We present an algorithm for re-rendering a person from a single image under arbitrary poses. Existing methods often have difficulties in hallucinating occluded contents photo-realistically while preserving the identity and fine details in the source image. We first learn to inpaint the correspondence field between the body surface texture and the source image with a human body symmetry prior. The… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: SIGGRAPH Asia 2021. Project page: https://pose-with-style.github.io/

  35. arXiv:2108.04913  [pdf, other

    cs.CV

    FLAME-in-NeRF : Neural control of Radiance Fields for Free View Face Animation

    Authors: ShahRukh Athar, Zhixin Shu, Dimitris Samaras

    Abstract: This paper presents a neural rendering method for controllable portrait video synthesis. Recent advances in volumetric neural rendering, such as neural radiance fields (NeRF), has enabled the photorealistic novel view synthesis of static scenes with impressive results. However, modeling dynamic and controllable objects as part of a scene with such scene representations is still challenging. In thi… ▽ More

    Submitted 10 August, 2021; originally announced August 2021.

    Comments: version 1.0.0

  36. Single-image Full-body Human Relighting

    Authors: Manuel Lagunas, Xin Sun, Jimei Yang, Ruben Villegas, Jianming Zhang, Zhixin Shu, Belen Masia, Diego Gutierrez

    Abstract: We present a single-image data-driven method to automatically relight images with full-body humans in them. Our framework is based on a realistic scene decomposition leveraging precomputed radiance transfer (PRT) and spherical harmonics (SH) lighting. In contrast to previous work, we lift the assumptions on Lambertian materials and explicitly model diffuse and specular reflectance in our data. Mor… ▽ More

    Submitted 15 July, 2021; originally announced July 2021.

    Comments: 11 pages, 12 figures

    Journal ref: Eurographics Symposium on Rendering (EGSR), 2021

  37. arXiv:2105.06517  [pdf

    cs.AI cs.LG cs.RO

    Reinforcement Learning Based Safe Decision Making for Highway Autonomous Driving

    Authors: Arash Mohammadhasani, Hamed Mehrivash, Alan Lynch, Zhan Shu

    Abstract: In this paper, we develop a safe decision-making method for self-driving cars in a multi-lane, single-agent setting. The proposed approach utilizes deep reinforcement learning (RL) to achieve a high-level policy for safe tactical decision-making. We address two major challenges that arise solely in autonomous navigation. First, the proposed algorithm ensures that collisions never happen, and there… ▽ More

    Submitted 13 May, 2021; originally announced May 2021.

  38. arXiv:2104.02244  [pdf, other

    cs.CV

    Content-Aware GAN Compression

    Authors: Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Federico Perazzi, S. Y. Kung

    Abstract: Generative adversarial networks (GANs), e.g., StyleGAN2, play a vital role in various image generation and synthesis tasks, yet their notoriously high computational cost hinders their efficient deployment on edge devices. Directly applying generic compression approaches yields poor results on GANs, which motivates a number of recent GAN compression works. While prior works mainly accelerate condit… ▽ More

    Submitted 5 April, 2021; originally announced April 2021.

    Comments: Published in CVPR2021

    ACM Class: I.4.0; I.2.6

  39. arXiv:2102.13453  [pdf, other

    cs.LG

    Private and Utility Enhanced Recommendations with Local Differential Privacy and Gaussian Mixture Model

    Authors: Jeyamohan Neera, Xiaomin Chen, Nauman Aslam, Kezhi Wang, Zhan Shu

    Abstract: Recommendation systems rely heavily on users behavioural and preferential data (e.g. ratings, likes) to produce accurate recommendations. However, users experience privacy concerns due to unethical data aggregation and analytical practices carried out by the Service Providers (SP). Local differential privacy (LDP) based perturbation mechanisms add noise to users data at user side before sending it… ▽ More

    Submitted 6 March, 2021; v1 submitted 26 February, 2021; originally announced February 2021.

    Comments: 12 pages

  40. arXiv:2010.03245  [pdf, other

    cs.CV

    Learning Clusterable Visual Features for Zero-Shot Recognition

    Authors: Jingyi Xu, Zhixin Shu, Dimitris Samaras

    Abstract: In zero-shot learning (ZSL), conditional generators have been widely used to generate additional training features. These features can then be used to train the classifiers for testing data. However, some testing data are considered "hard" as they lie close to the decision boundaries and are prone to misclassification, leading to performance degradation for ZSL. In this paper, we propose to learn… ▽ More

    Submitted 14 October, 2020; v1 submitted 7 October, 2020; originally announced October 2020.

  41. arXiv:2009.12148  [pdf, other

    cs.MM

    Adaptive Multi-modal Fusion Hashing via Hadamard Matrix

    Authors: Jun Yu, Donglin Zhang, Zhenqiu Shu, Feng Chen

    Abstract: Hashing plays an important role in information retrieval, due to its low storage and high speed of processing. Among the techniques available in the literature, multi-modal hashing, which can encode heterogeneous multi-modal features into compact hash codes, has received particular attention. Most of the existing multi-modal hashing methods adopt the fixed weighting factors to fuse multiple modali… ▽ More

    Submitted 4 August, 2021; v1 submitted 25 September, 2020; originally announced September 2020.

    Comments: 10pages, 8figures, 3tables. The paper is under consideration at Applied Intelligence

  42. arXiv:2008.08734  [pdf, ps, other

    eess.SY cs.LG math.OC stat.ML

    Model-free optimal control of discrete-time systems with additive and multiplicative noises

    Authors: Jing Lai, Junlin Xiong, Zhan Shu

    Abstract: This paper investigates the optimal control problem for a class of discrete-time stochastic systems subject to additive and multiplicative noises. A stochastic Lyapunov equation and a stochastic algebra Riccati equation are established for the existence of the optimal admissible control policy. A model-free reinforcement learning algorithm is proposed to learn the optimal admissible control policy… ▽ More

    Submitted 19 August, 2020; originally announced August 2020.

    Comments: 8 pages, 3 figures

  43. Gram filtering and sinogram interpolation for pixel-basis in parallel-beam X-ray CT reconstruction

    Authors: Ziyu Shu, Alireza Entezari

    Abstract: The key aspect of parallel-beam X-ray CT is forward and back projection, but its computational burden continues to be an obstacle for applications. We propose a method to improve the performance of related algorithms by calculating the Gram filter exactly and interpolating the sinogram signal optimally. In addition, the detector blur effect can be included in our model efficiently. The improvement… ▽ More

    Submitted 27 May, 2020; originally announced May 2020.

    Journal ref: ISBI 2020

  44. arXiv:1911.00735  [pdf, other

    cs.CV

    Self-supervised Deformation Modeling for Facial Expression Editing

    Authors: ShahRukh Athar, Zhixin Shu, Dimitris Samaras

    Abstract: Recent advances in deep generative models have demonstrated impressive results in photo-realistic facial image synthesis and editing. Facial expressions are inherently the result of muscle movement. However, existing neural network-based approaches usually only rely on texture generation to edit expressions and largely neglect the motion information. In this work, we propose a novel end-to-end net… ▽ More

    Submitted 5 November, 2019; v1 submitted 2 November, 2019; originally announced November 2019.

  45. arXiv:1904.11960  [pdf, other

    cs.CV

    Lifting AutoEncoders: Unsupervised Learning of a Fully-Disentangled 3D Morphable Model using Deep Non-Rigid Structure from Motion

    Authors: Mihir Sahasrabudhe, Zhixin Shu, Edward Bartrum, Riza Alp Guler, Dimitris Samaras, Iasonas Kokkinos

    Abstract: In this work we introduce Lifting Autoencoders, a generative 3D surface-based model of object categories. We bring together ideas from non-rigid structure from motion, image formation, and morphable models to learn a controllable, geometric model of 3D categories in an entirely unsupervised manner from an unstructured set of images. We exploit the 3D geometric nature of our model and use normal in… ▽ More

    Submitted 26 April, 2019; originally announced April 2019.

    Comments: 19 pages; 12 figures; code will be released; Project page: https://msahasrabudhe.github.io/projects/lae/

  46. arXiv:1810.10458  [pdf, other

    cs.NI

    Proportional fairness in wireless powered CSMA/CA based IoT networks

    Authors: Xiaomin Chen, Zhan Shu, Kezhi Wang, Fangmin Xu, Yue Cao

    Abstract: This paper considers the deployment of a hybrid wireless data/power access point in an 802.11-based wireless powered IoT network. The proportionally fair allocation of throughputs across IoT nodes is considered under the constraints of energy neutrality and CPU capability for each device. The joint optimization of wireless powering and data communication resources takes the CSMA/CA random channel… ▽ More

    Submitted 24 October, 2018; originally announced October 2018.

    Comments: Accepted by Globecom 2018

  47. arXiv:1809.05964  [pdf, other

    cs.LG stat.ML

    Latent Space Optimal Transport for Generative Models

    Authors: Huidong Liu, Yang Guo, Na Lei, Zhixin Shu, Shing-Tung Yau, Dimitris Samaras, Xianfeng Gu

    Abstract: Variational Auto-Encoders enforce their learned intermediate latent-space data distribution to be a simple distribution, such as an isotropic Gaussian. However, this causes the posterior collapse problem and loses manifold structure which can be important for datasets such as facial images. A GAN can transform a simple distribution to a latent-space data distribution and thus preserve the manifold… ▽ More

    Submitted 16 September, 2018; originally announced September 2018.

  48. arXiv:1806.06503  [pdf, other

    cs.CV

    Deforming Autoencoders: Unsupervised Disentangling of Shape and Appearance

    Authors: Zhixin Shu, Mihir Sahasrabudhe, Alp Guler, Dimitris Samaras, Nikos Paragios, Iasonas Kokkinos

    Abstract: In this work we introduce Deforming Autoencoders, a generative model for images that disentangles shape from appearance in an unsupervised manner. As in the deformable template paradigm, shape is represented as a deformation between a canonical coordinate system (`template') and an observed image, while appearance is modeled in `canonical', template, coordinates, thus discarding variability due to… ▽ More

    Submitted 18 June, 2018; originally announced June 2018.

    Comments: 17 pages including references, plus 12 pages appendix. Video available at : https://youtu.be/Oi7pyxKkF1g Code will be made available soon

  49. An Adversarial Neuro-Tensorial Approach For Learning Disentangled Representations

    Authors: Mengjiao Wang, Zhixin Shu, Shiyang Cheng, Yannis Panagakis, Dimitris Samaras, Stefanos Zafeiriou

    Abstract: Several factors contribute to the appearance of an object in a visual scene, including pose, illumination, and deformation, among others. Each factor accounts for a source of variability in the data, while the multiplicative interactions of these factors emulate the entangled variability, giving rise to the rich structure of visual object appearance. Disentangling such unobserved factors from visu… ▽ More

    Submitted 24 February, 2018; v1 submitted 28 November, 2017; originally announced November 2017.

    Journal ref: International Journal of Computer Vision, 2019

  50. arXiv:1709.02848  [pdf, other

    cs.CV

    Improving Heterogeneous Face Recognition with Conditional Adversarial Networks

    Authors: Wuming Zhang, Zhixin Shu, Dimitris Samaras, Liming Chen

    Abstract: Heterogeneous face recognition between color image and depth image is a much desired capacity for real world applications where shape information is looked upon as merely involved in gallery. In this paper, we propose a cross-modal deep learning method as an effective and efficient workaround for this challenge. Specifically, we begin with learning two convolutional neural networks (CNNs) to extra… ▽ More

    Submitted 13 September, 2017; v1 submitted 8 September, 2017; originally announced September 2017.