Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–10 of 10 results for author: Hang, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03297  [pdf, other

    cs.CV cs.AI

    Improved Noise Schedule for Diffusion Training

    Authors: Tiankai Hang, Shuyang Gu

    Abstract: Diffusion models have emerged as the de facto choice for generating visual signals. However, training a single model to predict noise across various levels poses significant challenges, necessitating numerous iterations and incurring significant computational costs. Various approaches, such as loss weighting strategy design and architectural refinements, have been introduced to expedite convergenc… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2406.04314  [pdf, other

    cs.CV

    Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step

    Authors: Zhanhao Liang, Yuhui Yuan, Shuyang Gu, Bohan Chen, Tiankai Hang, Ji Li, Liang Zheng

    Abstract: Recently, Direct Preference Optimization (DPO) has extended its success from aligning large language models (LLMs) to aligning text-to-image diffusion models with human preferences. Unlike most existing DPO methods that assume all diffusion steps share a consistent preference order with the final generated images, we argue that this assumption neglects step-specific denoising performance and that… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  3. arXiv:2403.14623  [pdf, other

    cs.LG cs.CV

    Simplified Diffusion Schrödinger Bridge

    Authors: Zhicong Tang, Tiankai Hang, Shuyang Gu, Dong Chen, Baining Guo

    Abstract: This paper introduces a novel theoretical simplification of the Diffusion Schrödinger Bridge (DSB) that facilitates its unification with Score-based Generative Models (SGMs), addressing the limitations of DSB in complex data generation and enabling faster convergence and enhanced performance. By employing SGMs as an initial solution for DSB, our approach capitalizes on the strengths of both framew… ▽ More

    Submitted 13 August, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

  4. arXiv:2401.13011  [pdf, other

    cs.CV

    CCA: Collaborative Competitive Agents for Image Editing

    Authors: Tiankai Hang, Shuyang Gu, Dong Chen, Xin Geng, Baining Guo

    Abstract: This paper presents a novel generative model, Collaborative Competitive Agents (CCA), which leverages the capabilities of multiple Large Language Models (LLMs) based agents to execute complex tasks. Drawing inspiration from Generative Adversarial Networks (GANs), the CCA system employs two equal-status generator agents and a discriminator agent. The generators independently process user instructio… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

  5. arXiv:2309.03895  [pdf, other

    cs.CV

    InstructDiffusion: A Generalist Modeling Interface for Vision Tasks

    Authors: Zigang Geng, Binxin Yang, Tiankai Hang, Chen Li, Shuyang Gu, Ting Zhang, Jianmin Bao, Zheng Zhang, Han Hu, Dong Chen, Baining Guo

    Abstract: We present InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions. Unlike existing approaches that integrate prior knowledge and pre-define the output space (e.g., categories and coordinates) for each vision task, we cast diverse vision tasks into a human-intuitive image-manipulating process whose output space is a flexible and interactive pi… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  6. arXiv:2303.09556  [pdf, other

    cs.CV

    Efficient Diffusion Training via Min-SNR Weighting Strategy

    Authors: Tiankai Hang, Shuyang Gu, Chen Li, Jianmin Bao, Dong Chen, Han Hu, Xin Geng, Baining Guo

    Abstract: Denoising diffusion models have been a mainstream approach for image generation, however, training these models often suffers from slow convergence. In this paper, we discovered that the slow convergence is partly due to conflicting optimization directions between timesteps. To address this issue, we treat the diffusion training as a multi-task learning problem, and introduce a simple yet effectiv… ▽ More

    Submitted 11 March, 2024; v1 submitted 16 March, 2023; originally announced March 2023.

  7. arXiv:2208.05617  [pdf, other

    cs.CV

    Language-Guided Face Animation by Recurrent StyleGAN-based Generator

    Authors: Tiankai Hang, Huan Yang, Bei Liu, Jianlong Fu, Xin Geng, Baining Guo

    Abstract: Recent works on language-guided image manipulation have shown great power of language in providing rich semantics, especially for face images. However, the other natural information, motions, in language is less explored. In this paper, we leverage the motion information and study a novel task, language-guided face animation, that aims to animate a static face image with the help of languages. To… ▽ More

    Submitted 3 July, 2024; v1 submitted 10 August, 2022; originally announced August 2022.

  8. arXiv:2111.10337  [pdf, other

    cs.CV

    Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions

    Authors: Hongwei Xue, Tiankai Hang, Yanhong Zeng, Yuchong Sun, Bei Liu, Huan Yang, Jianlong Fu, Baining Guo

    Abstract: We study joint video and language (VL) pre-training to enable cross-modality learning and benefit plentiful downstream VL tasks. Existing works either extract low-quality video features or learn limited text embedding, while neglecting that high-resolution videos and diversified semantics can significantly improve cross-modality learning. In this paper, we propose a novel High-resolution and Diver… ▽ More

    Submitted 8 July, 2022; v1 submitted 19 November, 2021; originally announced November 2021.

    Journal ref: published in CVPR 2022

  9. Epidemic Dynamics via Wavelet Theory and Machine Learning, with Applications to Covid-19

    Authors: Tô Tat Dat, Protin Frédéric, Nguyen T. T. Hang, Martel Jules, Nguyen Duc Thang, Charles Piffault, Rodríguez Willy, Figueroa Susely, Hông Vân Lê, Wilderich Tuschmann, Nguyen Tien Zung

    Abstract: We introduce the concept of epidemic-fitted wavelets which comprise, in particular, as special cases the number $I(t)$ of infectious individuals at time $t$ in classical SIR models and their derivatives. We present a novel method for modelling epidemic dynamics by a model selection method using wavelet theory and, for its applications, machine learning based curve fitting techniques. Our universal… ▽ More

    Submitted 13 November, 2020; v1 submitted 26 October, 2020; originally announced October 2020.

    Comments: References added, typos fixed, projections updated, minor mistakes corrected

    Journal ref: Biology 2020, 9(12), 477

  10. arXiv:2001.05681  [pdf, other

    cs.LG

    Stream-Flow Forecasting of Small Rivers Based on LSTM

    Authors: Youchuan Hu, Le Yan, Tingting Hang, Jun Feng

    Abstract: Stream-flow forecasting for small rivers has always been of great importance, yet comparatively challenging due to the special features of rivers with smaller volume. Artificial Intelligence (AI) methods have been employed in this area for long, but improvement of forecast quality is still on the way. In this paper, we tried to provide a new method to do the forecast using the Long-Short Term Memo… ▽ More

    Submitted 16 January, 2020; originally announced January 2020.

    Comments: 7 pages