Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–4 of 4 results for author: Yeon, I

Searching in archive eess. Search in all archives.
.
  1. arXiv:2401.10453  [pdf

    eess.AS cs.SD

    3D Room Geometry Inference from Multichannel Room Impulse Response using Deep Neural Network

    Authors: Inmo Yeon, Jung-Woo Choi

    Abstract: Room geometry inference (RGI) aims at estimating room shapes from measured room impulse responses (RIRs) and has received lots of attention for its importance in environment-aware audio rendering and virtual acoustic representation of a real venue. A lot of estimation models utilizing time difference of arrival (TDoA) or time of arrival (ToA) information in RIRs have been proposed. However, an est… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: 5 pages, 2 figures, Proceedings of the 24th International Congress on Acoustics

    Journal ref: Proceedings of the 24th International Congress on Acoustics, ICA 2022

  2. arXiv:2310.11728  [pdf, other

    cs.SD eess.AS eess.SP

    EchoScan: Scanning Complex Indoor Geometries via Acoustic Echoes

    Authors: Inmo Yeon, Iljoo Jeong, Seungchul Lee, Jung-Woo Choi

    Abstract: Accurate estimation of indoor space geometries is vital for constructing precise digital twins, whose broad industrial applications include navigation in unfamiliar environments and efficient evacuation planning, particularly in low-light conditions. This study introduces EchoScan, a deep neural network model that utilizes acoustic echoes to perform room geometry inference. Conventional sound-base… ▽ More

    Submitted 16 April, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: 9 pages, 8 figures, 2 tables

  3. arXiv:2309.13664  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    VoiceLDM: Text-to-Speech with Environmental Context

    Authors: Yeonghyeon Lee, Inmo Yeon, Juhan Nam, Joon Son Chung

    Abstract: This paper presents VoiceLDM, a model designed to produce audio that accurately follows two distinct natural language text prompts: the description prompt and the content prompt. The former provides information about the overall environmental context of the audio, while the latter conveys the linguistic content. To achieve this, we adopt a text-to-audio (TTA) model based on latent diffusion models… ▽ More

    Submitted 24 September, 2023; originally announced September 2023.

    Comments: Demos and code are available at https://voiceldm.github.io

  4. arXiv:2309.01513  [pdf, other

    eess.AS cs.AI cs.SD

    RGI-Net: 3D Room Geometry Inference from Room Impulse Responses With Hidden First-Order Reflections

    Authors: Inmo Yeon, Jung-Woo Choi

    Abstract: Room geometry is important prior information for implementing realistic 3D audio rendering. For this reason, various room geometry inference (RGI) methods have been developed by utilizing the time-of-arrival (TOA) or time-difference-of-arrival (TDOA) information in room impulse responses (RIRs). However, the conventional RGI technique poses several assumptions, such as convex room shapes, the numb… ▽ More

    Submitted 27 July, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

    Comments: 5 pages, 3 figures, 3 tables