Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–15 of 15 results for author: Kwak, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16695  [pdf, other

    cs.CV

    Geometry-Aware Score Distillation via 3D Consistent Noising and Gradient Consistency Modeling

    Authors: Min-Seop Kwak, Donghoon Ahn, Ines Hyeonsu Kim, Jin-wha Kim, Seungryong Kim

    Abstract: Score distillation sampling (SDS), the methodology in which the score from pretrained 2D diffusion models is distilled into 3D representation, has recently brought significant advancements in text-to-3D generation task. However, this approach is still confronted with critical geometric inconsistency problems such as the Janus problem. Starting from a hypothesis that such inconsistency problems may… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2406.16042  [pdf, other

    cs.CV

    Pose-Diversified Augmentation with Diffusion Model for Person Re-Identification

    Authors: Inès Hyeonsu Kim, JoungBin Lee, Soowon Son, Woojeong Jin, Kyusun Cho, Junyoung Seo, Min-Seop Kwak, Seokju Cho, JeongYeol Baek, Byeongwon Lee, Seungryong Kim

    Abstract: Person re-identification (Re-ID) often faces challenges due to variations in human poses and camera viewpoints, which significantly affect the appearance of individuals across images. Existing datasets frequently lack diversity and scalability in these aspects, hindering the generalization of Re-ID models to new camera systems. Previous methods have attempted to address these issues through data a… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: The project page is available at https://ku-cvlab.github.io/Diff-ID/

  3. Making a prototype of Seoul historical sites chatbot using Langchain

    Authors: Jae Young Suh, Minsoo Kwak, Soo Yong Kim, Hyoungseo Cho

    Abstract: In this paper, we are going to share a draft of the development of a conversational agent created to disseminate information about historical sites located in the Seoul. The primary objective of the agent is to increase awareness among visitors who are not familiar with Seoul, about the presence and precise locations of valuable cultural heritage sites. It aims to promote a basic understanding of… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Comments: 4 pages, 4 figures, draft

  4. arXiv:2402.02972  [pdf, other

    cs.CV cs.LG

    Retrieval-Augmented Score Distillation for Text-to-3D Generation

    Authors: Junyoung Seo, Susung Hong, Wooseok Jang, Inès Hyeonsu Kim, Minseop Kwak, Doyup Lee, Seungryong Kim

    Abstract: Text-to-3D generation has achieved significant success by incorporating powerful 2D diffusion models, but insufficient 3D prior knowledge also leads to the inconsistency of 3D geometry. Recently, since large-scale multi-view datasets have been released, fine-tuning the diffusion model on the multi-view datasets becomes a mainstream to solve the 3D inconsistency problem. However, it has confronted… ▽ More

    Submitted 2 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Accepted to ICML 2024 / Project Page: https://ku-cvlab.github.io/ReDream/

  5. arXiv:2312.01003  [pdf, other

    cs.CV

    Self-Evolving Neural Radiance Fields

    Authors: Jaewoo Jung, Jisang Han, Jiwon Kang, Seongchan Kim, Min-Seop Kwak, Seungryong Kim

    Abstract: Recently, neural radiance field (NeRF) has shown remarkable performance in novel view synthesis and 3D reconstruction. However, it still requires abundant high-quality images, limiting its applicability in real-world scenarios. To overcome this limitation, recent works have focused on training NeRF only with sparse viewpoints by giving additional regularizations, often called few-shot NeRF. We obs… ▽ More

    Submitted 5 December, 2023; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: 34 pages, 21 figures Our project page can be found at : https://ku-cvlab.github.io/SE-NeRF/

  6. arXiv:2311.07006  [pdf, other

    cs.CL cs.AI

    Context-dependent Instruction Tuning for Dialogue Response Generation

    Authors: Jin Myung Kwak, Minseon Kim, Sung Ju Hwang

    Abstract: Recent language models have achieved impressive performance in natural language tasks by incorporating instructions with task input during fine-tuning. Since all samples in the same natural language task can be explained with the same task instructions, many instruction datasets only provide a few instructions for the entire task, without considering the input of each example in the task. However,… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    Comments: Work in Progress

  7. arXiv:2305.19201  [pdf, other

    cs.CV

    DaRF: Boosting Radiance Fields from Sparse Inputs with Monocular Depth Adaptation

    Authors: Jiuhn Song, Seonghoon Park, Honggyu An, Seokju Cho, Min-Seop Kwak, Sungjin Cho, Seungryong Kim

    Abstract: Neural radiance fields (NeRF) shows powerful performance in novel view synthesis and 3D geometry reconstruction, but it suffers from critical performance degradation when the number of known viewpoints is drastically reduced. Existing works attempt to overcome this problem by employing external priors, but their success is limited to certain types of scenes or datasets. Employing monocular depth e… ▽ More

    Submitted 25 September, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: To appear at NeurIPS 2023. Project Page: https://ku-cvlab.github.io/DaRF/

  8. arXiv:2305.18846  [pdf, other

    cs.CL cs.AI cs.LG

    Knowledge Graph-Augmented Language Models for Knowledge-Grounded Dialogue Generation

    Authors: Minki Kang, Jin Myung Kwak, Jinheon Baek, Sung Ju Hwang

    Abstract: Language models have achieved impressive performances on dialogue generation tasks. However, when generating responses for a conversation that requires factual knowledge, they are far from perfect, due to an absence of mechanisms to retrieve, encode, and reflect the knowledge in the generated responses. Some knowledge-grounded dialogue generation methods tackle this problem by leveraging facts fro… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Preprint. Under review

  9. arXiv:2303.07937  [pdf, other

    cs.CV

    Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation

    Authors: Junyoung Seo, Wooseok Jang, Min-Seop Kwak, Hyeonsu Kim, Jaehoon Ko, Junho Kim, Jin-Hwa Kim, Jiyoung Lee, Seungryong Kim

    Abstract: Text-to-3D generation has shown rapid progress in recent days with the advent of score distillation, a methodology of using pretrained text-to-2D diffusion models to optimize neural radiance field (NeRF) in the zero-shot setting. However, the lack of 3D awareness in the 2D diffusion models destabilizes score distillation-based methods from reconstructing a plausible 3D scene. To address this issue… ▽ More

    Submitted 6 February, 2024; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: Project page https://ku-cvlab.github.io/3DFuse/

  10. arXiv:2301.10941  [pdf, other

    cs.CV

    GeCoNeRF: Few-shot Neural Radiance Fields via Geometric Consistency

    Authors: Min-seop Kwak, Jiuhn Song, Seungryong Kim

    Abstract: We present a novel framework to regularize Neural Radiance Field (NeRF) in a few-shot setting with a geometry-aware consistency regularization. The proposed approach leverages a rendered depth map at unobserved viewpoint to warp sparse input images to the unobserved viewpoint and impose them as pseudo ground truths to facilitate learning of NeRF. By encouraging such geometry-aware consistency at a… ▽ More

    Submitted 27 April, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

    Comments: ICML 2023

  11. arXiv:2210.10329  [pdf, other

    cs.CL cs.AI

    Language Detoxification with Attribute-Discriminative Latent Space

    Authors: Jin Myung Kwak, Minseon Kim, Sung Ju Hwang

    Abstract: Transformer-based Language Models (LMs) have achieved impressive results on natural language understanding tasks, but they can also generate toxic text such as insults, threats, and profanity, limiting their real-world applications. To overcome this issue, a few text generation approaches aim to detoxify toxic texts using additional LMs or perturbations. However, previous methods require excessive… ▽ More

    Submitted 5 July, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

    Comments: ACL 2023; *Equal contribution. Author ordering determined by coin flip

  12. arXiv:2204.10511  [pdf, other

    cs.CV

    Keypoint based Sign Language Translation without Glosses

    Authors: Youngmin Kim, Minji Kwak, Dain Lee, Yeongeun Kim, Hyeongboo Baek

    Abstract: Sign Language Translation (SLT) is a task that has not been studied relatively much compared to the study of Sign Language Recognition (SLR). However, the SLR is a study that recognizes the unique grammar of sign language, which is different from the spoken language and has a problem that non-disabled people cannot easily interpret. So, we're going to solve the problem of translating directly spok… ▽ More

    Submitted 13 June, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

    Comments: 14 pages, 5 figures

  13. arXiv:2112.09379  [pdf

    cs.CV

    Enhanced Frame and Event-Based Simulator and Event-Based Video Interpolation Network

    Authors: Adam Radomski, Andreas Georgiou, Thomas Debrunner, Chenghan Li, Luca Longinotti, Minwon Seo, Moosung Kwak, Chang-Woo Shin, Paul K. J. Park, Hyunsurk Eric Ryu, Kynan Eng

    Abstract: Fast neuromorphic event-based vision sensors (Dynamic Vision Sensor, DVS) can be combined with slower conventional frame-based sensors to enable higher-quality inter-frame interpolation than traditional methods relying on fixed motion approximations using e.g. optical flow. In this work we present a new, advanced event simulator that can produce realistic scenes recorded by a camera rig with an ar… ▽ More

    Submitted 17 December, 2021; originally announced December 2021.

    Comments: 10 pages, 19 figures

  14. arXiv:2109.14935  [pdf

    physics.app-ph cs.ET

    Ionic Sieving Through One-Atom-Thick 2D Material Enables Analog Nonvolatile Memory for Neuromorphic Computing

    Authors: Revannath Dnyandeo Nikam, Jongwon Lee, Wooseok Choi, Writam Banerjee, Myonghoon Kwak, Manoj Yadav, Hyunsang Hwang

    Abstract: The first report on ion transport through atomic sieves of atomically-thin 2D material is provided to solve critical limitations of electrochemical random-access memory (ECRAM) devices.

    Submitted 30 September, 2021; originally announced September 2021.

    Journal ref: Small 2021, 2103543

  15. arXiv:2010.11547  [pdf, other

    cs.CV cs.LG

    TLGAN: document Text Localization using Generative Adversarial Nets

    Authors: Dongyoung Kim, Myungsung Kwak, Eunji Won, Sejung Shin, Jeongyeon Nam

    Abstract: Text localization from the digital image is the first step for the optical character recognition task. Conventional image processing based text localization performs adequately for specific examples. Yet, a general text localization are only archived by recent deep-learning based modalities. Here we present document Text Localization Generative Adversarial Nets (TLGAN) which are deep neural networ… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

    Comments: 17 pages, three figures, 4 tables, methods for IEEE ICDAR RRC SROIE task1 leader board

    MSC Class: 68T10 (Primary); 68T07 (Secondary) ACM Class: I.5.1; I.4.5