Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 442 results for author: Lee, G

Searching in archive cs. Search in all archives.
.
  1. From Prediction to Application: Language Model-based Code Knowledge Tracing with Domain Adaptive Pre-Training and Automatic Feedback System with Pedagogical Prompting for Comprehensive Programming Education

    Authors: Unggi Lee, Jiyeong Bae, Yeonji Jung, Minji Kang, Gyuri Byun, Yeonseo Lee, Dohee Kim, Sookbun Lee, Jaekwon Park, Taekyung Ahn, Gunho Lee, Hyeoncheol Kim

    Abstract: Knowledge Tracing (KT) is a critical component in online learning, but traditional approaches face limitations in interpretability and cross-domain adaptability. This paper introduces Language Model-based Code Knowledge Tracing (CodeLKT), an innovative application of Language model-based Knowledge Tracing (LKT) to programming education. CodeLKT leverages pre-trained language models to process lear… ▽ More

    Submitted 30 August, 2024; originally announced September 2024.

    Comments: 9 pages, 2 figures

  2. arXiv:2408.13850  [pdf, other

    cs.LG cs.AI

    Condensed Sample-Guided Model Inversion for Knowledge Distillation

    Authors: Kuluhan Binici, Shivam Aggarwal, Cihan Acar, Nam Trung Pham, Karianto Leman, Gim Hee Lee, Tulika Mitra

    Abstract: Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact student model. KD relies on access to the training dataset, which may not always be fully available due to privacy concerns or logistical issues related to the size of the data. To address this, "data-free" KD methods use synthetic data, gener… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

  3. arXiv:2408.06065  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    An Investigation Into Explainable Audio Hate Speech Detection

    Authors: Jinmyeong An, Wonjun Lee, Yejin Jeon, Jungseul Ok, Yunsu Kim, Gary Geunbae Lee

    Abstract: Research on hate speech has predominantly revolved around detection and interpretation from textual inputs, leaving verbal content largely unexplored. While there has been limited exploration into hate speech detection within verbal acoustic speech inputs, the aspect of interpretability has been overlooked. Therefore, we introduce a new task of explainable audio hate speech detection. Specifically… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: Accepted to SIGDIAL 2024

  4. arXiv:2408.06044  [pdf, other

    cs.CL

    DiagESC: Dialogue Synthesis for Integrating Depression Diagnosis into Emotional Support Conversation

    Authors: Seungyeon Seo, Gary Geunbae Lee

    Abstract: Dialogue systems for mental health care aim to provide appropriate support to individuals experiencing mental distress. While extensive research has been conducted to deliver adequate emotional support, existing studies cannot identify individuals who require professional medical intervention and cannot offer suitable guidance. We introduce the Diagnostic Emotional Support Conversation task for an… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: Accepted by SIGDIAL 2024

  5. arXiv:2408.06043  [pdf, other

    cs.CL cs.SD eess.AS

    Enhancing Dialogue Speech Recognition with Robust Contextual Awareness via Noise Representation Learning

    Authors: Wonjun Lee, San Kim, Gary Geunbae Lee

    Abstract: Recent dialogue systems rely on turn-based spoken interactions, requiring accurate Automatic Speech Recognition (ASR). Errors in ASR can significantly impact downstream dialogue tasks. To address this, using dialogue context from user and agent interactions for transcribing subsequent utterances has been proposed. This method incorporates the transcription of the user's speech and the agent's resp… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: 11 pages, 2 figures, Accepted to SIGDIAL2024

  6. arXiv:2408.05927  [pdf, other

    cs.CV

    A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models

    Authors: Taehong Moon, Moonseok Choi, EungGu Yun, Jongmin Yoon, Gayoung Lee, Jaewoong Cho, Juho Lee

    Abstract: Diffusion models have shown remarkable performance in generation problems over various domains including images, videos, text, and audio. A practical bottleneck of diffusion models is their sampling speed, due to the repeated evaluation of score estimation networks during the inference. In this work, we propose a novel framework capable of adaptively allocating compute required for the score estim… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: ICML 2024

  7. arXiv:2408.05074  [pdf

    cs.CL cs.AI

    RT-Surv: Improving Mortality Prediction After Radiotherapy with Large Language Model Structuring of Large-Scale Unstructured Electronic Health Records

    Authors: Sangjoon Park, Chan Woo Wee, Seo Hee Choi, Kyung Hwan Kim, Jee Suk Chang, Hong In Yoon, Ik Jae Lee, Yong Bae Kim, Jaeho Cho, Ki Chang Keum, Chang Geol Lee, Hwa Kyung Byun, Woong Sub Koom

    Abstract: Accurate patient selection is critical in radiotherapy (RT) to prevent ineffective treatments. Traditional survival prediction models, relying on structured data, often lack precision. This study explores the potential of large language models (LLMs) to structure unstructured electronic health record (EHR) data, thereby improving survival prediction accuracy through comprehensive clinical informat… ▽ More

    Submitted 4 September, 2024; v1 submitted 9 August, 2024; originally announced August 2024.

    Comments: 23 pages, 2 tables, 4 figures

  8. arXiv:2407.21783  [pdf, other

    cs.AI cs.CL cs.CV

    The Llama 3 Herd of Models

    Authors: Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere, Bethany Biron, Binh Tang , et al. (510 additional authors not shown)

    Abstract: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical… ▽ More

    Submitted 15 August, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

  9. arXiv:2407.21035  [pdf, other

    cs.CV

    Direct Unlearning Optimization for Robust and Safe Text-to-Image Models

    Authors: Yong-Hyun Park, Sangdoo Yun, Jin-Hwa Kim, Junho Kim, Geonhui Jang, Yonghyun Jeong, Junghyo Jo, Gayoung Lee

    Abstract: Recent advancements in text-to-image (T2I) models have greatly benefited from large-scale datasets, but they also pose significant risks due to the potential generation of unsafe content. To mitigate this issue, researchers have developed unlearning techniques to remove the model's ability to generate potentially harmful content. However, these methods are easily bypassed by adversarial attacks, m… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Extended abstract accepted in GenLaw 2024 workshop @ ICML2024

  10. arXiv:2407.18328  [pdf, ps, other

    cs.CL cs.CY

    Unveiling Scoring Processes: Dissecting the Differences between LLMs and Human Graders in Automatic Scoring

    Authors: Xuansheng Wu, Padmaja Pravin Saraf, Gyeong-Geon Lee, Ehsan Latif, Ninghao Liu, Xiaoming Zhai

    Abstract: Large language models (LLMs) have demonstrated strong potential in performing automatic scoring for constructed response assessments. While constructed responses graded by humans are usually based on given grading rubrics, the methods by which LLMs assign scores remain largely unclear. It is also uncertain how closely AI's scoring process mirrors that of humans, or if it adheres to the same gradin… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Non-archival Presenting at EDM 2024 Workshop on Large Language Models

  11. arXiv:2407.15648  [pdf, other

    cs.CV

    TreeSBA: Tree-Transformer for Self-Supervised Sequential Brick Assembly

    Authors: Mengqi Guo, Chen Li, Yuyang Zhao, Gim Hee Lee

    Abstract: Inferring step-wise actions to assemble 3D objects with primitive bricks from images is a challenging task due to complex constraints and the vast number of possible combinations. Recent studies have demonstrated promising results on sequential LEGO brick assembly through the utilization of LEGO-Graph modeling to predict sequential actions. However, existing approaches are class-specific and requi… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  12. arXiv:2407.13808  [pdf, other

    cs.CV

    CoAPT: Context Attribute words for Prompt Tuning

    Authors: Gun Lee, Subin An, Sungyong Baik, Soochahn Lee

    Abstract: We propose a novel prompt tuning method called CoAPT(Context Attribute words in Prompt Tuning) for few/zero-shot image classification. The core motivation is that attributes are descriptive words with rich information about a given concept. Thus, we aim to enrich text queries of existing prompt tuning methods, improving alignment between text and image embeddings in CLIP embedding space. To do so,… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 14 pages, 4 figures

  13. arXiv:2407.12508  [pdf, other

    cs.CL cs.AI cs.CV

    MERLIN: Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank Pipeline

    Authors: Donghoon Han, Eunhwan Park, Gisang Lee, Adam Lee, Nojun Kwak

    Abstract: The rapid expansion of multimedia content has made accurately retrieving relevant videos from large collections increasingly challenging. Recent advancements in text-video retrieval have focused on cross-modal interactions, large-scale foundation model training, and probabilistic modeling, yet often neglect the crucial user perspective, leading to discrepancies between user queries and the content… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Work in progress

  14. arXiv:2407.11439  [pdf, other

    cs.LG cs.AI q-bio.BM

    Repurformer: Transformers for Repurposing-Aware Molecule Generation

    Authors: Changhun Lee, Gyumin Lee

    Abstract: Generating as diverse molecules as possible with desired properties is crucial for drug discovery research, which invokes many approaches based on deep generative models today. Despite recent advancements in these models, particularly in variational autoencoders (VAEs), generative adversarial networks (GANs), Transformers, and diffusion models, a significant challenge known as \textit{the sample b… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 12 pages, 8 figures, conference

  15. arXiv:2407.09514  [pdf

    cond-mat.mtrl-sci cs.LG physics.app-ph

    Machine Learning Based Prediction of Proton Conductivity in Metal-Organic Frameworks

    Authors: Seunghee Han, Byeong Gwan Lee, Dae Woon Lim, Jihan Kim

    Abstract: Recently, metal-organic frameworks (MOFs) have demonstrated their potential as solid-state electrolytes in proton exchange membrane fuel cells. However, the number of MOFs reported to exhibit proton conductivity remains limited, and the mechanisms underlying this phenomenon are not fully elucidated, complicating the design of proton-conductive MOFs. In response, we developed a comprehensive databa… ▽ More

    Submitted 17 July, 2024; v1 submitted 18 June, 2024; originally announced July 2024.

  16. arXiv:2407.06682  [pdf, other

    cs.LG cs.AI

    A Predictive Model Based on Transformer with Statistical Feature Embedding in Manufacturing Sensor Dataset

    Authors: Gyeong Taek Lee, Oh-Ran Kwon

    Abstract: In the manufacturing process, sensor data collected from equipment is crucial for building predictive models to manage processes and improve productivity. However, in the field, it is challenging to gather sufficient data to build robust models. This study proposes a novel predictive model based on the Transformer, utilizing statistical feature embedding and window positional encoding. Statistical… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  17. arXiv:2407.03741  [pdf, other

    cs.IT

    A Unified Expression for Upper Bounds on the BLER of Spinal Codes over Fading Channels

    Authors: Aimin Li, Xiaomeng Chen, Shaohua Wu, Gary C. F. Lee, Sumei Sun

    Abstract: Performance evaluation of particular channel coding has been a significant topic in coding theory, often involving the use of bounding techniques. This paper focuses on the new family of capacity-achieving codes, Spinal codes, to provide a comprehensive analysis framework to tightly upper bound the block error rate (BLER) of Spinal codes in the finite block length (FBL) regime. First, we resort to… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  18. arXiv:2407.02681  [pdf, other

    cs.LG eess.IV math.OC stat.ML

    Uniform Transformation: Refining Latent Representation in Variational Autoencoders

    Authors: Ye Shi, C. S. George Lee

    Abstract: Irregular distribution in latent space causes posterior collapse, misalignment between posterior and prior, and ill-sampling problem in Variational Autoencoders (VAEs). In this paper, we introduce a novel adaptable three-stage Uniform Transformation (UT) module -- Gaussian Kernel Density Estimation (G-KDE) clustering, non-parametric Gaussian Mixture (GM) Modeling, and Probability Integral Transfor… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted by 2024 IEEE 20th International Conference on Automation Science and Engineering

  19. arXiv:2407.02245  [pdf, other

    cs.RO cs.AI

    Safe CoR: A Dual-Expert Approach to Integrating Imitation Learning and Safe Reinforcement Learning Using Constraint Rewards

    Authors: Hyeokjin Kwon, Gunmin Lee, Junseo Lee, Songhwai Oh

    Abstract: In the realm of autonomous agents, ensuring safety and reliability in complex and dynamic environments remains a paramount challenge. Safe reinforcement learning addresses these concerns by introducing safety constraints, but still faces challenges in navigating intricate environments such as complex driving situations. To overcome these challenges, we present the safe constraint reward (Safe CoR)… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted to the Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024

  20. arXiv:2407.00693  [pdf, other

    cs.AI cs.CL cs.LG

    BAPO: Base-Anchored Preference Optimization for Personalized Alignment in Large Language Models

    Authors: Gihun Lee, Minchan Jeong, Yujin Kim, Hojung Jung, Jaehoon Oh, Sangmook Kim, Se-Young Yun

    Abstract: While learning to align Large Language Models (LLMs) with human preferences has shown remarkable success, aligning these models to meet the diverse user preferences presents further challenges in preserving previous knowledge. This paper examines the impact of personalized preference optimization on LLMs, revealing that the extent of knowledge loss varies significantly with preference heterogeneit… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: under review

  21. arXiv:2406.16341  [pdf, other

    cs.CL

    EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records

    Authors: Yeonsu Kwon, Jiho Kim, Gyubok Lee, Seongsu Bae, Daeun Kyung, Wonchul Cha, Tom Pollard, Alistair Johnson, Edward Choi

    Abstract: Electronic Health Records (EHRs) are integral for storing comprehensive patient medical records, combining structured data (e.g., medications) with detailed clinical notes (e.g., physician notes). These elements are essential for straightforward data retrieval and provide deep, contextual insights into patient care. However, they often suffer from discrepancies due to unintuitive EHR system design… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  22. arXiv:2406.15723  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Acoustic Feature Mixup for Balanced Multi-aspect Pronunciation Assessment

    Authors: Heejin Do, Wonjun Lee, Gary Geunbae Lee

    Abstract: In automated pronunciation assessment, recent emphasis progressively lies on evaluating multiple aspects to provide enriched feedback. However, acquiring multi-aspect-score labeled data for non-native language learners' speech poses challenges; moreover, it often leads to score-imbalanced distributions. In this paper, we propose two Acoustic Feature Mixup strategies, linearly and non-linearly inte… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Interspeech 2024

  23. arXiv:2406.13935  [pdf, other

    eess.AS cs.AI cs.SD

    CONMOD: Controllable Neural Frame-based Modulation Effects

    Authors: Gyubin Lee, Hounsu Kim, Junwon Lee, Juhan Nam

    Abstract: Deep learning models have seen widespread use in modelling LFO-driven audio effects, such as phaser and flanger. Although existing neural architectures exhibit high-quality emulation of individual effects, they do not possess the capability to manipulate the output via control parameters. To address this issue, we introduce Controllable Neural Frame-based Modulation Effects (CONMOD), a single blac… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  24. arXiv:2406.11311  [pdf, other

    cs.CV

    Syn-to-Real Unsupervised Domain Adaptation for Indoor 3D Object Detection

    Authors: Yunsong Wang, Na Zhao, Gim Hee Lee

    Abstract: The use of synthetic data in indoor 3D object detection offers the potential of greatly reducing the manual labor involved in 3D annotations and training effective zero-shot detectors. However, the complicated domain shifts across syn-to-real indoor datasets remains underexplored. In this paper, we propose a novel Object-wise Hierarchical Domain Alignment (OHDA) framework for syn-to-real unsupervi… ▽ More

    Submitted 26 August, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  25. arXiv:2406.11283  [pdf, other

    cs.CV

    Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding

    Authors: Yunsong Wang, Na Zhao, Gim Hee Lee

    Abstract: The field of self-supervised 3D representation learning has emerged as a promising solution to alleviate the challenge presented by the scarcity of extensive, well-annotated datasets. However, it continues to be hindered by the lack of diverse, large-scale, real-world 3D scene datasets for source data. To address this shortfall, we propose Generalizable Representation Learning (GRL), where we devi… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  26. arXiv:2406.10867  [pdf, other

    cs.LG q-bio.BM

    Geometric-informed GFlowNets for Structure-Based Drug Design

    Authors: Grayson Lee, Tony Shen, Martin Ester

    Abstract: The rise of cost involved with drug discovery and current speed of which they are discover, underscore the need for more efficient structure-based drug design (SBDD) methods. We employ Generative Flow Networks (GFlowNets), to effectively explore the vast combinatorial space of drug-like molecules, which traditional virtual screening methods fail to cover. We introduce a novel modification to the G… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: Accepted at MoML 2024 as Spotlight

  27. arXiv:2406.07800  [pdf, other

    cs.LG cs.DC

    Regularizing and Aggregating Clients with Class Distribution for Personalized Federated Learning

    Authors: Gyuejeong Lee, Daeyoung Choi

    Abstract: Personalized federated learning (PFL) enables customized models for clients with varying data distributions. However, existing PFL methods often incur high computational and communication costs, limiting their practical application. This paper proposes a novel PFL method, Class-wise Federated Averaging (cwFedAVG), that performs Federated Averaging (FedAVG) class-wise, creating multiple global mode… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  28. arXiv:2406.07231  [pdf, other

    cs.CL

    Decipherment-Aware Multilingual Learning in Jointly Trained Language Models

    Authors: Grandee Lee

    Abstract: The principle that governs unsupervised multilingual learning (UCL) in jointly trained language models (mBERT as a popular example) is still being debated. Many find it surprising that one can achieve UCL with multiple monolingual corpora. In this work, we anchor UCL in the context of language decipherment and show that the joint training methodology is a decipherment process pivotal for UCL. In a… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  29. arXiv:2406.06650  [pdf, other

    eess.IV cs.CV

    Predicting the risk of early-stage breast cancer recurrence using H\&E-stained tissue images

    Authors: Geongyu Lee, Joonho Lee, Tae-Yeong Kwak, Sun Woo Kim, Youngmee Kwon, Chungyeul Kim, Hyeyoon Chang

    Abstract: Accurate prediction of the likelihood of recurrence is important in the selection of postoperative treatment for patients with early-stage breast cancer. In this study, we investigated whether deep learning algorithms can predict patients' risk of recurrence by analyzing the pathology images of their cancer histology. A total of 125 hematoxylin and eosin stained breast cancer whole slide images la… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 12 pages, 7 figures

  30. arXiv:2406.06050  [pdf, other

    cs.CV

    Generalizable Human Gaussians from Single-View Image

    Authors: Jinnan Chen, Chen Li, Jianfeng Zhang, Hanlin Chen, Buzhen Huang, Gim Hee Lee

    Abstract: In this work, we tackle the task of learning generalizable 3D human Gaussians from a single image. The main challenge for this task is to recover detailed geometry and appearance, especially for the unobserved regions. To this end, we propose single-view generalizable Human Gaussian model (HGM), a diffusion-guided framework for 3D human modeling from a single image. We design a diffusion-based coa… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  31. arXiv:2406.05774  [pdf, other

    cs.CV

    VCR-GauS: View Consistent Depth-Normal Regularizer for Gaussian Surface Reconstruction

    Authors: Hanlin Chen, Fangyin Wei, Chen Li, Tianxin Huang, Yunsong Wang, Gim Hee Lee

    Abstract: Although 3D Gaussian Splatting has been widely studied because of its realistic and efficient novel-view synthesis, it is still challenging to extract a high-quality surface from the point-based representation. Previous works improve the surface by incorporating geometric priors from the off-the-shelf normal estimator. However, there are two main limitations: 1) Supervising normal rendered from 3D… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  32. arXiv:2406.04625  [pdf, other

    cs.CL cs.AI

    Key-Element-Informed sLLM Tuning for Document Summarization

    Authors: Sangwon Ryu, Heejin Do, Yunsu Kim, Gary Geunbae Lee, Jungseul Ok

    Abstract: Remarkable advances in large language models (LLMs) have enabled high-quality text summarization. However, this capability is currently accessible only through LLMs of substantial size or proprietary LLMs with usage fees. In response, smaller-scale LLMs (sLLMs) of easy accessibility and low costs have been extensively studied, yet they often suffer from missing key information and entities, i.e.,… ▽ More

    Submitted 25 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: Interspeech 2024

  33. arXiv:2406.02893  [pdf, other

    cs.CL

    Language Model Can Do Knowledge Tracing: Simple but Effective Method to Integrate Language Model and Knowledge Tracing Task

    Authors: Unggi Lee, Jiyeong Bae, Dohee Kim, Sookbun Lee, Jaekwon Park, Taekyung Ahn, Gunho Lee, Damji Stratton, Hyeoncheol Kim

    Abstract: Knowledge Tracing (KT) is a critical task in online learning for modeling student knowledge over time. Despite the success of deep learning-based KT models, which rely on sequences of numbers as data, most existing approaches fail to leverage the rich semantic information in the text of questions and concepts. This paper proposes Language model-based Knowledge Tracing (LKT), a novel framework that… ▽ More

    Submitted 9 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: 11 pages, 5 figures, 3 tables

  34. arXiv:2406.00303  [pdf, other

    cs.CL cs.AI

    Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning

    Authors: Sangwon Ryu, Heejin Do, Yunsu Kim, Gary Geunbae Lee, Jungseul Ok

    Abstract: The evaluation of summary quality encompasses diverse dimensions such as consistency, coherence, relevance, and fluency. However, existing summarization methods often target a specific dimension, facing challenges in generating well-balanced summaries across multiple dimensions. In this paper, we propose multi-objective reinforcement learning tailored to generate balanced summaries across all four… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: ACL 2024

  35. arXiv:2406.00019  [pdf, other

    cs.CL cs.AI cs.DB cs.IR

    EHR-SeqSQL : A Sequential Text-to-SQL Dataset For Interactively Exploring Electronic Health Records

    Authors: Jaehee Ryu, Seonhee Cho, Gyubok Lee, Edward Choi

    Abstract: In this paper, we introduce EHR-SeqSQL, a novel sequential text-to-SQL dataset for Electronic Health Record (EHR) databases. EHR-SeqSQL is designed to address critical yet underexplored aspects in text-to-SQL parsing: interactivity, compositionality, and efficiency. To the best of our knowledge, EHR-SeqSQL is not only the largest but also the first medical text-to-SQL dataset benchmark to include… ▽ More

    Submitted 30 July, 2024; v1 submitted 23 May, 2024; originally announced June 2024.

    Comments: ACL 2024 (Findings)

  36. arXiv:2405.19046  [pdf, other

    cs.IR

    Continual Collaborative Distillation for Recommender System

    Authors: Gyuseok Lee, SeongKu Kang, Wonbin Kweon, Hwanjo Yu

    Abstract: Knowledge distillation (KD) has emerged as a promising technique for addressing the computational challenges associated with deploying large-scale recommender systems. KD transfers the knowledge of a massive teacher system to a compact student model, to reduce the huge computational burdens for inference while retaining high accuracy. The existing KD studies primarily focus on one-time distillatio… ▽ More

    Submitted 25 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD 2024 research track. 9 main pages + 1 appendix page, 5 figures

  37. arXiv:2405.17958  [pdf, other

    cs.CV

    FreeSplat: Generalizable 3D Gaussian Splatting Towards Free-View Synthesis of Indoor Scenes

    Authors: Yunsong Wang, Tianxin Huang, Hanlin Chen, Gim Hee Lee

    Abstract: Empowering 3D Gaussian Splatting with generalization ability is appealing. However, existing generalizable 3D Gaussian Splatting methods are largely confined to narrow-range interpolation between stereo images due to their heavy backbones, thus lacking the ability to accurately localize 3D Gaussian and support free-view synthesis across wide view range. In this paper, we present a novel framework… ▽ More

    Submitted 9 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  38. arXiv:2405.13943  [pdf, other

    cs.CV

    DoGaussian: Distributed-Oriented Gaussian Splatting for Large-Scale 3D Reconstruction Via Gaussian Consensus

    Authors: Yu Chen, Gim Hee Lee

    Abstract: The recent advances in 3D Gaussian Splatting (3DGS) show promising results on the novel view synthesis (NVS) task. With its superior rendering performance and high-fidelity rendering quality, 3DGS is excelling at its previous NeRF counterparts. The most recent 3DGS method focuses either on improving the instability of rendering efficiency or reducing the model size. On the other hand, the training… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  39. arXiv:2405.12900  [pdf, other

    cs.CL cs.AI

    Adversarial DPO: Harnessing Harmful Data for Reducing Toxicity with Minimal Impact on Coherence and Evasiveness in Dialogue Agents

    Authors: San Kim, Gary Geunbae Lee

    Abstract: Recent advancements in open-domain dialogue systems have been propelled by the emergence of high-quality large language models (LLMs) and various effective training methodologies. Nevertheless, the presence of toxicity within these models presents a significant challenge that can potentially diminish the user experience. In this study, we introduce an innovative training algorithm, an improvement… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 15 pages, 7 figures, accepted to NAACL findings 2024

    ACM Class: I.2.7

  40. arXiv:2405.07520  [pdf, ps, other

    cs.CV

    Dehazing Remote Sensing and UAV Imagery: A Review of Deep Learning, Prior-based, and Hybrid Approaches

    Authors: Gao Yu Lee, Jinkuan Chen, Tanmoy Dam, Md Meftahul Ferdaus, Daniel Puiu Poenar, Vu N Duong

    Abstract: High-quality images are crucial in remote sensing and UAV applications, but atmospheric haze can severely degrade image quality, making image dehazing a critical research area. Since the introduction of deep convolutional neural networks, numerous approaches have been proposed, and even more have emerged with the development of vision transformers and contrastive/few-shot learning. Simultaneously,… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: Submitted to journal and under review, once the paper is accepted, the copyright will be transferred to the corresponding journal

  41. arXiv:2405.07163  [pdf, other

    physics.ed-ph cs.AI

    Realizing Visual Question Answering for Education: GPT-4V as a Multimodal AI

    Authors: Gyeong-Geon Lee, Xiaoming Zhai

    Abstract: Educational scholars have analyzed various image data acquired from teaching and learning situations, such as photos that shows classroom dynamics, students' drawings with regard to the learning content, textbook illustrations, etc. Unquestioningly, most qualitative analysis of and explanation on image data have been conducted by human researchers, without machine-based automation. It was partiall… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

  42. arXiv:2405.06673  [pdf, other

    cs.CL cs.AI

    Overview of the EHRSQL 2024 Shared Task on Reliable Text-to-SQL Modeling on Electronic Health Records

    Authors: Gyubok Lee, Sunjun Kweon, Seongsu Bae, Edward Choi

    Abstract: Electronic Health Records (EHRs) are relational databases that store the entire medical histories of patients within hospitals. They record numerous aspects of patients' medical care, from hospital admission and diagnosis to treatment and discharge. While EHRs are vital sources of clinical data, exploring them beyond a predefined set of queries requires skills in query languages like SQL. To make… ▽ More

    Submitted 23 May, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

    Comments: The 6th Clinical Natural Language Processing Workshop at NAACL 2024; Minor Change from Camera-Ready

  43. arXiv:2405.02042  [pdf, other

    cs.IT

    Sampling to Achieve the Goal: An Age-aware Remote Markov Decision Process

    Authors: Aimin Li, Shaohua Wu, Gary C. F. Lee, Xiaomeng Cheng, Sumei Sun

    Abstract: Age of Information (AoI) has been recognized as an important metric to measure the freshness of information. Central to this consensus is that minimizing AoI can enhance the freshness of information, thereby facilitating the accuracy of subsequent decision-making processes. However, to date the direct causal relationship that links AoI to the utility of the decision-making process is unexplored. T… ▽ More

    Submitted 8 August, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: 12 pages, 4 figures

  44. arXiv:2405.01884  [pdf, other

    cs.CL

    Beyond Single-Event Extraction: Towards Efficient Document-Level Multi-Event Argument Extraction

    Authors: Wanlong Liu, Li Zhou, Dingyi Zeng, Yichen Xiao, Shaohuan Cheng, Chen Zhang, Grandee Lee, Malu Zhang, Wenyu Chen

    Abstract: Recent mainstream event argument extraction methods process each event in isolation, resulting in inefficient inference and ignoring the correlations among multiple events. To address these limitations, here we propose a multiple-event argument extraction model DEEIA (Dependency-guided Encoding and Event-specific Information Aggregation), capable of extracting arguments from all events within a do… ▽ More

    Submitted 16 June, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: Accepted to Findings of ACL 2024

  45. arXiv:2405.01588  [pdf, other

    cs.CL cs.AI

    Towards Unbiased Evaluation of Detecting Unanswerable Questions in EHRSQL

    Authors: Yongjin Yang, Sihyeon Kim, SangMook Kim, Gyubok Lee, Se-Young Yun, Edward Choi

    Abstract: Incorporating unanswerable questions into EHR QA systems is crucial for testing the trustworthiness of a system, as providing non-existent responses can mislead doctors in their diagnoses. The EHRSQL dataset stands out as a promising benchmark because it is the only dataset that incorporates unanswerable questions in the EHR QA system alongside practical questions. However, in this work, we identi… ▽ More

    Submitted 28 April, 2024; originally announced May 2024.

    Comments: DPFM Workshop, ICLR 2024

  46. arXiv:2404.14329  [pdf, other

    cs.CV

    X-Ray: A Sequential 3D Representation For Generation

    Authors: Tao Hu, Wenhang Ge, Yuyang Zhao, Gim Hee Lee

    Abstract: We introduce X-Ray, a novel 3D sequential representation inspired by the penetrability of x-ray scans. X-Ray transforms a 3D object into a series of surface frames at different layers, making it suitable for generating 3D models from images. Our method utilizes ray casting from the camera center to capture geometric and textured details, including depth, normal, and color, across all intersected s… ▽ More

    Submitted 1 June, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  47. arXiv:2404.11291  [pdf, other

    cs.CV

    Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption

    Authors: Buzhen Huang, Chen Li, Chongyang Xu, Liang Pan, Yangang Wang, Gim Hee Lee

    Abstract: Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration, but overlook the modeling of close interactions. In this work, we tackle the task of reconstructing closely interactive humans from a monocular video. The main challenge of this task comes from insufficient visual information caused by depth ambiguity and severe inter-person occ… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: CVPR2024

  48. arXiv:2404.06814  [pdf, other

    cs.CV

    Zero-shot Point Cloud Completion Via 2D Priors

    Authors: Tianxin Huang, Zhiwen Yan, Yuyang Zhao, Gim Hee Lee

    Abstract: 3D point cloud completion is designed to recover complete shapes from partially observed point clouds. Conventional completion methods typically depend on extensive point cloud data for training %, with their effectiveness often constrained to object categories similar to those seen during training. In contrast, we propose a zero-shot framework aimed at completing partially observed point clouds a… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  49. arXiv:2404.02592  [pdf

    cs.CL cs.SD eess.AS

    Leveraging the Interplay Between Syntactic and Acoustic Cues for Optimizing Korean TTS Pause Formation

    Authors: Yejin Jeon, Yunsu Kim, Gary Geunbae Lee

    Abstract: Contemporary neural speech synthesis models have indeed demonstrated remarkable proficiency in synthetic speech generation as they have attained a level of quality comparable to that of human-produced speech. Nevertheless, it is important to note that these achievements have predominantly been verified within the context of high-resource languages such as English. Furthermore, the Tacotron and Fas… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Accepted to LREC-COLING 2024

  50. arXiv:2404.02157  [pdf, other

    cs.CV cs.AI

    Segment Any 3D Object with Language

    Authors: Seungjun Lee, Yuyang Zhao, Gim Hee Lee

    Abstract: In this paper, we investigate Open-Vocabulary 3D Instance Segmentation (OV-3DIS) with free-form language instructions. Earlier works that rely on only annotated base categories for training suffer from limited generalization to unseen novel categories. Recent works mitigate poor generalizability to novel categories by generating class-agnostic masks or projecting generalized masks from 2D to 3D, b… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Project Page: https://cvrp-sole.github.io