Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 356 results for author: Park, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.03257  [pdf, other

    cs.CL cs.AI

    Understanding LLM Development Through Longitudinal Study: Insights from the Open Ko-LLM Leaderboard

    Authors: Chanjun Park, Hyeonwoo Kim

    Abstract: This paper conducts a longitudinal study over eleven months to address the limitations of prior research on the Open Ko-LLM Leaderboard, which have relied on empirical studies with restricted observation periods of only five months. By extending the analysis duration, we aim to provide a more comprehensive understanding of the progression in developing Korean large language models (LLMs). Our stud… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  2. arXiv:2409.02361  [pdf, other

    cs.CL

    Diversify-verify-adapt: Efficient and Robust Retrieval-Augmented Ambiguous Question Answering

    Authors: Yeonjun In, Sungchul Kim, Ryan A. Rossi, Md Mehrab Tanjim, Tong Yu, Ritwik Sinha, Chanyoung Park

    Abstract: The retrieval augmented generation (RAG) framework addresses an ambiguity in user queries in QA systems by retrieving passages that cover all plausible interpretations and generating comprehensive responses based on the passages. However, our preliminary studies reveal that a single retrieval process often suffers from low quality results, as the retrieved passages frequently fail to capture all p… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  3. arXiv:2409.00913  [pdf, other

    math.OC cs.LG

    Generalized Continuous-Time Models for Nesterov's Accelerated Gradient Methods

    Authors: Chanwoong Park, Youngchae Cho, Insoon Yang

    Abstract: Recent research has indicated a substantial rise in interest in understanding Nesterov's accelerated gradient methods via their continuous-time models. However, most existing studies focus on specific classes of Nesterov's methods, which hinders the attainment of an in-depth understanding and a unified perspective. To address this deficit, we present generalized continuous-time models that cover a… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  4. arXiv:2409.00329  [pdf, other

    cs.CE

    Convolutional Hierarchical Deep Learning Neural Networks-Tensor Decomposition (C-HiDeNN-TD): a scalable surrogate modeling approach for large-scale physical systems

    Authors: Jiachen Guo, Chanwook Park, Xiaoyu Xie, Zhongsheng Sang, Gregory J. Wagner, Wing Kam Liu

    Abstract: A common trend in simulation-driven engineering applications is the ever-increasing size and complexity of the problem, where classical numerical methods typically suffer from significant computational time and huge memory cost. Methods based on artificial intelligence have been extensively investigated to accelerate partial differential equations (PDE) solvers using data-driven surrogates. Howeve… ▽ More

    Submitted 30 August, 2024; originally announced September 2024.

  5. AVIN-Chat: An Audio-Visual Interactive Chatbot System with Emotional State Tuning

    Authors: Chanhyuk Park, Jungbin Cho, Junwan Kim, Seongmin Lee, Jungsu Kim, Sanghoon Lee

    Abstract: This work presents an audio-visual interactive chatbot (AVIN-Chat) system that allows users to have face-to-face conversations with 3D avatars in real-time. Compared to the previous chatbot services, which provide text-only or speech-only communications, the proposed AVIN-Chat can offer audio-visual communications providing users with a superior experience quality. In addition, the proposed AVIN-C… ▽ More

    Submitted 15 August, 2024; originally announced September 2024.

    Journal ref: Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI) Demo Track. 8763-8766 (2024)

  6. arXiv:2408.13482  [pdf, other

    cs.LG cs.AI

    MPruner: Optimizing Neural Network Size with CKA-Based Mutual Information Pruning

    Authors: Seungbeom Hu, ChanJun Park, Andrew Ferraiuolo, Sang-Ki Ko, Jinwoo Kim, Haein Song, Jieung Kim

    Abstract: Determining the optimal size of a neural network is critical, as it directly impacts runtime performance and memory usage. Pruning is a well-established model compression technique that reduces the size of neural networks while mathematically guaranteeing accuracy preservation. However, many recent pruning methods overlook the global contributions of individual model components, making it difficul… ▽ More

    Submitted 2 September, 2024; v1 submitted 24 August, 2024; originally announced August 2024.

  7. arXiv:2408.13467  [pdf, other

    cs.LG cs.AI cs.DC

    LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs

    Authors: Chansung Park, Juyong Jiang, Fan Wang, Sayak Paul, Jing Tang

    Abstract: The widespread adoption of cloud-based proprietary large language models (LLMs) has introduced significant challenges, including operational dependencies, privacy concerns, and the necessity of continuous internet connectivity. In this work, we introduce an LLMOps pipeline, "LlamaDuo", for the seamless migration of knowledge and abilities from service-oriented LLMs to smaller, locally manageable m… ▽ More

    Submitted 28 August, 2024; v1 submitted 24 August, 2024; originally announced August 2024.

    Comments: 28 pages, 18 figures, 6 tables

  8. arXiv:2408.12772  [pdf, other

    cs.CV cs.AI

    Symmetric masking strategy enhances the performance of Masked Image Modeling

    Authors: Khanh-Binh Nguyen, Chae Jung Park

    Abstract: Masked Image Modeling (MIM) is a technique in self-supervised learning that focuses on acquiring detailed visual representations from unlabeled images by estimating the missing pixels in randomly masked sections. It has proven to be a powerful tool for the preliminary training of Vision Transformers (ViTs), yielding impressive results across various tasks. Nevertheless, most MIM methods heavily de… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: Accepted at ICPR 2024

  9. arXiv:2408.10593  [pdf, other

    cs.CL cs.CV

    An Efficient Sign Language Translation Using Spatial Configuration and Motion Dynamics with LLMs

    Authors: Eui Jun Hwang, Sukmin Cho, Junmyeong Lee, Jong C. Park

    Abstract: Gloss-free Sign Language Translation (SLT) converts sign videos directly into spoken language sentences without relying on glosses. Recently, Large Language Models (LLMs) have shown remarkable translation performance in gloss-free methods by harnessing their powerful natural language generation capabilities. However, these methods often rely on domain-specific fine-tuning of visual encoders to ach… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: Under Review

  10. arXiv:2408.08270  [pdf, other

    cs.CV

    HeightLane: BEV Heightmap guided 3D Lane Detection

    Authors: Chaesong Park, Eunbin Seo, Jongwoo Lim

    Abstract: Accurate 3D lane detection from monocular images presents significant challenges due to depth ambiguity and imperfect ground modeling. Previous attempts to model the ground have often used a planar ground assumption with limited degrees of freedom, making them unsuitable for complex road environments with varying slopes. Our study introduces HeightLane, an innovative method that predicts a height… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: 10 pages, 6 figures, 5 tables

  11. arXiv:2408.01084  [pdf, other

    cs.CL

    Adaptive Contrastive Decoding in Retrieval-Augmented Generation for Handling Noisy Contexts

    Authors: Youna Kim, Hyuhng Joon Kim, Cheonbok Park, Choonghyun Park, Hyunsoo Cho, Junyeob Kim, Kang Min Yoo, Sang-goo Lee, Taeuk Kim

    Abstract: When using large language models (LLMs) in knowledge-intensive tasks, such as open-domain question answering, external context can bridge a gap between external knowledge and LLM's parametric knowledge. Recent research has been developed to amplify contextual knowledge over the parametric knowledge of LLM with contrastive decoding approaches. While these approaches could yield truthful responses w… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

  12. arXiv:2408.00973  [pdf, other

    stat.ML cs.LG math.ST

    META-ANOVA: Screening interactions for interpretable machine learning

    Authors: Yongchan Choi, Seokhun Park, Chanmoo Park, Dongha Kim, Yongdai Kim

    Abstract: There are two things to be considered when we evaluate predictive models. One is prediction accuracy,and the other is interpretability. Over the recent decades, many prediction models of high performance, such as ensemble-based models and deep neural networks, have been developed. However, these models are often too complex, making it difficult to intuitively interpret their predictions. This comp… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: 26 pages

  13. arXiv:2407.19681  [pdf, other

    cs.RO cs.AI

    Motion Manifold Flow Primitives for Language-Guided Trajectory Generation

    Authors: Yonghyeon Lee, Byeongho Lee, Seungyeon Kim, Frank C. Park

    Abstract: Developing text-based robot trajectory generation models is made particularly difficult by the small dataset size, high dimensionality of the trajectory space, and the inherent complexity of the text-conditional motion distribution. Recent manifold learning-based methods have partially addressed the dimensionality and dataset size issues, but struggle with the complex text-conditional distribution… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

    Comments: 12 pages, 10 figures, under review

  14. arXiv:2407.19193  [pdf

    cs.LG cs.AI cs.CR cs.DC

    A collaborative ensemble construction method for federated random forest

    Authors: Penjan Antonio Eng Lim, Cheong Hee Park

    Abstract: Random forests are considered a cornerstone in machine learning for their robustness and versatility. Despite these strengths, their conventional centralized training is ill-suited for the modern landscape of data that is often distributed, sensitive, and subject to privacy concerns. Federated learning (FL) provides a compelling solution to this problem, enabling models to be trained across a grou… ▽ More

    Submitted 27 July, 2024; originally announced July 2024.

    Comments: This is the authors' accepted manuscript of an article published in the journal Expert Systems With Applications. Published version available at: https://www.sciencedirect.com/science/article/pii/S0957417424016099. 22 pages, 3 figures

    MSC Class: 68T05 (Primary); 68W40; 62H30 (Secondary) ACM Class: I.2.6; I.2.11; K.4.1

    Journal ref: Expert Systems with Applications, Volume 255, 2024, Article 124742

  15. Debiased Graph Poisoning Attack via Contrastive Surrogate Objective

    Authors: Kanghoon Yoon, Yeonjun In, Namkyeong Lee, Kibum Kim, Chanyoung Park

    Abstract: Graph neural networks (GNN) are vulnerable to adversarial attacks, which aim to degrade the performance of GNNs through imperceptible changes on the graph. However, we find that in fact the prevalent meta-gradient-based attacks, which utilizes the gradient of the loss w.r.t the adjacency matrix, are biased towards training nodes. That is, their meta-gradient is determined by a training procedure o… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: 9 pages. Proceeding ACM International Conference on Information and Knowledge Management (CIKM 2024) Proceeding

  16. arXiv:2407.18180  [pdf

    physics.bio-ph cs.RO

    Passive wing deployment and retraction in beetles and flapping microrobots

    Authors: Hoang-Vu Phan, Hoon Cheol Park, Dario Floreano

    Abstract: Birds, bats and many insects can tuck their wings against their bodies at rest and deploy them to power flight. Whereas birds and bats use well-developed pectoral and wing muscles and tendons, how insects control these movements remains unclear, as mechanisms of wing deployment and retraction vary among insect species. Beetles (Coleoptera) display one of the most complex wing mechanisms. For examp… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: 20 pages, 10 figures

    Journal ref: Nature 632 (2024) 1-6

  17. arXiv:2407.17857  [pdf, other

    cs.CV cs.AI

    Mew: Multiplexed Immunofluorescence Image Analysis through an Efficient Multiplex Network

    Authors: Sukwon Yun, Jie Peng, Alexandro E. Trevino, Chanyoung Park, Tianlong Chen

    Abstract: Recent advancements in graph-based approaches for multiplexed immunofluorescence (mIF) images have significantly propelled the field forward, offering deeper insights into patient-level phenotyping. However, current graph-based methodologies encounter two primary challenges: (1) Cellular Heterogeneity, where existing approaches fail to adequately address the inductive biases inherent in graphs, pa… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  18. arXiv:2407.16427  [pdf, other

    cond-mat.dis-nn cs.LG hep-lat

    Stochastic weight matrix dynamics during learning and Dyson Brownian motion

    Authors: Gert Aarts, Biagio Lucini, Chanju Park

    Abstract: We demonstrate that the update of weight matrices in learning algorithms can be described in the framework of Dyson Brownian motion, thereby inheriting many features of random matrix theory. We relate the level of stochasticity to the ratio of the learning rate and the mini-batch size, providing more robust evidence to a previously conjectured scaling relationship. We discuss universal and non-uni… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: 17 pages, 16 figures

  19. arXiv:2407.15396  [pdf, other

    cs.CV cs.AI

    Semantic Diversity-aware Prototype-based Learning for Unbiased Scene Graph Generation

    Authors: Jaehyeong Jeon, Kibum Kim, Kanghoon Yoon, Chanyoung Park

    Abstract: The scene graph generation (SGG) task involves detecting objects within an image and predicting predicates that represent the relationships between the objects. However, in SGG benchmark datasets, each subject-object pair is annotated with a single predicate even though a single predicate may exhibit diverse semantics (i.e., semantic diversity), existing SGG models are trained to predict the one a… ▽ More

    Submitted 25 July, 2024; v1 submitted 22 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  20. arXiv:2407.13166  [pdf, other

    cs.HC cs.IR

    Using LLMs to Investigate Correlations of Conversational Follow-up Queries with User Satisfaction

    Authors: Hyunwoo Kim, Yoonseo Choi, Taehyun Yang, Honggu Lee, Chaneon Park, Yongju Lee, Jin Young Kim, Juho Kim

    Abstract: With large language models (LLMs), conversational search engines shift how users retrieve information from the web by enabling natural conversations to express their search intents over multiple turns. Users' natural conversation embodies rich but implicit signals of users' search intents and evaluation of search results to understand user experience with the system. However, it is underexplored h… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: Accepted to LLM4Eval @ SIGIR 2024 - The First Workshop on Large Language Models (LLMs) for Evaluation in Information Retrieval

  21. arXiv:2407.11245  [pdf, other

    cs.IR cs.AI

    Pacer and Runner: Cooperative Learning Framework between Single- and Cross-Domain Sequential Recommendation

    Authors: Chung Park, Taesan Kim, Hyungjun Yoon, Junui Hong, Yelim Yu, Mincheol Cho, Minsung Choi, Jaegul Choo

    Abstract: Cross-Domain Sequential Recommendation (CDSR) improves recommendation performance by utilizing information from multiple domains, which contrasts with Single-Domain Sequential Recommendation (SDSR) that relies on a historical interaction within a specific domain. However, CDSR may underperform compared to the SDSR approach in certain domains due to negative transfer, which occurs when there is a l… ▽ More

    Submitted 24 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted at SIGIR'24 (Best Paper Honorable Mention)

  22. arXiv:2407.10733  [pdf, other

    cs.CV

    Joint-Embedding Predictive Architecture for Self-Supervised Learning of Mask Classification Architecture

    Authors: Dong-Hee Kim, Sungduk Cho, Hyeonwoo Cho, Chanmin Park, Jinyoung Kim, Won Hwa Kim

    Abstract: In this work, we introduce Mask-JEPA, a self-supervised learning framework tailored for mask classification architectures (MCA), to overcome the traditional constraints associated with training segmentation models. Mask-JEPA combines a Joint Embedding Predictive Architecture with MCA to adeptly capture intricate semantics and precise object boundaries. Our approach addresses two critical challenge… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 27 pages, 5 figures

  23. arXiv:2407.09043  [pdf, other

    cs.AI

    Vision Language Model is NOT All You Need: Augmentation Strategies for Molecule Language Models

    Authors: Namkyeong Lee, Siddhartha Laghuvarapu, Chanyoung Park, Jimeng Sun

    Abstract: Recently, there has been a growing interest among researchers in understanding molecules and their textual descriptions through molecule language models (MoLM). However, despite some early promising developments, the advancement of MoLM still trails significantly behind that of vision language models (VLM). This is because unique challenges exist apart from VLM in the field of MoLM due to 1) a lim… ▽ More

    Submitted 23 July, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

    Comments: CIKM 2024 / ACL 2024 Workshop on Languages and Molecule

  24. arXiv:2407.03627  [pdf, other

    cs.CL

    DSLR: Document Refinement with Sentence-Level Re-ranking and Reconstruction to Enhance Retrieval-Augmented Generation

    Authors: Taeho Hwang, Soyeong Jeong, Sukmin Cho, SeungYoon Han, Jong C. Park

    Abstract: Recent advancements in Large Language Models (LLMs) have significantly improved their performance across various Natural Language Processing (NLP) tasks. However, LLMs still struggle with generating non-factual responses due to limitations in their parametric memory. Retrieval-Augmented Generation (RAG) systems address this issue by incorporating external knowledge with a retrieval module. Despite… ▽ More

    Submitted 20 August, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: 20 pages

    Journal ref: KnowledgeNLP@ACL 2024

  25. arXiv:2407.02854  [pdf, other

    cs.CL cs.CV

    Universal Gloss-level Representation for Gloss-free Sign Language Translation and Production

    Authors: Eui Jun Hwang, Sukmin Cho, Huije Lee, Youngwoo Yoon, Jong C. Park

    Abstract: Sign language, essential for the deaf and hard-of-hearing, presents unique challenges in translation and production due to its multimodal nature and the inherent ambiguity in mapping sign language motion to spoken language words. Previous methods often rely on gloss annotations, requiring time-intensive labor and specialized expertise in sign language. Gloss-free methods have emerged to address th… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 14 pages, 5 figures

  26. arXiv:2407.02472  [pdf, other

    cs.CL

    ValueScope: Unveiling Implicit Norms and Values via Return Potential Model of Social Interactions

    Authors: Chan Young Park, Shuyue Stella Li, Hayoung Jung, Svitlana Volkova, Tanushree Mitra, David Jurgens, Yulia Tsvetkov

    Abstract: This study introduces ValueScope, a framework leveraging language models to quantify social norms and values within online communities, grounded in social science perspectives on normative structures. We employ ValueScope to dissect and analyze linguistic and stylistic expressions across 13 Reddit communities categorized under gender, politics, science, and finance. Our analysis provides a quantit… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: First three authors contributed equally. 33 pages. In submission

  27. arXiv:2407.02004  [pdf, other

    cs.CV cs.AI cs.SD eess.AS

    SAVE: Segment Audio-Visual Easy way using Segment Anything Model

    Authors: Khanh-Binh Nguyen, Chae Jung Park

    Abstract: The primary aim of Audio-Visual Segmentation (AVS) is to precisely identify and locate auditory elements within visual scenes by accurately predicting segmentation masks at the pixel level. Achieving this involves comprehensively considering data and model aspects to address this task effectively. This study presents a lightweight approach, SAVE, which efficiently adapts the pre-trained segment an… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  28. arXiv:2407.00626  [pdf, other

    cs.LG cs.AI

    Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models

    Authors: Sangwoong Yoon, Himchan Hwang, Dohyun Kwon, Yung-Kyun Noh, Frank C. Park

    Abstract: We present a maximum entropy inverse reinforcement learning (IRL) approach for improving the sample quality of diffusion generative models, especially when the number of generation time steps is small. Similar to how IRL trains a policy based on the reward function learned from expert demonstrations, we train (or fine-tune) a diffusion model using the log probability density estimated from trainin… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: Code is released at https://github.com/swyoon/Diffusion-by-MaxEntIRL

  29. arXiv:2406.19370  [pdf, other

    cs.LG cs.AI

    Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space

    Authors: Core Francisco Park, Maya Okawa, Andrew Lee, Ekdeep Singh Lubana, Hidenori Tanaka

    Abstract: Modern generative models demonstrate impressive capabilities, likely stemming from an ability to identify and manipulate abstract concepts underlying their training data. However, fundamental questions remain: what determines the concepts a model learns, the order in which it learns them, and its ability to manipulate those concepts? To address these questions, we propose analyzing a model's learn… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Preprint

  30. arXiv:2406.16469  [pdf, other

    cs.CL cs.CV

    Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration

    Authors: Yujin Baek, ChaeHun Park, Jaeseok Kim, Yu-Jung Heo, Du-Seong Chang, Jaegul Choo

    Abstract: To create culturally inclusive vision-language models (VLMs), the foremost requirement is developing a test benchmark that can diagnose the models' ability to respond to questions reflecting cultural elements. This paper addresses the necessity for such benchmarks, noting that existing research has relied on human annotators' manual efforts, which impedes diversity and efficiency. We propose a sem… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  31. arXiv:2406.16275  [pdf, other

    cs.CL

    Investigating the Influence of Prompt-Specific Shortcuts in AI Generated Text Detection

    Authors: Choonghyun Park, Hyuhng Joon Kim, Junyeob Kim, Youna Kim, Taeuk Kim, Hyunsoo Cho, Hwiyeol Jo, Sang-goo Lee, Kang Min Yoo

    Abstract: AI Generated Text (AIGT) detectors are developed with texts from humans and LLMs of common tasks. Despite the diversity of plausible prompt choices, these datasets are generally constructed with a limited number of prompts. The lack of prompt variation can introduce prompt-specific shortcut features that exist in data collected with the chosen prompt, but do not generalize to others. In this paper… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 19 pages, 3 figures, 13 tables, under review

  32. arXiv:2406.16013  [pdf, other

    cs.CL cs.AI cs.IR

    Database-Augmented Query Representation for Information Retrieval

    Authors: Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park

    Abstract: Information retrieval models that aim to search for the documents relevant to the given query have shown many successes, which have been applied to diverse tasks. However, the query provided by the user is oftentimes very short, which challenges the retrievers to correctly fetch relevant documents. To tackle this, existing studies have proposed expanding the query with a couple of additional (user… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  33. arXiv:2406.15951  [pdf, other

    cs.CL

    Modular Pluralism: Pluralistic Alignment via Multi-LLM Collaboration

    Authors: Shangbin Feng, Taylor Sorensen, Yuhan Liu, Jillian Fisher, Chan Young Park, Yejin Choi, Yulia Tsvetkov

    Abstract: While existing alignment paradigms have been integral in developing large language models (LLMs), LLMs often learn an averaged human preference and struggle to model diverse preferences across cultures, demographics, and communities. We propose Modular Pluralism, a modular framework based on multi-LLM collaboration for pluralistic alignment: it "plugs into" a base LLM a pool of smaller but special… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  34. arXiv:2406.14277  [pdf, other

    cs.CL cs.AI

    Augmenting Query and Passage for Retrieval-Augmented Generation using LLMs for Open-Domain Question Answering

    Authors: Minsang Kim, Cheoneum Park, Seungjun Baek

    Abstract: Retrieval-augmented generation (RAG) has received much attention for Open-domain question-answering (ODQA) tasks as a means to compensate for the parametric knowledge of large language models (LLMs). While previous approaches focused on processing retrieved passages to remove irrelevant context, they still rely heavily on the quality of retrieved passages which can degrade if the question is ambig… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  35. arXiv:2406.13214  [pdf, other

    cs.LG

    Self-Explainable Temporal Graph Networks based on Graph Information Bottleneck

    Authors: Sangwoo Seo, Sungwon Kim, Jihyeong Jung, Yoonho Lee, Chanyoung Park

    Abstract: Temporal Graph Neural Networks (TGNN) have the ability to capture both the graph topology and dynamic dependencies of interactions within a graph over time. There has been a growing need to explain the predictions of TGNN models due to the difficulty in identifying how past events influence their predictions. Since the explanation model for a static graph cannot be readily applied to temporal grap… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: KDD 2024

  36. arXiv:2406.12904  [pdf, other

    cs.LG physics.comp-ph physics.optics

    Meent: Differentiable Electromagnetic Simulator for Machine Learning

    Authors: Yongha Kim, Anthony W. Jung, Sanmun Kim, Kevin Octavian, Doyoung Heo, Chaejin Park, Jeongmin Shin, Sunghyun Nam, Chanhyung Park, Juho Park, Sangjun Han, Jinmyoung Lee, Seolho Kim, Min Seok Jang, Chan Y. Park

    Abstract: Electromagnetic (EM) simulation plays a crucial role in analyzing and designing devices with sub-wavelength scale structures such as solar cells, semiconductor devices, image sensors, future displays and integrated photonic devices. Specifically, optics problems such as estimating semiconductor device structures and designing nanophotonic devices provide intriguing research topics with far-reachin… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: under review

  37. arXiv:2406.12319  [pdf, other

    cs.CL

    PRePair: Pointwise Reasoning Enhance Pairwise Evaluating for Robust Instruction-Following Assessments

    Authors: Hawon Jeong, ChaeHun Park, Jimin Hong, Jaegul Choo

    Abstract: Pairwise evaluation using large language models (LLMs) is widely used for evaluating natural language generation (NLG) tasks. However, the reliability of LLMs is often compromised by biases, such as favoring verbosity and authoritative tone. In the study, we focus on the comparison of two LLM-based evaluation approaches, pointwise and pairwise. Our findings demonstrate that pointwise evaluators ex… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  38. arXiv:2406.12307  [pdf, other

    cs.CL

    Can Tool-augmented Large Language Models be Aware of Incomplete Conditions?

    Authors: Seungbin Yang, ChaeHun Park, Taehee Kim, Jaegul Choo

    Abstract: Recent advancements in integrating large language models (LLMs) with tools have allowed the models to interact with real-world environments. However, these tool-augmented LLMs often encounter incomplete scenarios when users provide partial information or the necessary tools are unavailable. Recognizing and managing such scenarios is crucial for LLMs to ensure their reliability, but this exploratio… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  39. arXiv:2406.11313  [pdf, other

    cs.CV

    Semi-Supervised Domain Adaptation Using Target-Oriented Domain Augmentation for 3D Object Detection

    Authors: Yecheol Kim, Junho Lee, Changsoo Park, Hyoung won Kim, Inho Lim, Christopher Chang, Jun Won Choi

    Abstract: 3D object detection is crucial for applications like autonomous driving and robotics. However, in real-world environments, variations in sensor data distribution due to sensor upgrades, weather changes, and geographic differences can adversely affect detection performance. Semi-Supervised Domain Adaptation (SSDA) aims to mitigate these challenges by transferring knowledge from a source domain, abu… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted to IEEE Transactions on Intelligent Vehicles (T-IV). The code is available at: https://github.com/rasd3/TODA

  40. arXiv:2406.09719  [pdf, other

    cs.CL cs.AI

    Self-Knowledge Distillation for Learning Ambiguity

    Authors: Hancheol Park, Soyeong Jeong, Sukmin Cho, Jong C. Park

    Abstract: Recent language models have shown remarkable performance on natural language understanding (NLU) tasks. However, they are often sub-optimal when faced with ambiguous samples that can be interpreted in multiple ways, over-confidently predicting a single label without consideration for its correctness. To address this issue, we propose a novel self-knowledge distillation method that enables models t… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 9 pages, 5 figures

  41. arXiv:2406.04064  [pdf, other

    cs.CL cs.AI cs.CY

    Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models

    Authors: Jisu Shin, Hoyun Song, Huije Lee, Soyeong Jeong, Jong C. Park

    Abstract: Social bias is shaped by the accumulation of social perceptions towards targets across various demographic identities. To fully understand such social bias in large language models (LLMs), it is essential to consider the composite of social perceptions from diverse perspectives among identities. Previous studies have either evaluated biases in LLMs by indirectly assessing the presence of sentiment… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Findings of ACL 2024

  42. arXiv:2406.03307  [pdf

    math.NA cs.CE

    Multi-Patch Isogeometric Convolution Hierarchical Deep-learning Neural Network

    Authors: Lei Zhang, Chanwook Park, T. J. R. Hughes, Wing Kam Liu

    Abstract: A seamless integration of neural networks with Isogeometric Analysis (IGA) was first introduced in [1] under the name of Hierarchical Deep-learning Neural Network (HiDeNN) and has systematically evolved into Isogeometric Convolution HiDeNN (in short, C-IGA) [2]. C-IGA achieves higher order approximations without increasing the degree of freedom. Due to the Kronecker delta property of C-IGA shape f… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 30 pages, 15 figures in main text, additional 10 pages for appendix

  43. arXiv:2406.03202  [pdf, other

    cs.CL cs.AI

    ChatLang-8: An LLM-Based Synthetic Data Generation Framework for Grammatical Error Correction

    Authors: Jeiyoon Park, Chanjun Park, Heuiseok Lim

    Abstract: We explore and improve the capabilities of LLMs to generate data for grammatical error correction (GEC). When merely producing parallel sentences, their patterns are too simplistic to be valuable as a corpus. To address this issue, we propose an automated framework that includes a Subject Selector, Grammar Selector, Prompt Manager, and Evaluator. Additionally, we introduce a new dataset for GEC ta… ▽ More

    Submitted 11 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: preprint

  44. arXiv:2406.03140  [pdf, other

    cs.LG

    Continual Traffic Forecasting via Mixture of Experts

    Authors: Sanghyun Lee, Chanyoung Park

    Abstract: The real-world traffic networks undergo expansion through the installation of new sensors, implying that the traffic patterns continually evolve over time. Incrementally training a model on the newly added sensors would make the model forget the past knowledge, i.e., catastrophic forgetting, while retraining the model on the entire network to capture these changes is highly inefficient. To address… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  45. arXiv:2406.02726  [pdf, other

    cs.LG

    Temporal Graph Learning Recurrent Neural Network for Traffic Forecasting

    Authors: Sanghyun Lee, Chanyoung Park

    Abstract: Accurate traffic flow forecasting is a crucial research topic in transportation management. However, it is a challenging problem due to rapidly changing traffic conditions, high nonlinearity of traffic flow, and complex spatial and temporal correlations of road networks. Most existing studies either try to capture the spatial dependencies between roads using the same semantic graph over different… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  46. arXiv:2406.02331  [pdf, other

    cs.CL

    Translation Deserves Better: Analyzing Translation Artifacts in Cross-lingual Visual Question Answering

    Authors: ChaeHun Park, Koanho Lee, Hyesu Lim, Jaeseok Kim, Junmo Park, Yu-Jung Heo, Du-Seong Chang, Jaegul Choo

    Abstract: Building a reliable visual question answering~(VQA) system across different languages is a challenging problem, primarily due to the lack of abundant samples for training. To address this challenge, recent studies have employed machine translation systems for the cross-lingual VQA task. This involves translating the evaluation samples into a source language (usually English) and using monolingual… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Findings Accepted

  47. arXiv:2406.01339  [pdf, other

    cs.HC cs.OS cs.SE

    Recover as It is Designed to Be: Recovering from Compatibility Mobile App Crashes by Reusing User Flows

    Authors: Donghwi Kim, Hyungjun Yoon, Chang Min Park, Sujin Han, Youngjin Kwon, Steven Y. Ko, Sung-Ju Lee

    Abstract: Android OS is severely fragmented by API updates and device vendors' OS customization, creating a market condition where vastly different OS versions coexist. This gives rise to compatibility crash problems where Android apps crash on certain Android versions but not on others. Although well-known, this problem is extremely challenging for app developers to overcome due to the sheer number of Andr… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  48. arXiv:2405.20574  [pdf, other

    cs.CL cs.AI

    Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark

    Authors: Chanjun Park, Hyeonwoo Kim, Dahyun Kim, Seonghwan Cho, Sanghoon Kim, Sukyung Lee, Yungi Kim, Hwalsuk Lee

    Abstract: This paper introduces the Open Ko-LLM Leaderboard and the Ko-H5 Benchmark as vital tools for evaluating Large Language Models (LLMs) in Korean. Incorporating private test sets while mirroring the English Open LLM Leaderboard, we establish a robust evaluation framework that has been well integrated in the Korean LLM community. We perform data leakage analysis that shows the benefit of private test… ▽ More

    Submitted 16 August, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted at ACL 2024 Main

  49. arXiv:2405.19778  [pdf, other

    cs.CL cs.AI

    Enhancing Consistency and Role-Specific Knowledge Capturing by Rebuilding Fictional Character's Persona

    Authors: Jeiyoon Park, Chanjun Park, Heuiseok Lim

    Abstract: With the recent introduction of Assistants API, it is expected that document-based language models will be actively used in various domains, especially Role-playing. However, a key challenge lies in utilizing protagonist's persona: Assistants API often fails to achieve with its search because the information extraction part is different each time and it often omits important information such as pr… ▽ More

    Submitted 4 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: preprint

  50. arXiv:2405.15311  [pdf, other

    cs.CV cs.AI

    Retro: Reusing teacher projection head for efficient embedding distillation on Lightweight Models via Self-supervised Learning

    Authors: Khanh-Binh Nguyen, Chae Jung Park

    Abstract: Self-supervised learning (SSL) is gaining attention for its ability to learn effective representations with large amounts of unlabeled data. Lightweight models can be distilled from larger self-supervised pre-trained models using contrastive and consistency constraints. Still, the different sizes of the projection heads make it challenging for students to mimic the teacher's embedding accurately.… ▽ More

    Submitted 24 August, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted at BMVC 2024