Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 418 results for author: Choi, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.09570  [pdf, other

    cs.HC cs.AI

    MindScape Study: Integrating LLM and Behavioral Sensing for Personalized AI-Driven Journaling Experiences

    Authors: Subigya Nepal, Arvind Pillai, William Campbell, Talie Massachi, Michael V. Heinz, Ashmita Kunwar, Eunsol Soul Choi, Orson Xu, Joanna Kuc, Jeremy Huckins, Jason Holden, Sarah M. Preum, Colin Depp, Nicholas Jacobson, Mary Czerwinski, Eric Granholm, Andrew T. Campbell

    Abstract: Mental health concerns are prevalent among college students, highlighting the need for effective interventions that promote self-awareness and holistic well-being. MindScape pioneers a novel approach to AI-powered journaling by integrating passively collected behavioral patterns such as conversational engagement, sleep, and location with Large Language Models (LLMs). This integration creates a hig… ▽ More

    Submitted 14 September, 2024; originally announced September 2024.

    Comments: arXiv admin note: text overlap with arXiv:2404.00487

    ACM Class: H.5.0; H.5.3; H.5.m; J.0

  2. arXiv:2409.06603  [pdf, other

    cs.CV eess.IV

    A Practical Gated Recurrent Transformer Network Incorporating Multiple Fusions for Video Denoising

    Authors: Kai Guo, Seungwon Choi, Jongseong Choi, Lae-Hoon Kim

    Abstract: State-of-the-art (SOTA) video denoising methods employ multi-frame simultaneous denoising mechanisms, resulting in significant delays (e.g., 16 frames), making them impractical for real-time cameras. To overcome this limitation, we propose a multi-fusion gated recurrent Transformer network (GRTN) that achieves SOTA denoising performance with only a single-frame delay. Specifically, the spatial den… ▽ More

    Submitted 10 September, 2024; originally announced September 2024.

    Comments: 5 pages, 5 figures

  3. arXiv:2409.03143  [pdf, other

    cs.GR eess.IV physics.optics

    Large Étendue 3D Holographic Display with Content-adpative Dynamic Fourier Modulation

    Authors: Brian Chao, Manu Gopakumar, Suyeon Choi, Jonghyun Kim, Liang Shi, Gordon Wetzstein

    Abstract: Emerging holographic display technology offers unique capabilities for next-generation virtual reality systems. Current holographic near-eye displays, however, only support a small étendue, which results in a direct tradeoff between achievable field of view and eyebox size. Étendue expansion has recently been explored, but existing approaches are either fundamentally limited in the image quality t… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

    Comments: 12 pages, 7 figures, to be published in SIGGRAPH Asia 2024. Project website: https://bchao1.github.io/holo_dfm/

  4. arXiv:2408.14841  [pdf, other

    cs.CV cs.AI

    Diffusion based Semantic Outlier Generation via Nuisance Awareness for Out-of-Distribution Detection

    Authors: Suhee Yoon, Sanghyu Yoon, Hankook Lee, Ye Seul Sim, Sungik Choi, Kyungeun Lee, Hye-Seung Cho, Woohyung Lim

    Abstract: Out-of-distribution (OOD) detection, which determines whether a given sample is part of the in-distribution (ID), has recently shown promising results through training with synthetic OOD datasets. Nonetheless, existing methods often produce outliers that are considerably distant from the ID, showing limited efficacy for capturing subtle distinctions between ID and OOD. To address these issues, we… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  5. arXiv:2408.12894  [pdf, other

    cs.CV

    FLoD: Integrating Flexible Level of Detail into 3D Gaussian Splatting for Customizable Rendering

    Authors: Yunji Seo, Young Sun Choi, Hyun Seung Son, Youngjung Uh

    Abstract: 3D Gaussian Splatting (3DGS) achieves fast and high-quality renderings by using numerous small Gaussians, which leads to significant memory consumption. This reliance on a large number of Gaussians restricts the application of 3DGS-based models on low-cost devices due to memory limitations. However, simply reducing the number of Gaussians to accommodate devices with less memory capacity leads to i… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

    Comments: Project page: https://3dgs-flod.github.io/flod.github.io/

  6. arXiv:2408.10937  [pdf, other

    cs.HC

    Proxona: Leveraging LLM-Driven Personas to Enhance Creators' Understanding of Their Audience

    Authors: Yoonseo Choi, Eun Jeong Kang, Seulgi Choi, Min Kyung Lee, Juho Kim

    Abstract: Creators are nothing without their audience, and thereby understanding their audience is the cornerstone of their professional achievement. Yet many creators feel lost while comprehending audiences with existing tools, which offer insufficient insights for tailoring content to audience needs. To address the challenges creators face in understanding their audience, we present Proxona, a system for… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 32 pages (including 14 pages of Appendix)

  7. arXiv:2408.09703  [pdf, other

    cs.AI

    Partial-Multivariate Model for Forecasting

    Authors: Jaehoon Lee, Hankook Lee, Sungik Choi, Sungjun Cho, Moontae Lee

    Abstract: When solving forecasting problems including multiple time-series features, existing approaches often fall into two extreme categories, depending on whether to utilize inter-feature information: univariate and complete-multivariate models. Unlike univariate cases which ignore the information, complete-multivariate models compute relationships among a complete set of features. However, despite the p… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 25 pages

  8. arXiv:2408.07944  [pdf, other

    cs.CV

    Training Spatial-Frequency Visual Prompts and Probabilistic Clusters for Accurate Black-Box Transfer Learning

    Authors: Wonwoo Cho, Kangyeol Kim, Saemee Choi, Jaegul Choo

    Abstract: Despite the growing prevalence of black-box pre-trained models (PTMs) such as prediction API services, there remains a significant challenge in directly applying general models to real-world scenarios due to the data distribution gap. Considering a data deficiency and constrained computational resource scenario, this paper proposes a novel parameter-efficient transfer learning framework for vision… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: ACM Multimedia 2024

  9. arXiv:2408.07326  [pdf, other

    cs.AR

    LPU: A Latency-Optimized and Highly Scalable Processor for Large Language Model Inference

    Authors: Seungjae Moon, Jung-Hoon Kim, Junsoo Kim, Seongmin Hong, Junseo Cha, Minsu Kim, Sukbin Lim, Gyubin Choi, Dongjin Seo, Jongho Kim, Hunjong Lee, Hyunjun Park, Ryeowook Ko, Soongyu Choi, Jongse Park, Jinwon Lee, Joo-Young Kim

    Abstract: The explosive arrival of OpenAI's ChatGPT has fueled the globalization of large language model (LLM), which consists of billions of pretrained parameters that embodies the aspects of syntax and semantics. HyperAccel introduces latency processing unit (LPU), a latency-optimized and highly scalable processor architecture for the acceleration of LLM inference. LPU perfectly balances the memory bandwi… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  10. arXiv:2408.05074  [pdf

    cs.CL cs.AI

    RT-Surv: Improving Mortality Prediction After Radiotherapy with Large Language Model Structuring of Large-Scale Unstructured Electronic Health Records

    Authors: Sangjoon Park, Chan Woo Wee, Seo Hee Choi, Kyung Hwan Kim, Jee Suk Chang, Hong In Yoon, Ik Jae Lee, Yong Bae Kim, Jaeho Cho, Ki Chang Keum, Chang Geol Lee, Hwa Kyung Byun, Woong Sub Koom

    Abstract: Accurate patient selection is critical in radiotherapy (RT) to prevent ineffective treatments. Traditional survival prediction models, relying on structured data, often lack precision. This study explores the potential of large language models (LLMs) to structure unstructured electronic health record (EHR) data, thereby improving survival prediction accuracy through comprehensive clinical informat… ▽ More

    Submitted 13 September, 2024; v1 submitted 9 August, 2024; originally announced August 2024.

    Comments: 23 pages, 2 tables, 4 figures

  11. arXiv:2408.04640  [pdf, ps, other

    cs.CL

    LLMs for Enhanced Agricultural Meteorological Recommendations

    Authors: Ji-jun Park, Soo-joon Choi

    Abstract: Agricultural meteorological recommendations are crucial for enhancing crop productivity and sustainability by providing farmers with actionable insights based on weather forecasts, soil conditions, and crop-specific data. This paper presents a novel approach that leverages large language models (LLMs) and prompt engineering to improve the accuracy and relevance of these recommendations. We designe… ▽ More

    Submitted 30 July, 2024; originally announced August 2024.

    Comments: 10 pages

  12. arXiv:2408.04414  [pdf, other

    cs.CL cs.AI

    Enhancing Robustness of Retrieval-Augmented Language Models with In-Context Learning

    Authors: Seong-Il Park, Seung-Woo Choi, Na-Hyun Kim, Jay-Yoon Lee

    Abstract: Retrieval-Augmented Language Models (RALMs) have significantly improved performance in open-domain question answering (QA) by leveraging external knowledge. However, RALMs still struggle with unanswerable queries, where the retrieved contexts do not contain the correct answer, and with conflicting information, where different sources provide contradictory answers due to imperfect retrieval. This s… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: 10 pages, 2 figures

    Journal ref: KnowledgeNLP@ACL 2024

  13. arXiv:2408.04297  [pdf, other

    cs.ET cs.HC

    Spatial Affordance-aware Interactable Subspace Allocation for Mixed Reality Telepresence

    Authors: Dooyoung Kim, Seonji Kim, Selin Choi, Woontack Woo

    Abstract: To enable remote Virtual Reality (VR) and Augmented Reality (AR) clients to collaborate as if they were in the same space during Mixed Reality (MR) telepresence, it is essential to overcome spatial heterogeneity and generate a unified shared collaborative environment by integrating remote spaces into a target host space. Especially when multiple remote users connect, a large shared space is necess… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: Accepted at the 2024 IEEE ISMAR Conference. 10 pages, 6 figures

  14. arXiv:2408.03541  [pdf, ps, other

    cs.CL cs.AI

    EXAONE 3.0 7.8B Instruction Tuned Language Model

    Authors: LG AI Research, :, Soyoung An, Kyunghoon Bae, Eunbi Choi, Stanley Jungkyu Choi, Yemuk Choi, Seokhee Hong, Yeonjung Hong, Junwon Hwang, Hyojin Jeon, Gerrard Jeongwon Jo, Hyunjik Jo, Jiyeon Jung, Yountae Jung, Euisoon Kim, Hyosang Kim, Joonkee Kim, Seonghwan Kim, Soyeon Kim, Sunkyoung Kim, Yireun Kim, Youchul Kim, Edward Hwayoung Lee, Haeju Lee , et al. (14 additional authors not shown)

    Abstract: We introduce EXAONE 3.0 instruction-tuned language model, the first open model in the family of Large Language Models (LLMs) developed by LG AI Research. Among different model sizes, we publicly release the 7.8B instruction-tuned model to promote open research and innovations. Through extensive evaluations across a wide range of public and in-house benchmarks, EXAONE 3.0 demonstrates highly compet… ▽ More

    Submitted 13 August, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

  15. arXiv:2408.02798  [pdf, other

    cs.CL cs.AI cs.LG

    Examining Gender and Power on Wikipedia Through Face and Politeness

    Authors: Adil Soubki, Shyne Choi, Owen Rambow

    Abstract: We propose a framework for analyzing discourse by combining two interdependent concepts from sociolinguistic theory: face acts and politeness. While politeness has robust existing tools and data, face acts are less resourced. We introduce a new corpus created by annotating Wikipedia talk pages with face acts and we use this to train a face act tagger. We then employ our framework to study how face… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Journal ref: SIGDIAL 2024

  16. arXiv:2407.20643  [pdf

    cs.CV

    Generalizing AI-driven Assessment of Immunohistochemistry across Immunostains and Cancer Types: A Universal Immunohistochemistry Analyzer

    Authors: Biagio Brattoli, Mohammad Mostafavi, Taebum Lee, Wonkyung Jung, Jeongun Ryu, Seonwook Park, Jongchan Park, Sergio Pereira, Seunghwan Shin, Sangjoon Choi, Hyojin Kim, Donggeun Yoo, Siraj M. Ali, Kyunghyun Paeng, Chan-Young Ock, Soo Ick Cho, Seokhwi Kim

    Abstract: Despite advancements in methodologies, immunohistochemistry (IHC) remains the most utilized ancillary test for histopathologic and companion diagnostics in targeted therapies. However, objective IHC assessment poses challenges. Artificial intelligence (AI) has emerged as a potential solution, yet its development requires extensive training for each cancer and IHC type, limiting versatility. We dev… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

  17. arXiv:2407.13517  [pdf, other

    cs.CV

    Mask2Map: Vectorized HD Map Construction Using Bird's Eye View Segmentation Masks

    Authors: Sehwan Choi, Jungho Kim, Hongjae Shin, Jun Won Choi

    Abstract: In this paper, we introduce Mask2Map, a novel end-to-end online HD map construction method designed for autonomous driving applications. Our approach focuses on predicting the class and ordered point set of map instances within a scene, represented in the bird's eye view (BEV). Mask2Map consists of two primary components: the Instance-Level Mask Prediction Network (IMPNet) and the Mask-Driven Map… ▽ More

    Submitted 19 July, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

    Comments: 20 pages, 9 figures

  18. arXiv:2407.11793  [pdf, other

    cs.CV cs.AI cs.GR

    Click-Gaussian: Interactive Segmentation to Any 3D Gaussians

    Authors: Seokhun Choi, Hyeonseop Song, Jaechul Kim, Taehyeong Kim, Hoseok Do

    Abstract: Interactive segmentation of 3D Gaussians opens a great opportunity for real-time manipulation of 3D scenes thanks to the real-time rendering capability of 3D Gaussian Splatting. However, the current methods suffer from time-consuming post-processing to deal with noisy segmentation output. Also, they struggle to provide detailed segmentation, which is important for fine-grained manipulation of 3D s… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024. The first two authors contributed equally to this work

  19. arXiv:2407.11781  [pdf, other

    cs.CV

    SlingBAG: Sliding ball adaptive growth algorithm with differentiable radiation enables super-efficient iterative 3D photoacoustic image reconstruction

    Authors: Shuang Li, Yibing Wang, Jian Gao, Chulhong Kim, Seongwook Choi, Yu Zhang, Qian Chen, Yao Yao, Changhui Li

    Abstract: High-quality 3D photoacoustic imaging (PAI) reconstruction under sparse view or limited view has long been challenging. Traditional 3D iterative-based reconstruction methods suffer from both slow speed and high memory consumption. Recently, in computer graphics, the differentiable rendering has made significant progress, particularly with the rise of 3D Gaussian Splatting. Inspired by these, we in… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  20. arXiv:2407.11057  [pdf, other

    cs.LG cs.AI q-bio.BM

    SPIN: SE(3)-Invariant Physics Informed Network for Binding Affinity Prediction

    Authors: Seungyeon Choi, Sangmin Seo, Sanghyun Park

    Abstract: Accurate prediction of protein-ligand binding affinity is crucial for rapid and efficient drug development. Recently, the importance of predicting binding affinity has led to increased attention on research that models the three-dimensional structure of protein-ligand complexes using graph neural networks to predict binding affinity. However, traditional methods often fail to accurately model the… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted to ECAI 2024

  21. arXiv:2407.10206  [pdf

    cs.CE cs.AI cs.NE cs.SI

    Dominant Design Prediction with Phylogenetic Networks

    Authors: Youwei He, Jeong-Dong Lee, Dawoon Jeong, Sungjun Choi, Jiyong Kim

    Abstract: This study proposes an effective method to predict technology development from an evolutionary perspective. Product evolution is the result of technological evolution and market selection. A phylogenetic network is the main method to study product evolution. The formation of the dominant design determines the trajectory of technology development. How to predict future dominant design has become a… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

  22. arXiv:2407.09434  [pdf, other

    cs.LG cs.AI cs.CE eess.SY

    A Perspective on Foundation Models for the Electric Power Grid

    Authors: Hendrik F. Hamann, Thomas Brunschwiler, Blazhe Gjorgiev, Leonardo S. A. Martins, Alban Puech, Anna Varbella, Jonas Weiss, Juan Bernabe-Moreno, Alexandre Blondin Massé, Seong Choi, Ian Foster, Bri-Mathias Hodge, Rishabh Jain, Kibaek Kim, Vincent Mai, François Mirallès, Martin De Montigny, Octavio Ramos-Leaños, Hussein Suprême, Le Xie, El-Nasser S. Youssef, Arnaud Zinflou, Alexander J. Belvi, Ricardo J. Bessa, Bishnu Prasad Bhattari , et al. (2 additional authors not shown)

    Abstract: Foundation models (FMs) currently dominate news headlines. They employ advanced deep learning architectures to extract structural information autonomously from vast datasets through self-supervision. The resulting rich representations of complex systems and dynamics can be applied to many downstream applications. Therefore, FMs can find uses in electric power grids, challenged by the energy transi… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Lead contact: H.F.H.; Major equal contributors: H.F.H., T.B., B.G., L.S.A.M., A.P., A.V., J.W.; Significant equal contributors: J.B., A.B.M., S.C., I.F., B.H., R.J., K.K., V.M., F.M., M.D.M., O.R., H.S., L.X., E.S.Y., A.Z.; Other equal contributors: A.J.B., R.J.B., B.P.B., J.S., S.S

  23. arXiv:2407.08964  [pdf, other

    cs.LG cs.RO

    Communication-Aware Reinforcement Learning for Cooperative Adaptive Cruise Control

    Authors: Sicong Jiang, Seongjin Choi, Lijun Sun

    Abstract: Cooperative Adaptive Cruise Control (CACC) plays a pivotal role in enhancing traffic efficiency and safety in Connected and Autonomous Vehicles (CAVs). Reinforcement Learning (RL) has proven effective in optimizing complex decision-making processes in CACC, leading to improved system performance and adaptability. Among RL approaches, Multi-Agent Reinforcement Learning (MARL) has shown remarkable p… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  24. arXiv:2407.08245  [pdf, other

    cs.LG cs.CV

    Feature Diversification and Adaptation for Federated Domain Generalization

    Authors: Seunghan Yang, Seokeon Choi, Hyunsin Park, Sungha Choi, Simyung Chang, Sungrack Yun

    Abstract: Federated learning, a distributed learning paradigm, utilizes multiple clients to build a robust global model. In real-world applications, local clients often operate within their limited domains, leading to a `domain shift' across clients. Privacy concerns limit each client's learning to its own domain data, which increase the risk of overfitting. Moreover, the process of aggregating models train… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  25. arXiv:2407.06551  [pdf, other

    cs.CL

    OffsetBias: Leveraging Debiased Data for Tuning Evaluators

    Authors: Junsoo Park, Seungyeon Jwa, Meiying Ren, Daeyoung Kim, Sanghyuk Choi

    Abstract: Employing Large Language Models (LLMs) to assess the quality of generated responses, such as prompting instruct-tuned models or fine-tuning judge models, has become a widely adopted evaluation method. It is also known that such evaluators are vulnerable to biases, such as favoring longer responses. While it is important to overcome this problem, the specifics of these biases remain under-explored.… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: Work in Progress

  26. arXiv:2407.05664  [pdf, other

    stat.ML cs.AI cs.LG

    How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning

    Authors: Arthur Jacot, Seok Hoan Choi, Yuxiao Wen

    Abstract: We show that deep neural networks (DNNs) can efficiently learn any composition of functions with bounded $F_{1}$-norm, which allows DNNs to break the curse of dimensionality in ways that shallow networks cannot. More specifically, we derive a generalization bound that combines a covering number argument for compositionality, and the $F_{1}$-norm (or the related Barron norm) for large width adaptiv… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  27. arXiv:2406.15725  [pdf, other

    eess.AS cs.SD

    Self Training and Ensembling Frequency Dependent Networks with Coarse Prediction Pooling and Sound Event Bounding Boxes

    Authors: Hyeonuk Nam, Deokki Min, Seungdeok Choi, Inhan Choi, Yong-Hwa Park

    Abstract: To tackle sound event detection (SED) task, we propose frequency dependent networks (FreDNets), which heavily leverage frequency-dependent methods. We apply frequency warping and FilterAugment, which are frequency-dependent data augmentation methods. The model architecture consists of 3 branches: audio teacher-student transformer (ATST) branch, BEATs branch and CNN branch including either partial… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: DCASE 2024 Challenge Task 4 technical report

  28. arXiv:2406.08796  [pdf, other

    cs.CL

    Deep Exploration of Cross-Lingual Zero-Shot Generalization in Instruction Tuning

    Authors: Janghoon Han, Changho Lee, Joongbo Shin, Stanley Jungkyu Choi, Honglak Lee, Kynghoon Bae

    Abstract: Instruction tuning has emerged as a powerful technique, significantly boosting zero-shot performance on unseen tasks. While recent work has explored cross-lingual generalization by applying instruction tuning to multilingual models, previous studies have primarily focused on English, with a limited exploration of non-English tasks. For an in-depth exploration of cross-lingual generalization in ins… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Findings of ACL 2024 (Camera-ready), by Janghoon Han and Changho Lee, with equal contribution

  29. arXiv:2406.06246  [pdf, other

    cs.LG

    Data-Efficient Learning with Neural Programs

    Authors: Alaia Solko-Breslin, Seewon Choi, Ziyang Li, Neelay Velingker, Rajeev Alur, Mayur Naik, Eric Wong

    Abstract: Many computational tasks can be naturally expressed as a composition of a DNN followed by a program written in a traditional programming language or an API call to an LLM. We call such composites "neural programs" and focus on the problem of learning the DNN parameters when the training data consist of end-to-end input-output labels for the composite. When the program is written in a differentiabl… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  30. arXiv:2406.05472  [pdf, other

    cs.CR eess.SY

    A Novel Generative AI-Based Framework for Anomaly Detection in Multicast Messages in Smart Grid Communications

    Authors: Aydin Zaboli, Seong Lok Choi, Tai-Jin Song, Junho Hong

    Abstract: Cybersecurity breaches in digital substations can pose significant challenges to the stability and reliability of power system operations. To address these challenges, defense and mitigation techniques are required. Identifying and detecting anomalies in information and communication technology (ICT) is crucial to ensure secure device interactions within digital substations. This paper proposes a… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 10 pages, 10 figures, Submitted to IEEE Transactions on Information Forensics and Security

  31. arXiv:2406.04000  [pdf, other

    physics.optics cs.ET

    Stochastic logic in biased coupled photonic probabilistic bits

    Authors: Michael Horodynski, Charles Roques-Carmes, Yannick Salamin, Seou Choi, Jamison Sloan, Di Luo, Marin Soljačić

    Abstract: Optical computing often employs tailor-made hardware to implement specific algorithms, trading generality for improved performance in key aspects like speed and power efficiency. An important computing approach that is still missing its corresponding optical hardware is probabilistic computing, used e.g. for solving difficult combinatorial optimization problems. In this study, we propose an experi… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  32. arXiv:2406.03234  [pdf, other

    cs.LG cs.AI

    Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning

    Authors: Inwoo Hwang, Yunhyeok Kwak, Suhyung Choi, Byoung-Tak Zhang, Sanghack Lee

    Abstract: Causal dynamics learning has recently emerged as a promising approach to enhancing robustness in reinforcement learning (RL). Typically, the goal is to build a dynamics model that makes predictions based on the causal relationships among the entities. Despite the fact that causal connections often manifest only under certain contexts, existing approaches overlook such fine-grained relationships an… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  33. arXiv:2405.16155  [pdf, other

    cs.CL

    Improving Multi-lingual Alignment Through Soft Contrastive Learning

    Authors: Minsu Park, Seyeon Choi, Chanyeol Choi, Jun-Seong Kim, Jy-yong Sohn

    Abstract: Making decent multi-lingual sentence representations is critical to achieve high performances in cross-lingual downstream tasks. In this work, we propose a novel method to align multi-lingual embeddings based on the similarity of sentences measured by a pre-trained mono-lingual embedding model. Given translation sentence pairs, we train a multi-lingual model in a way that the similarity between cr… ▽ More

    Submitted 28 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

    Comments: 8 pages, 1 figures, Accepted at NAACL SRW 2024

  34. arXiv:2405.01591  [pdf, other

    cs.CL cs.AI eess.IV

    Simplifying Multimodality: Unimodal Approach to Multimodal Challenges in Radiology with General-Domain Large Language Model

    Authors: Seonhee Cho, Choonghan Kim, Jiho Lee, Chetan Chilkunda, Sujin Choi, Joo Heung Yoon

    Abstract: Recent advancements in Large Multimodal Models (LMMs) have attracted interest in their generalization capability with only a few samples in the prompt. This progress is particularly relevant to the medical domain, where the quality and sensitivity of data pose unique challenges for model training and application. However, the dependency on high-quality data for effective in-context learning raises… ▽ More

    Submitted 29 April, 2024; originally announced May 2024.

    Comments: Under review

  35. arXiv:2405.01016  [pdf, other

    cs.CV cs.AI

    Addressing Diverging Training Costs using BEVRestore for High-resolution Bird's Eye View Map Construction

    Authors: Minsu Kim, Giseop Kim, Sunwook Choi

    Abstract: Recent advancements in Bird's Eye View (BEV) fusion for map construction have demonstrated remarkable mapping of urban environments. However, their deep and bulky architecture incurs substantial amounts of backpropagation memory and computing latency. Consequently, the problem poses an unavoidable bottleneck in constructing high-resolution (HR) BEV maps, as their large-sized features cause signifi… ▽ More

    Submitted 22 August, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

  36. arXiv:2404.18423  [pdf, other

    cs.CV cs.AI

    Unsupervised Dynamics Prediction with Object-Centric Kinematics

    Authors: Yeon-Ji Song, Suhyung Choi, Jaein Kim, Jin-Hwa Kim, Byoung-Tak Zhang

    Abstract: Human perception involves discerning complex multi-object scenes into time-static object appearance (ie, size, shape, color) and time-varying object motion (ie, location, velocity, acceleration). This innate ability to unconsciously understand the environment is the motivation behind the success of dynamics modeling. Object-centric representations have emerged as a promising tool for dynamics pred… ▽ More

    Submitted 6 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: 15 pages, 6 figures, 4 tables

  37. arXiv:2404.17179  [pdf, other

    cs.HC cs.ET

    Meta-Object: Interactive and Multisensory Virtual Object Learned from the Real World for the Post-Metaverse

    Authors: Dooyoung Kim, Taewook Ha, Jinseok Hong, Seonji Kim, Selin Choi, Heejeong Ko, Woontack Woo

    Abstract: With the proliferation of wearable Augmented Reality/Virtual Reality (AR/VR) devices, ubiquitous virtual experiences seamlessly integrate into daily life through metaverse platforms. To support immersive metaverse experiences akin to reality, we propose a next-generation virtual object, a meta-object, a property-embedded virtual object that contains interactive and multisensory characteristics lea… ▽ More

    Submitted 28 April, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

    Comments: 12 pages, 4 figures, under review in the IEEE CG&A magazine

  38. arXiv:2404.16418  [pdf, other

    cs.CL

    Instruction Matters, a Simple yet Effective Task Selection Approach in Instruction Tuning for Specific Tasks

    Authors: Changho Lee, Janghoon Han, Seonghyeon Ye, Stanley Jungkyu Choi, Honglak Lee, Kyunghoon Bae

    Abstract: Instruction tuning has shown its ability to not only enhance zero-shot generalization across various tasks but also its effectiveness in improving the performance of specific tasks. A crucial aspect in instruction tuning for a particular task is a strategic selection of related tasks that offer meaningful supervision, thereby enhancing efficiency and preventing performance degradation from irrelev… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 21 pages, 6 figures, 16 tables

  39. arXiv:2404.15635  [pdf, other

    cs.CV cs.LG

    A Real-time Evaluation Framework for Pedestrian's Potential Risk at Non-Signalized Intersections Based on Predicted Post-Encroachment Time

    Authors: Tengfeng Lin, Zhixiong Jin, Seongjin Choi, Hwasoo Yeo

    Abstract: Addressing pedestrian safety at intersections is one of the paramount concerns in the field of transportation research, driven by the urgency of reducing traffic-related injuries and fatalities. With advances in computer vision technologies and predictive models, the pursuit of developing real-time proactive protection systems is increasingly recognized as vital to improving pedestrian safety at i… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  40. arXiv:2404.13272  [pdf, other

    cs.HC

    DinAR: Augmenting Reality for Sustainable Dining

    Authors: MJ Johns, Eunsol Sol Choi, Derusha Baskaran

    Abstract: Sustainable food is among the many challenges associated with climate change. The resources required to grow or gather the food and the distance it travels to reach the consumer are two key factors of an ingredient's sustainability. Food that is grown locally and is currently "in-season" will have a lower carbon footprint, but when dining out these details unfortunately may not affect one's orderi… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: Presented at CHI 2024 (arXiv:2404.05889), 5 pages, and 4 figures

    Report number: ARSJ/2024/10

  41. arXiv:2404.13028  [pdf

    cs.CE cs.AI

    When Life gives you LLMs, make LLM-ADE: Large Language Models with Adaptive Data Engineering

    Authors: Stephen Choi, William Gazeley

    Abstract: This paper presents the LLM-ADE framework, a novel methodology for continued pre-training of large language models (LLMs) that addresses the challenges of catastrophic forgetting and double descent. LLM-ADE employs dynamic architectural adjustments, including selective block freezing and expansion, tailored to specific datasets. This strategy enhances model adaptability to new data while preservin… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 6 pages, 3 tables and 3 figures

  42. arXiv:2404.12168  [pdf, other

    cs.CV cs.AI

    Real-World Efficient Blind Motion Deblurring via Blur Pixel Discretization

    Authors: Insoo Kim, Jae Seok Choi, Geonseok Seo, Kinam Kwon, Jinwoo Shin, Hyong-Euk Lee

    Abstract: As recent advances in mobile camera technology have enabled the capability to capture high-resolution images, such as 4K images, the demand for an efficient deblurring model handling large motion has increased. In this paper, we discover that the image residual errors, i.e., blur-sharp pixel differences, can be grouped into some categories according to their motion blur type and how complex their… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: CVPR2024 Camera-Ready

  43. arXiv:2404.11936  [pdf, other

    cs.LG cs.AI cs.CV

    LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights

    Authors: Thibault Castells, Hyoung-Kyu Song, Bo-Kyeong Kim, Shinkook Choi

    Abstract: Latent Diffusion Models (LDMs) have emerged as powerful generative models, known for delivering remarkable results under constrained computational resources. However, deploying LDMs on resource-limited devices remains a complex issue, presenting challenges such as memory consumption and inference speed. To address this issue, we introduce LD-Pruner, a novel performance-preserving structured prunin… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 8 pages, accepted to CVPR24 First Workshop on Efficient and On-Device Generation (EDGE)

  44. arXiv:2404.11925  [pdf, other

    cs.LG cs.AI cs.CV

    EdgeFusion: On-Device Text-to-Image Generation

    Authors: Thibault Castells, Hyoung-Kyu Song, Tairen Piao, Shinkook Choi, Bo-Kyeong Kim, Hanyoung Yim, Changgwun Lee, Jae Gon Kim, Tae-Ho Kim

    Abstract: The intensive computational burden of Stable Diffusion (SD) for text-to-image generation poses a significant hurdle for its practical application. To tackle this challenge, recent research focuses on methods to reduce sampling steps, such as Latent Consistency Model (LCM), and on employing architectural optimizations, including pruning and knowledge distillation. Diverging from existing approaches… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 4 pages, accepted to CVPR24 First Workshop on Efficient and On-Device Generation (EDGE)

  45. arXiv:2404.11810  [pdf, other

    cs.GR

    Holographic Parallax Improves 3D Perceptual Realism

    Authors: Dongyeon Kim, Seung-Woo Nam, Suyeon Choi, Jong-Mo Seo, Gordon Wetzstein, Yoonchan Jeong

    Abstract: Holographic near-eye displays are a promising technology to solve long-standing challenges in virtual and augmented reality display systems. Over the last few years, many different computer-generated holography (CGH) algorithms have been proposed that are supervised by different types of target content, such as 2.5D RGB-depth maps, 3D focal stacks, and 4D light fields. It is unclear, however, what… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 33 pages, 34 figures

  46. arXiv:2404.11630  [pdf, other

    cs.CV cs.AI

    SNP: Structured Neuron-level Pruning to Preserve Attention Scores

    Authors: Kyunghwan Shim, Jaewoong Yun, Shinkook Choi

    Abstract: Multi-head self-attention (MSA) is a key component of Vision Transformers (ViTs), which have achieved great success in various vision tasks. However, their high computational cost and memory footprint hinder their deployment on resource-constrained devices. Conventional pruning approaches can only compress and accelerate the MSA module using head pruning, although the head is not an atomic unit. T… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  47. arXiv:2404.11557  [pdf, other

    cs.RO

    Spatio-Temporal Motion Retargeting for Quadruped Robots

    Authors: Taerim Yoon, Dongho Kang, Seungmin Kim, Minsung Ahn, Stelian Coros, Sungjoon Choi

    Abstract: This work introduces a motion retargeting approach for legged robots, which aims to create motion controllers that imitate the fine behavior of animals. Our approach, namely spatio-temporal motion retargeting (STMR), guides imitation learning procedures by transferring motion from source to target, effectively bridging the morphological disparities by ensuring the feasibility of imitation on the t… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 34 pages, 7 figures, videos/code available at https://terry97-guel.github.io/STMR-RL.github.io/

  48. arXiv:2404.11343  [pdf, other

    cs.IR cs.AI

    Large Language Models meet Collaborative Filtering: An Efficient All-round LLM-based Recommender System

    Authors: Sein Kim, Hongseok Kang, Seungyoon Choi, Donghyun Kim, Minchul Yang, Chanyoung Park

    Abstract: Collaborative filtering recommender systems (CF-RecSys) have shown successive results in enhancing the user experience on social media and e-commerce platforms. However, as CF-RecSys struggles under cold scenarios with sparse user-item interactions, recent strategies have focused on leveraging modality information of user/items (e.g., text or images) based on pre-trained modality encoders and Larg… ▽ More

    Submitted 1 June, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: KDD 2024

  49. arXiv:2404.09451  [pdf, other

    cs.CV

    Contrastive Mean-Shift Learning for Generalized Category Discovery

    Authors: Sua Choi, Dahyun Kang, Minsu Cho

    Abstract: We address the problem of generalized category discovery (GCD) that aims to partition a partially labeled collection of images; only a small part of the collection is labeled and the total number of target classes is unknown. To address this generalized image clustering problem, we revisit the mean-shift algorithm, i.e., a classic, powerful technique for mode seeking, and incorporate it into a con… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted at CVPR 2024

  50. arXiv:2404.02684  [pdf, other

    cs.CL cs.AI cs.LG

    Cross-Architecture Transfer Learning for Linear-Cost Inference Transformers

    Authors: Sehyun Choi

    Abstract: Recently, multiple architectures has been proposed to improve the efficiency of the Transformer Language Models through changing the design of the self-attention block to have a linear-cost inference (LCI). A notable approach in this realm is the State-Space Machines (SSMs) architecture, which showed on-par performance on language modeling tasks with the self-attention transformers. However, such… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Preprint