Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 285 results for author: Bao, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.01012  [pdf, other

    cs.IR cs.LG

    Improved Diversity-Promoting Collaborative Metric Learning for Recommendation

    Authors: Shilong Bao, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming Huang

    Abstract: Collaborative Metric Learning (CML) has recently emerged as a popular method in recommendation systems (RS), closing the gap between metric learning and collaborative filtering. Following the convention of RS, existing practices exploit unique user representation in their model design. This paper focuses on a challenging scenario where a user has multiple categories of interests. Under this settin… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

    Comments: arXiv admin note: text overlap with arXiv:2209.15292

  2. arXiv:2409.00843  [pdf, other

    econ.GN cs.CE cs.CY q-fin.CP stat.ML

    Global Public Sentiment on Decentralized Finance: A Spatiotemporal Analysis of Geo-tagged Tweets from 150 Countries

    Authors: Yuqi Chen, Yifan Li, Kyrie Zhixuan Zhou, Xiaokang Fu, Lingbo Liu, Shuming Bao, Daniel Sui, Luyao Zhang

    Abstract: In the digital era, blockchain technology, cryptocurrencies, and non-fungible tokens (NFTs) have transformed financial and decentralized systems. However, existing research often neglects the spatiotemporal variations in public sentiment toward these technologies, limiting macro-level insights into their global impact. This study leverages Twitter data to explore public attention and sentiment acr… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  3. arXiv:2408.14611  [pdf

    cs.DC cs.DB

    Scalable, reproducible, and cost-effective processing of large-scale medical imaging datasets

    Authors: Michael E. Kim, Karthik Ramadass, Chenyu Gao, Praitayini Kanakaraj, Nancy R. Newlin, Gaurav Rudravaram, Kurt G. Schilling, Blake E. Dewey, Derek Archer, Timothy J. Hohman, Zhiyuan Li, Shunxing Bao, Bennett A. Landman, Nazirah Mohd Khairi

    Abstract: Curating, processing, and combining large-scale medical imaging datasets from national studies is a non-trivial task due to the intense computation and data throughput required, variability of acquired data, and associated financial overhead. Existing platforms or tools for large-scale data curation, processing, and storage have difficulty achieving a viable cost-to-scale ratio of computation spee… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  4. arXiv:2408.10167  [pdf, other

    cs.RO

    Don't Get Stuck: A Deadlock Recovery Approach

    Authors: Francesca Baldini, Faizan M. Tariq, Sangjae Bae, David Isele

    Abstract: When multiple agents share space, interactions can lead to deadlocks, where no agent can advance towards its goal. This paper addresses this challenge with a deadlock recovery strategy. In particular, the proposed algorithm integrates hybrid-A$^\star$, STL, and MPPI frameworks. Specifically, hybrid-A$^\star$ generates a reference path, STL defines a goal (deadlock avoidance) and associated constra… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: Presented at the 27th IEEE International Conference on Intelligent Transportation Systems (ITSC) 2024, Edmonton, Alberta, Canada

  5. arXiv:2408.07905  [pdf, other

    cs.CV

    Persistence Image from 3D Medical Image: Superpixel and Optimized Gaussian Coefficient

    Authors: Yanfan Zhu, Yash Singh, Khaled Younis, Shunxing Bao, Yuankai Huo

    Abstract: Topological data analysis (TDA) uncovers crucial properties of objects in medical imaging. Methods based on persistent homology have demonstrated their advantages in capturing topological features that traditional deep learning methods cannot detect in both radiology and pathology. However, previous research primarily focused on 2D image analysis, neglecting the comprehensive 3D context. In this p… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  6. arXiv:2408.05337  [pdf, other

    cs.CV cs.AI

    VACoDe: Visual Augmented Contrastive Decoding

    Authors: Sihyeon Kim, Boryeong Cho, Sangmin Bae, Sumyeong Ahn, Se-Young Yun

    Abstract: Despite the astonishing performance of recent Large Vision-Language Models (LVLMs), these models often generate inaccurate responses. To address this issue, previous studies have focused on mitigating hallucinations by employing contrastive decoding (CD) with augmented images, which amplifies the contrast with the original image. However, these methods have limitations, including reliance on a sin… ▽ More

    Submitted 26 July, 2024; originally announced August 2024.

    Comments: 10 pages, 7 figures

    MSC Class: 68T01 ACM Class: I.2.0

  7. arXiv:2408.04574  [pdf, other

    cs.HC

    Integrating Annotations into the Design Process for Sonifications and Physicalizations

    Authors: Rhys Sorenson-Graff, S. Sandra Bae, Jordan Wirfs-Brock

    Abstract: Annotations are a critical component of visualizations, helping viewers interpret the visual representation and highlighting critical data insights. Despite their significant role, we lack an understanding of how annotations can be incorporated into other data representations, such as physicalizations and sonifications. Given the emergent nature of these representations, sonifications, and physica… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: 4 pages, 3 figures, to be published in Proceedings of IEEE VIS 2024

  8. arXiv:2408.01855  [pdf, other

    cs.HC

    MoodPupilar: Predicting Mood Through Smartphone Detected Pupillary Responses in Naturalistic Settings

    Authors: Rahul Islam, Tongze Zhang, Priyanshu Singh Bisen, Sang Won Bae

    Abstract: MoodPupilar introduces a novel method for mood evaluation using pupillary response captured by a smartphone's front-facing camera during daily use. Over a four-week period, data was gathered from 25 participants to develop models capable of predicting daily mood averages. Utilizing the GLOBEM behavior modeling platform, we benchmarked the utility of pupillary response as a predictor for mood. Our… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

    Comments: Accepted to IEEE International Conference on Wearable and Implantable Body Sensor Networks (BSN 2024)

  9. arXiv:2407.18451  [pdf, other

    cs.RO

    Gaussian Lane Keeping: A Robust Prediction Baseline

    Authors: David Isele, Piyush Gupta, Xinyi Liu, Sangjae Bae

    Abstract: Predicting agents' behavior for vehicles and pedestrians is challenging due to a myriad of factors including the uncertainty attached to different intentions, inter-agent interactions, traffic (environment) rules, individual inclinations, and agent dynamics. Consequently, a plethora of neural network-driven prediction models have been introduced in the literature to encompass these intricacies to… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  10. arXiv:2407.14733  [pdf, other

    cs.LG cs.AI cs.CL

    Hard Prompts Made Interpretable: Sparse Entropy Regularization for Prompt Tuning with RL

    Authors: Yunseon Choi, Sangmin Bae, Seonghyun Ban, Minchan Jeong, Chuheng Zhang, Lei Song, Li Zhao, Jiang Bian, Kee-Eung Kim

    Abstract: With the advent of foundation models, prompt tuning has positioned itself as an important technique for directing model behaviors and eliciting desired responses. Prompt tuning regards selecting appropriate keywords included into the input, thereby adapting to the downstream task without adjusting or fine-tuning the model parameters. There is a wide range of work in prompt tuning, from approaches… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  11. arXiv:2407.13926  [pdf

    cs.CY cs.AI

    Report on the Conference on Ethical and Responsible Design in the National AI Institutes: A Summary of Challenges

    Authors: Sherri Lynn Conklin, Sue Bae, Gaurav Sett, Michael Hoffmann, Justin B. Biddle

    Abstract: In May 2023, the Georgia Tech Ethics, Technology, and Human Interaction Center organized the Conference on Ethical and Responsible Design in the National AI Institutes. Representatives from the National AI Research Institutes that had been established as of January 2023 were invited to attend; researchers representing 14 Institutes attended and participated. The conference focused on three questio… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 9 pages

  12. arXiv:2407.09475  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Adaptive Prediction Ensemble: Improving Out-of-Distribution Generalization of Motion Forecasting

    Authors: Jinning Li, Jiachen Li, Sangjae Bae, David Isele

    Abstract: Deep learning-based trajectory prediction models for autonomous driving often struggle with generalization to out-of-distribution (OOD) scenarios, sometimes performing worse than simple rule-based models. To address this limitation, we propose a novel framework, Adaptive Prediction Ensemble (APE), which integrates deep learning and rule-based prediction experts. A learned routing function, trained… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  13. arXiv:2407.06116  [pdf

    eess.IV cs.CV cs.LG

    Data-driven Nucleus Subclassification on Colon H&E using Style-transferred Digital Pathology

    Authors: Lucas W. Remedios, Shunxing Bao, Samuel W. Remedios, Ho Hin Lee, Leon Y. Cai, Thomas Li, Ruining Deng, Nancy R. Newlin, Adam M. Saunders, Can Cui, Jia Li, Qi Liu, Ken S. Lau, Joseph T. Roland, Mary K Washington, Lori A. Coburn, Keith T. Wilson, Yuankai Huo, Bennett A. Landman

    Abstract: Understanding the way cells communicate, co-locate, and interrelate is essential to furthering our understanding of how the body functions. H&E is widely available, however, cell subtyping often requires expert knowledge and the use of specialized stains. To reduce the annotation burden, AI has been proposed for the classification of cells on H&E. For example, the recent Colon Nucleus Identificati… ▽ More

    Submitted 15 May, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2401.05602

  14. arXiv:2407.05621  [pdf, other

    cs.AR

    EA4RCA:Efficient AIE accelerator design framework for Regular Communication-Avoiding Algorithm

    Authors: W. B. Zhang, Y. Q. Liu, T. H. Zang, Z. S. Bao

    Abstract: With the introduction of the Adaptive Intelligence Engine (AIE), the Versal Adaptive Compute Acceleration Platform (Versal ACAP) has garnered great attention. However, the current focus of Vitis Libraries and limited research has mainly been on how to invoke AIE modules, without delving into a thorough discussion on effectively utilizing AIE in its typical use cases. As a result, the widespread ad… ▽ More

    Submitted 8 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  15. arXiv:2407.03563  [pdf, other

    eess.AS cs.CL cs.LG eess.IV

    Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition

    Authors: Sungnyun Kim, Kangwook Jang, Sangmin Bae, Hoirin Kim, Se-Young Yun

    Abstract: Audio-visual speech recognition (AVSR) aims to transcribe human speech using both audio and video modalities. In practical environments with noise-corrupted audio, the role of video information becomes crucial. However, prior works have primarily focused on enhancing audio features in AVSR, overlooking the importance of video features. In this study, we strengthen the video features by learning th… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  16. arXiv:2407.00596  [pdf, other

    eess.IV cs.CV

    HATs: Hierarchical Adaptive Taxonomy Segmentation for Panoramic Pathology Image Analysis

    Authors: Ruining Deng, Quan Liu, Can Cui, Tianyuan Yao, Juming Xiong, Shunxing Bao, Hao Li, Mengmeng Yin, Yu Wang, Shilin Zhao, Yucheng Tang, Haichun Yang, Yuankai Huo

    Abstract: Panoramic image segmentation in computational pathology presents a remarkable challenge due to the morphologically complex and variably scaled anatomy. For instance, the intricate organization in kidney pathology spans multiple layers, from regions like the cortex and medulla to functional units such as glomeruli, tubules, and vessels, down to various cell types. In this paper, we propose a novel… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.19286

  17. arXiv:2406.18214  [pdf, other

    cs.CV

    Trimming the Fat: Efficient Compression of 3D Gaussian Splats through Pruning

    Authors: Muhammad Salman Ali, Maryam Qamar, Sung-Ho Bae, Enzo Tartaglione

    Abstract: In recent times, the utilization of 3D models has gained traction, owing to the capacity for end-to-end training initially offered by Neural Radiance Fields and more recently by 3D Gaussian Splatting (3DGS) models. The latter holds a significant advantage by inherently easing rapid convergence during training and offering extensive editability. However, despite rapid advancements, the literature s… ▽ More

    Submitted 29 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted at BMVC 2024

  18. arXiv:2406.17181  [pdf, other

    cs.HC

    FacePsy: An Open-Source Affective Mobile Sensing System -- Analyzing Facial Behavior and Head Gesture for Depression Detection in Naturalistic Settings

    Authors: Rahul Islam, Sang Won Bae

    Abstract: Depression, a prevalent and complex mental health issue affecting millions worldwide, presents significant challenges for detection and monitoring. While facial expressions have shown promise in laboratory settings for identifying depression, their potential in real-world applications remains largely unexplored due to the difficulties in developing efficient mobile systems. In this study, we aim t… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted to ACM International Conference on Mobile Human-Computer Interaction (MobileHCI 2024)

  19. arXiv:2406.16341  [pdf, other

    cs.CL

    EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records

    Authors: Yeonsu Kwon, Jiho Kim, Gyubok Lee, Seongsu Bae, Daeun Kyung, Wonchul Cha, Tom Pollard, Alistair Johnson, Edward Choi

    Abstract: Electronic Health Records (EHRs) are integral for storing comprehensive patient medical records, combining structured data (e.g., medications) with detailed clinical notes (e.g., physician notes). These elements are essential for straightforward data retrieval and provide deep, contextual insights into patient care. However, they often suffer from discrepancies due to unintuitive EHR system design… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  20. arXiv:2406.15942  [pdf, other

    cs.HC

    Revolutionizing Mental Health Support: An Innovative Affective Mobile Framework for Dynamic, Proactive, and Context-Adaptive Conversational Agents

    Authors: Rahul Islam, Sang Won Bae

    Abstract: As we build towards developing interactive systems that can recognize human emotional states and respond to individual needs more intuitively and empathetically in more personalized and context-aware computing time. This is especially important regarding mental health support, with a rising need for immediate, non-intrusive help tailored to each individual. Individual mental health and the complex… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Accepted to Ubicomp '23, GenAI4PC Symposium

  21. arXiv:2406.12254  [pdf, other

    eess.IV cs.CV

    Enhancing Single-Slice Segmentation with 3D-to-2D Unpaired Scan Distillation

    Authors: Xin Yu, Qi Yang, Han Liu, Ho Hin Lee, Yucheng Tang, Lucas W. Remedios, Michael E. Kim, Rendong Zhang, Shunxing Bao, Yuankai Huo, Ann Zenobia Moore, Luigi Ferrucci, Bennett A. Landman

    Abstract: 2D single-slice abdominal computed tomography (CT) enables the assessment of body habitus and organ health with low radiation exposure. However, single-slice data necessitates the use of 2D networks for segmentation, but these networks often struggle to capture contextual information effectively. Consequently, even when trained on identical datasets, 3D networks typically achieve superior segmenta… ▽ More

    Submitted 12 July, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  22. arXiv:2406.10996  [pdf, other

    cs.CL

    THEANINE: Revisiting Memory Management in Long-term Conversations with Timeline-augmented Response Generation

    Authors: Seo Hyun Kim, Kai Tzu-iunn Ong, Taeyoon Kwon, Namyoung Kim, Keummin Ka, SeongHyeon Bae, Yohan Jo, Seung-won Hwang, Dongha Lee, Jinyoung Yeo

    Abstract: Large language models (LLMs) are capable of processing lengthy dialogue histories during prolonged interaction with users without additional memory modules; however, their responses tend to overlook or incorrectly recall information from the past. In this paper, we revisit memory-augmented response generation in the era of LLMs. While prior work focuses on getting rid of outdated memories, we argu… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: Under Review

  23. arXiv:2406.02657  [pdf, other

    cs.CL cs.AI cs.LG

    Block Transformer: Global-to-Local Language Modeling for Fast Inference

    Authors: Namgyu Ho, Sangmin Bae, Taehyeon Kim, Hyunjik Jo, Yireun Kim, Tal Schuster, Adam Fisch, James Thorne, Se-Young Yun

    Abstract: This paper presents the Block Transformer architecture which adopts hierarchical global-to-local modeling to autoregressive transformers to mitigate the inference bottlenecks of self-attention. To apply self-attention, the key-value (KV) cache of all previous sequences must be retrieved from memory at every decoding step. Thereby, this KV cache IO becomes a significant bottleneck in batch inferenc… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 30 pages, 21 figures, 5 tables

  24. arXiv:2405.13396  [pdf, other

    cs.LG stat.ML

    Why In-Context Learning Transformers are Tabular Data Classifiers

    Authors: Felix den Breejen, Sangmin Bae, Stephen Cha, Se-Young Yun

    Abstract: The recently introduced TabPFN pretrains an In-Context Learning (ICL) transformer on synthetic data to perform tabular data classification. As synthetic data does not share features or labels with real-world data, the underlying mechanism that contributes to the success of this method remains unclear. This study provides an explanation by demonstrating that ICL-transformers acquire the ability to… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 9 pages main body, 22 pages total. Preprint under review

  25. arXiv:2405.09782  [pdf, other

    cs.CV

    Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection

    Authors: Feiran Li, Qianqian Xu, Shilong Bao, Zhiyong Yang, Runmin Cong, Xiaochun Cao, Qingming Huang

    Abstract: This paper explores the size-invariance of evaluation metrics in Salient Object Detection (SOD), especially when multiple targets of diverse sizes co-exist in the same image. We observe that current metrics are size-sensitive, where larger objects are focused, and smaller ones tend to be ignored. We argue that the evaluation should be size-invariant because bias based on size is unjustified withou… ▽ More

    Submitted 27 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

    Comments: This paper has been accepted by ICML2024

  26. arXiv:2405.09321  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    ReconBoost: Boosting Can Achieve Modality Reconcilement

    Authors: Cong Hua, Qianqian Xu, Shilong Bao, Zhiyong Yang, Qingming Huang

    Abstract: This paper explores a novel multi-modal alternating learning paradigm pursuing a reconciliation between the exploitation of uni-modal features and the exploration of cross-modal interactions. This is motivated by the fact that current paradigms of multi-modal learning tend to explore multi-modal features simultaneously. The resulting gradient prohibits further exploitation of the features in the w… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: This paper has been accepted by ICML2024

  27. arXiv:2405.07780  [pdf, other

    cs.LG cs.AI cs.CV

    Harnessing Hierarchical Label Distribution Variations in Test Agnostic Long-tail Recognition

    Authors: Zhiyong Yang, Qianqian Xu, Zitai Wang, Sicong Li, Boyu Han, Shilong Bao, Xiaochun Cao, Qingming Huang

    Abstract: This paper explores test-agnostic long-tail recognition, a challenging long-tail task where the test label distributions are unknown and arbitrarily imbalanced. We argue that the variation in these distributions can be broken down hierarchically into global and local levels. The global ones reflect a broad range of diversity, while the local ones typically arise from milder changes, often focused… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  28. arXiv:2405.06673  [pdf, other

    cs.CL cs.AI

    Overview of the EHRSQL 2024 Shared Task on Reliable Text-to-SQL Modeling on Electronic Health Records

    Authors: Gyubok Lee, Sunjun Kweon, Seongsu Bae, Edward Choi

    Abstract: Electronic Health Records (EHRs) are relational databases that store the entire medical histories of patients within hospitals. They record numerous aspects of patients' medical care, from hospital admission and diagnosis to treatment and discharge. While EHRs are vital sources of clinical data, exploring them beyond a predefined set of queries requires skills in query languages like SQL. To make… ▽ More

    Submitted 23 May, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

    Comments: The 6th Clinical Natural Language Processing Workshop at NAACL 2024; Minor Change from Camera-Ready

  29. Field-of-View Extension for Brain Diffusion MRI via Deep Generative Models

    Authors: Chenyu Gao, Shunxing Bao, Michael Kim, Nancy Newlin, Praitayini Kanakaraj, Tianyuan Yao, Gaurav Rudravaram, Yuankai Huo, Daniel Moyer, Kurt Schilling, Walter Kukull, Arthur Toga, Derek Archer, Timothy Hohman, Bennett Landman, Zhiyuan Li

    Abstract: Purpose: In diffusion MRI (dMRI), the volumetric and bundle analyses of whole-brain tissue microstructure and connectivity can be severely impeded by an incomplete field-of-view (FOV). This work aims to develop a method for imputing the missing slices directly from existing dMRI scans with an incomplete FOV. We hypothesize that the imputed image with complete FOV can improve the whole-brain tracto… ▽ More

    Submitted 28 August, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

    Comments: 20 pages, 11 figures

    Journal ref: Journal of Medical Imaging, Vol. 11, Issue 4, 044008 (August 2024)

  30. arXiv:2405.02996  [pdf, other

    cs.SD cs.AI eess.AS

    RepAugment: Input-Agnostic Representation-Level Augmentation for Respiratory Sound Classification

    Authors: June-Woo Kim, Miika Toikkanen, Sangmin Bae, Minseok Kim, Ho-Young Jung

    Abstract: Recent advancements in AI have democratized its deployment as a healthcare assistant. While pretrained models from large-scale visual and audio datasets have demonstrably generalized to this task, surprisingly, no studies have explored pretrained speech models, which, as human-originated sounds, intuitively would share closer resemblance to lung sounds. This paper explores the efficacy of pretrain… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: Accepted EMBC 2024

  31. arXiv:2404.14590  [pdf, other

    cs.HC

    PupilSense: Detection of Depressive Episodes Through Pupillary Response in the Wild

    Authors: Rahul Islam, Sang Won Bae

    Abstract: Early detection of depressive episodes is crucial in managing mental health disorders such as Major Depressive Disorder (MDD) and Bipolar Disorder. However, existing methods often necessitate active participation or are confined to clinical settings. Addressing this gap, we introduce PupilSense, a novel, deep learning-driven mobile system designed to discreetly track pupillary responses as users i… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 2024 International Conference on Activity and Behavior Computing

  32. arXiv:2404.14563  [pdf, other

    cs.HC

    Exploring Algorithmic Explainability: Generating Explainable AI Insights for Personalized Clinical Decision Support Focused on Cannabis Intoxication in Young Adults

    Authors: Tongze Zhang, Tammy Chung, Anind Dey, Sang Won Bae

    Abstract: This study explores the possibility of facilitating algorithmic decision-making by combining interpretable artificial intelligence (XAI) techniques with sensor data, with the aim of providing researchers and clinicians with personalized analyses of cannabis intoxication behavior. SHAP analyzes the importance and quantifies the impact of specific factors such as environmental noise or heart rate, e… ▽ More

    Submitted 29 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 2024 International Conference on Activity and Behavior Computing

  33. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  34. arXiv:2404.01746  [pdf, other

    cs.RO cs.AI cs.LG

    Towards Scalable & Efficient Interaction-Aware Planning in Autonomous Vehicles using Knowledge Distillation

    Authors: Piyush Gupta, David Isele, Sangjae Bae

    Abstract: Real-world driving involves intricate interactions among vehicles navigating through dense traffic scenarios. Recent research focuses on enhancing the interaction awareness of autonomous vehicles to leverage these interactions in decision-making. These interaction-aware planners rely on neural-network-based prediction models to capture inter-vehicle interactions, aiming to integrate these predicti… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  35. arXiv:2404.00520  [pdf, other

    cs.RO math.OC

    Competition-Aware Decision-Making Approach for Mobile Robots in Racing Scenarios

    Authors: Kyoungtae Ji, Sangjae Bae, Nan Li, Kyoungseok Han

    Abstract: This paper presents a game-theoretic strategy for racing, where the autonomous ego agent seeks to block a racing opponent that aims to overtake the ego agent. After a library of trajectory candidates and an associated reward matrix are constructed, the optimal trajectory in terms of maximizing the cumulative reward over the planning horizon is determined based on the level-K reasoning framework. I… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 7 pages, 8 figures

  36. arXiv:2404.00300  [pdf, other

    cs.HC

    Enhancing Empathy in Virtual Reality: An Embodied Approach to Mindset Modulation

    Authors: Seoyeon Bae, Yoon Kyung Lee, Jungcheol Lee, Jaeheon Kim, Haeseong Jeon, Seung-Hwan Lim, Byung-Cheol Kim, Sowon Hahn

    Abstract: A growth mindset has shown promising outcomes for increasing empathy ability. However, stimulating a growth mindset in VR-based empathy interventions is under-explored. In the present study, we implemented prosocial VR content, Our Neighbor Hero, focusing on embodying a virtual character to modulate players' mindsets. The virtual body served as a stepping stone, enabling players to identify with t… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 9 pages, 2 figures, 1 table

  37. arXiv:2403.15511  [pdf, other

    cs.LG cs.AI cs.CR

    Multiple-Input Auto-Encoder Guided Feature Selection for IoT Intrusion Detection Systems

    Authors: Phai Vu Dinh, Diep N. Nguyen, Dinh Thai Hoang, Quang Uy Nguyen, Eryk Dutkiewicz, Son Pham Bao

    Abstract: While intrusion detection systems (IDSs) benefit from the diversity and generalization of IoT data features, the data diversity (e.g., the heterogeneity and high dimensions of data) also makes it difficult to train effective machine learning models in IoT IDSs. This also leads to potentially redundant/noisy features that may decrease the accuracy of the detection engine in IDSs. This paper first i… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  38. arXiv:2403.14963  [pdf, other

    cs.CR

    Enabling Physical Localization of Uncooperative Cellular Devices

    Authors: Taekkyung Oh, Sangwook Bae, Junho Ahn, Yonghwa Lee, Dinh-Tuan Hoang, Min Suk Kang, Nils Ole Tippenhauer, Yongdae Kim

    Abstract: In cellular networks, it can become necessary for authorities to physically locate user devices for tracking criminals or illegal devices. While cellular operators can provide authorities with cell information the device is camping on, fine-grained localization is still required. Therefore, the authorized agents trace the device by monitoring its uplink signals. However, tracking the uplink signal… ▽ More

    Submitted 25 March, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  39. arXiv:2403.03230  [pdf, other

    q-bio.NC cs.AI

    Large language models surpass human experts in predicting neuroscience results

    Authors: Xiaoliang Luo, Akilles Rechardt, Guangzhi Sun, Kevin K. Nejad, Felipe Yáñez, Bati Yilmaz, Kangjoo Lee, Alexandra O. Cohen, Valentina Borghesani, Anton Pashkov, Daniele Marinazzo, Jonathan Nicholas, Alessandro Salatiello, Ilia Sucholutsky, Pasquale Minervini, Sepehr Razavi, Roberta Rocca, Elkhan Yusifov, Tereza Okalova, Nianlong Gu, Martin Ferianc, Mikail Khona, Kaustubh R. Patil, Pui-Shee Lee, Rui Mata , et al. (14 additional authors not shown)

    Abstract: Scientific discoveries often hinge on synthesizing decades of research, a task that potentially outstrips human information processing capacities. Large language models (LLMs) offer a solution. LLMs trained on the vast scientific literature could potentially integrate noisy yet interrelated findings to forecast novel results better than human experts. To evaluate this possibility, we created Brain… ▽ More

    Submitted 21 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  40. arXiv:2403.01898  [pdf, other

    cs.CV eess.IV

    Revisiting Learning-based Video Motion Magnification for Real-time Processing

    Authors: Hyunwoo Ha, Oh Hyun-Bin, Kim Jun-Seong, Kwon Byung-Ki, Kim Sung-Bin, Linh-Tam Tran, Ji-Yun Kim, Sung-Ho Bae, Tae-Hyun Oh

    Abstract: Video motion magnification is a technique to capture and amplify subtle motion in a video that is invisible to the naked eye. The deep learning-based prior work successfully demonstrates the modelling of the motion magnification problem with outstanding quality compared to conventional signal processing-based ones. However, it still lags behind real-time performance, which prevents it from being e… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 19 pages

  41. Hysteresis Compensation of Flexible Continuum Manipulator using RGBD Sensing and Temporal Convolutional Network

    Authors: Junhyun Park, Seonghyeok Jang, Hyojae Park, Seongjun Bae, Minho Hwang

    Abstract: Flexible continuum manipulators are valued for minimally invasive surgery, offering access to confined spaces through nonlinear paths. However, cable-driven manipulators face control difficulties due to hysteresis from cabling effects such as friction, elongation, and coupling. These effects are difficult to model due to nonlinearity and the difficulties become even more evident when dealing with… ▽ More

    Submitted 3 May, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

    Comments: 8 pages, 11 figures, 5 tables

    Journal ref: IEEE Robotics and Automation Letters, Volume 9, Issue 7, 6091 - 6098, 2024

  42. arXiv:2402.05350  [pdf, other

    cs.CV eess.IV

    Descanning: From Scanned to the Original Images with a Color Correction Diffusion Model

    Authors: Junghun Cha, Ali Haider, Seoyun Yang, Hoeyeong Jin, Subin Yang, A. F. M. Shahab Uddin, Jaehyoung Kim, Soo Ye Kim, Sung-Ho Bae

    Abstract: A significant volume of analog information, i.e., documents and images, have been digitized in the form of scanned copies for storing, sharing, and/or analyzing in the digital world. However, the quality of such contents is severely degraded by various distortions caused by printing, storing, and scanning processes in the physical world. Although restoring high-quality content from scanned copies… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: Accepted to AAAI 2024

  43. arXiv:2402.01575  [pdf, other

    cs.RO

    Efficient and Interaction-Aware Trajectory Planning for Autonomous Vehicles with Particle Swarm Optimization

    Authors: Lin Song, David Isele, Naira Hovakimyan, Sangjae Bae

    Abstract: This paper introduces a novel numerical approach to achieving smooth lane-change trajectories in autonomous driving scenarios. Our trajectory generation approach leverages particle swarm optimization (PSO) techniques, incorporating Neural Network (NN) predictions for trajectory refinement. The generation of smooth and dynamically feasible trajectories for the lane change maneuver is facilitated by… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  44. arXiv:2401.07464  [pdf, other

    quant-ph cs.CR cs.LG

    Quantum Privacy Aggregation of Teacher Ensembles (QPATE) for Privacy-preserving Quantum Machine Learning

    Authors: William Watkins, Heehwan Wang, Sangyoon Bae, Huan-Hsin Tseng, Jiook Cha, Samuel Yen-Chi Chen, Shinjae Yoo

    Abstract: The utility of machine learning has rapidly expanded in the last two decades and presents an ethical challenge. Papernot et. al. developed a technique, known as Private Aggregation of Teacher Ensembles (PATE) to enable federated learning in which multiple teacher models are trained on disjoint datasets. This study is the first to apply PATE to an ensemble of quantum neural networks (QNN) to pave a… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

  45. arXiv:2401.06305  [pdf, other

    cs.RO eess.SY

    Multi-Profile Quadratic Programming (MPQP) for Optimal Gap Selection and Speed Planning of Autonomous Driving

    Authors: Alexandre Miranda Anon, Sangjae Bae, Manish Saroya, David Isele

    Abstract: Smooth and safe speed planning is imperative for the successful deployment of autonomous vehicles. This paper presents a mathematical formulation for the optimal speed planning of autonomous driving, which has been validated in high-fidelity simulations and real-road demonstrations with practical constraints. The algorithm explores the inter-traffic gaps in the time and space domain using a breadt… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: Submitted to ICRA 2024

  46. arXiv:2401.05602  [pdf

    cs.CV

    Nucleus subtype classification using inter-modality learning

    Authors: Lucas W. Remedios, Shunxing Bao, Samuel W. Remedios, Ho Hin Lee, Leon Y. Cai, Thomas Li, Ruining Deng, Can Cui, Jia Li, Qi Liu, Ken S. Lau, Joseph T. Roland, Mary K. Washington, Lori A. Coburn, Keith T. Wilson, Yuankai Huo, Bennett A. Landman

    Abstract: Understanding the way cells communicate, co-locate, and interrelate is essential to understanding human physiology. Hematoxylin and eosin (H&E) staining is ubiquitously available both for clinical studies and research. The Colon Nucleus Identification and Classification (CoNIC) Challenge has recently innovated on robust artificial intelligence labeling of six cell types on H&E stains of the colon.… ▽ More

    Submitted 28 January, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

  47. arXiv:2401.03060  [pdf

    eess.IV cs.CV

    Super-resolution multi-contrast unbiased eye atlases with deep probabilistic refinement

    Authors: Ho Hin Lee, Adam M. Saunders, Michael E. Kim, Samuel W. Remedios, Lucas W. Remedios, Yucheng Tang, Qi Yang, Xin Yu, Shunxing Bao, Chloe Cho, Louise A. Mawn, Tonia S. Rex, Kevin L. Schey, Blake E. Dewey, Jeffrey M. Spraggins, Jerry L. Prince, Yuankai Huo, Bennett A. Landman

    Abstract: Purpose: Eye morphology varies significantly across the population, especially for the orbit and optic nerve. These variations limit the feasibility and robustness of generalizing population-wise features of eye organs to an unbiased spatial reference. Approach: To tackle these limitations, we propose a process for creating high-resolution unbiased eye atlases. First, to restore spatial details… ▽ More

    Submitted 14 June, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

    Comments: Revised for submission to SPIE Journal of Medical Imaging. 26 pages, 6 figures

  48. arXiv:2401.00661  [pdf, ps, other

    eess.SY cs.GT

    Personalized Dynamic Pricing Policy for Electric Vehicles: Reinforcement learning approach

    Authors: Sangjun Bae, Balazs Kulcsar, Sebastien Gros

    Abstract: With the increasing number of fast-electric vehicle charging stations (fast-EVCSs) and the popularization of information technology, electricity price competition between fast-EVCSs is highly expected, in which the utilization of public and/or privacy-preserved information will play a crucial role. Self-interest electric vehicle (EV) users, on the other hand, try to select a fast-EVCS for charging… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

  49. arXiv:2312.09603  [pdf, other

    cs.SD cs.LG eess.AS

    Stethoscope-guided Supervised Contrastive Learning for Cross-domain Adaptation on Respiratory Sound Classification

    Authors: June-Woo Kim, Sangmin Bae, Won-Yang Cho, Byungjo Lee, Ho-Young Jung

    Abstract: Despite the remarkable advances in deep learning technology, achieving satisfactory performance in lung sound classification remains a challenge due to the scarcity of available data. Moreover, the respiratory sound samples are collected from a variety of electronic stethoscopes, which could potentially introduce biases into the trained models. When a significant distribution shift occurs within t… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: accepted to ICASSP 2024

  50. arXiv:2312.02490  [pdf, other

    cs.LG cs.CR

    Constrained Twin Variational Auto-Encoder for Intrusion Detection in IoT Systems

    Authors: Phai Vu Dinh, Quang Uy Nguyen, Dinh Thai Hoang, Diep N. Nguyen, Son Pham Bao, Eryk Dutkiewicz

    Abstract: Intrusion detection systems (IDSs) play a critical role in protecting billions of IoT devices from malicious attacks. However, the IDSs for IoT devices face inherent challenges of IoT systems, including the heterogeneity of IoT data/devices, the high dimensionality of training data, and the imbalanced data. Moreover, the deployment of IDSs on IoT systems is challenging, and sometimes impossible, d… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.