Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 1,268 results for author: Lee, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.02883  [pdf

    cs.CV cs.AI

    Multi-stream deep learning framework to predict mild cognitive impairment with Rey Complex Figure Test

    Authors: Junyoung Park, Eun Hyun Seo, Sunjun Kim, SangHak Yi, Kun Ho Lee, Sungho Won

    Abstract: Drawing tests like the Rey Complex Figure Test (RCFT) are widely used to assess cognitive functions such as visuospatial skills and memory, making them valuable tools for detecting mild cognitive impairment (MCI). Despite their utility, existing predictive models based on these tests often suffer from limitations like small sample sizes and lack of external validation, which undermine their reliab… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

    Comments: 20 pages, 3 figures, 2 tables

  2. arXiv:2409.02076  [pdf, other

    cs.CL

    Spinning the Golden Thread: Benchmarking Long-Form Generation in Language Models

    Authors: Yuhao Wu, Ming Shan Hee, Zhiqing Hu, Roy Ka-Wei Lee

    Abstract: The abilities of long-context language models (LMs) are often evaluated using the "Needle-in-a-Haystack" (NIAH) test, which comprises tasks designed to assess a model's ability to identify specific information ("needle") within large text sequences ("haystack"). While these benchmarks measure how well models understand long-context input sequences, they do not effectively gauge the quality of long… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  3. arXiv:2409.01585  [pdf, other

    cs.LG cs.DC

    Buffer-based Gradient Projection for Continual Federated Learning

    Authors: Shenghong Dai, Jy-yong Sohn, Yicong Chen, S M Iftekharul Alam, Ravikumar Balakrishnan, Suman Banerjee, Nageen Himayat, Kangwook Lee

    Abstract: Continual Federated Learning (CFL) is essential for enabling real-world applications where multiple decentralized clients adaptively learn from continuous data streams. A significant challenge in CFL is mitigating catastrophic forgetting, where models lose previously acquired knowledge when learning new information. Existing approaches often face difficulties due to the constraints of device stora… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

    Comments: A preliminary version of this work was presented at the Federated Learning Systems (FLSys) Workshop @ Sixth Conference on Machine Learning and Systems, June 2023

  4. Data Collectives as a means to Improve Accountability, Combat Surveillance and Reduce Inequalities

    Authors: Jane Hsieh, Angie Zhang, Seyun Kim, Varun Nagaraj Rao, Samantha Dalal, Alexandra Mateescu, Rafael Do Nascimento Grohmann, Motahhare Eslami, Min Kyung Lee, Haiyi Zhu

    Abstract: Platform-based laborers face unprecedented challenges and working conditions that result from algorithmic opacity, insufficient data transparency, and unclear policies and regulations. The CSCW and HCI communities increasingly turn to worker data collectives as a means to advance related policy and regulation, hold platforms accountable for data transparency and disclosure, and empower the collect… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  5. arXiv:2408.17063  [pdf, ps, other

    cs.CR

    SIMD-Aware Homomorphic Compression and Application to Private Database Query

    Authors: Jung Hee Cheon, Keewoo Lee, Jai Hyun Park, Yongdong Yeo

    Abstract: In a private database query scheme (PDQ), a server maintains a database, and users send queries to retrieve records of interest from the server while keeping their queries private. A crucial step in PDQ protocols based on homomorphic encryption is homomorphic compression, which compresses encrypted sparse vectors consisting of query results. In this work, we propose a new homomorphic compression s… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

  6. arXiv:2408.14841  [pdf, other

    cs.CV cs.AI

    Diffusion based Semantic Outlier Generation via Nuisance Awareness for Out-of-Distribution Detection

    Authors: Suhee Yoon, Sanghyu Yoon, Hankook Lee, Ye Seul Sim, Sungik Choi, Kyungeun Lee, Hye-Seung Cho, Woohyung Lim

    Abstract: Out-of-distribution (OOD) detection, which determines whether a given sample is part of the in-distribution (ID), has recently shown promising results through training with synthetic OOD datasets. Nonetheless, existing methods often produce outliers that are considerably distant from the ID, showing limited efficacy for capturing subtle distinctions between ID and OOD. To address these issues, we… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  7. arXiv:2408.13779  [pdf, other

    cs.PL cs.DC

    Concurrent Data Structures Made Easy (Extended Version)

    Authors: Callista Le, Kiran Gopinathan, Koon Wen Lee, Seth Gilbert, Ilya Sergey

    Abstract: Design of an efficient thread-safe concurrent data structure is a balancing act between its implementation complexity and performance. Lock-based concurrent data structures, which are relatively easy to derive from their sequential counterparts and to prove thread-safe, suffer from poor throughput under even light multi-threaded workload. At the same time, lock-free concurrent structures allow for… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

    Comments: Extended version of the OOPSLA'24 paper

  8. arXiv:2408.13377  [pdf, other

    cs.RO

    Safe Bubble Cover for Motion Planning on Distance Fields

    Authors: Ki Myung Brian Lee, Zhirui Dai, Cedric Le Gentil, Lan Wu, Nikolay Atanasov, Teresa Vidal-Calleja

    Abstract: We consider the problem of planning collision-free trajectories on distance fields. Our key observation is that querying a distance field at one configuration reveals a region of safe space whose radius is given by the distance value, obviating the need for additional collision checking within the safe region. We refer to such regions as safe bubbles, and show that safe bubbles can be obtained fro… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

    Comments: 16 pages, 11 figures. Submitted to International Symposium on Robotics Research 2024

  9. arXiv:2408.11180  [pdf, other

    math.AT cs.CG

    Any Graph is a Mapper Graph

    Authors: Enrique G Alvarado, Robin Belton, Kang-Ju Lee, Sourabh Palande, Sarah Percival, Emilie Purvine, Sarah Tymochko

    Abstract: The Mapper algorithm is a popular tool for visualization and data exploration in topological data analysis. We investigate an inverse problem for the Mapper algorithm: Given a dataset $X$ and a graph $G$, does there exist a set of Mapper parameters such that the output Mapper graph of $X$ is isomorphic to $G$? We provide constructions that affirmatively answer this question. Our results demonstrat… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 13 pages, 4 figures

  10. arXiv:2408.10937  [pdf, other

    cs.HC

    Proxona: Leveraging LLM-Driven Personas to Enhance Creators' Understanding of Their Audience

    Authors: Yoonseo Choi, Eun Jeong Kang, Seulgi Choi, Min Kyung Lee, Juho Kim

    Abstract: Creators are nothing without their audience, and thereby understanding their audience is the cornerstone of their professional achievement. Yet many creators feel lost while comprehending audiences with existing tools, which offer insufficient insights for tailoring content to audience needs. To address the challenges creators face in understanding their audience, we present Proxona, a system for… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 32 pages (including 14 pages of Appendix)

  11. arXiv:2408.09802  [pdf, other

    cs.SD cs.CV eess.AS

    Hear Your Face: Face-based voice conversion with F0 estimation

    Authors: Jaejun Lee, Yoori Oh, Injune Hwang, Kyogu Lee

    Abstract: This paper delves into the emerging field of face-based voice conversion, leveraging the unique relationship between an individual's facial features and their vocal characteristics. We present a novel face-based voice conversion framework that particularly utilizes the average fundamental frequency of the target speaker, derived solely from their facial images. Through extensive analysis, our fram… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: Interspeech 2024

  12. arXiv:2408.09446  [pdf, other

    cs.LG math.NA physics.comp-ph

    Parameterized Physics-informed Neural Networks for Parameterized PDEs

    Authors: Woojin Cho, Minju Jo, Haksoo Lim, Kookjin Lee, Dongeun Lee, Sanghyun Hong, Noseong Park

    Abstract: Complex physical systems are often described by partial differential equations (PDEs) that depend on parameters such as the Reynolds number in fluid mechanics. In applications such as design optimization or uncertainty quantification, solutions of those PDEs need to be evaluated at numerous points in the parameter space. While physics-informed neural networks (PINNs) have emerged as a new strong c… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

  13. arXiv:2408.09300  [pdf, other

    eess.AS cs.CR cs.LG cs.SD

    Malacopula: adversarial automatic speaker verification attacks using a neural-based generalised Hammerstein model

    Authors: Massimiliano Todisco, Michele Panariello, Xin Wang, Héctor Delgado, Kong Aik Lee, Nicholas Evans

    Abstract: We present Malacopula, a neural-based generalised Hammerstein model designed to introduce adversarial perturbations to spoofed speech utterances so that they better deceive automatic speaker verification (ASV) systems. Using non-linear processes to modify speech utterances, Malacopula enhances the effectiveness of spoofing attacks. The model comprises parallel branches of polynomial functions foll… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

    Comments: Accepted at ASVspoof Workshop 2024

  14. arXiv:2408.08739  [pdf, other

    eess.AS cs.AI cs.SD

    ASVspoof 5: Crowdsourced Speech Data, Deepfakes, and Adversarial Attacks at Scale

    Authors: Xin Wang, Hector Delgado, Hemlata Tak, Jee-weon Jung, Hye-jin Shim, Massimiliano Todisco, Ivan Kukanov, Xuechen Liu, Md Sahidullah, Tomi Kinnunen, Nicholas Evans, Kong Aik Lee, Junichi Yamagishi

    Abstract: ASVspoof 5 is the fifth edition in a series of challenges that promote the study of speech spoofing and deepfake attacks, and the design of detection solutions. Compared to previous challenges, the ASVspoof 5 database is built from crowdsourced data collected from a vastly greater number of speakers in diverse acoustic conditions. Attacks, also crowdsourced, are generated and tested using surrogat… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: 8 pages, ASVspoof 5 Workshop (Interspeech2024 Satellite)

  15. arXiv:2408.08616  [pdf, other

    eess.IV cs.CV

    Reference-free Axial Super-resolution of 3D Microscopy Images using Implicit Neural Representation with a 2D Diffusion Prior

    Authors: Kyungryun Lee, Won-Ki Jeong

    Abstract: Analysis and visualization of 3D microscopy images pose challenges due to anisotropic axial resolution, demanding volumetric super-resolution along the axial direction. While training a learning-based 3D super-resolution model seems to be a straightforward solution, it requires ground truth isotropic volumes and suffers from the curse of dimensionality. Therefore, existing methods utilize 2D neura… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: MICCAI2024 accepted

  16. arXiv:2408.08461  [pdf, other

    cs.CV

    TEXTOC: Text-driven Object-Centric Style Transfer

    Authors: Jihun Park, Jongmin Gim, Kyoungmin Lee, Seunghun Lee, Sunghoon Im

    Abstract: We present Text-driven Object-Centric Style Transfer (TEXTOC), a novel method that guides style transfer at an object-centric level using textual inputs. The core of TEXTOC is our Patch-wise Co-Directional (PCD) loss, meticulously designed for precise object-centric transformations that are closely aligned with the input text. This loss combines a patch directional loss for text-guided style direc… ▽ More

    Submitted 22 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

    Comments: 18 pages, 16 figures

  17. arXiv:2408.07900  [pdf, other

    cs.SI physics.soc-ph

    Network analysis reveals news press landscape and asymmetric user polarization

    Authors: Byunghwee Lee, Hyo-sun Ryu, Jae Kook Lee, Hawoong Jeong, Beom Jun Kim

    Abstract: Unlike traditional media, online news platforms allow users to consume content that suits their tastes and to facilitate interactions with other people. However, as more personalized consumption of information and interaction with like-minded users increase, ideological bias can inadvertently increase and contribute to the formation of echo chambers, reinforcing the polarization of opinions. Altho… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: 21 pages, 6 figures

  18. arXiv:2408.07327  [pdf, other

    cs.LG cs.AI

    An Offline Meta Black-box Optimization Framework for Adaptive Design of Urban Traffic Light Management Systems

    Authors: Taeyoung Yun, Kanghoon Lee, Sujin Yun, Ilmyung Kim, Won-Woo Jung, Min-Cheol Kwon, Kyujin Choi, Yoohyeon Lee, Jinkyoo Park

    Abstract: Complex urban road networks with high vehicle occupancy frequently face severe traffic congestion. Designing an effective strategy for managing multiple traffic lights plays a crucial role in managing congestion. However, most current traffic light management systems rely on human-crafted decisions, which may not adapt well to diverse traffic patterns. In this paper, we delve into two pivotal desi… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: 12 pages, 7 figures, 10 tables

  19. arXiv:2408.05940  [pdf, other

    cs.CV cs.AI cs.RO

    Spb3DTracker: A Robust LiDAR-Based Person Tracker for Noisy Environment

    Authors: Eunsoo Im, Changhyun Jee, Jung Kwon Lee

    Abstract: Person detection and tracking (PDT) has seen significant advancements with 2D camera-based systems in the autonomous vehicle field, leading to widespread adoption of these algorithms. However, growing privacy concerns have recently emerged as a major issue, prompting a shift towards LiDAR-based PDT as a viable alternative. Within this domain, "Tracking-by-Detection" (TBD) has become a prominent me… ▽ More

    Submitted 13 August, 2024; v1 submitted 12 August, 2024; originally announced August 2024.

    Comments: 17 pages, 5 figures

  20. arXiv:2408.04744  [pdf, other

    cond-mat.dis-nn cs.DC cs.ET

    Noise-augmented Chaotic Ising Machines for Combinatorial Optimization and Sampling

    Authors: Kyle Lee, Shuvro Chowdhury, Kerem Y. Camsari

    Abstract: The rise of domain-specific computing has led to great interest in Ising machines, dedicated hardware accelerators tailored to solve combinatorial optimization and probabilistic sampling problems. A key element of Ising machines is stochasticity, which enables a wide exploration of configurations, thereby helping avoid local minima. Here, we evaluate and improve the previously proposed concept of… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

  21. arXiv:2408.03541  [pdf, ps, other

    cs.CL cs.AI

    EXAONE 3.0 7.8B Instruction Tuned Language Model

    Authors: LG AI Research, :, Soyoung An, Kyunghoon Bae, Eunbi Choi, Stanley Jungkyu Choi, Yemuk Choi, Seokhee Hong, Yeonjung Hong, Junwon Hwang, Hyojin Jeon, Gerrard Jeongwon Jo, Hyunjik Jo, Jiyeon Jung, Yountae Jung, Euisoon Kim, Hyosang Kim, Joonkee Kim, Seonghwan Kim, Soyeon Kim, Sunkyoung Kim, Yireun Kim, Youchul Kim, Edward Hwayoung Lee, Haeju Lee , et al. (14 additional authors not shown)

    Abstract: We introduce EXAONE 3.0 instruction-tuned language model, the first open model in the family of Large Language Models (LLMs) developed by LG AI Research. Among different model sizes, we publicly release the 7.8B instruction-tuned model to promote open research and innovations. Through extensive evaluations across a wide range of public and in-house benchmarks, EXAONE 3.0 demonstrates highly compet… ▽ More

    Submitted 13 August, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

  22. arXiv:2408.03468  [pdf, other

    cs.MM cs.AI cs.CV

    MultiHateClip: A Multilingual Benchmark Dataset for Hateful Video Detection on YouTube and Bilibili

    Authors: Han Wang, Tan Rui Yang, Usman Naseem, Roy Ka-Wei Lee

    Abstract: Hate speech is a pressing issue in modern society, with significant effects both online and offline. Recent research in hate speech detection has primarily centered on text-based media, largely overlooking multimodal content such as videos. Existing studies on hateful video datasets have predominantly focused on English content within a Western context and have been limited to binary labels (hatef… ▽ More

    Submitted 12 August, 2024; v1 submitted 28 July, 2024; originally announced August 2024.

    Comments: 10 pages, 3 figures, ACM Multimedia 2024

    ACM Class: I.2.0

  23. arXiv:2408.03204  [pdf, other

    cs.SD eess.AS

    GRAFX: An Open-Source Library for Audio Processing Graphs in PyTorch

    Authors: Sungho Lee, Marco Martínez-Ramírez, Wei-Hsiang Liao, Stefan Uhlich, Giorgio Fabbro, Kyogu Lee, Yuki Mitsufuji

    Abstract: We present GRAFX, an open-source library designed for handling audio processing graphs in PyTorch. Along with various library functionalities, we describe technical details on the efficient parallel computation of input graphs, signals, and processor parameters in GPU. Then, we show its example use under a music mixing scenario, where parameters of every differentiable processor in a large graph a… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

    Comments: Accepted to DAFx 2024 demo

  24. arXiv:2408.00359  [pdf, other

    cs.LG stat.ML

    Memorization Capacity for Additive Fine-Tuning with Small ReLU Networks

    Authors: Jy-yong Sohn, Dohyun Kwon, Seoyeon An, Kangwook Lee

    Abstract: Fine-tuning large pre-trained models is a common practice in machine learning applications, yet its mathematical analysis remains largely unexplored. In this paper, we study fine-tuning through the lens of memorization capacity. Our new measure, the Fine-Tuning Capacity (FTC), is defined as the maximum number of samples a neural network can fine-tune, or equivalently, as the minimum number of neur… ▽ More

    Submitted 19 August, 2024; v1 submitted 1 August, 2024; originally announced August 2024.

    Comments: 10 pages, 9 figures, UAI 2024

  25. arXiv:2408.00156  [pdf, other

    cs.CY

    Measuring Falseness in News Articles based on Concealment and Overstatement

    Authors: Jiyoung Lee, Keeheon Lee

    Abstract: This research investigates the extent of misinformation in certain journalistic articles by introducing a novel measurement tool to assess the degrees of falsity. It aims to measure misinformation using two metrics (concealment and overstatement) to explore how information is interpreted as false. This should help examine how articles containing partly true and partly false information can potenti… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

  26. arXiv:2407.21635  [pdf, other

    cs.LG

    MART: MultiscAle Relational Transformer Networks for Multi-agent Trajectory Prediction

    Authors: Seongju Lee, Junseok Lee, Yeonguk Yu, Taeri Kim, Kyoobin Lee

    Abstract: Multi-agent trajectory prediction is crucial to autonomous driving and understanding the surrounding environment. Learning-based approaches for multi-agent trajectory prediction, such as primarily relying on graph neural networks, graph transformers, and hypergraph neural networks, have demonstrated outstanding performance on real-world datasets in recent years. However, the hypergraph transformer… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

    Comments: 19 pages, 12 figures, 7 tables, 8 pages of supplementary material. Paper accepted at ECCV 2024

  27. arXiv:2407.21260  [pdf, other

    cs.LG cs.AI stat.ML

    Tractable and Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation

    Authors: Taehyun Cho, Seungyub Han, Kyungjae Lee, Seokhun Ju, Dohyeong Kim, Jungwoo Lee

    Abstract: Distributional reinforcement learning improves performance by effectively capturing environmental stochasticity, but a comprehensive theoretical understanding of its effectiveness remains elusive. In this paper, we present a regret analysis for distributional reinforcement learning with general value function approximation in a finite episodic Markov decision process setting. We first introduce a… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

  28. arXiv:2407.19900  [pdf, other

    cs.SD cs.AI eess.AS

    Practical and Reproducible Symbolic Music Generation by Large Language Models with Structural Embeddings

    Authors: Seungyeon Rhyu, Kichang Yang, Sungjun Cho, Jaehyeon Kim, Kyogu Lee, Moontae Lee

    Abstract: Music generation introduces challenging complexities to large language models. Symbolic structures of music often include vertical harmonization as well as horizontal counterpoint, urging various adaptations and enhancements for large-scale Transformers. However, existing works share three major drawbacks: 1) their tokenization requires domain-specific annotations, such as bars and beats, that are… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: 9 pages, 6 figures, 4 tables

  29. arXiv:2407.19871  [pdf, ps, other

    cs.CR cs.NI

    Fast Private Location-based Information Retrieval Over the Torus

    Authors: Joon Soo Yoo, Mi Yeon Hong, Ji Won Heo, Kang Hoon Lee, Ji Won Yoon

    Abstract: Location-based services offer immense utility, but also pose significant privacy risks. In response, we propose LocPIR, a novel framework using homomorphic encryption (HE), specifically the TFHE scheme, to preserve user location privacy when retrieving data from public clouds. Our system employs TFHE's expertise in non-polynomial evaluations, crucial for comparison operations. LocPIR showcases min… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: Accepted at the IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS) 2024

  30. arXiv:2407.19862  [pdf, other

    cs.SD eess.AS

    Wavespace: A Highly Explorable Wavetable Generator

    Authors: Hazounne Lee, Kihong Kim, Sungho Lee, Kyogu Lee

    Abstract: Wavetable synthesis generates quasi-periodic waveforms of musical tones by interpolating a list of waveforms called wavetable. As generative models that utilize latent representations offer various methods in waveform generation for musical applications, studies in wavetable generation with invertible architecture have also arisen recently. While they are promising, it is still challenging to gene… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  31. AccessShare: Co-designing Data Access and Sharing with Blind People

    Authors: Rie Kamikubo, Farnaz Zamiri Zeraati, Kyungjun Lee, Hernisa Kacorri

    Abstract: Blind people are often called to contribute image data to datasets for AI innovation with the hope for future accessibility and inclusion. Yet, the visual inspection of the contributed images is inaccessible. To this day, we lack mechanisms for data inspection and control that are accessible to the blind community. To address this gap, we engage 10 blind participants in a scenario where they wear… ▽ More

    Submitted 27 July, 2024; originally announced July 2024.

    Comments: Preprint, The 26th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2024)

  32. arXiv:2407.19092  [pdf, other

    cs.LG stat.ME stat.ML

    Boosted generalized normal distributions: Integrating machine learning with operations knowledge

    Authors: Ragip Gurlek, Francis de Vericourt, Donald K. K. Lee

    Abstract: Applications of machine learning (ML) techniques to operational settings often face two challenges: i) ML methods mostly provide point predictions whereas many operational problems require distributional information; and ii) They typically do not incorporate the extensive body of knowledge in the operations literature, particularly the theoretical and empirical findings that characterize specific… ▽ More

    Submitted 1 August, 2024; v1 submitted 26 July, 2024; originally announced July 2024.

    Comments: 28 pages, 3 figures

    MSC Class: 60E05; 62G07; 62F99; 68T01; 90B22; 90B50

  33. arXiv:2407.17909  [pdf, other

    cs.CV cs.LG

    Separating Novel Features for Logical Anomaly Detection: A Straightforward yet Effective Approach

    Authors: Kangil Lee, Geonuk Kim

    Abstract: Vision-based inspection algorithms have significantly contributed to quality control in industrial settings, particularly in addressing structural defects like dent and contamination which are prevalent in mass production. Extensive research efforts have led to the development of related benchmarks such as MVTec AD (Bergmann et al., 2019). However, in industrial settings, there can be instances of… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  34. arXiv:2407.17688  [pdf, other

    cs.CL cs.AI

    Examining the Influence of Political Bias on Large Language Model Performance in Stance Classification

    Authors: Lynnette Hui Xian Ng, Iain Cruickshank, Roy Ka-Wei Lee

    Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities in executing tasks based on natural language queries. However, these models, trained on curated datasets, inherently embody biases ranging from racial to national and gender biases. It remains uncertain whether these biases impact the performance of LLMs for certain tasks. In this study, we investigate the political biases of L… ▽ More

    Submitted 26 July, 2024; v1 submitted 24 July, 2024; originally announced July 2024.

    Comments: Accepted at ICWSM 2025

  35. arXiv:2407.17423  [pdf, ps, other

    cs.CV

    On selection of centroids of fuzzy clusters for color classification

    Authors: Dae-Won Kim, Kwang H. Lee

    Abstract: A novel initialization method in the fuzzy c-means (FCM) algorithm is proposed for the color clustering problem. Given a set of color points, the proposed initialization extracts dominant colors that are the most vivid and distinguishable colors. Color points closest to the dominant colors are selected as initial centroids in the FCM. To obtain the dominant colors and their closest color points, w… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  36. arXiv:2407.16822  [pdf, other

    cs.CV cs.AI

    AI-Enhanced 7-Point Checklist for Melanoma Detection Using Clinical Knowledge Graphs and Data-Driven Quantification

    Authors: Yuheng Wang, Tianze Yu, Jiayue Cai, Sunil Kalia, Harvey Lui, Z. Jane Wang, Tim K. Lee

    Abstract: The 7-point checklist (7PCL) is widely used in dermoscopy to identify malignant melanoma lesions needing urgent medical attention. It assigns point values to seven attributes: major attributes are worth two points each, and minor ones are worth one point each. A total score of three or higher prompts further evaluation, often including a biopsy. However, a significant limitation of current methods… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  37. arXiv:2407.16329  [pdf, other

    cs.HC cs.AI

    PhenoFlow: A Human-LLM Driven Visual Analytics System for Exploring Large and Complex Stroke Datasets

    Authors: Jaeyoung Kim, Sihyeon Lee, Hyeon Jeon, Keon-Joo Lee, Hee-Joon Bae, Bohyoung Kim, Jinwook Seo

    Abstract: Acute stroke demands prompt diagnosis and treatment to achieve optimal patient outcomes. However, the intricate and irregular nature of clinical data associated with acute stroke, particularly blood pressure (BP) measurements, presents substantial obstacles to effective visual analytics and decision-making. Through a year-long collaboration with experienced neurologists, we developed PhenoFlow, a… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: 11 pages, 5 figures, paper to appear in IEEE Transactions on Visualization and Computer Graphics (TVCG) (Proc. IEEE VIS 2024)

  38. arXiv:2407.15779  [pdf, other

    cs.HC

    Analyzing the Impact of the Automatic Ball-Strike System in Professional Baseball: A Case Study on KBO League Data

    Authors: Kichang Lee, Kyungsik Han, JeongGil Ko

    Abstract: Recent advancements in professional baseball have led to the introduction of the Automated Ball-Strike (ABS) system, or ``robot umpires,'' which utilize machine learning, computer vision, and precise tracking technologies to automate ball-strike calls. The Korean Baseball Organization (KBO) league became the first professional baseball league to implement ABS during the 2024 season. This study ana… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: 12 pages, 14 figures

    MSC Class: 68U99 ACM Class: J.4

  39. arXiv:2407.15188  [pdf, other

    eess.AS cs.SD

    Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning

    Authors: Shuai Wang, Zhengyang Chen, Kong Aik Lee, Yanmin Qian, Haizhou Li

    Abstract: Speaker individuality information is among the most critical elements within speech signals. By thoroughly and accurately modeling this information, it can be utilized in various intelligent speech applications, such as speaker recognition, speaker diarization, speech synthesis, and target speaker extraction. In this article, we aim to present, from a unique perspective, the developmental history,… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  40. arXiv:2407.14502  [pdf, other

    cs.CV

    M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models

    Authors: Seunggeun Chi, Hyung-gun Chi, Hengbo Ma, Nakul Agarwal, Faizan Siddiqui, Karthik Ramani, Kwonjoon Lee

    Abstract: We introduce the Multi-Motion Discrete Diffusion Models (M2D2M), a novel approach for human motion generation from textual descriptions of multiple actions, utilizing the strengths of discrete diffusion models. This approach adeptly addresses the challenge of generating multi-motion sequences, ensuring seamless transitions of motions and coherence across a series of actions. The strength of M2D2M… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  41. arXiv:2407.13242  [pdf, other

    eess.AS cs.SD

    Fade-in Reverberation in Multi-room Environments Using the Common-Slope Model

    Authors: Kyung Yun Lee, Nils Meyer-Kahlen, Georg Götz, U. Peter Svensson, Sebastian J. Schlecht, Vesa Välimäki

    Abstract: In multi-room environments, modelling the sound propagation is complex due to the coupling of rooms and diverse source-receiver positions. A common scenario is when the source and the receiver are in different rooms without a clear line of sight. For such source-receiver configurations, an initial increase in energy is observed, referred to as the "fade-in" of reverberation. Based on recent work o… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 2024 AES 5th International Conference on Audio for Virtual and Augmented Reality

  42. arXiv:2407.13218  [pdf, other

    cs.LG cs.AI

    LiNR: Model Based Neural Retrieval on GPUs at LinkedIn

    Authors: Fedor Borisyuk, Qingquan Song, Mingzhou Zhou, Ganesh Parameswaran, Madhu Arun, Siva Popuri, Tugrul Bingol, Zhuotao Pei, Kuang-Hsuan Lee, Lu Zheng, Qizhan Shao, Ali Naqvi, Sen Zhou, Aman Gupta

    Abstract: This paper introduces LiNR, LinkedIn's large-scale, GPU-based retrieval system. LiNR supports a billion-sized index on GPU models. We discuss our experiences and challenges in creating scalable, differentiable search indexes using TensorFlow and PyTorch at production scale. In LiNR, both items and model weights are integrated into the model binary. Viewing index construction as a form of model tra… ▽ More

    Submitted 7 August, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

  43. arXiv:2407.13146  [pdf, other

    cs.LG cs.AI

    PG-Rainbow: Using Distributional Reinforcement Learning in Policy Gradient Methods

    Authors: WooJae Jeon, KangJun Lee, Jeewoo Lee

    Abstract: This paper introduces PG-Rainbow, a novel algorithm that incorporates a distributional reinforcement learning framework with a policy gradient algorithm. Existing policy gradient methods are sample inefficient and rely on the mean of returns when calculating the state-action value function, neglecting the distributional nature of returns in reinforcement learning tasks. To address this issue, we u… ▽ More

    Submitted 18 July, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

  44. arXiv:2407.12882  [pdf, other

    cs.CL cs.AI cs.LG

    InstructAV: Instruction Fine-tuning Large Language Models for Authorship Verification

    Authors: Yujia Hu, Zhiqiang Hu, Chun-Wei Seah, Roy Ka-Wei Lee

    Abstract: Large Language Models (LLMs) have demonstrated remarkable proficiency in a wide range of NLP tasks. However, when it comes to authorship verification (AV) tasks, which involve determining whether two given texts share the same authorship, even advanced models like ChatGPT exhibit notable limitations. This paper introduces a novel approach, termed InstructAV, for authorship verification. This appro… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  45. arXiv:2407.12329  [pdf, other

    cs.CV

    Label-Efficient 3D Brain Segmentation via Complementary 2D Diffusion Models with Orthogonal Views

    Authors: Jihoon Cho, Suhyun Ahn, Beomju Kim, Hyungjoon Bae, Xiaofeng Liu, Fangxu Xing, Kyungeun Lee, Georges Elfakhri, Van Wedeen, Jonghye Woo, Jinah Park

    Abstract: Deep learning-based segmentation techniques have shown remarkable performance in brain segmentation, yet their success hinges on the availability of extensive labeled training data. Acquiring such vast datasets, however, poses a significant challenge in many clinical applications. To address this issue, in this work, we propose a novel 3D brain segmentation approach using complementary 2D diffusio… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Extended version of "3D Segmentation of Subcortical Brain Structure with Few Labeled Data using 2D Diffusion Models" (ISMRM 2024 oral)

  46. arXiv:2407.10542  [pdf, other

    cs.CV cs.AI

    3D Geometric Shape Assembly via Efficient Point Cloud Matching

    Authors: Nahyuk Lee, Juhong Min, Junha Lee, Seungwook Kim, Kanghee Lee, Jaesik Park, Minsu Cho

    Abstract: Learning to assemble geometric shapes into a larger target structure is a pivotal task in various practical applications. In this work, we tackle this problem by establishing local correspondences between point clouds of part shapes in both coarse- and fine-levels. To this end, we introduce Proxy Match Transform (PMT), an approximate high-order feature transform layer that enables reliable matchin… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted to ICML 2024

  47. arXiv:2407.10385  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    By My Eyes: Grounding Multimodal Large Language Models with Sensor Data via Visual Prompting

    Authors: Hyungjun Yoon, Biniyam Aschalew Tolera, Taesik Gong, Kimin Lee, Sung-Ju Lee

    Abstract: Large language models (LLMs) have demonstrated exceptional abilities across various domains. However, utilizing LLMs for ubiquitous sensing applications remains challenging as existing text-prompt methods show significant performance degradation when handling long sensor data sequences. We propose a visual prompting approach for sensor data using multimodal LLMs (MLLMs). We design a visual prompt… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 21 pages, 16 figures

  48. arXiv:2407.10299  [pdf, other

    cs.CV

    Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models

    Authors: Yuchen Yang, Kwonjoon Lee, Behzad Dariush, Yinzhi Cao, Shao-Yuan Lo

    Abstract: Video Anomaly Detection (VAD) is crucial for applications such as security surveillance and autonomous driving. However, existing VAD methods provide little rationale behind detection, hindering public trust in real-world deployments. In this paper, we approach VAD with a reasoning framework. Although Large Language Models (LLMs) have shown revolutionary reasoning ability, we find that their direc… ▽ More

    Submitted 20 July, 2024; v1 submitted 14 July, 2024; originally announced July 2024.

    Comments: Accepted at European Conference on Computer Vision (ECCV) 2024

  49. arXiv:2407.06782  [pdf, ps, other

    cs.CV

    Fuzzy color model and clustering algorithm for color clustering problem

    Authors: Dae-Won Kim, Kwang H. Lee

    Abstract: The research interest of this paper is focused on the efficient clustering task for an arbitrary color data. In order to tackle this problem, we have tried to model the inherent uncertainty and vagueness of color data using fuzzy color model. By taking fuzzy approach to color modeling, we could make a soft decision for the vague regions between neighboring colors. The proposed fuzzy color model de… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  50. arXiv:2407.06774  [pdf, ps, other

    cs.AI

    A new validity measure for fuzzy c-means clustering

    Authors: Dae-Won Kim, Kwang H. Lee

    Abstract: A new cluster validity index is proposed for fuzzy clusters obtained from fuzzy c-means algorithm. The proposed validity index exploits inter-cluster proximity between fuzzy clusters. Inter-cluster proximity is used to measure the degree of overlap between clusters. A low proximity value refers to well-partitioned clusters. The best fuzzy c-partition is obtained by minimizing inter-cluster proximi… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: Accepted at FIP-2002