Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 1,675 results for author: Nguyen, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.04489  [pdf, other

    cs.CV

    Dude: Dual Distribution-Aware Context Prompt Learning For Large Vision-Language Model

    Authors: Duy M. H. Nguyen, An T. Le, Trung Q. Nguyen, Nghiem T. Diep, Tai Nguyen, Duy Duong-Tran, Jan Peters, Li Shen, Mathias Niepert, Daniel Sonntag

    Abstract: Prompt learning methods are gaining increasing attention due to their ability to customize large vision-language models to new domains using pre-trained contextual knowledge and minimal training data. However, existing works typically rely on optimizing unified prompt inputs, often struggling with fine-grained classification tasks due to insufficient discriminative attributes. To tackle this, we c… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Version 1

  2. arXiv:2407.03788  [pdf, other

    cs.CV cs.CL

    Meta-optimized Angular Margin Contrastive Framework for Video-Language Representation Learning

    Authors: Thong Nguyen, Yi Bin, Xiaobao Wu, Xinshuai Dong, Zhiyuan Hu, Khoi Le, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan

    Abstract: Data quality stands at the forefront of deciding the effectiveness of video-language representation learning. However, video-text pairs in previous data typically do not align perfectly with each other, which might lead to video-language representations that do not accurately reflect cross-modal semantics. Moreover, previous data also possess an uneven distribution of concepts, thereby hampering t… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  3. arXiv:2407.03665  [pdf, other

    cs.IR cs.AI cs.LG cs.SI stat.ML

    Heterogeneous Hypergraph Embedding for Recommendation Systems

    Authors: Darnbi Sakong, Viet Hung Vu, Thanh Trung Huynh, Phi Le Nguyen, Hongzhi Yin, Quoc Viet Hung Nguyen, Thanh Tam Nguyen

    Abstract: Recent advancements in recommender systems have focused on integrating knowledge graphs (KGs) to leverage their auxiliary information. The core idea of KG-enhanced recommenders is to incorporate rich semantic information for more accurate recommendations. However, two main challenges persist: i) Neglecting complex higher-order interactions in the KG-based user-item network, potentially leading to… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  4. arXiv:2407.03611  [pdf, other

    cs.SE cs.AI

    An Empirical Study on Capability of Large Language Models in Understanding Code Semantics

    Authors: Thu-Trang Nguyen, Thanh Trong Vu, Hieu Dinh Vo, Son Nguyen

    Abstract: Large Language Models for Code (code LLMs) have demonstrated remarkable performance across various software engineering (SE) tasks, increasing the application of code LLMs in software development. Despite the success of code LLMs, there remain significant concerns about the actual capabilities and reliability of these models, "whether these models really learn the semantics of code from the traini… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  5. arXiv:2407.03144  [pdf, other

    cs.CV

    Venomancer: Towards Imperceptible and Target-on-Demand Backdoor Attacks in Federated Learning

    Authors: Son Nguyen, Thinh Nguyen, Khoa Doan, Kok-Seng Wong

    Abstract: Federated Learning (FL) is a distributed machine learning approach that maintains data privacy by training on decentralized data sources. Similar to centralized machine learning, FL is also susceptible to backdoor attacks. Most backdoor attacks in FL assume a predefined target class and require control over a large number of clients or knowledge of benign clients' information. Furthermore, they ar… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  6. arXiv:2407.03110  [pdf, other

    cs.SD cs.AI eess.AS

    A Toolchain for Comprehensive Audio/Video Analysis Using Deep Learning Based Multimodal Approach (A use case of riot or violent context detection)

    Authors: Lam Pham, Phat Lam, Tin Nguyen, Hieu Tang, Alexander Schindler

    Abstract: In this paper, we present a toolchain for a comprehensive audio/video analysis by leveraging deep learning based multimodal approach. To this end, different specific tasks of Speech to Text (S2T), Acoustic Scene Classification (ASC), Acoustic Event Detection (AED), Visual Object Detection (VOD), Image Captioning (IC), and Video Captioning (VC) are conducted and integrated into the toolchain. By co… ▽ More

    Submitted 2 May, 2024; originally announced July 2024.

  7. arXiv:2407.02828  [pdf

    cs.ET quant-ph

    Quantum Serverless Paradigm and Application Development using the QFaaS Framework

    Authors: Hoa T. Nguyen, Bui Binh An Pham, Muhammad Usman, Rajkumar Buyya

    Abstract: Quantum computing has the potential to solve complex problems beyond the capabilities of classical computers. However, its practical use is currently limited due to early-stage quantum software engineering and the constraints of Noisy Intermediate-Scale Quantum (NISQ) devices. To address this issue, this chapter introduces the concept of serverless quantum computing with examples using QFaaS, a pr… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Guidelines for deploying and using the QFaaS Framework (for the original paper, see https://doi.org/10.1016/j.future.2024.01.018)

  8. arXiv:2407.02748  [pdf, other

    cs.DC cs.ET

    DRLQ: A Deep Reinforcement Learning-based Task Placement for Quantum Cloud Computing

    Authors: Hoa T. Nguyen, Muhammad Usman, Rajkumar Buyya

    Abstract: The quantum cloud computing paradigm presents unique challenges in task placement due to the dynamic and heterogeneous nature of quantum computation resources. Traditional heuristic approaches fall short in adapting to the rapidly evolving landscape of quantum computing. This paper proposes DRLQ, a novel Deep Reinforcement Learning (DRL)-based technique for task placement in quantum cloud computin… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted paper at IEEE CLOUD 2024 conference

  9. arXiv:2407.02190  [pdf, other

    cs.RO

    I2EKF-LO: A Dual-Iteration Extended Kalman Filter Based LiDAR Odometry

    Authors: Wenlu Yu, Jie Xu, Chengwei Zhao, Lijun Zhao, Thien-Minh Nguyen, Shenghai Yuan, Mingming Bai, Lihua Xie

    Abstract: LiDAR odometry is a pivotal technology in the fields of autonomous driving and autonomous mobile robotics. However, most of the current works focus on nonlinear optimization methods, and still existing many challenges in using the traditional Iterative Extended Kalman Filter (IEKF) framework to tackle the problem: IEKF only iterates over the observation equation, relying on a rough estimate of the… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted by IROS 2024

  10. arXiv:2407.01987  [pdf, other

    cs.CV

    AHMsys: An Automated HVAC Modeling System for BIM Project

    Authors: Long Hoang Dang, Duy-Hung Nguyen, Thai Quang Le, Thinh Truong Nguyen, Clark Mei, Vu Hoang

    Abstract: This paper presents a novel system, named AHMsys, designed to automate the process of generating 3D Heating, Ventilation, and Air Conditioning (HVAC) models from 2D Computer-Aided Design (CAD) drawings, a key component of Building Information Modeling (BIM). By automatically preprocessing and extracting essential HVAC object information then creating detailed 3D models, our proposed AHMsys signifi… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  11. arXiv:2407.01777  [pdf, other

    cs.SD cs.AI eess.AS

    Deepfake Audio Detection Using Spectrogram-based Feature and Ensemble of Deep Learning Models

    Authors: Lam Pham, Phat Lam, Truong Nguyen, Huyen Nguyen, Alexander Schindler

    Abstract: In this paper, we propose a deep learning based system for the task of deepfake audio detection. In particular, the draw input audio is first transformed into various spectrograms using three transformation methods of Short-time Fourier Transform (STFT), Constant-Q Transform (CQT), Wavelet Transform (WT) combined with different auditory-based filters of Mel, Gammatone, linear filters (LF), and dis… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  12. arXiv:2407.01110  [pdf

    cs.CR cs.AI cs.CY cs.LG

    SecGenAI: Enhancing Security of Cloud-based Generative AI Applications within Australian Critical Technologies of National Interest

    Authors: Christoforus Yoga Haryanto, Minh Hieu Vu, Trung Duc Nguyen, Emily Lomempow, Yulia Nurliana, Sona Taheri

    Abstract: The rapid advancement of Generative AI (GenAI) technologies offers transformative opportunities within Australia's critical technologies of national interest while introducing unique security challenges. This paper presents SecGenAI, a comprehensive security framework for cloud-based GenAI applications, with a focus on Retrieval-Augmented Generation (RAG) systems. SecGenAI addresses functional, in… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 10 pages, 4 figures, 9 tables, submitted to the 2024 11th International Conference on Soft Computing & Machine Intelligence (ISCMI 2024)

  13. arXiv:2407.00710  [pdf, other

    cs.LG stat.ML

    Weighted Missing Linear Discriminant Analysis: An Explainable Approach for Classification with Missing Data

    Authors: Tuan L. Vo, Uyen Dang, Thu Nguyen

    Abstract: As Artificial Intelligence (AI) models are gradually being adopted in real-life applications, the explainability of the model used is critical, especially in high-stakes areas such as medicine, finance, etc. Among the commonly used models, Linear Discriminant Analysis (LDA) is a widely used classification tool that is also explainable thanks to its ability to model class distributions and maximize… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  14. arXiv:2407.00609  [pdf, other

    cs.CV cs.LG

    ESGNN: Towards Equivariant Scene Graph Neural Network for 3D Scene Understanding

    Authors: Quang P. M. Pham, Khoi T. N. Nguyen, Lan C. Ngo, Truong Do, Truong Son Hy

    Abstract: Scene graphs have been proven to be useful for various scene understanding tasks due to their compact and explicit nature. However, existing approaches often neglect the importance of maintaining the symmetry-preserving property when generating scene graphs from 3D point clouds. This oversight can diminish the accuracy and robustness of the resulting scene graphs, especially when handling noisy, m… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  15. arXiv:2407.00535  [pdf, other

    cs.CE cs.CV

    AI-powered multimodal modeling of personalized hemodynamics in aortic stenosis

    Authors: Caglar Ozturk, Daniel H. Pak, Luca Rosalia, Debkalpa Goswami, Mary E. Robakowski, Raymond McKay, Christopher T. Nguyen, James S. Duncan, Ellen T. Roche

    Abstract: Aortic stenosis (AS) is the most common valvular heart disease in developed countries. High-fidelity preclinical models can improve AS management by enabling therapeutic innovation, early diagnosis, and tailored treatment planning. However, their use is currently limited by complex workflows necessitating lengthy expert-driven manual operations. Here, we propose an AI-powered computational framewo… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: CO and DHP contributed equally to this work. JSD and ETR are corresponding authors

  16. arXiv:2407.00411  [pdf, other

    cs.LG cs.AI

    Explainability of Machine Learning Models under Missing Data

    Authors: Tuan L. Vo, Thu Nguyen, Hugo L. Hammer, Michael A. Riegler, Pal Halvorsen

    Abstract: Missing data is a prevalent issue that can significantly impair model performance and interpretability. This paper briefly summarizes the development of the field of missing data with respect to Explainable Artificial Intelligence and experimentally investigates the effects of various imputation methods on the calculation of Shapley values, a popular technique for interpreting complex machine lear… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  17. arXiv:2406.20077  [pdf, other

    cs.CV

    HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model

    Authors: Hieu T. Nguyen, Yiwen Chen, Vikram Voleti, Varun Jampani, Huaizu Jiang

    Abstract: We introduce HouseCrafter, a novel approach that can lift a floorplan into a complete large 3D indoor scene (e.g., a house). Our key insight is to adapt a 2D diffusion model, which is trained on web-scale images, to generate consistent multi-view color (RGB) and depth (D) images across different locations of the scene. Specifically, the RGB-D images are generated autoregressively in a batch-wise m… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  18. arXiv:2406.19753  [pdf, other

    cs.LG

    Backdoor Attack in Prompt-Based Continual Learning

    Authors: Trang Nguyen, Anh Tran, Nhat Ho

    Abstract: Prompt-based approaches offer a cutting-edge solution to data privacy issues in continual learning, particularly in scenarios involving multiple data suppliers where long-term storage of private user data is prohibited. Despite delivering state-of-the-art performance, its impressive remembering capability can become a double-edged sword, raising security concerns as it might inadvertently retain p… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  19. arXiv:2406.18851  [pdf, other

    cs.LG cs.AI physics.chem-ph q-bio.BM q-bio.QM

    LICO: Large Language Models for In-Context Molecular Optimization

    Authors: Tung Nguyen, Aditya Grover

    Abstract: Optimizing black-box functions is a fundamental problem in science and engineering. To solve this problem, many approaches learn a surrogate function that estimates the underlying objective from limited historical evaluations. Large Language Models (LLMs), with their strong pattern-matching capabilities via pretraining on vast amounts of data, stand out as a potential candidate for surrogate model… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  20. arXiv:2406.17381  [pdf, other

    cs.LG cs.CV

    Forget but Recall: Incremental Latent Rectification in Continual Learning

    Authors: Nghia D. Nguyen, Hieu Trung Nguyen, Ang Li, Hoang Pham, Viet Anh Nguyen, Khoa D. Doan

    Abstract: Intrinsic capability to continuously learn a changing data stream is a desideratum of deep neural networks (DNNs). However, current DNNs suffer from catastrophic forgetting, which hinders remembering past knowledge. To mitigate this issue, existing Continual Learning (CL) approaches either retain exemplars for replay, regularize learning, or allocate dedicated capacity for new tasks. This paper in… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  21. arXiv:2406.17376  [pdf, other

    cs.SD cs.AI eess.AS

    Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection

    Authors: Duc-Tuan Truong, Ruijie Tao, Tuan Nguyen, Hieu-Thi Luong, Kong Aik Lee, Eng Siong Chng

    Abstract: Recent synthetic speech detectors leveraging the Transformer model have superior performance compared to the convolutional neural network counterparts. This improvement could be due to the powerful modeling ability of the multi-head self-attention (MHSA) in the Transformer model, which learns the temporal relationship of each input token. However, artifacts of synthetic speech can be located in sp… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

  22. arXiv:2406.16777  [pdf, other

    cs.CL cs.AI

    Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024

    Authors: Sai Koneru, Thai-Binh Nguyen, Ngoc-Quan Pham, Danni Liu, Zhaolin Li, Alexander Waibel, Jan Niehues

    Abstract: Large Language Models (LLMs) are currently under exploration for various tasks, including Automatic Speech Recognition (ASR), Machine Translation (MT), and even End-to-End Speech Translation (ST). In this paper, we present KIT's offline submission in the constrained + LLM track by incorporating recently proposed techniques that can be added to any cascaded speech translation. Specifically, we inte… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  23. arXiv:2406.16685  [pdf, other

    cs.CE

    A locking-free isogeometric thin shell formulation based on higher order accurate local strain projection via approximate dual splines

    Authors: Thi-Hoa Nguyen, René R. Hiemstra, Dominik Schillinger

    Abstract: We present a novel isogeometric discretization approach for the Kirchhoff-Love shell formulation based on the Hellinger-Reissner variational principle. For mitigating membrane locking, we discretize the independent strains with spline basis functions that are one degree lower than those used for the displacements. To enable computationally efficient condensation of the independent strains, we firs… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  24. arXiv:2406.16656  [pdf, ps, other

    cs.IT cs.DM math.CO

    Permutation Codes Correcting Multiple Deletions

    Authors: Shuche Wang, The Nguyen, Yeow Meng Chee, Van Khu Vu

    Abstract: Permutation codes in the Ulam metric, which can correct multiple deletions, have been investigated extensively recently owing to their applications. In this work, we are interested in the maximum size of the permutation codes in the Ulam metric and aim to design permutation codes that can correct multiple deletions with efficient decoding algorithms. We first present an improvement on the Gilbert-… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 9 pages

  25. arXiv:2406.15633  [pdf, other

    cs.SE

    Good things come in three: Generating SO Post Titles with Pre-Trained Models, Self Improvement and Post Ranking

    Authors: Duc Anh Le, Anh M. T. Bui, Phuong T. Nguyen, Davide Di Ruscio

    Abstract: Stack Overflow is a prominent Q and A forum, supporting developers in seeking suitable resources on programming-related matters. Having high-quality question titles is an effective means to attract developers' attention. Unfortunately, this is often underestimated, leaving room for improvement. Research has been conducted, predominantly leveraging pre-trained models to generate titles from code sn… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: The paper has been per-reviewed and accepted for publication to the International Symposium on Empirical Software Engineering and Measurement (ESEM 2024)

  26. arXiv:2406.15119  [pdf, other

    cs.SD cs.AI eess.AS

    Speech Emotion Recognition under Resource Constraints with Data Distillation

    Authors: Yi Chang, Zhao Ren, Zhonghao Zhao, Thanh Tam Nguyen, Kun Qian, Tanja Schultz, Björn W. Schuller

    Abstract: Speech emotion recognition (SER) plays a crucial role in human-computer interaction. The emergence of edge devices in the Internet of Things (IoT) presents challenges in constructing intricate deep learning models due to constraints in memory and computational resources. Moreover, emotional speech data often contains private information, raising concerns about privacy leakage during the deployment… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  27. arXiv:2406.14835  [pdf, other

    cs.CL cs.LG

    ToVo: Toxicity Taxonomy via Voting

    Authors: Tinh Son Luong, Thanh-Thien Le, Thang Viet Doan, Linh Ngo Van, Thien Huu Nguyen, Diep Thi-Ngoc Nguyen

    Abstract: Existing toxic detection models face significant limitations, such as lack of transparency, customization, and reproducibility. These challenges stem from the closed-source nature of their training data and the paucity of explanations for their evaluation mechanism. To address these issues, we propose a dataset creation mechanism that integrates voting and chain-of-thought processes, producing a h… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  28. arXiv:2406.14784  [pdf, other

    cs.LG stat.OT

    Active Learning for Fair and Stable Online Allocations

    Authors: Riddhiman Bhattacharya, Thanh Nguyen, Will Wei Sun, Mohit Tawarmalani

    Abstract: We explore an active learning approach for dynamic fair resource allocation problems. Unlike previous work that assumes full feedback from all agents on their allocations, we consider feedback from a select subset of agents at each epoch of the online resource allocation process. Despite this restriction, our proposed algorithms provide regret bounds that are sub-linear in number of time-periods f… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  29. arXiv:2406.14572  [pdf, other

    q-bio.QM cs.AI cs.IR

    Bioptic -- A Target-Agnostic Potency-Based Small Molecules Search Engine

    Authors: Vlad Vinogradov, Ivan Izmailov, Simon Steshin, Kong T. Nguyen

    Abstract: Recent successes in virtual screening have been made possible by large models and extensive chemical libraries. However, combining these elements is challenging: the larger the model, the more expensive it is to run, making ultra-large libraries unfeasible. To address this, we developed a target-agnostic, efficacy-based molecule search model, which allows us to find structurally dissimilar molecul… ▽ More

    Submitted 30 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  30. arXiv:2406.13781  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    A Primal-Dual Framework for Transformers and Neural Networks

    Authors: Tan M. Nguyen, Tam Nguyen, Nhat Ho, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher

    Abstract: Self-attention is key to the remarkable success of transformers in sequence modeling tasks including many applications in natural language processing and computer vision. Like neural network layers, these attention mechanisms are often developed by heuristics and experience. To provide a principled framework for constructing attention layers in transformers, we show that the self-attention corresp… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted to ICLR 2023, 26 pages, 4 figures, 14 tables

  31. arXiv:2406.13770  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    Elliptical Attention

    Authors: Stefan K. Nielsen, Laziz U. Abdullaev, Rachel Teo, Tan M. Nguyen

    Abstract: Pairwise dot-product self-attention is key to the success of transformers that achieve state-of-the-art performance across a variety of applications in language and vision. This dot-product self-attention computes attention weights among the input tokens using Euclidean distance, which makes the model prone to representation collapse and vulnerable to contaminated samples. In this paper, we propos… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 38 pages, 7 figures, 12 tables

  32. arXiv:2406.13762  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis

    Authors: Rachel S. Y. Teo, Tan M. Nguyen

    Abstract: The remarkable success of transformers in sequence modeling tasks, spanning various applications in natural language processing and computer vision, is attributed to the critical role of self-attention. Similar to the development of most deep learning models, the construction of these attention mechanisms rely on heuristics and experience. In our work, we derive self-attention from kernel principa… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 33 pages, 5 figures, 12 tables

  33. arXiv:2406.13725  [pdf, other

    cs.LG cs.AI stat.ML

    Tree-Sliced Wasserstein Distance on a System of Lines

    Authors: Viet-Hoang Tran, Trang Pham, Tho Tran, Tam Le, Tan M. Nguyen

    Abstract: Sliced Wasserstein (SW) distance in Optimal Transport (OT) is widely used in various applications thanks to its statistical effectiveness and computational efficiency. On the other hand, Tree Wassenstein (TW) and Tree-sliced Wassenstein (TSW) are instances of OT for probability measures where its ground cost is a tree metric. TSW also has a low computational complexity, i.e. linear to the number o… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 33 pages, 6 figures, 2 tables, 4 algorithms

  34. Machine Learning Applications of Quantum Computing: A Review

    Authors: Thien Nguyen, Tuomo Sipola, Jari Hautamäki

    Abstract: At the intersection of quantum computing and machine learning, this review paper explores the transformative impact these technologies are having on the capabilities of data processing and analysis, far surpassing the bounds of traditional computational methods. Drawing upon an in-depth analysis of 32 seminal papers, this review delves into the interplay between quantum computing and machine learn… ▽ More

    Submitted 24 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

    Comments: Proceedings of the 23rd European Conference on Cyber Warfare and Security (ECCWS 2024)

  35. arXiv:2406.12507  [pdf, other

    cs.LG

    Improving the Evaluation and Actionability of Explanation Methods for Multivariate Time Series Classification

    Authors: Davide Italo Serramazza, Thach Le Nguyen, Georgiana Ifrim

    Abstract: Explanation for Multivariate Time Series Classification (MTSC) is an important topic that is under explored. There are very few quantitative evaluation methodologies and even fewer examples of actionable explanation, where the explanation methods are shown to objectively improve specific computational tasks on time series data. In this paper we focus on analyzing InterpretTime, a recent evaluation… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  36. arXiv:2406.11794  [pdf, other

    cs.LG cs.CL

    DataComp-LM: In search of the next generation of training sets for language models

    Authors: Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Samir Gadre, Hritik Bansal, Etash Guha, Sedrick Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muennighoff, Reinhard Heckel, Jean Mercat, Mayee Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner , et al. (34 additional authors not shown)

    Abstract: We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants in the DCLM benchmark can experiment with dat… ▽ More

    Submitted 20 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Project page: https://www.datacomp.ai/dclm/

  37. arXiv:2406.11233  [pdf, other

    cs.LG cs.AI cs.CL

    Probing the Decision Boundaries of In-context Learning in Large Language Models

    Authors: Siyan Zhao, Tung Nguyen, Aditya Grover

    Abstract: In-context learning is a key paradigm in large language models (LLMs) that enables them to generalize to new tasks and domains by simply prompting these models with a few exemplars without explicit parameter updates. Many attempts have been made to understand in-context learning in LLMs as a function of model scale, pretraining data, and other factors. In this work, we propose a new mechanism to p… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 18 pages, 18 figures

  38. arXiv:2406.09732  [pdf, ps, other

    math.PR cs.GT econ.TH

    Finding pure Nash equilibria in large random games

    Authors: Andrea Collevecchio, Tuan-Minh Nguyen, Ziwen Zhong

    Abstract: Best Response Dynamics (BRD) is a class of strategy updating rules to find Pure Nash Equilibria (PNE) in a game. At each step, a player is randomly picked and they switches to a "best response" strategy based on the strategies chosen by others, so that the new strategy profile maximises their payoff. If no such strategy exists, a different player will be chosen randomly. When no player wants to ch… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 19 pages, 5 figures, 1 table

    MSC Class: 91A10; 91A06; 60K35; 60K37

  39. arXiv:2406.09400  [pdf, other

    cs.CV cs.LG

    Yo'LLaVA: Your Personalized Language and Vision Assistant

    Authors: Thao Nguyen, Haotian Liu, Yuheng Li, Mu Cai, Utkarsh Ojha, Yong Jae Lee

    Abstract: Large Multimodal Models (LMMs) have shown remarkable capabilities across a variety of tasks (e.g., image captioning, visual question answering). While broad, their knowledge remains generic (e.g., recognizing a dog), and they are unable to handle personalized subjects (e.g., recognizing a user's pet dog). Human reasoning, in contrast, typically operates within the context of specific subjects in o… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Project page: https://thaoshibe.github.io/YoLLaVA

  40. A Tangible Multi-Display Toolkit to Support the Collaborative Design Exploration of AV-Pedestrian Interfaces

    Authors: Marius Hoggenmuller, Martin Tomitsch, Callum Parker, Trung Thanh Nguyen, Dawei Zhou, Stewart Worrall, Eduardo Nebot

    Abstract: The advent of cyber-physical systems, such as robots and autonomous vehicles (AVs), brings new opportunities and challenges for the domain of interaction design. Though there is consensus about the value of human-centred development, there is a lack of documented tailored methods and tools for involving multiple stakeholders in design exploration processes. In this paper we present a novel approac… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  41. arXiv:2406.06863  [pdf, other

    cs.CR cs.AI cs.HC

    Ollabench: Evaluating LLMs' Reasoning for Human-centric Interdependent Cybersecurity

    Authors: Tam n. Nguyen

    Abstract: Large Language Models (LLMs) have the potential to enhance Agent-Based Modeling by better representing complex interdependent cybersecurity systems, improving cybersecurity threat modeling and risk management. However, evaluating LLMs in this context is crucial for legal compliance and effective application development. Existing LLM evaluation frameworks often overlook the human factor and cogniti… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 12 pages, 7 figures, 2 tables The final conference/journal version may have significantly more content updates

    ACM Class: I.2.0; J.4

  42. arXiv:2406.06239  [pdf, other

    cs.CV

    I-MPN: Inductive Message Passing Network for Effective and Efficient Human-in-the-Loop Annotation of Mobile Eye Tracking Data

    Authors: Hoang H. Le, Duy M. H. Nguyen, Omair Shahzad Bhatti, Laszlo Kopacsi, Thinh P. Ngo, Binh T. Nguyen, Michael Barz, Daniel Sonntag

    Abstract: Understanding human visual processing in dynamic environments is essential for psychology and human-centered interaction design. Mobile eye-tracking systems, combining egocentric video and gaze signals, offer valuable insights. However, manual analysis of these recordings is time-intensive. In this work, we present a novel human-centered learning algorithm designed for automated object recognition… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: First version

  43. arXiv:2406.05615  [pdf, other

    cs.CL

    Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives

    Authors: Thong Nguyen, Yi Bin, Junbin Xiao, Leigang Qu, Yicong Li, Jay Zhangjie Wu, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan

    Abstract: Humans use multiple senses to comprehend the environment. Vision and language are two of the most vital senses since they allow us to easily communicate our thoughts and perceive the world around us. There has been a lot of interest in creating video-language understanding systems with human-like senses since a video-language pair can mimic both our linguistic medium and visual environment with te… ▽ More

    Submitted 1 July, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

    Comments: Accepted at ACL 2024 (Findings)

  44. arXiv:2406.05349  [pdf, other

    cs.CV

    Blurry-Consistency Segmentation Framework with Selective Stacking on Differential Interference Contrast 3D Breast Cancer Spheroid

    Authors: Thanh-Huy Nguyen, Thi Kim Ngan Ngo, Mai Anh Vu, Ting-Yuan Tu

    Abstract: The ability of three-dimensional (3D) spheroid modeling to study the invasive behavior of breast cancer cells has drawn increased attention. The deep learning-based image processing framework is very effective at speeding up the cell morphological analysis process. Out-of-focus photos taken while capturing 3D cells under several z-slices, however, could negatively impact the deep learning model. I… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  45. arXiv:2406.03699  [pdf, other

    cs.CL

    M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering

    Authors: Anand Subramanian, Viktor Schlegel, Abhinav Ramesh Kashyap, Thanh-Tung Nguyen, Vijay Prakash Dwivedi, Stefan Winkler

    Abstract: There is vivid research on adapting Large Language Models (LLMs) to perform a variety of tasks in high-stakes domains such as healthcare. Despite their popularity, there is a lack of understanding of the extent and contributing factors that allow LLMs to recall relevant knowledge and combine it with presented information in the clinical and biomedical domain: a fundamental pre-requisite for succes… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted at ACL 2024 (Findings)

  46. arXiv:2406.02555  [pdf, ps, other

    eess.AS cs.CL

    PhoWhisper: Automatic Speech Recognition for Vietnamese

    Authors: Thanh-Thien Le, Linh The Nguyen, Dat Quoc Nguyen

    Abstract: We introduce PhoWhisper in five versions for Vietnamese automatic speech recognition. PhoWhisper's robustness is achieved through fine-tuning the Whisper model on an 844-hour dataset that encompasses diverse Vietnamese accents. Our experimental study demonstrates state-of-the-art performances of PhoWhisper on benchmark Vietnamese ASR datasets. We have open-sourced PhoWhisper at: https://github.com… ▽ More

    Submitted 27 March, 2024; originally announced June 2024.

    Comments: Accepted to ICLR 2024 Tiny Papers Track

  47. arXiv:2406.02317  [pdf, other

    cs.LG cs.AI stat.ML

    Generative Conditional Distributions by Neural (Entropic) Optimal Transport

    Authors: Bao Nguyen, Binh Nguyen, Hieu Trung Nguyen, Viet Anh Nguyen

    Abstract: Learning conditional distributions is challenging because the desired outcome is not a single distribution but multiple distributions that correspond to multiple instances of the covariates. We introduce a novel neural entropic optimal transport method designed to effectively learn generative models of conditional distributions, particularly in scenarios characterized by limited sample sizes. Our… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 15 pages, 8 figures

  48. arXiv:2406.01029  [pdf, other

    cs.CV

    CYCLO: Cyclic Graph Transformer Approach to Multi-Object Relationship Modeling in Aerial Videos

    Authors: Trong-Thuan Nguyen, Pha Nguyen, Xin Li, Jackson Cothren, Alper Yilmaz, Khoa Luu

    Abstract: Video scene graph generation (VidSGG) has emerged as a transformative approach to capturing and interpreting the intricate relationships among objects and their temporal dynamics in video sequences. In this paper, we introduce the new AeroEye dataset that focuses on multi-object relationship modeling in aerial videos. Our AeroEye dataset features various drone scenes and includes a visually compre… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  49. arXiv:2406.00973  [pdf, other

    cs.IR cs.LG

    Cold-start Recommendation by Personalized Embedding Region Elicitation

    Authors: Hieu Trung Nguyen, Duy Nguyen, Khoa Doan, Viet Anh Nguyen

    Abstract: Rating elicitation is a success element for recommender systems to perform well at cold-starting, in which the systems need to recommend items to a newly arrived user with no prior knowledge about the user's preference. Existing elicitation methods employ a fixed set of items to learn the user's preference and then infer the users' preferences on the remaining items. Using a fixed seed set can lim… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted at UAI 2024

  50. arXiv:2406.00181  [pdf, other

    cs.DC

    Wait or Not to Wait: Evaluating Trade-Offs between Speed and Precision in Blockchain-based Federated Aggregation

    Authors: Huong Nguyen, Tri Nguyen, Lauri Lovén, Susanna Pirttikangas

    Abstract: This paper presents a fully coupled blockchain-assisted federated learning architecture that effectively eliminates single points of failure by decentralizing both the training and aggregation tasks across all participants. Our proposed system offers a high degree of flexibility, allowing participants to select shared models and customize the aggregation for local needs, thereby optimizing system… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: Accepted at Workshop on Engineering techniques for Distributed Computing Continuum Systems 2024