Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 1,118 results for author: Nguyen, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17381  [pdf, other

    cs.LG cs.CV

    Forget but Recall: Incremental Latent Rectification in Continual Learning

    Authors: Nghia D. Nguyen, Hieu Trung Nguyen, Ang Li, Hoang Pham, Viet Anh Nguyen, Khoa D. Doan

    Abstract: Intrinsic capability to continuously learn a changing data stream is a desideratum of deep neural networks (DNNs). However, current DNNs suffer from catastrophic forgetting, which hinders remembering past knowledge. To mitigate this issue, existing Continual Learning (CL) approaches either retain exemplars for replay, regularize learning, or allocate dedicated capacity for new tasks. This paper in… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2406.17335  [pdf, other

    cs.IR cs.LG

    A Thorough Performance Benchmarking on Lightweight Embedding-based Recommender Systems

    Authors: Hung Vinh Tran, Tong Chen, Quoc Viet Hung Nguyen, Zi Huang, Lizhen Cui, Hongzhi Yin

    Abstract: Since the creation of the Web, recommender systems (RSs) have been an indispensable mechanism in information filtering. State-of-the-art RSs primarily depend on categorical features, which ecoded by embedding vectors, resulting in excessively large embedding tables. To prevent over-parameterized embedding tables from harming scalability, both academia and industry have seen increasing efforts in c… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  3. arXiv:2406.17235  [pdf, other

    cs.CV cs.AI cs.DC

    Task-Agnostic Federated Learning

    Authors: Zhengtao Yao, Hong Nguyen, Ajitesh Srivastava, Jose Luis Ambite

    Abstract: In the realm of medical imaging, leveraging large-scale datasets from various institutions is crucial for developing precise deep learning models, yet privacy concerns frequently impede data sharing. federated learning (FL) emerges as a prominent solution for preserving privacy while facilitating collaborative learning. However, its application in real-world scenarios faces several obstacles, such… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  4. arXiv:2406.16783  [pdf, other

    cs.CL cs.AI cs.LG

    M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models

    Authors: Rishabh Maheshwary, Vikas Yadav, Hoang Nguyen, Khyati Mahajan, Sathwik Tejaswi Madhusudhan

    Abstract: Instruction finetuning (IFT) is critical for aligning Large Language Models (LLMs) to follow instructions. Numerous effective IFT datasets have been proposed in the recent past, but most focus on high resource languages such as English. In this work, we propose a fully synthetic, novel taxonomy (Evol) guided Multilingual, Multi-turn instruction finetuning dataset, called M2Lingual, to better align… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 39 pages

  5. arXiv:2406.14835  [pdf, other

    cs.CL cs.LG

    ToVo: Toxicity Taxonomy via Voting

    Authors: Tinh Son Luong, Thanh-Thien Le, Thang Viet Doan, Linh Ngo Van, Thien Huu Nguyen, Diep Thi-Ngoc Nguyen

    Abstract: Existing toxic detection models face significant limitations, such as lack of transparency, customization, and reproducibility. These challenges stem from the closed-source nature of their training data and the paucity of explanations for their evaluation mechanism. To address these issues, we propose a dataset creation mechanism that integrates voting and chain-of-thought processes, producing a h… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  6. arXiv:2406.14819  [pdf, other

    cs.CV

    SAM-EG: Segment Anything Model with Egde Guidance framework for efficient Polyp Segmentation

    Authors: Quoc-Huy Trinh, Hai-Dang Nguyen, Bao-Tram Nguyen Ngoc, Debesh Jha, Ulas Bagci, Minh-Triet Tran

    Abstract: Polyp segmentation, a critical concern in medical imaging, has prompted numerous proposed methods aimed at enhancing the quality of segmented masks. While current state-of-the-art techniques produce impressive results, the size and computational cost of these models pose challenges for practical industry applications. Recently, the Segment Anything Model (SAM) has been proposed as a robust foundat… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  7. arXiv:2406.13997  [pdf, other

    cs.CL cs.CE

    "Global is Good, Local is Bad?": Understanding Brand Bias in LLMs

    Authors: Mahammed Kamruzzaman, Hieu Minh Nguyen, Gene Louis Kim

    Abstract: Many recent studies have investigated social biases in LLMs but brand bias has received little attention. This research examines the biases exhibited by LLMs towards different brands, a significant concern given the widespread use of LLMs in affected use cases such as product recommendation and market analysis. Biased models may perpetuate societal inequalities, unfairly favoring established globa… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  8. EMO-KNOW: A Large Scale Dataset on Emotion and Emotion-cause

    Authors: Mia Huong Nguyen, Yasith Samaradivakara, Prasanth Sasikumar, Chitralekha Gupta, Suranga Nanayakkara

    Abstract: Emotion-Cause analysis has attracted the attention of researchers in recent years. However, most existing datasets are limited in size and number of emotion categories. They often focus on extracting parts of the document that contain the emotion cause and fail to provide more abstractive, generalizable root cause. To bridge this gap, we introduce a large-scale dataset of emotion causes, derived f… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted to Findings of EMNLP 2023

    Journal ref: Findings of EMNLP 2023

  9. arXiv:2406.11912  [pdf, other

    cs.SE cs.AI

    AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology

    Authors: Minh Huynh Nguyen, Thang Phan Chau, Phong X. Nguyen, Nghi D. Q. Bui

    Abstract: Software agents have emerged as promising tools for addressing complex software engineering tasks. However, existing works oversimplify software development workflows by following the waterfall model. Thus, we propose AgileCoder, a multi-agent system that integrates Agile Methodology (AM) into the framework. This system assigns specific AM roles such as Product Manager, Developer, and Tester to di… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  10. arXiv:2406.10853  [pdf, other

    cs.CV

    MV2Cyl: Reconstructing 3D Extrusion Cylinders from Multi-View Images

    Authors: Eunji Hong, Minh Hieu Nguyen, Mikaela Angelina Uy, Minhyuk Sung

    Abstract: We present MV2Cyl, a novel method for reconstructing 3D from 2D multi-view images, not merely as a field or raw geometry but as a sketch-extrude CAD model. Extracting extrusion cylinders from raw 3D geometry has been extensively researched in computer vision, while the processing of 3D data through neural networks has remained a bottleneck. Since 3D scans are generally accompanied by multi-view im… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 24 pages

  11. arXiv:2406.09039  [pdf, other

    cs.RO

    Language-Driven Closed-Loop Grasping with Model-Predictive Trajectory Replanning

    Authors: Huy Hoang Nguyen, Minh Nhat Vu, Florian Beck, Gerald Ebmer, Anh Nguyen, Andreas Kugi

    Abstract: Combining a vision module inside a closed-loop control system for a \emph{seamless movement} of a robot in a manipulation task is challenging due to the inconsistent update rates between utilized modules. This task is even more difficult in a dynamic environment, e.g., objects are moving. This paper presents a \emph{modular} zero-shot framework for language-driven manipulation of (dynamic) objects… ▽ More

    Submitted 19 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: 9 pages, 6 figures

  12. arXiv:2406.06239  [pdf, other

    cs.CV

    I-MPN: Inductive Message Passing Network for Effective and Efficient Human-in-the-Loop Annotation of Mobile Eye Tracking Data

    Authors: Hoang H. Le, Duy M. H. Nguyen, Omair Shahzad Bhatti, Laszlo Kopacsi, Thinh P. Ngo, Binh T. Nguyen, Michael Barz, Daniel Sonntag

    Abstract: Understanding human visual processing in dynamic environments is essential for psychology and human-centered interaction design. Mobile eye-tracking systems, combining egocentric video and gaze signals, offer valuable insights. However, manual analysis of these recordings is time-intensive. In this work, we present a novel human-centered learning algorithm designed for automated object recognition… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: First version

  13. arXiv:2406.05387  [pdf, other

    cs.IR

    PTF-FSR: A Parameter Transmission-Free Federated Sequential Recommender System

    Authors: Wei Yuan, Chaoqun Yang, Liang Qu, Quoc Viet Hung Nguyen, Guanhua Ye, Hongzhi Yin

    Abstract: Sequential recommender systems have made significant progress. Recently, due to increasing concerns about user data privacy, some researchers have implemented federated learning for sequential recommendation, a.k.a., Federated Sequential Recommender Systems (FedSeqRecs), in which a public sequential recommender model is shared and frequently transmitted between a central server and clients to achi… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  14. arXiv:2406.03713  [pdf

    cs.RO

    Gait-Adaptive Navigation and Human Searching in field with Cyborg Insect

    Authors: Phuoc Thanh Tran-Ngoc, Huu Duoc Nguyen, Duc Long Le, Rui Li, Bing Sheng Chong, Hirotaka Sato

    Abstract: This study focuses on improving the ability of cyborg insects to navigate autonomously during search and rescue missions in outdoor environments. We propose an algorithm that leverages data from an IMU to calculate orientation and position based on the insect's walking gait. These computed factors serve as essential feedback channels across 3 phases of our exploration. Our method functions without… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 35 pages, 9 figures

  15. arXiv:2406.02317  [pdf, other

    cs.LG cs.AI stat.ML

    Generative Conditional Distributions by Neural (Entropic) Optimal Transport

    Authors: Bao Nguyen, Binh Nguyen, Hieu Trung Nguyen, Viet Anh Nguyen

    Abstract: Learning conditional distributions is challenging because the desired outcome is not a single distribution but multiple distributions that correspond to multiple instances of the covariates. We introduce a novel neural entropic optimal transport method designed to effectively learn generative models of conditional distributions, particularly in scenarios characterized by limited sample sizes. Our… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 15 pages, 8 figures

  16. arXiv:2406.00973  [pdf, other

    cs.IR cs.LG

    Cold-start Recommendation by Personalized Embedding Region Elicitation

    Authors: Hieu Trung Nguyen, Duy Nguyen, Khoa Doan, Viet Anh Nguyen

    Abstract: Rating elicitation is a success element for recommender systems to perform well at cold-starting, in which the systems need to recommend items to a newly arrived user with no prior knowledge about the user's preference. Existing elicitation methods employ a fixed set of items to learn the user's preference and then infer the users' preferences on the remaining items. Using a fixed seed set can lim… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted at UAI 2024

  17. arXiv:2406.00843  [pdf, other

    quant-ph cs.LG

    Diffusion-Inspired Quantum Noise Mitigation in Parameterized Quantum Circuits

    Authors: Hoang-Quan Nguyen, Xuan Bac Nguyen, Samuel Yen-Chi Chen, Hugh Churchill, Nicholas Borys, Samee U. Khan, Khoa Luu

    Abstract: Parameterized Quantum Circuits (PQCs) have been acknowledged as a leading strategy to utilize near-term quantum advantages in multiple problems, including machine learning and combinatorial optimization. When applied to specific tasks, the parameters in the quantum circuits are trained to minimize the target function. Although there have been comprehensive studies to improve the performance of the… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  18. arXiv:2406.00181  [pdf, other

    cs.DC

    Wait or Not to Wait: Evaluating Trade-Offs between Speed and Precision in Blockchain-based Federated Aggregation

    Authors: Huong Nguyen, Tri Nguyen, Lauri Lovén, Susanna Pirttikangas

    Abstract: This paper presents a fully coupled blockchain-assisted federated learning architecture that effectively eliminates single points of failure by decentralizing both the training and aggregation tasks across all participants. Our proposed system offers a high degree of flexibility, allowing participants to select shared models and customize the aggregation for local needs, thereby optimizing system… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: Accepted at Workshop on Engineering techniques for Distributed Computing Continuum Systems 2024

  19. arXiv:2405.20529  [pdf

    cs.AI cs.CL

    An Automatic Question Usability Evaluation Toolkit

    Authors: Steven Moore, Eamon Costello, Huy A. Nguyen, John Stamper

    Abstract: Evaluating multiple-choice questions (MCQs) involves either labor intensive human assessments or automated methods that prioritize readability, often overlooking deeper question design flaws. To address this issue, we introduce the Scalable Automatic Question Usability Evaluation Toolkit (SAQUET), an open-source tool that leverages the Item-Writing Flaws (IWF) rubric for a comprehensive and automa… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Artificial Intelligence in Education 2024

  20. arXiv:2405.19725  [pdf, other

    quant-ph cs.CV

    Quantum Visual Feature Encoding Revisited

    Authors: Xuan-Bac Nguyen, Hoang-Quan Nguyen, Hugh Churchill, Samee U. Khan, Khoa Luu

    Abstract: Although quantum machine learning has been introduced for a while, its applications in computer vision are still limited. This paper, therefore, revisits the quantum visual encoding strategies, the initial step in quantum machine learning. Investigating the root cause, we uncover that the existing quantum encoding design fails to ensure information preservation of the visual features after the enc… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  21. arXiv:2405.19722  [pdf, other

    cs.CV

    QClusformer: A Quantum Transformer-based Framework for Unsupervised Visual Clustering

    Authors: Xuan-Bac Nguyen, Hoang-Quan Nguyen, Samuel Yen-Chi Chen, Samee U. Khan, Hugh Churchill, Khoa Luu

    Abstract: Unsupervised vision clustering, a cornerstone in computer vision, has been studied for decades, yielding significant outcomes across numerous vision tasks. However, these algorithms involve substantial computational demands when confronted with vast amounts of unlabeled data. Conversely, Quantum computing holds promise in expediting unsupervised algorithms when handling large-scale databases. In t… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  22. arXiv:2405.18605  [pdf, ps, other

    cs.CL cs.IR q-bio.MN

    BioBERT-based Deep Learning and Merged ChemProt-DrugProt for Enhanced Biomedical Relation Extraction

    Authors: Bridget T. McInnes, Jiawei Tang, Darshini Mahendran, Mai H. Nguyen

    Abstract: This paper presents a methodology for enhancing relation extraction from biomedical texts, focusing specifically on chemical-gene interactions. Leveraging the BioBERT model and a multi-layer fully connected network architecture, our approach integrates the ChemProt and DrugProt datasets using a novel merging strategy. Through extensive experimentation, we demonstrate significant performance improv… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  23. arXiv:2405.18499  [pdf, other

    stat.ML cs.LG

    Large Margin Discriminative Loss for Classification

    Authors: Hai-Vy Nguyen, Fabrice Gamboa, Sixin Zhang, Reda Chhaibi, Serge Gratton, Thierry Giaccone

    Abstract: In this paper, we introduce a novel discriminative loss function with large margin in the context of Deep Learning. This loss boosts the discriminative power of neural nets, represented by intra-class compactness and inter-class separability. On the one hand, the class compactness is ensured by close distance of samples of the same class to each other. On the other hand, the inter-class separabili… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  24. arXiv:2405.18040  [pdf, other

    cs.LG cs.AI cs.DC cs.ET

    Fast-FedUL: A Training-Free Federated Unlearning with Provable Skew Resilience

    Authors: Thanh Trung Huynh, Trong Bang Nguyen, Phi Le Nguyen, Thanh Tam Nguyen, Matthias Weidlich, Quoc Viet Hung Nguyen, Karl Aberer

    Abstract: Federated learning (FL) has recently emerged as a compelling machine learning paradigm, prioritizing the protection of privacy for training data. The increasing demand to address issues such as ``the right to be forgotten'' and combat data poisoning attacks highlights the importance of techniques, known as \textit{unlearning}, which facilitate the removal of specific training data from trained FL… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Accepted in ECML PKDD 2024

  25. arXiv:2405.16815  [pdf, other

    cs.CV

    Image-level Regression for Uncertainty-aware Retinal Image Segmentation

    Authors: Trung Dang, Huy Hoang Nguyen, Aleksei Tiulpin

    Abstract: Accurate retinal vessel segmentation is a crucial step in the quantitative assessment of retinal vasculature, which is needed for the early detection of retinal diseases and other conditions. Numerous studies have been conducted to tackle the problem of segmenting vessels automatically using a pixel-wise classification approach. The common practice of creating ground truth labels is to categorize… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 13 pages

  26. arXiv:2405.16813  [pdf, other

    cs.CV

    SiNGR: Brain Tumor Segmentation via Signed Normalized Geodesic Transform Regression

    Authors: Trung Dang, Huy Hoang Nguyen, Aleksei Tiulpin

    Abstract: One of the primary challenges in brain tumor segmentation arises from the uncertainty of voxels close to tumor boundaries. However, the conventional process of generating ground truth segmentation masks fails to treat such uncertainties properly. Those ``hard labels'' with 0s and 1s conceptually influenced the majority of prior studies on brain image segmentation. As a result, tumor segmentation i… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted as a conference paper at MICCAI 2024

  27. arXiv:2405.16623  [pdf, other

    cs.LG cs.AR cs.PF

    Graph neural networks with configuration cross-attention for tensor compilers

    Authors: Dmitrii Khizbullin, Eduardo Rocha de Andrade, Thanh Hau Nguyen, Matheus Pedroza Ferreira, David R. Pugh

    Abstract: With the recent popularity of neural networks comes the need for efficient serving of inference workloads. A neural network inference workload can be represented as a computational graph with nodes as operators transforming multidimensional tensors. The tensors can be transposed and/or tiled in a combinatorially large number of ways, some configurations leading to accelerated inference. We propose… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  28. arXiv:2405.16148  [pdf, other

    cs.LG

    Accelerating Transformers with Spectrum-Preserving Token Merging

    Authors: Hoai-Chau Tran, Duy M. H. Nguyen, Duy M. Nguyen, Trung-Tin Nguyen, Ngan Le, Pengtao Xie, Daniel Sonntag, James Y. Zou, Binh T. Nguyen, Mathias Niepert

    Abstract: Increasing the throughput of the Transformer architecture, a foundational component used in numerous state-of-the-art models for vision and language tasks (e.g., GPT, LLaVa), is an important problem in machine learning. One recent and effective strategy is to merge token representations within Transformer models, aiming to reduce computational and memory requirements while maintaining accuracy. Pr… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: Version 1

  29. arXiv:2405.15997  [pdf, other

    cs.RO

    $\textit{UniSaT}$: Unified-Objective Belief Model and Planner to Search for and Track Multiple Objects

    Authors: Leonardo Santos, Brady Moon, Sebastian Scherer, Hoa Van Nguyen

    Abstract: The problem of path planning for autonomously searching and tracking multiple objects is important to reconnaissance, surveillance, and many other data-gathering applications. Due to the inherent competing objectives of searching for new objects while maintaining tracks for found objects, most current approaches rely on multi-objective planning methods, leaving it up to the user to tune parameters… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 13 pages, 5 figures, 1 table

  30. arXiv:2405.15437  [pdf

    cs.CY

    Learning about Data, Algorithms, and Algorithmic Justice on TikTok in Personally Meaningful Ways

    Authors: Luis Morales-Navarro, Yasmin B. Kafai, Ha Nguyen, Kayla DesPortes, Ralph Vacca, Camillia Matuk, Megan Silander, Anna Amato, Peter Woods, Francisco Castro, Mia Shaw, Selin Akgun, Christine Greenhow, Antero Garcia

    Abstract: TikTok, a popular short video sharing application, emerged as the dominant social media platform for young people, with a pronounced influence on how young women and people of color interact online. The application has become a global space for youth to connect with each other, offering not only entertainment but also opportunities to engage with artificial intelligence/machine learning (AI/ML)-dr… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    ACM Class: K.3; K.4

  31. arXiv:2405.15113  [pdf, other

    cs.RO

    A Wearable Resistance Devices Motor Learning Effects in Exercise

    Authors: Eugenio Frias-Miranda, Hong-Anh Nguyen, Jeremy Hampton, Trenner Jones, Benjamin Spotts, Matthew Cochran, Deva Chan, Laura H Blumenschein

    Abstract: The integration of technology into exercise regimens has emerged as a strategy to enhance normal human capabilities and return human motor function after injury or illness by enhancing motor learning and retention. Much research has focused on how active devices, whether confined to a lab or made into a wearable format, can apply forces at set times and conditions to optimize the process of learni… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 8 pages, 9 figures, To be published in IEEE International Conference on Biomedical Robotics and Biomechatronics (BioRob) 2024

  32. arXiv:2405.14352  [pdf, other

    cs.LG

    Explaining Graph Neural Networks via Structure-aware Interaction Index

    Authors: Ngoc Bui, Hieu Trung Nguyen, Viet Anh Nguyen, Rex Ying

    Abstract: The Shapley value is a prominent tool for interpreting black-box machine learning models thanks to its strong theoretical foundation. However, for models with structured inputs, such as graph neural networks, existing Shapley-based explainability approaches either focus solely on node-wise importance or neglect the graph structure when perturbing the input instance. This paper introduces the Myers… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 30 pages, ICML'24

  33. arXiv:2405.14131  [pdf, other

    stat.ML cs.LG

    Statistical Advantages of Perturbing Cosine Router in Sparse Mixture of Experts

    Authors: Huy Nguyen, Pedram Akbarian, Trang Pham, Trang Nguyen, Shujian Zhang, Nhat Ho

    Abstract: The cosine router in sparse Mixture of Experts (MoE) has recently emerged as an attractive alternative to the conventional linear router. Indeed, the cosine router demonstrates favorable performance in image and language tasks and exhibits better ability to mitigate the representation collapse issue, which often leads to parameter redundancy and limited representation potentials. Despite its empir… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 44 pages, 2 figures

  34. arXiv:2405.14124  [pdf, ps, other

    cs.LG

    Mixture of Experts Meets Prompt-Based Continual Learning

    Authors: Minh Le, An Nguyen, Huy Nguyen, Trang Nguyen, Trang Pham, Linh Van Ngo, Nhat Ho

    Abstract: Exploiting the power of pre-trained models, prompt-based approaches stand out compared to other continual learning solutions in effectively preventing catastrophic forgetting, even with very few learnable parameters and without the need for a memory buffer. While existing prompt-based continual learning methods excel in leveraging prompts for state-of-the-art performance, they often lack a theoret… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 34 pages

  35. arXiv:2405.13997  [pdf, other

    stat.ML cs.LG

    Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts

    Authors: Huy Nguyen, Nhat Ho, Alessandro Rinaldo

    Abstract: The softmax gating function is arguably the most popular choice in mixture of experts modeling. Despite its widespread use in practice, softmax gating may lead to unnecessary competition among experts, potentially causing the undesirable phenomenon of representation collapse due to its inherent structure. In response, the sigmoid gating function has been recently proposed as an alternative and has… ▽ More

    Submitted 1 June, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: 31 pages, 2 figures

  36. arXiv:2405.13867  [pdf, other

    cs.LG cs.AI

    Scaling-laws for Large Time-series Models

    Authors: Thomas D. P. Edwards, James Alvey, Justin Alsing, Nam H. Nguyen, Benjamin D. Wandelt

    Abstract: Scaling laws for large language models (LLMs) have provided useful guidance on how to train ever larger models for predictable performance gains. Time series forecasting shares a similar sequential structure to language, and is amenable to large-scale transformer architectures. Here we show that foundational decoder-only time series transformer models exhibit analogous scaling-behavior to LLMs, wh… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 8 pages, 3 figures

  37. arXiv:2405.13707  [pdf, other

    cs.LG cs.AI

    Rethinking and Accelerating Graph Condensation: A Training-Free Approach with Class Partition

    Authors: Xinyi Gao, Tong Chen, Wentao Zhang, Junliang Yu, Guanhua Ye, Quoc Viet Hung Nguyen, Hongzhi Yin

    Abstract: The increasing prevalence of large-scale graphs poses a significant challenge for graph neural network training, attributed to their substantial computational requirements. In response, graph condensation (GC) emerges as a promising data-centric solution aiming to substitute the large graph with a small yet informative condensed graph to facilitate data-efficient GNN training. However, existing GC… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  38. arXiv:2405.13010  [pdf, other

    cs.CL cs.AI

    UCCIX: Irish-eXcellence Large Language Model

    Authors: Khanh-Tung Tran, Barry O'Sullivan, Hoang D. Nguyen

    Abstract: The development of Large Language Models (LLMs) has predominantly focused on high-resource languages, leaving extremely low-resource languages like Irish with limited representation. This work presents UCCIX, a pioneering effort on the development of an open-source Irish-based LLM. We propose a novel framework for continued pre-training of LLMs specifically adapted for extremely low-resource langu… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  39. arXiv:2405.10659  [pdf, other

    cs.CL cs.AI

    Realistic Evaluation of Toxicity in Large Language Models

    Authors: Tinh Son Luong, Thanh-Thien Le, Linh Ngo Van, Thien Huu Nguyen

    Abstract: Large language models (LLMs) have become integral to our professional workflows and daily lives. Nevertheless, these machine companions of ours have a critical flaw: the huge amount of data which endows them with vast and diverse knowledge, also exposes them to the inevitable toxicity and bias. While most LLMs incorporate defense mechanisms to prevent the generation of harmful content, these safeg… ▽ More

    Submitted 20 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: Findings of ACL 2024

  40. arXiv:2405.10043  [pdf, other

    cs.RO

    Crash Landing onto "you": Untethered Soft Aerial Robots for Safe Environmental Interaction, Sensing, and Perching

    Authors: Pham Huy Nguyen

    Abstract: There are various desired capabilities to create aerial forest-traversing robots capable of monitoring both biological and abiotic data. The features range from multi-functionality, robustness, and adaptability. These robots have to weather turbulent winds and various obstacles such as forest flora and wildlife thus amplifying the complexity of operating in such uncertain environments. The key for… ▽ More

    Submitted 24 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: RSS Pioneers 2024 Research Statement

  41. arXiv:2405.07482  [pdf, other

    stat.ML cs.GR cs.LG

    Marginal Fairness Sliced Wasserstein Barycenter

    Authors: Khai Nguyen, Hai Nguyen, Nhat Ho

    Abstract: The sliced Wasserstein barycenter (SWB) is a widely acknowledged method for efficiently generalizing the averaging operation within probability measure spaces. However, achieving marginal fairness SWB, ensuring approximately equal distances from the barycenter to marginals, remains unexplored. The uniform weighted SWB is not necessarily the optimal choice to obtain the desired marginal fairness ba… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 33 pages, 14 figures, 6 tables

  42. Q-learning-based Opportunistic Communication for Real-time Mobile Air Quality Monitoring Systems

    Authors: Trung Thanh Nguyen, Truong Thao Nguyen, Dinh Tuan Anh Nguyen, Thanh Hung Nguyen, Phi Le Nguyen

    Abstract: We focus on real-time air quality monitoring systems that rely on devices installed on automobiles in this research. We investigate an opportunistic communication model in which devices can send the measured data directly to the air quality server through a 4G communication channel or via Wi-Fi to adjacent devices or the so-called Road Side Units deployed along the road. We aim to reduce 4G costs… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 2021 IEEE International Conference on Performance, Computing and Communications (IPCCC). arXiv admin note: substantial text overlap with arXiv:2405.01057

  43. arXiv:2405.01337  [pdf, other

    cs.CV

    Multi-view Action Recognition via Directed Gromov-Wasserstein Discrepancy

    Authors: Hoang-Quan Nguyen, Thanh-Dat Truong, Khoa Luu

    Abstract: Action recognition has become one of the popular research topics in computer vision. There are various methods based on Convolutional Networks and self-attention mechanisms as Transformers to solve both spatial and temporal dimensions problems of action recognition tasks that achieve competitive performances. However, these methods lack a guarantee of the correctness of the action subject that the… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  44. Fuzzy Q-Learning-Based Opportunistic Communication for MEC-Enhanced Vehicular Crowdsensing

    Authors: Trung Thanh Nguyen, Truong Thao Nguyen, Thanh Hung Nguyen, Phi Le Nguyen

    Abstract: This study focuses on MEC-enhanced, vehicle-based crowdsensing systems that rely on devices installed on automobiles. We investigate an opportunistic communication paradigm in which devices can transmit measured data directly to a crowdsensing server over a 4G communication channel or to nearby devices or so-called Road Side Units positioned along the road via Wi-Fi. We tackle a new problem that i… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: IEEE Transactions on Network and Service Management

  45. arXiv:2405.01021  [pdf, other

    cs.ET quant-ph

    QSimPy: A Learning-centric Simulation Framework for Quantum Cloud Resource Management

    Authors: Hoa T. Nguyen, Muhammad Usman, Rajkumar Buyya

    Abstract: Quantum cloud computing is an emerging computing paradigm that allows seamless access to quantum hardware as cloud-based services. However, effective use of quantum resources is challenging and necessitates robust simulation frameworks for effective resource management design and evaluation. To address this need, we proposed QSimPy, a novel discrete-event simulation framework designed with the mai… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  46. arXiv:2405.00543  [pdf, other

    cs.CL cs.AI

    New Benchmark Dataset and Fine-Grained Cross-Modal Fusion Framework for Vietnamese Multimodal Aspect-Category Sentiment Analysis

    Authors: Quy Hoang Nguyen, Minh-Van Truong Nguyen, Kiet Van Nguyen

    Abstract: The emergence of multimodal data on social media platforms presents new opportunities to better understand user sentiments toward a given aspect. However, existing multimodal datasets for Aspect-Category Sentiment Analysis (ACSA) often focus on textual annotations, neglecting fine-grained information in images. Consequently, these datasets fail to fully exploit the richness inherent in multimodal.… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  47. arXiv:2405.00418  [pdf, other

    cs.CR cs.AI

    Detection of ransomware attacks using federated learning based on the CNN model

    Authors: Hong-Nhung Nguyen, Ha-Thanh Nguyen, Damien Lescos

    Abstract: Computing is still under a significant threat from ransomware, which necessitates prompt action to prevent it. Ransomware attacks can have a negative impact on how smart grids, particularly digital substations. In addition to examining a ransomware detection method using artificial intelligence (AI), this paper offers a ransomware attack modeling technique that targets the disrupted operation of a… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  48. arXiv:2405.00355  [pdf, other

    cs.CV

    Exploring Self-Supervised Vision Transformers for Deepfake Detection: A Comparative Analysis

    Authors: Huy H. Nguyen, Junichi Yamagishi, Isao Echizen

    Abstract: This paper investigates the effectiveness of self-supervised pre-trained transformers compared to supervised pre-trained transformers and conventional neural networks (ConvNets) for detecting various types of deepfakes. We focus on their potential for improved generalization, particularly when training data is limited. Despite the notable success of large vision-language models utilizing transform… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  49. arXiv:2404.18981  [pdf, other

    eess.IV cs.AI

    Decoding Radiologists' Intentions: A Novel System for Accurate Region Identification in Chest X-ray Image Analysis

    Authors: Akash Awasthi, Safwan Ahmad, Bryant Le, Hien Van Nguyen

    Abstract: In the realm of chest X-ray (CXR) image analysis, radiologists meticulously examine various regions, documenting their observations in reports. The prevalence of errors in CXR diagnoses, particularly among inexperienced radiologists and hospital residents, underscores the importance of understanding radiologists' intentions and the corresponding regions of interest. This understanding is crucial f… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Accepted in ISBI 2024

  50. arXiv:2404.18831  [pdf, other

    cs.CV cs.AI

    ConPro: Learning Severity Representation for Medical Images using Contrastive Learning and Preference Optimization

    Authors: Hong Nguyen, Hoang Nguyen, Melinda Chang, Hieu Pham, Shrikanth Narayanan, Michael Pazzani

    Abstract: Understanding the severity of conditions shown in images in medical diagnosis is crucial, serving as a key guide for clinical assessment, treatment, as well as evaluating longitudinal progression. This paper proposes Con- PrO: a novel representation learning method for severity assessment in medical images using Contrastive learningintegrated Preference Optimization. Different from conventional co… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 8 pages