Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 107 results for author: Tran, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09737  [pdf, other

    cs.SE

    A Multivocal Review of MLOps Practices, Challenges and Open Issues

    Authors: Beyza Eken, Samodha Pallewatta, Nguyen Khoi Tran, Ayse Tosun, Muhammad Ali Babar

    Abstract: With the increasing trend of Machine Learning (ML) enabled software applications, the paradigm of ML Operations (MLOps) has gained tremendous attention of researchers and practitioners. MLOps encompasses the practices and technologies for streamlining the resources and monitoring needs of operationalizing ML models. Software development practitioners need access to the detailed and easily understa… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 45 pages, 4 figures

  2. arXiv:2405.20089  [pdf, other

    cs.CL

    The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM Abilities

    Authors: David Stap, Eva Hasler, Bill Byrne, Christof Monz, Ke Tran

    Abstract: Fine-tuning large language models (LLMs) for machine translation has shown improvements in overall translation quality. However, it is unclear what is the impact of fine-tuning on desirable LLM behaviors that are not present in neural machine translation models, such as steerability, inherent document-level translation abilities, and the ability to produce less literal translations. We perform an… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted to ACL 2024 (long, main)

  3. arXiv:2405.13010  [pdf, other

    cs.CL cs.AI

    UCCIX: Irish-eXcellence Large Language Model

    Authors: Khanh-Tung Tran, Barry O'Sullivan, Hoang D. Nguyen

    Abstract: The development of Large Language Models (LLMs) has predominantly focused on high-resource languages, leaving extremely low-resource languages like Irish with limited representation. This work presents UCCIX, a pioneering effort on the development of an open-source Irish-based LLM. We propose a novel framework for continued pre-training of LLMs specifically adapted for extremely low-resource langu… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  4. arXiv:2404.09951  [pdf, other

    cs.CV

    Unifying Global and Local Scene Entities Modelling for Precise Action Spotting

    Authors: Kim Hoang Tran, Phuc Vuong Do, Ngoc Quoc Ly, Ngan Le

    Abstract: Sports videos pose complex challenges, including cluttered backgrounds, camera angle changes, small action-representing objects, and imbalanced action class distribution. Existing methods for detecting actions in sports videos heavily rely on global features, utilizing a backbone network as a black box that encompasses the entire spatial frame. However, these approaches tend to overlook the nuance… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted to IJCNN 2024

  5. arXiv:2403.01339  [pdf, ps, other

    cs.LG math.RT

    Uniform $\mathcal{C}^k$ Approximation of $G$-Invariant and Antisymmetric Functions, Embedding Dimensions, and Polynomial Representations

    Authors: Soumya Ganguly, Khoa Tran, Rahul Sarkar

    Abstract: For any subgroup $G$ of the symmetric group $\mathcal{S}_n$ on $n$ symbols, we present results for the uniform $\mathcal{C}^k$ approximation of $G$-invariant functions by $G$-invariant polynomials. For the case of totally symmetric functions ($G = \mathcal{S}_n$), we show that this gives rise to the sum-decomposition Deep Sets ansatz of Zaheer et al. (2018), where both the inner and outer function… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: 38 pages

    MSC Class: 05E10 ACM Class: I.2.4; I.2.6; I.2.0

  6. How good are my search strings? Reflections on using an existing review as a quasi-gold standard

    Authors: Huynh Khanh Vi Tran, Jürgen Börstler, Nauman Bin Ali, Michael Unterkalmsteiner

    Abstract: Background: Systematic literature studies (SLS) have become a core research methodology in Evidence-based Software Engineering (EBSE). Search completeness, ie, finding all relevant papers on the topic of interest, has been recognized as one of the most commonly discussed validity issues of SLSs. Aim: This study aims at raising awareness on the issues related to search string construction and on se… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Journal ref: e Informatica Softw. Eng. J. 16(1) (2022)

  7. Assessing test artifact quality -- A tertiary study

    Authors: Huynh Khanh Vi Tran, Michael Unterkalmsteiner, Jürgen Börstler, Nauman bin Ali

    Abstract: Context: Modern software development increasingly relies on software testing for an ever more frequent delivery of high quality software. This puts high demands on the quality of the central artifacts in software testing, test suites and test cases. Objective: We aim to develop a comprehensive model for capturing the dimensions of test case/suite quality, which are relevant for a variety of perspe… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Journal ref: Information and Software Technology 139 (2021): 106620

  8. arXiv:2312.15576  [pdf, other

    cs.CL

    Reducing LLM Hallucinations using Epistemic Neural Networks

    Authors: Shreyas Verma, Kien Tran, Yusuf Ali, Guangyu Min

    Abstract: Reducing and detecting hallucinations in large language models is an open research problem. In this project, we attempt to leverage recent advances in the field of uncertainty estimation to reduce hallucinations in frozen large language models. Epistemic neural networks have recently been proposed to improve output joint distributions for large pre-trained models. ENNs are small networks attached… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

    Comments: 12 pages,9 figures, 4 tables

  9. arXiv:2310.18046  [pdf, other

    cs.CL cs.CV

    ViCLEVR: A Visual Reasoning Dataset and Hybrid Multimodal Fusion Model for Visual Question Answering in Vietnamese

    Authors: Khiem Vinh Tran, Hao Phu Phan, Kiet Van Nguyen, Ngan Luu Thuy Nguyen

    Abstract: In recent years, Visual Question Answering (VQA) has gained significant attention for its diverse applications, including intelligent car assistance, aiding visually impaired individuals, and document image information retrieval using natural language queries. VQA requires effective integration of information from questions and images to generate accurate answers. Neural models for VQA have made r… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: A pre-print version and submitted to journal

  10. arXiv:2310.14602  [pdf, ps, other

    cs.CL

    Generative Pre-trained Transformer for Vietnamese Community-based COVID-19 Question Answering

    Authors: Tam Minh Vo, Khiem Vinh Tran

    Abstract: Recent studies have provided empirical evidence of the wide-ranging potential of Generative Pre-trained Transformer (GPT), a pretrained language model, in the field of natural language processing. GPT has been effectively employed as a decoder within state-of-the-art (SOTA) question answering systems, yielding exceptional performance across various tasks. However, the current research landscape co… ▽ More

    Submitted 31 October, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

  11. arXiv:2310.14549  [pdf, other

    cs.LG

    Multimodal Graph Learning for Modeling Emerging Pandemics with Big Data

    Authors: Khanh-Tung Tran, Truong Son Hy, Lili Jiang, Xuan-Son Vu

    Abstract: Accurate forecasting and analysis of emerging pandemics play a crucial role in effective public health management and decision-making. Traditional approaches primarily rely on epidemiological data, overlooking other valuable sources of information that could act as sensors or indicators of pandemic patterns. In this paper, we propose a novel framework called MGL4MEP that integrates temporal graph… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  12. arXiv:2310.11477  [pdf, other

    cs.LG cs.AI

    Robust-MBFD: A Robust Deep Learning System for Motor Bearing Faults Detection Using Multiple Deep Learning Training Strategies and A Novel Double Loss Function

    Authors: Khoa Tran, Lam Pham, Hai-Canh Vu

    Abstract: This paper presents a comprehensive analysis of motor bearing fault detection (MBFD), which involves the task of identifying faults in a motor bearing based on its vibration. To this end, we first propose and evaluate various machine learning based systems for the MBFD task. Furthermore, we propose three deep learning based systems for the MBFD task, each of which explores one of the following tra… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  13. arXiv:2310.10875  [pdf, other

    cs.CV cs.CG

    Filling the Holes on 3D Heritage Object Surface based on Automatic Segmentation Algorithm

    Authors: Sinh Van Nguyen, Son Thanh Le, Minh Khai Tran, Le Thanh Sach

    Abstract: Reconstructing and processing the 3D objects are popular activities in the research field of computer graphics, image processing and computer vision. The 3D objects are processed based on the methods like geometric modeling, a branch of applied mathematics and computational geometry, or the machine learning algorithms based on image processing. The computation of geometrical objects includes proce… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 20 pages, 11 figures, 37 references

  14. arXiv:2310.06300  [pdf, other

    cs.CR cs.SE

    An Empirically Grounded Reference Architecture for Software Supply Chain Metadata Management

    Authors: Nguyen Khoi Tran, Samodha Pallewatta, M. Ali Babar

    Abstract: With the rapid rise in Software Supply Chain (SSC) attacks, organisations need thorough and trustworthy visibility over the entire SSC of their software inventory to detect risks early and identify compromised assets rapidly in the event of an SSC attack. One way to achieve such visibility is through SSC metadata, machine-readable and authenticated documents describing an artefact's lifecycle. Ado… ▽ More

    Submitted 8 June, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted for full paper presentation at EASE 2024 conference

  15. arXiv:2310.00273  [pdf, other

    cs.RO math.OC

    Safe Stabilizing Control for Polygonal Robots in Dynamic Elliptical Environments

    Authors: Kehan Long, Khoa Tran, Melvin Leok, Nikolay Atanasov

    Abstract: This paper addresses the challenge of safe navigation for rigid-body mobile robots in dynamic environments. We introduce an analytic approach to compute the distance between a polygon and an ellipse, and employ it to construct a control barrier function (CBF) for safe control synthesis. Existing CBF design methods for mobile robot obstacle avoidance usually assume point or circular robots, prevent… ▽ More

    Submitted 30 April, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

    Comments: 2024 American Control Conference

  16. Test-Case Quality -- Understanding Practitioners' Perspectives

    Authors: Huynh Khanh Vi Tran, Nauman Bin Ali, Jürgen Börstler, Michael Unterkalmsteiner

    Abstract: Background: Test-case quality has always been one of the major concerns in software testing. To improve test-case quality, it is important to better understand how practitioners perceive the quality of test-cases. Objective: Motivated by that need, we investigated how practitioners define test-case quality and which aspects of test-cases are important for quality assessment. Method: We conducted s… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: PROFES 2019: 37-52

  17. arXiv:2309.12972  [pdf, other

    cs.CV

    License Plate Recognition Based On Multi-Angle View Model

    Authors: Dat Tran-Anh, Khanh Linh Tran, Hoai-Nam Vu

    Abstract: In the realm of research, the detection/recognition of text within images/videos captured by cameras constitutes a highly challenging problem for researchers. Despite certain advancements achieving high accuracy, current methods still require substantial improvements to be applicable in practical scenarios. Diverging from text detection in images/videos, this paper addresses the issue of text dete… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  18. arXiv:2309.10550  [pdf, other

    cs.DC

    Addressing the Scalability Bottleneck of Semantic Technologies at Bosch

    Authors: Diego Rincon-Yanez, Mohamed H. Gad-Elrab, Daria Stepanova, Kien Trung Tran, Cuong Chu Xuan, Baifan Zhou, Evgeny Karlamov

    Abstract: At the heart of smart manufacturing is real-time semi-automatic decision-making. Such decisions are vital for optimizing production lines, e.g., reducing resource consumption, improving the quality of discrete manufacturing operations, and optimizing the actual products, e.g., optimizing the sampling rate for measuring product dimensions during production. Such decision-making relies on massive in… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Journal ref: Industry Track - Extended Semantic Web Conference (ESWC2023)

  19. arXiv:2309.06157  [pdf, other

    cs.LG cs.AI

    Robust-MBDL: A Robust Multi-branch Deep Learning Based Model for Remaining Useful Life Prediction and Operational Condition Identification of Rotating Machines

    Authors: Khoa Tran, Hai-Canh Vu, Lam Pham, Nassim Boudaoud

    Abstract: In this paper, a Robust Multi-branch Deep learning-based system for remaining useful life (RUL) prediction and condition operations (CO) identification of rotating machines is proposed. In particular, the proposed system comprises main components: (1) an LSTM-Autoencoder to denoise the vibration data; (2) a feature extraction to generate time-domain, frequency-domain, and time-frequency based feat… ▽ More

    Submitted 14 December, 2023; v1 submitted 12 September, 2023; originally announced September 2023.

  20. arXiv:2308.11596  [pdf, other

    cs.CL

    SeamlessM4T: Massively Multilingual & Multimodal Machine Translation

    Authors: Seamless Communication, Loïc Barrault, Yu-An Chung, Mariano Cora Meglioli, David Dale, Ning Dong, Paul-Ambroise Duquenne, Hady Elsahar, Hongyu Gong, Kevin Heffernan, John Hoffman, Christopher Klaiber, Pengwei Li, Daniel Licht, Jean Maillard, Alice Rakotoarison, Kaushik Ram Sadagopan, Guillaume Wenzek, Ethan Ye, Bapi Akula, Peng-Jen Chen, Naji El Hachem, Brian Ellis, Gabriel Mejia Gonzalez, Justin Haaheim , et al. (43 additional authors not shown)

    Abstract: What does it take to create the Babel Fish, a tool that can help individuals translate speech between any two languages? While recent breakthroughs in text-based models have pushed machine translation coverage beyond 200 languages, unified speech-to-speech translation models have yet to achieve similar strides. More specifically, conventional speech-to-speech translation systems rely on cascaded s… ▽ More

    Submitted 24 October, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

    ACM Class: I.2.7

  21. arXiv:2308.07601  [pdf, ps, other

    cs.CL

    VBD-MT Chinese-Vietnamese Translation Systems for VLSP 2022

    Authors: Hai Long Trieu, Song Kiet Bui, Tan Minh Tran, Van Khanh Tran, Hai An Nguyen

    Abstract: We present our systems participated in the VLSP 2022 machine translation shared task. In the shared task this year, we participated in both translation tasks, i.e., Chinese-Vietnamese and Vietnamese-Chinese translations. We build our systems based on the neural-based Transformer model with the powerful multilingual denoising pre-trained model mBART. The systems are enhanced by a sampling method fo… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

  22. arXiv:2307.15335  [pdf, other

    cs.CL cs.CV

    BARTPhoBEiT: Pre-trained Sequence-to-Sequence and Image Transformers Models for Vietnamese Visual Question Answering

    Authors: Khiem Vinh Tran, Kiet Van Nguyen, Ngan Luu Thuy Nguyen

    Abstract: Visual Question Answering (VQA) is an intricate and demanding task that integrates natural language processing (NLP) and computer vision (CV), capturing the interest of researchers. The English language, renowned for its wealth of resources, has witnessed notable advancements in both datasets and models designed for VQA. However, there is a lack of models that target specific countries such as Vie… ▽ More

    Submitted 28 July, 2023; originally announced July 2023.

  23. arXiv:2306.06620  [pdf, other

    cs.SE cs.AI

    ARIST: An Effective API Argument Recommendation Approach

    Authors: Son Nguyen, Cuong Tran Manh, Kien T. Tran, Tan M. Nguyen, Thu-Trang Nguyen, Kien-Tuan Ngo, Hieu Dinh Vo

    Abstract: Learning and remembering to use APIs are difficult. Several techniques have been proposed to assist developers in using APIs. Most existing techniques focus on recommending the right API methods to call, but very few techniques focus on recommending API arguments. In this paper, we propose ARIST, a novel automated argument recommendation approach which suggests arguments by predicting developers'… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

  24. arXiv:2305.17648  [pdf, other

    cs.CV

    Z-GMOT: Zero-shot Generic Multiple Object Tracking

    Authors: Kim Hoang Tran, Anh Duy Le Dinh, Tien Phat Nguyen, Thinh Phan, Pha Nguyen, Khoa Luu, Donald Adjeroh, Gianfranco Doretto, Ngan Hoang Le

    Abstract: Despite recent significant progress, Multi-Object Tracking (MOT) faces limitations such as reliance on prior knowledge and predefined categories and struggles with unseen objects. To address these issues, Generic Multiple Object Tracking (GMOT) has emerged as an alternative approach, requiring less prior information. However, current GMOT methods often rely on initial bounding boxes and struggle t… ▽ More

    Submitted 13 June, 2024; v1 submitted 28 May, 2023; originally announced May 2023.

  25. arXiv:2305.16474  [pdf, other

    cs.LG cs.CR cs.CY

    FairDP: Certified Fairness with Differential Privacy

    Authors: Khang Tran, Ferdinando Fioretto, Issa Khalil, My T. Thai, NhatHai Phan

    Abstract: This paper introduces FairDP, a novel mechanism designed to achieve certified fairness with differential privacy (DP). FairDP independently trains models for distinct individual groups, using group-specific clipping terms to assess and bound the disparate impacts of DP. Throughout the training process, the mechanism progressively integrates knowledge from group models to formulate a comprehensive… ▽ More

    Submitted 21 August, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

  26. arXiv:2304.06053  [pdf, other

    cs.CV

    TextANIMAR: Text-based 3D Animal Fine-Grained Retrieval

    Authors: Trung-Nghia Le, Tam V. Nguyen, Minh-Quan Le, Trong-Thuan Nguyen, Viet-Tham Huynh, Trong-Le Do, Khanh-Duy Le, Mai-Khiem Tran, Nhat Hoang-Xuan, Thang-Long Nguyen-Ho, Vinh-Tiep Nguyen, Tuong-Nghiem Diep, Khanh-Duy Ho, Xuan-Hieu Nguyen, Thien-Phuc Tran, Tuan-Anh Yang, Kim-Phat Tran, Nhu-Vinh Hoang, Minh-Quang Nguyen, E-Ro Nguyen, Minh-Khoi Nguyen-Nhat, Tuan-An To, Trung-Truc Huynh-Le, Nham-Tan Nguyen, Hoang-Chau Luong , et al. (8 additional authors not shown)

    Abstract: 3D object retrieval is an important yet challenging task that has drawn more and more attention in recent years. While existing approaches have made strides in addressing this issue, they are often limited to restricted settings such as image and sketch queries, which are often unfriendly interactions for common users. In order to overcome these limitations, this paper presents a novel SHREC chall… ▽ More

    Submitted 9 August, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Accepted to Computers and Graphics (3DOR, Journal Track)

  27. arXiv:2304.05731  [pdf, other

    cs.CV

    SketchANIMAR: Sketch-based 3D Animal Fine-Grained Retrieval

    Authors: Trung-Nghia Le, Tam V. Nguyen, Minh-Quan Le, Trong-Thuan Nguyen, Viet-Tham Huynh, Trong-Le Do, Khanh-Duy Le, Mai-Khiem Tran, Nhat Hoang-Xuan, Thang-Long Nguyen-Ho, Vinh-Tiep Nguyen, Nhat-Quynh Le-Pham, Huu-Phuc Pham, Trong-Vu Hoang, Quang-Binh Nguyen, Trong-Hieu Nguyen-Mau, Tuan-Luc Huynh, Thanh-Danh Le, Ngoc-Linh Nguyen-Ha, Tuong-Vy Truong-Thuy, Truong Hoai Phong, Tuong-Nghiem Diep, Khanh-Duy Ho, Xuan-Hieu Nguyen, Thien-Phuc Tran , et al. (9 additional authors not shown)

    Abstract: The retrieval of 3D objects has gained significant importance in recent years due to its broad range of applications in computer vision, computer graphics, virtual reality, and augmented reality. However, the retrieval of 3D objects presents significant challenges due to the intricate nature of 3D models, which can vary in shape, size, and texture, and have numerous polygons and vertices. To this… ▽ More

    Submitted 9 August, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Accepted to Computers & Graphics (3DOR 2023, Journal track)

  28. arXiv:2302.12685  [pdf, other

    cs.LG cs.AI cs.CR

    Active Membership Inference Attack under Local Differential Privacy in Federated Learning

    Authors: Truc Nguyen, Phung Lai, Khang Tran, NhatHai Phan, My T. Thai

    Abstract: Federated learning (FL) was originally regarded as a framework for collaborative learning among clients with data privacy protection through a coordinating server. In this paper, we propose a new active membership inference (AMI) attack carried out by a dishonest server in FL. In AMI attacks, the server crafts and embeds malicious parameters into global models to effectively infer whether a target… ▽ More

    Submitted 24 July, 2023; v1 submitted 24 February, 2023; originally announced February 2023.

    Comments: Published at AISTATS 2023

    Journal ref: Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:5714-5730, 2023

  29. EVJVQA Challenge: Multilingual Visual Question Answering

    Authors: Ngan Luu-Thuy Nguyen, Nghia Hieu Nguyen, Duong T. D Vo, Khanh Quoc Tran, Kiet Van Nguyen

    Abstract: Visual Question Answering (VQA) is a challenging task of natural language processing (NLP) and computer vision (CV), attracting significant attention from researchers. English is a resource-rich language that has witnessed various developments in datasets and models for visual question answering. Visual question answering in other languages also would be developed for resources and models. In addi… ▽ More

    Submitted 17 April, 2024; v1 submitted 22 February, 2023; originally announced February 2023.

    Comments: VLSP2022 EVJVQA challenge

  30. arXiv:2301.10186  [pdf, other

    cs.CL

    ViHOS: Hate Speech Spans Detection for Vietnamese

    Authors: Phu Gia Hoang, Canh Duc Luu, Khanh Quoc Tran, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

    Abstract: The rise in hateful and offensive language directed at other users is one of the adverse side effects of the increased use of social networking platforms. This could make it difficult for human moderators to review tagged comments filtered by classification systems. To help address this issue, we present the ViHOS (Vietnamese Hate and Offensive Spans) dataset, the first human-annotated corpus cont… ▽ More

    Submitted 26 January, 2023; v1 submitted 24 January, 2023; originally announced January 2023.

    Comments: EACL 2023

  31. arXiv:2212.14050  [pdf, other

    cs.LG cs.AI eess.SP

    Proof of Swarm Based Ensemble Learning for Federated Learning Applications

    Authors: Ali Raza, Kim Phuc Tran, Ludovic Koehl, Shujun Li

    Abstract: Ensemble learning combines results from multiple machine learning models in order to provide a better and optimised predictive model with reduced bias, variance and improved predictions. However, in federated learning it is not feasible to apply centralised ensemble learning directly due to privacy concerns. Hence, a mechanism is required to combine results of local models to produce a global mode… ▽ More

    Submitted 2 January, 2023; v1 submitted 28 December, 2022; originally announced December 2022.

    Comments: This is the full edition of a 4-page poster paper published at the Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing (SAC '23) which can be accessed via the following DOI link: https://doi.org/10.1145/3555776.3578601

  32. arXiv:2211.11502  [pdf, other

    cs.LG physics.app-ph

    Differentiable Physics-based Greenhouse Simulation

    Authors: Nhat M. Nguyen, Hieu T. Tran, Minh V. Duong, Hanh Bui, Kenneth Tran

    Abstract: We present a differentiable greenhouse simulation model based on physical processes whose parameters can be obtained by training from real data. The physics-based simulation model is fully interpretable and is able to do state prediction for both climate and crop dynamics in the greenhouse over very a long time horizon. The model works by constructing a system of linear differential equations and… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: Accepted at the Machine Learning and the Physical Sciences workshop, NeurIPS 2022. 7 pages, 2 figures

  33. arXiv:2211.08170  [pdf, other

    cs.CL cs.DB cs.IR cs.LG

    A Comparative Study of Question Answering over Knowledge Bases

    Authors: Khiem Vinh Tran, Hao Phu Phan, Khang Nguyen Duc Quach, Ngan Luu-Thuy Nguyen, Jun Jo, Thanh Tam Nguyen

    Abstract: Question answering over knowledge bases (KBQA) has become a popular approach to help users extract information from knowledge bases. Although several systems exist, choosing one suitable for a particular application scenario is difficult. In this article, we provide a comparative study of six representative KBQA systems on eight benchmark datasets. In that, we study various question types, propert… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

  34. arXiv:2211.06474  [pdf, other

    cs.CL cs.SD eess.AS

    Speech-to-Speech Translation For A Real-world Unwritten Language

    Authors: Peng-Jen Chen, Kevin Tran, Yilin Yang, Jingfei Du, Justine Kao, Yu-An Chung, Paden Tomasello, Paul-Ambroise Duquenne, Holger Schwenk, Hongyu Gong, Hirofumi Inaguma, Sravya Popuri, Changhan Wang, Juan Pino, Wei-Ning Hsu, Ann Lee

    Abstract: We study speech-to-speech translation (S2ST) that translates speech from one language into another language and focuses on building systems to support languages without standard text writing systems. We use English-Taiwanese Hokkien as a case study, and present an end-to-end solution from training data collection, modeling choices to benchmark dataset release. First, we present efforts on creating… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

  35. arXiv:2211.05766  [pdf, other

    cs.LG cs.CR

    Heterogeneous Randomized Response for Differential Privacy in Graph Neural Networks

    Authors: Khang Tran, Phung Lai, NhatHai Phan, Issa Khalil, Yao Ma, Abdallah Khreishah, My Thai, Xintao Wu

    Abstract: Graph neural networks (GNNs) are susceptible to privacy inference attacks (PIAs), given their ability to learn joint representation from features and edges among nodes in graph data. To prevent privacy leakages in GNNs, we propose a novel heterogeneous randomized response (HeteroRR) mechanism to protect nodes' features and edges against PIAs under differential privacy (DP) guarantees without an un… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: Accepted in IEEE BigData 2022 (short paper)

  36. arXiv:2210.13700  [pdf, other

    eess.AS cs.CL cs.LG

    Does Joint Training Really Help Cascaded Speech Translation?

    Authors: Viet Anh Khoa Tran, David Thulke, Yingbo Gao, Christian Herold, Hermann Ney

    Abstract: Currently, in speech translation, the straightforward approach - cascading a recognition system with a translation system - delivers state-of-the-art results. However, fundamental challenges such as error propagation from the automatic speech recognition system still remain. To mitigate these problems, recently, people turn their attention to direct data and propose various joint training methods.… ▽ More

    Submitted 24 November, 2022; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: Accepted to EMNLP 2022

  37. arXiv:2210.08610  [pdf, other

    cs.SD cs.AI eess.AS

    Robust, General, and Low Complexity Acoustic Scene Classification Systems and An Effective Visualization for Presenting a Sound Scene Context

    Authors: Lam Pham, Dusan Salovic, Anahid Jalali, Alexander Schindler, Khoa Tran, Canh Vu, Phu X. Nguyen

    Abstract: In this paper, we present a comprehensive analysis of Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. In particular, we firstly propose an inception-based and low footprint ASC model, referred to as the ASC baseline. The proposed ASC baseline is then compared with benchmark and high-complexity network architectures of Mobile… ▽ More

    Submitted 16 October, 2022; originally announced October 2022.

  38. arXiv:2210.06408  [pdf, other

    cs.CL cs.AI

    PriMeSRL-Eval: A Practical Quality Metric for Semantic Role Labeling Systems Evaluation

    Authors: Ishan Jindal, Alexandre Rademaker, Khoi-Nguyen Tran, Huaiyu Zhu, Hiroshi Kanayama, Marina Danilevsky, Yunyao Li

    Abstract: Semantic role labeling (SRL) identifies the predicate-argument structure in a sentence. This task is usually accomplished in four steps: predicate identification, predicate sense disambiguation, argument identification, and argument classification. Errors introduced at one step propagate to later steps. Unfortunately, the existing SRL evaluation scripts do not consider the full effect of this erro… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

  39. arXiv:2207.08486  [pdf, other

    cs.LG cs.CR

    Using Anomaly Detection to Detect Poisoning Attacks in Federated Learning Applications

    Authors: Ali Raza, Shujun Li, Kim-Phuc Tran, Ludovic Koehl

    Abstract: Adversarial attacks such as poisoning attacks have attracted the attention of many machine learning researchers. Traditionally, poisoning attacks attempt to inject adversarial training data in order to manipulate the trained model. In federated learning (FL), data poisoning attacks can be generalized to model poisoning attacks, which cannot be detected by simpler methods due to the lack of access… ▽ More

    Submitted 9 May, 2023; v1 submitted 18 July, 2022; originally announced July 2022.

    Comments: We will updated this article soon

  40. arXiv:2207.05851  [pdf, ps, other

    cs.CL

    Sockeye 3: Fast Neural Machine Translation with PyTorch

    Authors: Felix Hieber, Michael Denkowski, Tobias Domhan, Barbara Darques Barros, Celina Dong Ye, Xing Niu, Cuong Hoang, Ke Tran, Benjamin Hsu, Maria Nadejde, Surafel Lakew, Prashant Mathur, Anna Currey, Marcello Federico

    Abstract: Sockeye 3 is the latest version of the Sockeye toolkit for Neural Machine Translation (NMT). Now based on PyTorch, Sockeye 3 provides faster model implementations and more advanced features with a further streamlined codebase. This enables broader experimentation with faster iteration, efficient training of stronger and faster models, and the flexibility to move new ideas quickly from research to… ▽ More

    Submitted 2 August, 2022; v1 submitted 12 July, 2022; originally announced July 2022.

  41. arXiv:2206.13392  [pdf, ps, other

    cs.CV cs.AI cs.LG

    Remote Sensing Image Classification using Transfer Learning and Attention Based Deep Neural Network

    Authors: Lam Pham, Khoa Tran, Dat Ngo, Jasmin Lampert, Alexander Schindler

    Abstract: The task of remote sensing image scene classification (RSISC), which aims at classifying remote sensing images into groups of semantic categories based on their contents, has taken the important role in a wide range of applications such as urban planning, natural hazards detection, environment monitoring,vegetation mapping, or geospatial object detection. During the past years, research community… ▽ More

    Submitted 20 June, 2022; originally announced June 2022.

  42. arXiv:2206.10110  [pdf, other

    cs.SE

    ProML: A Decentralised Platform for Provenance Management of Machine Learning Software Systems

    Authors: Nguyen Khoi Tran, Bushra Sabir, M. Ali Babar, Nini Cui, Mehran Abolhasan, Justin Lipman

    Abstract: Large-scale Machine Learning (ML) based Software Systems are increasingly developed by distributed teams situated in different trust domains. Insider threats can launch attacks from any domain to compromise ML assets (models and datasets). Therefore, practitioners require information about how and by whom ML assets were developed to assess their quality attributes such as security, safety, and fai… ▽ More

    Submitted 21 June, 2022; originally announced June 2022.

    Comments: Accepted as full paper in ECSA 2022 conference. To be presented

  43. arXiv:2206.00524  [pdf, other

    cs.CL cs.AI cs.LG

    Vietnamese Hate and Offensive Detection using PhoBERT-CNN and Social Media Streaming Data

    Authors: Khanh Q. Tran, An T. Nguyen, Phu Gia Hoang, Canh Duc Luu, Trong-Hop Do, Kiet Van Nguyen

    Abstract: Society needs to develop a system to detect hate and offense to build a healthy and safe environment. However, current research in this field still faces four major shortcomings, including deficient pre-processing techniques, indifference to data imbalance issues, modest performance models, and lacking practical applications. This paper focused on developing an intelligent system capable of addres… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  44. Mod2Dash: A Framework for Model-Driven Dashboards Generation

    Authors: Liuyue Jiang, Nguyen Khoi Tran, M. Ali Babar

    Abstract: The construction of an interactive dashboard involves deciding on what information to present and how to display it and implementing those design decisions to create an operational dashboard. Traditionally, a dashboard's design is implied in the deployed dashboard rather than captured explicitly as a digital artifact, preventing it from being backed up, version-controlled, and shared. Moreover, pr… ▽ More

    Submitted 15 May, 2022; originally announced May 2022.

  45. arXiv:2205.06618  [pdf, other

    cs.CL cs.AI cs.LG

    The Devil is in the Details: On the Pitfalls of Vocabulary Selection in Neural Machine Translation

    Authors: Tobias Domhan, Eva Hasler, Ke Tran, Sony Trenous, Bill Byrne, Felix Hieber

    Abstract: Vocabulary selection, or lexical shortlisting, is a well-known technique to improve latency of Neural Machine Translation models by constraining the set of allowed output words during inference. The chosen set is typically determined by separately trained alignment model parameters, independent of the source-sentence context at inference time. While vocabulary selection appears competitive with re… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: NAACL 2022

  46. A Framework for Automating Deployment and Evaluation of Blockchain Network

    Authors: Nguyen Khoi Tran, M. Ali Babar, Andrew Walters

    Abstract: Blockchain network deployment and evaluation have become prevalent due to the demand for private blockchains by enterprises, governments, and edge computing systems. Whilst a blockchain network's deployment and evaluation are driven by its architecture, practitioners still need to learn and carry out many repetitive and error-prone activities to transform architecture into an operational blockchai… ▽ More

    Submitted 24 July, 2022; v1 submitted 20 March, 2022; originally announced March 2022.

    Comments: Published in the Journal of Network and Computer Applications

  47. arXiv:2110.00244  [pdf, other

    cs.CV cs.AI

    Lightweight Transformer in Federated Setting for Human Activity Recognition

    Authors: Ali Raza, Kim Phuc Tran, Ludovic Koehl, Shujun Li, Xianyi Zeng, Khaled Benzaidi

    Abstract: Human activity recognition (HAR) is a machine learning task with important applications in healthcare especially in the context of home care of patients and older adults. HAR is often based on data collected from smart sensors, particularly smart home IoT devices such as smartphones, wearables and other body sensors. Deep learning techniques like convolutional neural networks (CNNs) and recurrent… ▽ More

    Submitted 4 November, 2022; v1 submitted 1 October, 2021; originally announced October 2021.

    Comments: Submitted to Journal of Biomedical Informatics

  48. arXiv:2109.06449  [pdf, other

    cs.AI cs.CR cs.LG

    Deep hierarchical reinforcement agents for automated penetration testing

    Authors: Khuong Tran, Ashlesha Akella, Maxwell Standen, Junae Kim, David Bowman, Toby Richer, Chin-Teng Lin

    Abstract: Penetration testing the organised attack of a computer system in order to test existing defences has been used extensively to evaluate network security. This is a time consuming process and requires in-depth knowledge for the establishment of a strategy that resembles a real cyber-attack. This paper presents a novel deep reinforcement learning architecture with hierarchically structured agents cal… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: Presented at 1st International Workshop on Adaptive Cyber Defense, 2021 (arXiv:2108.08476)

    Report number: IJCAI-ACD/2021/114

  49. arXiv:2108.06239  [pdf, ps, other

    cs.DS cs.DM math.CO math.OC

    A Faster Algorithm for Quickest Transshipments via an Extended Discrete Newton Method

    Authors: Miriam Schlöter, Martin Skutella, Khai Van Tran

    Abstract: The Quickest Transshipment Problem is to route flow as quickly as possible from sources with supplies to sinks with demands in a network with capacities and transit times on the arcs. It is of fundamental importance for numerous applications in areas such as logistics, production, traffic, evacuation, and finance. More than 25 years ago, Hoppe and Tardos presented the first (strongly) polynomial-t… ▽ More

    Submitted 13 August, 2021; originally announced August 2021.

  50. arXiv:2106.11466  [pdf, other

    cs.CV

    Gait analysis with curvature maps: A simulation study

    Authors: Khac Chinh Tran, Marc Daniel, Jean Meunier

    Abstract: Gait analysis is an important aspect of clinical investigation for detecting neurological and musculoskeletal disorders and assessing the global health of a patient. In this paper we propose to focus our attention on extracting relevant curvature information from the body surface provided by a depth camera. We assumed that the 3D mesh was made available in a previous step and demonstrated how curv… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

    Comments: 4 pages, 5 figures