Search | arXiv e-print repository

doi 10.1145/3689775

Concurrent Data Structures Made Easy (Extended Version)

Authors: Callista Le, Kiran Gopinathan, Koon Wen Lee, Seth Gilbert, Ilya Sergey

Abstract: Design of an efficient thread-safe concurrent data structure is a balancing act between its implementation complexity and performance. Lock-based concurrent data structures, which are relatively easy to derive from their sequential counterparts and to prove thread-safe, suffer from poor throughput under even light multi-threaded workload. At the same time, lock-free concurrent structures allow for… ▽ More Design of an efficient thread-safe concurrent data structure is a balancing act between its implementation complexity and performance. Lock-based concurrent data structures, which are relatively easy to derive from their sequential counterparts and to prove thread-safe, suffer from poor throughput under even light multi-threaded workload. At the same time, lock-free concurrent structures allow for high throughput, but are notoriously difficult to get right and require careful reasoning to formally establish their correctness. We explore a solution to this conundrum based on batch parallelism, an approach for designing concurrent data structures via a simple insight: efficiently processing a batch of a priori known operations in parallel is easier than optimising performance for a stream of arbitrary asynchronous requests. Alas, batch-parallel structures have not seen wide practical adoption due to (i) the inconvenience of having to structure multi-threaded programs to explicitly group operations and (ii) the lack of a systematic methodology to implement batch-parallel structures as simply as lock-based ones. We present OBatcher-an OCaml library that streamlines the design, implementation, and usage of batch-parallel structures. It solves the first challenge (how to use) by suggesting a new lightweight implicit batching design that is built on top of generic asynchronous programming mechanisms. The second challenge (how to implement) is addressed by identifying a family of strategies for converting common sequential structures into efficient batch-parallel ones. We showcase OBatcher with a diverse set of benchmarks. Our evaluation of all the implementations on large asynchronous workloads shows that (a) they consistently outperform the corresponding coarse-grained lock-based implementations and that (b) their throughput scales reasonably with the number of processors. △ Less

Submitted 25 August, 2024; originally announced August 2024.

Comments: Extended version of the OOPSLA'24 paper

arXiv:2408.12627 [pdf, other]

Machine-Learning-Based Construction of Molecular Potential and Its Application in Exploring the Deep-Lying-Orbital Effect in High-Order Harmonic Generation

Authors: Duong D. Hoang-Trong, Khang Tran, Doan-An Trieu, Quan-Hao Truong, Kim-Ngan H. Nguyen, Cam-Tu Le, DinhDuy Vu, Ngoc-Hung Phan, Ngoc-Ty Nguyen, Van-Hoang Le, Ngoc-Loan Phan

Abstract: Creating soft-Coulomb-type (SC) molecular potential within single-active-electron approximation (SAE) is essential since it allows solving time-dependent Schrödinger equations with fewer computational resources compared to other multielectron methods. The current available SC potentials can accurately reproduce the energy of the highest occupied molecular orbital (HOMO), which is sufficient for an… ▽ More Creating soft-Coulomb-type (SC) molecular potential within single-active-electron approximation (SAE) is essential since it allows solving time-dependent Schrödinger equations with fewer computational resources compared to other multielectron methods. The current available SC potentials can accurately reproduce the energy of the highest occupied molecular orbital (HOMO), which is sufficient for analyzing nonlinear effects in laser-molecule interactions like high-order harmonic generation (HHG). However, recent discoveries of significant effects of deep-lying molecular orbitals call for more precise potentials to analyze them. In this study, we present a fast and accurate method based on machine learning to construct SC potentials that simultaneously reproduce various molecular features, including energies, symmetries, and dipole moments of HOMO, HOMO-1, and HOMO-2. We use this ML model to create SC SAE potentials of the HCN molecule and then comprehensively analyze the fingerprints of lower-lying orbitals in HHG spectra emitted during the H-CN stretching. Our findings reveal that HOMO-1 plays a role in forming the second HHG plateau. Additionally, as the H-C distance increases, the plateau structure and the smoothness of HHG spectra are altered due to the redistribution of orbital electron density. These results are in line with other experimental and theoretical studies. Lastly, the machine learning approach using deconvolution and convolution neural networks in the present study is so general that it can be applied to construct molecular potential for other molecules and molecular dynamic processes. △ Less

Submitted 4 September, 2024; v1 submitted 20 August, 2024; originally announced August 2024.

arXiv:2408.12593 [pdf, other]

Automating Deformable Gasket Assembly

Authors: Simeon Adebola, Tara Sadjadpour, Karim El-Refai, Will Panitch, Zehan Ma, Roy Lin, Tianshuang Qiu, Shreya Ganti, Charlotte Le, Jaimyn Drake, Ken Goldberg

Abstract: In Gasket Assembly, a deformable gasket must be aligned and pressed into a narrow channel. This task is common for sealing surfaces in the manufacturing of automobiles, appliances, electronics, and other products. Gasket Assembly is a long-horizon, high-precision task and the gasket must align with the channel and be fully pressed in to achieve a secure fit. To compare approaches, we present 4 met… ▽ More In Gasket Assembly, a deformable gasket must be aligned and pressed into a narrow channel. This task is common for sealing surfaces in the manufacturing of automobiles, appliances, electronics, and other products. Gasket Assembly is a long-horizon, high-precision task and the gasket must align with the channel and be fully pressed in to achieve a secure fit. To compare approaches, we present 4 methods for Gasket Assembly: one policy from deep imitation learning and three procedural algorithms. We evaluate these methods with 100 physical trials. Results suggest that the Binary+ algorithm succeeds in 10/10 on the straight channel whereas the learned policy based on 250 human teleoperated demonstrations succeeds in 8/10 trials and is significantly slower. Code, CAD models, videos, and data can be found at https://berkeleyautomation.github.io/robot-gasket/ △ Less

Submitted 22 August, 2024; originally announced August 2024.

Comments: Content without Appendix accepted for IEEE CASE 2024

arXiv:2408.02816 [pdf, other]

Learning to Predict Program Execution by Modeling Dynamic Dependency on Code Graphs

Authors: Cuong Chi Le, Hoang Nhat Phan, Huy Nhat Phan, Tien N. Nguyen, Nghi D. Q. Bui

Abstract: Predicting program behavior without execution is a crucial and challenging task in software engineering. Traditional models often struggle to capture the dynamic dependencies and interactions within code. This paper introduces a novel machine learning-based framework called CodeFlow, designed to predict code coverage and detect runtime errors through Dynamic Dependencies Learning. By utilizing con… ▽ More Predicting program behavior without execution is a crucial and challenging task in software engineering. Traditional models often struggle to capture the dynamic dependencies and interactions within code. This paper introduces a novel machine learning-based framework called CodeFlow, designed to predict code coverage and detect runtime errors through Dynamic Dependencies Learning. By utilizing control flow graphs (CFGs), CodeFlow represents all possible execution paths and the relationships between different statements, providing a comprehensive understanding of program behavior. CodeFlow constructs CFGs to depict execution paths and learns vector representations for CFG nodes, capturing static control-flow dependencies. Additionally, it learns dynamic dependencies through execution traces, which reflect the impacts among statements during execution. This approach enables accurate prediction of code coverage and effective identification of runtime errors. Empirical evaluations demonstrate significant improvements in code coverage prediction accuracy and effective localization of runtime errors, outperforming existing models. △ Less

Submitted 9 August, 2024; v1 submitted 5 August, 2024; originally announced August 2024.

arXiv:2407.19203 [pdf, other]

Towards Clean-Label Backdoor Attacks in the Physical World

Authors: Thinh Dao, Cuong Chi Le, Khoa D Doan, Kok-Seng Wong

Abstract: Deep Neural Networks (DNNs) are vulnerable to backdoor poisoning attacks, with most research focusing on digital triggers, special patterns digitally added to test-time inputs to induce targeted misclassification. In contrast, physical triggers, which are natural objects within a physical scene, have emerged as a desirable alternative since they enable real-time backdoor activations without digita… ▽ More Deep Neural Networks (DNNs) are vulnerable to backdoor poisoning attacks, with most research focusing on digital triggers, special patterns digitally added to test-time inputs to induce targeted misclassification. In contrast, physical triggers, which are natural objects within a physical scene, have emerged as a desirable alternative since they enable real-time backdoor activations without digital manipulation. However, current physical attacks require that poisoned inputs have incorrect labels, making them easily detectable upon human inspection. In this paper, we collect a facial dataset of 21,238 images with 7 common accessories as triggers and use it to study the threat of clean-label backdoor attacks in the physical world. Our study reveals two findings. First, the success of physical attacks depends on the poisoning algorithm, physical trigger, and the pair of source-target classes. Second, although clean-label poisoned samples preserve ground-truth labels, their perceptual quality could be seriously degraded due to conspicuous artifacts in the images. Such samples are also vulnerable to statistical filtering methods because they deviate from the distribution of clean samples in the feature space. To address these issues, we propose replacing the standard $\ell_\infty$ regularization with a novel pixel regularization and feature regularization that could enhance the imperceptibility of poisoned samples without compromising attack performance. Our study highlights accidental backdoor activations as a key limitation of clean-label physical backdoor attacks. This happens when unintended objects or classes accidentally cause the model to misclassify as the target class. △ Less

Submitted 27 July, 2024; originally announced July 2024.

Comments: 36 pages, 18 figures, 18 papers, submitted to NeurIPS 2024

arXiv:2407.08439 [pdf, other]

A fitted space-time finite element method for an advection-diffusion problem with moving interfaces

Authors: Quang Huy Nguyen, Van Chien Le, Phuong Cuc Hoang, Thi Thanh Mai Ta

Abstract: This paper presents a fitted space-time finite element method for solving a parabolic advection-diffusion problem with a nonstationary interface. The jumping diffusion coefficient gives rise to the discontinuity of the spatial gradient of solution across the interface. We use the Banach-Necas-Babuska theorem to show the well-posedness of the continuous variational problem. A fully discrete finite-… ▽ More This paper presents a fitted space-time finite element method for solving a parabolic advection-diffusion problem with a nonstationary interface. The jumping diffusion coefficient gives rise to the discontinuity of the spatial gradient of solution across the interface. We use the Banach-Necas-Babuska theorem to show the well-posedness of the continuous variational problem. A fully discrete finite-element based scheme is analyzed using the Galerkin method and unstructured fitted meshes. An optimal error estimate is established in a discrete energy norm under appropriate globally low but locally high regularity conditions. Some numerical results corroborate our theoretical results. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 19 pages

arXiv:2406.04423 [pdf, other]

Determining the Number of Communities in Sparse and Imbalanced Settings

Authors: Zhixuan Shao, Can M. Le

Abstract: Community structures represent a crucial aspect of network analysis, and various methods have been developed to identify these communities. However, a common hurdle lies in determining the number of communities K, a parameter that often requires estimation in practice. Existing approaches for estimating K face two notable challenges: the weak community signal present in sparse networks and the imb… ▽ More Community structures represent a crucial aspect of network analysis, and various methods have been developed to identify these communities. However, a common hurdle lies in determining the number of communities K, a parameter that often requires estimation in practice. Existing approaches for estimating K face two notable challenges: the weak community signal present in sparse networks and the imbalance in community sizes or edge densities that result in unequal per-community expected degree. We propose a spectral method based on a novel network operator whose spectral properties effectively overcome both challenges. This operator is a refined version of the non-backtracking operator, adapted from a "centered" adjacency matrix. Its leading eigenvalues are more concentrated than those of the adjacency matrix for sparse networks, while they also demonstrate enhanced signal under imbalance scenarios, a benefit attributed to the centering step. This is justified, either theoretically or numerically, under the null model K = 1, in both dense and ultra-sparse settings. A goodness-of-fit test based on the leading eigenvalue can be applied to determine the number of communities K. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2405.17809 [pdf, other]

TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

Authors: Chenyang Le, Yao Qian, Dongmei Wang, Long Zhou, Shujie Liu, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Sheng Zhao, Michael Zeng

Abstract: There is a rising interest and trend in research towards directly translating speech from one language to another, known as end-to-end speech-to-speech translation. However, most end-to-end models struggle to outperform cascade models, i.e., a pipeline framework by concatenating speech recognition, machine translation and text-to-speech models. The primary challenges stem from the inherent complex… ▽ More There is a rising interest and trend in research towards directly translating speech from one language to another, known as end-to-end speech-to-speech translation. However, most end-to-end models struggle to outperform cascade models, i.e., a pipeline framework by concatenating speech recognition, machine translation and text-to-speech models. The primary challenges stem from the inherent complexities involved in direct translation tasks and the scarcity of data. In this study, we introduce a novel model framework TransVIP that leverages diverse datasets in a cascade fashion yet facilitates end-to-end inference through joint probability. Furthermore, we propose two separated encoders to preserve the speaker's voice characteristics and isochrony from the source speech during the translation process, making it highly suitable for scenarios such as video dubbing. Our experiments on the French-English language pair demonstrate that our model outperforms the current state-of-the-art speech-to-speech translation model. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: Work in progress

arXiv:2405.11071 [pdf, other]

Boundary element methods for the magnetic field integral equation on polyhedra

Authors: Van Chien Le, Kristof Cools

Abstract: This paper provides a rigorous analysis on boundary element methods for the magnetic field integral equation on Lipschitz polyhedra. The magnetic field integral equation is widely used in practical applications to model electromagnetic scattering by a perfectly conducting body. The governing operator is shown to be coercive by means of the electric field integral operator with a purely imaginary w… ▽ More This paper provides a rigorous analysis on boundary element methods for the magnetic field integral equation on Lipschitz polyhedra. The magnetic field integral equation is widely used in practical applications to model electromagnetic scattering by a perfectly conducting body. The governing operator is shown to be coercive by means of the electric field integral operator with a purely imaginary wave number. Consequently, the continuous variational problem is uniquely solvable, given that the wave number does not belong to the spectrum of the interior Maxwell's problem. A Galerkin discretization scheme is then introduced, employing Raviart-Thomas basis functions for the solution space and Buffa-Christiansen functions for the test space. A discrete inf-sup condition is proven, implying the unique solvability of the discrete variational problem. An asymptotically quasi-optimal error estimate for the numerical solutions is established, and the convergence rate of the numerical scheme is examined. In addition, the resulting matrix system is shown to be well-conditioned regardless of the mesh refinement. Some numerical results are presented to support the theoretical analysis. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 18 pages, 4 figures

arXiv:2404.11792 [pdf, other]

Enhancing Q&A with Domain-Specific Fine-Tuning and Iterative Reasoning: A Comparative Study

Authors: Zooey Nguyen, Anthony Annunziata, Vinh Luong, Sang Dinh, Quynh Le, Anh Hai Ha, Chanh Le, Hong An Phan, Shruti Raghavan, Christopher Nguyen

Abstract: This paper investigates the impact of domain-specific model fine-tuning and of reasoning mechanisms on the performance of question-answering (Q&A) systems powered by large language models (LLMs) and Retrieval-Augmented Generation (RAG). Using the FinanceBench SEC financial filings dataset, we observe that, for RAG, combining a fine-tuned embedding model with a fine-tuned LLM achieves better accura… ▽ More This paper investigates the impact of domain-specific model fine-tuning and of reasoning mechanisms on the performance of question-answering (Q&A) systems powered by large language models (LLMs) and Retrieval-Augmented Generation (RAG). Using the FinanceBench SEC financial filings dataset, we observe that, for RAG, combining a fine-tuned embedding model with a fine-tuned LLM achieves better accuracy than generic models, with relatively greater gains attributable to fine-tuned embedding models. Additionally, employing reasoning iterations on top of RAG delivers an even bigger jump in performance, enabling the Q&A systems to get closer to human-expert quality. We discuss the implications of such findings, propose a structured technical design space capturing major technical components of Q&A AI, and provide recommendations for making high-impact technical choices for such components. We plan to follow up on this work with actionable guides for AI teams and further investigations into the impact of domain-specific augmentation in RAG and into agentic AI capabilities such as advanced planning and reasoning. △ Less

Submitted 19 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

Comments: Fixed typo of OODA's score on harder-question set in Table 2

arXiv:2404.10279 [pdf, other]

EucliDreamer: Fast and High-Quality Texturing for 3D Models with Depth-Conditioned Stable Diffusion

Authors: Cindy Le, Congrui Hetang, Chendi Lin, Ang Cao, Yihui He

Abstract: We present EucliDreamer, a simple and effective method to generate textures for 3D models given text prompts and meshes. The texture is parametrized as an implicit function on the 3D surface, which is optimized with the Score Distillation Sampling (SDS) process and differentiable rendering. To generate high-quality textures, we leverage a depth-conditioned Stable Diffusion model guided by the dept… ▽ More We present EucliDreamer, a simple and effective method to generate textures for 3D models given text prompts and meshes. The texture is parametrized as an implicit function on the 3D surface, which is optimized with the Score Distillation Sampling (SDS) process and differentiable rendering. To generate high-quality textures, we leverage a depth-conditioned Stable Diffusion model guided by the depth image rendered from the mesh. We test our approach on 3D models in Objaverse and conducted a user study, which shows its superior quality compared to existing texturing methods like Text2Tex. In addition, our method converges 2 times faster than DreamFusion. Through text prompting, textures of diverse art styles can be produced. We hope Euclidreamer proides a viable solution to automate a labor-intensive stage in 3D content creation. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: Short version of arXiv:2311.15573

arXiv:2403.16051 [pdf, other]

Segment Anything Model for Road Network Graph Extraction

Authors: Congrui Hetang, Haoru Xue, Cindy Le, Tianwei Yue, Wenping Wang, Yihui He

Abstract: We propose SAM-Road, an adaptation of the Segment Anything Model (SAM) for extracting large-scale, vectorized road network graphs from satellite imagery. To predict graph geometry, we formulate it as a dense semantic segmentation task, leveraging the inherent strengths of SAM. The image encoder of SAM is fine-tuned to produce probability masks for roads and intersections, from which the graph vert… ▽ More We propose SAM-Road, an adaptation of the Segment Anything Model (SAM) for extracting large-scale, vectorized road network graphs from satellite imagery. To predict graph geometry, we formulate it as a dense semantic segmentation task, leveraging the inherent strengths of SAM. The image encoder of SAM is fine-tuned to produce probability masks for roads and intersections, from which the graph vertices are extracted via simple non-maximum suppression. To predict graph topology, we designed a lightweight transformer-based graph neural network, which leverages the SAM image embeddings to estimate the edge existence probabilities between vertices. Our approach directly predicts the graph vertices and edges for large regions without expensive and complex post-processing heuristics, and is capable of building complete road network graphs spanning multiple square kilometers in a matter of seconds. With its simple, straightforward, and minimalist design, SAM-Road achieves comparable accuracy with the state-of-the-art method RNGDet++, while being 40 times faster on the City-scale dataset. We thus demonstrate the power of a foundational vision model when applied to a graph learning task. The code is available at https://github.com/htcr/sam_road. △ Less

Submitted 12 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

Comments: Accepted by IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) 2024, 2nd Workshop on Scene Graphs and Graph Representation Learning

arXiv:2403.12945 [pdf, other]

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

Authors: Alexander Khazatsky, Karl Pertsch, Suraj Nair, Ashwin Balakrishna, Sudeep Dasari, Siddharth Karamcheti, Soroush Nasiriany, Mohan Kumar Srirama, Lawrence Yunliang Chen, Kirsty Ellis, Peter David Fagan, Joey Hejna, Masha Itkina, Marion Lepert, Yecheng Jason Ma, Patrick Tree Miller, Jimmy Wu, Suneel Belkhale, Shivin Dass, Huy Ha, Arhan Jain, Abraham Lee, Youngwoon Lee, Marius Memmel, Sungjae Park , et al. (74 additional authors not shown)

Abstract: The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a resu… ▽ More The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a result, even the most general robot manipulation policies today are mostly trained on data collected in a small number of environments with limited scene and task diversity. In this work, we introduce DROID (Distributed Robot Interaction Dataset), a diverse robot manipulation dataset with 76k demonstration trajectories or 350 hours of interaction data, collected across 564 scenes and 84 tasks by 50 data collectors in North America, Asia, and Europe over the course of 12 months. We demonstrate that training with DROID leads to policies with higher performance and improved generalization ability. We open source the full dataset, policy learning code, and a detailed guide for reproducing our robot hardware setup. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: Project website: https://droid-dataset.github.io/

arXiv:2402.01866 [pdf, other]

Parametric Bootstrap on Networks with Non-Exchangeable Nodes

Authors: Zhixuan Shao, Can M. Le

Abstract: This paper studies the parametric bootstrap method for networks to quantify the uncertainty of statistics of interest. While existing network resampling methods primarily focus on count statistics under node-exchangeable (graphon) models, we consider more general network statistics (including local statistics) under the Chung-Lu model without node-exchangeability. We show that the natural network… ▽ More This paper studies the parametric bootstrap method for networks to quantify the uncertainty of statistics of interest. While existing network resampling methods primarily focus on count statistics under node-exchangeable (graphon) models, we consider more general network statistics (including local statistics) under the Chung-Lu model without node-exchangeability. We show that the natural network parametric bootstrap that first estimates the network generating model and then draws bootstrap samples from the estimated model generally suffers from bootstrap bias. As a general recipe for addressing this problem, we show that a two-level bootstrap procedure provably reduces the bias. This essentially extends the classical idea of iterative bootstrap to the network setting with a growing number of parameters. Moreover, the second-level bootstrap provides a way to construct higher-accuracy confidence intervals for many network statistics. △ Less

Submitted 25 March, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

MSC Class: 62H12 (Primary) 62H30 (Secondary)

arXiv:2401.03790 [pdf, other]

Inferring Properties of Graph Neural Networks

Authors: Dat Nguyen, Hieu M. Vu, Cong-Thanh Le, Bach Le, David Lo, ThanhVu Nguyen, Corina Pasareanu

Abstract: We propose GNNInfer, the first automatic property inference technique for GNNs. To tackle the challenge of varying input structures in GNNs, GNNInfer first identifies a set of representative influential structures that contribute significantly towards the prediction of a GNN. Using these structures, GNNInfer converts each pair of an influential structure and the GNN to their equivalent FNN and the… ▽ More We propose GNNInfer, the first automatic property inference technique for GNNs. To tackle the challenge of varying input structures in GNNs, GNNInfer first identifies a set of representative influential structures that contribute significantly towards the prediction of a GNN. Using these structures, GNNInfer converts each pair of an influential structure and the GNN to their equivalent FNN and then leverages existing property inference techniques to effectively capture properties of the GNN that are specific to the influential structures. GNNINfer then generalizes the captured properties to any input graphs that contain the influential structures. Finally, GNNInfer improves the correctness of the inferred properties by building a model (either a decision tree or linear regression) that estimates the deviation of GNN output from the inferred properties given full input graphs. The learned model helps GNNInfer extend the inferred properties with constraints to the input and output of the GNN, obtaining stronger properties that hold on full input graphs. Our experiments show that GNNInfer is effective in inferring likely properties of popular real-world GNNs, and more importantly, these inferred properties help effectively defend against GNNs' backdoor attacks. In particular, out of the 13 ground truth properties, GNNInfer re-discovered 8 correct properties and discovered likely correct properties that approximate the remaining 5 ground truth properties. Using properties inferred by GNNInfer to defend against the state-of-the-art backdoor attack technique on GNNs, namely UGBA, experiments show that GNNInfer's defense success rate is up to 30 times better than existing baselines. △ Less

Submitted 2 March, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

Comments: 20 pages main paper, 10 pages for appendix

arXiv:2312.16717 [pdf, other]

Landslide Detection and Segmentation Using Remote Sensing Images and Deep Neural Network

Authors: Cam Le, Lam Pham, Jasmin Lampert, Matthias Schlögl, Alexander Schindler

Abstract: Knowledge about historic landslide event occurrence is important for supporting disaster risk reduction strategies. Building upon findings from 2022 Landslide4Sense Competition, we propose a deep neural network based system for landslide detection and segmentation from multisource remote sensing image input. We use a U-Net trained with Cross Entropy loss as baseline model. We then improve the U-Ne… ▽ More Knowledge about historic landslide event occurrence is important for supporting disaster risk reduction strategies. Building upon findings from 2022 Landslide4Sense Competition, we propose a deep neural network based system for landslide detection and segmentation from multisource remote sensing image input. We use a U-Net trained with Cross Entropy loss as baseline model. We then improve the U-Net baseline model by leveraging a wide range of deep learning techniques. In particular, we conduct feature engineering by generating new band data from the original bands, which helps to enhance the quality of remote sensing image input. Regarding the network architecture, we replace traditional convolutional layers in the U-Net baseline by a residual-convolutional layer. We also propose an attention layer which leverages the multi-head attention scheme. Additionally, we generate multiple output masks with three different resolutions, which creates an ensemble of three outputs in the inference process to enhance the performance. Finally, we propose a combined loss function which leverages Focal loss and IoU loss to train the network. Our experiments on the development set of the Landslide4Sense challenge achieve an F1 score and an mIoU score of 84.07 and 76.07, respectively. Our best model setup outperforms the challenge baseline and the proposed U-Net baseline, improving the F1 score/mIoU score by 6.8/7.4 and 10.5/8.8, respectively. △ Less

Submitted 27 December, 2023; originally announced December 2023.

arXiv:2312.06367 [pdf, other]

A stabilized time-domain combined field integral equation using the quasi-Helmholtz projectors

Authors: Van Chien Le, Pierrick Cordel, Francesco P. Andriulli, Kristof Cools

Abstract: This paper introduces a time-domain combined field integral equation for electromagnetic scattering by a perfect electric conductor. The new equation is obtained by leveraging the quasi-Helmholtz projectors, which separate both the unknown and the source fields into solenoidal and irrotational components. These two components are then appropriately rescaled to cure the solution from a loss of accu… ▽ More This paper introduces a time-domain combined field integral equation for electromagnetic scattering by a perfect electric conductor. The new equation is obtained by leveraging the quasi-Helmholtz projectors, which separate both the unknown and the source fields into solenoidal and irrotational components. These two components are then appropriately rescaled to cure the solution from a loss of accuracy occurring when the time step is large. Yukawa-type integral operators of a purely imaginary wave number are also used as a Calderon preconditioner to eliminate the ill-conditioning of matrix systems. The stabilized time-domain electric and magnetic field integral equations are linearly combined in a Calderon-like fashion, then temporally discretized using an appropriate pair of trial functions, resulting in a marching-on-in-time linear system. The novel formulation is immune to spurious resonances, dense discretization breakdown, large-time step breakdown and dc instabilities stemming from non-trivial kernels. Numerical results for both simply-connected and multiply-connected scatterers corroborate the theoretical analysis. △ Less

Submitted 19 July, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

Comments: 13 pages

arXiv:2311.15834 [pdf, other]

doi 10.1103/PhysRevB.109.L041113

Charge-density wave transition in magnetic topological semimetal EuAl$_4$

Authors: R. Yang, C. C. Le, P. Zhu, Z. W. Wang, T. Shang, Y. M. Dai, J. P. Hu, M. Dressel

Abstract: The interplay among topology, charge-density wave (CDW), and magnetism can give rise to a plethora of exotic quantum phenomena. Recently, a group of magnetic topological semimetals with tetragonal lattices and CDW order were found to exhibit anomalous magnetic instability, helical spin ordering, and the presence of skyrmions. However, the underlying mechanism responsible for these observations rem… ▽ More The interplay among topology, charge-density wave (CDW), and magnetism can give rise to a plethora of exotic quantum phenomena. Recently, a group of magnetic topological semimetals with tetragonal lattices and CDW order were found to exhibit anomalous magnetic instability, helical spin ordering, and the presence of skyrmions. However, the underlying mechanism responsible for these observations remains unclear. Here, we conducted a comprehensive investigation into the impact of CDW on the topological and magnetic properties of EuAl$_4$ using optical spectroscopy and the first-principles calculations. Through optical spectroscopy, we observed a partial gap (60~meV) on the Fermi surface and an enhanced mid-infrared absorption around 0.4~eV after the CDW transition. Magneto-optical spectroscopy and the first-principles calculations proved that, by affecting the band structure, the CDW order frustrates the antiferromagnetic interactions but strengthened the ferromagnetic ones, which can destabilize the magnetism. With lower symmetry in the CDW ordered state, carriers from the Weyl bands will mediate the anisotropic magnetic interactions promoting the formation of chiral spin textures. Conversely, without the CDW order, the counterpart EuGa$_4$ shows robust collinear antiferromagnetic order. Our findings uncover the pivotal role played by CDW order in arousing intricate magnetism in topological materials and provide valuable insights into controlling topological and magnetic properties through the manipulation of CDW orders. △ Less

Submitted 27 November, 2023; originally announced November 2023.

Comments: 8 pages, 4 figures

Report number: RIKEN-iTHEMS-Report-24

arXiv:2311.15573 [pdf, other]

EucliDreamer: Fast and High-Quality Texturing for 3D Models with Stable Diffusion Depth

Authors: Cindy Le, Congrui Hetang, Chendi Lin, Ang Cao, Yihui He

Abstract: This paper presents a novel method to generate textures for 3D models given text prompts and 3D meshes. Additional depth information is taken into account to perform the Score Distillation Sampling (SDS) process with depth conditional Stable Diffusion. We ran our model over the open-source dataset Objaverse and conducted a user study to compare the results with those of various 3D texturing method… ▽ More This paper presents a novel method to generate textures for 3D models given text prompts and 3D meshes. Additional depth information is taken into account to perform the Score Distillation Sampling (SDS) process with depth conditional Stable Diffusion. We ran our model over the open-source dataset Objaverse and conducted a user study to compare the results with those of various 3D texturing methods. We have shown that our model can generate more satisfactory results and produce various art styles for the same object. In addition, we achieved faster time when generating textures of comparable quality. We also conduct thorough ablation studies of how different factors may affect generation quality, including sampling steps, guidance scale, negative prompts, data augmentation, elevation range, and alternatives to SDS. △ Less

Submitted 13 March, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

arXiv:2311.07747 [pdf]

Magnetic-coupled electronic landscape in bilayer-distorted titanium-based kagome metals

Authors: Yong Hu, Congcong Le, Long Chen, Hanbin Deng, Ying Zhou, Nicholas C. Plumb, Milan Radovic, Ronny Thomale, Andreas P. Schnyder, Jia-Xin Yin, Gang Wang, Xianxin Wu, Ming Shi

Abstract: Quantum materials whose atoms are arranged on a lattice of corner-sharing triangles, $\textit{i.e.}$, the kagome lattice, have recently emerged as a captivating platform for investigating exotic correlated and topological electronic phenomena. Here, we combine ultra-low temperature angle-resolved photoemission spectroscopy (ARPES) with scanning tunneling microscopy and density functional theory ca… ▽ More Quantum materials whose atoms are arranged on a lattice of corner-sharing triangles, $\textit{i.e.}$, the kagome lattice, have recently emerged as a captivating platform for investigating exotic correlated and topological electronic phenomena. Here, we combine ultra-low temperature angle-resolved photoemission spectroscopy (ARPES) with scanning tunneling microscopy and density functional theory calculations to reveal the fascinating electronic structure of the bilayer-distorted kagome material $\textit{Ln}$Ti${_3}$Bi${_4}$, where $\textit{Ln}$ stands for Nd and Yb. Distinct from other kagome materials, $\textit{Ln}$Ti${_3}$Bi${_4}$ exhibits two-fold, rather than six-fold, symmetries, stemming from the distorted kagome lattice, which leads to a unique electronic structure. Combining experiment and theory we map out the electronic structure and discover double flat bands as well as multiple van Hove singularities (VHSs), with one VHS exhibiting higher-order characteristics near the Fermi level. Notably, in the magnetic version NdTi${_3}$Bi${_4}$, the ultra-low base temperature ARPES measurements unveil an unconventional band splitting in the band dispersions which is induced by the ferromagnetic ordering. These findings reveal the potential of bilayer-distorted kagome metals $\textit{Ln}$Ti${_3}$Bi${_4}$ as a promising platform for exploring novel emergent phases of matter at the intersection of strong correlation and magnetism. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Report number: RIKEN-iTHEMS-Report-23

arXiv:2311.05600 [pdf, other]

FogROS2-Config: Optimizing Latency and Cost for Multi-Cloud Robot Applications

Authors: Kaiyuan Chen, Kush Hari, Rohil Khare, Charlotte Le, Trinity Chung, Jaimyn Drake, Jeffrey Ichnowski, John Kubiatowicz, Ken Goldberg

Abstract: Cloud service providers provide over 50,000 distinct and dynamically changing set of cloud server options. To help roboticists make cost-effective decisions, we present FogROS2-Config, an open toolkit that takes ROS2 nodes as input and automatically runs relevant benchmarks to quickly return a menu of cloud compute services that tradeoff latency and cost. Because it is infeasible to try every hard… ▽ More Cloud service providers provide over 50,000 distinct and dynamically changing set of cloud server options. To help roboticists make cost-effective decisions, we present FogROS2-Config, an open toolkit that takes ROS2 nodes as input and automatically runs relevant benchmarks to quickly return a menu of cloud compute services that tradeoff latency and cost. Because it is infeasible to try every hardware configuration, FogROS2-Config quickly samples tests a small set of edge case servers. We evaluate FogROS2-Config on three robotics application tasks: visual SLAM, grasp planning. and motion planning. FogROS2-Config can reduce the cost by up to 20x. By comparing with a Pareto frontier for cost and latency by running the application task on feasible server configurations, we evaluate cost and latency models and confirm that FogROS2-Config selects efficient hardware configurations to balance cost and latency. △ Less

Submitted 13 May, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

Comments: Published 2024 IEEE International Conference on Robotics and Automation (ICRA), Former name: FogROS2-Sky

arXiv:2311.03630 [pdf, other]

Counterfactual Data Augmentation with Contrastive Learning

Authors: Ahmed Aloui, Juncheng Dong, Cat P. Le, Vahid Tarokh

Abstract: Statistical disparity between distinct treatment groups is one of the most significant challenges for estimating Conditional Average Treatment Effects (CATE). To address this, we introduce a model-agnostic data augmentation method that imputes the counterfactual outcomes for a selected subset of individuals. Specifically, we utilize contrastive learning to learn a representation space and a simila… ▽ More Statistical disparity between distinct treatment groups is one of the most significant challenges for estimating Conditional Average Treatment Effects (CATE). To address this, we introduce a model-agnostic data augmentation method that imputes the counterfactual outcomes for a selected subset of individuals. Specifically, we utilize contrastive learning to learn a representation space and a similarity measure such that in the learned representation space close individuals identified by the learned similarity measure have similar potential outcomes. This property ensures reliable imputation of counterfactual outcomes for the individuals with close neighbors from the alternative treatment group. By augmenting the original dataset with these reliable imputations, we can effectively reduce the discrepancy between different treatment groups, while inducing minimal imputation error. The augmented dataset is subsequently employed to train CATE estimation models. Theoretical analysis and experimental studies on synthetic and semi-synthetic benchmarks demonstrate that our method achieves significant improvements in both performance and robustness to overfitting across state-of-the-art models. △ Less

Submitted 6 November, 2023; originally announced November 2023.

arXiv:2311.02803 [pdf, other]

Fast and Interpretable Face Identification for Out-Of-Distribution Data Using Vision Transformers

Authors: Hai Phan, Cindy Le, Vu Le, Yihui He, Anh Totti Nguyen

Abstract: Most face identification approaches employ a Siamese neural network to compare two images at the image embedding level. Yet, this technique can be subject to occlusion (e.g. faces with masks or sunglasses) and out-of-distribution data. DeepFace-EMD (Phan et al. 2022) reaches state-of-the-art accuracy on out-of-distribution data by first comparing two images at the image level, and then at the patc… ▽ More Most face identification approaches employ a Siamese neural network to compare two images at the image embedding level. Yet, this technique can be subject to occlusion (e.g. faces with masks or sunglasses) and out-of-distribution data. DeepFace-EMD (Phan et al. 2022) reaches state-of-the-art accuracy on out-of-distribution data by first comparing two images at the image level, and then at the patch level. Yet, its later patch-wise re-ranking stage admits a large $O(n^3 \log n)$ time complexity (for $n$ patches in an image) due to the optimal transport optimization. In this paper, we propose a novel, 2-image Vision Transformers (ViTs) that compares two images at the patch level using cross-attention. After training on 2M pairs of images on CASIA Webface (Yi et al. 2014), our model performs at a comparable accuracy as DeepFace-EMD on out-of-distribution data, yet at an inference speed more than twice as fast as DeepFace-EMD (Phan et al. 2022). In addition, via a human study, our model shows promising explainability through the visualization of cross-attention. We believe our work can inspire more explorations in using ViTs for face identification. △ Less

Submitted 5 November, 2023; originally announced November 2023.

Comments: 20 pages, 15 Figures

arXiv:2311.02377 [pdf, other]

Broken scale invariant unparticle physics and its prospective effects on the MuonE experiment

Authors: Van Dung Le, Duc Ninh Le, Duc Truyen Le, Van Cuong Le

Abstract: We investigate the effects of broken scale invariant unparticle at the MUonE experiment. The choice of the broken model is because the original scale-invariant model is severely suppressed by constraints from cosmology and low-energy experiments. Broken scale invariant unparticle model is categorized into four types: pseudoscalar, scalar, axial-vector, and vector unparticle. Each uparticle type is… ▽ More We investigate the effects of broken scale invariant unparticle at the MUonE experiment. The choice of the broken model is because the original scale-invariant model is severely suppressed by constraints from cosmology and low-energy experiments. Broken scale invariant unparticle model is categorized into four types: pseudoscalar, scalar, axial-vector, and vector unparticle. Each uparticle type is characterized by three free parameters: coupling constant $λ$, scaling dimension $d$, and energy scale $μ$ at which the scale-invariance is broken. After considering all of the available constraints on the model, we find that the MUonE experiment is sensitive to (axial-)vector unparticle with $1 < d < 1.4$ and $1\le μ\le 12$ GeV. △ Less

Submitted 4 November, 2023; originally announced November 2023.

Comments: 5 pages, 3 figs, contribution to the proceedings of Windows on the Universe conference, August 2023, Quy Nhon, Vietnam

arXiv:2310.19443 [pdf, other]

Asymptotically accurate and locking-free finite element implementation of first order shear deformation theory for plates

Authors: Khanh Chau Le, Hoang Giang Bui

Abstract: A formulation of the asymptotically exact first-order shear deformation theory for linear-elastic homogeneous plates in the rescaled coordinates and rotation angles is considered. This allows the development of its asymptotically accurate and shear-locking-free finite element implementation. As applications, numerical simulations are performed for circular and rectangular plates, showing complete… ▽ More A formulation of the asymptotically exact first-order shear deformation theory for linear-elastic homogeneous plates in the rescaled coordinates and rotation angles is considered. This allows the development of its asymptotically accurate and shear-locking-free finite element implementation. As applications, numerical simulations are performed for circular and rectangular plates, showing complete agreement between the analytical solution and the numerical solutions based on two-dimensional theory and three-dimensional elasticity theory. △ Less

Submitted 16 April, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

Comments: 32 pages, 11 figures

arXiv:2310.08864 [pdf, other]

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, Ajinkya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train generalist X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. More details can be found on the project website https://robotics-transformer-x.github.io. △ Less

Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

Comments: Project website: https://robotics-transformer-x.github.io

arXiv:2310.01720 [pdf, other]

Perceiver-based CDF Modeling for Time Series Forecasting

Authors: Cat P. Le, Chris Cannella, Ali Hasan, Yuting Ng, Vahid Tarokh

Abstract: Transformers have demonstrated remarkable efficacy in forecasting time series data. However, their extensive dependence on self-attention mechanisms demands significant computational resources, thereby limiting their practical applicability across diverse tasks, especially in multimodal problems. In this work, we propose a new architecture, called perceiver-CDF, for modeling cumulative distributio… ▽ More Transformers have demonstrated remarkable efficacy in forecasting time series data. However, their extensive dependence on self-attention mechanisms demands significant computational resources, thereby limiting their practical applicability across diverse tasks, especially in multimodal problems. In this work, we propose a new architecture, called perceiver-CDF, for modeling cumulative distribution functions (CDF) of time series data. Our approach combines the perceiver architecture with a copula-based attention mechanism tailored for multimodal time series prediction. By leveraging the perceiver, our model efficiently transforms high-dimensional and multimodal data into a compact latent space, thereby significantly reducing computational demands. Subsequently, we implement a copula-based attention mechanism to construct the joint distribution of missing data for prediction. Further, we propose an output variance testing mechanism to effectively mitigate error propagation during prediction. To enhance efficiency and reduce complexity, we introduce midpoint inference for the local attention mechanism. This enables the model to efficiently capture dependencies within nearby imputed samples without considering all previous samples. The experiments on the unimodal and multimodal benchmarks consistently demonstrate a 20% improvement over state-of-the-art methods while utilizing less than half of the computational resources. △ Less

Submitted 24 June, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

Comments: Accepted in Winter Simulation Conference 2024

arXiv:2310.00662 [pdf, other]

Classification of High-Ordered Topological Nodes towards Moiré Flat Bands in Twisted Bilayers

Authors: Fan Cui, Congcong Le, Qiang Zhang, Xianxin Wu, Jiangping Hu, Ching-Kai Chiu

Abstract: At magic twisted angles, Dirac cones in twisted bilayer graphene (TBG) can evolve into flat bands, serving as a critical playground for the study of strongly correlated physics. When chiral symmetry is introduced, rigorous mathematical proof confirms that the flat bands are locked at zero energy in the entire Moiré Brillouin zone (BZ). Yet, TBG is not the sole platform that exhibits this absolute… ▽ More At magic twisted angles, Dirac cones in twisted bilayer graphene (TBG) can evolve into flat bands, serving as a critical playground for the study of strongly correlated physics. When chiral symmetry is introduced, rigorous mathematical proof confirms that the flat bands are locked at zero energy in the entire Moiré Brillouin zone (BZ). Yet, TBG is not the sole platform that exhibits this absolute band flatness. Central to this flatness phenomenon are topological nodes and their specific locations in the BZ. In this study, considering twisted bilayer systems that preserve chiral symmetry, we classify various ordered topological nodes in base layers and all possible node locations across different BZs. Specifically, we constrain the node locations to rotational centers, such as Γ and M points, to ensure the interlayer coupling retains equal strength in all directions. Using this classification as a foundation, we systematically identify the conditions under which Moiré flat bands emerge. Additionally, through the extension of holomorphic functions, we provide proof that flat bands are locked at zero energy, shedding light on the origin of the band flatness. Remarkably, beyond Dirac cones, numerous twisted bilayer nodal platforms can host flat bands with a degeneracy number of more than two, such as four-fold, six-fold, and eight-fold. This multiplicity of degeneracy in flat bands might unveil more complex and enriched correlation physics. △ Less

Submitted 10 October, 2023; v1 submitted 1 October, 2023; originally announced October 2023.

Comments: 13 pages, 10 figures, 2 tables

Report number: RIKEN-iTHEMS-Report-23

arXiv:2309.02289 [pdf, other]

An operator preconditioned combined field integral equation for electromagnetic scattering

Authors: Van Chien Le, Kristof Cools

Abstract: This paper aims to address two issues of integral equations for the scattering of time-harmonic electromagnetic waves by a perfect electric conductor with Lipschitz continuous boundary: ill-conditioned {boundary element Galerkin matrices} on fine meshes and instability at spurious resonant frequencies. The remedy to ill-conditioned matrices is operator preconditioning, and resonant instability is… ▽ More This paper aims to address two issues of integral equations for the scattering of time-harmonic electromagnetic waves by a perfect electric conductor with Lipschitz continuous boundary: ill-conditioned {boundary element Galerkin matrices} on fine meshes and instability at spurious resonant frequencies. The remedy to ill-conditioned matrices is operator preconditioning, and resonant instability is eliminated by means of a combined field integral equation. Exterior traces of single and double layer potentials are complemented by their interior counterparts for a purely imaginary wave number. We derive the corresponding variational formulation in the natural trace space for electromagnetic fields and establish its well-posedness for all wave numbers. A Galerkin discretization scheme is employed using conforming edge boundary elements on dual meshes, which produces well-conditioned discrete linear systems of the variational formulation. Some numerical results are also provided to support the numerical analysis. △ Less

Submitted 10 June, 2024; v1 submitted 5 September, 2023; originally announced September 2023.

Comments: 6 figures

arXiv:2308.06539 [pdf, other]

Phase Shift Design for RIS-Aided Cell-Free Massive MIMO with Improved Differential Evolution

Authors: Trinh Van Chien, Cuong V. Le, Huynh Thi Thanh Binh, Hien Quoc Ngo, Symeon Chatzinotas

Abstract: This paper proposes a novel phase shift design for cell-free massive multiple-input and multiple-output (MIMO) systems assisted by reconfigurable intelligent surface (RIS), which only utilizes channel statistics to achieve the uplink sum ergodic throughput maximization under spatial channel correlations. Due to the non-convexity and the scale of the derived optimization problem, we develop an impr… ▽ More This paper proposes a novel phase shift design for cell-free massive multiple-input and multiple-output (MIMO) systems assisted by reconfigurable intelligent surface (RIS), which only utilizes channel statistics to achieve the uplink sum ergodic throughput maximization under spatial channel correlations. Due to the non-convexity and the scale of the derived optimization problem, we develop an improved version of the differential evolution (DE) algorithm. The proposed scheme is capable of providing high-quality solutions within reasonable computing time. Numerical results demonstrate superior improvements of the proposed phase shift designs over the other benchmarks, particularly in scenarios where direct links are highly probable. △ Less

Submitted 12 August, 2023; originally announced August 2023.

Comments: 5 pages, 2 figures. Accepted by IEEE WCL

arXiv:2307.16834 [pdf]

doi 10.1007/978-3-031-53963-3_25

Benchmarking Jetson Edge Devices with an End-to-end Video-based Anomaly Detection System

Authors: Hoang Viet Pham, Thinh Gia Tran, Chuong Dinh Le, An Dinh Le, Hien Bich Vo

Abstract: Innovative enhancement in embedded system platforms, specifically hardware accelerations, significantly influence the application of deep learning in real-world scenarios. These innovations translate human labor efforts into automated intelligent systems employed in various areas such as autonomous driving, robotics, Internet-of-Things (IoT), and numerous other impactful applications. NVIDIA's Jet… ▽ More Innovative enhancement in embedded system platforms, specifically hardware accelerations, significantly influence the application of deep learning in real-world scenarios. These innovations translate human labor efforts into automated intelligent systems employed in various areas such as autonomous driving, robotics, Internet-of-Things (IoT), and numerous other impactful applications. NVIDIA's Jetson platform is one of the pioneers in offering optimal performance regarding energy efficiency and throughput in the execution of deep learning algorithms. Previously, most benchmarking analysis was based on 2D images with a single deep learning model for each comparison result. In this paper, we implement an end-to-end video-based crime-scene anomaly detection system inputting from surveillance videos and the system is deployed and completely operates on multiple Jetson edge devices (Nano, AGX Xavier, Orin Nano). The comparison analysis includes the integration of Torch-TensorRT as a software developer kit from NVIDIA for the model performance optimisation. The system is built based on the PySlowfast open-source project from Facebook as the coding template. The end-to-end system process comprises the videos from camera, data preprocessing pipeline, feature extractor and the anomaly detection. We provide the experience of an AI-based system deployment on various Jetson Edge devices with Docker technology. Regarding anomaly detectors, a weakly supervised video-based deep learning model called Robust Temporal Feature Magnitude Learning (RTFM) is applied in the system. The approach system reaches 47.56 frames per second (FPS) inference speed on a Jetson edge device with only 3.11 GB RAM usage total. We also discover the promising Jetson device that the AI system achieves 15% better performance than the previous version of Jetson devices while consuming 50% less energy power. △ Less

Submitted 12 September, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

Comments: Accepted in Future of Information and Communication Conference (FICC) 2024

arXiv:2306.16678 [pdf, other]

BinaryViT: Pushing Binary Vision Transformers Towards Convolutional Models

Authors: Phuoc-Hoan Charles Le, Xinlin Li

Abstract: With the increasing popularity and the increasing size of vision transformers (ViTs), there has been an increasing interest in making them more efficient and less computationally costly for deployment on edge devices with limited computing resources. Binarization can be used to help reduce the size of ViT models and their computational cost significantly, using popcount operations when the weights… ▽ More With the increasing popularity and the increasing size of vision transformers (ViTs), there has been an increasing interest in making them more efficient and less computationally costly for deployment on edge devices with limited computing resources. Binarization can be used to help reduce the size of ViT models and their computational cost significantly, using popcount operations when the weights and the activations are in binary. However, ViTs suffer a larger performance drop when directly applying convolutional neural network (CNN) binarization methods or existing binarization methods to binarize ViTs compared to CNNs on datasets with a large number of classes such as ImageNet-1k. With extensive analysis, we find that binary vanilla ViTs such as DeiT miss out on a lot of key architectural properties that CNNs have that allow binary CNNs to have much higher representational capability than binary vanilla ViT. Therefore, we propose BinaryViT, in which inspired by the CNN architecture, we include operations from the CNN architecture into a pure ViT architecture to enrich the representational capability of a binary ViT without introducing convolutions. These include an average pooling layer instead of a token pooling layer, a block that contains multiple average pooling branches, an affine transformation right before the addition of each main residual connection, and a pyramid structure. Experimental results on the ImageNet-1k dataset show the effectiveness of these operations that allow a binary pure ViT model to be competitive with previous state-of-the-art (SOTA) binary CNN models. △ Less

Submitted 29 June, 2023; originally announced June 2023.

Comments: Accepted in CVPR 2023 Workshop on Efficient Deep Learning for Computer Vision (ECV)

arXiv:2306.07275 [pdf, other]

Effective model and pairing tendency in bilayer Ni-based superconductor La$_3$Ni$_2$O$_7$

Authors: Yuhao Gu, Congcong Le, Zhesen Yang, Xianxin Wu, Jiangping Hu

Abstract: Since the discovery of cuprate, the origin of high-T$_c$ superconductivity has been an outstanding puzzle. Recently, high-T$_c$ superconductivity was observed in a bilayer nickelate La$_3$Ni$_2$O$_7$ under pressure, whose structure hosts the apical oxygen between two layers, distinct from multi-layer cuprates. Motivated by this discovery, we investigate its electronic structure using first-princip… ▽ More Since the discovery of cuprate, the origin of high-T$_c$ superconductivity has been an outstanding puzzle. Recently, high-T$_c$ superconductivity was observed in a bilayer nickelate La$_3$Ni$_2$O$_7$ under pressure, whose structure hosts the apical oxygen between two layers, distinct from multi-layer cuprates. Motivated by this discovery, we investigate its electronic structure using first-principle calculations and superconducting instabilities from both weak-coupling and strong-coupling perspective. Based on the first-principle band structures, we construct a bilayer two-orbital model on a square lattice, consisting of $d_{x^2-y^2}$ and $d_{z^2}$ orbitals, which accurately captures the low-energy electronic properties. Within this model, we study pairing instability using both functional renormalization group approach and multi-orbital t-J model. An $s_{\pm}$-wave pairing with sign-reversal gaps on different Fermi surfaces is revealed, reminiscent of iron based superconductors. The Ni-$d_{z^2}$ orbital and its associated interlayer and intralayer exchange couplings are found to be crucial for the high-T$_c$ superconductivity. Our study provides valuable insights into unique nature of electronic structure and superconductivity in La$_3$Ni$_2$O$_7$ and contributes to the understanding of unconventional superconductors. △ Less

Submitted 31 August, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

Comments: 5 pages, 4 figures + SM

Report number: RIKEN-iTHEMS-Report-23

arXiv:2305.15613 [pdf, other]

O$n$ Learning Deep O($n$)-Equivariant Hyperspheres

Authors: Pavlo Melnyk, Michael Felsberg, Mårten Wadenbäck, Andreas Robinson, Cuong Le

Abstract: In this paper, we utilize hyperspheres and regular $n$-simplexes and propose an approach to learning deep features equivariant under the transformations of $n$D reflections and rotations, encompassed by the powerful group of O$(n)$. Namely, we propose O$(n)$-equivariant neurons with spherical decision surfaces that generalize to any dimension $n$, which we call Deep Equivariant Hyperspheres. We de… ▽ More In this paper, we utilize hyperspheres and regular $n$-simplexes and propose an approach to learning deep features equivariant under the transformations of $n$D reflections and rotations, encompassed by the powerful group of O$(n)$. Namely, we propose O$(n)$-equivariant neurons with spherical decision surfaces that generalize to any dimension $n$, which we call Deep Equivariant Hyperspheres. We demonstrate how to combine them in a network that directly operates on the basis of the input points and propose an invariant operator based on the relation between two points and a sphere, which as we show, turns out to be a Gram matrix. Using synthetic and real-world data in $n$D, we experimentally verify our theoretical contributions and find that our approach is superior to the competing methods for O$(n)$-equivariant benchmark datasets (classification and regression), demonstrating a favorable speed/performance trade-off. The code is available at https://github.com/pavlo-melnyk/equivariant-hyperspheres. △ Less

Submitted 27 May, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

arXiv:2305.14838 [pdf, other]

ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation

Authors: Chenyang Le, Yao Qian, Long Zhou, Shujie Liu, Yanmin Qian, Michael Zeng, Xuedong Huang

Abstract: Joint speech-language training is challenging due to the large demand for training data and GPU consumption, as well as the modality gap between speech and language. We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models and optimized data-efficiently for spoken language tasks. Particularly, we propose to incorporate… ▽ More Joint speech-language training is challenging due to the large demand for training data and GPU consumption, as well as the modality gap between speech and language. We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models and optimized data-efficiently for spoken language tasks. Particularly, we propose to incorporate cross-modality learning into transfer learning and conduct them simultaneously for downstream tasks in a multi-task learning manner. Our approach has demonstrated effectiveness in end-to-end speech-to-text translation tasks, achieving a new state-of-the-art average BLEU score of 31.5 on the multilingual speech to English text translation task for 21 languages, as measured on the public CoVoST2 evaluation set. △ Less

Submitted 14 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: NeurIPS 2023, Poster

arXiv:2305.11400 [pdf, other]

Mode-Aware Continual Learning for Conditional Generative Adversarial Networks

Authors: Cat P. Le, Juncheng Dong, Ahmed Aloui, Vahid Tarokh

Abstract: The main challenge in continual learning for generative models is to effectively learn new target modes with limited samples while preserving previously learned ones. To this end, we introduce a new continual learning approach for conditional generative adversarial networks by leveraging a mode-affinity score specifically designed for generative modeling. First, the generator produces samples of e… ▽ More The main challenge in continual learning for generative models is to effectively learn new target modes with limited samples while preserving previously learned ones. To this end, we introduce a new continual learning approach for conditional generative adversarial networks by leveraging a mode-affinity score specifically designed for generative modeling. First, the generator produces samples of existing modes for subsequent replay. The discriminator is then used to compute the mode similarity measure, which identifies a set of closest existing modes to the target. Subsequently, a label for the target mode is generated and given as a weighted average of the labels within this set. We extend the continual learning model by training it on the target data with the newly-generated label, while performing memory replay to mitigate the risk of catastrophic forgetting. Experimental results on benchmark datasets demonstrate the gains of our continual learning approach over the state-of-the-art methods, even when using fewer training samples. △ Less

Submitted 23 September, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

arXiv:2305.09463 [pdf, other]

Low-complexity deep learning frameworks for acoustic scene classification using teacher-student scheme and multiple spectrograms

Authors: Lam Pham, Dat Ngo, Cam Le, Anahid Jalali, Alexander Schindler

Abstract: In this technical report, a low-complexity deep learning system for acoustic scene classification (ASC) is presented. The proposed system comprises two main phases: (Phase I) Training a teacher network; and (Phase II) training a student network using distilled knowledge from the teacher. In the first phase, the teacher, which presents a large footprint model, is trained. After training the teacher… ▽ More In this technical report, a low-complexity deep learning system for acoustic scene classification (ASC) is presented. The proposed system comprises two main phases: (Phase I) Training a teacher network; and (Phase II) training a student network using distilled knowledge from the teacher. In the first phase, the teacher, which presents a large footprint model, is trained. After training the teacher, the embeddings, which are the feature map of the second last layer of the teacher, are extracted. In the second phase, the student network, which presents a low complexity model, is trained with the embeddings extracted from the teacher. Our experiments conducted on DCASE 2023 Task 1 Development dataset have fulfilled the requirement of low-complexity and achieved the best classification accuracy of 57.4%, improving DCASE baseline by 14.5%. △ Less

Submitted 16 May, 2023; originally announced May 2023.

Comments: arXiv admin note: text overlap with arXiv:2206.06057

arXiv:2305.01476 [pdf, other]

Deep Learning Based Multimodal with Two-phase Training Strategy for Daily Life Video Classification

Authors: Lam Pham, Trang Le, Cam Le, Dat Ngo, Weissenfeld Axel, Alexander Schindler

Abstract: In this paper, we present a deep learning based multimodal system for classifying daily life videos. To train the system, we propose a two-phase training strategy. In the first training phase (Phase I), we extract the audio and visual (image) data from the original video. We then train the audio data and the visual data with independent deep learning based models. After the training processes, we… ▽ More In this paper, we present a deep learning based multimodal system for classifying daily life videos. To train the system, we propose a two-phase training strategy. In the first training phase (Phase I), we extract the audio and visual (image) data from the original video. We then train the audio data and the visual data with independent deep learning based models. After the training processes, we obtain audio embeddings and visual embeddings by extracting feature maps from the pre-trained deep learning models. In the second training phase (Phase II), we train a fusion layer to combine the audio/visual embeddings and a dense layer to classify the combined embedding into target daily scenes. Our extensive experiments, which were conducted on the benchmark dataset of DCASE (IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events) 2021 Task 1B Development, achieved the best classification accuracy of 80.5%, 91.8%, and 95.3% with only audio data, with only visual data, both audio and visual data, respectively. The highest classification accuracy of 95.3% presents an improvement of 17.9% compared with DCASE baseline and shows very competitive to the state-of-the-art systems. △ Less

Submitted 30 April, 2023; originally announced May 2023.

arXiv:2304.11349 [pdf, ps, other]

Energy-minimizing torus-valued maps with prescribed singularities, Plateau's problem, and BV-lifting

Authors: Giacomo Canevari, Van Phu Cuong Le

Abstract: In this paper, we investigate the relation between energy-minimizing torus-valued maps with prescribed singularities, the lifting problem for torus-valued maps in the space BV, and Plateau's problem for vectorial currents, in codimension one. First, we show that the infimum of the $W^{1,1}$-seminorm among all maps with values in the $k$-dimensional flat torus and prescribed topological singulariti… ▽ More In this paper, we investigate the relation between energy-minimizing torus-valued maps with prescribed singularities, the lifting problem for torus-valued maps in the space BV, and Plateau's problem for vectorial currents, in codimension one. First, we show that the infimum of the $W^{1,1}$-seminorm among all maps with values in the $k$-dimensional flat torus and prescribed topological singularities $S$ is equal to the minimum of the mass among all $\textit{normal}$ $\mathbb{R}^k$-currents, of codimension one, bounded by $S$. Then, we show that the minimum of the $BV$-energy among all liftings of a given torus-valued $W^{1,1}$-map $\textbf{u}$ can be expressed in terms of the minimum mass among all $\textit{integral}$ $\mathbb{Z}^k$-currents, of codimension one, bounded by the singularities of $\textbf{u}$. As a byproduct of our analysis, we provide a bound for the solution of the integral Plateau problem, in codimension one, in terms of Plateau's problem for normal currents. △ Less

Submitted 27 February, 2024; v1 submitted 22 April, 2023; originally announced April 2023.

Comments: 42 pages

MSC Class: 49Q10; 49Q15; 49Q20; 58E20

arXiv:2304.08790 [pdf, other]

Constrained Assortment Optimization under the Cross-Nested Logit Model

Authors: Cuong Le, Tien Mai

Abstract: We study the assortment optimization problem under general linear constraints, where the customer choice behavior is captured by the Cross-Nested Logit model. In this problem, there is a set of products organized into multiple subsets (or nests), where each product can belong to more than one nest. The aim is to find an assortment to offer to customers so that the expected revenue is maximized. We… ▽ More We study the assortment optimization problem under general linear constraints, where the customer choice behavior is captured by the Cross-Nested Logit model. In this problem, there is a set of products organized into multiple subsets (or nests), where each product can belong to more than one nest. The aim is to find an assortment to offer to customers so that the expected revenue is maximized. We show that, under the Cross-Nested Logit model, the assortment problem is NP-hard, even without any constraints. To tackle the assortment optimization problem, we develop a new discretization mechanism to approximate the problem by a linear fractional program with a performance guarantee of $\frac{1 - ε}{1+ε}$, for any accuracy level $ε>0$. We then show that optimal solutions to the approximate problem can be obtained by solving mixed-integer linear programs. We further show that our discretization approach can also be applied to solve a joint assortment optimization and pricing problem, as well as an assortment problem under a mixture of Cross-Nested Logit models to account for multiple classes of customers. Our empirical results on a large number of randomly generated test instances demonstrate that, under a performance guarantee of 90%, the percentage gaps between the objective values obtained from our approximation methods and the optimal expected revenues are no larger than 1.2%. △ Less

Submitted 18 April, 2023; originally announced April 2023.

arXiv:2304.08274 [pdf, other]

An asymptotically exact first-order shear deformation theory for functionally graded plates

Authors: Khanh Chau Le

Abstract: An asymptotically exact first-order shear deformation theory for functionally graded elastic plates is derived using the variational-asymptotic method. As an application, an analytical solution to the problem of wave propagation in a sandwich plate is found in accordance with this refined theory. Comparison between the dispersion curves obtained by 2-D plate theory and 3-D elasticity theory reveal… ▽ More An asymptotically exact first-order shear deformation theory for functionally graded elastic plates is derived using the variational-asymptotic method. As an application, an analytical solution to the problem of wave propagation in a sandwich plate is found in accordance with this refined theory. Comparison between the dispersion curves obtained by 2-D plate theory and 3-D elasticity theory reveals that the former is accurate up to the order of h^2/l^2, where h is the plate thickness and l the wavelength. △ Less

Submitted 5 May, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

Comments: 27 pages, 4 figures

arXiv:2304.04439 [pdf, other]

Unparticle effects at the MUonE experiment

Authors: Duc Ninh Le, Van Dung Le, Duc Truyen Le, Van Cuong Le

Abstract: We investigate possible effects of unparticles at the MUonE experiment by considering a general model for unparticle with broken scale invariance, characterized by the scaling dimension $d$ and the energy scale $μ$ at which the scale invariance is broken. Taking into account available relevant constraints on the couplings of the unparticles with the Standard Model (SM) leptons, we found that the M… ▽ More We investigate possible effects of unparticles at the MUonE experiment by considering a general model for unparticle with broken scale invariance, characterized by the scaling dimension $d$ and the energy scale $μ$ at which the scale invariance is broken. Taking into account available relevant constraints on the couplings of the unparticles with the Standard Model (SM) leptons, we found that the MUonE experiment at the level of 10 ppm systematic accuracy is sensitive to such effects if $1<d\lesssim 1.4$ and $1\le μ\lesssim 12$ GeV for vector unparticles. The effects of scalar unparticles are too feeble to be detected. The vector unparticles can induce a significant shift on the best-fit value of $a_μ^\text{had}$ at the MUonE, thereby providing an opportunity to detect unparticles or to obtain a new bound on the unparticle-SM couplings in the case of no anomaly. △ Less

Submitted 21 November, 2023; v1 submitted 10 April, 2023; originally announced April 2023.

Comments: 16 pages, 6 figures, 3 tables; matches journal version

arXiv:2303.08478 [pdf, other]

Theory of transition from brittle to ductile fracture

Authors: Khanh Chau Le, Hyeonyeong Jeong, Tuan Minh Tran

Abstract: In this paper, two improvements to the theory of transition from brittle to ductile fracture developed by Langer are proposed. First, considering the drastic temperature rise near the crack tip, the temperature dependence of the shear modulus is included to better quantify the thermally sensitive dislocation entanglement. Second, the parameters of the improved theory are identified by the large sc… ▽ More In this paper, two improvements to the theory of transition from brittle to ductile fracture developed by Langer are proposed. First, considering the drastic temperature rise near the crack tip, the temperature dependence of the shear modulus is included to better quantify the thermally sensitive dislocation entanglement. Second, the parameters of the improved theory are identified by the large scale least squares method. The comparison between the fracture toughness predicted by the theory and the values obtained in Gumbsch's experiments for tungsten at different temperatures shows good agreement. △ Less

Submitted 5 May, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

Comments: 7 pages, 4 figures

arXiv:2302.13028 [pdf, other]

A Light-weight Deep Learning Model for Remote Sensing Image Classification

Authors: Lam Pham, Cam Le, Dat Ngo, Anh Nguyen, Jasmin Lampert, Alexander Schindler, Ian McLoughlin

Abstract: In this paper, we present a high-performance and light-weight deep learning model for Remote Sensing Image Classification (RSIC), the task of identifying the aerial scene of a remote sensing image. To this end, we first valuate various benchmark convolutional neural network (CNN) architectures: MobileNet V1/V2, ResNet 50/151V2, InceptionV3/InceptionResNetV2, EfficientNet B0/B7, DenseNet 121/201, C… ▽ More In this paper, we present a high-performance and light-weight deep learning model for Remote Sensing Image Classification (RSIC), the task of identifying the aerial scene of a remote sensing image. To this end, we first valuate various benchmark convolutional neural network (CNN) architectures: MobileNet V1/V2, ResNet 50/151V2, InceptionV3/InceptionResNetV2, EfficientNet B0/B7, DenseNet 121/201, ConNeXt Tiny/Large. Then, the best performing models are selected to train a compact model in a teacher-student arrangement. The knowledge distillation from the teacher aims to achieve high performance with significantly reduced complexity. By conducting extensive experiments on the NWPU-RESISC45 benchmark, our proposed teacher-student models outperforms the state-of-the-art systems, and has potential to be applied on a wide rage of edge devices. △ Less

Submitted 25 February, 2023; originally announced February 2023.

arXiv:2301.13372 [pdf, other]

Improving Open-Domain Dialogue Evaluation with a Causal Inference Model

Authors: Cat P. Le, Luke Dai, Michael Johnston, Yang Liu, Marilyn Walker, Reza Ghanadan

Abstract: Effective evaluation methods remain a significant challenge for research on open-domain conversational dialogue systems. Explicit satisfaction ratings can be elicited from users, but users often do not provide ratings when asked, and those they give can be highly subjective. Post-hoc ratings by experts are an alternative, but these can be both expensive and complex to collect. Here, we explore the… ▽ More Effective evaluation methods remain a significant challenge for research on open-domain conversational dialogue systems. Explicit satisfaction ratings can be elicited from users, but users often do not provide ratings when asked, and those they give can be highly subjective. Post-hoc ratings by experts are an alternative, but these can be both expensive and complex to collect. Here, we explore the creation of automated methods for predicting both expert and user ratings of open-domain dialogues. We compare four different approaches. First, we train a baseline model using an end-to-end transformer to predict ratings directly from the raw dialogue text. The other three methods are variants of a two-stage approach in which we first extract interpretable features at the turn level that capture, among other aspects, user dialogue behaviors indicating contradiction, repetition, disinterest, compliments, or criticism. We project these features to the dialogue level and train a dialogue-level MLP regression model, a dialogue-level LSTM, and a novel causal inference model called counterfactual-LSTM (CF-LSTM) to predict ratings. The proposed CF-LSTM is a sequential model over turn-level features which predicts ratings using multiple regressors depending on hypotheses derived from the turn-level features. As a causal inference model, CF-LSTM aims to learn the underlying causes of a specific event, such as a low rating. We also bin the user ratings and perform classification experiments with all four models. In evaluation experiments on conversational data from the Alexa Prize SocialBot, we show that the CF-LSTM achieves the best performance for predicting dialogue ratings and classification. △ Less

Submitted 30 January, 2023; originally announced January 2023.

Comments: Accepted as a conference paper at IWSDS 2023

arXiv:2301.11744 [pdf, other]

A numerical scheme for solving an induction heating problem with moving non-magnetic conductor

Authors: Van Chien Le, Marián Slodička, Karel Van Bockstal

Abstract: This paper investigates an induction heating problem in a multi-component system containing a moving non-magnetic conductor. The electromagnetic process is described by the eddy current model, and the heat transfer process is governed by the convection-diffusion equation. Both processes are coupled by a restrained Joule heat source. A temporal discretization scheme is introduced to solve the corre… ▽ More This paper investigates an induction heating problem in a multi-component system containing a moving non-magnetic conductor. The electromagnetic process is described by the eddy current model, and the heat transfer process is governed by the convection-diffusion equation. Both processes are coupled by a restrained Joule heat source. A temporal discretization scheme is introduced to solve the corresponding variational system numerically. With the aid of the Reynolds transport theorem, we prove the convergence of the proposed scheme as well as the well-posedness of the variational problem. Some numerical experiments are also performed to assess the performance of the numerical scheme. △ Less

Submitted 18 July, 2024; v1 submitted 27 January, 2023; originally announced January 2023.

MSC Class: 35Q61; 35Q79; 65M12

arXiv:2301.08530 [pdf, other]

Self-Organization Towards $1/f$ Noise in Deep Neural Networks

Authors: Nicholas Chong Jia Le, Ling Feng

Abstract: The presence of $1/f$ noise, also known as pink noise, is a well-established phenomenon in biological neural networks, and is thought to play an important role in information processing in the brain. In this study, we find that such $1/f$ noise is also found in deep neural networks trained on natural language, resembling that of their biological counterparts. Specifically, we trained Long Short-Te… ▽ More The presence of $1/f$ noise, also known as pink noise, is a well-established phenomenon in biological neural networks, and is thought to play an important role in information processing in the brain. In this study, we find that such $1/f$ noise is also found in deep neural networks trained on natural language, resembling that of their biological counterparts. Specifically, we trained Long Short-Term Memory (LSTM) networks on the `IMDb' AI benchmark dataset, then measured the neuron activations. The detrended fluctuation analysis (DFA) on the time series of the different neurons demonstrate clear $1/f$ patterns, which is absent in the time series of the inputs to the LSTM. Interestingly, when the neural network is at overcapacity, having more than enough neurons to achieve the learning task, the activation patterns deviate from $1/f$ noise and shifts towards white noise. This is because many of the neurons are not effectively used, showing little fluctuations when fed with input data. We further examine the exponent values in the $1/f$ noise in ``internal" and ``external" activations in the LSTM cell, finding some resemblance in the variations of the exponents in fMRI signals of the human brain. Our findings further supports the hypothesis that $1/f$ noise is a signature of optimal learning. With deep learning models approaching or surpassing humans in certain tasks, and being more ``experimentable'' than their biological counterparts, our study suggests that they are good candidates to understand the fundamental origins of $1/f$ noise. △ Less

Submitted 1 April, 2024; v1 submitted 20 January, 2023; originally announced January 2023.

arXiv:2301.04771 [pdf, other]

Variational Inference: Posterior Threshold Improves Network Clustering Accuracy in Sparse Regimes

Authors: Xuezhen Li, Can M. Le

Abstract: Variational inference has been widely used in machine learning literature to fit various Bayesian models. In network analysis, this method has been successfully applied to solve the community detection problems. Although these results are promising, their theoretical support is only for relatively dense networks, an assumption that may not hold for real networks. In addition, it has been shown rec… ▽ More Variational inference has been widely used in machine learning literature to fit various Bayesian models. In network analysis, this method has been successfully applied to solve the community detection problems. Although these results are promising, their theoretical support is only for relatively dense networks, an assumption that may not hold for real networks. In addition, it has been shown recently that the variational loss surface has many saddle points, which may severely affect its performance, especially when applied to sparse networks. This paper proposes a simple way to improve the variational inference method by hard thresholding the posterior of the community assignment after each iteration. Using a random initialization that correlates with the true community assignment, we show that the proposed method converges and can accurately recover the true community labels, even when the average node degree of the network is bounded. Extensive numerical study further confirms the advantage of the proposed method over the classical variational inference and another state-of-the-art algorithm. △ Less

Submitted 21 May, 2024; v1 submitted 11 January, 2023; originally announced January 2023.

arXiv:2301.02910 [pdf, other]

doi 10.1103/PhysRevA.108.023109

Universality in odd-even harmonic generation and application in terahertz waveform sampling

Authors: Doan-An Trieu, Ngoc-Loan Phan, Quan-Hao Truong, Hien T. Nguyen, Cam-Tu Le, DinhDuy Vu, Van-Hoang Le

Abstract: Odd-even harmonics emitted from a laser-target system imprint rich, subtle information characterizing the system's dynamical asymmetry, which is desirable to decipher. In this Letter, we discover a simple universal relation between the odd-even harmonics and the asymmetry of the THz-assisted laser-atomic system -- atoms in a fundamental mid-IR laser pulse combined with a THz laser. First, we demon… ▽ More Odd-even harmonics emitted from a laser-target system imprint rich, subtle information characterizing the system's dynamical asymmetry, which is desirable to decipher. In this Letter, we discover a simple universal relation between the odd-even harmonics and the asymmetry of the THz-assisted laser-atomic system -- atoms in a fundamental mid-IR laser pulse combined with a THz laser. First, we demonstrate numerically and then analytically formulize the harmonic even-to-odd ratio as a function of the THz electric field, the source of the system's asymmetry. Notably, we suggest a scaling that makes the obtained rule universal, independent of the parameters of both the fundamental pulse and atomic target. This universality facilitates us to propose a general pump-probe scheme for THz waveform sampling from the even-to-odd ratio, measurable within a conventional compact setup. △ Less

Submitted 16 January, 2023; v1 submitted 7 January, 2023; originally announced January 2023.

arXiv:2212.14353 [pdf, other]

Sheaf-theoretic self-filtering network of low-cost sensors for local air quality monitoring: A causal approach

Authors: Anh-Duy Pham, Chuong Dinh Le, Hoang Viet Pham, Thinh Gia Tran, Dat Thanh Vo, Chau Long Tran, An Dinh Le, Hien Bich Vo

Abstract: Sheaf theory, which is a complex but powerful tool supported by topological theory, offers more flexibility and precision than traditional graph theory when it comes to modeling relationships between multiple features. In the realm of air quality monitoring, this can be incredibly useful in detecting sudden changes in local dust particle density, which can be difficult to accurately measure using… ▽ More Sheaf theory, which is a complex but powerful tool supported by topological theory, offers more flexibility and precision than traditional graph theory when it comes to modeling relationships between multiple features. In the realm of air quality monitoring, this can be incredibly useful in detecting sudden changes in local dust particle density, which can be difficult to accurately measure using commercial instruments. Traditional methods for air quality measurement often rely on calibrating the measurement with public standard instruments or calculating the measurements moving average over a constant period. However, this can lead to an incorrect index at the measurement location, as well as an oversmoothing effect on the signal. In this study, we propose a compact device that uses sheaf theory to detect and count vehicles as a local air quality change-causing factor. By inferring the number of vehicles into the PM2.5 index and propagating it into the recorded PM2.5 index from low-cost air monitoring sensors such as PMS7003 and BME280, we can achieve self-correction in real-time. Plus, the sheaf-theoretic method allows for easy scaling to multiple nodes for further filtering effects. By implementing sheaf theory in air quality monitoring, we can overcome the limitations of traditional methods and provide more accurate and reliable results. △ Less

Submitted 29 December, 2022; originally announced December 2022.

Showing 1–50 of 201 results for author: Le, C