Search | arXiv e-print repository

When Does Visual Prompting Outperform Linear Probing for Vision-Language Models? A Likelihood Perspective

Authors: Hsi-Ai Tsao, Lei Hsiung, Pin-Yu Chen, Tsung-Yi Ho

Abstract: Adapting pre-trained models to new tasks can exhibit varying effectiveness across datasets. Visual prompting, a state-of-the-art parameter-efficient transfer learning method, can significantly improve the performance of out-of-distribution tasks. On the other hand, linear probing, a standard transfer learning method, can sometimes become the best approach. We propose a log-likelihood ratio (LLR) a… ▽ More Adapting pre-trained models to new tasks can exhibit varying effectiveness across datasets. Visual prompting, a state-of-the-art parameter-efficient transfer learning method, can significantly improve the performance of out-of-distribution tasks. On the other hand, linear probing, a standard transfer learning method, can sometimes become the best approach. We propose a log-likelihood ratio (LLR) approach to analyze the comparative benefits of visual prompting and linear probing. By employing the LLR score alongside resource-efficient visual prompts approximations, our cost-effective measure attains up to a 100-fold reduction in run time compared to full training, while achieving prediction accuracies up to 91%. The source code is available at https://github.com/IBM/VP-LLR. △ Less

Submitted 4 September, 2024; v1 submitted 3 September, 2024; originally announced September 2024.

arXiv:2408.05493 [pdf, other]

Stream-based Active Learning for Anomalous Sound Detection in Machine Condition Monitoring

Authors: Tuan Vu Ho, Kota Dohi, Yohei Kawaguchi

Abstract: This paper introduces an active learning (AL) framework for anomalous sound detection (ASD) in machine condition monitoring system. Typically, ASD models are trained solely on normal samples due to the scarcity of anomalous data, leading to decreased accuracy for unseen samples during inference. AL is a promising solution to solve this problem by enabling the model to learn new concepts more effec… ▽ More This paper introduces an active learning (AL) framework for anomalous sound detection (ASD) in machine condition monitoring system. Typically, ASD models are trained solely on normal samples due to the scarcity of anomalous data, leading to decreased accuracy for unseen samples during inference. AL is a promising solution to solve this problem by enabling the model to learn new concepts more effectively with fewer labeled examples, thus reducing manual annotation efforts. However, its effectiveness in ASD remains unexplored. To minimize update costs and time, our proposed method focuses on updating the scoring backend of ASD system without retraining the neural network model. Experimental results on the DCASE 2023 Challenge Task 2 dataset confirm that our AL framework significantly improves ASD performance even with low labeling budgets. Moreover, our proposed sampling strategy outperforms other baselines in terms of the partial area under the receiver operating characteristic score. △ Less

Submitted 10 August, 2024; originally announced August 2024.

Comments: Accepted as a conference paper in INTERSPEECH 2024

arXiv:2407.16296 [pdf, other]

Quantum Computing for Climate Resilience and Sustainability Challenges

Authors: Kin Tung Michael Ho, Kuan-Cheng Chen, Lily Lee, Felix Burt, Shang Yu, Po-Heng, Lee

Abstract: The escalating impacts of climate change and the increasing demand for sustainable development and natural resource management necessitate innovative technological solutions. Quantum computing (QC) has emerged as a promising tool with the potential to revolutionize these critical areas. This review explores the application of quantum machine learning and optimization techniques for climate change… ▽ More The escalating impacts of climate change and the increasing demand for sustainable development and natural resource management necessitate innovative technological solutions. Quantum computing (QC) has emerged as a promising tool with the potential to revolutionize these critical areas. This review explores the application of quantum machine learning and optimization techniques for climate change prediction and enhancing sustainable development. Traditional computational methods often fall short in handling the scale and complexity of climate models and natural resource management. Quantum advancements, however, offer significant improvements in computational efficiency and problem-solving capabilities. By synthesizing the latest research and developments, this paper highlights how QC and quantum machine learning can optimize multi-infrastructure systems towards climate neutrality. The paper also evaluates the performance of current quantum algorithms and hardware in practical applications and presents realistic cases, i.e., waste-to-energy in anaerobic digestion, disaster prevention in flooding prediction, and new material development for carbon capture. The integration of these quantum technologies promises to drive significant advancements in achieving climate resilience and sustainable development. △ Less

Submitted 23 July, 2024; originally announced July 2024.

arXiv:2407.08757 [pdf, ps, other]

Convergence rate of the $Q$-curvature flow

Authors: Pak Tung Ho, Sanghoon Lee

Abstract: Carlotto, Chodosh and Rubinstein have studied the convergence rate of the Yamabe flow. Inspired by their result, we study the convergence rate of the $Q$-curvature flow in this paper. In particular, we provide an example of a slowly converging $Q_6$-curvature flow in dimension 6, in constrast to the dimension 2 case, where the $Q$-curvature flow always converges exponentially. Carlotto, Chodosh and Rubinstein have studied the convergence rate of the Yamabe flow. Inspired by their result, we study the convergence rate of the $Q$-curvature flow in this paper. In particular, we provide an example of a slowly converging $Q_6$-curvature flow in dimension 6, in constrast to the dimension 2 case, where the $Q$-curvature flow always converges exponentially. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: All comments welcome! arXiv admin note: substantial text overlap with arXiv:2107.09616, arXiv:2212.04367; substantial text overlap with arXiv:1401.3738 by other authors

MSC Class: 53E99; 53C18; 35R01

arXiv:2406.10130 [pdf, other]

The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models

Authors: Yan Liu, Yu Liu, Xiaokang Chen, Pin-Yu Chen, Daoguang Zan, Min-Yen Kan, Tsung-Yi Ho

Abstract: Pre-trained Language models (PLMs) have been acknowledged to contain harmful information, such as social biases, which may cause negative social impacts or even bring catastrophic results in application. Previous works on this problem mainly focused on using black-box methods such as probing to detect and quantify social biases in PLMs by observing model outputs. As a result, previous debiasing me… ▽ More Pre-trained Language models (PLMs) have been acknowledged to contain harmful information, such as social biases, which may cause negative social impacts or even bring catastrophic results in application. Previous works on this problem mainly focused on using black-box methods such as probing to detect and quantify social biases in PLMs by observing model outputs. As a result, previous debiasing methods mainly finetune or even pre-train language models on newly constructed anti-stereotypical datasets, which are high-cost. In this work, we try to unveil the mystery of social bias inside language models by introducing the concept of {\sc Social Bias Neurons}. Specifically, we propose {\sc Integrated Gap Gradients (IG$^2$)} to accurately pinpoint units (i.e., neurons) in a language model that can be attributed to undesirable behavior, such as social bias. By formalizing undesirable behavior as a distributional property of language, we employ sentiment-bearing prompts to elicit classes of sensitive words (demographics) correlated with such sentiments. Our IG$^2$ thus attributes the uneven distribution for different demographics to specific Social Bias Neurons, which track the trail of unwanted behavior inside PLM units to achieve interoperability. Moreover, derived from our interpretable technique, {\sc Bias Neuron Suppression (BNS)} is further proposed to mitigate social biases. By studying BERT, RoBERTa, and their attributable differences from debiased FairBERTa, IG$^2$ allows us to locate and suppress identified neurons, and further mitigate undesired behaviors. As measured by prior metrics from StereoSet, our model achieves a higher degree of fairness while maintaining language modeling ability with low cost. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.05585 [pdf, other]

Efficient Hamiltonian encoding algorithms for extracting quantum control mechanism as interfering pathway amplitudes in the Dyson series

Authors: Erez Abrams, Michael Kasprzak, Gaurav Bhole, Tak-San Ho, Herschel Rabitz

Abstract: Hamiltonian encoding is a methodology for revealing the mechanism behind the dynamics governing controlled quantum systems. In this paper, following Mitra and Rabitz [Phys. Rev. A 67, 033407 (2003)], we define mechanism via pathways of eigenstates that describe the evolution of the system, where each pathway is associated with a complex-valued amplitude corresponding to a term in the Dyson series.… ▽ More Hamiltonian encoding is a methodology for revealing the mechanism behind the dynamics governing controlled quantum systems. In this paper, following Mitra and Rabitz [Phys. Rev. A 67, 033407 (2003)], we define mechanism via pathways of eigenstates that describe the evolution of the system, where each pathway is associated with a complex-valued amplitude corresponding to a term in the Dyson series. The evolution of the system is determined by the constructive and destructive interference of these pathway amplitudes. Pathways with similar attributes can be grouped together into pathway classes. The amplitudes of pathway classes are computed by modulating the Hamiltonian matrix elements and decoding the subsequent evolution of the system rather than by direct computation of the individual terms in the Dyson series. The original implementation of Hamiltonian encoding was computationally intensive and became prohibitively expensive in large quantum systems. This paper presents two new encoding algorithms that calculate the amplitudes of pathway classes by using techniques from graph theory and algebraic topology to exploit patterns in the set of allowed transitions, greatly reducing the number of matrix elements that need to be modulated. These new algorithms provide an exponential decrease in both computation time and memory utilization with respect to the Hilbert space dimension of the system. To demonstrate the use of these techniques, they are applied to two illustrative state-to-state transition problems. △ Less

Submitted 8 June, 2024; originally announced June 2024.

Comments: 22 pages, 16 images across 13 figures

arXiv:2405.20112 [pdf, other]

RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection

Authors: Zhiyuan He, Pin-Yu Chen, Tsung-Yi Ho

Abstract: The rapid advances in generative AI models have empowered the creation of highly realistic images with arbitrary content, raising concerns about potential misuse and harm, such as Deepfakes. Current research focuses on training detectors using large datasets of generated images. However, these training-based solutions are often computationally expensive and show limited generalization to unseen ge… ▽ More The rapid advances in generative AI models have empowered the creation of highly realistic images with arbitrary content, raising concerns about potential misuse and harm, such as Deepfakes. Current research focuses on training detectors using large datasets of generated images. However, these training-based solutions are often computationally expensive and show limited generalization to unseen generated images. In this paper, we propose a training-free method to distinguish between real and AI-generated images. We first observe that real images are more robust to tiny noise perturbations than AI-generated images in the representation space of vision foundation models. Based on this observation, we propose RIGID, a training-free and model-agnostic method for robust AI-generated image detection. RIGID is a simple yet effective approach that identifies whether an image is AI-generated by comparing the representation similarity between the original and the noise-perturbed counterpart. Our evaluation on a diverse set of AI-generated images and benchmarks shows that RIGID significantly outperforms existing trainingbased and training-free detectors. In particular, the average performance of RIGID exceeds the current best training-free method by more than 25%. Importantly, RIGID exhibits strong generalization across different image generation methods and robustness to image corruptions. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.20099 [pdf, other]

Defensive Prompt Patch: A Robust and Interpretable Defense of LLMs against Jailbreak Attacks

Authors: Chen Xiong, Xiangyu Qi, Pin-Yu Chen, Tsung-Yi Ho

Abstract: Safety, security, and compliance are essential requirements when aligning large language models (LLMs). However, many seemingly aligned LLMs are soon shown to be susceptible to jailbreak attacks. These attacks aim to circumvent the models' safety guardrails and security mechanisms by introducing jailbreak prompts into malicious queries. In response to these challenges, this paper introduces Defens… ▽ More Safety, security, and compliance are essential requirements when aligning large language models (LLMs). However, many seemingly aligned LLMs are soon shown to be susceptible to jailbreak attacks. These attacks aim to circumvent the models' safety guardrails and security mechanisms by introducing jailbreak prompts into malicious queries. In response to these challenges, this paper introduces Defensive Prompt Patch (DPP), a novel prompt-based defense mechanism specifically designed to protect LLMs against such sophisticated jailbreak strategies. Unlike previous approaches, which have often compromised the utility of the model for the sake of safety, DPP is designed to achieve a minimal Attack Success Rate (ASR) while preserving the high utility of LLMs. Our method uses strategically designed interpretable suffix prompts that effectively thwart a wide range of standard and adaptive jailbreak techniques. Empirical results conducted on LLAMA-2-7B-Chat and Mistral-7B-Instruct-v0.2 models demonstrate the robustness and adaptability of DPP, showing significant reductions in ASR with negligible impact on utility. Our approach not only outperforms existing defense strategies in balancing safety and functionality, but also provides a scalable and interpretable solution applicable to various LLM platforms. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.08681 [pdf, other]

Achieving Fairness Through Channel Pruning for Dermatological Disease Diagnosis

Authors: Qingpeng Kong, Ching-Hao Chiu, Dewen Zeng, Yu-Jen Chen, Tsung-Yi Ho, Jingtong hu, Yiyu Shi

Abstract: Numerous studies have revealed that deep learning-based medical image classification models may exhibit bias towards specific demographic attributes, such as race, gender, and age. Existing bias mitigation methods often achieve high level of fairness at the cost of significant accuracy degradation. In response to this challenge, we propose an innovative and adaptable Soft Nearest Neighbor Loss-bas… ▽ More Numerous studies have revealed that deep learning-based medical image classification models may exhibit bias towards specific demographic attributes, such as race, gender, and age. Existing bias mitigation methods often achieve high level of fairness at the cost of significant accuracy degradation. In response to this challenge, we propose an innovative and adaptable Soft Nearest Neighbor Loss-based channel pruning framework, which achieves fairness through channel pruning. Traditionally, channel pruning is utilized to accelerate neural network inference. However, our work demonstrates that pruning can also be a potent tool for achieving fairness. Our key insight is that different channels in a layer contribute differently to the accuracy of different groups. By selectively pruning critical channels that lead to the accuracy difference between the privileged and unprivileged groups, we can effectively improve fairness without sacrificing accuracy significantly. Experiments conducted on two skin lesion diagnosis datasets across multiple sensitive attributes validate the effectiveness of our method in achieving state-of-the-art trade-off between accuracy and fairness. Our code is available at https://github.com/Kqp1227/Sensitive-Channel-Pruning. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 13 pages, 3 figures, early accepted by International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2024

arXiv:2405.05590 [pdf, other]

TroLLoc: Logic Locking and Layout Hardening for IC Security Closure against Hardware Trojans

Authors: Fangzhou Wang, Qijing Wang, Lilas Alrahis, Bangqi Fu, Shui Jiang, Xiaopeng Zhang, Ozgur Sinanoglu, Tsung-Yi Ho, Evangeline F. Y. Young, Johann Knechtel

Abstract: Due to cost benefits, supply chains of integrated circuits (ICs) are largely outsourced nowadays. However, passing ICs through various third-party providers gives rise to many security threats, like piracy of IC intellectual property or insertion of hardware Trojans, i.e., malicious circuit modifications. In this work, we proactively and systematically protect the physical layouts of ICs against… ▽ More Due to cost benefits, supply chains of integrated circuits (ICs) are largely outsourced nowadays. However, passing ICs through various third-party providers gives rise to many security threats, like piracy of IC intellectual property or insertion of hardware Trojans, i.e., malicious circuit modifications. In this work, we proactively and systematically protect the physical layouts of ICs against post-design insertion of Trojans. Toward that end, we propose TroLLoc, a novel scheme for IC security closure that employs, for the first time, logic locking and layout hardening in unison. TroLLoc is fully integrated into a commercial-grade design flow, and TroLLoc is shown to be effective, efficient, and robust. Our work provides in-depth layout and security analysis considering the challenging benchmarks of the ISPD'22/23 contests for security closure. We show that TroLLoc successfully renders layouts resilient, with reasonable overheads, against (i) general prospects for Trojan insertion as in the ISPD'22 contest, (ii) actual Trojan insertion as in the ISPD'23 contest, and (iii) potential second-order attacks where adversaries would first (i.e., before Trojan insertion) try to bypass the locking defense, e.g., using advanced machine learning attacks. Finally, we release all our artifacts for independent verification [2]. △ Less

Submitted 9 May, 2024; originally announced May 2024.

arXiv:2404.07002 [pdf, ps, other]

Deformations of the scalar curvature of a partially integrable pseudohermitian manifold

Authors: Jeffrey S. Case, Pak Tung Ho

Abstract: We consider deformations of the scalar curvature of a partially integrable pseudohermitian manifold, in analogy with the work of Fischer and Marsden on Riemannian manifolds. In particular, we introduce and discuss $R$-singular spaces, give sufficient conditions for the stability of the scalar curvature, and give a partial infinitesimal rigidity result for the scalar curvature of a compact, torsion… ▽ More We consider deformations of the scalar curvature of a partially integrable pseudohermitian manifold, in analogy with the work of Fischer and Marsden on Riemannian manifolds. In particular, we introduce and discuss $R$-singular spaces, give sufficient conditions for the stability of the scalar curvature, and give a partial infinitesimal rigidity result for the scalar curvature of a compact, torsion-free, scalar-flat, integrable pseudohermitian manifold. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 16 pages

arXiv:2403.14736 [pdf, other]

NaNa and MiGu: Semantic Data Augmentation Techniques to Enhance Protein Classification in Graph Neural Networks

Authors: Yi-Shan Lan, Pin-Yu Chen, Tsung-Yi Ho

Abstract: Protein classification tasks are essential in drug discovery. Real-world protein structures are dynamic, which will determine the properties of proteins. However, the existing machine learning methods, like ProNet (Wang et al., 2022a), only access limited conformational characteristics and protein side-chain features, leading to impractical protein structure and inaccuracy of protein classes in th… ▽ More Protein classification tasks are essential in drug discovery. Real-world protein structures are dynamic, which will determine the properties of proteins. However, the existing machine learning methods, like ProNet (Wang et al., 2022a), only access limited conformational characteristics and protein side-chain features, leading to impractical protein structure and inaccuracy of protein classes in their predictions. In this paper, we propose novel semantic data augmentation methods, Novel Augmentation of New Node Attributes (NaNa), and Molecular Interactions and Geometric Upgrading (MiGu) to incorporate backbone chemical and side-chain biophysical information into protein classification tasks and a co-embedding residual learning framework. Specifically, we leverage molecular biophysical, secondary structure, chemical bonds, and ionic features of proteins to facilitate protein classification tasks. Furthermore, our semantic augmentation methods and the co-embedding residual learning framework can improve the performance of GIN (Xu et al., 2019) on EC and Fold datasets (Bairoch, 2000; Andreeva et al., 2007) by 16.41% and 11.33% respectively. Our code is available at https://github.com/r08b46009/Code_for_MIGU_NANA/tree/main. △ Less

Submitted 26 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

arXiv:2403.12172 [pdf, other]

Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection

Authors: Ali Karami, Thi Kieu Khanh Ho, Narges Armanfard

Abstract: Skeleton-based video anomaly detection (SVAD) is a crucial task in computer vision. Accurately identifying abnormal patterns or events enables operators to promptly detect suspicious activities, thereby enhancing safety. Achieving this demands a comprehensive understanding of human motions, both at body and region levels, while also accounting for the wide variations of performing a single action.… ▽ More Skeleton-based video anomaly detection (SVAD) is a crucial task in computer vision. Accurately identifying abnormal patterns or events enables operators to promptly detect suspicious activities, thereby enhancing safety. Achieving this demands a comprehensive understanding of human motions, both at body and region levels, while also accounting for the wide variations of performing a single action. However, existing studies fail to simultaneously address these crucial properties. This paper introduces a novel, practical and lightweight framework, namely Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection (GiCiSAD) to overcome the challenges associated with SVAD. GiCiSAD consists of three novel modules: the Graph Attention-based Forecasting module to capture the spatio-temporal dependencies inherent in the data, the Graph-level Jigsaw Puzzle Maker module to distinguish subtle region-level discrepancies between normal and abnormal motions, and the Graph-based Conditional Diffusion model to generate a wide spectrum of human motions. Extensive experiments on four widely used skeleton-based video datasets show that GiCiSAD outperforms existing methods with significantly fewer training parameters, establishing it as the new state-of-the-art. △ Less

Submitted 30 August, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

Comments: Accepted at the Winter Conference on Applications of Computer Vision (WACV). 17 pages, 6 figures, 6 tables

arXiv:2403.07257 [pdf, other]

The Dawn of AI-Native EDA: Opportunities and Challenges of Large Circuit Models

Authors: Lei Chen, Yiqi Chen, Zhufei Chu, Wenji Fang, Tsung-Yi Ho, Ru Huang, Yu Huang, Sadaf Khan, Min Li, Xingquan Li, Yu Li, Yun Liang, Jinwei Liu, Yi Liu, Yibo Lin, Guojie Luo, Zhengyuan Shi, Guangyu Sun, Dimitrios Tsaras, Runsheng Wang, Ziyi Wang, Xinming Wei, Zhiyao Xie, Qiang Xu, Chenhao Xue , et al. (14 additional authors not shown)

Abstract: Within the Electronic Design Automation (EDA) domain, AI-driven solutions have emerged as formidable tools, yet they typically augment rather than redefine existing methodologies. These solutions often repurpose deep learning models from other domains, such as vision, text, and graph analytics, applying them to circuit design without tailoring to the unique complexities of electronic circuits. Suc… ▽ More Within the Electronic Design Automation (EDA) domain, AI-driven solutions have emerged as formidable tools, yet they typically augment rather than redefine existing methodologies. These solutions often repurpose deep learning models from other domains, such as vision, text, and graph analytics, applying them to circuit design without tailoring to the unique complexities of electronic circuits. Such an AI4EDA approach falls short of achieving a holistic design synthesis and understanding, overlooking the intricate interplay of electrical, logical, and physical facets of circuit data. This paper argues for a paradigm shift from AI4EDA towards AI-native EDA, integrating AI at the core of the design process. Pivotal to this vision is the development of a multimodal circuit representation learning technique, poised to provide a comprehensive understanding by harmonizing and extracting insights from varied data sources, such as functional specifications, RTL designs, circuit netlists, and physical layouts. We champion the creation of large circuit models (LCMs) that are inherently multimodal, crafted to decode and express the rich semantics and structures of circuit data, thus fostering more resilient, efficient, and inventive design methodologies. Embracing this AI-native philosophy, we foresee a trajectory that transcends the current innovation plateau in EDA, igniting a profound shift-left in electronic design methodology. The envisioned advancements herald not just an evolution of existing EDA tools but a revolution, giving rise to novel instruments of design tools that promise to radically enhance design productivity and inaugurate a new epoch where the optimization of circuit performance, power, and area (PPA) is achieved not incrementally, but through leaps that redefine the benchmarks of electronic systems' capabilities. △ Less

Submitted 1 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

Comments: The authors are ordered alphabetically. Contact: qxu@cse[dot]cuhk[dot]edu[dot]hk, gluo@pku[dot]edu[dot]cn, yuan.mingxuan@huawei[dot]com

arXiv:2403.05125 [pdf, other]

Evaluating Text-to-Image Generative Models: An Empirical Study on Human Image Synthesis

Authors: Muxi Chen, Yi Liu, Jian Yi, Changran Xu, Qiuxia Lai, Hongliang Wang, Tsung-Yi Ho, Qiang Xu

Abstract: In this paper, we present an empirical study introducing a nuanced evaluation framework for text-to-image (T2I) generative models, applied to human image synthesis. Our framework categorizes evaluations into two distinct groups: first, focusing on image qualities such as aesthetics and realism, and second, examining text conditions through concept coverage and fairness. We introduce an innovative… ▽ More In this paper, we present an empirical study introducing a nuanced evaluation framework for text-to-image (T2I) generative models, applied to human image synthesis. Our framework categorizes evaluations into two distinct groups: first, focusing on image qualities such as aesthetics and realism, and second, examining text conditions through concept coverage and fairness. We introduce an innovative aesthetic score prediction model that assesses the visual appeal of generated images and unveils the first dataset marked with low-quality regions in generated human images to facilitate automatic defect detection. Our exploration into concept coverage probes the model's effectiveness in interpreting and rendering text-based concepts accurately, while our analysis of fairness reveals biases in model outputs, with an emphasis on gender, race, and age. While our study is grounded in human imagery, this dual-faceted approach is designed with the flexibility to be applicable to other forms of image generation, enhancing our understanding of generative models and paving the way to the next generation of more sophisticated, contextually aware, and ethically attuned generative models. We will release our code, the data used for evaluating generative models and the dataset annotated with defective areas soon. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.03437 [pdf, other]

Dark Dragon Breaks Magnetic Chain: Dynamical Substructures of IRDC G28.34 Form in Supported Environments

Authors: Junhao Liu, Qizhou Zhang, Yuxin Lin, Keping Qiu, Patrick M. Koch, Hauyu Baobab Liu, Zhi-Yun Li, Josep Miquel Girart, Thushara G. S. Pillai, Shanghuo Li, Huei-Ru Vivien Chen, Tao-Chung Ching, Paul T. P. Ho, Shih-Ping Lai, Ramprasad Rao, Ya-Wen Tang, Ke Wang

Abstract: We have comprehensively studied the multi-scale physical properties of the infrared dark cloud (IRDC) G28.34 (the Dragon cloud) with dust polarization and molecular line data from Planck, FCRAO-14m, JCMT, and ALMA. We find that the averaged magnetic fields of clumps tend to be either parallel with or perpendicular to the cloud-scale magnetic fields, while the cores in clump MM4 tend to have magnet… ▽ More We have comprehensively studied the multi-scale physical properties of the infrared dark cloud (IRDC) G28.34 (the Dragon cloud) with dust polarization and molecular line data from Planck, FCRAO-14m, JCMT, and ALMA. We find that the averaged magnetic fields of clumps tend to be either parallel with or perpendicular to the cloud-scale magnetic fields, while the cores in clump MM4 tend to have magnetic fields aligned with the clump fields. Implementing the relative orientation analysis (for magnetic fields, column density gradients, and local gravity), Velocity Gradient Technique (VGT), and modified Davis-Chandrasekhar-Fermi (DCF) analysis, we find that: G28.34 is located in a trans-to-sub-Alfvénic environment ($\mathcal{M}_{A}=0.74$ within $r=15$ pc); the magnetic field is effectively resisting gravitational collapse in large-scale diffuse gas, but is distorted by gravity within the cloud and affected by star formation activities in high-density regions; and the normalized mass-to-flux ratio tends to increase with increasing density and decreasing radius. Considering the thermal, turbulent, and magnetic supports, we find that the environmental gas of G28.34 is in a super-virial (supported) state, the infrared dark clumps may be in a near-equilibrium state, and core MM4-core4 is in a sub-virial (gravity-dominant) state. In summary, we suggest that magnetic fields dominate gravity and turbulence in the cloud environment at large scales, resulting in relatively slow cloud formation and evolution processes. Within the cloud, gravity could overwhelm magnetic fields and turbulence, allowing local dynamical star formation to happen. △ Less

Submitted 18 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

Comments: 35 pages, 24 figures. Accepted by ApJ

arXiv:2403.00867 [pdf, other]

Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes

Authors: Xiaomeng Hu, Pin-Yu Chen, Tsung-Yi Ho

Abstract: Large Language Models (LLMs) are becoming a prominent generative AI tool, where the user enters a query and the LLM generates an answer. To reduce harm and misuse, efforts have been made to align these LLMs to human values using advanced training techniques such as Reinforcement Learning from Human Feedback (RLHF). However, recent studies have highlighted the vulnerability of LLMs to adversarial j… ▽ More Large Language Models (LLMs) are becoming a prominent generative AI tool, where the user enters a query and the LLM generates an answer. To reduce harm and misuse, efforts have been made to align these LLMs to human values using advanced training techniques such as Reinforcement Learning from Human Feedback (RLHF). However, recent studies have highlighted the vulnerability of LLMs to adversarial jailbreak attempts aiming at subverting the embedded safety guardrails. To address this challenge, this paper defines and investigates the Refusal Loss of LLMs and then proposes a method called Gradient Cuff to detect jailbreak attempts. Gradient Cuff exploits the unique properties observed in the refusal loss landscape, including functional values and its smoothness, to design an effective two-step detection strategy. Experimental results on two aligned LLMs (LLaMA-2-7B-Chat and Vicuna-7B-V1.5) and six types of jailbreak attacks (GCG, AutoDAN, PAIR, TAP, Base64, and LRL) show that Gradient Cuff can significantly improve the LLM's rejection capability for malicious jailbreak queries, while maintaining the model's performance for benign user queries by adjusting the detection threshold. △ Less

Submitted 5 March, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

Comments: Project page: https://huggingface.co/spaces/TrustSafeAI/GradientCuff-Jailbreak-Defense

arXiv:2402.13061 [pdf, other]

Toward Fairness via Maximum Mean Discrepancy Regularization on Logits Space

Authors: Hao-Wei Chung, Ching-Hao Chiu, Yu-Jen Chen, Yiyu Shi, Tsung-Yi Ho

Abstract: Fairness has become increasingly pivotal in machine learning for high-risk applications such as machine learning in healthcare and facial recognition. However, we see the deficiency in the previous logits space constraint methods. Therefore, we propose a novel framework, Logits-MMD, that achieves the fairness condition by imposing constraints on output logits with Maximum Mean Discrepancy. Moreove… ▽ More Fairness has become increasingly pivotal in machine learning for high-risk applications such as machine learning in healthcare and facial recognition. However, we see the deficiency in the previous logits space constraint methods. Therefore, we propose a novel framework, Logits-MMD, that achieves the fairness condition by imposing constraints on output logits with Maximum Mean Discrepancy. Moreover, quantitative analysis and experimental results show that our framework has a better property that outperforms previous methods and achieves state-of-the-art on two facial recognition datasets and one animal dataset. Finally, we show experimental results and demonstrate that our debias approach achieves the fairness condition effectively. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.12179 [pdf, other]

Examining Monitoring System: Detecting Abnormal Behavior In Online Examinations

Authors: Dinh An Ngo, Thanh Dat Nguyen, Thi Le Chi Dang, Huy Hoan Le, Ton Bao Ho, Vo Thanh Khang Nguyen, Truong Thanh Hung Nguyen

Abstract: Cheating in online exams has become a prevalent issue over the past decade, especially during the COVID-19 pandemic. To address this issue of academic dishonesty, our "Exam Monitoring System: Detecting Abnormal Behavior in Online Examinations" is designed to assist proctors in identifying unusual student behavior. Our system demonstrates high accuracy and speed in detecting cheating in real-time s… ▽ More Cheating in online exams has become a prevalent issue over the past decade, especially during the COVID-19 pandemic. To address this issue of academic dishonesty, our "Exam Monitoring System: Detecting Abnormal Behavior in Online Examinations" is designed to assist proctors in identifying unusual student behavior. Our system demonstrates high accuracy and speed in detecting cheating in real-time scenarios, providing valuable information, and aiding proctors in decision-making. This article outlines our methodology and the effectiveness of our system in mitigating the widespread problem of cheating in online exams. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2402.09457 [pdf]

Self-Healing Effects in OAM Beams Observed on a 28 GHz Experimental Link

Authors: Marek Klemes, Lan Hu, Greg Bowles, Mohammad Akbari, Soulideth Thirakoune, Michael Schwartzman, Kevin Zhang, Tan Huy Ho, David Wessel, Wen Tong

Abstract: In this paper we document for the first time some of the effects of self-healing, a property of orbital-angular-momentum (OAM) or vortex beams, as observed on a millimeter-wave experimental communications link in an outdoors line-of-sight (LOS) scenario. The OAM beams have a helical phase and polarization structure and have conical amplitude shape in the far field. The Poynting vectors of the OAM… ▽ More In this paper we document for the first time some of the effects of self-healing, a property of orbital-angular-momentum (OAM) or vortex beams, as observed on a millimeter-wave experimental communications link in an outdoors line-of-sight (LOS) scenario. The OAM beams have a helical phase and polarization structure and have conical amplitude shape in the far field. The Poynting vectors of the OAM beams also possess helical structures, orthogonal to the corresponding helical phase-fronts. Due to such non-planar structure in the direction orthogonal to the beam axis, OAM beams are a subset of structured light beams. Such structured beams are known to possess self-healing properties when partially obstructed along their propagation axis, especially in their near fields, resulting in partial reconstruction of their structures at larger distances along their beam axis. Various theoretical rationales have been proposed to explain, model and experimentally verify the self-healing physical effects in structured optical beams, using various types of obstructions and experimental techniques. Based on these models, we hypothesize that any self-healing observed will be greater as the OAM order increases. Here we observe the self-healing effects for the first time in structured OAM radio beams, in terms of communication signals and channel parameters rather than beam structures. We capture the effects of partial near-field obstructions of OAM beams of different orders on the communications signals and provide a physical rationale to substantiate that the self-healing effect was observed to increase with the order of OAM, agreeing with our hypothesis. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: 9 pages, 10 figures, pending submission to IEEE Access journal

arXiv:2401.13204 [pdf, other]

An Extremely Young Protostellar Core, MMS 1/ OMC-3: Episodic Mass Ejection History Traced by the Micro SiO Jet

Authors: Satoko Takahashi, Masahiro N. Machida, Mitsuki Omura, Doug Johnstone, Kazuya Saigo, Naoto Harada, Kohji Tomisaka, Paul T. P. Ho, Luis A. Zapata, Steve Mairs, Gregory J. Herczeg, Kotomi Taniguchi, Yuhua Liu, Asako Sato

Abstract: We present ${\sim}0.2$ arcsec ($\sim$80 au) resolution observations of the CO (2-1) and SiO (5-4) lines made with the Atacama large millimeter/submillimeter array toward an extremely young intermediate-mass protostellar source (t$_{\rm dyn}<$1000 years), MMS 1 located in the Orion Molecular Cloud-3 region. We have successfully imaged a very compact CO molecular outflow associated with MMS 1, havin… ▽ More We present ${\sim}0.2$ arcsec ($\sim$80 au) resolution observations of the CO (2-1) and SiO (5-4) lines made with the Atacama large millimeter/submillimeter array toward an extremely young intermediate-mass protostellar source (t$_{\rm dyn}<$1000 years), MMS 1 located in the Orion Molecular Cloud-3 region. We have successfully imaged a very compact CO molecular outflow associated with MMS 1, having deprojected lobe sizes of $\sim$18000 au (red-shifted lobe) and $\sim$35000 au (blue-shifted lobe). We have also detected an extremely compact ($\lesssim$1000 au) and collimated SiO protostellar jet within the CO outflow. The maximum deprojected jet speed is measured to be as high as 93 km s$^{-1}$. The SiO jet wiggles and displays a chain of knots. Our detection of the molecular outflow and jet is the first direct evidence that MMS 1 already hosts a protostar. The position-velocity diagram obtained from the SiO emission shows two distinct structures: (i) bow-shocks associated with the tips of the outflow, and (ii) a collimated jet, showing the jet velocities linearly increasing with the distance from the driving source. Comparisons between the observations and numerical simulations quantitatively share similarities such as multiple-mass ejection events within the jet and Hubble-like flow associated with each mass ejection event. Finally, while there is a weak flux decline seen in the 850 $μ$m light curve obtained with JCMT/SCUBA 2 toward MMS 1, no dramatic flux change events are detected. This suggests that there has not been a clear burst event within the last 8 years. △ Less

Submitted 23 January, 2024; originally announced January 2024.

Comments: 19 pages, 9 figures, Accepted for publication in ApJ

arXiv:2401.08066 [pdf, other]

doi 10.1016/j.media.2024.103188

Achieve Fairness without Demographics for Dermatological Disease Diagnosis

Authors: Ching-Hao Chiu, Yu-Jen Chen, Yawen Wu, Yiyu Shi, Tsung-Yi Ho

Abstract: In medical image diagnosis, fairness has become increasingly crucial. Without bias mitigation, deploying unfair AI would harm the interests of the underprivileged population and potentially tear society apart. Recent research addresses prediction biases in deep learning models concerning demographic groups (e.g., gender, age, and race) by utilizing demographic (sensitive attribute) information dur… ▽ More In medical image diagnosis, fairness has become increasingly crucial. Without bias mitigation, deploying unfair AI would harm the interests of the underprivileged population and potentially tear society apart. Recent research addresses prediction biases in deep learning models concerning demographic groups (e.g., gender, age, and race) by utilizing demographic (sensitive attribute) information during training. However, many sensitive attributes naturally exist in dermatological disease images. If the trained model only targets fairness for a specific attribute, it remains unfair for other attributes. Moreover, training a model that can accommodate multiple sensitive attributes is impractical due to privacy concerns. To overcome this, we propose a method enabling fair predictions for sensitive attributes during the testing phase without using such information during training. Inspired by prior work highlighting the impact of feature entanglement on fairness, we enhance the model features by capturing the features related to the sensitive and target attributes and regularizing the feature entanglement between corresponding classes. This ensures that the model can only classify based on the features related to the target attribute without relying on features associated with sensitive attributes, thereby improving fairness and accuracy. Additionally, we use disease masks from the Segment Anything Model (SAM) to enhance the quality of the learned feature. Experimental results demonstrate that the proposed method can improve fairness in classification compared to state-of-the-art methods in two dermatological disease datasets. △ Less

Submitted 15 January, 2024; originally announced January 2024.

arXiv:2312.17480 [pdf, other]

Detection of evolutionary shifts in variance under an Ornsten-Uhlenbeck model

Authors: Wensha Zhang, Lam Si Tung Ho, Toby Kenney

Abstract: 1. Abrupt environmental changes can lead to evolutionary shifts in not only mean (optimal value), but also variance of descendants in trait evolution. There are some methods to detect shifts in optimal value but few studies consider shifts in variance. 2. We use a multi-optima and multi-variance OU process model to describe the trait evolution process with shifts in both optimal value and variance… ▽ More 1. Abrupt environmental changes can lead to evolutionary shifts in not only mean (optimal value), but also variance of descendants in trait evolution. There are some methods to detect shifts in optimal value but few studies consider shifts in variance. 2. We use a multi-optima and multi-variance OU process model to describe the trait evolution process with shifts in both optimal value and variance and provide analysis of how the covariance between species changes when shifts in variance occur along the path. 3. We propose a new method to detect the shifts in both variance and optimal values based on minimizing the loss function with L1 penalty. We implement our method in a new R package, ShiVa (Detection of evolutionary shifts in variance). 4. We conduct simulations to compare our method with the two methods considering only shifts in optimal values (l1ou; PhylogeneticEM). Our method shows strength in predictive ability and includes far fewer false positive shifts in optimal value compared to other methods when shifts in variance actually exist. When there are only shifts in optimal value, our method performs similarly to other methods. We applied our method to the cordylid data, ShiVa outperformed l1ou and phyloEM, exhibiting the highest log-likelihood and lowest BIC. △ Less

Submitted 29 December, 2023; originally announced December 2023.

arXiv:2312.15846 [pdf, other]

Shell-shaped quantum droplet in a three-component ultracold Bose gas

Authors: Yinfeng Ma, Tin-Lun Ho, Xiaoling Cui

Abstract: We present a scheme to generate shell-shaped droplet in a three-component (1,2,3) ultracold Bose gas. Here binary mixtures (1,2) and (2,3) form quantum droplets due to inter-species attractions, and the two droplets are mutually immiscible due to strong 1-3 repulsion. Importantly, the shared component-2 serves as a glue to link the two droplets together as a globally self-bound object. In this sys… ▽ More We present a scheme to generate shell-shaped droplet in a three-component (1,2,3) ultracold Bose gas. Here binary mixtures (1,2) and (2,3) form quantum droplets due to inter-species attractions, and the two droplets are mutually immiscible due to strong 1-3 repulsion. Importantly, the shared component-2 serves as a glue to link the two droplets together as a globally self-bound object. In this system, the outer droplet naturally develops a shell structure, and its radius and width can be conveniently tuned through the size of core droplet. Moreover, to reach an equilibrium with the shell, the core droplet displays very different spin densities as compared to the vacuum case. These results have been demonstrated in a realistic $^{23}$Na-$^{39}$K-$^{41}$K mixture. Our scheme liberates the shell-shaped Bose gas from stringent conditions with microgravity or fine-tuned traps, and can be readily implemented in cold atoms laboratories on Earth. This paves the way for future exploration of quantum droplets in curved space with non-trivial real-space topologies. △ Less

Submitted 25 December, 2023; originally announced December 2023.

Comments: 6 pages, 3 figures

arXiv:2312.13615 [pdf, other]

Self-supervised Complex Network for Machine Sound Anomaly Detection

Authors: Miseul Kim, Minh Tri Ho, Hong-Goo Kang

Abstract: In this paper, we propose an anomaly detection algorithm for machine sounds with a deep complex network trained by self-supervision. Using the fact that phase continuity information is crucial for detecting abnormalities in time-series signals, our proposed algorithm utilizes the complex spectrum as an input and performs complex number arithmetic throughout the entire process. Since the usefulness… ▽ More In this paper, we propose an anomaly detection algorithm for machine sounds with a deep complex network trained by self-supervision. Using the fact that phase continuity information is crucial for detecting abnormalities in time-series signals, our proposed algorithm utilizes the complex spectrum as an input and performs complex number arithmetic throughout the entire process. Since the usefulness of phase information can vary depending on the type of machine sound, we also apply an attention mechanism to control the weights of the complex and magnitude spectrum bottleneck features depending on the machine type. We train our network to perform a self-supervised task that classifies the machine identifier (id) of normal input sounds among multiple classes. At test time, an input signal is detected as anomalous if the trained model is unable to correctly classify the id. In other words, we determine the presence of an anomality when the output cross-entropy score of the multiclass identification task is lower than a pre-defined threshold. Experiments with the MIMII dataset show that the proposed algorithm has a much higher area under the curve (AUC) score than conventional magnitude spectrum-based algorithms. △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: Published in EUSIPCO 2021

arXiv:2312.13573 [pdf, other]

Dust Polarization of Prestellar and Protostellar Sources in OMC-3

Authors: Yuhua Liu, Satoko Takahashi, Masahiro Machida, Kohji Tomisaka, Josep Miquel Girart, Paul T. P. Ho, Kouichiro Nakanishi, Asako Sato

Abstract: We present the Atacama Large Millimeter/submillimeter Array (ALMA) observations of linearly polarized 1.1 mm continuum emission at $\sim$0.14" (55 au) resolution and CO ($J$=2$-$1) emission at $\sim$1.5" (590 au) resolution towards one prestellar (MMS 4), four Class 0 (MMS$\,$1, MMS$\,$3, MMS$\,$5, and MMS$\,$6), one Class I (MMS$\,$7), and one flat-spectrum (MMS$\,$2) sources in the Orion Molecul… ▽ More We present the Atacama Large Millimeter/submillimeter Array (ALMA) observations of linearly polarized 1.1 mm continuum emission at $\sim$0.14" (55 au) resolution and CO ($J$=2$-$1) emission at $\sim$1.5" (590 au) resolution towards one prestellar (MMS 4), four Class 0 (MMS$\,$1, MMS$\,$3, MMS$\,$5, and MMS$\,$6), one Class I (MMS$\,$7), and one flat-spectrum (MMS$\,$2) sources in the Orion Molecular Cloud$\,$3 region. The dust disk-like structures and clear CO outflows are detected towards all sources except for MMS$\,$4. The diameters of these disk-like structures, ranging from 16 au to 97 au, are estimated based on the deconvolved full width half maximum (FWHM) values obtained from the multi-Gaussian fitting. Polarized emissions are detected towards MMS$\,$2, MMS$\,$5, MMS$\,$6, and MMS$\,$7, while no polarized emission is detected towards MMS$\,$1, MMS$\,$3, and MMS$\,$4. MMS$\,$2, MMS$\,$5, and MMS$\,$7 show organized polarization vectors aligned with the minor axes of the disk-like structures, with mean polarization fractions ranging from 0.6$\%$ to 1.2$\%$. The strongest millimeter source, MMS$\,$6, exhibits complex polarization orientations and a remarkably high polarization fraction of $\sim$10$\%$ around the Stokes $I$ peak, and 15$-$20$\%$ on the arm-like structure, as reported by Takahashi et al. (2019). The origins of the polarized emission, such as self-scattering and dust alignment due to the magnetic field or radiative torque, are discussed for individual sources. Some disk-like sources exhibit a polarized intensity peak shift towards the nearside of the disk, which supports that the polarized emission originates from self-scattering. △ Less

Submitted 20 December, 2023; originally announced December 2023.

Comments: 46 pages, 19 figures, accepted for publication in Astrophysical Journal

arXiv:2312.10408 [pdf]

Intra-Family Transformation of The Bi-Te Family via in-situ Chemical Interactions

Authors: Zhihao He, Tin Seng Manfred Ho, Rolf Lortz, Iam Keong Sou

Abstract: The Bi-Te binary system, characterized by the homologous series of the (Bi2)m(Bi2Te3)n, has always attracted research interest for its layered structures and potential in advanced materials applications. Despite Bi2Te3 has been extensively studied, exploration of other compounds has been constrained by synthesis challenges. This study reports the molecular beam epitaxy (MBE) growth of FeTe on Bi2T… ▽ More The Bi-Te binary system, characterized by the homologous series of the (Bi2)m(Bi2Te3)n, has always attracted research interest for its layered structures and potential in advanced materials applications. Despite Bi2Te3 has been extensively studied, exploration of other compounds has been constrained by synthesis challenges. This study reports the molecular beam epitaxy (MBE) growth of FeTe on Bi2Te3, demonstrating that varying growth conditions can turn the Bi2Te3 layer into different Bi-Te phases and form corresponding FeTe/Bi-Te heterostructures. Our combined analysis using reflection high-energy electron diffraction (RHEED), high-resolution X-ray diffraction (HRXRD), and high-resolution scanning transmission electron microscopy (HR-STEM), indicates that specific growth conditions used for the growth of the FeTe layer can facilitate the extraction of Te from Bi2Te3, leading to the formation of Bi4Te3 and Bi6Te3. Additionally, by lowering the FeTe growth temperature to 230 oC, Te extraction from the Bi2Te3 layer could be avoided, preserving the Bi2Te3 structure. Notably, all the three FeTe/Bi-Te structures exhibit superconductivity with the FeTe/Bi2Te3 heterostructure enjoying the highest superconductivity quality. These findings introduce a novel method for realizing Bi4Te3 and Bi6Te3 through Te extraction by growing FeTe on Bi2Te3, driven by the high reactivity between Fe and Te. This approach holds promise for synthesizing other members of the Bi-Te series, expanding the functional potential of these materials. △ Less

Submitted 7 June, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

Comments: 25 pages, 10 figures

arXiv:2312.05849 [pdf, other]

InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models

Authors: Jiun Tian Hoe, Xudong Jiang, Chee Seng Chan, Yap-Peng Tan, Weipeng Hu

Abstract: Large-scale text-to-image (T2I) diffusion models have showcased incredible capabilities in generating coherent images based on textual descriptions, enabling vast applications in content generation. While recent advancements have introduced control over factors such as object localization, posture, and image contours, a crucial gap remains in our ability to control the interactions between objects… ▽ More Large-scale text-to-image (T2I) diffusion models have showcased incredible capabilities in generating coherent images based on textual descriptions, enabling vast applications in content generation. While recent advancements have introduced control over factors such as object localization, posture, and image contours, a crucial gap remains in our ability to control the interactions between objects in the generated content. Well-controlling interactions in generated images could yield meaningful applications, such as creating realistic scenes with interacting characters. In this work, we study the problems of conditioning T2I diffusion models with Human-Object Interaction (HOI) information, consisting of a triplet label (person, action, object) and corresponding bounding boxes. We propose a pluggable interaction control model, called InteractDiffusion that extends existing pre-trained T2I diffusion models to enable them being better conditioned on interactions. Specifically, we tokenize the HOI information and learn their relationships via interaction embeddings. A conditioning self-attention layer is trained to map HOI tokens to visual tokens, thereby conditioning the visual tokens better in existing T2I diffusion models. Our model attains the ability to control the interaction and location on existing T2I diffusion models, which outperforms existing baselines by a large margin in HOI detection score, as well as fidelity in FID and KID. Project page: https://jiuntian.github.io/interactdiffusion. △ Less

Submitted 26 February, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

Comments: Website: https://jiuntian.github.io/interactdiffusion. Accepted at CVPR2024

arXiv:2312.02759 [pdf, other]

Absolute Flux Density Calibration of the Greenland Telescope Data for Event Horizon Telescope Observations

Authors: J. Y. Koay, K. Asada, S. Matsushita, C. -Y. Kuo, C. -W. L. Huang, C. Romero-Cañizales, S. Koyama, J. Park, W. -P. Lo, G. Bower, M. -T. Chen, S. -H. Chang, C. -C. Chen, R. Chilson, C. C. Han, P. T. P. Ho, Y. -D. Huang, M. Inoue, B. Jeter, H. Jiang, P. M. Koch, D. Kubo, C. -T. Li, C. -T. Liu, K. -Y. Liu , et al. (13 additional authors not shown)

Abstract: Starting from the observing campaign in April 2018, the Greenland Telescope (GLT) has been added as a new station of the Event Horizon Telescope (EHT) array. Visibilities on baselines to the GLT, particularly in the North-South direction, potentially provide valuable new constraints for the modeling and imaging of sources such as M87*. The GLT's location at high Northern latitudes adds unique chal… ▽ More Starting from the observing campaign in April 2018, the Greenland Telescope (GLT) has been added as a new station of the Event Horizon Telescope (EHT) array. Visibilities on baselines to the GLT, particularly in the North-South direction, potentially provide valuable new constraints for the modeling and imaging of sources such as M87*. The GLT's location at high Northern latitudes adds unique challenges to its calibration strategies. Additionally, the performance of the GLT was not optimal during the 2018 observations due to it being only partially commissioned at the time. This document describes the steps taken to estimate the various parameters (and their uncertainties) required for the absolute flux calibration of the GLT data as part of the EHT. In particular, we consider the non-optimized status of the GLT in 2018, as well as its improved performance during the 2021 EHT campaign. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: 17 pages, 4 figures, EHT Memo Series 2023-L1-02

arXiv:2312.00656 [pdf, other]

Simple Transferability Estimation for Regression Tasks

Authors: Cuong N. Nguyen, Phong Tran, Lam Si Tung Ho, Vu Dinh, Anh T. Tran, Tal Hassner, Cuong V. Nguyen

Abstract: We consider transferability estimation, the problem of estimating how well deep learning models transfer from a source to a target task. We focus on regression tasks, which received little previous attention, and propose two simple and computationally efficient approaches that estimate transferability based on the negative regularized mean squared error of a linear regression model. We prove novel… ▽ More We consider transferability estimation, the problem of estimating how well deep learning models transfer from a source to a target task. We focus on regression tasks, which received little previous attention, and propose two simple and computationally efficient approaches that estimate transferability based on the negative regularized mean squared error of a linear regression model. We prove novel theoretical results connecting our approaches to the actual transferability of the optimal target models obtained from the transfer learning process. Despite their simplicity, our approaches significantly outperform existing state-of-the-art regression transferability estimators in both accuracy and efficiency. On two large-scale keypoint regression benchmarks, our approaches yield 12% to 36% better results on average while being at least 27% faster than previous state-of-the-art methods. △ Less

Submitted 3 December, 2023; v1 submitted 1 December, 2023; originally announced December 2023.

Comments: Paper published at The 39th Conference on Uncertainty in Artificial Intelligence (UAI) 2023

arXiv:2312.00050 [pdf, other]

Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift

Authors: Shengwei An, Sheng-Yen Chou, Kaiyuan Zhang, Qiuling Xu, Guanhong Tao, Guangyu Shen, Siyuan Cheng, Shiqing Ma, Pin-Yu Chen, Tsung-Yi Ho, Xiangyu Zhang

Abstract: Diffusion models (DM) have become state-of-the-art generative models because of their capability to generate high-quality images from noises without adversarial training. However, they are vulnerable to backdoor attacks as reported by recent studies. When a data input (e.g., some Gaussian noise) is stamped with a trigger (e.g., a white patch), the backdoored model always generates the target image… ▽ More Diffusion models (DM) have become state-of-the-art generative models because of their capability to generate high-quality images from noises without adversarial training. However, they are vulnerable to backdoor attacks as reported by recent studies. When a data input (e.g., some Gaussian noise) is stamped with a trigger (e.g., a white patch), the backdoored model always generates the target image (e.g., an improper photo). However, effective defense strategies to mitigate backdoors from DMs are underexplored. To bridge this gap, we propose the first backdoor detection and removal framework for DMs. We evaluate our framework Elijah on hundreds of DMs of 3 types including DDPM, NCSN and LDM, with 13 samplers against 3 existing backdoor attacks. Extensive experiments show that our approach can have close to 100% detection accuracy and reduce the backdoor effects to close to zero without significantly sacrificing the model utility. △ Less

Submitted 4 February, 2024; v1 submitted 27 November, 2023; originally announced December 2023.

Comments: AAAI 2024

arXiv:2311.17516 [pdf, other]

MMA-Diffusion: MultiModal Attack on Diffusion Models

Authors: Yijun Yang, Ruiyuan Gao, Xiaosen Wang, Tsung-Yi Ho, Nan Xu, Qiang Xu

Abstract: In recent years, Text-to-Image (T2I) models have seen remarkable advancements, gaining widespread adoption. However, this progress has inadvertently opened avenues for potential misuse, particularly in generating inappropriate or Not-Safe-For-Work (NSFW) content. Our work introduces MMA-Diffusion, a framework that presents a significant and realistic threat to the security of T2I models by effecti… ▽ More In recent years, Text-to-Image (T2I) models have seen remarkable advancements, gaining widespread adoption. However, this progress has inadvertently opened avenues for potential misuse, particularly in generating inappropriate or Not-Safe-For-Work (NSFW) content. Our work introduces MMA-Diffusion, a framework that presents a significant and realistic threat to the security of T2I models by effectively circumventing current defensive measures in both open-source models and commercial online services. Unlike previous approaches, MMA-Diffusion leverages both textual and visual modalities to bypass safeguards like prompt filters and post-hoc safety checkers, thus exposing and highlighting the vulnerabilities in existing defense mechanisms. △ Less

Submitted 30 March, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

Comments: CVPR 2024. Our codes and benchmarks are available at https://github.com/cure-lab/MMA-Diffusion

arXiv:2311.16646 [pdf, other]

Rethinking Backdoor Attacks on Dataset Distillation: A Kernel Method Perspective

Authors: Ming-Yu Chung, Sheng-Yen Chou, Chia-Mu Yu, Pin-Yu Chen, Sy-Yen Kuo, Tsung-Yi Ho

Abstract: Dataset distillation offers a potential means to enhance data efficiency in deep learning. Recent studies have shown its ability to counteract backdoor risks present in original training samples. In this study, we delve into the theoretical aspects of backdoor attacks and dataset distillation based on kernel methods. We introduce two new theory-driven trigger pattern generation methods specialized… ▽ More Dataset distillation offers a potential means to enhance data efficiency in deep learning. Recent studies have shown its ability to counteract backdoor risks present in original training samples. In this study, we delve into the theoretical aspects of backdoor attacks and dataset distillation based on kernel methods. We introduce two new theory-driven trigger pattern generation methods specialized for dataset distillation. Following a comprehensive set of analyses and experiments, we show that our optimization-based trigger design framework informs effective backdoor attacks on dataset distillation. Notably, datasets poisoned by our designed trigger prove resilient against conventional backdoor attack detection and mitigation methods. Our empirical results validate that the triggers developed using our approaches are proficient at executing resilient backdoor attacks. △ Less

Submitted 28 November, 2023; originally announced November 2023.

Comments: 19 pages, 4 figures

arXiv:2311.11046 [pdf]

DenseNet and Support Vector Machine classifications of major depressive disorder using vertex-wise cortical features

Authors: Vladimir Belov, Tracy Erwin-Grabner, Ling-Li Zeng, Christopher R. K. Ching, Andre Aleman, Alyssa R. Amod, Zeynep Basgoze, Francesco Benedetti, Bianca Besteher, Katharina Brosch, Robin Bülow, Romain Colle, Colm G. Connolly, Emmanuelle Corruble, Baptiste Couvy-Duchesne, Kathryn Cullen, Udo Dannlowski, Christopher G. Davey, Annemiek Dols, Jan Ernsting, Jennifer W. Evans, Lukas Fisch, Paola Fuentes-Claramonte, Ali Saffet Gonul, Ian H. Gotlib , et al. (63 additional authors not shown)

Abstract: Major depressive disorder (MDD) is a complex psychiatric disorder that affects the lives of hundreds of millions of individuals around the globe. Even today, researchers debate if morphological alterations in the brain are linked to MDD, likely due to the heterogeneity of this disorder. The application of deep learning tools to neuroimaging data, capable of capturing complex non-linear patterns, h… ▽ More Major depressive disorder (MDD) is a complex psychiatric disorder that affects the lives of hundreds of millions of individuals around the globe. Even today, researchers debate if morphological alterations in the brain are linked to MDD, likely due to the heterogeneity of this disorder. The application of deep learning tools to neuroimaging data, capable of capturing complex non-linear patterns, has the potential to provide diagnostic and predictive biomarkers for MDD. However, previous attempts to demarcate MDD patients and healthy controls (HC) based on segmented cortical features via linear machine learning approaches have reported low accuracies. In this study, we used globally representative data from the ENIGMA-MDD working group containing an extensive sample of people with MDD (N=2,772) and HC (N=4,240), which allows a comprehensive analysis with generalizable results. Based on the hypothesis that integration of vertex-wise cortical features can improve classification performance, we evaluated the classification of a DenseNet and a Support Vector Machine (SVM), with the expectation that the former would outperform the latter. As we analyzed a multi-site sample, we additionally applied the ComBat harmonization tool to remove potential nuisance effects of site. We found that both classifiers exhibited close to chance performance (balanced accuracy DenseNet: 51%; SVM: 53%), when estimated on unseen sites. Slightly higher classification performance (balanced accuracy DenseNet: 58%; SVM: 55%) was found when the cross-validation folds contained subjects from all sites, indicating site effect. In conclusion, the integration of vertex-wise morphometric features and the use of the non-linear classifier did not lead to the differentiability between MDD and HC. Our results support the notion that MDD classification on this combination of features and classifiers is unfeasible. △ Less

Submitted 18 November, 2023; originally announced November 2023.

arXiv:2311.06851 [pdf, other]

doi 10.1007/978-3-031-64779-6_1

Automatic Textual Normalization for Hate Speech Detection

Authors: Anh Thi-Hoang Nguyen, Dung Ha Nguyen, Nguyet Thi Nguyen, Khanh Thanh-Duy Ho, Kiet Van Nguyen

Abstract: Social media data is a valuable resource for research, yet it contains a wide range of non-standard words (NSW). These irregularities hinder the effective operation of NLP tools. Current state-of-the-art methods for the Vietnamese language address this issue as a problem of lexical normalization, involving the creation of manual rules or the implementation of multi-staged deep learning frameworks,… ▽ More Social media data is a valuable resource for research, yet it contains a wide range of non-standard words (NSW). These irregularities hinder the effective operation of NLP tools. Current state-of-the-art methods for the Vietnamese language address this issue as a problem of lexical normalization, involving the creation of manual rules or the implementation of multi-staged deep learning frameworks, which necessitate extensive efforts to craft intricate rules. In contrast, our approach is straightforward, employing solely a sequence-to-sequence (Seq2Seq) model. In this research, we provide a dataset for textual normalization, comprising 2,181 human-annotated comments with an inter-annotator agreement of 0.9014. By leveraging the Seq2Seq model for textual normalization, our results reveal that the accuracy achieved falls slightly short of 70%. Nevertheless, textual normalization enhances the accuracy of the Hate Speech Detection (HSD) task by approximately 2%, demonstrating its potential to improve the performance of complex NLP tasks. Our dataset is accessible for research purposes. △ Less

Submitted 25 July, 2024; v1 submitted 12 November, 2023; originally announced November 2023.

Comments: 2023 International Conference on Intelligent Systems Design and Applications (ISDA2023)

Journal ref: Intelligent Systems Design and Applications. Lecture Notes in Networks and Systems, vol 1049 (ISDA 2023) 1-12

arXiv:2311.02359 [pdf, other]

doi 10.3842/SIGMA.2023.087

Deformation of the Weighted Scalar Curvature

Authors: Pak Tung Ho, Jinwoo Shin

Abstract: Inspired by the work of Fischer-Marsden [Duke Math. J. 42 (1975), 519-547], we study in this paper the deformation of the weighted scalar curvature. By studying the kernel of the formal $L_φ^2$-adjoint for the linearization of the weighted scalar curvature, we prove several geometric results. In particular, we define a weighted vacuum static space, and study locally conformally flat weighted vacuu… ▽ More Inspired by the work of Fischer-Marsden [Duke Math. J. 42 (1975), 519-547], we study in this paper the deformation of the weighted scalar curvature. By studying the kernel of the formal $L_φ^2$-adjoint for the linearization of the weighted scalar curvature, we prove several geometric results. In particular, we define a weighted vacuum static space, and study locally conformally flat weighted vacuum static spaces. We then prove some stability results of the weighted scalar curvature on flat spaces. Finally, we consider the prescribed weighted scalar curvature problem on closed smooth metric measure spaces. △ Less

Submitted 4 November, 2023; originally announced November 2023.

Journal ref: SIGMA 19 (2023), 087, 15 pages

arXiv:2310.17970 [pdf, other]

The role of turbulence in high-mass star formation: Subsonic and transonic turbulence are ubiquitously found at early stages

Authors: Chao Wang, Ke Wang, Feng-Wei Xu, Patricio Sanhueza, Hauyu Baobab Liu, Qizhou Zhang, Xing Lu, F. Fontani, Paola Caselli, Gemma Busquet, Jonathan C. Tan, Di Li, J. M. Jackson, Thushara Pillai, Paul T. P. Ho, Andrés E. Guzmán, Nannan Yue

Abstract: Context. Traditionally, supersonic turbulence is considered to be one of the most likely mechanisms to slow down the gravitational collapse in dense clumps, thereby enabling the formation of massive stars. However, several recent studies have raised differing points of view based on observations carried out with sufficiently high spatial and spectral resolution. These studies call for a re-evaluat… ▽ More Context. Traditionally, supersonic turbulence is considered to be one of the most likely mechanisms to slow down the gravitational collapse in dense clumps, thereby enabling the formation of massive stars. However, several recent studies have raised differing points of view based on observations carried out with sufficiently high spatial and spectral resolution. These studies call for a re-evaluation of the role turbulence plays in massive star-forming regions. Aims. Our aim is to study the gas properties, especially the turbulence, in a sample of massive star-forming regions with sufficient spatial and spectral resolution, which can both resolve the core fragmentation and the thermal line width. Methods. We observed NH3 metastable lines with the Very Large Array (VLA) to assess the intrinsic turbulence. Results. Analysis of the turbulence distribution histogram for 32 identified NH3 cores reveals the presence of three distinct components. Furthermore, our results suggest that (1) sub- and transonic turbulence is a prevalent (21 of 32) feature of massive star-forming regions and those cold regions are at early evolutionary stage. This investigation indicates that turbulence alone is insufficient to provide the necessary internal pressure required for massive star formation, necessitating further exploration of alternative candidates; and (2) studies of seven multi-core systems indicate that the cores within each system mainly share similar gas properties and masses. However, two of the systems are characterized by the presence of exceptionally cold and dense cores that are situated at the spatial center of each system. Our findings support the hub-filament model as an explanation for this observed distribution △ Less

Submitted 7 February, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

Comments: 34 pages, 15 figures, 4 tables. Accepted for publication on A&A

arXiv:2310.12294 [pdf, other]

Open-Set Multivariate Time-Series Anomaly Detection

Authors: Thomas Lai, Thi Kieu Khanh Ho, Narges Armanfard

Abstract: Numerous methods for time-series anomaly detection (TSAD) have emerged in recent years, most of which are unsupervised and assume that only normal samples are available during the training phase, due to the challenge of obtaining abnormal data in real-world scenarios. Still, limited samples of abnormal data are often available, albeit they are far from representative of all possible anomalies. Sup… ▽ More Numerous methods for time-series anomaly detection (TSAD) have emerged in recent years, most of which are unsupervised and assume that only normal samples are available during the training phase, due to the challenge of obtaining abnormal data in real-world scenarios. Still, limited samples of abnormal data are often available, albeit they are far from representative of all possible anomalies. Supervised methods can be utilized to classify normal and seen anomalies, but they tend to overfit to the seen anomalies present during training, hence, they fail to generalize to unseen anomalies. We propose the first algorithm to address the open-set TSAD problem, called Multivariate Open-Set Time-Series Anomaly Detector (MOSAD), that leverages only a few shots of labeled anomalies during the training phase in order to achieve superior anomaly detection performance compared to both supervised and unsupervised TSAD algorithms. MOSAD is a novel multi-head TSAD framework with a shared representation space and specialized heads, including the Generative head, the Discriminative head, and the Anomaly-Aware Contrastive head. The latter produces a superior representation space for anomaly detection compared to conventional supervised contrastive learning. Extensive experiments on three real-world datasets establish MOSAD as a new state-of-the-art in the TSAD field. △ Less

Submitted 7 August, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

Comments: Accepted to ECAI-2024

arXiv:2310.08523 [pdf, other]

LLM-augmented Preference Learning from Natural Language

Authors: Inwon Kang, Sikai Ruan, Tyler Ho, Jui-Chien Lin, Farhad Mohsin, Oshani Seneviratne, Lirong Xia

Abstract: Finding preferences expressed in natural language is an important but challenging task. State-of-the-art(SotA) methods leverage transformer-based models such as BERT, RoBERTa, etc. and graph neural architectures such as graph attention networks. Since Large Language Models (LLMs) are equipped to deal with larger context lengths and have much larger model sizes than the transformer-based model, we… ▽ More Finding preferences expressed in natural language is an important but challenging task. State-of-the-art(SotA) methods leverage transformer-based models such as BERT, RoBERTa, etc. and graph neural architectures such as graph attention networks. Since Large Language Models (LLMs) are equipped to deal with larger context lengths and have much larger model sizes than the transformer-based model, we investigate their ability to classify comparative text directly. This work aims to serve as a first step towards using LLMs for the CPC task. We design and conduct a set of experiments that format the classification task into an input prompt for the LLM and a methodology to get a fixed-format response that can be automatically evaluated. Comparing performances with existing methods, we see that pre-trained LLMs are able to outperform the previous SotA models with no fine-tuning involved. Our results show that the LLMs can consistently outperform the SotA when the target text is large -- i.e. composed of multiple sentences --, and are still comparable to the SotA performance in shorter text. We also find that few-shot learning yields better performance than zero-shot learning. △ Less

Submitted 12 October, 2023; originally announced October 2023.

arXiv:2310.08381 [pdf, other]

AutoVP: An Automated Visual Prompting Framework and Benchmark

Authors: Hsi-Ai Tsao, Lei Hsiung, Pin-Yu Chen, Sijia Liu, Tsung-Yi Ho

Abstract: Visual prompting (VP) is an emerging parameter-efficient fine-tuning approach to adapting pre-trained vision models to solve various downstream image-classification tasks. However, there has hitherto been little systematic study of the design space of VP and no clear benchmark for evaluating its performance. To bridge this gap, we propose AutoVP, an end-to-end expandable framework for automating V… ▽ More Visual prompting (VP) is an emerging parameter-efficient fine-tuning approach to adapting pre-trained vision models to solve various downstream image-classification tasks. However, there has hitherto been little systematic study of the design space of VP and no clear benchmark for evaluating its performance. To bridge this gap, we propose AutoVP, an end-to-end expandable framework for automating VP design choices, along with 12 downstream image-classification tasks that can serve as a holistic VP-performance benchmark. Our design space covers 1) the joint optimization of the prompts; 2) the selection of pre-trained models, including image classifiers and text-image encoders; and 3) model output mapping strategies, including nonparametric and trainable label mapping. Our extensive experimental results show that AutoVP outperforms the best-known current VP methods by a substantial margin, having up to 6.7% improvement in accuracy; and attains a maximum performance increase of 27.5% compared to linear-probing (LP) baseline. AutoVP thus makes a two-fold contribution: serving both as an efficient tool for hyperparameter tuning on VP design choices, and as a comprehensive benchmark that can reasonably be expected to accelerate VP's development. The source code is available at https://github.com/IBM/AutoVP. △ Less

Submitted 10 March, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

Comments: ICLR 2024

arXiv:2310.05892 [pdf, ps, other]

A Generalization Bound of Deep Neural Networks for Dependent Data

Authors: Quan Huu Do, Binh T. Nguyen, Lam Si Tung Ho

Abstract: Existing generalization bounds for deep neural networks require data to be independent and identically distributed (iid). This assumption may not hold in real-life applications such as evolutionary biology, infectious disease epidemiology, and stock price prediction. This work establishes a generalization bound of feed-forward neural networks for non-stationary $φ$-mixing data. Existing generalization bounds for deep neural networks require data to be independent and identically distributed (iid). This assumption may not hold in real-life applications such as evolutionary biology, infectious disease epidemiology, and stock price prediction. This work establishes a generalization bound of feed-forward neural networks for non-stationary $φ$-mixing data. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2308.13666 [pdf, other]

A Joint Fermi-GBM and Swift-BAT Analysis of Gravitational-Wave Candidates from the Third Gravitational-wave Observing Run

Authors: C. Fletcher, J. Wood, R. Hamburg, P. Veres, C. M. Hui, E. Bissaldi, M. S. Briggs, E. Burns, W. H. Cleveland, M. M. Giles, A. Goldstein, B. A. Hristov, D. Kocevski, S. Lesage, B. Mailyan, C. Malacaria, S. Poolakkil, A. von Kienlin, C. A. Wilson-Hodge, The Fermi Gamma-ray Burst Monitor Team, M. Crnogorčević, J. DeLaunay, A. Tohuvavohu, R. Caputo, S. B. Cenko , et al. (1674 additional authors not shown)

Abstract: We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses,… ▽ More We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses, the Targeted Search and the Untargeted Search, we investigate whether there are any coincident GRBs associated with the GWs. We also search the Swift-BAT rate data around the GW times to determine whether a GRB counterpart is present. No counterparts are found. Using both the Fermi-GBM Targeted Search and the Swift-BAT search, we calculate flux upper limits and present joint upper limits on the gamma-ray luminosity of each GW. Given these limits, we constrain theoretical models for the emission of gamma-rays from binary black hole mergers. △ Less

Submitted 25 August, 2023; originally announced August 2023.

arXiv:2308.12563 [pdf, other]

Multivariate Time-Series Anomaly Detection with Contaminated Data

Authors: Thi Kieu Khanh Ho, Narges Armanfard

Abstract: Mainstream unsupervised anomaly detection algorithms often excel in academic datasets, yet their real-world performance is restricted due to the controlled experimental conditions involving clean training data. Addressing the challenge of training with noise, a prevalent issue in practical anomaly detection, is frequently overlooked. In a pioneering endeavor, this study delves into the realm of la… ▽ More Mainstream unsupervised anomaly detection algorithms often excel in academic datasets, yet their real-world performance is restricted due to the controlled experimental conditions involving clean training data. Addressing the challenge of training with noise, a prevalent issue in practical anomaly detection, is frequently overlooked. In a pioneering endeavor, this study delves into the realm of label-level noise within sensory time-series anomaly detection (TSAD). This paper presents a novel and practical end-to-end unsupervised TSAD when the training data are contaminated with anomalies. The introduced approach, called TSAD-C, is devoid of access to abnormality labels during the training phase. TSAD-C encompasses three modules: a Decontaminator to rectify the abnormalities (aka noise) present in the training data, a Long-range Variable Dependency Modeling module to capture both long-term intra- and inter-variable dependencies within the decontaminated data that can be considered as a surrogate of the pure normal data, and an Anomaly Scoring module to detect anomalies from all types. Our extensive experiments conducted on three reliable datasets conclusively demonstrate that our approach surpasses existing methodologies, thus establishing a new state-of-the-art performance in the field. △ Less

Submitted 16 February, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

Comments: 9 pages, 4 tables, 4 figures

arXiv:2308.01672 [pdf, other]

Floorplet: Performance-aware Floorplan Framework for Chiplet Integration

Authors: Shixin Chen, Shanyi Li, Zhen Zhuang, Su Zheng, Zheng Liang, Tsung-Yi Ho, Bei Yu, Alberto L. Sangiovanni-Vincentelli

Abstract: A chiplet is an integrated circuit that encompasses a well-defined subset of an overall system's functionality. In contrast to traditional monolithic system-on-chips (SoCs), chiplet-based architecture can reduce costs and increase reusability, representing a promising avenue for continuing Moore's Law. Despite the advantages of multi-chiplet architectures, floorplan design in a chiplet-based archi… ▽ More A chiplet is an integrated circuit that encompasses a well-defined subset of an overall system's functionality. In contrast to traditional monolithic system-on-chips (SoCs), chiplet-based architecture can reduce costs and increase reusability, representing a promising avenue for continuing Moore's Law. Despite the advantages of multi-chiplet architectures, floorplan design in a chiplet-based architecture has received limited attention. Conflicts between cost and performance necessitate a trade-off in chiplet floorplan design since additional latency introduced by advanced packaging can decrease performance. Consequently, balancing power, performance, cost, area, and reliability is of paramount importance. To address this challenge, we propose Floorplet, a framework comprising simulation tools for performance reporting and comprehensive models for cost and reliability optimization. Our framework employs the open-source Gem5 simulator to establish the relationship between performance and floorplan for the first time, guiding the floorplan optimization of multi-chiplet architecture. The experimental results show that our framework decreases inter-chiplet communication costs by 24.81%. △ Less

Submitted 11 December, 2023; v1 submitted 3 August, 2023; originally announced August 2023.

Comments: accepted by TCAD, 12 pages, 10 figures

arXiv:2307.10468 [pdf]

The Greenland Telescope: Construction, Commissioning, and Operations in Pituffik

Authors: Ming-Tang Chen, Keiichi Asada, Satoki Matsushita, Philippe Raffin, Makoto Inoue, Paul T. P. Ho, Chih-Chiang Han, Derek Kubo, Timothy Norton, Nimesh A. Patel, George Nystrom, Chih-Wei L. Huang, Pierre Martin-Cocher, Jun Yi Koay, Cristina Romero-Cañizales, Ching-Tang Liu, Teddy Huang, Kuan-Yu Liu, Tashun Wei, Shu-Hao Chang, Ryan Chilson, Peter Oshiro, Homin Jiang, Chao-Te Li, Geoffrey Bower , et al. (29 additional authors not shown)

Abstract: In 2018, the Greenland Telescope (GLT) started scientific observation in Greenland. Since then, we have completed several significant improvements and added new capabilities to the telescope system. This paper presents a full review of the GLT system, a summary of our observation activities since 2018, the lessons learned from the operations in the Arctic regions, and the prospect of the telescope… ▽ More In 2018, the Greenland Telescope (GLT) started scientific observation in Greenland. Since then, we have completed several significant improvements and added new capabilities to the telescope system. This paper presents a full review of the GLT system, a summary of our observation activities since 2018, the lessons learned from the operations in the Arctic regions, and the prospect of the telescope. △ Less

Submitted 19 July, 2023; originally announced July 2023.

Comments: 26 pages, 11 figures, and 8 tables. This is the version of the article before publication editing, as submitted by an author to Publications of the Astronomical Society of the Pacific. IOP Publishing Ltd is not responsible for any errors or omissions in this version of the manuscript or any version derived from it. The Version of Record will be added when it becomes available

arXiv:2307.03838 [pdf, other]

RADAR: Robust AI-Text Detection via Adversarial Learning

Authors: Xiaomeng Hu, Pin-Yu Chen, Tsung-Yi Ho

Abstract: Recent advances in large language models (LLMs) and the intensifying popularity of ChatGPT-like applications have blurred the boundary of high-quality text generation between humans and machines. However, in addition to the anticipated revolutionary changes to our technology and society, the difficulty of distinguishing LLM-generated texts (AI-text) from human-generated texts poses new challenges… ▽ More Recent advances in large language models (LLMs) and the intensifying popularity of ChatGPT-like applications have blurred the boundary of high-quality text generation between humans and machines. However, in addition to the anticipated revolutionary changes to our technology and society, the difficulty of distinguishing LLM-generated texts (AI-text) from human-generated texts poses new challenges of misuse and fairness, such as fake content generation, plagiarism, and false accusations of innocent writers. While existing works show that current AI-text detectors are not robust to LLM-based paraphrasing, this paper aims to bridge this gap by proposing a new framework called RADAR, which jointly trains a robust AI-text detector via adversarial learning. RADAR is based on adversarial training of a paraphraser and a detector. The paraphraser's goal is to generate realistic content to evade AI-text detection. RADAR uses the feedback from the detector to update the paraphraser, and vice versa. Evaluated with 8 different LLMs (Pythia, Dolly 2.0, Palmyra, Camel, GPT-J, Dolly 1.0, LLaMA, and Vicuna) across 4 datasets, experimental results show that RADAR significantly outperforms existing AI-text detection methods, especially when paraphrasing is in place. We also identify the strong transferability of RADAR from instruction-tuned LLMs to other LLMs, and evaluate the improved capability of RADAR via GPT-3.5-Turbo. △ Less

Submitted 24 October, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

Comments: Accepted by NeurIPS 2023. Project page and demos: https://radar.vizhub.ai

arXiv:2306.16869 [pdf, other]

NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes

Authors: Hao-Lun Sun, Lei Hsiung, Nandhini Chandramoorthy, Pin-Yu Chen, Tsung-Yi Ho

Abstract: Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains a notable issue. Lowering the supply voltage is an effective strategy for reducing energy consumption. However, aggressively scaling down the supply voltage can lead to accuracy degradation due to random bit flips in static random access memory (SRAM) where model parameters are stored. To a… ▽ More Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains a notable issue. Lowering the supply voltage is an effective strategy for reducing energy consumption. However, aggressively scaling down the supply voltage can lead to accuracy degradation due to random bit flips in static random access memory (SRAM) where model parameters are stored. To address this challenge, we introduce NeuralFuse, a novel add-on module that addresses the accuracy-energy tradeoff in low-voltage regimes by learning input transformations to generate error-resistant data representations. NeuralFuse protects DNN accuracy in both nominal and low-voltage scenarios. Moreover, NeuralFuse is easy to implement and can be readily applied to DNNs with limited access, such as non-configurable hardware or remote access to cloud-based APIs. Experimental results demonstrate that, at a 1% bit error rate, NeuralFuse can reduce SRAM memory access energy by up to 24% while recovering accuracy by up to 57%. To the best of our knowledge, this is the first model-agnostic approach (i.e., no model retraining) to address low-voltage-induced bit errors. The source code is available at https://github.com/IBM/NeuralFuse. △ Less

Submitted 21 February, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

arXiv:2306.14518 [pdf, other]

doi 10.1007/978-3-031-43898-1_10

Toward Fairness Through Fair Multi-Exit Framework for Dermatological Disease Diagnosis

Authors: Ching-Hao Chiu, Hao-Wei Chung, Yu-Jen Chen, Yiyu Shi, Tsung-Yi Ho

Abstract: Fairness has become increasingly pivotal in medical image recognition. However, without mitigating bias, deploying unfair medical AI systems could harm the interests of underprivileged populations. In this paper, we observe that while features extracted from the deeper layers of neural networks generally offer higher accuracy, fairness conditions deteriorate as we extract features from deeper laye… ▽ More Fairness has become increasingly pivotal in medical image recognition. However, without mitigating bias, deploying unfair medical AI systems could harm the interests of underprivileged populations. In this paper, we observe that while features extracted from the deeper layers of neural networks generally offer higher accuracy, fairness conditions deteriorate as we extract features from deeper layers. This phenomenon motivates us to extend the concept of multi-exit frameworks. Unlike existing works mainly focusing on accuracy, our multi-exit framework is fairness-oriented; the internal classifiers are trained to be more accurate and fairer, with high extensibility to apply to most existing fairness-aware frameworks. During inference, any instance with high confidence from an internal classifier is allowed to exit early. Experimental results show that the proposed framework can improve the fairness condition over the state-of-the-art in two dermatological disease datasets. △ Less

Submitted 1 July, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

Comments: MICCAI2023

arXiv:2306.14505 [pdf, other]

AME-CAM: Attentive Multiple-Exit CAM for Weakly Supervised Segmentation on MRI Brain Tumor

Authors: Yu-Jen Chen, Xinrong Hu, Yiyu Shi, Tsung-Yi Ho

Abstract: Magnetic resonance imaging (MRI) is commonly used for brain tumor segmentation, which is critical for patient evaluation and treatment planning. To reduce the labor and expertise required for labeling, weakly-supervised semantic segmentation (WSSS) methods with class activation mapping (CAM) have been proposed. However, existing CAM methods suffer from low resolution due to strided convolution and… ▽ More Magnetic resonance imaging (MRI) is commonly used for brain tumor segmentation, which is critical for patient evaluation and treatment planning. To reduce the labor and expertise required for labeling, weakly-supervised semantic segmentation (WSSS) methods with class activation mapping (CAM) have been proposed. However, existing CAM methods suffer from low resolution due to strided convolution and pooling layers, resulting in inaccurate predictions. In this study, we propose a novel CAM method, Attentive Multiple-Exit CAM (AME-CAM), that extracts activation maps from multiple resolutions to hierarchically aggregate and improve prediction accuracy. We evaluate our method on the BraTS 2021 dataset and show that it outperforms state-of-the-art methods. △ Less

Submitted 1 December, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

Comments: arXiv admin note: text overlap with arXiv:2306.05476

arXiv:2306.09154 [pdf, other]

doi 10.1088/1361-6544/ace3a0

Simple two-layer dispersive models in the Hamiltonian reduction formalism

Authors: R. Camassa, G. Falqui, G. Ortenzi, M. Pedroni, T. T. Vu Ho

Abstract: A Hamiltonian reduction approach is defined, studied, and finally used to derive asymptotic models of internal wave propagation in density stratified fluids in two-dimensional domains. Beginning with the general Hamiltonian formalism of Benjamin [1] for an ideal, stably stratified Euler fluid, the corresponding structure is systematically reduced to the setup of two homogeneous fluids under gravit… ▽ More A Hamiltonian reduction approach is defined, studied, and finally used to derive asymptotic models of internal wave propagation in density stratified fluids in two-dimensional domains. Beginning with the general Hamiltonian formalism of Benjamin [1] for an ideal, stably stratified Euler fluid, the corresponding structure is systematically reduced to the setup of two homogeneous fluids under gravity, separated by an interface and confined between two infinite horizontal plates. A long-wave, small-amplitude asymptotics is then used to obtain a simplified model that encapsulates most of the known properties of the dynamics of such systems, such as bidirectional wave propagation and maximal amplitude travelling waves in the form of fronts. Further reductions, and in particular devising an asymptotic extension of Dirac's theory of Hamiltonian constraints, lead to the completely integrable evolution equations previously considered in the literature for limiting forms of the dynamics of stratified fluids. To assess the performance of the asymptotic models, special solutions are studied and compared with those of the parent equations. △ Less

Submitted 15 June, 2023; originally announced June 2023.

Comments: 29 pages, 4 figures

Showing 1–50 of 626 results for author: Ho, T