Search | arXiv e-print repository

arXiv:2407.09342 [pdf, other]

MIXED-SENSE: A Mixed Reality Sensor Emulation Framework for Test and Evaluation of UAVs Against False Data Injection Attacks

Authors: Kartik A. Pant, Li-Yu Lin, Jaehyeok Kim, Worawis Sribunma, James M. Goppert, Inseok Hwang

Abstract: We present a high-fidelity Mixed Reality sensor emulation framework for testing and evaluating the resilience of Unmanned Aerial Vehicles (UAVs) against false data injection (FDI) attacks. The proposed approach can be utilized to assess the impact of FDI attacks, benchmark attack detector performance, and validate the effectiveness of mitigation/reconfiguration strategies in single-UAV and UAV swa… ▽ More We present a high-fidelity Mixed Reality sensor emulation framework for testing and evaluating the resilience of Unmanned Aerial Vehicles (UAVs) against false data injection (FDI) attacks. The proposed approach can be utilized to assess the impact of FDI attacks, benchmark attack detector performance, and validate the effectiveness of mitigation/reconfiguration strategies in single-UAV and UAV swarm operations. Our Mixed Reality framework leverages high-fidelity simulations of Gazebo and a Motion Capture system to emulate proprioceptive (e.g., GNSS) and exteroceptive (e.g., camera) sensor measurements in real-time. We propose an empirical approach to faithfully recreate signal characteristics such as latency and noise in these measurements. Finally, we illustrate the efficacy of our proposed framework through a Mixed Reality experiment consisting of an emulated GNSS attack on an actual UAV, which (i) demonstrates the impact of false data injection attacks on GNSS measurements and (ii) validates a mitigation strategy utilizing a distributed camera network developed in our previous work. Our open-source implementation is available at \href{https://github.com/CogniPilot/mixed\_sense}{\texttt{https://github.com/CogniPilot/mixed\_sense}} △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 6 pages, 5 figures, IROS 2024

arXiv:2407.07142 [pdf, other]

LANSCE-mQ: Dedicated search for milli/fractionally charged particles at LANL

Authors: Yu-Dai Tsai, Insung Hwang, Ryan Schmitz, Matthew Citron, Kranti Gunthoti, Jacob Steenis, Hoyong Jeong, Hyunki Moon, Jae Hyeok Yoo, Ming Xiong Liu

Abstract: In this paper, we propose an experiment, LANSCE-mQ, aiming to detect fractionally charged and millicharged particles (mCP) using an 800 MeV proton beam fixed target at the Los Alamos Neutron Science Center (LANSCE) facility. This search can shed new light on numerous fundamental questions, including charge quantization, the predictions of string theories and grand unification theories, the gauge s… ▽ More In this paper, we propose an experiment, LANSCE-mQ, aiming to detect fractionally charged and millicharged particles (mCP) using an 800 MeV proton beam fixed target at the Los Alamos Neutron Science Center (LANSCE) facility. This search can shed new light on numerous fundamental questions, including charge quantization, the predictions of string theories and grand unification theories, the gauge symmetry of the Standard Model, dark sector models, and the tests of cosmic reheating. We propose to install two-layer scintillation detectors made of plastic (such as EJ-200) or CeBr3 to search for mCPs. Dedicated Geant4 detector simulations and in situ measurements have been conducted to obtain a preliminary determination of the background rate. The dominant backgrounds are beam-induced neutrons and coincident dark current signals from the photomultiplier tubes, while beam-induced gammas and cosmic muons are subdominant. We determined that LANSCE-mQ, the dedicated mCP experiment, has the leading mCP sensitivity for mass between ~ 1 MeV to 300 MeV. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: 8 pages, 8 figures

Report number: FERMILAB-PUB-24-0357-T-V; LA-UR-24-27441

arXiv:2406.09117 [pdf, other]

PC-LoRA: Low-Rank Adaptation for Progressive Model Compression with Knowledge Distillation

Authors: Injoon Hwang, Haewon Park, Youngwan Lee, Jooyoung Yang, SunJae Maeng

Abstract: Low-rank adaption (LoRA) is a prominent method that adds a small number of learnable parameters to the frozen pre-trained weights for parameter-efficient fine-tuning. Prompted by the question, ``Can we make its representation enough with LoRA weights solely at the final phase of finetuning without the pre-trained weights?'' In this work, we introduce Progressive Compression LoRA~(PC-LoRA), which u… ▽ More Low-rank adaption (LoRA) is a prominent method that adds a small number of learnable parameters to the frozen pre-trained weights for parameter-efficient fine-tuning. Prompted by the question, ``Can we make its representation enough with LoRA weights solely at the final phase of finetuning without the pre-trained weights?'' In this work, we introduce Progressive Compression LoRA~(PC-LoRA), which utilizes low-rank adaptation (LoRA) to simultaneously perform model compression and fine-tuning. The PC-LoRA method gradually removes the pre-trained weights during the training process, eventually leaving only the low-rank adapters in the end. Thus, these low-rank adapters replace the whole pre-trained weights, achieving the goals of compression and fine-tuning at the same time. Empirical analysis across various models demonstrates that PC-LoRA achieves parameter and FLOPs compression rates of 94.36%/89.1% for vision models, e.g., ViT-B, and 93.42%/84.2% parameters and FLOPs compressions for language models, e.g., BERT. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: Accepted at T4V@CVPR

arXiv:2406.03234 [pdf, other]

Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning

Authors: Inwoo Hwang, Yunhyeok Kwak, Suhyung Choi, Byoung-Tak Zhang, Sanghack Lee

Abstract: Causal dynamics learning has recently emerged as a promising approach to enhancing robustness in reinforcement learning (RL). Typically, the goal is to build a dynamics model that makes predictions based on the causal relationships among the entities. Despite the fact that causal connections often manifest only under certain contexts, existing approaches overlook such fine-grained relationships an… ▽ More Causal dynamics learning has recently emerged as a promising approach to enhancing robustness in reinforcement learning (RL). Typically, the goal is to build a dynamics model that makes predictions based on the causal relationships among the entities. Despite the fact that causal connections often manifest only under certain contexts, existing approaches overlook such fine-grained relationships and lack a detailed understanding of the dynamics. In this work, we propose a novel dynamics model that infers fine-grained causal structures and employs them for prediction, leading to improved robustness in RL. The key idea is to jointly learn the dynamics model with a discrete latent variable that quantizes the state-action space into subgroups. This leads to recognizing meaningful context that displays sparse dependencies, where causal structures are learned for each subgroup throughout the training. Experimental results demonstrate the robustness of our method to unseen states and locally spurious correlations in downstream tasks where fine-grained causal reasoning is crucial. We further illustrate the effectiveness of our subgroup-based approach with quantization in discovering fine-grained causal relationships compared to prior methods. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: ICML 2024

arXiv:2406.00614 [pdf, other]

Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction

Authors: Yunhyeok Kwak, Inwoo Hwang, Dooyoung Kim, Sanghack Lee, Byoung-Tak Zhang

Abstract: Monte Carlo Tree Search (MCTS) has showcased its efficacy across a broad spectrum of decision-making problems. However, its performance often degrades under vast combinatorial action space, especially where an action is composed of multiple sub-actions. In this work, we propose an action abstraction based on the compositional structure between a state and sub-actions for improving the efficiency o… ▽ More Monte Carlo Tree Search (MCTS) has showcased its efficacy across a broad spectrum of decision-making problems. However, its performance often degrades under vast combinatorial action space, especially where an action is composed of multiple sub-actions. In this work, we propose an action abstraction based on the compositional structure between a state and sub-actions for improving the efficiency of MCTS under a factored action space. Our method learns a latent dynamics model with an auxiliary network that captures sub-actions relevant to the transition on the current state, which we call state-conditioned action abstraction. Notably, it infers such compositional relationships from high-dimensional observations without the known environment model. During the tree traversal, our method constructs the state-conditioned action abstraction for each node on-the-fly, reducing the search space by discarding the exploration of redundant sub-actions. Experimental results demonstrate the superior sample efficiency of our method compared to vanilla MuZero, which suffers from expansive action space. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: UAI 2024 (Oral). The first two authors contributed equally

arXiv:2405.07220 [pdf, other]

On Discovery of Local Independence over Continuous Variables via Neural Contextual Decomposition

Authors: Inwoo Hwang, Yunhyeok Kwak, Yeon-Ji Song, Byoung-Tak Zhang, Sanghack Lee

Abstract: Conditional independence provides a way to understand causal relationships among the variables of interest. An underlying system may exhibit more fine-grained causal relationships especially between a variable and its parents, which will be called the local independence relationships. One of the most widely studied local relationships is Context-Specific Independence (CSI), which holds in a specif… ▽ More Conditional independence provides a way to understand causal relationships among the variables of interest. An underlying system may exhibit more fine-grained causal relationships especially between a variable and its parents, which will be called the local independence relationships. One of the most widely studied local relationships is Context-Specific Independence (CSI), which holds in a specific assignment of conditioned variables. However, its applicability is often limited since it does not allow continuous variables: data conditioned to the specific value of a continuous variable contains few instances, if not none, making it infeasible to test independence. In this work, we define and characterize the local independence relationship that holds in a specific set of joint assignments of parental variables, which we call context-set specific independence (CSSI). We then provide a canonical representation of CSSI and prove its fundamental properties. Based on our theoretical findings, we cast the problem of discovering multiple CSSI relationships in a system as finding a partition of the joint outcome space. Finally, we propose a novel method, coined neural contextual decomposition (NCD), which learns such partition by imposing each set to induce CSSI via modeling a conditional distribution. We empirically demonstrate that the proposed method successfully discovers the ground truth local independence relationships in both synthetic dataset and complex system reflecting the real-world physical dynamics. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: Conference on Causal Learning and Reasoning (CLeaR), 2023

arXiv:2404.14647 [pdf, other]

Human Behavior Modeling via Identification of Task Objective and Variability

Authors: Sooyung Byeon, Dawei Sun, Inseok Hwang

Abstract: Human behavior modeling is important for the design and implementation of human-automation interactive control systems. In this context, human behavior refers to a human's control input to systems. We propose a novel method for human behavior modeling that uses human demonstrations for a given task to infer the unknown task objective and the variability. The task objective represents the human's i… ▽ More Human behavior modeling is important for the design and implementation of human-automation interactive control systems. In this context, human behavior refers to a human's control input to systems. We propose a novel method for human behavior modeling that uses human demonstrations for a given task to infer the unknown task objective and the variability. The task objective represents the human's intent or desire. It can be inferred by the inverse optimal control and improve the understanding of human behavior by providing an explainable objective function behind the given human behavior. Meanwhile, the variability denotes the intrinsic uncertainty in human behavior. It can be described by a Gaussian mixture model and capture the uncertainty in human behavior which cannot be encoded by the task objective. The proposed method can improve the prediction accuracy of human behavior by leveraging both task objective and variability. The proposed method is demonstrated through human-subject experiments using an illustrative quadrotor remote control example. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: 10 pages

arXiv:2404.01805 [pdf, other]

Improved Text Emotion Prediction Using Combined Valence and Arousal Ordinal Classification

Authors: Michael Mitsios, Georgios Vamvoukakis, Georgia Maniati, Nikolaos Ellinas, Georgios Dimitriou, Konstantinos Markopoulos, Panos Kakoulidis, Alexandra Vioni, Myrsini Christidou, Junkwang Oh, Gunu Jho, Inchul Hwang, Georgios Vardaxoglou, Aimilios Chalamandaris, Pirros Tsiakoulis, Spyros Raptis

Abstract: Emotion detection in textual data has received growing interest in recent years, as it is pivotal for developing empathetic human-computer interaction systems. This paper introduces a method for categorizing emotions from text, which acknowledges and differentiates between the diversified similarities and distinctions of various emotions. Initially, we establish a baseline by training a transforme… ▽ More Emotion detection in textual data has received growing interest in recent years, as it is pivotal for developing empathetic human-computer interaction systems. This paper introduces a method for categorizing emotions from text, which acknowledges and differentiates between the diversified similarities and distinctions of various emotions. Initially, we establish a baseline by training a transformer-based model for standard emotion classification, achieving state-of-the-art performance. We argue that not all misclassifications are of the same importance, as there are perceptual similarities among emotional classes. We thus redefine the emotion labeling problem by shifting it from a traditional classification model to an ordinal classification one, where discrete emotions are arranged in a sequential order according to their valence levels. Finally, we propose a method that performs ordinal classification in the two-dimensional emotion space, considering both valence and arousal scales. The results show that our approach not only preserves high accuracy in emotion prediction but also significantly reduces the magnitude of errors in cases of misclassification. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2404.00856 [pdf, other]

Removing Speaker Information from Speech Representation using Variable-Length Soft Pooling

Authors: Injune Hwang, Kyogu Lee

Abstract: Recently, there have been efforts to encode the linguistic information of speech using a self-supervised framework for speech synthesis. However, predicting representations from surrounding representations can inadvertently entangle speaker information in the speech representation. This paper aims to remove speaker information by exploiting the structured nature of speech, composed of discrete uni… ▽ More Recently, there have been efforts to encode the linguistic information of speech using a self-supervised framework for speech synthesis. However, predicting representations from surrounding representations can inadvertently entangle speaker information in the speech representation. This paper aims to remove speaker information by exploiting the structured nature of speech, composed of discrete units like phonemes with clear boundaries. A neural network predicts these boundaries, enabling variable-length pooling for event-based representation extraction instead of fixed-rate methods. The boundary predictor outputs a probability for the boundary between 0 and 1, making pooling soft. The model is trained to minimize the difference with the pooled representation of the data augmented by time-stretch and pitch-shift. To confirm that the learned representation includes contents information but is independent of speaker information, the model was evaluated with libri-light's phonetic ABX task and SUPERB's speaker identification task. △ Less

Submitted 31 March, 2024; originally announced April 2024.

arXiv:2402.01520 [pdf, ps, other]

Low-Resource Cross-Domain Singing Voice Synthesis via Reduced Self-Supervised Speech Representations

Authors: Panos Kakoulidis, Nikolaos Ellinas, Georgios Vamvoukakis, Myrsini Christidou, Alexandra Vioni, Georgia Maniati, Junkwang Oh, Gunu Jho, Inchul Hwang, Pirros Tsiakoulis, Aimilios Chalamandaris

Abstract: In this paper, we propose a singing voice synthesis model, Karaoker-SSL, that is trained only on text and speech data as a typical multi-speaker acoustic model. It is a low-resource pipeline that does not utilize any singing data end-to-end, since its vocoder is also trained on speech data. Karaoker-SSL is conditioned by self-supervised speech representations in an unsupervised manner. We preproce… ▽ More In this paper, we propose a singing voice synthesis model, Karaoker-SSL, that is trained only on text and speech data as a typical multi-speaker acoustic model. It is a low-resource pipeline that does not utilize any singing data end-to-end, since its vocoder is also trained on speech data. Karaoker-SSL is conditioned by self-supervised speech representations in an unsupervised manner. We preprocess these representations by selecting only a subset of their task-correlated dimensions. The conditioning module is indirectly guided to capture style information during training by multi-tasking. This is achieved with a Conformer-based module, which predicts the pitch from the acoustic model's output. Thus, Karaoker-SSL allows singing voice synthesis without reliance on hand-crafted and domain-specific features. There are also no requirements for text alignments or lyrics timestamps. To refine the voice quality, we employ a U-Net discriminator that is conditioned on the target speaker and follows a Diffusion GAN training scheme. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: Accepted to IEEE ICASSP SASB 2024

arXiv:2402.01298 [pdf, other]

Learning Semantic Information from Raw Audio Signal Using Both Contextual and Phonetic Representations

Authors: Jaeyeon Kim, Injune Hwang, Kyogu Lee

Abstract: We propose a framework to learn semantics from raw audio signals using two types of representations, encoding contextual and phonetic information respectively. Specifically, we introduce a speech-to-unit processing pipeline that captures two types of representations with different time resolutions. For the language model, we adopt a dual-channel architecture to incorporate both types of representa… ▽ More We propose a framework to learn semantics from raw audio signals using two types of representations, encoding contextual and phonetic information respectively. Specifically, we introduce a speech-to-unit processing pipeline that captures two types of representations with different time resolutions. For the language model, we adopt a dual-channel architecture to incorporate both types of representation. We also present new training objectives, masked context reconstruction and masked context prediction, that push models to learn semantics effectively. Experiments on the sSIMI metric of Zero Resource Speech Benchmark 2021 and Fluent Speech Command dataset show our framework learns semantics better than models trained with only one type of representation. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: Accepted to ICASSP 2024

arXiv:2401.16737 [pdf]

Formation of highly stable interfacial nitrogen gas hydrate overlayers under ambient conditions

Authors: Chung-Kai Fang, Cheng-Hao Chuang, Chih-Wen Yang, Zheng-Rong Guo, Wei-Hao Hsu, Chia-Hsin Wang, Ing-Shouh Hwang

Abstract: Surfaces (interfaces) dictate many physical and chemical properties of solid materials and adsorbates considerably affect these properties. Nitrogen molecules, which are the most abundant constituent in ambient air, are considered to be inert. Our study combining atomic force microscopy (AFM), X-ray photoemission spectroscopy (XPS), and thermal desorption spectroscopy (TDS) revealed that nitrogen… ▽ More Surfaces (interfaces) dictate many physical and chemical properties of solid materials and adsorbates considerably affect these properties. Nitrogen molecules, which are the most abundant constituent in ambient air, are considered to be inert. Our study combining atomic force microscopy (AFM), X-ray photoemission spectroscopy (XPS), and thermal desorption spectroscopy (TDS) revealed that nitrogen and water molecules can self-assemble into two-dimensional domains, forming ordered stripe structures on graphitic surfaces in both water and ambient air. The stripe structures of this study were composed of approximately 90% and 10% water and nitrogen molecules, respectively, and survived in ultra-high vacuum (UHV) conditions at temperatures up to approximately 350 K. Because pure water molecules completely desorb from graphitic surfaces in a UHV at temperatures lower than 200 K, our results indicate that the incorporation of nitrogen molecules substantially enhanced the stability of the crystalline water hydrogen bonding network. Additional studies on interfacial gas hydrates can provide deeper insight into the mechanisms underlying formation of gas hydrates. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.14421 [pdf, other]

Multi-Agent Based Transfer Learning for Data-Driven Air Traffic Applications

Authors: Chuhao Deng, Hong-Cheol Choi, Hyunsang Park, Inseok Hwang

Abstract: Research in developing data-driven models for Air Traffic Management (ATM) has gained a tremendous interest in recent years. However, data-driven models are known to have long training time and require large datasets to achieve good performance. To address the two issues, this paper proposes a Multi-Agent Bidirectional Encoder Representations from Transformers (MA-BERT) model that fully considers… ▽ More Research in developing data-driven models for Air Traffic Management (ATM) has gained a tremendous interest in recent years. However, data-driven models are known to have long training time and require large datasets to achieve good performance. To address the two issues, this paper proposes a Multi-Agent Bidirectional Encoder Representations from Transformers (MA-BERT) model that fully considers the multi-agent characteristic of the ATM system and learns air traffic controllers' decisions, and a pre-training and fine-tuning transfer learning framework. By pre-training the MA-BERT on a large dataset from a major airport and then fine-tuning it to other airports and specific air traffic applications, a large amount of the total training time can be saved. In addition, for newly adopted procedures and constructed airports where no historical data is available, this paper shows that the pre-trained MA-BERT can achieve high performance by updating regularly with little data. The proposed transfer learning framework and MA-BERT are tested with the automatic dependent surveillance-broadcast data recorded in 3 airports in South Korea in 2019. △ Less

Submitted 23 January, 2024; originally announced January 2024.

Comments: 12 pages, 8 figures, submitted for IEEE Transactions on Intelligent Transportation System

arXiv:2310.19348 [pdf]

Rapid suppression of quantum many-body magnetic exciton in doped van der Waals antiferromagnet (Ni,Cd)PS3

Authors: Junghyun Kim, Woongki Na, Jonghyeon Kim, Pyeongjae Park, Kaixuan Zhang, Inho Hwang, Young-Woo Son, Jae Hoon Kim, Hyeonsik Cheong, Je-Geun Park

Abstract: The unique discovery of magnetic exciton in van der Waals antiferromagnet NiPS3 arises between two quantum many-body states of a Zhang-Rice singlet excited state and a Zhang-Rice triplet ground state. Simultaneously, the spectral width of photoluminescence originating from this exciton is exceedingly narrow as 0.4 meV. These extraordinary properties, including the extreme coherence of the magnetic… ▽ More The unique discovery of magnetic exciton in van der Waals antiferromagnet NiPS3 arises between two quantum many-body states of a Zhang-Rice singlet excited state and a Zhang-Rice triplet ground state. Simultaneously, the spectral width of photoluminescence originating from this exciton is exceedingly narrow as 0.4 meV. These extraordinary properties, including the extreme coherence of the magnetic exciton in NiPS3, beg many questions. We studied doping effects using Ni1-xCdxPS3 using two experimental techniques and theoretical studies. Our experimental results show that the magnetic exciton is drastically suppressed upon a few % Cd doping. All these happen while the width of the exciton only gradually increases, and the antiferromagnetic ground state is robust. These results highlight the lattice uniformity's hidden importance as a prerequisite for coherent magnetic exciton. Finally, an exciting scenario emerges: the broken charge transfer forbids the otherwise uniform formation of the coherent magnetic exciton in (Ni,Cd)PS3. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: 40 pages, 4 main figures, 13 supporting figures, accepted by Nano Letters

arXiv:2310.16191 [pdf, other]

Can Virtual Reality Protect Users from Keystroke Inference Attacks?

Authors: Zhuolin Yang, Zain Sarwar, Iris Hwang, Ronik Bhaskar, Ben Y. Zhao, Haitao Zheng

Abstract: Virtual Reality (VR) has gained popularity by providing immersive and interactive experiences without geographical limitations. It also provides a sense of personal privacy through physical separation. In this paper, we show that despite assumptions of enhanced privacy, VR is unable to shield its users from side-channel attacks that steal private information. Ironically, this vulnerability arises… ▽ More Virtual Reality (VR) has gained popularity by providing immersive and interactive experiences without geographical limitations. It also provides a sense of personal privacy through physical separation. In this paper, we show that despite assumptions of enhanced privacy, VR is unable to shield its users from side-channel attacks that steal private information. Ironically, this vulnerability arises from VR's greatest strength, its immersive and interactive nature. We demonstrate this by designing and implementing a new set of keystroke inference attacks in shared virtual environments, where an attacker (VR user) can recover the content typed by another VR user by observing their avatar. While the avatar displays noisy telemetry of the user's hand motion, an intelligent attacker can use that data to recognize typed keys and reconstruct typed content, without knowing the keyboard layout or gathering labeled data. We evaluate the proposed attacks using IRB-approved user studies across multiple VR scenarios. For 13 out of 15 tested users, our attacks accurately recognize 86%-98% of typed keys, and the recovered content retains up to 98% of the meaning of the original typed content. We also discuss potential defenses. △ Less

Submitted 24 October, 2023; originally announced October 2023.

Comments: Accepted by USENIX 2024

arXiv:2310.07049 [pdf, other]

Robust Machine Learning Inference from X-ray Absorption Near Edge Spectra through Featurization

Authors: Yiming Chen, Chi Chen, Inhui Hwang, Michael J. Davis, Wanli Yang, Chengjun Sun, Shyue Ping Ong, Maria K. Y. Chan

Abstract: X-ray absorption spectroscopy (XAS) is a commonly-employed technique for characterizing functional materials. In particular, x-ray absorption near edge spectra (XANES) encodes local coordination and electronic information and machine learning approaches to extract this information is of significant interest. To date, most ML approaches for XANES have primarily focused on using the raw spectral int… ▽ More X-ray absorption spectroscopy (XAS) is a commonly-employed technique for characterizing functional materials. In particular, x-ray absorption near edge spectra (XANES) encodes local coordination and electronic information and machine learning approaches to extract this information is of significant interest. To date, most ML approaches for XANES have primarily focused on using the raw spectral intensities as input, overlooking the potential benefits of incorporating spectral transformations and dimensionality reduction techniques into ML predictions. In this work, we focused on systematically comparing the impact of different featurization methods on the performance of ML models for XAS analysis. We evaluated the classification and regression capabilities of these models on computed datasets and validated their performance on previously unseen experimental datasets. Our analysis revealed an intriguing discovery: the cumulative distribution function (CDF) feature achieves both high prediction accuracy and exceptional transferability. This remarkably robust performance can be attributed to its tolerance to horizontal shifts in spectra, which is crucial when validating models using experimental data. While this work exclusively focuses on XANES analysis, we anticipate that the methodology presented here will hold promise as a versatile asset to the broader spectroscopy community. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2310.05299 [pdf]

Image Compression and Decompression Framework Based on Latent Diffusion Model for Breast Mammography

Authors: InChan Hwang, MinJae Woo

Abstract: This research presents a novel framework for the compression and decompression of medical images utilizing the Latent Diffusion Model (LDM). The LDM represents advancement over the denoising diffusion probabilistic model (DDPM) with a potential to yield superior image quality while requiring fewer computational resources in the image decompression process. A possible application of LDM and Torchvi… ▽ More This research presents a novel framework for the compression and decompression of medical images utilizing the Latent Diffusion Model (LDM). The LDM represents advancement over the denoising diffusion probabilistic model (DDPM) with a potential to yield superior image quality while requiring fewer computational resources in the image decompression process. A possible application of LDM and Torchvision for image upscaling has been explored using medical image data, serving as an alternative to traditional image compression and decompression algorithms. The experimental outcomes demonstrate that this approach surpasses a conventional file compression algorithm, and convolutional neural network (CNN) models trained with decompressed files perform comparably to those trained with original image files. This approach also significantly reduces dataset size so that it can be distributed with a smaller size, and medical images take up much less space in medical devices. The research implications extend to noise reduction in lossy compression algorithms and substitute for complex wavelet-based lossless algorithms. △ Less

Submitted 8 October, 2023; originally announced October 2023.

Comments: 6 pages IEEE conference

arXiv:2309.11784 [pdf, other]

Collaborative Fault-Identification & Reconstruction in Multi-Agent Systems

Authors: Shiraz Khan, Inseok Hwang

Abstract: The conventional solutions for fault-detection, identification, and reconstruction (FDIR) require centralized decision-making mechanisms which are typically combinatorial in their nature, necessitating the design of an efficient distributed FDIR mechanism that is suitable for multi-agent applications. To this end, we develop a general framework for efficiently reconstructing a sparse vector being… ▽ More The conventional solutions for fault-detection, identification, and reconstruction (FDIR) require centralized decision-making mechanisms which are typically combinatorial in their nature, necessitating the design of an efficient distributed FDIR mechanism that is suitable for multi-agent applications. To this end, we develop a general framework for efficiently reconstructing a sparse vector being observed over a sensor network via nonlinear measurements. The proposed framework is used to design a distributed multi-agent FDIR algorithm based on a combination of the sequential convex programming (SCP) and the alternating direction method of multipliers (ADMM) optimization approaches. The proposed distributed FDIR algorithm can process a variety of inter-agent measurements (including distances, bearings, relative velocities, and subtended angles between agents) to identify the faulty agents and recover their true states. The effectiveness of the proposed distributed multi-agent FDIR approach is demonstrated by considering a numerical example in which the inter-agent distances are used to identify the faulty agents in a multi-agent configuration, as well as reconstruct their error vectors. △ Less

Submitted 22 September, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

arXiv:2308.16880 [pdf, other]

Text2Scene: Text-driven Indoor Scene Stylization with Part-aware Details

Authors: Inwoo Hwang, Hyeonwoo Kim, Young Min Kim

Abstract: We propose Text2Scene, a method to automatically create realistic textures for virtual scenes composed of multiple objects. Guided by a reference image and text descriptions, our pipeline adds detailed texture on labeled 3D geometries in the room such that the generated colors respect the hierarchical structure or semantic parts that are often composed of similar materials. Instead of applying fla… ▽ More We propose Text2Scene, a method to automatically create realistic textures for virtual scenes composed of multiple objects. Guided by a reference image and text descriptions, our pipeline adds detailed texture on labeled 3D geometries in the room such that the generated colors respect the hierarchical structure or semantic parts that are often composed of similar materials. Instead of applying flat stylization on the entire scene at a single step, we obtain weak semantic cues from geometric segmentation, which are further clarified by assigning initial colors to segmented parts. Then we add texture details for individual objects such that their projections on image space exhibit feature embedding aligned with the embedding of the input. The decomposition makes the entire pipeline tractable to a moderate amount of computation resources and memory. As our framework utilizes the existing resources of image and text embedding, it does not require dedicated datasets with high-quality textures designed by skillful artists. To the best of our knowledge, it is the first practical and scalable approach that can create detailed and realistic textures of the desired style that maintain structural context for scenes with multiple objects. △ Less

Submitted 31 August, 2023; originally announced August 2023.

Comments: Accepted to CVPR 2023

arXiv:2308.00274 [pdf, other]

Exploiting Sparsity for Localization of Large-Scale Wireless Sensor Networks

Authors: Shiraz Khan, Inseok Hwang, James Goppert

Abstract: Wireless Sensor Network (WSN) localization refers to the problem of determining the position of each of the agents in a WSN using noisy measurement information. In many cases, such as in distance and bearing-based localization, the measurement model is a nonlinear function of the agents' positions, leading to pairwise interconnections between the agents. As the optimal solution for the WSN localiz… ▽ More Wireless Sensor Network (WSN) localization refers to the problem of determining the position of each of the agents in a WSN using noisy measurement information. In many cases, such as in distance and bearing-based localization, the measurement model is a nonlinear function of the agents' positions, leading to pairwise interconnections between the agents. As the optimal solution for the WSN localization problem is known to be computationally expensive in these cases, an efficient approximation is desired. In this paper, we show that the inherent sparsity in this problem can be exploited to greatly reduce the computational effort of using an Extended Kalman Filter (EKF) for large-scale WSN localization. In the proposed method, which we call the Low-Bandwidth Extended Kalman Filter (LB-EKF), the measurement information matrix is converted into a banded matrix by relabeling (permuting the order of) the vertices of the graph. Using a combination of theoretical analysis and numerical simulations, it is shown that typical WSN configurations (which can be modeled as random geometric graphs) can be localized in a scalable manner using the proposed LB-EKF approach. △ Less

Submitted 1 August, 2023; originally announced August 2023.

arXiv:2308.00268 [pdf, other]

Distributed Gaussian Mixture PHD Filtering under Communication Constraints

Authors: Shiraz Khan, Yi-Chieh Sun, Inseok Hwang

Abstract: The Gaussian Mixture Probability Hypothesis Density (GM-PHD) filter is an almost exact closed-form approximation to the Bayes-optimal multi-target tracking algorithm. Due to its optimality guarantees and ease of implementation, it has been studied extensively in the literature. However, the challenges involved in implementing the GM-PHD filter efficiently in a distributed (multi-sensor) setting ha… ▽ More The Gaussian Mixture Probability Hypothesis Density (GM-PHD) filter is an almost exact closed-form approximation to the Bayes-optimal multi-target tracking algorithm. Due to its optimality guarantees and ease of implementation, it has been studied extensively in the literature. However, the challenges involved in implementing the GM-PHD filter efficiently in a distributed (multi-sensor) setting have received little attention. The existing solutions for distributed PHD filtering either have a high computational and communication cost, making them infeasible for resource-constrained applications, or are unable to guarantee the asymptotic convergence of the distributed PHD algorithm to an optimal solution. In this paper, we develop a distributed GM-PHD filtering recursion that uses a probabilistic communication rule to limit the communication bandwidth of the algorithm, while ensuring asymptotic optimality of the algorithm. We derive the convergence properties of this recursion, which uses weighted average consensus of Gaussian mixtures (GMs) to lower (and asymptotically minimize) the Cauchy-Schwarz divergence between the sensors' local estimates. In addition, the proposed method is able to avoid the issue of false positives, which has previously been noted to impact the filtering performance of distributed multi-target tracking. Through numerical simulations, it is demonstrated that our proposed method is an effective solution for distributed multi-target tracking in resource-constrained sensor networks. △ Less

Submitted 1 August, 2023; originally announced August 2023.

arXiv:2307.12078 [pdf, other]

Recovery of Localization Errors in Sensor Networks using Inter-Agent Measurements

Authors: Shiraz Khan, Inseok Hwang

Abstract: A practical challenge which arises in the operation of sensor networks is the presence of sensor faults, biases, or adversarial attacks, which can lead to significant errors incurring in the localization of the agents, thereby undermining the security and performance of the network. We consider the problem of identifying and correcting the localization errors using inter-agent measurements, such a… ▽ More A practical challenge which arises in the operation of sensor networks is the presence of sensor faults, biases, or adversarial attacks, which can lead to significant errors incurring in the localization of the agents, thereby undermining the security and performance of the network. We consider the problem of identifying and correcting the localization errors using inter-agent measurements, such as the distances or bearings from one agent to another, which can serve as a redundant source of information about the sensor network's configuration. The problem is solved by searching for a block sparse solution to an underdetermined system of equations, where the sparsity is introduced via the fact that the number of localization errors is typically much lesser than the total number of agents. Unlike the existing works, our proposed method does not require the knowledge of the identities of the anchors, i.e., the agents that do not have localization errors. We characterize the necessary and sufficient conditions on the sensor network configuration under which a given number of localization errors can be uniquely identified and corrected using the proposed method. The applicability of our results is demonstrated numerically by processing inter-agent distance measurements using a sequential convex programming (SCP) algorithm to identify the localization errors in a sensor network. △ Less

Submitted 22 July, 2023; originally announced July 2023.

arXiv:2305.04422 [pdf]

Multivariate Analysis on Performance Gaps of Artificial Intelligence Models in Screening Mammography

Authors: Linglin Zhang, Beatrice Brown-Mulry, Vineela Nalla, InChan Hwang, Judy Wawira Gichoya, Aimilia Gastounioti, Imon Banerjee, Laleh Seyyed-Kalantari, MinJae Woo, Hari Trivedi

Abstract: Although deep learning models for abnormality classification can perform well in screening mammography, the demographic, imaging, and clinical characteristics associated with increased risk of model failure remain unclear. This retrospective study uses the Emory BrEast Imaging Dataset(EMBED) containing mammograms from 115931 patients imaged at Emory Healthcare between 2013-2020, with BI-RADS asses… ▽ More Although deep learning models for abnormality classification can perform well in screening mammography, the demographic, imaging, and clinical characteristics associated with increased risk of model failure remain unclear. This retrospective study uses the Emory BrEast Imaging Dataset(EMBED) containing mammograms from 115931 patients imaged at Emory Healthcare between 2013-2020, with BI-RADS assessment, region of interest coordinates for abnormalities, imaging features, pathologic outcomes, and patient demographics. Multiple deep learning models were trained to distinguish between abnormal tissue patches and randomly selected normal tissue patches from screening mammograms. We assessed model performance by subgroups defined by age, race, pathologic outcome, tissue density, and imaging characteristics and investigated their associations with false negatives (FN) and false positives (FP). We also performed multivariate logistic regression to control for confounding between subgroups. The top-performing model, ResNet152V2, achieved accuracy of 92.6%(95%CI=92.0-93.2%), and AUC 0.975(95%CI=0.972-0.978). Before controlling for confounding, nearly all subgroups showed statistically significant differences in model performance. However, after controlling for confounding, we found lower FN risk associates with Other race(RR=0.828;p=.050), biopsy-proven benign lesions(RR=0.927;p=.011), and mass(RR=0.921;p=.010) or asymmetry(RR=0.854;p=.040); higher FN risk associates with architectural distortion (RR=1.037;p<.001). Higher FP risk associates to BI-RADS density C(RR=1.891;p<.001) and D(RR=2.486;p<.001). Our results demonstrate subgroup analysis is important in mammogram classifier performance evaluation, and controlling for confounding between subgroups elucidates the true associations between variables and model failure. These results can help guide developing future breast cancer detection models. △ Less

Submitted 19 October, 2023; v1 submitted 7 May, 2023; originally announced May 2023.

Comments: 29 pages, 6 tables, 7 figures, 2 supplemental tables

arXiv:2304.08204 [pdf, other]

Learning Geometry-aware Representations by Sketching

Authors: Hyundo Lee, Inwoo Hwang, Hyunsung Go, Won-Seok Choi, Kibeom Kim, Byoung-Tak Zhang

Abstract: Understanding geometric concepts, such as distance and shape, is essential for understanding the real world and also for many vision tasks. To incorporate such information into a visual representation of a scene, we propose learning to represent the scene by sketching, inspired by human behavior. Our method, coined Learning by Sketching (LBS), learns to convert an image into a set of colored strok… ▽ More Understanding geometric concepts, such as distance and shape, is essential for understanding the real world and also for many vision tasks. To incorporate such information into a visual representation of a scene, we propose learning to represent the scene by sketching, inspired by human behavior. Our method, coined Learning by Sketching (LBS), learns to convert an image into a set of colored strokes that explicitly incorporate the geometric information of the scene in a single inference step without requiring a sketch dataset. A sketch is then generated from the strokes where CLIP-based perceptual loss maintains a semantic similarity between the sketch and the image. We show theoretically that sketching is equivariant with respect to arbitrary affine transformations and thus provably preserves geometric information. Experimental results show that LBS substantially improves the performance of object attribute classification on the unlabeled CLEVR dataset, domain transfer between CLEVR and STL-10 datasets, and for diverse downstream tasks, confirming that LBS provides rich geometric information. △ Less

Submitted 17 April, 2023; originally announced April 2023.

Comments: CVPR 2023

arXiv:2302.00671 [pdf, other]

Efficient Multi-Task Reinforcement Learning via Selective Behavior Sharing

Authors: Grace Zhang, Ayush Jain, Injune Hwang, Shao-Hua Sun, Joseph J. Lim

Abstract: The ability to leverage shared behaviors between tasks is critical for sample-efficient multi-task reinforcement learning (MTRL). While prior methods have primarily explored parameter and data sharing, direct behavior-sharing has been limited to task families requiring similar behaviors. Our goal is to extend the efficacy of behavior-sharing to more general task families that could require a mix o… ▽ More The ability to leverage shared behaviors between tasks is critical for sample-efficient multi-task reinforcement learning (MTRL). While prior methods have primarily explored parameter and data sharing, direct behavior-sharing has been limited to task families requiring similar behaviors. Our goal is to extend the efficacy of behavior-sharing to more general task families that could require a mix of shareable and conflicting behaviors. Our key insight is an agent's behavior across tasks can be used for mutually beneficial exploration. To this end, we propose a simple MTRL framework for identifying shareable behaviors over tasks and incorporating them to guide exploration. We empirically demonstrate how behavior sharing improves sample efficiency and final performance on manipulation and navigation MTRL tasks and is even complementary to parameter sharing. Result videos are available at https://sites.google.com/view/qmp-mtrl. △ Less

Submitted 1 February, 2023; originally announced February 2023.

arXiv:2212.04396 [pdf, other]

On Attack Detection and Identification for the Cyber-Physical System using Lifted System Model

Authors: Dawei Sun, Minhyun Cho, Inseok Hwang

Abstract: Motivated by the safety and security issues related to cyber-physical systems with potentially multi-rate, delayed, and nonuniformly sampled measurements, we investigate the attack detection and identification using the lifted system model in this paper. Attack detectability and identifiability based on the lifted system model are formally defined and rigorously characterized in a novel approach.… ▽ More Motivated by the safety and security issues related to cyber-physical systems with potentially multi-rate, delayed, and nonuniformly sampled measurements, we investigate the attack detection and identification using the lifted system model in this paper. Attack detectability and identifiability based on the lifted system model are formally defined and rigorously characterized in a novel approach. The method of checking detectability is discussed, and a residual design problem for attack detection is formulated in a general way. For attack identification, we define and characterize it by generalizing the concept of mode discernibility for switched systems, and a method for identifying the attack is discussed based on the theoretical analysis. An illustrative example of an unmanned aircraft system (UAS) is provided to validate the main results. △ Less

Submitted 8 December, 2022; originally announced December 2022.

Comments: It is the preprint of a paper submitted to Automatica

arXiv:2212.04018 [pdf, other]

An Open-Source Gazebo Plugin for GNSS Multipath Signal Emulation in Virtual Urban Canyons

Authors: Kartik Anand Pant, Zhanpeng Yang, James M Goppert, Inseok Hwang

Abstract: One of the major errors affecting GNSS signals in urban canyons is GNSS multipath error. In this work, we develop a Gazebo plugin which utilizes a ray tracing technique to account for multipath effects in a virtual urban canyon environment using virtual satellites. This software plugin balances accuracy and computational complexity to run the simulation in real-time for both software-in-the-loop (… ▽ More One of the major errors affecting GNSS signals in urban canyons is GNSS multipath error. In this work, we develop a Gazebo plugin which utilizes a ray tracing technique to account for multipath effects in a virtual urban canyon environment using virtual satellites. This software plugin balances accuracy and computational complexity to run the simulation in real-time for both software-in-the-loop (SITL) and hardware-in-the-loop (HITL) testing. We also construct a 3D virtual environment of Hong Kong and compare the results from our plugin with the GNSS data in the publicly available Urban-Nav dataset, to validate the efficacy of the proposed Gazebo Plugin. The plugin is openly available to all the researchers in the robotics community. https://github.com/kpant14/multipath_sim △ Less

Submitted 7 December, 2022; originally announced December 2022.

Comments: 13 pages, 8 figures

arXiv:2211.07734 [pdf, other]

Superconducting Niobium Tip Electron Beam Source

Authors: Cameron W. Johnson, Andreas K. Schmid, Marian Mankos, Robin Röpke, Nicole Kerker, Ing-Shouh Hwang, Ed K. Wong, D. Frank Ogletree, Andrew M. Minor, Alexander Stibor

Abstract: Modern electron microscopy and spectroscopy is a key technology for studying the structure and composition of quantum and biological materials in fundamental and applied sciences. High-resolution spectroscopic techniques and aberration-corrected microscopes are often limited by the relatively large energy distribution of currently available beam sources. This can be improved by a monochromator, wi… ▽ More Modern electron microscopy and spectroscopy is a key technology for studying the structure and composition of quantum and biological materials in fundamental and applied sciences. High-resolution spectroscopic techniques and aberration-corrected microscopes are often limited by the relatively large energy distribution of currently available beam sources. This can be improved by a monochromator, with the significant drawback of losing most of the beam current. Here, we study the field emission properties of a monocrystalline niobium tip electron field emitter at 5.2 K, well below the superconducting transition temperature. The emitter fabrication process can generate two tip configurations, with or without a nano-protrusion at the apex, strongly influencing the field-emission energy distribution. The geometry without the nano-protrusion has a high beam current, long-term stability, and an energy width of around 100 meV. The beam current can be increased by two orders of magnitude by xenon gas adsorption. We also studied the emitter performance up to 82 K and demonstrated the beam's energy width can be below 40 meV even at liquid nitrogen cooling temperatures when an apex nano-protrusion is present. Furthermore, the spatial and temporal electron-electron correlations of the field emission are studied at normal and superconducting temperatures and the influence of Nottingham heating is discussed. This new monochromatic source will allow unprecedented accuracy and resolution in electron microscopy, spectroscopy, and high-coherence quantum applications. △ Less

Submitted 14 November, 2022; originally announced November 2022.

arXiv:2211.05203 [pdf, other]

Data-driven Cyberattack Synthesis against Network Control Systems

Authors: Omanshu Thapliyal, Inseok Hwang

Abstract: Network Control Systems (NCSs) pose unique vulnerabilities to cyberattacks due to a heavy reliance on communication channels. These channels can be susceptible to eavesdropping, false data injection (FDI), and denial of service (DoS). As a result, smarter cyberattacks can employ a combination of techniques to cause degradation of the considered NCS performance. We consider a white-box cyberattack… ▽ More Network Control Systems (NCSs) pose unique vulnerabilities to cyberattacks due to a heavy reliance on communication channels. These channels can be susceptible to eavesdropping, false data injection (FDI), and denial of service (DoS). As a result, smarter cyberattacks can employ a combination of techniques to cause degradation of the considered NCS performance. We consider a white-box cyberattack synthesis technique in which the attacker initially eavesdrops to gather system data, and constructs equivalent system model. We utilize the equivalent model to synthesize hybrid cyberattacks -- a combination of FDI and DoS attacks against the NCS. Reachable sets for the equivalent NCS model provide rapid, real-time directives towards selecting NCS agents to be attacked. The devised method provides a significantly more realistic approach toward cyberattack synthesis against NCSs with unknown parameters. We demonstrate the proposed method using a multi-aerial vehicle formation control scenario. △ Less

Submitted 9 November, 2022; originally announced November 2022.

Comments: 10 pages, 5 figures

arXiv:2211.03732 [pdf, other]

Approximating Reachable Sets for Neural Network based Models in Real-Time via Optimal Control

Authors: Omanshu Thapliyal, Inseok Hwang

Abstract: In this paper, we present a data-driven framework for real-time estimation of reachable sets for control systems where the plant is modeled using neural networks (NNs). We utilize a running example of a quadrotor model that is learned using trajectory data via NNs. The NN learned offline, can be excited online to obtain linear approximations for reachability analysis. We use a dynamic mode decompo… ▽ More In this paper, we present a data-driven framework for real-time estimation of reachable sets for control systems where the plant is modeled using neural networks (NNs). We utilize a running example of a quadrotor model that is learned using trajectory data via NNs. The NN learned offline, can be excited online to obtain linear approximations for reachability analysis. We use a dynamic mode decomposition based approach to obtain linear liftings of the NN model. The linear models thus obtained can utilize optimal control theory to obtain polytopic approximations to the reachable sets in real-time. The polytopic approximations can be tuned to arbitrary degrees of accuracy. The proposed framework can be extended to other nonlinear models that utilize NNs to estimate plant dynamics. We demonstrate the effectiveness of the proposed framework using an illustrative simulation of quadrotor dynamics. △ Less

Submitted 7 November, 2022; originally announced November 2022.

Comments: 14 pages, 11 figures, journal paper that has been conditionally accepted

arXiv:2211.03310 [pdf, other]

Log-linear Dynamic Inversion Control with Provable Safety Guarantees in Lie Groups

Authors: Li-Yu Lin, James Goppert, Inseok Hwang

Abstract: In this paper, we use the derivative of the exponential map to derive the exact evolution of the logarithm of the tracking error for mixed-invariant systems, a class of systems capable of describing rigid body tracking problems in Lie groups. Additionally, we design a log-linear dynamic inversion-based control law to remove the nonlinearities due to spatial curvature and enhance the robustness of… ▽ More In this paper, we use the derivative of the exponential map to derive the exact evolution of the logarithm of the tracking error for mixed-invariant systems, a class of systems capable of describing rigid body tracking problems in Lie groups. Additionally, we design a log-linear dynamic inversion-based control law to remove the nonlinearities due to spatial curvature and enhance the robustness of the controller. We apply Linear Matrix Inequalities (LMIs) to bound the tracking error given a bounded disturbance amplified by the distortion matrix and leverage the tracking error bound to create flow pipes. To demonstrate the usefulness of our method, we show its application with Urban Air Mobility (UAM) scenarios using a simplified kinematic aircraft model and polynomial-based path planning methods. △ Less

Submitted 13 August, 2023; v1 submitted 7 November, 2022; originally announced November 2022.

Comments: 7 pages, 5 figures. Revision is submitted to IEEE TAC

arXiv:2211.02291 [pdf, other]

SelecMix: Debiased Learning by Contradicting-pair Sampling

Authors: Inwoo Hwang, Sangjun Lee, Yunhyeok Kwak, Seong Joon Oh, Damien Teney, Jin-Hwa Kim, Byoung-Tak Zhang

Abstract: Neural networks trained with ERM (empirical risk minimization) sometimes learn unintended decision rules, in particular when their training data is biased, i.e., when training labels are strongly correlated with undesirable features. To prevent a network from learning such features, recent methods augment training data such that examples displaying spurious correlations (i.e., bias-aligned example… ▽ More Neural networks trained with ERM (empirical risk minimization) sometimes learn unintended decision rules, in particular when their training data is biased, i.e., when training labels are strongly correlated with undesirable features. To prevent a network from learning such features, recent methods augment training data such that examples displaying spurious correlations (i.e., bias-aligned examples) become a minority, whereas the other, bias-conflicting examples become prevalent. However, these approaches are sometimes difficult to train and scale to real-world data because they rely on generative models or disentangled representations. We propose an alternative based on mixup, a popular augmentation that creates convex combinations of training examples. Our method, coined SelecMix, applies mixup to contradicting pairs of examples, defined as showing either (i) the same label but dissimilar biased features, or (ii) different labels but similar biased features. Identifying such pairs requires comparing examples with respect to unknown biased features. For this, we utilize an auxiliary contrastive model with the popular heuristic that biased features are learned preferentially during training. Experiments on standard benchmarks demonstrate the effectiveness of the method, in particular when label noise complicates the identification of bias-conflicting examples. △ Less

Submitted 4 November, 2022; originally announced November 2022.

Comments: NeurIPS 2022

arXiv:2211.01327 [pdf, other]

Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis

Authors: Konstantinos Klapsas, Karolos Nikitaras, Nikolaos Ellinas, June Sig Sung, Inchul Hwang, Spyros Raptis, Aimilios Chalamandaris, Pirros Tsiakoulis

Abstract: A large part of the expressive speech synthesis literature focuses on learning prosodic representations of the speech signal which are then modeled by a prior distribution during inference. In this paper, we compare different prior architectures at the task of predicting phoneme level prosodic representations extracted with an unsupervised FVAE model. We use both subjective and objective metrics t… ▽ More A large part of the expressive speech synthesis literature focuses on learning prosodic representations of the speech signal which are then modeled by a prior distribution during inference. In this paper, we compare different prior architectures at the task of predicting phoneme level prosodic representations extracted with an unsupervised FVAE model. We use both subjective and objective metrics to show that normalizing flow based prior networks can result in more expressive speech at the cost of a slight drop in quality. Furthermore, we show that the synthesized speech has higher variability, for a given text, due to the nature of normalizing flows. We also propose a Dynamical VAE model, that can generate higher quality speech although with decreased expressiveness and variability compared to the flow based models. △ Less

Submitted 2 November, 2022; originally announced November 2022.

Comments: Submitted to ICASSP 2023

arXiv:2211.00523 [pdf, other]

Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis

Authors: Karolos Nikitaras, Konstantinos Klapsas, Nikolaos Ellinas, Georgia Maniati, June Sig Sung, Inchul Hwang, Spyros Raptis, Aimilios Chalamandaris, Pirros Tsiakoulis

Abstract: This paper proposes an Expressive Speech Synthesis model that utilizes token-level latent prosodic variables in order to capture and control utterance-level attributes, such as character acting voice and speaking style. Current works aim to explicitly factorize such fine-grained and utterance-level speech attributes into different representations extracted by modules that operate in the correspond… ▽ More This paper proposes an Expressive Speech Synthesis model that utilizes token-level latent prosodic variables in order to capture and control utterance-level attributes, such as character acting voice and speaking style. Current works aim to explicitly factorize such fine-grained and utterance-level speech attributes into different representations extracted by modules that operate in the corresponding level. We show that the fine-grained latent space also captures coarse-grained information, which is more evident as the dimension of latent space increases in order to capture diverse prosodic representations. Therefore, a trade-off arises between the diversity of the token-level and utterance-level representations and their disentanglement. We alleviate this issue by first capturing rich speech attributes into a token-level latent space and then, separately train a prior network that given the input text, learns utterance-level representations in order to predict the phoneme-level, posterior latents extracted during the previous step. Both qualitative and quantitative evaluations are used to demonstrate the effectiveness of the proposed approach. Audio samples are available in our demo page. △ Less

Submitted 1 November, 2022; originally announced November 2022.

Comments: Submitted to ICASSP 2023

arXiv:2211.00375 [pdf, other]

Generating Multilingual Gender-Ambiguous Text-to-Speech Voices

Authors: Konstantinos Markopoulos, Georgia Maniati, Georgios Vamvoukakis, Nikolaos Ellinas, Georgios Vardaxoglou, Panos Kakoulidis, Junkwang Oh, Gunu Jho, Inchul Hwang, Aimilios Chalamandaris, Pirros Tsiakoulis, Spyros Raptis

Abstract: The gender of any voice user interface is a key element of its perceived identity. Recently, there has been increasing interest in interfaces where the gender is ambiguous rather than clearly identifying as female or male. This work addresses the task of generating novel gender-ambiguous TTS voices in a multi-speaker, multilingual setting. This is accomplished by efficiently sampling from a latent… ▽ More The gender of any voice user interface is a key element of its perceived identity. Recently, there has been increasing interest in interfaces where the gender is ambiguous rather than clearly identifying as female or male. This work addresses the task of generating novel gender-ambiguous TTS voices in a multi-speaker, multilingual setting. This is accomplished by efficiently sampling from a latent speaker embedding space using a proposed gender-aware method. Extensive objective and subjective evaluations clearly indicate that this method is able to efficiently generate a range of novel, diverse voices that are consistent and perceived as more gender-ambiguous than a baseline voice across all the languages examined. Interestingly, the gender perception is found to be robust across two demographic factors of the listeners: native language and gender. To our knowledge, this is the first systematic and validated approach that can reliably generate a variety of gender-ambiguous voices. △ Less

Submitted 11 June, 2023; v1 submitted 1 November, 2022; originally announced November 2022.

Comments: Accepted to INTERSPEECH 2023

arXiv:2211.00342 [pdf, other]

doi 10.1109/ICASSP49357.2023.10096255

Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features

Authors: Alexandra Vioni, Georgia Maniati, Nikolaos Ellinas, June Sig Sung, Inchul Hwang, Aimilios Chalamandaris, Pirros Tsiakoulis

Abstract: Current state-of-the-art methods for automatic synthetic speech evaluation are based on MOS prediction neural models. Such MOS prediction models include MOSNet and LDNet that use spectral features as input, and SSL-MOS that relies on a pretrained self-supervised learning model that directly uses the speech signal as input. In modern high-quality neural TTS systems, prosodic appropriateness with re… ▽ More Current state-of-the-art methods for automatic synthetic speech evaluation are based on MOS prediction neural models. Such MOS prediction models include MOSNet and LDNet that use spectral features as input, and SSL-MOS that relies on a pretrained self-supervised learning model that directly uses the speech signal as input. In modern high-quality neural TTS systems, prosodic appropriateness with regard to the spoken content is a decisive factor for speech naturalness. For this reason, we propose to include prosodic and linguistic features as additional inputs in MOS prediction systems, and evaluate their impact on the prediction outcome. We consider phoneme level F0 and duration features as prosodic inputs, as well as Tacotron encoder outputs, POS tags and BERT embeddings as higher-level linguistic inputs. All MOS prediction systems are trained on SOMOS, a neural TTS-only dataset with crowdsourced naturalness MOS evaluations. Results show that the proposed additional features are beneficial in the MOS prediction task, by improving the predicted MOS scores' correlation with the ground truths, both at utterance-level and system-level predictions. △ Less

Submitted 7 May, 2023; v1 submitted 1 November, 2022; originally announced November 2022.

Comments: Proceedings of ICASSP 2023

arXiv:2210.17264

Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation

Authors: Nikolaos Ellinas, Georgios Vamvoukakis, Konstantinos Markopoulos, Georgia Maniati, Panos Kakoulidis, June Sig Sung, Inchul Hwang, Spyros Raptis, Aimilios Chalamandaris, Pirros Tsiakoulis

Abstract: This paper presents a method for end-to-end cross-lingual text-to-speech (TTS) which aims to preserve the target language's pronunciation regardless of the original speaker's language. The model used is based on a non-attentive Tacotron architecture, where the decoder has been replaced with a normalizing flow network conditioned on the speaker identity, allowing both TTS and voice conversion (VC)… ▽ More This paper presents a method for end-to-end cross-lingual text-to-speech (TTS) which aims to preserve the target language's pronunciation regardless of the original speaker's language. The model used is based on a non-attentive Tacotron architecture, where the decoder has been replaced with a normalizing flow network conditioned on the speaker identity, allowing both TTS and voice conversion (VC) to be performed by the same model due to the inherent linguistic content and speaker identity disentanglement. When used in a cross-lingual setting, acoustic features are initially produced with a native speaker of the target language and then voice conversion is applied by the same model in order to convert these features to the target speaker's voice. We verify through objective and subjective evaluations that our method can have benefits compared to baseline cross-lingual synthesis. By including speakers averaging 7.5 minutes of speech, we also present positive results on low-resource scenarios. △ Less

Submitted 27 February, 2024; v1 submitted 31 October, 2022; originally announced October 2022.

Comments: Fundamental changes to the model described and experimental procedure

arXiv:2210.10927 [pdf, other]

A Novel Approach to Set-Membership Observer for Systems with Unknown Exogenous Inputs

Authors: Marvin Jesse, Dawei Sun, Inseok Hwang

Abstract: Motivated by the increasing need to monitor safety-critical systems subject to uncertainties, a novel set-membership approach is proposed to estimate the state of a dynamical system with unknown-but-bounded exogenous inputs. The proposed method decomposes the system into the strongly observable and weakly unobservable subsystem in which an unknown input observer and an ellipsoidal set-membership o… ▽ More Motivated by the increasing need to monitor safety-critical systems subject to uncertainties, a novel set-membership approach is proposed to estimate the state of a dynamical system with unknown-but-bounded exogenous inputs. The proposed method decomposes the system into the strongly observable and weakly unobservable subsystem in which an unknown input observer and an ellipsoidal set-membership observer are designed for each subsystem, respectively. The conditions for the boundedness of the proposed set estimate are discussed, and the proposed set-membership observer is also tested numerically using illustrative examples. △ Less

Submitted 19 October, 2022; originally announced October 2022.

arXiv:2208.13843 [pdf, ps, other]

Provably Stabilizing Model-Free Q-Learning for Unknown Bilinear Systems

Authors: Shanelle G. Clarke, Omanshu Thapliyal, Inseok Hwang

Abstract: In this paper, we present a provably convergent Model-Free ${Q}$-Learning algorithm that learns a stabilizing control policy for an unknown Bilinear System from a single online run. Given an unknown bilinear system, we study the interplay between its equivalent control-affine linear time-varying and linear time-invariant representations to derive i) from Pontryagin's Minimum Principle, a pair of p… ▽ More In this paper, we present a provably convergent Model-Free ${Q}$-Learning algorithm that learns a stabilizing control policy for an unknown Bilinear System from a single online run. Given an unknown bilinear system, we study the interplay between its equivalent control-affine linear time-varying and linear time-invariant representations to derive i) from Pontryagin's Minimum Principle, a pair of point-to-point model-free policy improvement and evaluation laws that iteratively solves for an optimal state-dependent control policy; and ii) the properties under which the state-input data is sufficient to characterize system behavior in a model-free manner. We demonstrate the performance of the proposed algorithm via illustrative numerical examples and compare it to the model-based case. △ Less

Submitted 29 August, 2022; originally announced August 2022.

Comments: 7 pages, 1 figure, Submitted to IEEE Control Systems Letters (L-CSS)

arXiv:2208.04832 [pdf, other]

On the Importance of Critical Period in Multi-stage Reinforcement Learning

Authors: Junseok Park, Inwoo Hwang, Min Whoo Lee, Hyunseok Oh, Minsu Lee, Youngki Lee, Byoung-Tak Zhang

Abstract: The initial years of an infant's life are known as the critical period, during which the overall development of learning performance is significantly impacted due to neural plasticity. In recent studies, an AI agent, with a deep neural network mimicking mechanisms of actual neurons, exhibited a learning period similar to human's critical period. Especially during this initial period, the appropria… ▽ More The initial years of an infant's life are known as the critical period, during which the overall development of learning performance is significantly impacted due to neural plasticity. In recent studies, an AI agent, with a deep neural network mimicking mechanisms of actual neurons, exhibited a learning period similar to human's critical period. Especially during this initial period, the appropriate stimuli play a vital role in developing learning ability. However, transforming human cognitive bias into an appropriate shaping reward is quite challenging, and prior works on critical period do not focus on finding the appropriate stimulus. To take a step further, we propose multi-stage reinforcement learning to emphasize finding ``appropriate stimulus" around the critical period. Inspired by humans' early cognitive-developmental stage, we use multi-stage guidance near the critical period, and demonstrate the appropriate shaping reward (stage-2 guidance) in terms of the AI agent's performance, efficiency, and stability. △ Less

Submitted 9 August, 2022; originally announced August 2022.

Comments: Accepted by the ICML Complex Feedback in Online Learning Workshop (Open Problems) 2022

arXiv:2207.04236 [pdf, other]

doi 10.1145/3528223.3530075

Sparse Ellipsometry: Portable Acquisition of Polarimetric SVBRDF and Shape with Unstructured Flash Photography

Authors: Inseung Hwang, Daniel S. Jeon, Adolfo Muñoz, Diego Gutierrez, Xin Tong, Min H. Kim

Abstract: Ellipsometry techniques allow to measure polarization information of materials, requiring precise rotations of optical components with different configurations of lights and sensors. This results in cumbersome capture devices, carefully calibrated in lab conditions, and in very long acquisition times, usually in the order of a few days per object. Recent techniques allow to capture polarimetric sp… ▽ More Ellipsometry techniques allow to measure polarization information of materials, requiring precise rotations of optical components with different configurations of lights and sensors. This results in cumbersome capture devices, carefully calibrated in lab conditions, and in very long acquisition times, usually in the order of a few days per object. Recent techniques allow to capture polarimetric spatially-varying reflectance information, but limited to a single view, or to cover all view directions, but limited to spherical objects made of a single homogeneous material. We present sparse ellipsometry, a portable polarimetric acquisition method that captures both polarimetric SVBRDF and 3D shape simultaneously. Our handheld device consists of off-the-shelf, fixed optical components. Instead of days, the total acquisition time varies between twenty and thirty minutes per object. We develop a complete polarimetric SVBRDF model that includes diffuse and specular components, as well as single scattering, and devise a novel polarimetric inverse rendering algorithm with data augmentation of specular reflection samples via generative modeling. Our results show a strong agreement with a recent ground-truth dataset of captured polarimetric BRDFs of real-world objects. △ Less

Submitted 8 February, 2023; v1 submitted 9 July, 2022; originally announced July 2022.

Journal ref: ACM Transactions on Graphics 41, 4, Article 133 (July 2022)

arXiv:2207.03318 [pdf, other]

State Prediction of Human-in-the-Loop Multi-rotor System with Stochastic Human Behavior Model

Authors: Joonwon Choi, Sooyung Byeon, Inseok Hwang

Abstract: Reachability analysis is a widely used method to analyze the safety of a Human-in-the-Loop Cyber Physical System (HiLCPS). This strategy allows the HiLCPS to respond against an imminent threat in advance by predicting reachable states of the system. However, it could lead to an unnecessarily conservative reachable set if the prediction only relies on the system dynamics without explicitly consider… ▽ More Reachability analysis is a widely used method to analyze the safety of a Human-in-the-Loop Cyber Physical System (HiLCPS). This strategy allows the HiLCPS to respond against an imminent threat in advance by predicting reachable states of the system. However, it could lead to an unnecessarily conservative reachable set if the prediction only relies on the system dynamics without explicitly considering human behavior, and thus the risk might be overestimated. To reduce the conservativeness of the reachability analysis, we present a state prediction method which takes into account a stochastic human behavior model represented as a Gaussian Mixture Model (GMM). In this paper, we focus on the multi-rotor in a near-collision situation. The stochastic human behavior model is trained using experimental data to represent human operators' evasive maneuver. Then, we can retrieve a human control input probability distribution from the trained stochastic human behavior model using the Gaussian Mixture Regression (GMR). The proposed algorithm predicts the probability distribution of the multi-rotor's future state based on the given dynamics and the retrieved human control input probability distribution. Besides, the proposed state prediction method considers the uncertainty of the initial state modeled as a GMM, which yields more robust performance. Human subject experiment results are provided to demonstrate the effectiveness of the proposed algorithm. △ Less

Submitted 7 July, 2022; originally announced July 2022.

Comments: This work has been submitted to IFAC for possible publication

arXiv:2206.12455 [pdf, other]

Ev-NeRF: Event Based Neural Radiance Field

Authors: Inwoo Hwang, Junho Kim, Young Min Kim

Abstract: We present Ev-NeRF, a Neural Radiance Field derived from event data. While event cameras can measure subtle brightness changes in high frame rates, the measurements in low lighting or extreme motion suffer from significant domain discrepancy with complex noise. As a result, the performance of event-based vision tasks does not transfer to challenging environments, where the event cameras are expect… ▽ More We present Ev-NeRF, a Neural Radiance Field derived from event data. While event cameras can measure subtle brightness changes in high frame rates, the measurements in low lighting or extreme motion suffer from significant domain discrepancy with complex noise. As a result, the performance of event-based vision tasks does not transfer to challenging environments, where the event cameras are expected to thrive over normal cameras. We find that the multi-view consistency of NeRF provides a powerful self-supervision signal for eliminating the spurious measurements and extracting the consistent underlying structure despite highly noisy input. Instead of posed images of the original NeRF, the input to Ev-NeRF is the event measurements accompanied by the movements of the sensors. Using the loss function that reflects the measurement model of the sensor, Ev-NeRF creates an integrated neural volume that summarizes the unstructured and sparse data points captured for about 2-4 seconds. The generated neural volume can also produce intensity images from novel views with reasonable depth estimates, which can serve as a high-quality input to various vision-based tasks. Our results show that Ev-NeRF achieves competitive performance for intensity image reconstruction under extreme noise conditions and high-dynamic-range imaging. △ Less

Submitted 5 March, 2023; v1 submitted 24 June, 2022; originally announced June 2022.

Comments: Accepted to WACV 2023

arXiv:2204.03811 [pdf]

Observation of Mesoscopic Clathrate Structures in Ethanol-Water Mixtures

Authors: Wei-Hao Hsu, Tzu-Chieh Yen, Chien-Chun Chen, Chih-Wen Yang, Chung-Kai Fang, Ing-Shouh Hwang

Abstract: Water-alcohol mixtures exhibit many abnormal physicochemical properties, the origins of which remain controversial. Here we use transmission electron microscopy (TEM), nanoparticle tracking analysis (NTA), and atomic force microscopy (AFM) to study ethanol-water mixtures. TEM reveals mesoscopic clathrate structures with water molecules forming a crystalline matrix hosting a high density of tiny ce… ▽ More Water-alcohol mixtures exhibit many abnormal physicochemical properties, the origins of which remain controversial. Here we use transmission electron microscopy (TEM), nanoparticle tracking analysis (NTA), and atomic force microscopy (AFM) to study ethanol-water mixtures. TEM reveals mesoscopic clathrate structures with water molecules forming a crystalline matrix hosting a high density of tiny cells. The presence of these mesoscopic clathrate structures is further supported by a refractive index of 1.27+-0.02 at 405 nm measured via NTA and the hydrophilic nature of the mesoscopic structures implied by AFM observations, explaining many long-standing puzzles related to water-alcohol mixtures. △ Less

Submitted 7 April, 2022; originally announced April 2022.

arXiv:2203.12247 [pdf, other]

Ev-TTA: Test-Time Adaptation for Event-Based Object Recognition

Authors: Junho Kim, Inwoo Hwang, Young Min Kim

Abstract: We introduce Ev-TTA, a simple, effective test-time adaptation algorithm for event-based object recognition. While event cameras are proposed to provide measurements of scenes with fast motions or drastic illumination changes, many existing event-based recognition algorithms suffer from performance deterioration under extreme conditions due to significant domain shifts. Ev-TTA mitigates the severe… ▽ More We introduce Ev-TTA, a simple, effective test-time adaptation algorithm for event-based object recognition. While event cameras are proposed to provide measurements of scenes with fast motions or drastic illumination changes, many existing event-based recognition algorithms suffer from performance deterioration under extreme conditions due to significant domain shifts. Ev-TTA mitigates the severe domain gaps by fine-tuning the pre-trained classifiers during the test phase using loss functions inspired by the spatio-temporal characteristics of events. Since the event data is a temporal stream of measurements, our loss function enforces similar predictions for adjacent events to quickly adapt to the changed environment online. Also, we utilize the spatial correlations between two polarities of events to handle noise under extreme illumination, where different polarities of events exhibit distinctive noise distributions. Ev-TTA demonstrates a large amount of performance gain on a wide range of event-based object recognition tasks without extensive additional training. Our formulation can be successfully applied regardless of input representations and further extended into regression tasks. We expect Ev-TTA to provide the key technique to deploy event-based vision algorithms in challenging real-world applications where significant domain shift is inevitable. △ Less

Submitted 28 March, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

Comments: Accepted to CVPR 2022

arXiv:2110.11863 [pdf, ps, other]

Operator-valued rational functions

Authors: Raul E. Curto, In Sung Hwang, Woo Young Lee

Abstract: In this paper we show that every inner divisor of the operator-valued coordinate function, $zI_E$, is a Blaschke-Potapov factor. We also introduce a notion of operator-valued "rational" function and then show that $Δ$ is two-sided inner and rational if and only if it can be represented as a finite Blaschke-Potapov product; this extends to operator-valued functions the well-known result proved by V… ▽ More In this paper we show that every inner divisor of the operator-valued coordinate function, $zI_E$, is a Blaschke-Potapov factor. We also introduce a notion of operator-valued "rational" function and then show that $Δ$ is two-sided inner and rational if and only if it can be represented as a finite Blaschke-Potapov product; this extends to operator-valued functions the well-known result proved by V.P. Potapov for matrix-valued functions. △ Less

Submitted 22 October, 2021; originally announced October 2021.

MSC Class: 47

arXiv:2110.02509 [pdf, other]

Design and Implementation of 5.8GHz RF Wireless PowerTransfer System

Authors: Je Hyeon Park, Nguyen Minh Tran, Sa Il Hwang, Dong In Kim, Kae Won Choi

Abstract: In this paper, we present a 5.8 GHz radio-frequency (RF) wireless power transfer (WPT) system that consists of 64 transmit antennas and 16 receive antennas. Unlike the inductive or resonant coupling-based near-field WPT, RF WPT has a great advantage in powering low-power internet of things (IoT) devices with its capability of long-range wireless power transfer. We also propose a beam scanning algo… ▽ More In this paper, we present a 5.8 GHz radio-frequency (RF) wireless power transfer (WPT) system that consists of 64 transmit antennas and 16 receive antennas. Unlike the inductive or resonant coupling-based near-field WPT, RF WPT has a great advantage in powering low-power internet of things (IoT) devices with its capability of long-range wireless power transfer. We also propose a beam scanning algorithm that can effectively transfer the power no matter whether the receiver is located in the radiative near-field zone or far-field zone. The proposed beam scanning algorithm is verified with a real-life WPT testbed implemented by ourselves. By experiments, we confirm that the implemented 5.8 GHz RF WPT system is able to transfer 3.67 mW at a distance of 25 meters with the proposed beam scanning algorithm. Moreover, the results show that the proposed algorithm can effectively cover radiative near-field region differently from the conventional scanning schemes which are designed under the assumption of the far-field WPT. △ Less

Submitted 6 October, 2021; originally announced October 2021.

arXiv:2108.13022 [pdf]

doi 10.1002/adfm.202105992

Highly efficient nonvolatile magnetization switching and multi-level states by current in single van der Waals topological ferromagnet Fe3GeTe2

Authors: Kaixuan Zhang, Youjin Lee, Matthew J. Coak, Junghyun Kim, Suhan Son, Inho Hwang, Dong-Su Ko, Youngtek Oh, Insu Jeon, Dohun Kim, Changgan Zeng, Hyun-Woo Lee, Je-Geun Park

Abstract: Robust multi-level spin memory with the ability to write information electrically is a long-sought capability in spintronics, with great promise for applications. Here we achieve nonvolatile and highly energy-efficient magnetization switching in a single-material device formed of van-der-Waals topological ferromagnet Fe3GeTe2, whose magnetic information can be readily controlled by a tiny current.… ▽ More Robust multi-level spin memory with the ability to write information electrically is a long-sought capability in spintronics, with great promise for applications. Here we achieve nonvolatile and highly energy-efficient magnetization switching in a single-material device formed of van-der-Waals topological ferromagnet Fe3GeTe2, whose magnetic information can be readily controlled by a tiny current. Furthermore, the switching current density and power dissipation are about 400 and 4000 times smaller than those of the existing spin-orbit-torque magnetic random access memory based on conventional magnet/heavy-metal systems. Most importantly, we also demonstrate multi-level states, switched by electrical current, which can dramatically enhance the information capacity density and reduce computing costs. Thus, our observations combine both high energy efficiency and large information capacity density in one device, showcasing the potential applications of the emerging field of van-der-Waals magnets in the field of spin memory and spintronics. △ Less

Submitted 30 August, 2021; originally announced August 2021.

Comments: Accepted by Advanced Functional Materials; 28 pages, 5 main figures, 4 supporting figures

Journal ref: Advanced Functional Materials 31, 2105992 (2021)

arXiv:2108.12111 [pdf]

doi 10.1002/adma.202004110

Gigantic current control of coercive field and magnetic memory based on nm-thin ferromagnetic van der Waals Fe3GeTe2

Authors: Kaixuan Zhang, Seungyun Han, Youjin Lee, Matthew J. Coak, Junghyun Kim, Inho Hwang, Suhan Son, Jeacheol Shin, Mijin Lim, Daegeun Jo, Kyoo Kim, Dohun Kim, Hyun-Woo Lee, Je-Geun Park

Abstract: Controlling magnetic states by a small current is essential for the next-generation of energy-efficient spintronic devices. However, it invariably requires considerable energy to change a magnetic ground state of intrinsically quantum nature governed by fundamental Hamiltonian, once stabilized below a phase transition temperature. We report that surprisingly an in-plane current can tune the magnet… ▽ More Controlling magnetic states by a small current is essential for the next-generation of energy-efficient spintronic devices. However, it invariably requires considerable energy to change a magnetic ground state of intrinsically quantum nature governed by fundamental Hamiltonian, once stabilized below a phase transition temperature. We report that surprisingly an in-plane current can tune the magnetic state of nm-thin van der Waals ferromagnet Fe3GeTe2 from a hard magnetic state to a soft magnetic state. It is the direct demonstration of the current-induced substantial reduction of the coercive field. This surprising finding is possible because the in-plane current produces a highly unusual type of gigantic spin-orbit torque for Fe3GeTe2. And we further demonstrate a working model of a new nonvolatile magnetic memory based on the principle of our discovery in Fe3GeTe2, controlled by a tiny current. Our findings open up a new window of exciting opportunities for magnetic van der Waals materials with potentially huge impacts on the future development of spintronic and magnetic memory. △ Less

Submitted 1 September, 2021; v1 submitted 27 August, 2021; originally announced August 2021.

Comments: 61 pages, 4 main figures, 14 supporting figures

Journal ref: Advanced Materials 33, 2004110 (2021)

arXiv:2106.09774 [pdf, other]

doi 10.1103/PhysRevLett.128.036401

Unconventional hysteretic transition in a charge density wave

Authors: B. Q. Lv, Alfred Zong, D. Wu, A. V. Rozhkov, Boris V. Fine, Su-Di Chen, Makoto Hashimoto, Dong-Hui Lu, M. Li, Y. -B. Huang, Jacob P. C. Ruff, Donald A. Walko, Z. H. Chen, Inhui Hwang, Yifan Su, Xiaozhe Shen, Xirui Wang, Fei Han, Hoi Chun Po, Yao Wang, Pablo Jarillo-Herrero, Xijie Wang, Hua Zhou, Cheng-Jun Sun, Haidan Wen , et al. (3 additional authors not shown)

Abstract: Hysteresis underlies a large number of phase transitions in solids, giving rise to exotic metastable states that are otherwise inaccessible. Here, we report an unconventional hysteretic transition in a quasi-2D material, EuTe4. By combining transport, photoemission, diffraction, and x-ray absorption measurements, we observed that the hysteresis loop has a temperature width of more than 400 K, sett… ▽ More Hysteresis underlies a large number of phase transitions in solids, giving rise to exotic metastable states that are otherwise inaccessible. Here, we report an unconventional hysteretic transition in a quasi-2D material, EuTe4. By combining transport, photoemission, diffraction, and x-ray absorption measurements, we observed that the hysteresis loop has a temperature width of more than 400 K, setting a record among crystalline solids. The transition has an origin distinct from known mechanisms, lying entirely within the incommensurate charge-density-wave (CDW) phase of EuTe4 with no change in the CDW modulation periodicity. We interpret the hysteresis as an unusual switching of the relative CDW phases in different layers, a phenomenon unique to quasi-2D compounds that is not present in either purely 2D or strongly-coupled 3D systems. Our findings challenge the established theories on metastable states in density wave systems, pushing the boundary of understanding hysteretic transitions in a broken-symmetry state. △ Less

Submitted 17 June, 2021; originally announced June 2021.

Journal ref: Phys. Rev. Lett. 128, 036401 (2022)

Showing 1–50 of 127 results for author: Hwang, I