-
Meent: Differentiable Electromagnetic Simulator for Machine Learning
Authors:
Yongha Kim,
Anthony W. Jung,
Sanmun Kim,
Kevin Octavian,
Doyoung Heo,
Chaejin Park,
Jeongmin Shin,
Sunghyun Nam,
Chanhyung Park,
Juho Park,
Sangjun Han,
Jinmyoung Lee,
Seolho Kim,
Min Seok Jang,
Chan Y. Park
Abstract:
Electromagnetic (EM) simulation plays a crucial role in analyzing and designing devices with sub-wavelength scale structures such as solar cells, semiconductor devices, image sensors, future displays and integrated photonic devices. Specifically, optics problems such as estimating semiconductor device structures and designing nanophotonic devices provide intriguing research topics with far-reachin…
▽ More
Electromagnetic (EM) simulation plays a crucial role in analyzing and designing devices with sub-wavelength scale structures such as solar cells, semiconductor devices, image sensors, future displays and integrated photonic devices. Specifically, optics problems such as estimating semiconductor device structures and designing nanophotonic devices provide intriguing research topics with far-reaching real world impact. Traditional algorithms for such tasks require iteratively refining parameters through simulations, which often yield sub-optimal results due to the high computational cost of both the algorithms and EM simulations. Machine learning (ML) emerged as a promising candidate to mitigate these challenges, and optics research community has increasingly adopted ML algorithms to obtain results surpassing classical methods across various tasks. To foster a synergistic collaboration between the optics and ML communities, it is essential to have an EM simulation software that is user-friendly for both research communities. To this end, we present Meent, an EM simulation software that employs rigorous coupled-wave analysis (RCWA). Developed in Python and equipped with automatic differentiation (AD) capabilities, Meent serves as a versatile platform for integrating ML into optics research and vice versa. To demonstrate its utility as a research platform, we present three applications of Meent: 1) generating a dataset for training neural operator, 2) serving as an environment for the reinforcement learning of nanophotonic device optimization, and 3) providing a solution for inverse problems with gradient-based optimizers. These applications highlight Meent's potential to advance both EM simulation and ML methodologies. The code is available at https://github.com/kc-ml2/meent with the MIT license to promote the cross-polinations of ideas among academic researchers and industry practitioners.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
JenGAN: Stacked Shifted Filters in GAN-Based Speech Synthesis
Authors:
Hyunjae Cho,
Junhyeok Lee,
Wonbin Jung
Abstract:
Non-autoregressive GAN-based neural vocoders are widely used due to their fast inference speed and high perceptual quality. However, they often suffer from audible artifacts such as tonal artifacts in their generated results. Therefore, we propose JenGAN, a new training strategy that involves stacking shifted low-pass filters to ensure the shift-equivariant property. This method helps prevent alia…
▽ More
Non-autoregressive GAN-based neural vocoders are widely used due to their fast inference speed and high perceptual quality. However, they often suffer from audible artifacts such as tonal artifacts in their generated results. Therefore, we propose JenGAN, a new training strategy that involves stacking shifted low-pass filters to ensure the shift-equivariant property. This method helps prevent aliasing and reduce artifacts while preserving the model structure used during inference. In our experimental evaluation, JenGAN consistently enhances the performance of vocoder models, yielding significantly superior scores across the majority of evaluation metrics.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Learning to Compose: Improving Object Centric Learning by Injecting Compositionality
Authors:
Whie Jung,
Jaehoon Yoo,
Sungjin Ahn,
Seunghoon Hong
Abstract:
Learning compositional representation is a key aspect of object-centric learning as it enables flexible systematic generalization and supports complex visual reasoning. However, most of the existing approaches rely on auto-encoding objective, while the compositionality is implicitly imposed by the architectural or algorithmic bias in the encoder. This misalignment between auto-encoding objective a…
▽ More
Learning compositional representation is a key aspect of object-centric learning as it enables flexible systematic generalization and supports complex visual reasoning. However, most of the existing approaches rely on auto-encoding objective, while the compositionality is implicitly imposed by the architectural or algorithmic bias in the encoder. This misalignment between auto-encoding objective and learning compositionality often results in failure of capturing meaningful object representations. In this study, we propose a novel objective that explicitly encourages compositionality of the representations. Built upon the existing object-centric learning framework (e.g., slot attention), our method incorporates additional constraints that an arbitrary mixture of object representations from two images should be valid by maximizing the likelihood of the composite data. We demonstrate that incorporating our objective to the existing framework consistently improves the objective-centric learning and enhances the robustness to the architectural choices.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
HyperCLOVA X Technical Report
Authors:
Kang Min Yoo,
Jaegeun Han,
Sookyo In,
Heewon Jeon,
Jisu Jeong,
Jaewook Kang,
Hyunwook Kim,
Kyung-Min Kim,
Munhyong Kim,
Sungju Kim,
Donghyun Kwak,
Hanock Kwak,
Se Jung Kwon,
Bado Lee,
Dongsoo Lee,
Gichang Lee,
Jooho Lee,
Baeseong Park,
Seongjin Shin,
Joonsang Yu,
Seolki Baek,
Sumin Byeon,
Eungsup Cho,
Dooseok Choe,
Jeesung Han
, et al. (371 additional authors not shown)
Abstract:
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t…
▽ More
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in developing their sovereign LLMs.
△ Less
Submitted 13 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Goal-Reaching Trajectory Design Near Danger with Piecewise Affine Reach-avoid Computation
Authors:
Long Kiu Chung,
Wonsuhk Jung,
Chuizheng Kong,
Shreyas Kousik
Abstract:
Autonomous mobile robots must maintain safety, but should not sacrifice performance, leading to the classical reach-avoid problem: find a trajectory that is guaranteed to reach a goal and avoid obstacles. This paper addresses the near danger case, also known as a narrow gap, where the agent starts near the goal, but must navigate through tight obstacles that block its path. The proposed method bui…
▽ More
Autonomous mobile robots must maintain safety, but should not sacrifice performance, leading to the classical reach-avoid problem: find a trajectory that is guaranteed to reach a goal and avoid obstacles. This paper addresses the near danger case, also known as a narrow gap, where the agent starts near the goal, but must navigate through tight obstacles that block its path. The proposed method builds off the common approach of using a simplified planning model to generate plans, which are then tracked using a high-fidelity tracking model and controller. Existing approaches use reachability analysis to overapproximate the error between these models and ensure safety, but doing so introduces numerical approximation error conservativeness that prevents goal-reaching. The present work instead proposes a Piecewise Affine Reach-avoid Computation (PARC) method to tightly approximate the reachable set of the planning model. PARC significantly reduces conservativeness through a careful choice of the planning model and set representation, along with an effective approach to handling time-varying tracking errors. The utility of this method is demonstrated through extensive numerical experiments in which PARC outperforms state-of-the-art reach avoid methods in near-danger goal reaching. Furthermore, in a simulated demonstration, PARC enables the generation of provably-safe extreme vehicle dynamics drift parking maneuvers. A preliminary hardware demo on a TurtleBot3 also validates the method.
△ Less
Submitted 28 May, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
Domain Adaptive Imitation Learning with Visual Observation
Authors:
Sungho Choi,
Seungyul Han,
Woojun Kim,
Jongseong Chae,
Whiyoung Jung,
Youngchul Sung
Abstract:
In this paper, we consider domain-adaptive imitation learning with visual observation, where an agent in a target domain learns to perform a task by observing expert demonstrations in a source domain. Domain adaptive imitation learning arises in practical scenarios where a robot, receiving visual sensory data, needs to mimic movements by visually observing other robots from different angles or obs…
▽ More
In this paper, we consider domain-adaptive imitation learning with visual observation, where an agent in a target domain learns to perform a task by observing expert demonstrations in a source domain. Domain adaptive imitation learning arises in practical scenarios where a robot, receiving visual sensory data, needs to mimic movements by visually observing other robots from different angles or observing robots of different shapes. To overcome the domain shift in cross-domain imitation learning with visual observation, we propose a novel framework for extracting domain-independent behavioral features from input observations that can be used to train the learner, based on dual feature extraction and image reconstruction. Empirical results demonstrate that our approach outperforms previous algorithms for imitation learning from visual observation with domain shift.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Constraint-Guided Online Data Selection for Scalable Data-Driven Safety Filters in Uncertain Robotic Systems
Authors:
Jason J. Choi,
Fernando Castañeda,
Wonsuhk Jung,
Bike Zhang,
Claire J. Tomlin,
Koushil Sreenath
Abstract:
As the use of autonomous robotic systems expands in tasks that are complex and challenging to model, the demand for robust data-driven control methods that can certify safety and stability in uncertain conditions is increasing. However, the practical implementation of these methods often faces scalability issues due to the growing amount of data points with system complexity, and a significant rel…
▽ More
As the use of autonomous robotic systems expands in tasks that are complex and challenging to model, the demand for robust data-driven control methods that can certify safety and stability in uncertain conditions is increasing. However, the practical implementation of these methods often faces scalability issues due to the growing amount of data points with system complexity, and a significant reliance on high-quality training data. In response to these challenges, this study presents a scalable data-driven controller that efficiently identifies and infers from the most informative data points for implementing data-driven safety filters. Our approach is grounded in the integration of a model-based certificate function-based method and Gaussian Process (GP) regression, reinforced by a novel online data selection algorithm that reduces time complexity from quadratic to linear relative to dataset size. Empirical evidence, gathered from successful real-world cart-pole swing-up experiments and simulated locomotion of a five-link bipedal robot, demonstrates the efficacy of our approach. Our findings reveal that our efficient online data selection algorithm, which strategically selects key data points, enhances the practicality and efficiency of data-driven certifying filters in complex robotic systems, significantly mitigating scalability concerns inherent in nonparametric learning-based control methods.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
Hierarchical Joint Graph Learning and Multivariate Time Series Forecasting
Authors:
Juhyeon Kim,
Hyungeun Lee,
Seungwon Yu,
Ung Hwang,
Wooyul Jung,
Miseon Park,
Kijung Yoon
Abstract:
Multivariate time series is prevalent in many scientific and industrial domains. Modeling multivariate signals is challenging due to their long-range temporal dependencies and intricate interactions--both direct and indirect. To confront these complexities, we introduce a method of representing multivariate signals as nodes in a graph with edges indicating interdependency between them. Specificall…
▽ More
Multivariate time series is prevalent in many scientific and industrial domains. Modeling multivariate signals is challenging due to their long-range temporal dependencies and intricate interactions--both direct and indirect. To confront these complexities, we introduce a method of representing multivariate signals as nodes in a graph with edges indicating interdependency between them. Specifically, we leverage graph neural networks (GNN) and attention mechanisms to efficiently learn the underlying relationships within the time series data. Moreover, we suggest employing hierarchical signal decompositions running over the graphs to capture multiple spatial dependencies. The effectiveness of our proposed model is evaluated across various real-world benchmark datasets designed for long-term forecasting tasks. The results consistently showcase the superiority of our model, achieving an average 23\% reduction in mean squared error (MSE) compared to existing models.
△ Less
Submitted 30 November, 2023; v1 submitted 21 November, 2023;
originally announced November 2023.
-
Thermal-Infrared Remote Target Detection System for Maritime Rescue based on Data Augmentation with 3D Synthetic Data
Authors:
Sungjin Cheong,
Wonho Jung,
Yoon Seop Lim,
Yong-Hwa Park
Abstract:
This paper proposes a thermal-infrared (TIR) remote target detection system for maritime rescue using deep learning and data augmentation. We established a self-collected TIR dataset consisting of multiple scenes imitating human rescue situations using a TIR camera (FLIR). Additionally, to address dataset scarcity and improve model robustness, a synthetic dataset from a 3D game (ARMA3) to augment…
▽ More
This paper proposes a thermal-infrared (TIR) remote target detection system for maritime rescue using deep learning and data augmentation. We established a self-collected TIR dataset consisting of multiple scenes imitating human rescue situations using a TIR camera (FLIR). Additionally, to address dataset scarcity and improve model robustness, a synthetic dataset from a 3D game (ARMA3) to augment the data is further collected. However, a significant domain gap exists between synthetic TIR and real TIR images. Hence, a proper domain adaptation algorithm is essential to overcome the gap. Therefore, we suggest a domain adaptation algorithm in a target-background separated manner from 3D game-to-real, based on a generative model, to address this issue. Furthermore, a segmentation network with fixed-weight kernels at the head is proposed to improve the signal-to-noise ratio (SNR) and provide weak attention, as remote TIR targets inherently suffer from unclear boundaries. Experiment results reveal that the network trained on augmented data consisting of translated synthetic and real TIR data outperforms that trained on only real TIR data by a large margin. Furthermore, the proposed segmentation model surpasses the performance of state-of-the-art segmentation methods.
△ Less
Submitted 31 October, 2023;
originally announced October 2023.
-
Toward Practical Privacy-Preserving Convolutional Neural Networks Exploiting Fully Homomorphic Encryption
Authors:
Jaiyoung Park,
Donghwan Kim,
Jongmin Kim,
Sangpyo Kim,
Wonkyung Jung,
Jung Hee Cheon,
Jung Ho Ahn
Abstract:
Incorporating fully homomorphic encryption (FHE) into the inference process of a convolutional neural network (CNN) draws enormous attention as a viable approach for achieving private inference (PI). FHE allows delegating the entire computation process to the server while ensuring the confidentiality of sensitive client-side data. However, practical FHE implementation of a CNN faces significant hu…
▽ More
Incorporating fully homomorphic encryption (FHE) into the inference process of a convolutional neural network (CNN) draws enormous attention as a viable approach for achieving private inference (PI). FHE allows delegating the entire computation process to the server while ensuring the confidentiality of sensitive client-side data. However, practical FHE implementation of a CNN faces significant hurdles, primarily due to FHE's substantial computational and memory overhead. To address these challenges, we propose a set of optimizations, which includes GPU/ASIC acceleration, an efficient activation function, and an optimized packing scheme. We evaluate our method using the ResNet models on the CIFAR-10 and ImageNet datasets, achieving several orders of magnitude improvement compared to prior work and reducing the latency of the encrypted CNN inference to 1.4 seconds on an NVIDIA A100 GPU. We also show that the latency drops to a mere 0.03 seconds with a custom hardware design.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Exploring the Impact of Corpus Diversity on Financial Pretrained Language Models
Authors:
Jaeyoung Choe,
Keonwoong Noh,
Nayeon Kim,
Seyun Ahn,
Woohwan Jung
Abstract:
Over the past few years, various domain-specific pretrained language models (PLMs) have been proposed and have outperformed general-domain PLMs in specialized areas such as biomedical, scientific, and clinical domains. In addition, financial PLMs have been studied because of the high economic impact of financial data analysis. However, we found that financial PLMs were not pretrained on sufficient…
▽ More
Over the past few years, various domain-specific pretrained language models (PLMs) have been proposed and have outperformed general-domain PLMs in specialized areas such as biomedical, scientific, and clinical domains. In addition, financial PLMs have been studied because of the high economic impact of financial data analysis. However, we found that financial PLMs were not pretrained on sufficiently diverse financial data. This lack of diverse training data leads to a subpar generalization performance, resulting in general-purpose PLMs, including BERT, often outperforming financial PLMs on many downstream tasks. To address this issue, we collected a broad range of financial corpus and trained the Financial Language Model (FiLM) on these diverse datasets. Our experimental results confirm that FiLM outperforms not only existing financial PLMs but also general domain PLMs. Furthermore, we provide empirical evidence that this improvement can be achieved even for unseen corpus groups.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Enhancing Low-resource Fine-grained Named Entity Recognition by Leveraging Coarse-grained Datasets
Authors:
Su Ah Lee,
Seokjin Oh,
Woohwan Jung
Abstract:
Named Entity Recognition (NER) frequently suffers from the problem of insufficient labeled data, particularly in fine-grained NER scenarios. Although $K$-shot learning techniques can be applied, their performance tends to saturate when the number of annotations exceeds several tens of labels. To overcome this problem, we utilize existing coarse-grained datasets that offer a large number of annotat…
▽ More
Named Entity Recognition (NER) frequently suffers from the problem of insufficient labeled data, particularly in fine-grained NER scenarios. Although $K$-shot learning techniques can be applied, their performance tends to saturate when the number of annotations exceeds several tens of labels. To overcome this problem, we utilize existing coarse-grained datasets that offer a large number of annotations. A straightforward approach to address this problem is pre-finetuning, which employs coarse-grained data for representation learning. However, it cannot directly utilize the relationships between fine-grained and coarse-grained entities, although a fine-grained entity type is likely to be a subcategory of a coarse-grained entity type. We propose a fine-grained NER model with a Fine-to-Coarse(F2C) mapping matrix to leverage the hierarchical structure explicitly. In addition, we present an inconsistency filtering method to eliminate coarse-grained entities that are inconsistent with fine-grained entity types to avoid performance degradation. Our experimental results show that our method outperforms both $K$-shot learning and supervised learning methods when dealing with a small number of fine-grained annotations.
△ Less
Submitted 13 November, 2023; v1 submitted 18 October, 2023;
originally announced October 2023.
-
A Quantitatively Interpretable Model for Alzheimer's Disease Prediction Using Deep Counterfactuals
Authors:
Kwanseok Oh,
Da-Woon Heo,
Ahmad Wisnu Mulyadi,
Wonsik Jung,
Eunsong Kang,
Kun Ho Lee,
Heung-Il Suk
Abstract:
Deep learning (DL) for predicting Alzheimer's disease (AD) has provided timely intervention in disease progression yet still demands attentive interpretability to explain how their DL models make definitive decisions. Recently, counterfactual reasoning has gained increasing attention in medical research because of its ability to provide a refined visual explanatory map. However, such visual explan…
▽ More
Deep learning (DL) for predicting Alzheimer's disease (AD) has provided timely intervention in disease progression yet still demands attentive interpretability to explain how their DL models make definitive decisions. Recently, counterfactual reasoning has gained increasing attention in medical research because of its ability to provide a refined visual explanatory map. However, such visual explanatory maps based on visual inspection alone are insufficient unless we intuitively demonstrate their medical or neuroscientific validity via quantitative features. In this study, we synthesize the counterfactual-labeled structural MRIs using our proposed framework and transform it into a gray matter density map to measure its volumetric changes over the parcellated region of interest (ROI). We also devised a lightweight linear classifier to boost the effectiveness of constructed ROIs, promoted quantitative interpretation, and achieved comparable predictive performance to DL methods. Throughout this, our framework produces an ``AD-relatedness index'' for each ROI and offers an intuitive understanding of brain status for an individual patient and across patient groups with respect to AD progression.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
EAG-RS: A Novel Explainability-guided ROI-Selection Framework for ASD Diagnosis via Inter-regional Relation Learning
Authors:
Wonsik Jung,
Eunjin Jeon,
Eunsong Kang,
Heung-Il Suk
Abstract:
Deep learning models based on resting-state functional magnetic resonance imaging (rs-fMRI) have been widely used to diagnose brain diseases, particularly autism spectrum disorder (ASD). Existing studies have leveraged the functional connectivity (FC) of rs-fMRI, achieving notable classification performance. However, they have significant limitations, including the lack of adequate information whi…
▽ More
Deep learning models based on resting-state functional magnetic resonance imaging (rs-fMRI) have been widely used to diagnose brain diseases, particularly autism spectrum disorder (ASD). Existing studies have leveraged the functional connectivity (FC) of rs-fMRI, achieving notable classification performance. However, they have significant limitations, including the lack of adequate information while using linear low-order FC as inputs to the model, not considering individual characteristics (i.e., different symptoms or varying stages of severity) among patients with ASD, and the non-explainability of the decision process. To cover these limitations, we propose a novel explainability-guided region of interest (ROI) selection (EAG-RS) framework that identifies non-linear high-order functional associations among brain regions by leveraging an explainable artificial intelligence technique and selects class-discriminative regions for brain disease identification. The proposed framework includes three steps: (i) inter-regional relation learning to estimate non-linear relations through random seed-based network masking, (ii) explainable connection-wise relevance score estimation to explore high-order relations between functional connections, and (iii) non-linear high-order FC-based diagnosis-informative ROI selection and classifier learning to identify ASD. We validated the effectiveness of our proposed method by conducting experiments using the Autism Brain Imaging Database Exchange (ABIDE) dataset, demonstrating that the proposed method outperforms other comparative methods in terms of various evaluation metrics. Furthermore, we qualitatively analyzed the selected ROIs and identified ASD subtypes linked to previous neuroscientific studies.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Deep Geometric Learning with Monotonicity Constraints for Alzheimer's Disease Progression
Authors:
Seungwoo Jeong,
Wonsik Jung,
Junghyo Sohn,
Heung-Il Suk
Abstract:
Alzheimer's disease (AD) is a devastating neurodegenerative condition that precedes progressive and irreversible dementia; thus, predicting its progression over time is vital for clinical diagnosis and treatment. Numerous studies have implemented structural magnetic resonance imaging (MRI) to model AD progression, focusing on three integral aspects: (i) temporal variability, (ii) incomplete observ…
▽ More
Alzheimer's disease (AD) is a devastating neurodegenerative condition that precedes progressive and irreversible dementia; thus, predicting its progression over time is vital for clinical diagnosis and treatment. Numerous studies have implemented structural magnetic resonance imaging (MRI) to model AD progression, focusing on three integral aspects: (i) temporal variability, (ii) incomplete observations, and (iii) temporal geometric characteristics. However, deep learning-based approaches regarding data variability and sparsity have yet to consider inherent geometrical properties sufficiently. The ordinary differential equation-based geometric modeling method (ODE-RGRU) has recently emerged as a promising strategy for modeling time-series data by intertwining a recurrent neural network and an ODE in Riemannian space. Despite its achievements, ODE-RGRU encounters limitations when extrapolating positive definite symmetric metrics from incomplete samples, leading to feature reverse occurrences that are particularly problematic, especially within the clinical facet. Therefore, this study proposes a novel geometric learning approach that models longitudinal MRI biomarkers and cognitive scores by combining three modules: topological space shift, ODE-RGRU, and trajectory estimation. We have also developed a training algorithm that integrates manifold mapping with monotonicity constraints to reflect measurement transition irreversibility. We verify our proposed method's efficacy by predicting clinical labels and cognitive scores over time in regular and irregular settings. Furthermore, we thoroughly analyze our proposed framework through an ablation study.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Data Augmentation for Neural Machine Translation using Generative Language Model
Authors:
Seokjin Oh,
Su Ah Lee,
Woohwan Jung
Abstract:
Despite the rapid growth in model architecture, the scarcity of large parallel corpora remains the main bottleneck in Neural Machine Translation. Data augmentation is a technique that enhances the performance of data-hungry models by generating synthetic data instead of collecting new ones. We explore prompt-based data augmentation approaches that leverage large-scale language models such as ChatG…
▽ More
Despite the rapid growth in model architecture, the scarcity of large parallel corpora remains the main bottleneck in Neural Machine Translation. Data augmentation is a technique that enhances the performance of data-hungry models by generating synthetic data instead of collecting new ones. We explore prompt-based data augmentation approaches that leverage large-scale language models such as ChatGPT. To create a synthetic parallel corpus, we compare 3 methods using different prompts. We employ two assessment metrics to measure the diversity of the generated synthetic data. This approach requires no further model training cost, which is mandatory in other augmentation methods like back-translation. The proposed method improves the unaugmented baseline by 0.68 BLEU score.
△ Less
Submitted 13 November, 2023; v1 submitted 25 July, 2023;
originally announced July 2023.
-
OCELOT: Overlapped Cell on Tissue Dataset for Histopathology
Authors:
Jeongun Ryu,
Aaron Valero Puche,
JaeWoong Shin,
Seonwook Park,
Biagio Brattoli,
Jinhee Lee,
Wonkyung Jung,
Soo Ick Cho,
Kyunghyun Paeng,
Chan-Young Ock,
Donggeun Yoo,
Sérgio Pereira
Abstract:
Cell detection is a fundamental task in computational pathology that can be used for extracting high-level medical information from whole-slide images. For accurate cell detection, pathologists often zoom out to understand the tissue-level structures and zoom in to classify cells based on their morphology and the surrounding context. However, there is a lack of efforts to reflect such behaviors by…
▽ More
Cell detection is a fundamental task in computational pathology that can be used for extracting high-level medical information from whole-slide images. For accurate cell detection, pathologists often zoom out to understand the tissue-level structures and zoom in to classify cells based on their morphology and the surrounding context. However, there is a lack of efforts to reflect such behaviors by pathologists in the cell detection models, mainly due to the lack of datasets containing both cell and tissue annotations with overlapping regions. To overcome this limitation, we propose and publicly release OCELOT, a dataset purposely dedicated to the study of cell-tissue relationships for cell detection in histopathology. OCELOT provides overlapping cell and tissue annotations on images acquired from multiple organs. Within this setting, we also propose multi-task learning approaches that benefit from learning both cell and tissue tasks simultaneously. When compared against a model trained only for the cell detection task, our proposed approaches improve cell detection performance on 3 datasets: proposed OCELOT, public TIGER, and internal CARP datasets. On the OCELOT test set in particular, we show up to 6.79 improvement in F1-score. We believe the contributions of this paper, including the release of the OCELOT dataset at https://lunit-io.github.io/research/publications/ocelot are a crucial starting point toward the important research direction of incorporating cell-tissue relationships in computation pathology.
△ Less
Submitted 23 March, 2023; v1 submitted 23 March, 2023;
originally announced March 2023.
-
A Variational Approach to Mutual Information-Based Coordination for Multi-Agent Reinforcement Learning
Authors:
Woojun Kim,
Whiyoung Jung,
Myungsik Cho,
Youngchul Sung
Abstract:
In this paper, we propose a new mutual information framework for multi-agent reinforcement learning to enable multiple agents to learn coordinated behaviors by regularizing the accumulated return with the simultaneous mutual information between multi-agent actions. By introducing a latent variable to induce nonzero mutual information between multi-agent actions and applying a variational bound, we…
▽ More
In this paper, we propose a new mutual information framework for multi-agent reinforcement learning to enable multiple agents to learn coordinated behaviors by regularizing the accumulated return with the simultaneous mutual information between multi-agent actions. By introducing a latent variable to induce nonzero mutual information between multi-agent actions and applying a variational bound, we derive a tractable lower bound on the considered MMI-regularized objective function. The derived tractable objective can be interpreted as maximum entropy reinforcement learning combined with uncertainty reduction of other agents actions. Applying policy iteration to maximize the derived lower bound, we propose a practical algorithm named variational maximum mutual information multi-agent actor-critic, which follows centralized learning with decentralized execution. We evaluated VM3-AC for several games requiring coordination, and numerical results show that VM3-AC outperforms other MARL algorithms in multi-agent tasks requiring high-quality coordination.
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
PITS: Variational Pitch Inference without Fundamental Frequency for End-to-End Pitch-controllable TTS
Authors:
Junhyeok Lee,
Wonbin Jung,
Hyunjae Cho,
Jaeyeon Kim,
Jaehwan Kim
Abstract:
Previous pitch-controllable text-to-speech (TTS) models rely on directly modeling fundamental frequency, leading to low variance in synthesized speech. To address this issue, we propose PITS, an end-to-end pitch-controllable TTS model that utilizes variational inference to model pitch. Based on VITS, PITS incorporates the Yingram encoder, the Yingram decoder, and adversarial training of pitch-shif…
▽ More
Previous pitch-controllable text-to-speech (TTS) models rely on directly modeling fundamental frequency, leading to low variance in synthesized speech. To address this issue, we propose PITS, an end-to-end pitch-controllable TTS model that utilizes variational inference to model pitch. Based on VITS, PITS incorporates the Yingram encoder, the Yingram decoder, and adversarial training of pitch-shifted synthesis to achieve pitch-controllability. Experiments demonstrate that PITS generates high-quality speech that is indistinguishable from ground truth speech and has high pitch-controllability without quality degradation. Code, audio samples, and demo are available at https://github.com/anonymous-pits/pits.
△ Less
Submitted 6 June, 2023; v1 submitted 23 February, 2023;
originally announced February 2023.
-
Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?
Authors:
Sang-Woo Lee,
Sungdong Kim,
Donghyeon Ko,
Donghoon Ham,
Youngki Hong,
Shin Ah Oh,
Hyunhoon Jung,
Wangkyo Jung,
Kyunghyun Cho,
Donghyun Kwak,
Hyungsuk Noh,
Woomyoung Park
Abstract:
Task-oriented dialogue (TOD) systems are mainly based on the slot-filling-based TOD (SF-TOD) framework, in which dialogues are broken down into smaller, controllable units (i.e., slots) to fulfill a specific task. A series of approaches based on this framework achieved remarkable success on various TOD benchmarks. However, we argue that the current TOD benchmarks are limited to surrogate real-worl…
▽ More
Task-oriented dialogue (TOD) systems are mainly based on the slot-filling-based TOD (SF-TOD) framework, in which dialogues are broken down into smaller, controllable units (i.e., slots) to fulfill a specific task. A series of approaches based on this framework achieved remarkable success on various TOD benchmarks. However, we argue that the current TOD benchmarks are limited to surrogate real-world scenarios and that the current TOD models are still a long way to cover the scenarios. In this position paper, we first identify current status and limitations of SF-TOD systems. After that, we explore the WebTOD framework, the alternative direction for building a scalable TOD system when a web/mobile interface is available. In WebTOD, the dialogue system learns how to understand the web/mobile interface that the human agent interacts with, powered by a large-scale language model.
△ Less
Submitted 24 May, 2023; v1 submitted 20 December, 2022;
originally announced December 2022.
-
Quantile Constrained Reinforcement Learning: A Reinforcement Learning Framework Constraining Outage Probability
Authors:
Whiyoung Jung,
Myungsik Cho,
Jongeui Park,
Youngchul Sung
Abstract:
Constrained reinforcement learning (RL) is an area of RL whose objective is to find an optimal policy that maximizes expected cumulative return while satisfying a given constraint. Most of the previous constrained RL works consider expected cumulative sum cost as the constraint. However, optimization with this constraint cannot guarantee a target probability of outage event that the cumulative sum…
▽ More
Constrained reinforcement learning (RL) is an area of RL whose objective is to find an optimal policy that maximizes expected cumulative return while satisfying a given constraint. Most of the previous constrained RL works consider expected cumulative sum cost as the constraint. However, optimization with this constraint cannot guarantee a target probability of outage event that the cumulative sum cost exceeds a given threshold. This paper proposes a framework, named Quantile Constrained RL (QCRL), to constrain the quantile of the distribution of the cumulative sum cost that is a necessary and sufficient condition to satisfy the outage constraint. This is the first work that tackles the issue of applying the policy gradient theorem to the quantile and provides theoretical results for approximating the gradient of the quantile. Based on the derived theoretical results and the technique of the Lagrange multiplier, we construct a constrained RL algorithm named Quantile Constrained Policy Optimization (QCPO). We use distributional RL with the Large Deviation Principle (LDP) to estimate quantiles and tail probability of the cumulative sum cost for the implementation of QCPO. The implemented algorithm satisfies the outage probability constraint after the training period.
△ Less
Submitted 27 November, 2022;
originally announced November 2022.
-
PhaseAug: A Differentiable Augmentation for Speech Synthesis to Simulate One-to-Many Mapping
Authors:
Junhyeok Lee,
Seungu Han,
Hyunjae Cho,
Wonbin Jung
Abstract:
Previous generative adversarial network (GAN)-based neural vocoders are trained to reconstruct the exact ground truth waveform from the paired mel-spectrogram and do not consider the one-to-many relationship of speech synthesis. This conventional training causes overfitting for both the discriminators and the generator, leading to the periodicity artifacts in the generated audio signal. In this wo…
▽ More
Previous generative adversarial network (GAN)-based neural vocoders are trained to reconstruct the exact ground truth waveform from the paired mel-spectrogram and do not consider the one-to-many relationship of speech synthesis. This conventional training causes overfitting for both the discriminators and the generator, leading to the periodicity artifacts in the generated audio signal. In this work, we present PhaseAug, the first differentiable augmentation for speech synthesis that rotates the phase of each frequency bin to simulate one-to-many mapping. With our proposed method, we outperform baselines without any architecture modification. Code and audio samples will be available at https://github.com/mindslab-ai/phaseaug.
△ Less
Submitted 13 March, 2023; v1 submitted 8 November, 2022;
originally announced November 2022.
-
Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions
Authors:
Fernando Castañeda,
Jason J. Choi,
Wonsuhk Jung,
Bike Zhang,
Claire J. Tomlin,
Koushil Sreenath
Abstract:
Learning-based control schemes have recently shown great efficacy performing complex tasks for a wide variety of applications. However, in order to deploy them in real systems, it is of vital importance to guarantee that the system will remain safe during online training and execution. Among the currently most popular methods to tackle this challenge, Control Barrier Functions (CBFs) serve as math…
▽ More
Learning-based control schemes have recently shown great efficacy performing complex tasks for a wide variety of applications. However, in order to deploy them in real systems, it is of vital importance to guarantee that the system will remain safe during online training and execution. Among the currently most popular methods to tackle this challenge, Control Barrier Functions (CBFs) serve as mathematical tools that provide a formal safety-preserving control synthesis procedure for systems with known dynamics. In this paper, we first introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers using Gaussian Process (GP) regression to bridge the gap between an approximate mathematical model and the real system. Compared to previous approaches, we study the feasibility of the resulting robust safety-critical controller. This feasibility analysis results in a set of richness conditions that the available information about the system should satisfy to guarantee that a safe control action can be found at all times. We then use these conditions to devise an event-triggered online data collection strategy that ensures the recursive feasibility of the learned safety-critical controller. Our proposed methodology endows the system with the ability to reason at all times about whether the current information at its disposal is enough to ensure safety or if new measurements are required. This, in turn, allows us to provide formal results of forward invariance of a safe set with high probability, even in a priori unexplored regions. Finally, we validate the proposed framework in numerical simulations of an adaptive cruise control system and a kinematic vehicle.
△ Less
Submitted 26 September, 2023; v1 submitted 23 August, 2022;
originally announced August 2022.
-
XADLiME: eXplainable Alzheimer's Disease Likelihood Map Estimation via Clinically-guided Prototype Learning
Authors:
Ahmad Wisnu Mulyadi,
Wonsik Jung,
Kwanseok Oh,
Jee Seok Yoon,
Heung-Il Suk
Abstract:
Diagnosing Alzheimer's disease (AD) involves a deliberate diagnostic process owing to its innate traits of irreversibility with subtle and gradual progression. These characteristics make AD biomarker identification from structural brain imaging (e.g., structural MRI) scans quite challenging. Furthermore, there is a high possibility of getting entangled with normal aging. We propose a novel deep-le…
▽ More
Diagnosing Alzheimer's disease (AD) involves a deliberate diagnostic process owing to its innate traits of irreversibility with subtle and gradual progression. These characteristics make AD biomarker identification from structural brain imaging (e.g., structural MRI) scans quite challenging. Furthermore, there is a high possibility of getting entangled with normal aging. We propose a novel deep-learning approach through eXplainable AD Likelihood Map Estimation (XADLiME) for AD progression modeling over 3D sMRIs using clinically-guided prototype learning. Specifically, we establish a set of topologically-aware prototypes onto the clusters of latent clinical features, uncovering an AD spectrum manifold. We then measure the similarities between latent clinical features and well-established prototypes, estimating a "pseudo" likelihood map. By considering this pseudo map as an enriched reference, we employ an estimating network to estimate the AD likelihood map over a 3D sMRI scan. Additionally, we promote the explainability of such a likelihood map by revealing a comprehensible overview from two perspectives: clinical and morphological. During the inference, this estimated likelihood map served as a substitute over unseen sMRI scans for effectively conducting the downstream task while providing thorough explainable states.
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
MAC-DO: An Efficient Output-Stationary GEMM Accelerator for CNNs Using DRAM Technology
Authors:
Minki Jeong,
Wanyeong Jung
Abstract:
DRAM-based in-situ accelerators have shown their potential in addressing the memory wall challenge of the traditional von Neumann architecture. Such accelerators exploit charge sharing or logic circuits for simple logic operations at the DRAM subarray level. However, their throughput is limited due to low array utilization, as only a few row cells in a DRAM array participate in operations while mo…
▽ More
DRAM-based in-situ accelerators have shown their potential in addressing the memory wall challenge of the traditional von Neumann architecture. Such accelerators exploit charge sharing or logic circuits for simple logic operations at the DRAM subarray level. However, their throughput is limited due to low array utilization, as only a few row cells in a DRAM array participate in operations while most rows remain deactivated. Moreover, they require many cycles for more complex operations such as a multi-bit multiply-accumulate (MAC) operation, resulting in significant data access and movement and potentially worsening power efficiency. To overcome these limitations, this paper presents MAC-DO, an efficient and low-power DRAM-based in-situ accelerator. Compared to previous DRAM-based in-situ accelerators, a MAC-DO cell, consisting of two 1T1C DRAM cells (two transistors and two capacitors), innately supports a multi-bit MAC operation within a single cycle, ensuring good linearity and compatibility with existing 1T1C DRAM cells and array structures. This achievement is facilitated by a novel analog computation method utilizing charge steering. Additionally, MAC-DO enables concurrent individual MAC operations in each MAC-DO cell without idle cells, significantly improving throughput and energy efficiency. As a result, a MAC-DO array efficiently can accelerate matrix multiplications based on output stationary mapping, supporting the majority of computations performed in deep neural networks (DNNs). Furthermore, a MAC-DO array efficiently reuses three types of data (input, weight and output), minimizing data movement.
△ Less
Submitted 7 February, 2024; v1 submitted 16 July, 2022;
originally announced July 2022.
-
SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
Authors:
Hyunjae Cho,
Wonbin Jung,
Junhyeok Lee,
Sang Hoon Woo
Abstract:
In this paper, we present SANE-TTS, a stable and natural end-to-end multilingual TTS model. By the difficulty of obtaining multilingual corpus for given speaker, training multilingual TTS model with monolingual corpora is unavoidable. We introduce speaker regularization loss that improves speech naturalness during cross-lingual synthesis as well as domain adversarial training, which is applied in…
▽ More
In this paper, we present SANE-TTS, a stable and natural end-to-end multilingual TTS model. By the difficulty of obtaining multilingual corpus for given speaker, training multilingual TTS model with monolingual corpora is unavoidable. We introduce speaker regularization loss that improves speech naturalness during cross-lingual synthesis as well as domain adversarial training, which is applied in other multilingual TTS models. Furthermore, by adding speaker regularization loss, replacing speaker embedding with zero vector in duration predictor stabilizes cross-lingual inference. With this replacement, our model generates speeches with moderate rhythm regardless of source speaker in cross-lingual synthesis. In MOS evaluation, SANE-TTS achieves naturalness score above 3.80 both in cross-lingual and intralingual synthesis, where the ground truth score is 3.99. Also, SANE-TTS maintains speaker similarity close to that of ground truth even in cross-lingual inference. Audio samples are available on our web page.
△ Less
Submitted 24 June, 2022;
originally announced June 2022.
-
MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer
Authors:
Jeewon Jeon,
Woojun Kim,
Whiyoung Jung,
Youngchul Sung
Abstract:
In this paper, we consider cooperative multi-agent reinforcement learning (MARL) with sparse reward. To tackle this problem, we propose a novel method named MASER: MARL with subgoals generated from experience replay buffer. Under the widely-used assumption of centralized training with decentralized execution and consistent Q-value decomposition for MARL, MASER automatically generates proper subgoa…
▽ More
In this paper, we consider cooperative multi-agent reinforcement learning (MARL) with sparse reward. To tackle this problem, we propose a novel method named MASER: MARL with subgoals generated from experience replay buffer. Under the widely-used assumption of centralized training with decentralized execution and consistent Q-value decomposition for MARL, MASER automatically generates proper subgoals for multiple agents from the experience replay buffer by considering both individual Q-value and total Q-value. Then, MASER designs individual intrinsic reward for each agent based on actionable representation relevant to Q-learning so that the agents reach their subgoals while maximizing the joint action value. Numerical results show that MASER significantly outperforms StarCraft II micromanagement benchmark compared to other state-of-the-art MARL algorithms.
△ Less
Submitted 20 June, 2022;
originally announced June 2022.
-
Robust Imitation Learning against Variations in Environment Dynamics
Authors:
Jongseong Chae,
Seungyul Han,
Whiyoung Jung,
Myungsik Cho,
Sungho Choi,
Youngchul Sung
Abstract:
In this paper, we propose a robust imitation learning (IL) framework that improves the robustness of IL when environment dynamics are perturbed. The existing IL framework trained in a single environment can catastrophically fail with perturbations in environment dynamics because it does not capture the situation that underlying environment dynamics can be changed. Our framework effectively deals w…
▽ More
In this paper, we propose a robust imitation learning (IL) framework that improves the robustness of IL when environment dynamics are perturbed. The existing IL framework trained in a single environment can catastrophically fail with perturbations in environment dynamics because it does not capture the situation that underlying environment dynamics can be changed. Our framework effectively deals with environments with varying dynamics by imitating multiple experts in sampled environment dynamics to enhance the robustness in general variations in environment dynamics. In order to robustly imitate the multiple sample experts, we minimize the risk with respect to the Jensen-Shannon divergence between the agent's policy and each of the sample experts. Numerical results show that our algorithm significantly improves robustness against dynamics perturbations compared to conventional IL baselines.
△ Less
Submitted 18 June, 2022;
originally announced June 2022.
-
Quantifying the topic disparity of scientific articles
Authors:
Munjung Kim,
Jisung Yoon,
Woo-Sung Jung,
Hyunuk Kim
Abstract:
Citation count is a popular index for assessing scientific papers. However, it depends on not only the quality of a paper but also various factors, such as conventionality, team size, and gender. Here, we examine the extent to which the conventionality of a paper is related to its citation percentile in a discipline by using our measure, topic disparity. The topic disparity is the cosine distance…
▽ More
Citation count is a popular index for assessing scientific papers. However, it depends on not only the quality of a paper but also various factors, such as conventionality, team size, and gender. Here, we examine the extent to which the conventionality of a paper is related to its citation percentile in a discipline by using our measure, topic disparity. The topic disparity is the cosine distance between a paper and its discipline on a neural embedding space. Using this measure, we show that the topic disparity is negatively associated with the citation percentile in many disciplines, even after controlling team size and the genders of the first and last authors. This result indicates that less conventional research tends to receive fewer citations than conventional research. Our proposed method can be used to complement the raw citation counts and to recommend papers at the periphery of a discipline because of their less conventional topics.
△ Less
Submitted 8 February, 2022; v1 submitted 8 February, 2022;
originally announced February 2022.
-
Quantifying knowledge synchronisation in the 21st century
Authors:
Jisung Yoon,
Jinseo Park,
Jinhyuk Yun,
Woo-Sung Jung
Abstract:
Humans acquire and accumulate knowledge through language usage and eagerly exchange their knowledge for advancement. Although geographical barriers had previously limited communication, the emergence of information technology has opened new avenues for knowledge exchange. However, it is unclear which communication pathway is dominant in the 21st century. Here, we explore the dominant path of knowl…
▽ More
Humans acquire and accumulate knowledge through language usage and eagerly exchange their knowledge for advancement. Although geographical barriers had previously limited communication, the emergence of information technology has opened new avenues for knowledge exchange. However, it is unclear which communication pathway is dominant in the 21st century. Here, we explore the dominant path of knowledge diffusion in the 21st century using Wikipedia, the largest communal dataset. We evaluate the similarity of shared knowledge between population groups, distinguished based on their language usage. When population groups are more engaged with each other, their knowledge structure is more similar, where engagement is indicated by socioeconomic connections, such as cultural, linguistic, and historical features. Moreover, geographical proximity is no longer a critical requirement for knowledge dissemination. Furthermore, we integrate our data into a mechanistic model to better understand the underlying mechanism and suggest that the knowledge "Silk Road" of the 21st century is based online.
△ Less
Submitted 3 February, 2022;
originally announced February 2022.
-
AESPA: Accuracy Preserving Low-degree Polynomial Activation for Fast Private Inference
Authors:
Jaiyoung Park,
Michael Jaemin Kim,
Wonkyung Jung,
Jung Ho Ahn
Abstract:
Hybrid private inference (PI) protocol, which synergistically utilizes both multi-party computation (MPC) and homomorphic encryption, is one of the most prominent techniques for PI. However, even the state-of-the-art PI protocols are bottlenecked by the non-linear layers, especially the activation functions. Although a standard non-linear activation function can generate higher model accuracy, it…
▽ More
Hybrid private inference (PI) protocol, which synergistically utilizes both multi-party computation (MPC) and homomorphic encryption, is one of the most prominent techniques for PI. However, even the state-of-the-art PI protocols are bottlenecked by the non-linear layers, especially the activation functions. Although a standard non-linear activation function can generate higher model accuracy, it must be processed via a costly garbled-circuit MPC primitive. A polynomial activation can be processed via Beaver's multiplication triples MPC primitive but has been incurring severe accuracy drops so far.
In this paper, we propose an accuracy preserving low-degree polynomial activation function (AESPA) that exploits the Hermite expansion of the ReLU and basis-wise normalization. We apply AESPA to popular ML models, such as VGGNet, ResNet, and pre-activation ResNet, to show an inference accuracy comparable to those of the standard models with ReLU activation, achieving superior accuracy over prior low-degree polynomial studies. When applied to the all-RELU baseline on the state-of-the-art Delphi PI protocol, AESPA shows up to 42.1x and 28.3x lower online latency and communication cost.
△ Less
Submitted 18 February, 2022; v1 submitted 17 January, 2022;
originally announced January 2022.
-
BTS: An Accelerator for Bootstrappable Fully Homomorphic Encryption
Authors:
Sangpyo Kim,
Jongmin Kim,
Michael Jaemin Kim,
Wonkyung Jung,
Minsoo Rhu,
John Kim,
Jung Ho Ahn
Abstract:
Homomorphic encryption (HE) enables the secure offloading of computations to the cloud by providing computation on encrypted data (ciphertexts). HE is based on noisy encryption schemes in which noise accumulates as more computations are applied to the data. The limited number of operations applicable to the data prevents practical applications from exploiting HE. Bootstrapping enables an unlimited…
▽ More
Homomorphic encryption (HE) enables the secure offloading of computations to the cloud by providing computation on encrypted data (ciphertexts). HE is based on noisy encryption schemes in which noise accumulates as more computations are applied to the data. The limited number of operations applicable to the data prevents practical applications from exploiting HE. Bootstrapping enables an unlimited number of operations or fully HE (FHE) by refreshing the ciphertext. Unfortunately, bootstrapping requires a significant amount of additional computation and memory bandwidth as well. Prior works have proposed hardware accelerators for computation primitives of FHE. However, to the best of our knowledge, this is the first to propose a hardware FHE accelerator that supports bootstrapping as a first-class citizen.
In particular, we propose BTS - Bootstrappable, Technologydriven, Secure accelerator architecture for FHE. We identify the challenges of supporting bootstrapping in the accelerator and analyze the off-chip memory bandwidth and computation required. In particular, given the limitations of modern memory technology, we identify the HE parameter sets that are efficient for FHE acceleration. Based on the insights gained from our analysis, we propose BTS, which effectively exploits the parallelism innate in HE operations by arranging a massive number of processing elements in a grid. We present the design and microarchitecture of BTS, including a network-on-chip design that exploits a deterministic communication pattern. BTS shows 5,556x and 1,306x improved execution time on ResNet-20 and logistic regression over a CPU, with a chip area of 373.6mm^2 and up to 163.2W of power.
△ Less
Submitted 28 April, 2022; v1 submitted 31 December, 2021;
originally announced December 2021.
-
From augmented microscopy to the topological transformer: a new approach in cell image analysis for Alzheimer's research
Authors:
Wooseok Jung
Abstract:
Cell image analysis is crucial in Alzheimer's research to detect the presence of A$β$ protein inhibiting cell function. Deep learning speeds up the process by making only low-level data sufficient for fruitful inspection. We first found Unet is most suitable in augmented microscopy by comparing performance in multi-class semantics segmentation. We develop the augmented microscopy method to capture…
▽ More
Cell image analysis is crucial in Alzheimer's research to detect the presence of A$β$ protein inhibiting cell function. Deep learning speeds up the process by making only low-level data sufficient for fruitful inspection. We first found Unet is most suitable in augmented microscopy by comparing performance in multi-class semantics segmentation. We develop the augmented microscopy method to capture nuclei in a brightfield image and the transformer using Unet model to convert an input image into a sequence of topological information. The performance regarding Intersection-over-Union is consistent concerning the choice of image preprocessing and ground-truth generation. Training model with data of a specific cell type demonstrates transfer learning applies to some extent.
The topological transformer aims to extract persistence silhouettes or landscape signatures containing geometric information of a given image of cells. This feature extraction facilitates studying an image as a collection of one-dimensional data, substantially reducing computational costs. Using the transformer, we attempt grouping cell images by their cell type relying solely on topological features. Performances of the transformers followed by SVM, XGBoost, LGBM, and simple convolutional neural network classifiers are inferior to the conventional image classification. However, since this research initiates a new perspective in biomedical research by combining deep learning and topology for image analysis, we speculate follow-up investigation will reinforce our genuine regime.
△ Less
Submitted 3 August, 2021;
originally announced August 2021.
-
Heavily Augmented Sound Event Detection utilizing Weak Predictions
Authors:
Hyeonuk Nam,
Byeong-Yun Ko,
Gyeong-Tae Lee,
Seong-Hu Kim,
Won-Ho Jung,
Sang-Min Choi,
Yong-Hwa Park
Abstract:
The performances of Sound Event Detection (SED) systems are greatly limited by the difficulty in generating large strongly labeled dataset. In this work, we used two main approaches to overcome the lack of strongly labeled data. First, we applied heavy data augmentation on input features. Data augmentation methods used include not only conventional methods used in speech/audio domains but also our…
▽ More
The performances of Sound Event Detection (SED) systems are greatly limited by the difficulty in generating large strongly labeled dataset. In this work, we used two main approaches to overcome the lack of strongly labeled data. First, we applied heavy data augmentation on input features. Data augmentation methods used include not only conventional methods used in speech/audio domains but also our proposed method named FilterAugment. Second, we propose two methods to utilize weak predictions to enhance weakly supervised SED performance. As a result, we obtained the best PSDS1 of 0.4336 and best PSDS2 of 0.8161 on the DESED real validation dataset. This work is submitted to DCASE 2021 Task4 and is ranked on the 3rd place. Code availa-ble: https://github.com/frednam93/FilterAugSED.
△ Less
Submitted 14 September, 2021; v1 submitted 8 July, 2021;
originally announced July 2021.
-
Disturbance of questionable publishing to academia
Authors:
Taekho You,
Jinseo Park,
June Young Lee,
Jinhyuk Yun,
Woo-Sung Jung
Abstract:
Questionable publications have been accused of "greedy" practices; however, their influence on academia has not been gauged. Here, we probe the impact of questionable publications through a systematic and comprehensive analysis with various participants from academia and compare the results with those of their unaccused counterparts using billions of citation records, including liaisons, i.e., jou…
▽ More
Questionable publications have been accused of "greedy" practices; however, their influence on academia has not been gauged. Here, we probe the impact of questionable publications through a systematic and comprehensive analysis with various participants from academia and compare the results with those of their unaccused counterparts using billions of citation records, including liaisons, i.e., journals and publishers, and prosumers, i.e., authors. Questionable publications attribute publisher-level self-citations to their journals while limiting journal-level self-citations; yet, conventional journal-level metrics are unable to detect these publisher-level self-citations. We propose a hybrid journal-publisher metric for detecting self-favouring citations among QJs from publishers. Additionally, we demonstrate that the questionable publications were less disruptive and influential than their counterparts. Our findings indicate an inflated citation impact of suspicious academic publishers. The findings provide a basis for actionable policy-making against questionable publications.
△ Less
Submitted 19 April, 2022; v1 submitted 29 June, 2021;
originally announced June 2021.
-
DeepAuditor: Distributed Online Intrusion Detection System for IoT devices via Power Side-channel Auditing
Authors:
Woosub Jung,
Yizhou Feng,
Sabbir Ahmed Khan,
Chunsheng Xin,
Danella Zhao,
Gang Zhou
Abstract:
As the number of IoT devices has increased rapidly, IoT botnets have exploited the vulnerabilities of IoT devices. However, it is still challenging to detect the initial intrusion on IoT devices prior to massive attacks. Recent studies have utilized power side-channel information to identify this intrusion behavior on IoT devices but still lack accurate models in real-time for ubiquitous botnet de…
▽ More
As the number of IoT devices has increased rapidly, IoT botnets have exploited the vulnerabilities of IoT devices. However, it is still challenging to detect the initial intrusion on IoT devices prior to massive attacks. Recent studies have utilized power side-channel information to identify this intrusion behavior on IoT devices but still lack accurate models in real-time for ubiquitous botnet detection.
We proposed the first online intrusion detection system called DeepAuditor for IoT devices via power auditing. To develop the real-time system, we proposed a lightweight power auditing device called Power Auditor. We also designed a distributed CNN classifier for online inference in a laboratory setting. In order to protect data leakage and reduce networking redundancy, we then proposed a privacy-preserved inference protocol via Packed Homomorphic Encryption and a sliding window protocol in our system. The classification accuracy and processing time were measured, and the proposed classifier outperformed a baseline classifier, especially against unseen patterns. We also demonstrated that the distributed CNN design is secure against any distributed components. Overall, the measurements were shown to the feasibility of our real-time distributed system for intrusion detection on IoT devices.
△ Less
Submitted 9 May, 2022; v1 submitted 23 June, 2021;
originally announced June 2021.
-
The latent structure of global scientific development
Authors:
Lili Miao,
Dakota Murray,
Woo-Sung Jung,
Vincent Larivière,
Cassidy R. Sugimoto,
Yong-Yeol Ahn
Abstract:
Science is essential to innovation and economic prosperity. Although studies have shown that national scientific development is affected by geographic, historic, and economic factors, it remains unclear whether there are universal structures and trajectories of national scientific development that can inform forecasting and policymaking. Here, by examining countries' scientific 'exports'-publicati…
▽ More
Science is essential to innovation and economic prosperity. Although studies have shown that national scientific development is affected by geographic, historic, and economic factors, it remains unclear whether there are universal structures and trajectories of national scientific development that can inform forecasting and policymaking. Here, by examining countries' scientific 'exports'-publications that are indexed in international databases-we reveal a three-cluster structure in the relatedness network of disciplines that underpin national scientific development and the organization of global science. Tracing the evolution of national research portfolios reveals that while nations are proceeding to more diverse research profiles individually, scientific production is increasingly specialized in global science over the past decades. By uncovering the underlying structure of scientific development and connecting it with economic development, our results may offer a new perspective on the evolution of global science.
△ Less
Submitted 30 March, 2022; v1 submitted 21 April, 2021;
originally announced April 2021.
-
Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction
Authors:
Wonkwang Lee,
Whie Jung,
Han Zhang,
Ting Chen,
Jing Yu Koh,
Thomas Huang,
Hyungsuk Yoon,
Honglak Lee,
Seunghoon Hong
Abstract:
Learning to predict the long-term future of video frames is notoriously challenging due to inherent ambiguities in the distant future and dramatic amplifications of prediction error through time. Despite the recent advances in the literature, existing approaches are limited to moderately short-term prediction (less than a few seconds), while extrapolating it to a longer future quickly leads to des…
▽ More
Learning to predict the long-term future of video frames is notoriously challenging due to inherent ambiguities in the distant future and dramatic amplifications of prediction error through time. Despite the recent advances in the literature, existing approaches are limited to moderately short-term prediction (less than a few seconds), while extrapolating it to a longer future quickly leads to destruction in structure and content. In this work, we revisit hierarchical models in video prediction. Our method predicts future frames by first estimating a sequence of semantic structures and subsequently translating the structures to pixels by video-to-video translation. Despite the simplicity, we show that modeling structures and their dynamics in the discrete semantic structure space with a stochastic recurrent estimator leads to surprisingly successful long-term prediction. We evaluate our method on three challenging datasets involving car driving and human dancing, and demonstrate that it can generate complicated scene structures and motions over a very long time horizon (i.e., thousands frames), setting a new standard of video prediction with orders of magnitude longer prediction time than existing approaches. Full videos and codes are available at https://1konny.github.io/HVP/.
△ Less
Submitted 14 April, 2021;
originally announced April 2021.
-
Fine-Grained Attention for Weakly Supervised Object Localization
Authors:
Junghyo Sohn,
Eunjin Jeon,
Wonsik Jung,
Eunsong Kang,
Heung-Il Suk
Abstract:
Although recent advances in deep learning accelerated an improvement in a weakly supervised object localization (WSOL) task, there are still challenges to identify the entire body of an object, rather than only discriminative parts. In this paper, we propose a novel residual fine-grained attention (RFGA) module that autonomously excites the less activated regions of an object by utilizing informat…
▽ More
Although recent advances in deep learning accelerated an improvement in a weakly supervised object localization (WSOL) task, there are still challenges to identify the entire body of an object, rather than only discriminative parts. In this paper, we propose a novel residual fine-grained attention (RFGA) module that autonomously excites the less activated regions of an object by utilizing information distributed over channels and locations within feature maps in combination with a residual operation. To be specific, we devise a series of mechanisms of triple-view attention representation, attention expansion, and feature calibration. Unlike other attention-based WSOL methods that learn a coarse attention map, having the same values across elements in feature maps, our proposed RFGA learns fine-grained values in an attention map by assigning different attention values for each of the elements. We validated the superiority of our proposed RFGA module by comparing it with the recent methods in the literature over three datasets. Further, we analyzed the effect of each mechanism in our RFGA and visualized attention maps to get insights.
△ Less
Submitted 11 April, 2021;
originally announced April 2021.
-
DIFFnet: Diffusion parameter mapping network generalized for input diffusion gradient schemes and bvalues
Authors:
Juhung Park,
Woojin Jung,
Eun-Jung Choi,
Se-Hong Oh,
Dongmyung Shin,
Hongjun An,
Jongho Lee
Abstract:
In MRI, deep neural networks have been proposed to reconstruct diffusion model parameters. However, the inputs of the networks were designed for a specific diffusion gradient scheme (i.e., diffusion gradient directions and numbers) and a specific b-value that are the same as the training data. In this study, a new deep neural network, referred to as DIFFnet, is developed to function as a generaliz…
▽ More
In MRI, deep neural networks have been proposed to reconstruct diffusion model parameters. However, the inputs of the networks were designed for a specific diffusion gradient scheme (i.e., diffusion gradient directions and numbers) and a specific b-value that are the same as the training data. In this study, a new deep neural network, referred to as DIFFnet, is developed to function as a generalized reconstruction tool of the diffusion-weighted signals for various gradient schemes and b-values. For generalization, diffusion signals are normalized in a q-space and then projected and quantized, producing a matrix (Qmatrix) as an input for the network. To demonstrate the validity of this approach, DIFFnet is evaluated for diffusion tensor imaging (DIFFnetDTI) and for neurite orientation dispersion and density imaging (DIFFnetNODDI). In each model, two datasets with different gradient schemes and b-values are tested. The results demonstrate accurate reconstruction of the diffusion parameters at substantially reduced processing time (approximately 8.7 times and 2240 times faster processing time than conventional methods in DTI and NODDI, respectively; less than 4% mean normalized root-mean-square errors (NRMSE) in DTI and less than 8% in NODDI). The generalization capability of the networks was further validated using reduced numbers of diffusion signals from the datasets. Different from previously proposed deep neural networks, DIFFnet does not require any specific gradient scheme and b-value for its input. As a result, it can be adopted as an online reconstruction tool for various complex diffusion imaging.
△ Less
Submitted 4 February, 2021;
originally announced February 2021.
-
Dynamical prediction of two meteorological factors using the deep neural network and the long short term memory $(1)$
Authors:
Ki Hong Shin,
Jae Won Jung,
Sung Kyu Seo,
Cheol Hwan You,
Dong In Lee,
Jisun Lee,
Ki Ho Chang,
Woon Seon Jung,
Kyungsik Kim
Abstract:
It is important to calculate and analyze temperature and humidity prediction accuracies among quantitative meteorological forecasting. This study manipulates the extant neural network methods to foster the predictive accuracy. To achieve such tasks, we analyze and explore the predictive accuracy and performance in the neural networks using two combined meteorological factors (temperature and humid…
▽ More
It is important to calculate and analyze temperature and humidity prediction accuracies among quantitative meteorological forecasting. This study manipulates the extant neural network methods to foster the predictive accuracy. To achieve such tasks, we analyze and explore the predictive accuracy and performance in the neural networks using two combined meteorological factors (temperature and humidity). Simulated studies are performed by applying the artificial neural network (ANN), deep neural network (DNN), extreme learning machine (ELM), long short-term memory (LSTM), and long short-term memory with peephole connections (LSTM-PC) machine learning methods, and the accurate prediction value are compared to that obtained from each other methods. Data are extracted from low frequency time-series of ten metropolitan cities of South Korea from March 2014 to February 2020 to validate our observations. To test the robustness of methods, the error of LSTM is found to outperform that of the other four methods in predictive accuracy. Particularly, as testing results, the temperature prediction of LSTM in summer in Tongyeong has a root mean squared error (RMSE) value of 0.866 lower than that of other neural network methods, while the mean absolute percentage error (MAPE) value of LSTM for humidity prediction is 5.525 in summer in Mokpo, significantly better than other metropolitan cities.
△ Less
Submitted 16 January, 2021;
originally announced January 2021.
-
Unsupervised embedding of trajectories captures the latent structure of scientific migration
Authors:
Dakota Murray,
Jisung Yoon,
Sadamori Kojaku,
Rodrigo Costas,
Woo-Sung Jung,
Staša Milojević,
Yong-Yeol Ahn
Abstract:
Human migration and mobility drives major societal phenomena including epidemics, economies, innovation, and the diffusion of ideas. Although human mobility and migration have been heavily constrained by geographic distance throughout the history, advances and globalization are making other factors such as language and culture increasingly more important. Advances in neural embedding models, origi…
▽ More
Human migration and mobility drives major societal phenomena including epidemics, economies, innovation, and the diffusion of ideas. Although human mobility and migration have been heavily constrained by geographic distance throughout the history, advances and globalization are making other factors such as language and culture increasingly more important. Advances in neural embedding models, originally designed for natural language, provide an opportunity to tame this complexity and open new avenues for the study of migration. Here, we demonstrate the ability of the model word2vec to encode nuanced relationships between discrete locations from migration trajectories, producing an accurate, dense, continuous, and meaningful vector-space representation. The resulting representation provides a functional distance between locations, as well as a digital double that can be distributed, re-used, and itself interrogated to understand the many dimensions of migration. We show that the unique power of word2vec to encode migration patterns stems from its mathematical equivalence with the gravity model of mobility. Focusing on the case of scientific migration, we apply word2vec to a database of three million migration trajectories of scientists derived from the affiliations listed on their publication records. Using techniques that leverage its semantic structure, we demonstrate that embeddings can learn the rich structure that underpins scientific migration, such as cultural, linguistic, and prestige relationships at multiple levels of granularity. Our results provide a theoretical foundation and methodological framework for using neural embeddings to represent and understand migration both within and beyond science.
△ Less
Submitted 17 November, 2023; v1 submitted 4 December, 2020;
originally announced December 2020.
-
Accelerating Number Theoretic Transformations for Bootstrappable Homomorphic Encryption on GPUs
Authors:
Sangpyo Kim,
Wonkyung Jung,
Jaiyoung Park,
Jung Ho Ahn
Abstract:
Homomorphic encryption (HE) draws huge attention as it provides a way of privacy-preserving computations on encrypted messages. Number Theoretic Transform (NTT), a specialized form of Discrete Fourier Transform (DFT) in the finite field of integers, is the key algorithm that enables fast computation on encrypted ciphertexts in HE. Prior works have accelerated NTT and its inverse transformation on…
▽ More
Homomorphic encryption (HE) draws huge attention as it provides a way of privacy-preserving computations on encrypted messages. Number Theoretic Transform (NTT), a specialized form of Discrete Fourier Transform (DFT) in the finite field of integers, is the key algorithm that enables fast computation on encrypted ciphertexts in HE. Prior works have accelerated NTT and its inverse transformation on a popular parallel processing platform, GPU, by leveraging DFT optimization techniques. However, these GPU-based studies lack a comprehensive analysis of the primary differences between NTT and DFT or only consider small HE parameters that have tight constraints in the number of arithmetic operations that can be performed without decryption. In this paper, we analyze the algorithmic characteristics of NTT and DFT and assess the performance of NTT when we apply the optimizations that are commonly applicable to both DFT and NTT on modern GPUs. From the analysis, we identify that NTT suffers from severe main-memory bandwidth bottleneck on large HE parameter sets. To tackle the main-memory bandwidth issue, we propose a novel NTT-specific on-the-fly root generation scheme dubbed on-the-fly twiddling (OT). Compared to the baseline radix-2 NTT implementation, after applying all the optimizations, including OT, we achieve 4.2x speedup on a modern GPU.
△ Less
Submitted 3 December, 2020;
originally announced December 2020.
-
Dual Supervision Framework for Relation Extraction with Distant Supervision and Human Annotation
Authors:
Woohwan Jung,
Kyuseok Shim
Abstract:
Relation extraction (RE) has been extensively studied due to its importance in real-world applications such as knowledge base construction and question answering. Most of the existing works train the models on either distantly supervised data or human-annotated data. To take advantage of the high accuracy of human annotation and the cheap cost of distant supervision, we propose the dual supervisio…
▽ More
Relation extraction (RE) has been extensively studied due to its importance in real-world applications such as knowledge base construction and question answering. Most of the existing works train the models on either distantly supervised data or human-annotated data. To take advantage of the high accuracy of human annotation and the cheap cost of distant supervision, we propose the dual supervision framework which effectively utilizes both types of data. However, simply combining the two types of data to train a RE model may decrease the prediction accuracy since distant supervision has labeling bias. We employ two separate prediction networks HA-Net and DS-Net to predict the labels by human annotation and distant supervision, respectively, to prevent the degradation of accuracy by the incorrect labeling of distant supervision. Furthermore, we propose an additional loss term called disagreement penalty to enable HA-Net to learn from distantly supervised labels. In addition, we exploit additional networks to adaptively assess the labeling bias by considering contextual information. Our performance study on sentence-level and document-level REs confirms the effectiveness of the dual supervision framework.
△ Less
Submitted 23 November, 2020;
originally announced November 2020.
-
CyCNN: A Rotation Invariant CNN using Polar Mapping and Cylindrical Convolution Layers
Authors:
Jinpyo Kim,
Wooekun Jung,
Hyungmo Kim,
Jaejin Lee
Abstract:
Deep Convolutional Neural Networks (CNNs) are empirically known to be invariant to moderate translation but not to rotation in image classification. This paper proposes a deep CNN model, called CyCNN, which exploits polar mapping of input images to convert rotation to translation. To deal with the cylindrical property of the polar coordinates, we replace convolution layers in conventional CNNs to…
▽ More
Deep Convolutional Neural Networks (CNNs) are empirically known to be invariant to moderate translation but not to rotation in image classification. This paper proposes a deep CNN model, called CyCNN, which exploits polar mapping of input images to convert rotation to translation. To deal with the cylindrical property of the polar coordinates, we replace convolution layers in conventional CNNs to cylindrical convolutional (CyConv) layers. A CyConv layer exploits the cylindrically sliding windows (CSW) mechanism that vertically extends the input-image receptive fields of boundary units in a convolutional layer. We evaluate CyCNN and conventional CNN models for classification tasks on rotated MNIST, CIFAR-10, and SVHN datasets. We show that if there is no data augmentation during training, CyCNN significantly improves classification accuracies when compared to conventional CNN models. Our implementation of CyCNN is publicly available on https://github.com/mcrl/CyCNN.
△ Less
Submitted 21 July, 2020;
originally announced July 2020.
-
Persona2vec: A Flexible Multi-role Representations Learning Framework for Graphs
Authors:
Jisung Yoon,
Kai-Cheng Yang,
Woo-Sung Jung,
Yong-Yeol Ahn
Abstract:
Graph embedding techniques, which learn low-dimensional representations of a graph, are achieving state-of-the-art performance in many graph mining tasks. Most existing embedding algorithms assign a single vector to each node, implicitly assuming that a single representation is enough to capture all characteristics of the node. However, across many domains, it is common to observe pervasively over…
▽ More
Graph embedding techniques, which learn low-dimensional representations of a graph, are achieving state-of-the-art performance in many graph mining tasks. Most existing embedding algorithms assign a single vector to each node, implicitly assuming that a single representation is enough to capture all characteristics of the node. However, across many domains, it is common to observe pervasively overlapping community structure, where most nodes belong to multiple communities, playing different roles depending on the contexts. Here, we propose persona2vec, a graph embedding framework that efficiently learns multiple representations of nodes based on their structural contexts. Using link prediction-based evaluation, we show that our framework is significantly faster than the existing state-of-the-art model while achieving better performance.
△ Less
Submitted 21 October, 2020; v1 submitted 4 June, 2020;
originally announced June 2020.
-
A Maximum Mutual Information Framework for Multi-Agent Reinforcement Learning
Authors:
Woojun Kim,
Whiyoung Jung,
Myungsik Cho,
Youngchul Sung
Abstract:
In this paper, we propose a maximum mutual information (MMI) framework for multi-agent reinforcement learning (MARL) to enable multiple agents to learn coordinated behaviors by regularizing the accumulated return with the mutual information between actions. By introducing a latent variable to induce nonzero mutual information between actions and applying a variational bound, we derive a tractable…
▽ More
In this paper, we propose a maximum mutual information (MMI) framework for multi-agent reinforcement learning (MARL) to enable multiple agents to learn coordinated behaviors by regularizing the accumulated return with the mutual information between actions. By introducing a latent variable to induce nonzero mutual information between actions and applying a variational bound, we derive a tractable lower bound on the considered MMI-regularized objective function. Applying policy iteration to maximize the derived lower bound, we propose a practical algorithm named variational maximum mutual information multi-agent actor-critic (VM3-AC), which follows centralized learning with decentralized execution (CTDE). We evaluated VM3-AC for several games requiring coordination, and numerical results show that VM3-AC outperforms MADDPG and other MARL algorithms in multi-agent tasks requiring coordination.
△ Less
Submitted 4 June, 2020;
originally announced June 2020.
-
Deep Recurrent Model for Individualized Prediction of Alzheimer's Disease Progression
Authors:
Wonsik Jung,
Eunji Jun,
Heung-Il Suk
Abstract:
Alzheimer's disease (AD) is known as one of the major causes of dementia and is characterized by slow progression over several years, with no treatments or available medicines. In this regard, there have been efforts to identify the risk of developing AD in its earliest time. While many of the previous works considered cross-sectional analysis, more recent studies have focused on the diagnosis and…
▽ More
Alzheimer's disease (AD) is known as one of the major causes of dementia and is characterized by slow progression over several years, with no treatments or available medicines. In this regard, there have been efforts to identify the risk of developing AD in its earliest time. While many of the previous works considered cross-sectional analysis, more recent studies have focused on the diagnosis and prognosis of AD with longitudinal or time series data in a way of disease progression modeling (DPM). Under the same problem settings, in this work, we propose a novel computational framework that can predict the phenotypic measurements of MRI biomarkers and trajectories of clinical status along with cognitive scores at multiple future time points. However, in handling time series data, it generally faces with many unexpected missing observations. In regard to such an unfavorable situation, we define a secondary problem of estimating those missing values and tackle it in a systematic way by taking account of temporal and multivariate relations inherent in time series data. Concretely, we propose a deep recurrent network that jointly tackles the four problems of (i) missing value imputation, (ii) phenotypic measurements forecasting, (iii) trajectory estimation of the cognitive score, and (iv) clinical status prediction of a subject based on his/her longitudinal imaging biomarkers. Notably, the learnable model parameters of our network are trained in an end-to-end manner with our circumspectly defined loss function. In our experiments over TADPOLE challenge cohort, we measured performance for various metrics and compared our method to competing methods in the literature. Exhaustive analyses and ablation studies were also conducted to better confirm the effectiveness of our method.
△ Less
Submitted 27 August, 2020; v1 submitted 6 May, 2020;
originally announced May 2020.
-
HEAAN Demystified: Accelerating Fully Homomorphic Encryption Through Architecture-centric Analysis and Optimization
Authors:
Wonkyung Jung,
Eojin Lee,
Sangpyo Kim,
Keewoo Lee,
Namhoon Kim,
Chohong Min,
Jung Hee Cheon,
Jung Ho Ahn
Abstract:
Homomorphic Encryption (HE) draws a significant attention as a privacy-preserving way for cloud computing because it allows computation on encrypted messages called ciphertexts. Among numerous HE schemes proposed, HE for Arithmetic of Approximate Numbers (HEAAN) is rapidly gaining popularity across a wide range of applications because it supports messages that can tolerate approximate computation…
▽ More
Homomorphic Encryption (HE) draws a significant attention as a privacy-preserving way for cloud computing because it allows computation on encrypted messages called ciphertexts. Among numerous HE schemes proposed, HE for Arithmetic of Approximate Numbers (HEAAN) is rapidly gaining popularity across a wide range of applications because it supports messages that can tolerate approximate computation with no limit on the number of arithmetic operations applicable to the corresponding ciphertexts. A critical shortcoming of HE is the high computation complexity of ciphertext arithmetic; especially, HE multiplication (HE Mul) is more than 10,000 times slower than the corresponding multiplication between unencrypted messages. This leads to a large body of HE acceleration studies, including ones exploiting FPGAs; however, those did not conduct a rigorous analysis of computational complexity and data access patterns of HE Mul. Moreover, the proposals mostly focused on designs with small parameter sizes, making it difficult to accurately estimate their performance in conducting a series of complex arithmetic operations. In this paper, we first describe how HE Mul of HEAAN is performed in a manner friendly to computer architects. Then we conduct a disciplined analysis on its computational and memory access characteristics, through which we (1) extract parallelism in the key functions composing HE Mul and (2) demonstrate how to effectively map the parallelism to the popular parallel processing platforms, multicore CPUs and GPUs, by applying a series of optimization techniques such as transposing matrices and pinning data to threads. This leads to the performance improvement of HE Mul on a CPU and a GPU by 42.9x and 134.1x, respectively, over the single-thread reference HEAAN running on a CPU. The conducted analysis and optimization would set a new foundation for future HE acceleration research.
△ Less
Submitted 9 March, 2020;
originally announced March 2020.
-
Population-Guided Parallel Policy Search for Reinforcement Learning
Authors:
Whiyoung Jung,
Giseung Park,
Youngchul Sung
Abstract:
In this paper, a new population-guided parallel learning scheme is proposed to enhance the performance of off-policy reinforcement learning (RL). In the proposed scheme, multiple identical learners with their own value-functions and policies share a common experience replay buffer, and search a good policy in collaboration with the guidance of the best policy information. The key point is that the…
▽ More
In this paper, a new population-guided parallel learning scheme is proposed to enhance the performance of off-policy reinforcement learning (RL). In the proposed scheme, multiple identical learners with their own value-functions and policies share a common experience replay buffer, and search a good policy in collaboration with the guidance of the best policy information. The key point is that the information of the best policy is fused in a soft manner by constructing an augmented loss function for policy update to enlarge the overall search region by the multiple learners. The guidance by the previous best policy and the enlarged range enable faster and better policy search. Monotone improvement of the expected cumulative return by the proposed scheme is proved theoretically. Working algorithms are constructed by applying the proposed scheme to the twin delayed deep deterministic (TD3) policy gradient algorithm. Numerical results show that the constructed algorithm outperforms most of the current state-of-the-art RL algorithms, and the gain is significant in the case of sparse reward environment.
△ Less
Submitted 9 January, 2020;
originally announced January 2020.