-
AI-driven, Model-Free Current Control: A Deep Symbolic Approach for Optimal Induction Machine Performance
Authors:
Muhammad Usama,
Yunkyung Hwang,
Jaehong Kim
Abstract:
This paper proposed a straightforward and efficient current control solution for induction machines employing deep symbolic regression (DSR). The proposed DSR-based control design offers a simple yet highly effective approach by creating an optimal control model through training and fitting, resulting in an analytical dynamic numerical expression that characterizes the data. Notably, this approach…
▽ More
This paper proposed a straightforward and efficient current control solution for induction machines employing deep symbolic regression (DSR). The proposed DSR-based control design offers a simple yet highly effective approach by creating an optimal control model through training and fitting, resulting in an analytical dynamic numerical expression that characterizes the data. Notably, this approach not only produces an understandable model but also demonstrates the capacity to extrapolate and estimate data points outside its training dataset, showcasing its adaptability and resilience. In contrast to conventional state-of-the-art proportional-integral (PI) current controllers, which heavily rely on specific system models, the proposed DSR-based approach stands out for its model independence. Simulation and experimental tests validate its effectiveness, highlighting its superior extrapolation capabilities compared to conventional methods. These findings pave the way for the integration of deep learning methods in power conversion applications, promising improved performance and adaptability in the control of induction machines. The simulation and experimental test results are provided with a 3.7 kw induction machine to verify the efficacy of the proposed control solution.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Deep Learning-based Synthetic High-Resolution In-Depth Imaging Using an Attachable Dual-element Endoscopic Ultrasound Probe
Authors:
Hah Min Lew,
Jae Seong Kim,
Moon Hwan Lee,
Jaegeun Park,
Sangyeon Youn,
Hee Man Kim,
Jihun Kim,
Jae Youn Hwang
Abstract:
Endoscopic ultrasound (EUS) imaging has a trade-off between resolution and penetration depth. By considering the in-vivo characteristics of human organs, it is necessary to provide clinicians with appropriate hardware specifications for precise diagnosis. Recently, super-resolution (SR) ultrasound imaging studies, including the SR task in deep learning fields, have been reported for enhancing ultr…
▽ More
Endoscopic ultrasound (EUS) imaging has a trade-off between resolution and penetration depth. By considering the in-vivo characteristics of human organs, it is necessary to provide clinicians with appropriate hardware specifications for precise diagnosis. Recently, super-resolution (SR) ultrasound imaging studies, including the SR task in deep learning fields, have been reported for enhancing ultrasound images. However, most of those studies did not consider ultrasound imaging natures, but rather they were conventional SR techniques based on downsampling of ultrasound images. In this study, we propose a novel deep learning-based high-resolution in-depth imaging probe capable of offering low- and high-frequency ultrasound image pairs. We developed an attachable dual-element EUS probe with customized low- and high-frequency ultrasound transducers under small hardware constraints. We also designed a special geared structure to enable the same image plane. The proposed system was evaluated with a wire phantom and a tissue-mimicking phantom. After the evaluation, 442 ultrasound image pairs from the tissue-mimicking phantom were acquired. We then applied several deep learning models to obtain synthetic high-resolution in-depth images, thus demonstrating the feasibility of our approach for clinical unmet needs. Furthermore, we quantitatively and qualitatively analyzed the results to find a suitable deep-learning model for our task. The obtained results demonstrate that our proposed dual-element EUS probe with an image-to-image translation network has the potential to provide synthetic high-frequency ultrasound images deep inside tissues.
△ Less
Submitted 13 September, 2023;
originally announced September 2023.
-
Unsupervised detection of small hyperreflective features in ultrahigh resolution optical coherence tomography
Authors:
Marcel Reimann,
Jungeun Won,
Hiroyuki Takahashi,
Antonio Yaghy,
Yunchan Hwang,
Stefan Ploner,
Junhong Lin,
Jessica Girgis,
Kenneth Lam,
Siyu Chen,
Nadia K. Waheed,
Andreas Maier,
James G. Fujimoto
Abstract:
Recent advances in optical coherence tomography such as the development of high speed ultrahigh resolution scanners and corresponding signal processing techniques may reveal new potential biomarkers in retinal diseases. Newly visible features are, for example, small hyperreflective specks in age-related macular degeneration. Identifying these new markers is crucial to investigate potential associa…
▽ More
Recent advances in optical coherence tomography such as the development of high speed ultrahigh resolution scanners and corresponding signal processing techniques may reveal new potential biomarkers in retinal diseases. Newly visible features are, for example, small hyperreflective specks in age-related macular degeneration. Identifying these new markers is crucial to investigate potential association with disease progression and treatment outcomes. Therefore, it is necessary to reliably detect these features in 3D volumetric scans. Because manual labeling of entire volumes is infeasible a need for automatic detection arises. Labeled datasets are often not publicly available and there are usually large variations in scan protocols and scanner types. Thus, this work focuses on an unsupervised approach that is based on local peak-detection and random walker segmentation to detect small features on each B-scan of the volume.
△ Less
Submitted 26 March, 2023;
originally announced March 2023.
-
Cross-speaker Emotion Transfer by Manipulating Speech Style Latents
Authors:
Suhee Jo,
Younggun Lee,
Yookyung Shin,
Yeongtae Hwang,
Taesu Kim
Abstract:
In recent years, emotional text-to-speech has shown considerable progress. However, it requires a large amount of labeled data, which is not easily accessible. Even if it is possible to acquire an emotional speech dataset, there is still a limitation in controlling emotion intensity. In this work, we propose a novel method for cross-speaker emotion transfer and manipulation using vector arithmetic…
▽ More
In recent years, emotional text-to-speech has shown considerable progress. However, it requires a large amount of labeled data, which is not easily accessible. Even if it is possible to acquire an emotional speech dataset, there is still a limitation in controlling emotion intensity. In this work, we propose a novel method for cross-speaker emotion transfer and manipulation using vector arithmetic in latent style space. By leveraging only a few labeled samples, we generate emotional speech from reading-style speech without losing the speaker identity. Furthermore, emotion strength is readily controllable using a scalar value, providing an intuitive way for users to manipulate speech. Experimental results show the proposed method affords superior performance in terms of expressiveness, naturalness, and controllability, preserving speaker identity.
△ Less
Submitted 14 March, 2023;
originally announced March 2023.
-
Retinal blood flow speed quantification at the capillary level using temporal autocorrelation fitting OCTA
Authors:
Yunchan Hwang,
Jungeun Won,
Antonio Yaghy,
Hiroyuki Takahashi,
Jessica M. Girgis,
Kenneth Lam,
Siyu Chen,
Eric M. Moult,
Stefan B. Ploner,
Andreas Maier,
Nadia K. Waheed,
James G. Fujimoto
Abstract:
Optical coherence tomography angiography (OCTA) can visualize vasculature structures, but provides limited information about the blood flow speeds. Here, we present a second generation variable interscan time analysis (VISTA) OCTA, which evaluates a quantitative surrogate marker for blood flow speed in vasculature. At the capillary level, spatially compiled OCTA and a simple temporal autocorrelati…
▽ More
Optical coherence tomography angiography (OCTA) can visualize vasculature structures, but provides limited information about the blood flow speeds. Here, we present a second generation variable interscan time analysis (VISTA) OCTA, which evaluates a quantitative surrogate marker for blood flow speed in vasculature. At the capillary level, spatially compiled OCTA and a simple temporal autocorrelation model, ρ(τ) = exp(-ατ), were used to evaluate a temporal autocorrelation decay constant, α, as the blood flow speed marker. A 600 kHz A-scan rate swept-source provides short interscan time OCTA and fine A-scan spacing acquisition, while maintaining multi mm2 field of views for human retinal imaging. We demonstrate the cardiac pulsatility and repeatability of α measured with VISTA. We show different α for different retinal capillary plexuses in healthy eyes and present representative VISTA OCTA of eyes with diabetic retinopathy.
△ Less
Submitted 22 February, 2023;
originally announced February 2023.
-
Text-driven Emotional Style Control and Cross-speaker Style Transfer in Neural TTS
Authors:
Yookyung Shin,
Younggun Lee,
Suhee Jo,
Yeongtae Hwang,
Taesu Kim
Abstract:
Expressive text-to-speech has shown improved performance in recent years. However, the style control of synthetic speech is often restricted to discrete emotion categories and requires training data recorded by the target speaker in the target style. In many practical situations, users may not have reference speech recorded in target emotion but still be interested in controlling speech style just…
▽ More
Expressive text-to-speech has shown improved performance in recent years. However, the style control of synthetic speech is often restricted to discrete emotion categories and requires training data recorded by the target speaker in the target style. In many practical situations, users may not have reference speech recorded in target emotion but still be interested in controlling speech style just by typing text description of desired emotional style. In this paper, we propose a text-based interface for emotional style control and cross-speaker style transfer in multi-speaker TTS. We propose the bi-modal style encoder which models the semantic relationship between text description embedding and speech style embedding with a pretrained language model. To further improve cross-speaker style transfer on disjoint, multi-style datasets, we propose the novel style loss. The experimental results show that our model can generate high-quality expressive speech even in unseen style.
△ Less
Submitted 13 July, 2022;
originally announced July 2022.
-
Seamless Accurate Positioning in Deep Urban Area based on Mode Switching Between DGNSS and Multipath Mitigation Positioning
Authors:
Yongjun Lee,
Yoola Hwang,
Jae Young Ahn,
Jiwon Seo,
Byungwoon Park
Abstract:
Multipath and non-line-of-sight (NLOS) signals are the major causes of poor accuracy of a global navigation satellite system (GNSS) in urban areas. Despite the wide usage of the GNSS in populated urban areas, it is difficult to suggest a generalized method because multipath errors are user-specific errors that cannot be eliminated by the DGNSS or a real-time kinematic technique. This paper introdu…
▽ More
Multipath and non-line-of-sight (NLOS) signals are the major causes of poor accuracy of a global navigation satellite system (GNSS) in urban areas. Despite the wide usage of the GNSS in populated urban areas, it is difficult to suggest a generalized method because multipath errors are user-specific errors that cannot be eliminated by the DGNSS or a real-time kinematic technique. This paper introduces a real-time multipath estimation and mitigation technique, which considers compensation for the time offset between constellations. It also presents a mode-switching algorithm between the DGNSS and multipath mitigating mode and shows that this technique can be effectively utilized for automobiles in a deep urban environment without any help from sensors other than GNSS. The availability is improved from 64% to 100% and the error RMS is reduced from 11.1 m to 1.2 m on Teheran-ro, Seoul, Korea. Because this method does not require prior information or additional sensor implementation for high-positioning performance in deep urban areas, it is expected to gain wide usage in not only the automotive industry but also future intelligent transportation systems.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
Explainable Deep Learning Algorithm for Distinguishing Incomplete Kawasaki Disease by Coronary Artery Lesions on Echocardiographic Imaging
Authors:
Haeyun Lee,
Yongsoon Eun,
Jae Youn Hwang,
Lucy Youngmin Eun
Abstract:
Background and Objective: Incomplete Kawasaki disease (KD) has often been misdiagnosed due to a lack of the clinical manifestations of classic KD. However, it is associated with a markedly higher prevalence of coronary artery lesions. Identifying coronary artery lesions by echocardiography is important for the timely diagnosis of and favorable outcomes in KD. Moreover, similar to KD, coronavirus d…
▽ More
Background and Objective: Incomplete Kawasaki disease (KD) has often been misdiagnosed due to a lack of the clinical manifestations of classic KD. However, it is associated with a markedly higher prevalence of coronary artery lesions. Identifying coronary artery lesions by echocardiography is important for the timely diagnosis of and favorable outcomes in KD. Moreover, similar to KD, coronavirus disease 2019, currently causing a worldwide pandemic, also manifests with fever; therefore, it is crucial at this moment that KD should be distinguished clearly among the febrile diseases in children. In this study, we aimed to validate a deep learning algorithm for classification of KD and other acute febrile diseases.
Methods: We obtained coronary artery images by echocardiography of children (n = 88 for KD; n = 65 for pneumonia). We trained six deep learning networks (VGG19, Xception, ResNet50, ResNext50, SE-ResNet50, and SE-ResNext50) using the collected data.
Results: SE-ResNext50 showed the best performance in terms of accuracy, specificity, and precision in the classification. SE-ResNext50 offered a precision of 76.35%, a sensitivity of 82.64%, and a specificity of 58.12%.
Conclusions: The results of our study suggested that deep learning algorithms have similar performance to an experienced cardiologist in detecting coronary artery lesions to facilitate the diagnosis of KD.
△ Less
Submitted 5 April, 2022;
originally announced April 2022.
-
Best Practices and Scoring System on Reviewing A.I. based Medical Imaging Papers: Part 1 Classification
Authors:
Timothy L. Kline,
Felipe Kitamura,
Ian Pan,
Amine M. Korchi,
Neil Tenenholtz,
Linda Moy,
Judy Wawira Gichoya,
Igor Santos,
Steven Blumer,
Misha Ysabel Hwang,
Kim-Ann Git,
Abishek Shroff,
Elad Walach,
George Shih,
Steve Langer
Abstract:
With the recent advances in A.I. methodologies and their application to medical imaging, there has been an explosion of related research programs utilizing these techniques to produce state-of-the-art classification performance. Ultimately, these research programs culminate in submission of their work for consideration in peer reviewed journals. To date, the criteria for acceptance vs. rejection i…
▽ More
With the recent advances in A.I. methodologies and their application to medical imaging, there has been an explosion of related research programs utilizing these techniques to produce state-of-the-art classification performance. Ultimately, these research programs culminate in submission of their work for consideration in peer reviewed journals. To date, the criteria for acceptance vs. rejection is often subjective; however, reproducible science requires reproducible review. The Machine Learning Education Sub-Committee of SIIM has identified a knowledge gap and a serious need to establish guidelines for reviewing these studies. Although there have been several recent papers with this goal, this present work is written from the machine learning practitioners standpoint. In this series, the committee will address the best practices to be followed in an A.I.-based study and present the required sections in terms of examples and discussion of what should be included to make the studies cohesive, reproducible, accurate, and self-contained. This first entry in the series focuses on the task of image classification. Elements such as dataset curation, data pre-processing steps, defining an appropriate reference standard, data partitioning, model architecture and training are discussed. The sections are presented as they would be detailed in a typical manuscript, with content describing the necessary information that should be included to make sure the study is of sufficient quality to be considered for publication. The goal of this series is to provide resources to not only help improve the review process for A.I.-based medical imaging papers, but to facilitate a standard for the information that is presented within all components of the research study. We hope to provide quantitative metrics in what otherwise may be a qualitative review process.
△ Less
Submitted 3 February, 2022;
originally announced February 2022.
-
Domain Adaptive Transfer Attack (DATA)-based Segmentation Networks for Building Extraction from Aerial Images
Authors:
Younghwan Na,
Jun Hee Kim,
Kyungsu Lee,
Juhum Park,
Jae Youn Hwang,
Jihwan P. Choi
Abstract:
Semantic segmentation models based on convolutional neural networks (CNNs) have gained much attention in relation to remote sensing and have achieved remarkable performance for the extraction of buildings from high-resolution aerial images. However, the issue of limited generalization for unseen images remains. When there is a domain gap between the training and test datasets, CNN-based segmentati…
▽ More
Semantic segmentation models based on convolutional neural networks (CNNs) have gained much attention in relation to remote sensing and have achieved remarkable performance for the extraction of buildings from high-resolution aerial images. However, the issue of limited generalization for unseen images remains. When there is a domain gap between the training and test datasets, CNN-based segmentation models trained by a training dataset fail to segment buildings for the test dataset. In this paper, we propose segmentation networks based on a domain adaptive transfer attack (DATA) scheme for building extraction from aerial images. The proposed system combines the domain transfer and adversarial attack concepts. Based on the DATA scheme, the distribution of the input images can be shifted to that of the target images while turning images into adversarial examples against a target network. Defending adversarial examples adapted to the target domain can overcome the performance degradation due to the domain gap and increase the robustness of the segmentation model. Cross-dataset experiments and the ablation study are conducted for the three different datasets: the Inria aerial image labeling dataset, the Massachusetts building dataset, and the WHU East Asia dataset. Compared to the performance of the segmentation network without the DATA scheme, the proposed method shows improvements in the overall IoU. Moreover, it is verified that the proposed method outperforms even when compared to feature adaptation (FA) and output space adaptation (OSA).
△ Less
Submitted 29 April, 2020; v1 submitted 11 April, 2020;
originally announced April 2020.