-
Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes
Authors:
Sunjun Kweon,
Junu Kim,
Jiyoun Kim,
Sujeong Im,
Eunbyeol Cho,
Seongsu Bae,
Jungwoo Oh,
Gyubok Lee,
Jong Hak Moon,
Seng Chan You,
Seungjin Baek,
Chang Hoon Han,
Yoon Bin Jung,
Yohan Jo,
Edward Choi
Abstract:
The development of large language models tailored for handling patients' clinical notes is often hindered by the limited accessibility and usability of these notes due to strict privacy regulations. To address these challenges, we first create synthetic large-scale clinical notes using publicly available case reports extracted from biomedical literature. We then use these synthetic notes to train…
▽ More
The development of large language models tailored for handling patients' clinical notes is often hindered by the limited accessibility and usability of these notes due to strict privacy regulations. To address these challenges, we first create synthetic large-scale clinical notes using publicly available case reports extracted from biomedical literature. We then use these synthetic notes to train our specialized clinical large language model, Asclepius. While Asclepius is trained on synthetic data, we assess its potential performance in real-world applications by evaluating it using real clinical notes. We benchmark Asclepius against several other large language models, including GPT-3.5-turbo and other open-source alternatives. To further validate our approach using synthetic notes, we also compare Asclepius with its variants trained on real clinical notes. Our findings convincingly demonstrate that synthetic clinical notes can serve as viable substitutes for real ones when constructing high-performing clinical language models. This conclusion is supported by detailed evaluations conducted by both GPT-4 and medical professionals. All resources including weights, codes, and data used in the development of Asclepius are made publicly accessible for future research. (https://github.com/starmpcc/Asclepius)
△ Less
Submitted 13 June, 2024; v1 submitted 1 September, 2023;
originally announced September 2023.
-
Light-field-driven non-Ohmic current and Keldysh crossover in a Weyl semimetal
Authors:
R. Ikeda,
H. Watanabe,
J. H. Moon,
M. H. Jung,
K. Takasan,
S. Kimura
Abstract:
In recent years, coherent electrons driven by light fields have attracted significant interest in exploring novel material phases and functionalities. However, observing coherent light-field-driven electron dynamics in solids is challenging because the electrons are scattered within several ten femtoseconds in ordinary materials, and the coherence between light and electrons is disturbed. However,…
▽ More
In recent years, coherent electrons driven by light fields have attracted significant interest in exploring novel material phases and functionalities. However, observing coherent light-field-driven electron dynamics in solids is challenging because the electrons are scattered within several ten femtoseconds in ordinary materials, and the coherence between light and electrons is disturbed. However, when we use Weyl semimetals, the electron scattering becomes relatively long (several hundred femtoseconds - several picoseconds), owing to the suppression of the back-scattering process. This study presents the light-field-driven dynamics by the THz pulse to Weyl semimetal Co3Sn2S2, where the intense THz pulse of a monocycle electric field nonlinearly generates direct current (DC) via coherent acceleration without scattering and non-adiabatic excitation (Landau-Zener Transition). In other words, the non-Ohmic current appears in the Weyl semimetal with a combination of the long relaxation time and an intense THz pulse. This nonlinear DC generation also demonstrates a Keldysh crossover from a photon picture to a light-field picture by increasing the electric field strength.
△ Less
Submitted 15 June, 2023;
originally announced June 2023.
-
Exploring the magnetic properties of individual barcode nanowires using wide-field diamond microscopy
Authors:
Jungbae Yoon,
Jun Hwan Moon,
Jugyeong Jeong,
Yu Jin Kim,
Kihwan Kim,
Hee Seong Kang,
Yoo Sang Jeon,
Eunsoo Oh,
Sun Hwa Lee,
Kihoon Han,
Dongmin Lee,
Chul-Ho Lee,
Young Keun Kim,
Donghun Lee
Abstract:
Barcode magnetic nanowires typically comprise a multilayer magnetic structure in a single body with more than one segment type. Interestingly, owing to selective functionalization and novel interactions between the layers, barcode magnetic nanowires have attracted significant attention, particularly in the field of bioengineering. However, an analysis of their magnetic properties at the individual…
▽ More
Barcode magnetic nanowires typically comprise a multilayer magnetic structure in a single body with more than one segment type. Interestingly, owing to selective functionalization and novel interactions between the layers, barcode magnetic nanowires have attracted significant attention, particularly in the field of bioengineering. However, an analysis of their magnetic properties at the individual nanowire level remains challenging. With this background, herein, we investigated the characterization of magnetic nanowires at room temperature under ambient conditions based on magnetic images obtained via wide-field quantum microscopy with nitrogen-vacancy centers in diamond. Consequently, we could extract critical magnetic properties, such as the saturation magnetization and coercivity, of single nanowires by comparing the experimental results with those of micromagnetic simulations. This study opens up the possibility for a versatile characterization method suited to individual magnetic nanowires.
△ Less
Submitted 21 February, 2023;
originally announced February 2023.
-
Correlation between Alignment-Uniformity and Performance of Dense Contrastive Representations
Authors:
Jong Hak Moon,
Wonjae Kim,
Edward Choi
Abstract:
Recently, dense contrastive learning has shown superior performance on dense prediction tasks compared to instance-level contrastive learning. Despite its supremacy, the properties of dense contrastive representations have not yet been carefully studied. Therefore, we analyze the theoretical ideas of dense contrastive learning using a standard CNN and straightforward feature matching scheme rather…
▽ More
Recently, dense contrastive learning has shown superior performance on dense prediction tasks compared to instance-level contrastive learning. Despite its supremacy, the properties of dense contrastive representations have not yet been carefully studied. Therefore, we analyze the theoretical ideas of dense contrastive learning using a standard CNN and straightforward feature matching scheme rather than propose a new complex method. Inspired by the analysis of the properties of instance-level contrastive representations through the lens of alignment and uniformity on the hypersphere, we employ and extend the same lens for the dense contrastive representations to analyze their underexplored properties. We discover the core principle in constructing a positive pair of dense features and empirically proved its validity. Also, we introduces a new scalar metric that summarizes the correlation between alignment-and-uniformity and downstream performance. Using this metric, we study various facets of densely learned contrastive representations such as how the correlation changes over single- and multi-object datasets or linear evaluation and dense prediction tasks. The source code is publicly available at: https://github.com/SuperSupermoon/DenseCL-analysis
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
Unified Simultaneous Wireless Information and Power Transfer for IoT: Signaling and Architecture with Deep Learning Adaptive Control
Authors:
Jong Jin Park,
Jong Ho Moon,
Hyeon Ho Jang,
Dong In Kim
Abstract:
In this paper, we propose a unified SWIPT signal and its architecture design in order to take advantage of both single tone and multi-tone signaling by adjusting only the power allocation ratio of a unified signal. For this, we design a novel unified and integrated receiver architecture for the proposed unified SWIPT signaling, which consumes low power with an envelope detection. To relieve the co…
▽ More
In this paper, we propose a unified SWIPT signal and its architecture design in order to take advantage of both single tone and multi-tone signaling by adjusting only the power allocation ratio of a unified signal. For this, we design a novel unified and integrated receiver architecture for the proposed unified SWIPT signaling, which consumes low power with an envelope detection. To relieve the computational complexity of the receiver, we propose an adaptive control algorithm by which the transmitter adjusts the communication mode through temporal convolutional network (TCN) based asymmetric processing. To this end, the transmitter optimizes the modulation index and power allocation ratio in short-term scale while updating the mode switching threshold in long-term scale. We demonstrate that the proposed unified SWIPT system improves the achievable rate under the self-powering condition of low-power IoT devices. Consequently it is foreseen to effectively deploy low-power IoT networks that concurrently supply both information and energy wirelessly to the devices by using the proposed unified SWIPT and adaptive control algorithm in place at the transmitter side.
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-Training
Authors:
Jong Hak Moon,
Hyungyung Lee,
Woncheol Shin,
Young-Hak Kim,
Edward Choi
Abstract:
Recently a number of studies demonstrated impressive performance on diverse vision-language multi-modal tasks such as image captioning and visual question answering by extending the BERT architecture with multi-modal pre-training objectives. In this work we explore a broad set of multi-modal representation learning tasks in the medical domain, specifically using radiology images and the unstructur…
▽ More
Recently a number of studies demonstrated impressive performance on diverse vision-language multi-modal tasks such as image captioning and visual question answering by extending the BERT architecture with multi-modal pre-training objectives. In this work we explore a broad set of multi-modal representation learning tasks in the medical domain, specifically using radiology images and the unstructured report. We propose Medical Vision Language Learner (MedViLL), which adopts a BERT-based architecture combined with a novel multi-modal attention masking scheme to maximize generalization performance for both vision-language understanding tasks (diagnosis classification, medical image-report retrieval, medical visual question answering) and vision-language generation task (radiology report generation). By statistically and rigorously evaluating the proposed model on four downstream tasks with three radiographic image-report datasets (MIMIC-CXR, Open-I, and VQA-RAD), we empirically demonstrate the superior downstream task performance of MedViLL against various baselines, including task-specific architectures. The source code is publicly available at: https://github.com/SuperSupermoon/MedViLL
△ Less
Submitted 21 September, 2022; v1 submitted 24 May, 2021;
originally announced May 2021.
-
Few-shot Image Recognition with Manifolds
Authors:
Debasmit Das,
J. H. Moon,
C. S. George Lee
Abstract:
In this paper, we extend the traditional few-shot learning (FSL) problem to the situation when the source-domain data is not accessible but only high-level information in the form of class prototypes is available. This limited information setup for the FSL problem deserves much attention due to its implication of privacy-preserving inaccessibility to the source-domain data but it has rarely been a…
▽ More
In this paper, we extend the traditional few-shot learning (FSL) problem to the situation when the source-domain data is not accessible but only high-level information in the form of class prototypes is available. This limited information setup for the FSL problem deserves much attention due to its implication of privacy-preserving inaccessibility to the source-domain data but it has rarely been addressed before. Because of limited training data, we propose a non-parametric approach to this FSL problem by assuming that all the class prototypes are structurally arranged on a manifold. Accordingly, we estimate the novel-class prototype locations by projecting the few-shot samples onto the average of the subspaces on which the surrounding classes lie. During classification, we again exploit the structural arrangement of the categories by inducing a Markov chain on the graph constructed with the class prototypes. This manifold distance obtained using the Markov chain is expected to produce better results compared to a traditional nearest-neighbor-based Euclidean distance. To evaluate our proposed framework, we have tested it on two image datasets - the large-scale ImageNet and the small-scale but fine-grained CUB-200. We have also studied parameter sensitivity to better understand our framework.
△ Less
Submitted 22 October, 2020;
originally announced October 2020.
-
Multi-step Online Unsupervised Domain Adaptation
Authors:
J. H. Moon,
Debasmit Das,
C. S. George Lee
Abstract:
In this paper, we address the Online Unsupervised Domain Adaptation (OUDA) problem, where the target data are unlabelled and arriving sequentially. The traditional methods on the OUDA problem mainly focus on transforming each arriving target data to the source domain, and they do not sufficiently consider the temporal coherency and accumulative statistics among the arriving target data. We propose…
▽ More
In this paper, we address the Online Unsupervised Domain Adaptation (OUDA) problem, where the target data are unlabelled and arriving sequentially. The traditional methods on the OUDA problem mainly focus on transforming each arriving target data to the source domain, and they do not sufficiently consider the temporal coherency and accumulative statistics among the arriving target data. We propose a multi-step framework for the OUDA problem, which institutes a novel method to compute the mean-target subspace inspired by the geometrical interpretation on the Euclidean space. This mean-target subspace contains accumulative temporal information among the arrived target data. Moreover, the transformation matrix computed from the mean-target subspace is applied to the next target data as a preprocessing step, aligning the target data closer to the source domain. Experiments on four datasets demonstrated the contribution of each step in our proposed multi-step OUDA framework and its performance over previous approaches.
△ Less
Submitted 20 February, 2020;
originally announced February 2020.