-
Solar Power Prediction Using Satellite Data in Different Parts of Nepal
Authors:
Raj Krishna Nepal,
Bibek Khanal,
Vibek Ghimire,
Kismat Neupane,
Atul Pokharel,
Kshitij Niraula,
Baburam Tiwari,
Nawaraj Bhattarai,
Khem N. Poudyal,
Nawaraj Karki,
Mohan B Dangi,
John Biden
Abstract:
Due to the unavailability of solar irradiance data for many potential sites of Nepal, the paper proposes predicting solar irradiance based on alternative meteorological parameters. The study focuses on five distinct regions in Nepal and utilizes a dataset spanning almost ten years, obtained from CERES SYN1deg and MERRA-2. Machine learning models such as Random Forest, XGBoost, K-Nearest Neighbors,…
▽ More
Due to the unavailability of solar irradiance data for many potential sites of Nepal, the paper proposes predicting solar irradiance based on alternative meteorological parameters. The study focuses on five distinct regions in Nepal and utilizes a dataset spanning almost ten years, obtained from CERES SYN1deg and MERRA-2. Machine learning models such as Random Forest, XGBoost, K-Nearest Neighbors, and deep learning models like LSTM and ANN-MLP are employed and evaluated for their performance. The results indicate high accuracy in predicting solar irradiance, with R-squared(R2) scores close to unity for both train and test datasets. The impact of parameter integration on model performance is analyzed, revealing the significance of various parameters in enhancing predictive accuracy. Each model demonstrates strong performance across all parameters, consistently achieving MAE values below 6, RMSE values under 10, MBE within |2|, and nearly unity R2 values. Upon removal of various solar parameters such as "Solar_Irradiance_Clear_Sky", "UVA", etc. from the datasets, the model's performance is significantly affected. This exclusion leads to considerable increases in MAE, reaching up to 82, RMSE up to 135, and MBE up to |7|. Among the models, KNN displays the weakest performance, with an R2 of 0.7582546. Conversely, ANN exhibits the strongest performance, boasting an R2 value of 0.9245877. Hence, the study concludes that Artificial Neural Network (ANN) performs exceptionally well, showcasing its versatility even under sparse data parameter conditions.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Compensation for reactive power and harmonic current drawn by a non-linear load in a pv-micro hydro grid
Authors:
Raj Krishna Nepal,
Bibek Khanal,
Sanket Khatiwada,
Nirajan Bhandari,
Bishal Rijal,
Raisha Karmacharya,
Ajay Thapa
Abstract:
This paper presents a simulation approach to enhance the power quality of a PV-micro hydro grid supplying both linear consumer load and non-linear industrial load by integrating Shunt Active Power Filter (SAPF), utilizing instantaneous PQ theory and hysteresis current control band logic. The non-linear load draws reactive power and harmonic current from the source thereby affecting the power quali…
▽ More
This paper presents a simulation approach to enhance the power quality of a PV-micro hydro grid supplying both linear consumer load and non-linear industrial load by integrating Shunt Active Power Filter (SAPF), utilizing instantaneous PQ theory and hysteresis current control band logic. The non-linear load draws reactive power and harmonic current from the source thereby affecting the power quality. The integration of the SAPF at the point of common coupling (PCC) offers reactive power and harmonic current compensation, ensuring that the current supply to the grid remains nearly sinusoidal and proportional to the active power. By injecting equal and opposite harmonic components, the SAPF effectively reduces Total Harmonic Distortion (THD) from 7% to 2.96%, thereby enhancing the overall power quality of the PV-micro hydro grid system.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Impact of medium temperature heat treatment on flux trapping sensitivity in SRF cavities
Authors:
Pashupati Dhakal,
Bashu Dev Khanal,
Eric Lechner,
Gianluigi Ciovati
Abstract:
The effect of mid-T heat treatment on flux trapping sensitivity was measured on several 1.3 GHz single cell cavities subjected to vacuum annealing at temperature of 150 - 400 $^\circ$C for a duration of 3 hours. The cavity was cooldown with residual magnetic field $\sim$0 and $\sim$20 mG in the Dewar with cooldown condition of full flux trapping. The quality factor as a function of accelerating gr…
▽ More
The effect of mid-T heat treatment on flux trapping sensitivity was measured on several 1.3 GHz single cell cavities subjected to vacuum annealing at temperature of 150 - 400 $^\circ$C for a duration of 3 hours. The cavity was cooldown with residual magnetic field $\sim$0 and $\sim$20 mG in the Dewar with cooldown condition of full flux trapping. The quality factor as a function of accelerating gradient was measured. The results show the correlation between the treatment temperature, quality factor, and sensitivity to flux trapping. Sensitivity increases with increasing heat treatment temperatures within the range of (200 - 325 $^\circ$C/3h). Moreover, variations in the effective penetration depth of the magnetic field and the density of quasi-particles can occur, influencing alterations in the cavity's electromagnetic response and resonance frequency.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
VLSM-Adapter: Finetuning Vision-Language Segmentation Efficiently with Lightweight Blocks
Authors:
Manish Dhakal,
Rabin Adhikari,
Safal Thapaliya,
Bishesh Khanal
Abstract:
Foundation Vision-Language Models (VLMs) trained using large-scale open-domain images and text pairs have recently been adapted to develop Vision-Language Segmentation Models (VLSMs) that allow providing text prompts during inference to guide image segmentation. If robust and powerful VLSMs can be built for medical images, it could aid medical professionals in many clinical tasks where they must s…
▽ More
Foundation Vision-Language Models (VLMs) trained using large-scale open-domain images and text pairs have recently been adapted to develop Vision-Language Segmentation Models (VLSMs) that allow providing text prompts during inference to guide image segmentation. If robust and powerful VLSMs can be built for medical images, it could aid medical professionals in many clinical tasks where they must spend substantial time delineating the target structure of interest. VLSMs for medical images resort to fine-tuning base VLM or VLSM pretrained on open-domain natural image datasets due to fewer annotated medical image datasets; this fine-tuning is resource-consuming and expensive as it usually requires updating all or a significant fraction of the pretrained parameters. Recently, lightweight blocks called adapters have been proposed in VLMs that keep the pretrained model frozen and only train adapters during fine-tuning, substantially reducing the computing resources required. We introduce a novel adapter, VLSM-Adapter, that can fine-tune pretrained vision-language segmentation models using transformer encoders. Our experiments in widely used CLIP-based segmentation models show that with only 3 million trainable parameters, the VLSM-Adapter outperforms state-of-the-art and is comparable to the upper bound end-to-end fine-tuning. The source code is available at: https://github.com/naamiinepal/vlsm-adapter.
△ Less
Submitted 27 June, 2024; v1 submitted 9 May, 2024;
originally announced May 2024.
-
On Adversarial Examples for Text Classification by Perturbing Latent Representations
Authors:
Korn Sooksatra,
Bikram Khanal,
Pablo Rivas
Abstract:
Recently, with the advancement of deep learning, several applications in text classification have advanced significantly. However, this improvement comes with a cost because deep learning is vulnerable to adversarial examples. This weakness indicates that deep learning is not very robust. Fortunately, the input of a text classifier is discrete. Hence, it can prevent the classifier from state-of-th…
▽ More
Recently, with the advancement of deep learning, several applications in text classification have advanced significantly. However, this improvement comes with a cost because deep learning is vulnerable to adversarial examples. This weakness indicates that deep learning is not very robust. Fortunately, the input of a text classifier is discrete. Hence, it can prevent the classifier from state-of-the-art attacks. Nonetheless, previous works have generated black-box attacks that successfully manipulate the discrete values of the input to find adversarial examples. Therefore, instead of changing the discrete values, we transform the input into its embedding vector containing real values to perform the state-of-the-art white-box attacks. Then, we convert the perturbed embedding vector back into a text and name it an adversarial example. In summary, we create a framework that measures the robustness of a text classifier by using the gradients of the classifier.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
A Modified Depolarization Approach for Efficient Quantum Machine Learning
Authors:
Bikram Khanal,
Pablo Rivas
Abstract:
Quantum Computing in the Noisy Intermediate-Scale Quantum (NISQ) era has shown promising applications in machine learning, optimization, and cryptography. Despite the progress, challenges persist due to system noise, errors, and decoherence that complicate the simulation of quantum systems. The depolarization channel is a standard tool for simulating a quantum system's noise. However, modeling suc…
▽ More
Quantum Computing in the Noisy Intermediate-Scale Quantum (NISQ) era has shown promising applications in machine learning, optimization, and cryptography. Despite the progress, challenges persist due to system noise, errors, and decoherence that complicate the simulation of quantum systems. The depolarization channel is a standard tool for simulating a quantum system's noise. However, modeling such noise for practical applications is computationally expensive when we have limited hardware resources, as is the case in the NISQ era. We propose a modified representation for a single-qubit depolarization channel with two Kraus operators based only on X and Z Pauli matrices. Our approach reduces the computational complexity from six to four matrix multiplications per execution of a channel. Experiments on a Quantum Machine Learning (QML) model on the Iris dataset across various circuit depths and depolarization rates validate that our approach maintains the model's accuracy while improving efficiency. This simplified noise model enables more scalable simulations of quantum circuits under depolarization, advancing capabilities in the NISQ era.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
AI-Assisted Cervical Cancer Screening
Authors:
Kanchan Poudel,
Lisasha Poudel,
Prabin Raj Shakya,
Atit Poudel,
Archana Shrestha,
Bishesh Khanal
Abstract:
Visual Inspection with Acetic Acid (VIA) remains the most feasible cervical cancer screening test in resource-constrained settings of low- and middle-income countries (LMICs), which are often performed screening camps or primary/community health centers by nurses instead of the preferred but unavailable expert Gynecologist. To address the highly subjective nature of the test, various handheld devi…
▽ More
Visual Inspection with Acetic Acid (VIA) remains the most feasible cervical cancer screening test in resource-constrained settings of low- and middle-income countries (LMICs), which are often performed screening camps or primary/community health centers by nurses instead of the preferred but unavailable expert Gynecologist. To address the highly subjective nature of the test, various handheld devices integrating cameras or smartphones have been recently explored to capture cervical images during VIA and aid decision-making via telemedicine or AI models. Most studies proposing AI models retrospectively use a relatively small number of already collected images from specific devices, digital cameras, or smartphones; the challenges and protocol for quality image acquisition during VIA in resource-constrained camp settings, challenges in getting gold standard, data imbalance, etc. are often overlooked. We present a novel approach and describe the end-to-end design process to build a robust smartphone-based AI-assisted system that does not require buying a separate integrated device: the proposed protocol for quality image acquisition in resource-constrained settings, dataset collected from 1,430 women during VIA performed by nurses in screening camps, preprocessing pipeline, and training and evaluation of a deep-learning-based classification model aimed to identify (pre)cancerous lesions. Our work shows that the readily available smartphones and a suitable protocol can capture the cervix images with the required details for the VIA test well; the deep-learning-based classification model provides promising results to assist nurses in VIA screening; and provides a direction for large-scale data collection and validation in resource-constrained settings.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Field, frequency and temperature dependence of the surface resistance of nitrogen diffused niobium superconducting radio frequency cavities
Authors:
P. Dhakal,
B. D. Khanal,
A. Gurevich,
G. Ciovati
Abstract:
We report the RF performance of several single-cell superconducting radio-frequency cavities subjected to low temperature heat treatment in nitrogen environment. The cavities were treated at temperature 120 - 165 $^{\circ}$C for an extended period of time (24 - 48 hours) either in high vacuum or in a low partial pressure of ultra-pure nitrogen. The improvement in $Q_0$ with a Q-rise was observed w…
▽ More
We report the RF performance of several single-cell superconducting radio-frequency cavities subjected to low temperature heat treatment in nitrogen environment. The cavities were treated at temperature 120 - 165 $^{\circ}$C for an extended period of time (24 - 48 hours) either in high vacuum or in a low partial pressure of ultra-pure nitrogen. The improvement in $Q_0$ with a Q-rise was observed when nitrogen gas was injected at $\sim$300 $^{\circ} $C during the cavity cooldown from 800 $^{\circ}$C and held at 165 $^{\circ}$C, without any degradation in accelerating gradient over the baseline performance. The treatment was applied to several elliptical cavities with frequency ranging from 0.75 GHz to 3.0 GHz, showing an improved quality factor as a result of low temperature nitrogen treatments. The Q-rise feature is similar to that achieved by nitrogen alloying Nb cavities at higher temperature, followed by material removal by electropolishing. The surface modification was confirmed by the change in electronic mean free path and tuned with the temperature and duration of heat treatment. The decrease of the temperature-dependent surface resistance with increasing RF field, resulting in a Q-rise, becomes stronger with increasing frequency and decreasing temperature. The data suggest a crossover frequency of $\sim 0.95$~GHz above which the Q-rise phenomenon occurs at 2~K. Some of these results can be explained qualitatively with an existing model of intrinsic field-dependence of the surface resistance with both equilibrium and nonequilibrium quasiparticle distribution functions. The change in the Q-slope below 0.95 GHz may result from masking contribution of trapped magnetic flux to the residual surface resistance.
△ Less
Submitted 16 May, 2024; v1 submitted 27 February, 2024;
originally announced February 2024.
-
Investigating the Robustness of Vision Transformers against Label Noise in Medical Image Classification
Authors:
Bidur Khanal,
Prashant Shrestha,
Sanskar Amgain,
Bishesh Khanal,
Binod Bhattarai,
Cristian A. Linte
Abstract:
Label noise in medical image classification datasets significantly hampers the training of supervised deep learning methods, undermining their generalizability. The test performance of a model tends to decrease as the label noise rate increases. Over recent years, several methods have been proposed to mitigate the impact of label noise in medical image classification and enhance the robustness of…
▽ More
Label noise in medical image classification datasets significantly hampers the training of supervised deep learning methods, undermining their generalizability. The test performance of a model tends to decrease as the label noise rate increases. Over recent years, several methods have been proposed to mitigate the impact of label noise in medical image classification and enhance the robustness of the model. Predominantly, these works have employed CNN-based architectures as the backbone of their classifiers for feature extraction. However, in recent years, Vision Transformer (ViT)-based backbones have replaced CNNs, demonstrating improved performance and a greater ability to learn more generalizable features, especially when the dataset is large. Nevertheless, no prior work has rigorously investigated how transformer-based backbones handle the impact of label noise in medical image classification. In this paper, we investigate the architectural robustness of ViT against label noise and compare it to that of CNNs. We use two medical image classification datasets -- COVID-DU-Ex, and NCT-CRC-HE-100K -- both corrupted by injecting label noise at various rates. Additionally, we show that pretraining is crucial for ensuring ViT's improved robustness against label noise in supervised training.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
How does self-supervised pretraining improve robustness against noisy labels across various medical image classification datasets?
Authors:
Bidur Khanal,
Binod Bhattarai,
Bishesh Khanal,
Cristian Linte
Abstract:
Noisy labels can significantly impact medical image classification, particularly in deep learning, by corrupting learned features. Self-supervised pretraining, which doesn't rely on labeled data, can enhance robustness against noisy labels. However, this robustness varies based on factors like the number of classes, dataset complexity, and training size. In medical images, subtle inter-class diffe…
▽ More
Noisy labels can significantly impact medical image classification, particularly in deep learning, by corrupting learned features. Self-supervised pretraining, which doesn't rely on labeled data, can enhance robustness against noisy labels. However, this robustness varies based on factors like the number of classes, dataset complexity, and training size. In medical images, subtle inter-class differences and modality-specific characteristics add complexity. Previous research hasn't comprehensively explored the interplay between self-supervised learning and robustness against noisy labels in medical image classification, considering all these factors. In this study, we address three key questions: i) How does label noise impact various medical image classification datasets? ii) Which types of medical image datasets are more challenging to learn and more affected by label noise? iii) How do different self-supervised pretraining methods enhance robustness across various medical image datasets? Our results show that DermNet, among five datasets (Fetal plane, DermNet, COVID-DU-Ex, MURA, NCT-CRC-HE-100K), is the most challenging but exhibits greater robustness against noisy labels. Additionally, contrastive learning stands out among the eight self-supervised methods as the most effective approach to enhance robustness against noisy labels.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Medical Vision Language Pretraining: A survey
Authors:
Prashant Shrestha,
Sanskar Amgain,
Bidur Khanal,
Cristian A. Linte,
Binod Bhattarai
Abstract:
Medical Vision Language Pretraining (VLP) has recently emerged as a promising solution to the scarcity of labeled data in the medical domain. By leveraging paired/unpaired vision and text datasets through self-supervised learning, models can be trained to acquire vast knowledge and learn robust feature representations. Such pretrained models have the potential to enhance multiple downstream medica…
▽ More
Medical Vision Language Pretraining (VLP) has recently emerged as a promising solution to the scarcity of labeled data in the medical domain. By leveraging paired/unpaired vision and text datasets through self-supervised learning, models can be trained to acquire vast knowledge and learn robust feature representations. Such pretrained models have the potential to enhance multiple downstream medical tasks simultaneously, reducing the dependency on labeled data. However, despite recent progress and its potential, there is no such comprehensive survey paper that has explored the various aspects and advancements in medical VLP. In this paper, we specifically review existing works through the lens of different pretraining objectives, architectures, downstream evaluation tasks, and datasets utilized for pretraining and downstream tasks. Subsequently, we delve into current challenges in medical VLP, discussing existing and potential solutions, and conclude by highlighting future directions. To the best of our knowledge, this is the first survey focused on medical VLP.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Benchmarking Encoder-Decoder Architectures for Biplanar X-ray to 3D Shape Reconstruction
Authors:
Mahesh Shakya,
Bishesh Khanal
Abstract:
Various deep learning models have been proposed for 3D bone shape reconstruction from two orthogonal (biplanar) X-ray images. However, it is unclear how these models compare against each other since they are evaluated on different anatomy, cohort and (often privately held) datasets. Moreover, the impact of the commonly optimized image-based segmentation metrics such as dice score on the estimation…
▽ More
Various deep learning models have been proposed for 3D bone shape reconstruction from two orthogonal (biplanar) X-ray images. However, it is unclear how these models compare against each other since they are evaluated on different anatomy, cohort and (often privately held) datasets. Moreover, the impact of the commonly optimized image-based segmentation metrics such as dice score on the estimation of clinical parameters relevant in 2D-3D bone shape reconstruction is not well known. To move closer toward clinical translation, we propose a benchmarking framework that evaluates tasks relevant to real-world clinical scenarios, including reconstruction of fractured bones, bones with implants, robustness to population shift, and error in estimating clinical parameters. Our open-source platform provides reference implementations of 8 models (many of whose implementations were not publicly available), APIs to easily collect and preprocess 6 public datasets, and the implementation of automatic clinical parameter and landmark extraction methods. We present an extensive evaluation of 8 2D-3D models on equal footing using 6 public datasets comprising images for four different anatomies. Our results show that attention-based methods that capture global spatial relationships tend to perform better across all anatomies and datasets; performance on clinically relevant subgroups may be overestimated without disaggregated reporting; ribs are substantially more difficult to reconstruct compared to femur, hip and spine; and the dice score improvement does not always bring a corresponding improvement in the automatic estimation of clinically relevant parameters.
△ Less
Submitted 26 September, 2023; v1 submitted 24 September, 2023;
originally announced September 2023.
-
Synthetic Boost: Leveraging Synthetic Data for Enhanced Vision-Language Segmentation in Echocardiography
Authors:
Rabin Adhikari,
Manish Dhakal,
Safal Thapaliya,
Kanchan Poudel,
Prasiddha Bhandari,
Bishesh Khanal
Abstract:
Accurate segmentation is essential for echocardiography-based assessment of cardiovascular diseases (CVDs). However, the variability among sonographers and the inherent challenges of ultrasound images hinder precise segmentation. By leveraging the joint representation of image and text modalities, Vision-Language Segmentation Models (VLSMs) can incorporate rich contextual information, potentially…
▽ More
Accurate segmentation is essential for echocardiography-based assessment of cardiovascular diseases (CVDs). However, the variability among sonographers and the inherent challenges of ultrasound images hinder precise segmentation. By leveraging the joint representation of image and text modalities, Vision-Language Segmentation Models (VLSMs) can incorporate rich contextual information, potentially aiding in accurate and explainable segmentation. However, the lack of readily available data in echocardiography hampers the training of VLSMs. In this study, we explore using synthetic datasets from Semantic Diffusion Models (SDMs) to enhance VLSMs for echocardiography segmentation. We evaluate results for two popular VLSMs (CLIPSeg and CRIS) using seven different kinds of language prompts derived from several attributes, automatically extracted from echocardiography images, segmentation masks, and their metadata. Our results show improved metrics and faster convergence when pretraining VLSMs on SDM-generated synthetic images before finetuning on real images. The code, configs, and prompts are available at https://github.com/naamiinepal/synthetic-boost.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare
Authors:
Karim Lekadir,
Aasa Feragen,
Abdul Joseph Fofanah,
Alejandro F Frangi,
Alena Buyx,
Anais Emelie,
Andrea Lara,
Antonio R Porras,
An-Wen Chan,
Arcadi Navarro,
Ben Glocker,
Benard O Botwe,
Bishesh Khanal,
Brigit Beger,
Carol C Wu,
Celia Cintas,
Curtis P Langlotz,
Daniel Rueckert,
Deogratias Mzurikwao,
Dimitrios I Fotiadis,
Doszhan Zhussupov,
Enzo Ferrante,
Erik Meijering,
Eva Weicken,
Fabio A González
, et al. (93 additional authors not shown)
Abstract:
Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted…
▽ More
Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted by patients, clinicians, health organisations and authorities. This work describes the FUTURE-AI guideline as the first international consensus framework for guiding the development and deployment of trustworthy AI tools in healthcare. The FUTURE-AI consortium was founded in 2021 and currently comprises 118 inter-disciplinary experts from 51 countries representing all continents, including AI scientists, clinicians, ethicists, and social scientists. Over a two-year period, the consortium defined guiding principles and best practices for trustworthy AI through an iterative process comprising an in-depth literature review, a modified Delphi survey, and online consensus meetings. The FUTURE-AI framework was established based on 6 guiding principles for trustworthy AI in healthcare, i.e. Fairness, Universality, Traceability, Usability, Robustness and Explainability. Through consensus, a set of 28 best practices were defined, addressing technical, clinical, legal and socio-ethical dimensions. The recommendations cover the entire lifecycle of medical AI, from design, development and validation to regulation, deployment, and monitoring. FUTURE-AI is a risk-informed, assumption-free guideline which provides a structured approach for constructing medical AI tools that will be trusted, deployed and adopted in real-world practice. Researchers are encouraged to take the recommendations into account in proof-of-concept stages to facilitate future translation towards clinical practice of medical AI.
△ Less
Submitted 11 August, 2023;
originally announced September 2023.
-
Temperature, RF Field, and Frequency Dependence Performance Evaluation of Superconducting Niobium Half-Wave Cavity
Authors:
N. K. Raut,
B. D. Khanal,
J. K. Tiskumara,
S. De Silva,
P. Dhakal,
G. Ciovati1,
J. R. Delayen
Abstract:
Recent advancement in superconducting radio frequency cavity processing techniques, with diffusion of impurities within the RF penetration depth, resulted in high quality factor with increase in quality factor with increasing accelerating gradient. The increase in quality factor is the result of a decrease in the surface resistance as a result of nonmagnetic impurities doping and change in electro…
▽ More
Recent advancement in superconducting radio frequency cavity processing techniques, with diffusion of impurities within the RF penetration depth, resulted in high quality factor with increase in quality factor with increasing accelerating gradient. The increase in quality factor is the result of a decrease in the surface resistance as a result of nonmagnetic impurities doping and change in electronic density of states. The fundamental understanding of the dependence of surface resistance on frequency and surface preparation is still an active area of research. Here, we present the result of RF measurements of the TEM modes in a coaxial half wave niobium cavity resonating at frequencies between 0.3-1.3 GHz. The temperature dependence of the surface resistance was measured between 4.2 K and 1.6 K. The field dependence of the surface resistance was measured at 2.0 K. The baseline measurements were made after standard surface preparation by buffered chemical polishing.
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
Quench Detection in a Superconducting Radio Frequency Cavity with Combine Temperature and Magnetic Field Mapping
Authors:
B. D. Khanal,
P. Dhakal,
G. Ciovati
Abstract:
Local dissipation of RF power in superconducting radio frequency cavities create so called hot spots, primary precursors of cavity quench driven by either thermal or magnetic instability. These hot spots are detected by a temperature mapping system, and a large increase in temperature on the outer surface is detected during cavity quench events. Here, we have used combined magnetic and temperature…
▽ More
Local dissipation of RF power in superconducting radio frequency cavities create so called hot spots, primary precursors of cavity quench driven by either thermal or magnetic instability. These hot spots are detected by a temperature mapping system, and a large increase in temperature on the outer surface is detected during cavity quench events. Here, we have used combined magnetic and temperature mapping systems using anisotropic magnetoresistance (AMR) sensors and carbon resisters to locate the hot spots and areas with high trapped flux on a 3.0 GHz single-cell Nb cavity during the RF tests at 2.0 K. The quench location and hot spots were detected near the equator when the residual magnetic field in the Dewar is kept < 1 mG. The hot spots and quench locations moved when the magnetic field is trapped locally, as detected by T-mapping system. No significant dynamics of trapped flux is detected by AMR sensors, however, change in magnetic flux during cavity quench is detected by a flux gate magnetometer, close to the quench location. The result provides the direct evidence of hot spots and quench events due to localized trapped vortices.
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
Evaluation of flux expulsion and flux trapping sensitivity of srf cavities fabricated from cold work Nb sheet with successive heat treatment
Authors:
B. D. Khanal,
P. Dhakal
Abstract:
The main source of RF losses leading to lower quality factor of superconducting radio-frequency cavities is due to the residual magnetic flux trapped during cool-down. The loss due to flux trapping is more pronounced for cavities subjected to impurities doping. The flux trapping and its sensitivity to rf losses are related to several intrinsic and extrinsic phenomena. To elucidate the effect of re…
▽ More
The main source of RF losses leading to lower quality factor of superconducting radio-frequency cavities is due to the residual magnetic flux trapped during cool-down. The loss due to flux trapping is more pronounced for cavities subjected to impurities doping. The flux trapping and its sensitivity to rf losses are related to several intrinsic and extrinsic phenomena. To elucidate the effect of re-crystallization by high temperature heat treatment on the flux trapping sensitivity, we have fabricated two 1.3 GHz single cell cavities from cold-worked Nb sheets and compared with cavities made from standard fine-grain Nb. Flux expulsion ratio and flux trapping sensitivity were measured after successive high temperature heat treatments. The cavity made from cold worked Nb showed better flux expulsion after 800 C/3h heat treatments and similar behavior when heat treated with additional 900 C/3h and 1000 C/3h. In this contribution, we present the summary of flux expulsion, trapping sensitivity, and RF results.
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
Exploring Transfer Learning in Medical Image Segmentation using Vision-Language Models
Authors:
Kanchan Poudel,
Manish Dhakal,
Prasiddha Bhandari,
Rabin Adhikari,
Safal Thapaliya,
Bishesh Khanal
Abstract:
Medical image segmentation allows quantifying target structure size and shape, aiding in disease diagnosis, prognosis, surgery planning, and comprehension.Building upon recent advancements in foundation Vision-Language Models (VLMs) from natural image-text pairs, several studies have proposed adapting them to Vision-Language Segmentation Models (VLSMs) that allow using language text as an addition…
▽ More
Medical image segmentation allows quantifying target structure size and shape, aiding in disease diagnosis, prognosis, surgery planning, and comprehension.Building upon recent advancements in foundation Vision-Language Models (VLMs) from natural image-text pairs, several studies have proposed adapting them to Vision-Language Segmentation Models (VLSMs) that allow using language text as an additional input to segmentation models. Introducing auxiliary information via text with human-in-the-loop prompting during inference opens up unique opportunities, such as open vocabulary segmentation and potentially more robust segmentation models against out-of-distribution data. Although transfer learning from natural to medical images has been explored for image-only segmentation models, the joint representation of vision-language in segmentation problems remains underexplored. This study introduces the first systematic study on transferring VLSMs to 2D medical images, using carefully curated $11$ datasets encompassing diverse modalities and insightful language prompts and experiments. Our findings demonstrate that although VLSMs show competitive performance compared to image-only models for segmentation after finetuning in limited medical image datasets, not all VLSMs utilize the additional information from language prompts, with image features playing a dominant role. While VLSMs exhibit enhanced performance in handling pooled datasets with diverse modalities and show potential robustness to domain shifts compared to conventional segmentation models, our results suggest that novel approaches are required to enable VLSMs to leverage the various auxiliary information available through language prompts. The code and datasets are available at https://github.com/naamiinepal/medvlsm.
△ Less
Submitted 20 June, 2024; v1 submitted 15 August, 2023;
originally announced August 2023.
-
Improving Medical Image Classification in Noisy Labels Using Only Self-supervised Pretraining
Authors:
Bidur Khanal,
Binod Bhattarai,
Bishesh Khanal,
Cristian A. Linte
Abstract:
Noisy labels hurt deep learning-based supervised image classification performance as the models may overfit the noise and learn corrupted feature extractors. For natural image classification training with noisy labeled data, model initialization with contrastive self-supervised pretrained weights has shown to reduce feature corruption and improve classification performance. However, no works have…
▽ More
Noisy labels hurt deep learning-based supervised image classification performance as the models may overfit the noise and learn corrupted feature extractors. For natural image classification training with noisy labeled data, model initialization with contrastive self-supervised pretrained weights has shown to reduce feature corruption and improve classification performance. However, no works have explored: i) how other self-supervised approaches, such as pretext task-based pretraining, impact the learning with noisy label, and ii) any self-supervised pretraining methods alone for medical images in noisy label settings. Medical images often feature smaller datasets and subtle inter class variations, requiring human expertise to ensure correct classification. Thus, it is not clear if the methods improving learning with noisy labels in natural image datasets such as CIFAR would also help with medical images. In this work, we explore contrastive and pretext task-based self-supervised pretraining to initialize the weights of a deep learning classification model for two medical datasets with self-induced noisy labels -- NCT-CRC-HE-100K tissue histological images and COVID-QU-Ex chest X-ray images. Our results show that models initialized with pretrained weights obtained from self-supervised learning can effectively learn better features and improve robustness against noisy labels.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
M-VAAL: Multimodal Variational Adversarial Active Learning for Downstream Medical Image Analysis Tasks
Authors:
Bidur Khanal,
Binod Bhattarai,
Bishesh Khanal,
Danail Stoyanov,
Cristian A. Linte
Abstract:
Acquiring properly annotated data is expensive in the medical field as it requires experts, time-consuming protocols, and rigorous validation. Active learning attempts to minimize the need for large annotated samples by actively sampling the most informative examples for annotation. These examples contribute significantly to improving the performance of supervised machine learning models, and thus…
▽ More
Acquiring properly annotated data is expensive in the medical field as it requires experts, time-consuming protocols, and rigorous validation. Active learning attempts to minimize the need for large annotated samples by actively sampling the most informative examples for annotation. These examples contribute significantly to improving the performance of supervised machine learning models, and thus, active learning can play an essential role in selecting the most appropriate information in deep learning-based diagnosis, clinical assessments, and treatment planning. Although some existing works have proposed methods for sampling the best examples for annotation in medical image analysis, they are not task-agnostic and do not use multimodal auxiliary information in the sampler, which has the potential to increase robustness. Therefore, in this work, we propose a Multimodal Variational Adversarial Active Learning (M-VAAL) method that uses auxiliary information from additional modalities to enhance the active sampling. We applied our method to two datasets: i) brain tumor segmentation and multi-label classification using the BraTS2018 dataset, and ii) chest X-ray image classification using the COVID-QU-Ex dataset. Our results show a promising direction toward data-efficient learning under limited annotations.
△ Less
Submitted 21 June, 2023;
originally announced June 2023.
-
Deep-learning assisted detection and quantification of (oo)cysts of Giardia and Cryptosporidium on smartphone microscopy images
Authors:
Suprim Nakarmi,
Sanam Pudasaini,
Safal Thapaliya,
Pratima Upretee,
Retina Shrestha,
Basant Giri,
Bhanu Bhakta Neupane,
Bishesh Khanal
Abstract:
The consumption of microbial-contaminated food and water is responsible for the deaths of millions of people annually. Smartphone-based microscopy systems are portable, low-cost, and more accessible alternatives for the detection of Giardia and Cryptosporidium than traditional brightfield microscopes. However, the images from smartphone microscopes are noisier and require manual cyst identificatio…
▽ More
The consumption of microbial-contaminated food and water is responsible for the deaths of millions of people annually. Smartphone-based microscopy systems are portable, low-cost, and more accessible alternatives for the detection of Giardia and Cryptosporidium than traditional brightfield microscopes. However, the images from smartphone microscopes are noisier and require manual cyst identification by trained technicians, usually unavailable in resource-limited settings. Automatic detection of (oo)cysts using deep-learning-based object detection could offer a solution for this limitation. We evaluate the performance of three state-of-the-art object detectors to detect (oo)cysts of Giardia and Cryptosporidium on a custom dataset that includes both smartphone and brightfield microscopic images from vegetable samples. Faster RCNN, RetinaNet, and you only look once (YOLOv8s) deep-learning models were employed to explore their efficacy and limitations. Our results show that while the deep-learning models perform better with the brightfield microscopy image dataset than the smartphone microscopy image dataset, the smartphone microscopy predictions are still comparable to the prediction performance of non-experts.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
Determination of energy-dependent neutron backgrounds using shadow bars
Authors:
S. N. Paneru,
K. W. Brown,
F. C. E Teh,
K. Zhu,
M. B. Tsang,
D. DellAquila,
Z. Chajecki,
W. G. Lynch,
S. Sweany,
C. Y. Tsang,
A. K. Anthony,
J. Barney,
J. Estee,
I. Gasparic,
G. Jhang,
O. B. Khanal,
J. Mandredi,
C. Y. Niu,
R. S. Wang,
J. C. Zamora
Abstract:
Understanding the neutron background is essential for determining the neutron yield from nuclear reactions. In the analysis presented here, the shadow bars are placed in front of neutron detectors to determine the energy dependent neutron background fractions. The measurement of neutron spectra with and without shadow bars is important to determine the neutron background more accurately. The neutr…
▽ More
Understanding the neutron background is essential for determining the neutron yield from nuclear reactions. In the analysis presented here, the shadow bars are placed in front of neutron detectors to determine the energy dependent neutron background fractions. The measurement of neutron spectra with and without shadow bars is important to determine the neutron background more accurately. The neutron background, along with its sources and systematic uncertainties, are explored with a focus on the impact of background models and their dependence on neutron energy.
△ Less
Submitted 19 December, 2022;
originally announced December 2022.
-
COVID-19-related Nepali Tweets Classification in a Low Resource Setting
Authors:
Rabin Adhikari,
Safal Thapaliya,
Nirajan Basnet,
Samip Poudel,
Aman Shakya,
Bishesh Khanal
Abstract:
Billions of people across the globe have been using social media platforms in their local languages to voice their opinions about the various topics related to the COVID-19 pandemic. Several organizations, including the World Health Organization, have developed automated social media analysis tools that classify COVID-19-related tweets into various topics. However, these tools that help combat the…
▽ More
Billions of people across the globe have been using social media platforms in their local languages to voice their opinions about the various topics related to the COVID-19 pandemic. Several organizations, including the World Health Organization, have developed automated social media analysis tools that classify COVID-19-related tweets into various topics. However, these tools that help combat the pandemic are limited to very few languages, making several countries unable to take their benefit. While multi-lingual or low-resource language-specific tools are being developed, they still need to expand their coverage, such as for the Nepali language. In this paper, we identify the eight most common COVID-19 discussion topics among the Twitter community using the Nepali language, set up an online platform to automatically gather Nepali tweets containing the COVID-19-related keywords, classify the tweets into the eight topics, and visualize the results across the period in a web-based dashboard. We compare the performance of two state-of-the-art multi-lingual language models for Nepali tweet classification, one generic (mBERT) and the other Nepali language family-specific model (MuRIL). Our results show that the models' relative performance depends on the data size, with MuRIL doing better for a larger dataset. The annotated data, models, and the web-based dashboard are open-sourced at https://github.com/naamiinepal/covid-tweet-classification.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
FixMatchSeg: Fixing FixMatch for Semi-Supervised Semantic Segmentation
Authors:
Pratima Upretee,
Bishesh Khanal
Abstract:
Supervised deep learning methods for semantic medical image segmentation are getting increasingly popular in the past few years.However, in resource constrained settings, getting large number of annotated images is very difficult as it mostly requires experts, is expensive and time-consuming.Semi-supervised segmentation can be an attractive solution where a very few labeled images are used along w…
▽ More
Supervised deep learning methods for semantic medical image segmentation are getting increasingly popular in the past few years.However, in resource constrained settings, getting large number of annotated images is very difficult as it mostly requires experts, is expensive and time-consuming.Semi-supervised segmentation can be an attractive solution where a very few labeled images are used along with a large number of unlabeled ones. While the gap between supervised and semi-supervised methods have been dramatically reduced for classification problems in the past couple of years, there still remains a larger gap in segmentation methods. In this work, we adapt a state-of-the-art semi-supervised classification method FixMatch to semantic segmentation task, introducing FixMatchSeg. FixMatchSeg is evaluated in four different publicly available datasets of different anatomy and different modality: cardiac ultrasound, chest X-ray, retinal fundus image, and skin images. When there are few labels, we show that FixMatchSeg performs on par with strong supervised baselines.
△ Less
Submitted 2 August, 2022; v1 submitted 31 July, 2022;
originally announced August 2022.
-
Medium-Grain Niobium SRF Cavity Production Technology for Science Frontiers and Accelerator Applications
Authors:
G. Myneni,
Hani E. Elsayed-Ali,
Md Obidul Islam,
Md Nizam Sayeed,
G. Ciovati,
P. Dhakal,
R. A. Rimmer,
M. Carl,
A. Fajardo,
N. Lannoy,
B. Khanal,
T. Dohmae,
A. Kumar,
T. Saeki,
K. Umemori,
M. Yamanaka,
S. Michizono,
A. Yamamoto
Abstract:
We propose cost-effective production of medium grain (MG) niobium (Nb) discs directly sliced from forged and annealed billet. This production method provides clean surface conditions and reliable mechanical characteristics with sub-millimeter average grain size resulting in stable SRF cavity production. We propose to apply this material to particle accelerator applications in the science and indus…
▽ More
We propose cost-effective production of medium grain (MG) niobium (Nb) discs directly sliced from forged and annealed billet. This production method provides clean surface conditions and reliable mechanical characteristics with sub-millimeter average grain size resulting in stable SRF cavity production. We propose to apply this material to particle accelerator applications in the science and industrial frontiers. The science applications require high field gradients (>~40 MV/m) particularly in pulse mode. The industrial applications require high Q0 values with moderate gradients (~30 MV/m) in CW mode operation. This report describes the MG Nb disc production recently demonstrated and discusses future prospects for application in advanced particle accelerators in the science and industrial frontiers.
△ Less
Submitted 11 March, 2022;
originally announced March 2022.
-
How Does Heterogeneous Label Noise Impact Generalization in Neural Nets?
Authors:
Bidur Khanal,
Christopher Kanan
Abstract:
Incorrectly labeled examples, or label noise, is common in real-world computer vision datasets. While the impact of label noise on learning in deep neural networks has been studied in prior work, these studies have exclusively focused on homogeneous label noise, i.e., the degree of label noise is the same across all categories. However, in the real-world, label noise is often heterogeneous, with s…
▽ More
Incorrectly labeled examples, or label noise, is common in real-world computer vision datasets. While the impact of label noise on learning in deep neural networks has been studied in prior work, these studies have exclusively focused on homogeneous label noise, i.e., the degree of label noise is the same across all categories. However, in the real-world, label noise is often heterogeneous, with some categories being affected to a greater extent than others. Here, we address this gap in the literature. We hypothesized that heterogeneous label noise would only affect the classes that had label noise unless there was transfer from those classes to the classes without label noise. To test this hypothesis, we designed a series of computer vision studies using MNIST, CIFAR-10, CIFAR-100, and MS-COCO where we imposed heterogeneous label noise during the training of multi-class, multi-task, and multi-label systems. Our results provide evidence in support of our hypothesis: label noise only affects the class affected by it unless there is transfer.
△ Less
Submitted 26 September, 2021; v1 submitted 29 June, 2021;
originally announced June 2021.
-
Label Geometry Aware Discriminator for Conditional Generative Networks
Authors:
Suman Sapkota,
Bidur Khanal,
Binod Bhattarai,
Bishesh Khanal,
Tae-Kyun Kim
Abstract:
Multi-domain image-to-image translation with conditional Generative Adversarial Networks (GANs) can generate highly photo realistic images with desired target classes, yet these synthetic images have not always been helpful to improve downstream supervised tasks such as image classification. Improving downstream tasks with synthetic examples requires generating images with high fidelity to the unk…
▽ More
Multi-domain image-to-image translation with conditional Generative Adversarial Networks (GANs) can generate highly photo realistic images with desired target classes, yet these synthetic images have not always been helpful to improve downstream supervised tasks such as image classification. Improving downstream tasks with synthetic examples requires generating images with high fidelity to the unknown conditional distribution of the target class, which many labeled conditional GANs attempt to achieve by adding soft-max cross-entropy loss based auxiliary classifier in the discriminator. As recent studies suggest that the soft-max loss in Euclidean space of deep feature does not leverage their intrinsic angular distribution, we propose to replace this loss in auxiliary classifier with an additive angular margin (AAM) loss that takes benefit of the intrinsic angular distribution, and promotes intra-class compactness and inter-class separation to help generator synthesize high fidelity images.
We validate our method on RaFD and CIFAR-100, two challenging face expression and natural image classification data set. Our method outperforms state-of-the-art methods in several different evaluation criteria including recently proposed GAN-train and GAN-test metrics designed to assess the impact of synthetic data on downstream classification task, assessing the usefulness in data augmentation for supervised tasks with prediction accuracy score and average confidence score, and the well known FID metric.
△ Less
Submitted 12 May, 2021;
originally announced May 2021.
-
Uncertainty Estimation in Deep 2D Echocardiography Segmentation
Authors:
Lavsen Dahal,
Aayush Kafle,
Bishesh Khanal
Abstract:
2D echocardiography is the most common imaging modality for cardiovascular diseases. The portability and relatively low-cost nature of Ultrasound (US) enable the US devices needed for performing echocardiography to be made widely available. However, acquiring and interpreting cardiac US images is operator dependent, limiting its use to only places where experts are present. Recently, Deep Learning…
▽ More
2D echocardiography is the most common imaging modality for cardiovascular diseases. The portability and relatively low-cost nature of Ultrasound (US) enable the US devices needed for performing echocardiography to be made widely available. However, acquiring and interpreting cardiac US images is operator dependent, limiting its use to only places where experts are present. Recently, Deep Learning (DL) has been used in 2D echocardiography for automated view classification, and structure and function assessment. Although these recent works show promise in developing computer-guided acquisition and automated interpretation of echocardiograms, most of these methods do not model and estimate uncertainty which can be important when testing on data coming from a distribution further away from that of the training data. Uncertainty estimates can be beneficial both during the image acquisition phase (by providing real-time feedback to the operator on acquired image's quality), and during automated measurement and interpretation. The performance of uncertainty models and quantification metric may depend on the prediction task and the models being compared. Hence, to gain insight of uncertainty modelling for left ventricular segmentation from US images, we compare three ensembling based uncertainty models quantified using four different metrics (one newly proposed) on state-of-the-art baseline networks using two publicly available echocardiogram datasets. We further demonstrate how uncertainty estimation can be used to automatically reject poor quality images and improve state-of-the-art segmentation results.
△ Less
Submitted 19 May, 2020;
originally announced May 2020.
-
Value-assigned pulse shape discrimination for neutron detectors
Authors:
F. C. E. Teh,
J. -W. Lee,
K. Zhu,
K. W. Brown,
Z. Chajecki,
W. G. Lynch,
M. B. Tsang,
A. Anthony,
J. Barney,
D. Dell'Aquila,
J. Estee,
B. Hong,
G. Jhang,
O. B. Khanal,
Y. J. Kim,
H. S. Lee,
J. W. Lee,
J. Manfredi,
S. H. Nam,
C. Y. Niu,
J. H. Park,
S. Sweany,
C. Y. Tsang,
R. Wang,
H. Wu
Abstract:
Using the waveforms from a digital electronic system, an offline analysis technique on pulse shape discrimination (PSD) has been developed to improve the neutron-gamma separation in a bar-shaped NE-213 scintillator that couples to a photomultiplier tube (PMT) at each end. The new improved method, called the ``valued-assigned PSD'' (VPSD), assigns a normalized fitting residual to every waveform as…
▽ More
Using the waveforms from a digital electronic system, an offline analysis technique on pulse shape discrimination (PSD) has been developed to improve the neutron-gamma separation in a bar-shaped NE-213 scintillator that couples to a photomultiplier tube (PMT) at each end. The new improved method, called the ``valued-assigned PSD'' (VPSD), assigns a normalized fitting residual to every waveform as the PSD value. This procedure then facilitates the incorporation of longitudinal position dependence of the scintillator, which further enhances the PSD capability of the detector system. In this paper, we use radiation emitted from an AmBe neutron source to demonstrate that the resulting neutron-gamma identification has been much improved when compared to the traditional technique that uses the geometric mean of light outputs from both PMTs. The new method has also been modified and applied to a recent experiment at the National Superconducting Cyclotron Laboratory (NSCL) that uses an analog electronic system.
△ Less
Submitted 17 June, 2021; v1 submitted 15 January, 2020;
originally announced January 2020.
-
Automatic Cobb Angle Detection using Vertebra Detector and Vertebra Corners Regression
Authors:
Bidur Khanal,
Lavsen Dahal,
Prashant Adhikari,
Bishesh Khanal
Abstract:
Correct evaluation and treatment of Scoliosis require accurate estimation of spinal curvature. Current gold standard is to manually estimate Cobb Angles in spinal X-ray images which is time consuming and has high inter-rater variability. We propose an automatic method with a novel framework that first detects vertebrae as objects followed by a landmark detector that estimates the 4 landmark corner…
▽ More
Correct evaluation and treatment of Scoliosis require accurate estimation of spinal curvature. Current gold standard is to manually estimate Cobb Angles in spinal X-ray images which is time consuming and has high inter-rater variability. We propose an automatic method with a novel framework that first detects vertebrae as objects followed by a landmark detector that estimates the 4 landmark corners of each vertebra separately. Cobb Angles are calculated using the slope of each vertebra obtained from the predicted landmarks. For inference on test data, we perform pre and post processings that include cropping, outlier rejection and smoothing of the predicted landmarks. The results were assessed in AASCE MICCAI challenge 2019 which showed a promise with a SMAPE score of 25.69 on the challenge test set.
△ Less
Submitted 30 October, 2019;
originally announced October 2019.
-
Confident Head Circumference Measurement from Ultrasound with Real-time Feedback for Sonographers
Authors:
Samuel Budd,
Matthew Sinclair,
Bishesh Khanal,
Jacqueline Matthew,
David Lloyd,
Alberto Gomez,
Nicolas Toussaint,
Emma Robinson,
Bernhard Kainz
Abstract:
Manual estimation of fetal Head Circumference (HC) from Ultrasound (US) is a key biometric for monitoring the healthy development of fetuses. Unfortunately, such measurements are subject to large inter-observer variability, resulting in low early-detection rates of fetal abnormalities. To address this issue, we propose a novel probabilistic Deep Learning approach for real-time automated estimation…
▽ More
Manual estimation of fetal Head Circumference (HC) from Ultrasound (US) is a key biometric for monitoring the healthy development of fetuses. Unfortunately, such measurements are subject to large inter-observer variability, resulting in low early-detection rates of fetal abnormalities. To address this issue, we propose a novel probabilistic Deep Learning approach for real-time automated estimation of fetal HC. This system feeds back statistics on measurement robustness to inform users how confident a deep neural network is in evaluating suitable views acquired during free-hand ultrasound examination. In real-time scenarios, this approach may be exploited to guide operators to scan planes that are as close as possible to the underlying distribution of training images, for the purpose of improving inter-operator consistency. We train on free-hand ultrasound data from over 2000 subjects (2848 training/540 test) and show that our method is able to predict HC measurements within 1.81$\pm$1.65mm deviation from the ground truth, with 50% of the test images fully contained within the predicted confidence margins, and an average of 1.82$\pm$1.78mm deviation from the margin for the remaining cases that are not fully contained.
△ Less
Submitted 7 August, 2019;
originally announced August 2019.
-
Controlling Meshes via Curvature: Spin Transformations for Pose-Invariant Shape Processing
Authors:
Loic Le Folgoc,
Daniel C. Castro,
Jeremy Tan,
Bishesh Khanal,
Konstantinos Kamnitsas,
Ian Walker,
Amir Alansary,
Ben Glocker
Abstract:
We investigate discrete spin transformations, a geometric framework to manipulate surface meshes by controlling mean curvature. Applications include surface fairing -- flowing a mesh onto say, a reference sphere -- and mesh extrusion -- e.g., rebuilding a complex shape from a reference sphere and curvature specification. Because they operate in curvature space, these operations can be conducted ve…
▽ More
We investigate discrete spin transformations, a geometric framework to manipulate surface meshes by controlling mean curvature. Applications include surface fairing -- flowing a mesh onto say, a reference sphere -- and mesh extrusion -- e.g., rebuilding a complex shape from a reference sphere and curvature specification. Because they operate in curvature space, these operations can be conducted very stably across large deformations with no need for remeshing. Spin transformations add to the algorithmic toolbox for pose-invariant shape analysis. Mathematically speaking, mean curvature is a shape invariant and in general fully characterizes closed shapes (together with the metric). Computationally speaking, spin transformations make that relationship explicit. Our work expands on a discrete formulation of spin transformations. Like their smooth counterpart, discrete spin transformations are naturally close to conformal (angle-preserving). This quasi-conformality can nevertheless be relaxed to satisfy the desired trade-off between area distortion and angle preservation. We derive such constraints and propose a formulation in which they can be efficiently incorporated. The approach is showcased on subcortical structures.
△ Less
Submitted 6 March, 2019;
originally announced March 2019.
-
FastReg: Fast Non-Rigid Registration via Accelerated Optimisation on the Manifold of Diffeomorphisms
Authors:
Daniel Grzech,
Loïc le Folgoc,
Mattias P. Heinrich,
Bishesh Khanal,
Jakub Moll,
Julia A. Schnabel,
Ben Glocker,
Bernhard Kainz
Abstract:
We present an implementation of a new approach to diffeomorphic non-rigid registration of medical images. The method is based on optical flow and warps images via gradient flow with the standard $L^2$ inner product. To compute the transformation, we rely on accelerated optimisation on the manifold of diffeomorphisms. We achieve regularity properties of Sobolev gradient flows, which are expensive t…
▽ More
We present an implementation of a new approach to diffeomorphic non-rigid registration of medical images. The method is based on optical flow and warps images via gradient flow with the standard $L^2$ inner product. To compute the transformation, we rely on accelerated optimisation on the manifold of diffeomorphisms. We achieve regularity properties of Sobolev gradient flows, which are expensive to compute, owing to a novel method of averaging the gradients in time rather than space. We successfully register brain MRI and challenging abdominal CT scans at speeds orders of magnitude faster than previous approaches. We make our code available in a public repository: https://github.com/dgrzech/fastreg
△ Less
Submitted 24 April, 2019; v1 submitted 5 March, 2019;
originally announced March 2019.
-
Weakly Supervised Localisation for Fetal Ultrasound Images
Authors:
Nicolas Toussaint,
Bishesh Khanal,
Matthew Sinclair,
Alberto Gomez,
Emily Skelton,
Jacqueline Matthew,
Julia A. Schnabel
Abstract:
This paper addresses the task of detecting and localising fetal anatomical regions in 2D ultrasound images, where only image-level labels are present at training, i.e. without any localisation or segmentation information. We examine the use of convolutional neural network architectures coupled with soft proposal layers. The resulting network simultaneously performs anatomical region detection (cla…
▽ More
This paper addresses the task of detecting and localising fetal anatomical regions in 2D ultrasound images, where only image-level labels are present at training, i.e. without any localisation or segmentation information. We examine the use of convolutional neural network architectures coupled with soft proposal layers. The resulting network simultaneously performs anatomical region detection (classification) and localisation tasks. We generate a proposal map describing the attention of the network for a particular class. The network is trained on 85,500 2D fetal Ultrasound images and their associated labels. Labels correspond to six anatomical regions: head, spine, thorax, abdomen, limbs, and placenta. Detection achieves an average accuracy of 90\% on individual regions, and show that the proposal maps correlate well with relevant anatomical structures. This work presents itself as a powerful and essential step towards subsequent tasks such as fetal position and pose estimation, organ-specific segmentation, or image-guided navigation. Code and additional material is available at https://ntoussaint.github.io/fetalnav
△ Less
Submitted 2 August, 2018;
originally announced August 2018.
-
EchoFusion: Tracking and Reconstruction of Objects in 4D Freehand Ultrasound Imaging without External Trackers
Authors:
Bishesh Khanal,
Alberto Gomez,
Nicolas Toussaint,
Steven McDonagh,
Veronika Zimmer,
Emily Skelton,
Jacqueline Matthew,
Daniel Grzech,
Robert Wright,
Chandni Gupta,
Benjamin Hou,
Daniel Rueckert,
Julia A. Schnabel,
Bernhard Kainz
Abstract:
Ultrasound (US) is the most widely used fetal imaging technique. However, US images have limited capture range, and suffer from view dependent artefacts such as acoustic shadows. Compounding of overlapping 3D US acquisitions into a high-resolution volume can extend the field of view and remove image artefacts, which is useful for retrospective analysis including population based studies. However,…
▽ More
Ultrasound (US) is the most widely used fetal imaging technique. However, US images have limited capture range, and suffer from view dependent artefacts such as acoustic shadows. Compounding of overlapping 3D US acquisitions into a high-resolution volume can extend the field of view and remove image artefacts, which is useful for retrospective analysis including population based studies. However, such volume reconstructions require information about relative transformations between probe positions from which the individual volumes were acquired. In prenatal US scans, the fetus can move independently from the mother, making external trackers such as electromagnetic or optical tracking unable to track the motion between probe position and the moving fetus. We provide a novel methodology for image-based tracking and volume reconstruction by combining recent advances in deep learning and simultaneous localisation and mapping (SLAM). Tracking semantics are established through the use of a Residual 3D U-Net and the output is fed to the SLAM algorithm. As a proof of concept, experiments are conducted on US volumes taken from a whole body fetal phantom, and from the heads of real fetuses. For the fetal head segmentation, we also introduce a novel weak annotation approach to minimise the required manual effort for ground truth annotation. We evaluate our method qualitatively, and quantitatively with respect to tissue discrimination accuracy and tracking robustness.
△ Less
Submitted 19 July, 2018;
originally announced July 2018.
-
Standard Plane Detection in 3D Fetal Ultrasound Using an Iterative Transformation Network
Authors:
Yuanwei Li,
Bishesh Khanal,
Benjamin Hou,
Amir Alansary,
Juan J. Cerrolaza,
Matthew Sinclair,
Jacqueline Matthew,
Chandni Gupta,
Caroline Knight,
Bernhard Kainz,
Daniel Rueckert
Abstract:
Standard scan plane detection in fetal brain ultrasound (US) forms a crucial step in the assessment of fetal development. In clinical settings, this is done by manually manoeuvring a 2D probe to the desired scan plane. With the advent of 3D US, the entire fetal brain volume containing these standard planes can be easily acquired. However, manual standard plane identification in 3D volume is labour…
▽ More
Standard scan plane detection in fetal brain ultrasound (US) forms a crucial step in the assessment of fetal development. In clinical settings, this is done by manually manoeuvring a 2D probe to the desired scan plane. With the advent of 3D US, the entire fetal brain volume containing these standard planes can be easily acquired. However, manual standard plane identification in 3D volume is labour-intensive and requires expert knowledge of fetal anatomy. We propose a new Iterative Transformation Network (ITN) for the automatic detection of standard planes in 3D volumes. ITN uses a convolutional neural network to learn the relationship between a 2D plane image and the transformation parameters required to move that plane towards the location/orientation of the standard plane in the 3D volume. During inference, the current plane image is passed iteratively to the network until it converges to the standard plane location. We explore the effect of using different transformation representations as regression outputs of ITN. Under a multi-task learning framework, we introduce additional classification probability outputs to the network to act as confidence measures for the regressed transformation parameters in order to further improve the localisation accuracy. When evaluated on 72 US volumes of fetal brain, our method achieves an error of 3.83mm/12.7 degrees and 3.80mm/12.6 degrees for the transventricular and transcerebellar planes respectively and takes 0.46s per plane. Source code is publicly available at https://github.com/yuanwei1989/plane-detection.
△ Less
Submitted 6 October, 2018; v1 submitted 19 June, 2018;
originally announced June 2018.
-
Fast Multiple Landmark Localisation Using a Patch-based Iterative Network
Authors:
Yuanwei Li,
Amir Alansary,
Juan J. Cerrolaza,
Bishesh Khanal,
Matthew Sinclair,
Jacqueline Matthew,
Chandni Gupta,
Caroline Knight,
Bernhard Kainz,
Daniel Rueckert
Abstract:
We propose a new Patch-based Iterative Network (PIN) for fast and accurate landmark localisation in 3D medical volumes. PIN utilises a Convolutional Neural Network (CNN) to learn the spatial relationship between an image patch and anatomical landmark positions. During inference, patches are repeatedly passed to the CNN until the estimated landmark position converges to the true landmark location.…
▽ More
We propose a new Patch-based Iterative Network (PIN) for fast and accurate landmark localisation in 3D medical volumes. PIN utilises a Convolutional Neural Network (CNN) to learn the spatial relationship between an image patch and anatomical landmark positions. During inference, patches are repeatedly passed to the CNN until the estimated landmark position converges to the true landmark location. PIN is computationally efficient since the inference stage only selectively samples a small number of patches in an iterative fashion rather than a dense sampling at every location in the volume. Our approach adopts a multi-task learning framework that combines regression and classification to improve localisation accuracy. We extend PIN to localise multiple landmarks by using principal component analysis, which models the global anatomical relationships between landmarks. We have evaluated PIN using 72 3D ultrasound images from fetal screening examinations. PIN achieves quantitatively an average landmark localisation error of 5.59mm and a runtime of 0.44s to predict 10 landmarks per volume. Qualitatively, anatomical 2D standard scan planes derived from the predicted landmark locations are visually similar to the clinical ground truth. Source code is publicly available at https://github.com/yuanwei1989/landmark-detection.
△ Less
Submitted 6 October, 2018; v1 submitted 18 June, 2018;
originally announced June 2018.
-
Adapted and Oversegmenting Graphs: Application to Geometric Deep Learning
Authors:
Alberto Gomez,
Veronika A. Zimmer,
Bishesh Khanal,
Nicolas Toussaint,
Julia A. Schnabel
Abstract:
We propose a novel iterative method to adapt a a graph to d-dimensional image data. The method drives the nodes of the graph towards image features. The adaptation process naturally lends itself to a measure of feature saliency which can then be used to retain meaningful nodes and edges in the graph. From the adapted graph, we also propose the computation of a dual graph, which inherits the salien…
▽ More
We propose a novel iterative method to adapt a a graph to d-dimensional image data. The method drives the nodes of the graph towards image features. The adaptation process naturally lends itself to a measure of feature saliency which can then be used to retain meaningful nodes and edges in the graph. From the adapted graph, we also propose the computation of a dual graph, which inherits the saliency measure from the adapted graph, and whose edges run along image features, hence producing an oversegmenting graph. The proposed method is computationally efficient and fully parallelisable. We propose two distance measures to find image saliency along graph edges, and evaluate the performance on synthetic images and on natural images from publicly available databases. In both cases, the most salient nodes of the graph achieve average boundary recall over 90%. We also apply our method to image classification on the MNIST hand-written digit dataset, using a recently proposed Deep Geometric Learning architecture, and achieving state-of-the-art classification accuracy, for a graph-based method, of 97.86%.
△ Less
Submitted 5 September, 2019; v1 submitted 1 June, 2018;
originally announced June 2018.
-
Computing CNN Loss and Gradients for Pose Estimation with Riemannian Geometry
Authors:
Benjamin Hou,
Nina Miolane,
Bishesh Khanal,
Matthew C. H. Lee,
Amir Alansary,
Steven McDonagh,
Jo V. Hajnal,
Daniel Rueckert,
Ben Glocker,
Bernhard Kainz
Abstract:
Pose estimation, i.e. predicting a 3D rigid transformation with respect to a fixed co-ordinate frame in, SE(3), is an omnipresent problem in medical image analysis with applications such as: image rigid registration, anatomical standard plane detection, tracking and device/camera pose estimation. Deep learning methods often parameterise a pose with a representation that separates rotation and tran…
▽ More
Pose estimation, i.e. predicting a 3D rigid transformation with respect to a fixed co-ordinate frame in, SE(3), is an omnipresent problem in medical image analysis with applications such as: image rigid registration, anatomical standard plane detection, tracking and device/camera pose estimation. Deep learning methods often parameterise a pose with a representation that separates rotation and translation. As commonly available frameworks do not provide means to calculate loss on a manifold, regression is usually performed using the L2-norm independently on the rotation's and the translation's parameterisations, which is a metric for linear spaces that does not take into account the Lie group structure of SE(3). In this paper, we propose a general Riemannian formulation of the pose estimation problem. We propose to train the CNN directly on SE(3) equipped with a left-invariant Riemannian metric, coupling the prediction of the translation and rotation defining the pose. At each training step, the ground truth and predicted pose are elements of the manifold, where the loss is calculated as the Riemannian geodesic distance. We then compute the optimisation direction by back-propagating the gradient with respect to the predicted pose on the tangent space of the manifold SE(3) and update the network weights. We thoroughly evaluate the effectiveness of our loss function by comparing its performance with popular and most commonly used existing methods, on tasks such as image-based localisation and intensity-based 2D/3D registration. We also show that hyper-parameters, used in our loss function to weight the contribution between rotations and translations, can be intrinsically calculated from the dataset to achieve greater performance margins.
△ Less
Submitted 17 July, 2018; v1 submitted 2 May, 2018;
originally announced May 2018.
-
3D Reconstruction in Canonical Co-ordinate Space from Arbitrarily Oriented 2D Images
Authors:
Benjamin Hou,
Bishesh Khanal,
Amir Alansary,
Steven McDonagh,
Alice Davidson,
Mary Rutherford,
Jo V. Hajnal,
Daniel Rueckert,
Ben Glocker,
Bernhard Kainz
Abstract:
Limited capture range, and the requirement to provide high quality initialization for optimization-based 2D/3D image registration methods, can significantly degrade the performance of 3D image reconstruction and motion compensation pipelines. Challenging clinical imaging scenarios, which contain significant subject motion such as fetal in-utero imaging, complicate the 3D image and volume reconstru…
▽ More
Limited capture range, and the requirement to provide high quality initialization for optimization-based 2D/3D image registration methods, can significantly degrade the performance of 3D image reconstruction and motion compensation pipelines. Challenging clinical imaging scenarios, which contain significant subject motion such as fetal in-utero imaging, complicate the 3D image and volume reconstruction process. In this paper we present a learning based image registration method capable of predicting 3D rigid transformations of arbitrarily oriented 2D image slices, with respect to a learned canonical atlas co-ordinate system. Only image slice intensity information is used to perform registration and canonical alignment, no spatial transform initialization is required. To find image transformations we utilize a Convolutional Neural Network (CNN) architecture to learn the regression function capable of mapping 2D image slices to a 3D canonical atlas space. We extensively evaluate the effectiveness of our approach quantitatively on simulated Magnetic Resonance Imaging (MRI), fetal brain imagery with synthetic motion and further demonstrate qualitative results on real fetal MRI data where our method is integrated into a full reconstruction and motion compensation pipeline. Our learning based registration achieves an average spatial prediction error of 7 mm on simulated data and produces qualitatively improved reconstructions for heavily moving fetuses with gestational ages of approximately 20 weeks. Our model provides a general and computationally efficient solution to the 2D/3D registration initialization problem and is suitable for real-time scenarios.
△ Less
Submitted 23 January, 2018; v1 submitted 19 September, 2017;
originally announced September 2017.