Search | arXiv e-print repository

Resource and Mobility Management in Hybrid LiFi and WiFi Networks: A User-Centric Learning Approach

Abstract: Hybrid light fidelity (LiFi) and wireless fidelity (WiFi) networks (HLWNets) are an emerging indoor wireless communication paradigm, which combines the advantages of the capacious optical spectra of LiFi and ubiquitous coverage of WiFi. Meanwhile, load balancing (LB) becomes a key challenge in resource management for such hybrid networks. The existing LB methods are mostly network-centric, relying… ▽ More Hybrid light fidelity (LiFi) and wireless fidelity (WiFi) networks (HLWNets) are an emerging indoor wireless communication paradigm, which combines the advantages of the capacious optical spectra of LiFi and ubiquitous coverage of WiFi. Meanwhile, load balancing (LB) becomes a key challenge in resource management for such hybrid networks. The existing LB methods are mostly network-centric, relying on a central unit to make a solution for the users all at once. Consequently, the solution needs to be updated for all users at the same pace, regardless of their moving status. This would affect the network performance in two aspects: i) when the update frequency is low, it would compromise the connectivity of fast-moving users; ii) when the update frequency is high, it would cause unnecessary handovers as well as hefty feedback costs for slow-moving users. Motivated by this, we investigate user-centric LB which allows users to update their solutions at different paces. The research is developed upon our previous work on adaptive target-condition neural network (ATCNN), which can conduct LB for individual users in quasi-static channels. In this paper, a deep neural network (DNN) model is designed to enable an adaptive update interval for each individual user. This new model is termed as mobility-supporting neural network (MSNN). Associating MSNN with ATCNN, a user-centric LB framework named mobility-supporting ATCNN (MS-ATCNN) is proposed to handle resource management and mobility management simultaneously. Results show that at the same level of average update interval, MS-ATCNN can achieve a network throughput up to 215\% higher than conventional LB methods such as game theory, especially for a larger number of users. In addition, MS-ATCNN costs an ultra low runtime at the level of 100s $μ$s, which is two to three orders of magnitude lower than game theory. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 12 pages, 12 figures, 3 tables, submitted to IEEE TWC

arXiv:2402.19275 [pdf, other]

Adaptive Testing Environment Generation for Connected and Automated Vehicles with Dense Reinforcement Learning

Authors: Jingxuan Yang, Ruoxuan Bai, Haoyuan Ji, Yi Zhang, Jianming Hu, Shuo Feng

Abstract: The assessment of safety performance plays a pivotal role in the development and deployment of connected and automated vehicles (CAVs). A common approach involves designing testing scenarios based on prior knowledge of CAVs (e.g., surrogate models), conducting tests in these scenarios, and subsequently evaluating CAVs' safety performances. However, substantial differences between CAVs and the prio… ▽ More The assessment of safety performance plays a pivotal role in the development and deployment of connected and automated vehicles (CAVs). A common approach involves designing testing scenarios based on prior knowledge of CAVs (e.g., surrogate models), conducting tests in these scenarios, and subsequently evaluating CAVs' safety performances. However, substantial differences between CAVs and the prior knowledge can significantly diminish the evaluation efficiency. In response to this issue, existing studies predominantly concentrate on the adaptive design of testing scenarios during the CAV testing process. Yet, these methods have limitations in their applicability to high-dimensional scenarios. To overcome this challenge, we develop an adaptive testing environment that bolsters evaluation robustness by incorporating multiple surrogate models and optimizing the combination coefficients of these surrogate models to enhance evaluation efficiency. We formulate the optimization problem as a regression task utilizing quadratic programming. To efficiently obtain the regression target via reinforcement learning, we propose the dense reinforcement learning method and devise a new adaptive policy with high sample efficiency. Essentially, our approach centers on learning the values of critical scenes displaying substantial surrogate-to-real gaps. The effectiveness of our method is validated in high-dimensional overtaking scenarios, demonstrating that our approach achieves notable evaluation efficiency. △ Less

Submitted 29 February, 2024; originally announced February 2024.

arXiv:2402.09463 [pdf]

Multi-Center Fetal Brain Tissue Annotation (FeTA) Challenge 2022 Results

Authors: Kelly Payette, Céline Steger, Roxane Licandro, Priscille de Dumast, Hongwei Bran Li, Matthew Barkovich, Liu Li, Maik Dannecker, Chen Chen, Cheng Ouyang, Niccolò McConnell, Alina Miron, Yongmin Li, Alena Uus, Irina Grigorescu, Paula Ramirez Gilliland, Md Mahfuzur Rahman Siddiquee, Daguang Xu, Andriy Myronenko, Haoyu Wang, Ziyan Huang, Jin Ye, Mireia Alenyà, Valentin Comte, Oscar Camara , et al. (42 additional authors not shown)

Abstract: Segmentation is a critical step in analyzing the developing human fetal brain. There have been vast improvements in automatic segmentation methods in the past several years, and the Fetal Brain Tissue Annotation (FeTA) Challenge 2021 helped to establish an excellent standard of fetal brain segmentation. However, FeTA 2021 was a single center study, and the generalizability of algorithms across dif… ▽ More Segmentation is a critical step in analyzing the developing human fetal brain. There have been vast improvements in automatic segmentation methods in the past several years, and the Fetal Brain Tissue Annotation (FeTA) Challenge 2021 helped to establish an excellent standard of fetal brain segmentation. However, FeTA 2021 was a single center study, and the generalizability of algorithms across different imaging centers remains unsolved, limiting real-world clinical applicability. The multi-center FeTA Challenge 2022 focuses on advancing the generalizability of fetal brain segmentation algorithms for magnetic resonance imaging (MRI). In FeTA 2022, the training dataset contained images and corresponding manually annotated multi-class labels from two imaging centers, and the testing data contained images from these two imaging centers as well as two additional unseen centers. The data from different centers varied in many aspects, including scanners used, imaging parameters, and fetal brain super-resolution algorithms applied. 16 teams participated in the challenge, and 17 algorithms were evaluated. Here, a detailed overview and analysis of the challenge results are provided, focusing on the generalizability of the submissions. Both in- and out of domain, the white matter and ventricles were segmented with the highest accuracy, while the most challenging structure remains the cerebral cortex due to anatomical complexity. The FeTA Challenge 2022 was able to successfully evaluate and advance generalizability of multi-class fetal brain tissue segmentation algorithms for MRI and it continues to benchmark new algorithms. The resulting new methods contribute to improving the analysis of brain development in utero. △ Less

Submitted 8 February, 2024; originally announced February 2024.

Comments: Results from FeTA Challenge 2022, held at MICCAI; Manuscript submitted. Supplementary Info (including submission methods descriptions) available here: https://zenodo.org/records/10628648

arXiv:2212.05715 [pdf, other]

Integrated optimization of train timetables rescheduling and response vehicles on a disrupted metro line

Authors: Hui Wang, Jialin Liu, Feng Li, Hao Ji, Bin Jia, Ziyou Gao

Abstract: When an unexpected metro disruption occurs, metro managers need to reschedule timetables to avoid trains going into the disruption area, and transport passengers stranded at disruption stations as quickly as possible. This paper proposes a two-stage optimization model to jointly make decisions for two tasks. In the first stage, the timetable rescheduling problem with cancellation and short-turning… ▽ More When an unexpected metro disruption occurs, metro managers need to reschedule timetables to avoid trains going into the disruption area, and transport passengers stranded at disruption stations as quickly as possible. This paper proposes a two-stage optimization model to jointly make decisions for two tasks. In the first stage, the timetable rescheduling problem with cancellation and short-turning strategies is formulated as a mixed integer linear programming (MILP). In particular, the instantaneous parameters and variables are used to describe the accumulation of time-varying passenger flow. In the second one, a system-optimal dynamic traffic assignment (SODTA) model is employed to dynamically schedule response vehicles, which is able to capture the dynamic traffic and congestion. Numerical cases of Beijing Metro Line 9 verify the efficiency and effectiveness of our proposed model, and results show that: (1) when occurring a disruption event during peak hours, the impact on the normal timetable is greater, and passengers in the direction with fewer train services are more affected; (2) if passengers stranded at the terminal stations of disruption area are not transported in time, they will rapidly increase at a speed of more than 300 passengers per minute; (3) compared with the fixed shortest path, using the response vehicles reduces the total travel time about 7%. However, it results in increased travel time for some passengers. △ Less

Submitted 12 December, 2022; originally announced December 2022.

Comments: 32 pages, 21 figures

arXiv:2209.09696 [pdf]

Synthesis of realistic fetal MRI with conditional Generative Adversarial Networks

Authors: Marina Fernandez Garcia, Rodrigo Gonzalez Laiz, Hui Ji, Kelly Payette, Andras Jakab

Abstract: Fetal brain magnetic resonance imaging serves as an emerging modality for prenatal counseling and diagnosis in disorders affecting the brain. Machine learning based segmentation plays an important role in the quantification of brain development. However, a limiting factor is the lack of sufficiently large, labeled training data. Our study explored the application of SPADE, a conditional general ad… ▽ More Fetal brain magnetic resonance imaging serves as an emerging modality for prenatal counseling and diagnosis in disorders affecting the brain. Machine learning based segmentation plays an important role in the quantification of brain development. However, a limiting factor is the lack of sufficiently large, labeled training data. Our study explored the application of SPADE, a conditional general adversarial network (cGAN), which learns the mapping from the label to the image space. The input to the network was super-resolution T2-weighted cerebral MRI data of 120 fetuses (gestational age range: 20-35 weeks, normal and pathological), which were annotated for 7 different tissue categories. SPADE networks were trained on 256*256 2D slices of the reconstructed volumes (image and label pairs) in each orthogonal orientation. To combine the generated volumes from each orientation into one image, a simple mean of the outputs of the three networks was taken. Based on the label maps only, we synthesized highly realistic images. However, some finer details, like small vessels were not synthesized. A structural similarity index (SSIM) of 0.972+-0.016 and correlation coefficient of 0.974+-0.008 were achieved. To demonstrate the capacity of the cGAN to create new anatomical variants, we artificially dilated the ventricles in the segmentation map and created synthetic MRI of different degrees of fetal hydrocephalus. cGANs, such as the SPADE algorithm, allow the generation of hypothetically unseen scenarios and anatomical configurations in the label space, which data in turn can be utilized for training various machine learning algorithms. In the future, this algorithm would be used for generating large, synthetic datasets representing fetal brain development. These datasets would potentially improve the performance of currently available segmentation networks. △ Less

Submitted 20 September, 2022; originally announced September 2022.

arXiv:2208.05035 [pdf, ps, other]

Adaptive Target-Condition Neural Network: DNN-Aided Load Balancing for Hybrid LiFi and WiFi Networks

Authors: Han Ji, Qiang Wang, Stephen J. Redmond, Iman Tavakkolnia, Xiping Wu

Abstract: Load balancing (LB) is a challenging issue in the hybrid light fidelity (LiFi) and wireless fidelity (WiFi) networks (HLWNets), due to the nature of heterogeneous access points (APs). Machine learning has the potential to provide a complexity-friendly LB solution with near-optimal network performance, at the cost of a training process. The state-of-the-art (SOTA) learning-aided LB methods, however… ▽ More Load balancing (LB) is a challenging issue in the hybrid light fidelity (LiFi) and wireless fidelity (WiFi) networks (HLWNets), due to the nature of heterogeneous access points (APs). Machine learning has the potential to provide a complexity-friendly LB solution with near-optimal network performance, at the cost of a training process. The state-of-the-art (SOTA) learning-aided LB methods, however, need retraining when the network environment (especially the number of users) changes, significantly limiting its practicability. In this paper, a novel deep neural network (DNN) structure named adaptive target-condition neural network (A-TCNN) is proposed, which conducts AP selection for one target user upon the condition of other users. Also, an adaptive mechanism is developed to map a smaller number of users to a larger number through splitting their data rate requirements, without affecting the AP selection result for the target user. This enables the proposed method to handle different numbers of users without the need for retraining. Results show that A-TCNN achieves a network throughput very close to that of the testing dataset, with a gap less than 3%. It is also proven that A-TCNN can obtain a network throughput comparable to two SOTA benchmarks, while reducing the runtime by up to three orders of magnitude. △ Less

Submitted 9 August, 2022; originally announced August 2022.

Comments: 13 pages, 9 figures, and 4 tables, submitted to IEEE JSAC SI-BeyondShannon

arXiv:2206.12489 [pdf, other]

Predicting within and across language phoneme recognition performance of self-supervised learning speech pre-trained models

Authors: Hang Ji, Tanvina Patel, Odette Scharenborg

Abstract: In this work, we analyzed and compared speech representations extracted from different frozen self-supervised learning (SSL) speech pre-trained models on their ability to capture articulatory features (AF) information and their subsequent prediction of phone recognition performance for within and across language scenarios. Specifically, we compared CPC, wav2vec 2.0, and HuBert. First, frame-level… ▽ More In this work, we analyzed and compared speech representations extracted from different frozen self-supervised learning (SSL) speech pre-trained models on their ability to capture articulatory features (AF) information and their subsequent prediction of phone recognition performance for within and across language scenarios. Specifically, we compared CPC, wav2vec 2.0, and HuBert. First, frame-level AF probing tasks were implemented. Subsequently, phone-level end-to-end ASR systems for phoneme recognition tasks were implemented, and the performance on the frame-level AF probing task and the phone accuracy were correlated. Compared to the conventional speech representation MFCC, all SSL pre-trained speech representations captured more AF information, and achieved better phoneme recognition performance within and across languages, with HuBert performing best. The frame-level AF probing task is a good predictor of phoneme recognition performance, showing the importance of capturing AF information in the speech representations. Compared with MFCC, in the within-language scenario, the performance of these SSL speech pre-trained models on AF probing tasks achieved a maximum relative increase of 34.4%, and it resulted in the lowest PER of 10.2%. In the cross-language scenario, the maximum relative increase of 26.7% also resulted in the lowest PER of 23.0%. △ Less

Submitted 24 June, 2022; originally announced June 2022.

Comments: Submitted to INTERSPEECH 2022

arXiv:2205.13294 [pdf, other]

Analytical Interpretation of Latent Codes in InfoGAN with SAR Images

Authors: Zhenpeng Feng, Milos Dakovic, Hongbing Ji, Mingzhe Zhu, Ljubisa Stankovic

Abstract: Generative Adversarial Networks (GANs) can synthesize abundant photo-realistic synthetic aperture radar (SAR) images. Some recent GANs (e.g., InfoGAN), are even able to edit specific properties of the synthesized images by introducing latent codes. It is crucial for SAR image synthesis since the targets in real SAR images are with different properties due to the imaging mechanism. Despite the succ… ▽ More Generative Adversarial Networks (GANs) can synthesize abundant photo-realistic synthetic aperture radar (SAR) images. Some recent GANs (e.g., InfoGAN), are even able to edit specific properties of the synthesized images by introducing latent codes. It is crucial for SAR image synthesis since the targets in real SAR images are with different properties due to the imaging mechanism. Despite the success of InfoGAN in manipulating properties, there still lacks a clear explanation of how these latent codes affect synthesized properties, thus editing specific properties usually relies on empirical trials, unreliable and time-consuming. In this paper, we show that latent codes are disentangled to affect the properties of SAR images in a non-linear manner. By introducing some property estimators for latent codes, we are able to provide a completely analytical nonlinear model to decompose the entangled causality between latent codes and different properties. The qualitative and quantitative experimental results further reveal that the properties can be calculated by latent codes, inversely, the satisfying latent codes can be estimated given desired properties. In this case, properties can be manipulated by latent codes as we expect. △ Less

Submitted 26 May, 2022; originally announced May 2022.

Comments: 13 pages, 14 figures

arXiv:2205.00463 [pdf, other]

doi 10.1088/1361-6420/ac8ac6

A Dataset-free Deep learning Method for Low-Dose CT Image Reconstruction

Authors: Qiaoqiao Ding, Hui Ji, Yuhui Quan, Xiaoqun Zhang

Abstract: Low-dose CT (LDCT) imaging attracted a considerable interest for the reduction of the object's exposure to X-ray radiation. In recent years, supervised deep learning (DL) has been extensively studied for LDCT image reconstruction, which trains a network over a dataset containing many pairs of normal-dose and low-dose images. However, the challenge on collecting many such pairs in the clinical setu… ▽ More Low-dose CT (LDCT) imaging attracted a considerable interest for the reduction of the object's exposure to X-ray radiation. In recent years, supervised deep learning (DL) has been extensively studied for LDCT image reconstruction, which trains a network over a dataset containing many pairs of normal-dose and low-dose images. However, the challenge on collecting many such pairs in the clinical setup limits the application of such supervised-learning-based methods for LDCT image reconstruction in practice. Aiming at addressing the challenges raised by the collection of training dataset, this paper proposed a unsupervised deep learning method for LDCT image reconstruction, which does not require any external training data. The proposed method is built on a re-parametrization technique for Bayesian inference via deep network with random weights, combined with additional total variational~(TV) regularization. The experiments show that the proposed method noticeably outperforms existing dataset-free image reconstruction methods on the test data. △ Less

Submitted 5 October, 2022; v1 submitted 1 May, 2022; originally announced May 2022.

arXiv:2204.09573 [pdf]

doi 10.1016/j.media.2023.102833

Fetal Brain Tissue Annotation and Segmentation Challenge Results

Authors: Kelly Payette, Hongwei Li, Priscille de Dumast, Roxane Licandro, Hui Ji, Md Mahfuzur Rahman Siddiquee, Daguang Xu, Andriy Myronenko, Hao Liu, Yuchen Pei, Lisheng Wang, Ying Peng, Juanying Xie, Huiquan Zhang, Guiming Dong, Hao Fu, Guotai Wang, ZunHyan Rieu, Donghyeon Kim, Hyun Gi Kim, Davood Karimi, Ali Gholipour, Helena R. Torres, Bruno Oliveira, João L. Vilaça , et al. (33 additional authors not shown)

Abstract: In-utero fetal MRI is emerging as an important tool in the diagnosis and analysis of the developing human brain. Automatic segmentation of the developing fetal brain is a vital step in the quantitative analysis of prenatal neurodevelopment both in the research and clinical context. However, manual segmentation of cerebral structures is time-consuming and prone to error and inter-observer variabili… ▽ More In-utero fetal MRI is emerging as an important tool in the diagnosis and analysis of the developing human brain. Automatic segmentation of the developing fetal brain is a vital step in the quantitative analysis of prenatal neurodevelopment both in the research and clinical context. However, manual segmentation of cerebral structures is time-consuming and prone to error and inter-observer variability. Therefore, we organized the Fetal Tissue Annotation (FeTA) Challenge in 2021 in order to encourage the development of automatic segmentation algorithms on an international level. The challenge utilized FeTA Dataset, an open dataset of fetal brain MRI reconstructions segmented into seven different tissues (external cerebrospinal fluid, grey matter, white matter, ventricles, cerebellum, brainstem, deep grey matter). 20 international teams participated in this challenge, submitting a total of 21 algorithms for evaluation. In this paper, we provide a detailed analysis of the results from both a technical and clinical perspective. All participants relied on deep learning methods, mainly U-Nets, with some variability present in the network architecture, optimization, and image pre- and post-processing. The majority of teams used existing medical imaging deep learning frameworks. The main differences between the submissions were the fine tuning done during training, and the specific pre- and post-processing steps performed. The challenge results showed that almost all submissions performed similarly. Four of the top five teams used ensemble learning methods. However, one team's algorithm performed significantly superior to the other submissions, and consisted of an asymmetrical U-Net network architecture. This paper provides a first of its kind benchmark for future automatic multi-tissue segmentation algorithms for the developing human brain in utero. △ Less

Submitted 20 April, 2022; originally announced April 2022.

Comments: Results from FeTA Challenge 2021, held at MICCAI; Manuscript submitted

arXiv:2203.00628 [pdf]

A Neural Ordinary Differential Equation Model for Visualizing Deep Neural Network Behaviors in Multi-Parametric MRI based Glioma Segmentation

Authors: Zhenyu Yang, Zongsheng Hu, Hangjie Ji, Kyle Lafata, Scott Floyd, Fang-Fang Yin, Chunhao Wang

Abstract: Purpose: To develop a neural ordinary differential equation (ODE) model for visualizing deep neural network (DNN) behavior during multi-parametric MRI (mp-MRI) based glioma segmentation as a method to enhance deep learning explainability. Methods: By hypothesizing that deep feature extraction can be modeled as a spatiotemporally continuous process, we designed a novel deep learning model, neural O… ▽ More Purpose: To develop a neural ordinary differential equation (ODE) model for visualizing deep neural network (DNN) behavior during multi-parametric MRI (mp-MRI) based glioma segmentation as a method to enhance deep learning explainability. Methods: By hypothesizing that deep feature extraction can be modeled as a spatiotemporally continuous process, we designed a novel deep learning model, neural ODE, in which deep feature extraction was governed by an ODE without explicit expression. The dynamics of 1) MR images after interactions with DNN and 2) segmentation formation can be visualized after solving ODE. An accumulative contribution curve (ACC) was designed to quantitatively evaluate the utilization of each MRI by DNN towards the final segmentation results. The proposed neural ODE model was demonstrated using 369 glioma patients with a 4-modality mp-MRI protocol: T1, contrast-enhanced T1 (T1-Ce), T2, and FLAIR. Three neural ODE models were trained to segment enhancing tumor (ET), tumor core (TC), and whole tumor (WT). The key MR modalities with significant utilization by DNN were identified based on ACC analysis. Segmentation results by DNN using only the key MR modalities were compared to the ones using all 4 MR modalities. Results: All neural ODE models successfully illustrated image dynamics as expected. ACC analysis identified T1-Ce as the only key modality in ET and TC segmentations, while both FLAIR and T2 were key modalities in WT segmentation. Compared to the U-Net results using all 4 MR modalities, Dice coefficient of ET (0.784->0.775), TC (0.760->0.758), and WT (0.841->0.837) using the key modalities only had minimal differences without significance. Conclusion: The neural ODE model offers a new tool for optimizing the deep learning model inputs with enhanced explainability. The presented methodology can be generalized to other medical image-related deep learning applications. △ Less

Submitted 23 March, 2022; v1 submitted 1 March, 2022; originally announced March 2022.

Comments: 30 pages, 7 figures, 2 tables

arXiv:2202.03433 [pdf, other]

A Coarse-to-fine Morphological Approach With Knowledge-based Rules and Self-adapting Correction for Lung Nodules Segmentation

Authors: Xinliang Fu, Jiayin Zheng, Juanyun Mai, Yanbo Shao, Minghao Wang, Linyu Li, Zhaoqi Diao, Yulong Chen, Jianyu Xiao, Jian You, Airu Yin, Yang Yang, Xiangcheng Qiu, Jinsheng Tao, Bo Wang, Hua Ji

Abstract: The segmentation module which precisely outlines the nodules is a crucial step in a computer-aided diagnosis(CAD) system. The most challenging part of such a module is how to achieve high accuracy of the segmentation, especially for the juxtapleural, non-solid and small nodules. In this research, we present a coarse-to-fine methodology that greatly improves the thresholding method performance with… ▽ More The segmentation module which precisely outlines the nodules is a crucial step in a computer-aided diagnosis(CAD) system. The most challenging part of such a module is how to achieve high accuracy of the segmentation, especially for the juxtapleural, non-solid and small nodules. In this research, we present a coarse-to-fine methodology that greatly improves the thresholding method performance with a novel self-adapting correction algorithm and effectively removes noisy pixels with well-defined knowledge-based principles. Compared with recent strong morphological baselines, our algorithm, by combining dataset features, achieves state-of-the-art performance on both the public LIDC-IDRI dataset (DSC 0.699) and our private LC015 dataset (DSC 0.760) which closely approaches the SOTA deep learning-based models' performances. Furthermore, unlike most available morphological methods that can only segment the isolated and well-circumscribed nodules accurately, the precision of our method is totally independent of the nodule type or diameter, proving its applicability and generality. △ Less

Submitted 7 February, 2022; originally announced February 2022.

arXiv:2201.13392

MHSnet: Multi-head and Spatial Attention Network with False-Positive Reduction for Pulmonary Nodules Detection

Authors: Juanyun Mai, Minghao Wang, Jiayin Zheng, Yanbo Shao, Zhaoqi Diao, Xinliang Fu, Yulong Chen, Jianyu Xiao, Jian You, Airu Yin, Yang Yang, Xiangcheng Qiu, Jinsheng Tao, Bo Wang, Hua Ji

Abstract: The mortality of lung cancer has ranked high among cancers for many years. Early detection of lung cancer is critical for disease prevention, cure, and mortality rate reduction. However, existing detection methods on pulmonary nodules introduce an excessive number of false positive proposals in order to achieve high sensitivity, which is not practical in clinical situations. In this paper, we prop… ▽ More The mortality of lung cancer has ranked high among cancers for many years. Early detection of lung cancer is critical for disease prevention, cure, and mortality rate reduction. However, existing detection methods on pulmonary nodules introduce an excessive number of false positive proposals in order to achieve high sensitivity, which is not practical in clinical situations. In this paper, we propose the multi-head detection and spatial squeeze-and-attention network, MHSnet, to detect pulmonary nodules, in order to aid doctors in the early diagnosis of lung cancers. Specifically, we first introduce multi-head detectors and skip connections to customize for the variety of nodules in sizes, shapes and types and capture multi-scale features. Then, we implement a spatial attention module to enable the network to focus on different regions differently inspired by how experienced clinicians screen CT images, which results in fewer false positive proposals. Lastly, we present a lightweight but effective false positive reduction module with the Linear Regression model to cut down the number of false positive proposals, without any constraints on the front network. Extensive experimental results compared with the state-of-the-art models have shown the superiority of the MHSnet in terms of the average FROC, sensitivity and especially false discovery rate (2.98% and 2.18% improvement in terms of average FROC and sensitivity, 5.62% and 28.33% decrease in terms of false discovery rate and average candidates per scan). The false positive reduction module significantly decreases the average number of candidates generated per scan by 68.11% and the false discovery rate by 13.48%, which is promising to reduce distracted proposals for the downstream tasks based on the detection results. △ Less

Submitted 12 May, 2022; v1 submitted 31 January, 2022; originally announced January 2022.

Comments: We have to revise the experiment results and conclusions

arXiv:2111.08482 [pdf, other]

Distributed Optimal Output Consensus of Uncertain Nonlinear Multi-Agent Systems over Unbalanced Directed Networks via Output Feedback

Authors: Jin Zhang, Lu Liu, Xinghu Wang, Haibo Ji

Abstract: In this note, a novel observer-based output feedback control approach is proposed to address the distributed optimal output consensus problem of uncertain nonlinear multi-agent systems in the normal form over unbalanced directed graphs. The main challenges of the concerned problem lie in unbalanced directed graphs and nonlinearities of multi-agent systems with their agent states not available for… ▽ More In this note, a novel observer-based output feedback control approach is proposed to address the distributed optimal output consensus problem of uncertain nonlinear multi-agent systems in the normal form over unbalanced directed graphs. The main challenges of the concerned problem lie in unbalanced directed graphs and nonlinearities of multi-agent systems with their agent states not available for feedback control. Based on a two-layer controller structure, a distributed optimal coordinator is first designed to convert the considered problem into a reference-tracking problem. Then a decentralized output feedback controller is developed to stabilize the resulting augmented system. A high-gain observer is exploited in controller design to estimate the agent states in the presence of uncertainties and disturbances so that the proposed controller relies only on agent outputs. The semi-global convergence of the agent outputs toward the optimal solution that minimizes the sum of all local cost functions is proved under standard assumptions. A key feature of the obtained results is that the nonlinear agents under consideration are only required to be locally Lipschitz and possess globally asymptotically stable and locally exponentially stable zero dynamics. △ Less

Submitted 16 November, 2021; originally announced November 2021.

Comments: 8 pages, 2 figures. arXiv admin note: text overlap with arXiv:2107.04056

arXiv:2109.04202 [pdf, other]

IMG2SMI: Translating Molecular Structure Images to Simplified Molecular-input Line-entry System

Authors: Daniel Campos, Heng Ji

Abstract: Like many scientific fields, new chemistry literature has grown at a staggering pace, with thousands of papers released every month. A large portion of chemistry literature focuses on new molecules and reactions between molecules. Most vital information is conveyed through 2-D images of molecules, representing the underlying molecules or reactions described. In order to ensure reproducible and mac… ▽ More Like many scientific fields, new chemistry literature has grown at a staggering pace, with thousands of papers released every month. A large portion of chemistry literature focuses on new molecules and reactions between molecules. Most vital information is conveyed through 2-D images of molecules, representing the underlying molecules or reactions described. In order to ensure reproducible and machine-readable molecule representations, text-based molecule descriptors like SMILES and SELFIES were created. These text-based molecule representations provide molecule generation but are unfortunately rarely present in published literature. In the absence of molecule descriptors, the generation of molecule descriptors from the 2-D images present in the literature is necessary to understand chemistry literature at scale. Successful methods such as Optical Structure Recognition Application (OSRA), and ChemSchematicResolver are able to extract the locations of molecules structures in chemistry papers and infer molecular descriptions and reactions. While effective, existing systems expect chemists to correct outputs, making them unsuitable for unsupervised large-scale data mining. Leveraging the task formulation of image captioning introduced by DECIMER, we introduce IMG2SMI, a model which leverages Deep Residual Networks for image feature extraction and an encoder-decoder Transformer layers for molecule description generation. Unlike previous Neural Network-based systems, IMG2SMI builds around the task of molecule description generation, which enables IMG2SMI to outperform OSRA-based systems by 163% in molecule similarity prediction as measured by the molecular MACCS Fingerprint Tanimoto Similarity. Additionally, to facilitate further research on this task, we release a new molecule prediction dataset. including 81 million molecules for molecule description generation △ Less

Submitted 3 September, 2021; originally announced September 2021.

arXiv:2107.04056 [pdf, other]

doi 10.1002/rnc.6059

Optimal Output Consensus of Second-Order Uncertain Nonlinear Systems on Weight-Unbalanced Directed Networks

Authors: Jin Zhang, Lu Liu, Haibo Ji

Abstract: This paper investigates the distributed optimal output consensus problem of second-order uncertain nonlinear multi-agent systems over weight-unbalanced directed networks. Under the standard assumption that local cost functions are strongly convex with globally Lipschitz gradients, a novel distributed dynamic state feedback controller is developed such that the outputs of all the agents reach the o… ▽ More This paper investigates the distributed optimal output consensus problem of second-order uncertain nonlinear multi-agent systems over weight-unbalanced directed networks. Under the standard assumption that local cost functions are strongly convex with globally Lipschitz gradients, a novel distributed dynamic state feedback controller is developed such that the outputs of all the agents reach the optimal solution to minimize the global cost function which is the sum of all the local cost functions. The controller design is based on a two-layer strategy, where a distributed optimal coordinator and a reference-tracking controller are proposed to address the challenges arising from unbalanced directed networks and uncertain nonlinear functions respectively. A key feature of the proposed controller is that the nonlinear functions containing the uncertainties and disturbances are not required to be globally Lipschitz. Furthermore, by exploiting adaptive control technique, no prior knowledge of the uncertainties or disturbances is required either. Two simulation examples are finally provided to illustrate the effectiveness of the proposed control scheme. △ Less

Submitted 8 July, 2021; originally announced July 2021.

Comments: 13 pages, 5 figures

Journal ref: Int J Robust Nonlinear Control. 2022;1-21

arXiv:2105.10650 [pdf]

Post-Radiotherapy PET Image Outcome Prediction by Deep Learning under Biological Model Guidance: A Feasibility Study of Oropharyngeal Cancer Application

Authors: Hangjie Ji, Kyle Lafata, Yvonne Mowery, David Brizel, Andrea L. Bertozzi, Fang-Fang Yin, Chunhao Wang

Abstract: This paper develops a method of biologically guided deep learning for post-radiation FDG-PET image outcome prediction based on pre-radiation images and radiotherapy dose information. Based on the classic reaction-diffusion mechanism, a novel biological model was proposed using a partial differential equation that incorporates spatial radiation dose distribution as a patient-specific treatment info… ▽ More This paper develops a method of biologically guided deep learning for post-radiation FDG-PET image outcome prediction based on pre-radiation images and radiotherapy dose information. Based on the classic reaction-diffusion mechanism, a novel biological model was proposed using a partial differential equation that incorporates spatial radiation dose distribution as a patient-specific treatment information variable. A 7-layer encoder-decoder-based convolutional neural network (CNN) was designed and trained to learn the proposed biological model. As such, the model could generate post-radiation FDG-PET image outcome predictions with possible time-series transition from pre-radiotherapy image states to post-radiotherapy states. The proposed method was developed using 64 oropharyngeal patients with paired FDG-PET studies before and after 20Gy delivery (2Gy/daily fraction) by IMRT. In a two-branch deep learning execution, the proposed CNN learns specific terms in the biological model from paired FDG-PET images and spatial dose distribution as in one branch, and the biological model generates post-20Gy FDG-PET image prediction in the other branch. The proposed method successfully generated post-20Gy FDG-PET image outcome prediction with breakdown illustrations of biological model components. Time-series FDG-PET image predictions were generated to demonstrate the feasibility of disease response rendering. The developed biologically guided deep learning method achieved post-20Gy FDG-PET image outcome predictions in good agreement with ground-truth results. With break-down biological modeling components, the outcome image predictions could be used in adaptive radiotherapy decision-making to optimize personalized plans for the best outcome in the future. △ Less

Submitted 22 May, 2021; originally announced May 2021.

Comments: 26 pages, 5 figures

arXiv:2010.15526 [pdf]

doi 10.1038/s41597-021-00946-3

An automatic multi-tissue human fetal brain segmentation benchmark using the Fetal Tissue Annotation Dataset

Authors: Kelly Payette, Priscille de Dumast, Hamza Kebiri, Ivan Ezhov, Johannes C. Paetzold, Suprosanna Shit, Asim Iqbal, Romesa Khan, Raimund Kottke, Patrice Grehten, Hui Ji, Levente Lanczi, Marianna Nagy, Monika Beresova, Thi Dao Nguyen, Giancarlo Natalucci, Theofanis Karayannis, Bjoern Menze, Meritxell Bach Cuadra, Andras Jakab

Abstract: It is critical to quantitatively analyse the developing human fetal brain in order to fully understand neurodevelopment in both normal fetuses and those with congenital disorders. To facilitate this analysis, automatic multi-tissue fetal brain segmentation algorithms are needed, which in turn requires open databases of segmented fetal brains. Here we introduce a publicly available database of 50 m… ▽ More It is critical to quantitatively analyse the developing human fetal brain in order to fully understand neurodevelopment in both normal fetuses and those with congenital disorders. To facilitate this analysis, automatic multi-tissue fetal brain segmentation algorithms are needed, which in turn requires open databases of segmented fetal brains. Here we introduce a publicly available database of 50 manually segmented pathological and non-pathological fetal magnetic resonance brain volume reconstructions across a range of gestational ages (20 to 33 weeks) into 7 different tissue categories (external cerebrospinal fluid, grey matter, white matter, ventricles, cerebellum, deep grey matter, brainstem/spinal cord). In addition, we quantitatively evaluate the accuracy of several automatic multi-tissue segmentation algorithms of the developing human fetal brain. Four research groups participated, submitting a total of 10 algorithms, demonstrating the benefits the database for the development of automatic algorithms. △ Less

Submitted 7 July, 2021; v1 submitted 29 October, 2020; originally announced October 2020.

Comments: This is a preprint of an article published in Nature Scientific Data. The final authenticated version is available online at: https://doi.org/10.1038/s41597-021-00946-3

Journal ref: Sci Data 8, 167 (2021)

arXiv:2008.04656 [pdf, ps, other]

AHP-Net: adaptive-hyper-parameter deep learning based image reconstruction method for multilevel low-dose CT

Authors: Qiaoqiao Ding, Yuesong Nan, Hao Gao, Hui Ji

Abstract: Low-dose CT (LDCT) imaging is desirable in many clinical applications to reduce X-ray radiation dose to patients. Inspired by deep learning (DL), a recent promising direction of model-based iterative reconstruction (MBIR) methods for LDCT is via optimization-unrolling DL-regularized image reconstruction, where pre-defined image prior is replaced by learnable data-adaptive prior. However, LDCT is c… ▽ More Low-dose CT (LDCT) imaging is desirable in many clinical applications to reduce X-ray radiation dose to patients. Inspired by deep learning (DL), a recent promising direction of model-based iterative reconstruction (MBIR) methods for LDCT is via optimization-unrolling DL-regularized image reconstruction, where pre-defined image prior is replaced by learnable data-adaptive prior. However, LDCT is clinically multilevel, since clinical scans have different noise levels that depend of scanning site, patient size, and clinical task. Therefore, this work aims to develop an adaptive-hyper-parameter DL-based image reconstruction method (AHP-Net) that can handle multilevel LDCT of different noise levels. AHP-Net unrolls a half-quadratic splitting scheme with learnable image prior built on framelet filter bank, and learns a network that automatically adjusts the hyper-parameters for various noise levels. As a result, AHP-Net provides a single universal training model that can handle multilevel LDCT. Extensive experimental evaluations using clinical scans suggest that AHP-Net outperformed conventional MBIR techniques and state-of-the-art deep-learning-based methods for multilevel LDCT of different noise levels. △ Less

Submitted 17 February, 2021; v1 submitted 11 August, 2020; originally announced August 2020.

Comments: 7 figures, 5 table

arXiv:2007.02018 [pdf, other]

Deep Bilateral Retinex for Low-Light Image Enhancement

Authors: Jinxiu Liang, Yong Xu, Yuhui Quan, Jingwen Wang, Haibin Ling, Hui Ji

Abstract: Low-light images, i.e. the images captured in low-light conditions, suffer from very poor visibility caused by low contrast, color distortion and significant measurement noise. Low-light image enhancement is about improving the visibility of low-light images. As the measurement noise in low-light images is usually significant yet complex with spatially-varying characteristic, how to handle the noi… ▽ More Low-light images, i.e. the images captured in low-light conditions, suffer from very poor visibility caused by low contrast, color distortion and significant measurement noise. Low-light image enhancement is about improving the visibility of low-light images. As the measurement noise in low-light images is usually significant yet complex with spatially-varying characteristic, how to handle the noise effectively is an important yet challenging problem in low-light image enhancement. Based on the Retinex decomposition of natural images, this paper proposes a deep learning method for low-light image enhancement with a particular focus on handling the measurement noise. The basic idea is to train a neural network to generate a set of pixel-wise operators for simultaneously predicting the noise and the illumination layer, where the operators are defined in the bilateral space. Such an integrated approach allows us to have an accurate prediction of the reflectance layer in the presence of significant spatially-varying measurement noise. Extensive experiments on several benchmark datasets have shown that the proposed method is very competitive to the state-of-the-art methods, and has significant advantage over others when processing images captured in extremely low lighting conditions. △ Less

Submitted 4 July, 2020; originally announced July 2020.

Comments: 15 pages

arXiv:2005.06832 [pdf, ps, other]

doi 10.1016/j.automatica.2020.109298

Detection of Intermittent Faults Based on an Optimally Weighted Moving Average T^2 Control Chart with Stationary Observations

Authors: Yinghong Zhao, Xiao He, Junfeng Zhang, Hongquan Ji, Donghua Zhou, Michael G. Pecht

Abstract: The moving average (MA)-type scheme, also known as the smoothing method, has been well established within the multivariate statistical process monitoring (MSPM) framework since the 1990s. However, its theoretical basis is still limited to smoothing independent data, and the optimality of its equally or exponentially weighted scheme remains unproven. This paper aims to weaken the independence assum… ▽ More The moving average (MA)-type scheme, also known as the smoothing method, has been well established within the multivariate statistical process monitoring (MSPM) framework since the 1990s. However, its theoretical basis is still limited to smoothing independent data, and the optimality of its equally or exponentially weighted scheme remains unproven. This paper aims to weaken the independence assumption in the existing MA method, and then extend it to a broader area of dealing with autocorrelated weakly stationary processes. With the discovery of the non-optimality of the equally and exponentially weighted schemes used for fault detection when data have autocorrelation, the essence that they do not effectively utilize the correlation information of samples is revealed, giving birth to an optimally weighted moving average (OWMA) theory. The OWMA method is combined with the Hotelling's $T^2$ statistic to form an OWMA $T^2$ control chart (OWMA-TCC), in order to detect a more challenging type of fault, i.e., intermittent fault (IF). Different from the MA scheme that puts an equal weight on samples within a time window, OWMA-TCC uses correlation (autocorrelation and cross-correlation) information to find an optimal weight vector (OWV) for the purpose of IF detection (IFD). In order to achieve a best IFD performance, the concept of IF detectability is defined and corresponding detectability conditions are provided, which further serve as selection criteria of the OWV. Then, the OWV is given in the form of a solution to nonlinear equations, whose existence is proven with the aid of the Brouwer fixed-point theory. Moreover, symmetrical structure of the OWV is revealed, and the optimality of the MA scheme for any IF directions when data exhibit no autocorrelation is proven. △ Less

Submitted 9 November, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

Comments: Automatica, Published online, 2020

arXiv:2001.03366 [pdf, other]

Sparse Vector Transmission: An Idea Whose Time Has Come

Authors: Wonjun Kim, Hyoungju Ji, Hyojin Lee, Younsun Kim, Juho Lee, Byonghyo Shim

Abstract: In recent years, we are witnessing bewildering variety of automated services and applications of vehicles, robots, sensors, and machines powered by the artificial intelligence technologies. Communication mechanism associated with these services is dearly distinct from human-centric communications. One important feature for the machine-centric communications is that the amount of information to be… ▽ More In recent years, we are witnessing bewildering variety of automated services and applications of vehicles, robots, sensors, and machines powered by the artificial intelligence technologies. Communication mechanism associated with these services is dearly distinct from human-centric communications. One important feature for the machine-centric communications is that the amount of information to be transmitted is tiny. In view of the short packet transmission, relying on today's transmission mechanism would not be efficient due to the waste of resources, large decoding latency, and expensive operational cost. In this article, we present an overview of the sparse vector transmission (SVT), a scheme to transmit a short-sized information after the sparse transformation. We discuss basics of SVT, two distinct SVT strategies, viz., frequency-domain sparse transmission and sparse vector coding with detailed operations, and also demonstrate the effectiveness in realistic wireless environments. △ Less

Submitted 10 January, 2020; originally announced January 2020.

Comments: submitted to IEEE Vehicular Technology Magazine (VTM)

arXiv:1912.07648 [pdf, other]

Rethinking Medical Image Reconstruction via Shape Prior, Going Deeper and Faster: Deep Joint Indirect Registration and Reconstruction

Authors: Jiulong Liu, Angelica I. Aviles-Rivero, Hui Ji, Carola-Bibiane Schönlieb

Abstract: Indirect image registration is a promising technique to improve image reconstruction quality by providing a shape prior for the reconstruction task. In this paper, we propose a novel hybrid method that seeks to reconstruct high quality images from few measurements whilst requiring low computational cost. With this purpose, our framework intertwines indirect registration and reconstruction tasks is… ▽ More Indirect image registration is a promising technique to improve image reconstruction quality by providing a shape prior for the reconstruction task. In this paper, we propose a novel hybrid method that seeks to reconstruct high quality images from few measurements whilst requiring low computational cost. With this purpose, our framework intertwines indirect registration and reconstruction tasks is a single functional. It is based on two major novelties. Firstly, we introduce a model based on deep nets to solve the indirect registration problem, in which the inversion and registration mappings are recurrently connected through a fixed-point interaction based sparse optimisation. Secondly, we introduce specific inversion blocks, that use the explicit physical forward operator, to map the acquired measurements to the image reconstruction. We also introduce registration blocks based deep nets to predict the registration parameters and warp transformation accurately and efficiently. We demonstrate, through extensive numerical and visual experiments, that our framework outperforms significantly classic reconstruction schemes and other bi-task method; this in terms of both image quality and computational time. Finally, we show generalisation capabilities of our approach by demonstrating their performance on fast Magnetic Resonance Imaging (MRI), sparse view computed tomography (CT) and low dose CT with measurements much below the Nyquist limit. △ Less

Submitted 16 December, 2019; originally announced December 2019.

arXiv:1910.03255 [pdf, other]

Channel aware sparse transmission for ultra low-latency communications in TDD systems

Authors: Wonjun Kim, Hyoungju Ji, Byonghyo Shim

Abstract: Major goal of ultra reliable and low latency communication (URLLC) is to reduce the latency down to a millisecond (ms) level while ensuring reliability of the transmission. Since the current uplink transmission scheme requires a complicated handshaking procedure to initiate the transmission, to meet this stringent latency requirement is a challenge in wireless system design. In particular, in the… ▽ More Major goal of ultra reliable and low latency communication (URLLC) is to reduce the latency down to a millisecond (ms) level while ensuring reliability of the transmission. Since the current uplink transmission scheme requires a complicated handshaking procedure to initiate the transmission, to meet this stringent latency requirement is a challenge in wireless system design. In particular, in the time division duplexing (TDD) systems, supporting the URLLC is difficult since the mobile device has to wait until the transmit direction is switched to the uplink. In this paper, we propose a new approach to support a low latency access in TDD systems, called channel aware sparse transmission (CAST). Key idea of the proposed scheme is to encode a grant signal in a form of sparse vector. This together with the fact that the sensing mechanism preserves the energy of the sparse vector allows us to use the compressed sensing (CS) technique in CAST decoding. From the performance analysis and numerical evaluations, we demonstrate that the proposed CAST scheme achieves a significant reduction in access latency over the 4G LTE-TDD and 5G NR-TDD systems. △ Less

Submitted 8 October, 2019; originally announced October 2019.

arXiv:1909.04779 [pdf, other]

Localized Adversarial Training for Increased Accuracy and Robustness in Image Classification

Authors: Eitan Rothberg, Tingting Chen, Luo Jie, Hao Ji

Abstract: Today's state-of-the-art image classifiers fail to correctly classify carefully manipulated adversarial images. In this work, we develop a new, localized adversarial attack that generates adversarial examples by imperceptibly altering the backgrounds of normal images. We first use this attack to highlight the unnecessary sensitivity of neural networks to changes in the background of an image, then… ▽ More Today's state-of-the-art image classifiers fail to correctly classify carefully manipulated adversarial images. In this work, we develop a new, localized adversarial attack that generates adversarial examples by imperceptibly altering the backgrounds of normal images. We first use this attack to highlight the unnecessary sensitivity of neural networks to changes in the background of an image, then use it as part of a new training technique: localized adversarial training. By including locally adversarial images in the training set, we are able to create a classifier that suffers less loss than a non-adversarially trained counterpart model on both natural and adversarial inputs. The evaluation of our localized adversarial training algorithm on MNIST and CIFAR-10 datasets shows decreased accuracy loss on natural images, and increased robustness against adversarial inputs. △ Less

Submitted 10 September, 2019; originally announced September 2019.

Comments: 4 pages (excluding references). Presented at AdvML: 1st Workshop on Adversarial Learning Methods for Machine Learning and Data Mining at KDD '19

arXiv:1812.11292 [pdf, other]

Adaptive Short-time Fourier Transform and Synchrosqueezing Transform for Non-stationary Signal Separation

Authors: Lin Li, Haiyan Cai, Hongxia Han, Qingtang Jiang, Hongbing Ji

Abstract: The synchrosqueezing transform, a kind of reassignment method, aims to sharpen the time-frequency representation and to separate the components of a multicomponent non-stationary signal. In this paper, we consider the short-time Fourier transform (STFT) with a time-varying parameter, called the adaptive STFT. Based on the local approximation of linear frequency modulation mode, we analyze the well… ▽ More The synchrosqueezing transform, a kind of reassignment method, aims to sharpen the time-frequency representation and to separate the components of a multicomponent non-stationary signal. In this paper, we consider the short-time Fourier transform (STFT) with a time-varying parameter, called the adaptive STFT. Based on the local approximation of linear frequency modulation mode, we analyze the well-separated condition of non-stationary multicomponent signals using the adaptive STFT with the Gaussian window function. We propose the STFT-based synchrosqueezing transform (FSST) with a time-varying parameter, named the adaptive FSST, to enhance the time-frequency concentration and resolution of a multicomponent signal, and to separate its components more accurately. In addition, we also propose the 2nd-order adaptive FSST to further improve the adaptive FSST for the non-stationary signals with fast-varying frequencies. Furthermore, we present a localized optimization algorithm based on our well-separated condition to estimate the time-varying parameter adaptively and automatically. Simulation results on synthetic signals and the bat echolocation signal are provided to demonstrate the effectiveness and robustness of the proposed method. △ Less

Submitted 26 September, 2019; v1 submitted 29 December, 2018; originally announced December 2018.

Showing 1–26 of 26 results for author: Ji, H