Transform-and multi-domain deep learning for single-frame rapid autofocusing in whole slide imaging

S Jiang, J Liao, Z Bian, K Guo, Y Zhang… - Biomedical optics …, 2018 - opg.optica.org
S Jiang, J Liao, Z Bian, K Guo, Y Zhang, G Zheng
Biomedical optics express, 2018opg.optica.org
A whole slide imaging (WSI) system has recently been approved for primary diagnostic use
in the US. The image quality and system throughput of WSI is largely determined by the
autofocusing process. Traditional approaches acquire multiple images along the optical axis
and maximize a figure of merit for autofocusing. Here we explore the use of deep
convolution neural networks (CNNs) to predict the focal position of the acquired image
without axial scanning. We investigate the autofocusing performance with three illumination …
A whole slide imaging (WSI) system has recently been approved for primary diagnostic use in the US. The image quality and system throughput of WSI is largely determined by the autofocusing process. Traditional approaches acquire multiple images along the optical axis and maximize a figure of merit for autofocusing. Here we explore the use of deep convolution neural networks (CNNs) to predict the focal position of the acquired image without axial scanning. We investigate the autofocusing performance with three illumination settings: incoherent Kohler illumination, partially coherent illumination with two plane waves, and one-plane-wave illumination. We acquire ~130,000 images with different defocus distances as the training data set. Different defocus distances lead to different spatial features of the captured images. However, solely relying on the spatial information leads to a relatively bad performance of the autofocusing process. It is better to extract defocus features from transform domains of the acquired image. For incoherent illumination, the Fourier cutoff frequency is directly related to the defocus distance. Similarly, autocorrelation peaks are directly related to the defocus distance for two-plane-wave illumination. In our implementation, we use the spatial image, the Fourier spectrum, the autocorrelation of the spatial image, and combinations thereof as the inputs for the CNNs. We show that the information from the transform domains can improve the performance and robustness of the autofocusing process. The resulting focusing error is ~0.5 µm, which is within the 0.8-µm depth-of-field range. The reported approach requires little hardware modification for conventional WSI systems and the images can be captured on the fly without focus map surveying. It may find applications in WSI and time-lapse microscopy. The transform- and multi-domain approaches may also provide new insights for developing microscopy-related deep-learning networks. We have made our training and testing data set (~12 GB) open-source for the broad research community.
opg.optica.org