Abstract
Image segmentation technology has made a remarkable effect in medical image analysis and processing, which is used to help physicians get a more accurate diagnosis. Manual segmentation of the medical image requires a lot of effort by professionals, which is also a subjective task. Therefore, developing an advanced segmentation method is an essential demand. We propose an end-to-end segmentation method for medical images, which mimics physicians delineating a region of interest (ROI) on the medical image in a multi-step manner. This multi-step operation improves the performance from a coarse result to a fine result progressively. In this paper, the segmentation process is formulated as a Markov decision process and solved by a deep reinforcement learning (DRL) algorithm, which trains an agent for segmenting ROI in images. The agent performs a serial action to delineate the ROI. We define the action as a set of continuous parameters. Then, we adopted a DRL algorithm called deep deterministic policy gradient to learn the segmentation model in continuous action space. The experimental result shows that the proposed method has 7.24% improved to the state-of-the-art method on three prostate MR data sets and has 3.52% improved on one retinal fundus image data set.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aguirreramos H, Avinacervantes JG, Cruzaceves I, Ruizpinales J, Ledesma S (2018) Blood vessel segmentation in retinal fundus images using gabor filters, fractional derivatives, and expectation maximization. Appl Math Comput 339:568–587
Ahmadvand A, Daliri MR (2015) Improving the runtime of mrf based method for mri brain segmentation. Appl Math Comput 256:808–818
Alansary A, Oktay O, Li Y, Le Folgoc L, Hou B, Vaillant G, Kamnitsas K, Vlontzos A, Glocker B, Kainz B et al (2019) Evaluating reinforcement learning agents for anatomical landmark detection. Med Image Anal 53:156–164
Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: a brief survey. IEEE Signal Process Mag 34(6):26–38
Ba J, Mnih V, Kavukcuoglu K (2014) Multiple object recognition with visual attention. arXiv preprint arXiv:14127755
Caicedo JC, Lazebnik S (2015) Active object localization with deep reinforcement learning. In: Proceedings of the IEEE international conference on computer vision. IEEE, Santiago, pp 2488–2496
Castrejon L, Kundu K, Urtasun R, Fidler S (2017) Annotating object instances with a polygon-rnn. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Honolulu, pp 5230–5238
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder–decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV). Springer, Munich, pp 801–818
Dai J, He K, Sun J (2016) Instance-aware semantic segmentation via multi-task network cascades. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Las Vegas, pp 3150–3158
Eltanboly A, Ghazal M, Hajjdiab H, Shalaby A, Switala A, Mahmoud AM, Sahoo PK, Elazab MS, Elbaz A (2019) Level sets-based image segmentation approach using statistical shape priors. Appl Math Comput 340:164–179
Guotai W, Li Wenqi, Vercauteren T (2018) Interactive medical image segmentation using deep learning with image-specific fine tuning. IEEE Trans Med Imaging 37(7):1562–1573
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Las Vegas, pp 770–778
Huang Z, Heng W, Zhou S (2019) Stroke-based artistic rendering agent with deep reinforcement learning. arXiv preprint arXiv:190304411
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:150203167
Jagadeesan S, Subbiah J (2020) Real-time personalization and recommendation in adaptive learning management system. J Ambient Intell Humaniz Comput 2:1–11
Kim D, Lee T, Kim S, Lee B, Youn HY (2019) Adaptive packet scheduling in iot environment based on q-learning. J Ambient Intell Humaniz Comput 6:1–11
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:14126980
Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:150902971
Litjens GJS, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, Der Laak JAWMV, Van Ginneken B, Sanchez CI (2017) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88
Liu F, Li S, Zhang L, Zhou C, Ye R, Wang Y, Lu J (2017) 3dcnn-dqn-rnn: a deep reinforcement learning framework for semantic parsing of large-scale 3d point clouds. In: Proceedings of the IEEE international conference on computer vision. IEEE, Venice, pp 5678–5687
Liu R, Lehman J, Molino P, Such FP, Frank E, Sergeev A, Yosinski J (2018) An intriguing failing of convolutional neural networks and the coordconv solution. In: Advances in neural information processing systems. MIT Press, Montreal, pp 9605–9616
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Boston, pp 3431–3440
Lu H, Li B, Zhu J, Li Y, Li Y, Xu X, He L, Li X, Li J, Serikawa S (2017) Wound intensity correction and segmentation with convolutional neural networks, concurrency and computation: practice and experience. Concurr Comput Pract Exp 29(6):e3927
Lu H, Kondo M, Li Y, Tan J, Kim H (2020) Supervoxel graph cuts: an effective method for ggo candidate regions extraction on ct images. IEEE Consum Electron Mag 9(1):61–66
Madani Y, Ezzikouri H, Erritali M, Hssina B (2019) Finding optimal pedagogical content in an adaptive e-learning platform using a new recommendation approach and reinforcement learning. J Ambient Intell Humaniz Comput 12:1–16
Milletari F, Navab N, Ahmadi SA (2016) V-net: fully convolutional neural networks for volumetric medical image segmentation. In: International conference on 3D vision (3DV). IEEE, Stanford, pp 565–571
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529
Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In: International conference on machine learning. ACM, New York, pp 1928–1937
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. arXiv preprint arXiv:13125602
Orlando JI, Fu H, Breda JB, van Keer K, Bathula DR, Diaz-Pinto A, Fang R, Heng PA, Kim J, Lee J et al (2020) Refuge challenge: a unified framework for evaluating automated methods for glaucoma assessment from fundus photographs. Med Image Anal 59:101570
Rao Y, Lu J, Zhou J (2017) Attention-aware deep reinforcement learning for video face recognition. In: Proceedings of the IEEE international conference on computer vision. IEEE, Venice, pp 3931–3940
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, Munich, pp 234–241
Rother C, Kolmogorov V, Blake A (2004) Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314
Sahba F, Tizhoosh HR, Salama MM (2008) Application of reinforcement learning for segmentation of transrectal ultrasound images. BMC Med Imaging 8(1):8
Salimans T, Kingma DP (2016) Weight normalization: a simple reparameterization to accelerate training of deep neural networks. In: Advances in neural information processing systems. MIT Press, Barcelona, pp 901–909
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:170706347
Shi W, Caballero J, Huszar F, Totz J, Aitken AP, Bishop R, Rueckert D, Wang Z (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Las Vegas, pp 1874–1883
Silver D, Lever G, Heess N, Degris T, Wierstra D, Riedmiller M (2014) Deterministic policy gradient algorithms. In: International conference on machine learning. ACM, Beijing, pp 1–9
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489
Song G, Myeong H, Mu Lee K (2018) Seednet: automatic seed generation with deep reinforcement learning for robust interactive segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Salt Lake City, pp 1760–1768
Sutton RS, McAllester DA, Singh SP, Mansour Y (2000) Policy gradient methods for reinforcement learning with function approximation. In: Advances in neural information processing systems. ACM, New York, pp 1057–1063
Tian Z, Liu L, Zhang Z, Fei B (2016) Superpixel-based segmentation for 3d prostate mr images. IEEE Trans Med Imaging 35(3):791–801
Wang Z, Sarcar S, Liu J, Zheng Y, Ren X (2018) Outline objects using deep reinforcement learning. arXiv preprint arXiv:180404603
Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8(3–4):279–292
Wu J, Li G, Lu H, Kim H (2019) Multi-organ segmentation from abdominal ct with random forest based statistical shape model. In: International conference on biomedical signal and image processing. ACM, New York, pp 67–70
Xiang S, Li H (2017) On the effects of batch and weight normalization in generative adversarial networks. arXiv preprint arXiv:170403971
Xie N, Zhao T, Tian F, Zhang XH, Sugiyama M (2015) Stroke-based stylization learning and rendering with inverse reinforcement learning. In: Twenty-fourth international joint conference on artificial intelligence. AAAI, Palo Alto, pp 2531–2537
Xu X, Lu H, Song J, Yang Y, Shen HT, Li X (2019) Ternary adversarial networks with self-supervision for zero-shot cross-modal retrieval. In: IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2019.2928180
Yoshino Y, Miyajima T, Lu H (2017) Automatic classification of lung nodules on mdct images with the temporal subtraction technique. Int J Comput Assist Radiol Surg 12:1789–1798
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Honolulu, pp 2881–2890
Zhu Y, Zhao D (2019) Vision-based control in the open racing car simulator with deep and reinforcement learning. J Ambient Int Humaniz Comput 9:1–13
Zhu Y, Mottaghi R, Kolve E, Lim JJ, Gupta A, Fei-Fei L, Farhadi A (2017) Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: 2017 IEEE international conference on robotics and automation (ICRA). IEEE, Singapore, pp 3357–3364
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grant no. 61876148. This work was also supported in part by the Fundamental Research Funds for the Central Universities no. XJJ2018254, and China Postdoctoral Science Foundation no. 2018M631164.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tian, Z., Si, X., Zheng, Y. et al. Multi-step medical image segmentation based on reinforcement learning. J Ambient Intell Human Comput 13, 5011–5022 (2022). https://doi.org/10.1007/s12652-020-01905-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-01905-3