Search | arXiv e-print repository

PitVis-2023 Challenge: Workflow Recognition in videos of Endoscopic Pituitary Surgery

Authors: Adrito Das, Danyal Z. Khan, Dimitrios Psychogyios, Yitong Zhang, John G. Hanrahan, Francisco Vasconcelos, You Pang, Zhen Chen, Jinlin Wu, Xiaoyang Zou, Guoyan Zheng, Abdul Qayyum, Moona Mazher, Imran Razzak, Tianbin Li, Jin Ye, Junjun He, Szymon Płotka, Joanna Kaleta, Amine Yamlahi, Antoine Jund, Patrick Godau, Satoshi Kondo, Satoshi Kasai, Kousuke Hirasawa , et al. (7 additional authors not shown)

Abstract: The field of computer vision applied to videos of minimally invasive surgery is ever-growing. Workflow recognition pertains to the automated recognition of various aspects of a surgery: including which surgical steps are performed; and which surgical instruments are used. This information can later be used to assist clinicians when learning the surgery; during live surgery; and when writing operat… ▽ More The field of computer vision applied to videos of minimally invasive surgery is ever-growing. Workflow recognition pertains to the automated recognition of various aspects of a surgery: including which surgical steps are performed; and which surgical instruments are used. This information can later be used to assist clinicians when learning the surgery; during live surgery; and when writing operation notes. The Pituitary Vision (PitVis) 2023 Challenge tasks the community to step and instrument recognition in videos of endoscopic pituitary surgery. This is a unique task when compared to other minimally invasive surgeries due to the smaller working space, which limits and distorts vision; and higher frequency of instrument and step switching, which requires more precise model predictions. Participants were provided with 25-videos, with results presented at the MICCAI-2023 conference as part of the Endoscopic Vision 2023 Challenge in Vancouver, Canada, on 08-Oct-2023. There were 18-submissions from 9-teams across 6-countries, using a variety of deep learning models. A commonality between the top performing models was incorporating spatio-temporal and multi-task methods, with greater than 50% and 10% macro-F1-score improvement over purely spacial single-task models in step and instrument recognition respectively. The PitVis-2023 Challenge therefore demonstrates state-of-the-art computer vision models in minimally invasive surgery are transferable to a new dataset, with surgery specific techniques used to enhance performance, progressing the field further. Benchmark results are provided in the paper, and the dataset is publicly available at: https://doi.org/10.5522/04/26531686. △ Less

Submitted 2 September, 2024; originally announced September 2024.

arXiv:2408.16445 [pdf, other]

Mismatched: Evaluating the Limits of Image Matching Approaches and Benchmarks

Authors: Sierra Bonilla, Chiara Di Vece, Rema Daher, Xinwei Ju, Danail Stoyanov, Francisco Vasconcelos, Sophia Bano

Abstract: Three-dimensional (3D) reconstruction from two-dimensional images is an active research field in computer vision, with applications ranging from navigation and object tracking to segmentation and three-dimensional modeling. Traditionally, parametric techniques have been employed for this task. However, recent advancements have seen a shift towards learning-based methods. Given the rapid pace of re… ▽ More Three-dimensional (3D) reconstruction from two-dimensional images is an active research field in computer vision, with applications ranging from navigation and object tracking to segmentation and three-dimensional modeling. Traditionally, parametric techniques have been employed for this task. However, recent advancements have seen a shift towards learning-based methods. Given the rapid pace of research and the frequent introduction of new image matching methods, it is essential to evaluate them. In this paper, we present a comprehensive evaluation of various image matching methods using a structure-from-motion pipeline. We assess the performance of these methods on both in-domain and out-of-domain datasets, identifying key limitations in both the methods and benchmarks. We also investigate the impact of edge detection as a pre-processing step. Our analysis reveals that image matching for 3D reconstruction remains an open challenge, necessitating careful selection and tuning of models for specific scenarios, while also highlighting mismatches in how metrics currently represent method performance. △ Less

Submitted 15 September, 2024; v1 submitted 29 August, 2024; originally announced August 2024.

Comments: 19 pages, 5 figures

arXiv:2408.13126 [pdf, other]

CathAction: A Benchmark for Endovascular Intervention Understanding

Authors: Baoru Huang, Tuan Vo, Chayun Kongtongvattana, Giulio Dagnino, Dennis Kundrat, Wenqiang Chi, Mohamed Abdelaziz, Trevor Kwok, Tudor Jianu, Tuong Do, Hieu Le, Minh Nguyen, Hoan Nguyen, Erman Tjiputra, Quang Tran, Jianyang Xie, Yanda Meng, Binod Bhattarai, Zhaorui Tan, Hongbin Liu, Hong Seng Gan, Wei Wang, Xi Yang, Qiufeng Wang, Jionglong Su , et al. (13 additional authors not shown)

Abstract: Real-time visual feedback from catheterization analysis is crucial for enhancing surgical safety and efficiency during endovascular interventions. However, existing datasets are often limited to specific tasks, small scale, and lack the comprehensive annotations necessary for broader endovascular intervention understanding. To tackle these limitations, we introduce CathAction, a large-scale datase… ▽ More Real-time visual feedback from catheterization analysis is crucial for enhancing surgical safety and efficiency during endovascular interventions. However, existing datasets are often limited to specific tasks, small scale, and lack the comprehensive annotations necessary for broader endovascular intervention understanding. To tackle these limitations, we introduce CathAction, a large-scale dataset for catheterization understanding. Our CathAction dataset encompasses approximately 500,000 annotated frames for catheterization action understanding and collision detection, and 25,000 ground truth masks for catheter and guidewire segmentation. For each task, we benchmark recent related works in the field. We further discuss the challenges of endovascular intentions compared to traditional computer vision tasks and point out open research questions. We hope that CathAction will facilitate the development of endovascular intervention understanding methods that can be applied to real-world applications. The dataset is available at https://airvlab.github.io/cathaction/. △ Less

Submitted 30 August, 2024; v1 submitted 23 August, 2024; originally announced August 2024.

Comments: 10 pages. Webpage: https://airvlab.github.io/cathaction/

arXiv:2406.11732 [pdf, other]

Correspondence Free Multivector Cloud Registration using Conformal Geometric Algebra

Authors: Francisco Xavier Vasconcelos, Jacinto C. Nascimento

Abstract: We present, for the first time, a novel theoretical approach to address the problem of correspondence free multivector cloud registration in conformal geometric algebra. Such formalism achieves several favorable properties. Primarily, it forms an orthogonal automorphism that extends beyond the typical vector space to the entire conformal geometric algebra while respecting the multivector grading.… ▽ More We present, for the first time, a novel theoretical approach to address the problem of correspondence free multivector cloud registration in conformal geometric algebra. Such formalism achieves several favorable properties. Primarily, it forms an orthogonal automorphism that extends beyond the typical vector space to the entire conformal geometric algebra while respecting the multivector grading. Concretely, the registration can be viewed as an orthogonal transformation (\it i.e., scale, translation, rotation) belonging to $SO(4,1)$ - group of special orthogonal transformations in conformal geometric algebra. We will show that such formalism is able to: $(i)$ perform the registration without directly accessing the input multivectors. Instead, we use primitives or geometric objects provided by the conformal model - the multivectors, $(ii)$ the geometric objects are obtained by solving a multilinear eigenvalue problem to find sets of eigenmultivectors. In this way, we can explicitly avoid solving the correspondences in the registration process. Most importantly, this offers rotation and translation equivariant properties between the input multivectors and the eigenmultivectors. Experimental evaluation is conducted in datasets commonly used in point cloud registration, to testify the usefulness of the approach with emphasis to ambiguities arising from high levels of noise. The code is available at https://github.com/Numerical-Geometric-Algebra/RegistrationGA . This work was submitted to the International Journal of Computer Vision and is currently under review. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2404.13437 [pdf, other]

High-fidelity Endoscopic Image Synthesis by Utilizing Depth-guided Neural Surfaces

Authors: Baoru Huang, Yida Wang, Anh Nguyen, Daniel Elson, Francisco Vasconcelos, Danail Stoyanov

Abstract: In surgical oncology, screening colonoscopy plays a pivotal role in providing diagnostic assistance, such as biopsy, and facilitating surgical navigation, particularly in polyp detection. Computer-assisted endoscopic surgery has recently gained attention and amalgamated various 3D computer vision techniques, including camera localization, depth estimation, surface reconstruction, etc. Neural Radia… ▽ More In surgical oncology, screening colonoscopy plays a pivotal role in providing diagnostic assistance, such as biopsy, and facilitating surgical navigation, particularly in polyp detection. Computer-assisted endoscopic surgery has recently gained attention and amalgamated various 3D computer vision techniques, including camera localization, depth estimation, surface reconstruction, etc. Neural Radiance Fields (NeRFs) and Neural Implicit Surfaces (NeuS) have emerged as promising methodologies for deriving accurate 3D surface models from sets of registered images, addressing the limitations of existing colon reconstruction approaches stemming from constrained camera movement. However, the inadequate tissue texture representation and confused scale problem in monocular colonoscopic image reconstruction still impede the progress of the final rendering results. In this paper, we introduce a novel method for colon section reconstruction by leveraging NeuS applied to endoscopic images, supplemented by a single frame of depth map. Notably, we pioneered the exploration of utilizing only one frame depth map in photorealistic reconstruction and neural rendering applications while this single depth map can be easily obtainable from other monocular depth estimation networks with an object scale. Through rigorous experimentation and validation on phantom imagery, our approach demonstrates exceptional accuracy in completely rendering colon sections, even capturing unseen portions of the surface. This breakthrough opens avenues for achieving stable and consistently scaled reconstructions, promising enhanced quality in cancer screening procedures and treatment interventions. △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2404.07124 [pdf, other]

Measuring proximity to standard planes during fetal brain ultrasound scanning

Authors: Chiara Di Vece, Antonio Cirigliano, Meala Le Lous, Raffaele Napolitano, Anna L. David, Donald Peebles, Pierre Jannin, Francisco Vasconcelos, Danail Stoyanov

Abstract: This paper introduces a novel pipeline designed to bring ultrasound (US) plane pose estimation closer to clinical use for more effective navigation to the standard planes (SPs) in the fetal brain. We propose a semi-supervised segmentation model utilizing both labeled SPs and unlabeled 3D US volume slices. Our model enables reliable segmentation across a diverse set of fetal brain images. Furthermo… ▽ More This paper introduces a novel pipeline designed to bring ultrasound (US) plane pose estimation closer to clinical use for more effective navigation to the standard planes (SPs) in the fetal brain. We propose a semi-supervised segmentation model utilizing both labeled SPs and unlabeled 3D US volume slices. Our model enables reliable segmentation across a diverse set of fetal brain images. Furthermore, the model incorporates a classification mechanism to identify the fetal brain precisely. Our model not only filters out frames lacking the brain but also generates masks for those containing it, enhancing the relevance of plane pose regression in clinical settings. We focus on fetal brain navigation from 2D ultrasound (US) video analysis and combine this model with a US plane pose regression network to provide sensorless proximity detection to SPs and non-SPs planes; we emphasize the importance of proximity detection to SPs for guiding sonographers, offering a substantial advantage over traditional methods by allowing earlier and more precise adjustments during scanning. We demonstrate the practical applicability of our approach through validation on real fetal scan videos obtained from sonographers of varying expertise levels. Our findings demonstrate the potential of our approach to complement existing fetal US technologies and advance prenatal diagnostic practices. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 11 pages, 5 figures

ACM Class: I.2.0; I.4.0; J.2.0; J.3.0

arXiv:2404.06128 [pdf, other]

Gaussian Pancakes: Geometrically-Regularized 3D Gaussian Splatting for Realistic Endoscopic Reconstruction

Authors: Sierra Bonilla, Shuai Zhang, Dimitrios Psychogyios, Danail Stoyanov, Francisco Vasconcelos, Sophia Bano

Abstract: Within colorectal cancer diagnostics, conventional colonoscopy techniques face critical limitations, including a limited field of view and a lack of depth information, which can impede the detection of precancerous lesions. Current methods struggle to provide comprehensive and accurate 3D reconstructions of the colonic surface which can help minimize the missing regions and reinspection for pre-ca… ▽ More Within colorectal cancer diagnostics, conventional colonoscopy techniques face critical limitations, including a limited field of view and a lack of depth information, which can impede the detection of precancerous lesions. Current methods struggle to provide comprehensive and accurate 3D reconstructions of the colonic surface which can help minimize the missing regions and reinspection for pre-cancerous polyps. Addressing this, we introduce 'Gaussian Pancakes', a method that leverages 3D Gaussian Splatting (3D GS) combined with a Recurrent Neural Network-based Simultaneous Localization and Mapping (RNNSLAM) system. By introducing geometric and depth regularization into the 3D GS framework, our approach ensures more accurate alignment of Gaussians with the colon surface, resulting in smoother 3D reconstructions with novel viewing of detailed textures and structures. Evaluations across three diverse datasets show that Gaussian Pancakes enhances novel view synthesis quality, surpassing current leading methods with a 18% boost in PSNR and a 16% improvement in SSIM. It also delivers over 100X faster rendering and more than 10X shorter training times, making it a practical tool for real-time applications. Hence, this holds promise for achieving clinical translation for better detection and diagnosis of colorectal cancer. △ Less

Submitted 16 August, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

Comments: 12 pages, 5 figures

arXiv:2403.08156 [pdf, other]

NeRF-Supervised Feature Point Detection and Description

Authors: Ali Youssef, Francisco Vasconcelos

Abstract: Feature point detection and description is the backbone for various computer vision applications, such as Structure-from-Motion, visual SLAM, and visual place recognition. While learning-based methods have surpassed traditional handcrafted techniques, their training often relies on simplistic homography-based simulations of multi-view perspectives, limiting model generalisability. This paper prese… ▽ More Feature point detection and description is the backbone for various computer vision applications, such as Structure-from-Motion, visual SLAM, and visual place recognition. While learning-based methods have surpassed traditional handcrafted techniques, their training often relies on simplistic homography-based simulations of multi-view perspectives, limiting model generalisability. This paper presents a novel approach leveraging Neural Radiance Fields (NeRFs) to generate a diverse and realistic dataset consisting of indoor and outdoor scenes. Our proposed methodology adapts state-of-the-art feature detectors and descriptors for training on multi-view NeRF-synthesised data, with supervision achieved through perspective projective geometry. Experiments demonstrate that the proposed methodology achieves competitive or superior performance on standard benchmarks for relative pose estimation, point cloud registration, and homography estimation while requiring significantly less training data and time compared to existing approaches. △ Less

Submitted 30 July, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

arXiv:2311.10859 [pdf, other]

A Quadratic Speedup in Finding Nash Equilibria of Quantum Zero-Sum Games

Authors: Francisca Vasconcelos, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Panayotis Mertikopoulos, Georgios Piliouras, Michael I. Jordan

Abstract: Recent developments in domains such as non-local games, quantum interactive proofs, and quantum generative adversarial networks have renewed interest in quantum game theory and, specifically, quantum zero-sum games. Central to classical game theory is the efficient algorithmic computation of Nash equilibria, which represent optimal strategies for both players. In 2008, Jain and Watrous proposed th… ▽ More Recent developments in domains such as non-local games, quantum interactive proofs, and quantum generative adversarial networks have renewed interest in quantum game theory and, specifically, quantum zero-sum games. Central to classical game theory is the efficient algorithmic computation of Nash equilibria, which represent optimal strategies for both players. In 2008, Jain and Watrous proposed the first classical algorithm for computing equilibria in quantum zero-sum games using the Matrix Multiplicative Weight Updates (MMWU) method to achieve a convergence rate of $\mathcal{O}(d/ε^2)$ iterations to $ε$-Nash equilibria in the $4^d$-dimensional spectraplex. In this work, we propose a hierarchy of quantum optimization algorithms that generalize MMWU via an extra-gradient mechanism. Notably, within this proposed hierarchy, we introduce the Optimistic Matrix Multiplicative Weights Update (OMMWU) algorithm and establish its average-iterate convergence complexity as $\mathcal{O}(d/ε)$ iterations to $ε$-Nash equilibria. This quadratic speed-up relative to Jain and Watrous' original algorithm sets a new benchmark for computing $ε$-Nash equilibria in quantum zero-sum games. △ Less

Submitted 17 November, 2023; originally announced November 2023.

Comments: 53 pages, 7 figures, QTML 2023 (Accepted (Long Talk))

MSC Class: primary 91A05; 81Q93; secondary 68Q32; 91A26; 37N40;

arXiv:2311.09631 [pdf, other]

doi 10.1145/3618260.3649662

On the Pauli Spectrum of QAC0

Authors: Shivam Nadimpalli, Natalie Parham, Francisca Vasconcelos, Henry Yuen

Abstract: The circuit class $\mathsf{QAC}^0$ was introduced by Moore (1999) as a model for constant depth quantum circuits where the gate set includes many-qubit Toffoli gates. Proving lower bounds against such circuits is a longstanding challenge in quantum circuit complexity; in particular, showing that polynomial-size $\mathsf{QAC}^0$ cannot compute the parity function has remained an open question for o… ▽ More The circuit class $\mathsf{QAC}^0$ was introduced by Moore (1999) as a model for constant depth quantum circuits where the gate set includes many-qubit Toffoli gates. Proving lower bounds against such circuits is a longstanding challenge in quantum circuit complexity; in particular, showing that polynomial-size $\mathsf{QAC}^0$ cannot compute the parity function has remained an open question for over 20 years. In this work, we identify a notion of the Pauli spectrum of $\mathsf{QAC}^0$ circuits, which can be viewed as the quantum analogue of the Fourier spectrum of classical $\mathsf{AC}^0$ circuits. We conjecture that the Pauli spectrum of $\mathsf{QAC}^0$ circuits satisfies low-degree concentration, in analogy to the famous Linial, Nisan, Mansour theorem on the low-degree Fourier concentration of $\mathsf{AC}^0$ circuits. If true, this conjecture immediately implies that polynomial-size $\mathsf{QAC}^0$ circuits cannot compute parity. We prove this conjecture for the class of depth-$d$, polynomial-size $\mathsf{QAC}^0$ circuits with at most $n^{O(1/d)}$ auxiliary qubits. We obtain new circuit lower bounds and learning results as applications: this class of circuits cannot correctly compute - the $n$-bit parity function on more than $(\frac{1}{2} + 2^{-Ω(n^{1/d})})$-fraction of inputs, and - the $n$-bit majority function on more than $(1 - Ω(n^{-1/2}))$-fraction of inputs. Additionally we show that this class of $\mathsf{QAC}^0$ circuits with limited auxiliary qubits can be learned with quasipolynomial sample complexity, giving the first learning result for $\mathsf{QAC}^0$ circuits. More broadly, our results add evidence that "Pauli-analytic" techniques can be a powerful tool in studying quantum circuits. △ Less

Submitted 17 July, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: 46 pages, 7 figures, new version fixed bugs, updated majority bound and Cor. 36, added context on interpreting normalized Frobenius distance

Journal ref: STOC 2024: Proceedings of the 56th Annual ACM Symposium on Theory of Computing, 1498-1506

arXiv:2306.02261 [pdf, other]

Online estimation of the hand-eye transformation from surgical scenes

Authors: Krittin Pachtrachai, Francisco Vasconcelos, Danail Stoyanov

Abstract: Hand-eye calibration algorithms are mature and provide accurate transformation estimations for an effective camera-robot link but rely on a sufficiently wide range of calibration data to avoid errors and degenerate configurations. To solve the hand-eye problem in robotic-assisted minimally invasive surgery and also simplify the calibration procedure by using neural network method cooporating with… ▽ More Hand-eye calibration algorithms are mature and provide accurate transformation estimations for an effective camera-robot link but rely on a sufficiently wide range of calibration data to avoid errors and degenerate configurations. To solve the hand-eye problem in robotic-assisted minimally invasive surgery and also simplify the calibration procedure by using neural network method cooporating with the new objective function. We present a neural network-based solution that estimates the transformation from a sequence of images and kinematic data which significantly simplifies the calibration procedure. The network utilises the long short-term memory architecture to extract temporal information from the data and solve the hand-eye problem. The objective function is derived from the linear combination of remote centre of motion constraint, the re-projection error and its derivative to induce a small change in the hand-eye transformation. The method is validated with the data from da Vinci Si and the result shows that the estimated hand-eye matrix is able to re-project the end-effector from the robot coordinate to the camera coordinate within 10 to 20 pixels of accuracy in both testing dataset. The calibration performance is also superior to the previous neural network-based hand-eye method. The proposed algorithm shows that the calibration procedure can be simplified by using deep learning techniques and the performance is improved by the assumption of non-static hand-eye transformations. △ Less

Submitted 4 June, 2023; originally announced June 2023.

Comments: 6 pages, 4 main figures

arXiv:2302.03022 [pdf, other]

SurgT challenge: Benchmark of Soft-Tissue Trackers for Robotic Surgery

Authors: Joao Cartucho, Alistair Weld, Samyakh Tukra, Haozheng Xu, Hiroki Matsuzaki, Taiyo Ishikawa, Minjun Kwon, Yong Eun Jang, Kwang-Ju Kim, Gwang Lee, Bizhe Bai, Lueder Kahrs, Lars Boecking, Simeon Allmendinger, Leopold Muller, Yitong Zhang, Yueming Jin, Sophia Bano, Francisco Vasconcelos, Wolfgang Reiter, Jonas Hajek, Bruno Silva, Estevao Lima, Joao L. Vilaca, Sandro Queiros , et al. (1 additional authors not shown)

Abstract: This paper introduces the ``SurgT: Surgical Tracking" challenge which was organised in conjunction with MICCAI 2022. There were two purposes for the creation of this challenge: (1) the establishment of the first standardised benchmark for the research community to assess soft-tissue trackers; and (2) to encourage the development of unsupervised deep learning methods, given the lack of annotated da… ▽ More This paper introduces the ``SurgT: Surgical Tracking" challenge which was organised in conjunction with MICCAI 2022. There were two purposes for the creation of this challenge: (1) the establishment of the first standardised benchmark for the research community to assess soft-tissue trackers; and (2) to encourage the development of unsupervised deep learning methods, given the lack of annotated data in surgery. A dataset of 157 stereo endoscopic videos from 20 clinical cases, along with stereo camera calibration parameters, have been provided. Participants were assigned the task of developing algorithms to track the movement of soft tissues, represented by bounding boxes, in stereo endoscopic videos. At the end of the challenge, the developed methods were assessed on a previously hidden test subset. This assessment uses benchmarking metrics that were purposely developed for this challenge, to verify the efficacy of unsupervised deep learning algorithms in tracking soft-tissue. The metric used for ranking the methods was the Expected Average Overlap (EAO) score, which measures the average overlap between a tracker's and the ground truth bounding boxes. Coming first in the challenge was the deep learning submission by ICVS-2Ai with a superior EAO score of 0.617. This method employs ARFlow to estimate unsupervised dense optical flow from cropped images, using photometric and regularization losses. Second, Jmees with an EAO of 0.583, uses deep learning for surgical tool segmentation on top of a non-deep learning baseline method: CSRT. CSRT by itself scores a similar EAO of 0.563. The results from this challenge show that currently, non-deep learning methods are still competitive. The dataset and benchmarking tool created for this challenge have been made publicly available at https://surgt.grand-challenge.org/. △ Less

Submitted 30 August, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

arXiv:2301.08317 [pdf, other]

doi 10.1109/TMRB.2023.3328638

Ultrasound Plane Pose Regression: Assessing Generalized Pose Coordinates in the Fetal Brain

Authors: Chiara Di Vece, Maela Le Lous, Brian Dromey, Francisco Vasconcelos, Anna L David, Donald Peebles, Danail Stoyanov

Abstract: In obstetric ultrasound (US) scanning, the learner's ability to mentally build a three-dimensional (3D) map of the fetus from a two-dimensional (2D) US image represents a significant challenge in skill acquisition. We aim to build a US plane localization system for 3D visualization, training, and guidance without integrating additional sensors. This work builds on top of our previous work, which p… ▽ More In obstetric ultrasound (US) scanning, the learner's ability to mentally build a three-dimensional (3D) map of the fetus from a two-dimensional (2D) US image represents a significant challenge in skill acquisition. We aim to build a US plane localization system for 3D visualization, training, and guidance without integrating additional sensors. This work builds on top of our previous work, which predicts the six-dimensional (6D) pose of arbitrarily oriented US planes slicing the fetal brain with respect to a normalized reference frame using a convolutional neural network (CNN) regression network. Here, we analyze in detail the assumptions of the normalized fetal brain reference frame and quantify its accuracy with respect to the acquisition of transventricular (TV) standard plane (SP) for fetal biometry. We investigate the impact of registration quality in the training and testing data and its subsequent effect on trained models. Finally, we introduce data augmentations and larger training sets that improve the results of our previous work, achieving median errors of 2.97 mm and 6.63 degrees for translation and rotation, respectively. △ Less

Submitted 2 November, 2023; v1 submitted 19 January, 2023; originally announced January 2023.

Comments: 13 pages, 9 figures, 2 tables. This article has been accepted for publication in IEEE Transactions on Medical Robotics and Bionics. This is the author's version which has not been fully edited and content may change prior to final publication. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/

MSC Class: 68T07 ACM Class: I.2.0; I.4.0; J.2; J.3

arXiv:2208.00902 [pdf, ps, other]

Retrieval of surgical phase transitions using reinforcement learning

Authors: Yitong Zhang, Sophia Bano, Ann-Sophie Page, Jan Deprest, Danail Stoyanov, Francisco Vasconcelos

Abstract: In minimally invasive surgery, surgical workflow segmentation from video analysis is a well studied topic. The conventional approach defines it as a multi-class classification problem, where individual video frames are attributed a surgical phase label. We introduce a novel reinforcement learning formulation for offline phase transition retrieval. Instead of attempting to classify every video fr… ▽ More In minimally invasive surgery, surgical workflow segmentation from video analysis is a well studied topic. The conventional approach defines it as a multi-class classification problem, where individual video frames are attributed a surgical phase label. We introduce a novel reinforcement learning formulation for offline phase transition retrieval. Instead of attempting to classify every video frame, we identify the timestamp of each phase transition. By construction, our model does not produce spurious and noisy phase transitions, but contiguous phase blocks. We investigate two different configurations of this model. The first does not require processing all frames in a video (only <60% and <20% of frames in 2 different applications), while producing results slightly under the state-of-the-art accuracy. The second configuration processes all video frames, and outperforms the state-of-the art at a comparable computational cost. We compare our method against the recent top-performing frame-based approaches TeCNO and Trans-SVNet on the public dataset Cholec80 and also on an in-house dataset of laparoscopic sacrocolpopexy. We perform both a frame-based (accuracy, precision, recall and F1-score) and an event-based (event ratio) evaluation of our algorithms. △ Less

Submitted 1 August, 2022; originally announced August 2022.

Comments: Accepted by MICCAI 2022

arXiv:2207.13185 [pdf, other]

Learning-Based Keypoint Registration for Fetoscopic Mosaicking

Authors: Alessandro Casella, Sophia Bano, Francisco Vasconcelos, Anna L. David, Dario Paladini, Jan Deprest, Elena De Momi, Leonardo S. Mattos, Sara Moccia, Danail Stoyanov

Abstract: In Twin-to-Twin Transfusion Syndrome (TTTS), abnormal vascular anastomoses in the monochorionic placenta can produce uneven blood flow between the two fetuses. In the current practice, TTTS is treated surgically by closing abnormal anastomoses using laser ablation. This surgery is minimally invasive and relies on fetoscopy. Limited field of view makes anastomosis identification a challenging task… ▽ More In Twin-to-Twin Transfusion Syndrome (TTTS), abnormal vascular anastomoses in the monochorionic placenta can produce uneven blood flow between the two fetuses. In the current practice, TTTS is treated surgically by closing abnormal anastomoses using laser ablation. This surgery is minimally invasive and relies on fetoscopy. Limited field of view makes anastomosis identification a challenging task for the surgeon. To tackle this challenge, we propose a learning-based framework for in-vivo fetoscopy frame registration for field-of-view expansion. The novelties of this framework relies on a learning-based keypoint proposal network and an encoding strategy to filter (i) irrelevant keypoints based on fetoscopic image segmentation and (ii) inconsistent homographies. We validate of our framework on a dataset of 6 intraoperative sequences from 6 TTTS surgeries from 6 different women against the most recent state of the art algorithm, which relies on the segmentation of placenta vessels. The proposed framework achieves higher performance compared to the state of the art, paving the way for robust mosaicking to provide surgeons with context awareness during TTTS surgery. △ Less

Submitted 26 July, 2022; originally announced July 2022.

arXiv:2206.12512 [pdf, other]

Placental Vessel Segmentation and Registration in Fetoscopy: Literature Review and MICCAI FetReg2021 Challenge Findings

Authors: Sophia Bano, Alessandro Casella, Francisco Vasconcelos, Abdul Qayyum, Abdesslam Benzinou, Moona Mazher, Fabrice Meriaudeau, Chiara Lena, Ilaria Anita Cintorrino, Gaia Romana De Paolis, Jessica Biagioli, Daria Grechishnikova, Jing Jiao, Bizhe Bai, Yanyan Qiao, Binod Bhattarai, Rebati Raman Gaire, Ronast Subedi, Eduard Vazquez, Szymon Płotka, Aneta Lisowska, Arkadiusz Sitek, George Attilakos, Ruwan Wimalasundera, Anna L David , et al. (6 additional authors not shown)

Abstract: Fetoscopy laser photocoagulation is a widely adopted procedure for treating Twin-to-Twin Transfusion Syndrome (TTTS). The procedure involves photocoagulation pathological anastomoses to regulate blood exchange among twins. The procedure is particularly challenging due to the limited field of view, poor manoeuvrability of the fetoscope, poor visibility, and variability in illumination. These challe… ▽ More Fetoscopy laser photocoagulation is a widely adopted procedure for treating Twin-to-Twin Transfusion Syndrome (TTTS). The procedure involves photocoagulation pathological anastomoses to regulate blood exchange among twins. The procedure is particularly challenging due to the limited field of view, poor manoeuvrability of the fetoscope, poor visibility, and variability in illumination. These challenges may lead to increased surgery time and incomplete ablation. Computer-assisted intervention (CAI) can provide surgeons with decision support and context awareness by identifying key structures in the scene and expanding the fetoscopic field of view through video mosaicking. Research in this domain has been hampered by the lack of high-quality data to design, develop and test CAI algorithms. Through the Fetoscopic Placental Vessel Segmentation and Registration (FetReg2021) challenge, which was organized as part of the MICCAI2021 Endoscopic Vision challenge, we released the first largescale multicentre TTTS dataset for the development of generalized and robust semantic segmentation and video mosaicking algorithms. For this challenge, we released a dataset of 2060 images, pixel-annotated for vessels, tool, fetus and background classes, from 18 in-vivo TTTS fetoscopy procedures and 18 short video clips. Seven teams participated in this challenge and their model performance was assessed on an unseen test dataset of 658 pixel-annotated images from 6 fetoscopic procedures and 6 short clips. The challenge provided an opportunity for creating generalized solutions for fetoscopic scene understanding and mosaicking. In this paper, we present the findings of the FetReg2021 challenge alongside reporting a detailed literature review for CAI in TTTS fetoscopy. Through this challenge, its analysis and the release of multi-centre fetoscopic data, we provide a benchmark for future research in this field. △ Less

Submitted 26 February, 2023; v1 submitted 24 June, 2022; originally announced June 2022.

Comments: Accepted at MedIA (Medical Image Analysis)

arXiv:2203.17013 [pdf, other]

A Temporal Learning Approach to Inpainting Endoscopic Specularities and Its effect on Image Correspondence

Authors: Rema Daher, Francisco Vasconcelos, Danail Stoyanov

Abstract: Video streams are utilised to guide minimally-invasive surgery and diagnostic procedures in a wide range of procedures, and many computer assisted techniques have been developed to automatically analyse them. These approaches can provide additional information to the surgeon such as lesion detection, instrument navigation, or anatomy 3D shape modeling. However, the necessary image features to reco… ▽ More Video streams are utilised to guide minimally-invasive surgery and diagnostic procedures in a wide range of procedures, and many computer assisted techniques have been developed to automatically analyse them. These approaches can provide additional information to the surgeon such as lesion detection, instrument navigation, or anatomy 3D shape modeling. However, the necessary image features to recognise these patterns are not always reliably detected due to the presence of irregular light patterns such as specular highlight reflections. In this paper, we aim at removing specular highlights from endoscopic videos using machine learning. We propose using a temporal generative adversarial network (GAN) to inpaint the hidden anatomy under specularities, inferring its appearance spatially and from neighbouring frames where they are not present in the same location. This is achieved using in-vivo data of gastric endoscopy (Hyper-Kvasir) in a fully unsupervised manner that relies on automatic detection of specular highlights. System evaluations show significant improvements to traditional methods through direct comparison as well as other machine learning techniques through an ablation study that depicts the importance of the network's temporal and transfer learning components. The generalizability of our system to different surgical setups and procedures was also evaluated qualitatively on in-vivo data of gastric endoscopy and ex-vivo porcine data (SERV-CT, SCARED). We also assess the effect of our method in computer vision tasks that underpin 3D reconstruction and camera motion estimation, namely stereo disparity, optical flow, and sparse point feature matching. These are evaluated quantitatively and qualitatively and results show a positive effect of specular highlight inpainting on these tasks in a novel comprehensive analysis. △ Less

Submitted 31 March, 2022; originally announced March 2022.

arXiv:2202.10847 [pdf, other]

UncertaINR: Uncertainty Quantification of End-to-End Implicit Neural Representations for Computed Tomography

Authors: Francisca Vasconcelos, Bobby He, Nalini Singh, Yee Whye Teh

Abstract: Implicit neural representations (INRs) have achieved impressive results for scene reconstruction and computer graphics, where their performance has primarily been assessed on reconstruction accuracy. As INRs make their way into other domains, where model predictions inform high-stakes decision-making, uncertainty quantification of INR inference is becoming critical. To that end, we study a Bayesia… ▽ More Implicit neural representations (INRs) have achieved impressive results for scene reconstruction and computer graphics, where their performance has primarily been assessed on reconstruction accuracy. As INRs make their way into other domains, where model predictions inform high-stakes decision-making, uncertainty quantification of INR inference is becoming critical. To that end, we study a Bayesian reformulation of INRs, UncertaINR, in the context of computed tomography, and evaluate several Bayesian deep learning implementations in terms of accuracy and calibration. We find that they achieve well-calibrated uncertainty, while retaining accuracy competitive with other classical, INR-based, and CNN-based reconstruction techniques. Contrary to common intuition in the Bayesian deep learning literature, we find that INRs obtain the best calibration with computationally efficient Monte Carlo dropout, outperforming Hamiltonian Monte Carlo and deep ensembles. Moreover, in contrast to the best-performing prior approaches, UncertaINR does not require a large training dataset, but only a handful of validation images. △ Less

Submitted 2 May, 2023; v1 submitted 22 February, 2022; originally announced February 2022.

Comments: Published in the Transactions on Machine Learning Research (TMLR) April 2023 [https://openreview.net/forum?id=jdGMBgYvfX]

arXiv:2202.04218 [pdf, ps, other]

Managers versus Machines: Do Algorithms Replicate Human Intuition in Credit Ratings?

Authors: Matthew Harding, Gabriel F. R. Vasconcelos

Abstract: We use machine learning techniques to investigate whether it is possible to replicate the behavior of bank managers who assess the risk of commercial loans made by a large commercial US bank. Even though a typical bank already relies on an algorithmic scorecard process to evaluate risk, bank managers are given significant latitude in adjusting the risk score in order to account for other holistic… ▽ More We use machine learning techniques to investigate whether it is possible to replicate the behavior of bank managers who assess the risk of commercial loans made by a large commercial US bank. Even though a typical bank already relies on an algorithmic scorecard process to evaluate risk, bank managers are given significant latitude in adjusting the risk score in order to account for other holistic factors based on their intuition and experience. We show that it is possible to find machine learning algorithms that can replicate the behavior of the bank managers. The input to the algorithms consists of a combination of standard financials and soft information available to bank managers as part of the typical loan review process. We also document the presence of significant heterogeneity in the adjustment process that can be traced to differences across managers and industries. Our results highlight the effectiveness of machine learning based analytic approaches to banking and the potential challenges to high-skill jobs in the financial sector. △ Less

Submitted 8 February, 2022; originally announced February 2022.

arXiv:2107.05255 [pdf, other]

AutoFB: Automating Fetal Biometry Estimation from Standard Ultrasound Planes

Authors: Sophia Bano, Brian Dromey, Francisco Vasconcelos, Raffaele Napolitano, Anna L. David, Donald M. Peebles, Danail Stoyanov

Abstract: During pregnancy, ultrasound examination in the second trimester can assess fetal size according to standardized charts. To achieve a reproducible and accurate measurement, a sonographer needs to identify three standard 2D planes of the fetal anatomy (head, abdomen, femur) and manually mark the key anatomical landmarks on the image for accurate biometry and fetal weight estimation. This can be a t… ▽ More During pregnancy, ultrasound examination in the second trimester can assess fetal size according to standardized charts. To achieve a reproducible and accurate measurement, a sonographer needs to identify three standard 2D planes of the fetal anatomy (head, abdomen, femur) and manually mark the key anatomical landmarks on the image for accurate biometry and fetal weight estimation. This can be a time-consuming operator-dependent task, especially for a trainee sonographer. Computer-assisted techniques can help in automating the fetal biometry computation process. In this paper, we present a unified automated framework for estimating all measurements needed for the fetal weight assessment. The proposed framework semantically segments the key fetal anatomies using state-of-the-art segmentation models, followed by region fitting and scale recovery for the biometry estimation. We present an ablation study of segmentation algorithms to show their robustness through 4-fold cross-validation on a dataset of 349 ultrasound standard plane images from 42 pregnancies. Moreover, we show that the network with the best segmentation performance tends to be more accurate for biometry estimation. Furthermore, we demonstrate that the error between clinically measured and predicted fetal biometry is lower than the permissible error during routine clinical measurements. △ Less

Submitted 12 July, 2021; originally announced July 2021.

Comments: Accepted at MICCAI 2021

arXiv:2106.05923 [pdf, other]

FetReg: Placental Vessel Segmentation and Registration in Fetoscopy Challenge Dataset

Authors: Sophia Bano, Alessandro Casella, Francisco Vasconcelos, Sara Moccia, George Attilakos, Ruwan Wimalasundera, Anna L. David, Dario Paladini, Jan Deprest, Elena De Momi, Leonardo S. Mattos, Danail Stoyanov

Abstract: Fetoscopy laser photocoagulation is a widely used procedure for the treatment of Twin-to-Twin Transfusion Syndrome (TTTS), that occur in mono-chorionic multiple pregnancies due to placental vascular anastomoses. This procedure is particularly challenging due to limited field of view, poor manoeuvrability of the fetoscope, poor visibility due to fluid turbidity, variability in light source, and unu… ▽ More Fetoscopy laser photocoagulation is a widely used procedure for the treatment of Twin-to-Twin Transfusion Syndrome (TTTS), that occur in mono-chorionic multiple pregnancies due to placental vascular anastomoses. This procedure is particularly challenging due to limited field of view, poor manoeuvrability of the fetoscope, poor visibility due to fluid turbidity, variability in light source, and unusual position of the placenta. This may lead to increased procedural time and incomplete ablation, resulting in persistent TTTS. Computer-assisted intervention may help overcome these challenges by expanding the fetoscopic field of view through video mosaicking and providing better visualization of the vessel network. However, the research and development in this domain remain limited due to unavailability of high-quality data to encode the intra- and inter-procedure variability. Through the \textit{Fetoscopic Placental Vessel Segmentation and Registration (FetReg)} challenge, we present a large-scale multi-centre dataset for the development of generalized and robust semantic segmentation and video mosaicking algorithms for the fetal environment with a focus on creating drift-free mosaics from long duration fetoscopy videos. In this paper, we provide an overview of the FetReg dataset, challenge tasks, evaluation metrics and baseline methods for both segmentation and registration. Baseline methods results on the FetReg dataset shows that our dataset poses interesting challenges, offering large opportunity for the creation of novel methods and models through a community effort initiative guided by the FetReg challenge. △ Less

Submitted 16 June, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

arXiv:2007.04349 [pdf, other]

doi 10.1007/978-3-030-59716-0_73

Deep Placental Vessel Segmentation for Fetoscopic Mosaicking

Authors: Sophia Bano, Francisco Vasconcelos, Luke M. Shepherd, Emmanuel Vander Poorten, Tom Vercauteren, Sebastien Ourselin, Anna L. David, Jan Deprest, Danail Stoyanov

Abstract: During fetoscopic laser photocoagulation, a treatment for twin-to-twin transfusion syndrome (TTTS), the clinician first identifies abnormal placental vascular connections and laser ablates them to regulate blood flow in both fetuses. The procedure is challenging due to the mobility of the environment, poor visibility in amniotic fluid, occasional bleeding, and limitations in the fetoscopic field-o… ▽ More During fetoscopic laser photocoagulation, a treatment for twin-to-twin transfusion syndrome (TTTS), the clinician first identifies abnormal placental vascular connections and laser ablates them to regulate blood flow in both fetuses. The procedure is challenging due to the mobility of the environment, poor visibility in amniotic fluid, occasional bleeding, and limitations in the fetoscopic field-of-view and image quality. Ideally, anastomotic placental vessels would be automatically identified, segmented and registered to create expanded vessel maps to guide laser ablation, however, such methods have yet to be clinically adopted. We propose a solution utilising the U-Net architecture for performing placental vessel segmentation in fetoscopic videos. The obtained vessel probability maps provide sufficient cues for mosaicking alignment by registering consecutive vessel maps using the direct intensity-based technique. Experiments on 6 different in vivo fetoscopic videos demonstrate that the vessel intensity-based registration outperformed image intensity-based registration approaches showing better robustness in qualitative and quantitative comparison. We additionally reduce drift accumulation to negligible even for sequences with up to 400 frames and we incorporate a scheme for quantifying drift error in the absence of the ground-truth. Our paper provides a benchmark for fetoscopy placental vessel segmentation and registration by contributing the first in vivo vessel segmentation and fetoscopic videos dataset. △ Less

Submitted 8 July, 2020; originally announced July 2020.

Comments: Accepted at MICCAI 2020

arXiv:1907.06543 [pdf, other]

doi 10.1007/978-3-030-32239-7_35

Deep Sequential Mosaicking of Fetoscopic Videos

Authors: Sophia Bano, Francisco Vasconcelos, Marcel Tella Amo, George Dwyer, Caspar Gruijthuijsen, Jan Deprest, Sebastien Ourselin, Emmanuel Vander Poorten, Tom Vercauteren, Danail Stoyanov

Abstract: Twin-to-twin transfusion syndrome treatment requires fetoscopic laser photocoagulation of placental vascular anastomoses to regulate blood flow to both fetuses. Limited field-of-view (FoV) and low visual quality during fetoscopy make it challenging to identify all vascular connections. Mosaicking can align multiple overlapping images to generate an image with increased FoV, however, existing techn… ▽ More Twin-to-twin transfusion syndrome treatment requires fetoscopic laser photocoagulation of placental vascular anastomoses to regulate blood flow to both fetuses. Limited field-of-view (FoV) and low visual quality during fetoscopy make it challenging to identify all vascular connections. Mosaicking can align multiple overlapping images to generate an image with increased FoV, however, existing techniques apply poorly to fetoscopy due to the low visual quality, texture paucity, and hence fail in longer sequences due to the drift accumulated over time. Deep learning techniques can facilitate in overcoming these challenges. Therefore, we present a new generalized Deep Sequential Mosaicking (DSM) framework for fetoscopic videos captured from different settings such as simulation, phantom, and real environments. DSM extends an existing deep image-based homography model to sequential data by proposing controlled data augmentation and outlier rejection methods. Unlike existing methods, DSM can handle visual variations due to specular highlights and reflection across adjacent frames, hence reducing the accumulated drift. We perform experimental validation and comparison using 5 diverse fetoscopic videos to demonstrate the robustness of our framework. △ Less

Submitted 15 July, 2019; originally announced July 2019.

Comments: Accepted at MICCAI 2019

arXiv:1901.05377 [pdf]

doi 10.1016/j.media.2019.01.003

Nonrigid reconstruction of 3D breast surfaces with a low-cost RGBD camera for surgical planning and aesthetic evaluation

Authors: Rene Lacher, Francisco Vasconcelos, Norman Williams, Gerrit Rindermann, John Hipwell, David Hawkes, Danail Stoyanov

Abstract: Accounting for 26% of all new cancer cases worldwide, breast cancer remains the most common form of cancer in women. Although early breast cancer has a favourable long-term prognosis, roughly a third of patients suffer from a suboptimal aesthetic outcome despite breast conserving cancer treatment. Clinical-quality 3D modelling of the breast surface therefore assumes an increasingly important role… ▽ More Accounting for 26% of all new cancer cases worldwide, breast cancer remains the most common form of cancer in women. Although early breast cancer has a favourable long-term prognosis, roughly a third of patients suffer from a suboptimal aesthetic outcome despite breast conserving cancer treatment. Clinical-quality 3D modelling of the breast surface therefore assumes an increasingly important role in advancing treatment planning, prediction and evaluation of breast cosmesis. Yet, existing 3D torso scanners are expensive and either infrastructure-heavy or subject to motion artefacts. In this paper we employ a single consumer-grade RGBD camera with an ICP-based registration approach to jointly align all points from a sequence of depth images non-rigidly. Subtle body deformation due to postural sway and respiration is successfully mitigated leading to a higher geometric accuracy through regularised locally affine transformations. We present results from 6 clinical cases where our method compares well with the gold standard and outperforms a previous approach. We show that our method produces better reconstructions qualitatively by visual assessment and quantitatively by consistently obtaining lower landmark error scores and yielding more accurate breast volume estimates. △ Less

Submitted 16 January, 2019; originally announced January 2019.

Journal ref: Medical Image Analysis, Volume 53, April 2019, pp. 11-25

arXiv:1804.03141 [pdf, other]

Automated pick-up of suturing needles for robotic surgical assistance

Authors: Claudia D'Ettorre, George Dwyer, Xiaofei Du, Francois Chadebecq, Francisco Vasconcelos, Elena De Momi, Danail Stoyanov

Abstract: Robot-assisted laparoscopic prostatectomy (RALP) is a treatment for prostate cancer that involves complete or nerve sparing removal prostate tissue that contains cancer. After removal the bladder neck is successively sutured directly with the urethra. The procedure is called urethrovesical anastomosis and is one of the most dexterity demanding tasks during RALP. Two suturing instruments and a pair… ▽ More Robot-assisted laparoscopic prostatectomy (RALP) is a treatment for prostate cancer that involves complete or nerve sparing removal prostate tissue that contains cancer. After removal the bladder neck is successively sutured directly with the urethra. The procedure is called urethrovesical anastomosis and is one of the most dexterity demanding tasks during RALP. Two suturing instruments and a pair of needles are used in combination to perform a running stitch during urethrovesical anastomosis. While robotic instruments provide enhanced dexterity to perform the anastomosis, it is still highly challenging and difficult to learn. In this paper, we presents a vision-guided needle grasping method for automatically grasping the needle that has been inserted into the patient prior to anastomosis. We aim to automatically grasp the suturing needle in a position that avoids hand-offs and immediately enables the start of suturing. The full grasping process can be broken down into: a needle detection algorithm; an approach phase where the surgical tool moves closer to the needle based on visual feedback; and a grasping phase through path planning based on observed surgical practice. Our experimental results show examples of successful autonomous grasping that has the potential to simplify and decrease the operational time in RALP by assisting a small component of urethrovesical anastomosis. △ Less

Submitted 9 April, 2018; originally announced April 2018.

arXiv:1802.03274 [pdf]

doi 10.1117/12.2293671

Augmented Reality needle ablation guidance tool for Irreversible Electroporation in the pancreas

Authors: Timur Kuzhagaliyev, Neil T. Clancy, Mirek Janatka, Kevin Tchaka, Francisco Vasconcelos, Matthew J. Clarkson, Kurinchi Gurusamy, David J. Hawkes, Brian Davidson, Danail Stoyanov

Abstract: Irreversible electroporation (IRE) is a soft tissue ablation technique suitable for treatment of inoperable tumours in the pancreas. The process involves applying a high voltage electric field to the tissue containing the mass using needle electrodes, leaving cancerous cells irreversibly damaged and vulnerable to apoptosis. Efficacy of the treatment depends heavily on the accuracy of needle placem… ▽ More Irreversible electroporation (IRE) is a soft tissue ablation technique suitable for treatment of inoperable tumours in the pancreas. The process involves applying a high voltage electric field to the tissue containing the mass using needle electrodes, leaving cancerous cells irreversibly damaged and vulnerable to apoptosis. Efficacy of the treatment depends heavily on the accuracy of needle placement and requires a high degree of skill from the operator. In this paper, we describe an Augmented Reality (AR) system designed to overcome the challenges associated with planning and guiding the needle insertion process. Our solution, based on the HoloLens (Microsoft, USA) platform, tracks the position of the headset, needle electrodes and ultrasound (US) probe in space. The proof of concept implementation of the system uses this tracking data to render real-time holographic guides on the HoloLens, giving the user insight into the current progress of needle insertion and an indication of the target needle trajectory. The operator's field of view is augmented using visual guides and real-time US feed rendered on a holographic plane, eliminating the need to consult external monitors. Based on these early prototypes, we are aiming to develop a system that will lower the skill level required for IRE while increasing overall accuracy of needle insertion and, hence, the likelihood of successful treatment. △ Less

Submitted 9 February, 2018; originally announced February 2018.

Comments: 6 pages, 5 figures. Proc. SPIE 10576 (2018) Copyright 2018 Society of Photo Optical Instrumentation Engineers (SPIE). One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this publication for a fee or for commercial purposes, or modification of the contents of the publication are prohibited

arXiv:1706.06531 [pdf, other]

A comparative study of breast surface reconstruction for aesthetic outcome assessment

Authors: Rene Lacher, Francisco Vasconcelos, David Bishop, Norman Williams, Mohammed Keshtgar, David Hawkes, John Hipwell, Danail Stoyanov

Abstract: Breast cancer is the most prevalent cancer type in women, and while its survival rate is generally high the aesthetic outcome is an increasingly important factor when evaluating different treatment alternatives. 3D scanning and reconstruction techniques offer a flexible tool for building detailed and accurate 3D breast models that can be used both pre-operatively for surgical planning and post-ope… ▽ More Breast cancer is the most prevalent cancer type in women, and while its survival rate is generally high the aesthetic outcome is an increasingly important factor when evaluating different treatment alternatives. 3D scanning and reconstruction techniques offer a flexible tool for building detailed and accurate 3D breast models that can be used both pre-operatively for surgical planning and post-operatively for aesthetic evaluation. This paper aims at comparing the accuracy of low-cost 3D scanning technologies with the significantly more expensive state-of-the-art 3D commercial scanners in the context of breast 3D reconstruction. We present results from 28 synthetic and clinical RGBD sequences, including 12 unique patients and an anthropomorphic phantom demonstrating the applicability of low-cost RGBD sensors to real clinical cases. Body deformation and homogeneous skin texture pose challenges to the studied reconstruction systems. Although these should be addressed appropriately if higher model quality is warranted, we observe that low-cost sensors are able to obtain valuable reconstructions comparable to the state-of-the-art within an error margin of 3 mm. △ Less

Submitted 20 June, 2017; originally announced June 2017.

Comments: This paper has been accepted to MICCAI2017

arXiv:1608.00247 [pdf, other]

Similarity Registration Problems for 2D/3D Ultrasound Calibration

Authors: Francisco Vasconcelos, Donald Peebles, Sebastien Ourselin, Danail Stoyanov

Abstract: We propose a minimal solution for the similarity registration (rigid pose and scale) between two sets of 3D lines, and also between a set of co-planar points and a set of 3D lines. The first problem is solved up to 8 discrete solutions with a minimum of 2 line-line correspondences, while the second is solved up to 4 discrete solutions using 4 point-line correspondences. We use these algorithms to… ▽ More We propose a minimal solution for the similarity registration (rigid pose and scale) between two sets of 3D lines, and also between a set of co-planar points and a set of 3D lines. The first problem is solved up to 8 discrete solutions with a minimum of 2 line-line correspondences, while the second is solved up to 4 discrete solutions using 4 point-line correspondences. We use these algorithms to perform the extrinsic calibration between a pose tracking sensor and a 2D/3D ultrasound (US) curvilinear probe using a tracked needle as calibration target. The needle is tracked as a 3D line, and is scanned by the ultrasound as either a 3D line (3D US) or as a 2D point (2D US). Since the scale factor that converts US scan units to metric coordinates is unknown, the calibration is formulated as a similarity registration problem. We present results with both synthetic and real data and show that the minimum solutions outperform the correspondent non-minimal linear formulations. △ Less

Submitted 31 July, 2016; originally announced August 2016.

Showing 1–28 of 28 results for author: Vasconcelos, F