Keywords

1 Introduction

Ultra High Frequency Ultrasound (UHFUS) is a new advancement in non-invasive imaging, capable of operating above 50 MHz and resolving structures less than 0.03 mm. Potential clinical applications include vascular measurements for surgical procedures and disease diagnosis, with such measures including intimal wall thickness and variations in atherosclerotic plaque buildup [1]. It can be used to monitor hand transplant recipients [1], for whom the gold standard diagnosis using invasive histopathology is not practical due to suppressed immune systems [1]. However, UHFUS can only image through \({\sim }1\,\text {cm}\) of tissue. Vessels can be visualized at this depth (see Fig. 1(a)), in contrast to skeletal structures, which are too deep to be imaged. When compared against traditional high frequency ultrasound (HFUS) (see Fig. 1(b)), substantially increased speckle noise is encountered with UHFUS at such shallow depths. The vessel measurements have naturally occurring sub-millimeter(mm) variations along their length, and sub-mm displacements of the probe confound comparisons across time. Our motivation is that vessel tracking across B-scans with sub-mm precision should enable consistent comparisons. In this work, the primary medical image computing (MIC) goal is the fast sub-mm 2D vessel contour localization.

Traditional ultrasound based real-time vessel tracking has been researched before [2,3,4,5]. However, when tested on UHFUS images, these gradient-based edge detection approaches failed to detect and track the vessel boundaries in the presence of higher speckle noise. Furthermore, precise delineation of the deforming vessel is required for vessel-based measurements, whereas prior approaches [2,3,4,5] modeled the vessel as an ellipse without accounting for the deforming vessel contour. A recent approach in [5] was designed for a specific imaging setting of 55% maximum gain, but when applied to UHFUS sequences, it completely failed to track vessels regardless of gain settings (see Fig. 1(c)). A recent level-set based approach [6] designed for HFUS images ran slowly at 0.5 s per image.

In this paper, a fast GPU-based approach is presented to segment and track the deforming vessel contour in UHFUS images. It combines the robust edge detection capability of local phase analysis, with a distance regularized level set to accurately capture the vessel contour, and an efficient Extended Kalman Filter (EKF) to track the vessel. Validation on 35 UHFUS sequences showed that it successfully segmented and tracked vessels undergoing dynamic compression. Our algorithm achieved a maximum Hausdorff distance error of 0.135 mm, which was \(6{\times }\) smaller than the smallest vessel diameter of 0.81 mm. It also generalized to datasets acquired with different imaging settings and from a HFUS imaging system, with errors \({\sim }2{\times }\) smaller than the state-of-the-art for HFUS [6].

Contribution. (1) We present the first system capable of rapidly segmenting and tracking a vessel contour in UHFUS images, and we demonstrate its high speed performance (\({\ge }52\) FPS). (2) We demonstrate the generality of our approach by applying it to datasets acquired from a traditional HFUS machine, and show that it is faster than the state-of-the-art approach for HFUS.

2 Methods

2.1 Data Acquisition

The Visualsonics Vevo 2100 UHFUS machine (Fujifilm, Canada) and a 50 MHz transducer (bandwidth extendable to 70 MHz) was used to acquire freehand ultrasound volumes. This UHFUS system has a physical resolution of \(30\,\upmu \mathrm{m}\), and the pixel pitch is \(11.6\,\upmu \mathrm{m}\) between pixel centers. 35 deidentified UHFUS sequences were acquired over a wide range of gain values (40–70 dB), with the maximum gain value setting being 70 dB. The sequences contained a wide range of motions with the probe, such as longitudinal scanning, out-of-plane tissue deformation, beating vessel visualization, etc. Figure 1(a) shows an example ultrasound image of the proper palmar digital artery acquired with the UHFUS system. Each sequence consisted of 100 2D B-scans with dimensions of \(832\times 512\) pixels. To show the generality of our approach, 5 additional sequences were acquired from a traditional HFUS machine (Diasus, Dynamic Imaging, UK) using a 10–22 MHz transducer. The pixel resolution for the HFUS machine was \(92.5\,\upmu \mathrm{m}\), and each sequence consisted of 250 2D B-scans of dimensions \(280\times 534\) pixels.

2.2 Noise Reduction and Clustering

Noise Reduction. In contrast to traditional HFUS, speckle noise is greater in UHFUS as seen in Figs. 1(a) and (b). To mitigate the effects of speckle during segmentation and speed up computation, the UHFUS B-scans were first downsampled by a factor of 4 in each dimension (see Fig. 1(d)). Next, a bilateral filter [7] of size \(5\times 5\) pixels was applied to the downsampled image to smooth the small amplitude noise (see Fig. 1(e)), while preserving vessel boundaries that are crucial to our segmentation. The bilateral filtered image is represented by \({I}_{\hbox {B}}\).

Fig. 1.
figure 1

Vessel imaged using (a) UHFUS, (b) HFUS; (c) Failed vessel detection result (red ellipse) of algorithm in [5] on an UHFUS image; (d) Downsampled image; (e) Bilateral filtered image (\({I}_{\hbox {B}}\)); (f) With a kernel size \(3 \times 3\), pixels in \({I}_{\hbox {B}}\) are clustered into homogeneous patches in \({I}_{\hbox {C}}\), each with its own root (orange points); (g) \({I}_{\hbox {C}}\) generated with \(7 \times 7\) kernel; (h) Feature Asymmetry map (\({I}_{\hbox {FA}}\)); (i) Initial boundary locations (green points) estimated from \({I}_{\hbox {FA}}\) using the tracked point \(\mathbf {s}^{t}\) (magenta); (j) Ellipse (green) fitted to green points in (i), and then shrunk (brown ellipse) to initialize the level set evolution.

Clustering. The approach published in [8], which has also shown applicability to MRI images, was used to produce an image \({I}_{\hbox {C}}\), where the pixels in \({I}_{\hbox {B}}\) were clustered into homogeneous patches (see Figs. 1(f) and (g)). Each pixel in \({I}_{\hbox {C}}\) can be represented by two elements: the mean intensity of the patch that it belongs to, and a cluster/patch center (root). For each pixel in \({I}_{\hbox {B}}\), the mean intensity and variance is found in a circular neighborhood, whose size varies depending on the size of the vessel. For small vessels in UHFUS images (\(\le \)70 pixel diameter or 0.81 mm), the neighborhood size was \(3\times 3\) pixels, while it was \(7\times 7\) pixels for larger vessels (\({>}70\) pixels). Each patch root in \({I}_{\hbox {C}}\) has the lowest local variance amongst all the members of the same patch [8]. Roots in \({I}_{\hbox {C}}\) were used solely as seeds to track vessels over sequential B-scans. As seen in Figs. 1(f) and (g), increasing the neighborhood size reduces the number of roots that can be tracked, which can cause tracking failure when large motion occurs.

2.3 Local Phase Analysis

Vessel boundaries in \({I}_{\hbox {B}}\) were highlighted using a Cauchy filter, which has been shown to be better than a Log-Gabor filter at detecting edges in ultrasound [9]. We denote the spatial intensity value at a location \(\mathbf {x}=[{x}{y}]^{T}\) in the image \({I}_{\hbox {B}}\) by \({I}_{\hbox {B}}(\mathbf {x})\). After applying a 2D Fourier transform, the corresponding 2D frequency domain value is \({F(\mathbf {w})}\), where \(\mathbf {w} = [{w}_{1}{w}_{2}]^{T}\). The Cauchy filter \({C}(\mathbf {w})\) applied to \({F(\mathbf {w})}\) is represented as:

$$\begin{aligned} \begin{aligned} C(\mathbf {w}) = {\Vert \mathbf {w} \Vert _2^{u}} \exp \left( -{w}_{o} \Vert \mathbf {w} \Vert _2 \right) ,\qquad {u} \ge {1} \end{aligned} \end{aligned}$$
(1)

where u is a scaling parameter, and \({w}_{o}\) is the center frequency. We chose the same optimal parameter values suggested in [9]: \({w}_{o}=10\), and \(u=1\). Filtering \({F(\mathbf {w})}\) with \({C}(\mathbf {w})\) yielded the monogenic signal, from which the feature asymmetry map (\({I}_{\hbox {FA}}\)) [9] was obtained (see Fig. 1(h)). Pixel values in \({I}_{\hbox {FA}}\) range between [0, 1].

2.4 Vessel Segmentation and Tracking

Initialization. As in [3, 4], we manually initialize our system by clicking a point inside the vessel lumen in the first B-scan of a sequence. This pixel location is stored as a seed, denoted by \({\mathbf {s}}^{0}\) at time \(t=0\), to segment the vessel boundary in the first B-scan, and initialize the vessel lumen tracking in subsequent B-scans.

Initial Boundary Segmentation. \(N = 360\) radial lines of maximum search length \(M = 100\), which corresponds to the largest observed vessel diameter, stem out from \({\mathbf {s}}^{0}\) to find the vessel boundaries in \({I}_{\hbox {FA}}\). The first local maximum on each radial line is included in a set \(\mathcal {I}\) as an initial boundary point (see Fig. 1(i)).

Segmentation Refinement. A rough estimate of the semi-major and semi-minor vessel axes was determined by fitting an ellipse [10] to the initial boundary locations in \(\mathcal {I}\). Next, the estimated values were shrunk by \(75\%\), and used to initialize an elliptical binary level set function (LSF) \({\phi }_{o}\) (see Fig. 1(j)) in a narrowband distance regularized level set evolution (DRLSE) [11] framework. As the LSF initialization is close to the true boundaries, the DRLSE formulation allows quick propagation of LSF to the desired vessel locations \(\mathcal {D}\) (see Fig. 2(a)) with a large timestep \(\varDelta {\tau }\) [11]. The DRLSE framework minimizes an energy functional \(\mathcal {E}(\phi )\) [11] using the gradient defined in Eq. (2). \(\mu , \lambda , \epsilon \), and \(\alpha \) are constants, g is the same edge indicator function used in [11], and \({\delta }_{\epsilon }\) and \({d}_{p}\) are first order derivatives of the Heaviside function and the double-well potential respectively. The parameters used in all datasets were: \(\varDelta {\tau }=10, \mu =0.2, \lambda =1, \alpha =-1, \epsilon =1\) for a total of 15 iterations.

$$\begin{aligned} \begin{aligned} \frac{\partial \phi }{\partial \tau } = \mu \text {div}({d}_{p} (|\nabla \phi |) \nabla \phi ) \, + \, \lambda {\delta }_{\epsilon }(\phi ) \text {div}\bigg (g \frac{\nabla \phi }{|\nabla \phi |}\bigg ) \, + \, \alpha {g} {\delta }_{\epsilon }(\phi ) \end{aligned} \end{aligned}$$
(2)
Fig. 2.
figure 2

(a) Refined segmentation (yellow contour) evolved from initial LSF (brown ellipse); Tracking under large motion - (b) In frame 87, \({\mathbf {s}}_{\hbox {ekf}}^{87}\) (blue) chosen over \({\mathbf {s}}_{\hbox {c}}^{87}\) (orange) to segment vessel (yellow contour), which is then fitted with an ellipse (green); (c) In frame 88, the EKF prediction \({\mathbf {s}}_{\hbox {ekf}}^{88}\) (red) is ignored as Eq. (7) is not satisfied. Instead, \({\mathbf {s}}_{\hbox {c}}^{88}\) (magenta) is chosen as it falls under the elliptical neighborhood (brown) of \({\mathbf {s}}_{\hbox {c}}^{87}\) (orange); (d) Successful contour segmentation (Adventitia) of UHFUS image in Fig. 1(c); (e) Successful segmentation of vessel in HFUS image shown in Fig. 1(b).

Vessel Tracking. To update the vessel lumen position \({\mathbf {s}}^{t}\) at time t to \({\mathbf {s}}^{t+1}\) at time \(t+1\), two new potential seeds are found, from which one is chosen. The first seed is found using an EKF [5, 12]. The second seed is found using \({I}_{\hbox {C}}\), and it is needed in case the EKF fails to track the vessel lumen due to abrupt motion. The EKF tracks a state vector defined by: \(\mathbf {x}^{t} = [c_x^{t}, c_y^{t}, a^{t}, b^{t}]\), where \({\mathbf {s}}_{\hbox {ekf}}^{t}=[c_x^{t}, c_y^{t}]\) is the EKF-tracked vessel lumen location and \([a^{t}, b^{t}]\) are the tracked semi-major and semi-minor vessel axes respectively. Instead of tracking all locations in \(\mathcal {D}\), it is computationally efficient to track \(\mathbf {x}^{t}\), whose elements are estimated by fitting an ellipse once again to the locations in \(\mathcal {D}\) (see Fig. 2(b)). The EKF projects the current state \(\mathbf {x}^{t}\) at time t to the next state \(\mathbf {x}^{t+1}\) at time t+1 using the motion model in [5], which uses two state transition matrices \(A_1, A_2\), the covariance error matrix P, and the process-noise covariance matrix Q. These matrices are initialized in Eqs. (3)–(6).

$$\begin{aligned} A_1&= diag([1.5, 1.5, 1.5, 1.5]) \end{aligned}$$
(3)
$$\begin{aligned} A_2&= diag([-0.5, -0.5, -0.5, -0.5]) \end{aligned}$$
(4)
$$\begin{aligned} P&= diag([1000, 1000, 1000, 1000]) \end{aligned}$$
(5)
$$\begin{aligned} Q&= diag([0.001, 0.001, 0.001, 0.001]) \end{aligned}$$
(6)

The second seed was found using the clustering result. At \({\mathbf {s}}_{\hbox {c}}^{t}\) in the clustered image \({I}_{\hbox {C}}^{t+1}\) at time \(t+1\), the EKF tracked axes \([a^{t+1}, b^{t+1}]\) were used to find the neighboring roots of \({\mathbf {s}_{\hbox {c}}^{t}}\) in an elliptical region of size \([{1.5{a}^{t+1}}, {b}^{t+1}]\) pixels. Amongst these roots, the root \({\mathbf {s}}_{\hbox {c}}^{t+1}\), which has the lowest mean pixel intensity representing a patch in the vessel lumen, is chosen. By using the elliptical neighborhood derived from the EKF state, \({\mathbf {s}}_{\hbox {c}}^{t}\) is tracked in subsequent frames (see Fig. 2(c)). The elliptical region is robust to vessel compression, which enlarges the vessel horizontally.

The EKF prediction is sufficient for tracking during slow longitudinal scanning or still imaging as \({\mathbf {s}}_{\hbox {ekf}}^{t+1}\) and \({\mathbf {s}}_{\hbox {c}}^{t+1}\) lie close to each other. However, when large motion was encountered, the EKF incorrectly predicted the vessel location (see Fig. 2(c)) as it corrected motion, thereby leading to tracking failure. To mitigate tracking failure during large vessel motion, \({\mathbf {s}}_{\hbox {ekf}}^{t+1}\) was ignored, and \({\mathbf {s}}_{\hbox {c}}^{t+1}\) was updated as the new tracking seed according to the rule in Eq. (7):

$$\begin{aligned} {\mathbf {s}}^{t+1}&= {\left\{ \begin{array}{ll} {\mathbf {s}}_{\hbox {c}}^{t+1} &{} if \quad \Vert {\mathbf {s}}_{\hbox {ekf}}^{t+1} - {\mathbf {s}}_{\hbox {c}}^{t+1} \Vert _2 > a^{t+1}\\ {\mathbf {s}}_{\hbox {ekf}}^{t+1} &{} \quad otherwise \end{array}\right. } \end{aligned}$$
(7)

3 Results and Discussion

Metrics. Segmentation accuracy of the proposed approach was evaluated by comparing the contour segmentations against the annotations of two graders. All images in all datasets were annotated by two graders. Tracking was deemed successful if the vessel was segmented in all B-scans of a sequence. Considering the set of ground truth contour points as G and the segmented contour points as S, the following metrics were calculated as defined in Eqs. (8)–(11): (1) Dice Similarity Coefficient (DSC), (2) Hausdorff Distance (H) in millimeters, (3) Definite False Positive and Negative Distances (DFPD, DFND). The latter represent weighted distances of false positives and negatives to the true annotation. Let \({I}_{\hbox {G}}\) and \({I}_{\hbox {S}}\) be binary images containing 1 on and inside the area covered by G and S respectively, and 0 elsewhere. The Euclidean Distance Transform (EDT) is computed for \({I}_{\hbox {G}}\) and its inverse \({I}_{\hbox {G}}^{\hbox {Inv}}\) [13]. DFPD and DFND are estimated from the element-wise product of \({I}_{\hbox {S}}\) with EDT(\({I}_{\hbox {G}}\)) and EDT(\({I}_{\hbox {G}}^{\hbox {Inv}}\)) respectively (10)–(11). d(iGS) is the distance from contour point i in G to the closest point in S. Inter-grader annotation variability was also measured.

$$\begin{aligned} \hbox {DSC}&= \frac{2|G \cap S |}{|G |+ |S |} \end{aligned}$$
(8)
$$\begin{aligned} \hbox {H}&= \max \Big (\underset{i \in [1,|G|]}{\max } d(i,G,S), \underset{j \in [1,|S|]}{\max } d(j,S,G) \Big ) \end{aligned}$$
(9)
$$\begin{aligned} \hbox {DFPD}&= \log \Big (\Vert \hbox {EDT}({I}_{\hbox {G}}) \circ {I}_{\hbox {S}} \Vert _1\Big ) \end{aligned}$$
(10)
$$\begin{aligned} \text {DFND}&= \log \Big (\Vert \hbox {EDT}({I}_{\hbox {G}}^{\hbox {Inv}}) \circ {I}_{\hbox {S}} \Vert _1\Big ) \end{aligned}$$
(11)
Fig. 3.
figure 3

Quantitative segmentation and tracking accuracy metrics for 35 UHFUS (top row) and 5 HFUS (bottom row) sequences respectively. The black * in each box plot represents the mean value of the metric. The terms ‘G1vG2’ and ‘G2vG1’ in figure represent the inter-grader annotation variability when grader 2 annotation was considered the ground truth, and vice versa.

UHFUS Results. We ran our algorithm on 35 UHFUS sequences (100 images each), and the corresponding results are shown in Figs. 3(a)–(d). The two graders varied in their estimation of the vessel boundary locations in UHFUS images due to the speckle noise obscuring the precise location of the vessel edges, as shown in the inter-grader Dice score in Fig. 3(a), inter-grader Hausdorff distance in Fig. 3(b), and inter-grader variation between Figs. 3(c) and (d). Grader 2 tended to under-segment the vessel (G1vG2, low DFPD and high DFND scores), while grader 1 tended to over-segment (G2vG1, high DFPD and low DFND scores). As desired, our segmentation tended to be within the region of uncertainty between the two graders (see Figs. 3(c) and (d)). Accordingly, the mean Dice score and mean Hausdorff distance of our algorithm against grader 1 (\(0.917\pm 0.019, 0.097\pm 0.019\,\text {mm}\)) and grader 2 (\(0.905\pm 0.018, 0.091\pm 0.019\,\text {mm}\)) were better than the inter-grader scores of (\(0.892\pm 0.019, 0.105\pm 0.02\,\text {mm}\)). The largest observed Hausdorff distance error of \(0.135\,\text {mm}\) is 6 times smaller than the smallest observed vessel diameter of \(0.81\,\text {mm}\). Similarly, the mean Hausdorff distance error of \(0.094 \pm 0.019\,\text {mm}\) is \({\sim }7\) times smaller than smallest observed vessel diameter. This satisfies our goal of sub-mm vessel contour localization. Tracking was successful as the vessel contours in all sequences were segmented.

HFUS Results. To show the generality of our approach to HFUS, we ran our algorithm on 5 HFUS sequences (250 images each), and the corresponding results are shown in Figs. 2(e) and 3(e)–(h). As opposed to UHFUS, lower DFPD and DFND scores were seen with HFUS, meaning a greater consensus in grader annotations (see Figs. 3(g) and (h)). Notably, our algorithm still demonstrated the desirable property of final segmentations that lay in the uncertain region of annotation between the two graders. This is supported by comparing the mean Dice score and mean Hausdorff distance of our algorithm against grader 1 (\(0.915\pm 0.008, 0.292\pm 0.023\,\text {mm}\)) and grader 2 (\(0.912\pm 0.021, 0.281\pm 0.065\,\text {mm}\)), with the inter-grader scores (\(0.915\pm 0.02, 0.273\pm 0.04\,\text {mm}\)). To compare against the 0.1 mm Mean Absolute Deviation (MAD) error in [6], we also computed the MAD error for HFUS sequences (not shown in Fig. 3). The MAD error of our algorithm against grader 1 was \(0.059\pm 0.021\,\text {mm}, 0.057\pm 0.024\,\text {mm}\) against grader 2, and \(0.011\pm 0.003\,\text {mm}\) between the graders. Despite the lower pixel resolution (\(92.5\,\upmu \mathrm{m}\)) of the HFUS machine used in this work, our MAD errors were \({\sim }2{\times }\) lower than the state-of-the-art \(0.1\,\text {mm}\) MAD error in [6]. Furthermore, only minor changes in the parameters of the algorithm were required to transfer the methodology to HFUS sequences; namely, the bilateral filter size was \(3\times 3\) pixels, \({w}_{o}=5\), and \(\varDelta {\tau }=8\). No other changes were made to the level set parameters.

Performance. The average run-time on an entry-level NVIDIA GeForce GTX 760 GPU was 19.15 ms per B-scan and 1.915 s per sequence, thus achieving a potential real-time frame rate of 52 frames per second. The proposed approach is significantly faster than the regular CPU- [6], and real-time CPU- [2,3,4] and GPU-based approaches in [5] respectively. Efficient use of CUDA unified memory and CUDA programming contributed to the performance speed-up.

4 Conclusion and Future Work

In this paper, a robust system combining the advantages of local phase analysis [9], a distance-regularized level set [11], and an Extended Kalman Filter (EKF) [12] was presented to segment and track vessel contours in UHFUS sequences. The approach, which has also shown applicability to traditional HFUS sequences, was validated by two graders, and it produced similar results as the expert annotations. To the best of our knowledge, this is the first system capable of rapid deformable vessel segmentation and tracking in UHFUS images. Future work is directed towards multi-vessel tracking capabilities.