1. Introduction
Forest inventories are critical to assess timber volume and economic value, whether for wood products or standing carbon, but are costly, time-consuming, and labor-intensive when collected via traditional ground-based methods. Large-scale airborne laser scanning (ALS), or LiDAR, has the potential to provide estimates of forest metrics, such as standing volume, biomass, and carbon across large spatial extents [
1,
2,
3]. While individual tree height estimates have a high level of accuracy [
4], direct measurements of individual tree stem diameter from ALS have not reliably been achieved. This is due to the inability of most sensors to consistently penetrate forest canopies and return from the tree stem, or the stem being occluded (e.g., [
5,
6]). Laser power, scan angle, and point density are key parameters for this task, and all of these will influence the number of returns from the sub-canopy. Both the sensor and the characteristics of the platform on which it is carried (e.g., manned fixed-wing aircraft, uncrewed aerial vehicles, etc.) greatly influence the spatial scale and accuracy of the acquisition.
ALS sensors are biased to recording returns from the upper portions of the forest canopy, particularly when canopy vegetation is dense [
7,
8]. Larger-scan angles (>20°; where 0° is nadir) cover a larger area (footprint) than smaller-scan angles and are more likely to encounter gaps and penetrate deeper into the forest canopy [
9,
10]. In addition, canopy architecture, such as leaf orientation and species type, will influence the detection of returns as scan angle changes [
11]. Returns from features beneath the top canopy surface are often composed of intermediate and last return types, and thus there is less energy available to detect. Brede et al. [
12] evaluated ALS sensors onboard uncrewed aerial vehicles (UAV-LS) and stated that dense forest canopies require high-power laser pulses and a high-maximum sensor detection range to obtain returns from beneath the canopy. Hopkinson et al. [
13] demonstrated that beam divergence can influence canopy penetration depending on the type of canopy vegetation encountered and that high pulse repetition frequencies can result in higher rates of sub-canopy returns due to increased probability of pulses encountering canopy gaps. Weight and power requirements in UAVs are also of critical concern. UAV-LS laser detection range is often limited and highly variable by sensor; for example, the DJI Livox laser scanner (Livox Technology, Shenzhen, China) and Velodyne HDL-32E (Velodyne LiDAR, Inc., San Jose, CA, USA) have maximum detection ranges of 260 m and 100 m, respectively. These limitations must be factored into flight specifications. Occlusion from canopy components also represents a significant problem in detecting returns from lower forest canopy height strata. These issues present several problems in trying to classify individual tree features beneath the forest canopy. UAV-LS traditionally has acquired high pulse densities (e.g., [
14]) but are often spatially limited. However, developments in sensor technology, such as full-waveform recording ALS sensors (e.g., [
15]), have opened new possibilities to acquire high-density ALS point clouds.
Previous studies have presented methods to classify portions of individual tree stems in ALS-derived datasets and to estimate stem dimensions, such as diameter at breast height (DBH), directly from the features of the point cloud [
16,
17], or via allometry [
15,
16,
18]. With the increased point density of UAV-LS, several studies have successfully classified some tree stems and have estimated stem size. Kuželka et al. [
19] classified tree stems based on a segmentation of voxels in the sub-canopy and three types of circle fitting methods applied to horizontal cross-sections at multiple heights in order to estimate stem diameter. Dalla Corte et al. [
20] iteratively fitted Hough transforms to the data to identify the location and to estimate the size of stems. Neuville et al. [
21] documented a method to estimate DBH that combined hierarchical density-based clustering, principal components analysis, and a least-squares circle fit to returns within the height range of 1.2 to 1.4 m above ground. The authors noted, however, that point density was often insufficient at these near-ground heights to accurately estimate stem size. Hyyppä et al. [
22] used crewed-helicopter ALS (HALS) to classify individual trees and estimate DBH using clustering and circle fitting methods. Thus, below-canopy conditions (within a point cloud dataset) are likely to be highly variable in terms of point density and vegetation structural complexity, which may adversely affect our ability to identify stem features.
Previous ALS research has focused on the estimation of individual tree level metrics using statistical models. Sumnall et al. [
23], for example, developed a predictive model for DBH using a linear regression approach with individual tree crown (ITC) size (e.g., top height and crown area) and distance-dependent neighborhood index metrics for the local environment around each ITC as input variables. These input variables are almost exclusively derived from the upper canopy and crowns. The model was developed specifically for loblolly pine, and produced adjusted R
2 values of 0.62, and a root mean square error (RMSE) of 3.13 cm for DBH.
Acquisitions from HALS may provide a number of benefits over fixed-wing aircraft in terms of point density and could provide a high pulse density (>500 pulse m−2) point cloud dataset, comparable to UAV-LS, but at a potentially larger scale. Individual stem diameter estimation at multiple heights, and therefore volume, should be feasible with the right combination of laser power, scan angle, point density, flight coverage, and environment.
The overall objective of this study was to assess the potential of HALS-derived point clouds to provide estimates of stem diameter in managed loblolly pine (
Pinus taeda L.) following an automated ITC classification. We propose a method of classifying the tree stem based on shape fitting and density-/model-based clustering. Our first objective was to develop four methods to predict stem diameter at multiple heights along the stem to estimate DBH directly. The second objective was to adapt the four circle fitting methods to incorporate an existing loblolly stem taper equation to estimate DBH. The third objective was to evaluate DBH estimates compared directly to the following: (a) field-measured data, including stem-mapped plots located in an operational plantation; and (b) predictions from a previously developed ALS-ITC metric DBH model [
23]. Given the high density of the HALS point clouds, we hypothesized that the circle fitting would accurately predict individual stem diameter and do so with greater accuracy than the ALS-ITC/Sumnall et al. [
23] DBH model.
2. Materials and Methods
2.1. Study Site and Management Background
Our study spanned four stands, with an approximate total area of 196 hectares, located near Woodville, Texas, USA. All four stands were within 25 km of the following coordinates: 30.674161°N–94.551503°W. All stands were composed of intensively managed loblolly pine plantations. Understory species were primarily yaupon (
Ilex vomitoria A.), blackberry (
Rubus spp.), and devil’s walking stick (
Aralia spinosa L.). Stands varied by total area. All stands shared common features of being at mid-rotation age and had been recently thinned to similar stem densities and basal areas, summarized in
Table 1. Of the 46 plots installed during fieldwork, 24 had herbicide applied in May 2020.
2.2. Field Data Collection
Fieldwork was conducted in 46 plots in September 2022. The average plot area was approximately 300 m2. Field plots were established in the interior of each stand; no plot could be established within 70 m of the plot boundary to ensure that edge effects were minimized. At each plot, DBH (at a height of 1.37 m) was recorded for every loblolly pine tree, and two tree top heights and height to living crown (HTLC) were measured using a Vertex hypsometer. Each of the measured tree stem locations was recorded using a Trimble Geo 7X (Trimble Inc., Colorado, CA, USA), a survey-grade Global Navigation Satellite System (GNSS) device—maintaining an accuracy threshold of 1 m or less. Between 9 and 51 stems were recorded in each plot, with a mean stem count of 17.
2.3. HALS Specification and Data Pre-Processing
Small-footprint scanning waveform HALS data were acquired for all sites from a helicopter, a Bell 206B Jet Ranger, in November 2022. The HALS sensor was a RIEGL VUX 240 (RIEGL Laser Measurement Systems Inc., 2024; Winter Garden, FL, USA). The sensor has a maximum detection range of 1400 m. The acquisition altitude was approximately 100 m above ground. The decomposition of waveform data into discrete returns was performed in RiProcess (version 1.8, [
24]). The initial point density per individual swath/flight line was 239 points m
−2 (at nadir ±5°). When merged, this resulted in a dataset with a point density of approximately 800 points m
−2 with up to 9 returns per pulse. The scan angle had a range of 0° (nadir) to 45°. Flight line overlap was >75%. A ‘cross-hatch’ flight plan was implemented, i.e., overlapping flights perpendicular to one another, with the intention of reducing potential occlusion effects; see
Figure 1 for an example. The on-board helicopter navigation used a real-time kinematic global positioning system with a precision of ≤10 cm.
All HALS data pre-processing and analysis were completed using the R software (version 4.3.1.) [
25]. The specific packages we used were as follows: lidR (version 4.0.3, [
26,
27]), TreeLS (version 2.0.5, [
28]), rTLS (version 0.2.5.6, [
29]), and data.table (version 1.14.8, [
30]) were used to process HALS files. Geographic information system functions utilized sf (version 1.0.13, [
31]) and raster processing used terra (version 1.7.39, [
32]). The dbscan (version 1.1.11, [
33]) and mclust (version 6.0.1, [
34]) packages were used for clustering 3D points based on relative distance to their neighbors and for noise removal. The deldir package (version 1.0.9, [
35]) was used for the creation of Voronoi tessellations. The geometry package (version 0.4.7, [
36]) was used for the calculation of 3D convex hulls. The soilphysics package [
37] was used to assess graph-slope curvature.
We applied a noise filter where HALS returns were classified using an isolated voxels filter, as specified in Sumnall et al. [
23]. In summary, an initial 3D voxel matrix was created (with a cubic side length of 1 m), and a moving search kernel of 3 × 3 × 3 cells was used to classify isolated returns. Ground return classification was implemented using a cloth simulation filter [
38] and used to create a digital terrain model triangular irregular network. Aboveground heights calculated by subtracting the height of the terrain surface from all returns.
2.4. Individual Tree Crown Processing
We implemented the ITC classification method as described in Sumnall et al. [
23] for all four stands. In summary, initial ITC center coordinates were estimated using a local maximum search of a raster layer representing a 2D vertical occupancy index, where each pixel represents a sum of the number of occupied 0.2 m voxels vertically—cell areas of high density were assumed to represent horizontal tree centers. ITC horizontal extents were established using the marker-based approach defined in Silva et al. [
39] using a 0.1 × 0.1 m canopy height model. All returns of less than 0.2 m height aboveground were considered as ground and ignored.
All HALS returns intersecting with the classified horizontal area, or footprint, of the ITC were initially considered as belonging to the individual tree, even if understory vegetation was present. The HALS point cloud was subset by the ITC classification, and metrics were calculated for each ITC. Top height is the largest return height of all returns within the ITC classification. The mean of ITC X and Y coordinates of returns in the top 50% of ITC height was considered as the center position of the crown. To estimate the vertical extent of the ITC, HTLC was computed for each ITC following the method used in Sumnall et al. [
23]. In summary, a cubic-smoothed spline was fit to the vertical distribution of the frequency of returns incident in height bins of 0.2 m in size, from height 0 m to the tallest return height incident in the ITC footprint (i.e., 0.2 to 0.4, 0.4 to 0.6…
b1 to
b2 m). We assumed local maxima along this line to correspond to the presence of vegetation layers. Local maxima and minima were identified along this fitted line and were used to estimate the vertical extent of the tree crown.
2.5. Individual Tree Stem Classification
All HALS returns intersecting with the ITC horizontal footprint formed a subset of analysis. We classified stem returns for each ITC class by initially searching the HALS point cloud for exposed areas of the tree stem (see
Figure 2 for an example). Not all trees had unobstructed stems for estimating stem size (caused by foliage, branches, or understory in proximity). Therefore, our approach employed two stages: (i) a linear feature search and (ii) a density-/model-based clustering of the point cloud data to remove potential understory (or non-stem features).
We assumed loblolly pine tree stems to be cylindrical in shape, orientated vertically, and to have no defects. We used an approach to segment linear features within the point cloud based on the approach outlined in Limberger and Oliveira [
40] and implemented in the LiDR R package [
26,
27]. In summary, the approach utilized a Hough transform with 3D neighborhoods defined by a tri-variate Gaussian kernel. Returns were classified as linear if the returns identified by the algorithm were approximately linear and vertical. We utilized the following settings: the number of neighbors used to estimate the neighborhood was set to 50 returns, and the distance threshold value was set to 2.5 m. These settings were determined using random samples across all four stands, which intersected with field plot locations, accounting for 10% of total stems. These settings were not altered for the remaining areas.
We assumed the stem would be composed of many HALS returns, within a narrow horizontal extent (<1 m
2); thus, non-vertical stems would be ignored. To remove any non-stem returns or false-positive linear features, density-based clustering (DBSCAN, [
33]) was used to identify large contiguous groups of linear features within the footprint of the ITCs. An initial 3D search kernel with an ellipsoid of 0.3 m and a minimum cluster size of 15 points was used for the DBSCAN process. The clusters identified were assessed using the following criteria: (1) the minimum and maximum return heights and (2) a convex hull of the horizontal area occupied by returns. Clusters were removed if any of their heights existed above the estimated HTLC value. As we assumed the tree stem to be straight and vertical, therefore clusters that horizontally overlapped (via convex-hull polygons) were merged (horizontal coordinate values only, ignoring height values). Finally, a cubic-smoothed spline was fitted to each cluster using the 3D coordinates of all returns. The spline is composed of multiple nodes (3D coordinates) summarizing the distribution. The vertical angle was calculated between each node, where the average mean vertical angle (where 0° is vertical) and standard deviation of angle were calculated sequentially from bottom to top. Any cluster with a mean or standard deviation with angle values exceeding 20° was considered non-straight and excluded; thus, any branches or leaning stems were not included. The presence of understory near to the tree stem prevented this method from classifying the desired feature completely and required the following modification, outlined below, when encountered.
When understory vegetation was in proximity to stem returns (≤0.5 m), we used a different stem classification method. First, we subset the ITC point cloud below the HTLC estimate, leaving all potential returns from the stem, non-crown branches, and understory. Model-based clustering, based on finite normal mixture modeling, was applied to the ITC subset data (as in the mclust package, [
34]), and specifically to the horizontal coordinates of all returns (XY). Here, we assumed the stem cluster had a small horizontal area and had a different return distribution than any understory or horizontally projecting branch clusters. The optimal number of clusters (
n, which could be between one and eight) was automatically determined using the ‘clustCombi’ and ‘clustCombiOptim’ functions in the mclust package [
34]. A convex hull was applied to the resultant 2D clusters, and the horizontal area was calculated. If all the individual horizontal areas of the clusters exceeded 0.5 m
2, the number of potential clusters was multiplied by two times the original value (
n × 2, to ensure smaller cluster sizes), and the clustering was recalculated until cluster sizes were all small enough. The clusters containing any returns that belonged to a linear stem feature (as in the previous paragraph) were all reclassified as stem returns. If multiple clusters were found, clusters with over 20% of their points classified as linear were reclassified as stem returns. All other clusters were classified as non-stem. A final, model-based cluster step was implemented if the horizontal area of the stem-classified returns from below the HTLC had a horizontal area (calculated via convex hull) of more than 2 m
2. The cluster with the smallest convex-hull horizontal area maintained the stem classification, with the remaining returns reclassified as non-stem. An example of the classification result is illustrated in
Figure 3.
2.6. Estimating the Size of the Tree Stem
We assumed that the stem may not be a perfect horizontally oriented conical shape; we did, however, assume the amount of tilt in the study site to be minimal. To account for any potential tilt in the tree, a 3D principal component vector line was fit to the classified stem returns and the tree top coordinates. The direction of this vector was used to transform the stem coordinates using the methods documented in the rTLS package [
29] to be vertical.
For each ITC with a classified stem, four methods were implemented to estimate stem radii at multiple heights along the classified stem when sufficient returns were available. The following four methods were as follows: (i) Fitting logistic regression circles to individual HALS scans-lines (ARC); (ii) random sample consensus circle (RANSAC); (iii) iterative reweighted least squares (IRLS); and (iv) non-linear logistic regression circle fitting (LScircle). The first method, ARC (
Section 2.6.1), or individual HALS scan line circle fitting, is predicated on returns from the tree stem being from a range of heights composed of multiple separate scan lines (from separate flight lines with different look angles) and will be oblique relative to the tree stem. When the returns for a single scan line are viewed horizontally, they will resemble a curve, from which we can fit a circle, and thus we can estimate diameter. The RANSAC, IRLS, and LScircle methods did not distinguish individual HALS scan lines, instead relying on subsets of all returns within specific height ranges; thus, a regular set of heights was specified. The second method (
Section 2.6.2), RANSAC fitting, employs methods developed for TLS datasets to provide stem diameter estimates. The IRLS and LScircle methods relied on the returns within binned 1 m height segments and fit shapes to the horizontal distribution of points within each. The third and fourth approaches, (
Section 2.6.3) IRLS circle fitting and binned non-linear LScircle (
Section 2.6.4), work on the horizontal distribution of binned points. Thus, functions were fitted to the vertical distribution of HALS returns to estimate circle radius at multiple heights along the stem, illustrated in
Figure 4.
2.6.1. Individual HALS Scan Line Circle Fitting (ARC)
Vertical scan angles within the HALS dataset ranged from 0° to 45°, and a full 360° horizontal sampling of the stem was not possible; therefore, there was no true horizontal cross-section measure of stem circumference available. The laser pulses in the flight line intersected with a portion of the stem at an angle, if at all. Assuming the tree stem is straight and vertical, the scan line provided the data for an ‘arc’ of surface returns across the surface of a cylinder, hereafter referred to as ARC. These ARCs are illustrated in
Figure 5.
For the classified stem returns, we first identified individual ARCs by identifying differences in sequential individual pulse GPS times. We assumed time difference values larger than the minimum difference between two sequential laser pulses emitted from the sensor indicated different scan lines. This minimum time difference value was multiplied by 10 to account for small breaks. This difference value was used to determine any temporal breaks in the subset. If any of the resultant ARCs had a population of less than five returns and/or all returns covered a small horizontal spatial extent (<0.01 m2), we considered there being too few points to fit a circle to, and the points were discarded. Only returns between the heights of 0.5 m to the estimate HLTC were considered here.
A function was written in R to calculate a best-fit circle to 2D points using non-linear logistic regression based upon the R script presented in Fischetti [
41] and Spacedman [
42] using X and Y coordinate values with optimization using the method documented in Nelder and Mead [
43]. The initial estimate of circle radius (
rmean) or the average distance of the coordinates to the average position of all coordinates—a surrogate for circle radius—is as follows:
where
Xi and
Yi are the coordinates of the return
i,
Xmean and
Ymean are the mean average of all coordinates for the given point dataset. The optimization function was applied to
Craw to provide an estimate of circle size and center position and is expressed as follows:
We therefore calculated a best-fit circle for each set of ARC horizontal coordinates. The median return height per ARC was used to represent the height of the stem radius estimate.
2.6.2. Random Sample Consensus Circle Fitting (RANSAC)
This approach adapted the existing methods available through the TreeLS R package [
28]. The package provided a means of applying iteratively fit Hough transforms to identify stem points and fit a RANSAC circle-fitting algorithm at a user-defined height. The method is based upon Olofsson et al. [
44]. The following settings were used as input to the function: a minimum of 15 points were selected for every RANSAC iteration, and 20 points were set as the optimal number; a median of estimates with the lowest error was used to estimate radius. These settings were consistent throughout. We attempted to estimate stem radius for every 25 cm of stem height beginning at a height of 0.5 m up to the HTLC (i.e., 0.50, 0.75, 1.00, 1.25, 1.50, …, etc.). For each of the heights, a circle fit was attempted, and if successful, circle center coordinates, radius, and error were stored.
2.6.3. Iterative Reweighted Least Squares Circle Fitting (IRLS)
This approach combined sub-setting returns in vertical height bins and the shape fitting functions of TreeLS, specifically the IRLS circle fitting algorithm [
45]. Vertically overlapping 1 m height bins were created for each tree stem from 0.5 up to the HTLC, with 75% overlap (i.e., 1.00–2.00; 1.25–2.25; …, etc.). A circle was fitted to the horizontal distribution of all HALS returns within the 1 m height bin. If the number of returns within the bin was below five, that height was skipped. As before, height, circle center coordinates, radius, and error were stored for each of the circles fitted.
2.6.4. Binned Non-Linear Logistic Regression Circle Fitting (LScircle)
The final circle fitting approach, LScircle, combined sub-setting returns by height bin and circle fitting (as in
Section 2.6.1). Vertically overlapping 1 m height bins were created for each tree stem from 0.5 m up to the HTLC, with 75% overlap (i.e., 1.00–2.00; 1.25–2.25; …, etc.). The returns intersecting with each height bin were subset, and a circle was fit to the horizontal distribution of returns. Bins containing less than 5 points were considered too few points and ignored. For each successful circle fit, the height, coordinates of the circle center, radius, and fit error were recorded.
2.7. Estimating Stem Diameter at Breast Height
Ultimately, there were eight estimates of stem diameter from methods applied to classified tree stems to estimate DBH, in addition to those from statistical methods. Four, as summarized above, are derived by circle fitting at different stem heights, applying a cubic spline function to the distribution of stem radii estimates, and estimating values at DBH (
Section 2.7). The estimates from the cubic spline were then used as input to a pre-existing stem taper equation, yielding four more estimates (
Section 2.8). Finally, additional estimates of stem DBH were produced from existing statistical methods (
Section 2.9).
Each of the four methods generated estimates of stem radius at different heights. To estimate stem diameter at a height of 1.37 m, we must account for missing or erroneous values of estimated radii along the stem. We assumed all stems were vertical and roughly conical in shape. We therefore implemented the following to estimate stem diameter at any height along the stem below the canopy. We first removed any unreasonably large stem radii estimates, as determined by half of the largest field-measured DBH value multiplied by 25%—where radii values over 0.4 m were removed. In addition, we removed any radii estimates from any circle where the center coordinates were over 1 m from the horizontal mean of all stem-classified returns. Each outlying radius value was identified and excluded based on the interquartile range, where values lower than the 25th percentile or above the 75th percentile were removed.
The following was used to approximate the horizontal radii of the stem, from the top to the ground via a spline. A single artificial point was created from the maximum return height in the crown footprint with an arbitrary stem radius value of 0 m to represent the tree’s top. This point was used along with radii estimates derived from circle fitting to account for the assumed general taper or the conical shape of the stem. We expected noise in the estimates across the height of the stem due to the variability in returns and the effectiveness of each method. To account for potential noise, a weighted cubic-smoothed spline was then fit to the distribution of height and radii values. Weights were calculated from the radii values using the following:
where
Wi is the new weight value for sample
i and
Ri is the radius value for sample
i. The largest radii value for the current stem is
Rmax, and
Rmin is the lowest. The values
SRmax and
SRmin denote the maximum and minimum values for rescaling, respectively. An exploratory analysis based on a random 10% sample of all ITCs with a stem classification was implemented, where different weights were tested with regard to individual tree DBH estimate accuracy; specifically,
SRmin values were varied in increments of 0.05 from 0.00 to 0.95, where
SRmax was always 1.00. Those weight values that provided the most accurate DBH estimates were 0.65 to 1 (ARC), 0.85 to 1 (RANSAC), 0.65 to 1 (IRLS), and 0.65 to 1 (LScircle). The spline was applied using the following settings for all stems: the smoothing parameter was set as 0.99 to be straight and monotonic, and the tolerance value was set to 0.001 m. The radius was estimated by evaluating the spline value at a height of 1.37 m and multiplying the value by two for the purposes of DBH estimation.
2.8. Implementation of an Existing Stem Taper Model
As noted previously, we initially classified portions of the stem below the HLTC and above any understory. We assumed there to be more returns from this section and potentially less noise. Thus, estimating stem size at these heights may be more accurate than at lower heights. Applying a relevant taper equation to account for typical loblolly pine shape could therefore be applied to estimate stem DBH.
In addition to estimating radii from fit cubic splines at the height of DBH (1.37 m), the value of the spline was extracted for all existing radii heights in each of the four methods (i.e., the value of the spline at the height where the circles were fit). We multiplied the spline-radii value by two to estimate diameter and converted the units to inches and height to feet for compatibility with an existing loblolly pine stem taper equation. Specifically, the overbark taper model presented in Coble and Hilpp [
46] for semi-intensive plantation-grown loblolly pine forests in East Texas was used. This model was modified to estimate DBH from a diameter measurement from any height point above 1.37 m and is expressed as follows:
where
;
−3.7178;
1.7805;
−1.4659;
74.1804;
0.7986;
0.0895.
All value units, post-calculation, were converted to centimeters. Thus, multiple estimates of DBH were produced for all spline radii values at the various heights available per classified stem. A mean of these values was calculated for each ITC. This approach was applied to all four circle-fitting methods, providing an additional four estimates of DBH per ITC object, referred to as Taper+ARC, Taper+RANSAC, Taper+IRLS, and Taper+LScircle.
2.9. Calculation of Stem Diameter Using Statistical Models
Estimates of individual tree DBH were also calculated using models generated by Sumnall et al. [
23]. These models required ITC input metrics relating to ITC size (estimates of top height, HTLC, 3D crown volume, 3D crown surface area, and horizontal area) and local competitive neighborhood indices (based on Hegyi [
47], with an index for each ITC size metric). In summary, the training datasets consisted of 29,700 individual tree samples of DBH and were developed for managed research loblolly pine sites located in Virginia and North Carolina, USA. When the model predictions were assessed against an independent dataset, they were correlated with field values. The Sumnall et al. [
23] model was used to estimate DBH and compared to the field-measured values of DBH in this study. Hereafter we will refer to this approach as PREV.
2.10. Linking Field- and HALS-Derived Individual Tree Stem Metrics
We assumed the horizontal center of gravity of return coordinates calculated from the returns of each ITC classification to be representative of the stem location. Field stem GNSS coordinates were compared directly with HALS-delineated ITC objects using the following approach: a 1 m radius buffer was applied to the field stem map location. This value was smaller than the expected stem minimum spacing within the row (~2 m). We then examined ITC coordinates that intersected with this buffer layer. If multiple intersections were present, the closest ITC to the field GNSS coordinate was kept. If stem coordinates could not be linked to any ITC coordinate, it was considered an omission error. We considered commission error as more ITCs being delineated within a plot extent than could be reasonably accounted for via field validation.
We compared matched field-ITC DBH values at an individual tree level. The predicted values from the stem classification and four circle fitting methods, the four circle fitting method with taper estimates, and PREV predictions were evaluated using the absolute and normalized root mean square error (RMSE and RMSE%, respectively) and absolute and normalized bias (Bias and Bias%, respectively).
4. Discussion
The eight circle fitting methods applied to HALS stem classified returns implemented here estimated individual tree stem diameter (RMSE 7.8–8.2 cm; and 6.9–8.5 cm when using the taper model) had accuracies approaching that from the corrected version of the existing method PREV corr (RMSE 6.2 cm). The differences in accuracy were small when comparing circle-based estimates of DBH and those derived from incorporating a taper equation, where no obvious improvements were observed except for the IRLS method. The stem classification methods(s) developed in this research ultimately provided a similar level of accuracy to the statistical approach, albeit for fewer stems (77% of all field-measured stems), and required no statistical model training in the traditional sense.
4.1. Stem Classification and DBH Estimates
Tree detection, or more specifically ITC, successfully detected all of the field reference trees. This result is consistent with previous work by Sumnall et al. [
23], where the same method detected 99% of the trees in a sparse plot with 618 trees/ha. However, the detection rate decreased significantly for denser plots and with lower pulse density [
48]. Our method to classify stem returns resulted in 80.1% of field-measured stems being classified. Yao et al. [
15] delineated ITCs using 25 points m
−2 waveform ALS and reported the successful classification of 93 to 95% of stems in field plots. Kuželka et al. [
19] classified 99–100% of stems in mixed deciduous and coniferous forest from high pulse density UAV-LS data. In a more structurally complex forest, Rudge et al. [
18] classified 47–68% of ITCs using UAV-LS data.
The dataset and context of the current study are important in understanding the stem detection method developed from it. Various parts of the approach, such as the specification of radii that were too large and the development of spline weights for each of the circle-fitting approaches, may only be applicable to the current study and within thinned managed loblolly pine forest stands (aged 15–23). The requirement of our method to use the cubic spline to smooth variation in stem diameter estimates may imply either insufficient returns for accurate circle fitting or the presence of additional noise in the classified returns, caused by the presence of knots or the remnants of dead branches, for example, causing a circle to be fit including these features. A filtering of points used in circle fitting may be warranted in future work. In addition, given the potential range of ALS acquisition specifications and sensor parameters critical to providing returns below the main canopy, transferring this approach to another context should be thoroughly tested.
When visually comparing the distribution of DBH values from all field-measured trees against those with a corresponding HALS estimate, there did not appear to be any sampling bias or skewness in the distributions. This implies that for the ability to classify the stem returns and estimate DBH from HALS, there was no apparent bias towards larger or smaller stems, in terms of DBH. However, a smaller proportion of stems were classified in areas with understory compared to areas without understory. This implies that point cloud characteristics, such as occlusion or the additional complexity in the arrangement of vegetation features (anything interfering with classifying linearly arranged returns), are of concern. Point density at different heights (e.g.,
Figure 6), in addition to non-circular shapes of stems (in cross-section), will influence accuracy. Noise within the points used to calculate a circle would also influence accuracy. Similar to that observed in Neuville et al. [
21], the probability of classifying a tree stem was reduced due to reduced return density at lower vertical strata. Hyyppä et al. [
49] stated that the point data from UAV-LS was not sufficient to predict stem attributes.
Two of our methods, however, were able to estimate stem diameter for 77% and 76% of stems in the current study, and no commission error was observed. Excluding the results from ARC, the RMSE values of 7–10 cm (RMSE% < 20%) were produced for DBH, which compares favorably with the existing research. While the RANSAC approach was often the most accurate, it produced fewer estimates (45% of stems). IRLS produced similar results in terms of RMSE, but for more stems. Averaged plot-level estimates also showed IRLS as the most accurate. Our methods were dependent on HALS returns existing on the classified stem on or near to the height where DBH is measured (i.e., 1.37 m), and our trees were in pine plantations with low stem density following thinning. Using stem returns to estimate DBH, Neuville et al. [
21] were able to classify up to 82% of stems from UAV-LS data and estimate DBH for up to 60% and reported RMSE values of 15 to 25 cm. The authors also highlighted the benefits of including scan angles up to 75° to increase the potential of stem returns, similarly to that found here where stem returns were predominantly from angles over 20°. Kuželka et al. [
19] were able to classify up to 99% of tree stems and estimate DBH with RMSE values of 6 cm (RMSE% 19%). Dalla Corte et al. [
18] reported a RMSE% of 11.3% for 63 georeferenced trees. Brede et al. [
50] reported the results of fitting cylinders to classified stems, where diameter estimates had RMSE values between 2 and 8 cm. Hyyppä et al. [
22], which also used HALS, were able to classify 42–71% of the trees and estimate DBH with a RMSE of 1.7–2.6 cm for a mature Norway spruce (
Picea abies) forest.
The adaptation of the taper model to calculate DBH, from spline values at heights that were not at the DBH (mean values), improved estimate accuracy by a small amount (RMSE 6.9 cm). This suggests some noise, or circle fitting uncertainty, was still present when fitting the cubic spline, thus predicting radii or diameter values directly from the fit cubic spline (such as at a height of 1.37 m) would be problematic. Given our limited stem measurements in the current study (i.e., DBH only), we cannot state if circles could be fitted with higher or lower accuracy at different heights. Given the slightly higher accuracies achieved when incorporating stem taper values (averaged from multiple positions along the spline), this may imply that higher accuracies of stem radii may exist at heights not at DBH. Given the variability in the location of stem returns, future work could attempt to prioritize estimates from stem sections with the highest number of returns and subject to the least occlusion.
4.2. Comparison of Stem and Statistical Methods
Estimates of DBH using the canopy-centric (i.e., no stem-classified returns) PREV corr method [
23] were produced for all field-measured tree stems in this study, demonstrating the reliability of generating metrics from the upper portions of the forest for analysis rather than classifying stem returns. There was a notable issue with the PREV statistical model in the current research context, in that it required a correction for systematic error (overestimation) and required locally specific information to validate. We assume the source of this error is that the models were developed for unthinned forest locations in Virginia and North Carolina, USA. The transferability of statistical models to different study sites, such as ours in Texas, will introduce uncertainty due to different ALS and site characteristics [
51]. The study site in Texas was subject to thinning, and the PREV model was developed using unthinned data as model training. The transferability of predictive models among different ALS acquisitions is a topic of research (e.g., [
51]). Different ALS acquisitions are expected to have fundamental differences. For example, canopy penetration characteristics differ with sensor systems even when applied to the same location [
10,
52] in addition to seasonal characteristics of vegetation [
53,
54,
55]. The PREV model was created using datasets for an unthinned forest, data from an ALS acquisition with different specifications, and in a different location, all of which will influence estimate accuracy (Sumnall et al. [
23,
56]).
The error in prediction was corrected, and the PREV corr model generated results with a RMSE of 6.2 cm (RMSE% 12.25%). This was slightly more accurate, in terms of RMSE, than calculated for the stem classification and circle fitting methods. The Taper+IRLS method had the most similar RMSE of 6.9 cm (a 0.7 cm difference).
4.3. Implications and Future Work
The stem classification method(s) developed in this research were based on observations made from this specific dataset; such an approach might preclude transferability to other contexts. Other ALS datasets will likely differ in terms of pulse density, canopy penetration, and occlusion due to sensor capabilities and different vegetation structures; thus, care should be taken when transferring these methods in different contexts. Furthermore, the use of ALS to detect individual trees and estimate their size using high pulse densities is seeing increased attention in the research literature [
57]. Given the unique dataset specifications used in this current research, future work would need to evaluate their applicability to other locations and datasets generated with different acquisition specifications. If successful, such an approach would represent a significant saving in terms of cost and time, where the requirement for model creation and training would be reduced.
A key element of the developed methods was the estimation of stem diameter at multiple heights along the tree stem and fitting a spline to the distribution to reduce noise and extrapolate in gaps, allowing the estimation of stem diameter at any height below the canopy. Validation data were only available for DBH and no other heights in the current research. Future work could evaluate the accuracy of the diameter estimate at different heights. If determined to be accurate, the direct evaluation of stem taper could be implemented. Taper equations for specific tree species, such as loblolly pine, have existed for some time (e.g., [
58,
59]), represented by non-linear equations estimating volume from DBH and height, but are often developed for a specific geographic area. Given the non-continuous nature of the return distributions from the tree stem, those sections with higher point density may yield a more accurate estimate of stem radius at that height. These estimates could be input into an existing taper model used to estimate tree size metrics for the missing sections. This simplified approach could be an alternative to estimating tree volume by fitting enclosed 3D mesh surfaces, such as in Brede et al. [
50], when dealing with potentially incomplete stem information.
The potential of increased laser power used in ALS acquisitions, flight parameters, such as altitude, variable flight speed, scan angle (e.g., Neuville et al., [
21]), and spatial extent covered, was not explicitly evaluated in the current study relative to canopy penetration of the laser pulse [
14] or stem detection. Stem detection would require high return densities; future work would therefore need to evaluate sensors and platforms to optimize these specifications with regards to the cost of acquisition.
There were several assumptions used in the current study with regards to stem detection, specifically that the stem was straight, vertical, and free of defects. It should be noted that the approach is also only applicable to tree stems free of branches. When stem density is high, Loblolly pine will naturally shed dead branches. Trees with defects were not encountered in the current research, and thus the method cannot account for it. Thus, the method would require modification to account for leaning stems and complex or aberrant structures. Assuming the number of returns from the stem is high and stem classification accuracy sufficiently encompasses the merchantable sections of the stem, methods could be applied to evaluate aspects of stem quality, such as the straightness of the stem or the presence of defects (e.g., [
60]). Classifying these attributes prior to harvesting could further enhance forecasting of timber value and facilitate the prioritization of the removal of poorly formed stems during thinning operations or more accurately value timber prior to harvest.