Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Next Article in Journal
Realization of Wireless-Controlled Gear Shifter for Shaft-Driven Bicycle Gearbox
Previous Article in Journal
Electrical and Thermal Anisotropy in Additively Manufactured AlSi10Mg and Fe-Si Samples
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Enhanced STag Marker System: Materials and Methods for Flexible Robot Localisation

by
James R. Heselden
1,2,
Dimitris Paparas
3,
Robert L. Stevenson
2,4 and
Gautham P. Das
1,2,*
1
Lincoln Institute for Agri-Food Technology (LIAT), University of Lincoln, Riseholme Campus, Lincoln LN2 2LG, UK
2
Lincoln Centre for Autonomous Systems (L-CAS), University of Lincoln, Brayford Pool Campus, Lincoln LN6 7TS, UK
3
Department of Engineering, University of Cambridge, Cambridge CB2 1TN, UK
4
School of Engineering and Physical Sciences, University of Lincoln, Brayford Pool Campus, Lincoln LN6 7TS, UK
*
Author to whom correspondence should be addressed.
Machines 2025, 13(1), 2; https://doi.org/10.3390/machines13010002
Submission received: 9 November 2024 / Revised: 16 December 2024 / Accepted: 17 December 2024 / Published: 24 December 2024
(This article belongs to the Section Robotics, Mechatronics and Intelligent Machines)

Abstract

:
Accurate localisation is key for the autonomy of mobile robots. Fiducial localisation utilises relative positions of markers physically deployed across an environment to determine a localisation estimate for a robot. Fiducial markers are strictly designed, with very limited flexibility in appearance. This often results in a “trade-off” between visual customisation, library size, and occlusion resilience. Many fiducial localisation approaches vary in their position estimation over time, leading to instability. The Stable Fiducial Marker System (STag) was designed to address this limitation with the use of a two-stage homography detection. Through its combined square and circle detection phases, it can refine detection stability. In this work, we explore the utility of STag as a basis for a stable mobile robot localisation system. Key marker restrictions are addressed in this work through contributions of three new chromatic STag marker types. The hue/greyscale STag marker set addresses constraints in customisability, the high-capacity STag marker set addresses limitations in library size, and the high-occlusion STag marker set improves resilience to occlusions. These are designed with compatibility with the STag detection system, requiring only preprocessing steps for enhanced detection. They are assessed against the existing STag markers and each shows clear improvements. Further, we explore the viability of various materials for marker fabrication, for use in outdoor and low-light conditions. This includes the exploration of “active” materials which induce effects such as retro-reflectance and photo-luminescence. Detection rates are experimentally assessed across lighting conditions, with “active” markers assessed on the practicality of their effects. To encapsulate this work, we have developed a full end-to-end deployment for fiducial localisation under the STag system. It is shown to function for both on-board and off-board localisation, with deployment in practical robot trials. As a part of this contribution, the associated software for marker set generation/detection, physical marker fabrication, and end-to-end localisation has been released as an open source distribution.

1. Introduction

For the deployment of robotic systems, reliable and stable localisation is a necessity. Most approaches, especially in indoor environments, utilise full on-board localisation with built-in LiDAR (light detection and ranging) for exact position estimation. This is often supplemented with the use of camera-based visual localisation to perform long-term corrections. A common approach in this field is fiducial localisation, in which markers are generated and placed in the environment [1,2,3]. To enable localisation, the locations of each marker are mapped to a reference file. This can then be expanded to fiducial SLAM (simultaneous localisation and mapping), wherein the reference file is generated dynamically. Fiducial markers can encode information in various ways based on the structure and contents of the marker [4,5,6].
Many fiducial localisation systems have been proposed over the years, but many are unstable in their six-DoF (degree of freedom) pose estimation [7]. The Stable Fiducial Marker System (STag) [8] was designed to provide stable localisation without temporal filtering. The STag marker system addressed the stability of position estimation by introducing a two-stage homography detection wherein monochromatic markers, constructed from both squares and circles, were developed. It includes libraries with varying Hamming distances (HDs) to improve occlusion resistance and detection reliability, as a trade-off to library size. The authors notably reject chromatic markers due to the necessity for photometric assumptions or a calibration step.
The first core contribution of our work is the extension of the original STag system with a reliable six-DoF multi-robot pose estimation system. We developed this to function with both on-board and off-board localisation. On-board pose estimation is with a robot-mounted camera and static markers, whilst off-board pose estimation utilises ceiling-mounted cameras and robot-mounted markers. This enhanced STag marker system was designed with a collection of subsystems providing extended features to fully facilitate reliable robot deployment. Each subsystem was validated through experimental evaluation. The efficacy of the localisation in its entirety was evaluated with the use of marker-mounted AgileX LIMO robots (Figure 1) in an indoor environment.
The second core contribution is the development of three new marker types compatible with the STag marker system, each one designed to address limitations found in the original STag marker set. The new markers were designed to utilise the baseline infrastructure, with the detection adaptations functioning as preprocessing steps. The resulting sets were empirically assessed against the original implementation, with statistically significant results for each. Figure 2 shows the marker variants developed within this work compared to the baseline version.
The first set of marker variants, named hue/greyscale (HG) STag markers, utilise inverted colour spaces to improve low-contrast detection by over 50%. This set further enables the full two-tone customisation of markers for more discrete deployment. The second set, named high-capacity (HC) STag markers, makes use of chromatic variation in marker ID to increase library sizes cubically. This enables up to 11 trillion unique markers when utilising the HD11 base set. The third set, named high-occlusion (HO) STag markers, utilises chromatic rotation to offset data used for decoding. This approach enables over 75% occlusion of the data within markers without degradation in decoding.
We further strengthen the contributions of this work through our exploration into marker fabrication. These were explored to address limitations in the outdoor deployment of fiducial markers and to enhance detection in low-light environments. The marker fabrication explored the viability of economical and reliable material options, with the assessment of the practicalities of “active” materials for use in dimmed and dark conditions. To this end, we utilised 3D-printed markers made with reflective, transparent, and glow-in-the-dark filaments. Marker detection rates were then assessed in experimental trials across various lighting conditions and distances.
We performed a collection of experiments to validate the approaches. Specifically, we explored the validity and utility of the localisation system and its subcomponents. Further, we assessed the quality of the improved preprocessing detection systems compared to the baseline system, highlighting the efficacy of chromatic STag markers over the traditional monochromatic baseline marker sets. Finally, we explored the detection reliability of the fabricated markers to varying lighting, highlighting the benefits of active materials for improved detection in darkened conditions.
The remainder of this paper is as follows: Section 2 contains a literature analysis of fiducial marker approaches and existing fiducial localisation systems. In this, we detail the rationale for using STag as the baseline marker system in this work. Section 3 details our approach, broken down into the three developmental areas of marker sets in Section 3.1, fabrication in Section 3.2, and the localisation system in Section 3.3. Section 4 outlines our experimental assessment. Finally, the concluding remarks are summarised in Section 5.

2. Related Works

The focus of this work is the use of fiducial markers for stable robot localisation applications, and this section provides a detailed overview of the existing work in this area. The existing fiducial markers can be broadly classified as monochromatic and chromatic. The monochromatic markers can be subdivided into groups based on the geometric patterns used for encoding information, such as circular, square, and hybrid patterns. Although these subclasses could be seen in both monochromatic and chromatic markers, we will limit the detailed discussion on these subclasses to monochromatic markers in this section. The STag marker system we have selected here is a hybrid monochromatic marker with the advantages of both circular and square patterns and will be discussed in detail. Although the original STag markers are monochromatic, some of the enhancements we propose in this paper involve their chromatic variations, and hence a detailed discussion on chromatic markers and their critique will also be presented.

2.1. Existing Markers

2.1.1. Monochromatic with Circular Shapes

Circular markers featuring single or multiple concentric circles are very common in the literature and excel in photogrammetry applications [8,9]. Circular markers can provide a single highly accurate correspondence to the centre of the marker [8] drawing from the definition of the locus of a circle. Approaches to circular markers are, as a result, quite unique, as can be seen in Figure 3.
WhyCon [5] and TRIP [10] are highly similar fiducial marker systems that both detect black-and-white circular ring patterns, achieving precise and fast detection. WhyCon uses a black-and-white roundel with two concentric annuli and a white central disc, starting detection by locating a black segment and then a white inner disc. WhyCode [11] extends WhyCon with a Binary Necklace-based encoding system, allowing for up to 4080 unique IDs to assume a code length of 16. TRIP [10] is a vision-based sensor system that combines 2D circular barcode tags (also known as ringcodes) featuring 19,683 unique IDs. Detection involves adaptive thresholding, edge extraction, ellipse fitting, and concentricity checking.
RUNE-Tag [12] features occlusion-resilient cyclic codes, and various marker sets each with varying Hamming distances. The RUNE-43 markers use a Hamming distance of 13, making use of a single ring divided into 43 sectors to encode binary symbols. The marker set of RUNE-129 extends this to three layers, encoding symbols from an alphabet of 8 ( 2 3 ) elements, with a Hamming distance of 30.
FourierTag [13] addresses resolution-based issues by preserving high-order bits of the tag’s ID while low-order bits (i.e., high-frequency features) degrade with distance. Essentially, the visual features are low-pass-filtered. The authors considered utilising colour to increase payload density and detectability but disregarded it due to the impact of illumination and printing variation.
LARICS [14] uses two circles with coding lines for a unique ID, detected via Hough transformation. CCTag [9] employs concentric black rings on a white background for encoding. The detection system of LARICS markers utilises flow conservation to handle motion blur and homography estimation to handle perspective distortion. SyRoTek [15] employs a ring-shaped binary tag for localising up to fourteen robots in a planar arena.
Figure 3. Fiducial marker approaches that use a circular base for detection. From left to right: WhyCode [11], Rune-Tag [12], CCTag [9], and TRIP [10].
Figure 3. Fiducial marker approaches that use a circular base for detection. From left to right: WhyCode [11], Rune-Tag [12], CCTag [9], and TRIP [10].
Machines 13 00002 g003
While the circular markers provide fast and precise detection, their six-DoF pose estimation is not feasible and hence not suitable for robot localisation applications.

2.1.2. Monochromatic with a Matrix Base

Square markers are the dominant design when pose estimation from a single marker is desired. The four corners of a square marker provide the necessary correspondences needed for pose estimation. The designs of square markers are highly similar, as can be seen in Figure 4.
Many matrix-based markers have been investigated for use in AR (augmented reality), with Matrix [16] and Cybercode [17] as clear examples of early research. Other uses include camera calibration, with a prominent example being CALTag [18] featuring a checkerboard-like marker area where every corner is used as a calibration point. LFTag [19] is a matrix marker that encodes using subtle topological positioning of marker elements to minimise high spacial frequency content. The marker consists of a black border, a white background, and black regions. The black regions, for each ID, are shifted accordingly to the corner of its designated space.
ARToolKit [20] is a popular fiducial implementation that led to the development of derivative works such as ARTag [4] and ARToolKit+ [21]. These two are widely used to obtain localisation information within robotic applications. These markers use a black quadrilateral pattern outline, with ARToolKit encoding a feature vector of usually 256 or 1024 elements, and ARTag applying digital techniques to encode and match patterns of 10 bits in a 36-bit binary sequence supplemented with forward error correction and cyclic redundancy checks. AprilTag [22] is a similar implementation designed to be more resilient to occlusion.
A highly influential fiducial marker within robotics is ArUco [23], which features markers visually similar to ARTag and ARToolKit. It has 1024 markers at a Hamming distance of 3 and, unlike BinARyID [24], has a robust binary ID system with integrated error correction.
Figure 4. Fiducial marker approaches that use a matrix base for detection. From left to right: AprilTag [22], ARToolkit [20], ArUco [23], and ARTag [4].
Figure 4. Fiducial marker approaches that use a matrix base for detection. From left to right: AprilTag [22], ARToolkit [20], ArUco [23], and ARTag [4].
Machines 13 00002 g004
While the square-shaped fiducial markers provide higher-DoF pose estimates, a major challenge in their applicability is their limited occlusion resistance.

2.1.3. Monochromatic with Hybrid Shapes

Whilst markers using circular or matrices have clear utility and benefit, they also have limitations against one another. Hybrid markers aim to utilise the strengths of both and resolve their weaknesses. A common feature of hybrid designs is using points or lines to generate correspondences. In this, the points receive some of the same benefits as the circles, whilst the lines leverage the detection and occlusion resilience of the matrix detection approaches. Benefits of such designs include over-defined correspondence relationships that lead to occlusion-resistant markers. Some examples of such hybrid markers are shown in Figure 5.
Intersense (circular data matrix) [25] combines the DataMatrix concept with the benefits of the contrasting concentric circle (CCC), achieving 2 15 = 32,768 possible codes. Occlusion-resistant optical tracking strategies using novel shapes have been utilised within LinePencil [26] with intersecting lines and PiTag [27] with dots on imaginative square frames. LinePencil [26] is an occlusion-resistant optical tracking strategy that employs four lines intersecting at one point. Even under partial occlusion, the geometric property of the intersecting lines allows for robust detection. Based on RUNE-Tag, PiTag [27] is also designed for occlusion resistance. It consists of 12 dots placed on an imaginative square frame and can handle occlusion of nearly 50%. ReacTIVision [28] presents compact and highly efficient amoeba fiducials, generated using a genetic algorithm that ensures each marker is distinct and easily trackable, whilst Yamaarashi [29] furthers this with the inclusion of a bit code alongside the amoeba design. Both CTag [9] and Prasad [30] utilise a blur-resistant fiducial design that appears as a set of circles of varying radius on a square background. SIFTTag [31] is a single marker implementation for SIFT and SURF feature detectors, characterised by a circular gradient.
Figure 5. Fiducial marker approaches that use a unique base for detection. From left to right: PiTag [27], Yamaarashi [29], LinePencil [26], and ReacTIVision [28].
Figure 5. Fiducial marker approaches that use a unique base for detection. From left to right: PiTag [27], Yamaarashi [29], LinePencil [26], and ReacTIVision [28].
Machines 13 00002 g005

2.1.4. Chromatic

Historically, most fiducial markers were designed to be monochromatic. During the initial development of marker systems, this was done to address a technical limitation of the current digital camera technology and limit the cost of the systems. Coloured markers also present some technical difficulties when it comes to implementation as they are susceptible to lighting conditions (light intensity, hue, etc.). However, adding colour to current fiducial markers can potentially increase detection speed, data capacity, and detection stability—especially under motion—at a low cost. More advanced interactions of light (blur) can also be exploited. Some approaches to chromatic markers are detailed below and presented in Figure 6.
ChromaTag [6] uses the LAB colour space and makes use of opponent colours to optimise detection, localisation, and encoding, offering considerable speed-up compared with AprilTag, RUNE-Tag, and CCTag. More specifically, according to [6], the green outer ring and the red centre feature a large gradient in the A channel, which does not routinely exist in natural scenes, aiding in fast and accurate detection. In addition, information is coded within the B channel, which has little effect on the A channel intensity.
Monospectrum markers [32] address blurred and out-of-focus images with high-frequency components in the marker’s two-dimensional chromatic sinusoidal intensity pattern in the HSV colour space. In contrast, Bokode markers [33] lean into out-of-focus images by leveraging the bokeh effect—where out-of-focus points appear as a blurred disk—to encode information to be read from several meters away, albeit being only 3 mm in diameter.
Triangles are utilised in a number of approaches such as Košt’ák and Slaby [34], who propose a YOLO-based marker detection pipeline, and Liu et al. [35], who utilise adaptive colour matching. Dell’Acqua et al. [36] also utilise triangles with markers of a 5 × 5 grid divided diagonally. This results in 50 triangles, each encoded in binary as either green or blue. Three blocks are used to encode orientation, twelve for parity, leaving 32 bits for information. The high-capacity colour barcode [37] is a Microsoft technology similar to [36] for encoding data using clusters of CMYK coloured triangles. Neumann [38] proposes proportional-width ring fiducials to target different detection ranges.
In Farkas et al. [39], the developed markers are designed for aesthetic consideration, utilising combined shape and colour detection processes. As a part of this consideration, the authors utilise only an 8-bit capacity, prioritising environmental incorporation with simpler designs. ARTTag [40] is an approach that utilises specifically sized circles within an image to identify its ID, allowing for markers that are hidden within images, while JuMarker [41] takes this further, incorporating bit data into an image directly. In this, bits are encoded discretely into components such as open/closed windows or banner patterns.
Figure 6. Fiducial marker approaches that incorporate hue variation into the markers. From left to right: (top) high-capacity colour barcode [37], ChromaTag [6], monospectrum marker [32], Farkas et al. [39], (bottom) Liu et al. [35], Cho et al. [38], and JuMarker [41].
Figure 6. Fiducial marker approaches that incorporate hue variation into the markers. From left to right: (top) high-capacity colour barcode [37], ChromaTag [6], monospectrum marker [32], Farkas et al. [39], (bottom) Liu et al. [35], Cho et al. [38], and JuMarker [41].
Machines 13 00002 g006

2.1.5. STag

STag [8], developed by Benligiray et al. and introduced in 2019, is a monochromatic fiducial marker system that provides stable pose estimation. STag markers are a type of hybrid marker featuring an outer square border for detection and homography estimation and an inner circle for homography refinement, thereby enhancing localisation stability. The hybrid approach combines the benefits of circular and square markers discussed in Section 2.1.1 and Section 2.1.2. It also features an encoding area inside the inner circular border filled with 48 disk-shaped bit representations.
The code pattern (see Figure 7) is morphologically dilated and eroded repeatedly, filling the gaps between neighbouring bit representations, and allowing the code to be read correctly in the case of slight localisation errors. It further reduces high-frequency elements in the pattern, resulting in fewer edge detections.
The STag lexicode generation algorithm performs an exhaustive search ( O 2 n complexity), generating codewords and ensuring that they are a certain Hamming distance (HD) apart—ranging from 11 to 23—and not cyclically redundant. As the search space is prohibitively large, the authors of [8] first generate 12-bit codewords and combine them to generate unique 48-bit codewords. Compared to ArUco and RUNE-tag, STag markers feature a wider range of error correction and library size, as shown in Table 1 and Figure 8.
The multi-step marker detection process from [8] can be summarised as a highly optimised pipeline.
The first step of the algorithm performs edge detection and segmentation using the Edge-Drawing Parameter-Free (EDPF) algorithm [42]. Following the detection of edge segments, the algorithm fits lines to them effectively connecting the first pixel of an edge segment to the last. Line segment intersections are used to perform corner detection, and three-corner groups are used to detect quads in the image. This approach is highly occlusion-resistant.
The second step of the process performs candidate validation by ensuring that the quads in the image are mathematically consistent with an ideal square under perspective distortion. The authors of [8] support that this simple initial step can “eliminate as much false candidates as possible, which will decrease the false positive rate”. Following that initial triage step, the remaining quads have their codewords extracted and compared to the codeword library. If a codeword cannot be detected, then the marker is discarded.
The final step of the detection pipeline, that of homography refinement, is what makes the STag approach special compared with other similar approaches in the field. It utilises the conic information of the inner circular border to refine the estimated tomography. Edge detection is applied again to the inner part of the marker, and the edge segment most similar to the predicted inner circle is used to fit an ellipse. The final step is skipped if a suitable ellipse cannot be detected. Otherwise, the Nelder–Mead method is used to refine the current homography matrix. The authors of [8] note that the ability of the last step to be omitted “is a deliberate design choice” as “homography refinement with a partially occluded ellipse does not benefit stability, and in some cases even degrades it”.

2.2. Existing Fiducial Localisation

To obtain the absolute localisation of robots in the environment, there are two approaches to utilising fiducial localisation: (i) on-board, with static markers at known locations in the environment and the camera on the robot; and (ii) off-board, with static camera(s) in the environment and markers on the robots. In the first approach (when static fiducial markers are used for aiding robot localisation), a dictionary storing the poses for each marker within the environment is often used. Through this, when a marker is detected, the mapping of its pose relative to the robot-mounted camera can be determined. This map can be generated through a process called fiducial SLAM [43], which combines mapping with localisation, or it can be embedded explicitly into the environment description.
If the camera is mounted to the environment, and the markers are mounted to the robots, only the position of the cameras needs to be defined. This allows the computation to determine the robot’s position to be carried out external to the robots. This is the case in SwarmCon [44] where each small robot can be given a small marker and the markers are tracked externally to the robot. For this approach, and for [3], the camera can be mounted on the ceiling. Whenever robots move beneath the camera, the marker is detected, and the pose estimate is sent back to the robot. This functions similarly to a scenario in which one robot is able to see markers mounted on other robots.
Whether to choose an on-board or off-board fiducial localisation system may be decided based on the robot density within the environment and associated economic and computational costs. For environments with a high robot density, a few cameras may provide sufficient support, but for low robot density, it may be preferable to utilise on-board cameras due to the cost of supporting infrastructure. Furthermore, the computational capability of the robots is a factor influencing this decision. For example, robots used in swarm systems usually have low computational capabilities, and an off-board approach is ideal in such cases.

3. Approach

In this work, three key areas have been explored: (i) the development of chromatic STag markers, (ii) the exploration of physical and “active” fabrication materials, and (iii) the development of the localisation system for on-board and off-board compatibility. These developments are detailed here. The source code for this work has been released publicly and can be accessed at https://github.com/LCAS/STag_ROS2 (accessed on 16 December 2024).

3.1. New Marker Sets

This section outlines the new marker sets developed under this approach. Each one is designed to address limitations with the existing approach through the use of chromatic markers. For each marker set, scripts have been included in the work to generate the markers, and the detection utilises the basic STag pipeline with only preprocessing steps required for compatibility. Due to the potential impact on liveness with the included preprocessing steps, their processing times are quantified in Table 2. Please note that these may be further optimised by porting the code from Python 3.10 to C or C++.
The variation between the baseline STag marker set and the marker sets developed in this section is summarised pictorially in Figure 9.

3.1.1. Customisable Hue/Greyscale (HG) Markers

Greyscale Markers

Many approaches to fiducial markers were explored in Section 2. Fiducial markers are generally limited to highly constrained colour pallets, with most markers restricted to binary or greyscale marker types as a consideration of reliability under atmospheric variation, and chromatic markers having strict colouring required for encoding. This unfortunately results in many constraints for users and limited options for incorporation of such markers into their environments.
As detailed in Section 2.1.5, STag markers are no exception, they are designed with a white border around a black outer box, with a white inner circle containing the code in black. The marker is designed with this colouring for many scenarios, especially those with reduced lighting, this provides high contrast for more reliable detection. However, as deployments for autonomous systems are most often in well-lit environments, this has the potential to be a more flexible constraint.
The contrast of the marker in this work is quantified based on the normalised relative luminance of its two tones. This is calculated with (1) where R, G, and B are the pixel colours for the tone from 0 to 255. For the marker as a whole, its relative luminances are combined through the signed contrast ratio ((2), where L 1 and L 2 are the luminances of the two tones). This value can be from 0.008 at a low contrast and up to 5100 for a binary contrast. The value being signed means that the inverted form of the original image is represented by a value of −5100.
Luminance = ( 0.2126 × R ) + ( 0.7152 × G ) + ( 0.0722 × B )
Signed Contrast Ratio = ( L 1 L 2 ) m i n ( L 1 , L 2 ) + ( 0.05 )
Simply inverting the image opens it up for supporting the detection of twice the number of markers. However, through consideration of varying greyscale, the opportunities for integration expand much further. For the detection of such markers, this approach deploys a second detection phase with an inverted image. This enables the detection of markers with varying positive and negative contrasts, as shown in Figure 10, allowing markers to be used, which can be more subtly integrated into a deployment.
The markers under this approach are simply two-tone versions of the existing STag markers. However, to facilitate their use with the other markers in the system, and to ensure clarity with their discussion, we have included them in this work as their own marker set, aptly named hue/greyscale (HG) STag markers.

Hue Markers

Under the adaptations to support greyscale markers, chromatic markers have also become viable. Any colouring can be utilised for markers so long as they reach a minimum contrast for detection. This allows markers to be deployed with colour schemes aligning with corporate identity, as illustrated in Figure 11, in which three markers are presented as examples with corporate colour branding (used in the logo) of John Deere, PayPal, and Coca-Cola, respectively.
With the original STag marker approach, this was not a reliable approach given the detection is unable to detect markers with a negative signed contrast ratio. However, with the additions to the detection pipeline, markers can now be identified regardless of whether the signed contrast ratio is positive or negative.
Further, this HG marker detection approach enabled more discretion with the deployment of markers within the environment, an important factor for retaining the aesthetic considerations of interior design without reducing capabilities for deployment of autonomous systems. Further, it is hoped that enabling customisation will also encourage further utility for the commercial deployment of fiducial localisation systems.

3.1.2. Higher-Capacity (HC) Markers

High-Capacity Markers

With consideration of the basic marker sets available, there is a steep trade-off between the desired bit error ratio and the available number of markers in the set, with the most impactful trade-off being with the marker set of HD23, in which there are only 6 markers available, and the next resilient marker set HD19 only having 16. Only when utilising the base set of HD17 does the capacity of 157 become reasonable, yet still limiting, for deployment.
Whilst in Section 3.1.1, the use of chromatics is for the benefit of discrete integration, in this particular marker adaptation, it is utilised to enhance the number of available markers for deployment. These new marker sets are termed high-capacity (HC) STag markers, following [37]. In these markers, RGB colour channels are used with each holding an independent marker from the given base set, and these are combined together in the decoding to cube the number of available markers ( n 3 ). An example of this is shown in Figure 12, wherein three different STag markers, each in one of the RGB channels, are combined to form a single HC marker. Further examples can be seen in Figure 13.

High-Capacity Detection

To detect HC markers, a multi-step algorithm is deployed utilising the existing STag system. The process starts with splitting the input RGB image into its red, green, and blue channels with each channel c { R , G , B } reconstructed into a three-channel image, as shown in (3). Each constructed image is then passed into the base detection system to identify the marker ID’s in each channel.
I = I R I G I B I c = I c I c I c
As there may be multiple markers within the frame, simply using the identified marker IDs is not sufficient. Thus, following the initial detection stage, the system next determines where there are overlapping bounding boxes across the three colour channels. In this, the approach first identifies the midpoint x ¯ , y ¯ of each polygon from the corner points ( x i , y i ) i = 1 n on the bounding box, as in (4).
x ¯ = 1 n i = 1 n x i , y ¯ = 1 n i = 1 n y i
For each bounding box identified in the primary (red) channel, the algorithm compares the displacement for each bounding box in the secondary (green and blue) channels. Only when bounding box centroids across all three channels overlap by the pixel-based proximity p, satisfying (5), will the bounding boxes be assumed to belong to the same marker.
x ¯ R x ¯ G p , y ¯ R y ¯ G p , x ¯ R x ¯ B p , y ¯ R y ¯ B p
The HC markers are encoded where each channel is representative of a separate digit, with the base for each unit equal to the total size of the base set. Under this, the three-channel representation can be parsed according to a standard base conversion as in (6). For a marker containing markers 5, 2 and 10 from HD23 in the R, G, and B channels, respectively, the HC ID would be 377.
H C = I D red + ( I D green × | HD | ) + ( I D blue × | HD | 2 )
Various STag marker libraries exist (shown in Table 3) with capacities inversely proportional to detection reliability. The HC marker sets enhance the capacity of the existing STag libraries cubically, leading to an outlandish capacity for HC11 markers.

3.1.3. Occlusion-Resilient (HO) Markers

High-Occlusion Markers

To address the detection limitation of occlusions, the standard STag markers have a built-in bit error ratio, which contributes to the Hamming distance variation and quantity of markers, between the base sets. However, the STag markers are still susceptible to occlusion, as evidenced in our experimental results presented later in Section 4.1.3 and Section 4.1.4. We have added a new collection of marker sets, with the markers modified to utilise chromatic variation to enhance the resistance to occlusion. These markers have as such been termed high-occlusion (HO) STag Markers.
The HO markers have been designed to offset the impact of occlusion through the rotation of each colour channel. In this, the red, blue, and green colour channels are each rotated by 90 to one another, with the red channel left as is, the blue channel rotated by 90, and the green channel rotated by 180. This rotation enables the information contained within the inner ring of the marker to be moved and duplicated to other regions within the image. This is illustrated in Figure 14.
The rationale behind this approach is when a region of the marker image is occluded, preventing the system from capturing the encoded information from that region, that information can still be visible in the other channels due to the rotations applied to those channels. With this, the acceptable occlusion that the marker is able to endure is increased by thrice the amount. Without consideration of the BER, and with only the blue channel rotated at 180, the marker is able to withstand up to 50% occlusion with the information duplicated. With the green channel also rotated at 90, this increases to up to 75% occlusion resilience. When factoring in the BER also, this increases further depending on the base marker set used. Further examples can be seen in Figure 15.

High-Occlusion Detection

The detection system for the HO markers begins similarly to the HC markers detailed in Section 3.1.2. In this approach, the image is first split into the three colour channels, and each is parsed into the STag system. Notably, in the HO detection, only the bounding boxes are of interest at this stage, not the identified marker IDs.
This is then followed by the same overlap detection detailed and presented in (4) and (5). Given the list of overlapping detected bounding boxes, the system then begins image correction on each marker. In this, the region defined by the bounding box in the red channel is first extracted into a mask. This mask is then used to extract the same region from the blue and green channels.
To work with the region, it is firstly corrected for perspective in each colour channel. In this, the perspective transformation is computed from the bounding box p i c = ( x i , y i ) to a square using the homography matrix H c . Here, H c is calculated using (7), and the new perspective p i is calculated using (8) utilising the OpenCV library.
H c = cv 2 . getPerspectiveTransform { p i c } , { p i }
p i c = H p i c
The corrected region is then rotated by −90 for the green channel or −180 for the blue channel. The perspective transforms for both the blue and green regions is then inverted back to the perspective on the red channel p i R . , using (9).
p i R = H 1 p i c
Once the image has the corrected region in place, the channels are combined within the polygon to enhance the region for detection and compensate for any data loss across the channels. In this process, we utilise a two-phase processing approach denoted under (10), where f 1 and f 2 are pixel-intensity joining methods.
I ( x , y ) = f 1 ( f 2 ( I ( x , y ) ) )
In both phases, one of six join methods can be determined from none, sum, binary, modal, average, and median. For each of these, the respective approach is applied to each individual pixel within the polygon. With the first phase as sum, and the second as binary, each pixel will be summed together with a cap of 255, and then on the second phase, the binary threshold will be applied based on the mean intensity of the region. For the utilisation of the modal method, it is key to perform binary for the first phase.
With the image corrected and processed to handle occluded data, the image is then passed back into the standard STag system for marker detection again.

3.2. Marker Fabrication

In the deployment of fiducial localisation systems, the most often used material is paper printing; however, this has clear limits in its practicality outdoors and in low-light environments. This section details material options available for physical fabrication along with their physical resistances. Further, this section outlines the use of “active” materials to fabricate markers that are visible even in the dark.

3.2.1. Outdoor-Capable and Standard Physical Markers

Paper-Based Markers

A standard approach to creating markers for fiducial tracking is to use paper printing. This is the simplest and most cost-effective approach for indoor use, utilising card backing as a stiffening material to support markers during deployment. It is important to note that when utilising coloured markers for HC and HO, the colour printing needs to be accurately calibrated to have the individual colour channels distinct from each other.
In rougher environments, lamination can provide basic resistance to scratches, tears, oils, and UV while stiffening the thinner paper-based marker material. However, their reflective qualities increase the risk of detection failure, especially on sunny days or under torchlight. Some markers fabricated with these approaches are shown in Figure 16.

Laser-Cut Markers

Laser-cutting can be used to construct markers from materials such as acrylic or wood. Acrylic is highly UV-resistant, has good moisture resistance, and is highly stable, but it can be reflective depending on viewing angle, colours, surface finish, and surrounding lighting. Wood, on the other hand, offers minimal reflectance; however, it also has a lower weather resistance without additional treatments. For both materials, fabrication can be performed by etching the entire region for the darker tone into the material. For acrylic, this can also be completed by covering the material with a film and etching along the edges of the darker tone. As a more general drawback, the equipment required for this approach comes at a high premium. Markers generated from these approaches can be seen in Figure 17.
With both approaches for acrylic, the film can be removed partially and paint applied to further enhance the contrast. Etching only the edges like this works as a much tidier and faster approach, only taking a few minutes, though additional painting does add to the time. Further, the use of full-etching on acrylic can result in bubbling and warping of the material due to the high temperature. With the use of wood, the choice of a specific material is paramount. Using a darker wood as a base will result in a lower contrast and harder detection; however, brighter woods do come at a higher cost.

3D-Printed Markers

Materials traditionally used in 3D printing have a range of resistances against moisture, sunlight degradation, physical damage, and reflectance. With entry-level printers ranging from as low as GBP 300, 3D printing is far more accessible than laser cutting for producing durable outdoor markers. This is furthered by the flexibility gained in the wide range of available colours. Compared with laser-cutting, fabrication costs are also much lower with 0.145 m markers, costing as little as GBP 0.28. For fabrication on a basic printer, colour swaps can be manually executed mid-print, to enable two-tone colour prints to be created. This approach was used for the fabrication of the HD and HG markers shown in Figure 18.
In this work, we have developed a tool to enable basic STag markers to be 3D-printed. This script is based on the Trace2SCAD tool [45], which transforms PNG images into OpenSCAD [46] files. The OpenSCAD designs can be further parameterised and exported in a 3D geometry format like STL, ready for 3D printing. The STL generation script supports batch processing of a directory containing multiple PNG files. Additionally, users can input parameters such as the model’s height (in mm), the percentage of the height used for detailing, and the size of the outer square (in mm). The models can then be used with a slicer software, such as PrusaSlicer [47], where they can add pauses at specific layers to switch filament colours.
Through manual swapping of filament, it is also possible to 3D print chromatic markers such as the HC and HO markers discussed in Section 3.1.2 and Section 3.1.3, respectively; however, slightly more expensive 3D printers can facilitate this automatically with automated colour swapping. This approach was used for the fabrication of the HO23 marker 0 in Figure 19.
A complimentary script was also made to perform colour separation of the HC and HO markers in a given directory. This script generates eight separate images for black, white, blue, red, green, yellow, cyan, and magenta. These separated images are then sent to the STL generation script, which allows us to print independent components of the chromatic marker variants. This can be seen in Figure 19.

3.2.2. Active-Material Physical Markers

Filaments

Most filaments for use in 3D printing allow basic resistance to scratches, UV light, water, and oil for extended periods of time. Because of this, basic filaments are ideal for the fabrication of outdoor markers. Some filaments further contain additives that enable unique behaviours. Examples of such additives include metal, carbon, and glass fibres, which are used to stiffen the result, or photoluminescence, thermochromic, and photochromic additives, which can induce colour-changing effects on the material’s surface. As part of this work, we explore the potential of these materials to facilitate “active” STag markers, leveraging the unique material properties alongside their catalysts, to enable advantageous behaviours.

General PLA

Polylactic acid (PLA) is one of the cheapest and most accessible 3D printing materials. As it is among plant-based thermoplastics, it is biodegradable under industrial composting conditions, making is a sustainable option. With the use of added copolymers or fibres, it can be made to last longer for use in outdoor deployments or be incorporated with “active” behaviours. As such, PLA is the basis for many “active” materials.

Transparent PETG

There are several materials similar to PLA that provide additional benefits. One such material is PETG (polyethylene terephthalate glycol), which offers much better UV resistance than PLA. Additionally, PETG is much easier to produce in transparent form. This makes it ideal for use in markers incorporating backlight illumination with varying active lighting. A red backlight version of this is presented in Figure 20a and non-activated version in Figure 20d.

Glow-in-the-Dark PLA

Glow-in-the-dark filament, specifically Overture Glow PLA—Glow White (green in dark) [48], is a PLA additive material that contains strontium aluminate. The strontium aluminate produces a glow effect once exposed to UV light, with the length of time it glows proportional to the amount of energy received in the charging phase. An example of an Overtude Glow PLA marker activated under darkness is shown in Figure 20b, with the same under illumination in Figure 20e. The strontium aluminate particulates added to the PLA can have an abrasive effect on softer nozzle materials, like the commonly used brass nozzle material. As such, unlike many additive filaments that require very specialised hardware, this material only requires the printer to be equipped with a more wear-resistant nozzle.

Reflect-o-Lay PLA

In this work, we also explored the potential of reflective filaments. In this, we utilised the material called Reflect-o-Lay [49] for the background of the 3D-printed marker. When this material has a light shone directly towards it, the light is reflected back regardless of the viewing angle. This retro-reflective effect is achieved by adding glass microsphere particulates within the filament. An example of this reflective effect on the markers can be observed in Figure 20c, where the marker is directly lit from an external light source in a dark environment. Due to the flexible properties of the filament, printing requires a printer equipped with a direct-drive extruder.

3.3. System Design

Finally, in this section, we detail the algorithms and calculations behind the localisation system. This system supports all stages of localisation from image processing and pose projection to off-board localisation and communication with the robot.

3.3.1. Six-DoF Marker Estimation

Camera FOV Calibration

To accurately project the pose of an object within the image, the horizontal field of view ( F O V H ) of the camera must be defined. For many camera providers, this may not be listed in the specification. For this reason, the approach utilises a process for calculating F O V H from the diagonal field of view ( F O V D ), which may be listed instead. This is calculated according to (11) in which the arctan of the ratio of the tangent of half the diagonal field of view and the scaling factor caused by the distortion aspect ratio of the image. Here, I h e i g h t and I w i d t h are the height and width of the detected image.
F O V H = 2 · arctan tan F O V D 2 1 + I h e i g h t I w i d t h 2

Orientation Estimation

The orientation is the first aspect of the pose to be determined. This is calculated from the orientation of the first edge of the bounding box. Here, the bounding box is structured as a set of four corners, each containing a set of coordinates within the image. This proceeds clockwise from the corner containing the ID number as in (12). The first edge is the connection from the first corner ( x 0 , y 0 ) to the second corner ( x 1 , y 1 ) .
c o r n e r s = { ( x 0 , y 0 ) , ( x 1 , y 1 ) , ( x 2 , y 2 ) , ( x 3 , y 3 ) }
The calculation begins with determining the vector of the first edge of the bounding box. The arctan is then taken off the gradient of the line and given a 90 offset so as to align to the marker direction as in (13). The cosine and sine of half the rotation are then taken, respectively, for the w and z components of the quaternion Q.
Q = arctan Δ y Δ x + π 2
where
Δ x = x 1 x 0
Δ y = y 1 y 0

Depth Estimation

To determine the position P of the object, it is first required to estimate the depth of the object from the camera, denoted as P z . To this end, we first determine the focal length FL of the camera as the ratio of the image width in pixels ( I w i d t h ) to twice the tangent of half the horizontal FOV ( F O V H ) as in (16).
F L = I w i d t h 2 · tan F O V H 2
This is then factored into (17), wherein the physical dimensions of the marker M are utilised. In this, the focal length is multiplied by the ratio of the real width of the marker ( M w i d t h ) to the edge length of the marker in pixels.
P z = F L · M w i d t h ( Δ x ) 2 + ( Δ y ) 2

Pose Estimation

Finally, the pose estimation is performed. To accurately determine this, firstly, the centre point C of the marker within the image is identified as the pixel midpoint between the first and third corners as in (18). Here, C x and C y are the X and Y coordinate values of the centre point.
C = C x C y = x 0 + x 2 2 y 0 + y 2 2
This is then used in (19) to determine the pose estimate P in which it is centred with respect to the image dimensions, before being multiplied by a scaling factor. Here, P x and P y are the X and Y coordinate values of P, and the scaling factor is calculated as the ratio of the marker depth ( P z ) to the focal length ( F L ).
P = P x P y = C x I w i d t h 2 · P z F L C y I h e i g h t 2 · P z F L

Stable Position Filtering

In detection approaches such as this, the accuracy is likely to have slight misalignments due to lighting variations detected by the camera. The STag Marker system is also prone to this despite the developments in stability. When considering the depth estimation included with this approach, this can be further exacerbated. In this, a few pixels wider or smaller can impact the depth estimation by up to a few centimetres.
To improve the accuracy of the detection under temporal variation, we implemented a filtering system for pose determination. This is utilised twice within the calculation: once the rotation Q of the marker has been calculated, and then when the final pose P is calculated. In both instances, the calculated value is added to the respective buffer, B q for rotation as in (20), and B p for position as in (21), where [Q] and [P] are the most recent orientation and position estimates, respectively. This new list is then trimmed to maintain the length L, removing the oldest items added to the list.
B q = B q , Q L
B p = B p , P L
This buffer is then smoothened over time with the strategy chosen, as shown in (22) and (23), to obtain the smoothened orientation estimate ( Q ^ ) and smoothened position estimate ( P ^ ). Here, f ( . ) and g ( . ) are the smoothing functions applied.
Q ^ = f ( B q )
P ^ = g ( B p )

3.3.2. Off-Board/On-Board Tracking Compatibility

Marker Configuration File

To utilise the poses of detected markers, it is first important to determine the location of the camera. For this purpose, we utilise a collection of “calibration” markers. Each of these markers has its actual pose with respect to the map, stored in a configuration file. To simplify the utility of this configuration file, the system was made compatible with the L-CAS unified environment description template [50]. In this configuration, markers are defined with their ID, Hamming distance, marker width, and normal vector, with a pose containing position and orientation as a quaternion.
Given each marker is perpendicular to the plane it rests on, the rotations of the marker within the configuration file must be set accurately. To improve usability, the normal vector can be defined as either facing up or down. Upon launching the system, the downward-facing markers are rotated according to (24), where their original quaternion r is rotated by the rotation matrix q to correct the orientation. This results in a mapping of the initial ( w , x , y , z ) to ( x , w , y , z ) . Please note that the variables x, y, z and w in (24) are the standard variables used to represent rotations as a quaternion and not to be confused with the positions discussed in Section 3.3.1.
q × r = 0 · w 1 · x 0 · y 0 · z 0 · x + 1 · w + 0 · z 0 · y 0 · y 1 · z + 0 · w + 0 · x 0 · z + 1 · y 0 · x + 0 · w = x w y z

Camera Localisation

For each image processed, the camera localisation system receives a list of poses, corresponding to the detected markers in the image. These poses are estimated relative to the camera set as origin. In the localisation of the camera, each marker detected and appears in the configuration file is identified, followed by converting both the detected pose and absolute pose for each marker into rotation and translation matrices. The camera pose relative to the world frame T c a m e r a , i is then estimated as the dot product of the translation matrix of the absolute position T a b s , i and the multiplicative inverse of the translation matrix of the relative position T r e l , i using (25). This is completed for each calibration marker i detected.
T c a m e r a , i = T a b s , i · T r e l , i 1
The estimation of the camera’s transformation from the world frame T c a m e r a ¯ is then determined as in (26). In this, the average pose estimate for the camera is computed across all markers detected. This is then converted from a transformation matrix to a pose (with both position and rotation) and identified as the camera’s pose with respect to the world coordinate frame. This estimated camera frame is then used to transform new marker pose estimates directly to the world frame.
T c a m e r a ¯ = 1 N i = 1 N T c a m e r a , i
To enhance the flexibility of the approach, the minimum number of calibration markers to perform this camera position calibration can be specified, with a higher number of simultaneous markers resulting in more accurate camera estimation. The frequency of calibration attempts can also be adjusted between the options of either only calibrating once, or attempting to calibrate on each detection.

Robot Localisation

Once the camera’s position in the world frame is known, markers not listed in the configuration files can be detected and their pose in the world coordinate frame can be estimated. The configuration options used for the camera pose estimation enable the approach to be deployed either as an on-board approach for mobile localisation or as an off-board approach for a remote localisation system. For on-board localisation, utilising continuous camera localisation and a high minimum number of calibration markers encourages an infrequent but high-reliability localisation. In contrast, for a fixed-position camera, utilising a low marker count and single calibration point can result in high-frequency and low-latency detection.
With the use of ceiling-mounted visual tracking, if the camera’s field of view covers the full unobstructed environment, the robot can rely on the tracking as its sole localisation. In an obstructed environment, or in a scenario where the field of view does not fully cover the environment, the robot can utilise the occasional pose updates as a supplementary system, when it is within the camera’s field of view, to improve its estimates of its position.

3.3.3. STag ROS2 Package

For effective robotic deployment of this approach, the enhanced features proposed here have been combined into a software package with direct ROS2 support. In this, files to launch the enhanced STag detection system with configurable parameters, and full dependency management through the ROS dependency management system are provided. As described in the earlier subsections, the developed package provides the essential preprocessing and postprocessing for the enhanced features and uses the upstream STag marker detection system for the core detections. In our developed software package, ROS2 Humble was utilised for the ROS integration, with version 1.0.2 of the upstream stag-python module. The system also supports full compatibility with the unified environment description template [50] offering a wide range of operational scenarios.
To facilitate live pose estimation, the system was designed with the capability to function with a variety of cameras. In this, it utilises a standard ROS2 interface utilising the visualisation messages of Image and CameraInfo. The developed approach considers and integrates compatibility from low-cost (GBP 10–60) USB cameras, to expensive RealSense D435i cameras (GBP 200–300). Although not reported in Section 4, the multi-camera configuration was assessed on an Intel Next Unit Computing device (model: NUC8i5BEK) running an i5 quad-core CPU. The device, running four live camera feeds simultaneously, had no significant reduction in processing compared to a single-camera configuration.

4. Experiments

To fully assess the potential of this work, we conducted a collection of experiments to quantify the performance improvements from the proposed features. We grouped the experiments into three areas. These focus on the efficacy of the custom marker sets (HG, HC, and HO) in Section 4.1, detection rates of fabrication materials in Section 4.2, and localisation accuracy in Section 4.3.

4.1. Assessments of Custom Markers

4.1.1. Impact of Contrast Level on Greyscale Marker Detection

To assess detection reliability for the HG (hue/greyscale) marker sets, reliability at different contrast levels was quantified experimentally. In this, we utilised HD19’s marker 0 printed with varying contrasts with an outer width of 0.15 m. The Microsoft L2 LifeCam HD-3000 USB Camera was used to take a total of 10 images at heights from 0.3 m to 1.2 m. Detection rates were recorded under both the original system and our adapted version. In Figure 21, heatmaps show the detection rates for various signed contrast ratios (SCRs, given by (12)) across multiple distances for the two approaches. In this, 1 corresponds to 100% detection.
The results show that both the original and inverted approaches can achieve consistent detection with SCR at ±0.12 at 0.4 m distance between the marker and camera. The results further show inverted markers can be detected further away with higher reliability compared to the baseline. Interestingly, the data show that the inverted binary image (SCR of −5100) has a higher detection potential than the original image (SCR of 5100); this marker shows consistent detection rates up to 30% further from the camera compared to the baseline. It is also interesting to note that the markers with an SCR of −2.99 and −8.2 had a higher detection rate than the baseline equivalents.
The inverted approach was expected to enable a mirror of the heatmap. However, the results show further detection than that. For the baseline HD STag markers, detection was not possible for markers with an SCR less than 1.35 and over 0.4 m. In our HG markers, we observed reliable detection for this space in both the mirrored region and the baseline region.

4.1.2. High-Capacity Efficacy Across Varying Baseline Marker Sets

Although the purpose of designing high-capacity (HC) markers was to increase the number of unique markers, in this experiment, we aimed to validate the functionality of the HC markers by assessing their robustness to occlusion, a key trait in the STag system. Tests were conducted using the Microsoft L2 LifeCam HD-3000 USB Camera, positioned at a fixed depth of 0.35 m from markers rendered on a screen with an outer width of 0.09 m.
In the execution of the tests, 20 markers were selected randomly from each of the HC19, HD19, and HD11 marker sets. They were occluded on screen by a circle with a diameter of 0.032 m with the circle’s centre at 10 different positions relative to the marker. The occlusion locations are shown in Figure 22 with dotted circles. The detection of the markers for each instance of the mask was recorded, and detection likelihood, despite occlusions, was calculated. The results are detailed in Figure 22.
The results showed that 2 % of occluded instances in HC19 resulted in undetected markers, 1 % of HD19 and 9 % of HD11. The results show a significant difference (with a p-value of 0.035) in the total detection per set when comparing HC19 and HD11, while an insignificant difference (with a p-value of 0.556) was found between HD19 and HC19.
From these results, it can be reasoned that a decision between HD19 and HC19 has no significant difference in detection under occlusion. This means the deployment is no worse off with the use of our new marker set. Further, it can be argued that using a high-capacity marker set is preferable over one with a reduced bit error ratio, as it allows for greater capacity. This means that when looking to change a marker set to one with higher capacity, it is better to utilise HC markers than HD sets with a lower Hamming distance.

4.1.3. Occlusion Resilience for High-Occlusion Markers

The HO marker sets were proposed here to improve the occlusion resistance of the marker system. To assess the resilience of the the HO marker sets to occlusion, an experiment was conducted comparing the HO23 marker set to the HD23 marker set. To this end, we utilised a script to generate occlusion masks to apply to both markers.
To cover the 1000 px by 1000 px marker, we generated 1000 masks each with an orange circle with a 150 px radius positioned at a random location within the marker. The masks were applied to both the original HD23 marker set and the high-occlusion marker set of HO23. These were then passed directly into the detection system. The detection results are shown in Table 4.
To better observe and understand how the overlap between successes and failures occurs between the groups, the results were aligned into a contingency table, shown in Table 5.
The results show the success or failure for each mask under both systems, making it paired binary data. The statistical significance was calculated using the McNemars statistical test at a p-value of 0.0 . We can say as a result, that the high-occlusion markers are more resilient to detection under occlusion than the HD23 markers.
Further, we sought to categorise the failure causes, in which we determined two possible causes of failures: (i) decoding failure and (ii) bounding box detection. Decoding failure is caused by occlusions within the data region of the marker, whilst bounding box failure occurs when the occlusion covers enough of the outer square or inner circle. Notably, when the boundary box fails detection, it impacts both the baseline and adapted system equally; however, both systems are also affected equally in some instances of data occlusion.
In our above-mentioned experiment, we recorded the number of markers detected across each stage of detection, along with the locations of the occlusion masks. Observing these data, we could determine whether the failure resulted from the boundary occlusion or data occlusion. The locations of the occlusions from the set in which both markers failed are plotted in Figure 23.
Figure 23 shows the centroids for each occlusion mask performed in the experiment. The masks are shown on different subfigures based on the impact of the occlusion mask on the detection of the baseline and developed marker sets. For the subfigures where the occlusions caused failure in the high-occlusion markers (Figure 23 (left-middle) and Figure 23 (right)), it can be clearly seen how the occlusions are absent from the inner circle of the marker that contains the data. In the subfigures where the detection of the baseline failed (Figure 23 (right-middle) and Figure 23 (right)), it can be clearly seen how the masks that caused such failures were sporadic over the entire marker. From this, we can see how the failed detections for the high-occlusion marker set is predominately impacted on by the bounding box occlusion rather than data occlusion.

4.1.4. Variation in Detection Rates Across Base Marker Sets

To further assess the HO marker sets, we employed a supplementary set of experiments. In this, we assessed a sample from each of the baseline marker sets, HD23, HD21, HD19, HD17, HD15, HD13, and HD11. For each set, 10 markers were selected at random, with their counterparts from the HO sets. In the case of HD23, where there were only six markers, all six were selected. For each marker set, 100 occlusion masks were generated, each with a randomly selected centroid, with a radius of 100 px. Each centroid was applied to each marker within the HD and HO sets, with the composites totalling either 1200 or 2000 total images.
Each composite was passed through the system, with the success or failure of the detection recorded, for each occlusion mask. The total failed detections are presented in Table 6, subdivided based on which sample images were affected: both, HD only, HO only, or neither. Also included are the average detection rates of the HD composites and the HO composite.
When comparing the HD23 detection rates and the HD11 detection rates, it can be seen how the occlusions impact the detection potential. A drop in the detection rate was observed from HD23 at 49 to 0.2 for HD11. For the HO marker sets, there was no drop, and the average detection remained consistently above 75% throughout each marker set. This is illustrated more clearly in Figure 24.
From the constrained standard deviation, the stability of HO detection across each base set is clear. In contrast, HD detection decreased further as the Hamming distance in the base set increased, with wide deviation for every set where the detection was more than 1% of markers. This effectively shows the improvement in the efficacy of occlusion resilience.

4.2. Assessments of Fabrication Materials

4.2.1. Assessment of Low-Light Detection for Active-Material Markers

This set of experiments aimed to assess the utility of markers constructed using non-conventional materials. In this, we aimed to identify scenarios and conditions wherein various material combinations outperform standard approaches in the literature.
To conduct this experiment, we assessed the detection reliability by trialling the markers in different lighting conditions. These included indoors; outdoors under daylight and cloudy conditions; and inside in the dark, under low-light conditions, and under torchlight from various directions.
The Illuminance and CCT were collected for each lighting condition, and the markers were placed in the centre of the camera view at a depth of 0.82 m. Whether the marker could be consistently, inconsistently, or not detected was recorded. Figure 25 shows a subset of the results focusing on detection under darker lighting conditions with a LUX of less than 350.

Reflect-o-Lay

In our initial assessment of the Reflect-o-Lay PLA, the methodology was insufficient at highlighting the “active” component of the marker, when compared to the standard black and white PLA. This is attributed to the overall contrast of the marker, wherein the Reflect-o-Lay being grey when inactive has a weaker contrast than the black and white of the PLA marker.
To validate this material further, we explored an additional methodology to assess detection reliability under a reflective scenario based on distance. In this, we placed the Reflect-o-Lay marker indoors under an illuminance of 120 LUX under direct light, causing reflection. This was placed alongside markers on paper (Laminated mono), colour PLA, and acrylic (etched + painted). For each distance with an interval of 10 cm from 1.9 m to 3.0 m, again, the marker detection rates were observed and recorded as either consistent, inconsistent, or none. The results are shown in Figure 26.
The Reflect-o-Lay material was found in this supplementary experiment to be detectable from the furthest distance, with laminated paper being the second best approach. When only considering stable detection, the Reflect-o-Lay marker achieved an increase of over 16% in the total distance when compared to the non-active 3D-printed variant. These results show a clear benefit to the use of reflective materials, in particular for scenarios such as marker detection where high contrast is key.

Backlit Transparent Markers

In the initial experiment, the transparent markers utilised a backlight illuminance of 6000 (100%), 1500 (20%), and 300 (5%). In the results, transparent markers were found to be undetectable in darkened conditions when utilising a high brightness. This was suspected to be caused by a mix of light leakage diminishing the contrast and overexposure, which can be resolved with Adaptive Active Exposure Control [51]. Due to the less vibrant colouring, the blue marker at 100% was detectable in some instances. In the 15% tests, it can be seen that pink backlit markers were also quite reliable, highlighting the importance of colour selection. The 5% backlit markers had a good detection rate in darkness but were undetectable in daylight. This is thought to be caused by the low contrast of the inactive markers.

Glow in the Dark

In the assessment of the glow-in-the-dark marker, it was found that it performed excellently in all low-light conditions. The detection clarity, as seen in Figure 20, was the clearest of any markers visible in the dark. The additive used for this is strontium aluminate-based phosphors. These phosphors are able to glow for up to 24 h under ideal conditions, and 6–12 h under normal conditions.
In this form, once removed from the charging course, the marker was able to emit light for an extended period of time; however, the intensity of the emission diminished rapidly. Whilst initially, the marker could be identified very clearly, it was only able to be identified for a brief period. Because of this limitation, the Overture Glow material will need to be reassessed in a future study.

4.3. Assessments of System Performance

4.3.1. Depth Estimation from Bounding Box

In this experiment, the purpose was to assess the accuracy of the depth estimation obtained through bounding box measurements.
In the execution, a Microsoft L2 LifeCam HD-3000 USB camera was mounted to the ceiling at 2.33 m from the ground. A total of 8 distances from the ceiling-mounted camera were identified, with a roughly 30 cm interval between each. The marker was positioned at the centre of the camera frame at each height, and a total of 9000 images were collected over 5 min. Each image was passed through the detection system and the depth estimations were identified and recorded. The distributions of these estimations are shown in Figure 27.
Across each of the distances, the depth estimation error was relatively small with the largest mean error of 14 cm at nearly 2 m from the camera. The standard deviation in distance estimation was relatively small with the largest value at 3.1 cm, emphasising the stability inherent with STag markers. These results, summarised in Table 7, highlight the stability of the proposed approach for depth estimation from the STag markers.

4.3.2. Incorporated Stable Position Filtering

Even pixel-wise detection errors triggered by lighting conditions can result in temporal fluctuation even for a static position of the marker. The temporal filtering presented in Section 3.3.1 addresses this. However, the buffer length of this filter while smoothening the detection can also cause a lag in the detection. To assess the efficacy of this included temporal filtering and to identify an ideal buffer length, we ran a set of trials focusing on marker tracking on a 2D plane. In this, we used our ceiling-mounted Microsoft L2 LifeCam HD-3000 USB camera at 2.33 m from the ground. We then moved marker 0 from HD19 at 0.9 m from the ground for 5 min collecting raw image data. In total, this accounted for 9000 images with the 30 fps camera feed.
We processed the collected image data through the pose estimation model using buffer lengths of 1, 2, 5, 10, 15, and 20 image frames. The pose estimation used the median as the function g in (23) over the buffer length considered for each frame to filter the outliers. To assess the efficacy of the filtering, we evaluated the various buffer lengths against the unfiltered input (where the buffer length was 1). In initial observations, the unfiltered graph can be seen with a lot of noise due to a lack of stability in the position of the marker. To quantify this noise, we utilised a metric termed cumulative path length ( C P L ) over time (27). The value of this metric quantifies the jitter within the temporal data by taking, for each reading, the cumulative navigation of the marker across the next N frames. This is expressed as follows:
C P L ( t ) = i = 0 N 1 p ( t + Δ t i ) p ( t + Δ t i 1 )
A high value of CPL indicates the effect of jittery measurements within the selected time window (here, 2 s). The temporal variation of the CPL is presented in Figure 28. In this, it can be seen that as the buffer length begins to increase, there is a sharp decline in the most prominent spikes.
Notably, the CPL does increase partly as a result of the position estimates lagging behind the camera feed. This particular value can be predicted when utilising a median filtering approach, as shown in (28); the lag (in seconds) is estimated at the ratio of the buffer length L to twice the framerate of the camera.
l a g = L 2 · framerate
Figure 29 shows the impact of the lag and CPL for different buffer lengths. The CPL decreased exponentially as the buffer length increased, whilst the lag increased linearly. As the buffer size increased, the reduction in the CPL was attributed to larger buffers skipping tight corners, thereby ignoring many intermediate positions. From these observations, a buffer length of 10 presented as a Pareto-optimal value for our test scenario, balancing further reduction in CPL to the increases in CPL due to lag. The ideal filter length, however, should be considered with respect to the speed of the robot and the framerate of the camera.

4.3.3. Integration for Robot Localisation

To assess the efficacy of the system as a whole for off-board robot localisation, additional experimentation was performed. In this, we utilised an AgileX LIMO robot running a ROS2 docker image. The camera setup also utilised our ceiling-mounted Microsoft L2 LifeCam HD-3000 USB camera at 2.33 m from the ground with the marker attached to the back of the LIMO robot, as in Figure 30. The robot was configured to run AMCL with LiDAR input. It was further configured to receive pose updates from the STag localiser at a frequency of 1 update per second. Whilst the detection rate could reach over 30 Hz, this was limited on the robot, with the rate set at 1 Hz to compensate for the limited speed of the robot.
The experiment was conducted with the robot tele-operated. The robot was driven around the environment to incur drift and then periodically driven within the view of the camera to receive corrections. Figure 31 and Figure 32, respectively, show the x and y coordinates received.
In total, the robot entered the camera view a total of six times throughout the experiment. In this, it corrected a total of 4.7 m of displacement with an average correction of 0.78 m with a std of 0.317 m. It can be seen how the corrections compensate for the overall drift, showing this type of approach is fully capable of supporting regions of an environment that are homogeneous from the perspective of a LiDAR.

5. Conclusions

Robust localisation of mobile robots is critical to the efficiency of their navigation and task performances. When considering long-term deployments, the performance of the traditional LiDAR-based approaches may be affected due to the variations in the mapped objects in the space. Fiducial markers provide an option for reliable robot localisation. Although different fiducial marker systems exist, they usually perform better only in some of the core requirements: (i) speed of detection, (ii) stability of position estimations, or (iii) resistance to occlusions. The STag has many positives in these functional requirements but lacks the six-DoF pose estimates that are essential for robot localisation. Although our primary aim was the development of a robust fiducial marker-based indoor robot localisation, our contributions here were multi-faceted and can be listed chronologically as they appeared: (i) new marker sets with enhanced resilience of the detection pipeline to occlusion and improved capacity and customisability; (ii) exploration of different fabrication methods and materials for various environmental conditions; and (iii) enhanced stability and six-DoF pose estimates for both on-board and off-board localisation of autonomous mobile robots. The full fiducial localisation system consists of two subsystems: the marker pose estimation and the camera pose estimation. Further, the developments support a collection of features to facilitate and improve the quality of deployment. The extension in marker sets facilitates three new types of marker sets: HG, HC, and HO marker sets. The associated software to support the deployment of these systems, along with marker generation, detection processes, and physical marker fabrication, has been released as an open source distribution, accessible from the LCAS Github at https://github.com/LCAS/STag_ROS2 (accessed on 16 December 2024).
The marker pose estimation utilises the STag bounding boxes, combined with physical marker size and camera field of view, to compose a six-DoF pose estimate for each marker. The real-world experiments showed stable depth estimation over time, with errors increasing at minimal rates compared to the true depth. The camera localisation leverages ground-truth marker locations encoded into the environment to determine the camera pose with respect to the world coordinate frame. Assessing drift correction for a mobile robot entering the view of a ceiling-mounted camera, the results showed effective compensation to LiDAR-based AMCL for homogeneous regions in an indoor environment. Further, motion filtering was included to address camera noise in the detection stability over time, removing the effect of outliers in the estimated pose and reducing cumulative path length.
The developed hue/greyscale (HG) markers and the supporting detection processing were designed to enable further customisation of the STag markers. They utilise image inversion to enhance detection for markers of various levels of contrast and further enable the use of all forms of two-tone chromatic markers. The experimental analysis showed the varying contrasts detectable under this approach extend the original options by over 50%.
The high-capacity (HC) markers utilise a chromatic approach to encoding information in which each colour channel is dedicated to a unique marker. Under the HC marker sets, the total number of possible markers increased to 11 trillion unique markers for the HD11 base. The experimental analysis showed statistical similarity in the detection rate between HD19 and HC19 markers. Further, it was found that increasing the library size with HC markers significantly reduced detection failures compared with increasing the library size with a reduction in Hamming distance.
The new high-occlusion (HO) marker sets were generated through colour-channel rotation, wherein each of the red, green, and blue channels was rotated by 90 degrees to one another. This enabled up to 75% of the marker to be covered with the missing information able to be recovered from the other colour channels, with the bit error ratio increasing this occlusion resistance further. The approach was validated through experimental analysis with 1000 masks randomly generated and placed over both the baseline HD23 and our HO23 marker sets. The results of the McNemar statistical test showed a p-value below 0.05, proving the newly developed markers have a higher resilience to occlusions than the baseline approach. To further assess the efficacy of the approach, supplementary experiments were carried out sampling markers across all seven of the baseline STag marker sets. The average detection rates showed a clear drop in stability for the baseline markers from 50% for HD23 down to 0.2% for HD11, whilst the developed HO marker sets retained consistent detection at over 75% for all marker sets, with less variation in standard deviation. This further reinforced the efficacy of our approach over the baseline.
In the assessment of active materials for marker fabrication, detection rates were assessed for various materials commonly used for indoor and outdoor robotics. It was found that the reflective filament, Reflect-o-Lay, enabled the detection of 3D-printed markers at 16% further distances. The glow-in-the-dark additive material was found to have an effective detection rate at all light levels. Finally, the efficacy of the backlit markers was found to work exceptionally in low-light conditions when utilising a low brightness, with blue and pink hues providing a more reliable detection rate.
The localisation system as a whole was found to localise, with high stability, robots navigating through a partly observed global space. The experiments validated the subsystems proposed within the approach and showed effective end-to-end deployment in practice under real-world conditions. The newly developed marker sets were found to have a clear potential for extending the capabilities of STag with the inclusion of chromatics. The adoption of active materials also showed potential for improving detection in low-light conditions.
Further work in this area should primarily focus around, for localisation improvement, and forward projection of location filtering such as with the use of particle filters. For marker set development, exploration would be valuable in detection improvements under atmospheric variation with chromatic markers utilising alternate colour spaces such as HSV or CMYK. For the different materials used, a long-term deployment study would be ideal to understand the challenges specific to individual materials and chromatic detections while being deployed for long durations. For marker material development, there is scope in developing markers with high-vibrant fabrication methods and further exploration into novel active materials.

Author Contributions

Conceptualisation, J.R.H. and G.P.D.; Methodology, J.R.H.; Software, J.R.H., D.P. and R.L.S.; Writing—Original Draft Preparation, J.R.H.; Writing—Review and Editing, G.P.D.; Supervision, G.P.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Engineering and Physical Sciences Research Council [EP/S023917/1].

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data presented in this study are openly available at https://github.com/LCAS/STag_ROS2 (accessed on 16 December 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Lightbody, P.; Krajník, T.; Hanheide, M. An efficient visual fiducial localisation system. ACM SIGAPP Appl. Comput. Rev. 2017, 17, 28–37. [Google Scholar] [CrossRef]
  2. Ulrich, J.; Blaha, J.; Alsayed, A.; Rouček, T.; Arvin, F.; Krajník, T. Real Time Fiducial Marker Localisation System with Full 6 DOF Pose Estimation. ACM SIGAPP Appl. Comput. Rev. 2023, 23, 20–35. [Google Scholar] [CrossRef]
  3. Alam, M.S.; Gullu, A.I.; Gunes, A. Fiducial Markers and Particle Filter Based Localization and Navigation Framework for an Autonomous Mobile Robot. SN Comput. Sci. 2024, 5, 748. [Google Scholar] [CrossRef]
  4. Fiala, M. ARTag, a fiducial marker system using digital techniques. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; Volume 2, pp. 590–596. [Google Scholar]
  5. Krajník, T.; Nitsche, M.; Faigl, J.; Vaněk, P.; Saska, M.; Přeučil, L.; Duckett, T.; Mejail, M. A practical multirobot localization system. J. Intell. Robot. Syst. 2014, 76, 539–562. [Google Scholar] [CrossRef]
  6. DeGol, J.; Bretl, T.; Hoiem, D. Chromatag: A colored marker and fast detection algorithm. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1472–1481. [Google Scholar]
  7. Kalaitzakis, M.; Cain, B.; Carroll, S.; Ambrosi, A.; Whitehead, C.; Vitzilaios, N. Fiducial Markers for Pose Estimation. J. Intell. Robot. Syst. 2021, 101, 71. [Google Scholar] [CrossRef]
  8. Benligiray, B.; Topal, C.; Akinlar, C. STag: A stable fiducial marker system. Image Vis. Comput. 2019, 89, 158–169. [Google Scholar] [CrossRef]
  9. Calvet, L.; Gurdjos, P.; Griwodz, C.; Gasparini, S. Detection and accurate localization of circular fiducials under highly challenging conditions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 562–570. [Google Scholar]
  10. Ló pez de Ipin a, D.; Mendonça, P.R.; Hopper, A.; Hopper, A. TRIP: A low-cost vision-based location system for ubiquitous computing. Pers. Ubiquitous Comput. 2002, 6, 206–219. [Google Scholar] [CrossRef]
  11. Lightbody, P.; Krajník, T.; Hanheide, M. A versatile high-performance visual fiducial marker detection system with scalable identity encoding. In Proceedings of the Symposium on Applied Computing, Marrakech, Morocco, 4–6 April 2017; pp. 276–282. [Google Scholar]
  12. Bergamasco, F.; Albarelli, A.; Cosmo, L.; Rodola, E.; Torsello, A. An accurate and robust artificial marker based on cyclic codes. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 2359–2373. [Google Scholar] [CrossRef]
  13. Sattar, J.; Bourque, E.; Giguere, P.; Dudek, G. Fourier tags: Smoothly degradable fiducial markers for use in human-robot interaction. In Proceedings of the Fourth Canadian Conference on Computer and Robot Vision (CRV’07), Montreal, QC, Canada, 28–30 May 2007; pp. 165–174. [Google Scholar]
  14. Mutka, A.; Miklic, D.; Draganjac, I.; Bogdan, S. A low cost vision based localization system using fiducial markers. IFAC Proc. Vol. 2008, 41, 9528–9533. [Google Scholar] [CrossRef]
  15. Kulich, M.; Chudoba, J.; Kosnar, K.; Krajnik, T.; Faigl, J.; Preucil, L. SyRoTek—Distance Teaching of Mobile Robotics. IEEE Trans. Educ. 2013, 56, 18–23. [Google Scholar] [CrossRef]
  16. Rekimoto, J. Matrix: A realtime object identification and registration method for augmented reality. In Proceedings of the 3rd Asia Pacific Computer Human Interaction (Cat. No. 98EX110), Kanagawa, Japan, 15–17 July 1998; pp. 63–68. [Google Scholar]
  17. Rekimoto, J.; Ayatsuka, Y. CyberCode: Designing augmented reality environments with visual tags. In Proceedings of the DARE 2000 on Designing Augmented Reality Environments, Elsinore, Denmark, 12–14 April 2000; pp. 1–10. [Google Scholar]
  18. Atcheson, B.; Heide, F.; Heidrich, W. Caltag: High precision fiducial markers for camera calibration. In Proceedings of the VMV, Siegen, Germany, 15–17 November 2010; Volume 10, pp. 41–48. [Google Scholar]
  19. Wang, B. LFTag: A scalable visual fiducial system with low spatial frequency. In Proceedings of the 2020 2nd International Conference on Advances in Computer Technology, Information Science and Communications (CTISC), Suzhou, China, 20–22 March 2020; pp. 140–147. [Google Scholar]
  20. Kato, H.; Billinghurst, M. Marker tracking and hmd calibration for a video-based augmented reality conferencing system. In Proceedings of the 2nd IEEE and ACM International Workshop on Augmented Reality (IWAR’99), San Francisco, CA, USA, 20–21 October 1999; pp. 85–94. [Google Scholar]
  21. Wagner, D.; Schmalstieg, D. ARToolKitPlus for Pose Tracking on Mobile Devices. In Proceedings of the Computer Vision Winter Workshop, St. Lambrecht, Austria, 6–8 February 2007; Grabner, M., Grabner, H., Eds.; Graz Technical University: St. Lambrecht, Austria, 2007. [Google Scholar]
  22. Olson, E. AprilTag: A robust and flexible visual fiducial system. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 3400–3407. [Google Scholar]
  23. Garrido-Jurado, S.; Muñoz-Salinas, R.; Madrid-Cuevas, F.J.; Marín-Jiménez, M.J. Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recognit. 2014, 47, 2280–2292. [Google Scholar] [CrossRef]
  24. Flohr, D.; Fischer, J. A Lightweight ID-Based Extension for Marker Tracking Systems. In Proceedings of the Eurographics Symposium on Virtual Environments, Short Papers and Posters, Weimar, Germany, 15–18 July 2007; Froehlich, B., Blach, R., van Liere, R., Eds.; The Eurographics Association: Limassol, Cyprus, 2007. [Google Scholar] [CrossRef]
  25. Naimark, L.; Foxlin, E. Circular data matrix fiducial system and robust image processing for a wearable vision-inertial self-tracker. In Proceedings of the International Symposium on Mixed and Augmented Reality, Darmstadt, Germany, 30 September–1 October 2002; pp. 27–36. [Google Scholar]
  26. van Rhijn, A.; Mulder, J.D. Optical Tracking using Line Pencil Fiducials. In Proceedings of the EGVE, Grenoble, France, 8–9 June 2004; pp. 35–44. [Google Scholar]
  27. Bergamasco, F.; Albarelli, A.; Torsello, A. Pi-tag: A fast image-space marker design based on projective invariants. Mach. Vis. Appl. 2013, 24, 1295–1310. [Google Scholar] [CrossRef]
  28. Kaltenbrunner, M.; Bencina, R. reacTIVision: A computer-vision framework for table-based tangible interaction. In Proceedings of the 1st International Conference on Tangible and Embedded Interaction, Baton Rouge, LO, USA, 15–17 February 2007; pp. 69–74. [Google Scholar]
  29. Kaltenbrunner, M. An Abstraction Framework for Tangible Interactive Surfaces. Ph.D. Thesis, Bauhaus-Universität Weimar, Weimar, Germany, 2018. [Google Scholar] [CrossRef]
  30. Prasad, M.G.; Chandran, S.; Brown, M.S. A motion blur resilient fiducial for quadcopter imaging. In Proceedings of the 2015 IEEE Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 5–9 January 2015; pp. 254–261. [Google Scholar]
  31. Schweiger, F.; Zeisl, B.; Georgel, P.F.; Schroth, G.; Steinbach, E.G.; Navab, N. Maximum Detector Response Markers for SIFT and SURF. In Proceedings of the VMV, Braunschweig, Germany, 16–18 November 2009; Volume 10, pp. 145–154. [Google Scholar]
  32. Toyoura, M.; Aruga, H.; Turk, M.; Mao, X. Detecting markers in blurred and defocused images. In Proceedings of the 2013 International Conference on Cyberworlds, Yokohama, Japan, 21–23 October 2013; pp. 183–190. [Google Scholar]
  33. Mohan, A.; Woo, G.; Hiura, S.; Smithwick, Q.; Raskar, R. Bokode: Imperceptible Visual Tags for Camera Based Interaction from a Distance. In Proceedings of the ACM SIGGRAPH 2009 Papers, Yokohama, Japan, 16–19 December 2009; ACM: New York, NY, USA, 2009; pp. 1–8. [Google Scholar] [CrossRef]
  34. Košt’ák, M.; Slabỳ, A. Designing a simple fiducial marker for localization in spatial scenes using neural networks. Sensors 2021, 21, 5407. [Google Scholar] [CrossRef] [PubMed]
  35. Liu, J.; Chen, S.; Sun, H.; Qin, Y.; Wang, X. Real time tracking method by using color markers. In Proceedings of the 2013 International Conference on Virtual Reality and Visualization, Xi’an, China, 14–15 September 2013; pp. 106–111. [Google Scholar]
  36. Dell’Acqua, A.; Ferrari, M.; Marcon, M.; Sarti, A.; Tubaro, S. Colored visual tags: A robust approach for augmented reality. In Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, Como, Italy, 15–16 September 2005; pp. 423–427. [Google Scholar]
  37. Research, M. High Capacity Color Barcode (HCCB) Technology. 2007. Available online: http://research.microsoft.com/en-us/projects/hccb/ (accessed on 16 December 2024).
  38. Neumann, Y.C.J.L.U. A multi-ring color fiducial system and an intensity-invariant detection method for scalable fiducial-tracking augmented reality. In Proceedings of the Int’l Workshop Augmented Reality, San Francisco, CA, USA, 20–21 October 1999; pp. 147–165. [Google Scholar]
  39. Farkas, Z.V.; Korondi, P.; Illy, D.; Fodor, L. Aesthetic marker design for home robot localization. In Proceedings of the IECON 2012—38th Annual Conference on IEEE Industrial Electronics Society, Montreal, QC, Canada, 25–28 October 2012; pp. 5510–5515. [Google Scholar]
  40. Higashino, S.; Nishi, S.; Sakamoto, R. ARTTag: Aesthetic fiducial markers based on circle pairs. In ACM SIGGRAPH 2016 Posters; ACM: New York, NY, USA, 2016; pp. 1–2. [Google Scholar]
  41. Jurado-Rodríguez, D.; Muñoz-Salinas, R.; Garrido-Jurado, S.; Medina-Carnicer, R. Design, detection, and tracking of customized fiducial markers. IEEE Access 2021, 9, 140066–140078. [Google Scholar] [CrossRef]
  42. Akinlar, C.; Topal, C. EDPF: A real-time parameter-free edge segment detector with a false detection control. Int. J. Pattern Recognit. Artif. Intell. 2012, 26, 1255002. [Google Scholar] [CrossRef]
  43. Vaughan, J. Fiducial SLAM. 2024. Available online: http://wiki.ros.org/fiducial_slam (accessed on 16 December 2024).
  44. Ulrich, J.; Alsayed, A.; Arvin, F.; Krajník, T. Towards fast fiducial marker with full 6 dof pose estimation. In Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing, Virtual, 25–29 April 2022; pp. 723–730. [Google Scholar]
  45. Dietz, H.G. Trace2SCAD: Converting Images Into OpenSCAD Models; Technical Report; University of Kentucky: Lexington, KY, USA, 2015. [Google Scholar]
  46. Machado, F.; Malpica, N.; Borromeo, S. Parametric CAD modeling for open source scientific hardware: Comparing OpenSCAD and FreeCAD Python scripts. PLoS ONE 2019, 14, e0225795. [Google Scholar] [CrossRef]
  47. Prusa Research. PrusaSlicer. Version 2.9.4. 2024. Available online: https://www.prusa3d.com/prusaslicer/ (accessed on 16 December 2024).
  48. Overture 3D, Overture Glow PLA 3D Printer Filament 1.75 mm—Glow White (Green in Dark). 2024. Available online: https://overture3d.com/products/overture-glow-pla?variant=44279674732798 (accessed on 13 September 2024).
  49. Filament2Print. Reflect-o-Lay 3D Printing Filament. 2024. Available online: https://filament2print.com/gb/special-pla/690-reflect-o-lay.html (accessed on 2 September 2024).
  50. Heselden, J.R.; Das, G.P. Unified Map Handling for Robotic Systems: Enhancing Interoperability and Efficiency Across Diverse Environments. In Proceedings of the Workshop on Field Robotics, ICRA 2024, Yokohama, Japan, 17 May 2024. [Google Scholar] [CrossRef]
  51. Ren, Z.; Lensgraf, S.; Quattrini Li, A. Improving the perception of visual fiducial markers in the field using Adaptive Active Exposure Control. In Proceedings of the International Symposium on Experimental Robotics, Chiang Mai, Thailand, 9–12 November 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 274–284. [Google Scholar]
Figure 1. Examples of robots equipped with STag markers.
Figure 1. Examples of robots equipped with STag markers.
Machines 13 00002 g001
Figure 2. Examples of markers from each of the (a) baseline, (b) hue/greyscale, (c) high-capacity, and (d) high-occlusion marker sets, developed in this work.
Figure 2. Examples of markers from each of the (a) baseline, (b) hue/greyscale, (c) high-capacity, and (d) high-occlusion marker sets, developed in this work.
Machines 13 00002 g002
Figure 7. Representation for stages of the STag marker-generation pipeline: (a) binary representation of all points, (b) active points for marker 0 from the HD23 set, (c) binary pattern processed for homography refinement, and (d) completed marker.
Figure 7. Representation for stages of the STag marker-generation pipeline: (a) binary representation of all points, (b) active points for marker 0 from the HD23 set, (c) binary pattern processed for homography refinement, and (d) completed marker.
Machines 13 00002 g007
Figure 8. Marker libraries with different sizes and maximum bit error ratio (BER) correction capabilities (reproduced from [8]).
Figure 8. Marker libraries with different sizes and maximum bit error ratio (BER) correction capabilities (reproduced from [8]).
Machines 13 00002 g008
Figure 9. Developed marker libraries with different library sizes and maximum bit error ratio (BER) correction capabilities.
Figure 9. Developed marker libraries with different library sizes and maximum bit error ratio (BER) correction capabilities.
Machines 13 00002 g009
Figure 10. STag Markers of varying contrast with signed contrast ratios of (a) +5100, (b) +1.34, (c) +0.08, and (d) −5100.
Figure 10. STag Markers of varying contrast with signed contrast ratios of (a) +5100, (b) +1.34, (c) +0.08, and (d) −5100.
Machines 13 00002 g010
Figure 11. Collection of markers with colour palettes matching themes of various corporate logos of (a) John Deere, (b) PayPal, and (c) Coca-Cola, with signed contrast ratios of −1.06, 0.72, and −1084, respectively.
Figure 11. Collection of markers with colour palettes matching themes of various corporate logos of (a) John Deere, (b) PayPal, and (c) Coca-Cola, with signed contrast ratios of −1.06, 0.72, and −1084, respectively.
Machines 13 00002 g011
Figure 12. Construction of (a) HC23 marker with the ID of 11 from HD23 markers (b) 5, (c) 1, and (d) 0 across the red, green, and blue channels, respectively.
Figure 12. Construction of (a) HC23 marker with the ID of 11 from HD23 markers (b) 5, (c) 1, and (d) 0 across the red, green, and blue channels, respectively.
Machines 13 00002 g012
Figure 13. Examples of high-capacity markers from the HC23 marker set. Note how shared channels of the same ID result in either (a) binary images, (b) blue/yellow, (c) red/cyan, or (d) green/purple pairs.
Figure 13. Examples of high-capacity markers from the HC23 marker set. Note how shared channels of the same ID result in either (a) binary images, (b) blue/yellow, (c) red/cyan, or (d) green/purple pairs.
Machines 13 00002 g013
Figure 14. Generation process for (a) high-occlusion marker 0 from HD23 marker 0 split into and rotated by (b) red channel and no rotation, (c) green channel and 90° clockwise rotation and (d) blue channel and 180° rotation, respectively.
Figure 14. Generation process for (a) high-occlusion marker 0 from HD23 marker 0 split into and rotated by (b) red channel and no rotation, (c) green channel and 90° clockwise rotation and (d) blue channel and 180° rotation, respectively.
Machines 13 00002 g014
Figure 15. Examples of high-occlusion markers from the HO23 marker set.
Figure 15. Examples of high-occlusion markers from the HO23 marker set.
Machines 13 00002 g015
Figure 16. Examples of STag markers produced using (a) paper, (b) laminated paper and (c) with photo paper.
Figure 16. Examples of STag markers produced using (a) paper, (b) laminated paper and (c) with photo paper.
Machines 13 00002 g016
Figure 17. Examples of STag markers produced using the laser-cutting method: (a) edge-etched acrylic, (b) fully-etched acrylic and (c) fully-etched wood.
Figure 17. Examples of STag markers produced using the laser-cutting method: (a) edge-etched acrylic, (b) fully-etched acrylic and (c) fully-etched wood.
Machines 13 00002 g017
Figure 18. Example of STag marker produced using the 3D printing method. Markers generated with (a) manual colour change at the halfway mark from black to white, (b) white to black and (c) orange to blue.
Figure 18. Example of STag marker produced using the 3D printing method. Markers generated with (a) manual colour change at the halfway mark from black to white, (b) white to black and (c) orange to blue.
Machines 13 00002 g018
Figure 19. Example of (a) a 3D-printed chromatic STag marker produced via manual filament swaps during the printing alongside the detection across each of the (b) red, (c) green, and (d) blue colour channels.
Figure 19. Example of (a) a 3D-printed chromatic STag marker produced via manual filament swaps during the printing alongside the detection across each of the (b) red, (c) green, and (d) blue colour channels.
Machines 13 00002 g019
Figure 20. Unique qualities of (a,d) Transparent PETG, (b,e) Overture Glow PLA, and (c,f) Reflect-o-Lay PLA under different lighting conditions. Note the higher contrast (top row) when the material properties are activated by darkness for the (a) Transparent-PETG with red backlit and (b) Overture Glow PLA, and direct torchlight for the (c) Reflect-o-Lay, compared against their non-activated versions (bottom row).
Figure 20. Unique qualities of (a,d) Transparent PETG, (b,e) Overture Glow PLA, and (c,f) Reflect-o-Lay PLA under different lighting conditions. Note the higher contrast (top row) when the material properties are activated by darkness for the (a) Transparent-PETG with red backlit and (b) Overture Glow PLA, and direct torchlight for the (c) Reflect-o-Lay, compared against their non-activated versions (bottom row).
Machines 13 00002 g020
Figure 21. Heatmaps of detection results for the original (top) and upgraded approach (bottom). The value and associated colour for each cell denotes the detection rate where 0 indicates no detections were made, and 1 indicates the detections were stable and reliable.
Figure 21. Heatmaps of detection results for the original (top) and upgraded approach (bottom). The value and associated colour for each cell denotes the detection rate where 0 indicates no detections were made, and 1 indicates the detections were stable and reliable.
Machines 13 00002 g021
Figure 22. Failed detection across different marker libraries (a) when occluded in positions denoted by the dotted ring (b). The numbers in the center of the dotted circle are the mask numbers, used for repeated trials across different markers.
Figure 22. Failed detection across different marker libraries (a) when occluded in positions denoted by the dotted ring (b). The numbers in the center of the dotted circle are the mask numbers, used for repeated trials across different markers.
Machines 13 00002 g022
Figure 23. Centroids for occlusion masks used in the experiment, divided based on contingency table in Table 5: (a) detected with both; (b) only detected with HD; (c) only detected with HO; and (d) neither detected. Different colours are used to differentiate the occlusion mask locations for each set.
Figure 23. Centroids for occlusion masks used in the experiment, divided based on contingency table in Table 5: (a) detected with both; (b) only detected with HD; (c) only detected with HO; and (d) neither detected. Different colours are used to differentiate the occlusion mask locations for each set.
Machines 13 00002 g023
Figure 24. Average detection and standard deviation in the detection of 100 occlusion composites over 10 markers sampled from different baseline (HD) and high-occlusion (HO) marker sets.
Figure 24. Average detection and standard deviation in the detection of 100 occlusion composites over 10 markers sampled from different baseline (HD) and high-occlusion (HO) marker sets.
Machines 13 00002 g024
Figure 25. Detection rates for various light-emitting active materials, illuminated at different intensities. Indicated as either; stable (S) for reliable detection, unstable (U) for unreliable detections, or left absent (-) when no detections were made.
Figure 25. Detection rates for various light-emitting active materials, illuminated at different intensities. Indicated as either; stable (S) for reliable detection, unstable (U) for unreliable detections, or left absent (-) when no detections were made.
Machines 13 00002 g025
Figure 26. Detection rates for various reflective materials, under direct torchlight, at different distances from the camera. Indicated as either; stable (S) for reliable detection, unstable (U) for unreliable detections, or left absent (-) when no detections were made.
Figure 26. Detection rates for various reflective materials, under direct torchlight, at different distances from the camera. Indicated as either; stable (S) for reliable detection, unstable (U) for unreliable detections, or left absent (-) when no detections were made.
Machines 13 00002 g026
Figure 27. Stability of depth estimation for different positions. Orange line denotes the median of the detections. Green crosses indicate the desired estimation.
Figure 27. Stability of depth estimation for different positions. Orange line denotes the median of the detections. Green crosses indicate the desired estimation.
Machines 13 00002 g027
Figure 28. Cumulative path length (CPL) over 2 s intervals for each filter length, aligned to the first output. The complete output can be seen on YouTube (https://youtu.be/2ApKVdD_6aE (accessed on 16 December 2024)).
Figure 28. Cumulative path length (CPL) over 2 s intervals for each filter length, aligned to the first output. The complete output can be seen on YouTube (https://youtu.be/2ApKVdD_6aE (accessed on 16 December 2024)).
Machines 13 00002 g028
Figure 29. Total CPL across various filter lengths is shown against the incurred lag to achieve stability.
Figure 29. Total CPL across various filter lengths is shown against the incurred lag to achieve stability.
Machines 13 00002 g029
Figure 30. Experimental setup with ceiling-mounted USB camera (top) and AgileX LIMO holding marker 0 from HD19 (bottom). The image view (right) shows marker 0 detected and labelled on the image.
Figure 30. Experimental setup with ceiling-mounted USB camera (top) and AgileX LIMO holding marker 0 from HD19 (bottom). The image view (right) shows marker 0 detected and labelled on the image.
Machines 13 00002 g030
Figure 31. Robot X positions over time. The region with red shading highlights the periods when the robot is receiving location estimates from the STag marker detection. The green line shows the correction at the entry into the camera view. The bottom graph presents these correction offset spikes.
Figure 31. Robot X positions over time. The region with red shading highlights the periods when the robot is receiving location estimates from the STag marker detection. The green line shows the correction at the entry into the camera view. The bottom graph presents these correction offset spikes.
Machines 13 00002 g031
Figure 32. Robot Y positions over time. The region with red shading highlights the periods when the robot is receiving location estimates from the STag marker detection. The green line shows the correction at the entry into the camera view. The bottom graph presents these correction offset spikes.
Figure 32. Robot Y positions over time. The region with red shading highlights the periods when the robot is receiving location estimates from the STag marker detection. The green line shows the correction at the entry into the camera view. The bottom graph presents these correction offset spikes.
Machines 13 00002 g032
Table 1. Library sizes of the STag HD marker sets with respective minimum Hamming distances.
Table 1. Library sizes of the STag HD marker sets with respective minimum Hamming distances.
HD11HD13HD15HD17HD19HD21HD23
22,335288476615738126
Table 2. Average performance (in milliseconds) on marker detection for each new marker set.
Table 2. Average performance (in milliseconds) on marker detection for each new marker set.
Marker TypeSTagHue/GreyscaleHigh-CapacityHigh-Occlusion
Image Retrieval0.50.50.50.5
Preprocessing00.34.10
STag Detection14.630.749.253.8
Post-processing000.838.4
Localisation0.40.40.40.2
Total15.531.955.092.9
Table 3. STag library sizes of the original HD marker sets and the proposed HC marker sets, with respective minimum Hamming distances.
Table 3. STag library sizes of the original HD marker sets and the proposed HC marker sets, with respective minimum Hamming distances.
Base
Library
HD11HD13HD15HD17HD19HD21HD23
HD22,335288476615738126
HC 11 × 10 12 24 × 10 9 45 × 10 7 39 × 10 5 54,8721728216
Table 4. Comparison of the successful detection of HD and HO marker sets.
Table 4. Comparison of the successful detection of HD and HO marker sets.
Detection SuccessDetection FailureTotal
Baseline (HD23)292330776000
Developed (HO23)53606406000
Table 5. Contingency table of results.
Table 5. Contingency table of results.
Adapted System SuccessAdapted System Failure
Baseline Success288736
Baseline Failure2473604
Table 6. Results applying 100 masks with 100 px radii over various 1000 px wide markers.
Table 6. Results applying 100 masks with 100 px radii over various 1000 px wide markers.
BaseTotalFailed DetectionsHD DetectionsHO Detectionsp-Value
BothHDHONoneMean ± Std.Mean ± Std.
23600285919511149.0 ± 21.9980.0 ± 6.511   ×   10 46
2110004652038313248.5 ± 38.5284.8 ± 8.646   ×   10 88
191000162262021616.4 ± 25.9378.2 ± 15.52   ×   10 182
1710003261253113133.8 ± 33.6985.7 ± 5.738   ×   10 140
1510008356822308.8 ± 18.7476.5 ± 19.503   ×   10 195
131000608091850.6 ± 0.6681.5 ± 12.785   ×   10 244
111000118441540.2 ± 0.4084.5 ± 10.257   ×   10 252
Table 7. Mean standard deviation and error across the estimated depths for each of the eight ground truths.
Table 7. Mean standard deviation and error across the estimated depths for each of the eight ground truths.
Ground Truth Depth (m)Estimated Depth as Mean ± Std (m)Mean Error (m)
0.230.23 ± 0.0000.00
0.460.44 ± 0.0030.02
0.800.76 ± 0.0040.04
1.091.02 ± 0.0050.07
1.391.30 ± 0.0150.09
1.701.62 ± 0.0310.08
1.991.85 ± 0.0040.14
2.292.19 ± 0.0290.10
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Heselden, J.R.; Paparas, D.; Stevenson, R.L.; Das, G.P. Enhanced STag Marker System: Materials and Methods for Flexible Robot Localisation. Machines 2025, 13, 2. https://doi.org/10.3390/machines13010002

AMA Style

Heselden JR, Paparas D, Stevenson RL, Das GP. Enhanced STag Marker System: Materials and Methods for Flexible Robot Localisation. Machines. 2025; 13(1):2. https://doi.org/10.3390/machines13010002

Chicago/Turabian Style

Heselden, James R., Dimitris Paparas, Robert L. Stevenson, and Gautham P. Das. 2025. "Enhanced STag Marker System: Materials and Methods for Flexible Robot Localisation" Machines 13, no. 1: 2. https://doi.org/10.3390/machines13010002

APA Style

Heselden, J. R., Paparas, D., Stevenson, R. L., & Das, G. P. (2025). Enhanced STag Marker System: Materials and Methods for Flexible Robot Localisation. Machines, 13(1), 2. https://doi.org/10.3390/machines13010002

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop