Object pose estimation for robot loading in accommodation space using alpha-shape algorithm

Guo, Qingda; Tang, Lulu; Zhang, Jianchi

doi:10.1007/s00500-021-06237-8

Object pose estimation for robot loading in accommodation space using alpha-shape algorithm

Optimization
Published: 21 September 2021

Volume 25, pages 14619–14628, (2021)
Cite this article

Download PDF

Soft Computing Aims and scope Submit manuscript

Object pose estimation for robot loading in accommodation space using alpha-shape algorithm

Download PDF

1148 Accesses
Explore all metrics

Abstract

Robots with visual sensors have been used in various goods logistics, such as bin picking or uploading. However, there are more and more demands for the automatic blanking and loading, and it is necessary to solve the problem of object pose estimation in changing accommodation space. This paper proposes a method for pose estimation in the accommodation space using alpha-shape algorithm and improved fruit fly optimization algorithm (FOA). The alpha-shape volume variety of object and measured space is set to the objective function, and the pose variety of object is set to six variables of improved FOA. The experiments were performed by setting parameters of improved FOA and considering the four space types represented the common accommodation shapes. Compared with previous work using convex hull, the new study using alpha-shape algorithm not only keeps the object in the accommodation space, but also maintains the object pose which is at the bottom of the space and can meet the practical requirement of object placement by robot arms.

Object Pose Estimation in Accommodation Space using an Improved Fruit Fly Optimization Algorithm

Article 25 October 2018

Active Object Search Exploiting Probabilistic Object–Object Relations

Robot End-Effector Mounted Camera Pose Optimization in Object Detection-Based Tasks

Article 30 December 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

With the increase in labor cost and the automation demand of production line equipment, industrial robots with stereo vision are applied to product processing, especially for bin picking and workpiece loading, such as the picking competition held by Amazon from the box or on the shelf. However, there are more and more demands for the automatic blanking and placement of objects in the “reversed picking process.” Limited by the visual perception technology of industrial robots, the packing process is usually set by manual procedure or based on artificial prior knowledge. For example, objects are packaged into simplified geometric shapes (cubes, cylinders, etc.) and then loaded into the boxes with known shapes and fixed positions. For some industrial robots with complex shape and elastic objects, robot engineers often set up the procedure of goods placement. Wynright company of the USA (Criswell 2014) built a mobile robot system for automatic loading of tires in containers, but there will still be unexpected space changes in the container body and thus remaining space (accommodation space). The key problem of automatic loading in an irregularly changing space is to obtain the pose of a target object in the remaining accommodation space, which eventually contains at least one object.

The effective pose matrix of the object in the accommodation space is the last crucial step in the process of intelligent robot’s picking, running (obstacle avoidance) and placing. Automatic loading (Guo et al. 2019) using industrial robot needs the hand–eye matrix (1), pose matrix (2) between the camera coordinate system and workpiece coordinate system of reference model, pose matrix (3) between the reference model and the target model and pose matrix (4) between the target object (or the reference model (5)) and the accommodation space, as shown in Fig. 1a. In the past few years, object recognition of robot in bin picking has been a hot topic in the scientific research and robot companies. Many algorithms have been proposed in the aspect of 6-degree of freedom (DOF) pose estimation, such as the application of Fast Point Feature Histograms (FPFH), Signature of Histograms of Orientations (SHOT), LineMod algorithm with template (Hinterstoisser et al. 2012), Point Pair Feature (PPF) with voting mechanism (Drost et al. 2010), DenseFusion (Wang et al. 2019), Pose-CNN based on regression (Xiang et al. 2017) and ssd-6d (Kehl et al. 2017).

The end-to-end trajectory of robot is also a hotspot research, which focuses on the trajectory planning based on optimal time or minimum energy. These planning algorithms mostly establish the dynamic and kinematic constraints of robot, using intelligent optimization algorithm to achieve trajectory acquisition or apply sensors to generate Octomap or three-dimensional (3D) data environment to establish space obstacle avoidance trajectory. Robot is usually based on the pre-setting pose matrix when it moves to the place where objects can be placed. Human beings have great flexibility in the operation of object placement, such as choosing a reasonable place for the object according to the size of the object or the environment on the plane, and packing in a limited space, which can be best placed according to different objects.

Alpha-shape algorithm is used to reconstruct object surface from an unorganized point cloud. The method was proposed in 2D points by Edelsbrunner and was then extended to 3D points (Edelsbrunner et al. 2003; Edelsbrunner and Mücke 1994). Compared with convex hull, alpha shape can be used to reconstruct object shape of nonconvex body, as shown in Fig. 1b. Alpha shape was widely used in 3D object shape. Zhu et al. (2008) proposed a novel approach for tree crown reconstruction based on an improvement of alpha-shape modeling, where the data are points unevenly distributed in a volume rather than on a surface only. Lou et al. (2013) used alpha shape to extract topographical features from engineering surfaces and found that the alpha-shape method was more efficient in performance for large structuring elements. Santos et al. (2019) proposed an adaptive method which estimates a local parameter for each edge based on local point spacing and used the method to extract the building roof boundaries from LiDAR Data.

Based on previous study (Guo et al. 2019), this paper still aims to solve the problem of object pose estimation in the accommodation space assuming that accommodation space can hold one target object. The alpha-shape algorithm and improved FOA (Pan 2012) are used to determine the object’s final state given a pose matrix. Firstly, the alpha-shape algorithms are introduced, and then, the alpha-shape volume variety of object and measured space is set to the objective function and the pose variety of object is set to six variables of improved FOA. Next, in the simulation experiments we at first present pose estimation process in cube space and obtain the convergence curves of four different spaces (namely the cube, hemisphere, cylindrical and triangular prism) and the object’s final pose in the four spaces. Finally, the reliability of the proposed method is verified experimentally, comparisons with previous work are given, and conclusions are drawn.

2 Proposed method

2.1 Alpha-shape algorithm

Alpha-shape algorithm can reconstruct geometry from a set of discrete points, as shown in Fig. 2. A circle with radius value α is used to scroll around the point set S. After traversing all the points, the inner and outer contours of S can be obtained. When the radius α is large, the circle rolls outside S, and the trace of the external roll is the boundary contour of the point set. When α is small, the circle will roll to the interior of S. When α is small enough, every point in S is a boundary point. The value of circle radius α is closely related to the fineness of the detected points contour. When the radius is relatively small, the detected contour is relatively fine, and when the radius is relatively large, the detected contour is slightly rough.

2.2 Pose estimation algorithm

When an industrial robot picks, runs, and places the target object in the accommodation space, the shape of accommodation space generally can be generally convex. The swarm intelligence optimization algorithm usually requires an objective function to determine the direction of iterative optimization. Compared with previous work (Guo et al. 2019), we just use the volume of alpha shape to establish an objective function of FOA.

In the process of placing the object in the accommodation space, there are two groups of point clouds (the object and the accommodation space), in Fig. 3a–c. The two groups of points are combined into one group, and the change of relative position can be expressed by the whole volume value V_t. The whole volume value V_t of the optimal position is minimal, as the object is placed at the bottom of the accommodation space and there is no space between the object and the bottom of the accommodation space.

As the object is placed at the bottom of the accommodation space, the combined point cloud is concave. The whole volume value V_t can be obtained by tuning α value of alpha-shape algorithm. The fitness function of FOA is expressed by:

$$ S_{\min } = V_{t} . $$

(1)

Based on previous work (Guo et al. 2019), the target rotation is regarded as a separate rotation along each axis, and the posture adjustment of the object is simplified into three parameter variables, whose rotation matrixes are expressed as:

$$ {\mathbf{R}}_{x} (\alpha ) = \left[ {\begin{array}{*{20}c} 1 & 0 & 0 \\ 0 & {\cos (\alpha )} & { - \sin (\alpha )} \\ 0 & {\sin (\alpha )} & {\cos (\alpha )} \\ \end{array} } \right], $$

(2)

$$ {\mathbf{R}}_{y} (\beta ) = \left[ {\begin{array}{*{20}c} {\cos (\beta )} & 0 & {\sin (\beta )} \\ 0 & 1 & 0 \\ { - \sin (\beta )} & 0 & {\cos (\beta )} \\ \end{array} } \right], $$

(3)

$$ {\mathbf{R}}_{z} (\gamma ) = \left[ {\begin{array}{*{20}c} {\cos (\gamma )} & { - \sin (\gamma )} & 0 \\ {\sin (\beta )} & {\cos (\gamma )} & 0 \\ 0 & 0 & 1 \\ \end{array} } \right], $$

(4)

where θ, β and γ are the rotation parameters of x-, y- and z-axes, respectively, and ${\mathbf{R}}_{x} (\theta )$, ${\mathbf{R}}_{y} (\beta )$ and ${\mathbf{R}}_{z} (\gamma )$ are the corresponding rotation matrices, respectively.

To overcome local optimum, we still use multiple individuals to search for a global optimum, as shown in Fig. 4. We start by defining the point cloud of the target object as O(x, y, z) and the point cloud of the measured accommodation space as P(x, y, z).

(1) Initialize the FOA parameters:

$$ O_{g} (\overline{{x_{m} }} ,\overline{{y_{m} }} ,\overline{{z_{m} }} ) = \sum\limits_{i = 1}^{m} {O_{i} (x_{i} ,y_{i} ,z_{i} )} /m, $$

(5)

$$ P_{g} (\overline{{x_{n} }} ,\overline{{y_{n} }} ,\overline{{z_{n} }} ) = \sum\limits_{i = 1}^{n} {P_{i} (x_{i} ,y_{i} ,z_{i} )} /n, $$

(6)

$$ [O^{1} ,1]^{T} = \left[ {\begin{array}{*{20}c} {\mathbf{E}} & {{\mathbf{O}}_{g} { - }P_{g} } \\ 0 & 1 \\ \end{array} } \right] \cdot [O,1]^{T} , $$

(7)

$$ V^{t} = alphaShape([O^{1} ;P_{g} ]), $$

(8)

where $m$ and $n$ are the corresponding point cloud numbers, ${\mathbf{E}}$ is the 3 identity matrix, $V^{t}$ is the volume of the alpha-shape algorithm result for the two point clouds, and alphaShape() represents the alpha-shape algorithm.

(2) Initialize the population position parameters of FOA:

$$ \left\{ {\begin{array}{*{20}c} {X_{i}^{o} = \overline{{x_{n} }} } \\ {Y_{i}^{o} = \overline{{y_{n} }} } \\ {Z_{i}^{o} = \overline{{z_{n} }} } \\ \end{array} } \right., $$

(9)

$$ \left\{ {\begin{array}{*{20}c} {\alpha_{i}^{o} = 2\pi \cdot rand - \pi } \\ {\beta_{i}^{o} = 2\pi \cdot rand - \pi } \\ {\gamma_{i}^{o} = 2\pi \cdot rand - \pi } \\ \end{array} } \right., $$

(10)

where $rand$ denotes a random number within the range (0, 1), $X_{i}^{o}$, $Y_{i}^{o}$ and $Z_{i}^{o}$ denote the coordinates of the ith initial given position, and $\alpha_{i}^{o}$, $\beta_{i}^{o}$ and $\gamma_{i}^{o}$ denote the ith initial random angle parameters.

(3) The fruit flies use olfactory cues to search for the food in random directions and distances:

$$ \left\{ {\begin{array}{*{20}c} {X_{i} = X_{i}^{o} + a_{1} \cdot rand - b_{1} } \\ {Y_{i} = Y_{i}^{o} + a_{2} \cdot rand - b_{2} } \\ {Z_{i} = Z_{i}^{o} + a_{3} \cdot rand - b_{3} } \\ \end{array} } \right., $$

(11)

$$ \left\{ {\begin{array}{*{20}c} {\alpha_{i} = \alpha_{i}^{o} + a_{4} \cdot rand - b_{4} } \\ {\beta_{i} = \beta_{i}^{o} + a_{5} \cdot rand - b_{5} } \\ {\gamma_{i} = \gamma_{i}^{o} + a_{6} \cdot rand - b_{6} } \\ \end{array} } \right., $$

(12)

where $a_{1} \sim a_{6}$ and $b_{1} \sim b_{6}$ are constants used to constrain the range of random numbers.

(4) The minimum volume of alpha shape of the object and the accommodation space is taken as a judgment value of the taste concentration:

$$ {\mathbf{R}}_{i} = {\mathbf{R}}_{z} (\gamma_{i} ) \cdot {\mathbf{R}}_{y} (\beta_{i} ) \cdot {\mathbf{R}}_{x} (\alpha_{i} ), $$

(13)

$$ {\mathbf{T}}_{i} = [X_{i} ,Y_{i} ,Z_{i} ]^{T} , $$

(14)

$$ [O_{i}^{2} ,1]^{T} = \left[ {\begin{array}{*{20}c} {{\mathbf{R}}_{i} } & {{\mathbf{T}}_{i} } \\ 0 & 1 \\ \end{array} } \right] \cdot [O_{i}^{1} ,1]^{T} , $$

(15)

$$ PO_{i} (x,y,z) = [O_{i}^{2} ;P], $$

(16)

$$ V_{i}^{t} = alphaShape(PO_{i} (x,y,z)), $$

(17)

$$ S_{i} = V_{i}^{t} . $$

(18)

(5) The groups are sorted in ascending order according to the concentration value:

$$ {[}S\_s \, S\_index{\text{] = sort(}}S{)}. $$

(19)

where S_s is the volume value after sorting the groups and S_index is index of the corresponding individual.

(6) Half individuals with smaller concentration values are selected and marked again according to the concentration values:

$$ \left\{ {\begin{array}{*{20}c} {X_{j} = X(S\_index)} \\ {Y_{j} = Y(S\_index)} \\ {Z_{j} = Z(S\_index)} \\ \end{array} } \right., $$

(20)

$$ \left\{ {\begin{array}{*{20}c} {\alpha_{j} = \alpha (S\_index)} \\ {\beta_{j} = \beta (S\_index)} \\ {\gamma_{j} = \gamma (S\_index)} \\ \end{array} } \right., $$

(21)

where the range of $S\_index$ is determined by the number of individuals in half population, (X_j, Y_j, Z_j) denote new position, (α_j, β_j, γ_j) denote new posture, and subscript j is index of individual.

(7) After step (6), return to step (3), and then to step (4); random direction and location are updated to obtain the taste concentration of a new half individuals. The pre-selected half individuals and the new half individuals are combined, and then, all individuals are sorted by the taste concentration according to step (5). The algorithm will stop the iterative optimization loop until the end condition is satisfied.

3 Experimental results

3.1 Obtaining the target object

In order to obtain the sparse point cloud of the target, the object was placed in the common field of view of the left and right cameras, and the object images of the left and right cameras were obtained, as shown in Fig. 5a. According to the point cloud acquisition process, Otsu threshold segmentation was performed on the object image of the left camera; subsequently, the background was removed, and region of the object was retained, as shown in Fig. 5b. Harris feature points were detected on the left image after threshold segmentation, as shown in Fig. 5c. The feature points of the right image were tracked by the obtained left image feature points, and the positions of features point pairs were obtained using KLT optical flow (Shi and Tomasi 1994), as shown in Fig. 5d. According to the corresponding feature point positions of left and right images and camera calibrated parameters, the top point clouds in the left camera coordinate system were calculated and finally rotated and translated into the world coordinate system, as shown in Fig. 5e. The bottom point cloud was obtained using the same method, and the top point cloud and the bottom point cloud were combined to a complete point cloud of the target object based on the world coordinate system.

3.2 Parameters setting result analysis

The initial parameters of the improved FOA are population size, number of iterations and random direction and position parameters ($a_{1} \sim a_{6}$ and $b_{1} \sim b_{6}$). In the experiments, the number of iterations is set to be 100 and the radius α is set to be 250. The range of random position variation was limited to (− 1, 1), and the range of random posture variation was limited to (− 0.05 $\pi$, 0.05 $\pi$); thus, $a_{1} \sim a_{3}$ and $b_{1} \sim b_{3}$ are set to be 2 and 1, respectively, and $a_{4} \sim a_{6}$ and $b_{4} \sim b_{6}$ are set to be 0.1 $\pi$ and 0.05 $\pi$, respectively. Software environment is Win10 64bit and MATLAB 2015b, and the corresponding hardware configuration parameters were CPU i7-7700 and RAM 8 GB.

Based on previous study, the point cloud of the target object is obtained by the Harris feature point detection and KLT optical flow, in Fig. 6a. 3D accommodation space is set as a cuboid with the size of 550 × 100 × 300 mm³, in Fig. 6b. The population number of fruit flies is 100. Figure 6c shows the initial state; Fig. 6d shows the 10th iteration; Fig. 6e shows the 20th iteration; and Fig. 6f shows that pose adjustment in the accommodation space is completed at the 40th iteration.

The space types are designed only to accommodate the object and the target object’s final posture in those three kinds of space as shown in Fig. 8. The four space types, namely the cube, hemisphere, cylindrical and triangular prism, represent different shapes of the common accommodation space. The proposed method can be applied to different convex shaped spaces. The iterative convergence curves of the four space types are plotted in Fig. 7. The four curves show that the object is basically placed in four spaces after iterating the 10th iteration. To test the effect of different value of parameter α on the estimation pose, parameter α values are set to be 60, 80, 100, 150, respectively. As shown in Fig. 9, there will be different results for setting different values of parameter α and it is important to choose proper α value.

3.3 Method comparison

Our previous work uses convex hull to construct the point clouds of the object and the accommodation space. Object final pose in the hemisphere space is computed using previous method, as shown in Fig. 10. Comparing Figs. 8e–f, 10d with Fig. 10a–c, the result shows that alpha-shape algorithm is more suitable for the placement requirements of robot real scenes than the convex hull algorithm. Previous work only considers that the accommodation space meets the placement of the target object, but not gives the best final pose in practical application; our new study using alpha-shape algorithm not only keeps the object in the accommodation space, but also maintains the object pose which is at the bottom of the space and can meet the actual robot placements requirements.

3.4 Hardware test

We used the SIASUN industrial robot (model: SR6C), binocular vision (or 3D vision sensor, model: Astra Mini) and vacuum sucker to establish the hardware test platform shown in Fig. 11a. A more complex-shaped object and a randomly cube box were considered, and their corresponding point clouds were derived using a 3D sensor, as shown in Fig. 11b, c. The final pose of the complex object in the box space was estimated using the proposed method, as shown in Fig. 11d.

4 Conclusion

In this paper, the method of combining alpha-shape algorithm and improved fruit fly optimization is proposed to estimate the object posing in a 3D accommodation space. The proposed method uses the alpha-shape algorithm to establish the objective function, which uses the whole volume change to adjust the object pose. To obtain the best individual with 6 degrees of freedom, the iteration strategy of improved FOA chooses better half individuals to produce next half individuals. The experiments were performed setting parameters of improved FOA and considering the four space types, and the obtained results show that proposed method can obtain the object pose in the common accommodation space. Compared with previous work using convex hull, the new study using alpha shape could keep the object pose at the bottom of the accommodation space without limitation of gravity, which could meet the actual robot stacking. For an actual object placed in the enclosed space, the future study would increase the constraints (such as elasticity and collision) and adjust the parameters (or adds other algorithms) to shorten the estimation time. In addition, the objective function depends on the whole volume change of alpha shape and parameter α directly related to point density and the level of detail of the boundary. As the proposed method needs choosing an appropriate parameter α, segmenting point clouds of the accommodation space and building 3D object, our future work will focus on using adaptive alpha-shape algorithm and obtaining the whole scene point clouds and the space point clouds segmentation.

References

Criswell T (2014) Arlington, TX (US). Automated truck unloader for unloading unpacking product from trailers and containers. US, 2014/0205403 A1. 24 July 2014
Drost B, Ulrich M, Navab N et al (2010) Model globally, match locally: efficient and robust 3D object recognition. In: IEEE conference on computer vision and pattern recognition
Edelsbrunner H, Mücke EP (1994) Three dimensional alpha shapes. ACM Trans Graph 13(1):43–72
Article Google Scholar
Edelsbrunner H, Kirkpatrick D, Seidel R (2003) On the shape of a set of points in the plane. IEEE Trans Inf Theory 29(4):551–559
Article MathSciNet Google Scholar
Guo Q, Quan Y, Jiang C (2019) Object pose estimation in accommodation space using an improved fruit fly optimization algorithm. J Intell Rob Syst 95(2):405–417
Article Google Scholar
Hinterstoisser S, Cagniart C, Ilic S et al (2012) Gradient response maps for real-time detection of textureless objects. IEEE Trans Pattern Anal Mach Intell 34(5):876–888
Article Google Scholar
Kehl W, Manhardt F, Tombari F et al (2017) SSD-6D: making RGB-based 3D detection and 6D pose estimation great again. In: IEEE international conference on computer vision
Lou S, Jiang X, Scott PJ (2013) Application of the morphological alpha shape method to the extraction of topographical features from engineering surfaces. Measurement 46:1002–1008
Article Google Scholar
Pan WT (2012) A new Fruit Fly Optimization Algorithm: taking the financial distress model as an example. Knowl-Based Syst 26(2):69–74
Article Google Scholar
Santos RCD, Galo M, Carrilho AC (2019) Extraction of building roof boundaries from LiDAR data using an adaptive alpha-shape algorithm. IEEE Geosci Remote Sens Lett 16:1–5
Article Google Scholar
Shi J, Tomasi C (1994) Good features to track. In: IEEE conference on computer vision and pattern recognition, pp 593–600
Wang Q, Chen X (2019) Building contours extraction from LIDAR data based on alpha shapes algorithm. Intell Build Smart City 2:23–25
Google Scholar
Wang C, Xu D, Zhu Y et al (2019) DenseFusion: 6D object pose estimation by iterative dense fusion. In: IEEE conference on computer vision and pattern recognition
Xiang Y, Schmidt T, Narayanan V et al (2017) PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. In: Robotics: science and systems
Zhu C, Zhang X, Hu B et al (2008) Reconstruction of tree crown shape from scanned data. In: Proceedings of the 3rd international conference on technologies for e-learning and digital entertainment. Edutainment ’08, pp 745–756. Springer-Verlag, Berlin

Download references

Acknowledgements

This work was supported by Key-Area Research and Development Program of Guangdong Province, China (2019B010155001). In addition, thanks to all the people who are fighting the COVID-19.

Author information

Authors and Affiliations

School of Electronic and Information Engineering, South China University of Technology, Guangzhou, 510640, China
Qingda Guo & Jianchi Zhang
State Key Laboratory of Internet of Things for Smart City, University of Macau, San Francisco, CA, USA
Lulu Tang

Authors

Qingda Guo
View author publications
You can also search for this author in PubMed Google Scholar
Lulu Tang
View author publications
You can also search for this author in PubMed Google Scholar
Jianchi Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Guo Qingda contributed to the conception of the study and wrote the manuscript; Tang Lulu contributed significantly to analysis and manuscript preparation; and Zhang Jianchi performed the experiment.

Corresponding author

Correspondence to Qingda Guo.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Data Availability

All data generated or analyzed during this study are included in this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, Q., Tang, L. & Zhang, J. Object pose estimation for robot loading in accommodation space using alpha-shape algorithm. Soft Comput 25, 14619–14628 (2021). https://doi.org/10.1007/s00500-021-06237-8

Download citation

Accepted: 03 September 2021
Published: 21 September 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s00500-021-06237-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Object pose estimation for robot loading in accommodation space using alpha-shape algorithm

Abstract

Similar content being viewed by others

Object Pose Estimation in Accommodation Space using an Improved Fruit Fly Optimization Algorithm

Active Object Search Exploiting Probabilistic Object–Object Relations

Robot End-Effector Mounted Camera Pose Optimization in Object Detection-Based Tasks

1 Introduction