Advanced Topics On Computer Vision Control and Robotics in Mechatronics
Advanced Topics On Computer Vision Control and Robotics in Mechatronics
Manuel Nandayapa
Israel Soto Editors
Advanced Topics
on Computer
Vision, Control
and Robotics in
Mechatronics
Advanced Topics on Computer Vision, Control
and Robotics in Mechatronics
Osslan Osiris Vergara Villegas
Manuel Nandayapa Israel Soto
•
Editors
Advanced Topics
on Computer Vision, Control
and Robotics in Mechatronics
123
Editors
Osslan Osiris Vergara Villegas Israel Soto
Industrial and Manufacturing Engineering Industrial and Manufacturing Engineering
Universidad Autónoma de Ciudad Juárez Universidad Autónoma de Ciudad Juárez
Ciudad Juárez, Chihuahua Ciudad Juárez, Chihuahua
Mexico Mexico
Manuel Nandayapa
Industrial and Manufacturing Engineering
Universidad Autónoma de Ciudad Juárez
Ciudad Juárez, Chihuahua
Mexico
This Springer imprint is published by the registered company Springer International Publishing AG
part of Springer Nature
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
v
vi Preface
Part II Control
The second part of the book related to control focused mainly into propose intel-
ligent control strategies for helicopters, manipulators and robots.
Chapter 6 focuses on the field of cognitive robotics. Therefore, the simulations
of an autonomous learning process of an artificial agent controlled by artificial
action potential neural networks during an obstacle avoidance task are presented.
Chapter 7 analyzes and implements the hybrid force/position control using a
fuzzy logic in a Mitsubishi PA10-7CE Robot Arm which is a seven degrees of
freedom robot.
Chapter 8 reports the kinematic and dynamic models of the 6-3-PUS-type
Hexapod parallel mechanism and also covers the motion control of the Hexapod. In
addition, the chapter describes the implementation of two motion tracking con-
trollers in a real Hexapod robot.
The application of a finite time-time nonlinear proportional–integral–derivative
(PID) controller to a five-bar mechanism, for set-point controller, is presented in
Chap. 9. The stability analysis of the closed-loop system shows global finite-time
stability of the system.
Finally, Chap. 10 deals with the tracking control problem of three degrees of
freedom helicopter. The control problem is solved using nonlinear H∞ synthesis of
time-varying systems. The proposed method considers external perturbations and
parametric variations.
Chapter 11 proposes a novel ankle rehabilitation parallel robot with two degrees
of freedom consisting of two linear guides. Also, a serious game and a facial
expression recognition system were added for entertainment and to improve patient
engagement in the rehabilitation process.
Chapter 12 explains the new challenges in the area of cognitive robotics. In
addition, two low-level cognitive tasks are modeled and implemented in an artificial
agent. In the first experiment an agent learns its body map, while in the second
experiment the agent acquires a distance-to-obstacles concept.
Chapter 13 covers a review of applications of two novel technologies known as
haptic systems and virtual environments. The applications are divided in two cat-
egories including training and assistance. For each category the fields of education,
medicine and industry are addressed.
The aerodynamic analysis of a bio-inspired three degrees of freedom articulated
flat empennage is presented in Chap. 14. The proposal mimics the way that the tail
of some birds moves.
Finally, the problem of performing different tasks with a group of mobile robots
is addressed in Chap. 15. In order to cope with issues like regulation to a point or
trajectory tracking, a consensus scheme is considered. The proposal was validated
by a group of three differential mobile robots.
Also, we would like to thank all our book contributors and many other partic-
ipants who submitted their chapters that cannot be included in the book, we value
your effort enormously. Finally, we would like to thank the effort of our chapter
reviewers that helped us sustain the high quality of the book.
Part II Control
6 Learning in Biologically Inspired Neural Networks
for Robot Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Diana Valenzo, Dadai Astorga, Alejandra Ciria and Bruno Lara
ix
x Contents
1.1 Introduction
Medical ultrasound (US) is a low cost, real-time, and noninvasive technique that
requires processing signals at high speed (Adamo et al. 2013). This type of imaging
modality has several advantages over computed tomography (CT), positron emission
tomography (PET), and magnetic resonance imaging (MRI) especially in obstetric
applications where radiation or the injection of a radiotracer can be harmful to the
fetus. Besides, in medical US, the patient does not have to remain still. US images are
inherently contaminated with speckle noise because it is a coherent imaging system.
In the past, several methods to denoise US medical images have been proposed.
However, many of them apply strategies designed for additive Gaussian noise. Before
filtering, the noisy image is transformed into an additive process by taking the log-
arithm of the image. Therefore, by assuming that the noise is an additive Gaussian
process, a Wiener filter (Portilla et al. 2001) or a wavelet shrinkage method (Pizurica
et al. 2003; Rizi et al. 2011; Tian and Chen 2011; Premaratne and Premaratne 2012;
Fu et al. 2015) is applied to remove the noise component. Nevertheless, in (Oliver and
Quegan 2004; Goodman 2007; Huang et al. 2012), the authors study the speckle noise
and indicate that the suitable distribution for this type of noise is Gamma or Rayleigh.
The denoising methods are divided in spatial filtering (Lee 1980; Frost et al.
1982; Kuan et al. 1985), transform methods (Argenti and Alparone 2002; Xie et al.
2002; Pizurica et al. 2003; Rizi et al. 2011; Tian and Chen 2011; Premaratne and
Premaratne 2012), and, more recently, regularization methods for image recon-
struction and restoration (Aubert and Aujol 2008; Shi and Osher 2008; Huang et al.
2009; Nie et al. 2016a, b). Regularization methods are based on partial differential
equations, and the first denoising filter for multiplicative noise was the total variation
(TV) proposed in (Rudin et al. 2003). However, the problem of TV regularization
method is that in smooth regions produces a stair-case effect. In other words, the
texture features are not restored. Hence, other regularization methods introduce an
extra term to the functional named prior to work with the TV and the data fidelity
terms (Nie et al. 2016b) to overcome the piecewise constant of the smooth region.
Despite the results obtained in the transformed and the variational methods, the
limitation for its implementation in real time is the computational burden, since the
former need to change to a transform domain and after removing the noise returning
to the spatial domain. The variational methods need several iterations to converge
and it is usually very complicated their implementation in fixed-point processor.
In this chapter, a comparative analysis of the performances of several filters to
reduce the speckle effect, in US medical images, is presented. The filters are especially
designed for multiplicative noise, operate in the spatial domain and programmed in the
DM6437 digital signal processor (DSP) of Texas Instruments™ (TI) to study their
performance. This processor is also known as the digital media (DM) 6437.
The chapter is organized as follows: In Sect. 1.2, a literature review is given. In
Sect. 1.3, the methods used in this research and the metrics to measure the per-
formance are explained. In Sect. 1.4, the experimental results are presented. The
chapter concludes in Sect. 1.5.
1 Denoising of Ultrasound Medical Images Using the DM6437 … 5
The aim of this section is to provide a brief and useful description of the hardware
and techniques implemented in DSP for different image processing applications.
In Xuange et al. (2009), authors propose an online hydrological sediment
detection system based on image processing. This consists on an image collection
subsystem, an image transport subsystem, a network transmission subsystem, and
an ARM-based processing subsystem based on PXA270 processor. The system
acquires the image of mountain rivers using an online technique and performs the
hydrological analysis of sediment. The denoising algorithm uses wavelet trans-
formation. However, the overall performance is not reported.
In Bronstein (2011), the design of bilateral filter for noise removal is carried out
for a parallel single instruction, multiple data (SIMD)-type architecture using a
sliding window. For each pixel, in raster order, neighbor pixels within a window
around it are taken and used to compute the filter output; the window is moved right
by one pixel and so on. This implementation is optimized for windows sizes
between 10 and 20 to keep low the complexity. However, it approximates the
performance to the bilateral filter in terms of root mean square error (RMSE), and
the proposed implementation can operate at real time.
In Lin et al. (2011), authors propose a novel restoration algorithm based on
super-resolution concept using the wavelets decomposition implemented on the
OMAP3530 platform performing the effectiveness of the images restoration. The
architecture utilized is designed to provide good quality video, image, and graphics
processing. To verify the execution time of the algorithm, they use four different
methods: the Cortex-A 8 only implementation, the Cortex-A 8 + NEON imple-
mentation, the DSP only implementation, and the dual-core implementation.
Method 2 shows the best performance. Method 3 or 4 did not have the best
performance because the proposed algorithm involves heavy floating-point com-
putation which is not supported by the fixed-point C64x + DSP. For the
well-known Lena, Baboon, Barbara and Peppers images of size 256 256 report
an execution time from 1.41 to 2.5 s with PSNRs of 32.78, 24.49, 25.36 and
31.43 dBs respectively, using a dual-core implementation, outperforming the
bilinear and bicubic algorithms.
In Zoican (2011), the author develops an algorithm that reduces impulsive noise
in still images that allows to reduce more than 90% of the noise. The algorithm
presented is a median filter modification. The median filter is typically applied
uniformly across the image. To avoid this and reduce the noise, the author uses the
modified median filter, where impulse detection algorithm is used before filtering to
control the pixel to be modified. The algorithm is non-parametric comparing with
the progressive median algorithm that must be predetermined with four parameters.
The performance of the new algorithm is evaluated by measurement of mean square
error (MSE) and peak signal-to-noise ratio (PNSR). The results show the efficiency
of the new algorithm comparing with median progressive algorithm while
6 G. A. Martínez Medrano et al.
computational burden is similar. However, the proposal is for small images using
the BF5xx (Analog Devices Inc.™) DSP family.
In Akdeniz and Tora (2012), authors present a study of the balanced contrast
limited adaptive histogram equalization (BCLAHE) implementation for infrared
images on an embedded platform to achieve a real-time performance for a target
that uses a dual processor OMAP3530. The debug access port (DAP) and the
advanced risk machine (ARM) are optimized to obtain a significant speed increase.
The performance analysis is done over infrared images with different dynamic
range. The performance reached a real-time processing at 28 FPS with 16-bit
images.
In Dallai and Ricci (2014), the authors present a real-time implementation for a
bilateral filter for the TMS320DM64x + DSPs. Real-time capability was achieved
through code optimization and exploitation of the DSP architecture. The filter,
tested on the ULA-OP scanner, processes images from 192 512 to 40 FPS. The
images are obtained from a phantom and in vivo.
In Zhuang (2014), the author develops a system to enhance images using the
dual-core TI DaVinci DM6467T with MontaVista Linux operating system running
on the ARM subsystem to handle the I/O and the result of the DSP. The results
show that the system provides equivalent capabilities to a X86 computer processing
25 FPS on D1 resolution (704 480 pixels).
Finally, in Fan et al. (2016), authors focus on the optimized implementation of
the linear line detection system based on multiple image pre-processing methods
and an efficient Hough transformation. To evaluate the performance of the real-time
algorithm, the DSP TMS320C6678 was used. Lane detection takes up only a small
portion of the processing time and should be implemented with a much higher
performance than 25 frames per second (FPS) to make room for the rest of the
system. The linear detection algorithm presented in this paper is
faster-than-real-time, which achieves a high-speed performance with over 81 fps on
a multicore DSP. They used C language to program the linear lane detection
algorithm to achieve compatibility across multiple platforms especially for DSP to
yield a much faster performance than real time. The processor has eight cores, and
each core can run at 1.25 GHz. To develop a faster-than-real-time algorithm, they
use optimize the DSP, such as restricted search area, an efficient Hough transform,
and a better memory allocation. Also, with the purpose of reducing the Hough
transformation accumulated noise and decreasing the processing time, Gaussian
blur, edge thinning, and edge elimination are used.
1.3 Methods
This section introduces the methods used throughout this research, including the US
image formation, the model of the image with speckle noise and the classic filtering
strategies to remove it. Also, a brief description of the DSP as well as the metrics
used is included.
1 Denoising of Ultrasound Medical Images Using the DM6437 … 7
1.3.1.1 B-Mode
Kang et al. 2016; Wen et al. 2016; Li et al. 2017; Singh et al. 2017). The multi-
plicative noise can be expressed as
g ¼ f nþv ð1:1Þ
where g and f are the noisy and the noise-free images, respectively, nðm; nÞ and v
are the amount of multiplicative an additive noise component in the image. The
effect of additive noise is considered smaller compared with that of multiplicative
noise (coherent interface) kvk2 knk2 then Eq. (1.1) becomes,
gfn ð1:2Þ
Noise reduction without blurring the edges is a speckle noise reduction problem in
US images. Speckle suppression in ultrasound images is usually done by techniques
that are applied directly to the original image domain like median (Maini and
Aggarwal 2009), Lee (1980), Frost et al. (1982), and Kuan et al. (1985) filters that
achieve very good speckle reduction in homogeneous areas and ignore the speckle
noise in areas close to edges and lines. Perona and Malik (1990) developed a
method called anisotropic diffusion based on heat equation. It works well in
homogenous areas with edge preservation for an image corrupted by additive noise,
but the performance is poor for the speckle noise, which is a multiplicative noise;
then, Yu and Acton (2002) introduced a method called speckle reduction aniso-
tropic diffusion (SRAD). In this method, diffusion coefficient which defines the
amount of smoothing is based on ratio of local standard deviation to mean and these
are calculated using nearest neighbor window and it smoothens the edges and
structural content in images. Median, Lee, Kuan, Frost, and SRAD filters were
programmed in the DSP.
filter preserves the edges and reduces the blur in images. If the window length is
2k + 1, the filtering is given by Eq. (1.3),
where med½ is the median operator. To find the median value, it is necessary to
sort all the intensities in a neighborhood into a numerical ascendant order. This is a
computationally complex process due to the time needed to sort pixels to find the
median value of the window.
The Lee filter is popular in the image processing community for despeckling and
enhancing SAR images. The Lee filter and other similar sigma filters reduce
multiplicative noise while preserving image sharpness and details. It uses a sliding
window that calculates a value with the neighbor pixels of the central window pixel
and replaces it with the calculated value. Calculate the variance of the window and
if the variance is low, smoothing will be performed. On the other hand, if the
variance is high, assuming an edge, the smoothing will not be performed.
Therefore,
The Lee filter is a case of the Kuan filter (Kuan et al. 1985) without the term
r2f =ENL.
The Kuan filter (Kuan et al. 1985) is an adaptive noise smoothing filter that has a
simple structure and does not require any information of the image. The filter
considers the multiplicative noise model of Eq. (1.2) as an additive model of the
form:
g ¼ f þ ðn 1Þf ; ð1:6Þ
10 G. A. Martínez Medrano et al.
Assuming unit mean noise, the estimated pixel value in the local window is:
with
ENLr2g g2
r2f ¼ ; ð1:8Þ
ENL þ 1
and
2 !2
Mean
g
ENL ¼ ¼ ; ð1:9Þ
StDev r2g
The equivalent number of looks (ENL) estimates the noise level and is calculated
in a uniform region of the image. One shortcoming of this filter is that the ENL
parameter needs to be computed beforehand.
The Frost filter (Frost et al. 1982) is an adaptive as well as exponential, based on
weighted middling filter, that reduces the multiplicative noise while preserving
edges. It works with a window that is 2k þ 1 size replacing the central pixel with the
sum of weighted exponential terms. The weighting factors depend on the distance
to the central pixel, the damping factor, and the local variance. The more far the
pixel from the central pixel the less the weight. Also, the weighting factors increase
as variance in the window increases. The filter convolves the pixel values within the
window with the exponential impulse response:
where K is the filter parameter, i0 is the window central pixel, and jij is the distance
measured from the window central pixel. The coefficient of variation is defined as
ag ¼ rg =g, were g and rg are the local mean and standard deviation of the window,
respectively.
SRAD is called speckle reducing anisotropic diffusion filter (Yu and Acton 2002),
and it is obtained by rearranging Eq. (1.6) as:
The term ðg gÞ approximate to the Laplacian operator (with c = 1) and then
can be expressed as:
^f ¼ g þ k0 divðDgÞ: ð1:12Þ
where cðqÞ is the diffusion coefficient, and gðx; y; tÞ is an edge detector. The last
boundary condition states that the derivative of the function along the outer normal,
at the image boundary, must vanish. This assures that the average brightness will be
preserved.
The C64x + DSP core contains eight functional units (.M1, .L1, .D1, .S1, .M2, .L2,
.D2, and .S2); each one can execute one instruction every clock cycle. The .M
functional units perform multiply operations.
.M units can perform one of the following each clock cycle: one 32 32 bit
multiply, one 16 16 bit multiply, two 16 16 bit multiplies, two 16 16 bit
multiplies with add/subtract capabilities, four 8 8 bit multiplies with add oper-
ations, and four 16 16 multiplies with add/subtract capabilities also supports
complex multiply (CMPY) instructions that take for 16-bit inputs and produces a
32-bit packed output that contains 16-bit real a 16-bit imaginary values. The
32 32 bit multiply instructions provide the extended precision necessary for
audio and other high-precision algorithms on a variety of signed and unsigned
32-bit data types.
The .S and .L units perform a general set of arithmetic, logical, and branch
functions. The .D units primarily load data from memory to the register file and
store results from the register file into memory, also, two register files, and two data
12 G. A. Martínez Medrano et al.
paths. There are two general-purpose register files (A and B), and each contains
32-bit registers for a total of 64 registers.
The .L or arithmetic logic units have the ability to do parallel add/subtract
operations on a pair of common inputs. Versions of this instructions exist to work
on 32-bit data or on pairs of 16-bit data performing dual 16-bit add–subtracts in
parallel.
The DM6437 evaluation module (EVM) is a platform that allows to evaluate and
develop applications for the TI DaVinci processors family. The EVM board
includes a TI DM6437 processor operating up to 600 megahertz (MHz), one video
decoder, supports composite or S-video, four video digital-to-analog converter
(DAC) outputs—component, red, green, blue (RGB) composite, 128 megabytes
(MB) of double data rate synchronous dynamic random-access memory (DDR2
DRAM), one universal asynchronous receiver-transmitter (UART) and a pro-
grammable input/output device for controller area network (CAN I/O), 16 MB of
non-volatile flash memory, 64 MB of flash memory based on nand gates (NAND
flash), 2 MB of static random-access memory (SRAM), a low power stereo codec
(AIC33), inter-integrated circuit interface (I2C) with onboard electrically erasable
programmable read-only memory (EEPROM) and expanders, 10/100 megabit per
second (MBPS) Ethernet interface, configurable boot load options, embedded
emulation interface known as joint test action group (JTAG), four user light
emitting diodes (LEDs) and four position user switches, single voltage power
supply (5 volts), expansion connectors for daughter card use, a full-duplex serial
bus to perform transmit and receive operations separately for connecting to one or
more external physical devices which are mapped to local physical address space
and appear as if they are on the internal bus of the DM6437 processor, and one
Sony/Philips digital interface format (S/PDIF) to transmit digital audio.
The EVM is designed to work with Code Composer Studio. Code Composer
communicates with the board through the embedded emulator or an external JTAG
emulator. Figure 1.2 shows the block diagram of the EVM.
The US images were loaded into the memory of the EVM using the JTAG
emulator; after finishing the process, a copy of the clean image was sent to the
computer and to the video port to be displayed in a monitor.
Figure 1.3 shows the memory of the address space of a DM6437, portions of
memory can be remapped in software, the total amount of memory for data, pro-
gram code, and video is 128 megabytes. In this work, the US images were allocated
in DDR memory (unsigned char *) 0 80000000. This memory has a dedicated
32-bits bus.
1 Denoising of Ultrasound Medical Images Using the DM6437 … 13
Fig. 1.2 Block Diagram of the EVM DM6437 (Texas Instruments 2006)
For processing purposes, ten memory regions were allocated in DDR memory to
store the results of filtered images. Before sending to the display, the images are
reshaped to 480 720.
1.3.4 Metrics
X
1 M 1 X
N 1
MSE ¼ kxði; jÞ yði; jÞk2 ; ð1:15Þ
MN i¼0 j¼0
v2max
PSNR ¼ 10 log10 ; ð1:16Þ
MSEðx; yÞ
!
r2y
SNR ¼ 10 log10 ; ð1:17Þ
MSEðx; yÞ
where x is the original image, y is the recovered image after denoising, and vmax is
the maximum possible value in the range of the signals. The SSIM factor (Wang
et al. 2004) is calculated as,
2lx ly þ c1 2rxy þ c2
SSIMðx; yÞ ¼ ; ð1:18Þ
l2x þ l2y þ c1 r2x þ r2y þ c2
where lx and ly are the mean value of x and y, r2x , r2y and rxy are the variance and
covariance of x and y; and c1 and c2 are constants terms. Another metric derived
from the SSIM is the MSSIM of Eq. (1.19),
1X M
MSSIM ¼ SSIMðxj ; yj Þ; ð1:19Þ
M j¼1
jlFR lBR j
CB ¼ ; ð1:20Þ
jlFP þ lBP j
where lFR and lBR are the mean value of the foreground and the background of the
recovered image, and lFP and lBP are the mean value of the foreground and the
background of the synthetic image. Both obtained on homogeneous regions.
1.4 Results
In this section, performance evaluation of the filters on synthetic data and on real
data is obtained. The phantom of a fetus (Center for Fast Ultrasound Imaging 2017)
was contaminated with speckle noise with different variances and uploaded to the
memory of the board. Then, a filtering process is applied to the noisy image. The
resulting clean image is sent back to the computer. Different metrics were calculated
using the clean phantom as a reference.
1 Denoising of Ultrasound Medical Images Using the DM6437 … 15
Fig. 1.4 System configuration: code composer software, interface to the DM6437 EVM and the
display unit
Figure 1.4 shows the system configuration to process the US images. The code
composer is used to program the processor and to upload the image to the DDR
memory. The interface connects the computer to module, and the module sends the
image to a display and to the computer for visualization and performance evaluation
purposes, respectively. In the next sections, the results or synthetic and real data are
presented.
The synthetic images (Center for Fast Ultrasound Imaging 2017) and speckle noise
model are considered for the experiments, and different metrics to evaluate the noise
are used to compare objectively several methods. Figure 1.5 shows the original
image and the affected images with different speckle values.
In this experiment, the synthetic image of Fig. 1.5 (Center for Fast Ultrasound
Imaging 2017) was corrupted with different levels of noise. The synthetic image
(phantom) was modified according to the national television system committee
(NTSC) standard to 8-bit image of 480 720 pixels for display purposes. The
speckle noise process, applied to the synthetic image, follows the model of
Eq. (1.2). Seven different levels of noise variance were tested by setting
r ¼ f0:02; 0:05; 0:1; 0:15; 0:2; 0:25; 0:3g. To assess denoising methods, the previ-
ous metrics defined in Sect. 1.3.4 were computed between the synthetic and the
reconstructed image. Quantitative results are shown in Tables 1.1, 1.2, 1.3, 1.4, 1.5,
1.6, and 1.7.
16 G. A. Martínez Medrano et al.
Fig. 1.5 Synthetic images, from left to right. First row shows the original image and the images
contaminated with a speckle noise variance of 0.02 and 0.05, respectively. Second row shows the
original image contaminated with a speckle noise variance of 0.1, 0.15, and 0.2, respectively, and
the third row shows the original image contaminated with a speckle noise variance of 0.25 and 0.3,
respectively
Table 1.1 Results with filters applied to the affected image with 0.02 of noise variance
Filter MSE PSNR SNR CB SSIM MSSIM SI FOM
Speckle 0.02 489.099 21.236 10.756 1.116 0.3621 0.4766 296.593 0.7797
Median 647.448 20.018 9.538 1.119 0.3192 0.5623 626.425 0.7257
33
Median 859.717 18.787 8.307 1.119 0.2751 0.6035 569.047 0.7166
55
Median 1082.408 17.786 7.306 1.118 0.2503 0.6104 556.798 0.6665
77
Lee 3 3 845.326 18.860 8.380 1.112 0.2755 0.5747 111.047 0.8868
Lee 5 5 1300.727 16.988 6.508 1.016 0.1855 0.4729 19.122 0.7730
Lee 7 7 1391.133 16.697 6.217 1.107 0.1384 0.5112 18.385 0.6531
Kuan (15 it.) 2106.461 14.895 4.415 1.065 0.1319 0.5213 363.414 0.5547
Frost 5 5 812.249 19.033 8.553 1.109 0.2552 0.6142 147.586 0.8072
SRAD (15 it.) 287.326 23.547 13.067 1.113 0.3830 0.7129 1205.296 0.8352
1 Denoising of Ultrasound Medical Images Using the DM6437 … 17
Table 1.2 Results with filters applied to the affected image with 0.2 of noise variance
Filter MSE PSNR SNR CB SSIM MSSIM SI FOM
Speckle 0.2 2622.640 13.943 3.463 0.982 0.2960 0.3126 29.194 0.3072
Median 3 3 1428.633 16.58 6.101 0.976 0.2212 0.3293 145.724 0.3524
Median 5 5 1410.025 16.638 6.158 0.976 0.1843 0.3864 124.430 0.4327
Median 7 7 1578.258 16.149 5.669 0.981 0.1504 0.4269 89.081 0.5420
Lee 3 3 1410.097 16.638 6.158 0.985 0.2230 0.3616 32.443 0.4055
Lee 5 5 1418.131 16.613 6.133 0.978 0.1724 0.4465 16.415 0.7803
Lee 7 7 1627.629 16.015 5.535 0.976 0.1206 0.4561 9.122 0.6280
Kuan (15 it.) 2357.738 14.405 3.925 0.952 0.0923 0.4601 6.631 0.5881
Frost 5 5 1082.135 17.787 7.307 0.978 0.2840 0.5181 97.837 0.5566
SRAD (15 it.) 1949.973 15.230 4.750 0.993 0.2809 0.3371 68.303 0.3434
Table 1.3 Results with filters applied to the affected image with 0.05 noise variance
Filter MSE PSNR SNR CB SSIM MSSIM SI FOM
Speckle 0.05 721.608 19.547 9.067 0.9979 0.4229 0.4590 126.939 0.6122
Median 3 3 686.310 19.765 9.285 0.9904 0.2997 0.4897 371.883 0.7023
Median 5 5 858.621 18.792 8.312 0.9938 0.2629 0.5532 321.724 0.7218
Median 7 7 1072.914 17.825 7.345 0.9950 0.2259 0.5690 297.163 0.6402
Lee 3 3 819.340 18.996 8.516 0.9953 0.2804 0.5266 75.003 0.7893
Lee 5 5 1300.727 16.988 6.508 1.0161 0.1855 0.4729 19.122 0.7730
Lee 7 7 1331.404 16.887 6.407 0.9935 0.1359 0.5022 14.161 0.6541
Kuan (15 it.) 1685.911 15.862 5.382 0.9014 0.1148 0.4746 7.368 0.6057
Frost 5 5 758.378 19.331 8.851 0.9948 0.3025 0.6060 118.401 0.8178
SRAD (15 it.) 374.178 22.400 11.920 1.003 0.4720 0.6642 712.479 0.7405
Table 1.4 Results with filters applied to the affected image with 0.1 of noise variance
Filter MSE PSNR SNR CB SSIM MSSIM SI FOM
Speckle 0.1 1358.284 16.800 6.320 1.003 0.3552 0.3778 59.144 0.3888
Median 3 3 953.5335 18.337 7.857 0.9982 0.2647 0.4089 245.035 0.5166
Median 5 5 1071.67 17.830 7.350 1.003 0.2252 0.4734 198.070 0.6694
Median 7 7 1280.756 17.056 6.576 1.003 0.1825 0.4992 171.538 0.5598
Lee 3 3 1104.697 17.698 7.218 1.018 0.2440 0.4153 44.087 0.4482
Lee 5 5 1300.727 16.988 6.508 1.016 0.1855 0.4729 19.122 0.7730
Lee 7 7 1426.965 16.586 6.106 0.9962 0.1294 0.4840 11.755 0.6529
Kuan (15 it.) 1872.912 15.405 4.925 0.9081 0.1075 0.4694 7.120 0.5955
Frost 5 5 864.236 18.764 8.284 0.9980 0.2974 0.5723 107.896 0.8182
SRAD (15 it.) 880.621 18.682 8.202 1.009 0.3574 0.4489 242.935 0.5477
18 G. A. Martínez Medrano et al.
Table 1.5 Results with filters applied to the affected image with 0.15 of noise variance
Filter MSE PSNR SNR CB SSIM MSSIM SI FOM
Speckle 0.15 1998.636 15.123 4.643 1.024 0.3212 0.3392 37.895 0.3967
Median 3 3 1214.233 17.287 6.807 1.018 0.2381 0.3574 182.118 0.3902
Median 5 5 1256.460 17.139 6.659 1.021 0.2033 0.4203 158.164 0.5545
Median 7 7 1434.514 16.563 6.083 1.020 0.1645 0.4565 125.176 0.5472
Lee 3 3 1104.697 17.698 7.218 1.018 0.2440 0.4153 44.087 0.4482
Lee 5 5 1300.727 16.988 6.508 1.016 0.1855 0.4729 19.122 0.7730
Lee 7 7 1518.448 16.316 5.836 1.013 0.1267 0.4700 10.422 0.6528
Kuan (15 it.) 2123.588 14.860 4.380 0.9359 0.0993 0.4647 7.133 0.5973
Frost 5 5 970.406 18.261 7.781 1.016 0.2914 0.5424 106.232 0.7643
SRAD (15 it.) 1417.497 16.615 6.135 1.032 0.3100 0.3762 114.514 0.3921
Table 1.6 Results with filters applied to the affected image with 0.25 of noise variance
Filter MSE PSNR SNR CB SSIM MSSIM SI FOM
Speckle 0.25 3250.787 13.010 2.530 0.9861 0.2771 0.2935 22.871 0.2969
Median 3 3 1671.314 15.900 5.420 0.9806 0.2047 0.3030 137.273 0.3312
Median 5 5 1579.811 16.148 5.664 0.9878 0.1727 0.3588 129.235 0.4253
Median 7 7 1729.038 15.752 5.272 0.9898 0.1445 0.4069 92.049 0.5062
Lee 3 3 1410.097 16.638 6.158 0.9859 0.2230 0.3616 32.443 0.4055
Lee 5 5 1300.727 16.988 6.508 1.016 0.1855 0.4729 19.122 0.7730
Lee 7 7 1743.332 15.717 5.237 0.9833 0.1250 0.4475 8.2734 0.6560
Kuan (15 it.) 2624.992 13.939 3.459 0.9752 0.0854 0.4560 6.736 0.5838
Frost 5 5 1208.513 17.308 6.828 0.9830 0.2776 0.4937 99.0494 0.5461
SRAD (15 it.) 2474.478 14.195 3.715 0.9995 0.2602 0.3124 49.905 0.3159
Table 1.7 Results with filters applied to the affected image with 0.3 of noise variance
Filter MSE PSNR SNR CB SSIM MSSIM SI FOM
Speckle 0.3 3871.467 12.252 1.772 0.9866 0.2607 0.2780 22.102 0.2825
Median 3 3 1911.848 15.316 4.836 0.9774 0.1944 0.2877 128.782 0.2993
Median 5 5 1731.528 15.746 5.266 0.9775 0.1633 0.3392 116.175 0.3503
Median 7 7 1881.839 15.384 4.90 0.9749 0.1292 0.3824 76.382 0.4697
Lee 3 3 1104.697 17.698 7.218 1.018 0.2440 0.4153 44.087 0.4482
Lee 5 5 5605.530 10.644 0.1646 0.9925 0.0782 0.1372 22.024 0.2347
Lee 7 7 6554.056 9.965 0.5142 0.9896 0.0392 0.1009 21.828 0.2022
Kuan (15 it.) 2902.858 13.502 3.022 0.9885 0.0772 0.4484 6.597 0.5759
Frost 5 5 1335.729 16.873 6.393 0.9848 0.2722 0.4705 96.415 0.4828
SRAD (15 it.) 2985.113 13.381 2.90 1.001 0.2454 0.2943 42.202 0.2985
1 Denoising of Ultrasound Medical Images Using the DM6437 … 19
Fig. 1.6 Synthetic images after filtering process to remove a noise variance of 0.02, from left to
right. First row shows the filtered image using median filter with window sizes of 3 3, 5 5,
and 7 7, respectively. Second row shows the filtered image using the Lee filter with window
sizes of 3 3, 5 5, and 7 7, respectively. Third row shows the filtered image using the
Kuan, Frost, and SRAD filters, respectively
Figure 1.6 shows the synthetic images after applying the filtering process to
remove the noise. The speckle noise variance was 0.02. From left to right, the first
row shows the filtered image using a median filter with window sizes of
3 3 (median 3 3), 5 5 (median 5 5) and 7 7 (median 7 7) respec-
tively. The second row shows the filtered image using the Lee filter with window
sizes of 3 3, 5 5, and 7 7, respectively, and the third row shows the filtered
images using the Kuan, the Frost, and the SRAD (15 iterations) filters, respectively.
Notice that the SRAD filter yields the best visual results.
Median filter of 7 7 (median 7 7) yields a clean image. However, the
regions of the fingers are mixed, and the same happens in the image processed with
the median filter of 5 5 (median 5 5). Also, Lee of 3 3 and SRAD filters
yield a better image. The quantitative evaluation is summarized in Table 1.1. The
best FOM was obtained by using the Lee 3 3 filter followed by the performance
of the SRAD. However, SRAD yielded the best PSNR, SSIM, MSSIM, and SI.
20 G. A. Martínez Medrano et al.
Fig. 1.7 Synthetic images after filtering process to remove a noise variance of 0.2, from left to
right. First row shows the filtered image using median filter with window sizes of 3 3, 5 5,
and 7 7, respectively. Second row shows the filtered image using the Lee filter with window
sizes of 3 3, 5 5, and 7 7, respectively. Third row shows the filtered image using the
Kuan, Frost, and SRAD filters, respectively
Figure 1.7 shows the synthetic images after applying the filtering process. The
speckle noise variance was 0.2. From left to right, the first row shows the filtered
image using a median filter with window sizes of 3 3, 5 5, and 7 7,
respectively. The second row shows the filtered image using the Lee filter with
window sizes of 3 3, 5 5, and 7 7, respectively, and the third row shows
the filtered images using the Kuan, the Frost, and the SRAD (15 iterations) filters,
respectively. Notice that the Lee of 3 3, Frost, and SRAD filters preserve most of
the image details in spite of the noise.
1 Denoising of Ultrasound Medical Images Using the DM6437 … 21
The quantitative evaluation is summarized in Table 1.2 for Fig. 1.7. The best
FOM was obtained by using the Lee 5 5 filter in spite of the blur image followed
by the Lee 7 7 and Kuan after 15 iterations. However, the Frost 5 5 yielded
the best PSNR, SNR, SSIM, MSSIM. However, the SRAD gives a better CBC
because it also produces a piecewise effect in smooth areas.
Tables 1.4, 1.5, 1.6, and 1.7 show the performance of the filters for different
noise powers. For example, when the noise variance is 0.1, Frost, SRAD, and
median 3 3 filters yield the best PSNR results and Lee 3 3 preserves better the
contrast. Frost 5 5 yields the best SI, MSSIM, and FOM. SRAD (15 it) yields the
best SSIM and median 3 3 the best SI.
The PSNR and MSE values from the tables show that the filters have good noise
smoothing, especially in the SRAD and Frost filters that have a higher PSNR in
most of the cases. The SRAD and Lee filters yield the best results in contrast CB
and FOM, meaning that these filters reduce the noise and preserve the contrast.
The results show the effectiveness of the Frost filter with the highest score of MSSIM
in most of the cases. We note that the MSSIM approximates the perceived visual quality
of an image better than PSNR. The median and SRAD filters reached the highest values
of sharpness index that means that these filters can restore more image details.
The time reached by the filters is shown in Table 1.8, highlighting that the
SRAD and Kuan filters have 15 iterations for better quality, and the rest of the
algorithms is only in a single iteration.
In these results, it is shown that the algorithms can be suitable for the
DM6437 DSP reaching good results in the metrics and acceptable processing times.
The more performance in restoring image the more time consumed. However,
classical filters are suitable for implementation using fixed-point hardware.
22 G. A. Martínez Medrano et al.
Fig. 1.8 Real images from left to right. First row, original obstetric image, image restored using a
median 3 3 and a median 5 5 filter. Second row, image restored using a median 7 7, a Lee
3 3, and a Lee 5 5 filter. The third row, image restored using a Lee 7 7 a Kuan with 15
iterations and the Frost filter, and the fourth row shows the SRAD filter with 15 iterations
The algorithms show a good visual performance in the synthetic image, and now
they are going to be tested in real data. For this experiment, it was used an US
obstetric image. After processing, the image was adjusted to be displayed in a
display unit as shown in Fig. 1.4. Figure 1.8 shows the original and the denoised
images using the different algorithms implemented.
Speckle noise in US images has very complex statistical properties which
depend on several factors. Experimental results show that the edge preservation of
the Lee and SRAD filter is visible on the removed noise image.
1 Denoising of Ultrasound Medical Images Using the DM6437 … 23
The benefits of using the DM6437 processor (C6000 family) are its capabilities
of instruction scheduling to ensure the full utilization of the pipeline, parallel
processing, and high throughput. These proficiencies make the selected DSP suit-
able for computation-intensive real-time applications. The TI’s C6000 core utilizes
the very long instruction word (VLIW) architecture to achieve this performance and
affords lower space and power footprints to implement compared to superscalar
architectures. The eight functional units are highly independents and include six
32-bits and 40-bits arithmetic logic units (ALUs), and 64 general-purpose registers
of 32 bits (Texas Instruments 2006). In this research, a sample was represented in
Q-format as Q9.7, meaning a gap of only 0.0078125 between adjacent non-integer
numbers and a maximum decimal number of 0.9921875. As it can be seen, the
effect of the granular noise introduced by this quantization process is negligible.
Nevertheless, the speed gain is high (about 1.67 ns per instruction cycle) (Texas
Instruments 2006) compared to a floating-point processor.
1.5 Conclusions
The existence of speckle noise in US images is undesirable since it reduces the image
quality by affecting edges and details between interest data that is the most interesting
part for diagnostics. In this chapter, the performance of different strategies to remove
speckle noise using the fixed-point, DM6437 digital signal processor was analyzed.
The performance of the filters in synthetic images, with different noise variance, and
images acquired with a real US-scanner were compared. Measurements of recon-
struction quality and performance in time were carried out. It is noted that the median,
the Lee, and the Kuan filters perform very fast. However, Frost and SRAD filters
provide the best reconstruction quality even with images severely affected by noise,
but their performance in time is less than the previous filters.
As future directions, we are working on a framework to include stages such as
filtering, zooming, cropping, and segmentation of regions using active contours
(Chan and Vese 2001).
References
Abbott, J., & Thurstone, F. (1979). Acoustic speckle: Theory and experimental analysis.
Ultrasonic Imaging, 1(4), 303–324.
Adamo, F., Andria, G., Attivissimo, F., Lucia, A., & Spadavecchia, M. (2013). A comparative study
on mother wavelet selection in ultrasound image denoising. Measurement, 46(8), 2447–2456.
Akdeniz, N., & Tora, H. (2012). Real time infrared image enhancement. In Proceedings of the
20th Signal Processing and Communications Applications Conference (SIU), Mugla, Turkey
(vol. 1, pp. 1–4).
Argenti, F., & Alparone, L. (2002). Speckle removal from SAR images in the undecimated
wavelet domain. IEEE Transactions on Geoscience and Remote Sensing, 40(11), 2363–2374.
24 G. A. Martínez Medrano et al.
Aubert, G., & Aujol, J. (2008). A variational approach to removing multiplicative noise. SIAM
Journal on Applied Mathematics, 68(4), 925–946.
Blanchet, G., & Moisan, L. (2012) An explicit sharpness index related to global phase coherence.
In Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP), Kyoto, Japan (vol. 1, pp. 1065–1068).
Bronstein, M. (2011). Lazy sliding window implementation of the bilateral filter on parallel
architectures. IEEE Transactions on Image Processing, 20(6), 1751–1756.
Center for Fast Ultrasound Imaging. (September, 2017). Field II Simulation Program. [online]
Available at: http://field-ii.dk/?examples/fetus_example/fetus_example.html.
Chan, T., & Vese, L. (2001). Active contours without edges. IEEE Transactions on Image
Processing, 10(2), 266–277.
Dallai, A., & Ricci, S. (2014). Real-time bilateral filtering of ultrasound images through highly
optimized DSP implementation. In Proceedings of 6th European Embedded Design in
Education and Research Conference (EDERC), Milano, Italy (vol. 1, pp. 278–281).
Fan, R., Prokhorov, V., & Dahnoun, N. (2016). Faster-than-real-time linear lane detection
implementation using SoC DSP TMS320C6678. In Proceedings of the IEEE International
Conference on Imaging Systems and Techniques (IST) (Chania, Greece, vol. 1, pp. 306–311).
Frost, V., Abbott, J., Shanmugan, K., & Holtzman, J. (1982). A model for radar images and its
application to adaptive digital filtering of multiplicative noise. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 4(2), 157–166.
Fu, X., Wang, Y., Chen, L., & Dai, Y. (2015). Quantum-inspired hybrid medical ultrasound
images despeckling method. Electronic Letters, 51(4), 321–323.
Goodman, J. (2007). Speckle phenomena in optics: Theory and applications (1st ed.). Englewood,
Colorado, USA: Roberts and Company Publishers.
Huang, Y., Ng, M., & Wen, Y. (2009). A new total variation method for multiplicative noise
removal. SIAM Journal on Imaging Sciences, 2(1), 20–40.
Huang, Y., Moisan, L., Ng, M., & Zeng, T. (2012). Multiplicative noise removal via a learned
dictionary. IEEE Transactions on Image Processing, 21(11), 4534–4543.
Kang, J., Youn, J., & Yoo, Y. (2016). A new feature-enhanced speckle reduction method based on
multiscale analysis for ultrasound b-mode imaging. IEEE Transactions on Biomedical
Engineering, 63(6), 1178–1191.
Koundal, D., Gupta, S., & Singh, S. (2015). Nakagami-based total variation method for speckle
reduction in thyroid ultrasound images. Journal of Engineering in Medicine, 230, 97–110.
Kuan, D., Sawchuk, A., Strand, T., & Chavel, P. (1985). Adaptive noise smoothing filter for
images with signal-dependent noise. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 7(2), 165–177.
Lee, J. (1980). Digital image enhancement and noise filtering by use of local statistics. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 2(2), 165–168.
Li, H., Wu, J., Miao, A., Yu, P., Chen, J., & Zhang, Y. (2017). Rayleigh-maximum-likelihood
bilateral filter for ultrasound image enhancement. Biomedical Engineering Online, 16(46),
1–22.
Lin, R., Su, B., Wu, X., & Xu, F. (2011). Image super resolution technique based on wavelet
decomposition implemented on OMAP3530 platform. In Proceedings of Third International
Conference on Multimedia Information Networking and Security (MINES) (Shanghai, China,
vol. 1, pp. 69–72).
Maini, R., & Aggarwal, H. (2009). Performance evaluation of various speckle noise reduction
filters on medical images. International Journal of Recent Trends in Engineering, 2(4), 22–25.
Nie, X., Zhang, B., Chen, Y., & Qiao, H. (2016a). A new algorithm for optimizing TV-based
Pol-SAR despeckling model. IEEE Signal Processing Letters, 23(10), 1409–1413.
Nie, X., Qiao, H., Zhang, B., & Huang, X. (2016b). A nonlocal TV-based variational method for
PolSAR data speckle reduction. IEEE Transactions on Image Processing, 25(6), 2620–2634.
Oliver, C., & Quegan, S. (2004). Understanding synthetic aperture radar images (1st ed.).
Raleigh, North Carolina, USA: SciTech Publishing, Inc.
1 Denoising of Ultrasound Medical Images Using the DM6437 … 25
Ovireddy, S., & Muthusamy, E. (2014). Speckle suppressing anisotropic diffusion filter for
medical ultrasound images. Ultrasonic Imaging, 36(2), 112–132.
Ozcan, A., Bielnca, A., Desjardins, A., Bouma, B., & Tearney, G. (2007). Speckle reduction in
optical coherence tomography images using digital filtering. Journal of the Optical Society of
America A, 24(7), 1901–1910.
Perona, P., & Malik, J. (1990). Scale-space and edge detection using anisotropic diffusion. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 12(7), 629–639.
Pizurica, A., Philips, W., Lemahieu, I., & Acheroy, M. (2003). A versatile wavelet domain
noise filtration technique for medical imaging. IEEE Transactions on Medical Imaging, 22(3),
323–331.
Portilla, J., Strela, V., Wainwright, M., & Simoncelli, E. (2001). Adaptive Wiener denoising using
a Gaussian scale mixture model in the wavelet domain. In Proceedings of the International
Conference on Image Processing (ICIP) (Thessaloniki, Greece, vol. 2, pp. 37–40).
Pratt, W. (2001). Digital image processing (4th ed.). Hoboken, New Jersey, USA: Wiley.
Premaratne, P., & Premaratne, M. (2012). Image similarity index based on moment invariants of
approximation level of discrete wavelet transform. Electronic Letters, 48(23), 465–1467.
Rizi, F., Noubari, H., & Setarehdan, S. (2011). Wavelet-based ultrasound image de-noising:
Performance analysis and comparison. In Proceedings of the 2011 Annual International
Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (Boston,
Massachusetts, USA, vol. 1, pp. 3917–3920).
Rudin, L., Lions, P., & Osher, S. (2003). Multiplicative denoising and deblurring: Theory and
algorithms. In Geometric Level Set Methods in Imaging, Vision, and Graphics. New York,
USA: Springer.
Shi, J., & Osher, S. (2008). A nonlinear inverse scale space method for a convex multiplicative
noise model. SIAM Journal on Imaging Sciences, 1(3), 294–321.
Singh, K., Ranade, S., & Singh, C. (2017). A hybrid algorithm for speckle noise reduction of
ultrasound images. Computer Methods and Programs in Biomedicine, 148, 55–69.
Suetens, P. (2002). Fundamentals of medical Imaging (2nd ed.). Cambridge, United Kingdom:
Cambridge University Press.
Texas Instruments. (2006). TMS320DM6437 Digital Media Processor, SPRS345D, Rev. D.
Tian, J., & Chen, L. (2011). Image despeckling using a non-parametric statistical model of wavelet
coefficients. Biomedical Signal Processing and Control, 6(4), 432–437.
Wagner, R., Smith, S., Sandrik, J., & Lopez, H. (1983). Statistics of speckle in ultrasound B-Scans.
IEEE Transactions on Sonics and Ultrasonics, 30(3), 156–163.
Wang, Z., Bovik, A., Sheikh, H., & Simoncelli, E. (2004). Image quality assessment: from error
visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.
Wen, T., Gu, J., Li, L., Qin, W., & Xie, Y. (2016). Nonlocal total-variation-based speckle filtering
for ultrasound images. Ultrasonic Imaging, 38(4), 254–275.
Xie, H., Pierce, L., & Ulaby, L. (2002). SAR speckle reduction using wavelet denoising and
Markov random field modeling. IEEE Transactions on Geoscience and Remote Sensing,
40(10), 2196–2212.
Xuange, P., Ming, L., Bing, Z., Chunying, H., & Xuyan, Z. (2009). The online hydrological
sediment detection system based on image process. In Proceedings of 4th IEEE Conference on
Industrial Electronics and Applications (ICIEA) (Xi’an, China, vol. 1, pp. 3761–3764).
Yu, Y., & Acton, S. (2002). Speckle reducing anisotropic diffusion. IEEE Transactions on Image
Processing, 11(11), 1260–1270.
Zhuang, L. (2014). Realization of a single image haze removal system based on DaVinci
DM6467T processor. In Proceedings of SPIE 9273, Optoelectronic Imaging and Multimedia
Technology III (Beijing, China, vol. 9273, pp. 1–7).
Zoican, S. (2011). Adaptive algorithm for impulse noise suppression from still images and its real
time implementation. In Proceedings of 10th International Conference on Telecommunication
in Modern Satellite Cable and Broadcasting Services (TELSIKS) (Nis, Serbia, vol. 1,
pp. 337–340).
Chapter 2
Morphological Neural Networks
with Dendritic Processing for Pattern
Classification
2.1 Introduction
autonomous way in such a way that it can avoid hitting those objects, obeying
orders, locate and grasp them to perform a given task.
The pattern classification problem can be stated as follows: Given a pattern X in
vector form composed of or of n features as follows: X ¼ ½x1 ; x2 ; . . .; xn T , deter-
mine its corresponding class C k ; k ¼ 1; 2; . . .; p. Several approaches were devel-
oped during the last decades to provide different solutions to this problem; among
them are the statistical approach, the syntactical or structural approach, and the
artificial neural approach.
The artificial neural approach is based on the fact that many small processing
units (the neurons) combine their capabilities to determine the class C k ; k ¼
1; 2; . . .; p given an input pattern: X ¼ ½x1 ; x2 ; . . .; xn T . Considering that an artificial
neural network is a mapping between X and the set of labels: K ¼ f1; 2; . . .; pg; if
this mapping is defined as M then: X ! M ! K.
Several artificial neural network (ANN) models have been reported in literature,
since the very old threshold logic unit (TLU) model introduced to the world during
the 40s by McCulloch and Pitts (1943), the well-known Perceptron developed by
Rosenblatt during the 50s (Rosenblatt 1958, 1962), the radial basis function neural
network (RBFNN) proposed by Broomhead and Lowe (1988a, b), the elegant
support vector machine (SVM) introduced to the world by Cortes and Vapnik in the
90s (Cortes and Vapnik 1995), the extreme learning machine (ELM) model pro-
posed by Guang et al. (2006) and Huang et al. (2015), among other.
A kind of ANNs not very well known by the scientific community that has
demonstrated very promising and competitive pattern classification results is the
so-called morphological neural network with dendritic processing (MNNDP) model
(Ritter et al. 2003; Ritter and Urcid 2007).
Instead of using the standard multiplications ðÞ and additions ð þ Þ to obtain
the values used by the activation functions of the computing unities in classical
models, MNNDPs combine additions ð þ Þ and max ð_Þ or min ð^Þ operations. As
we will see along this chapter, this change will modify the way separating among
pattern classes; instead of using decision surfaces integrated by a combination of
separating hyperplanes, MNNDPs combine hyper-boxes to perform the same task:
Divide classes to find the class to which a given input pattern X ¼ ½x1 ; x2 ; . . .; xn T
should be put.
The rest of this chapter is organized as follows. Section 2.2 is oriented to present
to the reader the basics of MNNDP. Section 2.3, on the other hand, is focused to
explain the operation of the most popular and useful training algorithms. When
necessary, a simple numerical example is provided to help the reader to easily grasp
the idea of the operation of the training algorithm. In Sect. 2.4, we compare the
performance of the presented models as well as the training algorithms in respect to
other artificial neural networks models. Finally, in Sect. 2.5, we conclude and give
some directives for present and future research.
2 Morphological Neural Networks with Dendritic Processing … 29
Fig. 2.1 a Typical MPDP and b example of hyper-box in 2D generated by kth dendrite
30 H. Sossa et al.
The computation skj performed by the kth dendrite for the jth class in 2D can be
expressed as follows:
skj ¼ ^ni¼1 xi þ w1ik ^ xi þ w0ik ð2:1Þ
From Eq. (2.2), we can see that the argmax function selects only one of the
dendrites values, and the result is a scalar. This argmax function permits a MPDP
classifying patterns that are outside the hyper-boxes. It also allows building more
complex decision boundaries by combining the actions of several hyper-boxes. If
Eq. (2.2) produces more than one output, the argmax function selects the first
maximum argument as the index class to which the input pattern is classified.
In order to explain how a dendrite computation is performed for a MPDP, let us
refer to Fig. 2.2a displaying two hyper-boxes that could be generated by any MPDP
trained with any training algorithm covering all the patterns (green crosses and blue
dots). In the example, blue dot points belong to class C 1 while green crosses belong
to class C 2 . Figure 2.2b presents the MPDP that allows generating the two afore-
mentioned boxes. As can be appreciated, the input pattern values x1 and x2 are
connected to the output neuron via the dendrites. The geometrical calculation
explanation executed by the dendrites is that each of these determines a box in two
dimensions (a hyper-box in n dimensions) which can be represented by its weight
values wij .
To verify the correct operation of the
MPDP
shown in Fig. 2.2b, let us consider
3
the following two noisy patterns: ~x1 ¼ which is supposed to belong to class C 1
0
7
and ~x2 ¼ to class C 2 .
3
According to Eq. (2.1), the following dendrite computations for both patterns
can be obtained:
2 Morphological Neural Networks with Dendritic Processing … 31
Fig. 2.2 Simple example of a DMNN. a Two boxes that cover all the patterns and b MPDP based
on the two boxes (black circles denote excitatory connections and white circles inhibitory
connections)
3
s11 ð~x1 Þ ¼ s11 ¼ ½ð3 1Þ ^ ð3 5Þ ^ ½ð0 1Þ ^ ð0 5Þ ¼ ½2 ^ 1 ¼ 1:
0
3
s21 ð~x1 Þ ¼ s21 ¼ ½ð3 4Þ ^ ð3 8Þ ^ ½ð0 4Þ ^ ð0 8Þ ¼ ½1 ^ 4 ¼ 4:
0
7
s11 ð~x2 Þ ¼ s11 ¼ ½ð7 1Þ ^ ð7 5Þ ^ ½ð3 1Þ ^ ð3 5Þ ¼ ½2 ^ 2 ¼ 2:
3
7
s21 ð~x2 Þ ¼ s21 ¼ ½ð7 4Þ ^ ð7 8Þ ^ ½ð3 4Þ ^ ð3 8Þ ¼ ½1 ^ 1 ¼ 1:
3
It is well known that to be useful, any ANN has to be trained. In the case of
MNNDPs, several training methods have been reported in the literature. Most of
these methods utilize a sort of heuristic and do not make use of an optimization
technique to tune the interconnection parameters.
32 H. Sossa et al.
In this section, we describe some of the most useful methods reported in the
literature to train a MNNDP. Without loss of generality, let us consider the case of a
MNNDP composed of just one neuron, i.e., a MPDP.
According to Ritter et al. (2003), a MPDP can be trained in two different ways. The
first is based on iteratively eliminating boxes, the second one on merging boxes.
The principle of operation of both approaches is described in the following two
subsections.
This method was originally designed to work for one morphological perceptron
applied to two-class problems. The method first builds a hyper-box that encloses all
the patterns from the first class and possibly patterns of the second class. For an
example, refer to Fig. 2.3a. As can be appreciated, the hyper-box generated con-
tains patterns of both classes.
The elimination method then, in an iterative way, generates boxes containing
patterns of the second class, carving the first hyper-box producing a polygonal
region containing, at each iteration, more patterns of the first class. The elimination
method continues this way until all patterns of second class are eliminated from the
original hyper-box. Figure 2.3b and c illustrates this process.
Fig. 2.3 Illustration of the operation of the elimination method. a A two-class problem, b and
c consecutive steps until the resulting region only encloses patterns of the first class
The second set of methods we are going to describe first takes a set of patterns,
divided into classes producing a clustering. They then utilize the generated clus-
tering to obtain the weights of the corresponding dendrites. We present two
methods, one of exponential complexity and one improvement of linear complexity.
In Sossa and Guevara (2014), the authors introduce the so-called divide and con-
quer method (DCM) for training MPDP. The main idea behind this training method
is to first group the patterns of classes into clusters (one cluster for each class of
patterns), then to use this clustering to obtain the weights of dendrites of the
morphological perceptron.
For purposes of explaining the functioning of the algorithm, a simple example of
three classes with two attributes will be used. Figure 2.5a shows the whole set of
34 H. Sossa et al.
Fig. 2.4 Illustration of the operation of the merging method. a A two-class problem, b and
c consecutive steps of the merging process until only one region is obtained
Fig. 2.5 Illustration of the operation of the DCM. a Three-class problem and hyper-box enclosing
all patterns of all classes, b first division when step 2 of the DCM is applied, c consequent
divisions generated when step 3(a) is applied, and d simplification step, resulting in five dendrites
with the label of the corresponding class, stop the learning process and proceed
to step 4. For example, the first division of the box is presented in Fig. 2.5b.
(3) This step is divided into two stages as follows:
(a) If at least one of the generated hyper-cubes H n has patterns of more than
one class, divide H n into 2n smaller hyper-boxes. Iteratively repeat the
verification division process onto each smaller hyper-box until the stopping
criterion is satisfied. Figure 2.5c shows all the boxes generated by the
training algorithm.
(b) Once all the hyper-boxes are generated, if two or more of them of the same
class share a common side, group them into one region. Figure 2.5d
36 H. Sossa et al.
With these values by means of Eq. (2.2), the two patterns are classified as
follows:
sð~x1 Þ ¼ argmaxk s11 ð~x1 Þ; s12 ð~x1 Þ; s23 ð~x1 Þ; s34 ð~x1 Þ; s35 ð~x1 Þ
¼ argmaxk ð1:1; 2:1; 1:1; 1:1; 3:5Þ ¼ 1:
Thus, ~x1 is put in class C1 as expected. In the same way, for pattern ~x2 :
sð~x2 Þ ¼ argmaxk s11 ð~x2 Þ; s12 ð~x2 Þ; s23 ð~x2 Þ; s34 ð~x2 Þ; s35 ð~x2 Þ
¼ argmaxk ð3:9; 1:6; 1:5; 1:5; 1:5Þ ¼ 3:
A main problem of the DCM introduced in (Sossa and Guevara 2014) and
explained in the last section is its exponential complexity. Each time a hyper-box is
divided, 2n computations are required. This could be very restrictive in most
sequential platforms.
In this section, we briefly present the operation of a substantial improvement of
the DCM that operates in linear time. We call this new method the LDCM. Instead
of generating all the 2n hyper-boxes at each iteration, the new method generates
only the necessary hyper-boxes directly from the data by analyzing it in a linear
way. The method operates in a recursive way. The steps that the method uses for
training a LDCM are explained as follows.
Given m patterns belonging to p classes, with n the dimensionality of each
pattern:
Algorithm LDCM:
(1) Enclose all the m patterns inside a first hyper-box denoted as H0 in Rn . Again,
to have better tolerance to noise, add a margin M to each side of H0 .
(2) Divide the length of each dimension of Hk , including H0 half:
di
2 ¼ maxfxi gminfxi g,
2 obtaining two parts at each dimension
minfxi g hi1 maxfxi g þ d2i , minfxi g þ d2i hi2 maxfxi g. Determine inside
which intervals the first sample is found and generate the corresponding
hyper-box. Let us call this box H1 .
(3) Take each sample pattern (in the provided example) from left to right and up
and down. If a pattern is out of the generated box, generate a new box for each
of these patterns. Repeat this step for the whole set of patterns. Let us designate
these boxes as: H2 , H3 ,…, one for each first point outside the preceding box.
38 H. Sossa et al.
(4) Verify all the generated boxes generated in step 3, and apply one of the fol-
lowing steps:
(a) If the patterns inside a box belong to the same class, label this box with the
label of the corresponding class. If all boxes contain patterns of the same
class, stop training and go to step 5.
(b) If at least one box contains pattern of different classes, then iterate between
steps 3 and 4(a) until the stopping criterion is satisfied (until each generated
box contains patterns of the same class).
(5) If two or more generated boxes share dimensions, merge those regions, as it is
done with the DCM.
(6) By taking into account the coordinates, select the weights for each box
(dendrite).
To illustrate the operation of LDCM, let us consider the two-class problem
shown in Fig. 2.6a. Figure 2.6b shows also the corresponding box H0 generated
when the first step of the LDCM is applied. Figure 2.6c shows the box generated by
the application of the second step of the LDCM. Figures 2.6d–f depicts the three
boxes generated when the third step of the LDCM is applied over the sample points.
Each generated box is labeled with the black dot at the left upper over the first
pattern outside the previous box. As can be appreciated from Fig. 2.6f, only the first
two boxes contain patterns of the first class (red dots) and second class (green dots),
and the other two boxes contain patterns of both classes; thus, step 4(b) is applied
over these two boxes, giving as a result the subdivision of boxes shown in
Fig. 2.6g–i. Finally, Fig. 2.6j depicts the simplification of the boxes provided by
the application of the fifth step of the LDCM. As can be seen, the optimized final
neuron will have seven dendrites, four for the first class and three for the second
class. The weights of dendrite should be calculated in terms of the limit coordinates
of each box (step 6 of the LDCM). One important result concerning the LDCM is
that it produces exactly the same result as if the DCM was applied. The proof can be
found in (Guevara 2016). More details concerning the LDCM can be found in
(Guevara 2016).
This kind of methods makes use of so-called evolutionary techniques to find the
weights of the dendrites of a MPDP. Recently in (Arce et al. 2016, 2017), the
authors describe a method that utilizes so-called evolutionary computation to
optimally find the weights of the dendrites of a MPDP. The method utilizes dif-
ferential evolution to evolve the weights. Let us call this method the differential
evolution method (DEM).
2 Morphological Neural Networks with Dendritic Processing … 39
Fig. 2.6 Illustration of the operation of the LDCM. a Two-class problem, b first step (box H0 ),
c box generated by the second step, d, e, f boxes generated by third step, g, h, i iterative
subdivision of the boxes by the application of step 4(b), and j simplification of the boxes by the
application of the fifth step
Begin
Generate initial population of solutions.
For d ¼ 1 to q
Repeat:
For the entire population, calculate the fitness value.
For each parent, select two solutions at random and the best parent.
Create one offspring using DE operators.
If the offspring is better than the parents:
Replace parent by the offspring.
Until a stop condition is satisfied.
End
In (Arce et al. 2016, 2017), the authors present two initialization methods. Here,
we describe the operation of one of these methods. In general, the so-called HBd
initialization method proceeds in two steps as follows:
(1) For each class C j , open a hyper-box that encloses all its patterns.
(2) Divide each hyper-box into smaller hyper-boxes along the first axis on equal
terms by a factor d, for dZ þ until q divisions have been carried out.
In order to explain the better, understand the operation of the HBd initialization
algorithm, a straightforward example of four classes with two features is next
presented. Figure 2.7a illustrates the problem to be solved. Blue dots belong to C 1 ,
black crosses to C2 , green stars to C 3 , and red diamonds to C 4 .
As can be seen from Fig. 2.7b, during the first step, the patterns of each class are
enclosed a box. The blue box encloses the patterns from C1 , the black box encloses
those from C2 , then green box encloses those from C 3 , and the red encloses those
from C 4 .
During the second step, Fig. 2.7b and c shows how each box is divided by a
factor d (d ¼ 1 in the first case, d ¼ 2 in the second case), while Fig. 2.7d shows
how DE is applied to the resultant boxes from Fig. 2.7c. This is best placement of
boxes for d ¼ 2 by the application of DE.
In 2006, Barmpoutis and Ritter (2006) modified the dendritic model by rotating the
orthonormal axes of each hyper-box. In this work, the authors create hyper-boxes
with a different orientation of their respective coordinate axes.
2 Morphological Neural Networks with Dendritic Processing … 41
Fig. 2.7 Illustration of the operation of the HBd initialization method. a Simple example with
four classes, b division with d ¼ 1, c division with d ¼ 2, and d optimal placement by the
application of DE of the boxes for d ¼ 2
2.4 Comparison
To compare among the training methods for MPDP, we use three synthetic data-
bases shown in Fig. 2.8a and b, the spiral with laps depicted in Fig. 2.8c and a,
spiral with 10 laps (not shown).
Fig. 2.8 Three of the synthetic databases to test the performance of three of the methods for
training MNNDP. a Two-class synthetic problem, b Three-class synthetic problem, and
c two-class spiral synthetic problem
2 Morphological Neural Networks with Dendritic Processing … 43
Table 2.2 Comparison between RM, DCM, and DEM methods for four synthetic databases
Dataset RM DCM DEM
ND etest ND etrain etest ND etrain etest
A 194 28.0 419 0.0 25.0 2 21.7 20.5
B 161 50.3 505 0.0 20.3 3 16.6 15.2
Spiral 2 160 8.6 356 0.0 7.2 60 7.3 6.4
Spiral 10 200 26.7 1648 0.0 10.6 1094 1.8 6.3
Bold indicates DEM method offers a better etest error; for the first three databases (A, B and Spiral
2) it requires a less number of dendrites ND to produce the result; only in the case of Spiral 10
DEM method requires a greater number of dendrites (1094) compared to the 200 required by the
RM method
Table 2.2 shows a comparison with these four databases among the methods
reported in (Ritter et al. 2003; Sossa and Guevara 2014; Arce et al. 2016),
respectively. In all three cases, ND is the number of dendrites generated by the
training method, etrain is the training error, and etest is the error during testing.
As can be appreciated, in all four cases, the DE-based algorithm provides the
best testing errors. Note also that although the DE-based method obtains a training
error, it provides the best testing error. We can say that this method generalizes
better than the DCM that obtains a 0% for training error etrain . In the first two
problems, we can see that the DEM needs a reduced number of dendrites to provide
a reduced training error. In all four experiments, 80% of the data were used for
training and 20% for testing.
In this section, we first compare the three training methods for MPDP with 11
databases taken from the UCI Machine Learning Repository (Asuncion 2007).
Table 2.3 presents the performance results. As we can appreciate from Table 2.3, in
all cases, the DE-based training method provides the smallest number of dendrites
to solve the problem as well as the smallest testing error.
Because the DEM provides the best results among the three training methods for
MNNDP, we now compare its performance against three well-known neural
network-based classifiers: a MLP, a SVM, and a RBFNN. We do it with the same
11 databases from UCI. Table 2.4 presents the performance results; as we appre-
ciate from this table, in most of the cases, the DEM provides the best testing errors.
44 H. Sossa et al.
Table 2.3 Comparison between RM, DCM, and DEM methods for 11 databases from UCI
Machine Learning Repository
Dataset RM DCM DEM
ND etest ND etrain etest ND etrain etest
Iris 5 6.7 28 0.0 3.3 3 3.3 0.0
Mammographic mass 51 14.4 26 0.0 19.2 8 15.8 10.4
Liver disorders 41 42.0 183 0.0 35.5 12 37.6 31.1
Glass identification 60 36.7 82 0.0 31.8 12 4.7 13.6
Wine quality 120 51.0 841 0.0 42.1 60 42.1 40.0
Mice protein expression 77 18.9 809 0.0 5.0 32 6.6 4.5
Abalone 835 88.2 3026 0.0 80.6 27 77.1 78.2
Dermatology 192 57.8 222 0.0 15.5 12 4.8 4.2
Hepatitis 19 53.3 49 0.0 46.7 9 9.4 33.3
Pima Indians diabetes 180 70.6 380 0.0 31.4 2 23.8 23.5
Ionosphere 238 10.0 203 0.0 35.7 2 2.8 2.8
Bold indicates DEM method requires a less number of dendrites to produce the result; in all cases
it also offers a better etest error
Table 2.4 Comparison between the MLP, the SVM, the RBFNN, and the DEM methods for 11
databases from UCI Machine Learning Repository
Dataset MLP SVM RBFNN DEM
etrain etest etrain etest etrain etest etrain etest
Iris 1.7 0.0 4.2 0.0 4.2 0.0 3.3 0.0
Mammographic mass 15.7 11.2 18.4 11.2 17.9 16.0 15.8 10.4
Liver disorders 40.3 40.6 40.0 40.2 29.0 37.8 37.6 31.1
Glass identification 14.1 20.4 12.3 18.2 0.0 20.4 4.7 13.6
Wine quality 34.0 39.0 40.6 43.0 41.5 44.3 42.1 40.0
Mice protein expression 0.0 0.6 0.1 0.5 11.4 13.9 6.6 4.5
Abalone 75.0 75.0 73.1 75.0 72.0 76.0 77.1 78.2
Dermatology 0.0 0.0 1.4 1.4 1.0 2.8 4.8 4.2
Hepatitis 1.6 40.0 15.6 33.3 15.6 33.3 9.4 33.3
Pima Indians diabetes 15.5 29.4 22.3 24.8 22.3 24.8 23.8 23.5
Ionosphere 0.3 7.1 6.8 6.8 6.4 8.6 2.8 2.8
Bold indicates DEM wins when compared to other standard classification methods. In some cases
it losses, for example, for the Dermatology problems MLP methods obtains the best results with 0
etrain and etest errors
2 Morphological Neural Networks with Dendritic Processing … 45
References
Arce, F., Zamora, E., Sossa, H., & Barrón, R. (2016). Dendrite morphological neural networks
trained by differential evolution. In Proceedings of 2016 IEEE Symposium Series on
Computational Intelligence (SSCI), Athens, Greece (vol. 1, pp. 1–8).
Arce, F., Zamora, E., Sossa, H., & Barrón, R. (2017). Differential evolution training algorithm for
dendrite morphological neural networks. Under Review in Applied Soft Computing.
Ardia, D., Boudt, K., Carl, P., Mullen, K., & Peterson, B. (2011). Differential evolution with
DEoptim: An application to non-convex portfolio optimization. The R Journal, 3(1), 27–34.
Asuncion, D. (2007). UCI machine learning repository. [online] Available at: http://archive.ics.uci.
edu/ml/index.php.
Barmpoutis, A., & Ritter, G. (2006). Orthonormal basis lattice neural networks. In Proceedings of
the IEEE International Conference on Fuzzy Systems, Vancouver, British Columbia, Canada
(vol. 1, pp. 331–336).
Broomhead, D., & Lowe, D. (1988a). Radial basis functions, multi-variable functional
interpolation and adaptive networks. (Technical Report). RSRE. 4148.
Broomhead, D., & Lowe, D. (1988b). Multivariable functional interpolation and adaptive
networks. Complex Systems, 2(3), 321–355.
Cortes, C., & Vapnik, V. (1995). Support vector networks. Machine Learning, 20(3), 273–297.
46 H. Sossa et al.
Guevara, E. (2016). Method for training morphological neural networks with dendritic
processing. Ph.D. Thesis. Center for Computing Research. National Polytechnic Institute.
Guang, G., Zhu, Q., & Siew, Ch. (2006). Extreme learning machine: theory and applications.
Neurocomputing, 70(1–3), 489–501.
Huang, G., Huang, G., Song, S., & You, K. (2015). Trends in extreme learning machines: A
review. Neural Networks, 61, 32–48.
McCulloch, W., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity.
Bulletin of Mathematical Biophysics, 5, 115–133.
Ojeda, L., Vega, R., Falcon, L., Sanchez-Ante, G., Sossa, H., & Antelis, J. (2015). Classification of
hand movements from non-invasive brain signals using lattice neural networks with dendritic
processing. In Proceedings of the 7th Mexican Conference on Pattern Recognition (MCPR)
LNCS 9116, Springer Verlag (pp. 23–32).
Ritter, G., & Beaver, T. (1999). Morphological perceptrons. In Proceedings of the International
Joint Conference on Neural Networks (IJCNN). Washington, DC, USA (vol. 1, pp. 605–610).
Ritter, G., Iancu, L., & Urcid, G. (2003). Morphological perceptrons with dendritic structure. In
Proceedings of the 12th IEEE International Conference in Fuzzy Systems (FUZZ), Saint Louis,
Missouri, USA (vol. 2, pp. 1296–1301).
Ritter, G., & Schmalz, M. (2006). Learning in lattice neural networks that employ dendritic
computing. In Proceedings of the 2006 IEEE International Conference on Fuzzy Systems
(FUZZ), Vancouver, British Columbia, Canada (vol. 1, pp. 7–13).
Ritter, G., & Urcid, G. (2007). Learning in lattice neural networks that employ dendritic
computing. Computational Intelligence Based on Lattice Theory, 67, 25–44.
Ritter, G., Urcid, G., & Valdiviezo, J. (2014). Two lattice metrics dendritic computing for pattern
recognition. In Proceedings of the 2014 IEEE International Conference on Fuzzy Systems
(FUZZ), Beijing, China (pp. 45–52).
Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and
organization in the brain. Psychological Review, 65(6), 386–408.
Rosenblatt, F. (1962). Principles of neurodynamics: Perceptron and theory of brain mechanisms
(1st ed.). Washington, DC, USA: Spartan Books.
Sossa, H., & Guevara, E. (2013a). Modified dendrite morphological neural network applied to 3D
object recognition. In Proceedings of the Mexican Conference on Pattern Recognition
(MCPR), LNCS (vol. 7914, pp. 314–324).
Sossa, H., & Guevara, E. (2013b). Modified dendrite morphological neural network applied to 3D
object recognition on RGB-D data. In Proceedings of the 8th International Conference on
Hybrid Artificial Intelligence Systems (HAIS), LNAI (vol. 8073, pp. 304–313).
Sossa, H., & Guevara, E. (2014). Efficient training for dendrite morphological neural networks.
Neurocomputing, 131, 132–142.
Sossa, H., Cortés, G., & Guevara, E. (2014). New radial basis function neural network architecture
for pattern classification: First results. In Proceedings of the 19th Iberoamerican Congress on
Pattern Recognition (CIARP), Puerto Vallarta, México, LNCS (vol. 8827, pp. 706–713).
Sussner, P., & Esmi, E., (2009). An introduction to morphological perceptrons with competitive
learning. In Proceedings of the 2009 International Joint Conference on Neural Networks
(IJCNN), Atlanta, Georgia, USA (pp. 3024–3031).
Sussner, P., & Esmi, E. (2011). Morphological perceptrons with competitive learning:
Lattice-theoretical framework and constructive learning algorithm. Information Sciences, 181
(10), 1929–1950.
Vega, R., Guevara, E., Falcon, L., Sanchez, G., & Sossa, H. (2013). Blood vessel segmentation in
retinal images using lattice neural networks. In Proceedings of the 12th Mexican International
Conference on Artificial Intelligence (MICAI), LNAI (vol. 8265, pp. 529–540).
Vega, R., Sánchez, G., Falcón, L., Sossa, H., & Guevara, E. (2015). Retinal vessel extraction
using lattice neural networks with dendritic processing. Computers in Biology and Medicine,
58, 20–30.
2 Morphological Neural Networks with Dendritic Processing … 47
Zamora, E., & Sossa, H. (2016). Dendrite morphological neurons trained by stochastic gradient
descent. In Proceedings of the 2016 IEEE Symposium Series on Computational Intelligence
(SSCI 2016), Athens, Greece (pp. 1–8).
Zamora, E., & Sossa, H. (2017). Dendrite morphological neurons trained by stochastic gradient
descent. Neurocomputing, 260, 420–431.
Chapter 3
Mobile Augmented Reality Prototype
for the Manufacturing of an All-Terrain
Vehicle
Keywords Mobile augmented reality Automotive manufacturing
All-terrain vehicle Android OS Unity 3D Vuforia
3.1 Introduction
Mechatronics is the science of intelligent machines, since its dissemination has been
useful for the development of several industries such as manufacturing, robotics,
and automotive (Bradley et al. 2015). Particularly, most complex innovations in the
automotive industry are highly integrated mechatronics systems that include elec-
tronic, mechanical, computer, and control structures (Bradley 2010).
Fig. 3.1 Example of an ATV. a The real ATV and b the 3D model of an ATV
52 E. D. Nava Orihuela et al.
fusion, and no porosity; (2) specific dimensions, for example, the distance from one
blast-hole to another, or the distance between two components; and (3) the correct
mounting of the accessories to build the complete ATV structure. The measures and
welding are determined by the design plans, and if they are not correct, the cor-
respondent assembly cannot be carried out.
The difficulties encountered in the assembly lines are due to human errors or by
bad weldings and out of specification dimensions, which causes that the accessories
cannot be mounted properly. Therefore, it is important to create a system to support
the processes of welding inspection and measure critical dimensions and acces-
sories mounting.
The use of AR applications for industrial uses is increasingly common as was stated
in the works of Odhental et al. (2012), Nee et al. (2012), Elia et al. (2016),
Syberfeldt et al. (2017), Palmarini et al. (2018). However, in the literature, a limited
number of papers have been presented which showed the ability of AR to support
processes in the manufacturing industry with promising results. Following some of
the papers detected in the perusal of current literature are briefly discussed.
A typical problem in operations and maintenance (O&M) practice is the col-
lection of various types of data to locate the target equipment and facilities and to
properly diagnose them at the site. In the paper of Lee and Akin (2011), an
AR-based interface for improving O&M information in terms of time spent and
steps taken to complete work orders was developed. The BACnet protocol was used
to get sensor-derived operation data in real time from building automation system
(BAS). A series of experiments was conducted to quantitatively measure
improvement in equipment O&M fieldwork efficiency by using a software proto-
type of the application. Two research and educational facilities and their heating,
ventilating, and air conditioning (HVAC) systems were used for tests: a ventilation
system and a mullion system in one facility, and an air-handling unit (AHU) in the
other facility. The verification tests consist of retrieval of operation data from
HVAC systems in real time and superimposition of the 3D model of the mullion
system. The results obtained show that with the proposal the subjects saved, on
average, 51% of time spent at the task when they located target areas, and 8% of the
time at task while obtaining sensor-based performance data from BAS.
The use of robots in the processes of a manufacturing plant is increasingly
common for handling tasks, for example, in assembly operations. The paper of
Fang et al. (2012) developed an AR system (RPAR-II) to facilitate robot pro-
gramming and trajectory planning considering the dynamic constraints of the
robots. The users are able to preview the simulated motion, perceive any possible
overshoot, and resolve discrepancies between the planned and simulated paths prior
to the execution of a task. A virtual robot model, which is a replicate of a real robot,
was used to perform and simulate the task planning process. A hand-held device,
3 Mobile Augmented Reality Prototype for the Manufacturing … 53
which is attached with a marker-cube, was used for human–robot interaction in the
task and path planning processes. By means of a pick-and-place simulation, the
performance of the trajectory planning and the fitness of the selection of the robot
controller model/parameters in the robot programming process can be visually
evaluated.
Because maintenance and assembly tasks can be very complex, training tech-
nicians to efficiently perform new skills is challenging. Therefore, the paper of
Webel et al. (2013), presented an AR platform that directly links instructions on
how to perform the service tasks to the machine parts that require processing. The
platform allows showing in real time the step-by-step instructions to realize a
specific task and, as a result, accelerating the technician’s acquisition of new
maintenance procedures. The experimental task was composed of 25 steps grouped
into six subtasks to assemble an electro-mechanical actuator. Twenty technicians
with at least 2 years of experience on field assembly/disassembly operations served
as participants. The sample was divided into two groups of ten participants: the
control group executes the task by watching videos and the second group using AR.
The execution time of the task was enhanced in 5%, and the affectivity rate obtained
was 77% using AR.
Maintenance is crucial in prolonging the serviceability and lifespan of the
equipment. The work of Ong and Zhu (2013) presented an AR real-time equipment
maintenance system including: (1) context-aware information to the technicians,
(2) a mobile user interface that allows the technicians to interact with the virtual
information rendered, (3) a remote collaboration mechanism that allows the expert
to create and provide AR-based visual instructions to the technicians, and (4) a
bidirectional content creation tool that allows dynamic AR maintenance contents
creation offline and on-site. The system was used to assist the machinist and
maintenance engineers in conducting preventive and corrective computer mainte-
nance activities. From the studies conducted, it was found that providing
context-aware information to the technicians using AR technology can facilitate the
maintenance workflow. In addition, allowing the remote expert to create and use
AR-based visual interactions effectively enables more efficient and less error prone
remote maintenance.
For decades, machine tools have been widely used to manufacture parts for
various industries including automotive, electronics and aerospace. Due to the
pursuit of mechanical precision and structural rigidity, one of the main drawbacks
in machine tool industry is the use of traditional media, such as video and direct
mail advertising instructional materials. In order to solve this, the machine tools
augmented reality (MTAR) system for viewing machine tools from different angles
with 3D demonstrations was developed by Hsien et al. (2014). Based on markerless
AR, the system can integrate real and virtual spaces using different platforms, such
as a webcam, smartphone, or tablet device without extra power or demonstration
space. The clients can project the virtual information to a real field and learn the
features of the machine form different angles and aspects. The technology also
provides information for area planning.
54 E. D. Nava Orihuela et al.
As can be observed from the literature review, most of the works use markers as
the core to show the AR, only one work implemented markerless AR. None of the
papers addressed the measuring of critical dimensions and mounting accessories for
ATV manufacturing. However, the welding inspection for automotive purposes was
addressed by the work of Doshi et al. (2017). It should be noted that the work of
Doshi et al. (2017) checks the welds only in plane panels, unlike our work which
checks welds even on irregular surfaces. On the other hand, none of the papers
reviewed included a usability study such as the one presented in our chapter. The
study is important to measure if the system complies with the initial goal and if it is
ease of use.
The most observed applications focused on maintenance and training operations
in different industries, including two works for automotive. It is important to note
that in all the works revised the ability of AR to enhance some task is always
highlighted. Motivated from the revision above, in the following section, the pro-
posal of a methodology to create a MAR prototype to support the manufacturing of
an ATV is proposed.
The methodology for building the MAR prototype, as it is shown in Fig. 3.2,
comprises five main stages: (1) selection of development tools, (2) selection and
design of 3D models, (3) markers design, (4) development of the MAR application,
and (5) graphical user interface (GUI) design. The individual stages of the
methodology are deeply explained in the following subsections.
Three software packages were used to build the core of the MAR prototype. The
selection was made by an exhaustive analysis of the commercial software for AR
development. At the end, the softwares selected were Autodesk 3DS Max, Vuforia,
and Unity 3D, all of them in its free or educational versions.
3DS Max is software for graphics creation and 3D modeling developed by
Autodesk and contains integrating tools for 3D modeling, animation, and rendering.
One of the advantages is that 3DS has an educational version that includes the same
functionalities of the professional version. The software was used for the creation of
all the 3D models and animations of the MAR prototype (Autodesk 2017).
Vuforia software developer kit (SDK) was selected because it is a powerful
platform that contains the necessary libraries to carry out the tasks related to AR
including the markers detection, recognition and tracking, and the computations for
object superimposition. Nowadays, Vuforia is the world’s most widely deployed
AR platform (PTC Inc. 2017).
Unity is a multiplatform game engine created by Unity Technologies that offer
the possibility of building 3D environments. It was selected because of the facility
of having control of the content of the mobile device in a local way. In addition, the
visual environment of the platform provides a transparent integration with Vuforia.
The language C# was used to create script programming which includes all the
logical operations of the MAR prototype Unity and includes an integrated system
that allows the creation of a GUI for execution at different platforms including
iPhone operating system (iOS), Android, and universal windows platform (UWP).
It is compatible with 3D graphics and animations created by 3DS such as *.max,
*.3ds, *.fbx, *.dae, *.obj, among others (Unity Technologies 2017).
The integration of Unity and Vuforia is explained in Fig. 3.3. The MAR
application is fully designed in Unity including all the programming logic related to
system navigation and 3D model’s behavior. The necessary resources to create AR
are taken from Vuforia that includes administration (local, remote), detection,
recognition, and tracking of all the markers. Finally, the developer defines the 3D
models and animations associated with each marker.
Two different ATVs models known as short chassis ATV and large chassis ATV
were selected as the core for 3D modeling purposes. The selection was mainly due
to the associated complexity of fabrication and assembly, because both models are
the most sold in the company where the MAR prototype was implemented. The
short and large ATV chassis are shown in Fig. 3.4.
In addition, six different accessories, including (a) the arms of the front sus-
pension, (b) the arms of the rear suspension, (c) the tail structure (seat support),
3 Mobile Augmented Reality Prototype for the Manufacturing … 57
Fig. 3.3 Unity and Vuforia integration scheme to develop the MAR prototype
Fig. 3.4 Two versions of the ATV chassis. a Short, and b large
(d) the front bumper, (e) the rear loading structure, and (f) the steering column, were
selected and 3D modeled. In the real process, the accessories are added to the
chassis by means of temporal mechanical joints (screws) to shape the final ATV
structure. The main idea is to make a montage of the 3D models of the accessories
over the physical chassis, to observe the critical dimensions, and the weldings that
will be inspected and controlled in the ATV manufacturing process. The 3D models
of the six accessories selected are shown in Fig. 3.5.
The original 3D models of the chassis and the six accessories were originally
designed by the manufacturing company in the computer-aided three-dimensional
interactive application (CATIA) software, with a file extension *.CATPart. Therefore,
the models were converted from CATPart to STEP format. Finally, STEP format was
opened in 3DS Max and saved as *.max file, which is compatible with Vuforia, and
58 E. D. Nava Orihuela et al.
Fig. 3.5 3D models of the accessories selected. a The arms of the front suspension, b the arms of
the rear suspension, c the tail structure, d the front bumper, e the rear loading structure, and f the
steering column
this was to allow model manipulation in the MAR prototype. The file in 3DS preserves
the original model geometries and creates the necessary meshes with graphics features
to be projected in an AR application. In Fig. 3.6, the model of the short chassis ATV
represented in 3DS is shown.
3 Mobile Augmented Reality Prototype for the Manufacturing … 59
Markers in conjunction with the programming scripts for detection and tracking are
one of the main parts of the MAR prototype. The design of the markers associated
with the 3D models designed was made with the AR marker generator Brosvision
(2017). The generator uses an algorithm for the creation of images with predefined
patterns composed of lines, triangles, and rectangles. A unique image is created
randomly in full color or in gray scale.
In this stage, nine markers with different sizes were created for the MAR pro-
totype. The size of a marker was defined in accordance with the physical space
which will be located. The first two markers named Short and Max were associated
with the two chassis sizes and allow to virtually observe the particular size of the
chassis (short or large) as shown in Fig. 3.7.
The seven remaining markers were associated with the six selected accessories
mentioned in Sect. 3.4.2 and will be mounted in the real chassis to execute the
superimposition of the associated 3D models. It is important to mention that in the
experimentation process, it was detected that for the case of the arms of the front
suspension, the use of only one marker was not sufficient to observe the entire
details. Therefore, an additional marker was created; one was used on left and the
other on right side of the ATV, obtaining the total quantity of seven. The 3D model
of the arms of the front suspension associated with the additional marker is just a
mirror of the original ones. The set of seven markers associated with accessories are
shown in Fig. 3.8.
It should be noted that markers have a scale of 1:2 and with the adequate
proportion to be collocated in strategic parts of the chassis. The markers shown in
Fig. 3.8a, b will be located at left and right lower tubes of the frontal suspension,
Fig. 3.7 Markers associated with ATV chassis. a Short, and b Max
60 E. D. Nava Orihuela et al.
Fig. 3.9 Location of the markers in a short ATV. a Front section, and b rear section
respectively; marker shown in Fig. 3.8c will be located at the support tube of the
rear suspension; marker shown in Fig. 3.8d will be located at the end of the chassis;
marker shown in Fig. 3.8e will be located at the steering column bracket; marker
shown in Fig. 3.8f will be located at the upper chassis beam; marker shown in
Fig. 3.8g will be located at the front support. The physical location of the markers
in a short chassis ATV can be observed in Fig. 3.9.
3 Mobile Augmented Reality Prototype for the Manufacturing … 61
In this stage, the MAR application was developed and it is important to note that
Android operating system (OS) was selected for deployment. In the first step, it is
necessary to import the markers created with Brosvision to Vuforia by means of the
target manager. To do this, a database where the markers will be stored was
created. The store location can be remotely (cloud) or directly in the mobile device;
for this chapter and because of facility, the last one was selected. After that, the
markers were added to the database in joint photographic experts group (JPEG)
format.
Once the database was created, each marker was subjected to an evaluation
performed by Vuforia to measure the ability of detection and tracking. Vuforia uses
a set of algorithms to detect and track the features that are present in an image
(marker) recognizing them by comparing these features against a local database.
A star rating is assigned for each image that is uploaded to the system. The star
rating reflects how well the image can be detected and tracked, and can vary among
0–5. The higher rating of an image target, the stronger the detection and tracking
ability it contains. A rating of zero explains that a target would not be tracked at all,
while an image given a rating of 5 would be easily tracked by the AR system. The
developers recommend to only using image targets that result in 3 stars an above.
The rating of stars obtained for each of the nine markers used in the MAR prototype
is shown in Fig. 3.10.
It should be noted from Fig. 3.10 that all the markers used in the MAR prototype
obtained at least a rating of three stars, which means that are appropriate for AR
purposes. In addition, every single marker was analyzed in a detailed way to
observe the set of traceable points (fingerprints) as it is shown in Fig. 3.11.
After the process of marker rating, the creation of AR scenes is carried out using
the AR camera prefab offered by Vuforia. The camera was included by dragging it
toward the utilities tree. The configuration of the AR camera includes the license
and the definition of the maximum number of markers to track and detect.
In a similar way than AR camera, the markers must be added to the utilities tree.
In this part, the database that contains the markers was selected, and the respective
markers to detect were defined. At this time, a Unity scene is ready and able to
detect and track the markers and display the related 3D models. The 3D models also
must be imported by dragging it from its location in a local directory to Unity
interface. The models can be observed in the assets menu. Afterward, each model
was associated with a particular marker by a dragging action similar to the previ-
ously explained.
Once that the main AR functionality was explained, the three experiences related
to welding inspection, measuring of critical dimensions, and accessories mounting
were developed.
62 E. D. Nava Orihuela et al.
The possibility of reviewing the product dimensions with respect to the manufac-
turing plans is an important activity for the quality control and safety department.
The chassis is the base component where all the accessories of the ATV will be
mounted and assembled. Therefore, if the dimensions of the chassis are not
according to the manufacturing plans, it cannot be assembled with other pieces.
Currently, the revision of critical dimensions and its correspondent comparison
with the nominal value is carried out at the end of the production process using
gauges. In addition, specialized machines such as coordinate measuring machine
(CMM) are used. However, it is not easy to see full-scale measures with gauges,
and the time taken by CMM to offer results of measures is quite long.
The AR tool for measuring critical dimensions offers a guide to check the
measures that impact the quality of a product. In the final prototype, the dimensions
from one component to another, dimensions of a manufacturing process, and
dimensions from individual components were included. Also, the tool can serve for
fast training of people that labor in the manufacturing of a product. In a similar way
to welding inspection, 2D components for displaying information and arrows were
inserted into the scene. An example of the result obtained with the measuring of the
critical dimensions tool is shown in Fig. 3.13. It should be noted that information
related to a particular dimension is displayed when the camera of the mobile device
is pointed out to one of the seven markers.
The main goal of this stage is creating an AR experience to show the real place in
the chassis where the accessories will be mounted. The seven markers of the
accessories were placed in the chassis structure corresponding to the real location of
a particular component. This tool is very important for helping in the process of
training people that in the future it will construct the ATV. In this stage, unlike
welding inspection and measuring of critical dimensions where only boxes and
arrows were used, the transformation properties of the virtual 3D models inserted in
the scene must be adjusted to determine the proper position and scale according to
the size of the real accessories. The good determination of the transformation
properties inside a Unity scene helps the final perspective observed by the user
when the application is running on the mobile device.
The transformation properties obtained for each 3D model are shown in
Table 3.1. The values obtained include the position in X-, Y-, and Z-axis, with the
respective values in scale and orientation.
The name of the MAR prototype is “Welding AR” due to the welding metalworking
process for ATVs chassis manufacturing. The complete GUI structure can be
observed in Fig. 3.15, and each block corresponds to one individual scene designed
in Unity.
The first scene created was the main screen that is displayed when the icon of the
Welding AR is taped in the mobile device as shownin Fig. 3.16. The scene includes
buttons to display the prototype help, for closing the application, and to start the
main menu.
3 Mobile Augmented Reality Prototype for the Manufacturing … 67
After main scene creation, eight additional scenes were created regarding to:
(1) mode selection, (2) information, (3) AR experience (observing the short and
large chassis ATV), (4) tools and utilities menu, (5) Help for all the scenes,
(6) welding inspection, (7) measuring critical dimensions, and (8) accessories
mounting. All the scenes include buttons to follow the flow of the application and to
return to the previous scene. Figure 3.17 shows the scenes of mode selection and
tools and utilities.
Figure 3.18 shows the flow diagram to understand the function of the prototype
regarding AR experience. The diagram was used in the MAR prototype for welding
inspection, measuring critical dimensions, and accessories mounting.
68 E. D. Nava Orihuela et al.
Fig. 3.17 MAR prototype scenes. a Mode selection, and b tools and utilities
The resulting application has the *.apk extension which can be shared in a
virtual store such as Google Play. Once the application is downloaded, it is
deployed to the mobile device to be used.
Two different tests were executed in order to measure and demonstrate the per-
formance of the MAR prototype inside a real manufacturing industry. Both
experiments are explained in the following subsections.
The first test consists in reviewing the scope of marker detection and the behavior of
the whole prototype. In this test, the measures were obtained in the real industrial
environment where the ATV is manufactured, with a constant illumination of 300
lumens. By using a Bosch GLM 40 laser, and a typical flexometer, the minimal and
maximal distances in centimeters to detect the 7 markers of accessories were cal-
culated. The specifications of the two mobile devices used for testing are shown in
Table 3.2.
The test was carried out by approaching the camera of the mobile device to the
marker as close as possible, and after that, moving the device away until the point
that the marker cannot be detected. The results obtained from the test are shown in
Table 3.3.
It should be observed from Table 3.3 that the general range to detect markers is
wide. In addition, even when the Galaxy S6 has better characteristics, the detection
range is greater with Tab S2, concluding that this was the device with the better
performance. In addition, the area covered by the marker is important for good
recognition. In Table 3.4, the information about the area covered by each marker is
shown.
It should be noted from Table 3.4 that the detection abilities are influenced by
the area covered by the marker. For example, the marker of steering_columns is the
biggest; therefore, the scope distance range is greater than the others. In conclusion,
Table 3.2 Technical specifications of the mobile devices used for tests
Brand Model Operating system RAM Camera
(MP)
Samsung Galaxy Tab S2 8.0 Android 6.0.1 3 GB 8
(SM-T713) (Marshmallow) LPDDR3
Samsung Galaxy S6 (SM-G920V) Android 6.0.1 3 GB 16
(Marshmallow) LPDDR4
70 E. D. Nava Orihuela et al.
the MAR prototype allows working inside a real manufacturing scenario with
different devices and different distances for pointing out the markers with good
detection and tracking.
In the second test, a questionnaire was designed for measuring the user satisfaction
when using the MAR prototype inside the real manufacturing environment. Ten
subjects participated in the survey, with an age ranged from 22 to 60 years. Nine
subjects were men, while one was women, all of them employees of the ATV
manufacturing company. In the sample, three subjects were technicians, two group
chiefs, two welding engineers, one quality engineer, one supervisor, and one
welder. The survey is shown in Table 3.5.
In the Likert scale used, the 1 means totally disagree, while a 10 means totally
agree. Each participant received an explanation about the purpose of the survey;
after that, both devices were used to test the MAR prototype. Each user takes
around 15 min for testing the prototype, and after that, the survey was filled.
The results obtained for questions 1–7 are shown in Fig. 3.19, while the results
obtained for question 10 are shown in Fig. 3.20. For the case of question 8, 80% of
3 Mobile Augmented Reality Prototype for the Manufacturing … 71
the participants responded yes. The comments include augmenting the number of
weldings inspected, augmenting the distance in which the prototype can detect the
markers, and augmenting the number of critical dimensions measured. Most of the
participants comment the benefits that could be obtained if the prototype will be
installed in AR lenses such Microsoft Hololens. Finally, for the case of question 9,
100% of the participants expressed that it will be easy and fast the process of
training a new employee using the MAR prototype, and this is mainly due to its
visual and ease of use interface.
3.5.3 Discussion
By observing the results obtained for both experiments, it should be noted that the
prototype is useful for supporting the ATV manufacturing process including the
training stage. It is important to highlight that the users demonstrate interest in using
the application and enthusiasm to include it in the dairy work. Effectively, the
72 E. D. Nava Orihuela et al.
Fig. 3.19 Results obtained for questions 1–7. a Question 1, b question 2, c question 3, d question
4, e question 5, f question 6, and g question 7
prototype helps in the task programmed with the use of AR that includes welding
inspection, measuring critical dimensions and mounting accessories.
Regarding the ability to detect the markers, it should be noted that a wide range
of distances could be handled, which will help the user to observe the superimposed
models at different sizes and orientations. When a detailed view is necessary, then
the user approaches the device in a very short distance, if a macrovision is needed,
then the user moves away from the markers.
3 Mobile Augmented Reality Prototype for the Manufacturing … 73
It is important to highlight the ability of the prototype to be useful inside the real
manufacturing environment, where changes of illumination, noisy environment,
and eventually occlusions happened almost all the time.
With respect to the results obtained from the survey, we confirmed that users are
interested in using the application. Nevertheless, the comments offered about
improvement opportunities were very valuable to enhance the prototype in the
future. At the end, the experiments allow confirming the premise that AR is a
valuable technological tool that can be used to support the process of manufacturing
an ATV.
3.6 Conclusions
using AR glasses such as ORA Optinvent or Microsoft Hololens, which will pro-
vide the user the total mobility of the hands. It will be important to increase the
number of 3D models and include more types of ATV models. It is also necessary
to increase the number of welds inspected and the number of critical dimensions to
measure. Finally, it will be desirable to change the functionality of the prototype
from marker-based AR to a markerless system which will offer a more natural
interface.
References
Aras, M., Shahrieel, M., Zambri, M., Khairi, M., Rashid, A., Zamzuri, M., et al. (2015). Dynamic
mathematical design and modelling of autonomous control of all-terrain vehicles (ATV) using
system identification technique based on pitch and yaw stability. International Review of
Automatic Control (IREACO), 8(2), 140–148.
Autodesk. (2017, September). 3D modeling with Autodesk, [On Line]. Available: https://www.
autodesk.com/solutions/3d-modeling-software.
Azman, M., Tamaldin, N., Redza, F., Nizam, M., & Mohamed, A. (2014). Analysis of the chassis
and components of all-terrain vehicle (ATV). Applied Mechanics and Materials, 660,
753–757.
Benham, E., Ross, S., Mavilia, M., Fescher, P., Britton, A., & Sing, R. (2017). Injuries from
all-terrain vehicles: An opportunity for injury prevention. The American Journal of Surgery,
214(2), 211–216.
Bradley, D. (2010). Mechatronics—More questions than answers. Mechatronics, 20, 827–841.
Bradley, D., Russell, D., Ferguson, I., Isaacs, J., MacLeod, A., & White, R. (2015). The Internet of
Things—The future or the end of mechatronics. Mechatronics, 27, 57–74.
Brosvision. (2017, September). Augmented reality marker generator [On Line]. Available: http://
www.brosvision.com/ar-marker-generator/.
Chatzopoulos, D., Bermejo, C., Huang, Z., & Hui, P. (2017). Mobile augmented reality survey:
From where we are to where we go. IEEE Access, 5, 6917–6950.
Doshi, A., Smith, R., Thomas, B., & Bouras, C. (2017). Use of projector based augmented reality
to improve manual spot-welding precision and accuracy for automotive manufacturing. The
International Journal of Advanced Manufacturing Technology, 89(5–8), 1279–1293.
Elia, V., Grazia, M., & Lanzilotto, A. (2016). Evaluating the application of augmented reality
devices in manufacturing from a process point of view: An AHP based model. Expert Systems
with Applications, 63, 187–197.
Fang, H., Ong, S., & Nee, A. (2012). Interactive robot trajectory planning and simulation using
augmented reality. Robotics and Computer-Integrated Manufacturing, 28(2), 227–237.
Fleming, S. (2010). All-terrain vehicles: How they are used, crashes, and sales of adult-sized
vehicles for children’s use (1st ed.). Washington D.C., USA: Diane Publishing Co.
Gattullo, M., Uva, A., Fiorentino, M., & Gabbard, J. (2015a). Legibility in industrial AR: Text
style, color coding, and illuminance. IEEE Computer Graphics and Applications, 35(2), 52–61.
Gattullo, M., Uva, A., Fiorentino, M., & Monno, G. (2015b). Effect of text outline and contrast
polarity on AR text readability in industrial lighting. IEEE Transactions on Visualization and
Computer Graphics, 21(5), 638–651.
Gavish, N., Gutiérrez, T., Webel, S., Rodríguez, J., Peveri, M., Bockholt, U., et al. (2015).
Evaluating virtual reality and augmented reality training for industrial maintenance and
assembly tasks. Interactive Learning Environments, 23(6), 778–798.
Hsien, Y., Lee, M., Luo, T., & Liao, C. (2014). Toward smart machine tools in Taiwan. IT
Professional, 16(6), 63–65.
3 Mobile Augmented Reality Prototype for the Manufacturing … 75
Lee, S., & Akin, O. (2011). Augmented reality-based computational fieldwork support for
equipment operations and maintenance. Automation in Construction, 20(4), 338–352.
Lima, J., Robert, R., Simoes, F., Almeida, M., Figueiredo, L., Teixeira, J., et al. (2017). Markerless
tracking system for augmented reality in the automotive industry. Expert Systems with
Applications, 82, 100–114.
Liu, Y., Liu, Y., & Chen, J. (2015). The impact of the Chinese automotive industry: Scenarios
based on the national environmental goals. Journal of Cleaner Production, 96, 102–109.
Mota, J., Ruiz-Rube, I., Dodero, J., & Arnedillo-Sánchez, I. (2017). Augmented reality mobile app
development for all. Computers and Electrical Engineering, article in press.
Nee, A., Ong, S., Chryssolouris, G., & Mourtzis, D. (2012). Augmented reality applications in
design and manufacturing. CIRP Annals—Manufacturing Technology, 61, 657–679.
Odenthal, B., Mayer, M., KabuB, W., & Schlick, C. (2012). A comparative study of head-mounted
and table-mounted augmented vision systems for assembly error detection. Human Factors
and Ergonomics in Manufacturing & Service Industries, 24(1), 105–123.
Ong, S., & Zhu, J., (2013). A novel maintenance system for equipment serviceability
improvement. CIRP Annals—Manufacturing Technology, 62(1), 39–42.
Palmarini, R., Ahmet, J., Roy, R., & Torabmostaedi, H. (2018). A systematic review of augmented
reality applications in maintenance. Robotics and Computer-Integrated Manufacturing, 49,
215–228.
PTC Inc. (2017, September). Vuforia, [On Line]. Available: https://www.vuforia.com/.
Schoner, H. (2004). Automotive mechatronics. Control Engineering Practice, 12(11), 1343–1351.
Syberfelt, A., Danielsson, O., & Gustavson, P. (2017). Augmented reality smart glasses in the
smart factory: Product evaluation guidelines and review of available products. IEEE Access, 5,
9118–9130.
Unity Technologies. (2017, September). Unity-products, [On Line]. Available: https://unity3d.
com/es/unity.
Webel, S., Bockholt, U., Engelke, T., Gavish, N., Olbrich, M., & Preusche, C. (2013). An
augmented reality training platform for assembly and maintenance skills. Robotics and
Autonomous Systems, 61(4), 398–403.
Westerfield, G., Mitrovic, A., & Billinghurst, M. (2015). Intelligent augmented reality training for
motherboard assembly. International Journal of Artificial Intelligence in Education, 25(1),
157–172.
Williams, A., Oesch, S., McCartt, A., Teoh, E., & Sims, L. (2014). On-road all-terrain vehicle
(ATV) fatalities in the United States. Journal of Safety Research, 50, 117–123.
Yew, A., Ong, S., & Nee, A. (2016). Towards a griddable distributed manufacturing system with
augmented reality interfaces. Robotics and Computer-Integrated Manufacturing, 39, 43–55.
Chapter 4
Feature Selection for Pattern
Recognition: Upcoming Challenges
Abstract Pattern recognition is not a new field, but the challenges are coming on
the data format. Today’s technological devices provide a huge amount of data with
extensive detail evolving the classical pattern recognition approaches for dealing
with them. Given the size of and quantity of descriptors data possess, traditional
pattern recognition techniques have to draw on feature selection to handle problems
like the excess of computer resources and dimensionality. Feature selection tech-
niques are evolving, as well, for data related reasons. Chronologically linked data
brings new challenges to the field. In the present chapter, we expose the gap in
feature selection research to handle this type of data, as well as give suggestions of
how to perform or pursue an approach to chronologically linked data feature
selection.
4.1 Introduction
Today, in the big data era, data are coming at large formats, from high-definition
video to interaction posts on social media, making it hard for pattern recognition
algorithms to process and make decisions with them. Pattern recognition aims to
search into data for regularities that can automatically classify (among other tasks)
the data into different classes.
Let’s look at this last statement with a specific problem: one of the latest
smartphone tackles unlocking function with facial recognition, since this task has to
differentiate a face from non-face image, it has to search for facial characteristics,
thus pattern recognitions is involved.
Fig. 4.1 Number of publications by year related to feature selection in the last decade
4 Feature Selection for Pattern Recognition: Upcoming Challenges 79
1
Definition, characteristic, variable, and description are all used as synonyms in this chapter.
80 M. Cervantes Salgado and R. Pinto Elías
Fig. 4.2 Visual representation of a classic punctual data, and b chronologically linked data
Feature selection has had an increase amount of research because every day tech-
nological devices need to perform some kind of machine learning/pattern recogni-
tion task. The applications are many, and the algorithms to deal with this demanding
activity are evolving; from punctual data approaches to chronologically linked data
methodologies. A state-of-the-art review will be presented; dividing it on the
methodologies, the studies are based on to perform feature selection. This survey of
the state of the art is not intended to be a complete guide to perform feature selection
but to search for gaps in order to offer a big panorama of how the upcoming
challenges could be tackled.
Descriptive statistics make use of measures to describe data; these measures have
the intention to respond, among others, questions like: what kind of data we are
dealing with? How spread is data? Location measurements intend to give an idea of
the kind of data within an attribute, and dispersion measurements describe how
spreads are the values within an attribute (Ayala 2015). This information is the
baseline of statistical feature selection methods. Statistical feature selection com-
pares the attributes without considering the classification algorithm, so the majority
of these methods are considered to be filter feature selection methods
(Chandrashekar and Sahin 2014).
4 Feature Selection for Pattern Recognition: Upcoming Challenges 81
There are some concepts to go through in order to facilitate the understanding of the
rest of the section. These measurements are: mean, variance, and standard deviation.
– Mean: It is one of the most common location measurements used in statistics. It
helps us to localize our data. To obtain the mean value of a variable, we use
X
n
xi
ex ¼ ð4:1Þ
i¼1
n
The term mean is often interchanged by average, which gives a clear expla-
nation of the equation.
– Variance: This measurement is used in order to know the spread of the values
within a variable. Thus, the variance measures how much the values of a
variable differs from the mean (Myatt and Johnson 2014), and the variance
formula is
Pn
ðxi ex Þ2
s ¼
2 i¼1
ð4:2Þ
n
where n represents the total of observations, and it differs when the formula is
used to calculate a sample.
– Standard deviation: The standard deviation is the square root of the variance and
is given by
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn
pffiffiffiffi
i¼1 ðxi e x Þ2
s ¼ s2 ¼ ð4:3Þ
n
where xi is the actual data value, x̃ is the mean of the variable, and n is the
number of observation. The higher the value of s, the most broadly scattered the
variable’s data values are from and toward the mean.
Other than giving a complete tutorial of how each method is done, we pretend to
dig into the methodology in order to catch any possibility of using the method with
chronologically linked data.
An algorithm for feature selection on chronologically linked data (Somuano
2016).
A very ingenious statistical feature selection approach was done at Somuano
(2016) where the measures reviewed in last section take a key point on discrimina-
tion. The author describes the feature selection algorithm with the following steps:
82 M. Cervantes Salgado and R. Pinto Elías
1. Calculate variation coefficients, from original data, with the following equation
rn
CVn¼1;...N ¼ ð4:4Þ
ln
where cov(n, n′) is the covariance of two variables (n, n′) and is defined by
1 X Lt
covðn; n0 Þ ¼ xi;n ln xi;n0 ln0 ð4:6Þ
L t i¼1
where L is the total number of objects, t is the total number of observations, xi,n
is the data values in row i column n, ln is the mean of the n feature for the first
object. xi,n are the values in row i column n′, and ln is the mean of the n′ feature
for the second object.
Note: this approach was constructed for chronologically linked data, which
means there are several observations per object, find an example in Table 4.1,
where object1 has three entries and object2 has three entries as well.
4. Find the Basic Matrix. The basic matrix is Boolean and is made of only basic
rows from the matrix of differences (MD). One t row is basic only if in the MD
there is not any p row that is a sub-row of t. If p and t are two rows on the MD, it
is said that p is a sub-row of t if and only if:
8jðap;j ¼ 1 ) atj¼1Þ and 9k atk ¼ 1 ^ apk ¼ 1 ð4:7Þ
5. Using the bottom-top algorithm, subsets of variables are codified. The algorithm
is able to compute the subsets of variables as a vector of n dimension and
Boolean values, where 0 denotes that the associated variable is not going to be
included and number 1 indicates that the variable is included.
We can notice this method uses chronologically linked data, as we are looking
for; however, it seems to lose the sequence of the time stamp, for that reason we
will continue exploring more possibilities.
A simple low-variance approach (Pedregosa et al. 2011).
Feature selection can be achieved using simpler descriptive statistics. Tools have
been developed that perform feature selection removing variables with low variance
whose value does not meet some threshold (Pedregosa et al. 2011). In the men-
tioned work, the authors approach Boolean variables as Bernoulli random variables,
where the variance is giving by
where p is the percentage of ones (or zeros), and the discrimination can be done by
setting a threshold, if the variance does not meet the threshold then it will be
removed.
We can expect, in this type of approach, that features with zero variance value
are the first options to remove because this implies that the variable or feature has
the same value through all the samples. As we can see, this kind of discrimination
has nothing to do with the classification task.
Consider a discrete random variable (let say x) and then think about the amount of
information that is received when we select a specific value within it, this is call
“degree of surprise” (Bishop 2006). We start this section presenting some basic
concepts of information theory before exploring different developments that make
used of it to search for a subset of features that preserve reparability (feature
selection).
where X and Y are two variables and x̃ and ỹ their respective means. If two variables
are entirely correlated, q±=, then one is redundant; thus, it could be eliminated.
Mutual information is a nonlinear correlation measure. One of its key concepts is
entropy, and it is defined by
X
ðX Þ ¼ pð xÞlog2 ðpð xÞÞ ð4:10Þ
x
Here, variable X must have a discrete set of values, an associated probability p(x).
The probability of a value is given by the frequency of that value in the sample.
The entropy of X given Y is defined as
X X
H ðXjY Þ ¼ pð y Þ pðxjyÞlog2 ðpðxjyÞÞ ð4:11Þ
y x
Information and concepts presented above will give us an idea of how these
families of methods work.
I ðC; Fi Þ
IGR ðC; Fi Þ ¼ ð4:14Þ
H ðFi Þ
where FS = {x1, x2, x3, …, xn} is the subset of size n and C represent the class
label. H(F) is the entropy and can be calculated with Eq. (4.10).
3. Decompose FS using the chain rule of mutual information given by
X
n X
n
I ðFS; C Þ ¼ I ðX i ; C Þ IðXi ; FS1;i1 IðXi ; FS1;i1 jCÞ ð4:15Þ
i¼1 i¼2
4. Remove the features that do not provide any added information about the class.
The significance of the complete estimation of mutual information is discussed
when employed as a feature selection criterion. Nevertheless, this looks like a
simple task, the authors only deploy it with a dataset of 500 samples, and conclude
that if the number of samples increase, the computational time will increase as well.
Feature selection based on correlation (Sridevi and Murugan 2014).
Finding a proper classification for breast cancer diagnosis is achieved using
correlation-rough set feature selection joint method.
The proposed method combines two methods in a two-step feature selection
algorithm. The first step selects a subset of features, and this is done by applying the
rough set feature selection algorithm to discrete data. The resultant set R1 is set with
the attribute with the highest correlation value, then with an attribute with average
correlation (R2), and for the third time (R3) is done with the lowest correlated
attribute. The rough set feature selection algorithm relays, for this work, in the
QuickReduct algorithm, and it can be consulted at Hassanien et al. (2007). The
algorithm can be visualized at Fig. 4.3.
Finally, the second step consists of a reselection of the R1–R3 subsets using
correlation feature selection. As a conclusion, the author affirm that this joint
algorithm achieves the 85% of classification accuracy.
In order to pursue a chronologically linked data feature selection with, this family
of methods, implies the use of probabilistic techniques such as Markov models.
Similarity-based methods select a feature subset where the pairwise similarity can
be preserved (Liu et al. 2014). Within these methods, there are two key concepts:
86 M. Cervantes Salgado and R. Pinto Elías
Fig. 4.3 Visual description of correlation-rough set feature selection joint method proposed by
Sridevi and Murugan (2014)
(1) pairwise sample similarity and (2) local geometric structure of data. These
concepts and their theoretical support are described in the next section.
Similarity between two binary variables. The last three distance definitions are
suitable for continuous variables. But for binary variables, we need a different
approach. First, we have to converge in the notation, let p and q be the two binary
samples, and Table 4.2 shows all the possible combinations to their values that will
be used to find the distance between them.
Then, to find the distance between samples p and q, there are two measures
(University of Pennsylvania State 2017): the simple matching and the Jaccard
coefficient.
Simple matching, SMC, coefficient is given by
n1;1 þ n0;0
SMC ¼ ð4:19Þ
n1;1 þ n1;0 þ n0;1 þ n0;0
Local geometric structure of data. Often in the literature (He et al. 2006; Liu
et al. 2014), geometric structure of data is done with a graph, where each sample is
treated as a node and an edge is placed between two samples if they are neighbors.
To find out if two nodes are neighbors, we can either use the label information
(class feature for supervised learning) or we could use k-nearest neighbor
(kNN) algorithm (University of Pennsylvania State 2017). Using the k-NN algo-
rithm, we put an edge between nodes i and j if xi and xj are “close.”
Let’s take a look at the example given at Murty and Devi (2011) and its resulted
graph. Let the training set consists of two variables, eighteen samples, and three
classes, as shown in Table 4.3.
Using the label information of attribute class in the data set, we obtain a graph
that looks like the one in Fig. 4.4, where connected nodes (samples) belong to the
same class.
So far, we have covered some basics of the similarity approach without doing
any feature selection. Until this point, the similarity concepts that were presented
just show how close are the observations from each other. Thus, the research work
that follows will give an idea of how they can be used for the purpose that is
intended.
Table 4.3 Training set for practical example (Murty and Devi 2011)
var1 var2 Class var1 var2 Class var1 var2 Class
x1 0.8 0.8 1 x2 1 1 1 x3 1.2 0.8 1
x4 0.8 1.2 1 x5 1.2 1.2 1 x6 4 3 2
x7 3.8 2.8 2 x8 4.2 2.8 2 x9 3.8 3.2 2
x10 4.2 3.2 2 x11 4.4 2.8 2 x12 4.4 3.2 2
x13 3.2 0.4 3 x14 3.2 0.7 3 x15 3.8 0.5 3
x16 3.5 1 3 x17 4 1 3 x18 4 0.7 3
The methods in this family assess the importance of features by their ability to
preserve data similarity between two samples (Li et al. 2018). It means, they have
the characteristic of selecting a subset of attributes where the similarity for pairwise
comparison can be preserved (He et al. 2006). For the face recognition example,
samples of face images have close values when there is a face in the image and as
long there is a face, in other way the values would be different. Research of the
methods belonging to this type will be described next.
A binary ABC algorithm based on advanced similarity scheme for feature selection
(Hancer et al. 2015).
The main goal of this research was to propose a variant of the discrete binary
ABC (DisABC) algorithm for feature selection. The variant consists of introducing
differential evolution (DE)-based neighborhood mechanism into the similarity-
based search of DisABC. The main steps of the research are summarized in the next
list:
1. Pick three neighbor samples and call them Xr1, Xr2, and Xr3.
2. Compute / Dissimilarity(Xr2, Xr3) using
where similarity(Xi, Xk) represents the Jaccard coefficient, which was defined in
the previous section, and / is a positive random scaling factor.
3. Solve the equation
M11
min 1 U Dissimilarity ðXr2 ; Xr3 Þ ð4:22Þ
M11 þ M10 þ M01
where CR is the crossover rate and xid represents the dth dimension of Xi
6. Pick a better solution between Xi and Ui.
According to the authors and the results shown in this study, the joint of
DE-based similarity search mechanism into the DisABC algorithm improve the
ability of the algorithm in feature selection due to its ability to remove redundant
features. The study was performed with different datasets at Asuncion (2007) where
data is punctual.
90 M. Cervantes Salgado and R. Pinto Elías
kxi xj k
2
Sij ¼ e t ð4:24Þ
This method is a filter approach and was tested using two datasets formed of
punctual data. As said before, it can be used for supervised or unsupervised
approaches, and as conclusion, it is based on the observation that local geometric
structure is crucial for discrimination. Similarity-based feature selection methods
can be applied to supervised or unsupervised learning.
linked data. Now, it is the turn of neural networks; later we will discuss if they can
reach the goal of feature selection on chronologically linked data.
This section provides a brief review of the multi-layer perceptron (MLP) and deep
belief networks (DBNs) without intending to be an extend guide (see references for
complete information). Artificial neural networks (ANNs) are believed to handle a
bigger amount of data without compromising too much of the resources (Bishop
2006). At the end of this quick review of ANN approaches, a state of the art will be
presented.
The MLP is a basic ANN structure that can have many numbers of layers; its
configuration lies on the idea of having the outputs of one layer connected to the
inputs to the next layer, and in between having a nonlinear differentiable activation
function. MLP ANNs are trained using several backpropagation methods of
reinforcement learning (Curtis et al. 2016). MLP is a supervised learning algo-
rithm that learns a function f(): Rm ! Ro by training a dataset, where m is the
number of dimensions for input and o is the number of dimensions for output.
Given a set of features X = x1, x2,…,xm and a target y, it can learn a nonlinear
function for either classification or regression. Figure 4.5 shows a one-hidden
layer MLP (Pedregosa et al. 2011).
In Fig. 4.5, the features are represented in the left side. The hidden layer (middle
one) transforms the values from the left layer with a weighted linear summation
w1x1 + w2x2 ++ wmxm followed by a nonlinear activation function g():
R ! R (i.e., hyperbolic function). The output layer receives the values from the last
hidden layer and transforms them into output values (Pedregosa et al. 2011).
To model a classification task using a MLP, the ANN will consist of an output
neuron for each class, where a successful classification produces a much higher
activation level for the corresponding neuron class (Curtis et al. 2016).
Deep learning is a relatively new technique and has attracted wide attention (Zou
et al. 2015), it uses artificial intelligence techniques of which we will refer here in
particular to deep belief networks (DBNs). The deep-learning procedure of the
DBN consists of two steps: Layer-wise feature abstracting and reconstruction
weight fine-tuning (Hinton 2006). In the first step, the DBN make used of a
restricted Boltzmann machine (RBM) to calculate the reconstruction of weights.
During the second step, DBN performs a backpropagation to achieve the desired
weights obtained from the first step (Hinton 2006).
To stand in solid ground, let consider v as the left layer (visual layer) and h the
middle layer (hidden). In the DBN, all nodes are binary variables (to satisfy the
Boltzmann distribution). In a RBM, there is a concept called “energy” which is a
joint configuration of the visible and hidden layers and is defined as follows:
X X X
E ðv; h; hÞ ¼ Wij vi hj bi v i aj hj ð4:28Þ
i;j i j
where h denotes the parameters (i.e., W, a, b); W denotes the weights between
visible and hidden nodes; and a and b denote de bias of the hidden and visible
layers. The joint probability of the configuration can be defined as
1
Ph ðv; hÞ ¼ expðEðv; h; hÞÞ ð4:29Þ
Z ð hÞ
P
where ZðhÞ ¼ v;h ð expðEðv; h; hÞÞ is the normalization factor. Combining the
last two equations (Zou et al. 2015), we have
!
1 X X X
Ph ðv; hÞ ¼ exp Wij vi hj þ bi v i þ aj hj ð4:30Þ
Z ð hÞ i;j i j
In the RBMs, the visible and hidden nodes are conditionally independent to each
other, that is why, the marginal distribution of v respect h can be defined as
1
Ph ðvÞ ¼ exp vT Wh þ aT h þ bT v ð4:31Þ
Z ð hÞ
In the second step of the DBN, a backpropagation is applied on all the layers to
fine-tune the weight obtained from the first step.
4 Feature Selection for Pattern Recognition: Upcoming Challenges 93
Having in mind that the survey done in this chapter has to lead us to the techniques
that can handle the chronologically linked data feature selection, we present how
recent research is dealing with feature selection using ANN. At Sect. 4.3, we will
present a summary of the methods and personalized opinion of the approaches that
could handle chronologically liked data.
Deep learning-based feature selection for remote sensing scene classification (Zou
et al. 2015).
According to the authors, feature selection can be achieved by making use of the
most reconstructible features due to its characteristic of holding the feature intrinsic.
They proposed a method based in DBNs with two main steps. The steps are:
interactive feature learning and feature selection. The details of this method are
presented next.
1. Iterative feature learning. In this step, the goal is to obtain reconstruction
weights, and that can be done removing the feature outliers. The feature outliers
are those with larger reconstruction errors. They can be identified by analyzing
the distribution of the reconstruction errors as output of the following algorithm:
(1) as inputs enter:
V ¼ fvi ji ¼ 1; 2; . . .; ng ð4:32Þ
which is the original input feature vector, η = the ratio of feature outliers, η as
the stop criteria, nIt = the max number of iterations.
2. Iterate from j = 1 to n It times only if
be j1 be j \e ð4:33Þ
meanwhile get the weight matrix and the average error. Also, filter the features.
3. Finally, as output get M, the final reconstruction weight matrix.
1Xn
be ¼ ei ð4:34Þ
n i
Feature selection. At this step, the weight matrix is used to choose the better
features since the outliers were eliminated in first step. Suppose M is the recon-
struction weight matrix obtained in the last step, I is an image (since this method
was intended for feature selection on images) in the testing data set,
94 M. Cervantes Salgado and R. Pinto Elías
VNI ¼ vIi ji ¼ 1; 2; . . .; N ð4:35Þ
where N is the number of features extracted from I. As mention before, the purpose
of this research is to select the feature with smaller reconstruction error which is
given by
V I ¼ vIi jeIi \TI ; eIi 2 ENI ð4:36Þ
Refer to Zou (2015) for complete set of equations. It can be seen that the main
idea is to get rid of features that after going through the ANN, in this case a DBN,
are considered having a greater error, giving us a different way to use the ANN, not
as classifiers but as selectors.
Artificial Neural Networks (Murty and Devi 2011).
A less complicated technique is presented in Murty and Devi (2011). Using a
multi-layer feed-forward network with a backpropagation learning algorithm, the
authors propose the extraction of the more discriminative feature subset. First, it is
proposed to set a larger network, then start the training, and as it goes trim some
nodes being careful to adjust the remaining weights in such way that the network
performance does not become worse over training process.
The criteria to remove a node, on this approach, are given by
– A node will be removed after analyzing the increase of the error caused by
removing that specific node.
The pruning problem is formulated in terms of solving a system of linear
equations using the optimization technique. Same as last study (Zou et al. 2015),
data set consider for this specific study is punctual. Within this and the last studies,
no chronologically linked data was used or consider during tests.
Given its nature, sparse learning is very suitable for feature selection. For a sparse
statistical model just a relatively small number of features are important for the
manifold of data (Hastie et al. 2015). They are said to handle linked data or
multi-view data (Wang et al. 2013; Tang and Liu 2014), for that reason they will be
presented in this survey.
Sparse learning aims to find a simpler model out of data, as said in Hastie et al.
(2015) simplicity could be a synonymous of sparsity. In order to comprehend this
theme, we have to introduce important concepts.
4 Feature Selection for Pattern Recognition: Upcoming Challenges 95
X
p
yi ¼ b0 þ xij bj þ ei ð4:37Þ
j¼1
Due to the estimates of last equation, which typically will be nonzero, the
interpretation of the model will be hard if p is large. Thus, in lasso or l1-regularized
regression a regulation is introduced, and the problem is solved as follows
!2
X
N X
p
minimizeb0 ;b y i b0 xij bj subject to k b k1 t ð4:39Þ
i¼1 j¼1
P
where kbk1 ¼ pj¼1 bj is the l1 norm of b, and t is parameter establish to find the
within parameters.
So far, we show the introductory basis of sparse learning, next we present a pair
of studies that will give us an understanding of how this method is been used.
Sparse model aims to push the feature coefficients close to zero, then these features
can be eliminated. Sparse models are suspect of a lot of research in recent years (Li
et al. 2018). We present recent studies that will lead us to find an appropriate
method to handle chronologically linked data.
Feature Selection for social media data (Tang and Liu 2014).
This is a study of how social media data present new challenges to feature selection
approaches. Since data in social media has multi-dimension, some approaches for
feature discrimination will not perform well. Data in social media, i.e., Twitter, have a
morphological representation as seen in Fig. 4.6 where the authors explain the
interaction user-post and user-user. Users in social media have two behaviors:
(1) following other users, represented in Fig. 4.6 as li, and (2) generating some posts
(post is a generalization of tweets, blogs or pictures) represented as pi
96 M. Cervantes Salgado and R. Pinto Elías
Fig. 4.6 Visual representation of social media data and its matrix illustration as shown in Tang
and Liu (2014)
To model the hypotheses, the authors first introduce feature selection for
punctual data based on l2,1-norm regulation, which selects features across data
points using
where ∥∙∥F denotes the Frobenius norm of a matrix and the parameter a controls the
sparseness of W in rows. W 2 Rmk and ∥W∥2,1 is the l2,1-norm of W and is define
by
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
m u
X uX k
k W k2;1 ¼ t W 2 ði; jÞ ð4:41Þ
i¼1 j¼1
Then, the authors propose to add a regularization term with equation in first step
that forces the hypothesis that the class labels of posts by the same user are similar.
And this is given by
2 X X
minw
X T W Y
F þ akW k2;1 þ b
T ðfi Þ T fj
2
2
ð4:42Þ
u2u fi ;fj 2Fu
The above hypothesis assumes that posts by the same user are of similar topics.
In other words, the posts from same user are more similar, in terms of topics, than
randomly selected posts.
Since the authors in this study work with linked data, it could be assumed that
this method could handle the chronology linked data. To reinforce this thought, we
4 Feature Selection for Pattern Recognition: Upcoming Challenges 97
present another work related to sparse learning. Until here, we can make a summary
of the methods and the type of data they used and if there is a possibility to expand
the model to accept and perform well with chronologically linked data.
Table 4.4 Summary of all the approaches and the possibilities to adapt with chronologically
linked data
Reference Based Data type used Comments
method
He et al. Similarity Punctual Potential to deal with chronologically linked
(2006) data introducing measures of distances
between groups of objects or distributions
Murty and ANN Punctual There is not available literature that suggests
Devi (2011) the possible adjustment to use
chronologically linked data
Pedregosa Statistical Punctual Simple approach that cannot deal with
et al. (2011) chronologically linked data
Sathya and Information Punctual Potential to deal with chronologically linked
Aramudhan theory data using, i.e., Markov models
(2014)
Sridevi and Information Punctual Potential to deal with chronologically linked
Murugan theory data using, i.e., Markov models
(2014)
Tang and Liu Sparse Linked Advance research on multiple sample
(2014) learning objects, suitable for chronologically linked
data
Hancer et al. Similarity Punctual Potential to deal with chronologically linked
(2015) data introducing measures of distances
between groups of objects or distributions
Zou et al. ANN Punctual There is not available literature that suggests
(2015) the possible adjustment to use
chronologically linked data
Somuano Statistical Chronologically It uses chronologically linked data
(2016) linked nevertheless needs to improve the
preservation of sequence
In Sects. 4.2 and 4.3, we presented the possibilities that different methods contain to
work or to be adapted for chronologically linked data. Here, we present the chal-
lenges that this data type represents.
• Multi-sample objects.
• Different cardinality between objects.
• Same number of attributes for all the objects.
• Time stamp available on the set.
Multi-sample objects are one of the principal characteristics, since data in this
category contain multiple samples per object and every object could contain dif-
ferent number or samples. Refer to Fig. 4.2b for visual explanation.
Today’s data availability and utilization bring new challenges for pattern recognition
algorithms. Feature selection aims to facilitate data visualization, prediction per-
formance, reduce storage space, and reduce training time while keeping separability
of classes.
In contrast to static concept, events have a dynamic characteristic which is
represented by chronologically linked data.
As we presented in Sects. 4.2 and 4.3, there is work that needs to be done to
tackle feature selection with chronologically linked data with the following
requirements:
• Supervised learning.
• Samples with different cardinality.
• Conserve the sequence of the data.
Having supervised learning scenario will help to evaluate the quality of the
selected subset. It is very important to preserve the sequence of the data since it
might experience future processing.
4.5 Conclusions
The goal of this study was to find a gap in research to justify a proposal of
chronologically linked data for feature selection investigation. We reviewed the
importance of feature selection for pattern recognition algorithms and the work
done until now with punctual and linked data. We provided a quick guide of basic
concepts to introduce readers into the feature selection techniques. Within the
introduction of concepts, we showed a representative state of the art using of the
categorization used to organize the content and studies in Sect. 4.2 (state of the art).
After summarizing and analyzing the state of the art, we found that chronologically
linked data problem continues unsolved. We gave some guidance of the appro-
priated mathematical methods to handle the type of data mentioned. Finally, we
presented the challenges of feature selection techniques given the dynamic nature of
events (chronologically linked data).
100 M. Cervantes Salgado and R. Pinto Elías
References
Asuncion, D. (2007). UCI machine learning repository. [online] Available at: http://archive.ics.
uci.edu/ml/index.php.
Ayala, G. (2015). Estadística Básica (1st ed.). Valencia, España: Universidad de Valencia.
Bishop, C. (2006). Pattern recognition and machine learning (1st ed.). New York, USA: Springer.
Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers &
Electrical Engineering, 40(1), 16–28.
Curtis, P., Harb, M., Abielmona, R., & Petriu, E. (2016). Feature selection and neural network
architecture evaluation for real-time video object classification. In Proceedings of 2016 IEEE
Congress on Evolutionary Computation (CEC) (pp. 1038–1045) Vancouver, British Columbia,
Canada.
Hancer, E., Xue, B., Karaboga, D., & Zhang, M. (2015). A binary ABC algorithm based on
advanced similarity scheme for feature selection. Applied Soft Computing, 36, 334–348.
Hassanien, A., Suraj, Z., Slezak, D., & Lingras, P. (2007). Rough computing: Theories,
technologies and applications (1st ed.). Hershey, Pennsylvania, USA: IGI Global.
Hastie, T., Tibshirani, R., & Wainwright, M. (2015). Statistical learning with sparsity: The lasso
and generalizations (1st ed.). Boca Raton, Florida, USA: CRC Press.
He, X., Cai, D., & Niyogi, P. (2006). Laplacian score for feature selection. In Proceedings of the
18th International Conference on Neural Information Processing Systems (Vol. 1, pp. 507–
514). Vancouver, British Columbia, Canada.
Hinton, G., Osindero, S., & Teh, Y. (2006). A fast learning algorithm for deep belief nets. Neural
Computation, 18(7), 1527–1554.
Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R., Tang, J., et al. (2018). Feature selection: A
data perspective. ACM Computing Surveys, 50(6), 1–45.
Liu, X., Wang, L., Zhang, J., Yin, J., & Liu, H. (2014). Global and local structure preservation for
feature selection. IEEE Transactions on Neural Networks and Learning Systems, 25(6), 1083–1095.
Murty, N., & Devi, S. (2011). Pattern recognition: An algorithmic approach. London, United
Kingdom: Springer Science & Business Media.
Myatt, G., & Johnson, W. (2014). Making sense of data I: A practical guide to exploratory data
analysis and data mining. London, United Kingdom: Wiley.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011).
Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Sathya, R., & Aramudhan M. (2014). Feature selection based on information theory for pattern
classification. In Proceedings of 2014 International Conference on Control, Instrumentation,
Communication and Computational Technologies (ICCICCT) (Vol. 1, pp. 1233–1236).
Kanyakumari, India.
Somuano, J. (2016). Algoritmo para la selección de variables en descripciones crono-valuadas,
Ms.C. Thesis, Centro Nacional de Investigación y Desarrollo Tecnológico (CENIDET).
Sridevi, T., & Murugan, A. (2014). A novel feature selection method for effective breast cancer
diagnosis and prognosis. International Journal of Computer Applications, 88(11), 28–33.
Tang, J., & Liu, H. (2014). Feature selection for social media data. ACM Transactions on
Knowledge Discovery from Data, 8(4), 1–27.
University of Pennsylvania State. (2017). Applied data mining and statistical learning. [Online].
Available: https://onlinecourses.science.psu.edu/stat857/node/3.
Wang, H., Nie, F., & Huang, H. (2013). Multi-view clustering and feature learning via structured
sparsity. In Proceedings of the 30th International Conference on Machine Learning (ICML)
(Vol. 28, pp. 352–360). Atlanta, Georgia, USA, 28.
Webb, A., & Copsey, K. (2011). Statistical pattern recognition (3rd ed.). London, United
Kingdom: Wiley.
Zou, Q., Ni, L., Zhang, T., & Wang, Q. (2015). Deep learning based feature selection for remote sensing
scene classification. IEEE Geoscience and Remote Sensing Letters, 12(11), 2321–2325.
Chapter 5
Overview of Super-resolution
Techniques
5.1 Introduction
The methods for SR are addressed including the definition of each method. We
address big topics of work in super-resolution as: the pure interpolation with high
scales of amplification, the use of dictionaries, the variational procedures and the
exploiting of gradients sharpening. Each section in this chapter yields a guide for
the technical comprehension of each procedure. The technical procedures of the
cited articles are not fully reproduced but neither is a superficial description made
without ideas for a practical realization.
The first main separation between the SR methods is determined by the
resources to employ in the process. In the first case, a group of LR images are used.
These procedures refer to the first publications about the topic. In the second case,
due to practical situations, the SR is carried out by using only the input image of
low resolution. Figures 5.1 and 5.2 show the taxonomies of the more evident
classification of the methods, multiple-image SR or single-image SR.
In the second class of methods, we refer to the domain of application, spatial
domain or frequency domain. The following proposed differentiation between SR
methods is based on the mathematical models in order to reach the high resolution.
Transformations, probabilistic prediction, direct projection, learning dictionaries,
reduction of dimension and reconstruction models under minimization procedures
and residual priors are discussed. A common goal is the incorporation of the lost
high-frequency details. Finally, we propose two new methods for single-image SR.
The first one is based on gradient control, and the second one is a hybrid method
based on gradient control and total variation.
The rest of the chapter is organized as follows: In Sect. 5.2, the methods are
explained. In Sect. 5.3, the results of the proposed methods are presented. In
Sect. 5.4, the metrics used to characterize the methods are presented. Finally, the
chapter concludes in Sect. 5.5.
5.2 Methods
Down-sampling and warping are two processes in consideration for a more realistic
representation of the image at low resolution. In the first process, the image is
averaged over equal areas of size q q as can be seen from Eq. (2.1). In the
warping process, the image is shifted along x and y directions, and the distances
a and b are in pixels. Also, a rotation Ɵ is assumed on the image (Irani and Peleg
1990; Schultz and Stevenson 1994) as can be observed in Eq. (2.2).
104 L. Morera-Delfín et al.
ðq þX
1Þm1 ðq þ
X 1Þn1
1
gðm; nÞ ¼ f ðx; yÞ ð5:1Þ
q2 x¼qm y¼qm
2 3 02 3 2 311 2 3
x 1 0 a cos h sin h 0 m
w4 y 5 ¼ @4 0 1 b 5 4 sin h cos h 0 5A 4 n 5 ð5:2Þ
1 0 0 1 0 0 1 1
Fig. 5.3 Steps to form three LR images g1, g2, and g3 from a HR image f. Each branch represents
a different acquisition process
5 Overview of Super-resolution Techniques 105
where xtk and ytk are the displacements, qx and qy the sampling rates, and h the
rotation angle. Two acquisitions g1 and g2 with rotation and displacements can be
related using the following Eq. (5.5).
The approximation to this parameter has been solved using the Taylor series rep-
resentation. In the first step, sin h and cos h are expressed in series expansion using
the first two terms.
mh2 nh2
g2 ðm; nÞ ¼ g1 m þ a nh ; n þ b þ mh :
2 2
X
mh2 @g1
nh2 @g1
2
Eða; b; hÞ ¼ g1 ðm; nÞ þ a nh þ b þ mh g2 ðm; nÞ
2 @m 2 @n
ð5:7Þ
Finally, the parameters a, b, and h of Eq. (5.7) are determined using partial
derivatives on the final expansion and solving the equation system.
The models in frequency domain consider the sampling theory. There, a 2D array of
Dirac deltas (DT) performs the sampler function. The array has the same form in
time and frequency domains (2D impulse train). The acquisition process multiplies
106 L. Morera-Delfín et al.
the array of DT with the image in the spatial domain point by point. This operation
in frequency domain becomes a convolution operation. The advantage is that the
resolution of the convolution kernel (sampling array in the frequency domain in the
interval of [−p, p]) can be increased for optimal scales of amplification, checking
the high-frequency content at the output of the process. The Fourier transform of the
sampling is shown in Eq. (5.8),
n 0 L1 o
sin x0x M1
2 Dx sin xy 2 Dy
DT ðx0x ; x0y Þ ¼ Dx n Dyo ; ð5:8Þ
sin x0x 2 sin x0y 2
and the convolution with the image can be expressed as in Eq. (5.9),
X
L=2 X
M=2
Samp ðj1 ; j2 Þ ¼ Sðnx Dxx ; ny Dxy Þ
nx ¼L=2 mx ¼M=2 ð5:9Þ
DT ðj1 nx Dxx þ Mcx ; j2 ny Dxy þ Lcx ; j1 ; j2 Þ
The high-frequency content in Samp must be maximized. This strategy has been
used in (Morera 2015). Figure 5.4 shows a 1D sampling array in space and fre-
quency domains.
The wavelet transform introduces the analysis of the image generally in four fields
of information. The common decomposition brings directional information of
fluctuation of the image signal. The coefficients of the transformation are present in
four groups. The low-frequency coefficients which are a coarse representation of the
5 Overview of Super-resolution Techniques 107
image, the horizontal, the vertical and the diagonal coefficients which represent
details of directional variations of the image. The most common strategy for SR
using wavelets applies a non-sub-sampled wavelet or static wavelet before a
wavelet reconstruction, and the first step produces a decomposition of four images
with the same dimension as the input. Then, the wavelet reconstruction produces an
amplified image with scale factor 2, this strategy is employed in (Morera 2014).
5.2.6 Multiple-Image SR
The main goal in this group of techniques is the simulation of the process of
formation of the image in order to reject the aliasing effects due to the
down-sampling effect. A group of acquisitions of the same scene in LR is required
for estimation of the HR image.
Iterative back-projection (IBP) methods were the first methods developed for
spatial-based SR. IBP algorithm yields the desired image that satisfies that the
reconstruction error is close to zero. In other words, the IBP is convergent. Having
defined the imaging model like the one given in Eq. (5.3), the distance kAf gk22 is
minimized, where matrix A includes the blur, down-sampling and warping opera-
tions, f is the original HR image, and g is the observed image. The HR estimated
image is generated and afterward refined. Such a guess can be obtained by regis-
tering the LR images over a HR grid and then averaged them (Irani and Peleg 1990,
1991, 1992, 1993). The iterative model given in Eq. (5.10) is used to refine the set
of the available LR observations. Then, the error between the LR images and the
observed ones is obtained and back-projected to the coordinates of the HR image to
improve the initial estimation (Irani and Peleg 1993). The Richardson iteration is
commonly used in these techniques.
1X K
gk gk d_ h_ ;
ðtÞ
f ðt þ 1Þ ðx; yÞ ¼ f ðtÞ ðx; yÞ þ w1
k ð5:10Þ
K k¼1
where w1 _ _
k is the inverse of the warping operator, d is the up-sampling operator, h is
ðt þ 1Þ
a deblurring kernel, k = 1…K is the number of LR acquisitions, f ðx; yÞ is the
ðtÞ
reconstructed SR image in the (t + 1)th iteration, and f ðx; yÞ is the reconstructed
SR image in the previous (t)th iteration. The shortcoming of this algorithm is that
produces artifacts along salient edges.
108 L. Morera-Delfín et al.
The noise term in the imaging model given in Eq. (5.3) is assumed to be additive
white Gaussian noise (AWGN) with zero mean and variance r2 . Assuming the
measurements are independent and the error between images is uncorrelated, the
likelihood function of an observed LR image gk for an estimated HR image ^f
(Cheeseman et al. 1994; Capel and Zisserman 1998; Elad and Hel-Or 2001; Farsiu
et al. 2004; Pickup et al. 2006; Pickup 2007; Prendergast and Nguyen 2008; Jung
et al. 2011a) is,
0 2 1
_
Y 1 B g k g k C
p gk j^f ¼ pffiffiffiffiffiffiffiffiffiffi exp@ 2 A: ð5:11Þ
8m;n 2pr 2 2r
1 X _
Lðgk Þ ¼ C g g k : ð5:12Þ
2r2 8m;n k
The maximum likelihood (ML) solution (Woods and Galatsanos 2005) seeks a
super-resolved image ^fML which maximizes the log-likelihood for all observations.
Notice that after maximization the constant term vanishes. Therefore, the
super-resolved images can be obtained by maximizing Eq. (5.12) or, equivalently,
_
by minimizing the distance between gk and gk as,
!
X 2
^fML ¼ arg max ¼ arg min gk gk 2 :
_
Lðgk Þ ð5:13Þ
f 8m;n f
Given the LR images gk, the maximum a posteriori (MAP) method (Cheeseman
et al. 1994) finds an estimate ^fMAP of the HR image by using the Bayes rule in
Eq. (5.14),
pðg1 ; g2 ; . . .; gk jf Þpð f Þ
p ^f jg1 ; g2 ; . . .; gk ¼ / pðg1 ; g2 ; . . .; gk jf Þpð f Þ ð5:14Þ
pðg1 ; g2 ; . . .; gk Þ
The estimate can be found by maximizing log of Eq. (5.14). Notice that the
denominator is a constant term that normalizes the probability conditional. This
term is going to be zero after maximization then,
5 Overview of Super-resolution Techniques 109
Applying statistical independence between the images gk, Eq. (2.15) can be
written as,
!
X
K
^fMAP ¼ arg max logðpðgk jf ÞÞ þ logðpðf ÞÞ ; ð5:16Þ
f k¼1
where
_ !
gk gk 2
pðgk jf Þ / exp 2
2r2k
The probability pðgk jf Þ is named the regularization term. This term has been
modeled in many different forms; some cases are:
1. Natural image prior (Tappen et al. 2003; Kim and Kwon 2008, 2010).
2. Stationary simultaneous autoregression (SAR) (Villena et al. 2004), which
applies uniform smoothness to all the locations in the image.
3. Non-stationary SAR (Woods and Galatsanos 2005) in which the variance of the
SAR prediction can be different from one location in the image to another.
4. Soft edge smoothness a priori, which estimates the average length of all level
lines in an intensity image (Dai et al. 2007, 2009).
5. Double-exponential Markov random field, which is simply the absolute value
of each pixel value (Debes et al. 2007).
6. Potts–Strauss MRF (Martins et al. 2007).
7. Non-local graph-based regularization (Peyre et al. 2008).
8. Corner and edge preservation regularization term (Shao and Wei 2008).
9. Multi-channel smoothness a priori which considers the smoothness between
frames (temporal residual) and within frames (spatial residual) of a video
sequence (Belekos et al. 2010).
10. Non-local self-similarity (Dong et al. 2011).
11. Total subset variation, which is a convex generalization of the total variation
(TV) regularization strategy (Kumar and Nguyen 2010).
12. Mumford–Shah regularization term (Jung et al. 2011b).
13. Morphological-based regularization (Purkait and Chanda 2012).
14. Wavelet-based (Li et al. 2008; Mallat and Yu 2010).
110 L. Morera-Delfín et al.
5.2.7 Single-Image SR
The concept of geometric duality is one of the most useful tools in the parametric
SR with least-square estimation for interpolation, and one of the most cited algo-
rithm in comparison with SR method is the new edge-directed interpolation (NEDI)
(Li and Orchad 2001).
The idea behind is that each low-resolution pixel also exists in the HR image and
the neighbor pixels are unknown. Hence, with two orthogonal pairs of directions
around the low-resolution pixel in the HR image (horizontal, vertical, and diagonal
directions), a least-square estimation can be used in each pair. The equation system
is constructed in the LR image, and then, the coefficients are used to estimate pixels
in the HR initial image. The first estimation is made by using Eq. (5.17),
1 X
X 1
Y^2i þ 1;2j þ 1 ¼ a2k þ l Y2ði þ kÞ;2ðj þ lÞ ð5:17Þ
k¼0 l¼0
where the coefficients are obtained in the same configuration as in the LR image. In
this case, the unknown pixels between LR pixels that exist in the HR image (in
vertical and horizontal directions) are estimated. In the next step, the unknown
pixels between LR pixels that exist in the HR image (in diagonal directions) are
estimated. The pixels of each category are shown in Fig. 5.5.
In (Zhang and Wu 2008), take advantage of NEDI. There, a new restriction is
applied including the estimated pixels in the second step, and the minimum square
estimation is made using the 8-connected pixels around a central pixel in the
diamond configuration shown in Fig. 5.6. They define a 2D piecewise
5 Overview of Super-resolution Techniques 111
Fig. 5.5 Array of pixels in the initial HR image for NEDI interpolation. Black pixels are the LR
pixels used to calculate the HR gray pixels. The white pixels are calculated using the white and the
black pixels
ð8Þ
where xi and yi are the LR and the HR pixels, respectively, xi}k are the four
8-connected LR neighbors available for a missing yi pixel and for a xi pixel, and
ð8Þ
yi}k denotes its HR missing four 8-connected pixels.
Fig. 5.6 a Spatial configuration for the known and missing pixels and b the parameters used to
characterize the diagonal, horizontal, and the vertical correlations (Zhang and Wu 2008)
112 L. Morera-Delfín et al.
Other approach for NEDI algorithms (Ren et al. 2006; Hung and Siu 2012) uses
a weighting matrix W to assign different influence of the neighbor pixels on the
pixel under estimation. The correlation is affected by the distance between pixels.
The diagonal correlation model parameter is estimated by using a weighted
least-square strategy.
1
A ¼ LTLA WLLA LTLA WL; ð5:19Þ
Fig. 5.7 Projection of an input image using two external LR–HR dictionaries
Fig. 5.8 Low-resolution input image and a pair of LR–HR dictionary images
X ¼ WKWT ; ð5:22Þ
resolution Uh and Ul are used to find the minimum distance in a projection over the
found eigenspace.
Dh ¼ Ukh Wh : ð5:23Þ
In dictionary search, the patches represent rows or columns of the data matrix Uh
or Ul . The strategy is to find the position of the patch at HR with a minimum
distance respect to the projection of a LR patch in the eigenspace of HR.
^h
phðposÞ ¼ minDTh U k D T
h ^
v l;j ; ^vl;j 2U
^l :
k ð5:24Þ
v;l;j 2
5.2.7.3 Diffusive SR
Perona and Malik (1990) developed a method that employs a diffusion equation for
the reconstruction of the image. The local context of the image is processed using a
function to restore the edges.
@
div ðcrIÞ ¼ ðcIx Þ ð5:25Þ
@x
rN Ii;j Ii1;j Ii;j ; rS Ii;j Ii þ 1;j Ii;j ; rE Ii;j Ii;j þ 1 Ii;j ; rW Ii;j
Ii;j1 Ii;j ð5:26Þ
ctN i;j ¼g ðrIÞti þ ð1=2Þ;j ; ctS i;j ¼ g ðrIÞtið1=2Þ;j ; ctE i;j ¼ g ðrIÞti;j þ ð1=2Þ
ctW i;j ¼ g ðrIÞti;jð1=2Þ :
ð5:27Þ
ðt þ 1Þ ðtÞ ðtÞ
Ii;j ¼ Ii;j þ k½cN rN I þ cS rS I þ cE rE I þ cW rW Ii;j ð5:28Þ
5 Overview of Super-resolution Techniques 115
This principle has been a guide for local in time processing over the image used
in image processing algorithms for adaptation to a local context in the image.
5.2.7.4 TFOCS
D
min /ðxÞ ¼ f ðAðxÞ þ bÞ þ hðxÞ; ð5:29Þ
where the function f is smooth and convex, h is convex, A is a lineal operator, and b
a bias vector. The function h also must be prox-capable; in other words, it must be
inexpensive to compute its proximity operator of Eq. (5.30)
1
min Uh ðx; tÞ ¼ arg min hðzÞ þ t1 hz x; z xi ð5:30Þ
z 2
1
min kAx bk22 þ hðxÞ ð5:32Þ
2
The library was employed in Ren et al. (2017) for the minimization of a function
of estimation in which two priors are employed: the first a differential respect to a
new estimation based on TV of a central patch respect to a window of search
adaptive high-dimensional non-local total variation (AHNLTV) and the second a
weighted adaptive geometric duality (AGD). Figure 5.9 shows the visual com-
parison between bicubic interpolation and AHNLTV-AGD method after HR image
estimation.
116 L. Morera-Delfín et al.
Fig. 5.9 Visual comparison of the HR image using a bicubic interpolation and b AHNLTV-AGD
method
where ∇ is the gradient operator. The TV term can be weighted with an adaptive
spatial algorithm based on differences in the curvature. For example, the bilateral
total variation (BTV) (Farsiu et al. 2003) is used to approximate TV, and it is
defined in Eq. (5.34),
X P XP
qðf Þ ¼ al þ 1 f Skx Sly f ð5:34Þ
1
k¼0 l¼0
where Skx and Sly shift f by k and l pixels in the x and y directions to present several
scales of derivatives, 0\a\1 imposes a spatial decay on the results (Farsiu et al.
2003), and P is the scale at which the derivatives are calculated (so it calculates
derivatives at multiple scales of resolution (Farsiu et al. 2006). In (Wang et al.
2008), the authors discuss that an a priori term generates saturated data if it is
applied to unmanned aerial vehicle data. Therefore, it has been suggested to
combine it with the Hubert function, resulting in the BTV Hubert of Eq. (5.35),
(
jrxj2
; if A\a
qðj xjÞ ¼ @A
2 ; ð5:35Þ
@x othewise
5 Overview of Super-resolution Techniques 117
The gradients are a topic of interest in SR. The changes in the image are a funda-
mental evidence of the resolution, and a high-frequency content brings the maximal
changes of values between consecutive pixels in the image. The management of
gradient has been addressed in two forms: first, by using a dictionary of external
gradients of HR and second, by working directly on the LR image and recon-
structing the HR gradients with the context of the image and regularization terms.
In these methods (Sun et al. 2008; Wang et al. 2013), a relationship is estab-
lished in order to sharp the edges. In the first case, the gradients of an external
database of HR are analyzed, and with a dictionary technique, the gradients of the
LR input image are reconstructed. In the second case, the technique does not require
external dictionaries, the procedure is guided by the second derivative of the same
LR images amplified using pure interpolation, then a gradient scale factor is
incorporated extracted from the local characteristics of the image.
In this chapter, we propose a new algorithm of gradient management and the
application for a novel procedure of SR. For example, a bidirectional and orthog-
onal gradient field is employed. In our algorithm, two new procedures are proposed;
in the first, the gradient field employed is calculated as:
2 3
1 1 1
1
l ruTh ¼ Iuh 4 1 0 15 ð5:36Þ
2
1 1 1
Then, the procedure is integrated as shown in Fig. 5.10; for deeper under-
standing, refer to (Wang et al. 2013).
The second form of our procedure is the application of the gradient field with
independence. That is, the gradient fields are calculated by convolving the image
with discrete gradient operators of Eq. (5.37) to obtain the differences along
diagonal directions. The resulting model is shown in Fig. 5.11.
118 L. Morera-Delfín et al.
Fig. 5.10 Overview of the proposed SR algorithm. First, two orthogonal and directional HR
gradients as well as a displacement field
Fig. 5.11 Bidirectional and orthogonal gradient management with independent branches
2 3 2 3
1 0 0 0 0 1
4 0 1 1
0 0 5 and4 0 0 0 5 ð5:37Þ
2 2
0 0 1 1 0 0
This section proposes the integration of two powerful tools for SR, the TV and
gradient control. In the proposed case, the gradient regularization is applied first
using the proposed model of Sect. 5.2.7.6. The technique produces some artifacts
5 Overview of Super-resolution Techniques 119
Fig. 5.12 Hybrid model for collaborative SR. The model combines the gradient control and BTV
strategies
when the amplification scale is high, and the regularization term takes high values.
The first problem is addressed by TV also exposed previously. This algorithm
brings an average of similar pixels around the image for estimation of the high
resolution. Here, two characteristics can collaborate for a better result.
The general procedure of the proposed method is shown in Fig. 5.12, and the
visual comparison between the LR image and the HR image is exposed in
Fig. 5.16. The proposed new algorithm is named orthogonal and directional gra-
dient management and bilateral total variation (ODGM-BTV). It is only an illus-
tration of the multiple possibilities for the creation of SR algorithms.
In this section, the results of the proposed methods are illustrated. Experiments on
test and real images are presented with scaling factors of 2, 3, and 4. The objective
metrics used were peak signal-to-noise ratio (PSNR) and the structural similarity
(SSIM), and results are given in Tables 5.1, 5.2, and 5.3. Subjective performance of
our SR schemes is evaluated in Figs. 5.13, 5.14, 5.15, and 5.16.
Table 5.1 PSNR/SSIM comparison for multi-directional, diagonal, horizontal, and vertical
gradient management with a scale of 2
Method Decoupled total Diagonal gradient Horiz–vert gradient Total gradient
gradient
Bike 0.6509/22.399 0.6547/22.321 0.6346/21.988 0.6507/22.026
Butterfly 0.7714/22.360 0.7714/22.580 0.7564/22.177 0.7634/22.237
Comic 0.5515/20.694 0.5540/20.628 0.5366/20.348 0.5633/20.697
Flower 0.7603/23.964 0.7553/26.221 0.7467/25.225 0.7691/24.710
Hat 0.8165/26.025 0.8163/25.888 0.8126/25.964 0.8168/26.284
Parrot 0.8648/27.378 0.8604/26.70 0.8602/26.594 0.8641/25.876
Parthenon 0.6585/20.632 0.6593/20.593 0.6451/20.371 0.6616/21.540
Plants 0.8480/23.298 0.8467/23.521 0.8418/28.645 0.8546/26.458
120 L. Morera-Delfín et al.
Table 5.2 PSNR/SSIM comparison for multi-directional, diagonal, horizontal, and vertical
gradient management with a scale of 3
Method Decoupled total Diagonal gradient Horiz–vert gradient Total gradient
gradient
Bike 0.6103/21.83 0.5950/21.229 0.5956/21.405 0.5854/21.068
Butterfly 0.7345/22.23 0.7174/21.121 0.7199/21.124 0.7065/20.714
Comic 0.5074/20.12 0.4812/18.789 0.4985/19.859 0.4999/19.925
Flower 0.7275/24.37 0.7079/25.200 0.7185/25.007 0.7242/24.678
Hat 0.7996/26.43 0.7939/26.757 0.7965/25.967 0.7904/26.129
Parrot 0.8491/26.73 0.8363/25.901 0.8486/26.307 0.8371/24.253
Parthenon 0.6243/21.70 0.6091/21.777 0.6151/20.868 0.6137/22.001
Plants 0.8256/23.19 0.8103/23.372 0.8252/28.664 0.8248/26.432
Table 5.3 PSNR/SSIM comparison for multi-directional, diagonal, horizontal, and vertical
gradient management with scale of 4
Method Decoupled total Diagonal gradient Horiz–vert gradient Total gradient
gradient
Bike 0.5498/20.9911 0.5197/19.3258 0.5368/20.6564 0.5325/20.7127
Butterfly 0.6806/20.7812 0.6532/19.0710 0.6552/18.0756 0.6613/19.9920
Comic 0.4451/19.4540 0.4156/18.0424 0.4363/19.0990 0.4417/19.3933
Flower 0.6713/24.5320 0.6440/24.0778 0.6644/24.5192 0.6697/24.0886
Hat 0.7723/26.4887 0.7633/26.2439 0.7684/26.2650 0.7673/26.1693
Parrot 0.8214/25.5286 0.8026/24.5686 0.8209/24.8905 0.8115/23.5459
Parthenon 0.5844/22.4052 0.5626/21.9254 0.5745/21.5580 0.5762/21.7595
Plants 0.7837/22.3791 0.7599/22.5480 0.7905/27.7839 0.7901/26.5721
Fig. 5.13 4 amplification factor using a test image with a diagonal, b horiz–vert, c coupled, and
d decoupled gradients
In these experiments, the group of images shown in Fig. 5.15, included in the
BSDS500 database, was used. The amplification factors were 2, 3, and 4.
Tables 5.1, 5.2, and 5.3 show the increment in PSNR and SSIM of the second
5 Overview of Super-resolution Techniques 121
Fig. 5.14 Slopes of the estimated HR image (row 60 of the test image in Fig. 5.13). The image
was processed using the two proposed algorithms with two orthogonal directions of the slopes
independently
Fig. 5.15 Processed images with the decoupled gradient algorithm. The scale factors are: 4 for
the top row of images, 3 for the second row of images, and 2 for the row of images at the
bottom
alternative proposed with independence of the two gradient fields. Also, the test
image was used to observe the sharpening effect around contours, and the results
are shown in Fig. 5.13. Figure 5.14 shows the plot of the row 60, taken from the
test image of Fig. 5.13, to illustrate the edge transitions for the HR recovered
image.
122 L. Morera-Delfín et al.
Fig. 5.16 Application of SR using the hybrid BTV and gradient management strategy with a scale
of amplification of q = 4, a low-resolution image and b application of ODGM-BTV
Figure 5.16 shows the result of the proposed method ODGM-BTV using a scale of
amplification of 4.
Algorithm:
Input: LR image, iteration number
For i = 1: iteration number
1. Apply the BTV algorithm to the LR input image.
2. Apply the bidirectional orthogonal gradient management.
3. Update the LR input image with the HR output image.
end
Output HR image
5.4 Metrics
The PSNR in dBs of Eq. (5.38) and SSIM of Eq. (5.39) are the metrics most used
to evaluate SR algorithms.
v2max
PSNR ¼ 10 log10 ; ð5:38Þ
MSEðx; yÞ
where x and y are the two signals to compare, MSEðx; yÞ is the mean square error,
and vmax is the maximum possible value in the range of the signals. The SSIM
factor (Wang et al. 2004) is calculated as,
5 Overview of Super-resolution Techniques 123
2 lx ly þ c1 2rxy þ c2
SSIMðx; yÞ ¼ ; ð5:39Þ
l2x þ l2y þ c1 r2x þ r2y þ c2
where lx and ly are the mean value of x and y, r2x , r2y , and rxy are the variance and
covariance of x and y; c1 and c2 are constants terms. Another metric derived from
the SSIM is the mean SSIM (MSSIM) of Eq. (5.40)
1X M
MSSIM ¼ SSIMðxj ; yj Þ; ð5:40Þ
M j¼1
Tables 5.1, 5.2, and 5.3 show an enhancement of the quality parameters SSIM and
PSNR of our proposed method over the management of a single gradient. Also, the
scales of amplification are greater than 3 with major increments of the quality
factors for high scale factors. Our procedure employs a natural following of the
gradients, and let to give a more precise dimension of the slopes, it is an important
contribution to the state of the art of the algorithms of gradient management.
Although the goal of our chapter is an overview of complements for
super-resolution and not contributions of a novel algorithms or improvement of the
results in the state of the art. The overview shows that SR is a very rich field of
investigation. In each step, we can find a possibility of application of some method
using the strongest principle of functioning. An example is the combination of the
BTV and ODGM, the visual effect is very interesting in Fig. 5.16, and the major
resolution by area can be observed. The contribution in this case avoids artifacts
from gradient management, and at the same time, a less blurred image is obtained in
comparison with BTV method due to the sharping procedure over the edges.
The review of the literature brings some conclusions. The investigation in this
topic is extended, and the contributions for the state of the art are in the most of the
cases little changes over well-known procedures. Unfortunately, the goal is based
on a quality measurement and the benchmark for guide of the results is based on
different configurations of the Eqs. (5.38), (5.39), and (5.40). The consequence is
that the comparison between many reported algorithms and models is difficult and
not always possible. In this point, the borders between classifications of the
methods are diffused by this reason the comparison between methods in an over-
view more than attempts of classification and the explanation of the classification is
not useful. Nevertheless, the great creativity exhibited in the different methods and
the totally different mathematical solutions make it difficult to establish mathe-
matical comparisons and objective conclusions without considering only empirical
results based on measurement metrics.
124 L. Morera-Delfín et al.
5.5 Conclusions
SR is an exciting and diverse subject in the digital processing area and can take all
possible forms. Each algorithm has a place in this area of research and is extremely
complex and comprehensive. The study of these techniques should be oriented from
the beginning because the development of each of them is broad and difficult to
reproduce. Sometimes, a small advance can be made in one of them. Also, the
initial condition is different in each case and some bases of comparison are required.
In the literature, some standard measurements are proposed but the application
conditions are diverse. A useful strategy to approach SR research is the knowledge
of the cause of preexisting algorithms. Advantages and disadvantages are important
factors to consider in order to combine characteristics that produce more convincing
effects and better qualities of the output image in a system. The proposed example
makes edge sharpening and average for estimation; the first method produces
artifacts, but the second fails to produce clear edges. A case was proposed in which
these two characteristics can be positively complemented. For future work, we
continue the study of multiple possibilities in the field of SR estimation using the
transformation of the image and learning from different characterizations as
wavelets fluctuations with dictionary learning. Other interesting field is the mini-
mization procedures for multiple residuals priors in the estimations as was made in
works as (Ren et al. 2017).
References
Becker, S., Candès, E., & Grant, M. (2011). Templates for convex cone problems with
applications to sparse signal recovery. Mathematical Programming Computation, 3, 165–218.
Belekos, S., Galatsanos, N., & Katsaggelos, A. (2010). Maximum a posteriori video
super-resolution using a new multichannel image prior. IEEE Transactions on Image
Processing, 19(6), 1451–1464.
Capel, D., & Zisserman, A. (1998). Automated mosaicing with super-resolution zoom. In
Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern
Recognition (CVPR), Santa Barbara, California, USA, 1, 885–891.
Cheeseman, P., Kanefsky, B., Kraft, R., Stutz, J., & Hanson, R. (1994). Super-resolved surface
reconstruction from multiple images (1st ed.). London, United Kingdom: Springer Science +
Business Media.
Dai, S., Han, M., Xu, W., Wu, Y., & Gong, Y. (2007). Soft edge smooth-ness prior for alpha
channel super resolution. In Proceedings of IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), Minneapolis, Minnesota, USA, 1, 1–8.
Dai, S., Han, M., Xu, W., Wu, Y., Gong, Y., & Katsaggelos, A. (2009). SoftCuts: a soft edge
smoothness prior for color image super-resolution. IEEE Transactions on Image Processing,
18(5), 969–981.
Debes, C., Wedi, T., Brown, C., & Zoubir, A. (2007). Motion estimation using a joint optimisation
of the motion vector field and a super-resolution reference image. In Proceedings of IEEE
International Conference on Image Processing (ICIP), San Antonio, Texas, USA, 2, 479–500.
5 Overview of Super-resolution Techniques 125
Dong, W., Zhang, L., Shi, G., & Wu, X. (2011). Image deblurring and super-resolution by
adaptive sparse domain selection and adaptive regularization. IEEE Transactions on Image
Processing, 20(7), 1838–1856.
Elad, M., & Hel-Or, Y. (2001). A fast super-resolution reconstruction algorithm for pure
translational motion and common space-invariant blur. IEEE Transactions on Image
Processing, 10(8), 1187–1193.
Farsiu, S., Robinson, D., Elad, M., & Milanfar, P. (2003). Robust shift and add approach to
super-resolution. In Proceedings of SPIE Conference on Applications of Digital Signal and
Image Processing, San Diego, California, USA, 1, 121–130.
Farsiu, S., Robinson, D., Elad, M., & Milanfar, P. (2004). Fast and robust multi-frame
super-resolution. IEEE Transactions on Image Processing, 13(10), 327–1344.
Farsiu, S., Elad, M., & Milanfar, P. (2006). A practical approach to super-resolution. In
Proceedings of SPIE Conference on Visual Communications and Image Processing, San Jose,
California, USA, 6077, 1–15.
Huang, K., Hu, R., Han, Z., Lu, T., Jiang, J., & Wang, F. (2011). A face super-resolution method
based on illumination invariant feature. In Proceedings of IEEE International Conference on
Multimedia Technology (ICMT), Hangzhou, China, 1, 5215–5218.
Hung, K., & Siu, W. (2012). Robust soft-decision interpolation using weighted least squares. IEEE
Transactions on Image Processing, 21(3), 1061–1069.
Irani, M., & Peleg, S. (1990). Super-resolution from image sequences. In Proceedings of 10th
IEEE International Conference on Pattern Recognition, Atlantic City, New Jersey, USA,
1,115–120.
Irani, M., & Peleg, S. (1991). Improving resolution by image registration. CVGIP Graphical
Models and Image Processing, 53(3), 231–239.
Irani. M., & Peleg, S. (1992). Image sequence enhancement using multiple motions analysis. In
Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern
Recognition (CVPR), Champaign, Illinois, USA, 1, 216–222.
Irani, M., & Peleg, S. (1993). Motion analysis for image enhancement: Resolution, occlusion, and
transparency. Journal of Visual Communication and Image Representation, 4(4), 324–335.
Jung, C., Jiao, L., Liu, B., & Gong, M. (2011a). Position-patch based face hallucination using
convex optimization. IEEE Signal Processing Letters, 18(6), 367–370.
Jung, M., Bresson, X., Chan, T., & Vese, L. (2011b). Nonlocal Mumford-Shah regularizers for
color image restoration. IEEE Transactions on Image Processing, 20(6), 1583–1598.
Keren, D., Peleg, S., & Brada, R. (1998). Image sequence enhancement using subpixel
displacements. In Proceedings of IEEE Computer Society Conference on Computer Vision and
Pattern Recognition (CVPR), Ann Arbor, Michigan, USA, 1, 742–746.
Kim, K., & Kwon, Y. (2008). Example-based learning for single-image super-resolution. Pattern
Recognition, LNCS, 5096, 456–465.
Kim, K., & Kwon, Y. (2010). Single-image super-resolution using sparse regression and natural
image prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(6),
1127–1133.
Kong, D., Han, M., Xu, W., Tao, H., & Gong, Y. (2006). A conditional random field model for
video super-resolution. In Proceedings of 18th International Conference on Pattern
Recognition (ICPR), Hong Kong, China, 1, 619–622.
Kumar, S., & Nguyen, T. (2010). Total subset variation prior. In Proceedings of IEEE
International Conference on Image Processing (ICIP), Hong Kong, China, 1, 77–80.
Li, X., & Orchard, M. (2001). New edge-directed interpolation. IEEE Transactions on Image
Processing, 10(10), 1521–1527.
Li, F., Jia, X., & Fraser, D. (2008). Universal HMT based super resolution for remote sensing
images. In Proceedings of 15th IEEE International Conference on Image Processing (ICIP),
San Diego, California, USA, 1, 333–336.
Li, X., Lam, K., Qiu, G., Shen, L., & Wang, S. (2009). Example-based image super-resolution
with class-specific predictors. Journal of Visual Communication and Image Representation, 20
(5), 312–322.
126 L. Morera-Delfín et al.
Li, X., Hu, Y., Gao, X., Tao, D., & Ning, B. (2010). A multi-frame image super-resolution
method. Signal Processing, 90(2), 405–414.
Liu, C., & Sun, D. (2011). A Bayesian approach to adaptive video super resolution. In
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
Colorado Springs, Colorado, USA, 1, 209–216.
Mallat, S., & Yu, G. (2010). Super-resolution with sparse mixing estimators. IEEE Transactions
on Image Processing, 19(11), 2889–2900.
Martins, A., Homem, M., & Mascarenhas, N. (2007). Super-resolution image reconstruction using
the ICM algorithm. In Proceedings of IEEE International Conference on Image Processing
(ICIP), San Antonio, Texas, USA, 4, 205–208.
Mochizuki, Y., Kameda, Y., Imiya, A., Sakai, T., & Imaizumi, T. (2011). Variational method for
super-resolution optical flow. Signal Processing, 91(7), 1535–1567.
Morera, D. (2015). Determining parameters for images amplification by pulses interpolation.
Ingeniería Investigación y Tecnología, 16(1), 71–82.
Morera, D. (2014). Amplification by pulses interpolation with high frequency restrictions for
conservation of the structural similitude of the image. International Journal of Signal
Processing, Image Processing and Pattern Recognition, 7(4), 195–202.
Omer, O., & Tanaka, T. (2010). Image superresolution based on locally adaptive mixed-norm.
Journal of Electrical and Computer Engineering, 2010, 1–4.
Perona, P., & Malik, J. (1990). Scale-space and edge detection using anisotropic diffusion. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 12(7), 629–639.
Peyre, G., Bougleux, S., & Cohen, L. (2008). Non-local regularization of inverse problems. In
Proceedings of European Conference on Computer Vision, Marseille, France, 5304, 57–68.
Pickup, L., Capel, D., & Roberts, S. (2006). Bayesian image super-resolution, continued. Neural
Information Processing Systems, 19, 1089–1096.
Pickup, L. (2007). Machine learning in multi-frame image super-resolution. Ph.D. thesis,
University of Oxford.
Prendergast, R., & Nguyen, T. (2008). A block-based super-resolution for video sequences. In
Proceedings of 15th IEEE International Conference on Image Processing (ICIP), San Diego,
California, USA, 1, 1240–1243.
Purkait, P., & Chanda, B. (2012). Super resolution image reconstruction through Bregman
iteration using morphologic regularization. IEEE Transactions on Image Processing, 21(9),
4029–4039.
Ren, C., He, X., Teng, Q., Wu, Y., & Nguyen, T. (2006). Single image super-resolution using
local geometric duality and non-local similarity. IEEE Transactions on Image Processing, 25
(5), 2168–2183.
Ren, C., He, X., & Nguyen, T. (2017). Single image super-resolution via adaptive
high-dimensional non-local total variation and adaptive geometric feature. IEEE
Transactions on Image Processing, 26(1), 90–106.
Schultz, R., & Stevenson, R. (1994). A Bayesian approach to image expansion for improved
definition. IEEE Transactions on Image Processing, 3(3), 233–242.
Shao, W., & Wei, Z. (2008). Edge-and-corner preserving regularization for image interpolation
and reconstruction. Image and Vision Computing, 26(12), 1591–1606.
Song, H., Zhang, L., Wang, P., Zhang, K., & Li, X. (2010). An adaptive L1–L2 hybrid error model
to super-resolution. In: Proceedings of 17th IEEE International Conference on Image
Processing (ICIP), Hong Kong, China, 1, 2821–2824.
Sun, J., Xu, Z., & Shum, H. (2008). Image super-resolution using gradient profile prior. In
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
Anchorage, Alaska, 1, 1–8.
Tappen, M., Russell, B., & Freeman, W. (2003). Exploiting the sparse derivative prior for
super-resolution and image demosaicing. In Proceedings of IEEE 3rd International Workshop
on Statistical and Computational Theories of Vision (SCTV), Nice, France, 1, 1–24.
5 Overview of Super-resolution Techniques 127
Villena, S., Abad, J., Molina, R., & Katsaggelos, A. (2004). Estimation of high resolution images
and registration parameters from low resolution observations. Progress in Pattern Recognition,
Image Analysis and Applications, LNCS, 3287, 509–516.
Wang, Z., Bovik, A., Sheikh, H., & Simoncelli, E. (2004). Image quality assessment: from error
visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.
Wang, Y., Fevig, R., & Schultz, R. (2008). Super-resolution mosaicking of UAV surveillance
video. In Proceedings of 15th IEEE International Conference on Image Processing (ICIP), San
Diego, California, USA, 1, 345–348.
Wang, L., Xiang, S., Meng, G., Wu, H., & Pan, C. (2013). Edge-directed single-image
super-resolution via adaptive gradient magnitude self-interpolation. IEEE Transactions on
Circuits and Systems for Video Technology, 23(8), 1289–1299.
Woods, N., & Galatsanos, N. (2005). Non-stationary approximate Bayesian super-resolution using
a hierarchical prior model. In Proceedings of IEEE International Conference on Image
Processing (ICIP), Genova, Italy, 1, 37–40.
Zhang, X., & Wu, X. (2008). Image interpolation by adaptive 2-D autoregressive modeling and
soft-decision estimation. IEEE Transactions on Image Processing, 17(6), 887–896.
Part II
Control
Chapter 6
Learning in Biologically Inspired Neural
Networks for Robot Control
Abstract Cognitive robotics has focused its attention on the design and con-
struction of artificial agents that are able to perform some cognitive task autono-
mously through the interaction of the agent with its environment. A central issue in
these fields is the process of learning. In its attempt to imitate cognition in artificial
agents, cognitive robotics has implemented models of cognitive processes proposed
in areas such as biology, psychology, and neurosciences. A novel methodology for
the control of autonomous artificial agents is the paradigm that has been called
neuro-robotics or embedded neural cultures, which aims to embody cultures of
biological neurons in artificial agents. The present work is framed in this paradigm.
In this chapter, simulations of an autonomous learning process of an artificial agent
controlled by artificial action potential neural networks during an obstacle avoid-
ance task were carried out. The implemented neural model was introduced by
Izhikevich (2003); this model is capable of reproducing abrupt changes in the
membrane potential of biological neurons, known as action potentials. The learning
strategy is based on a multimodal association process where the synaptic weights of
the networks are modified using a Hebbian rule. Despite the growing interest
generated by artificial action potential neural networks, there is little research that
implements these models for learning and the control of autonomous agents. The
present work aims to fill this gap in the literature and at the same time, serve as a
guideline for the design of further experiments for in vitro experiments where neural
cultures are used for robot control.
6.1 Introduction
Artificial neural networks have been widely used to control artificial agents (Pfeifer
and Scheier 1999; Gaona et al. 2015; He et al. 2016). However, a new paradigm has
emerged by fusing neuroscience and cognitive robotics. This methodology attempts
to study cognitive processes, such as learning and memory in vitro. The aim is to
use embed living neurons in artificial agents (DeMarse et al. 2001). By doing this, a
new possibility emerges in the study of the cellular mechanisms underlying
cognition.
This work attempts to give some hints and directions in this field by using
simulated artificial action potential neural networks to control an artificial agent.
The models used have a high biological plausibility and can serve as a guideline in
the design of experiments that use in vitro neural cultures.
The chapter is divided as follows: the remainder of this section gives a short
introduction to the changes of paradigm in the study of cognition in artificial
intelligence. Section 6.2 presents the theoretical framework for artificial neural
networks, focusing on models of action potential neurons. Section 6.3 presents the
materials, methods, and results of two different experiments. Finally, in Sect. 6.4,
the conclusions are presented.
development of cognitive processes (Pfeifer and Scheier 1999). The most natural
platforms to perform this interaction are artificial agents (Moravec 1984; Brooks 1991).
From this perspective, Brooks argues that perception and sensorimotor abilities are the
really hard problems to solve by artificial agents. Once an agent has the basic sensing
and moving abilities to achieve survival and reproduction, higher-order abilities should
come easier to implement. Such high-order abilities include problem solving, language
expert knowledge among other things (Brooks 1991).
Following the ideas and principles of the new AI, cognitive robotics is an emerging
research area postulating that the best way to imitate and study cognition is by
building artificial autonomous agents. An artificial agent is defined as that machine
that is capable of learning an ability by means of the interaction with its environ-
ment to successfully perform some specific cognitive task (Pfeifer and Scheier
1999). Artificial agents are then robots that have a body, sensors, and a motor
system that allow them to perceive their environment and interact with it. Moreover,
artificial agents can be real physical robots or computer-simulated robots that live
and interact in a simulated environment.
Studies in this field focus on basic cognitive abilities, such as moving suc-
cessfully in an environment. The central hypothesis is that complex behaviors
emerge from the interaction of simpler behaviors, such as obstacle avoidance
(Copeland 2015).
This field aims at simulating cognitive processes in robots through the imple-
mentation of models coming from other branches of cognitive sciences, such as
psychology, neuroscience, and philosophy (Pfeifer and Scheier 1999). A recent area
of research for control of autonomous agents arises from the fusion of robotics and
neuroscience. This paradigms, neuro-robotics or embedded neural cultures, attempt
to embed and embody biological neural cultures by using in vitro neurons to control
artificial agents. At the same time, these agents are in direct contact with their
environment and from this interaction changes in patterns and strength of con-
nections in the embedded neurons take place.
Research on embedded neural cultures emerges as a field aiming at filling the gap
when having, on the one hand, studies on learning and memory and, on the other,
in vitro studies of the cellular mechanisms of synaptic plasticity involved in these
134 D. Valenzo et al.
while the other, the left. In this work, the network learns to control the movements of
the agent based on the interaction with its environment. The system adapts to the
environment through the evolutionary development of a population of individuals. The
implemented evolutionary mechanism allows the adaptation of the neural network in a
short period of time and the network becomes capable of controlling the agent so that it
navigates safely in the environment without colliding with the walls.
In this section, the authors will include a critical analysis of the sources review
related to the particular topic adressed. This part will be used for comparison and
evaluation purposes. The authors should include a table in which a summary of the
main features of the papers reviewed can be depicted. Interaction with its
environment.
Artificial neural networks (ANNs) are mathematical models inspired by the struc-
ture and functioning of the nervous system. These models are formed by single
elements called units or neurons. Each unit is a processing element that receives
some input from the environment or other units. All input signals are processed, and
the unit outputs a single signal based on some mathematical function. The infor-
mation is propagated through the network depending on the global architecture of
the system.
ANN have been classified in three main generations according to the mathe-
matical model, that the neurons in them use, to transform the incoming information.
The first generation is based on the McCulloch-Pitts model (McCulloch and Pitts
1943). The main characteristic of this model is that outputs are binary. In the second
generation, the output from the units is a continuous value (between 0 and 1 or −1
and 1) typically the result of a sigmoidal activation function. In contrast, the third
generation of ANN uses action potential neurons. These models are thought to
better simulate the output of biological neurons as they try to capture the nature of
electrical impulses of these cells. One of the advantages of these models is that they
use time as a resource as they process information due to the output of neurons is
the change in their membrane potential with respect to time (Maass 1997).
There exist some of models of action potential neurons, Izhikevich (2004)
provides a review of some of the most used ones. It is important to highlight that, in
using these types of models, a compromise must be made between two important
but seemingly mutually exclusive characteristics. On the one hand, the model must
be computationally simple so as to be feasible; on the other, it must reproduce the
firing patterns of biological networks (Izhikevich 2004). The most bio-physically
precise model, such as the one proposed by Hodking-Huxley, have a very high
computational cost given the amount of floating point number operations they
perform. For this reason, the number of neurons that can be modeled in real time is
136 D. Valenzo et al.
limited. On the other hand, models such as integrate-and-fire are very efficient
computationally, but they reproduce poorly the dynamics registered experimentally
in biological networks (Izhikevich 2003). Considering all these, the model proposed
by Izhikevich (2003) presents a reasonable compromise between computational
efficiency and biological plausibility.
The Izhikevich model is capable of reproducing action potentials of a number of
different biological cortical neurons by using only four parameters in a
two-dimensional system of differential equations of the form:
v0 ¼ 0:04v2 þ 5v þ 140u þ I
u0 ¼ aðbvuÞ
where v represents the membrane action potential of the neuron and u is a variable
modeling the recovery of the membrane which gives negative feedback to
v. Variable I is the electrical input to the neuron. The parameters a, b, c, and d of the
model allow the reproduction of different neural behaviors. The effect of these
parameters in the dynamics of the model are (Izhikevich 2003):
• The parameter a describes the time scale of the recovery variable u. Smaller
values result in slower recovery. A typical value is a = 0.02.
• The parameter b describes the sensitivity of the recovery variable to the sub-
threshold fluctuations of the membrane potential v. Greater values couple v and
u more strongly resulting in possible subthreshold oscillations and low-threshold
spiking dynamics. A typical value is b = 0.2. The case b < a(b > a) corre-
sponds to saddle-node (Andronov–Hopf) bifurcation of the resting state
(Izhikevich 2000).
• The parameter c describes the after-spike reset value of the membrane potential
v caused by the fast high-threshold k+ conductances. A typical value is
c = −65 mV.
• The parameter d describes after-spike reset of the recovery variable caused by
slow high-threshold Na+ and k+ conductances. A typical value is d = 2.
It is worth noting that, even though the Izhikevich model is a plausible biological
model with low computational cost, there are few implementations of this model for
the control of artificial agents in cognitive robotics. Even more, there is very few
research and implemetations of learning algorithms for networks using this type of
models.
6 Learning in Biologically Inspired Neural Networks … 137
they are stimulated in a prolonged way. This means that the frequency of action
potentials of these neurons decreases over time despite the fact that the stimulus
persists. Another characteristic of this type of neurons is that their firing frequency
increases when the current they receive increases, although they never fire too fast due
to the long hyperpolarization phase they present. In the model, these parameters cor-
respond to a more negative value of the readjustment voltage (c = −65.0) and a high
value of the readjustment of the recovery variable u(d = 8.0) (Izhikevich 2003).
The initial values for the membrane potential and the recovery variable for each
neuron were vo = −65.0 and uo = −13.0, respectively. These values were estab-
lished taking as reference the experiment reported in Izhikevich (2003). The resting
membrane potential was established at −65.0 to −70.0 mV, depending on the type
of neuron (sensory, interneuron, or motor) within the network architecture. This
corresponds to an input stream Ibase of 0 or 3.5, respectively.
Structure of Processing and Transmission of Information
The processing and transmission of the information used for the implementation of
the systems can be divided into the following phases:
6 Learning in Biologically Inspired Neural Networks … 139
• Normalization of sonar values: Originally, each of the eight sonars of the agent
can register obstacles that are within a range of distance between 0 (near) and
5000 mm (far). These values were normalized to a range of 0–1, such that the
maximum value indicates maximum proximity to an obstacle.
s
Sn ¼ 1
5000
where s represents the original value of the sonar and Sn is the normalized value.
• Mapping of sonar information to sensory neurons: The sonar information was
mapped to an input stream of current for the sensory neurons of the networks.
This input stream is proportional to the degree of activation of the sensors, in
such a way that the firing frequency of sensory neurons is higher in the presence
of obstacles.
• Propagation of information: This process depends on the pattern of connections
of each of the networks, as well as the value of the connection force or synaptic
weight between the neurons. If the strength of connection is high, the current with
which the presynaptic neuron contributes to the postsynaptic neuron will be enough
to trigger an action potential in it. Otherwise, the contribution current would not
trigger an action potential in the postsynaptic neuron.
• Mapping of motor neuron activity to motors speeds: Motor speed was
assigned depending on the rate of action potentials of motor neurons recorded in a
certain time window. Tests were performed with time windows of different duration
to choose the appropriate one. A time window (Wt) of 400 ms was chosen because
the shorter time windows could record very few action potentials, while very long
time windows required longer simulations or higher learning rates.
AP
Vmotor ¼ 350 þ 150
APmax
where Vmotor refers to the speed assigned to the motor, AP is the number of
action potentials within the set time window and APmax the maximum number
of action potentials that can be generated in the time window. This speed will be
assigned to the left motor if the action potentials correspond to the left motor
neuron, and to the right motor if the right motor neuron is the one that triggers.
If AP is zero, then a base speed is assigned to the motors, which in the equation
corresponds to 150 mm/s.
In these experiments, the agent does not learn, rather, the weights connecting the
neurons are fixed. Two architectures are tested.
140 D. Valenzo et al.
Neuronal Architectures
The artificial systems were modeled with an artificial action potential neural net-
work coupled to the simulated agent (Fig. 6.3).
The two systems present a neural network architecture with eight sensory neu-
rons and two motor neurons, which are illustrated with blue and green circles,
respectively. However, one of them has two additional interneurons, shown in pink
(Fig. 6.3a). In both systems, the sensory neurons are associated to a sensor of the
artificial agent, in such a way that the sensor 1 is associated to the sensory neuron 1,
the sensor 2 to the sensory neuron 2, and so on. Finally, the sensory neurons are
connected to the interneurons or to the motor neurons, depending on the system.
Specifically, in Architecture 1 (Fig. 6.3a), sensory neurons 1, 2, 3, and 4, which
correspond to the left side of the agent, are connected to interneuron 1. On the other
hand, neurons 5, 6, 7, and 8 are connected to the interneuron 2. In this system, each
interneuron is connected to the motor neuron that is on the same side of the
architecture. In the case of Architecture 2 (Fig. 6.3b), sensory neurons are directly
connected to motor neurons. The sensory neurons 1, 2, 3, and 4 are connected to the
motor neuron 1, while the sensory neurons on the right side are connected to the
motor neuron 2. In both systems, each motor neuron is associated with a motor. The
motor neuron 1 is associated with the left motor, while the motor neuron 2 is
associated with the motor on the right side of the agent. The wheels of the agent are
independent of each other so that if the left wheel accelerates and the right wheel
maintains its constant speed, the agent will turn to the right and vice versa.
It should be mentioned that during the experiments, a base speed was established
for the motors, in such a way that when there is no obstacle near the agent, both
wheels maintain the same speed and the agent advances in a straight line.
Fig. 6.4 Avoidance of two obstacles. Letter a indicates the moment in which the artificial agent
detects one of the obstacles and turns to avoid it. When turning, the agent detects a second obstacle
at time b
simulation. A value of 0.7 is enough to ensure that every time a neuron fires, the
neurons connected to it will also fire. In this experiment, the artificial agent must
evade two obstacles that are close to him; the path taken by the artificial agent
during the simulation is shown in Fig. 6.4.
The agent is able to detect the obstacle in front of him at instant a and turns left
timely. Subsequently, it detects a second obstacle at instant b but does not need to
turn to avoid it. The recording of the sonars and the activation of the corresponding
sensory neurons during the evasion of the obstacles are shown in Figs. 6.5, 6.6, 6.7,
and 6.8. Figure 6.5 shows the activation of sonars 1, 2, 3, and 4, which are located
on the left side of the agent. While, the recording of the activity of sensory neurons
associated with each of these sonars is presented in Fig. 6.6. The graphs show that
Fig. 6.5 Activation of sonars during the navigation in the environment shown in Fig. 6.4
142 D. Valenzo et al.
Fig. 6.6 Activation of sensory neurons while navigating environment shown in Fig. 6.4
Fig. 6.7 Activation of sonars during navigation of the environment shown in Fig. 6.4
the first sensor that is activated during instant a is sensor 3, around 3000 ms. Its
activation triggers a series of action potentials in the sensory neuron number 3,
information that is transmitted to interneuron 1 and, subsequently, to the motor
6 Learning in Biologically Inspired Neural Networks … 143
Fig. 6.8 Activation of sensory neurons during navigation of the environment shown in Fig. 6.4
Fig. 6.9 Activation of inter- and motor neurons during navigation of the environment in Fig. 6.4
neuron 1 (Fig. 6.9). The action potentials generated by the motor neuron 1 cause the
motor on the left side to increase its speed, so the agent turns to the right to avoid
the obstacle. While turning, the same obstacle is detected by sensors 2 and 1
(shortly after the 4000 and 6000 ms, respectively), which generates action poten-
tials in the sensory neurons associated with these sensors. This information is also
transmitted to the motor neurons and influences the speed of the agent’s turn.
144 D. Valenzo et al.
Then at instant b, the sensor 8 on the right side of the architecture is activated at
around 6000 ms due to the second obstacle (Fig. 6.7), triggering the activation of
the sensory neuron 8 (Fig. 6.8), as well as interneuron 2 and motor neuron 2
(Fig. 6.9). This results in an increase in the speed of the right motor, which read-
justs the direction of the agent. Changes in agent speeds, associated with the
transmission of information from sonars to motor neurons, result in the successful
evasion of both obstacles. Finally, the sensors do not detect any nearby obstacle, so
the neurons do not fire and the speeds of both motors return to their base speed,
causing the agent to move forward again.
In this experiment, the neural architecture composed of only eight sensory neurons
and two motor neurons was used (Fig. 6.3b). The synaptic weight established
between the connections of the network was set at a value of 0.7, as in the
experiment previously described. The environment and behavior of the artificial
agent during the simulation is shown in Fig. 6.10.
The artificial agent detects a barrier of obstacles with the sensors and neurons
that are on the left side of its architecture. Figure 6.11 shows that 1, 2, 3 and 4
maintain a high activation during the first part of the simulation time. Therefore,
the neurons associated with these sensors generate action potentials, decreasing
their firing rate over time (Fig. 6.12). In turn, the result of these activations is
reflected in the activity of the motor neuron 1 (Fig. 6.15), generating an increase
in the speed of the left wheel. After turning, the activity of the sonars and neurons
on the right side of the architecture increases due to the detection of an obstacle
that is located on the right side of the environment. This happens shortly after
Fig. 6.11 Activation of sonars during navigation of the environment shown in Fig. 6.10
Fig. 6.12 Activation of sensory neurons during navigation of the environment shown in Fig. 6.10
3000 ms, as illustrated in Figs. 6.13 and 6.14. Finally, the sensory neuron 1 and
the motor neuron 1 are activated again at the 6000 ms, generating a change in the
speeds of the agent.
146 D. Valenzo et al.
Fig. 6.13 Activation of sonars during navigation of the environment shown in Fig. 6.10
Fig. 6.14 Activation of sensory neurons during navigation of the environment shown in Fig. 6.10
In this section, the systems are expected to learn the appropriate connections
between the neurons. The two architectures used are shown in Fig. 6.16.
6 Learning in Biologically Inspired Neural Networks … 147
Fig. 6.15 Activation of motor neurons during navigation of the environment shown in Fig. 6.10
where η is the learning rate and the forgetting rate, while act(npre) and act(npost)
correspond to the activity of the presynaptic and postsynaptic neurons, respectively.
148 D. Valenzo et al.
During the experiments, a learning rate of 0.08 and a forgetting rate of 0.000015
were used, both values determined experimentally. The value of the forgetting rate
was established considering that high values caused the synaptic weights, which
were reinforced during a collision, to decay quickly given the time it took the agent
to meet the next obstacle. The degree of activation for the neurons was normalized
in a range of 0–1, depending on the number of action potentials registered in each
time window.
The systems were designed so that each time the agent hits an obstacle the
activation of the corresponding collision sensor generates that the motor neuron
associated with it fires. Simultaneously, the sensory neurons and/or corresponding
interneurons will be activated by the proximity to the obstacle. The connections,
between the sensory neurons and/or interneurons and the corresponding motor
neurons, were reinforced by the Hebbian learning rule. The initial synaptic weight
between the neurons involved in the learning process was established in such a way
that, at the beginning of the simulations, the motor neurons are activated only by the
corresponding collision sensor. However, once the system has learned, a multi-
modal association occurs and the motor neurons will be activated by sensory
neurons or interneurons, as the case may be, and not by the collision sensors. The
functioning of the collision sensors was inspired by the experiment reported by
Scheier et al. (1998).
The experiment was carried out with the system whose neuronal architecture pre-
sents eight sensory neurons, two interneurons and two motor neurons. The system
and the values established for the simulation are specified in Fig. 6.17.
The initial synaptic weights are shown in Table 6.2. The sensory neurons are
indicated with an s, the interneurons with an i, and the motor neurons with the letter
m. The subscripts indicate the position of the neuron within the network. Each of
the synaptic weights established between the sensory neurons and the interneurons
has a value of 0.7, which ensures the firing of the interneurons each time one of the
Table 6.2 Initial synaptic weights for the experiment with interneurons
Neurons
– s1 s2 s3 s4 s5 s6 s7 s8 m1 m2
i1 0.7 0.7 0.7 0.7 – – – – 0.49 0.29
i2 – – – – 0.7 0.7 0.7 0.7 0.13 0.36
sensory neurons to which they are connected fires. The hyphens (-) indicate that
there is no connection between neurons. The connection value between the sensory
neurons and the interneurons remains fixed during the simulation. The synaptic
weights that are modulated during the experiment are shown in boldface.
The navigation of the agent in the environment during the simulation is shown in
Fig. 6.18. There were six collisions during the agent’s path. The place of the
collisions is indicated by an arrow and a number, marking the order in which they
Fig. 6.18 Path of the artificial agent during the experiment and table showing the activated sonars
on each collision
150 D. Valenzo et al.
happened. The information from the sensors obtained during each collision also
indicates which neurons, both sensory and interneurons are activated.
The sensors that registered each of the collisions showed an activation equal to
or greater than 0.8 (Fig. 6.18). The information in Table in Fig. 6.18 shows that
collisions 1, 2, and 5 were registered by sensors located on the left side of the
architecture, activating in these three cases the collision sensor c1 and, therefore,
neuron m1 was activated. In contrast, collisions 3, 4, and 6 activated collision
sensor c2.
Figures 6.19 and 6.20 show the pattern of activation for the interneurons and
motor neurons registered during the different situations the agent encounter in the
environment. The change for the synaptic weights over time is also presented.
These records are shown with the purpose of graphically illustrating the relationship
of the activation patterns obtained with the change in the synaptic weights regis-
tered in the different situations.
One of the activation patterns registered during the experiment is shown in
Fig. 6.19 and corresponds to the neuronal activity registered during collision 1. As
shown, only interneuron 1 and motor neuron 1 generate action potentials. This
pattern corresponds to an increase in the synaptic weight between these neurons
(Fig. 6.22), while the other weights decrease due to the fact that the other two
neurons do not fire during this collision. A similar activation pattern is observed
during collisions 2 and 5, since in these cases the same interneurons and motor
neurons are activated. The increases in synaptic weights during these collisions are
also indicated in Fig. 6.22. It is important to remember that the initial synaptic
weights are not high enough for the interneurons to trigger the motor neurons.
Therefore, in this case, the activation of the motor neuron 1 is due to the activation
of the collision sensor c1 and not to the activation of the interneuron.
Figure 6.20 shows the activation pattern during a collision in which interneuron 2
and motor neuron 2 are activated. This is reflected in an increase in the synaptic weight
of these neurons (Fig. 6.22). The other three synaptic weights do not increase. This
case is similar to what happens in collisions 4 and 6, because the same interneurons and
motor neurons are activated. In the same way as in Fig. 6.19, the activation of the
motor neuron in Fig. 6.20 is due to the activation of the collision sensor and not to the
activity of the interneuron.
Figure 6.21 illustrates an activation pattern where the four neurons are activated. In
this situation, it was found that even when the four synaptic weights increase, they do it
in a different proportion depending on the degree of activity between the neurons. The
activation pattern shown in Fig. 6.21 is more frequent once the synaptic weights
between the neurons are high enough for the interneurons to trigger the motor neurons.
In this example, the activity of the motor neuron 1 is a product of the activity of the
152 D. Valenzo et al.
interneuron 1 and not of the collision sensor. In contrast, motor neuron 2 is still
activated by collision sensor c2. It is possible to know this by looking at the values of
the synaptic weights (Fig. 6.22).
Although there are other activation patterns, only those that generate an increase
in the synaptic weight of at least some neural connection are mentioned. That is to
6 Learning in Biologically Inspired Neural Networks … 153
say, activation patterns where at least one interneuron and one motor neuron fire.
Patterns that do not comply with this activation condition would generate forget-
fulness. For example, when only one of the four neurons is activated, the con-
nections are not reinforced or in the case where the two interneurons are activated
but not the motor neurons, the connection is not reinforced either.
The change of the synaptic weights during the experiment is shown in Fig. 6.22.
The six collisions registered during the simulation are indicated by numbered
boxes. The box of each of the collisions appears near the line of the synaptic weight
that most varied due to this collision. As can be seen, changes in synaptic weights
were recorded apart from those recorded during collisions. These changes occur
because some connections between the interneurons and the motor neurons are high
enough to trigger the motor neurons without the need for a collision. The values of
the synaptic weights obtained at the beginning and end of the simulation are shown
in Table 6.3.
The highest synaptic weights at the end of the simulation correspond to the
weights between interneuron 1 and motor neuron 1, as well as the connection
between interneuron 2 and motor neuron 2. Both connections are greater than 0.55,
so only in these cases the activity of each motor neuron will be triggered by the
activity of the corresponding interneuron and not by the collision sensor. The
difference between the activation of a motor neuron due to the collision sonar and
due to the interneuron associated with it is illustrated in Fig. 6.23. This difference is
reflected between an action potential of interneuron 1 and motor neuron 1 at the
beginning and end of the simulation.
Fig. 6.23 Activation when a collision occurs before learning (a) and after learning (b)
154 D. Valenzo et al.
Figure 6.24 shows the activity of the interneurons and motor neurons with the
final synaptic weights. In these graphs, it can be seen that interneuron 1 and motor
neuron 1 present the same number of action potentials. This is because the motor
neuron fires when the interneuron fires. However, in the case of the other two
neurons, motor neuron 2 does not fire even when there is activation from
interneuron 2. This is because the synaptic weight is only high enough to reproduce
some of the action potentials generated in the interneuron but not all of them.
The navigation of the artificial agent with the final synaptic weights is shown in
Fig. 6.25. In contrast to Fig. 6.18, the agent does not come close obstacles and
follows a straight path.
This experiment was carried out with the neural network architecture that presents
eight sensory neurons and two interconnected motor neurons. The system and the
values established for the simulation are specified in Fig. 6.26.
6 Learning in Biologically Inspired Neural Networks … 155
Table 6.4 Initial synaptic weights for the architecture with direct connections
Neurons
– s1 s2 s3 s4 s5 s6 s7 s8
m1 0.39 0.12 0.17 0.29 0.42 0.17 0.21 0.43
m2 0.28 0.16 0.42 0.43 0.33 0.35 0.41 0.14
The initial values for the synaptic weights are shown in Table 6.4. In contrast to
the previous experiment, in this case, all the synaptic weights of the neural network
are learned during the simulation.
The interaction of the agent with the environment during the simulation is shown
in Fig. 6.27. In this Figure, the six collisions registered during the path followed by
the agent are indicated by an arrow and a number.
156 D. Valenzo et al.
Fig. 6.27 Path of the artificial agent during the experiment and table showing the activated sonars
on collisions
The graphs in Figs. 6.28 and 6.29 show the neuronal activity during collision 1.
Figure 6.28 shows that no action potentials were registered in the neurons on the
left side of the architecture. In contrast, Fig. 6.29 shows the activation of sensory
neurons 5, 6, 7, and 8, which correspond to the right side of the architecture.
Likewise, the same figure shows that the only active motor neuron during collision
1 was motor neuron 2. This activation pattern results in an increase in the synaptic
weight connecting sensory neurons on the right side of the architecture with the
motor neuron 2, which is also on the right side. This increase in synaptic weights is
shown in Fig. 6.35. The synaptic weights of neurons that were not activated during
this collision do not vary. The final configuration of the weights for this architecture
is summarized in Table 6.5.
Figure 6.30 and 6.31 show the activation pattern corresponding to collision 4. In
these figures, it is shown that sensory neurons on the left side are those that are now
6 Learning in Biologically Inspired Neural Networks … 157
activated, while sensory neurons 5, 6, 7, and 8 present null or very low activation in
the case of neuron 8.
In contrast to collision 1, during collision 4 the synaptic weights of sensory
neurons on the left side of the architecture connected to the motor neuron 1 are
increased. Although, in this case the motor neuron 2 is also activated, increasing the
synaptic weights between these sensory neurons and the motor neuron 2, but this
increase in activation is lower than among the sensory neurons related to the motor
neuron 1, because the motor neuron 2 produced a lower number of action potentials
than the motor neuron 1.
The changes in synaptic weights registered during this experiment are presented
in Figs. 6.32, 6.33, 6.34, and 6.35. In these figures, we can see that the highest
synaptic weights correspond to the weights between motor neuron 1 and the sen-
sory neurons 1, 3, 4, and 8 and the synaptic weights of motor neuron 2 with the
sensory neurons 1, 5, 6, and 7.
The path of the artificial agent with the synaptic weights obtained at the end of
the simulation is shown in Fig. 6.36. In this case, the agent continues to approach
158 D. Valenzo et al.
the obstacles because some synaptic weights that transmit information on both sides
of the architecture increased during the learning process. However, once the
obstacles are detected by a neuron that has a strong synaptic weight with the motor
neuron that is on the same side, the agent avoids the obstacle.
6 Learning in Biologically Inspired Neural Networks … 159
Table 6.5 Final synaptic weights for the experiment with direct connections
Neurons
– s1 s2 s3 s4 s5 s6 s7 s8
m1 1.0 0.26 0.60 0.80 0.42 0.16 0.21 0.89
m2 0.32 0.20 0.99 0.47 0.59 0.40 0.87 0.45
Fig. 6.32 Modulation of the synaptic weights between sensory neurons on the left side of the
architecture (1, 2, 3, and 4) and motor neuron 1
6 Learning in Biologically Inspired Neural Networks … 161
Fig. 6.36 Path of the agent with the final synaptic weights
6.4 Conclusions
The first experiments, reported in Sect. 6.3.1, show the plausibility of the action
potential model proposed by Izhikevich (2003). In these experiments, the different
parameters of the model were tried to obtain a coherent behavior in an artificial
agent. The different network architectures studied show the potential of the models.
Furthermore, in light of the results reported in Sect. 6.3.2, it is concluded that
artificial action potential neural networks are useful models for the study of cog-
nitive processes such as learning. The results of the simulations show that it is
possible to observe a change in the behavior of the artificial agent before and after
the Hebbian learning process for both architectures.
As already mentioned, there is scarce literature about autonomous learning
algorithms for embedded artificial action potential neural networks. We consider
that the main contribution of this work is the simulation of a multimodal process in
an artificial action potential neural network. The learning strategy consisted on
modifying the synaptic weights between the neurons of the networks by means of
an implementation of Hebb’s rule during a multimodal association process. This
architecture was embedded in the artificial agent with eight proximity sensors and
two collision sensors. The proximity sensors had a continuous activation function
and were associated with sensory neurons. On the other hand, the collision sensors
had a threshold of binary activation and were associated with motor neurons.
6 Learning in Biologically Inspired Neural Networks … 163
References
Gaona, W., Escobar, E., Hermosillo, J., & Lara, B. (2015). Anticipation by multi-modal
association through an artificial mental imagery process. Connection Science, 27(1), 68–88.
He, W., Chen, Y., & Yin, Z. (2016). Adaptive neural network control of an uncertain robot with
full-state constraints. IEEE Transactions on Cybernetics, 46(3), 620–629.
Izhikevich, E. (2000). Neural excitability, spiking and bursting. International Journal of
Bifurcation and Chaos, 10(06), 1171–1266.
Izhikevich, E. (2003). Simple model of spiking neurons. IEEE Transactions on Neural Networks,
14(6), 1569–1572.
Izhikevich, E. (2004). Which model to use for cortical spiking neurons? IEEE Transactions on
Neural Networks, 15(5), 1063–1070.
Maass, W. (1997). Networks of spiking neurons: The third generation of neural network models.
Neural networks, 10(9), 1659–1671.
Manson, N. (2004). Brains, vats, and neurally-controlled animats. Studies in History and
Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical
Sciences, 35(2), 249–268.
McCarthy, J., Minsky, M., Rochester, N., & Shannon, C. (2006). A proposal for the dartmouth
summer research project on artificial intelligence, August 31, 1955. AI Magazine, 27(4), 12.
McCulloch, W., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity.
The bulletin of mathematical biophysics, 5(4), 115–133.
Mingers, J. (2001). Embodying information systems: The contribution of phenomenology.
Information and Organization, 11(2), 103–128.
Mokhtar, M., Halliday, D., & Tyrrell, A. (2007, August). Autonomous navigational controller
inspired by the hippocampus. In IEEE International Joint Conference on Neural Networks
(pp. 813–818).
Moravec, H. (1984). Locomotion, vision and intelligence. In: M. Brady, & R. Paul (Eds.),
Robotics research (pp. 215–224). Cambridge, MA: MIT Press.
Newell, A., & Simon, H. (1976). Computer science as empirical inquiry: Symbols and search.
Communications of the ACM, 19(3), 113–126.
Novellino, A., D’Angelo, P., Cozzi, L., Chiappalone, M., Sanguineti, V., & Martinoia, S. (2007).
Connecting neurons to a mobile robot: An in vitro bidirectional neural interface.
Computational Intelligence and Neuroscience, 2007.
Pfeifer, R., & Scheier, C. (1999). Understanding intelligence. MIT Press.
Potter, S. (2001). Distributed processing in cultured neuronal networks. Progress in Brain
Research, 130, 49–62.
Potter, S., & DeMarse, T. (2001). A new approach to neural cell culture for long-term studies.
Journal of Neuroscience Methods, 110(1), 17–24.
Scheier, C., Pfeifer, R., & Kunyioshi, Y. (1998). Embedded neural networks: Exploiting
constraints. Neural Networks, 11(7–8), 1551–1569.
Trhan, P. (2012). The application of spiking neural networks in autonomous robot control.
Computing and Informatics, 29(5), 823–847.
Chapter 7
Force and Position Fuzzy Control:
A Case Study in a Mitsubishi PA10-7CE
Robot Arm
Abstract Too many research works have focused on the problem of control of
robot manipulators while executing tasks that do not involve the contact forces of
the end-effector with the environment. However, many tasks require an interaction
of the manipulator with the objects around it. For the correct performance of these
tasks, the use of a force controller is essential. Generally, the control objective
during the contact is to regulate the force and torque of the manipulator’s
end-effector over the environment, while simultaneously regulating the position and
orientation (i.e., the pose) free coordinates of the manipulator’s end-effector. Many
works have been presented on this topic, in which various control strategies are
presented; one of the most relevant methods is the so-called hybrid force/position
control; this scheme has the advantage of being able to independently control the
force in constrained directions by the environment and the pose along uncon-
strained directions. This work analyzes and implements the hybrid force/position
control using a fuzzy logic control method, since the fuzzy control provides a
solution for nonlinearities, high coupling, and variations or perturbations. The
system employed is the Mitsubishi PA10-7CE robot manipulator, which is a robot
of 7 degrees of freedom (DOF), but in this work, it is only used as a 6-DOF
manipulator, equipped with a 6-DOF force/torque sensor in the end-effector.
7.1 Introduction
Actually, the ability to handle and manipulate physical contact between a robot and
the environment that surrounds it is a demand to perform more advanced manip-
ulation tasks. This capacity is known as the interaction of the manipulator with the
physical environment in which it works.
The nature of the interaction between the manipulator and its environment allows
classifying robotic applications in two classes. Those tasks that are involved without
contact, that is, movements without restrictions in the free space, and complex robotic
applications that require that the manipulator be mechanically coupled to other
objects. Two categories can be distinguished within this last type of tasks. The first
category is dedicated to essential tasks of force in which the end-effector is required to
stabilize the physical contact with the environment and executes a specific force
process. In the second category, the emphasis falls on the movement of the
end-effector, which is performed on restricted surfaces (compliant motion).
We are interested only in the scheme of active compliance, and more specific,
the hybrid control of force/position. Figure 7.1 (Vukobratovic et al. (2009) shows a
control scheme that involve active compliance.
This control methodology is based on the force and position control theory pro-
posed by Mason (1981), depending on the mechanical and geometrical character-
istics of the contact problem. This control methodology distinguishes two sets of
constraints between the movement of the robot and the contact forces. The first set
contains the so-called natural constraints, which arise due to the geometry of the
task. The other set of constraints, called artificial constraints, is given by the
characteristics associated with the execution of the specified task, i.e., the con-
straints are specified with respect to a framework, called a constraint framework.
For instance, in a contact task where a slide is performed on a surface, it is common
to adopt the Cartesian restriction framework in the way shown in Fig. 7.2 which is
given in Vukobratovic et al. (2009). Assuming an ideally rigid and frictionless
contact between the end-effector and the surface, it is obvious that natural con-
straints limit the movement of the end-effector in the direction of the z-axis, as well
as rotations around the x- and y-axes.
The artificial constraints, imposed by the controller, are introduced to specify the
task that will be performed by the robot with respect to the restriction framework.
These restrictions divide the possible degrees of freedom (DOF) of the Cartesian
movement into those that must be controlled in position and in those that must be
controlled in force, in order to carry out the requested task.
In the implementation of a hybrid force/position control, it is essential to introduce
two Boolean matrices S and S in the feedback loops in order to filter the forces and
displacements sensed in the end-effector, which are inconsistent with the contact
model of the task. The first is called the compliance selection matrix, and according
168 M. A. Llama et al.
to the artificial constraints specified, the ith diagonal element of this matrix has the
value of 1 if the ith DOF with respect to the frame of the task has to be controlled in
force, and the value of 0 if it is controlled in position. The second matrix is the
selection matrix for the DOF that is controlled in position; the ith diagonal element
of this matrix has the value of 1 if the ith DOF with respect to the frame of the task
has to be controlled in position, and the value of 0 if controlled in force.
To specify a hybrid contact task, the following sets of information have to be
defined:
• Position and orientation of the frame of the task.
• The controlled directions in position and strength with respect to the frame of
the task (selection matrix).
• The desired position and strength with respect to the frame of the task.
Once the contact task is specified, the next step is to select the appropriate
control algorithm.
The concept of fuzzy logic was introduced for the first time in 1965 by Professor
Zadeh (1965) as an alternative to describe sets in which there is vagueness or
uncertainty and, consequently, cannot be easily defined.
Fuzzy logic or fuzzy set theory is a mathematical tool based on degrees of
membership that allows modeling information which contains ambiguity, imprecision,
7 Force and Position Fuzzy Control: A Case … 169
and uncertainty, by measuring the degree to which an event occurs, using for this a
knowledge base or human reasoning.
A ¼ fðx; lA ðxÞÞjx 2 U g
A membership function µA(x) can take different forms according to the system you
want to describe. Among the most common forms are those described by impulsive,
triangular, pseudo-trapezoidal, and Gaussian membership functions (Nguyen et al.
2003). The following describes the membership functions used later in this work.
Singleton membership function
A singleton membership function is shown in Fig. 7.3 and is defined by the fol-
lowing expression:
1; if x ¼ x
dðx; xÞ ¼
0; if x 6¼ x
Gðx; q; rÞ ¼ eð r Þ
xq 2
Fuzzy controllers are constructed from a set of fuzzy rules based on the knowledge
of the control system and the experience of operators and experts. A fuzzy rule is
expressed by
IF x is A THEN y is B
Rle1 ln : IF xi is Alii AND AND xn is Alnn THEN y is Bl1 ln ð7:1Þ
operator between the antecedents of the rules; the latest is faster and that is the
reason it is the most used in real-time applications.
max-prod inference
Taking into account the set of rules of the form Eq. (7.1), with input membership
functions lAli ðxi Þ and output lBl1 ln ðyÞ for all x = (x1, x2, …, xn)T 2 U Rn and y 2
i
V R, the product inference engine is given as
lB0l l ðy; x Þ ¼ lAl1 ðx1 Þ lAl2 ðx2 Þ lBl1 l2 ðyÞ ð7:2Þ
12 1 2
Figure 7.8a describes the process of the max-product inference engine, while
Fig. 7.8b describes the process of combining, by union operation, the output of
several rule conclusions.
Defuzzification
In the defuzzification stage, a scalar value y* is generated from the output lB0 ðyÞ
that generates the inference engine. This value y* is the output of the fuzzy con-
troller that will be applied to the system to be controlled.
There are several ways to compute the output of the fuzzy controller; the most
common is the center of average defuzzification which is given as
P N1 P
l1 ¼1
Nn
ln ¼1 yl1 ln xl1 ln ðx Þ
y ðx Þ ¼ P N1 P Nn ð7:3Þ
l1 ¼1 ln ¼1 xl1 ln ðx Þ
where yl1 ln is the center of the l1… ln output fuzzy sets, while xl1 ln is the height of
the input membership functions, and x* is the set of input real values.
The Mitsubishi industrial robot manipulator PA10-7CE is one of the versions of the
“Portable General-Purpose Intelligent Arm” of open architecture, developed by
Mitsubishi Heavy Industries (MHI). This manipulator is composed of seven joints
connected through links as shown in Fig. 7.9. The servomotors of the PA10 are
three-phase brushless type and are coupled to the links by means of harmonic drives
and electromagnetic brakes. In this work, the Mitsubishi is used as a 6-DOF robot arm;
i.e, one of the joints is blocked, in this case the joint 3 represented by S3 in Fig. 7.9.
The dynamic equation of motion for a manipulator of n DOF in interaction with the
environment is expressed by Vukobratovic et al. (2009)
_ q_ þ gðqÞ ¼ s þ J T ðqÞf s
MðqÞ€q þ Cðq; qÞ ð7:4Þ
Table 7.2 D-H parameters Link ai1 ½m ai1 ½rad di ½m hi ½rad
of 6-DOF reduced PA10-7CE
robot manipulator 1 0 0 0.317 q1
2 0 p2 0 q2 p2
3 0.450 0 0 q2 þ p
2
p
4 0 2 0.480 q4
5 0 p2 0 q5
p
6 0 2 0.070 q6
The position direct kinematic model of a robot manipulator is the relation that
allows determining the vector x 2 Rd þ m of operational coordinates according to its
articular configuration q:
x ¼ hðqÞ ð7:5Þ
fkineðdh; qÞ
where
• ai1 ; ai1 ; di ; hi are the Denavit–Hartenberg parameters according to Craig
(2006).
• ri indicates the type of joint (it takes the value of 0 for a rotational joint and 1 if
it is a prismatic one).
The elements of the homogeneous transformation matrix 0n T are shown in
Castañon (2017).
Taking the time deriving of Eq. (7.5), we obtain
178 M. A. Llama et al.
x_ ¼ JA ðqÞq_ ð7:8Þ
where JA ðqÞ 2 Rðd þ mÞn is the analytic Jacobian matrix of the robot. This matrix
can be found in Salinas (2011). The geometric Jacobian is obtained using
HEMERO tool using the following instruction
jacob0ðdh; qÞ
The position inverse kinematic model is the inverse function h−1 that if it exists for
a given robot, it allows obtaining the necessary configuration to locate its
end-effector in a given position x:
The expressions of the h−1 function of the position inverse kinematic model
were calculated with the help of SYMORO+ robotics software (Khalil et al. 2014),
and the results are shown in Castañon (2017).
Finally, from Eq. (7.8) the expression that characterizes the velocity inverse
kinematic model is given by
q_ ¼ JA ðqÞ1 x_ ð7:10Þ
The hybrid force/position controller requires the feedback of the forces and torques
present in the robot’s end-effector or in the contact tool used; to achieve this, the
robot was fitted with a Delta model ATI force sensor shown in Fig. 7.11. This is a
sensor of 6 degrees of freedom; this means that it is able to acquire the forces and
torques in each of the Cartesian axes (Fx, Fy, Fz, Tx, Ty, Tz).
The main characteristics of the ATI Delta sensor are shown in Castañon (2017).
For more technical information, consult (ATI Industrial Automation 2018a, b).
The ATI sensor was paired with a NI PCI-6220 DAQ card fitted in the control
computer. Once the voltages signals read by the DAQ are in the MATLAB/
Simulink environment, these are converted to force/torque values. This conversion
is given by the expression
f s ¼ MT vc þ co ð7:11Þ
The sensor ATI was mounted between the last link of the robot and a special
contact tool designed in Salinas (2011) to reduce, as much as possible, the friction
between this tool and the contact surface; this is illustrated in Fig. 7.12.
In the literature, there is a wide collection of works about different and very varied
algorithms of hybrid force/position control; however, one of the most important
approaches is the one proposed by Craig and Raibert (1979), which is shown in
Fig. 7.13.
It contains two control loops in parallel with independent control and feedback
laws for each one. The first loop is the position control which makes use of the
information acquired by the position sensors in each robot joint. The second loop is
the force control. This loop uses the information collected by the force sensor
mounted on the end-effector. The matrix S is used to select which link will be
controlled in either position or force.
In the direction controlled in force, the position errors are set to zero when
multiplied by the orthogonal complement of the selection matrix (position selection
matrix) defined as S ¼ I S. This means that the part of the position control loop
does not interfere with the force control loop; however, this is not the real case
because there is still some coupling between both control loops.
7 Force and Position Fuzzy Control: A Case … 181
A PD-type position control law is used with gain matrices kp 2 Rðd þ mÞðd þ mÞ
and kv 2 Rðd þ mÞðd þ mÞ , while the force control law consists of a proportional and
integral action PI with their respective matrices’ gains kpf 2 Rðd þ mÞðd þ mÞ and
kif 2 Rðd þ mÞðd þ mÞ , as well as a part of the feedback of the desired force in the
force loop, then the control law can be written in operational space as
sx ¼ sfx þ spx ð7:12Þ
where sx 2 R6 is the vector of control torques; kp, kv, kpf and kif are the 6 6
control gain diagonal matrices; ~x is the vector resulting from the difference between
the desired operational posture vector xd 2 Rd þ m and the actual operational posture
vector x 2 Rd þ m ; ~x_ 2 Rd þ m is the vector of speed errors in operational space; and
~f s 2 Rd þ m is the vector obtained by the difference between the desired contact
forces vector f sd 2 Rd þ m and the instant force vector f sd 2 Rd þ m .
A problem that arises in this formulation is a dynamic instability in the force
control part due to high gain effects of the feedback from the force sensor signal that
is caused when a high rigidity is present in the environment, unmodeled dynamics
effects caused by the arm and the elasticity of the sensor. To solve this problem, the
dynamic model of the manipulator was introduced into the control law. In Shin and
Lee (1985), a hybrid control of force/position is formulated in which the dynamic
model of the robot is used in the control law; the expression is given by
sx ¼ Mx ðxÞ€x þ Cx ðx; xÞ
_ þ gx ðxÞ þ Sf ð7:15Þ
and f 2 R6 is the vector generated by the control law selected for the part of the
force loop.
To avoid rebounding and minimize overshoots during the transition, an active
damping term is added in the force control part (Khatib 1987).
182 M. A. Llama et al.
where the term kvf is a diagonal matrix with Cartesian damping gains. In Bona and
Indri (1992), it is proposed to modify the control law for position as
spx ¼ Mx ðxÞS x€ Mx1 ðxÞðSf f s Þ þ Cx ðx; xÞ
_ þ gx ðxÞ ð7:18Þ
being Mx1 ðxÞðSf f s Þ an added term to compensate the coupling between the
force and position control loops, as well as the disturbances in the position con-
troller due to the reaction force.
So far, the control laws have been handled in operational space; however in
Zhang and Paul (1985), a transformation of the Cartesian space to joint space is
proposed by transforming the selection matrices S and S given in Cartesian space to
joint space as
Sq ¼ J 1 SJ ð7:19Þ
and
Sq ¼ J 1 SJ ð7:20Þ
s ¼ sf þ sp ð7:21Þ
being
where
and
Z t
f c ¼ Kpf J T ~f s þ Kif J T ~f s dt ð7:25Þ
0
MðqÞ 2 Rnn , Cðq; qÞ_ 2 Rnn and gðqÞ 2 Rn are the dynamic joint components
of the manipulator; ~q 2 Rn is the vector of differences between the vector of desired
7 Force and Position Fuzzy Control: A Case … 183
Fig. 7.14 Block diagram for hybrid controller with fixed gains in joint space
Based on experimental results with the control law expressed by Eqs. (7.21)–(7.25)
performed in the Mitsubishi PA10 robot arm, it was observed that the control
performance changes depending on the rigidity of the contact environment; hence,
to improve the performance from one surface to another, it was necessary to retune
the control gains. A similar approach was proposed by Shih-Tin and Ang-Kiong
(1998) but in a hierarchical way by tuning the scaling factor of the fuzzy logic
controller.
Our experimental results showed that Kpf was the most sensible gain compared
to Kvf and Kif gains. For this reason, we proposed that only the gain Kpf be
supervised in a fuzzy manner and that Kvf and Kif were configured with constant
values.
The proposed fuzzy control design is based on the control laws of Eqs. (7.21)–
(7.25) with the difference that a supervisory fuzzy system is used to tune in the
control gain K b pf in the force control loop. Since now the gain is given by the
b
function K pf ðxÞ, the Eq. (7.25) in the force control loop is given by
184 M. A. Llama et al.
Fig. 7.15 Block diagram for hybrid controller with fuzzy gains in joint space
Zt
f c b pf ðxÞJ T ~f s þ Kif J T
¼K ~f s dt ð7:26Þ
0
A fuzzy system K b pf ðxÞ like that represented in (7.3) with one input x1 ¼ ~fsz and
one output y1 is designed. We define N1 fuzzy sets Al11 (l1 = 1, 2, …, N1) for input
l1
x1, each of them described by a Gaussian membership function lAl1 ðx1 Þ. For the
1
output, singleton membership functions are selected.
The fuzzy system can be built from the set of N1 fuzzy IF–THEN rules of the
form
b pf ðxÞ is yl1
IF x1 is Al11 THEN K 1
The block diagram of the hybrid controller with fuzzy gains in joint space is
shown in Fig. 7.15.
7 Force and Position Fuzzy Control: A Case … 185
Fig. 7.16 Situation of the task frame referred to the frame of the base of the robot
The controller was implemented by using Eqs. (7.21)–(7.24) and (7.26). The
selection matrices for position S and for force S were selected as Eqs. (7.28) and
(7.29), respectively. On the other hand, the values of the diagonal gain matrix for
the position control loop are given in Table 7.3 and the fixed gains for the force
control loop are shown in Table 7.4. Kpf was selected as the variable gain, and a
fuzzy logic tuner was implemented for tuning such a gain.
2 3
0 0 0 0 0 0
60 0 0 0 0 07
6 7
60 0 1 0 0 07
S¼6
60
7 ð7:28Þ
6 0 0 0 0 077
40 0 0 0 0 05
0 0 0 0 0 0
7 Force and Position Fuzzy Control: A Case … 187
2 3
1 0 0 0 0 0
60 1 0 0 0 07
6 7
60 0 0 0 0 07
S¼6
60
7 ð7:29Þ
6 0 0 1 0 077
40 0 0 0 1 05
0 0 0 0 0 1
To approximate the gain through the fuzzy system K b pf ðxÞ, it receives an input
~
x1 ¼ fsz with a universe of discourse partitioned into N1 = 3 fuzzy sets: A11 ¼ FES
(Force Error Small), A21 ¼ FEM (Force Error Medium), A31 ¼ FEB (Force Error
Big). To build the fuzzy system, we propose to use an open to the left Gaussian
188 M. A. Llama et al.
function, a Gaussian function, and an open to the right Gaussian function as shown
in Fig. 7.18.
The partitions of the universe of discourse, using the notation
qA1 ¼ fq1 ; q2 ; q3 g, were selected as
As already mentioned, the fuzzy system consists of singleton functions for the
output variable. The universe of discourse of the output is also partitioned into 3
impulsive functions: KpS (Small Kpf Gain), KpM (Medium Kpf Gain), and KpB (Big
Kpf Gain); this is shown in Fig. 7.19, where each parameter h corresponds to the
position of the impulse functions. Taking the notation hy1 ¼ fh1 ; h2 ; h3 g, the par-
titions of the universe of discourse for the output variable were selected like
The experiments carried out on the Mitsubishi PA10 robot arm with the hybrid
controller with fuzzy gains were made on different materials (shown in Fig. 7.17)
and a desired force reference fszd = −50 N (force applied down on the z-axis).
Figure 7.20 shows the response fsz to the applied force reference fszd and the force
error ~fsz on the z-axis applied on a sponge surface like the one in Fig. 7.17a.
190 M. A. Llama et al.
The results of the hybrid controller with fuzzy gains applied to the PA10 robot
manipulator in interaction with an expanded polystyrene surface are shown in
Fig. 7.21. This figure shows the response fsz to the applied force reference
fszd = −50 N and the force error ~fsz on the z-axis applied on an expanded poly-
styrene surface like the one in Fig. 7.17b.
7 Force and Position Fuzzy Control: A Case … 191
This section shows the results of the hybrid controller with fuzzy gains applied to
the PA10 robot manipulator in interaction with a wood board surface like the one in
Fig. 7.17c. Figure 7.22 shows the response fsz to the applied force reference
fszd = −50 N and the force error ~fsz on the z-axis.
192 M. A. Llama et al.
The following figures show the results of the hybrid controller with fuzzy gains
applied to the PA10 robot manipulator in interaction with a glass surface like the
one in Fig. 7.17d. Figure 7.23 shows the response fsz to the applied force reference
fszd = −50 N and the force error ~fsz on the z-axis.
The position and orientation errors are all very small, and they are reported in
Castañon (2017).
7 Force and Position Fuzzy Control: A Case … 193
7.6 Conclusions
The proposed hybrid force/position controller with fuzzy gains has the great
advantage over its corresponding fixed gain controller that it does not require to
retune the gains to exert a desired force in different types of materials with good
performance. Conversely, the hybrid force/position controller with fixed gains
requires the retuning of its gains for each material; in other words, for the fixed gain
controller, the best gains obtained for soft materials cannot be used in hard materials
because the system becomes unstable and very violent vibrations occur. This
problem is not present in the proposed fuzzy version.
References
ATI Industrial Automation, I. (2018a). ATI F/T Catalogs and Manuals. Recuperado el 16 de enero
de 2018, a partir de http://www.ati-ia.com/products/ft/ft_literature.aspx.
ATI Industrial Automation, I. (2018b). ATI Industrial Automation: Multi-Axis Force / Torque
Sensors. Recuperado el 16 de enero de 2018, a partir de http://www.ati-ia.com/products/ft/
sensors.aspx.
Bona, B., & Indri, M. (1992). Exact decoupling of the force-position control using the operational
space formulation. In Proceedings of IEEE International Conference on Robotics and
Automation, Nice, France, May.
Castañon, W. Z. (2017). Control difuso de fuerza para el robot manipulador Mitsubishi PA10-7CE,
Master dissertation, Instituto Tecnológico de la Laguna, Torreón, Coahuila, México.
Craig, J. J. (2006). Robotica. Upper Saddle River: Prentice Hall.
Craig, J. J., & Raibert, M. H. (1979). A systematic method of hybrid position/force control of a
manipulator. In Proceedings of the IEEE Computer Software and Applications Conference,
Chicago, IL, USA.
Khalil, W., Vijayalingam, A., Khomutenko B., Mukhanov I., Lemoine P., & Ecorchard, G. (2014).
OpenSYMORO: An open-source software package for Symbolic Modelling of Robots. IEEE/
ASME International Conference on Advanced Intelligent Mechatronics, Besancon, France.
pp. 1206–1211.
Khatib, O. (1987). A unified approach for motion and force control of robot manipulators: The
operational space formulation. IEEE Journal on Robotics and Automation, 3(1), 43–53.
Mason, M. T. (1981). Compliance and force control for computer controlled manipulators. IEEE
Transactions on Systems, Man and Cybernetics, 11(6), 418–432.
Maza, J. I., & Ollero, A. (2001). HEMERO: Herramienta MATLAB/Simulink para el estudio de
manipuladores y robots moviles, Marcombo-Boixareu.
Nguyen, H., Prasad, R., & Walker, C. (2003). A first course in fuzzy and neural control. USA:
Chapman & Hall/CRC.
Salinas, A. (2011). Análisis e implementación de esquemas de control de interacción activa para
robots manipuladores: Aplicación al robot Mitsubishi PA10, Master dissertation, Instituto
Tecnológico de la Laguna, Torreón. Diciembre: Coah.
Sciavicco, L., & Siciliano B., (1996). Modelling and control of robot manipulators. Berlin:
Springer.
Shih-Tin, L., & Ang-Kiong, H. (1998, August). Hierarchical fuzzy force control for industrial
robots. IEEE Transactions on Industrial Electronics, 45(4).
194 M. A. Llama et al.
Shin, K. G., & Lee, C. P. (1985). Compliant control of robotic manipulators with resolved
acceleration. In Proceedings of 24th IEEE Conference on Decision and Control, Ft.
Lauderdale, FL, USA, December.
Vukobratovic, M., Surdilovic, D., Ekalo, Y., & Katic, D. (2009). Dynamics and robust control of
robot-environment interaction. Singapore: World Scientific.
Zadeh, L. A. (1965). Fuzzy sets, Information and Control, 8, 338–353.
Zhang, H., & Paul, R. (1985). Hybrid control of robot manipulator. In Proceedings of the IEEE
International Conference on Robotics and Automation.
Chapter 8
Modeling and Motion Control of the
6-3-PUS-Type Hexapod Parallel
Mechanism
Abstract This chapter reports the kinematics and dynamics models of the parallel
mechanism known as Hexapod, which has a structure of the type known as 6-3-
PUS. For computing the dynamics model, we start considering a non-minimal set of
generalized coordinates and employ the Euler–Lagrange formulation; after that, we
apply the so-called projection method to get a minimal model. It is worth noticing
that the modeling approach presented here can be used for similar robotic struc-
tures, and the resulting models are suitable for automatic control applications. The
computed analytical kinematics and dynamics models are validated by comparing
their results with numerical simulations carried out using the SolidWorks Motion
platform. In addition, this chapter describes the implementation of two motion
tracking controllers in a real Hexapod robot. The tested controllers are one with a
two-loop structure (a kinematic controller in the outer loop and a PI velocity
controller in the inner loop) and other with an inverse dynamics structure. The
experimental results of both controllers show a good performance.
8.1 Introduction
First robot manipulators were inspired in the human arm; that is the reason why
they had open kinematic chains and were later known as serial manipulators.
However, with the passage of time, it was necessary to use a different type of
R. Campa J. Bernal
Tecnológico Nacional de México, Instituto Tecnológico de la laguna,
Torreón, Coahuila, Mexico
I. Soto (&)
Universidad Autónoma de Ciudad Juárez, Instituto de Ingeniería y Tecnología,
Ciudad Juárez, Chihuahua, Mexico
e-mail: angel.soto@uacj.mx
• The pose kinematics model gives the relation between the generalized coordi-
nates employed to describe the robot’s configuration, and those used for
describing the platform’s position and orientation (i.e., its pose) in space.
• The velocity kinematics model gives the relation between the first derivatives of
the generalized coordinates (or generalized velocities) and the linear and angular
velocity vectors of the platform.
• The dynamics model establishes the relation between the generalized coordi-
nates, their first and second derivatives (generalized velocities and accelera-
tions), and the generalized forces applied to the robot in order to produce its
motion.
• The statics model is a particular case of the dynamics model, when no motion
occurs; in other words, it gives the relation between the generalized coordinates
and the generalized forces.
It is a well-known fact that in the case of the kinematics of platform-type parallel
robots, a major difficulty arises when computing the platform’s pose from a given
set of generalized coordinates; that is called the forward pose kinematics model (or
FPK model, for simplicity). There exist several methods (both analytical and
numerical) to deal with this problem, but it can be shown that it always has multiple
solutions. On the other hand, the velocity kinematics model is useful for the
analysis of singularities in the robot’s workspace.
In recent years, many research works have been conducted on the dynamics
modeling of parallel manipulators. Several methods or formulations have been
proposed to find the equations of motion governing the dynamics of such
198 R. Campa et al.
mechanisms, being two of the most important the Newton–Euler formulation and
the Euler–Lagrange formulation.
Despite its widespread use, the Newton–Euler formulation requires the com-
putation of all constraint forces and moments between the links, but these are
usually not necessary for simulation and control purposes. On the other hand, the
Euler–Lagrange formulation has several advantages, such as (a) the possibility of
using generalized coordinates, (b) the use of energy (rather than forces) which is a
scalar quantity, and (c) the possibility of excluding from the analysis the constraint
forces that do not directly produce the motion of the robot.
But independently of the formulation employed to compute the dynamics
equations, it is now a common practice to employ a non-minimal set of generalized
coordinates, and then to apply a method for reducing those equations and getting
the minimal dynamics model. Such a method is in general known as the projection
method (see, e.g., Arczewski and Blajer 1996; Blajer 1997; Ghorbel et al. 2000;
Betsch 2005).
It is worth mentioning here that although the dynamics of the original Gough–
Stewart platform (UPS type) has been subject of numerous studies (see Geng et al.
1992; Liu et al. 2000), little has been reported about the dynamics of platform
mechanisms with different kinematic chains. Narayanan et al. (2010) and Carbonari
et al. (2011) deal with the kinematics of a 6-3-PUS platform such as the one studied
in this paper, but, as far as the authors’ knowledge, there is no previous study about
the dynamics of such mechanism.
The literature regarding the control of parallel mechanisms is not as vast as for
serial manipulators. Nevertheless, since the work of Murray and Lovell (1989), it
has become apparent that the possibility of getting a minimal model for a
closed-chained mechanism allows to apply to this kind of systems the same type of
controllers as for serial robots. As pointed out by Ghorbel et al. (2000), the main
issue to take into account when proceeding this way is that the (Lyapunov) stability
conclusions will at best be local due to the structural singularities of parallel
mechanisms.
The aim of this paper is threefold. First, we recall the basics on kinematics and
dynamics modeling of parallel robots; in the case of the dynamics model, we focus
on the Euler–Lagrange formulation and explain the generalities of the projection
method in order to show its application for computing the minimal dynamics
model. Secondly, after describing the Quanser’s Hexapod robot, we compute both
its kinematics and dynamics models, and they are validated by comparing the
results generated numerically by SolidWorks Motion. Moreover, we show the
experimental results of the application of two model-based motion controllers to
the Hexapod robot.
The chapter is organized as follows. Section 8.2 recalls the generalities of the
kinematics and dynamics modeling of parallel robots. Section 8.3 introduces the
Quanser’s Hexapod robot, while Sects. 8.4 and 8.5 describe the derivation of
the kinematics and dynamics models of such mechanism, respectively. The vali-
dation of such models is provided in Sect. 8.6, and the real-time experiments are
described in Sect. 8.7. Finally, Sect. 8.8 gives concluding remarks.
8 Modeling and Motion Control of the 6-3-PUS-Type … 199
and
" @aðqÞ #
@wðqÞ @q Ja ðqÞ
Jw ðqÞ ¼ ¼ ¼ 2 Rmm : ð8:3Þ
@q @cðqÞ Jc ðqÞ
@q
qÞ
@ wðq;
¼ Jw ðqÞ; ð8:5Þ
@q
then we can apply the implicit function theorem (see Dontchev and Rockafellar
2014) to show that, for any q0 2 Xq , there is a neighborhood Nq of q0 , and a
neighborhood N q of q0 ¼ aðq0 Þ such that, for any q 2 Nq , there exist a unique
q 2 Nq and a continuously differentiable function r : Nq ! Nq such that
q ¼ rðqÞ ð8:6Þ
where I 2 Rnn and O 2 Rrn are the identity and null matrices, respectively.
Let Xq Xq denote the largest subset of Xq containing q0 for which the unique
parameterization Eq. (8.6) holds, and let Xq be the corresponding domain of r.
Then we have a diffeomorphism from Xq to Xq as follows:
a r
Xq ! Xq ! Xq : ð8:8Þ
Notice that unlike a, which can be easily found, r cannot in general be expressed
explicitly in an analytical form (sometimes it can only be computed iteratively by
numerical methods), but the previous analysis shows that whenever q 2 Xq , there is
always a unique solution q ¼ rðqÞ 2 Xq for which q ¼ aðqÞ 2 Xq holds (Ghorbel
et al. 2000). An estimate of the domain Xq is also proposed in Ghorbel et al. (2000).
8 Modeling and Motion Control of the 6-3-PUS-Type … 201
It is worth noting also that for a given q 2 Xq , we can also find other solutions
for the mapping Xq ! Xq different from r. Let us denote r0 any of those solutions,
then
q0 ¼ r0 ðqÞ
n ¼ ½x y z k l m T 2 R6 ;
n ¼ hðqÞ; ð8:9Þ
where h : Xq ! Xn (with Xn being the set of all admissible poses of the platform) is
known as the FPK function of the robot. But it is a well-known fact that in the case
of parallel robots, the FPK model has multiple solutions, in the sense that a single
set of active joints can produce different poses of the platform.
Now, let us define a function v : Xq ! Xn . For q 2 Xq Xq , we can write
n ¼ vðqÞ 2 Xn Xn , and using Eq. (8.6) we get
n ¼ vðrðqÞÞ; ð8:10Þ
comparing Eqs. (8.10) and (8.11) with Eq. (8.9) we conclude that the FPK function
h can be either h ¼ v r or h ¼ v r0 with the standard symbol for function
composition.
202 R. Campa et al.
Figure 8.2 shows the diagram of sets Xq , Xq , Xn and the functions among them.
The relevance of sets Xq and Xq lies in the fact that they can correspond to actual
(or real) configurations of the robot. Indeed, if q0 2 Xq is chosen to be a known
configuration of the real robot (e.g., its home configuration), then the definition of a
smooth function a and the implicit function theorem guarantee the existence of sets
Xq , Xn and functions r and v.
In general, the computation of the FPK model becomes a major problem due to
the complexity of the equations involved and the difficulty to find a closed set of
solutions. The methods for solving the FPK model can be classified into analytical
and numerical methods. The analytical methods allow to get all the possible
solutions of the FPK model (even those that are not physically realizable, due to
mechanical constraints); however, we are often interested in knowing only the
solution that describes the actual pose of the platform (corresponding to n ), so
iterative numerical methods are sufficient. Several analytical methods can be
employed for solving the FPK model of a parallel robot (see, e.g., Merlet 2006);
among them, the so-called elimination methods are of particular interest (Kapur
1995).
The main idea of an elimination method is to manipulate the equations of the
FPK in order to reduce the problem to the solution of a univariate polynomial
whose real roots enable to determine all the possible poses of the platform.
A drawback of this procedure is that it can be performed in several different ways,
not all of them leading to the same degree of the resulting polynomial (Merlet
1999). Therefore, it is necessary to find the univariate polynomial with the least
degree. Such degree can be obtained, for example, using the Bezout’s theorem
(Merlet 2006). But once the roots of the polynomial are computed, it is necessary to
determine which one gives the actual configuration of the robot. We will consider
that such configuration (given by q 2 Xq ) and the corresponding pose of the
platform (n 2 Xn ) can be determined by considering the diffeomorphism Eq. (8.8).
It is worth mentioning here that the FPK of a large number of mechanisms can
be determined by studying equivalent mechanisms for which the univariate poly-
nomial can be easily extracted (Merlet 2006). For example, in the case of the
mechanism under study, its 6-3-PUS structure can be analyzed as a 3-PRPS type
(Carbonari et al. 2011).
Now, let t 2 R3 and x 2 R3 be, respectively, the vectors of linear and angular
velocities of the center of mass (com) of the platform. Then the forward velocity
kinematics (FVK) model can be written as:
t
¼ JðqÞq_
x
where q_ ¼ dt
d q 2 Rn , and JðqÞ 2 R6n is known as the geometric Jacobian matrix
of the robot.
Taking the time derivative of Eq. (8.1), we get
Jc ðqÞq_ ¼ 0 2 Rr ð8:12Þ
where
@cðqÞ
Jc ðqÞ ¼ 2 Rrm ð8:13Þ
@q
denoted here as the constraint Jacobian, was already used in Eq. (8.3). Moreover,
taking the time derivative of Eq. (8.6) we get the relation between the vectors of
minimal and non-minimal generalized velocities, i.e.,
q_ ¼ AðqÞq_ 2 Rm ð8:14Þ
@rðqÞ
AðqÞ ¼ 2 Rmn ð8:15Þ
@q
Jc ðqÞAðqÞq_ ¼ 0 2 Rr ð8:16Þ
and it can be shown (see, e.g., Blajer 1997) that Eq. (8.16) implies:
Jc ðqÞAðqÞ ¼ O 2 Rrn
or equivalently
Now let us assume that the pose of the platform is known, and given in terms of
the position vector rF and the rotation matrix 0 RF , as functions of q; in other words,
we know rF ðqÞ, and 0 RF ðqÞ (which is a parameterization of vðqÞ). Then the linear
velocity vector t is simply computed as
@rF ðqÞ @rF ðqÞ
t ¼ r_ F ðqÞ ¼ q_ ¼ _
AðqÞq; ð8:18Þ
@q @q
where we have employed the chain rule and Eq. (8.14). And if the columns of
matrix 0 RF ðqÞ are the orthonormal vectors ^xF ðqÞ, ^yF ðqÞ and ^zF ðqÞ, i.e.,
0
RF ðqÞ ¼ ½^xF ðqÞ ^yF ðqÞ ^zF ðqÞ 2 SOð3Þ;
then it is possible to show (see Campa and de la Torre 2009) that the angular
velocity vector x can be obtained using the following:
1h i
x ¼ Sð^xF ðqÞÞ^x_ F ðqÞ þ Sð^yF ðqÞÞ^y_ F ðqÞ þ Sð^zF ðqÞÞ^z_ F ðqÞ ð8:19Þ
2
Again, using the chain rule and Eq. (8.14), we can rewrite Eq. (8.19) as
1 @^xF ðqÞ @^y ðqÞ @bz F ðqÞ
x¼ Sð^xF ðqÞÞ þ Sð^yF ðqÞÞ F þ Sðbz F ðqÞÞ AðqÞq_ ð8:20Þ
2 @q @q @q
To complete this section, let us consider the time derivative of the FPK model
Eq. (8.9), i.e.,
@hðqÞ
n_ ¼ q_ ¼ JA ðqÞq_
@q
The minimal dynamics model of a robot manipulator is the mapping between the
generalized forces exerted on the links by the active joints (named here as sq ), and
_ and q
the minimal generalized coordinates, velocities, and accelerations (i.e., q, q, €,
respectively).
8 Modeling and Motion Control of the 6-3-PUS-Type … 205
X
b X
b
K¼ Kl ; U¼ Ul ð8:21Þ
l¼1 l¼1
where Kl and U l are, respectively, the kinetic and potential energies of the l-th rigid
body.
In order to compute Kl and U l , it is customary to consider again the fixed
(inertial) coordinate frame Ro and a coordinate frame Rl attached to the l-th body,
usually in accordance with the Denavit–Hartenberg convention (see, e.g., Siciliano
et al. 2009). We now can write:
1 T
where
ml mass of the l-th body.
Il inertia tensor of the l-th body with respect to a frame with origin at its com, but
oriented as Rl .
pl position vector of the com of the l-th body, with respect to frame Ro .
tl linear velocity vector of the com of the l-th body, with respect to frame
Ro (tl ¼ p_ l ).
l
xl angular velocity vector of the l-th body, with respect to frame Ro , expressed in
the coordinates of Rl .
go constant vector of gravitational acceleration, with respect to frame Ro .
Consider the case where the system is described by the minimal set of gener-
alized coordinates q 2 Rn . If the pose of the l-th rigid body is given by the position
vector pl ðqÞ 2 R3 , and the rotation matrix
206 R. Campa et al.
0
Rl ðqÞ ¼ ½^xl ðqÞ ^yl ðqÞ ^zl ðqÞ 2 SOð3Þ;
then the linear velocity vector is simply tl ¼ p_ l , and the angular velocity vector l xl
can be computed using an expression like Eq. (8.19), i.e.,
1 0 Th i
RF Sð^xl ðqÞÞ^x_ l ðqÞ þ Sð^yl ðqÞÞ^y_ l ðqÞ þ Sð^zl ðqÞÞ^z_ l ðqÞ
2
where the left-multiplying rotation matrix allows to express the angular velocity
vector xl (computed with respect to Ro ) in the coordinates of Rl .
By using Eqs. (8.21)–(8.23), we can now compute the total kinetic and potential
_ and UðqÞ.
energies, Kðq; qÞ
Then the Lagrangian function of the robot is defined as
_ ¼ Kðq; qÞ
Lðq; qÞ _ UðqÞ
and it can be shown that the inverse dynamics model of such a system is given by
the so-called Euler–Lagrange equations of motion:
d @Lðq; q_ Þ @Lðq; q_ Þ
¼ sq : ð8:24Þ
dt @ q_ @q
where Mq ðqÞ 2 Rnn is known as the robot inertia matrix, Cq ðq; qÞ _ 2 Rnn is the
matrix of terms arising from the centrifugal and Coriolis forces, and gq ðqÞ 2 Rn
represents the vector of forces due to gravity.
But if we choose a non-minimal set of generalized coordinates, given by q 2 Rm
to describe the same system, then the dynamics must also include the holonomic
constraints given by Eq. (8.1). The total kinetic and potential energies should now
be Kðq; qÞ _ and UðqÞ, respectively, and for their computation we can also use
Eqs. (8.21)–(8.23).
If the pose of the l-th rigid body is given by pl ðqÞ 2 R3 and 0 Rl ðqÞ ¼
½^xl ðqÞ ^yl ðqÞ ^zl ðqÞ 2 SOð3Þ; then tl and l xl can be computed using the fol-
lowing expressions
@pl ðqÞ
tl ¼ AðqÞq_ ð8:25Þ
@q
10 @^xl ðqÞ @^y ðqÞ @^zl ðqÞ
l
xl ¼ Rl ðqÞT Sð^xl ðqÞÞ þ Sð^yl ðqÞÞ l þ Sð^zl ðqÞÞ AðqÞq_
2 @q @q @q
ð8:26Þ
8 Modeling and Motion Control of the 6-3-PUS-Type … 207
where the time derivative of q is not explicitly required (and, as will be shown in
Sect. 8.5, that fact is useful when computing the dynamics model of parallel
robots).
The Lagrangian function would become
_ ¼ Kðq; q_ Þ U ðqÞ;
Lðq; qÞ
and the expansion of the Lagrange equations of motion in this case leads to the
following expression:
where now Mq ðqÞ 2 Rmm represents the inertia matrix, Cq ðq; qÞ _ 2 Rmm the
matrix of centrifugal and Coriolis forces, and gq ðqÞ 2 R the vector of gravitational
m
forces; Jc ðqÞ is defined in Eq. (8.13), and k is the vector of Lagrange multipliers,
which ensures that the constraints in Eq. (8.1) are fulfilled.
It is worth mentioning that an alternative for computing matrices Mq ðqÞ,
_ and gðqÞ, without explicitly expanding the Lagrange’s equations of
Cq ðq; qÞ,
motion, is employing the following properties (Kelly et al. 2005):
1
_ ¼ q_ T Mq ðqÞq_
Kðq; qÞ ð8:28Þ
2
_
@Kðq; qÞ
¼ M_ q ðqÞ Cq ðq; qÞ
_ q_ ð8:29Þ
@q
@UðqÞ
¼ gq ðqÞ ð8:30Þ
@q
where
sq ¼ AðqÞT sq : ð8:35Þ
This last step is known in general as the projection method and has been
employed by different authors (see Arczewski and Blajer 1996; Blajer 1997;
Ghorbel et al. 2000; Betsch 2005) to reduce the dynamics of constrained
mechanical systems. It should be clear that Eq. (8.31) represents the minimal
inverse dynamics model of the robot.
As mentioned in Ghorbel et al. (2000), however, the reduced dynamics model
described above in Eq. (8.31) has two special characteristics which make it different
from regular dynamics models of open-chain mechanical systems. First, the above
reduced model is valid only (locally) for q in the compact set Xq . Second, since the
parameterization q ¼ rðqÞ is implicit, it is an implicit model.
Now, after mentioning the above concepts, we can list the necessary steps to get
the dynamics model of a system subject to holonomic constraints using the Euler–
Lagrange formulation and the projection method:
1. Define the sets of minimal and non-minimal coordinates so that q 2 Xq Rn
and q 2 Xq Rm .
2. Determine functions cðqÞ, aðqÞ, and rðqÞ (if possible), so that Eqs. (8.1), (8.2),
and (8.6) are met.
3. Compute Jw ðqÞ and AðqÞ using Eqs. (8.3) and (8.7) (or Eq. (8.15), if possible).
4. Express the pose of each of the b rigid bodies in the robot in terms of q, i.e., find
pl ðqÞ and 0 Rl ðqÞ.
5. Compute the vectors of linear velocity tl and angular velocity l xl , using
Eqs. (8.25) to (8.26).
:
6. Compute the total kinetic Kðq; qÞ and potential UðqÞ energies using Eqs. (8.21)–
(8.23).
7. Find the matrices of the non-minimal model Eq. (8.27) either expanding
Eq. (8.24) or using Eqs. (8.28)–(8.30).
8. Left-multiply model Eq. (8.27) by AðqÞT to get the minimal dynamics model
Eq. (8.31).
Figure 8.3 is again a picture of the Hexapod mechanism, but now with some marks
which will be useful for modeling the robot kinematics. These marks will be
described in the following paragraphs.
Points T1 , T2 , and T3 define the vertices of an equilateral triangle which is fixed
to the base and has a side length LB . For simplicity, it is assumed that the centers of
the six universal joints (labeled D0 ; D1 ; :::; D5 in Fig. 8.3) lie in one of the sides of
the T1 T2 T3 triangle. Attached to the center of this base triangle is the reference
8 Modeling and Motion Control of the 6-3-PUS-Type … 209
frame R0 ðX0 ; Y0 ; Z0 Þ, with an orientation such that the X0 axis points toward the
vertex T2 , the Y0 axis is parallel to the side T1 T3 , and the Z0 axis points upward.
The points denoted as Q1 , Q2 , and Q3 are placed where the last axis of each
spherical joint meets the surface of the mobile platform, and they form a rigid
equilateral triangle. The coordinate frame RF ðXF ; YF ; ZF Þ is attached to the mobile
platform, with its origin placed at the geometric center of triangle Q1 Q2 Q3 , and its
orientation is such that the XF axis points to the center of the Q2 Q3 side, and the ZF
axis is normal to the Q1 Q2 Q3 triangle, and points upward. Due to the mechanical
design of the spherical joints, the triangle defined by the points P1 , P2 , and P3
(points located at the center of the spherical joints) is rigid and equilateral (see
Fig. 8.3); moreover, the P1 P2 P3 triangle is always parallel to the Q1 Q2 Q3 triangle
and has its same dimensions, so that they together constitute a rigid right triangular
prism whose length of each side is LP ¼ LB =2, and its height is HPQ .
The intrinsic symmetry of this mechanism greatly simplifies its kinematic
analysis. Assuming hereinafter that the triad ði; j; kÞ is an element of the set
fð1; 2; 3Þ; ð2; 3; 1Þ; ð3; 1; 2Þg
S3 of cyclic permutations, it is possible to obtain
expressions for one side of the base equilateral triangle which are similar to those of
the other two sides, simply changing the indexes in the corresponding expressions.
In order to simplify the forthcoming analysis, a frame RTi ðXTi ; YTi ; ZTi Þ is
assigned to each vertex Ti of the base triangle. And, as it is shown in the schematic
diagram of Fig. 8.4a, this frame is such that the axis XTi has the direction of the
vector rTk Ti , which goes from the point Tk to the point Ti (hereinafter, unless
otherwise indicated, this vector notation will be used), and the axis ZTi is always
210 R. Campa et al.
perpendicular to the base, and points upward. The matrix relating the orientation of
frames RTi and R0 is:
2 3
cosðbi Þ sinðbi Þ 0
0
RTi ðbi Þ ¼ 4 sinðbi Þ cosðbi Þ 0 5; ð8:36Þ
0 0 1
where bi is the angle from X0 to XTi , around the Z0 axis, so that b1 = 270°,
b2 = 30°, and b3 = 150°.
Besides the centers of the six universal joints, denoted by D0 ; D1 ; . . .; D5 ,
Fig. 8.4 also shows the midpoints of the segments Tk Ti , indicated by Bi . The active
8 Modeling and Motion Control of the 6-3-PUS-Type … 211
joint variables are named q0 ; q1 ; . . .; q5 (see Fig. 8.4a). Note that q2i2 and q2i1
ði ¼ 1; 2; 3Þ are the distances from the point Bi to points D2i2 and D2i1 ,
respectively, which are on the same side of the T1 T2 T3 triangle.
The active joint variables can be grouped into the following vector of active joint
coordinates:
q ¼ ½ q0 q1 q2 q3 q4 q5 T 2 R6 :
8.4 Kinematics
In this section, we describe the computation of the pose and velocity kinematics
models of the Hexapod parallel robot. It is worth mentioning that the kinematics
analysis of this mechanism has been previously reported in Campa et al. (2016).
As mentioned in Sect. 8.2.1, to get the FPK model of a parallel robot we require to
calculate the position and orientation of the mobile platform, given respectively by
rF ðqÞ and 0 RF ðqÞ. In order to get these expressions, we will first compute the
position vector with respect to frame R0 of each point Pi , i.e., rPi , as a function of q.
From Fig. 8.4b, we can verify that the vector rPi is given by:
Vectors rBi Ci and rCi Pi can be expressed in terms of the joint variables of the
corresponding side (i.e., q2i2 and q2i1 ) as:
2 3
q q2i2 0
1
RTi ðbi Þ4 0 5
2i1
rBi Ci ¼ 0 RTi ðbi Þi rBi Ci ¼ ð8:39Þ
2
0
where /i (the angle between YCi and YTi around XTi ) is in general a function of q.
Substituting Eqs. (8.38), (8.39), and (8.40) in Eq. (8.37), we get:
8 Modeling and Motion Control of the 6-3-PUS-Type … 213
2 3
rBi Ci
rPi ¼0 RTi ðbi Þ4 2pBffiffi3 þ rCi Pi cosð/i Þ 5
L
ð8:41Þ
rCi Pi sinð/i Þ
where
q2i1 q2i2
rBi Ci ¼ ; ð8:42Þ
2
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
4L2 q2i
rCi Pi ¼ ; ð8:43Þ
2
with
qi ¼ q2i1 þ q2i2 ð8:44Þ
It is worth noticing here that Eqs. (8.42) and (8.43) can be considered to perform a
change of coordinates from (q2i1 ; q2i2 ) to (rBiCi ; rCiPi ) and, as shown in Eq. (8.41),
rBiCi and rCiPi together with /i can be used to describe the position of point Pi . The
same rationale is applied in Carbonari et al. (2011) to transform the 2-1-PUS
mechanisms in each side of the Hexapod into the equivalent PRPS structure.
Further, it should be noted that given the vectors rPi , it is possible to compute:
On the other hand, from the geometry of the robot, it is possible to verify that the
rotation matrix 0 RF (whose columns 1, 2, and 3 are unit vectors in the direction of
the axes XF , YF , and ZF , respectively, with respect to R0 ) is given by:
h i
0
RF ¼ pffiffi2 3 r23 ðr23 r21 Þ 1 pffiffi2 r23 r21 ; ð8:47Þ
3 LP LP r23 3L2P
where
2 1 1 2
^xF ðqÞ ¼ pffiffiffi r23 r21 ; ^yF ðqÞ ¼ r23 ; ^zF ðqÞ ¼ pffiffiffi 2 r23 r21 ð8:49Þ
3L P 2 LP 3LP
214 R. Campa et al.
where F rP1 F is the vector from P1 to F with respect to frame RF ; but it should be
noted that rP1 F has components only in the direction of the XF and ZF axes, i.e.,
pffiffiffi
T
F
rP1 F ¼ Lp = 3 0 HPQ ; so that substituting Eq. (8.48) in Eq. (8.50) we get:
1 2 2HPQ
rF ¼ rP1 þ r23 r21 þ pffiffiffi 2 ðr23 r21 Þ;
3 3 3LP
1
rF ¼ ðrP1 þ rP2 þ rP3 Þ þ HPQ^zF : ð8:51Þ
3
From Eqs. (8.45), (8.47), and (8.51), we have that in order to get the FPK model
of the Hexapod it is sufficient to know the vectors rP1 , rP2 , and rP3 . However, to
obtain rPi as functions of q, from Eq. (8.41) we have that we need to compute /1 ,
/2 , and /3 as functions of q, and that is precisely the most difficult task when
obtaining the FPK model in this robot.
In order to do so, we start by noticing that substituting Eq. (8.41) in Eq. (8.46),
we can get three expressions of the form
c i ¼ r C i P i r Cj P j ; ð8:55Þ
1 1
ei ¼ L2B þ LB ðrBj Cj rBi Ci Þ þ rBi Ci rBj Cj þ rB2 i Ci þ rC2 i Pi þ rB2 j Cj þ rC2 j Pj L2P
4 2
ð8:57Þ
are in general functions of q, and we have employed the fact that bj bi ¼ 120 ,
for all ði; j; kÞ 2 S3 .
8 Modeling and Motion Control of the 6-3-PUS-Type … 215
q ¼ ½ q0 q1 q2 q3 q4 q5 /1 /2 /3 T 2 R9 ð8:58Þ
1 x2i 2xi
cð/i Þ ¼ and sð/i Þ ¼
1 þ x2i 1 þ x2i
where xi ¼ tanð/i =2Þ. Rearranging terms in c1 ðqÞ, c2 ðqÞ, and c3 ðqÞ, we obtain the
equations
A1 x23 þ B1 x3 þ C1 ¼ 0 ð8:63Þ
A2 x23 þ B2 x3 þ C2 ¼ 0 ð8:64Þ
After following a similar method to the one given in Nanua et al. (1990), we can
eliminate x2 from Eqs. (8.60) and (8.65) and get the following equation:
1 þ r2 x1 þ r3 x1 þ r4 x1 þ r5 x1 þ r6 x1 þ r7 x1 þ r8 x1 þ r9 ¼ 0
r1 x16 ð8:66Þ
14 12 10 8 6 4 2
@cðq;/Þ
so that Jw ðqÞ is invertible if and only if @/ is invertible, that is to say
@cðq; /Þ
det Jw ðqÞ 6¼ 0 , det 6¼ 0: ð8:67Þ
@/
The vectors of linear velocity t and angular velocity x of the Hexapod’s platform
can be computed using Eqs. (8.18) and (8.20).
218 R. Campa et al.
where
@rPi
Ji ¼ ði ¼ 1; 2; 3Þ ð8:68Þ
@q
and
@^xF @^yF @^zF
Jx ¼ ; Jy ¼ ; Jz ¼ ð8:69Þ
@q @q @q
For the computation of matrix AðqÞ, let us consider the following analysis,
which is a contribution of this work.
The constraint vector for the Hexapod robot is given by Eq. (8.59), and taking its
time derivative, we get
@cðq; /Þ @cðq; /Þ _
q_ þ / ¼ 0; ð8:70Þ
@q @/
or
@/ðqÞ @cðq; /Þ 1 @cðq; /Þ
¼ :
@q @/ @q
It is worth noticing that Eq. (8.73) allows to compute matrix AðqÞ without
explicitly knowing function rðqÞ. The key for this useful result was the selection of
the non-minimal coordinates given by Eq. (8.58), and this suggests that a similar
procedure could be applied to other types of parallel robots.
Now, for the computation of the analytical Jacobian of the Hexapod robot
(which will be required for the implementation of the two-loop controller described
in Sect. 8.7.1), let us consider the fact that the pose of the platform can be written in
terms of q and /, i.e.,
@vðq; /Þ @vðq; /Þ _
n_ ¼ q_ þ / ð8:74Þ
@q @/
8.5 Dynamics
In this section, we show the application of the procedure described in Sect. 8.2.3 for
computing the inverse dynamics model of the Hexapod parallel robot.
The analysis considers that the Hexapod robot consists of a total of b ¼ 25
mobile rigid bodies, distributed as indicated below:
• The platform: 1.
• The legs: 6.
• The links between a P joint and a U joint: 6.
• The links between the two ends of a U joint: 6.
• The two links between the two ends of an S joint: 2 3 = 6.
220 R. Campa et al.
The following subsections explain how to compute tl and l xl for the different rigid
bodies in the Hexapod. In each subsection, we first show how to describe the pose
of the rigid body l, via the position vector pl 2 R3 and the rotation matrix
0
Rl 2 SOð3Þ; after that, we compute tl and l xl using Eqs. (8.25) and (8.26). But in
order to simplify the subsequent analysis, we employ the Jacobians JGl and KGl
satisfying
tl JG l
¼ q_ ð8:76Þ
l
xl KGl
so that
@pl ðqÞ
JG l ¼ ð8:77Þ
@q
10 @^xl ðqÞ @^y ðqÞ @^zl ðqÞ
KGl ¼ Rl ðqÞT Sð^xl ðqÞÞ þ Sð^yl ðqÞÞ l þ Sð^zl ðqÞÞ ð8:78Þ
2 @q @q @q
8.5.1.1 Platform
Let us consider that l ¼ 1 in the case of the platform. From Eq. (8.51), the position
vector of its com is
1
p1 ¼ ðrP1 þ rP2 þ rP3 Þ þ H ^zF
3
where H is the distance from the center of the triangle P1 P2 P3 to the com of the
platform. As we can choose frame RF to compute the angular velocity of the
platform, we can write
0
R1 ¼ 0 RF ¼ ½ ^xF ðqÞ ^yF ðqÞ ^zF ðqÞ ;
1
JG1 ¼ ðJ1 þ J2 þ J3 Þ þ H Jz ð8:79Þ
3
8 Modeling and Motion Control of the 6-3-PUS-Type … 221
and
1 0 T
where the auxiliary Jacobians Ji , Jx , Jy , and Jz are defined in Eqs. (8.68) and (8.69).
8.5.1.2 Legs
There are six legs in the Hexapod robot, two in each side of the base triangle.
Figure 8.6 shows the side i (i ¼ 1; 2; 3) with the legs 2i 2 and 2i 1 (which are
coupled to the active joints with the same number). For the analysis, let us consider
that the leg 2i 2 corresponds to the body l ¼ 2i and the leg 2i 1 to the body
l ¼ 2i þ 1. Thus, in the count of b ¼ 25 rigid bodies, the legs are those with the
numbers l ¼ 2; 3; . . .; 7.
For the sake of simplicity, let us assume that each of the legs is symmetric, so
that its com, labeled either as G2i2 or G2i1 , is at the midpoint of the corresponding
segment, either Pi D2i2 or Pi D2i2 .
As it can be seen in Fig. 8.6, E2i2 and E2i1 are the midpoints of the segments
Ci D2i2 and Ci D2i1 , respectively, therefore rCi E2i1 ¼ rCi E2i2 ¼
0 T
ðqi =4Þ RTi ðbi Þ½1 0 0 and rE2i1 G2i1 ¼ rE2i2 G2i2 ¼ ð1=2ÞrCi Pi ; therefore, the
position vectors of G2i2 and G2i1 , rG2i2 and rG2i1 , are given by:
2 rCi Pi sinð/i Þ
1 0
p2i þ 1 ¼ rG2i1 ¼ rCi þ rCi E2i1 þ rE2i1 G2i1
2 3 2 q2i1 3
rBi Ci þ q4i
6 LB 7 1 2
¼0 RTi ðbi Þ6 pffiffi þ 1 rC P cosð/ Þ 7 ¼ rP þ 0 RT ðb Þ6
4 LpBffiffi 7
5
4 2 3 2 i i i 5 2 i i i 4 3
2 rCi Pi sinð/i Þ
1 0
Now, by considering frames RCi , RD2i2 , and RD2i1 , defined at the end of
Sect. 8.3 and shown in Fig. 8.5, we have that the orientation of the legs 2i 1 and
2i 2, with respect to the base frame, is respectively given by
222 R. Campa et al.
0
R2i ¼ 0 RD2i1 ¼ 0 RTi ðbi ÞRx ð/i ÞRz ðai Þ;
0
R2i þ 1 ¼ 0 RD2i2 ¼ 0 RTi ðbi ÞRx ð/i ÞRz ðai Þ;
Eqs. (8.43) and (8.44), respectively, and L the length of the link.
Now, applying Eqs. (8.77) and (8.78) for l ¼ 2i and l ¼ 2i þ 1, we can verify
that
1 @ q2i2
T
JG2i ¼ Ji þ 0 RTi ðbi Þ 2 0 0 ;
2 @q
1 @ q2i1
T
JG2i þ 1 ¼ Ji þ RTi ðbi Þ
0
0 0 ;
2 @q 2
ð8:82Þ
2 3 2 3 2 3
/i 0 0
rCiPi @ 6 7 qi @ 6 7 1 @ 6 7
KG2i ¼ 4 5
0 þ /
4 i5 4 0 5;
L @q 2L @q 2rCiPi @q
0 0
qi
8 Modeling and Motion Control of the 6-3-PUS-Type … 223
and
2 3 2 3 2 3
/ 0 0
rCiPi @ 4 i 5 qi @ 4 5 1 @ 4 5
KG2i þ 1 ¼ 0 /i þ 0 ; ð8:83Þ
L @q 2L @q 2rCiPi @q
0 0
qi
where rCi Pi and qi are defined in Eqs. (8.43) and (8.44), respectively, so that
1 h iT
t2i ¼ r_ Pi þ 0 RTi ðbi Þ q_ 2i2
2 0 0 ;
2
1 h iT
t2i þ 1 ¼ r_ Pi þ 0 RTi ðbi Þ q_ 2i1
2 0 0 ;
2
h iT
2i
x2i ¼ rCiPi /_
L
qi _
i / 1
2L i 2rCiPi
_ i ; and
q
h iT
2i þ 1
x2i þ 1 ¼ rCiPi
L /_ i qi _
2L /i 2r1CiPi q_ i :
Notice that the prismatic (P) and the universal (U) joints can be seen as a 2-DOF
compound joint between the robot base and each of its legs. Each PU joint contains
two intermediate rigid bodies (or parts) which can be appreciated in Fig. 8.7a:
• Part 1 of PU joint: This is the base of the U joint, which moves along the side of
the base triangle; its configuration depends only on the active joint coordinate
corresponding to the P joint.
• Part 2 of PU joint: This is the coupling link between the part 1 of a PU joint and
the corresponding leg; its configuration depends on the active joint coordinate of
the P joint and the corresponding /i angle, which gives a rotation around the
same axis.
Fig. 8.7 Rigid bodies in the Hexapod’s joints: a PU joint and b S joint
224 R. Campa et al.
Part 1 of PU joints
Let us consider that the part 1 of the PU joint 2i 2 corresponds to the body
l ¼ 2i þ 6 and the same part of the PU joint 2i 1 to the body l ¼ 2i þ 7. Thus, the
first parts of the PU joints of the Hexapod are those rigid bodies with the numbers
l ¼ 8; 9; . . .; 13.
Notice that the position vector of the com for the first part of a PU joint is either
@
JG2i þ 6 ¼ 0 RTi ðbi Þ ½ q2i2 d lc1 T and
@q
ð8:84Þ
@ T
JG2i þ 7 ¼ 0 RTi ðbi Þ ½ q2i1 d lc1
@q
Part 2 of PU joints
For the count of rigid bodies, let us consider that the part 2 of the PU joint 2i 2
corresponds to the body l ¼ 2i þ 12 and the same part of the PU joint 2i 1 to the
body l ¼ 2i þ 13. Thus, the second parts of the PU joints are those rigid bodies with
the numbers l ¼ 14; 15; . . .; 19.
The position vector of the second part of a PU joint is either
@
JG2i þ 12 ¼ 0 RTi ðbi Þ ½ q2i2 d 0 T and
@q
ð8:86Þ
@
JG2i þ 13 ¼ 0 RTi ðbi Þ ½ q2i1 d 0 T
@q
0
R2i þ 12 ¼ 0 R2i þ 13 ¼ 0 RTi ðbi ÞRð/i Þ
thus
@
KG2i þ 12 ¼ KG2i þ 13 ¼ ½/ 0 0 T : ð8:87Þ
@q i
8.5.1.4 S Joints
There are only three spherical joints in the Hexapod robot. The spherical joint i is
placed in the side i of the base triangle, and it joins the vertex Qi of the platform
triangle with the legs 2i 2 and 2i 1. Figure 8.8 shows this side of the robot in
foreground.
Each spherical joint is assumed to be formed by three independent rotational
joints; let h1i , h2i , and h3i be the joint coordinates of the S joint in the i side of the
robot. Notice in Fig. 8.8 that these three angles can be considered as the Euler
angles of the XYZ convention, which allow to express the relative orientation of
frame RF with respect to a frame denoted in the same figure as RNi . In other words,
h1i , h2i , and h3i are the angles that RNi has to be rotated in order to have the same
orientation as RF .
Now, with respect to the pose of frame RNi , it is worth noticing that the origin of
this frame is at the point Pi , and its orientation is given by the following compo-
sition of rotation matrices:
0
RNi ¼ 0 RTi ðbi ÞRx ð/i ÞCi RNi
where
2 3
0 1 0
Ci
R Ni ¼ 4 0 0 1 5 2 SOð3Þ
1 0 0
is the rotation matrix giving the relative orientation of frame RNi with respect to RCi
Analyzing the coordinate frames in Fig. 8.8, it should be clear that the following
expression is valid:
0
RF ¼ 0 RTi ðbi ÞRx ð/i ÞCi RNi Rx ðh1i ÞRy ðh2i ÞRz ðh3i Þ ð8:88Þ
where the elementary rotation matrices Rx ðÞ, Ry ðÞ, and Rz ðÞ were defined in
Eq. (8.81).
Applying the property R1 ¼ RT of any rotation matrix R 2 SOð3Þ, we can
rewrite Eq. (8.88) as
Rx ðh1i ÞRy ðh2i ÞRz ðh3i Þ ¼ Ci RTNi Rx ð/i ÞT 0 RTi ðbi ÞT 0 RF ¼ Ni RF ðqÞ ð8:89Þ
The (Euler) angles h1i , h2i , and h3i can now be computed using the standard
formulas for the XYZ convention (see Craig 2004), that is:
h1i ¼ atan2 R2;3 ðqÞ; R3;3 ðqÞ ð8:90Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
h2i ¼ atan2 R1;3 ðqÞ; R2;3 ðqÞ2 þ R3;3 ðqÞ2 ð8:91Þ
h3i ¼ atan2 R1;2 ðqÞ; R1;1 ðqÞ
where Ru;v ðqÞ is the element ðu; vÞ of the matrix Ni RF ðqÞ defined in Eq. (8.89).
Moreover, there are two rigid bodies (parts) in every spherical joint (see
Fig. 8.7b):
• Part 1 of S joint: This is the link between the legs and the second part of the S
joint; its configuration can be computed from the corresponding leg’s config-
uration and the angle h1i .
• Part 2 of S joint: This is the link between the part 1 of the same S joint and the
platform; its configuration can be computed from the configuration of the part 1
and the angle h2i .
8 Modeling and Motion Control of the 6-3-PUS-Type … 227
Part 1 of S joints
The first parts of the S joints correspond to the rigid bodies with l ¼ 20; 21; 22,
or, in terms of i, l ¼ 19 þ i. For the sake of simplicity, let us consider that the com
of the S joint part 1 is at point Pi , that is to say that
p19 þ i ¼ rPi
so that
JG19 þ i ¼ Ji ; ð8:92Þ
with Ji ¼ @r@qPi .
The frame attached to this body is labeled as R1i in Fig. 8.8, and its orientation is
given by
0
R19 þ i ðqÞ ¼ 0 RTi ðbi ÞRx ð/i ÞCi RNi Rx ðh1i Þ;
@ @
KG19 þ i ¼ Rx ðh1i ÞT Ci RTNi ½/ 0 0 T þ ½h1i 0 0 T ð8:93Þ
@q i @q
@h1i
where the term @q can be computed by taking the partial derivative of Eq. (8.90)
and results in
Part 2 of S joints
The second parts of the S joints correspond to the rigid bodies with
l ¼ 23; 24; 25, or, in terms of i, l ¼ 22 þ i. In this case, the com of the S joint part 2
is not at point Pi but at a distance lc in the direction of ^zF , that is to say that
so that
JG22 þ i ¼ Ji þ lc Jz ; ð8:94Þ
Notice that the angular velocity of the part 2 of each S joint is given as a function
of the angular velocity of the part 1 of the same S joint.
According to Eqs. (8.21)–(8.23), the total kinetic and potential energies of the
Hexapod robot are given by:
X
25
1X 25
_ ¼
Kðq; qÞ Kl ¼ ml tTl tl þ l xl Ill xl ð8:96Þ
l¼1
2 l¼1
and
" #
X
25 X
25
UðqÞ ¼ Ul ¼ ml pTl go ; ð8:97Þ
l¼1 l¼1
8 Modeling and Motion Control of the 6-3-PUS-Type … 229
25 h
X i
MðqÞ ¼ ml JTGl JGl þ KTGl Il KGl ð8:98Þ
l¼1
where the Jacobians JGl and KGl , with l ¼ 1; 2; . . .; 25, for the Hexapod robot were
found in the previous subsection [Eqs. (8.79), (8.80), (8.82)–(8.87), and (8.92)–
(8.95)], and the dynamics parameters ml and Il are to be determined for the
Hexapod robot either by an experimental procedure or via CAD modeling.
Once MðqÞ is computed, the vector of centrifugal and Coriolis forces can be
obtained by rewriting Eq. (8.29) as:
: @ 1 T
_ q_ ¼ MðqÞq_
Cðq; qÞ q_ MðqÞq_
@q 2
: 1 @ T
_ ¼ MðqÞ
Cðq; qÞ ðq_ MðqÞÞ ð8:99Þ
2 @q
Finally, the vector of gravitational forces can be obtained from (8.30) and (8.97),
i.e.,
" # " #
@UðqÞ X25
@pl ðqÞT X25
gðqÞ ¼ ¼ ml go ¼ ml JTGl go ð8:100Þ
@q l¼1
@q l¼1
to find the corresponding matrices of the minimal dynamics model Eq. (8.31) using
Eqs. (8.32)–(8.35).
230 R. Campa et al.
In order to show the validity of the Hexapod’s forward kinematics model and
inverse dynamics model, obtained in the previous sections, we carried out some
simulations in which the results produced by our analytical models for a given
motion were compared with the results generated by the SolidWorks Motion
software tool.
SolidWorks Motion is a module of the SolidWorks® product family which is
useful for the analysis and design of mechanisms, when their SolidWorks CAD
model is provided. SolidWorks Motion can compute the solution of kinematics and
dynamics models numerically for a time-based motion.
For the validation of the kinematics and dynamics models, we designed a motion
profile in which the active joint variables were taken as inputs and time-varying
functions were applied to them. Such motion profile was then used in simulations
employing the analytical expressions of the FPK and inverse dynamics models,
developed in Sects. 8.4 and 8.5, respectively, and compared with the results given
by SolidWorks Motion.
The trajectory for the active joints is given by the vector
sinðxi tÞ, and qð0Þ corresponds to the vector of active joints at the home configu-
ration (where the robot starts at t ¼ 0) which, according to the specifications of the
Hexapod robot, is given by qð0Þ ¼ 0:1765½ 1 1 1 1 1 1 T m, which corre-
sponds to the home pose of the platform given by rF ð0Þ ¼ ½ xð0Þ yð0Þ zð0Þ T ¼
½ 0 0 0:424 T m and 0 RF ¼ I (or ½ kð0Þ lð0Þ mð0Þ T ¼ ½0 0 0T rad).
Then, by using SolidWorks Motion, we computed: (a) the vector n 2 R6 of
coordinates describing the pose of the platform (the three Cartesian coordinates and
the three ZYX-convention Euler angles), which is the output of the FPK model; and
(b) the vector of active joint generalized forces sq , which is the output of the inverse
dynamics model.
The parameters of the trajectory were chosen to be c1 ¼ c4 ¼ c6 ¼ 0:05 ½m,
c2 ¼ c3 ¼ c5 ¼ 0:08 ½m, j ¼ 1 ½s3 , and x1 ¼ x2 ¼ 2x3 ¼ 2x4 ¼ 4x5 ¼
4x6 ¼ 3 ½rad/s. It is worth noticing that this trajectory starts at the home config-
uration with null velocity and null acceleration (i.e., qd ðtÞ ¼ q_ d ðtÞ ¼ q
€d ðtÞ ¼ 0 for
t ¼ 0). Also notice that as t ! 1, the desired trajectory reduces to simple sinu-
soidal functions in each axis.
Once the position vector rF and the rotation matrix 0 RF were computed by
following the steps at the end of Sect. 8.4.1, the ZYX Euler angles were determined
using the following expressions:
8 Modeling and Motion Control of the 6-3-PUS-Type … 231
0 1
0 0
RF B RF3;1 0
C RF3;2
k ¼ atan 0 2;1 ; l ¼ atan@qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi A; m ¼ atan 0R
RF1;1 0 R2 þ R 0 2 F3;3
F 1;1 F2;1
where 0 RFu;v is the element ðu; vÞ of matrix 0 RF , and we take LB ¼ 0:866 ½m,
L ¼ 0:3689 ½m, and HPQ ¼ 0:090 ½m as kinematic parameters.
Figure 8.9 shows the time evolution of the six variables giving the pose of the
platform obtained by SolidWorks and our analytical expressions. It should be noticed
that the graphs for each coordinate are very similar. Figure 8.10 shows the norm of
the vectors giving the difference between the SolidWorks and analytical models for
both the position and orientation parts of the pose. If we consider the first 20 s shown
in Fig. 8.9, the maximum deviation between both graphs is given by ~x ¼ 0:55 [mm],
~y ¼ 1:28 and ~x ¼ 0:235 [mm], ~a ¼ 0:1314 , b
~ ¼ 0:1050 and ~c ¼ 0:1317 .
In the case of the dynamics model, we also employed H ¼ 0:0791 ½m, lc1 ¼
0:0287 ½m, lc2 ¼ 0:03081 ½m, and the dynamics parameters given in Table 8.1.
Notice in this table that the last column gives the moment of inertia tensor with
respect to the frame associated with the corresponding body (i.e., Il ), but only the
terms in its diagonal are considered.
Figure 8.11 shows the time evolution of the resulting joint generalized forces
obtained from SolidWorks and using the minimal dynamics model Eq. (8.32).
Fig. 8.9 Pose coordinates of the platform, computed by both the analytical model (black solid
line) and the CAD model (green circled line)
232 R. Campa et al.
Fig. 8.10 Norm of the vector giving the difference between the analytical and the CAD model for
the platform’s a position and b orientation
Table 8.1 Dynamics Rigid body Mass [Kg] (Ixx, Iyy, Izz)[Kg cm2]
parameters of the Hexapod
Mobile platform 2.085 (198.58, 199.84, 396.31)
Leg k 0.44917 (54.66, 0.50, 54.78)
PU joint part 1 0.3194 (4.94, 5.51, 2.51)
PU joint part 2 0.2200 (1.76, 3.36, 1.75)
S joint part 1 0.2200 (1.75, 1.76, 3.36)
S joint part 2 0.3025 (5.864, 1.22, 5.218)
Figure 8.12 shows the time evolution of the norm of the vector formed by the
difference between the generalized forces computed by those models for all the six
joints. The maximum deviation from both graphs is given by: ~s0 ¼ 0:179 [N],
~s1 ¼ 0:239 [N], ~s2 ¼ 0:143 [N], ~s3 ¼ 0:116 [N], ~s4 ¼ 0:094 [N], ~s5 ¼ 0:096 [N].
It is clear, by the results presented in this section, that the analytical expressions
we obtained for the FPK model and inverse dynamics model of the Hexapod robot
are validated by the SolidWorks Motion software.
Fig. 8.11 Active joint generalized forces, computed by both the analytical model (black solid
line) and the CAD model (green circled line)
Figure 8.13 shows the block diagram of the two-loop controller proposed as a
tracking controller for the Hexapod robot. This controller, applied to serial robot
manipulators, has been studied, and its stability is analyzed by Camarillo et al.
(2008).
By kinematic control, we refer to any scheme that uses an inverse Jacobian
algorithm to resolve the desired joint velocities directly from the pose variables of
the desired task. Thus, a kinematic controller is often employed as the outer loop of
a two-loop controller such as the one in Fig. 8.13. In this paper, we use as kinematic
controller the so-called resolved motion rate controller (RMRC), which was first
proposed by Whitney (1969). Using this scheme, the desired joint velocity for the
inner loop can be written as
h i
md ¼ JA ðqÞ1 n_ d þ K~
n ; ð8:101Þ
_
n~ ¼ K ~n: ð8:102Þ
instead of ideal velocity tracking. So, the implementation of the kinematic control
given by Eq. (8.101) requires the design of a joint velocity controller. To this end,
let us define the joint velocity error as
~m ¼ md q_ 2 Rn : ð8:103Þ
8.7.1.1 Experiments
qð0Þ ¼ 0:1765½ 1 1 1 1 1 1 T m;
rF ð0Þ ¼ ½ 0 0 0:424 T m
ð8:104Þ
wð0Þ ¼ ½ 0 0 0 T rad
6 7
pd ¼ 6 c2 ð1 ect Þ sinðxtÞ 7m
3
4 5
c3 ð1 ect ÞðsinðxtÞ cz03 Þ þ 0:424
3
2 3
c4 ð1 ect Þ cosðxtÞ
3
6 7
wd ¼ 4 c5 ð1 ect Þ sinðxtÞ 5rad
3
c6 ð1 ect Þ sinðxtÞ
3
236 R. Campa et al.
decouples the mechanical system, by adding the necessary nonlinear terms to the
control law (Cheah and Haghighi 2014).
In this work, we employ the inverse dynamics controller in operational space
proposed by Khatib (1987), which uses Euler angles to parameterize the orientation.
This controller is given by:
h i
q ðqÞJA ðqÞ1 n€d þ KV ~n_ þ KP ~n J_ A ðqÞq_ þ C
sq ¼ M q ðq; qÞ
_ q_ þ
gq ðqÞ; ð8:105Þ
where JA ðqÞ is the analytical Jacobian; KP and KV are diagonal matrices of control
gains, and ~n ¼ nd n, being nd , n_ d , €nd the vectors of desired pose coordinates,
velocities, and accelerations, respectively.
Figure 8.17 shows the block diagram of the controller given by Eq. (8.105).
Substituting this control law in the minimal robot dynamics Eq. (8.31), and
assuming that JA ðqÞ is invertible in the region of the workspace where the robot
operates, it is possible to demonstrate, using n_ ¼ JA ðqÞq_ and its time derivative,
€ _
that the closed-loop system is ~n þ KV ~n þ KP ~n ¼ 0 2 R6 , which is a linear system
whose stability is easy to demonstrate.
8.7.2.1 Experiments
For the implementation of the inverse dynamics controller, we used the same
desired trajectory and initial conditions as for the two-loop controller. Moreover,
the gain matrices were defined as KP ¼ diagf35; 35; 50; 80; 100; 90g½103 =s2 and
KV ¼ diagf190; 250; 120; 550; 500g½1=s.
Figures 8.18 and 8.19 show the time evolution of the norm of the position error
(in Cartesian coordinates) and the orientation error (in ZYX Euler angles) parts of ~
n:
Note that both norms are kept bounded, meaning that the position of the mobile
platform follows the desired path with a relatively small error.
Figure 8.20 shows the generalized forces applied to the prismatic joints.
8.8 Conclusions
This work first recalls the kinematics and dynamics modeling of platform-type
parallel manipulators. The Lagrangian formulation, together with the projection
method, is suggested for obtaining the minimal dynamics model. The proposed
methodology is then employed to model a 6-3-PUS-type parallel robot, known as
Hexapod. The effect of all mechanical parts of the robot (including those of the
joints) is taken into account, and the computed kinematics and dynamics models are
validated by comparing them with numerical simulations using SolidWorks
Motion. It is worth noticing that the proposed method can be used for similar
parallel robotic structures.
Additionally, we show how to implement in the Hexapod robot two tracking
controllers in operational space (i.e., employing Euler angles for describing the
orientation). The first controller has a two-loop structure: a resolved motion rate
controller (RMRC) in the outer loop and a joint velocity PI controller in the inner
loop. The second controller is of the inverse dynamics type, and it requires the
computation of the inverse dynamics model. The experimental results show a good
performance for both controllers, and this also allows to conclude the validity of the
kinematics and dynamics models we have obtained for the mechanism under study.
References
Arczewski, K., & Blajer, W. (1996). A unified approach to the modelling of holonomic and
nonholonomic mechanical systems. Mathematical Modelling of Systems, 2(3), 157–174.
Betsch, P. (2005). The discrete null space method for the energy consistent integration of
constrained mechanical systems: Part I: Holonomic constraints. Computer Methods in Applied
Mechanics and Engineering, 194, 5159–5190.
Blajer, W. (1997). A geometric unification of constrained system dynamics. Multibody System
Dynamics, 1, 3–21.
Camarillo, K., Campa, R., Santibáñez, V., & Moreno-Valenzuela, J. (2008). Stability analysis of
the operational space control for industrial robots using their own joint velocity PI controllers.
Robotica, 26(6), 729. https://doi.org/10.1017/S0263574708004335.
Campa, R., Bernal, J., & Soto, I. (2016). Kinematic modeling and control of the Hexapod parallel
robot. In Proceedings of the 2016 American Control Conference (pp. 1203–1208). IEEE.
http://doi.org/10.1109/ACC.2016.7525081.
Campa, R., & de la Torre, H. (2009). Pose control of robot manipulators using different orientation
representations: A comparative review. In Proceedings of the American Control Conference.
St. Louis, MO, USA.
Carbonari, L., Krovi, V. N., & Callegari, M. (2011). Polynomial solution to the forward kinematics
problem of a 6-PUS parallel-architecture robot (in Italian). In Proceedings of the Congresso
dell’Associazione Italiana di Meccanica Teorica e Applicata. Bologna, Italy.
Cheah, C. C., & Haghighi, R. (2014). Motion control of robot manipulators. In Handbook of
Manufacturing Engineering and Technology (pp. 1–40). London: Springer London. http://doi.
org/10.1007/978-1-4471-4976-7_93-1.
Craig, J. J. (2004). Introduction to robotics: Mechanics and control. Pearson.
240 R. Campa et al.
Dasgupta, B., & Mruthyunjaya, T. S. (2000). The Stewart platform manipulator: A review.
Mechanism and Machine Theory, 35(1), 15–40.
Dontchev, A. L., & Rockafellar, R. T. (2014). Implicit functions and solution mappings: A view
from variational analysis. Springer.
Geng, Z., Haynes, L. S., Lee, J. D., & Carroll, R. L. (1992). On the dynamic model and kinematic
analysis of a class of Stewart platforms. Robotics and Autonomous Systems, 9(4), 237–254.
Ghorbel, F. H., Chételat, O., Gunawardana, R., & Longchamp, R. (2000). Modeling and set point
control of closed-chain mechanisms: Theory and experiment. IEEE Transactions on Control
Systems Technology, 8(5), 801–815.
Hopkins, B. R., & Williams, R. L., II. (2002). Kinematics, design and control of the 6-PSU
platform. Industrial Robot: An International Journal, 29(5), 443–451.
Kapur, D. (1995). Algorithmic elimination methods. In Tutorial Notes of the International
Symposium on Symbolic and Algebraic Computation. Montreal, Canada.
Kelly, R., Santibáñez, V., & Loría, A. (2005). Control of robot manipulators in joint space.
Springer.
Khatib, O. (1987). A unified approach for motion and force control of robot manipulators: The
operational space formulation. IEEE Journal on Robotics and Automation, 3(1), 43–53. https://
doi.org/10.1109/JRA.1987.1087068.
Liu, C. H., Huang, K. C., & Wang, Y. T. (2012). Forward position analysis of 6-3 Linapod parallel
manipulators. Meccanica, 47(5), 1271–1282.
Liu, M. J., Li, C. X., & Li, C. N. (2000). Dynamics analysis of the Gough-Stewart platform
manipulator. IEEE Transactions on Robotics and Automation, 16(1), 94–98.
Merlet, J.-P. (1999). Parallel robots: Open problems. In Proceedings of the International
Symposium of Robotics Research. Snowbird, UT, USA.
Merlet, J.-P. (2006). Parallel robots. Springer.
Murray, J. J., & Lovell, G. H. (1989). Dynamic modeling of closed-chain robotic manipulators and
implications for trajectory control. IEEE Transactions on Robotics and Automation, 5(4),
522–528. https://doi.org/10.1109/70.88066.
Nanua, P., Waldron, K. J., & Murthy, V. (1990). Direct kinematic solution of a Stewart platform.
IEEE Transactions on Robotics and Automation, 6(4), 438–444.
Narayanan, M. S., Chakravarty, S., Shah, H., & Krovi, V. N. (2010). Kinematic, static and
workspace analysis of a 6-PUS parallel manipulator. In Volume 2: 34th Annual Mechanisms
and Robotics Conference, Parts A and B (pp. 1456–1456.8). Montreal, Canada: ASME. http://
doi.org/10.1115/DETC2010-28978.
Siciliano, B., Sciavicco, L., Villani, L., & Oriolo, G. (2009). Robotics. London: Springer London.
https://doi.org/10.1007/978-1-84628-642-1.
Tsai, L. W. (1999). Robot analysis: The mechanics of serial and parallel manipulators. Wiley.
Whitney, D. (1969). Resolved motion rate control of manipulators and human prostheses. IEEE
Transactions on Man Machine Systems, 10(2), 47–53. https://doi.org/10.1109/TMMS.1969.
299896.
Chapter 9
A Finite-Time Nonlinear PID Set-Point
Controller for a Parallel Manipulator
Abstract In recent years, finite-time controllers have attracted attention from some
researchers in control, who have formulated applications to several processes and
systems, including serial robotic manipulators. In this work, we report the appli-
cation of a finite-time nonlinear PID controller to a Five-Bar Mechanism, which is a
parallel manipulator, for set-point controller. The stability analysis of the
closed-loop system shows global finite-time stability of the system. The dynamic
model of the Five-Bar Mechanism developed in this work is a so-called reduced
model, which has a structure similar to a serial robot. Moreover, the results of the
numerical simulations carried out confirm the usefulness of the proposed applica-
tion. The contribution of this work is to show the feasibility of the application of a
finite-time nonlinear controller to a Five-Bar Mechanism and the usefulness of the
proposed approach by numerical simulations.
9.1 Introduction
F. Salas R. Juarez
Universidad Autónoma de Coahuila, Facultad de Contaduría y Administración,
Blvd. Revolución 151 Oriente, Col. Centro, 27000 Torreón, Coahuila, CP, Mexico
I. Soto (&) I. U. Ponce
Universidad Autónoma de Ciudad Juárez, Instituto de Ingeniería
y Tecnología, Ciudad Juárez, Chihuahua, Mexico
e-mail: angel.soto@uacj.mx
In this work, vectors are denoted with italic–bold lowercase letters, e.g., x or x.
pffiffiffiffiffiffiffiffi
Matrices are denoted with italic capital letters, e.g., A. kxk ¼ xT x represents the
Euclidean norm of vector x. kmax f Ag and kmin f Ag represent the largest and the
smallest eigenvalues of matrix A, respectively.
In the following, based on Su and Zheng (2017) we define some useful vectors
and vector functions, as well as a definition for the control design and analysis.
T
Siga ðxÞ ¼ ½jx1 ja signðx1 Þ; . . .; jxn ja signðxn Þ 2 <n ð9:1Þ
where a0 and a are positive constants, and x 2 <n . Furthermore, 0\a\1, signðÞ,
and sechðÞ are the standard scalar functions signum and hyperbolic secant,
respectively, and diagðÞ denotes a diagonal matrix. By defining the vector function
X
n
xT Siga ðxÞ ¼ jxi ja þ 1 TanhT ðxÞSiga ðxÞ 0 ð9:4Þ
i¼1
rigorously established. In Bhat and Bernstein (2005) were further studied some
conditions for finite-time stability, in relation to the homogeneity of a system. In the
following, some definitions will be exposed in order to clarify the concepts of
finite-time stability.
Definition 1 A function V : <n ! < is homogeneous of degree d with respect to
the weights p ¼ ðp1 ; . . .; pn Þ 2 <n if for any given d [ 0; Vðdp1 x1 ; . . .; dpn xn Þ ¼
dd VðxÞ; 8x 2 <n . A vector field h is homogeneous of degree d with respect to the
weights p ¼ ðp1 ; . . .; pn Þ 2 <nþ , if for all 1 i n, the ith component hi is a
homogeneous function of degree pi þ d.
Definition 2 Consider the system
^
x_ ¼ hðxÞ þ hðxÞ; ^
hð0Þ ¼ 0; hð0Þ ¼ 0; x 2 <n ð9:8Þ
where hðxÞ is a continuous homogeneous vector field of degree d\0, with respect to
ðp1 ; . . .; pn Þ. Assume that x ¼ 0 is an asymptotically stable equilibrium of system
Eq. (9.7). Then, x ¼ 0 is a locally finite-time stable equilibrium of system Eq. (9.8) if
M 0 ðqÞ€
q þ C0 ðq; qÞ
_ q_ þ g0 ðqÞ þ F 0 q_ ¼ s0 þ DT ðqÞk
ð9:10Þ
cðqÞ ¼ 0
with RðqÞ 2 <sn . Notice that, given the differential kinematic model b_ ¼ Jb ðqÞq,
_
the matrix RðqÞ can be constructed as
In
RðqÞ ¼ ð9:12Þ
Jb ðqÞ
246 F. Salas et al.
_ q_ þ gðqÞ þ F q_ ¼ s
MðqÞq þ Cðq; qÞ ð9:13Þ
where
Cðq; qÞ _
_ ¼ RT ðqÞM 0 ðqÞRðqÞ þ RT ðqÞC 0 ðq; qÞRðqÞ
_ ð9:15Þ
s ¼ RT ðqÞs0 ð9:18Þ
Notice that the term of Eq. (9.10) containing the product of the constraint
Jacobian by the Lagrange multipliers vanishes because it belongs to the null space
of RðqÞ, as it was pointed above.
Ghorbel et al. (2000) proven that there exist a unique parametrization q ¼ gðqÞ
of q 2 Nq inside a neighborhood Nq , whenever the system is not in a singular
configuration. Moreover, Muller (2005) established that, for a parallel machine, a
subset q of n joint variables determines its configuration, in virtue of that exist a
smooth mapping u that assigns to each q the parallel machine configuration as
q ¼ uðqÞ, where the map u1 is a local parametrization of the n dimensional
manifold V, such as V ¼ fq 2 V n ; cðqÞ ¼ 0g, where V represents the set of all
admissible configurations of the parallel machine, and cðqÞ ¼ 0 represents the
holonomic constraints.
In consequence, we can write down, without loss of generality, the matrices and
vectors of the dynamic model MðqÞ; Cðq; qÞ _ and gðqÞ as MðqÞ; Cðq; qÞ_ and gðqÞ,
respectively. Thus, the dynamic model Eq. (9.10) takes the form
_ q_ þ gðqÞ þ F q_ ¼ s
MðqÞq þ Cðq; qÞ ð9:19Þ
kMðqÞk MM
9 A Finite-Time Nonlinear PID Set-Point Controller for a … 247
Norm kM 0 k is upper bounded whenever its entries are finite. For robots with
only revolute joints, this is assured because entries of matrix M 0 are sinusoidal
functions of joint variables with constant coefficients. On the other hand, kRk is
upper bounded whenever its entries are finite, that is to say, matrix kRk is well
posed. From Eq. (9.12) it can be noticed that kRk is well posed whenever there
exists a continuous mapping between q_ and q; _ i.e., the robot is not in singular
configuration.
Property 3 (Ghorbel et al. 2000; Cheng et al. 2003)
_
The matrix 12 MðqÞ _ is skew-symmetric.
Cðq; qÞ
Property 4 (Khalil and Dombre 2004)
_ k kC kq_ k, for all q 2 <n .
There exists a constant kC [ 0 such that kCr ðq; qÞ
Property 5 The friction matrix F can be bounded as
fm I F fM I
Zt
s ¼ Kp Sig ð~qÞ Kd Sig ðgÞ kp0 KI
a1 a2
gðrÞ dr kd0 q_ ð9:20Þ
0
to a CCM, with
and
Zt
u¼ gðrÞ dr ð9:22Þ
0
_ q_ þ F q_
MðqÞ€q þ Cðq; qÞ
ð9:23Þ
þ Kp Sig ð~qÞ þ Kd Siga2 ðgÞ þ kp0 ~q þ kd0 q_ þ KI u ¼ 0
a1
q_ ¼ q_ when qd ¼ 0,
With Eqs. (9.22) and (9.23), and taking into account that ~
the closed-loop equation can be written as
2 3
2 3 q_
~q M 1 ðqÞ½Cðq; qÞ
d4 5 6 _ q_ þ F q_ þ Kp Siga1 ð~
qÞ þ Kd Siga2 ðgÞ 7
q_ ¼ 6
4 þ K u þ k ~q þ k q
7
5 ð9:24Þ
dt
u I p0 d0 _
q_ þ a0 Tanhð~qÞ
Notice that the origin of the system Eq. (9.24) is the only equilibrium of the
system.
1 1
Vð~ _ uÞ ¼ q_ T MðqÞq_ þ a0 TanhT ð~qÞMðqÞq_ þ kp0 ~qT ~
q; q; q
2 2
1 X n Xn
1 ð9:25Þ
þ kpi j~qi ja1 þ 1 þ a0 ðfi þ kd0 Þ ln ðcoshð~
qi ÞÞ þ uT KI u
a1 þ 1 i¼1 i¼1
2
where fi is the ith entry of friction matrix F. In order to investigate positive defi-
niteness of Eq. (9.25), notice that in virtue of
1 T 1 X n
q_ MðqÞq_ þ a0 TanhT ð~qÞMðqÞq_ þ qi ja1 þ 1
kpi j~
4 2ða1 þ 1Þ i¼1
1 X n
kpi 2ða1 þ 1Þa20 MM tanh2 ð~
qi Þ
2ða1 þ 1Þ i¼1
9 A Finite-Time Nonlinear PID Set-Point Controller for a … 249
where we have used Property 2 and Eq. (9.5), we can lower bound Eq. (9.25) as
1 X n 1
V qi Þ þ kp0 ~
kpi 2ða1 þ 1Þa20 MM tanh2 ð~ qT ~
q
2ða1 þ 1Þ i¼1 2
ð9:26Þ
1 1 Xn
þ uT KI u þ q_ T MðqÞq_ þ a0 ðfi þ kd0 Þ ln ðcosh ð~
qi ÞÞ
2 4 i¼1
The three last terms of the right side of inequality Eq. (9.26) can be lower
bounded as
1 T 1
u KI u kmin fKI gkuk2 [ 0; 8u 6¼ 0 2 <n
2 2
1 T 1
q_ MðqÞq_ kmin fMðqÞgkq_ k2 [ 0; 8q_ 6¼ 0 2 <n
4 4
X n X
n
a0 ðfi þ kd0 Þ ln ðcosh ð~qi ÞÞ a0 ðfi þ kd0 Þe [ 0
i¼1 i¼1
The second term of the right side of inequality Eq. (9.26) is positive definite
since 12 kp0 ~qT ~q ¼ 12 kp0 k~qk2 . Notice that the first term or the right side of Eq. (9.26) is
positive as long as kpi 2ða1 þ 1Þa20 MM is positive, i.e.,
Therefore, since the fourth last terms of the right side of Eq. (9.26) are positive
definite for all ~q; q;
_ u 6¼ 0 2 <n , the Lyapunov candidate function Eq. (9.25) is
positive definite while Eq. (9.27) is satisfied.
The temporal derivative of the Lyapunov function candidate Eq. (9.25) is as
follows:
1 _
_ q; q;
Vð~ _ uÞ ¼ q_ T MðqÞ q_ ÞT MðqÞq_
qÞ ~
q_ þ q_ T MðqÞ€q þ a0 ðSech2 ð~
2
_
þ a0 TanhT ð~qÞMðqÞ q_ þ a0 TanhT ð~
qÞMðqÞ€ q þ kp0 ~q_ T ~
q ð9:28Þ
where we have used the Property 3 (skew symmetry). Here, we neglect the grav-
itational forces vector from Eq. (9.24) since the CCM is a horizontal Five-Bar
250 F. Salas et al.
_ q_ kTanhð~qÞCðq; qÞ
Tanhð~qÞCðq; qÞ _ q_ k
kTanhð~qÞkkCðq; qÞ _ kkq_ k
pffiffiffi
nkC kq_ k 2
_ MðqÞq_ ðSech2 ð~qÞqÞ
ðSech ð~qÞqÞ
2 T
_ T MðqÞq_
ðSech2 ð~qÞqÞ
_ T kMðqÞkkq_ k
MM kq_ k2
where we have used Eq. (9.2), Property 2, and Property 4. Thus, the fifth term of
the right side of Eq. (9.29) can be upper bounded as
_ q_ þ ðSech2 ð~
a0 ½Tanhð~qÞCðq; qÞ _ T MðqÞq
qÞqÞ _
pffiffiffi ð9:30Þ
a0 ð nkC þ MM Þkq_ k2
In addition, by using Property 5 the first term of the right side of Eq. (9.29) can
be upper bounded as
q_ T F q_ fm kq_ k2 ð9:31Þ
After substituting Eqs. (9.30) and (9.31) in Eq. (9.29) and rearranging terms, we
can upper bound Eq. (9.29) as
pffiffiffi
V_ fm þ kd0 a0 nkC þ MM kq_ k2 a0 Tanh ð~
qÞKp Siga1 ð~
qÞ
gT Kd Siga2 ðgÞ a0 kp0 Tanhð~qÞ~q
In virtue of that, tanhðxÞ and x have the same sign, and then
Tanhð~qÞ~q [ 0; 8~q 6¼ 0. Therefore, we can write
pffiffiffi
V_ fm þ kd0 a0 nkC þ MM kq_ k2 a0 Tanhð~
qÞKp Siga1 ð~
qÞ
ð9:32Þ
gT Kd Siga2 ðgÞ
After using the expression in Eq. (9.4), Eq. (9.32) can be rewritten as
pffiffiffi X
n
V_ fm þ kd0 a0 nkC þ MM kq_ k2 a0 qi ja1
qi Þ j j ~
kpi jtanhð~
i¼1
ð9:33Þ
X
n
a2 þ 1
kdi jgi j
i¼1
9 A Finite-Time Nonlinear PID Set-Point Controller for a … 251
where kpi and kdi represent the ith diagonal elements of matrices Kp and Kd ,
respectively. Therefore, we can conclude that V_ 0 as long as
pffiffiffi
kd0 [ a0 ð nkC þ MM Þ fm ð9:34Þ
where
^1 ¼ a0 Tanhðy1 Þ
h ð9:37Þ
^2 ¼ M 1 ðy þ q Þ½ðCðy þ q ; y a0 Tanhðy ÞÞ
h 1 d 1 d 2 1
þ F þ kd0 IÞðy2 a0 Tanhðy1 ÞÞ þ kp0 y1 þ KI ðy3 Þy3
ð9:38Þ
M e ðy1 ; qd Þ½Kp Siga1 ðy1 Þ þ Kd Siga2 ðy2 Þ
þ a0 ðSech2 ðy1 Þðy2 a0 Tanh ðy1 ÞÞÞ
252 F. Salas et al.
e 1 ; qd Þ ¼ M 1 ðy1 þ qd Þ M 1 ðy1 þ qd Þ
Mðy ð9:39Þ
p2 ¼ d þ p1
a1 p1 ¼ a2 p 2 ¼ d þ p2 ð9:41Þ
p2 ¼ d þ p3
1 X n
1 1
V2 ¼ kpi jy1i ja1 þ 1 þ yT2 Mðqd Þy2 þ ðy1 y3 ÞT ðy1 y3 Þ ð9:42Þ
a1 þ 1 i¼1 2 2
where y1i denotes the ith component of vector y1 . The temporal derivative of
Eq. (9.42) is
By using Eq. (9.4) in Eq. (9.44), it can be concluded that V_ 2 0, which implies
that the origin is a stable equilibrium. By using the LaSalle invariance theorem
(Kelly et al. 2005), the global asymptotical stability of the origin can be concluded.
9 A Finite-Time Nonlinear PID Set-Point Controller for a … 253
where oðep1 y1 Þ means to be of order ep1 y1 as ep1 y1 ! 0. Therefore, for any fixed
y ¼ ðyT1 yT2 yT3 ÞT 2 <3n , we have
Since M 1 ðy1 þ qd Þ and Cðy1 þ qd ; y2 Þ are smooth [see Hong et al. (2002); Su
and Zheng (2009)], we obtain
M 1 ðep1 y1 þ qd Þ
lim ½ðCðep1 y1 þ qd ; ep2 y2 a0 Tanh ðep1 y1 ÞÞ
e!0 ed þ p 2
þ F þ kd0 IÞðep2 y2 a0 Tanhðep1 y1 ÞÞ
ð9:47Þ
¼ M 1 ðqd Þ½ðCðqd ; 0Þ þ F þ kd0 IÞ
ðy2 lim ed a0 lim oðep1 dp2 y1 ÞÞ ¼ 0
e!0 e!0
and
M 1 ðep1 y1 þ qd Þ
lim ½kp0 ep1 y1 þ KI ðep3 y3 Þ
e!0 ed þ p2
¼ M 1 ðqd Þ kp0 y1 lim ep1 dp2 KI y3 lim ep3 dp2 ¼ 0
e!0 e!0
e ðy1 ; qd Þ yields
After applying the mean value theorem to each entry of M
which results in
Me ðep1 y1 ; qd Þ Kp Siga1 ðep1 y1 Þ þ Kd Siga2 ðep2 y2 Þ
lim
e!0 ed þ p2 ð9:49Þ
p1 dp2
¼ lim oðe Þ¼0
e!0
Thus, according to Lemma 1, the finite-time stability of the system Eq. (9.35) is
proven. Moreover, by invoking Lemma 2, the global finite-time stability of the
system Eq. (9.35) is proven.
9.4 Simulations
In order to show the feasibility of the proposed application of the finite-time reg-
ulation controller for a parallel manipulator, we carried out numerical simulations.
Simulations of the finite-time nonlinear PID controller applied to the model of a real
horizontal Five-Bar Mechanism were carried out.
where
and
_
The transformation matrix RðqÞ and its temporal derivative RðqÞ are
2 3 2 3
1 0 0 0
6 0 1 7 6 0 0 7
RðqÞ ¼ 6 7 _ 6
4 r11 r12 5; RðqÞ ¼ 4 r_ 11 r_ 12 5
7
r21 r22 r_ 21 r_ 22
where
sinðq1 q2 b2 Þ
r11 ¼ 1
sinðq1 q2 þ b1 b2 Þ
sinðb2 Þ
r12 ¼
sinðq1 q2 þ b1 b2 Þ
sinðb1 Þ
r21 ¼
sinðq1 q2 þ b1 b2 Þ
sinðq1 q2 b1 Þ
r22 ¼ 1
sinðq1 q2 þ b1 b2 Þ
where
m11 ¼ m044 r22
2
þ m011 þ m013 r11 þ r11 ðm013 þ m033 r11 Þ
m12 ¼ m024 r21 þ r12 ðm013 þ m033 r11 Þ þ m044 r21 r22
m21 ¼ m024 r21 þ r12 ðm013 þ m33 r11 Þ þ m044 r21 r22
m22 ¼ m033 r12
2
þ m022 þ m024 r22 þ r22 ðm024 þ m044 r22 Þ
and
c11 ¼ c011 þ c013 r11 þ c031 r11 þ r_ 11 ðm013 þ m033 r11 Þ þ m044 r21 r_ 21
c12 ¼ c013 r12 þ c042 r21 þ r_ 12 ðm013 þ m033 r11 Þ þ m044 r21 r_ 22
c21 ¼ c031 r12 þ c024 r21 þ r_ 21 ðm024 þ m044 r22 Þ þ m033 r12 r_ 11
c22 ¼ c022 þ c024 r22 þ c042 r22 þ r_ 22 ðm024 þ m044 r22 Þ þ m033 r12 r_ 12
The friction coefficients matrices of the model Eq. (9.10) and of the model
Eq. (9.19) are
2 0 3
f11 0 0 0
6 0 f 0
0 0 7
F0 ¼ 6 22 7; F ¼ f11 f12
4 0 0 f33 0
0 5 f21 f22
0
0 0 0 f44
9 A Finite-Time Nonlinear PID Set-Point Controller for a … 257
where
0 2 0 2 0
f11 ¼ f11 þ r11 f33 þ r21 f44
0 0
f12 ¼ r11 r12 f33 þ r21 r22 f44
0 0
f21 ¼ r11 r12 f33 þ r21 r22 f44
0 2 0 2 0
f22 ¼ f22 þ r12 f33 þ r22 f44
Table 9.3 Gains and parameters of the finite-time nonlinear PID controller
Gain Joint 1 Joint 2 Units
kp0 0.2 0.22 Nm/rad
kd0 0.5 0.5 Nms/rad
Kp 0.37 0.36 Nm/rad
KI 0.1 0.06 Nms/rad
Kd 0.1 0.01 Nm/rad
a0 0.1 0.1 s−1
a1 0.5 0.5 (dimensionless)
a2 0.6666 0.6666 (dimensionless)
Zt
s ¼ Kp ~q Ki Tanhð~qðrÞÞ dr Kd q_
0
The gains used for this controller are shown in Table 9.4. These gains were
selected by try and test, in order to obtain the best performance of the controller and
avoiding to exceed the maximum torque values.
The results of the simulations are shown in Figs. 9.2, 9.3, 9.4, 9.5, 9.6 and 9.7. In
Fig. 9.2, the position errors at joint 1 from both controllers, the finite-time nonlinear
PID controller (FNPID) and the nonlinear PID from Kelly (1998), are shown. In
Fig. 9.3, the position errors at joint 2 from both controllers are shown. From these
figures, notice that the position errors of the FNPID in steady state are smaller than
the position errors of the NPID. In Figs. 9.4 and 9.5, the commanded torques from
the FNPID for joint 1 and joint 2, respectively, are shown. In Figs. 9.6 and 9.7, the
commanded torques from the NPID for joint 1 and joint 2, respectively, are shown.
9 A Finite-Time Nonlinear PID Set-Point Controller for a … 259
0.05
FNPID
NPID
0
-0.05
-0.1
-0.15
-0.2
0 1 2 3 4
Fig. 9.2 Position errors in joint 1 from both controllers, FNPID and NPID
0.05
FNPID
NPID
0
-0.05
-0.1
-0.15
-0.2
0 1 2 3 4
Fig. 9.3 Position errors in joint 2 from both controllers, FNPID and NPID
Notice that the torque signals from the NPID controller for both joints last longer
times than the torque signals from the FNPID controller. This may imply smaller
and shorter control efforts from the FNPID controller, which may result in
improved durability of the drives and motors of the parallel machine. Notice that, as
was pointed above, in the simulations we were careful in avoiding exceeding the
maximum torque value of 0.2 (Nm).
260 F. Salas et al.
0.2
0.15
0.1
0.05
0
0 1 2 3 4
Fig. 9.4 Commanded torque from the FNPID controller, for joint 1
0.2
0.15
0.1
0.05
0
0 1 2 3 4
Fig. 9.5 Commanded torque from the FNPID controller, for joint 2
9 A Finite-Time Nonlinear PID Set-Point Controller for a … 261
0.2
0.15
0.1
0.05
0
0 1 2 3 4
Fig. 9.6 Commanded torque from the NPID controller, for joint 1
0.2
0.15
0.1
0.05
0
0 1 2 3 4
Fig. 9.7 Commanded torque from the NPID controller, for joint 2
9.5 Conclusion
In this work, we have reported the application of a finite-time nonlinear PID reg-
ulation controller to a Five-Bar Mechanism. The stability analysis of the system has
been carried out, resulting in the global finite-time stability of the closed-loop
system.
A dynamic model of a parallel robot, which is subject to mechanical constraints,
has been obtained in structure similar to that of a serial robot. This let us analyze the
closed-loop system in a similar way to analyzing a system with a serial robot.
262 F. Salas et al.
References
Amato, F., De Tommasi, G., & Pironti, A. (2013). Necessary and sufficient conditions for
finite-time stability of impulsive dynamical linear systems. Automatica, 49(8), 2546–2550.
Barnfather, J. D., Goodfellow, M. J., & Abram, T. (2017). Positional capability of a hexapod robot
for machining applications. The International Journal of Advanced Manufacturing
Technology, 89(1–4), 1103–1111. https://doi.org/10.1007/s00170-016-9051-0.
Bhat, S. P., & Bernstein, D. S. (1998). Continuous finite-time stabilization of the translational and
rotational double integrators. IEEE Transactions on Automatic Control, 43(5), 678–682.
https://doi.org/10.1109/9.668834.
Bhat, S. P., & Bernstein, D. S. (2000). Finite-Time stability of continuous autonomous systems.
SIAM Journal on Control and Optimization, 38(3), 751–766. https://doi.org/10.1137/
S0363012997321358.
Bhat, S. P., & Bernstein, D. S. (2005). Geometric homogeneity with applications to finite-time
stability. Mathematics of Control, Signals, and Systems, 17(2), 101–127. https://doi.org/10.
1007/s00498-005-0151-x.
Bourbonnais, F., Bigras, P., & Bonev, I. A. (2015). Minimum-time trajectory planning and control
of a pick-and-place Five-Bar parallel robot. IEEE/ASME Transactions on Mechatronics, 20(2),
740–749. https://doi.org/10.1109/TMECH.2014.2318999.
Cheng, H., Yiu, Y-K., & Li, Z. (2003). Dynamics and control of redundantly actuated parallel
manipulators. IEEE/ASME Transactions on Mechatronics, 8(4), 483–491.
Diaz-Rodriguez, M., Valera, A., Mata, V., & Valles, M. (2013). Model-based control of a 3-DOF
parallel robot based on identified relevant parameters. IEEE/ASME Transactions on
Mechatronics, 18(6), 1737–1744. https://doi.org/10.1109/TMECH.2012.2212716.
Dorato, P. (1961). Short time stability in linear time-varying systems. In IRE International
Convention Record (pp. 83–87). USA: New York.
Enferadi, J., & Shahi, A. (2016). On the position analysis of a new spherical parallel robot with
orientation applications. Robotics and Computer-Integrated Manufacturing, 37, 151–161.
https://doi.org/10.1016/J.RCIM.2015.09.004.
Feng, Y., Yu, X., & Man, Z. (2002). Non-singular terminal sliding mode control of rigid manipulators.
Automatica, 38(12), 2159–2167. https://doi.org/10.1016/S0005-1098(02)00147-4.
Ghorbel, F. H., Chetelat, O., Gunawardana, R., & Longchamp, R. (2000). Modeling and set point
control of closed-chain mechanisms: theory and experiment. IEEE Transactions on Control
Systems Technology, 8(5), 801–815. https://doi.org/10.1109/87.865853.
Gruyitch, L. T., & Kokosy, A. (1999). Robot control for robust stability with finite reachability
time in the whole. Journal of Robotic Systems, 16(5), 263–283. http://doi.org/10.1002/(SICI)
1097-4563(199905)16:5<263::AID-ROB2>3.0.CO;2-Q.
Hong, Y., Xu, Y., & Huang, J. (2002). Finite-time control for robot manipulators. Systems &
Control Letters, 46(4), 243–253. https://doi.org/10.1016/S0167-6911(02)00130-5.
Huang, Z., & Cao, Y. (2005). Property identification of the singularity loci of a class of
Gough-Stewart manipulators. The International Journal of Robotics Research, 24(8), 675–685.
https://doi.org/10.1177/0278364905054655.
Kelaiaia, R. (2017). Improving the pose accuracy of the Delta robot in machining operations. The
International Journal of Advanced Manufacturing Technology, 91(5–8), 2205–2215. https://
doi.org/10.1007/s00170-016-9955-8.
9 A Finite-Time Nonlinear PID Set-Point Controller for a … 263
Kelly, R. (1998). Global positioning of robot manipulators via PD control plus a class of nonlinear
integral actions. IEEE Transactions on Automatic Control, 43(7), 934–938. https://doi.org/10.
1109/9.701091.
Kelly, R., Santibáñez, V., & Loría, A. (Antonio). (2005). Control of robot manipulators in joint
space. Berlin: Springer.
Khalil, W. (Wisama), & Dombre, E. (Etienne). (2004). Modeling, identification and control of
robots. Kogan Page Science.
Khan, W. A., Krovi, V. N., Saha, S. K., & Angeles, J. (2005). Recursive kinematics and inverse
dynamics for a planar 3R parallel manipulator. Journal of Dynamic Systems, Measurement,
and Control, 127(4), 529. https://doi.org/10.1115/1.2098890.
Li, Q., Wu, W., Xiang, J., Li, H., & Wu, C. (2015). A hybrid robot for friction stir welding.
Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical
Engineering Science, 229(14), 2639–2650. https://doi.org/10.1177/0954406214562848.
Michel, A. (1970). Quantitative analysis of simple and interconnected systems: Stability,
boundedness, and trajectory behavior. IEEE Transactions on Circuit Theory, 17(3), 292–301.
https://doi.org/10.1109/TCT.1970.1083119.
Muller, A. (2005). Internal preload control of redundantly actuated parallel manipulators—Its
application to backlash avoiding control. IEEE Transactions on Robotics, 21(4), 668–677.
https://doi.org/10.1109/TRO.2004.842341.
Nan, R., Li, D., Jin, C., Wang, Q., Zhu, L., Zhu, W., … Qian, L. (2011). The Five-Hundred-Meter
Aperture Spherical Radio Telescope (FAST) Project. International Journal of Modern Physics
D, 6, 989–1024. http://doi.org/10.1142/S0218271811019335.
Pierrot, F., Reynaud, C., & Fournier, A. (1990). DELTA: a simple and efficient parallel robot.
Robotica, 8(2), 105. https://doi.org/10.1017/S0263574700007669.
Polyakov, A. (2014). Stability notions and Lyapunov functions for sliding mode control systems.
Journal of the Franklin Institute, 351(4), 1831–1865. https://doi.org/10.1016/J.JFRANKLIN.
2014.01.002.
Polyakov, A., & Poznyak, A. (2009). Lyapunov function design for finite-time convergence
analysis: “Twisting” controller for second-order sliding mode realization. Automatica, 45(2),
444–448. https://doi.org/10.1016/J.AUTOMATICA.2008.07.013.
Ren, L., Mills, J. K., & Sun, D. (2007). Experimental comparison of control approaches on
trajectory tracking control of a 3-DOF parallel robot. IEEE Transactions on Control Systems
Technology, 15(5), 982–988. https://doi.org/10.1109/TCST.2006.890297.
Salinas, A., Moreno-Valenzuela, J., & Kelly, R. (2016). A family of nonlinear PID-like regulators
for a class of torque-driven robot manipulators equipped with torque-constrained actuators.
Advances in Mechanical Engineering, 8(2), 168781401662849. https://doi.org/10.1177/
1687814016628492.
Soto, I., & Campa, R. (2014). On dynamic modelling of parallel manipulators: The Five-Bar
mechanism as a case study. International Review on Modelling and Simulations (IREMOS), 7
(3), 531–541. http://doi.org/10.15866/IREMOS.V7I3.1899.
Soto, I., & Campa, R. (2015). Modelling and control of a spherical inverted pendulum on a
five-bar mechanism. International Journal of Advanced Robotic Systems, 12(7), 95. https://doi.
org/10.5772/60027.
Su, Y., & Zheng, C. (2017). PID control for global finite-time regulation of robotic manipulators.
International Journal of Systems Science, 48(3), 547–558. https://doi.org/10.1080/00207721.
2016.1193256.
Weiss, L., & Infante, E. (1967). Finite time stability under perturbing forces and on product spaces.
IEEE Transactions on Automatic Control, 12(1), 54–59. https://doi.org/10.1109/TAC.1967.
1098483.
Wu, H., Handroos, H., & Pessi, P. (2008). Mobile parallel robot for assembly and repair of ITER
vacuum vessel. Industrial Robot: An International Journal, 35(2), 160–168. https://doi.org/10.
1108/01439910810854656.
264 F. Salas et al.
Xie, F., & Liu, X.-J. (2016). Analysis of the kinematic characteristics of a high-speed parallel robot
with Schönflies motion: Mobility, kinematics, and singularity. Frontiers of Mechanical
Engineering, 11(2), 135–143. https://doi.org/10.1007/s11465-016-0389-7.
Yu, S., Yu, X., Shirinzadeh, B., & Man, Z. (2005). Continuous finite-time control for robotic
manipulators with terminal sliding mode. Automatica, 41(11), 1957–1964. https://doi.org/10.
1016/J.AUTOMATICA.2005.07.001.
Su, Y., & Zheng, C. (2009). A simple nonlinear PID control for finite-time regulation of robot
manipulators. In 2009 IEEE International Conference on Robotics and Automation (pp. 2569–
2574). IEEE. http://doi.org/10.1109/ROBOT.2009.5152244.
Su, Y., & Zheng, C. (2010). A simple nonlinear PID control for global finite-time regulation of
robot manipulators without velocity measurements. In 2010 IEEE International Conference on
Robotics and Automation (pp. 4651–4656). IEEE. http://doi.org/10.1109/ROBOT.2010.
5509163.
Zhao, D., Li, S., Zhu, Q., & Gao, F. (2010). Robust finite-time control approach for robotic
manipulators. IET Control Theory and Applications, 4(1), 1–15. https://doi.org/10.1049/iet-cta.
2008.0014.
Chapter 10
Robust Control of a 3-DOF Helicopter
with Input Dead-Zone
10.1 Introduction
Recently, there is a lot of attention on unmanned aerial vehicles (UAVs) due to their
potential applications. A large list of works can be found in the existing literature.
Unmanned helicopters have an advantage over other UAVs because of their unique
capabilities to perform tasks such as hover and vertical takeoff and land, needing for
that a very limited space (Isidori and Astolfi 1992; Avila et al. 2003; Marconi and
Naldi 2007; Gadewadikar et al. 2008).
In this work, we refer to a three-degree-of-freedom (3-DOF) laboratory heli-
copter developed by Quanser Company that is often used in control research for the
design and implementation of control concepts (Quanser 1998). The 3-DOF heli-
copter system consists of two DC motors mounted at the two ends of a rectangular
frame (helicopter frame) that drive two propellers (back and front propellers). There
are two inputs voltages: one for the front motor and the other for the back motor.
The 3-DOF helicopter has three outputs, which correspond to the elevation angle,
the pitch angle, and the travel angle. This plant represents a typical underactuated
MIMO nonlinear system with large uncertainties which can be utilized as an ideal
platform to test effectiveness of control schemes. The system contains various
uncertainties such as nonlinearities, coupling effects, unmodeled dynamics, and
parametric perturbations which may further increase the difficulties of control.
Other important drawback to design a stable controller is the fact that the inputs are
aerodynamical forces/torques.
The simplest dynamical model of the 3-DOF helicopter is described in (Quanser
1998) where the friction and other dynamics of the system are neglected. When this
model is used to design a position controller, it is difficult to accurately reach the
desired position for the closed loop system. To improve the performance of the
system, different models are used to control the 3-DOF helicopter such as (Ishutkina
2004; Shan et al. 2005; Andrievsky et al. 2007; Ishitobi et al. 2010). In this work,
we consider input dynamics and a dead-zone phenomenon in a model based on
those given in (Ishutkina 2004; Andrievsky et al. 2007). These input dynamics
relate the input voltages with the torques, adding a lag to the system. These
dynamics augment the order of the system and equivalently the degrees of freedom.
None of the works referenced in this paper include these dynamics in the controller
design. We only can find a reference of this existing dynamic in (Ishutkina 2004).
Attitude control problem (elevation and pitch channels) or position control
problem (elevation and travel angles) can be selected as the controlled outputs. The
attitude control problem is solved taking into account only a partial dynamic of the
system, which simplify the problem (Zheng and Zhong 2011; Wang et al. 2013; Liu
et al. 2014) because a fully actuated system is obtained. The position control
problem focuses on tracking the references for the elevation and travel angles. To
solve the control position problem, different solutions have been proposed, for
example, in (Odelga et al. 2012; Liu et al. 2014) a hierarchical control is used. First,
the control problem for the travel angle is solved using the reference position for the
pitch angle as the control input, and then. the attitude control problem is solved.
Another solution for the position control problem is given in (Ishitobi et al. 2010),
where a reference model is used. In this paper, we solve the position control
problem, considering the pitch angle as dependent of the other desired positions.
Different control techniques have been applied to the 3-DOF helicopter, where
robust controllers are more popular, e.g., the PD control with a robust compensator
(Zheng and Zhong 2011; Ferreira et al. 2012), the H2 ; H1 or LQR controllers (Li
and Shen 2007; Raafat and Akmeliawati 2012; Wang et al. 2013; Liu et al. 2014),
and sliding mode controllers (Starkov et al. 2008; Meza-Sanchez et al. 2012a, b;
Odelga et al. 2012). Beside these, adaptive controllers are used for this plant
(Andrievsky et al. 2007; Gao and Fang 2012). In this work, a H1 -synthesis is
applied for a nonlinear time-varying system obtained from the model of the plant.
Static H1 -controllers, which take into account the linearized system, are given by
(Ferreira et al. 2012; Wang et al. 2013). The proposed H1 -synthesis solves the
problem of having only position measurements, obtaining, through a filter, the
10 Robust Control of a 3-DOF Helicopter with Input Dead-Zone 267
unknown velocities of the given outputs. Moreover, the H1 ensure a L2 -gain of the
chosen output from the disturbances that affect the system.
The aim of this work is to develop a controller capable of ensuring that the
position of the system, given by the elevation and travel angles, tracks a desirable
smooth enough trajectory. The plant consists of a 3-DOF helicopter including a
dynamic and a dead-zone input which generates a system of eight-order equivalent
to a system with five degrees of freedom with two degrees of actuation. To simplify
the problem, the full system is decomposed, solving first the problem for the plant
without considering the input dynamics, and then, a hierarchical control is used to
track the reference control, obtained from H1 synthesis, employing the input
voltages as the input of the full system.
10.2 Preliminaries
with the state vector xðtÞ 2 Rn , the control input uðtÞ 2 Rm , the unknown distur-
bance wðtÞ 2 Rr , the output zðtÞ 2 Rl to be controlled, and the available mea-
surement yðtÞ 2 Rp , imposed on the system, and with matrices AðtÞ; B1 ðtÞ; B2 ðtÞ;
C1 ðtÞ; C2 ðtÞ; D12 ðtÞ; D21 ðtÞ of appropriate dimensions.
For convenience for the reader, recall that the system Eq. (10.1) possesses a L2 -
gain less than c if the following inequality holds
Z T Z T
kzðtÞk2 dt\c2 kwðtÞk2 dt ð10:2Þ
0 0
for all T [ 0, for all the system trajectories initialized at the origin, and for all
piecewise continuous functions wðtÞ 2 L2 ð0; TÞ such that the state trajectories
remain in a vicinity of the origin.
The H1 -control problem for the system Eq. (10.1) is to find all admissible
controllers
u ¼ Kðn; tÞ;
ð10:3Þ
n_ ¼ F ðn; y; tÞ;
268 I. U. Ponce et al.
with internal state n 2 Rs such that the L2 -gain of the closed-loop system
Eq. (10.1), driven by Eq. (10.3), is less than c. Solving the above problem under c
approaching the infimal achievable level c in Eq. (10.2) yields a (sub)optimal H1 -
controller with the (sub)optimal disturbance attenuation level c (c [ c ).
The following assumptions are imposed on the system Eq. (10.1).
A1. ðAðtÞ; B1 ðtÞÞ is stabilizable, and ðC1 ðtÞ; AðtÞÞ is detectable,
A2. ðAðtÞ; B2 ðtÞÞ is stabilizable, and ðC2 ðtÞ; AðtÞÞ is detectable,
A3. DT12 ðtÞC1 ðtÞ 0 and DT12 ðtÞD12 ðtÞ I,
A4. B1 ðtÞDT21 ðtÞ 0 and DT21 ðtÞD21 ðtÞ I.
which are made to simplify the solution to the H1 -control problem.
Necessary and sufficient conditions, for the above H1 suboptimal control
problem, are formulated in terms of the existence of appropriate solution of certain
differential Riccati equations to have a solution with a disturbance attenuation level
c [ 0:
C1. There exists a positive constant e0 such that the perturbed differential Riccati
equation
1
n_ ¼ An þ B1 B1 B2 B2 Pn þ QC2T ðy C2 nÞ;
T T
c2 ð10:6Þ
u ¼ BT2 ðtÞPðtÞnðtÞ
where xðtÞ 2 Rn is the state vector, uðtÞ 2 Rm is the control input, wðtÞ 2 Rr is the
unknown disturbance, zðtÞ 2 Rl is the unknown output to be controlled, and yðtÞ 2
Rp is the only available measurement on the system.
The nonlinear H1 -control problem for a system Eq. (10.7) is to find a locally
stabilizing output feedback controller of the form
_
n~ ¼ Fe ð~n; y; tÞ;
ð10:8Þ
e ~n; tÞ
~u ¼ Kð
with internal state ~n 2 Rs such that the L2 -gain of the closed-loop system
Eq. (10.7) driven by Eq. (10.8) is locally less than c.
The following assumptions are made on the system Eq. (10.8):
A5. The functions f ðxðtÞ; tÞ, g1 ðxðtÞ; tÞ, g2 ðxðtÞ; tÞ, h1 ðxðtÞ; tÞ, h2 ðxðtÞ; tÞ,
k12 ðxðtÞ; tÞ, and k21 ðxðtÞ; tÞ are of appropriate dimensions and of class C 1 ;
A6. f ð0; tÞ ¼ 0, h1 ð0; tÞ ¼ 0, and h2 ð0; tÞ ¼ 0 for all t;
A7. hT1 ðxðtÞ; tÞk12 ðxðtÞ; tÞ ¼ 0; k12
T
ðxðtÞ; tÞk12 ðxðtÞ; tÞ ¼ I,
k21 ðxðtÞ; tÞg1 ðxðtÞ; tÞ ¼ 0; k21 ðxðtÞ; tÞk21
T T
ðxðtÞ; tÞ ¼ I.
Assumptions A5 and A6 are typical for the nonlinear treatment (Isidori and
Astolfi 1992), whereas assumption A7 contains simplified assumptions related with
the linear treatment.
The local synthesis involves the linear H1 -control problem of time-varying
systems for the linearized system Eq. (10.1), where
270 I. U. Ponce et al.
@f
AðtÞ ¼ ð0; tÞ; B1 ðtÞ ¼ g1 ð0; tÞ; B2 ðtÞ ¼ g2 ð0; tÞ;
@x
@h1 @h2 ð10:9Þ
C1 ðtÞ ¼ ð0; tÞ; C2 ðtÞ ¼ ð0; tÞ;
@x @x
D12 ðtÞ ¼ k12 ð0; tÞ; D21 ðtÞ ¼ k21 ð0; tÞ:
We present a controller that globally stabilizes the perturbed system of first order
with state xðtÞ 2 R, a feedback law uðxÞ 2 R referred as high gain around the origin
(HGAO) control, and the bounded disturbance wðtÞ 2 R.
The following state feedback
simple control law tðxÞ ¼ j1 x. Therefore, using adequate parameter values of the
proposed controller Eq. (10.12), we can get the benefits of a discontinuous con-
troller, given by tðxÞ ¼ j1 signðxÞ, avoiding undesirable chattering effects of the
control signal.
The HGAO controller is more robust than the linear controller and can reach a
similar performance to the discontinuous one. The following result states the
robustness properties for of the HGAO controller.
Theorem 3 Given j1 [ 0, [ 0, and 0\a\1, the continuous closed-loop system
Eqs. (10.11)–(10.12) is globally asymptotically stable for any disturbance w that
satisfy the growth condition
jxj
jwðtÞj j0 ¼ rðxÞ ð10:13Þ
ð þ jxjÞa
1
Vðx; tÞ ¼ x2 ð10:14Þ
2
and since that j1 [ j0 by a condition of the theorem, the global asymptotic stability
of Eqs. (10.11)–(10.12) is then established.
Figure 10.1 depicts an example of rejected disturbances using the following
parameter values: j1 ¼ 1, a ¼ 0:9, and ¼ 0:001. This example shows that the
closed-loop system can reject almost disturbances with magnitude less of 1 unit.
The term is added to avoid singularities in the dynamic model Eqs. (10.11)–
(10.12). Furthermore, this term limits the linear gain, GðxÞ, of the control input
Eq. (10.12), reaching its maximum value at the origin with Gð0Þ ¼ ja1 . The
parameter a let us move the border of the rejected disturbances around the origin
(see Fig. 10.1).
where v ¼ ½ xðtÞ x_ ðtÞ T represent the state vector of the system, t 2 R is the time
variable, uðtÞ 2 R and wðtÞ 2 R are unknown inputs of the given system,
f ðx; u; w; tÞ : R R R R ! R is a bounded function, and yðtÞ 2 R represent
the output of the system.
For the described system, we state the following problem: using the only
available information of the system, xðtÞ, and knowing the bound
jf ðÞj C ð10:18Þ
and condition Eq. (10.18) holds for system Eq. (10.16). Then the state estimations
(^v1 ; ^v2 ) of system Eq. (10.19) converge globally asymptotically to the states ( v1 ; v2 )
of system (10.17).
Proof First, we define the observation error as
1 ^v1 v1
1¼ 1 ¼ ; ð10:21Þ
12 ^v2 v2
10 Robust Control of a 3-DOF Helicopter with Input Dead-Zone 273
lim 1 ¼ 0 ð10:23Þ
t!1
Let us consider the Lyapunov candidate function (Moreno and Osorio 2008)
1 2 1 2
V ¼ 2k2 j11 j þ 1 þ s ð10:24Þ
2 2 2
with
1
s ¼ k1 j11 j2 signð11 Þ þ 12 ð10:25Þ
By computing the time derivative of this function along the trajectories of system
Eq. (10.22), we arrive at
1
V_ ¼ k1 k2 j11 j2 k1 j11 j2 s2 2sf ðÞ k1 j11 j2 signð11 Þf ðÞ;
1 1 1
2 ð10:26Þ
1
k1 k2 j11 j2 k1 j11 j2 s2 þ 2sjf ðÞj þ k1 j11 j2 jf ðÞj:
1 1 1
which is negative semidefinite when condition Eq. (10.20) holds. By applying the
Invariance Principle (Khalil 2002) and since x 0 is the largest invariant set for
11 ¼ 0; the origin is globally asymptotically stable.
In addition to the robustness given by the above theorem, it can be showed that this
velocity observer, using adequate parameter values, possesses finite-time convergence
(Moreno and Osorio 2008; Orlov et al. 2011), which is a desirable feature.
performance of the system. The inverse model can be used to compensate the
effects of the dead-zone on the system (Tao and Kokotovic 1997).
The dead-zone characteristic DðÞ is described by
8
< ml ðYðtÞ þ jl Þ if YðtÞ\ jl
ZðtÞ ¼ DðYðtÞÞ ¼ 0 if jl YðtÞ jr ð10:28Þ
:
mr ðYðtÞ jr Þ if YðtÞ [ jr
where jl and jr represent the size of the dead-zone, whereas ml and mr are the linear
ratios between input and output. This model is shown in Fig. 10.2.
The inverse model D of the dead-zone characteristic, depicted in Fig. 10.3, is
specified by
8 ZðtÞm j
>
< m l
l l
if ZðtÞ\0
YðtÞ ¼ DðZðtÞÞ ¼ 0 if ZðtÞ ¼ 0 : ð10:29Þ
>
: ZðtÞ þ m rjr
r
m if ZðtÞ [ 0
When the parameters of the inverse model coincide with those of the dead-zone
model, the dead-zone effect is canceled. If this does not happen, the effects of the
dead-zone are at least minimized.
The 3-DOF helicopter model, as shown in Fig. 10.4, used in this work is analogous
to a tandem rotor helicopter such as the Boeing CH-47 Chinook illustrated in
Fig. 10.5.
The dynamical model derived from (Quanser 1998; Ishutkina 2004) is given by
where h; / and w 2 R are the elevation, pitch, and travel angles. Je ; Jp ; Jt 2 R þ and
fe ; fp ; ft are the inertia and viscous friction coefficients of the elevation, pitch, and
rotation axis. The terms ce sinðhÞ and cp sinð/Þ represent the restorative spring
torque relative to the elevation and pitch axis, respectively. Kf is the force constant
of the motor/propeller combination, and Kp is the force required to maintain the
helicopter in flight. Lb is the distance from the pivot point to the helicopter body,
and Lh is the distance from the pitch axis to any of the motors. The input torques sf
and sb represent the control action of the front and back DC motors applied to the
system. Finally, we ; wp and wt 2 L2 are introduced to take into account different
perturbations affecting the system.
The elevation angle, h, corresponds to the angular displacement of the main
sustentation arm with respect to the horizontal axis y. The movement range of the
elevation h is limited between around 1 and 1 rad due to the hardware restriction.
The pitch angle, /, defines the movement of the helicopter body and is confined to
the domain / 2 p2 ; p2 . The travel angle corresponds to the rotation of the entire
system around the vertical axis. The inertia model of the system is simplified to
point masses associated with the two motors and to the counterweight. In addition,
friction and aerodynamics drag effects are assumed to be negligible. The force
generated by each motor–propeller is assumed to be normal to the propeller plane.
The system is controlled by the action of two rotors driven by a corresponding
electric DC motor. The collective operation of the two rotors produces two actions
acting simultaneously in the system, the lifting system given by the sum of the
torques, and the rolling system given by the difference of the torques.
The torques sf and sb (sf ;b ) applied to the system are the result of the torques ~sf
and ~sb (~sf ;b ) affected by dead-zone phenomena, which are described as follows
0 if j~sf ;b j jf ; jb
sf ;b ¼ Dð~sf ;b Þ ¼ ; ð10:31Þ
~sf ;b jf ;b otherwise
The objective of the control system is achieved such that the system position, (h; w),
tracks a desirable smooth enough trajectory, T ðhd ðtÞ; wd ðtÞÞ, while also attenuating
the system disturbances and the measurement errors wh ; w/ and ww .
The only available measurements of the system are defined by the positions of
the system states corrupted by some measurement errors, defining the system output
as
2 3
h þ wh
y ¼ 4 / þ w/ 5 ð10:34Þ
w þ ww
ð10:36Þ
~s_ f ¼ d1~sf þ d2 Vf ;
ð10:37Þ
~s_ b ¼ e1~sb þ e2 Vb :
z ¼ ½ sf þ sb sf sb q1 x1 q 2 x2 q3 x3 q4 x4 q5 x5 q 6 x6 T ; ð10:38Þ
Then, the H1 tracking problem is stated in terms of the state deviation vector
x 2 R6 . Given the error system Eq. (10.36) with the system output Eq. (10.34) and
a real number c [ 0, it is required to find (if any) a causal dynamic output
Eq. (10.38) with internal state n 2 R6 such that the closed-loop system is internally
uniformly asymptotically stable around the origin, whereas its L2 -gain is locally
less than c.
The solution of the above problem only solves the control problem of the plant
without considering the actuator dynamics. Therefore, it is necessary to develop a
strategy to generate voltages Vf and Vb produced by the torques sf and sb that are
obtained from the proposed H1 -synthesis.
The tracking control problem is solved in two steps. In the first stage, we consider
the plant without taking into account the dynamics of the actuator, in this stage is
required to find the input torques sf and sb such that the H1 control problem is
solved for the system Eqs. (10.36) and (10.38). The second stage consists of finding
the voltages Vf and Vb such that the input torques sf and sb , derived with H1
synthesis, will be generated by the actuators.
10 Robust Control of a 3-DOF Helicopter with Input Dead-Zone 279
The first stage of the proposed control problem involves a system of six order
Eq. (10.36), i.e., considers a system of 3-DOF with two degrees of actuation. To
simplifying the control problem of this system, the following control inputs are
proposed
sf þ sb ¼ v1 ¼ u1 þ fe h_ d þ ce sinðhÞ þ Je €
hd ;
Kf Lb cosð/Þ
ð10:39Þ
1
sf sb ¼ v2 ¼ u2 þ fp /_ d þ cp sinð/Þ þ Jp / € ;
d
Kf Lh
Z 2 ¼ ½ u2 q3 x3 q 4 x4 q5 x5 q6 x6 T : ð10:46Þ
280 I. U. Ponce et al.
For each system, R1 and R2 , the H1 tracking problem is restated in terms of the
state deviation vector X1 and X2 . Given the error systems R1 and R2 and real
numbers c1 ; c2 [ 0, it is required to find (if any) causal dynamic outputs
Eqs. (10.45) and (10.46) with internal states N1 and N2 such that the closed-loop
systems are internally uniformly asymptotically stable around the origin, whereas
their L2 -gains are locally less than c1 and c2 , respectively.
Solving the H1 problem allows us to find the control inputs sf and sb from
Eq. (10.39), where
1
sf ¼ ðv1 þ v2 Þ; ð10:47Þ
2
1
sb ¼ ðv1 v2 Þ; ð10:48Þ
2
the voltages Vf and Vb are obtained such that the torques ^sf and ^sb fulfill the
condition
^sf ~sf
lim ¼ 0: ð10:51Þ
t!1 ^ sb ~sb
1
Vb ¼ ðe1~sb þ s_ b þ tb Þ; ð10:55Þ
e2
where s_ f and s_ b are estimated values of ~s_ f and ~s_ b , whereas tf and tb are defined by
the control law Eq. (10.12), i.e.,
a
tf ¼ j1f f þ jff j f ff
ð10:56Þ
tb ¼ j1b ðb þ jfb jÞab fb :
The estimated values s_ f and s_ b are obtained by applying the supertwisting
observer Eq. (10.19), i.e.,
The inverse model Eq. (10.29) is used to compensate the dead-zone Eq. (10.31).
We consider a symmetric dead-zone model, i.e., jl ¼ jr and ml ¼ mr . In this case, it
is necessary to estimate the gap of the dead-zone for both motors (front and back)
which correspond to values jf and jb .
The complete system controller is well described in Fig. 10.7.
and to fulfill requirement A6, the trajectory /d and its time derivatives are pre-
specified in the form
Jt
/d ¼ sin1 ft Jt1 w_ d w
€
d ;
Kp Lb
Jt
Jt
€ ¼ tanð/ Þ/_ 2 þ
/ f t J 1 ð3Þ
w w
ð4Þ
:
d d d
Kp Lb cosð/d Þ t d d
Some numerical simulations were performed to show the efficacy of the proposed
method. The parameters of the helicopter, drown from the Quanser 3-DOF heli-
copter manual (Quanser 1998), are given in Table 10.1. The numerical setup was
implemented using Simulink Version 7.5 (R2010a) from MATLAB 7.10.0.499
(R2010a) for 64-bit (win64) running on a personal computer with Intel Core
i3-3120, 2.50-2.50 GHz, 4 GB processor.
The parameter values for the input dynamics were obtained from (Ishutkina
2004), where d1 ¼ 7:3; d2 ¼ 1; e1 ¼ 6:2, and e2 ¼ 1.
We consider a perturbed system with parametric variations. The H1 -controller
parameters used in the numerical simulations were c1 ¼ 320, q1 ¼ ½ 300 0 ,
we ¼ 0:05 sinð0:4ptÞ;
2
wp ¼ 0:02 sin pt ; ð10:61Þ
7
wt ¼ 0:01 sinð0:25ptÞ
The position behavior is shown in Figs. 10.8, 10.9, and 10.10, and the control
input is shown in Fig. 10.11. These figures demonstrate the effectiveness of the
proposed control method.
10.5 Conclusions
The tracking control problem for a 3-DOF helicopter was solved using H1 syn-
thesis. For this problem, the elevation and travel angles were selected as the outputs
to be controlled, while the desired trajectory for the pitch angle was obtained from
condition A6. The input dead-zone was compensated using its inverse model, and a
reference model was used to compensate the first-order dynamics in the actuators,
where a first-order controller was developed from a discontinuous one. For this
problem, the system positions (h; /, w) were the only available measurements of
the system, and external disturbances and parametric variations were considered.
Numerical results show the effectiveness of the proposed control.
References
Andrievsky, B., Peaucelle, D., & Fradkov, A. L. (2007, 7). Adaptive Control of 3DOF Motion for
LAAS Helicopter Benchmark: Design and Experiments. American Control Conference, 2007.
ACC ‘07, (pp. 3312–3317). https://doi.org/10.1109/acc.2007.4282243.
Avila Vilchis, J. C., Brogliato, B., Dzul, A., & Lozano, R. (2003). Nonlinear modelling and
control of helicopters. Automatica, 39, 1583–1596. https://doi.org/10.1016/S0005-1098(03)
00168-7.
Davila, J., Fridman, L., & Levant, A. (2005, 11). Second-order sliding-mode observer for
mechanical systems. Automatic Control, IEEE Transactions on, 50, 1785–1789. https://doi.
org/10.1109/tac.2005.858636.
Ferreira de Loza, A., Rios, H., & Rosales, A. (2012). Robust regulation for a 3-DOF helicopter via
sliding-mode observation and identification. Journal of the Franklin Institute, 349, 700–718.
https://doi.org/10.1016/j.jfranklin.2011.09.006.
Gadewadikar, J., Lewis, F., Subbarao, K., & Chen, B. M. (2008). Structured H-infinity command
and Control-loop design for unmanned helicopters. Journal of Guidance, Control and
Dynamics, 31, 1093–1102. https://doi.org/10.2514/1.31377.
Gao, W.-N., & Fang, Z. (2012, 6). Adaptive integral backstepping control for a 3-DOF helicopter.
Information and Automation (ICIA), 2012 International Conference on, (pp. 190–195). https://
doi.org/10.1109/icinfa.2012.6246806.
Ishitobi, M., Nishi, M., & Nakasaki, K. (2010). Nonlinear adaptive model following control for a
3-DOF tandem-rotor model helicopter. Control Engineering Practice, 18, 936–943. https://doi.
org/10.1016/j.conengprac.2010.03.017.
Ishutkina, M. A. (2004, 6). Design and implementation of a supervisory safety controller for a
3DOF helicopter. Ph.D. thesis. Massachusetts Institute of Technology.
Isidori, A., & Astolfi, A. (1992, 9). Disturbance attenuation and H-∞-control via measurement
feedback in nonlinear systems. IEEE Transactions on Automatic Control, 37, 1283–1293.
Khalil, H. K. (2002). Nonlinear Systems. Prentice Hall.
Li, P.-R., & Shen, T. (2007, 8). The research of 3 DOF helicopter tracking controller. Machine
Learning and Cybernetics, 2007 International Conference on, 1 (pp. 578–582). https://doi.org/
10.1109/icmlc.2007.4370211.
Liu, H., Xi, J., & Zhong, Y. (2014). Robust hierarchical control of a laboratory helicopter. Journal
of the Franklin Institute, 351, 259–276. https://doi.org/10.1016/j.jfranklin.2013.08.020.
Marconi, L., & Naldi, R. (2007). Robust full degree-of-freedom tracking control of a helicopter.
Automatica, 43, 1909–1920. https://doi.org/10.1016/j.automatica.2007.03.028.
10 Robust Control of a 3-DOF Helicopter with Input Dead-Zone 287
Meza-Sanchez, I. M., Orlov, Y., & Aguilar, L. T. (2012a, 1). Periodic motion stabilization of a
virtually constrained 3-DOF underactuated helicopter using second order sliding modes.
Variable Structure Systems (VSS), 2012 12th International Workshop on, (pp. 422–427).
https://doi.org/10.1109/vss.2012.6163539.
Meza-Sanchez, I. M., Orlov, Y., & Aguilar, L. T. (2012b, 1). Stabilization of a 3-DOF
underactuated helicopter prototype: Second order sliding mode algorithm synthesis, stability
analysis, and numerical verification. 12th International Workshop on Variable Structure
Systems (VSS), (pp. 361–366). https://doi.org/10.1109/vss.2012.6163529.
Moreno, J. A., & Osorio, M. (2008, 12). A Lyapunov approach to second-order sliding mode
controllers and observers. Decision and Control, 2008. CDC 2008. 47th IEEE Conference on,
(pp. 2856–2861). https://doi.org/10.1109/cdc.2008.4739356.
Odelga, M., Chriette, A., & Plestan, F. (2012, 6). Control of 3 DOF helicopter: A novel autopilot
scheme based on adaptive sliding mode control. American Control Conference (ACC) (Vol.
2012, pp. 2545–2550).
Orlov, Y. V., & Aguilar, L. T. (2014). Advanced H_∞ control: Towards nonsmooth theory and
applications. New York: Birkhauser.
Orlov, Y., Aoustin, Y., & Chevallereau, C. (2011, 3). Finite Time stabilization of a perturbed
double integrator; Part I: Continuous sliding mode-based output feedback synthesis. Automatic
Control, IEEE Transactions on, 56, 614–618. https://doi.org/10.1109/tac.2010.2090708.
Quanser. (1998). 3D helicopter system with active disturbance. [available] http://www.quanser.
com/choice.asp.
Raafat, S. M., & Akmeliawati, R. (2012). Robust disturbance rejection control of helicopter system
using intelligent identification of uncertainties. Procedia Engineering, 41, 120–126. https://doi.
org/10.1016/j.proeng.2012.07.151.
Shan, J., Liu, H.-T., & Nowotny, S. (2005, 11). Synchronised trajectory-tracking control of
multiple 3-DOF experimental helicopters. Control Theory and Applications, IEE Proceedings,
152, 683–692. https://doi.org/10.1049/ip-cta:20050008.
Starkov, K. K., Aguilar, L. T., & Orlov, Y. (2008, 6). Sliding mode control synthesis of a 3-DOF
helicopter prototype using position feedback. Variable Structure Systems, 2008. VSS ‘08.
International Workshop on, (pp. 233–237). https://doi.org/10.1109/vss.2008.4570713.
Tao, G., & Kokotovic, P. V. (1997). Adaptive control of systems with unknown non-smooth
non-linearities. International Journal of Adaptive Control and Signal Processing, 11, 81–100.
Wang, X., Lu, G., & Zhong, Y. (2013). Robust attitude control of a laboratory helicopter. Robotics
and Autonomous Systems, 61, 1247–1257. https://doi.org/10.1016/j.robot.2013.09.006.
Zheng, B., & Zhong, Y. (2011, 2). Robust attitude regulation of a 3-DOF helicopter benchmark:
Theory and experiments. Industrial Electronics, IEEE Transactions on, 58, 660–670. https://
doi.org/10.1109/tie.2010.2046579.
Part III
Robotics
Chapter 11
Mechatronic Integral Ankle
Rehabilitation System: Ankle
Rehabilitation Robot, Serious Game,
and Facial Expression Recognition
System
Abstract People who have suffered an injury require a rehabilitation process of the
affected muscle. Rehabilitation machines have been proposed to recover and
strengthen the affected muscle. In this chapter, we propose a novel ankle rehabil-
itation parallel robot of two degrees of freedom consisting of two linear guides. For
the integral rehabilitation, a serious game and a facial expression recognition sys-
tem are added for entertainment and to improve patient engagement in the reha-
bilitation process. The serious game has a simple design. This game has three levels
and it is controlled with an impedance control, which specific command allowing
character game jumps the obstacles. Facial expressions recognition system assists to
the serious game. We propose to recognize three different facial expressions to the
basic expressions. Based on the experiment results, we concluded that our system is
good because it has a performance of 0.95%.
11.1 Introduction
Ligamentous ankle injuries are the most common sports trauma, its frequency
approximately represents between 10 and 30% of all sports injuries (Zoch et al.
2003), according to data from different publications. The ankle sprain occurs
when the ankle unexpectedly twists or turns in an awkward way beyond what the
ligaments can tolerate, being the most common due to an excess of inversion
movement, which damages the ankle lateral ligaments. It was observed that sprain
occurs when the ankle unexpectedly twists or turns in an awkward way by that
when a muscle remains on immobilization tends to weaken, caused stiffness and
therefore loses tone and shortened. Salter recommended that all joint affections
should be moved continuously through a full range of motion. Salter invented the
concept of continuous passive motion, known as CPM (O’Discoll and Giori 2000).
Rehabilitation is the process of restoration of skills by a person who has had an
illness or injury so as to regain maximum self-sufficiency and function in a normal
or as near normal manner as possible. The rehabilitation is beneficial to reduce
spasticity, to increase the muscle mass, and to control the muscle movement.
Robotic ankle rehabilitation devices like CPM machines are used to perform
smooth and control motions in rehabilitation therapies to help patients to perform
repetitive movements in a well-defined interval and a given speed.
Consequently, there is an increasing research interest in developing rehabilita-
tion machines by technology development companies, institutions, and universities
around the world. The main objectives of these machines are (a) rehabilitate the
affected part (e.g., knee, ankle, hands, hip), (b) restore mobility, (c) reduce repet-
itive work of a therapist, (d) increasing the number of therapy services, (e) reduce
recovery time and offer a wider range of personalized therapies with precise, and
(f) insurance movements (Blanco-Ortega et al. 2012).
The repetitive nature of exercises in therapy sessions is another problem. The
patients tend to pull out the rehabilitation process. To solve this problem, serious
games have been proposed. Serious games refer to the use of computer games that
have a main purpose that is not pure entertainment. These games contribute to
increase motivation in rehabilitation sessions (Rego et al. 2010). Our game has
simple rules to minimize learning period. Also, it is important the accuracy
detecting patient’s emotions. Automatic facial expressions recognition provides
information nonverbal of patient. Facial expression recognition is a topic widely
discussed in Computer Vision. Many researchers have been interested in the
analysis of the six basic facial expressions for different applications. We proposed
facial expression recognition as interface between serious game and rehabilitator.
For this interaction, we recognize three expressions different to basic facial
expressions.
In this chapter, we propose a novel comprehensive rehabilitation system that
considers an ankle rehabilitation of two degrees of freedom (DOF) (dorsiflexion–
plantarflexion and inversion–eversion), a serious game and an artificial vision
system for the detection of facial expressions of the patient, which improves the
rehabilitation process of the ankle. The movement of dorsiflexion–plantarflexion is
considered in this game. Figure 11.1 shows the integration scheme of the proposed
integral rehabilitation system.
The main contributions of this chapter are: an ankle rehabilitation system for
young people, which allow 25° of dorsiflexion, 45° of plantarflexion, 25° of
inversion, 15° degrees of eversion. One serious game with three different levels of
difficulty according to the stages of rehabilitation, it helps that the ankle increases
his strength and mobility, with a rehabilitation process entertained. Finally, it is also
11 Mechatronic Integral Ankle Rehabilitation System: Ankle … 293
proposed a system for recognizing facial expressions different from the basic
expressions as motivated, unmotivated, and pain, which gives feedback to the
serious game by indicating whether to increase or decrease the frequency of
obstacles.
The rest of the chapter is organized as follows: Sect. 11.2 reviews the state of the
art of ankle rehabilitation systems, serious games, and facial expressions recogni-
tion. Section 11.3 explains the design of the proposed rehabilitator. In Sect. 11.4,
serious games design is shown; in Sect. 11.5, the proposed facial expressions
recognition system is presented. Section 11.6 explains the methodology followed in
the experimentation and the results obtained are analyzed. Finally, we provide the
conclusions in Sect. 11.7.
patient’s own home. With respect to manual therapy, the robotized one is more
reproducible, more repeatable, and less dependent to the therapist ability; less tiring
for the therapist and sometimes may be remotely performed at the patient home.
The results are a better quality of life and a reduction of health expenses (Perdereau
et al. 2011).
Mechatronic systems for rehabilitation are devices that seek to improve the
recovery of a patient after suffering some kind of illness or injury in any part of the
body. Some ankle rehabilitation machines proposed are based on configuration of
parallel robot whose mechanical structure is formed by a mechanism of closed
chain in which the end effector is attached to the fixed base by at least two inde-
pendent kinematic chains.
In Rutgers University, a rehabilitation ankle device called “The Rutgers Ankle”
was proposed, see Fig. 11.2a. This device is a parallel robot with 6-DOF, despite
the ankle only has 3-DOF and uses pneumatic actuators. This device includes an
interface where the patient interacts virtually through simulation games during their
rehabilitation process. This device helps to improve balance, flexibility, and
increase muscle strength. It has been used in patients for determining the effec-
tiveness of this rehabilitation device, where they have concluded that require a large
capacity compressor to maintain pressure and prevent overheating and low loads in
the system (Girone et al. 1999, 2000; Deutsch et al. 2001). A new Rutgers Ankle
Fig. 11.2 Parallel robots for ankle rehabilitation: a 6-DOF (Cioi et al. 2011), b 4-DOF (Yoon and
Ryu 2005), c 2-DOF (Saglia et al. 2009), d 3-DOF (Liu et al. 2006), and e 1-DOF (Chou-Ching
et al. 2008)
11 Mechatronic Integral Ankle Rehabilitation System: Ankle … 295
device was used to train ankle strength and improved motor control for children
with cerebral palsy (CP) (Cioi et al. 2011).
As shown in Fig. 11.2b, the parallel mechanism for ankle rehabilitation com-
prises two plates for supporting the foot and provides flexion–extension of the
fingers. The parallel mechanism is of 4-DOF and uses four pneumatic actuators,
which provide dorsi/plantarflexion and inversion/eversion movements for the ankle
(Yoon and Ryu 2005). A parallel robot of 2-DOF for ankle rehabilitation is shown
in Fig. 11.2c (Saglia et al. 2009). Using a PD control , the parallel robot is operated
redundantly to avoid singularities and this way to provide dorsi/plantarflexion and
inversion/eversion movements. In Saglia et al. (2010) presents the development of
an admittance-based assistive controller for this ankle rehabilitation system. An
admittance control technique is used to perform patient-active exercises with and
without motion assistance. Electromyography (EMG) signals are used to evaluate
patient’s effort during training/exercising.
Another parallel robot of 3-DOF (Liu et al. 2006) with a link in the central part
to connect the mobile base with fixed base and thus give greater rigidity to the
structure and limiting movement is shown in Fig. 11.2d. The authors present
simulation results in MSC ADAMS of a virtual prototype and also present the
physical prototype.
A robot assistant in ankle rehabilitation of 1-DOF, shown in Fig. 11.2e, provides
dorsi/plantarflexion movement to reduce spasticity, increase muscle tone, and
improve motor control (Chou-Ching et al. 2008). The authors implemented a
proportional-derivative (PD) fuzzy controller combined with a conventional inte-
grated control with feedback of the angular position and torque exerted by the
patient through his foot on the robot base.
Fig. 11.3 Factors that influence facial expressions formation (Fasel and Luettin 2003)
298 A. M. Salazar et al.
units, where the automatic detection of the facial components is done through the
regions of interest (ROI) of the face, which are mainly the eyes, the mouth, nose,
and eyebrows. The development of facial expression recognition systems can be
made classifying expressions based on facial action coding system and direct or
indirect interpretation of facial expressions.
Facial action coding system (FACS) is a coding system created by Paul Ekman
and Freisen (Kumari et al. 2015) for describing facial movements. The FACS
identifies the facial muscles that individually or in groups cause changes in facial
behaviors. These changes in the face are called action units (AU); then, the FACS is
made up of several such action units. This facial action coding system has become a
standard for the automatic facial expression recognition (FER).
FER has been a line of research addressed in recent decades, obtaining nonverbal
information about the behavior of people. Generally, facial expression automatic
systems are carried out by modeling the action units. However, the expression
analysis is still complex for current systems of automatic recognition through units
of action or specific points, since determining the internal state of a person through
their facial muscle movements requires pondering many variables (Porras-Luraschi
2005).
Table 11.2 shows a summary of the state-of-the-art recognition of facial
expressions, it can be seen that the most popular techniques for the extraction of
these characteristics are: Gabor filters, local binary patterns, principal component
analysis (PCA), independent component analysis (ICA) linear discriminant analysis
(LDA), local gradient code (LGC), and local directional pattern (LDP).
The most popular classification techniques are, but not limited to support vector
machines (SVM), nearest neighbor (NN), artificial neural networks (ANN), and
decision trees. We can also observe that in recent years the use of the Kinect sensor
for the acquisition of images of facial expressions has increased.
The parallel robot proposed for the ankle rehabilitation consisting of two linear
guides actuated with CD geared motors for vertical movements, resulting in a
mechanism of 2-DOF. This robot has a movable platform where the foot-ankle is
supported. Spherical and translation joints are used to link the movable base
through bars. The strut plays an important role in the mechanical design, since it is
positioned to counterbalance the foot-leg weight of the patient, and is attached to
the movable base by means of a spherical union. The ankle rehabilitation robot
provides dorsiflexion/plantarflexion and inversion/eversion movements, as can be
observed in Fig. 11.4.
Table 11.2 Summary of automatic facial expressions recognition
11
Consider the schematic diagram of the parallel robot shown in Fig. 11.6, where r1,
r2, r3, and r4 are the distances of the mobile platform, driven link, mobile base, and
ground link, respectively. The kinematic model can be expressed in polar form by
means of Eq. (11.1).
Using Euler identity and solving simultaneously for the unknown displacements,
which are the displacements of the driven link and the mobile base of the linear
guide. Note that r1 and r4 are constant vectors, see Fig. 11.6.
r2 ¼ r4 þ r1 cos h1
ð11:2Þ
r3 ¼ r1 sin h1
11 Mechatronic Integral Ankle Rehabilitation System: Ankle … 303
Taking time derivative of the vector loop in Eq. (11.2), using Euler identity and
solving for the velocities of the driven link and the mobile base, we obtain
r_ 2 ¼ h_ sin h1
ð11:3Þ
r_ 3 ¼ r1 h_ cos h1
Je €h1 ¼ F1 d1 P ð11:4Þ
where Je is the inertia moment of the movable platform, d1 is the distance between
the strut and Acme screw. The control force is F1 and P is an unknown disturbance
(e.g., ankle stiffness, viscous damping, friction forces).
The use of this PID-type controller yields the following closed-loop dynamics
for trajectory tracking errors given by eh ¼ h h1d :
The controller gains a0 ; a1 and a2 were selected such that the associated char-
acteristic polynomial for the closed-loop system Eq. (11.3) be Hurwitz polynomial
(polynomial whose roots are located in the left half-plane of the complex plane),
one guarantees that the error dynamics be globally asymptotically stable. The
controller
gains were set to coincide with those of the desired characteristic poly-
nomial s2 þ 2fxn s þ x2n ðs þ bÞ with xn ¼ 10; f ¼ 0:7; b ¼ 10.
Desired position trajectory to provide dorsi/plantarflexion movements is given
by the following Bézier polynomial:
h1d ðtÞ ¼ hi þ hf hi r t; ti ; tf l5p
r t; ti ; tf ¼ c1 c2 lp þ c3 l2p þ c6 l5p ð11:8Þ
t ti
lp ¼
tf ti
where hi ¼ h1d ðti Þ and hf ¼ h1d ðtf Þ are the initial and final desired positions, so that
the basis of rehabilitation starts from an initial position and goes to a final position
with a smooth change, such that:
8
< 0 0 t\ti
h1d ðtÞ ¼ r t; ti ; tf hf ti t\tf ð11:9Þ
:
hf t [ tf
11 Mechatronic Integral Ankle Rehabilitation System: Ankle … 305
Parameters of the Bézier polynomial function h1d ðtÞ (4) are c1 = 252, c2 = 1050,
c3 = 1800, c4 = 1575, c5 = 700 and c6 = 126.
Pm ¼ a3 t3 þ a2 t2 þ a1 t þ a0 ð11:12Þ
Zt Z t Zs
k3 ðh1 h1d Þ ds k2 ðh1 h1d Þdkds
0 0 0
Z t Z s Zk ð11:13Þ
k1 ðh1 h1d Þdrdkds
0 0 0
Z t Z s Z k Zr
k0 ðh1 h1d Þdqdrdkds
0 0 0 0
306 A. M. Salazar et al.
where
_ Z t
h_ 1 ¼ ux ds
0 ð11:14Þ
_
h_ 1 ¼ h_ 1 þ h_ 1 ð0Þ
Substituting the controller Eq. (11.13) and the disturbance Eq. (11.12) in
Eq. (11.4), besides, if we consider the error, e ¼ h1 h1d , and its respective
derivatives, then we obtain:
The associated characteristic polynomial for the closed-loop system Eq. (11.15)
is given by:
s6 þ k5 s5 þ k4 s4 þ k3 s3 þ k2 s2 þ k1 s þ k0 ¼ 0 ð11:16Þ
The parameters were selected to ensure that the error dynamics was globally
asymptotically stable and were set to coincide with those of the desired charac-
3
teristic polynomial s2 þ 2fxn þ x2n with xn ¼ 10; f ¼ 0:7.
The serious game needs to be interesting, entertaining, and interactive (Zhang et al.
2014), to motivate the patient. According to the recommendations given in
(Michmizos and Krebs 2012), we considered to have a simple visual interface and
simple control so that the learning period is short, helping the patient to be
autonomous and if possible, that he/she can perform the therapy at home.
The game developed is aimed at young people, ranging from 12 to 20 years old,
who need motivation not to abandon their rehabilitation process, having a more
attractive and fun therapy. The tool used for its development was Unity3D because
it is a flexible development platform that can be implemented in all operating
systems (Linux, Mac, Windows, etc.).
The serious game developed is of 1-DOF corresponding to dorsiflexion and
plantarflexion movements. The game is proposed to use in active and resistive
rehabilitation (strengthening or resistance training). On these rehabilitation modes,
the individual realizes all the effort in the exercises. The parallel robot presents an
opposing force to the active patient movement, which is gradually increased to
improve muscular endurance, see Table 11.4. Contains three levels of difficulty and
each level has a game character. The force and angle the patient must apply on the
mobile platform increases as level up. The obstacles that must skip the game
character also increase its frequency as the game level increases, see Fig. 11.8.
11 Mechatronic Integral Ankle Rehabilitation System: Ankle … 307
Fig. 11.8 Levels of serious game corresponding to dorsiflexion and plantarflexion movements
The obstacles that must skip the game character also increase its frequency is
increasing as the game level. The relations between game character, angle of the
movable platform, and force that the patient must apply depending on the game
level is shown in Table 11.4.
The purpose of the first game level is for the patient to regain mobility in a small
range of motion by applying a small force that does not cause discomfort or pain. In
the second level, the patient must apply a greater force to strengthen the affected
muscle. Finally, the third level helps both mobility and strengthening the muscles; the
range of motion and the force that the patient must apply in the mobile platform are
increased. In Fig. 11.9, the relation between angular displacement of the movable
platform and the linear displacement of the mobile base of the linear guide is shown.
Another thing is that blue and green tones are used to convey a feeling of
well-being and red obstacles, with the intention of capturing the attention of the
players, see Fig. 11.10.
Finally, our ankle rehabilitation system contemplates store the aforementioned
information about strength, number of movements performed, and therapy time.
This information allows for supervision by the therapist. Additionally, improving
interaction between rehabilitation prototype and serious game was proposed to
308 A. M. Salazar et al.
psychological activity such as the pain. According to Fasel and Luettin (2003), this
mental and physical information can be identified visually through facial
expressions.
In this paper, the steps considered in facial expression automatic recognition
system are (a) Face acquisition, (b) Detection and descriptions of FACS, and finally
(c) Facial expression recognition. We used Kinect sensor for the first and second
step.
The Kinect sensor was built to revolutionize the way you play video games and
the entertainment experience; however, now it has more uses. In recent years,
Kinect has been used for other areas such as the recognition of facial expressions,
helping to obtain nonverbal information of what a person is feeling.
Kinect provides the Microsoft Face Tracking SDK development tool for the
detection, tracking and description of the components of the face and their move-
ments. This software enables us to develop our application for the tracking of the
face in real time. This library makes the description of the face using different action
units. We proposed to recognize only tree expression: pain, motivated or concen-
trate, and unmotivated or distracted. For the facial expressions recognition, we used
six action units and two movements of the head, these are shown in Fig. 11.11.
Fig. 11.11 Action units and movements that Kinect sensor recognizes
310 A. M. Salazar et al.
The unit of action UA0 refers to lifting the upper lip, the UA1 to lower the jaw,
the UA2 to tighten the lips, the UA3 to lower the eyebrows, the UA4 to lower the
corners of the lips, and the UA5 to raise the eyebrows. These units of action are
measured in a range of −1 to 1. The head movements that Kinect sensor recognized
is Pitch (which refers to raising and lowering the head) and Yaw (moving the head
from left to right and Roll turning the head), which are measured in a range of −90°
to 90°. For the detection of these movements and action units, Candide-3 is used,
which is a parameterized mask of 113 vertices and 168 surfaces (see Fig. 11.12).
The FACS system proposed by Ekman is a complete description of the facial
muscles. However, the set of facial actions considered by the Kinect sensor to carry
out the recognition of facial expressions is smaller, so it was necessary to make a
table of equivalences of both systems. It is important to mention that we only made
this table of equivalences for those action units that are part of the three facial
expressions considered in this project, see Table 11.5.
Generally, the recognition of basic facial expressions is based on the description
made by Ekman (through the FACS) for these emotions. However, we did not find
in the literature of the area a description of which action units are involved for the
recognition of the three facial expressions considered in this paper. We made an
analysis of the data and we propose a description for the three facial expressions
11 Mechatronic Integral Ankle Rehabilitation System: Ankle … 311
Table 11.6 Proposed description of the facial expressions pain, concentrate or motivated,
distraction or unmotivated and transitions
Class Description
Pain (class 1) Lip corner depressor (AU4) with values greater than 0.01
Brow lowerer (AU3) and jaw drop (AU1) with values greater
than 0.25
Brow lowerer (AU3) and lip stretcher (AU29) with values
greater than 0.25
Distraction or unmotivated Pitch with values greater than 35°
(class 2) Pitch with values lower than −10°
Yaw with values greater than 35°
Yaw with values lower than −25°
Concentrate or motivated Neutral: all units of action between −0.20 and 0.20
(class 3) Happy: lip corner depressor (AU3) with values lower than
−0.35
Transitions (class 4) Those are not included in any of the previous class
under study from four action units and two head movements. The face is a dynamic
object that usually changed by blinking, yawn, move the head from one side to
another, lower the head, etc. For these movements, we considered necessary to have
another class and it was named transitions. In Table 11.6, you can see this
description proposal.
Then, the feature vector of each expression is composed of eight attributes: six
action units and two positions of Kinect sensor: x = {AU0, AU1, AU2, AU3, AU4,
AU5, Pitch, Yaw}.
11.6 Experimentation
In this section, three tests are presented. First, we can provide simulations results
about ankle rehabilitation. The second test is about the facial expression recognition
system, and finally, the last test is about integration of all the systems proposed.
The ankle rehabilitation parallel robot can provide a passive rehabilitation using the
PID-type controller or robust GPI controller. In passive exercises, the patient effort
is not required, due to the parallel robot moves the ankle-foot in a smooth way.
Table 11.7 shows the simulation parameters obtained from the virtual prototype,
see Fig. 11.13.
In Fig. 11.13, the real and desired dorsiflexion response and control forces are
shown, using the virtual prototype (see Fig. 11.5) and PID-type controller of
Eq. (11.6). It shows how smooth movement of 0° to 15° (p/12 rad) is obtained
312 A. M. Salazar et al.
using Bézier polynomial of Eq. (11.8). The aim is that the physiotherapist could set
the angle interval at the desired time in order that parallel robot provides the
required speed based on the process of rehabilitation and improvement of the
affected part. It can be seen that for the dorsiflexion movement, the tracking error
tends to zero and that the movement is performed smoothly, 15° in 5 s.
The simulation results are shown in Fig. 11.14, corresponding to dorsiflexion
movement considering a constant disturbance (P = 0.5 Nm). It can be seen that the
control force does not compensate the disturbance.
Fig. 11.14 Dorsiflexion response using the PID-type controller for P = 0.5 Nm
11 Mechatronic Integral Ankle Rehabilitation System: Ankle … 313
Fig. 11.16 Dorsiflexion response using the robust GPI controller, P = 0.5 N
11.6.2 Database
A training database was created with 2303 instances and a database for tests with
306 instances. The images were captured using the Kinect sensor, at 30 frames in
one second. We recorded only three expressions for 45 people (15 women and 30
men) between 22 and 30 years old, from México. The facial expression images are
true color (24 bits) with measure 640 480 pixels. All images have a frontal-view
314 A. M. Salazar et al.
of a single person. The images were acquired under various lighting conditions, in a
natural environment.
Kinect sensor has some restrictions for building the database. First, Kinect needs
a distance from 60 cm to 1.20 m between volunteers and the Kinect camera. If there
is a greater or lesser distance, the Candide-3 mask is not positioned on the face to
tracking her. Second, the Kinect sensor does not work correctly with the presence of
accessories on the face such as mustache, hair in face, glasses, caps. In this case, our
volunteers did not have these accessories.
The goal of this test was evaluating the facial expression recognition system. In this
work, we provide a comparison with three classification algorithms in the two
frameworks in such a way that the best model for the recognition of facial
expressions was obtained. Weka (Waikao 2017) is open source software that
contains a collection of machine learning algorithms for data mining tasks, Weka
3.8 was used and for all algorithms we considered the option cross validation.
Language R (Foundation 2017) is a free software environment for statistical
computing and graphics. It compiles and runs on a wide variety of platforms. The
algorithms evaluated were Naïve Bayes, C4.5, and Support vector machines
(SVMs) using an RBF kernel. We use the default values from frameworks for the
cost and gamma.
Database was divided into two parts. For the training, we used a database with
captures of 30 people, the test phase was carried out with an integrated database of
15 people different from those considered in the training base. Table 11.8 shows the
performance achieved by each of the classification algorithms considered.
Classifier with the lowest performance was the Bayes algorithm. This algorithm
was considered for its simplicity to implement it in the integration of the rehabil-
itation system. The results obtained by Bayes on both platforms are similar.
The algorithm with the best performance, in both Weka and R, is C4.5 reaching
in the training phase a value 99% of success in both platforms, but his performance
decrease in the test phase up to 90% in the R language. In Weka, the algorithm
maintained the good result in the test phase. The tree generated for this model is
shown in Fig. 11.17. We decided to consider this decision tree for its integration in
the rehabilitation system for its simplicity in the rules, its easy represents, for its
rapid response and for achieving a close performance to accomplish by the SVM in
test phase, in R.
Support vector machines obtain a good result, according the literature; however,
it is interesting to note that the implementation carried out in each platform affects
the results obtained. SVM achieves a better performance in R. However, this model
had not considered because its implementation and integration is more difficult.
Analyzing the confusion matrix of the evaluated algorithms, we conclude that
the two main problems in the recognition of facial expressions are due to changes in
the luminous intensity and the large number of movements that the face presents
continuously, such as blinking, yawn, move the head from one side to another,
lower the look.
For the development and integration of these three systems, different software and
hardware tools were used. We used a laptop with Windows 10 operating system, Kinect
sensor, 1-DOF ankle rehabilitation, force sensor, Unity Engine C#, Visual C++,
Kinect SDK, and an Arduino.
316 A. M. Salazar et al.
The main elements of the ankle rehabilitation integral system are shown in
Fig. 11.18. The system consists of a serious game, a Kinect sensor, ankle reha-
bilitation parallel robot, and the facial expression recognition system. The parallel
robot has a force sensor to acquire the force applied by the patient on the movable
platform.
The interaction between the patient and the serious game is performed by the
Kinect sensor, which by means of the facial expression recognition system detects
the patient’s condition to increase or decrease the game level based on the fre-
quency of appearance of obstacles, which the serious game character must jump. If
the game level is modified, the patient must exert force in the movable platform so
that the game character jumps the obstacles. By means of a force sensor placed on
the mobile platform, the force magnitude that should be depending on the level of
the game is detected. That is, if the required force is not exerted, the angle
amplitude will not be achieved and therefore cannot jump the obstacle. On the
contrary, if the force is applied, the angle amplitude is achieved, and the character
of the game skips the obstacle.
Integral system was evaluated with six healthy people, for two 10-min therapies.
Each person used the rehabilitator and play with the developed game, while the
automatic recognition of their facial expressions was performed. The results
obtained show that the game entertains them while performing therapy. The
interface of the game is simple and easy to understand and learn quickly; however,
it is necessary to add more challenges to the concentration of the person during a
longer period. The communication between the rehabilitator and the game is carried
out in time close to the real, with responses less than 10 s. The rehabilitator fulfills
its objective by providing the necessary movements for the rehabilitation of the
ankle. Finally, the facial expression recognition system makes the recognition of
nonverbal information of the patient’s face, allowing the levels of the game to be
modified depending on their emotional state.
11 Mechatronic Integral Ankle Rehabilitation System: Ankle … 317
11.7 Conclusions
References
Agas, A., Daitol, A., Shah, U., Fraser, L., Abbruzzese, K., Karunakaran, K., & Foulds, R. (2015).
3-DOF admittance control robotic arm with a 3D virtual game for facilitated training of the
hemiparetic hand. In 41st Annual Northeast Biomedical Engineering Conference, (NEBEC)
(pp. 1–2).
Aly, S., Trubanova, A., Abbott, L., White, S., & Youssef, A. (2015). VT-KFER: A kinect-based
RGBD + Time dataset for spontaneous and non-spontaneous facial expression recognition. In
International Conference of Biometrics, Miami (pp. 1–8).
Aly, S., Youssef, A., & Abbott, L. (2014). Adaptive feature selection and data pruning for 3d facial
expression recognition using the kinect. In 2014 IEEE International Conference on Image
Processing (ICIP), Paris, France (pp. 1361–1365).
Arenas, Á.., Cotacio, B., Isaza, E., Garcia, J., Morales, J., & Marín, J. (2012). Sistema de
Reconocimiento de Rostros en 3D usando Kinect.
Blanco-Ortega, A., Beltrán, F., Silva, G., & Oliver, M. (2010). Active vibration control of a
rotor-bearing system based on dynamic stiffness. Revista Facultad de Ingeniería Universidad
de Antioquia, 55, 125–133.
Blanco-Ortega, A., Quintero-Mármol, E., Vela-Valdés, G., López-López, G., & Azcaray-Rivera, H.
(2012). Control of a virtual prototype for ankle rehabilitation. In Eighth International
Conference on Intelligent Environments, (IE’12), Guanajuato, Mexico (pp. 80–86).
Burdea, G., Cioi, D., Kale, A., Janes, W. E., Ross, S. A., & Engsberg, J. R. (2013). Robotics and
gaming to improve ankle strength, motor control, and function in children with cerebral palsy
—A case study series. IEEE Transactions on Neural Systems and Rehabilitation Engineering,
21(2), 165–173.
Chou-Ching, K., Ming-Shaung, J., Shu-Min, C., & Bo-Wei, P. (2008). A specialized robot for ankle
rehabilitation and evaluation. Journal of Medical and Biological Engineering, 28(2), 79–86.
318 A. M. Salazar et al.
Cioi, D., Kale, A., Burdea, G., Engsberg, J., Janes, W., & Ross, S. (2011). Ankle control and
strength training for children with cerebral palsy using the Rutgers Ankle CP. In 2011 IEEE
International Conference on Rehabilitation Robotics, Zurich, Switzerland (pp. 1–6).
Deutsch, J., Latonio, J., Burdea, G., & Boian, R. (2001). Rehabilitation of musculoskeletal injuries
using the rutgers ankle haptic interface: Three case reports. In Eurohaptics Conference,
Birmingham, UK (pp. 1–4).
Ekman, P., & Friesen, W. (1978). Facial action coding system: A technique for the measurement
of facial movement. Palo Alto: Consulting Psychologists Press.
Farjadian, A., Nabian, M., Holden, M., & Mavroidis, C. (2014). Development of 2-DOF ankle
rehabilitation system. In 40th Annual Northeast Bioengineering Conference, (NEBEC),
Boston, MA (pp. 1–2).
Fasel, B., & Luettin, J. (2003). Automatic facial expression analysis: A survey. Pattern
Recognition, 36(1), 259–275.
Fliess, M., & Join, C. (2008). Commande sans modèle et commande à modèle restreint. e-STA,
5(4), 1–23.
Foundation, R. (2017, Dec). The R project for statistical computing. Retrieved from https://www.r-
project.org/.
Franco-González, A., Márquez, R., & Sira-Ramírez, H. (2007). On the generalized-
proportional-integral sliding mode control of the “boost-boost” converter. In 4th
International Conference on Electrical and Electronics Engineering, Mexico City (pp. 209–
212).
Garcia, J., & Navarro, K. (2014). The mobile RehAppTM: An AR-based mobile game for ankle
sprain rehabilitation. In 2014 IEEE 3nd International Conference on Serious Games and
Applications for Health (SeGAH), Rio de Janeiro (pp. 1–6).
Girone, M., Burdea, G., & Bouzit, M. (1999). The rutgers ankle orthopedic rehabilitation interface.
In Proceedings of the ASME Haptics Symposium, DSC 67 (pp. 305–312).
Girone, M., Buerdea, G., Bouzit, M., Popescu, V., & Deutsch, J. (2000). Orthopedic rehabilitation
using the rutgers ankle interface. In Proceedings of Medicine Meets Virtual Reality, IOS Press
(pp. 89–95).
Goncalves, A., Dos Santos, W., Consoni, L., & Siqueira, A. (2014). Serious games for assessment
and rehabilitation of ankle movements. In 2014 IEEE 3nd International Conference on Serious
Games and Applications for Health (SeGAH), Rio de Janeiro (pp. 1–6).
Gupta, S., Verma, K., & Perveen, N. (2012). Facial expression recognition system using facial
characteristic points and ID3. International Journal of Computer & Communication
Technology (IJCCT), 3(1), 45–49.
Hsu, T. (1997). Mechatronics. An overview. IEEE Transactions on Components, Packaging, and
Manufacturing Technology: Part C, 20(1), 4–7.
Ijjina, E., & Mohan, C. (2014). Facial expression recognition using kinect depth sensor and
convolutional neural networks. In 13th International Conference in Machine Learning and
Applications (ICMLA), Detroit, MI (pp. 392–396).
Jaume-i-capó, A., & Samčović, A. (2014). Vision-based interaction as an input of serious game for
motor rehabilitation. In 22nd Telecommunications Forum Telfor (TELFOR), Belgrade
(pp. 854–857).
Kakarla, M., & Reddy, G. (2014). A real time facial emotion recognition using depth sensor and
interfacing with second life based virtual 3D avatar. In International Conference on Recent
Advances and Innovations in Engineering (ICRAIE-2014), Jaipur (pp. 1–7).
Kumari, J., Rajesh, R., & Pooja, K. (2015). Facial expression recognition: A survey. Procedia
Computer Science, 58, 486–491.
Li, D., Sun, C., Hu, F., Zang, D., Wang, L., & Zhang, M. (2013). Real-time performance-driven
facial animation with 3ds max and kinect. In 3rd International Conference on Consumer
Electronics, Communications and Networks, CECNet, Xianning, China (pp. 473–476).
Liu, G., Gao, J., Yue, H., Zhang, X., & Lu, G. (2006). Design and kinematics simulation of
parallel robots for ankle rehabilitation. In International Conference on Mechatronics and
Automation, Luoyang, Henan (pp. 1109–1113).
11 Mechatronic Integral Ankle Rehabilitation System: Ankle … 319
Mao, Q., Pan, X., Zhan, Y., & Shen, X. (2015). Sing kinect for real-time emotion recognition via
facial expressions. Frontiers of Information Technology & Electronic Engineering, 16(4),
272–282.
Menezes, R., Batista, P., Ramos, A., & Medeiros, A. (2014). Development of a complete game
based system for physical therapy with kinect. In IEEE 3nd International Conference on
Serious Games and Applications for Health, (SeGAH 2014), Rio de Janeiro (pp. 1–6).
Michel, P., & El Kaliouby, R. (2003). Real time facial expression recognition in video using
support vector machines. In 5th ACM International Conference on Multimodal Interaction—
ICMI ‘03, Vancouver, British Columbia (pp. 258–264).
Michmizos, K., & Krebs, H. (2012). Serious games for the pediatric anklebot. In 4th IEEE RAS &
EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob),
Rome, Italy (pp. 1710–1714).
O’Discoll, S., & Giori, N. (2000). Continuous passive motion (CPM): Theory and principles of
applications. Journal of Rehabilitation Research and Development, 7(2), 179–188.
Omelina, L., Jansen, B., Bonnechère, B., Jan, S. V., & Cornelis, J. (2012). Serious games for
physical rehabilitation: Designing highly configurable and adaptable games. In Proceeding of
9th International Conference Disability, Virtual Reality & Associated Technologies, Laval,
France (pp. 195–201).
Pasqual, T., Caurin, G., & Siqueira, A. (2016). Serious game development for ankle rehabilitation
aiming at user experience. In 6th IEEE International Conference on Biomedical Robotics and
Biomechatronics (BioRob), Singapore (pp. 1015–1020).
Perdereau, V., Legnani, G., Pasqui, V., Sardini, E., & Visioli, A. (2011). International master
program on mechatronic systems for rehabilitation. Journal sur l’enseignement des sciences et
technologies de l’information et des systems, 10(1006), J3eA.
Porras-Luraschi, J. (2005). Sistema de reconocimiento de expresiones faciales aplicado a la
interacción humano-computadora usando redes neuronales y flujo óptico. UNAM.
Rego, P., Moreira, P. M., & Reis, L. P. (2010). Serious games for rehabilitation: A survey and a
classification towards a taxonomy. In 5th Iberian Conference on Information Systems and
Technologies, Santiago de Compostela (pp. 1–6).
Saglia, J., Tsagarakis, N., Dai, J., & Caldwell, D. (2009). A high performance 2-DOF
over-actuated parallel mechanism for ankle rehabilitation. In IEEE International Conference on
Robotics and Automation, (ICRA 2009), Kobe, Japan (pp. 2180–2186).
Saglia, J., Tsagarakis, N., Dai, J., & Caldwell, D. (2010). Assessment of the assistive performance
of an ankle exerciser using electromyographic signals. In 2010 Annual International
Conference of the IEEE Engineering in Medicine and Biology, Buenos Aires, Argentina
(pp. 5854–5858).
Seddik, B., Maamatou, H., Gazzah, S., Chateau, T., & Ben Amara, N. E. (2013). Unsupervised
facial expressions recognition and avatar reconstruction from Kinect. In 10th International
Multi-Conferences on Systems, Signals & Devices, (SSD13), Hammamet, Tunisia (pp. 1–6).
Shah, N., Amirabdollahian, F., & Basteris, A. (2014). Designing motivational games for stroke
rehabilitation. In 2014 7th International Conference on Human System Interactions (HSI),
Costa da Caparica (pp. 166–171).
Sira-Ramírez, H., Beltrán, F., & Blanco, A. (2008). A generalized proportional integral output
feedback controller for the robust perturbation rejection in a mechanical system. eSTA Sciences
et Technologies de l’Automotive, 5, 24–32.
Stocchi, L. (2014). 3D facial expressions recognition using the microsoft kinect. In 18th
International Conference on Image Processing (ICIP), Dublin, Ireland (pp. 773–776).
Surbhi, V. (2012). ROI segmentation for feature extraction from human facial images.
International Journal of Research in Computer Science, 61–64.
Tannous, H., Dao, T., Istrate, D., & Tho, M. (2015). Serious game for functional rehabilitation.
Advances in Biomedical Engineering (ICABME), Beirut (pp. 242–245).
Tsalakanidou, F., & Malassiotis, S. (2010). Real-time 2D + 3D facial action and expression
recognition. Pattern Recognition, 43(5), 1763–1775.
320 A. M. Salazar et al.
Waikao, W. (2017, Dec). Weka 3: Data mining software in java. Retrieved from https://www.cs.
waikato.ac.nz/ml/weka/.
Yoon, J., & Ryu, J. (2005). A novel reconfigurable ankle/foot rehabilitation robot. In IEEE
International Conference on Robotics and Automation, (ICRA 2005), Barcelona, Spain
(pp. 2290–2295).
Zhang, M., Zhu, G., Nandakumar, A., Gong, S., & Xie, S. (2014). A virtual-reality tracking game
for use in robot-assisted ankle rehabilitation. In 2014 IEEE/ASME 10th International
Conference on Mechatronic and Embedded Systems and Applications (MESA), Senigallia
(pp. 1–4).
Zoch, C., Fialka-Moser, V., & Quittan, M. (2003). Rehabilitation of ligamentous ankle injuries: A
review of recent studies. British Journal of Sports Medicine, 37(4), 291–295.
Chapter 12
Cognitive Robotics: The New Challenges
in Artificial Intelligence
Keywords Cognitive robotics Artificial intelligence Embodied cognition
Embodied robotics Internal models
12.1 Introduction
The main aim of this chapter is to present a different type of robotics research,
namely cognitive robotics. The development of machines for automation has
always been inspired by the imitation of human agents. Robot arms, in their attempt
to perform tasks traditionally performed by human operators, have a close resem-
blance to their human counterparts. Moreover, after many years of successful
development, industrial arms are capable of performing a wide range of tasks with
very high precision, with minimum wear and no boredom.
However, there still exists the issue of imitation, as this is not addressed in depth
in all these industrial developments. To a certain extent, the concept of intelligence
for industrial robots is, if not irrelevant, very limited. It is in this quest that robotics
and artificial intelligence come together. Research in cognitive robotics aims at
making use of artificial agents to model, simulate, and understand cognitive
processes.
This chapter is organized as follows. Section 12.2 provides a short story of
industrial robotics highlighting its limitations in their exploration of human-level
intelligence and cognitive processes. It is followed by a quick review of the
problems in artificial intelligence, its shortcomings, and changes of paradigm.
Section 12.3 then presents the new field of embodied cognition and robotics and
their attempt to understand cognition studying low-level cognitive processes.
Section 12.4 presents two significant studies that try to model very specific and
basic human cognitive abilities. Finally, Sect. 12.5 presents the conclusions of this
chapter.
In one of the first patents registered, the inventor Devol (1967) put forward some of
the first ideas for the automation of machinery and manufacturing processes. The
first manufacturing robot was sold to the Ford Company, which used it to tend a
die-casting machine (Mortimer and Rooks 1987). The company was UNIMATION,
creators of the Programmable Universal Machine for Assembly (PUMA) robot
developed in 1978.
Since then, robot companies in this field have come out with a variety of
automated machines to fulfill manufacturing tasks. Nowadays, robots are sophis-
ticated apparatus that can operate in different environments performing tasks
deemed too tiring or too dangerous for human operators such as painting (Graca
et al. 2016; Li et al. 2016), assembling of heavy loads (Chuyet al. 2017), or
soldering (Draghiciu et al. 2017). They can also perform highly precise work such
as in the biosciences (Wu et al. 2016; Zhuanget al. 2018) or medicine (Brown et al.
2017; Rosen et al. 2017).
12 Cognitive Robotics: The New Challenges in Artificial Intelligence 323
During all this time and development, a parallel quest has always been present
and relates to a universal and far older curiosity: Can we build a machine that acts
and thinks as a human being? This has been a central question in what became
known as artificial intelligence (AI). The definition of AI has been a topic of debate
and analysis since the creation of the field. In general, it is possible to define AI as
the field devoted to building tools or agents capable of displaying intelligent
behaviors. In 1997, IBM designed Deep Blue, a computer chess system that
defeated Garry Kasparov, the then world chess champion. Later on, in 2011,
Watson a “question answering machine” defeated the two human champions on the
quiz show Jeopardy, a game that implies the answering of complex natural lan-
guage questions very quickly. Intuitively, playing chess or a quiz game is the
activities that require intelligence. However, do these computer programs show
genuine, intelligent behavior? A deep philosophical question arises in terms of
defining what intelligence behavior is, and even more problematic, what it means to
have a mind and how the mind is capable of performing intelligent behavior in
different contexts and circumstances.
Turing (1950), one of the most influential computer science theoreticians asked
the question “Can machines think?” He proposed what became known as the
“Turing Test” and claimed that a computer able to pass this test could be considered
an intelligent machine. The Turing Test can be described in terms of an imitation
game, which is played by three people: a man, a woman, and an interrogator. The
aim of the game is for the interrogator to identify which of the two players is the
man and which is the woman. The role of the man in the game is to confuse
the interrogator to cause a wrong identification. In the other hand, the role of the
woman is to help the interrogator in the identification task. During the game, the
interrogator stays in a different room and is allowed to pose questions about
the identity of the two players. The answers of the two players should be type-
written so that the voice may not help the identification. Now, it is possible to ask
the question:
“What will happen when a machine takes the part of A (the man) in this game?” Will the
interrogator decide wrongly as often when the game is played like this as he does when the
game is played between a man and a woman? These questions replace our original, “Can
machines think?”. (Turing1950, p. 443)
the people who wrote the rules are the “programmers,” Searle is the “computer,” the
boxes with symbols are the “database,” and the bunches of symbols that are handed
to Searle are the “questions,” and the bunches handed out are the “answers.”
The Chinese Room argument claims that despite the fact that Searle does not
understand a word of Chinese the outputs are indistinguishable from those of a
native Chinese speaker. Although this “computer program” actually passes the
Turing Test, this does not mean that the computer understands the meaning of the
symbols. Just manipulating the Chinese symbols is not enough to guarantee cog-
nition, perception, understanding, and thinking (Searle 1990). Human minds have
mental contents (semantics), and manipulating the symbols (syntax) is not sufficient
for having semantics. Computers would have semantics and not just syntax if their
inputs and outputs were put in an appropriate causal relation to the rest of the world
(Searle 1990, p. 30).
The Chinese Room was put forward mainly as a response to the work of Schank
and Abelson (1977) about “conceptual representation” which claims that computer
programs understand the meaning of the words and sentences they are programmed
to respond to. However, the main argument also applies to Winograd’s SHRDLU
(Winograd 1973), Weizenbaum’s ELIZA (Weizenbaum 1965), and of course the
Turing Test (Turing 1950).
Harnad (1989) defended Searle’s argument arguing that symbol meaning is
grounded in perceptuomotor categories. Specifically, Harnad (1990) aimed to
answer how symbol meaning is to be grounded in something other than just more
meaningless symbols. This came to be known as the symbol grounding problem.
The standard reply to the symbol grounding problem is that the meaning of the
symbols comes from connecting the system with the world (Fodor 1978). However,
this assumption underestimates the difficulty of selecting the proper objects, events,
and states that symbols refer to (Harnad 1990). He proposes as a possible solution a
hybrid nonsymbolic/symbolic system. The nonsymbolic part of the hybrid system
refers to the ability to discriminate inputs which depends on the “iconic represen-
tations” that are analogs of the proximal sensory projections of distal objects and
events. The symbolic part of the system refers to the ability to identify an input
reducing the icons to those “invariant features” that will reliably distinguish a
member of a category. The output of the category-specific feature detector is the
“category representation.” With this hybrid system, the match between the words
and the world is grounded in perceptual categories or “categorical representations”
which are based on the invariants “iconic representations.” How does the hybrid
system find the invariant features of the sensory projection that makes it possible to
categorize and identify objects correctly? The names of the elementary symbols
(categorical representations) are connected to nonsymbolic representations (iconic
representations) via connectionist networks that extract the invariant features of
their analog sensory projections, so that is possible to select the objects to which
they refer.
The main focus of AI research turned toward the physical grounding hypothesis,
which states that to build a system with intelligent behavior it is necessary to have
its representations grounded in the physical world. However, Brooks (1990)
326 B. Lara et al.
suggested that when this approach is implemented, the need for traditional symbolic
representations fades entirely.
The main assumption is that the world is the best model of itself as it contains
every detail that has to be known. Likewise, an agent must respond continuously to
its inputs using its perception of the world instead of a world model (Brooks
1991a). This is the key element of situatedness. Therefore, in this framework,
intelligence is determined by the total behavior of the system and how that behavior
emerges in relation to the environment. Now, the line between intelligence and
environmental interaction disappears.
The idea that intelligence can be conceived as no representational (Brooks
1991b) is often criticized. However, what Brooks suggested relies on the idea that
there are representations, but they are partial models of the world. These repre-
sentations extract only those aspects of the world that are relevant within the
context and the specific task. Nevertheless, he highlighted the idea that if the world
is the best model of itself, it is necessary to sense it appropriately, so that building a
system that is connected to the world via a set of sensors and actuators turns out to
be fundamental. In this framework, the agent has a body, sensors, and a motor
system, so that it is embodied (Brooks 1991a). Two main reasons make the
embodiment of an intelligent system critical. First, only an embodied intelligent
agent can deal with the real world. Second, only with a physical grounding
framework can any internal symbolic system give meaning to the processing going
on within the system.
Pfiefer and Bongard (2007) proposed that only agents that are embodied, whose
behavior can be observed as they interact with the environment, are intelligent.
Having a body is a prerequisite for any kind of intelligence, and it is necessary for
cognition. The embodied cognition framework requires working with real-world
physical systems, like robots. Autonomous robots, which are independent of human
control, have to be situated by being able to learn about the world through their
sensory system during interaction. The ideas around the concept of embodiment
produced a major shift in research in AI, which is addressed in the next section.
In the last decades, a new paradigm has started to surface in the sciences concerned
with the study of the brain. In this, the body and the environment take an important
role in the shaping of the mind. Known globally as embodied cognition, this new
way of thinking puts forward the idea that agents are entities that have a body and
interact with their environment as they develop (Wilson 2002). It is through this
interaction that knowledge arises and forms the basis of cognitive abilities.
In this chapter, two telling examples are presented where an artificial agent
acquires specific cognitive abilities through the interaction with the world. In these
examples, we explore the concept of affordances, a term coined by psychologist
Gibson in his seminal (1979) book. According to the ecological approach to
12 Cognitive Robotics: The New Challenges in Artificial Intelligence 327
The faculties, capabilities, and skills to dynamically interact with the world, which
as adult humans we possess, emerge through a long process of tuning and
rehearsing of sensorimotor schemes. This idea has taken a central role in research in
the cognitive sciences.
A very important example is the acquisition of the sensorimotor schemes that
code for the capabilities and reaches of our body. This set of schemes provide us
with many cognitive tools, among them the knowledge and coding of our body
map, essential for navigating around the environment.
328 B. Lara et al.
Distance perception has been studied for long in cognitive sciences and stills a
complex problem (Turvey 2004). According to some research hypothesis, the
perception of distance is not a geometrical process but an association of multimodal
(visual and tactile) sensory information (Braund 2007), influenced by the body,
self-motion (Profitt 2006), and the environment (Lappin et al. 2006).
To the best of our knowledge, modeling distance perception without a geo-
metrical framework is an issue that has not been addressed in cognitive robotics.
Indeed, the study of spatial cognition in robotics has a long history, and several
different techniques have been proposed (see Thrun and Leonard (2008) for an
exhaustive review). In the frame of cognitive robotics, the work presented here
differs from the brain-anatomical approaches (Tolman 1948; Arleo et al. 2004) in
that here the work models basic cognitive functions through internal models (Miall
and Wolpert 1996; Wolpert et al. 2001), where the sensorimotor cycle is considered
the fundamental unit of cognition and from which, it is hypothesized the modeling
of cognitive processes should start (Lungarella et al. 2003).
Instead of giving the robot the explicit means to model the external metric or
topology of the free space or to exploit geometrical information from stereo vision
techniques (Moons 1998), as in Experiment 1, this experiment goes a step further.
Here, the aim is to validate internal models that associate sensorimotor relationships
which code distance. The distance affordance is obtained by means of the prediction
and reenaction of visuomotor cycles.
12 Cognitive Robotics: The New Challenges in Artificial Intelligence 329
Spatial cognitive abilities using internal models and visual sensors have been
proposed in the past for particular space features recognition and for localization
purposes (Hoffmann and Möller 2004; Möller and Schenck 2008). These works
used refined data structures from preprocessed visual information together with
forward models. The work presented here differs from these approaches in that here
foveated images are used as the sensory situation together with current motor
commands as input to a forward model that predicts the next images and a proximal
tactile sensory situation. In this way, a notion of distance is coded in robot motor
coordinates (i.e., a distance perception in robot’s own body-scale units) repre-
senting the spatial relationships between the artificial agent and the objects in the
environment.
We provide details of the implemented cognitive process in the next section.
The artificial agent used is a Pioneer 3-DX showed in Fig. 12.1. It has two
wheeled-motors, a frontal ring of sonars and a stereoscopic camera. The robot can
execute forward and backward movements as well as turn to the right or left with
velocities controlled independently for each wheel.
The range sensors are 8 SensComp sonars series 600 with a sensing range of
0.15–5 m with a covering angle of 15°. The sonars are arranged in a ring around the
sides and front of the robot, with sonar number 1 pointing to the left of the robot,
sonar number 8 to the right, and the remaining six distributed evenly in between.
The stereoscopic camera pair is a STOC-9CM from Videre Design with a res-
olution of 640 480 pixels and a baseline of 9 cms. It has two 1.4f/6.0 mm lenses
which give it a 57.3° horizontal field of view (HFOV). The stereoscopic pair is
arranged as two digital cameras placed at the same height with parallel optical axis
330 B. Lara et al.
Stereo
Camera
Sonars
Fig. 12.2 Stereo pair images from the left (a), and respectively right (b) camera
separated a known distance, and whose intrinsic parameters are irrelevant in our
framework. Both cameras provide a monochromatic image (320 240) of the
scene with values between [0–255]. An example of an acquired image pair on the
environment designed for the experiments is shown in Fig. 12.2.
In stereo vision, the basic principle is that, having two simultaneous images from a
scene, a matching is made between features from one image and features from the
other. The disparity found between the features from the images is a relative
measure of the distance these have to the camera pair or any other pre-defined
reference frame. The disparity d of a point X in 3-D with coordinates (xl,yl) and
(xr,yr) in the left and right projections for each of the cameras respectively is found
12 Cognitive Robotics: The New Challenges in Artificial Intelligence 331
by computing d = xl − xr. As can be seen from Fig. 12.3, d comes from Eq. 12.1
and is obtained by doing basic geometric correspondence when all other parameters
are known, which is the case for a calibrated pair (Hartley and Zisserman 2003).
b ð xl xr Þ b
¼ ð12:1Þ
Zf Z
A disparity map is formed by the set of points that overlap in the two images.
Autonomous navigation strategies based on stereo vision have been used with
different rates of success (Collins and Kornhauser 2006; Murarka and Kuipers
2009). In particular, the work of Hasan et al. (2009) presents a system that is
capable of navigating its way among obstacles. However, the decisions the system
takes are strictly based on the values of the disparity map.
All of these works represent a background on the use of traditional computer
vision methods to solve autonomous navigation. However, the issue in this work is
the use of cognitive models and their applicability. In this exercise, the attempt is
made to try to understand their relevance in the search for artificial intelligence. In
particular, the aim is providing the robot with the cognitive tools allowing it to
anticipate collisions by means of reenacted visuomotor cycles predicting proximal
tactile situations without actually moving. To test the proposed model, a
look-for-an-output experimental task is constructed.
332 B. Lara et al.
The model presented here does the learning of a basic body map using a forward
model. The model takes as input sensory information and a constant motor com-
mand and predicts the next sensory situation. The input sensory data is formed by
visual information coming from the disparity map of two images (Dt). The output is
formed by the predicted visual information (Dt+1) and simulated tactile stimuli
(Bt+1) coded from threshold capped sonar values (Fig. 12.4).
Visual information and tactile data form a representation of the obstacles in the
arena of the robot. Making use of this representation the agent is capable of per-
forming predictions about the sensory changes in the environment. The motor
command in Fig. 12.4 can be an executed command or a planned action that is not
necessarily executed. This planned action allows the execution of long-term pre-
dictions as the output of the forward model can be used as input to a next forward
model.
Tactile information is obtained by thresholding the values of the sonars. Given the size
of the robot and the characteristics of the visual data (see below) a value of 440 or less
is defined as a collision, meaning an obstacle is 44 cm away or closer to the robot.
The necessary steps to acquire the visual data from the two images coming from
the stereo camera can be seen in Fig. 12.5 and can be described as follows:
Image Acquisition Using a calibrated camera pair STOC-9CM, two simultaneous
320 240 images of the scene are obtained. These images are rectified to correct
for the distortion caused by the lenses and sensor geometries.
Disparity Map The disparity map for two images is based on the difference in
pixels between the projection of the same point in the left and the right images. The
matching of each point in one of the images with its pair in the other image is done
using the sum of absolute differences (SAD):
m X
X n
/SAD ðx; y; d Þ ¼ abs½Vr ði; jÞ Vl ði; jÞ ð12:2Þ
j¼1 i¼1
where Vl ði; jÞ is the pixel ði; jÞ in the n m window Vl with center in the ðx; yÞ
pixel from the left image Il , likewise, Vr ði; jÞ is a window on the right image with
center at Ir ðx þ d; yÞ. The central pixels of the two most similar windows are
considered to represent the same point in the three-dimensional world.
Several parameters define the maximum and minimum number of disparity
values that can be calculated from a pair of images, and this number directly relates
to the distance that will be coded in the disparity map. For this work, a good and
safe compromise, considering the size of the robot and the parameters of the
stereoscopic pair, was set to look for 64 values of disparity. For our system, this
means disparities will be found in the range between 34 and 215 cm. (The inter-
ested reader can refer to Konolige (1997)).
Region of Interest (ROI) From the disparity map, a 228 6 ROI is extracted.
The upper limit of the ROI is located at line 152 of the image which, in a scene
without obstacles, is located at 2.15 m from the robot; this is the maximum distance
for which a disparity value can be calculated. In the horizontal direction, 228 pixels
are taken as they are the effective processed area of the image given the size of the
masks used for calculating the disparity.
Maximum Disparity Vector (MDV) This vector is formed by taking the maxi-
mum disparity for each column of the ROI and represents the closest obstacles in
the final 57.3° of the visible field of the camera.
Low Pass Filter Finally, a Gaussian filter with a five-pixel mask is applied to the
MDV. This is done primarily to facilitate the learning of the forward model.
To obtain the forward model, we used 57 MLPs which are trained using resilient
back-propagation (Riedmiller and Braun 1993). The input is a 228 values vector
(VDM) for time t and the output is the VDM for the time t + 1 and the bumper state
334 B. Lara et al.
Bt+1; each of these two vectors has 228 values. Each of the 57 MLP takes as input a
14 value window from the 228 values of the MDV and predicts the central four
values of the next time step (Lara et al. 2007). The 57.3° of the MDV is covered by
the two front sonars of the Pioneer, which corresponds to sonars 4 and 5, so a vector
is composed of 228 binary values depending on whether any of these two sonars
present the pre-defined activation. It is important to note that all of the values of the
MDV are set to 1 when there is a collision disregarding which of the sonars
detected it and to 0 otherwise.
The MLPs are trained offline using data collected during walks of the robot in an
arena filled with obstacles. The obstacles are texturized to ease the stereo matching
problem; they vary in shape and range in height from 30 to 60 cm, which ensures
their visibility by the camera pair given that this is mounted on top of the robot.
The robot has a diameter of 38 cm and performs steps of 15 cm. For every step,
it takes a snapshot of the scene and does a one-step prediction (OSP) using the
forward model. A threshold is set so that if 35 or more neurons are predicting the
MVD show activation of 0.45 or higher a long-term prediction (LTP) is triggered.
This threshold serves as a warning for a possible future collision. A second
threshold is set: four or more neurons predicting the bumper with activation of 0.95
or higher is considered a collision.
The LTP is an internal simulation of the trajectory and consists of using the
predicted VMD as input to a next forward model which in turns predict the next
VMD. This process can be carried out for a small number of steps as insignificant
errors in the OSP accumulate turning the VMD into noise.
Figure 12.6 shows the results of a typical run of the system. The first column
shows the VMDs as the agents move in the environment with t = 0 at the top of the
image. The next column shows the prediction of the forward model, the network is
very accurate, with an average sum squared error of SSE = 0.0043, still, small errors
are apparent. After a few steps, the system triggers the LTP. In the example, the
LTP is triggered correctly, and a collision is detected after four steps of internal
simulation, which is sufficient distance for the agent to take corrective action.
The increase in activation of the output neurons coding for the bumper states in
the trajectory shown in Fig. 12.6 can be seen in Fig. 12.7. The activation corre-
sponds to the time steps where LTP is performed, which is an internal simulation of
the events.
A remarkable emergent property of the system and not predetermined by design
is the fact that the activation of the neurons coding for bumper states corresponds to
the proximity of the obstacles in the MVD. This is, the activation of right and left
bumper corresponds to obstacles in the right and left image regions, respectively.
This activation can actually be interpreted as a body map.
Navigation
To further evaluate the capabilities of the forward model the agent was set in the
center of the arena with obstacles all around. The obstacles had a single passage
where the robot could go out. The distances between the obstacles varied from
10 cm to up to approximately 60 cm of the free passage. With an acquired body
map, the agent should be able to find the gap in the obstacles where it can pass
through.
This experiment allowed us to test the following hypothesis: with an acquired
basic body map, the agent should be able to find the gap in the obstacles where it
can pass through without the need to move in that direction.
The agent turns around its axis for 360°, every 10° it takes a snapshot of the
scene and performs a long-term prediction, recording the number of steps it can
predict without registering a collision. Once the robot has completed a whole turn, it
heads in the direction where a collision was not detected while performing LTP or
where this was detected after the largest number of steps. At this point, the behavior
of obstacle avoidance from the previous experiment takes over taking the robot out
of the circle of obstacles.
Figure 12.9 shows a typical run of the experiment. In Fig. 12.9a, the agent is
turning around in the circle of obstacles, performing the internal simulation of
heading in that direction. After completing a whole turn, the robot heads in the
direction where the gap between the obstacles is sufficient to pass through. The
agent undertakes a small correction in the path as it predicts a possible future
collision. This is due to the errors in encoders’ measurements and the skidding of
the wheels when the agent rotates toward the desired direction. Finally, in
Fig. 12.9b, the agent reaches the output.
The system was tried in different obstacle configurations with a 100% success
rate. The final path toward the exit was corrected 90% of the times due, again, to the
12 Cognitive Robotics: The New Challenges in Artificial Intelligence 337
In this second experiment, the methodological steps for building distance perception
capabilities are explored, not as a geometrical process but as an association of
multimodal (visual and tactile) sensory information during the agent’s interaction
with its environment.
A schematic view of the proposed forward model is shown in Fig. 12.10. This
model receives the two images from the stereo pair (VL and VR) and a motor
command (M) at time t and produces as output the sensory consequences were that
motor command being executed. The consequences of that execution are two
resulting sensory states, visual (for both cameras), and tactile (B), at time t + 1.
The tactile output is coded as a continuous value in the range 0–1 and represents
a measure of the proximity the agent has to objects in its arena.
The proposed forward model associates future visual and tactile modalities from
present visual and motor information. Hence, creating a multimodal sensory rep-
resentation, the system is relating different sensory modalities around the same
perceived situation together with the executed or thought action.
The proposed model produces a notion of distance for navigation through the
agent’s interaction with the environment. This notion of distance is coded by the
multimodal sensory representation, and its units are grounded in the physical
capabilities and characteristics of the agent.
As a first step, the images were inverted, in the original coding a high pixel value
represents white and 0 codes the presence of an obstacle. To reduce the dimen-
sionality of the visual data, we use a foveated imaging technique based on a
Gaussian distribution. The fovealisation process is a weighted mask which produces
images with high resolution at the center, decreasing it toward the periphery. This
technique allowed us to reduce the size and enhance the central region of the images
provided by the cameras. The result of applying this process was two final images
of size 23 24 pixels and is shown in Fig. 12.11.
The motor commands were chosen from three classes, turn 5° to left or to the
right, and a forward movement of a step-size, in this case 15 cm: each of these
commands was transformed into a vector of values given by three Gaussian
functions with the same standard deviation but with different mean according to
each type of motor command as can be seen in Fig. 12.12.
Fig. 12.11 a Original image ( 320 240) and b Foveated image (23 24)
12 Cognitive Robotics: The New Challenges in Artificial Intelligence 339
The forward model was coded using artificial neural networks and is made out of
120 local predictors. Each of these network predictors receives two windows of
5 5 pixels from the left and right images and a vector of 21 values for the motor
command as input. The output for each network is two windows of 3 3 pixels for
the left and right images and one value representing the tactile output. Each input
image (left and right) is divided into 120 windows, 10 in the x-direction and 12 in
the y-direction and predicts a 3 3 window of the next time step.
The inputs and outputs of the system are overlapped in the vertical and hori-
zontal direction, allowing a prediction of a fully sized output image. This
arrangement can be seen in Fig. 12.13, where three different input windows map to
their respective output windows. In effect, we have local predictors that take a
window of the whole scene and predict a smaller region of the next time step, the
system as a whole predicts two full images. The tactile state of the system is
represented by a vector of 10 values: a bumpers vector. Each of the ten columns
composed of 12 predictors contributes equally to one of the bumper values. The
whole vector contains binary values and is 0 when there is no collision and 1 when
any of the four front sonars detects a collision.
The training patterns were recollected through random and manual movements
executed by the robot with a combination of the previously specified motor com-
mands. As before, the training of the system was done using resilient
back-propagation (Riedmiller and Braun 1993).
340 B. Lara et al.
Fig. 12.13 Distribution of input and output windows for the local predictors
Short-Term Prediction
The first test to the trained system was to perform short-term prediction, this is,
given a sensory situation (visual) and a motor command, perform a prediction of the
next sensory situation (visual and tactile states). A typical result can be seen in
Fig. 12.14 where in subfigure (a) we see the initial sensory state. The bright strip
running from top to bottom in both images represents an obstacle. It is clear from
the images that the obstacle is to the left of the robot.
After a turn to the left, subfigure (b) shows the state of the visual data and
subfigure (c) shows the prediction of the system. The prediction for the tactile state,
represented by the bumpers vector, can be seen in Fig. 12.15. It shows a clear
12 Cognitive Robotics: The New Challenges in Artificial Intelligence 341
correspondence between the real and predicted state, with a maximum activation
value near to 0:05 (5% of the activation range), indicating there is no imminent
collision.
Long-Term Prediction
To provide the artificial agent with a long internal simulation process or long-term
prediction, the forward model described above can be chained a number of times.
This means a consecutive sensorimotor reenaction is performed where the output of
342 B. Lara et al.
the model is used as input for the next time step. For a given initial visual state and
a motor command, the system performs the prediction for of the next visual and
tactile states. The predicted visual state is then used, together with a respective
motor command as input to the model to produce the prediction for the next time
step.
In the example shown in Fig. 12.16, the initial visual state is shown in
Fig. 12.16a. A chain of motor commands right-right-forward is used as covert
actions; i.e., the actions are not executed but internally simulated. The long-term
(a)
(b) + 1 (c) + 1
(d) + 2 (e) + 2
Fig. 12.17 Long-term prediction for the tactile state from time t + 1 to t + 3
predictions of the system for the next three time steps are shown in the right column
of Fig. 12.16; the left column shows the sensory situations once the robot executes
the chain of motor commands.
The tactile prediction of the system is shown in Fig. 12.17 for the execution of
the three internally simulated motor commands. It is worth noting that the tactile
prediction encodes the position of the obstacle in the arena. At time t + 3, the
neurons of the bumpers vector on the right of the agent produce high activation. As
it was the case, if the robot had continued forward on that path, it would have
collided with the obstacle.
These results allow us to conclude that the bumpers vector values is coding, at
least in an incipient way, the body-reference perception of distance that we are
looking for, due to the fact that given the previous sequence of three movements,
the system is able to predict a possible collision expressed in terms of the robot’s
own motor capabilities.
It is worth noting that in no way in the training examples there is a coding for the
spatial position of the obstacles related to the tactile data. The whole tactile vector is
set to 1 when an obstacle is encountered. As an emergent property, the tactile
prediction is an indication of the position of the obstacles in the arena and repro-
duces previous results (Lara et al. 2007).
It is evident, however, that the visual prediction deteriorates due to the fact that
the errors for each step accumulate, distorting the visual input and therefore making
more uncertain longer predictions. Nevertheless, the prediction of three time steps
allows the robot preventive action. The agent does not need to approach a dan-
gerous or undesired situation, as an internal simulation of its actions allots it
knowledge of its surroundings.
344 B. Lara et al.
12.5 Conclusions
In this work, a different area in the robotics literature is addressed, namely cognitive
robotics, which can be considered as the artificial intelligence branch of the cog-
nitive sciences. Cognitive robotics uses artificial autonomous agents to shed light
on processes such as perception, learning through sensorimotor interactions with
the world, intelligent adaptive behaviour in dynamic environment. The work in the
area is framed in the embodied cognition thesis.
Throughout its history, artificial intelligence has suffered a significant number of
paradigm turns. This story started in the mid of last century attempting to emulate
high-level human cognitive abilities. These turned out to be relatively easy to
imitate and sooner rather than later, we had machines capable of defeating chess
world champions, general problem solvers and very smart conversationalists.
However, low-level abilities such as walking in uneven terrain, distinguishing a
rotten fruit from a ripe one or something as basic as calculating distance to an
object, turned out to be very complicated tasks for machines. Artificial intelligence
research, for many decades, failed to deliver in this quest.
It is only recently that research in artificial intelligence has turned to the findings
and results from the other cognitive sciences searching for a different approach to
understand, model and then implement basic behaviors in artificial agents. It is in
this framework that two telling examples are presented here.
In the first, an artificial agent making use of a stereo camera and the disparity
map calculated from its images learns sensorimotor associations which endow it
with safe navigation abilities. The agent of the first example learns the conse-
quences of performing forward movements on the environment that surrounds it. At
the same time, the changes in the disparity map are associated with the feeling of
crashes with obstacles. The associations are coded by means of a forward model.
The second example brings the capabilities of the forward model a step further
by making use of important characteristics of the images coming from the stereo
camera. Here, the agent learns associations between visual stimuli, a range of motor
commands and the feeling of touching obstacles. These associations allow the agent
navigation in complex corridors and sets of obstacles. The predicted tactile values
show an important association between the visual data representing the obstacles
and their actual position in the space, without this being explicitly coded in the
training data.
Both examples have shown that a system of local predictors successfully forms
what is known as multimodal sensory representations allotting the agent with a self
body map and a notion of distance. Without needing to perform any motor com-
mand, the agent is capable of predicting the sensory consequences of its actions.
The agent learns these representations by means of its interaction with the envi-
ronment. Furthermore, the self body knowledge and distance affordances are
learned with regards to the agent’s own sensorimotor capabilities. To state it
plainly, distance to an object is not learned as, for example, centimeters but as
number of motor commands to touch the object. It is argued here that this type of
12 Cognitive Robotics: The New Challenges in Artificial Intelligence 345
References
Abelson, R., & Schank, R. (1977). Scripts, plans, goals and understanding (p. 10). New Jersey:
An inquiry into human knowledge structures.
Arkoudas, A. & Bringsjord, S. (2014). Philosophical foundations. In Frankish, K., & Ramsey, W. M.
(Eds.), The Cambridge handbook of artificial intelligence (pp. 34–63). Cambridge University
Press.
Arleo, A., Smeraldi, F., & Gerstner, W. (2004). Cognitive navigation based on nonuniform gabor
space sampling, unsupervised growing networks, and reinforcement learning. IEEE
Transactions on Neural Networks, 15(3), 639–652.
Barsalou, L. (2008). Grounded cognition. Annual Review of Psychology, 59, 617–645.
Barsalou, L. (2009). Simulation, situated conceptualization, and prediction. Philosophical
Transactions of the Royal Society of London B: Biological Sciences, 364(1521), 1281–1289.
Blakemore, S., Goodbody, S., & Wolpert, D. (1998). Predicting the consequences of our own
actions: The role of sensorimotor context estimation. The Journal of Neuroscience, 18(18),
7511–7518.
Braund, M. (2007). The indirect perception of distance: Interpretive complexities in berkeley’s.
Kritike, 1, 49–64.
Brooks, R. (1990). Elephants don’t play chess. Robotics and autonomous systems, 6(1–2), 3–15.
Brooks, R. (1991a). Intelligence without reason. Artificial intelligence: critical concepts, 3, 107–
163.
Brooks, R. (1991b). Intelligence without representation. Artificial Intelligence, 47(1–3), 139–159.
Brown, J., O’Brien, C., Leung, S., Dumon, K., Lee, D., & Kuchenbecker, K. (2017). Using contact
forces and robot arm accelerations to automatically rate surgeon skill at peg transfer. IEEE
Transactions on Biomedical Engineering, 64(9), 2263–2275.
Chuy, O., Collins, E., Sharma, A., & Kopinsky, R. (2017). Using dynamics to consider torque
constraints in manipulator planning with heavy loads. Journal of Dynamic Systems,
Measurement, and Control, 139(5), 051001.
Collins, B., & Kornhauser, A. (2006). Stereo vision for obstacle detection in autonomous
navigation. DARPA grand challenge Princeton university technical paper, 255–264.
Devol, G. (1967). U.S. Patent No. 3,306,471. Washington, DC: U.S. Patent and Trademark Office.
Draghiciu, N., Burca, A., & Galasel, T. (2017). Improving production quality with the help of a
robotic soldering arm. Journal of Computer Science and Control Systems, 10(1), 11.
Escobar, E., Hermosillo, J., & Lara, B. (2012, November). Self body mapping in mobile robots
using vision and forward models. In 2012 IEEE Ninth Electronics, Robotics and Automotive
Mechanics Conference (CERMA), (pp. 72–77). IEEE.
Escobar-Juárez, E., Schillaci, G., Hermosillo-Valadez, J., & Lara-Guzmán, B. (2016). a
self-Organized internal Models architecture for coding sensory–Motor schemes. Frontiers in
Robotics and AI, 3, 22.
Fodor, J. A. (1978). Tom Swift and his procedural grandmother. Cognition, 6(3), 229–247.
346 B. Lara et al.
Gaona, W., Hermosillo, J., & Lara, B. (2012, November). Distance perception in mobile robots as
an emergent consequence of visuo-motor cycles using forward models. In IEEE Ninth
Electronics, Robotics and Automotive Mechanics Conference (CERMA), (pp. 42–47). IEEE.
Gibson, J. (1979). The Ecological Approach to Visual Perception. Psychology Press.
Graca, R., Xiao, D., & Cheng, S. (2016). U.S. Patent No. 9,227,322. Washington, DC: U.S. Patent
and Trademark Office.
Harnad, S. (1989). Minds, machines and Searle. Journal of Experimental & Theoretical Artificial
Intelligence, 1(1), 5–25.
Harnad, S. (1990). The symbol grounding problem. Physica D: Nonlinear Phenomena, 42(1–3),
335–346.
Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge
university press.
Hasan, A., Hamzah, R., & Johar, M. (2009, November). Region of interest in disparity mapping
for navigation of stereo vision autonomous guided vehicle. In International Conference on
Computer Technology and Development, 2009. ICCTD’09, (Vol. 1, pp. 98–102). IEEE.
Hoffmann, H. (2007). Perception through visuomotor anticipation in a mobile robot. Neural
Networks, 20(1), 22–33.
Hoffmann, H., & Möller, R. (2004). Action selection and mental transformation based on a chain
of forward models. From Animals to Animats, 8, 213–222.
Jamone, L., Ugur, E., Cangelosi, A., Fadiga, L., Bernardino, A., Piater, J., & Santos-Victor,
J. (2016). Affordances in psychology, neuroscience and robotics: a survey. IEEE Transactions
on Cognitive and Developmental Systems.
Johnson-Laird, P. (1977). Procedural semantics. Cognition, 5(3), 189–214.
Konolige, K. (1997). Small vision system. hardware and implementation. In proceedings
international symposium on robotics research (pp. 111–116). ISRR.
Lappin, J., Shelton, A., & Rieser, J. (2006). Environmental context influences visually perceived
distance. Attention, Perception, & Psychophysics, 68(4), 571–581.
Lara, B., & Rendon, J. (2006, September). Prediction of undesired situations based on multi-modal
representations. In Electronics, Robotics and Automotive Mechanics Conference, 2006 (vol. 1,
pp. 131–136). IEEE.
Lara, B., Rendon, J., & Capistran, M. (2007). Prediction of multi-modal sensory situations, a
forward model approach. In Proceedings of the 4th IEEE Latin America Robotics Symposium
(Vol. 1, pp. 504–542).
Li, X., Wang, J., Choi, S., Li, R., Riveland, S., Landsnes, O., & Hara, M. (2016, June). Automatic
Gyro Effect Simulation for Robotic Painting Application. In Proceedings of ISR 2016: 47st
International Symposium on Robotics, (pp. 1–4). VDE.
Lungarella, M., Metta, G., Pfeifer, R., & Sandini, G. (2003). Developmental robotics: A survey.
Connection Science, 15(4), 151–190.
Miall, R., & Wolpert, D. (1996). Forward models for physiological motor control. Neural
Networks, 9(8), 1265–1279.
Möller, R., & Schenck, W. (2008). Bootstrapping cognition from behavior—a computerized
thought experiment. Cognitive Science, 32(3), 504–542.
Moons, T. (1998, June). A guided tour through multiview relations. In SMILE (Vol. 98, pp. 304–
346).
Mortimer, J., & Rooks, B. (1987). Introduction. In The International Robot Industry Report
(pp. 1–7). Berlin, Heidelberg: Springer.
Murarka, A., & Kuipers, B. (2009, October). A stereo vision based mapping algorithm for
detecting inclines, drop-offs, and obstacles for safe local navigation. In Intelligent Robots and
Systems, 2009. IROS 2009. IEEE/RSJ International Conference on (pp. 1646–1653). IEEE.
Pezzulo, G., & Cisek, P. (2016). Navigating the affordance landscape: feedback control as a
process model of behavior and cognition. Trends in cognitive sciences, 20(6), 414–424.
Pfeifer, R., & Bongard, J. (2007). How the body shapes the way we think: A new view of
intelligence. MIT press.
12 Cognitive Robotics: The New Challenges in Artificial Intelligence 347
13.1 Introduction
Actually, 90% of what is learned by doing is kept by the person who is learning
(Volunteer Development 4H-CLUB-100 2016). So, sigh, hearing, and touch are the
main senses that allow us to recognize and perceive an environment. When it comes
to virtual environments, such as virtual reality (VR) and augmented reality (AR),
the user interacts fully or partially with virtual objects only through sight and
sometimes also with hearing. The addition of the sense of touch as haptic feedback
from a virtual environment could enhance the recognition and perception of the
virtual environment. The computational requirements are essential to accomplish
this integration.
The evolution of computers in terms of better processing, graphics, and
peripherals has allowed the development of virtual environments. Nowadays, vir-
tual environments have been studied and have been applied in different areas such
as medicine, education, industry, military, entertainment, aeronautics, among oth-
ers. All the areas of applications look forward representing a realistic environment,
but interaction of users in virtual environments usually misses the feeling of touch
which is essential for a realistic experience.
Part of the human experience of interaction with any environment is touching.
A common reaction in users, immersed in a virtual environment, is trying to touch
the objects in it. Given the importance of senses for interacting in an environment,
some robotic systems represent or simulate senses through the integration of dif-
ferent devices and systems like sensors and actuators. For example, the use of
vision, acoustic, and haptic systems are commonly used for simulating sight,
hearing, and touch, respectively.
The systems, that represent senses in an artificial way, require either unilateral or
bilateral transmission of information through an interface as it is shown in
Fig. 13.1. The interface works as a means for the exchange of information. For
example, a camera as a vision system may represent sight. A camera, just as sight,
Fig. 13.1 Acoustic, visual, and haptic information acquisition and reproduction
13 Applications of Haptic Systems in Virtual Environments: A Brief … 351
The virtual environments discussed in this chapter are visual systems based on
VR and AR. Haptic systems will be represented as haptic feedback either kines-
thetics or cutaneous (discussed in Sect. 13.1.1). Many systems developed by the
scientific community, that integrate the feeling of touch, use haptic interfaces. The
haptic interface captures information from the environment and sends it to the user
to make him/her have the feeling of touch through a device while the device senses
the position and force of the user (Lin and Otaduy 2008).
In a virtual environment, users can see and touch rendered objects. Artificial
vision systems are carried out by digital image processing techniques, and the
generation of images in computer graphics is called graphic rendering. In the case
of haptics, haptic rendering is the calculus and simulation of force and/or torque that
the user feels when manipulates an object in a virtual environment through a haptic
device in real time (Luo and Xiao 2004).
In general, a mathematical model is required to represent the behavior of an
object or a system from the real world. A mathematical model of a dynamic system
is defined as a set of equations that represent the dynamics of the system (Ogata
1998). When the behavior of an object is mathematically modeled, it is possible to
simulate it in a virtual environment. Mathematical models allow to obtain an answer
from the interaction between virtual objects and send it to the user through an
interface, giving the sensation of touching a virtual object.
There is another way to perceive haptic feedback in a virtual environment; some
authors call it pseudo-haptics (Lecuyer et al. 2008; Punpongsanon et al. 2015; Li
et al. 2016; Neupert et al. 2016). Pseudo-haptics is the simulation of the haptic
sensation when the user interacts in a virtual environment usually through sight.
This technique has been proven to enhance the interaction with different materials,
helping the user to distinguish them.
The use of haptic systems in virtual environments has increased in the last
decades. In Sects. 13.1.1 and 13.1.2, haptic systems and virtual environments are
described, respectively.
The word haptic refers to the capability to sense a natural or synthetic mechanical
environment through touch (Hayward et al. 2004). A haptic system works as a
teleoperated system by its bilateral communication nature. Just like teleoperated
systems, haptic systems have a master and a slave robot. The master controls the
slave’s moves; the slave sends feedback to the master in response to the interaction
with the remote environment. The objective is that the user, through the master,
feels an object even if is not in direct contact with it, the one in contact would be the
slave. When it comes to haptic systems in virtual environment, the slave and remote
environment are computed, meaning they are virtual (Hayward et al. 2004).
Most of the commercial haptic devices developed are force-feedback based. The
lack of realistic touch sensations in the applications is largely due to limitations in
13 Applications of Haptic Systems in Virtual Environments: A Brief … 353
The objective of the search presented in this chapter is to show how haptic systems
have been applied in virtual environments. When these two technologies are
combined, a whole world of application emerges, all seeking for the immersion of
13 Applications of Haptic Systems in Virtual Environments: A Brief … 355
Table 13.1 Classification of user tracking in virtual environments based on (Rolland et al. 2001)
Principle Classification Sub-classification
Time of flight Ultrasonic
(TOF) measurements
Pulse infrared laser
diode
GPS
Optical gyroscope
Spatial scan Outside-in
Inside-out Videometric
Beam scanning
Inertial sensing Mechanical gyroscope
Accelerometer
Mechanical
linkages
Phase difference
Direct field Magnetic field sensing Sinusoidal alternating current
sensing Pulse direct current
Magnetometer/compass
Gravitational field
sensing
Hybrid inertial platforms
Inside-out inertial
Magnetic/videometric
Hybrid systems TOF/mechanical linkages/videometric
position tracker
TOF/mechanical linkages/videometric 5-DOF
tracker
the user in the activity performed. For this research, a virtual environment can be
either VR or AR. Four main areas were identified for the application of haptics in
virtual environments. Certainly, there are many other areas of applications, but the
one described in this chapter circles most of them in a global way.
The databases used for the search were IEEE Xplore (IEEE 2017), ScienceDirect
(Elsevier B.V. 2017), ACM Digital Library (ACM Inc. 2017) EMERALD (Emerald
Publishing 2017), and Springer (Springer International Publishing AG 2017). These
databases are related in general to the areas of computation, technology, engi-
neering, electronics, having also the advantage of having them in the repertory of
the UACJ Data Base BIVIR (Biblioteca virtual, virtual library). From the start of
the search, the main key words used in the databases were haptics and virtual since
this is the technologies of interest. Throughout the investigation, other key words
like augmented reality, visuo-haptic, pseudo-haptics, mixed reality, virtual educa-
tion, virtual training, haptics augmented surgery, and virtual haptics entertainment
was used.
356 A. G. Rodríguez Ramírez et al.
The selection criteria for the articles were first based on four areas of application:
education, medicine, industry, and entertainment. The researches were not older
than 2007 and were taken from journals and conferences related to the two tech-
nologies (haptic systems and virtual environments). Finally, other sources used
were books, for the fundamental theory and online links, for commercial trends and
identification of haptic devices and interfaces.
In this chapter, some works related to haptic systems in virtual environments are
described. Three categories are presented: training, assistance, and entertainment. In
the first two categories, applications related to education, medicine, and industry
will be mentioned. The category of training is based on the application of haptic
systems in virtual environments as tools or strategies for acquiring knowledge about
a specific task/topic. In contrast, the category of assistance focuses on the appli-
cation oriented to help during an activity having in count that the user already has
the knowledge and experience to do it, but the system is expected to enhance the
performance. The final category presented is entertainment. Entertainment industry
has played an important role in the development of haptic systems and virtual
environments since they have the same objective, the immersion of the final user.
13.3.1 Training
Along their lives, humans are in constant learning. Since humans are born, the
process of learning is important, and it starts with the interaction with other human
beings and the environment in general. Later, when humans want to acquire an
explicit knowledge, they proceed to training activities where the user interacts with
a given environment. Haptic systems in virtual environments can be applied in
training applications where the user obtains knowledge through interaction and
immersion, Fig. 13.3. The user carries on a certain interactive activity; trough sigh
and touch, the user is immersed in the activity; the immersion enhances the process
of learning; finally, the objective is to acquire certain knowledge. Training is the
action of teaching a person or animal a specific skill or type of behavior (Oxford
University Press 2017). To enhance the process of training, the integration of
different technologies has taken place in the last decades. That is the case of haptic
systems and virtual environments.
In general, virtual trainings reproduce an activity and give feedback to the user to
get the feeling of doing it on real life. The purpose of training applications is
transferring user’s knowledge from the virtual experience to real-life operations.
13 Applications of Haptic Systems in Virtual Environments: A Brief … 357
This purpose can be achieved through immersion and interaction. If the user has the
feeling of reality, he/she will gain certain degree of experience and the ability even
if the environment is controlled and safe.
Haptic systems as the feeling of touch in artificial systems work through a haptic
interface, while virtual environments simulate the physical surroundings humans
interact with trough computer graphics. In combination, these two technologies
have tried to enhance different experiences of the user such as the process of
training. In the following subsections, some cases of training application are
described particularly in the areas of education, medicine, and industry.
13.3.1.1 Education
In general, human machine interfaces (HMIs) are the link between the user and an
artificial system, allowing the interaction with computers and machines. Currently,
tangible interfaces offer new ways of interaction with virtual objects. Nevertheless,
the design of this interface has not been much studied with an educational approach
358 A. G. Rodríguez Ramírez et al.
13.3.1.2 Medicine
Medical applications for training have become popular in the last decades. The
benefits that haptic systems and virtual environments bring to the students are
mainly based on the possibility of experience a medical procedure without the
dangers of treating a living patient. It is certainly difficult to gain the same
knowledge through a real experience than through a simulation due to the limita-
tions of the systems in terms of perception of the world and the lack of realistic
experiences. With the integration of haptic systems in virtual environments, the user
could have a more realistic experience having visual and haptic feedback for
medical training.
Rhienmora et al. (2010) developed a dental surgery training simulator. The
simulator was developed in two virtual environments using two haptic devices. The
system had two modalities, one with VR and the other with AR. In the VR mode,
the environment was displayed in a computer screen; the user was able to interact
with dental pieces for extraction training. The AR mode used a head-mounted
display (HMD); the user manipulated the dental pieces shown also for extraction
training. In the second mode, the virtual objects were set using markers. Both
modalities required two haptic interfaces Phantom Omni (Sensable Technologies
2016c). It was reported that an experienced dentist confirmed that AR environment
had many advantages over VR for dental surgical simulations like the realistic
clinical setting.
Lin et al. (2014) developed and validated a surgical training simulator with
haptic feedback for safe, repeatable, and cost-effective alternative in learning
bone-sawing skill. The system had an Omega.6 as haptic interface and a
Display300 as the 3D stereo display. For the haptic rendering, spindle speed, feed
velocity, and bone density were considered as variables and multi-point collision
detection is method applied. The position and orientation of the virtual tool were
continuously updated, according to the position of the end effector of the haptic
device. A multi-threading computation environment was applied to maintain,
1000 Hz for haptic rendering and 30 Hz for graphic rendering, update rates.
Acoustic feedback was also added. Finally, the validation was based on three
experiments: the first proved that the systems were able to differentiate between
experimented and novice participants and also improved the performance with
repeated practice by decreasing the operative time; the second to prove if the
simulator acted as expected; and the third validated the knowledge transfer from
training to real procedure in terms of maximal acceleration, and the trained group
with lower maximal acceleration suggested that the simulator had positive effects
on real sawing.
Chowriappa et al. (2015) developed and tested a training system AR and haptic
based for robot-assisted urethrovesical anastomosis (needle driving, needle posi-
tioning, and suture placement). The environment called hands-on surgical training
(HoST) consisted on a simulator that helps the trainees step by step on the pro-
cedures with simultaneous proctoring throughout the training. The experience of the
user is visual, auditive, and haptic enable for didactic explanations, annotations, and
360 A. G. Rodríguez Ramírez et al.
13.3.1.3 Industry
the physical motor module, and the haptic motor module. In the first module, the
virtual environment and objects were developed. In the second module, the physical
behavior of the objects was programmed. Finally, the third module was related to
the haptic device Phantom Desktop (Sensable Technologies 2016b), now called
Geomagic Touch X. In a screen, the user saw the correct order for assembly and
then had to do the assembly having the haptic feedback for improving the expe-
rience of the virtual assembly task. The case study presented resulted in the iden-
tification of the haptic feedback as a beneficial technology for virtual assembly
tasks. The importance of the physical features was also identified for the realistic
simulation of the assembly task, features like restitution coefficient, control spring
stiffness, and other should have been included; so, as a stereoscopic visualization
for enhancing the user’s immersion.
Carlson et al. (2016) evaluated a virtual assembly task using different combi-
nations of interfaces used by the user. The task consisted in manipulating two
different pieces at the same time with two haptic interfaces for insert one in the
other. The haptic interfaces used for the first experiment were a Phantom Omni
(Sensable Technologies 2016c) and a 5DT Data Glove (Virtual Realities, LLC
2017) and for the second one two Phantom Omni. Several combinations related to
the device used by the dominant hand were also made, but it was not proved that
there was any significant difference in this combination. The insertion in assembly
tasks is a difficult operation for training simulation for the complexity in syn-
chronizing the instruments. In general, it was reported that participants performed
equally well in all treatment conditions. The tests did not include gravity force in
the objects; the haptic feedback was limited to object’s collisions. Either way, the
participants showed interest in the experiments.
13.3.2 Assistance
The combination of haptic systems and virtual environments, such as VR and AR,
has attracted the attention of assistance applications, mainly in the areas of edu-
cation, medicine, and industry. Assistance is the action of helping someone by
sharing work (Oxford University Press 2017). In this subsection, the cases pre-
sented are related to systems developed with the purpose of assisting in different
tasks for enhancing the performance, productivity, and/or precision of the user.
Figure 13.4 described how the user interacts with a system through sight and
touch with the integration of haptic systems and virtual environments, respectively.
The integration of these two technologies contributes to the user’s immersion in the
task carried on. The overall system has the purpose of aiding the user in a specific
task, taking in count that the user already has the knowledge and skills to do it. For
example, if the task involves a dangerous procedure, the system could help by
warning the user, visually and tangibly, if danger is close. The final objective of this
application, and mentioned before, is to enhance the performance, productivity,
362 A. G. Rodríguez Ramírez et al.
13.3.2.1 Education
In education, the teachers usually use different tools and strategies to improve the
process of teaching. Unlike training applications for education, assistance appli-
cations are focused on facilitating the teaching–learning process more than teaching
a specific topic. For example, Csongei et al. (2012) developed a system called
ClonAR that allowed the user to clone and to edit objects from real world. First, the
real object was scanned by a Kinect Fusion (Microsoft 2017). Then, the object was
rendered and could be edited in a visuo-haptic AR environment. The information
for the rendering was not managed in meshes, instead signed distance fields
(SDF) were used because of the Kinect Fusion. They assured the information flow
was faster than meshes. The system was tested as a didactic tool, but other possible
applications were identified such as medical training a medical education.
13 Applications of Haptic Systems in Virtual Environments: A Brief … 363
Eck and Sandor (2013) defined the term visuo-haptic augmented reality (VHAR)
as a technology that allows the user to see and to touch virtual objects. The authors
presented a software development platform called HARP that allowed to program
VHAR applications. The platform worked either as an educational tool or just for
application development. The authors used H3DAPI (SenseGraphics AB 2012), a
haptic software development platform that is open source and they complemented it
with the Phantom Omni haptic device (Sensable Technologies 2016c). The platform
developed was tested and validated by undergraduate students who used it for
making different projects. The applications developed in HARP were limited to 30
FPS (frames per second) so the image was reported as looking shaky. Another
limitation was that it did not allow the use of more efficient rendering techniques.
Murphy and Darrah (2015) created a set of twenty applications for teaching math
and science to students with visual impairments. Certainly, the objective is to teach
the students, but the haptic feedback was taken as tool/strategy to improve the
teaching–learning process in students with visual difficulties. The haptic device
used was the Novint Falcon (Novint 2017). The applications were developed with
the haptics software developers kit (HSDK) and the game engine GameStudio. The
students were able to select the application of interest and interact with the virtual
objects on the simulator through the haptic device having the feeling of touching
the objects, and also, acoustic and visual feedback was included. Some of the
applications were a plant cell nucleus, volume of shapes, gravity of planets, and
exploration of atoms. Six applications were tested in classroom with pre- and
post-tests for each application. The results showed, in general, a significant learning
gain for all the applications tested, and most of the teachers agreed in the easy to use
characteristic of the whole system.
13.3.2.2 Medicine
coincide with the motivation of the subjects to do the exercises besides the difficulty
in depth perception (overcame with practice of the participants). Some of the
participants felt arm fatigue because of the weight of the tangible object, but this
could be changed in customized exercises depending on the subject’s capabilities.
In general, the study showed efficiently motivation in patients and the capability of
the system to measure important performance factors for the assess of the patient’s
treatment progress such as task completion time.
Unilateral spatial neglect (USN) is a post-stroke neurological disorder that
causes a failure in stimuli response of the brain hemisphere damaged. The patients
who have USN present spatial deficits such as stepping into objects when walking
and only can dress on side of their body. Tsirlin et al. (2010) studied a therapy
application based on a string haptic workbench. The technique for rehabilitation
included a space interface device for artificial reality (SPIDAR) and a Fastrack
stylus attached. SPIDAR is a device that has a ring suspended by wires, a pair of red
and green glasses, and a large screen. An object was displayed on the screen and
perceived as a 3D object by the user. Then, the user moved the ring with the finger
and had the feeling of touching the object. This occurred when the position of the
ring and the object was the same. This illusion was possible because the motion of
the string was restricted when the collision occurred. The study revealed that spatial
biases could be induced when the user was in a scenario where he/she should avoid
a perturbed sensorimotor experience in one side of space. The tests were made with
subjects who had to draw a trajectory with the Fastrak stylus and felt a disturbance
on one side of the space. For example, when the user traced a line from left to right
and the right hemispace as disturbed, they induced a significant bias to the left.
Yamamoto et al. (2012) presented a system for surgical robotic assistance tested
in artificial tissue. The system had a pair of haptic devices Phantom Premium
(Sensable Technologies 2016a) communicated with a master–slave control and a
Bumblebee2 IEEE-1394 stereo-vision camera (FLIR Integrated Imaging Solutions,
Inc. 2017). The authors made a prohibited-region user-defined to make sure the
procedure was minimally invasive, and the healthy tissues stayed safe. The region
of interest was augmented so the user could carry out the task easily and reliability.
For the tests, the artificial prostate tissue was reconstructed as the user interacted
with it. The task consisted in a teleoperated palpation of tissue to differentiate soft
and hardener surfaces, in real time. The forbidden region virtual fixture was found
to be useful in the procedure and so as the haptic feedback during the experiments.
The force feedback resulted in discontinuities; this could be fixed with modification
of impedance and edge geometry of the virtual fixtures according to the authors.
Haptic devices allow users to interact with a remote or virtual environment
through the sense of touch (Díaz et al. 2014). Since some surgery devices are
manipulated by pedal, Díaz et al. (2014) proposed the use of a pedal with a double
haptic channel. The double haptic channel was referred to the hand and foot haptic
feedback received during a procedure. The haptic feedback would help the surgeon
perform the necessary task not only visual based but also tactile since the surgeon
cannot feel what the instrument is touching. The one DOF pedal system proposed
consisted in a Maxon RE40 DC motor and a transmission cable (26.66:1), with a
13 Applications of Haptic Systems in Virtual Environments: A Brief … 365
Quantum Devices QD145 encoder. The pedal had a peak torque of 10.72 Nm, a
continuous torque of 5.36 Nm, and 15° of workspace. The performance of the
haptic pedal was validated on a user-study, with warning signals and resistance to
tool’s penetration, during a drilling procedure with a double haptic channel. The
hand haptic feedback was acquired through a PHANToM 1.0 device with a
micro-vibrating electric motor attached at the tip of the PHANToM’s stylus. The
haptic pedal controlled the speed of the drill, and the resistance torque of the tool’s
penetration was emulated back to the pedal, so the user could feel that resistance. In
general, during the experiments, the users with haptic feedback had a faster reaction
to warning signals. The results indicated that the haptic information is helpful
during a drilling procedure, and it improves the surgeon accuracy.
Also, haptic systems in combination with VR have helped in diagnosis task and
medical analysis. This is the case of the cephalometric diagnosis and analysis, and
the current 2D and 3D tools are often complicated, impractical, and not intuitive
(Medellín-Castillo et al. 2016). Medellín-Castillo et al. (2016) presented a solution
to the disadvantages of the cephalography analysis based on a haptic approach. The
proposed system required a haptic device, either a Phantom Omni (Sensable
Technologies 2016c) or a Falcon (Novint 2017). Since they used the platform
H3DAPI (SenseGraphics AB 2012) (open source haptics software development
platform that uses the open standards OpenGL and X3D), the system could interact
with any of the two haptic devices. The 2D and 3D crane models were imported in
the interface where the haptic interaction was integrated. The user manipulated the
crane and had the feeling of touch through a pencil/pen easing the processes of
diagnose and surgery planning.
13.3.2.3 Industry
assistive techniques using the Phantom Omni. The reduction of this error rates and
time targeting in industrial applications could improve productivity and efficiency
in human–computer interaction operations. The techniques are based on a virtual
plane designed with deformable cones and deformable switches to develop a haptic
virtual switch for implementation on existing GUIs. For the experimentation of the
techniques, six measurements were defined in terms of characteristics of the
clicking operation. Gravity wells and haptic cones were implemented: the first,
based on a bounding volume with a spring force toward the center of that volume
(Asque et al. 2014); the second, based on the cursor clamping to the apex at the
target center by extracting the button position to embed the cones correctly into the
mesh of a virtual plane. Finally, deformable virtual switches were developed to help
people with physical disabilities target and operate accurately different devices and
interfaces. The first experiment of cursor analysis of the haptic assistance proved
significant improvements in the measures. The second experiment of the effect of
target size (small, medium, and large) and shape showed that only in small and
medium, the haptic condition has a significant effect and that target shape has less
significant effect on the participant’s performance than the haptic condition and the
target size.
Ni et al. (2017) noticed that programming remote robots for welding manipu-
lation becomes difficult when the only feedback, from the remote site, is visual
information. They proposed an AR application with a haptic device integrated.
They used a display for showing the real robot and augment a virtual arm and the
end effector. The robot was manipulated by moving the haptic device (PHANToM)
remotely. The user got the feedback from the remote robot to the haptic device
before the end effector reached the welding surface. This helped keeping a constant
distance while the user defined the welding path. The workpiece was captured by a
Kinect camera for a 3D point cloud data acquisition. The virtual robotic arm was
placed in scene using a marker in the physical workspace of the real robot. The
system was tested by ten users with no background in welding or robot program-
ming. The test consisted in recording the path followed to weld two workpieces.
The user could choose the welding points as they moved the remote virtual arm as
they saw the real scene with the augmented robot on a display. The user-defined
paths from the actual welding path were within ±15 mm of the actual path.
13.3.3 Entertainment
Entertainment has become such an important part of our lives that new studies have
pursuit the understanding of the “Psychology of Entertainment” (Invitto et al.
2016). Entertainment has been studied with a multidisciplinary approach having in
mind that is related to learning, perception, emotions, communication, marketing,
science, therapy, and others (Ricciardi and Paolis 2014). That might be the reason
why entertainment is an attractive area of application for the developers of haptic
systems in virtual environments. In general, the applications of entertainment seek
13 Applications of Haptic Systems in Virtual Environments: A Brief … 367
for the user to feel comfortable and immerse in the environment given as described
in Fig. 13.5. In the case of video games, immersion is usually accomplished
through audio, graphics, and simple haptic feedback like vibration.
Other application of entertainment includes haptic systems like the one devel-
oped by Magnenat-Thalmann et al. (2007). The system consisted in an interactive
virtual environment where the user could fix the hair of a virtual character. The user
had the feeling of manipulating the hair by using different virtual tools to comb,
wet, dry, and cut. A SpaceBall 5000 (Spacemice 2017) and a Phantom Desktop
(Sensable Technologies 2016b) were used, so as an algorithm called virtual cou-
pling based on physics modeling for the haptic representation. The algorithm was
also used to link the haptic device with the virtual tools in a stable way.
From Disney Research Laboratories, Bau and Poupyrev (2012) developed a
system they called REVEL. The system was based on AR combining visual and
haptic feedback to virtual and real objects inserted in the reality. The visual feed-
back was delivered through a display that allowed the user to see the reality with
virtual objects inserted. The haptic feedback allowed the user feeling the virtual
object through reverse electrovibration (induce AC signal in the user instead of in
the object). The system maintained a constant tactile sensation by adjusting the
signal amplitude dynamically to compensate all the varied impedances. The signals
368 A. G. Rodríguez Ramírez et al.
generated and applied to the user were safe since the current applied to the user was
in the microampere range (max. 150 lA). When the user touched a real object, a
capacitive sensing of the touch occurred, and the haptic augmented feedback was
delivered from a database. The touch sensing of virtual objects was optical, a user
finger tracking through a Kinect. The virtual objects were inserted with markers.
The system required an infrastructure previously prepared for the tactile augmen-
tation when touching an object.
Sodhi et al. (2013), also from Disney Research Laboratories, developed the
project AIREAL. AIREAL was a device that gave the user the feeling of free air
textures. The device consisted of a servo actuated flexible nozzle that generated an
air vortex and a camera to measure the target’s distance, and the camera was
mounted over a gimbal structure. The vortex control was based on four dimensions
mainly: pulse frequency, intensity, location, and multiplicity. The experiences of
the users consisted on having a projection, for example, over the hand, of an object
and the air haptic feedback should coincide with it in space and time. The system
could synchronize with others of the same type to create a whole atmosphere. The
authors tested the systems simulating an environment where the user felt seagulls
flying around while seeing it on a computer game. The system presented by Sodhi
et al. (2013) made considerable sound considered as noisy and could make the user
feel uncomfortable because of the position of the devices.
On the other hand, Ouarti et al. (2014) developed a test platform to differentiate
between a visual, haptic, and visuo-haptic experience of a user in a virtual world.
The experiment of interest was the visuo-haptic one where the system had the
capacity to make the user feel like being inside of an accelerating car. The user
could see on a screen a video generated in a graphic engine for simulating the
movement. The system had a Virtuose (Haption 2017) haptic device connected to a
mechanism to simulate the movement when the car accelerated. The authors con-
cluded about the importance of the haptic feedback synchronized with the video for
the user to be immersed in the game.
Israr et al. (2014) presented a story-telling application. Just like any other
entertainment application, immersion was important for the user to have a satisfying
experience. The application was kids oriented. The system was capable of making
the user felt like it was raining, something walked around, started a motor, and
others. Each feeling was classified within a haptic vocabulary list, and it could be
intensified as wanted. The effects were also visual and auditory. The system had
two modules. The first module was an arrangement of tactile vibrators called tactors
C-2 (Engineering Acoustics Inc. 2017) aligned in the back and waist (arranged in a
vest). The second module was a graphic interface for the manipulation of param-
eters of vibrators (mainly time and intensity), this could be shown in a computer or
a mobile device. When someone said a phrase of the haptic vocabulary, a vibration
corresponding to it was produced.
Punpongsanon et al. (2015) developed a system called SoftAR which was an
application of AR where the user could feel the softness of an object. The user
could see a projection of a surface over a real object and when the user touched it,
he/she saw the deformation of the material projected. The projection created the
13 Applications of Haptic Systems in Virtual Environments: A Brief … 369
haptic illusion. The author also identified marketing and design applications of the
system, to show clients or designers different options of materials to select based on
the softness simulated.
In Tables 13.2, 13.3, 13.4 and 13.5, a summary of the researches, described in
representation of each area of application, is shown. From Tables 13.2, 13.3, 13.4
and 13.5 it should be noted that most of the applications use commercial haptic
interfaces, so there are opportunities in the development of customized haptic
devices. The design of customized devices could help in the development of more
complex, cheapest, specialized, and/or precise applications to enhance the immer-
sion of the user given a specific application. In a future, people will be able to touch
virtual information and do it in a natural way just like interacting with the natural
environment.
Nowadays, the educational applications, of haptic systems in virtual environ-
ments, have impact in society since currently digital technologies are been more
incorporated as strategies for teaching. Nevertheless, the access to some tech-
nologies may not be at hand for everyone yet one day the use of haptic systems and
virtual environments will be used as didactic materials like today we use books and
computers. Certainly, with the advantage of access to intelligent mobile devices, it
seems feasible to scale different applications in the classroom or for the common
use. When it comes to apply haptic systems in virtual environment in training for
education, it seemed to be difficult. The difficulty resided in the fact that a virtual
experience might not be as good as gaining real experience. Multisensorial feed-
back has recently been taken as a solution to this problem, integrating different
technologies such as computer vision, haptic feedback, and audio effects; it is
possible to have a close approach to real tasks. On the other hand, assistance
applications for education have big potential for simplifying the teaching–learning
process. Teachers will be benefited with the flexibility of educational tool based on
haptic systems in virtual environments. The use of multisensorial experiences
trough haptic systems and virtual environments has also taken place in medical
applications.
On one hand, training medical applications have become very popular in the last
decades. The main limitation is the use of commercial haptic devices. The next step
in haptic systems development for medical applications is to develop customized
devices. The transparency is the main factor of a haptic device; it could be
improved by implementing low friction actuators. The development of realistic
simulators in virtual environments with haptic feedback has become an important
trend given the fact that user in this field requires the feeling of touch to gain the
experience needed. On the other hand, medical assistance applications focus on
enhancing the performance of the user with multiple feedback of sensory infor-
mation such as visual and haptic.
370
Table 13.3 Cont. Researches of applications for haptic systems in virtual environments
References Category Area of Haptic Haptic interface VE Latency/real time/ Graphic/ Subjects Tracking Display
application feedback frequency haptic device
rendering
Bau and Entertainment Entertainment Cutaneous Reverse AR Real ARToolkit 2 Kinect Fusion Proyector;
Poupyrev electrovibration time; *150 ms library applications mobile
(2012) system latency device
Eck and Assistance Education Kinesthetic Phantom Omni AR 1000 FPS in the Parallel 9 projects Marker-based Canon
Sandor haptic loop and 30 graphic render VH-2007
(2013) FPS for graphical rate HWD
rendering
Sodhi Entertainment Entertainment Cutaneous AIREAL VR *139 ms latency of – 5 Depth sensor Computer
et al. a vortex applications PMD screen/
(2013) Camboard tablet/
Nano; Kinect proyector
Fusion
Díaz et al. Assistance Medicine Kinesthetic Phantom VR Real time; 1 kHz dSPACE 12 Simulation Computer
(2014) and Premium and sampling rate 1104; screen
cutaneous pedal OpenGL;
virtual springs
model
Ouarti Entertainment Entertainment Kinesthetic Virtuose VR – OpenGL 17 Simulation Computer
et al. screen
Applications of Haptic Systems in Virtual Environments: A Brief …
(2014)
Israr et al. Entertainment Entertainment Cutaneous Tractors tipo VR – – 85 Button to iPad
(2014) C-2 enable the
feeling
described
371
372
Table 13.4 Cont. Researches of applications for haptic systems in virtual environments
References Category Area of Haptic Haptic VE Latency/real time/ Graphic/haptic Subjects Tracking Display
application feedback interface frequency rendering device
Lin (2014) Training Medicine Kinesthetic Omega.6 VR Real time; update rates CHAI3D; 25 6 Polaris, NDI Display300
of 1000 Hz for haptic OpenGL; Canada;
rendering and 30 Hz multi-threading simulation;
for graphic rendering computation markerless
environment
C.T.Asque Assistance Industry Kinesthetic Phantom VR Real time CHAI3D API 6 – Computer
et al. (2014) Omni screen
Abidi et al. Training Industry Kinesthetic Phantom VR RT comms ph-hap eng OpenGL; Case Simulation Computer
(2015) Desktop GLUT libraries; study: a screen
PhysX; blower
OpenHaptics house
assembly
Parinya and Entertainment Entertainment Visual Pseudo-haptic AR Real time NVIDIA 17 3 Marker-based NEC
Kosuke GeForce GT520 Elastic NP-L51 WD
(2015) 2 GB objects 1280 800
simulated 70 Hz ANSI
lumen
Chowriappa Training Medicine Kinesthetic Do not say AR Real time – 52 – –
et al. (2015)
Murphy and Assistance Education Kinesthetic Novint VR – HSDK; 32 Simulation Computer
Darrah Falcon GameStudio screen
(2015)
A. G. Rodríguez Ramírez et al.
13
Table 13.5 Cont. Researches of applications for haptic systems in virtual environments
References Category Area of Haptic Haptic VE Latency/ Graphic/haptic Subjects Tracking Display
application feedback interface real time/ rendering device
frequency
Skulmowski et al. Training Education Cutaneous Stylus VR *4 ms The tracked 96 Polhemus FASTRAK 24” iiyama
(2016) latency position and motion tracking ProLite
rotation was system (six E2473HS
smoothed over 5 degrees-of-freedom, screen
frames (approx. 60 Hz, 4 ms latency) (1920 1080
83 ms) pixels)
Carlson et al. Training Industry Kinesthetic Phantom VR IS-900 at OpenSceneGraph; 52 Polhemus patriot Computer
(2016) and Omni/ around VR JuggLua magnetic tracker screen
cutaneous 5DT data 4 ms and InterSense IS-900
glove the patriot hybrid inertial and
at around ultrasonic tracking
17 ms system.
Medellín-Castillo Assistance Medicine Kinesthetic Phantom VR Haptic Microsoft 5 2D and 1 3D Simulation Computer
et al. (2016) omni/ device foundation classes; cephalometric screen
falcon latency visualization radiographs,
toolkit library; 21 dental
H3DAPI surgeons
Applications of Haptic Systems in Virtual Environments: A Brief …
Ni et al. (2017) Assistance Industry Kinesthetic Phantom AR Real time Point cloud data; 10 Marker-based Computer
device Implicit surface of screen
workpieces
373
374 A. G. Rodríguez Ramírez et al.
13.5 Conclusions
The technologies of VR, AR, and haptics are in fast growing and have been of great
interest of technology innovators. There are infinite possibilities of application in
the use of this technology. The researches described were classified in training,
assistance, and entertainment with a sub-classification for the first and second one,
according to their area of application in educational, medical, and industrial.
Nevertheless, some of the authors coincide with the fact that the development of a
haptic system in a virtual environment may have a multidisciplinary impact.
Users in all areas demand immersive experiences. The lack of the feeling of
touch in virtual environment limits the user immersion and could lead to a
low-interest response from the user. Besides the enhancement of immersive
experiences, haptic virtual environment-based trainings can improve safe acquisi-
tion of technical and basic skills. On the other hand, assistance application meets
the benefits of haptic systems in virtual environments by the enhancements of
operations in all areas of application.
References
Abidi, M., Ahmad, A., Darmoul, S., & Al-Ahmari, A. (2015). Haptics assisted virtual assembly.
IFAC-PapersOnLine, 48(3), 100–105.
ACM, Inc. (2017). ACM digital library. Retrieved from http://dl.acm.org/.
Aleotti, J., Micconi, G., & Caselli, S. (2016). Object interaction and task programming by
demonstration visuo-haptic augmented reality. Multimedia Systems, 22(6), 675–691.
Asque, C., Day, A., & Laycock, S. (2014). Augmenting graphical user interfaces with haptic
assistance for motion-impaired operators. International Journal of Human-Computer Studies,
72, 689–703.
Atif, A., & Saddik, A. E. (2010). AR-REHAB: An augmented reality framework for
poststroke-patient rehabilitation. IEEE Transactions on Instrumentation and Measurement,
59(10), 1–10.
Bau, O., & Poupyrev, I. (2012). REVEL: Tactile feedback technology for augmented reality. ACM
Transactions on Graphics, 89, 1–11.
13 Applications of Haptic Systems in Virtual Environments: A Brief … 375
Carlson, P., Vance, J., & BergNee, M. (2016). An evaluation of asymmetric interfaces for
bimanual virtual assembly with haptics. Virtual Reality, 20(4), 193–201.
Chowriappa, A., Raza, S., Fazili, A., Field, E., Malito, C., Samarasekera, D., et al. (2015).
Augmented-reality-based skills training for robot-assisted urethrovesical anastomosis: A
multi-institutional randomised controlled trial. BJU International, 115(2), 336–345.
Craig, A. B. (2013). Understanding augmented reality: Concepts and applications. Newnes.
Csongei, M., Hoang, L., Eck, U., & Sandor, C. (2012). ClonAR: Rapid redesign of real-world
objects. IEEE International Symposium on Mixed and Augmented Reality, 277–278.
CyberGlove Systems Inc. (2017). Overview. Retrieved from http://www.cyberglovesystems.com/
cybergrasp/.
Díaz, I., Gil, J., & Louredo, M. (2014). A haptic pedal for surgery assistance. Computer Methods
and Programs in Biomedicine, 116(2), 97–104.
Eck, U., & Sandor, C. (2013). HARP: A framework for visuo-haptic augmented reality. IEEE
Virtual Reality, 145–146.
Elsevier B.V. (2017). Explore scientific, technical, and medical research on sciencedirect.
Retrieved from http://www.sciencedirect.com/.
Emerald Publishing. (2017). Discover new things. Retrieved from http://www.emeraldinsight
.com/.
Engineering Acoustics Inc. (2017). C2-HDLF. Retrieved from https://www.eaiinfo.com/product/
c2-lf/.
Faulhaber Group. (2017). DC-micromotors series 0615…S. Retrieved from https://www.faulhaber.
com/en/products/series/0615s/.
FLIR Integrated Imaging Solutions, Inc. (2017). Bumblebee2 1394a. Retrieved from https://www.
ptgrey.com/bumblebee2-firewire-stereo-vision-camera-systems.
Force Dimension. (2017). Omega.3. Retrieved from http://www.forcedimension.com/products/
omega-3/overview.
Han, G., Lee, J., Lee, I., & Choi, S. (2010). Effects of kinesthetic information on working memory
for 2D sequential selection task. IEEE Haptics Symposium, 43–46.
Han, I., & Black, J. (2011). Incorporating haptic feedback in simulation for learning physics.
Computers and Education, 2281–2290.
Haption SA. (2017). Virtuose 6D. Retrieved from https://www.haption.com/site/index.php/en/
products-menu-en/hardware-menu-en/virtuose-6d-menu-en.
Hassan, S., & Yoon, J. (2010). Haptic based optimized path planning approach to virtual
maintenance assembly/disassembly (MAD). In The 2010 IEEE/RSJ International Conference
on Intelligent Robots and Systems (pp. 1310–1315). Taipei, Taiwan: IEEE.
Hayward, V., Astley, O., Cruz-Hernandez, M., Grant, D., & Robles-De-La-Torre, G. (2004).
Haptic interfaces and devices. Sensor Review, 24, 16–29.
IEEE. (2017). IEEE Xplore digital library. Retrieved from http://ieeexplore.ieee.org/Xplore/home.
jsp.
Invitto, S., Faggiano, C., Sammarco, S., Luca, V., & Paolis, L. (2016). Haptic, virtual interaction
and motor imagery: Entertainment tools and psychophysiological testing. Sensors, 16(3), 1–17.
Israr, A., Zhao, S., Schwalje, K., Klatzky, R., & Lehman, J. (2014). Feel effects: Enriching
storytelling with haptic feedback. ACM Transactions on Applied Perception, 11(3), 1–14.
Lecuyer, A., Burkhardt, J.-M., & Tan, C.-H. (2008). A study of the modification of the speed and
size of the cursor for simulating pseudo-haptic bumps and holes. ACM Transactions on Applied
Perception, 5(13), 1–32.
Li, M., Sareh, S., Xu, G., Ridzuan, M., Luo, S., Xie, J., et al. (2016). Evaluation of pseudo-haptic
interactions with soft objects in virtual environments. PLoS One, 11(6), 1–17.
Lin, Y., Wang, X., Wu, F., Chen, X., Wang, C., & Shen, G. (2014). Development and validation
of a surgical training simulator with haptic feedback for learning bone-sawing skill. Journal of
Biomedical Informatics, 48, 122–129.
Lin, M., & Otaduy, M. (2008). Haptic rendering foundations, algorithms, and applications. A K
Peters.
376 A. G. Rodríguez Ramírez et al.
Lindgren, R., Tscholl, M., Wang, S., & Johnson, E. (2016). Enhancing learning and engagement
through embodied interaction within a mixed reality simulation. Computers & Education, 95,
174–187.
Luo, Q., & Xiao, J. (2004). Physically accurate haptic rendering with dynamic effects. IEEE
Computer Graphics and Applications, 24(6), 60–69.
Magnenat-Thalmann, N., Montagnol, M., Bonanni, U., & Gupta, R. (2007). Visuo-haptic interface
for hair. In International Conference on Cyberworlds, 3–12.
Medellín-Castillo, H., Govea-Valladare, E., Pérez-Guerrero, C., Gil-Valladaresc, J., Limd, T., &
Ritchie, J. (2016). The evaluation of a novel haptic-enabled virtual reality approach
for computer-aided cephalometry. Computer methods and programs in biomedicine, 130(C),
46–53.
Microsoft. (2017, March). Kinect fusion. Retrieved from https://msdn.microsoft.com/en-us/library/
dn188670.aspx.
Murphy, K., & Darrah, M. (2015). Haptics-based apps for middle school students with visual
impairments. IEEE Transactions on Haptics, 8(3), 318–326.
Ni, D., Yew, A., Ong, S., & Nee, A. (2017). Haptic and visual augmented reality interface for
programming welding robots. Advanced Manufacturing, 5(3), 191–198.
Neupert, C., Matich, S., Scherping, N., Kupnik, M., Werthscheutzky, R., & Hatzfeld, C. (2016).
Pseudo-haptic feedback in teleoperation. IEEE Transactions on Haptics, 9(3), 397–408.
Novint. (2017, March). Falcon technical specifications. Retrieved from http://www.novint.com/
index.php/novintxio/41.
Ogata, K. (1998). Ingeniería de Control Moderna. Pearson Educación.
Ouarti, N., Lécuyery, A., & Berthozz, A. (2014). Haptic motion: Improving sensation of
self-motion in virtual worlds with force feedback. IEEE Haptics Symposium, 167–174.
Oxford University Press. (2017). English oxford living dictionaries. Retrieved from https://en.
oxforddictionaries.com/.
Pacchierotti, C., Prattichizzo, D., & Kuchenbecker, K. (2016, February). Cutaneous feedback of
fingertip deformation and vibration for palpation in robotic surgery. IEEE Transactions on
Biomedical Engineering, 63(2), 278–287.
Pacchierotti, C., Tirmizi, A., & Prattichizzo, D. (2014). Improving transparency in teleoperation
by means of cutaneous tactile force feedback. ACM Transactions on Applied Perception, 11(1),
1–16.
Punpongsanon, P., & Kosuke, S. (2015). SoftAR: Visually manipulating haptic softness
perception in spatial augmented reality. IEEE Transactions on Visualization and Computer
Graphics, 21(11), 1279–1288.
Polhemus. (2017). FASTRAK. Retrieved from http://polhemus.com/motion-tracking/all-trackers/
fastrak.
Potkonjak, V., Gardner, M., Callaghan, V., Mattila, P., Guetl, C., Petrovic, V., et al. (2016).
Virtual laboratories for education in science, technology, and engineering: A review.
Computers & Education, 95, 309–327.
Rhienmora, P., Gajananan, K., Haddawy, P., Dailey, M., & Suebnukarn, S. (2010). Augmented
reality haptics system for dental surgical skills training. In VRST‘10 Proceedings of the 17th
ACM Symposium on Virtual Reality Software and Technology (pp. 97–98).
Ricciardi, F., & Paolis, L. (2014). A comprehensive review of serious games in health professions.
International Journal of Computer Games Technology, 1–14.
Rolland, J., Davis, L., & Baillot, Y. (2001). Survey of tracking technology for virtual
environments. In W. Barfield, & T. Caudell (Eds.), Fundamentals of wearable computers and
augmented reality (p. 836). CRC Press.
Sensable Technologies. (2016a). Geomagic phantom premium haptic devices. (Geomagic, Editor)
Retrieved from http://www.geomagic.com/es/products/phantom-premium/overview/.
Sensable Technologies. (2016b). Phantom desktop haptic device. Retrieved from http://www.
geomagic.com/archives/phantom-desktop/specifications/.
Sensable Technologies. (2016c). Phantom omni haptic device. (Geomagic, Editor) Retrieved from
http://www.geomagic.com/archives/phantom-omni/specifications/.
13 Applications of Haptic Systems in Virtual Environments: A Brief … 377
M. A. García-Terán (&)
UACJ Department of Manufacturing and Industrial Engineering, Ciudad Juárez, Mexico
e-mail: angel.garcia@uacj.mx
M. A. García-Terán
CINVESTAV Ramos Arizpe, Coahuila, Mexico
E. Olguín-Díaz
Department of Robotics and Advanced Manufacturing, CINVESTAV,
Ramos Arizpe, Coahuila, Mexico
M. Gamboa-Marrufo
Department of Structures and Materials, UADY, Mérida, Yucatán, Mexico
A. Flores-Abad
University of Texas at El Paso, El Paso TX, USA
F. Tapia-Rodríguez
Department of Engineering, Universidad Panamericana de Guadalajara,
Zapopan, Jalisco, Mexico
14.1 Introduction
Birds have the ability to modify position, attitude, and shape of both the wings and
the tail independently, as well as the shape of their bodies in order to develop a
specific flight mode. Furthermore, birds change the attitude of their feathers during
the flapping wings. It is noteworthy that flight modes of birds depend on the species
of bird and flight technique is particular of each individual bird no matter if they
belong to the same species, (Gatesy and Dial 1993; Alexander 2002; Biewener
2003; Gottfried 2007; Tobalske 2007). After this, the definition of a bio-inspired
morphing unmanned aerial vehicle (UAV) arises naturally as a UAV that changes
its external shape during flight to adapt to the environment (Valasek 2011).
There are two opinions concerning the function of the tail during bird flight.
Pennycuick in (Pennycuick 2008) affirms that the effects of the bird’s tail are not
significant during the flight; then, flight modes of the birds are developed by means
of the wings. On the other hand, in Tucker (1992), Gatesy and Dial (1993), Thomas
(1993), Gottfried (2007), Su et al. (2012) show that this element is very important
during different locomotion movements. In the same sense, these establish that it is
necessary to develop studies about the physic contributions because of the com-
plexity for understanding its functions during the flight (Tucker 1992; Kirmse 1998;
Biewener 2003; Videler 2005; Shyy et al. 2008).
The work proposed by (Alexander 2002) established that birds used the tilt
movement of the tail to counteract the tilting of the wings. It is noteworthy that
results showed that the effects of the wings were predominant and the effects of the
tail were not important. In the same sense, in Pennycuick (2008) it was stated that
the tail of the birds produces neither a lift force nor a moment; then the tail does not
improve the stability of the birds. Notwithstanding, the results suggested that birds
modify the angle of attack of each wing for controlling their movements. On the
other hand, in Tucker (1992), Gatesy and Dial (1993), Su et al. (2012), it has been
established that the tail of the birds is an important element for taking off, for
landing, for developing acrobatic movements, and different flight modes. Last
works concluded that the interaction between the tail and wings is not clear because
of the problems which represent the process of measurement. Then, in Su et al.
(2012), the hover flight was studied because this flight mode was considered
appropriate to analyze the interactions between tail and wings. In accordance with
Su et al. (2012), the birds changed both: attitude and area of the tail (spread and
folded tail) for modifying the aerodynamic forces. These changes improved the
stability and maneuverability of the birds, because the lift and drag forces were
related to the area and the angle of attack of the tail. Then, the birds synchronized
the tilt, folded, and spread of the tail with the wings to recover the body posture.
14 Experimental Analysis of a 3-DOF Articulated Flat Empennage 381
On the other hand, in Gottfried (2007) it was defined that the contribution of the tail
to the directional stability is a function of the sideslip angle and the lift coefficient.
The electrical activity of the tail muscles of pigeons for different locomotion
movements and flight modes were presented in Gatesy and Dial (1993). The results
showed that the electrical activity changed during the transition phase from the
take-off to flight, the flight to landing and between flight modes. It is noteworthy
that in accordance with the results the flapping flight requires more contributions
from the tail. Tucker (1992) analyzed the pitching equilibrium of the Harris hawk
where the primary wing feathers were clipped. The hawks presented problems to
achieve the equilibrium when gliding while increasing the percentage of clipped
feathers. Results showed that the hawks spread the tail and changed the position of
the wings to achieve the longitudinal stability. On the other hand, in Thomas (1993)
the aerodynamic properties of a tail of bird were determined. The work proposed a
model based on slender lifting surface theory, and it concluded that there are two
situations where the birds need the forces produced by the tail. The first is at slow
velocities where the longitudinal instability is presented; the second during acro-
batic movements and hover flight where high forces are required to control. These
effects were produced by both the upward and downward movements of the tail to
cancel the longitudinal unbalance and the banking motion.
The study and development of bio-inspired morphing UAVs are focused on new
designs, materials, mechanism, dynamic modeling, and controllers, all of them
implemented typically on the wings, (Valasek 2011; Paranjape et al. 2011a, b,
2012a, b). However, the studies about the bio-inspired empennage or bio-inspired
tail are limited because most of the times, the aerodynamic effects of the empennage
are considered negligible. However, the main contributions of the empennage to a
fixed-wing aerial vehicle are given by the moments produced by the aerodynamic
forces, which are important for both the longitudinal and lateral–directional
stability.
The analysis of stability of soaring birds using a RC model airplane with similar
dimensions and weight of a Raven is presented in Hoey (1992). The vehicle
included an articulated empennage, which can develop both the bank motion and
the tilt motion, as well as flaps on the lower surface of the wings. The results
showed that the direction of the lateral force was defined by the attitude of the
empennage. Moreover, there were at least two combinations that produce the same
sign. Bank motion of the empennage affects not only the lateral–directional stability
but also the airplane pitch motion. Results suggested that soaring birds control their
lateral stability by means of the adverse-yaw effects, and also that they use the
dihedral angles and the motion of the tail when performing rapid turns. The work
affirms that soaring birds used more the dihedral angle to stabilize the flight when
soaring than when gliding.
382 M. A. García-Terán et al.
In Leveron (2005), Higgs (2005), Rivera-Parga et al. (2007) the design, test, and
dynamic model of a micro-aerial vehicle (MAV) with a 2-DOF articulate empen-
nage, to develop a portable aerial vehicle, were presented. The work determined the
behavior of the MAV and the aerodynamic effects produced by the attitude changes
of the empennage. Tail included both bank and tilt motions with respect to the
MAV body. Tests were developed in a wind tunnel at low velocity, where the
forces and moments on the vehicle were measured for different empennages and
different attitude settings. Results showed that both longitudinal stability and lat-
eral–directional stability of the vehicle were affected by the empennage. Results
suggested that the empennage acted as a spoiler under specific conditions because
the lift did not increase despite the attitude changes; however, the longitudinal
stability was not compromised because the lift, drag, and pitching moment were
mildly affected. On the other hand, both the direction of the lateral force as well as
the yaw moment were affected by the empennage tilt movement, whose values were
similar to those of a typical airplane. Notice that these results were consistent with
the results of Hoey (1992). In the same sense, the attitude of the empennage
produces slight changes on the roll moment of the vehicle, but these effects can
improve the orientation. The research showed that the empennage motions modify
the attitude of the MAV; nevertheless, both the longitudinal and lateral–directional
stability are coupled, and it is necessary to define a control strategy.
The longitudinal stability and the controllability of an ornithopter, that included
a variable tail with one DOF, were presented in Han et al. (2008). The work
proposed the dynamic model of the vehicle and using a path following control the
flapping frequency and the tilt angle of the tail are adjusted. The results suggested
that the synchronization of the wings and the tail guarantees the longitudinal sta-
bility, it is important to comment that the lateral stability was not analyzed. In the
same sense, the design, the perching control, and the experimental test of MAVs
having articulated wings and variable tilt horizontal flat tail (lacking the vertical
stabilizer) (Paranjape et al. 2011a, b, 2012a, b). The dynamic model included the
inertial effects produced by the attitude changes of wings; furthermore, a perching
control strategy was proposed based on the variable dihedral angles of the wings
and the tilt motion of the tail. Experimental results showed that while the lateral–
directional motion was stabilized, the system required large control effort and the
yaw dynamics become slow because of the lack of the vertical stabilizer.
The design of a bio-inspired autonomous aircraft with a rotatable empennage was
presented in (Muller et al. 2015). The vehicle includes five different airfoils per wing
and an empennage with two DOF, which consist of an open chain, and a set of
ailerons as control surfaces. The vehicle was tested in a wind tunnel, and the results
were consistent with the results presented previously in Hoey (1992), Rivera-Parga
et al. (2007). The previous works consider that the tail of birds has two degrees of
freedom (the bank and tilt motions). However, in accordance with the observation
process, there are at least two more DOF which correspond to the pan motion and the
capability of both folding and spreading the tail. Then in this work, the design of a
bio-inspired empennage which includes the bank, tilt, and pan motion is presented in
order to mimic the main movements on the tail of the birds and to analyze their effects.
14 Experimental Analysis of a 3-DOF Articulated Flat Empennage 383
14.3 Background
The aerodynamic forces and moments arise from the interaction between a body
and the air (fluid) flow. These effects are defined to be function of a set of
dimensionless coefficients which depend on the shape of the airfoil, and they are
weighted by some geometric parameters, the relative velocity, the angle of attack,
the sideslip angle, and their rate of change (Stevens and Lewis 2003; Stengel 2004;
Cook 2007). The aerodynamic coefficients of an aerial vehicle include the aero-
dynamic contribution of the vehicle components (the wings, the vertical stabilizer,
the horizontal stabilizer, and the fuselage) and the control surfaces. It is noteworthy
that the coefficients are affected by the position and attitude of that vehicle ele-
ments; then, there is a set of coefficients for each configuration over the same
vehicle. This is one of the reasons why the analysis of bio-inspired aerial vehicles is
complicated. However, an aerodynamic sectional approach can recover the quali-
tative behavior of an aerial vehicle, which defines the aerodynamic forces and
moment as the sum of the contribution acting on each vehicle component, (Noth
2008; Roscam 2003; Olguín-Díaz and García-Terán 2014). The aerodynamic
effects produced either by an articulated wing or an empennage can be expressed as
a function of its attitude using the appropriated transformations. Furthermore, the
aerodynamic effects can be expressed by means of either a pair of 3D vectors (force
and moment vectors) or a single 6D vector (which is known as wrench) that
includes both force and moment vectors.
The set of variables that define the position and attitude of a rigid body with respect
to either an inertial reference frame or a local reference frame are known as the pose
of the rigid body. Let R0 be an inertial reference frame (Earth-fixed frame) and R1 a
local reference frame rigidly attached to a body at point 1 as shown in Fig. 14.1.
Vector rð0Þ ¼ ðx; y; zÞT 2 R3 is the position of the local reference R1 with respect to
the inertial reference frame R0 expressed with coordinates of the inertial reference
frame R0 . Both reference frames can be related by means of a rotation matrix
R10 ðhÞ 2 SOð3Þ that is parameterized by an attitude vector h 2 Rm (for m ¼ f3; 4g).
Notice that the dimension of the vector h depends on the attitude representation,
then there are multiple ways to parameterize the matrix R10 ðhÞ. In aeronautic,
the attitude is represented typically by the “roll-pitch-yaw” angles vector
h ¼ ð/; h; wÞT . Therefore, for this attitude representation the rotation matrix R10 ðhÞ
is given by (1), where the trigonometric functions sineð xÞ and cosineð xÞ are
abbreviated for simplification as sx ¼ sinð xÞ and cx ¼ cosð xÞ, (Stevens and Lewis
2003; Stengel 2004; Olguín-Díaz and García-Terán 2014).
384 M. A. García-Terán et al.
2 3
cw ch sw c/ þ cw sh s/ s w s / þ cw s h c/
RðhÞ¼4 sw ch cw c/ þ s w s h s / cw s/ þ sw sh c/ 5 ð14:1Þ
sh ch s / ch c/
Since the above-mentioned reference frame R1 may be used as the root frame of
the aerial vehicle, additional frames for each aerodynamic section may be of some
important use. Let the reference frame Ra be the aerodynamic frame as shown in
Fig. 14.1 which is placed at the aerodynamic center of the wing (or section) and
oriented along the chord line of the wing. This reference frame may be parameterized
by means of the dihedral angle u, the incidence angle a, and the sweep angle Ki of
the section, (Olguín-Díaz and García-Terán 2014). In accordance with the aerody-
namic sectional modeling if the wing is hinged, the matrix Ra1 is parameterized by
means of the additional movements; the rotation matrix Ra1 ðu; a; KÞ 2 SOð3Þ is
constant for constant parameters: u; a; K.
The absolute rotation matrix of the aerodynamic frame with respect to the
inertial one can be made by composed rotations in the appropriate order: Ra0 ¼
R10 Ra1 (where arguments are excluded only for simplification purposes in the
notation). Moreover, since the rotation matrix is an orthogonal matrix it arises
ðRa0 Þ1 ¼ ðRa0 ÞT . Notice that the position of Ra respects to R0 expressed in the
ðaÞ
inertial reference frame R0 is the addition of the vector rð0Þ and the vector ra in
accordance with Eq. (14.2). Then, the position and attitude of any point that
belongs to the body can be determined by means of the last procedure (Siciliano
et al. 2009).
rð0Þ
a ¼ r
ð0Þ
þ R10 Ra1 rðaÞ
a ¼r
ð0Þ
þ Ra0 rðaÞ
a ð14:2Þ
14 Experimental Analysis of a 3-DOF Articulated Flat Empennage 385
An extended vector is given by the pair ða; bÞ, where a 2 R3 is a linear vector and
b 2 R3 is a free vector, both belonging to the Euclidian space, and expressed in the
same reference frame. It is noteworthy that there are different extended vectors and
each one has a specific meaning and mathematical properties (Featherstone 2010a, b).
ð1Þ ð1Þ
Let mb 2 R6 be the extended velocity vector called twist, and Fb 2 R6 the extended
force vector or wrench, both in the reference frame R1 (Fig. 14.1) and expressed by
Eqs. (14.3) and (14.4):
!
ð1Þ
ð1Þ vb
mb ¼ ð1Þ 2 M R6 ð14:3Þ
xb
!
ð1Þ
ð1Þ fb
Fb ¼ ð1Þ 2 F R6 ð14:4Þ
nb
ð1Þ
where mb , which belongs to the motion space M R6 , contains the linear
ð1Þ
velocity vb 2 R3 of the point b and the angular velocity of the body to which the
point belongs; notice that both vectors describe the motion of the body. On the
ð1Þ
other hand, the wrench Fb , which belongs to the force space F R6 , contains
ð1Þ ð1Þ
the force vector f b 2 R3 that acts on the point b and the moment vector nb 2 R3
acting on the body, all of them expressed in the local reference frame R1 .
The extended vectors can be expressed between any two reference frames, i.e.,
R0 and R1 by means of an extended rotation R10 and an extended translation
ð1Þ
T rc=b that are presented by Eqs. (14.5) and (14.6), respectively:
R10 0
R10 ¼ ð14:5Þ
0 R10
" h i#
ð1Þ
ð1Þ I rc=b
T rc=b ¼ ð14:6Þ
0 I
Using the operators in Eqs. (14.5) and (14.6) both twists and wrenches can be
transformed between any two reference frames. For instance, between R0 and R1 it
follows, (Olguín-Díaz and García-Terán 2014):
ð1Þ ð1Þ ð1Þ ð1Þ ð1Þ ð1Þ ð1Þ ð1Þ
mc ¼ T rc=b mb mb ¼ T 1 ðrc=b Þ mc ¼ T ðrc=b Þmc
ð0Þ ð1Þ ð1Þ ð0Þ
mb ¼ R10 mb mb ¼ R1T
0 mb
ð1Þ ð1Þ ð1Þ ð1Þ ð1Þ
Fð1Þ
c ¼T
T
rc=b Fb Fb ¼ T T ðrc=b Þ Fð1Þ ð1Þ
c ¼ T ðrc=b ÞFc
T
ðjÞ ðiÞ
X ðda=b ; Rij Þ , RiT
j T ðda=b Þ ¼ T ðda=b ÞRj
iT
ð14:7Þ
The first term of Eq. (14.7) is defined by rotating an extended vector that was
translated previously, and the second one is obtained by translating the previous
rotated vector, then the twist and wrench are directly transformed as shown in the
following expression:
ðjÞ
mðiÞ
a ¼ X ðda=b ; Rj Þmb
i
ð14:8Þ
ðjÞ
FðiÞ
a ¼X
T
ðda=b ; Rij ÞFb ð14:9Þ
0 1
CD ðÞ
ðwÞ
fA ¼ q S@ CY ðÞ A ð14:10Þ
CL ðÞ
0 1
b Cl ðÞ
ðaÞ
nA ¼ q S@ c Cm ðÞ A ð14:11Þ
b Cn ðÞ
relative attitude of the aerodynamic frame Ra with respect to the wind frame Rw ,
and which can be represented by the next matrix:
2 3
ca cb sb cb s a
RAi ða; bÞ ¼ Raw ¼ 4 ca sb cb sa sb 5 2 SOð3Þ ð14:12Þ
sa 0 ca
being parameterized by means of the angle of attack a ¼ arctanðvrz =vrx Þ and the
ðaÞ
side slip angle b ¼ arcsinðvrx =vr Þ that are a function of the relative wind
ðaÞ
velocity vector vr , (Stevens and Lewis 2003).
ðaÞ
The relative velocity vr is computed by the difference between the section
ðaÞ ð0Þ
(wing) velocity of the wing va and the wind velocity vw . For proper addition both
velocities need to be expressed with respect to the same reference frame; for
instance, the aerodynamic frame of the section Ra :
ðaÞ
Based on the concept of the EMO operator, the aerodynamic wrench FA can be
expressed in any reference frame by means of Eq. (14.14), where X A is parame-
terized by the distance between Ra and R1 and the rotation matrix that related both
388 M. A. García-Terán et al.
ðaÞ
reference frames. Notice that the wrench is a function of the relative wind twist mr
that is computed by means of Eq. (14.15).
ð1Þ ðaÞ
FA ¼ X TA FA ðmðaÞ
r Þ ð14:14Þ
ð1Þ
mðaÞ aT ð0Þ
r ¼ X A m1 R0 mw ð14:15Þ
The term Ra0 in Eq. (14.15) is the extended rotation which relates both R0 and
ð0Þ
Ra reference frames. In the same expression, the term mw is the wind twist which
represents the environmental effects that is referred to the inertial reference frame.
Then, in accordance with Eq. (14.14), it is possible to express the aerodynamic
effects of an aerodynamic body with respect to any reference frame using the
appropriated transformations.
The design of bio-inspired UAVs with articulated empennage typically considers that
this aerodynamic body includes two DOF; however, in accordance with the analysis of
the flight of birds, these living beings have the capability to develop at least four
movements. The design of a bio-inspired empennage presents different problems that
are related with the size, the weight, and the number of DOF, as well as the undesirable
effects that are related with the shape of the empennage. In these sections, the design of
a bio-inspired empennage is presented, where the PT-40 RC model from Great
Planes® was considered to define the dimension and the shape of the tail. Moreover, in
order to analyze the static stability of the bio-inspired empennage an experimental
testbed was built to measure the forces and moments for different attitudes.
Based on the dynamic flight of the birds and the movements of its tail, an
empennage with three DOF was designed. The empennage consisted of a flat single
surface, which corresponds with the horizontal stabilizer of the PT-40 RC model.
Figure 14.2 presents a 3D view of the mechanism that consists in a set of conical
gears, which allows to develop both bank (U) and tilt (H) movements by the
combination of the rotation of two servomotors. The third movement, which cor-
responds to the pan motion (W), is produced by a pair of conical gears that change
the rotation axis in order to produce a rotation along the vertical axis. This
arrangement reduces the size of the mechanism and allows that all of the axes of
rotation intersect at the same point, and then the complexity of the extended vectors
transformation is reduced.
14 Experimental Analysis of a 3-DOF Articulated Flat Empennage 389
Figure 14.3 shows the design and manufacture of the tail that was made of two
wood plates, and in order to reduce weight some material was removed. Plates were
covered with polyester film.
I-shaped beam was included to firmly fix and locate the bio-inspired empennage at
the center of the wind tunnel test section, as shown in Fig. 14.5. The pitot tube was
aligned with the wind flow in order to measure the wind velocity.
Reaction forces and moments were measured with a 6D force sensor JR3 (mod.
67M25A3-140-DH) and a DSP-based receiver PCI card that works at 33 MHz and
32-Bit. In Table 14.1, the most relevant characteristics of the force sensor are
presented.
Figure 14.6 presents a block diagram that exemplifies the instrumentation of the
system. The attitude of the bio-inspired empennage was controlled by an open
source electronic board, which receives the angular set points through the computer
serial port from MATLAB. The MATLAB program selects the direction, the
angular displacement, and the sequence of the rotations in order to achieve the
desired attitude. The wind velocity was measured by a manometer PCE-P01 of PCE
Instruments and a pitot tube. The reaction forces and moments and wind velocity
were recorded in a text file.
14 Experimental Analysis of a 3-DOF Articulated Flat Empennage 391
The objective of the experiments is to define the relationship between the attitude of
the bio-inspired empennage and the aerodynamic forces and moments. The tests
consist on measuring the reaction forces and moments produced by the
392 M. A. García-Terán et al.
aerodynamic effects for the given range of different attitudes combinations of the
empennage. A proper transformation from the center of the sensor, corresponding
to the experimental setup, must be performed to produce the appropriate coordinate
force measurements at the aerodynamic center of the bio-inspired empennage.
These transformations are computed considering the reference frames defined in
Fig. 14.7, which were computed by an EMO operator given by Eq. (14.7). The
force sensor’s reference frame is located at its own center and the aerodynamic
reference frame is at the aerodynamic center at 25% of the chord line.
Three tests at the same constant wind velocity of Vw = 20 km/h were designed.
By combining two movements, which are termed basic movement and secondary
movement, the attitude is changed. Table 14.2 presents a summary of the move-
ments, ranges, and increments (DB for basic and DS for secondary movement
respectively) for each test. The first test was developed considering the pan
motion W as basic movement and the bank U motion as secondary movement,
the increments for both movements were DW ¼ DH ¼ 8 . The second and third
tests combine the pan-tilt (W H) and tilt-bank (H U) motions, respectively.
Both movements basic and secondary were either increased or decreased using
stepwise angular position.
Table 14.3 presents the sequence and the values for each step, where the first
column corresponds to the values for the basic movement and the first row contains
the values for the secondary movement, so that for each combination the forces and
moments were measured.
The next pseudocode summarizes the procedure that was followed for each test:
for i ¼ 1 : 1 : n
DB ðiÞ ;
for j ¼ 1 : 1 : m
D S ð jÞ ;
k = 1;
for t ¼; 0 : Dt : 20
F ðSÞ ði; j; k Þ ¼ readðSensor JR3Þ ;
394 M. A. García-Terán et al.
k ++
end
ðSÞ
FA ði; jÞ ¼ mean F ðSÞ ði; j; k Þ
end
end
where i and j correspond to the rows (basic movement) and columns (secondary
movement) position, respectively. In accordance with the pseudocode, the test
consisted in establishing the value of the basic movement when selecting the ith
row. For each position of the basic movement, a sweep of all of the values of the
secondary movement was made when selecting the jth column. Forces and
moments were measured at a sampling period of 0.001 s during 20 s for each
combination, and the average of this measurements was recorded at the position
ði; jÞ in Table 14.3. This process defines a complete measurement and to guaranty
stable results, the average of six cycles was computed. Figure 14.8 presents the
attitude of the bio-inspired empennage for the first (left) and third (right) tests.
The data measured by the sensor were transformed in order to define the aero-
dynamic forces and moments in the aerodynamic center of the bio-inspired empen-
nage. The corresponding transformation was computed in accordance with (7), which
is a function of the attitude of the empennage and the position of the aerodynamic
center. The relationship between the attitude of the empennage and the aerodynamic
coefficients was defined by a multiple regression analysis, which is based on the least
square theory. The analysis of the results showed that the data can be approximated to
a high-order polynomial function (five order) where the model includes interaction
between the movements.
Each observed value is a function of the independent variables (U, H, and W)
and a set of regression coefficients in accordance with the next expression:
Fig. 14.8 Test two and three. Figure shows the attitude of the stabilizer due to the bank (left) and
tilt (right) motions
14 Experimental Analysis of a 3-DOF Articulated Flat Empennage 395
where Mp is the predictors matrix and expresses the relationship between the
independent variables, and it is represented by Eq. (14.17) where n is the nth value
of the independent variables. In Eq. (14.16), the term Bxi represents the regression
coefficients vector and it is expressed by Eq. (14.18).
0 1
1 U1 H1 W1 U21 U 1 H1 U 1 W1 H21 ... W51
B1 U2 H2 W2 U22 U 2 H2 U 2 W2 H22 ... W52 C
B C
B U3 H3 W3 U23 U 3 H3 U 3 W3 H23 ... W53 C
Mp ¼ B 1 C ð14:17Þ
B. .. .. .. .. .. .. .. .. .. C
@ .. . . . . . . . . . A
1 Un Hn Wn U2n U n Hn U n Wn H2n ... W5n
0 1
b0 ðÞ
B b1 ðÞ C
B C
B C
Bxi ¼ B b2 ðÞ C ð14:18Þ
B .. C
@ . A
bn ðÞ
Finally, Cxi is the vector of observed values, which correspond with the
experimental data; see Eq. (14.19). Notice that both Eqs. (14.18) and (14.19) are
general expressions, then there is a pair of vector Bxi and Cxi for each aerodynamic
coefficient.
0 1
c0 ðÞ
B c1 ðÞ C
B C
B C
Cxi ¼ B c2 ðÞ C ð14:19Þ
B .. C
@ . A
cn ðÞ
Figure 14.9 shows the results from raw data of the experimental tests.
To facilitate the analysis of the experimental data, a multiple regression was
applied according to Eq. (14.16), which leads to the definition of the six coefficient
vectors given by Eq. (14.18). Numerical results are presented in Tables 14.3–14.8
of the appendix. It is noteworthy that the regression model has a correlation greater
than 95% for all of the cases. Using the obtained coefficients, the tests were
reproduced and the results are presented in Fig. 14.10.
Figure 14.11 presents the results for the first test, where one can observe that a
combination of both movements produces changes on all of the aerodynamic
coefficients. In accordance with the graph of drag coefficient CD , there are different
396 M. A. García-Terán et al.
combinations that produce the same output value. Notice that for the side force
coefficient CY , which affects the lateral–directional stability, its value changes in
magnitude and sign as a function of the pan motion. That means that it is possible to
change the direction of the lateral force by this movement. On the other hand, the
magnitude of the yawing moment coefficient Cn changes due to the pan motion and
for a specific value of the bank motion the pan motion changes the sign of the
coefficient. It is noteworthy that the combination of the maximum and minimum
values of these movements produces changes in both sign and amplitude of rolling
moment coefficient Cl . Bank motion changes the magnitude and direction of the lift
coefficient CL , and these effects are enhanced by the pan motion. Pitch moment
coefficient Cm exhibits both sign and amplitude changes and there are different
combinations that produce the same value. Notice that these combinations do not
produce the same values for the rest of the coefficients.
Figure 14.12 shows the results of the second test, where it is possible to observe
that the tilt motion affects principally the longitudinal stability ðCD ; CL ; Cm Þ if the
14 Experimental Analysis of a 3-DOF Articulated Flat Empennage 397
pan motion is null; however, when this condition is not fulfilled, the lateral–di-
rectional stability is modified. Coefficients ðCD ; CL ; Cm Þ are maximized due to the
pan motion in accordance with the graphs, where the maximum values are produced
by the combination of the maximum values of both motions, this behavior is
common due to the increase of the angle of attack. Notice that the aerodynamic
coefficients which are related with the lateral–directional stability ðCY ; Cl ; Cn Þ
exhibit slight changes for positive values of both movements; however, for negative
values of the movements the coefficients exhibit significant changes respect to the
rest of the possible combinations. Results of the third test are depicted in Fig. 14.13,
and these suggest that the drag coefficient CD is symmetrical with respect to the tilt
movement. Nevertheless, with respect to the bank motion, it seems that there are no
significant changes produced by this movement. Notice that as the rest of the tests
there are different combinations which produce the same drag value.
Lift coefficient CL exhibits significant changes due to H, this behavior is com-
mon because of the dependence of the lift with the angle of attack. Furthermore,
for the lift coefficient, the bank motion U produces minimum changes. However,
398 M. A. García-Terán et al.
the main contribution of U is seen in the lateral coefficient CY , where the direction
of the lateral force can be changed by the bank position of the empennage.
Lateral coefficient exhibits more changes for positive values of both H and U.
Graph of the roll moment coefficient shows that Cl varies its value mainly as a
response to positive values of the tilt motion H. In the same graph, it is observed
that there is no symmetry with respect to any of the movements. Nevertheless, the
effects are more significant for positive values of H. Pitch moment coefficient Cm is
symmetrical respect to H; however, the coefficient exhibits more changes when
combining positive values of H with the range of values of U. Notice that the bank
motions seem to produce no main effects on Cm ; furthermore, it is important to
comment that there are more than two combinations that produce the same value.
The corresponding results to the yaw moment coefficient Cn suggest that the sig-
nificant changes are produced by positive values of H, in the same sense the bank
movement, for any value of H, affects the magnitude of Cn .
14 Experimental Analysis of a 3-DOF Articulated Flat Empennage 399
14.6 Conclusions
In this work, a 3-DOF empennage with the capability of aiding in the flight control
of aerial vehicles was introduced. The proposed design was bio-inspired on the way
that some birds move their tail to control their flight. The document focused on the
experimental aerodynamic analysis of the system to determine the aerodynamic
coefficients and their relationship with the attitude change of the empennage. The
spatial transformations of the six-axis force sensor from inertial to a local frame
located at the aerodynamic center of the empennage are performed using extended
vectors such that torque and force are treated in a single expression. A number of
experiments were conducted at low velocity in a wind tunnel. Experimental results
were fitted using a multiple regression method base on a least square approxima-
tion. Four-dimensional plots and contours were used to facilitate the visualization of
the connection between the attitude change and the variation on the aerodynamic
400 M. A. García-Terán et al.
Appendix
b55 2.56E−10
401
402
b55 5.14E−10
403
404
b55 −4.11E−10
405
406
References
Alexander, D.-E. (2002). Nature’s flyers: Birds, insects, and the biomechanics of flight. Marryland:
The Johns Hopkins University Press.
Biewener, A.-A. (2003). Animal locomotion (Oxford animal biology series). Oxford: Oxford
University Press.
Cook, M.-V. (2007). Flight dynamics principles (2nd ed.). Amsterdam: Elsevier.
Featherstone, R. (2010a). A beginner’s guide to 6-D vectors (part 1) what they are, how they work,
and how to use them. IEEE Robotics and Automation Magazine, 17(3), 83–94.
Featherstone, R. (2010b). A beginner’s guide to 6-D vectors (part 2) from equations to software.
IEEE Robotics and Automation Magazine, 17(4), 88–99.
Gatesy, S.-M., & Dial, K. P. (March de 1993). Tail muscle patterns in walking and flying pigeons
Columa Livia. The Journal of Experimental Biology, 176, 55–76.
Gottfried, S. (2007). Tail effects on yaw stability in birds. Journal of Theoretical Biology, 249(3),
464–472.
Han, J.-H., Lee, J.-Y., & Kim, D.-K. (2008). Ornithopter modeling for flight simulation.
International Conference on Control, Automation and Systems, in COEX, Seoul, Korea.
Higgs, T. J. (2005). Modeling, stability, and control of a rotatable tail on a micro air vehicle.
Department of Aeronautics and Astronautics. Air force institute of technology. Air university.
Hoey, R. G. (1992). Research on the stability and control of soaring birds. In 28th National Heat
Transfer Conference, AIAA, 393–401.
Kirmse, W. (1998). Morphometric features characterizing flight properties of Palearctic eagles.
In R. D. Chancellor, B.-U. Meyburg & J. J. Ferrero (Eds.), Holarctic birds of prey
ADENEX-WWGBP (pp. 339–348).
Leveron, T. A. (2005). Characterization of a rotary flat tail as a spoiler and parametric analysis
of improving directional stability in a portable UAV. Department of Aeronautics and
Astronautics. Air force institute of technology. Air university.
Muller, B., Clothier, R., Watkins, S., & Fisher, A. (2015). Design of bio-inspired autonomous
aircraft for bird management. In Proceedings of the 16th Australian International Aerospace
Congress (AIAC16), pp. 370–377.
Noth, A. (2008). Design of solar powered airplanes for continuous flight (Ph.D. thesis). Swiss
Federal Institute of technology Zurich.
Olguín-Díaz, E., & García-Terán, M. A. (2014). Aerodynamic sectional modeling with the use of
extended vectors. In Unmanned Aircraft Systems (ICUAS), 2014 International Conference on,
(pp. 459–469).
Paranjape, A., Kim, J., Gandhi, N., & Chung, S.-J. (2011a). Experimental demonstration of
perching by an articulated wing MAV. In AIAA Guidance, Navigation, and Control
Conference, August.
Paranjape, A.-A., Chung, S.-J., & Selig, M. (2011b). Flight mechanics of a tailless articulated wing
aircraft. Bioinspiration & Biomimetic, 6(2), 1–20.
Paranjape, A. A., Chung, S.-J., Hilton, H. H., & Chakravarthy, A. (2012a). Dynamics and
performance of tailless micro aerial vehicle with flexible articulated wings. AIAA Journal, 50
(5), 1177–1188.
Paranjape, A.-A., Kim, J., & Chung, S.-J. (2012b). Closed-loop perching and spatial guidance
laws for bio-inspired articulated wing MAV. In AIAA Guidance, Navigation, and Control
Conference, 21.
Pennycuick, C.-J. (2008). Modelling the flying bird (theoretical ecology series). Amsterdam:
Elsevier.
Rivera-Parga, J., Reeder, M. F., Leveron, T., & Blackburn, K. (November–December de 2007).
Experimental study of a micro air vehicle with a rotatable tail. Journal of Aircraft, AIAA, 44(6),
1761–1768.
Roscam, J. (2003). Airplane flight dynamic and automatic flight control (6th ed.). DAR
corporation.
408 M. A. García-Terán et al.
Shyy, W., Yongsheng, L., Tang, J., Viieru, D., & Liu, H. (2008). Aerodynamics of low reynolds
number flyers. Cambridge: Cambridge Aerospace Series.
Siciliano, B., Sciavicco, L., Villani, L., & Oriolo, G. (2009). Robotics, modelling, planning and
control (2nd ed.). Berlin: Springer.
Stengel, R. (2004). Flight dynamics. Princeton University press.
Stevens, B., & Lewis, F. L. (2003). Aircraft control and simulation (2nd ed.). Hoboken: Wiley.
Su, J.-Y., Ting, S.-C., Chang, Y.-H., & Yang, J.-T. (2012). A passerine spreads its tail to facilitate
a rapid recovery of its body posture during hovering. Journal of the Royal Society, 9(72),
1674–1684.
Thomas, A.-L. (1993). On the aerodynamics of birds’ tails. Philosophical Transactions of the
Royal Society B, 340(1294), 361–380.
Tobalske, B. (2007). Biomechanics of bird flight. The Company of Biologists Ltd, 210(18),
3135–3146.
Tucker, V.-A. (1992). Pitching equilibrium, wing span and tail span in a gliding Harris Hawk,
Parabuteo Unicinctus. The Journal of Experimental Biology, 165, 21–41.
Valasek, J. (2011). Morphing aerospace vehicles and structures (Primera Edición ed.). Aerospace
Engineering Department Texas A&M University USA. Hoboken: Wiley.
Videler, J.-J. (2005). Avian flight (Oxford ornithology series). Oxford: Oxford University Press.
Chapter 15
Consensus Strategy Applied
to Differential Mobile Robots
with Regulation Control
and Trajectory Tracking
Flabio Mirelez-Delgado
Abstract In this article, the problem of performing different tasks with a group of
mobile robots is addressed. To cope with issues like regulation to a point or tra-
jectory tracking, a consensus scheme is considered. Three topologies were tested in
simulation. The first goal was to make consensus in the group of robots, after the
consensus point was relocated to achieve a regulation control. The last objective
was to follow a desired trajectory moving the consensus point along the predefined
path. The proposal was validated through experimental test with a group of three
differential mobile robots.
15.1 Introduction
F. Mirelez-Delgado (&)
Centro de Investigación y de Estudios Avanzados del Instituto Politécnico
Nacional Unidad Saltillo, Av. Industria Metalúrgica N° 1062, Parque Industrial
Saltillo-Ramos Arizpe, C.P, 25900 Ramos Arizpe, Coahuila, Mexico
e-mail: flabiodariomirelezdelgado@gmail.com
Historically, some of the earliest work in multiple robots grappled with the idea
of swarming robots to make formations (Desai et al. 2001; Yamaguchi et al. 2001;
Sun and Mills 2002; Takahashi et al. 2004; Sun and Mills 2007; Antonelli et al.
2009). Regulation to a fixed point is another research topic widely studied in mobile
robotics (Huijberts et al. 2000) as the trajectory tracking, with a single robot
(Nijmeijer and Rodríguez-Angeles 2004) or with a swarm (Siméon et al. 2002).
Interest in this area is due to the ability of biological societies to complete tasks
together faster than individually. One of the initial problems in the control of
cooperative robots comes from the need to share information. Sharing information
is a necessary condition for cooperation. For example, the relative position of the
robots among themselves, the speed of each vehicle, etc. The exchange of infor-
mation becomes a crucial part of the problem.
The structure of this chapter is as follows: Sect. 15.2 is related to the main
element in the group of robots, a differential mobile robot. In this section the
kinematic model is explained. Section 15.3 is about consensus strategy used in this
paper and the three different topologies. The control algorithms used to perform
consensus, regulation, and tracking are explained in Sect. 15.4. Section 15.5 pre-
sents the simulation results meanwhile Sect. 15.6 shows the experimental results.
Finally, Sect. 15.7 provides a conclusion for this work.
Mobile robotic platforms are increasingly common at the industry and as service
robots. The most common are wheel robots with differential control (DMR). The
tasks in a general way for this class of mobile robots are:
• Movements from Point to Point: The robot is given a desired configuration and it
must reach it from an initial position.
• Trajectory Tracking: A reference point in the robot must follow a certain desired
trajectory in a Cartesian plane starting from a certain initial position.
Be q 2 Q the n-vector of generalized coordinates for an DMR. The simplest
model is that of the unicycle. It means a single tire rolling on a plane. The gen-
eralized coordinates are q ¼ ðx; y; hÞ 2 R2 SO1 ðn ¼ 3Þ. The non-holonomic
restriction which means that the tire cannot move laterally is given by:
AðqÞq_ ¼ x_ sin h y_ cos h ¼ 0 ð15:1Þ
When multiple vehicles agree on the value of a variable of interest, it is said that the
robots reached consensus. To reach consensus, there must be a variable of interest
which is being shared by all the robots involved. Examples include a representation
of the center of the figure of the formation, time of arrival at the desired point, the
direction of the movement, the size of the perimeter being monitored, among others.
By necessity the consensus is designed to be distributed, assuming only neigh-
boring neighbor interaction between the robots. The objective is to design an
updating law so that the status of each value of each vehicle converges to a common
point. If a n number of vehicles in the group are assumed, the topology of the
communication can be represented through a direct graph.
Gn , ðvn ; nn Þ ð15:3Þ
where vn ¼ 1; 2; . . .; n is the set of nodes, and nn vn vn is the set of corners. The
most common algorithm of continuous dynamic consensus is:
X
n
x_ i ðtÞ ¼ aij ðtÞ xi ðtÞ xij ; i ¼ 1; . . .; n ð15:4Þ
j¼1
where aij is the input (ij) of the adjacent matrix an 2 Rnn associated with Gn at
time t. xi is the information state of the vehicle ‘i’. If aij ¼ 0, the vehicle i does not
receive information from j. A consequence of Eq. (15.4) is that xi ðtÞ is taken to the
information of its neighbors.
412 F. Mirelez-Delgado
The topology of communication is the name given to the configuration or the way
in which the robot members of the team communicate or exchange information. For
this project, various topologies seen in Ren and Beard (2008) were used.
For the topology presented in Fig. 15.2a, we have the following system.
x_ 1 ¼ a12 ðx1 x2 Þ
x_ 2 ¼ a23 ðx2 x3 Þ ð15:6Þ
x_ 3 ¼ 0
For the topology presented in Fig. 15.2b, we have the following system.
x_ 1 ¼ a12 ðx1 x2 Þ
x_ 2 ¼ a23 ðx2 x3 Þ ð15:9Þ
x_ 3 ¼ a32 ðx3 x2 Þ
For the topology presented in Fig. 15.2c, we have the following system.
x_ 1 ¼ a12 ðx1 x2 Þ
x_ 2 ¼ a23 ðx2 x3 Þ ð15:12Þ
x_ 3 ¼ a31 ðx3 x1 Þ
15.4.1 Consensus
As previously mentioned, the consensus process is achieved when all the vehicles
agree and reach a variable of interest. Based on the topologies shown in the pre-
vious figures, the necessary control is shown so that each topology reaches
consensus.
414 F. Mirelez-Delgado
15.4.1.1 Topology 1
The Laplacian matrix for topology 1 is constructed according to the group con-
nections as shown in Eq. (15.7).
2 3
1 1 0
L ¼ 40 1:5 1:5 5
0 0 0
The control law needed to achieve consensus in the group is given by:
15.4.1.2 Topology 2
For the second topology, we have that the Laplacian matrix is as Eq. (15.10):
2 3
1 1 0
L ¼ 40 1:5 1:5 5
0 2 2
That means the control law needed to achieve consensus in the group is given
by:
15.4.1.3 Topology 3
Last, the Laplacian matrix for third topology is as depicted in Eq. (15.13):
15 Consensus Strategy Applied to Differential Mobile Robots … 415
2 3
1 1 0
L¼4 0 1:5 1:5 5
2 0 2
and the control law needed to achieve consensus in the group is given by:
The kinematic model presented in Eq. (15.2) cannot be transformed into a linear
controllable system using static state feedback. However, the system can be
transformed via feedback into simple integrators (De Luca et al. 2001).
n1 ¼ h
n2 ¼ x cos h þ y sin h ð15:18Þ
n3 ¼ x sin h þ y cos h
The existence of a canonical form for the dynamic model of DMR allows a
general and systematic development of control strategies of open loop and closed
loop. The most useful structure is the so-called chain shape, which is obtained by
deriving the previous system:
n_ 1 ¼ h ¼ u1
n_ 2 ¼ x_ cos h x sinðhÞh_ þ y_ sin h þ y cosðhÞh_ ¼ u2 ð15:19Þ
n_ 3 ¼ x_ sin h x cosðhÞh_ y_ cos h þ y sinðhÞh_ ¼ n2 u1
Thus,
t ¼ u2 þ n3 u1 ð15:21Þ
x ¼ u1 ð15:22Þ
For track tracking, it is assumed that the DMR is represented by a point ðx; yÞ.
Which must follow a trajectory in the Cartesian plane represented by ðxd ðtÞ; yd ðtÞÞ
where t 2 ½0; T, and possibly T ! 1. The reference trajectory parameterized in
the time used in this work is given by Eq. (15.23).
a
xd ¼ 0:5 þ
3 sinð2ð2pðtÞ=nÞ
a sinð2ð2pðtÞÞÞ ð15:23Þ
yd ¼
n=2
hd ¼ arctan 2ð_y; x_ Þ
where a is the width of the trajectory, t is the current time, and n the time in which it
is desired to complete the cycle; therefore, the commands of reference speeds are
given by:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
td ¼ x_ 2d ðtÞ þ y_ 2d ðtÞ ð15:24Þ
t ¼ td cos e3 u1 ð15:27Þ
x ¼ xd u2 ð15:28Þ
15 Consensus Strategy Applied to Differential Mobile Robots … 417
u1 ¼ k1 e1 ð15:30Þ
In terms of the original inputs, the design leads to the nonlinear controller variant
in time (De Luca et al. 2001):
15.5 Simulations
Using the different topologies shown in Fig. 15.2 and the control laws from
Sect. 15.4, the following was achieved: Consensus, regulation, and trajectory
tracking for all robot members of the team.
15.5.1 Topology 1
15.5.1.1 Consensus
The consensus in position and orientation for the first topology on a plane was
simulated and Fig. 15.3 shows each robot behavior.
The circles denote where each robot begins, and the pentagon indicates where
the robots finish their movements. The (*) mark is used to represent the front of the
robot.
Figure 15.4 shows the orientation for the robots. At the end of the graph, it is
clear how the heading angles converge to the same value as the robot 1 has. This is
due to the connections made on topology 1.
The robots reached consensus as shown in the Figs. 15.3 and 15.4. The linear
and angular speeds of each robot to achieve the position and orientation consensus
are shown in Figs. 15.5 and 15.6.
418 F. Mirelez-Delgado
15.5.1.2 Regulation
Once the consensus process is over, the regulation stage continues, in which the
regulation control at a point leads to the states of the robots being modified in such a
way that they reach a desired position and orientation. Figure 15.7 shows how the
robots reach a desired position and orientation.
The evolution for orientation angles for each member of the group is depicted in
Fig. 15.8.
15 Consensus Strategy Applied to Differential Mobile Robots … 419
The robots arrived at the desired position as shown in Figs. 15.7 and 15.8.
The linear and angular speeds of each robot to achieve this are shown in Figs. 15.9
and 15.10.
Once the robots reach a desired point on Cartesian plane, the next step is to apply a
tracking control that will guide the robots to follow a predetermined trajectory.
In this case, the desired trajectory is an 8 shape, also known as Lemniscata.
420 F. Mirelez-Delgado
Fig. 15.7 Robots movements for regulation control on consensus for topology 1
Fig. 15.8 Robots orientation for regulation control on consensus for topology 1
The results of the simulation are shown in Fig. 15.11. Figure 15.12 represents the
orientation for each robot along the trajectory, and Figs. 15.13 and 15.14 show the
linear and angular velocity, respectively.
15 Consensus Strategy Applied to Differential Mobile Robots … 421
Fig. 15.9 Linear velocities for robots, regulation on consensus for topology 1
Fig. 15.10 Angular velocities for robots, regulation on consensus for topology 1
The simulations were performed for the topologies 2 and 3 to compare the behavior
for the group of robots. In Tables 15.1 and 15.2, the comparison between topology
2 and 3 is depicted. According to the procedure done for topology 1, the main
aspects to analyze are Cartesian plane movements, orientation, linear, and angular
velocity. These four points are presented in three scenarios; consensus, regulation,
and tracking.
422 F. Mirelez-Delgado
Orientations
Linear velocities
Angular velocities
(continued)
15 Consensus Strategy Applied to Differential Mobile Robots … 425
Orientations
Linear velocities
Angular velocities
(continued)
426 F. Mirelez-Delgado
Orientations
Linear velocities
Angular velocities
15 Consensus Strategy Applied to Differential Mobile Robots … 427
Orientations
Linear velocities
Angular velocities
(continued)
428 F. Mirelez-Delgado
Orientations
Linear velocities
Angular velocities
(continued)
15 Consensus Strategy Applied to Differential Mobile Robots … 429
Orientations
Linear velocities
Angular velocities
430 F. Mirelez-Delgado
Fig. 15.15 Experimental result using topology 3 for consensus, regulation, and trajectory tracking
Figure 15.16 shows the behavior for the heading angles of each robot during the
experiment. At the end of this graph, we can see how the robot has the same
orientation as they are following the desired path.
In Figs. 15.17 and 15.18, we can see the evolution for linear and angular
velocities in the robots during the experiment.
Fig. 15.16 Robots orientation for consensus, regulation and trajectory tracking with topology 3
15 Consensus Strategy Applied to Differential Mobile Robots … 431
15.7 Conclusions
It was shown that three different mobile robots can achieve consensus in their three
states can perform regulation to a fixed point with consensus and follow a path with
only displacing the consensus point.
The weights or values of the coefficients of the Laplacian matrix influence not
only the value of the consensus point, but also in the robot’s behavior on regulation
and trajectory tracking. This aspect must be carefully handled at topology design.
432 F. Mirelez-Delgado
The results of the implementation differ from the simulations due to factors such
as lighting, physical limitations of the robots and other factors inherent to the
experimental platform. The experimental validation demonstrates that through
consensus cooperation techniques in mobile robots can be established.
References
Antonelli, G., Arrichiello, F., & Chiaverini, S. (2009). Experiments of formation control with
multirobot systems using the null-space-based behavioral control. IEEE Transactions on
Control Systems Technology, 17(5), 1173–1182.
Chung, S., & Slotine, J. (2009). Cooperative robot control and concurrent synchronization of
Lagrangian systems. IEEE Transactions on Robotics, 25(3), 686–700.
De Luca, A., Oriolo, G., & Vendittelli, M. (2001). Control of wheeled mobile robots: An
experimental overview. In S. Nicosia, B. Siciliano, A. Bicchi, & P. Valigi (Eds.), Lecture notes
in control and information sciences (Vol. 270). Berlin, Heidelberg: Springer.
Desai, J., Ostrowski, J., & Kumar, V. (2001). Modeling and control of formations of
non-holonomic mobile robots. IEEE Transactions on Robotics and Automation, 17(6), 905–
908.
Huijberts, H., Nijmeijer, H., & Willems, R. (2000). Regulation and controlled synchronization for
complex dy-namical systems. International Journal of Robust and Nonlinear Control, 10(5),
336–377.
Nijmeijer, H., & Rodríguez-Angeles, A. (2004). Control synchronization of differential mobile
robots. In 6th IFAC Symposium on Nonlinear Control Systems, California, USA, pp. 579–584.
Ren, W., & Beard, R. (2008). Distributed consensus in multi-vehicle cooperative control: Theory
and application. London: Springer.
Siméon, T., Leroy, S., & Laumond, J. (2002). Path coordination for multiple mobile robots: A
resolution-complete algorithm. IEEE Transactions on Robotics and Automation, 18(1), 42–49.
Sun, D., & Mills, J. (2002). Adaptive synchronized control for coordination of multi-robot
assembly tasks. IEEE Transactions on Robotics and Automation, 18(4), 498–510.
Sun, D., & Mills, J. K. (2007). Controlling swarms of mobile robots for switching between
formations using synchronization Concept. In IEEE International Conference on Robotics and
Automation, Roma, Italy, pp. 2300–2305.
Takahashi, H., Nishi, H., & Ohnishi, K. (2004). Autonomous decentralized control for formation
of multiple mobile robots considering ability of robot. IEEE Transactions on Industrial
Electronics, 51(6), 1272–1279.
Yamaguchi, H., Arai, T., & Beni, G. (2001). A distributed control scheme for multiple robotic
vehicles to make group for-mations. Robotics and Autonomous Systems, 36(4), 125–147.