A Fast Convergence Algorithm for Iterative Adaptation
of Feedforward Controller Parameters*
Abstract
Feedforward control is a viable option for enhancing the response time and control accuracy of a wide variety of systems. Nevertheless, it is not able to compensate for the effects produced by modeling errors or disturbances. A solution to improve the feedforward performance is the use of an adaptation law that modifies the parameters of the feedforward control. In the case where real-time feedback is not possible, a solution is a run-to-run numerical optimization method that is fed with a cost based on a measured signal. Although the effectiveness of this approach has been demonstrated, its performance is hindered by slow convergence. In this paper, we present an algorithm based on Pattern Search and Adaptive Coordinate Descent methods that makes use of the sensitivity of the feedforward controller to its parameters so that the convergence speed improves significantly. Like many algorithms, this is a local strategy so the algorithm might converge to a local minimum. Therefore, we present two versions, one without a learning rate and one with it. To compare them and to demonstrate the effectiveness of the algorithm, simulated results are shown on a well-known control problem in electromechanics: the soft-landing control of electromechanical switching devices.
I Introduction
Feedforward control is an important element in control systems, offering immediate responses to reference changes or known disturbances. Despite their advantages, feedforward controllers alone are not robust to design errors, modeling errors, or system changes. To address these limitations, various complementary strategies exist, including conventional feedback controllers with observers [1], learning algorithms [2], and parameter adjustments based on measured variables [3].
As can be seen, there are many solutions when the state variables can be measured. However, in some situations these measurements are not feasible, either because the sensor is more expensive than the device to be controlled, or because such measurements are not accessible. In our previous work [4] we proposed a solution to this situation in impact reduction control of electromechanical switching devices. Using an alternative measurement, such as impact velocity in simulation or a measure of impact sound in real-world experiments, a cost is calculated. Then, using a black box approach, the parameters of the feedforward controller are iteratively modified. The initial results demonstrate the efficacy of the control structure, but we believe that the convergence of the black box proposal in [4] can be improved. The black box algorithm is a Pattern Search [5] algorithm. It is one of the derivative-free optimizations and, as this type of methods, it has the advantages of not using derivatives or finite differences, only having to compare function values. It is very useful for the problem treated, since the relationship between the input and output of the black box is unknown. As an improvement, in [6], a dimensionality reduction and a change of coordinate system for the optimization algorithm are proposed. The results confirm the effectiveness and highlight the potential for improvement.
In this line, [7] presents an Adaptive Coordinate Descent algorithm. The strategy involves periodically updating the coordinate system by a Covariance Matrix Adaptation Evolution Strategy (CMA-ES) and Adaptive Encoding to decompose the problem into as many one-dimensional problems as there are dimensions in the general problem. While the general concept can be useful in some applications, the use of a CMA-ES can be counterproductive in control field. As concluded by the authors in [8], there are fundamental limitations to the possibilities of self-adaptation in evolutionary strategies. The number of function evaluations required to reliably achieve a significant change is high. The function evaluations they consider are 10 (where is the problem dimension), 30 for a real-world search problem, and 100 for complete adaptation.
In terms of one-dimensional search, the authors of [7] suggest free-derivative methods (such as Pattern Search methods) or the use of gradients. Gradient methods are a strong powerful tool, but in our problem, the objective function equation is unknown and cannot be evaluated directly. However the use of subgradients may be an option. In this line, a solution could be sign gradient descent methods, first introduced in the RProp algorithm [9]. The RProp (Resilient Propagation) algorithm is a gradient descent algorithm that uses only the signs of the gradients to compute updates. Although it is a gradient method, the computational load is low because it is not necessary to compute the gradient, only its sign. Also, they may allow to reach minima other than the closest minimum of the initial condition, which makes these algorithms usable for global optimization. Nevertheless, adjusting their initial parameters and hyperparameters can be a challenging task. In contrast, gradient descent methods set an adaptive step size without the need for hyperparameters. A popular and effective method is the Polyak step size. This method is coupled with others [10] based on momentum acceleration, moving averaged gradient or stochastic methods [11], among others. Furthermore, its use in the subgradient method is common.
To address the problems highlighted above, in this paper we present a new algorithm that performs the functions of the black box. The proposed new algorithm is composed to a technique based on the sensitivity of the feedforward law that decomposes the initial -dimensional problem into one-dimensional problems, a method that selects the descent coordinate and makes movements in this direction, and a learning rate that enhances the algorithm performance. The main contribution is the transfer of optimization techniques more commonly used in other fields to the field of control, in particular the combination of free derivative algorithms and gradient descent methods.
The paper is structured as follows. Section II presents the work control structure and the first step to improve the convergence of the feedback loop. Section III develops the proposed algorithm in three parts: the basis change, the search method, and the subgradient learning rate. Section IV summarizes everything related to the simulation experiments: the dynamic system and feedforward control used, the simulation conditions, and the results that show the improvements. Finally, the conclusions are discussed in Section V.
II Background of the control system
The first proposal [4] focuses on the control of systems with differentially flat dynamical models. An -th order system is differentially flat if the -th derivative of the output is the first where the input appears explicitly [12]. This property allows the design of a feedforward controller by model inversion. However, despite its simplicity, errors in the model or parameter identification can significantly affect the accuracy of the controller. Therefore, the inclusion of a feedback loop is essential. The interest of this proposal lies in addressing scenarios when the measurement of the signal to be controlled is not available. To address this challenge, a system measurement that can be processed and converted into a performance indicator is selected as the feedback measurement. Closing the feedback loop requires a block that relates the performance indicator to the primary control loop. The proposed control structure is schematized in Fig. 1. However, it can often be difficult to find a function that effectively links these two aspects and can be implemented online. As a solution, [4] proposes a pattern search algorithm uses the performance indicator as a cost function to optimize the feedforward parameters.
After the initial proposal, future work has focused on improving the convergence speed of the method. [6] addresses this by applying dimensionality reduction techniques to the parameter set. This method proposes two techniques based on the sensitivity of the controller to the parameters. The first one involves optimizing only most sensitive parameters of . The second technique aims to reduce an alternative orthogonal coordinate system. Using the sensitivity of the feedforward controller, the Fisher matrix information is computed to construct a basis change matrix composed of Fisher matrix information eigenvectors. The main idea is to concentrate all the information into a smaller number of parameters to increase the controller accuracy when the dimensionality of the problem is reduced. Both proposals in that work use a fixed basis change matrix based on the nominal value of the feedforward controller parameters. However, we suggest the possibility of periodically updating the reduced parametric basis.
III New algorithm
The proposed algorithm tries to solve two different issues. The first one is to answer the questions of [6], i.e., how often the reduced parametric basis should be updated and what is the appropriate size of the search dimension at each update. The second is to be able to adapt when the error between actual and optimal parameters is significantly large.
For the first issue, following the idea presented in [7], the algorithm combines a simple optimization method, e.g., some successive line searches by coordinates, with a method that periodically adapts a coordinate system. This method aims to decompose the problem into separable functions that set the control output , where is a set of auxiliary parameters. This set can be the parameter vector , a subset of this vector or a function of them, e.g. the normalized parameters by their nominal parameters , , where denotes element-wise division. For the second issue, the idea is to add to the equation that calculates the next point with a learning rate parameter based on subgradient methods.
III-A An alternative orthogonal coordinate system
In our proposal, we use the second method proposed in [6] as the technique that adapts the coordinate system. In short, this technique is based on calculating the new coordinate system which keeps constant the integral-square deviation of with respect to the nominal input , ,
(1) |
By a simplification of the Taylor expansion around the nominal parameter vector, the integral-square deviation of with respect to the nominal input is approximately given by the quadratic form
(2) |
where , and is the Fisher matrix, which can be calculated from the sensitivity of the feedforward law to the vector of the control parameters, , as follows:
(3) |
(4) |
On the other hand, the Fisher matrix could be decomposed into the matrix of its eigenvalues, , and its eigenvectors, .
(5) |
Through a variable change, the transformation between the old coordinate system, , and the new coordinate system, , can be performed using a matrix whose columns are the eigenvectors of the Fisher matrix.
(6) |
This transformation not only allows the problem to be decomposed into separable functions, but also provides an orthogonal coordinate system sorted by the average sensitivity over time of the feedforward law to the new parameters .
(7) |
where denotes the -th element of .
III-B Search of the descending coordinate and line search
Once we have a coordinate system that we assume has a low correlation between its coordinates, we can apply a successive linear coordinate search. To do this, first, the coordinate with further decrease in cost has been found, and then a method that search the minimum at the coordinate should be selected. The proposal for selecting the coordinate of greatest descent is based on Pattern Search. A pattern of points is created with the center point being the lowest cost evaluated point, and two side points at each coordinate. Due to the coordinate are sorted by the sensitivity of the feedforward law, we assume the first evaluations can generate the greatest improvements. Once a cost-improving coordinate has been found, the algorithm continues to look for lower cost points in that direction by a method that embraces the philosophy of sign gradient descent algorithms. Thus, it is not necessary to complete the pattern to update the best point. When the next point does not improve the cost, assume a minimum is found and evaluate a new pattern.
To obtain the new pattern, a new orthogonal coordinate system is calculated and the search starts again at the most sensitive coordinate, i.e., the first new coordinate. The only exception is when the algorithm has been moved to the first coordinate. In this case a new pattern is not calculated and the algorithm continues the pattern at the second coordinate. The main reason is not to convert the algorithm into a single gradient search method in which the descendent coordinate is calculated though the coordinate that most modified , because the speed of convergence could be reduced to the lack of opportunity to directly find a new best point. In Fig. 2 outlines the movement rules. As can be inferred, if the initialization pattern block (“Init. pattern”) is reached with the left arrows, it is not necessary to recalculate the basis change matrix. This is because, after evaluating the entire pattern, the best point remains the initial one, and consequently, the basis change matrix remains unchanged.
As for the step size, , with the same philosophy as Pattern Search and RProp, when we seem to be moving in a good direction, the step size should be increased to get to the optimal point faster, and when we have just fallen over a minimum, the step size should be decreased to allow us to get closer to the minimum cost. In short, the next point to evaluate can be calculated as
(8) | |||
(11) |
where is the unitary vector with angle equal a system coordinate and desired direction. The values are constants such that . The values and are the minimum and maximum allowed step sizes, respectively.
III-C Subgradient learning rate
Although with this new heuristic we have addressed the questions regarding updating the reduced parametric basis, like the predecessor algorithms, it is useful only if the target point is near the initial estimation or the objective function is globally convex. If this is not true, it may converge to a local minimum and not reach the global minimum. In order to try to solve this problem, we propose to modify (11) as a modified gradient method
(12) |
where is the gradient, and , is another step size in iteration . However, given that the objective function is unknown (only its evaluated values are available), and we cannot assume the objective function is a continuous differentiable equation, the algorithm is treated as a subgradient method. As described in [10], a technique to calculate is to minimize the squared distance between and the optimal point .
(13) | |||
(14) |
Expanding the squared norm, the equation to be minimized is
(15) |
where and are the angle between and , i.e., between the coordinate of movement and the gradient, and the angle between and , i.e, the descent direction and the theoretical next step, respectively. By the relationship
(16) |
and considering we only want this extra term to acts when the algorithm falls into a local minimum or the convergence is too slow, i.e., assuming , (15) is reduced to
(17) |
With the previous assumption, knowing that is an scalar and , the solution of the minimization is
(18) |
Although the point is unknown, if we consider an objective function with a general convex behavior we can suppose , where is the minimum cost. If we substitute , we obtain an upper bound of the minimization, and is calculated as
(19) |
Since we are moving in only one coordinate, we consider , i.e., we assume that the coordinate is the descent direction. This assumption introduces an error, but, of the two solutions obtained for choosing the subtraction operation in the operation provides a conservative solution and smaller error.
Due to the low information of the gradient at each coordinate and the possible unsmoothed result, and are replaced by average values.
(20) | |||
(21) |
where is a positive constant that acts as a decay factor.
Working with an average value of the cost, , local minima, in which the objective function is not globally convex, gives higher values, causing the algorithm to avoid these points. The interest of is to mitigate excessively oscillating learning rates. This technique is already used in other algorithms, such as Root Mean Square Propagation (RMSProp) and other stochastic gradient descent algorithms.
The only term that remains to be defined is . This value can be considered a constant and if the real value in not know, is adequate on many situations, but it is usually a strong assumption. Alternatively, it can be considered as an iteration-dependent target value. This idea is more interesting in our case, because we want the additional terms to help (11) when the convergence is slower due to the evaluated point being far from the solution point. For this reason the different between and is saturated. We define the next function
(22) |
IV Simulated results
In this section we present through simulation an example of operation in a non-linear system based on the dynamics of electromechanical switching devices. These devices experience significant collisions at the end of switching operations, posing a continuous control challenge that has previously been addressed by soft landing controls. To illustrate the benefits of the new feedback algorithm, we discuss the improvements achieved over our previous work [6].
IV-A System dynamics
The dynamical model used for the simulated experiments is based on a single-coil reluctance actuator. This actuator is affected by two types of forces: passive elastic forces, which can generally be modeled as ideal springs, and magnetic force. The magnetic force is generated when current flows through the coil, causing an inner fixed core to become magnetized and attract the movable core. The typical method of supplying the actuator with power is by providing a voltage. We describe the dynamics of the system using a state-space model, where the voltage , is the input to our system, the position is the output, and velocity and magnetic flux linkage are auxiliary state variables. The state equations are defined as
(24) | ||||
(25) | ||||
(26) |
where , , , , and are the moving mass, the spring stiffness, the spring resting position, the coil resistance, and an auxiliary function based on the magnetic reluctance concept, respectively. This magnetic reluctance considers the phenomena of magnetic saturation and flux fringing in the model
(27) |
where , , , , , and are positive constants. Overall, the system dynamics depends on uncertain parameters, which can be grouped in the parameter vector .
(28) |
Note that the resistance is treated independently as a parameter without uncertainty, as it can be precisely measured. As explained in [4], the model (24)–(26) exhibits differential flatness and we can calculate the feedforward controller by inversion of the model
(29) |
where , and are derived from (25).
IV-B Description of the simulated experiments
In a real world scenario, we determine the parameters of an electromechanical switching device through a cumbersome estimation process. However, due to manufacturing or other tolerances, we assume that not all devices are identical and that the parameters vary from device to device. The values of the estimated parameters, which may be representative of a typical solenoid actuator or electromagnetic relay, are shown in Table I.
In order to be able to compare the results, the desired position trajectory ( in Fig. 1), necessary for the feedforward control and for the calculation of the base change matrix, is designed, as in [6]. This trajectory is formulated as a th-degree polynomial with the following boundary conditions:
(30) | ||||||||
where and are the desired initial and final times of the switching operation, and and are the desired initial and final positions, which correspond to the mechanical limits of the motion of the movable core.
In terms of variable measurements, the position of the movable core is not available. The electromechanical switching devices that we are considering are small in size and cheap, so using an expensive laser sensor for measurement would be impractical. Additionally, most of them are encapsulated within a protective housing, which impedes access to the component whose position needs to be known. In other works, during real world experimentation, the impact sound or the bounces are selected as indicators of control performance. In this simulations, as in [6], we consider the impact velocity such indicator
(31) |
To emulate the real situation where the actual value of the parameters does not match the nominal values, 10 000 different trials have been conducted. In each trial, we initialize the feedforward law with the estimated parameters of a devices (see Table I). Due to the way the algorithm is set up as,
(32) |
the initial parameters take the value 1, i.e., . To account for parameter variation, each component of the model parameter vector is randomly and independently perturbed by a certain percentage.
The results presented in this paper can be divided into three parts. First, to demonstrate the functionality of the ACD algorithm (without applying the learning rate) and to observe the improvement, we replicate the simulation conditions described in [6]. In this simulation, the control algorithm is executed for 300 switching operations in each trial, with parameter perturbations set to . In other words, the parameters of the real device under consideration vary between and with a uniform probability distribution of the values in Table I. The second result is shown to test the influence of the learning coefficient. Finally, the third and last set of results addresses the impact of greater errors in the initial parameters. In this case, parameter perturbations are set at . In this part, we compare the three algorithms again: the Pattern Search algorithm, the ACD algorithm without a learning rate, and the ACD algorithm with a learning rate.
IV-C Results
Fig. 3 shows the results for the first analysis. The graphs represent the evolution of the cost, , with respect to each evaluation or switching operation, . Due to the large number of simulations needed to reproduce the variability of the parameters between devices, the results are presented by the median () and the 10th and 90th percentiles ( and , respectively) of the distribution of values obtained for the 10 000 simulated experiments. For reference, the cost without control is also plotted. Fig. 3a shows the results when using the ACD algorithm, without applying the learning rate, to check its effectiveness and compare it with the previous results of [6] using the Pattern Search algorithm with an initial fixed change of the basis and a reduced dimensional coordinate system. Fig. 3b shows the results when only four dimensions are optimized, the situation with the least variability of results after 300 function evaluations. Fig. 3c shows the results when only two dimensions are optimized, the situation with the fastest convergence. From these results, we can conclude that the convergence is faster and the after 300 evaluations of the new algorithm is smaller. To facilitate comparisons between all the studied cases of the previous paper and the results obtained with the ACD algorithm, Fig. 4 shows the integrated (i.e., cumulative) average cost of each trial, denoted as ,
(33) |
where is the mean cost in the evaluation for the 10 000 simulated trials. The improvement is remarkable and, if we look at the trend, the intersection of the ACD curves with each other would be at infinity, i.e., the improvement is continuous over time.
To demonstrate the effect of the learning rate, the simulation is repeated with the same parameters for each trial. The function is calculated using the of the results ( of Fig. 3a). Thus, the processes that do not require the learning rate, i.e., those at or below the 90th percentile, will remain unaffected. Fig. 5 shows the 90th, 97th and 98th percentiles (, and ) for both processes, without (ACD) and with learning rate (ACD+LR). As expected, the evolution of is identical in both cases, and as the percentile gets higher, the algorithm with learning rate achieves better results with fewer evaluations. If we look at , the ACD with learning rate reaches the target in less than 150 evaluations, while the version without learning rate does not reach it even after 300 evaluations. Additionally, Fig. 6 shows the evolution of two individual processes. These plots show the effect of the learning rate in a process with slow convergence (Fig. 6a) and convergence to an unacceptable cost (Fig. 6b).
Finally, Fig. 7 shows the behavior of the three algorithms when the parameters are not so close to the right ones. Fig. 7a shows the results of the Pattern Search algorithm with basis change and optimizing only four dimensions. However, the and have similar values to the processes when the estimation of the initial parameters are between , the offers much higher values. Fig. 7b shows the results of the ACD without learning rate. The conclusions are similar to the previous ones, but, in this case the , although elevated, is better than with the Pattern Search, and even seems not to have converged yet. Fig. 7c shows the results of the complete new algorithm, ACD with learning rate. In this case , after 150 evaluations, converges to comparable values to when the initial error in the parameters was within 5, instead of 25.
V Conclusions
In this work we have presented a new algorithm to adapt the parameters of a feedforward controller from an alternative measurement of the state variables of the system. The improvement with respect to our previous has been achieved both for small initial parameter errors and for larger errors, where the previous technique is not useful.The improvements have been obtained by integrating three concepts into the algorithm: a periodic basis change based on the sensitivity of the feedforward law, the search for an optimal point in one dimension using the philosophy of the Pattern Search algorithms and sign gradient methods, and the inclusion of a learning rate calculated using the concepts used in the subgradient methods. Additionally, with this new algorithm we answer the questions of our previous work, the periodicity of updating the basis and the number of dimensions to reduce, due to the fact that the ACD selects the minimum number of dimensions to improve the feedforward controller behavior.
As future work, we would like to perform a deep theoretical analysis, and address some questions, such as the possibility of a technique to estimate the target function or to solve certain assumptions. In addition, we also intend to perform real laboratory tests on different systems to verify that the experimental results agree with those observed in simulation and the generality of the method.
References
- [1] R. Schroedter, M. Roth, K. Janschek, and T. Sandner, “Flatness-based open-loop and closed-loop control for electrostatic quasi-static microscanners using jerk-limited trajectory design,” Mechatronics, vol. 56, pp. 318–331, 2018.
- [2] M. Grotjahn and B. Heimann, “Model-based feedforward control in industrial robotics,” The International Journal of Robotics Research, vol. 21, no. 1, pp. 45–60, 2002.
- [3] S.-S. Yeh and P.-L. Hsu, “An optimal and adaptive design of the feedforward motion controller,” IEEE/ASME transactions on mechatronics, vol. 4, no. 4, pp. 428–439, 1999.
- [4] E. Moya-Lasheras, E. Ramirez-Laboreo, and E. Serrano-Seco, “Run-to-Run Adaptive Nonlinear Feedforward Control of Electromechanical Switching Devices,” IFAC-PapersOnLine, vol. 56, no. 2, pp. 5358–5363, 2023, 22nd IFAC World Congr.
- [5] R. M. Lewis and V. Torczon, “Pattern search methods for linearly constrained minimization,” SIAM J. Optimization, vol. 10, no. 3, pp. 917–941, 2000.
- [6] E. Ramirez-Laboreo, E. Moya-Lasheras, and E. Serrano-Seco, “Faster run-to-run feedforward control of electromechanical switching devices: a sensitivity-based approach,” in in Proc. Eur. Control Conf., Stockholm, Sweden, June 2024.
- [7] I. Loshchilov, M. Schoenauer, and M. Sebag, “Adaptive coordinate descent,” in Prod. 13th GECCO, 2011, pp. 885–892.
- [8] N. Hansen and A. Ostermeier, “Completely derandomized self-adaptation in evolution strategies,” Evolutionary computation, vol. 9, no. 2, pp. 159–195, 2001.
- [9] E. Moulay, V. Léchappé, and F. Plestan, “Properties of the sign gradient descent algorithms,” Information Sciences, vol. 492, pp. 29–39, 2019.
- [10] X. Wang, M. Johansson, and T. Zhang, “Generalized polyak step size for first order optimization with momentum,” in International Conference on Machine Learning. PMLR, 2023, pp. 35 836–35 863.
- [11] N. Loizou, S. Vaswani, I. H. Laradji, and S. Lacoste-Julien, “Stochastic polyak step-size for sgd: An adaptive learning rate for fast convergence,” in International Conference on Artificial Intelligence and Statistics. PMLR, 2021, pp. 1306–1314.
- [12] J. Lévine, “On necessary and sufficient conditions for differential flatness,” Appl. Algebra Eng., Commun. Comput., vol. 22, no. 1, pp. 47–90, 2011.