Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
60 views

EE363 Review Session 1: LQR, Controllability and Observability

1) The document discusses linear quadratic regulation (LQR) with an added penalty for input "smoothness" to encourage smaller differences between inputs over time. 2) It shows how this problem can be transformed into a standard LQR problem by augmenting the state to include previous inputs. 3) It briefly reviews the concepts of controllability and observability for discrete-time linear systems, including how to determine if a system is controllable or observable from the rank of relevant matrices.

Uploaded by

Ravi Verma
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views

EE363 Review Session 1: LQR, Controllability and Observability

1) The document discusses linear quadratic regulation (LQR) with an added penalty for input "smoothness" to encourage smaller differences between inputs over time. 2) It shows how this problem can be transformed into a standard LQR problem by augmenting the state to include previous inputs. 3) It briefly reviews the concepts of controllability and observability for discrete-time linear systems, including how to determine if a system is controllable or observable from the rank of relevant matrices.

Uploaded by

Ravi Verma
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

EE363 Review Session 1: LQR, Controllability and Observability

In this review session well work through a variation on LQR in which we add an input smoothness cost, in addition to the usual penalties on the state and input. We will also (briey) review the concepts of controllability and observability from EE263. If you havent seen this before, or, if you dont remember them, please read through EE263 lectures 18 and 19. As always, the TAs are more than happy to help if you have any questions. Announcements: TA oce hours: Tuesday 3-5pm, Packard 107, Wednesday 7-9pm, Packard 277, Thursday 4-6pm, Packard 277. Homework is due on Fridays. Homework is graded on a scale of 0-10. Representing quadratic functions as quadratic forms Lets rst go over a method for representing quadratic functions that you might nd useful for the homework. This representation can often simplify the algebra for LQR problems, especially when there are linear, as well as quadratic terms in the costs. Consider the following quadratic function in u and v , f (u, v ) = uT F u + v T Gv + 2uT Sv + 2f T u + 2g T v + s, where F > 0 and G > 0. We can write this as a pretty, symmetric quadratic form, T F S f u u T S G g v . f (u, v ) = v T T f g s 1 1

Now we can apply many of the results we know for quadratic forms to quadratic functions. For example, suppose we want to minimze f (u, v ), over v . We know (see homework 1, problem 3), T y Q11 Q12 y 1 = xT (Q22 QT min 12 Q11 Q12 )x. x QT Q x y 22 12 Applying this result to our quadratic form representation of f (u, v ), we get min f (u, v ) =
v

v 1

G g gT s

ST fT

F 1

S f

v 1

Example: Represent the following quadratic function in w and z , f (z, w ) = z T Qz + 2q T z + w T Rw + (Az + Bw )T P (Az + Bw ) + 2pT (Az + Bw ) + s, as a quadratic form. (This quadratic function arises, for example, in the Bellman equation for an LQR problem.) Solution. We can write f (z, w ) as T R + BT P B AT P B BT p w w Q + AT P A q + AT p z . f (z, w ) = z B T P A pT B q T + pT A s 1 1 This means, for example, that we can derive the Riccati recursion for this problem by directly applying our Shur complement minimization formula above. This gives, min f (z, w ) =
w

z 1

Q + AT P A q + AT p q T + pT A s BT P A pT B (R + B T P B ) 1 AT P B B T p z 1 .

LQR with smoothness penalty Consider the following linear dynamical system xt+1 = Axt + But , x0 = xinit .

In conventional LQR we choose u0 , u1, . . ., so that x0 , x1 , . . . and u0 , u1 , . . . are small. In this problem wed like to add an additional penalty on the smoothness of the input. This means that we want the dierences, u1 u0 , u2 u1 , . . . to be small as well. To do this, we dene the following cost function,
N 1 N 1

J=
=0

xT Qx

uT Ru

+
=0

(u u 1 ) + xT Qf xN , (u u 1 )T R N 2

= R T > 0, and u1 = 0. Notice that J is still a where Q = QT 0, R = RT > 0, R quadratic function of u0 , . . . , uN 1, although the form of the quadratic is more complicated, compared with conventional LQR. This means that we can solve for u0 , . . . , uN 1 as a giant least squares problem and, if we use appropriate numerical linear algebra methods, we can solve this very fast (no slower than a Riccati recursion). Another way of solving this problem is by transforming it into a standard LQR problem. Lets dene wt = ut ut1 , t = 0, . . . , N 1 (with u1 = 0). Then we can write x t+1 = xt+1 ut = = Axt + But ut1 + 0 wt + B I wt

A B xt 0 I ut1 x t, = A t + Bw where x t = We can write J as J=


=0

xt ut1
N 1

= A

A B 0 I

= B

B I

N , +x + w T Rw T x T N Qf x Qx Q 0 0 R Qf 0 0 0

where = Q

f = Q

So we see that by dening a new state x t that consists of the original state xt , and the previous input ut1 , we have transformed this problem into a conventional LQR problem. This means that we can apply the standard LQR Riccati equation to nd the optimal w0 , . . . , wN 1 , and hence u0 , . . . , uN 1. I should point out here, that there nothing profound about anything we have done here. State augmentation (increasing the size of the state), is a common trick to deal with situations where the variables are coupled across several time periods. There is also no reason why we should compare every problem we encounter with the conventional LQR problem. Its much more important to learn and understand dynamic programming and the Bellman equation, rather than memorizing LQR formulas you can always look these up, or re-derive them if you need to. Controllability Now lets briey review the concepts of controllability and observability for the discrete-time linear dynamical system xt+1 = Axt + But , t = 0, 1, . . . , 3

where xt Rn and ut Rm . Similar concepts exist for continuous time systems for more details, please refer to EE263 lectures 18 and 19. We say that a state z is reachable in t steps, if we can nd an input sequence u0 , u1 , . . . , ut1 that steers the state from x0 = 0 to xt = z . We can write ut1 . xt = B AB At1 B . . , u0 and so the set of states reachable in t steps is simply R(Ct ), where Ct = B AB At1 B .

By the Cayley-Hamilton theorem (EE263 lecture 12), we can express each Ak for k n as a linear combination of A0 , . . . , An1 , and so for t n, R(Ct ) = R(Cn ). This means that any state that can be reached, can be reached by t = n, and similarly, any state that cannot be reached by t = n is not reachable (in any number of steps). We call C = Cn the controllability matrix. The set of reachable states is exactly R(C ). We say that the linear dynamical system is reachable or controllable if all the states are reachable (i.e., R(C ) = Rn ). Thus, the system is controllable if and only if Rank(C ) = n. We will often say that the pair of matrices (A, B ) is controllable this means that the controllability matrix formed from A and B is full rank. Example: Suppose the system is controllable. Is there always state from an initial state xinit at time t = 0 to some desired state Solution. Yes. We can write ut1 . t t1 xt = A x0 + B AB A B . . u0 an input that steers the xnal ?

Thus, we can nd an input that steers the state from xinit to xnal in time t, if and only if xnal At xinit R(Ct ). Since the system is controllable, we know that Rank(Cn ) = n, which implies that R(Cn ) = Rn . This means that for any xinit and any xnal , we must have xnal An xinit R(Cn ). Thus, if the system is controllable, we can always nd an input transfers the state from any xinit to any xnal in time n. Example: Suppose there is an input that steers the state from an initial state xinit at time t = 0 to some state xnal at time t = N (where N > n). Is there always an input that steers the state from xinit at time t = 0 to xnal at time t = n? (We do not assume that (A, B ) is controllable.) Solution. No. Let, A= 0 1 1 0 , B= 1 1 , xinit = 4 1 0 , xnal = 0 1 .

Notice that this system is not controllable, and R(CN ) = span{(1, 1)} for any N . From the previous example, we know that we can drive the system from xinit at time t = 0 to xnal at time t = N if and only if xnal AN xinit R(CN ). Now xnal A3 xinit = but, xnal An xinit = 1 1 / R(Cn ). 0 0 R(C3 )

The point is that when the system is controllable everything works out a state transfer from any xinit to any xnal is possible. However, when the system is not controllable we must be very careful. Many results that seem perfectly plausible are simply not true. For instance, the result above is true if the initial state xinit = 0, but it not true in general for any xinit . In the lectures you will see that we always state the controllability and observability assumptions very clearly. Observability Well review observability for the discrete-time linear dynamical system xt+1 = Axt + But , yt = Cxt + Dut, t = 0, 1 , . . . ,

where xt Rn , ut Rm , and y Rp . Lets consider the problem of nding x0 , given u0 , . . . , ut1 and y0 , . . . , yt1 . We can write y0 u0 . . . . = Ot x0 + Tt . . , yt1 ut1 where, Ot = This implies C CA . . . CAt1 , Tt = 0 0 0 . . .. . . . CAt2 B CAt3 B CAt4 B D D CB CAB . . . 0 D CB . . . 0 0 D . . .

and so we see that x0 can be determined uniquely if and only if N (Ot ) = 0. By the Cayley-Hamilton theorem, we know that each Ak can be written as a linear combination of 5

y0 u0 . . Ot x0 = . . Tt . . , yt1 ut1

A0 , . . . , An1 . Thus, for t n, N (Ot ) = N (O), where C CA O = On = . . . n1 CA is called the observability matrix. We say that the system is observable if N (O) = 0. We will also often say that a pair of matrices (C, A) is observable this means that the observability matrix formed from C and A has zero nullspace.

You might also like