I Introduction

Abstract

In this article, we introduce an adaptive online model update algorithm designed for predictive control applications in networked systems, particularly focusing on power distribution systems. Unlike traditional methods that depend on historical data for offline model identification, our approach utilizes real-time data for continuous model updates. This method integrates seamlessly with existing online control and optimization algorithms and provides timely updates in response to real-time changes. This methodology offers significant advantages, including a reduction in the communication network bandwidth requirements by minimizing the data exchanged at each iteration and enabling the model to adapt after disturbances. Furthermore, our algorithm is tailored for non-linear convex models, enhancing its applicability to practical scenarios. The efficacy of the proposed method is validated through a numerical study, demonstrating improved control performance using a synthetic IEEE test case.
keywords: Model-identification, data-driven model predictive control, distributed optimization, online optimization, power grid, networked systems.

Adaptive Online Model Update Algorithm for Predictive Control in Networked Systems Vivek Khatana, Chin-Yao Chang, Wenbo Wang Vivek Khatana is with the Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, USA (Email: {khata010}@umn.edu). Chin-Yao Chang and Wenbo Wang are with the National Renewable Energy Laboratory, Golden, CO 80401, USA (Email: {chinyao.chang, wenbo.wang}@nrel.gov). This work was authored in part by NREL, operated by Alliance for Sustainable Energy, LLC, for the U.S. Department of Energy (DOE) under Contract No. DE-AC36-08GO28308. Funding provided by DOE Office of Electricity, Advanced Grid Modeling Program, through agreement NO. 33652. The views expressed in the article do not necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this work or allow others to do so, for the U.S. Government purposes.

I Introduction

System models are essential for understanding and controlling complex systems, as they enable accurate predictions, analysis, and optimization of resources. The mathematical models abstract complex, real-world phenomena into manageable representations. With the increase in data availability, scientists and practitioners favor adaptive reconfiguration of the model representations to conform to the latest data measurements. In this article, we focus on identifying the sub-system models capturing the global behavior of a networked system to solve the following predictive control problem

	$\displaystyle\operatorname*{minimize}_{\text{u}(1),\text{u}(2),\dots,\text{u}(% T)\in\mathbb{R}^{\mathbf{d_{u}}}}$	$\displaystyle\sum_{k=1}^{T}\sum_{i=1}^{N}\ell_{i}(y(k),k)+h_{i}(\text{u}_{i}(k% ),k)$		(1)
	$\displaystyle\mbox{subject to}\ \text{u}_{i}(k)$	$\displaystyle\in\mathcal{U}_{i},\ \mbox{for all}\ i=1,2,\dots,N,\ \mbox{for % all}\ k$

where, $\text{u}(k)=[\text{u}_{1}(k),\text{u}_{2}(k),\dots,\text{u}_{N}(k)]\in\mathbb{% R}^{\mathbf{d_{u}}}$ is the control decision at time $k$ with constraint sets $\mathcal{U}_{i}\subseteq\mathbb{R}^{\mathbf{d_{u}}_{i}}$ , $\sum_{i=1}^{N}\mathbf{d_{u}}_{i}=\mathbf{d_{u}}$ , and $y(k)\in\mathbb{R}^{\mathbf{d_{y}}}$ is an observable that involves the physical or behavioral inter-dependencies among the sub-systems. Functions $\ell_{i}$ and $h_{i}$ in (1) capture the costs due to the output $y(t)$ and the control inputs $\text{u}_{i}$ respectively at the sub-system $i$ . The predictive control problems (1) appear in the context of online optimal control, communication systems, and robotic networks [1, 2] to mention a few. More recently, the problem has also been of interest in the control and operation of power systems [3].

Suppose, the sub-system inter-dependencies are modeled via the parametric description between the observable $y$ and the inputs u,

\displaystyle\mathcal{M}(\theta):y_{\theta}(t)=\phi(\text{u}(t),\theta),

(2)

where $\theta\in\mathbb{R}^{\mathbf{d}_{\theta}}$ is the parameter of the map $\phi:\mathbb{R}^{\mathbf{d_{u}}}\times\mathbb{R}^{\mathbf{d}_{\theta}}\to% \mathbb{R}^{\mathbf{d_{y}}}$ that captures the relation between the control inputs, $\text{u}(t)$ , and the (parameterized) observable or output of the networked system, $y_{\theta}(t)$ , at time $t$ . It is assumed that there exists a vector $\theta^{\star}$ such that the true system output $y(t)$ is given by $y(t)=y_{\theta^{\star}}(t)=\phi(\text{u}(t),\theta^{\star})$ . Note that optimization (1) presumes the knowledge of the input-output map (2). The decisions $\text{u}(k)$ are determined based on the postulated output $y_{\theta}(k)$ via a model of the form (2) and are sensitive to model mismatches. Under model imperfections, the generated control inputs might drive the network operation to an undesirable state.

Given (1) and (2), the problem addressed in the current article pertains to the development and analysis of an algorithm that enables the update of the parametric input-output map in an online manner based on current data measurements to incorporate real-time variations of the controlled system and generate optimal control inputs. The exact description of the class of parametric maps chosen is given in Section II.

I-A Literature Review

System identification [4] is a broad topic that spans multiple fields. Specialized methodologies, such as learning-based methods [5] and behavioral system theory for non-parametric models [6], can generally be viewed as system identification. As for the applications for power distribution systems, utilities typically maintain feeder models in distribution planning and geographic information system databases [7]. However, operational changes to the grid, such as upgrades and reconfigurations [8], as well as database errors [9], necessitate ongoing maintenance of these models and databases [10]. Voltage control and other operational controls [11, 12], if based on erroneous or outdated models and data, can adversely affect system stability and reliability. Model identification techniques can address these model consistency issues. Some approaches involving machine learning methods [13, 14] require centralized data collection, which raises data privacy concerns and necessitates communication infrastructures. A multi-agent-based distributed approach [15, 16] can mitigate these concerns. Building on this foundation, our previous work, [17], advanced the state-of-the-art in distributed identification methods that protect local data privacy for linear systems, albeit with some communication requirements.

Building on the merits of our prior research’s distributed and localized approach, the current article introduces several advancements. The main contributions are as follows:

1.

We develop a distributed algorithm for online model identification in networked nonlinear systems to enable the observable estimate in problem (1) to align with the true output of the system.
2.

We establish that the proposed online algorithm has a sublinear regret in identifying the true convex input-output map of the system.
3.
The developed algorithm has several desirable properties:
- •
  
  it preserves the local input data privacy for every sub-system.
- •
  
  it requires only the latest measurements for updates, eliminating the need for storing historical data, and has substantially less communication bandwidth requirement.

We present a numerical simulation study to demonstrate the performance of the proposed algorithm. The predictive control problem (1) is instantiated as a voltage regulation problem in power systems. The numerical results corroborate the efficacy of the proposed algorithm in adaptively updating the input-output map of the test power system utilizing the latest measurements. The results establish that having access to an accurate nonlinear model provided by the proposed algorithm results in superior control performance compared to traditional linear models for power distribution systems, underscoring the practical value of our framework.
At this point, we emphasize that the current work is not related to the body of research on the reconstruction and identification of unknown topology of an interconnected system using time-series measurement data [18, 19, 20].
The rest of the paper is organized as follows: Section I-B introduces some key definitions and notations used throughout the article. Sections II and III delve into the distributed system model identification framework and provide the preliminary analysis to aid the development of the model identification algorithm. Section IV presents the proposed algorithm, its convergence analysis and the distributed implementation details. A predictive control problem with online model updates is presented in Section V. Section VI provides the simulation study and demonstrates the numerical results on the performance of the developed algorithm in solving the predictive control problem with online model updates introduced in Section V. The concluding remarks are provided in Section VII with some directions for future research.

I-B Definition and Notations

In this paper, we denote matrices in boldface. For a vector $x\in\mathbb{R}^{n}$ , we denote its $\ell^{2}$ -norm and the norm induced by a matrix $\mathbf{A}\succ 0$ by $\|x\|_{2}$ and $\|x\|_{\mathbf{A}}$ , respectively. The vertical and horizontal concatenation of matrices $\mathbf{A}^{i}$ are denoted as $[\mathbf{A}^{1};\mathbf{A}^{2};\cdots;\mathbf{A}^{n}]\in\mathbb{R}^{Nm\times n}$ and $[\mathbf{A}^{1},\mathbf{A}^{2},\cdots,\mathbf{A}^{n}]\in\mathbb{R}^{m\times Nn}$ . The Kronecker product of matrices $\mathbf{A}^{i}$ and $\mathbf{A}^{j}$ is denoted as $\mathbf{A}^{i}\otimes\mathbf{A}^{j}$ . For a matrix $\mathbf{A}\in\mathbb{R}^{m\times n}$ , $\operatorname*{vec}(\mathbf{A})\in\mathbb{R}^{mn}$ is a column vector created by concatenating the column vectors of $\mathbf{A}$ from left to right. For a matrix $\mathbf{A}\in\mathbb{R}^{m\times n}$ , $\operatorname{null}\{\mathbf{A}\}:=\{x\in\mathbb{R}^{n}|\mathbf{A}x=0\}$ denotes the null space of matrix $\mathbf{A}$ . Given a set $S$ of vectors, the linear span of $S$ is defined as $\operatorname{span}\{S\}:=\{\sum_{i=1}^{N}v_{i}x_{i}|v_{i}\in\mathbb{R},x_{i}% \in S\}$ . The scalar element of the $i^{th}$ row and $j^{th}$ column of $\mathbf{A}$ is denoted as $\mathbf{A}_{i}^{j}$ and the $j^{th}$ row and column of the matrix $\mathbf{A}$ are denoted as $\mathbf{A}_{j,:}$ and $\mathbf{A}_{:,j}$ , respectively. The identity matrix and vector with all entries equal to $1$ of dimension $n$ are denoted as $\mathbf{I}_{n}$ and $1_{n}$ , respectively.

A graph $\mathcal{G}$ is denoted by a pair $(\mathcal{V},\mathcal{E})$ where $\mathcal{V}$ is a set of vertices (or nodes) and $\mathcal{E}$ is a set of edges, which are ordered subsets of two distinct elements of $\mathcal{V}$ . If an edge from $j\in\mathcal{V}$ to $i\in\mathcal{V}$ exists then it is denoted as $(i,j)\in\mathcal{E}$ . The set of neighboring sub-systems of node $i\in\mathcal{V}$ is called the neighborhood of node $i$ and is denoted by $\operatorname{\mathcal{N}}_{i}=\{j\ |\ (i,j)\in\mathcal{E}\}$ . In the subsequent, we use the terms agents, nodes, and sub-systems interchangeably. A continuous function $f:\mathbb{R}^{p}\to\mathbb{R}$ is called Lipschitz continuous with constant $L>0$ if the following inequality holds: $|f(x)-f(y)|\leq L\|x-y\|,\ \forall\ x,y\in\mathbb{R}^{p}$ . Given a norm $\|\cdot\|$ and a set $K\subset\mathbb{R}^{p}$ , define the diameter of $K$ with respect to this norm as $Diam_{\|\cdot\|}(K):=\sup_{x,y\in K}\|x-y\|$ . In the subsequent text the $O(.)$ and $o(.)$ operations denote the standard Big-O and Little-o notations respectively [21].

II Agent Based System Framework

We consider a networked system represented by a graph $\mathcal{G}(\mathcal{V},\mathcal{E})$ consisting of $|\mathcal{V}|:=N$ nodes (or sub-systems). Each node $i$ has an actuator applying the control decision $\text{u}_{i}\in\mathbb{R}^{\mathbf{d_{u}}_{i}}$ . Assume that $\mathbf{d_{y}}$ number of sensors are deployed in the network $\mathcal{G}$ . The measurements of all these sensors are sent to a fusion center that collects all the sensor outputs to create a measurement $\widehat{y}(t)\in\mathbb{R}^{\mathbf{d_{y}}}$ of the true global observable $y(t)$ . Every sub-system $i$ maintains a local estimate $\widehat{y}^{i}_{\theta_{i}}(t)$ of the global observable $y(t)$ via a model of the kind in (2). In particular,

\displaystyle\widehat{y}^{i}_{\theta_{i}}(t):=\phi_{i}(\text{u}_{i}(t),\theta_% {i}),\ \mbox{for all}\ i=1,2,\dots,N,

(3)

where parameter $\theta_{i}\in\mathbb{R}^{\mathbf{d}_{\theta_{i}}}$ , with $\sum_{i=1}^{N}\mathbf{d}_{\theta_{i}}=\mathbf{d}_{\theta}$ . The output estimate of the network is defined as

\displaystyle\widehat{y}_{\theta}(t)=\frac{1}{N}\sum_{i=1}^{N}\widehat{y}^{i}_% {\theta_{i}}(t)=\frac{1}{N}\sum_{i=1}^{N}\phi_{i}(\text{u}_{i}(t),\theta_{i}).

(4)

Here we assume that the parametric model above is accurate in the sense that there exists a $\theta^{\star}:=[\theta_{1}^{\star};\theta_{2}^{\star},\dots;\theta_{N}^{\star% }]\in\mathbb{R}^{\mathbf{d}_{\theta}}$ such that

\displaystyle\widehat{y}(t)=\widehat{y}_{\theta^{\star}}(t)=\frac{1}{N}\sum_{i% =1}^{N}\phi_{i}(\text{u}_{i}(t),\theta^{\star}_{i}),\ \mbox{for all}\ t.

(5)

With (5), we formulate the following predictive control problem,

	$\displaystyle\operatorname*{minimize}_{\text{u}(1),\text{u}(2),\dots,\text{u}(% T)\in\mathbb{R}^{\mathbf{d_{u}}}}$	$\displaystyle\sum_{k=1}^{T}\sum_{i=1}^{N}\ell_{i}(\widehat{y}_{\theta}(k),k)+h% _{i}(\text{u}_{i}(k),k)$		(6)
	$\displaystyle\mbox{subject to}\ \text{u}_{i}(k)$	$\displaystyle\in\mathcal{U}_{i},\mbox{ for all}\ i=1,2,\dots,N,\ \ \mbox{for % all}\ k.$

Note that (6) is equivalent to problem (1) if $y(t)=\widehat{y}(t)=\widehat{y}_{\theta}(t)$ hold for all $t$ . This is typically assumed in the state-of-the-art to solve problem (1) (see [11], Assumption 4, [12], Assumption 5, for example). However, when the output estimate $\widehat{y}_{\theta}$ doesn’t match with the true measurements of the system the control performance is affected adversely. In this article, we take the approach of adaptively improving the parametric output estimate of the sub-systems to maintain the validity of (5). Denote $L(\theta)=|\sum_{k=1}^{T}\sum_{i=1}^{N}\ell_{i}(\widehat{y}_{\theta}(k),k)-% \sum_{k=1}^{T}\sum_{i=1}^{N}\ell_{i}(\widehat{y}(k),k)|$ as the cost of model mismatch with the controller running over a horizon $T$ . Our goal is to develop algorithms that minimize the model mismatch quantified by $L(\theta)$ in real-time so that the performance of the closed-loop controllers in solving (6) is not compromised due to model mismatches.

III Model Update Problem and the Distributed Reformulation

Given the criticality of (5) we aim to reduce the model mismatch by finding the parameter $\theta$ that solves

\displaystyle\operatorname*{minimize}_{\theta}\ \mathbf{r}(\theta):=\frac{1}{2% }\sum_{t=1}^{T}\left\lVert\widehat{y}(t)-\frac{1}{N}\sum_{i=1}^{N}\phi_{i}(% \text{u}_{i}(t),\theta_{i})\right\rVert^{2}.

(7)

For the subsequent development, we make the following assumptions:

Assumption 1.

The control decisions generated via problem (6) ensure the stability of the networked system $\mathcal{S}$ .

Assumption 2.

Functions $\phi_{i}$ in (3) are proper, convex, and Lipschitz continuous with constant $L_{i}$ for all $i$ .

Note that under the network model, each sub-system $i$ creates an estimate $\widehat{y}^{i}_{\theta_{i}}(t)$ to determine the effect of its regional control decisions $\text{u}_{i}(t)$ on the output (reflected in the measurements). Problem (7) can be interpreted as a distributed optimization problem across a network of sub-systems as:

\displaystyle\operatorname*{minimize}_{\theta}\ \frac{1}{2N^{2}}\sum_{t=1}^{T}% \left\lVert\sum_{i=1}^{N}\left(\widehat{y}(t)-\phi_{i}(\text{u}_{i}(t),\theta_% {i})\right)\right\rVert^{2}.

(8)

where we rewrite $\widehat{y}$ by $\frac{1}{N}\sum_{i=1}^{N}\widehat{y}$ in (7) to derive (8). Optimization problem (8) couples the parameters and data for all the agents. We next consider a reformulation described in [22] to set up a formulation for a distributed algorithm, allowing the sub-systems to do local computations and communicate with the neighboring sub-systems in the network $\mathcal{G}(\mathcal{V},\mathcal{E})$ to determine a solution for problem (8). Assuming the network $\mathcal{G}(\mathcal{V},\mathcal{E})$ is connected, let $\mathbf{P}\in\mathbb{R}^{N\times N}$ be a finite weight matrix associated with the graph satisfying the following assumption:

Assumption 3.

$\operatorname{null}\{\mathbf{P}\}=\operatorname{span}\{1_{N}\}$ .

A few examples of matrices $\mathbf{P}$ that satisfy Assumption 3 are:

(i)

Laplacian matrix: The Laplacian matrix of the graph [23] is defined as:

\displaystyle\mathbf{P}_{ij}

\displaystyle=\begin{cases}-1,&\text{if}\ (i,j)\in\mathcal{E},\\ |\operatorname{\mathcal{N}}_{i}|,&\text{if}\ i=j,\\ 0,&\text{otherwise}.\end{cases}

(ii)

A matrix created via a column stochastic matrix: $\mathbf{P}=\mathbf{I}_{N}-\tilde{\mathbf{P}}^{\top}$ , where, $\tilde{\mathbf{P}}$ is a column stochastic matrix such that $[\tilde{\mathbf{P}}]\in[0,1],1_{N}^{\top}\tilde{\mathbf{P}}=1_{N}$ .

Let $\Phi(\text{u}(t),\theta):=[\phi_{1}(\text{u}_{1}(t),\theta_{1});\phi_{2}(\text% {u}_{2}(t),\theta_{2});\dots;$ $\phi_{N}(\text{u}_{N}(t),\theta_{N})]\in\mathbb{R}^{N\mathbf{d_{y}}},\widehat{% \mathbf{y}}(t):=[\widehat{y}(t);\widehat{y}(t);\dots;\widehat{y}(t)]\in\mathbb% {R}^{N\mathbf{d_{y}}},\widehat{\mathbf{P}}:=\mathbf{P}\otimes\mathbf{I}_{% \mathbf{d_{y}}},\mathbf{x}=[\theta;w]\in\mathbb{R}^{\mathbf{d}_{\theta}+N% \mathbf{d_{y}}}$ . Consider the problem

\displaystyle\operatorname*{argmin}_{\mathbf{x}=[\theta;w]}\sum_{t=1}^{T}\left% [f_{t}(\mathbf{x}):=\textstyle\frac{1}{2N^{2}}\left\lVert\Phi(\text{u}(t),% \theta)-\widehat{\mathbf{y}}(t)-\widehat{\mathbf{P}}w\right\rVert^{2}\right].

(9)

Let $F(\mathbf{x}):=\sum_{t=1}^{T}f_{t}(\mathbf{x})$ for brevity of notation. Problem (9) allows for the objective function $f_{t}$ to be distributed across different sub-systems and allows for the synthesis of a distributed algorithm. In the next result, we establish that solving problem (9) indeed aids towards our objective of solving (7). Specifically, Lemma 1 shows that the solution to (9) is also a solution for (8) and thus provides a solution for (7). We make the following assumption,

Assumption 4.

The set of minimizing solution to problem (9), $\operatorname*{argmin}_{\mathbf{x}}F(\mathbf{x})$ , is non-empty and bounded.

Lemma 1.

(Optimal solutions of (8) and (9)). Let the matrix $\widehat{\mathbf{P}}$ in (9) be such that $\mathbf{P}$ satisfy Assumption 3. If $\mathbf{x}^{\star}=[\theta^{\star};w^{\star}]$ is a solution to problem (9), then $\theta^{\star}$ is also a solution to problem (8).

Proof.

Using the first order optimality conditions for (9) with convex function $F$ (sum of composition of convex and increasing functions), we have $\nabla_{\theta}F(\mathbf{x}^{\star})=0,\;\nabla_{w}F(\mathbf{x}^{\star})=0$ . Namely, for all $t\in\{1,2,\dots,T\}$ ,

	$\displaystyle\nabla_{\theta}\Phi(\text{u}(t),\theta^{\star})^{\top}(\Phi(\text% {u}(t),\theta^{\star})-\widehat{\mathbf{y}}(t)-\widehat{\mathbf{P}}w^{\star})$	$\displaystyle=0,$		(10)
	$\displaystyle\widehat{\mathbf{P}}^{\top}(\Phi(\text{u}(t),\theta^{\star})-% \widehat{\mathbf{y}}(t)-\widehat{\mathbf{P}}w^{\star})$	$\displaystyle=0.$		(11)

Since, $\operatorname{null}\{\mathbf{P}\}=\operatorname{span}\{1_{N}\}$ , it follows from (11) that there exists $z^{\star}$ such that

\displaystyle 1_{N}\otimes z^{\star}:=(\Phi(\text{u}(t),\theta^{\star})-% \widehat{\mathbf{y}}(t)-\widehat{\mathbf{P}}w^{\star}).

(12)

Multiplying both by $1_{N}^{\top}\otimes\mathbf{I}_{\mathbf{d_{y}}}$ we get,

	$\displaystyle Nz^{\star}$	$\displaystyle=(1_{N}^{\top}\otimes\mathbf{I}_{\mathbf{d_{y}}})1_{N}\otimes z^{\star}$
		$\displaystyle=(1_{N}^{\top}\otimes\mathbf{I}_{\mathbf{d_{y}}})(\Phi(\text{u}(t% ),\theta^{\star})-\widehat{\mathbf{y}}(t)-\widehat{\mathbf{P}}w^{\star})$
		$\displaystyle\textstyle=\sum_{i=1}^{N}\big{(}\phi_{i}(\text{u}_{i}(t),\theta_{% i}^{\star})-\widehat{y}(t)\big{)}-(1_{N}^{\top}\otimes\mathbf{I}_{\mathbf{d_{y% }}})(\mathbf{P}\otimes\mathbf{I}_{\mathbf{d_{y}}})w^{\star}$
		$\displaystyle\textstyle=\sum_{i=1}^{N}\big{(}\phi_{i}(\text{u}_{i}(t),\theta_{% i}^{\star})-\widehat{y}(t)\big{)}-(1_{N}^{\top}\otimes\mathbf{P})(\mathbf{I}_{% \mathbf{d_{y}}}\otimes\mathbf{I}_{\mathbf{d_{y}}})w^{\star}$
		$\displaystyle\textstyle=\sum_{i=1}^{N}\big{(}\phi_{i}(\text{u}_{i}(t),\theta_{% i}^{\star})-\widehat{y}(t)\big{)},$

where we used Assumption 3 in the last step. Thus,

\displaystyle\textstyle z^{\star}=\frac{1}{N}\sum_{i=1}^{N}\big{(}\phi_{i}(% \text{u}_{i}(t),\theta_{i}^{\star})-\widehat{y}(t)\big{)}.

(13)

Substituting (12) and (13) in (10), gives: for all $t\in\{1,\dots,T\}$

		$\displaystyle\nabla_{\theta_{j}}\textstyle\Phi(\text{u}_{j}(t),\theta^{\star}_% {j})^{\top}\Big{[}\sum_{i=1}^{N}\big{(}\phi_{i}(\text{u}_{i}(t),\theta_{i}^{% \star})$
		$\displaystyle\hskip 72.26999pt-\widehat{y}(t)\big{)}\Big{]}=0,\;\forall j\in\{% 1,2,\dots,N\}.$		(14)

As problem (8) is convex, by the optimality conditions, any $\theta^{{}^{\prime}}$ is a solution of (8) if and only if, for all $t\in\{1,\dots,T\}$ ,

	$\displaystyle\nabla_{\theta_{j}}\phi_{j}\textstyle(\text{u}_{j}(t),\theta^{{}^% {\prime}}_{j})^{\top}\Big{[}\sum_{i=1}^{N}\big{(}\phi_{i}(\text{u}_{i}(t),% \theta_{i}^{{}^{\prime}})$
	$\displaystyle\hskip 72.26999pt\textstyle-\widehat{y}(t)\big{)}\Big{]}=0,\ % \forall j\in\{1,2,\dots,N\}.$

Thus, we conclude $\theta^{\star}$ is a solution to (8). ∎

Using Lemma 1, we can concentrate on solving (9), which facilitates the development of distributed solutions, as will be demonstrated in the following section.

IV Online Model Update Algorithm

Recall the problem statement in Section II. Assume a local controller is available at each sub-system $i$ that solves problem (6). Starting at any time $t_{0}$ the local controller has access to the model with parameter $\theta^{0}$ that is used to estimate the output $\widehat{y}_{\theta^{0}}(t_{0})$ for $[t_{0},t_{0}+T-1]$ based on which it generates local control decisions $\text{u}_{i}$ that are implemented in the system by the local actuators at some time $t_{1}\geq t_{0}+\Delta t$ , where $\Delta t$ is the amount of time it takes to solve problem (6). At time $t_{1}$ a measurement $\widehat{y}(t_{1})$ of the observable is obtained. If the model predicted output, $\widehat{y}_{\theta^{0}}(t_{1})$ , does not match the measurement, $\widehat{y}(t_{1})$ , the model parameter needs to be updated. We meet this objective by developing an online algorithm to solve the system model update problem (9). The function $f_{t}(\mathbf{x})$ in (9) is used to update the parameters to a new value at time $t$ given the measurement $\widehat{y}(t)$ and the control decisions $\text{u}_{i}(t)$ . We aim to minimize the “regret” of the online algorithm compared to a model devised using all the input-output pairs in hindsight. Let $\{\mathbf{x}(t)\}_{t\geq 1}$ denote the solution parameters generated by our algorithm, we formally define regret of our algorithm after any time $T$ as,

\displaystyle\mathcal{R}_{T}:=\sum_{t=1}^{T}f_{t}(\mathbf{x}(t))-\min_{\mathbf% {x}}\sum_{t=1}^{T}f_{t}(\mathbf{x}).

(15)

Note that if $\mathcal{R}_{T}$ is zero, then the solution sequence $\{\mathbf{x}(t)\}_{t\geq 1}$ is such that the total error incurred is equal to the error obtained by minimizing the error objective function in (9) created by using the control decisions and measurement data over the entire time horizon $[t_{0},t_{0}+T-1]$ . We propose Algorithm 1 to update $\mathbf{x}(t)=[\theta(t);w(t)]$ in an online manner.

For

t=0,1,2,\dots

- Given

\mathbf{x}(t)=[\theta(t);w(t)]\in\mathbb{R}^{\mathbf{d}_{\theta}+N\mathbf{d_{y% }}},\text{u}(t),\widehat{y}(t)

\mathbf{x}(t+1)=\mathbf{x}(t)-\eta_{t}\nabla f_{t}(\mathbf{x}(t))

Algorithm 1 Online Input-Output Map Update

Lemma 2 establishes the boundedness of the gradient steps involved in Algorithm 1.

Lemma 2.

Let Assumptions 1-4 hold. There exists constants $\eta_{t}>0$ and $\delta<\infty$ such that $\|\nabla f_{t}\|:=\left\|\left[\begin{array}[]{cc}\nabla_{\theta}f_{t}\\ \nabla_{w}f_{t}\end{array}\right]\right\|\leq\delta$ for all $t\in\{1,\dots,T\}$ with $\mathbf{x}(t)$ updated by Algorithm 1

Proof.

We start by presenting three supporting claims that we later utilize to prove the desired result.

Claim 1: Any $\gamma$ sub-level set $C_{\gamma}:=\{\mathbf{x}\ |\ f_{t}(\mathbf{x})\leq\gamma\}$ of $f_{t}$ is bounded.

Proof. Given $\beta>0$ , let $v^{\star}\in\operatorname*{argmin}_{\mathbf{x}}f_{t}(\mathbf{x})$ , with $\|v^{\star}\|<\infty$ . Define, $\Gamma_{\beta}:=\{\mathbf{x}\ |\ \|\mathbf{x}-v^{\star}\|=\beta\}$ and $v_{\beta}=\inf_{\mathbf{x}\in\Gamma_{\beta}}f_{t}(\mathbf{x})$ . Note that $\Gamma_{\beta}$ is non-empty and compact. Since, $f_{t}$ is continuous, from the Weierstrass’s theorem $v_{\beta}$ is attained at some point of $\Gamma_{\beta}$ , we have $v_{\beta}>f_{t}(v^{\star})$ . For any $\mathbf{x}$ such that $\|\mathbf{x}-v^{\star}\|>\beta$ , let $\alpha=\frac{\beta}{\|\mathbf{x}-v^{\star}\|},\tilde{\mathbf{x}}=(1-\alpha)v^{% \star}+\alpha\mathbf{x}$ . By convexity of $f_{t}$ , we have

\displaystyle(1-\alpha)f_{t}(v^{\star})+\alpha f_{t}(\mathbf{x})\geq f_{t}(% \tilde{\mathbf{x}}).

Since $\|\tilde{\mathbf{x}}-v^{\star}\|=\alpha\|\mathbf{x}-v^{\star}\|=\beta$ , $\tilde{\mathbf{x}}\in\Gamma_{\beta}$ and

\displaystyle f_{t}(\tilde{\mathbf{x}})\geq v_{\beta}=\inf_{\mathbf{x}\in% \Gamma_{\beta}}f_{t}(\mathbf{x}).

Combining the above two relations, we get

	$\displaystyle f_{t}(\mathbf{x})$	$\displaystyle\textstyle\geq\frac{f_{t}(\tilde{\mathbf{x}})-f_{t}(v^{\star})}{% \alpha}+f_{t}(v^{\star})\geq f_{t}(v^{\star})+\frac{v_{\beta}-f_{t}(v^{\star})% }{\alpha}$
		$\displaystyle=\textstyle f_{t}(v^{\star})+\frac{v_{\beta}-f_{t}(v^{\star})}{% \beta}\\|\mathbf{x}-v^{\star}\\|.$

Because $v_{\beta}>f_{t}(v^{\star})$ and $f_{t}(\mathbf{x})\leq\gamma$ , we derive

\displaystyle\|\mathbf{x}-v^{\star}\|\leq\textstyle\frac{\beta(\gamma-f_{t}(v^% {\star}))}{v_{\beta}-f_{t}(v^{\star})}.

Thus, $\|\mathbf{x}-v^{\star}\|\leq\max\left\{\beta,\frac{\beta(\gamma-f_{t}(v^{\star% }))}{v_{\beta}-f_{t}(v^{\star})}\right\}.\hskip 25.29494pt\qed$

Claim 2: There exists a sufficiently small $\eta_{t}>0$ such that for all $t\in\{1,2,\dots,T\}$ , $\mathbf{x}(t+1)$ updated by Algorithm 1 lies in the sub-level set $C_{f_{t}(\mathbf{x}(t))}:=\{\mathbf{x}\ |\ f_{t}(\mathbf{x})\leq f_{t}(\mathbf% {x}(t))\}$ .

Proof. By Taylor series expansion and $f_{t}\geq 0$ ,

	$\displaystyle f_{t}(\mathbf{x}(t+1))$	$\displaystyle=f_{t}\big{(}\mathbf{x}(t)-\eta_{t}\nabla f_{t}(\mathbf{x}(t))% \big{)}$
		$\displaystyle=f_{t}(\mathbf{x}(t))-\eta_{t}\\|\nabla f_{t}(\mathbf{x}(t))\\|^{2}% +o(\eta_{t}\nabla f_{t}(\mathbf{x}(t)))$
		$\displaystyle=\textstyle f_{t}(\mathbf{x}(t))\!-\!\eta_{t}\left(\!\\|\nabla f_{% t}(\mathbf{x}(t))\\|^{2}\!+\!\frac{o(\eta_{t}\nabla f_{t}(\mathbf{x}(t)))}{\eta% _{t}}\!\right)\!$
		$\displaystyle\leq f_{t}(\mathbf{x}(t)),$

for sufficiently small $\eta_{t}>0$ by the definition of $o(\eta_{t})$ , which completes the proof. ∎

Claim 3: Let $\operatorname{\mathbf{z}}(t):=\Phi(\text{u}(t),\theta(t))-\widehat{\mathbf{y}}% (t)-\widehat{\mathbf{P}}w(t)$ . Then, $\exists\ \bar{\operatorname{\mathbf{z}}}<\infty$ such that $\|\operatorname{\mathbf{z}}(t)\|\leq\overline{\operatorname{\mathbf{z}}}$ for all $t\in\{1,2,\dots,T\}$ .

Proof. Under Assumptions 1 and 2,

	$\displaystyle\\|\operatorname{\mathbf{z}}(t)-\operatorname{\mathbf{z}}(1)\\|=\\|% \Phi(\text{u}(t),\theta(t))-\widehat{\mathbf{y}}(t)-\widehat{\mathbf{P}}w(t)\ -$
	$\displaystyle\hskip 86.72377pt\Phi(\text{u}(1),\theta(1))+\widehat{\mathbf{y}}% (1)+\widehat{\mathbf{P}}w(1)\\|$
	$\displaystyle\leq\\|\big{(}\Phi(\text{u}(t),\theta(t))-\widehat{\mathbf{P}}w(t)% \big{)}\ -\big{(}\Phi(\text{u}(1),\theta(1))-\widehat{\mathbf{P}}w(1)\big{)}\\|$
	$\displaystyle\hskip 14.45377pt+\\|\widehat{\mathbf{y}}(t)-\widehat{\mathbf{y}}(% 1)\\|$
	$\displaystyle=L_{m}\left\\|\left[\!\!\begin{array}[]{cc}\text{u}(t)\\ \mathbf{x}(t)\end{array}\!\!\right]-\left[\!\!\begin{array}[]{cc}\text{u}(1)\\ \mathbf{x}(1)\end{array}\!\!\right]\right\\|+\\|\widehat{\mathbf{y}}(t)-\widehat% {\mathbf{y}}(1)\\|.$

Therefore, $\|\operatorname{\mathbf{z}}(t)-\operatorname{\mathbf{z}}(1)\|\leq 2\overline{% \text{y}}+2L_{m}(\overline{\text{u}}+D_{m})$ , where the results of Claims 1 and 2 are applied with $D_{m}:=\max_{t}Diam_{\|.\|}(C_{f_{t}(\mathbf{x}(t))})$ and $L_{m}:=\max_{1\leq i\leq N}L_{i}$ . Therefore, there exists a $\bar{\operatorname{\mathbf{z}}}<\infty$ that bounds $\|\operatorname{\mathbf{z}}(t)\|$ for all $t\in\{1,2,\dots,T\}.\qed$

With all the claims, we circle back to the proof of Lemma 2. At any time index $t$ ,

\displaystyle\nabla f_{t}

\displaystyle=\left[\begin{array}[]{cc}\nabla_{\theta}f_{t}\\ \nabla_{w}f_{t}\end{array}\right]=\left[\begin{array}[]{cc}\nabla_{\theta}\Phi% (\text{u}(t),\theta(t))^{\top}\operatorname{\mathbf{z}}(t)\\ -\widehat{\mathbf{P}}^{\top}\operatorname{\mathbf{z}}(t)\end{array}\right].

Thus, $\|\nabla f_{t}\|^{2}\leq\|\nabla_{\theta}\Phi(\text{u}(t),\theta(t))^{\top}% \operatorname{\mathbf{z}}(t)\|^{2}+\|\widehat{\mathbf{P}}^{\top}\operatorname{% \mathbf{z}}(t)\|^{2}+2\|\nabla_{\theta}\Phi(\text{u}(t),\theta(t))^{\top}% \operatorname{\mathbf{z}}(t)\|\|\widehat{\mathbf{P}}^{\top}\operatorname{% \mathbf{z}}(t)\|\leq(NL_{m}+\|\widehat{\mathbf{P}}\|)^{2}\overline{% \operatorname{\mathbf{z}}}^{2}$ , where we used Assumption 2 and claim 3. Hence, $\|\nabla f_{t}\|\leq(NL_{m}+\|\widehat{\mathbf{P}}\|)\overline{\operatorname{% \mathbf{z}}}:=\delta$ . This completes the proof. ∎

Theorem 1.

(Regret of Algorithm 1). Let Assumptions 1-4 hold. Let $\mathbf{x}^{\star}\in\operatorname*{argmin}_{\mathbf{x}}\sum_{t=1}^{T}f_{t}(% \mathbf{x})$ and $\eta_{t}=\frac{c_{1}}{\sqrt{t}}$ , $c_{1}>0$ . Then, the regret of Algorithm 1 after any time $T$ is bounded. In particular,

\displaystyle\mathcal{R}_{T}=\frac{\delta_{1}\sqrt{T}}{2}-\frac{\delta_{2}}{2}% =O(\sqrt{T}),

where, $\delta_{1}$ and $\delta_{2}$ are some positive finite constants. Therefore, $\limsup_{T\to\infty}\mathcal{R}_{T}/T\rightarrow 0$ .

Proof.

Let $\mathbf{x}^{\star}\in\operatorname*{argmin}_{\mathbf{x}}\sum_{t=1}^{T}f_{t}(% \mathbf{x})$ . Consider the update in Algorithm 1, $\mathbf{x}(t+1)-\mathbf{x}^{\star}=\mathbf{x}(t)-\eta_{t}\nabla f_{t}(\mathbf{% x}(t))-\mathbf{x}^{\star}$ , then

	$\displaystyle\\|\mathbf{x}(t+1)-\mathbf{x}^{\star}\\|^{2}$	$\displaystyle\leq\\|\mathbf{x}(t)-\mathbf{x}^{\star}\\|^{2}+\eta_{t}^{2}\\|\nabla f% _{t}(\mathbf{x}(t))\\|^{2}$
		$\displaystyle\hskip 28.90755pt-2\eta_{t}\nabla f_{t}(\mathbf{x}(t))^{\top}(% \mathbf{x}(t)-\mathbf{x}^{\star}).$

From Lemma 2, there exists $\delta=:\sup_{t}\|\nabla f_{t}(\mathbf{x}(t))\|<\infty$ ,

	$\displaystyle\\|\mathbf{x}(t+1)-\mathbf{x}^{\star}\\|^{2}\leq\\|\mathbf{x}(t)-% \mathbf{x}^{\star}\\|^{2}+\eta_{t}^{2}\delta^{2}$
	$\displaystyle\hskip 7.22743pt-2\eta_{t}\nabla f_{t}(\mathbf{x}(t))^{\top}(% \mathbf{x}(t)-\mathbf{x}^{\star})$
	$\displaystyle\Longrightarrow\nabla f_{t}(\mathbf{x}(t))^{\top}(\mathbf{x}(t)-% \mathbf{x}^{\star})$		(16)
	$\displaystyle\hskip 21.68121pt\leq\frac{\\|\mathbf{x}(t)-\mathbf{x}^{\star}\\|^{% 2}-\\|\mathbf{x}(t+1)-\mathbf{x}^{\star}\\|^{2}}{2\eta_{t}}+\frac{\eta_{t}\delta% ^{2}}{2}$

By convexity of $f_{t}$ ,

\displaystyle f_{t}(\mathbf{x}(t))-f_{t}(\mathbf{x}^{\star})\leq\nabla f_{t}(% \mathbf{x}(t))^{\top}(\mathbf{x}(t)-\mathbf{x}^{\star}).

(17)

Combining (16) and (17) gives

\displaystyle f_{t}(\mathbf{x}(t))\!-\!f_{t}(\mathbf{x}^{\star})

\displaystyle\leq\frac{\|\mathbf{x}(t)\!-\!\mathbf{x}^{\star}\|^{2}-\|\mathbf{% x}(t\!+\!1)\!-\!\mathbf{x}^{\star}\|^{2}}{2\eta_{t}}+\frac{\eta_{t}\delta^{2}}% {2}.

Summing over $t=1$ to $T$ ,

	$\displaystyle\mathcal{R}_{T}$	$\displaystyle=\sum_{t=1}^{T}f_{t}(\mathbf{x}(t))-\sum_{t=1}^{T}f_{t}(\mathbf{x% }^{\star})$
		$\displaystyle\leq\sum_{t=1}^{T}\left(\frac{\\|\mathbf{x}(t)-\mathbf{x}^{\star}% \\|^{2}-\\|\mathbf{x}(t+1)-\mathbf{x}^{\star}\\|^{2}}{2\eta_{t}}\right)+\sum_{t=1% }^{T}\frac{\eta_{t}\delta^{2}}{2}$
		$\displaystyle=\frac{\\|\mathbf{x}(1)-\mathbf{x}^{\star}\\|^{2}}{2\eta_{1}}-\frac% {\\|\mathbf{x}(T+1)-\mathbf{x}^{\star}\\|^{2}}{2\eta_{T}}+\frac{\delta^{2}}{2}% \sum_{t=1}^{T}\eta_{t}$
		$\displaystyle+\frac{1}{2}\sum_{t=2}^{T}\\|\mathbf{x}(t)-\mathbf{x}^{\star}\\|^{2% }\left(\frac{1}{\eta_{t}}-\frac{1}{\eta_{t-1}}\right).$

From claims 1 and 2 in Lemma 2, there exists $\Xi<\infty$ such that $\sup_{t}\|\mathbf{x}(t)-\mathbf{x}^{\star}\|\leq\sup_{t}\|\mathbf{x}(t)\|+\|% \mathbf{x}^{\star}\|\leq\Xi$ . Therefore,

	$\displaystyle\mathcal{R}_{T}$	$\displaystyle\textstyle\leq\frac{\Xi^{2}}{2}\left(\frac{1}{\eta_{1}}+\sum_{t=2% }^{T}\left(\frac{1}{\eta_{t}}-\frac{1}{\eta_{t-1}}\right)\right)+\frac{\delta^% {2}}{2}\sum_{t=1}^{T}\eta_{t}$
		$\displaystyle=\textstyle\frac{\Xi^{2}}{2\eta_{T}}+\frac{\delta^{2}}{2}\sum_{t=% 1}^{T}\eta_{t}.$

For $\eta_{t}=\frac{c_{1}}{\sqrt{t}},\sum_{t=1}^{T}\eta_{t}=\sum_{t=1}^{T}\frac{c_{% 1}}{\sqrt{t}}\leq 1+\int_{t=1}^{T}\frac{c_{1}}{\sqrt{t}}dt\leq 1+[2c_{1}\sqrt{% t}]_{1}^{T}\leq 2c_{1}\sqrt{T}+1-2c_{1}$ . Thus,

\displaystyle\mathcal{R}_{T}\leq\frac{(\Xi^{2}/c_{1}+2\delta^{2}c_{1})\sqrt{T}% }{2}-\frac{(2c_{1}-1)\delta^{2}}{2}.

Therefore, $\limsup_{T\to\infty}\mathcal{R}_{T}/T\rightarrow 0$ . ∎

The result of Theorem 1 establishes that Algorithm 1 provides an estimated output close to the estimated output derived via a best-fixed model in hindsight and thus solves the adaptive model update problem. Next, we elucidate a methodology for implementing the Algorithm 1 within a distributed framework.

Up to this point, we have presented Algorithm 1 to solve the model update problem under online experimental scenarios. In the following, we present how Algorithm 1 can be implemented distributively. Consider a communication network $\mathcal{G}^{c}(\mathcal{V}^{c},\mathcal{E}^{c})$ with $\mathcal{V}^{c}=\mathcal{V}\cup\{0\},|\mathcal{V}^{c}|=N+1$ , $\mathcal{E}^{c}=\mathcal{E}\cup\{(0,1),(0,2),\dots,(0,N)\}\subseteq(N+1)\times% (N+1)$ . The node index $0$ is the fusion center to which all the sensor measurements are relayed. There are two kinds of communication links in the graph $\mathcal{G}^{c}$ (a) $(i,j)\in\mathcal{E}$ with $i,j\in\{1,2,\dots,N\}$ and (b) $(0,j)\in\mathcal{E}^{c}$ with $j\in\{1,2,\dots,N\}$ . The sub-systems communicate with each other via the link of kind (a) and the fusion center communicates with all the sub-systems via the communication links of the form (b). The fusion center communicates the measurement $\widehat{y}(t)\in\mathbb{R}^{\mathbf{d_{y}}}$ to all the sub-systems $i\in\{1,2,\dots,N\}$ . The updates in Algorithm 1 utilizes $\nabla f_{t}(\mathbf{x}(t))$ . From (10) and (11), we have

\displaystyle\nabla f_{t}(\mathbf{x}(t))=\left[\begin{array}[]{cc}\nabla_{% \theta}\Phi(\text{u}(t),\theta(t))^{\top}\operatorname{\mathbf{z}}(t)\\ -\widehat{\mathbf{P}}^{\top}\operatorname{\mathbf{z}}(t)\end{array}\right].

(20)

Note that $\operatorname{\mathbf{z}}(t)$ can be decomposed as, $\operatorname{\mathbf{z}}=[\operatorname{\mathbf{z}}_{1}(t);\operatorname{% \mathbf{z}}_{2}(t);\dots;$ $\operatorname{\mathbf{z}}_{N}(t)]$ , where for $i,j\in\{1,2,\dots,N\}$ ,

\displaystyle\operatorname{\mathbf{z}}_{i}(t)=\phi_{i}(\text{u}_{i}(t),\theta_% {i}(t))-\widehat{y}(t)-(\widehat{\mathbf{P}}_{ii}w_{i}(t)+\sum_{j\in% \operatorname{\mathcal{N}}_{i}}\widehat{\mathbf{P}}_{ij}w_{j}(t)).

A closer examination of (20) yields that $\nabla f_{t}$ can be further written as, $\nabla f_{t}=[(\nabla_{1}f_{t})^{\top};(\nabla_{2}f_{t})^{\top};\dots;(\nabla_% {N}f_{t})^{\top}]$ , where

\displaystyle\nabla_{i}f_{t}=\left[\begin{array}[]{cc}\nabla_{\theta_{i}}\phi(% \text{u}_{i}(t),\theta_{i}(t))^{\top}\operatorname{\mathbf{z}}_{i}(t)\\ -\widehat{\mathbf{P}}_{ii}\operatorname{\mathbf{z}}_{i}(t)-\sum_{j\in% \operatorname{\mathcal{N}}_{i}}\widehat{\mathbf{P}}_{ji}\operatorname{\mathbf{% z}}_{j}(t)\end{array}\right],

(23)

for all $i\in\{1,2,\dots,N\}.$ Thus, using (23) the updates in Algorithm 1 can be implemented in a distributed manner at any sub-system $i$ while maintaining an auxiliary variable $\operatorname{\mathbf{z}}_{i}$ as shown in Algorithm 2.

For

t=0,1,2,\dots

- Receive

\widehat{y}(t)

from the fusion center

- Given

\theta_{i}(t)\in\mathbb{R}^{\theta_{i}},w_{i}(t),w_{j}(t)\in\mathbb{R}^{% \mathbf{d_{y}}},j\in\operatorname{\mathcal{N}}_{i}

\operatorname{\mathbf{z}}_{i}(t)=\phi_{i}(\text{u}_{i}(t),\theta_{i}(t))-% \widehat{y}(t)-\displaystyle\sum_{j\in\operatorname{\mathcal{N}}_{i}\cup\{i\}}% \widehat{\mathbf{P}}_{ij}w_{j}(t)

\theta_{i}(t+1)=\theta_{i}(t)-\eta_{t}\nabla_{\theta_{i}}\phi_{i}(\text{u}_{i}% (t),\theta_{i}(t))^{\top}\operatorname{\mathbf{z}}_{i}(t)

w_{i}(t+1)=w_{i}(t)+\eta_{t}\displaystyle\sum_{j\in\operatorname{\mathcal{N}}_% {i}\cup\{i\}}\widehat{\mathbf{P}}_{ji}\operatorname{\mathbf{z}}_{j}(t)

Algorithm 2 Distributed Online Input-Output Map Update at Sub-system

i

In Algorithm 2, each agent $i$ engages in two rounds of communication on auxiliary variables $\operatorname{\mathbf{z}}_{i}$ and $w_{i}$ . Importantly, the exchange of $\operatorname{\mathbf{z}}_{i}$ and $w_{i}$ among agents does not allow for the reconstruction of the model parameters $\theta_{i}$ or the local input data. As a result, the information transmitted across the communication network does not divulge any direct details regarding the sub-system’s parameters or local data, thereby bolstering the privacy and security of the individual sub-systems.

V Controller with Online Model Updates

In this section, we provide an extension of the control problem (6) beyond the fixed input-output map for the entire time horizon $T$ . We consider the input-output map in (6) to be frequently updated based on the latest control decision and measurement. The following formulation captures the time-varying aspects

	$\displaystyle\operatorname*{minimize}_{\text{u}(1),\dots,\text{u}(T)\in\mathbb% {R}^{\mathbf{d_{u}}}}$	$\displaystyle\sum_{k=1}^{T}\sum_{i=1}^{N}\ell_{i}(\widehat{y}_{\theta_{i}(t)}(% k),k)+h_{i}(\text{u}_{i}(k),k)$		(24)
	subject to	$\displaystyle\text{u}_{i}(k)\in\mathcal{U}_{i},\mbox{ for all}\ i=1,2,\dots,N,% \forall k,$

where, $t$ is the model update counter. Whenever, the input-output map is updated the counter $t$ is increased by $1$ . Given $t$ , let $\tilde{\ell}_{i}(\widehat{y}_{\theta_{i}},T):=\sum_{k=1}^{T}\ell_{i}(\widehat{% y}_{\theta_{i}(t)}(k),k),\tilde{h}_{i}(\text{u}_{i},T):=\sum_{k=1}^{T}h_{i}(% \text{u}_{i}(k),k)$ . While the input-output map is fixed the control decisions in problem (24) can be solved for via a projected gradient iteration given by, for all $i\in\{1,2,\dots,N\}$ ,

		$\displaystyle\text{u}_{i}(\tau+1)=\textstyle\operatorname*{Proj}_{\mathcal{U}_% {i}}\big{\{}\text{u}_{i}(\tau)-$		(25)
		$\displaystyle-\alpha\big{[}\nabla_{\widehat{y}_{\theta_{i}}}\tilde{\ell}_{i}(% \widehat{y}_{\theta_{i}(t)}(\tau),T)^{\top}\nabla_{\text{u}_{i}}\widehat{y}_{% \theta_{i}}(\text{u}_{i}(\tau));\nabla_{\text{u}_{i}}\tilde{h}_{i}(\text{u}_{i% }(\tau),T)\big{]}\big{\}},$

where $\tau$ is the iteration counter of the projected gradient steps and $\operatorname*{Proj}_{\mathcal{U}_{i}}\{\cdot\}$ denote the projection operator. The control decisions obtained via (25) are implemented in the system. Further, after $T_{con}$ consecutive iterations of the application of the control decisions to the system, the model update counter $t$ is incremented by $1$ and the input-output map is updated via Algorithm 2 utilizing the current measurement. Subsequently, the projected gradient iterations are reinitialized and performed with the new input-output map corresponding to the parameters $\theta_{i}(t+1)$ . We summarize this in Algorithm 3.

Initialize:

t=\tau=1,\text{u}_{i}(0),\theta_{i}(0),w_{i}(0)

for all

i

Repeat

if $\mod(\tau,T_{con})=0$ then

- Update the input-output map via Algorithm 2

t=t+1

\text{u}(\tau)=\text{u}(0)

else

- Compute

\text{u}(\tau)

using (25)

- Apply

\text{u}(\tau)

to the system

\tau=\tau+1

Algorithm 3 Control with Online Input-Output Map Update for Solving Problem (24)

VI Application Example and Numerical Simulations

In this section, we provide an example of an application of the proposed framework and present numerical results of a numerical simulation of the application example.

VI-A Power System Application Example

Consider the problem of controlling and optimally managing the operation of a distribution power grid with penetration of photovoltaic energy sources (PES). We formulate this as a voltage regulation problem that fits the formulation in problem (24). We assume that there are $N$ number of PES bus and they can adjust their power injections for voltage regulation. The control decision at PES bus $i$ at any time $k$ is given by $\text{u}_{i}(k):=[P_{i}(k);Q_{i}(k)]\in\mathbb{R}^{2}$ , where, $P_{i}(k)$ and $Q_{i}(k)$ are the net active and reactive power injections at the PES bus $i$ at time $t$ , respectively. The sets $\mathcal{U}_{i}:=\{[P_{i};Q_{i}]:P^{2}_{i}+Q^{2}_{i}\leq S^{2}_{i,\mbox{max}},% 0\leq P_{i}\leq\overline{P}_{i}\}$ , where $S^{2}_{i,\mbox{max}}$ is the rated apparent power for the of the PES $i$ and $\overline{P}_{i}$ is the maximum real power available with PES $i$ . The input-output $\widehat{y}_{\theta(t)}(k)=\frac{1}{N}\sum_{i=1}^{N}\phi_{i}(\text{u}_{i}(k),% \theta_{i}(k))$ gives the mapping from power injections $\text{u}_{i}$ to the magnitudes of the voltages in the entire distribution grid. The functions $\ell_{i}(.)$ are designed to capture the engineering constraint of keeping the true voltages within the interval $[0.95,1.05]$ . Quadratic functions $\text{u}_{i}(k)\to h_{i}(\text{u}_{i}(k),k)$ that penalize active power curtailment and reactive power injections at the PES buses at time $k$ are chosen.

Refer to caption — Figure 1: Schematic of the modified IEEE 37 bus system. The buses highlighted in red triangles are PES buses

VI-B Illustrative Numerical Simulations

In this section, we instantiate the voltage regulation problem detailed in the previous section via a modified IEEE- $37$ bus system augmented with additional PES, using the solar irradiance data from Anatolia, CA, USA, and electric loads, having realistic load profiles for $4$ hours with a granularity of one second, introduced at different buses as illustrated in Fig. 1. A total of $18$ PES buses are considered. We utilize Algorithm 3 with $T_{con}=1$ so that the input-output map between the power injections and the voltage magnitudes is updated every subsequent control decision. During the simulation study, PES exchange estimates through a connected communication network with $19$ nodes ( $18$ PES and one fusion center). The graph Laplacian of the graph with $18$ nodes is used as the weight matrix $\mathbf{P}$ . We compare two parametric models for capturing the input-output map:

1.

A linear model where nodal active and reactive power injections serve as inputs and bus voltages as outputs, described by $\widehat{y}_{\theta_{i}(t)}(t)=A_{i}\text{u}_{i}(t)$ for all $t$ . This model is known as the LinDistFlow model [24], and the goal is to identify matrix $A_{i}$ .
2.

A non-linear model that posits a polynomial relationship between local power injections and bus voltage, described by $\widehat{y}_{\theta_{i}(t)}(t)=\frac{B_{i}-\sqrt{B_{i}^{2}-4(C_{i}-\overline{% \text{u}}_{i}(t))}}{2}$ for all $t$ , where $\overline{\text{u}}_{i}=\sqrt{P_{i}^{2}+Q_{i}^{2}}$ is the apparent power magnitude at bus $i$ . This reflects the quadratic correlation between power injection and voltage magnitude observed in power flow equations also known as the constant power load model. $B_{i}$ and $C_{i}$ are the parameters of the model to be determined.

For Algorithm 2, we adopt a step-size $\eta_{t}=0.01/\sqrt{t}$ . We plot the voltage levels in the modified IEEE $37$ -bus system during the control process while updating the model with new input-output data becomes available using Algorithm 3. Fig. 2 illustrates potential violations of voltage regulation limits without control measures. The real-time estimated linear model provides a satisfactory control performance in Fig. 3, albeit with some fluctuations during periods of reduced PES generation. Notably, the control performance using the identified non-linear constant power load model, as shown in Fig. 4, surpasses that of the linear model, which aligns with expectations given the non-linear model’s closer representation of actual power flow dynamics. Overall, the models identified through our proposed algorithm demonstrate effective voltage regulation capabilities.

VII Conclusion

In this paper, we developed an online distributed algorithm where each agent updates its estimate of the model via an online gradient descent scheme utilizing the most recent input-output pair. We prove that the developed distributed algorithm has a sub-linear regret and determines the original system model. Further, agents only share non-linear estimates preserving their private information. The numerical simulation study corroborates the efficacy of our developed algorithm with the identification of a more accurate quadratic power flow model, which improves the voltage regulation performance of the control system. Looking ahead, our future endeavors will focus on extensive testing within real-world systems, aiming to further validate the performance and characterize the scalability of our method. with tens of thousands of nodes.

References

[1] P. Roque, “Coordination of multi-agent systems: Predictive and vision-based control for aerial and space robotics,” Ph.D. dissertation, KTH Royal Institute of Technology, 2022.
[2] V. Spudić, C. Conte, M. Baotić, and M. Morari, “Cooperative distributed model predictive control for wind farms,” Optimal Control Applications and Methods, vol. 36, no. 3, pp. 333–352, 2015.
[3] A. Bernstein and E. Dall’Anese, “Real-time feedback-based optimization of distribution grids: A unified approach,” IEEE Transactions on Control of Network Systems, vol. 6, no. 3, pp. 1197–1209, 2019.
[4] L. Ljung, “System identification,” in Signal analysis and prediction. Springer, 1998, pp. 163–173.
[5] T. Hastie, R. Tibshirani, J. Friedman, T. Hastie, R. Tibshirani, and J. Friedman, “Overview of supervised learning,” The elements of statistical learning: Data mining, inference, and prediction, pp. 9–41, 2009.
[6] I. Markovsky and F. Dörfler, “Behavioral systems theory in data-driven analysis, signal processing, and control,” Annual Reviews in Control, vol. 52, pp. 42–64, 2021.
[7] K. Montano-Martinez, S. Thakar, V. Vittal, R. Ayyanar, and C. Rojas, “Detailed primary and secondary distribution system feeder modeling based on ami data,” in 2020 52nd North American Power Symposium (NAPS), 2021, pp. 1–6.
[8] W. Wang, S. Jazebi, F. de León, and Z. Li, “Looping radial distribution systems using superconducting fault current limiters: Feasibility and economic analysis,” IEEE Transactions on Power Systems, vol. 33, no. 3, pp. 2486–2495, 2018.
[9] J. D. Lankutis, “Verifying data integrity: If you cannot believe the data, how can you believe the analytics?” in 2013 IEEE Rural Electric Power Conference (REPC), 2013, pp. C4–1–C4–4.
[10] EPRI, “Distribution modeling guidelines:recommendations for system and asset modeling for distributed energy resource assessments,” Electric Power Research Institute, Palo Alto, California (United States), Tech. Rep., 2016.
[11] A. Bernstein, J. Comden, Y. Chen, and J. Wang, “Time-varying feedback optimization for quadratic programs with heterogeneous gradient step sizes,” in 2023 62nd IEEE Conference on Decision and Control (CDC). IEEE, 2023, pp. 4003–4011.
[12] A. Bernstein, E. Dall’Anese, and A. Simonetto, “Online primal-dual methods with measurement feedback for time-varying convex optimization,” IEEE Transactions on Signal Processing, vol. 67, no. 8, pp. 1978–1991, 2019.
[13] A. Chiuso and G. Pillonetto, “System identification: A machine learning perspective,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 2, no. 1, pp. 281–304, 2019.
[14] O. A. Alimi, K. Ouahada, and A. M. Abu-Mahfouz, “A review of machine learning approaches to power system security and stability,” IEEE Access, vol. 8, pp. 113 512–113 531, 2020.
[15] S. D. McArthur, E. M. Davidson, V. M. Catterson, A. L. Dimeas, N. D. Hatziargyriou, F. Ponci, and T. Funabashi, “Multi-agent systems for power engineering applications—Part I: Concepts, approaches, and technical challenges,” IEEE Transactions on Power systems, vol. 22, no. 4, pp. 1743–1752, 2007.
[16] O. P. Mahela, M. Khosravy, N. Gupta, B. Khan, H. H. Alhelou, R. Mahla, N. Patel, and P. Siano, “Comprehensive overview of multi-agent systems for controlling smart grids,” CSEE Journal of Power and Energy Systems, vol. 8, no. 1, pp. 115–131, 2020.
[17] C.-Y. Chang, “A privacy preserving distributed model identification algorithm for power distribution systems,” in 62nd IEEE Conference on Decision and Control, 2023.
[18] D. Materassi and M. V. Salapaka, “On the problem of reconstructing an unknown topology via locality properties of the wiener filter,” IEEE Transactions on Automatic Control, vol. 57, no. 7, pp. 1765–1777, 2012.
[19] H. H. Weerts, P. M. Van den Hof, and A. G. Dankers, “Identifiability of linear dynamic networks,” Automatica, vol. 89, pp. 247–258, 2018.
[20] M. S. Veedu, H. Doddi, and M. V. Salapaka, “Topology learning of linear dynamical systems with latent nodes using matrix decomposition,” IEEE Transactions on Automatic Control, vol. 67, no. 11, pp. 5746–5761, 2022.
[21] D. E. Knuth, The Art of Computer Programming, Volume 1: Fundamental Algorithms. Addison-Wesley, 1997.
[22] Y. Huang, Z. Meng, and J. Sun, “Scalable distributed least square algorithms for large-scale linear equations via an optimization approach,” Automatica, vol. 146, p. 110572, 2022.
[23] R. Merris, “Laplacian matrices of graphs: a survey,” Linear algebra and its applications, vol. 197, pp. 143–176, 1994.
[24] M. E. Baran and F. F. Wu, “Network reconfiguration in distribution systems for loss reduction and load balancing,” IEEE Transactions on Power delivery, vol. 4, no. 2, pp. 1401–1407, 1989.

	$\displaystyle\\|\operatorname{\mathbf{z}}(t)-\operatorname{\mathbf{z}}(1)\\|=\\|% \Phi(\text{u}(t),\theta(t))-\widehat{\mathbf{y}}(t)-\widehat{\mathbf{P}}w(t)\ -$
	$\displaystyle\hskip 86.72377pt\Phi(\text{u}(1),\theta(1))+\widehat{\mathbf{y}}% (1)+\widehat{\mathbf{P}}w(1)\\|$
	$\displaystyle\leq\\|\big{(}\Phi(\text{u}(t),\theta(t))-\widehat{\mathbf{P}}w(t)% \big{)}\ -\big{(}\Phi(\text{u}(1),\theta(1))-\widehat{\mathbf{P}}w(1)\big{)}\\|$
	$\displaystyle\hskip 14.45377pt+\\|\widehat{\mathbf{y}}(t)-\widehat{\mathbf{y}}(% 1)\\|$
	$\displaystyle=L_{m}\left\\|\left[\!\!\begin{array}[]{cc}\text{u}(t)\\ \mathbf{x}(t)\end{array}\!\!\right]-\left[\!\!\begin{array}[]{cc}\text{u}(1)\\ \mathbf{x}(1)\end{array}\!\!\right]\right\\|+\\|\widehat{\mathbf{y}}(t)-\widehat% {\mathbf{y}}(1)\\|.$

	$\displaystyle\mathcal{R}_{T}$	$\displaystyle=\sum_{t=1}^{T}f_{t}(\mathbf{x}(t))-\sum_{t=1}^{T}f_{t}(\mathbf{x% }^{\star})$
		$\displaystyle\leq\sum_{t=1}^{T}\left(\frac{\\|\mathbf{x}(t)-\mathbf{x}^{\star}% \\|^{2}-\\|\mathbf{x}(t+1)-\mathbf{x}^{\star}\\|^{2}}{2\eta_{t}}\right)+\sum_{t=1% }^{T}\frac{\eta_{t}\delta^{2}}{2}$
		$\displaystyle=\frac{\\|\mathbf{x}(1)-\mathbf{x}^{\star}\\|^{2}}{2\eta_{1}}-\frac% {\\|\mathbf{x}(T+1)-\mathbf{x}^{\star}\\|^{2}}{2\eta_{T}}+\frac{\delta^{2}}{2}% \sum_{t=1}^{T}\eta_{t}$
		$\displaystyle+\frac{1}{2}\sum_{t=2}^{T}\\|\mathbf{x}(t)-\mathbf{x}^{\star}\\|^{2% }\left(\frac{1}{\eta_{t}}-\frac{1}{\eta_{t-1}}\right).$