1.1 Motivation
Modern power systems have increased uncertainties due to the penetration of distributed energy resources. The safe and reliable operation of such systems requires improved observability and controllability. This is achieved by the deep interconnection with
information and communication technology (ICT) systems, giving rise to
cyber-physical energy systems (CPESs) [
48]. A promising approach for operating such complex systems with system-wide ICT, a large number of actors, and a high degree of automation is to shift from a hierarchical system, operated mainly in a centralized manner by humans, to a distributed one, mainly operated autonomously but with several actors interacting with each other as well as with humans. Consequentially, such CPESs can be regarded as organic computing systems [
27,
34] as they continuously and dynamically adapt to exogenous and endogenous changes on various time scales—from significant changes in generation technologies driven by external regulatory and political factors to unforeseen demand fluctuations due to stochastic human behavior. To cope with these dynamics, the system is characterized by self-* properties, e.g., self-organization of battery storage swarms [
41], self-configuration of SCADA and protection systems [
43], self-explanation, and context awareness [
5].
Owing to the geographical size and many actors, CPESs are among the biggest and most complex human-made organic computing systems. However, the ICT penetration has further increased the overall system complexity, leading to new risks such as software failures, malfunctions, and cyberattacks [
42]. This, in turn, increases the number of factors affecting the state and health of a CPES. In addition to established factors like functional correctness and safety, other factors like security, credibility, and usability must be considered. Although these aspects may already be considered in certain subsystems (e.g., safety for power lines and security for communication networks), they are yet to be integrated and used to operate the whole CPES, i.e., a holistic state and health of CPESs considering the individual subsystems.
Following the idea of the transition of CPESs towards organic computing systems, trust from the domain of
organic computing (OC) (OC-Trust) [
40], used for assessing trust between autonomous actors and technical subsystems, is a promising approach for a holistic state and health assessment. Here, trust is defined as a context-dependent and multifaceted sense of a technical entity with respect to its functional correctness, safety, security, reliability, credibility, and usability. All of these facets can be directly mapped onto the state and health of complex CPESs. The assessment of trust on all levels of a CPES can contribute to a coherent state and health assessment, which may then be interpreted by technical subsystems or used by operators in their decision-making.
Assessing the trust of CPESs implies assessing the trust of its constituting (autonomous) subsystems, components, and services, which is challenging as they face a broad and diverse threat landscape [
42]. In addition to disturbances in power systems, disturbances in ICT systems can also impact the interconnected power system via the grid services. The 2003 North American blackout resulted from a software bug in the state estimation service, which gave incorrect situational awareness to the system operators, thereby hiding certain power line failures [
33]. The 2015 and 2017 Ukraine blackouts were caused by hackers taking over control rooms and shutting down vital grid services [
47]. This illustrates the vulnerability of CPESs to cyberattacks. Other examples of ICT disturbances include sensor failures affecting observability, controller failures affecting controllability, and communication network failures affecting data transfer, causing data to be either delayed or lost [
30]. It is evident that ICT disturbances impact the performance of grid services, which in turn impacts the operation of the power system. Therefore, these ICT-enabled grid services can be regarded as entities, for which trust aspects like functional correctness, security, reliability, and credibility must be assessed.
The operational state classification solely based on electrotechnical parameters (e.g., voltage, frequency, currents) is the prominent state-of-the-art among power system operators to assess the current state (or performance) of a power system [
12,
15]. The operation of a power system can be classified into one of five different states:
normal,
alert,
emergency,
blackout, and
restorative. Disturbances in the system can cause the state to degrade, and suitable grid services (referred to as remedial actions) can be triggered to improve the state again. However, as already mentioned, certain disturbances can also impact the grid services, causing them (in the worst case) to fail [
30]. To address this, the authors of References [
17,
20] propose an operational state classification for each ICT-enabled grid service. Based on three properties of the ICT system, i.e., availability of components and data, timeliness of data transfer, and data correctness, the state of a grid service can be classified into
normal,
limited, or
failed. These properties are regarded as the requirements of the grid services. Figure
1 shows the operational states of the power grid as well as of ICT-enabled grid services, along with possible state transitions [
20]. The remedial actions in the operational state classification of a power system (left part) are performed via the ICT-enabled grid services, and the remedial actions of grid services (right part) would be actions in the ICT network, such as rerouting traffic or traffic shaping [
49]. The states of grid services will be further elaborated in Section
2. The authors of Reference [
20] present only the theoretical foundations for these states without concrete use cases. These states are formalized in Reference [
17], and their benefits were demonstrated using simulations in Reference [
31]. However, the impact of data correctness on the state of ICT-enabled grid services is not investigated because the ICT disturbances impacting the state are not considered.
As mentioned earlier, cyberattacks are disturbances that can impact all three aforementioned properties. A denial-of-service attack can render a component unable to communicate (availability) or communicate with unaccepted timeliness [
46], whereas a false data injection attack can impact the data correctness by injecting manipulated data [
26]. In this regard, there exists a fundamental difference between the aforementioned availability and timeliness on one hand and data correctness on the other. The first two properties can be measured, e.g., by pinging (or heartbeat) and comparing timestamps, respectively [
22]. However, data correctness cannot be directly measured due to the absence of ground truth (i.e., real/actual value).
Traditionally, data correctness in a power system refers to the correctness of measurements. State estimation, in combination with bad data detection, is used in this regard and is the base for almost all other services [
32]. However, this requires redundant measurements and assumes measurement errors to be randomly distributed [
1]. While the former is not applicable for distribution grids (especially on the lower voltages) [
14], the latter does not hold for all ICT disturbances, e.g., for coordinated false data injection attacks [
26]. Therefore, data correctness is challenging to determine, especially considering the rising cyber threats in CPESs. Unlike other disturbances, such as link and sensor failures, the diverse nature of cyberattacks makes it challenging to investigate their impact on the ICT system as well as the interconnected power system [
8]. Furthermore, since ICT systems span over large geographical areas (similar to power systems), data transfer from source to destination typically passes through several intermediate nodes (or hops), all of which are potential entry points for attackers. An intelligent attacker can make it challenging (or even impossible) for destination nodes to determine, whether the received data is correct, especially without the ground truth. Additionally, as shown in [
30], malfunctions and incorrect operation of components (e.g., sensors) can also impact data correctness.
1.2 Related Work
Addressing issues regarding the correctness of process data has been the focus of several studies. They can be categorized into field measures (e.g., References [
13,
36]), improved bad data detection (e.g., References [
16,
45]), and trust-based measures (e.g., References [
8,
25]). Field measures aim to improve cyber security and accuracy, e.g., by placing different types of sensors, but not by estimating data correctness. Research on improved bad data detection is specialized (e.g., against coordinated false data injection attacks). Both measures in the field and improved bad data detection only tackle a subset of threats impacting the correctness of measurements or they rely on certain invalid assumptions (e.g., measurement redundancy). Considering various types of power systems, grid services, and threats to the correctness of measurements, a more holistic approach capable of integrating existing approaches is needed. Trust-based approaches can provide a suitable framework to estimate data correctness, but the interpretation of the term trust lacks uniformity. In Reference [
25], for example, field devices (referred to as agents) derive trust from the deviation of the estimated measurements of neighboring devices based on measurements received from the neighboring device. A disadvantage of such univariate trust-based measures is that the trust in the measurements of the other device relies only on one piece of information, in this case, the deviation from the expected data.
The authors of References [
6,
7,
8] adapt OC-Trust to a multifaceted trust model for power systems, referred to as
trust in power system network assessment (PSNA-Trust). Here, trust is defined as a “context-dependent, and multivariate sense about an entity with respect to its functional correctness, safety, security, reliability, credibility, and usability”. Additionally, a methodology to assess or estimate the trust in measurements and state variables is presented in Reference [
6,
7]. Due to the absence of ground truth, data correctness can often only be estimated (i.e., cannot be directly measured), which makes PSNA-Trust and its assessment methodology a promising approach.
There also exist other trust approaches in other domains like the so-called ABI model (ability, benevolence, integrity) in the field of organizational trust [
29] or the concept of data veracity in the field of big data [
4,
21]. However, OC-Trust focuses on trust in and between autonomous software agents in complex systems and provides for this specific environment, a more specific (compared with data veracity) and fine-granular (compared with the ABI model) grouping of factors affecting trust. In addition, there already exist applications of OC-Trust in the domain of power systems [
3,
6,
37].