Standard network QoS analysis usually accounts for the infrastructure performance/availability on... more Standard network QoS analysis usually accounts for the infrastructure performance/availability only, with scarce consideration of the user perspective. With reference to the General Packet Radio Service (GPRS), this paper addresses the problem of how to evaluate the impact of system unavailability periods on QoS measures, explicitly accounting for user characteristics. In fact, the ultimate goal of a service provider is user satisfaction, therefore it is extremely important to introduce the peculiarities of the user population when performing system analysis in such critical system conditions as during outages. The lack of service during outages is aggravated by the collision phenomenon determined by accumulated users requests, which (negatively) impacts on the QoS provided by the system for some time after its restart. Then, depending on the specific behavior exhibited by the variety of users, such QoS degradation due to outages may be perceived differently by different user categories. We follow a compositional modeling approach, based on the GPRS and user models; the focus is on the GPRS random access procedure on one side, and different classes of users behavior on the other side. Quantitative analysis, performed using a simulation approach, is carried out, showing the impact of outages on a relevant QoS indicator, in relation with the considered user characteristics and network load.
This workshop summary gives a brief overview of the workshop on "Architecting Dependable Systems"... more This workshop summary gives a brief overview of the workshop on "Architecting Dependable Systems" held in conjunction with DSN 2007. The main aim of this workshop is to promote cross-fertilization between the software architecture and dependability communities. We believe that both of them will benefit from clarifying approaches that have been previously tested and have succeeded as well as those that have been tried but have not yet been shown to be successful.
In this chapter we present approaches for analysis and monitoring of dependability and performanc... more In this chapter we present approaches for analysis and monitoring of dependability and performance of connected systems, and their combined usage. These approaches need to account for dynamicity and evolvability of connected systems. In particular, the chapter covers the quantitative assessment of dependability and performance properties through a stochastic model-based approach: first an overview of dependability-related measurements and stochastic model-based approaches provides the necessary background. Then, our proposal in connect of an automated and modular dependability analysis framework for dynamically connected systems is described. This framework can be used off-line for system design (specifically, in connect, for connector synthesis), and on-line, to continuously assess system behaviour and detect possible issues arising at run-time. For the latter purpose, a generic, flexible and modular monitoring infrastructure has been developed. Monitoring is at the core of the connect vision, in order to ensure run-time observation of specified quantitative properties and possibly trigger adequate reactions. We focus here on the interaction chain between monitoring and analysis, to allow for on-line continuous validation of specified dependability and performance properties. Illustrative examples of applications of analysis and monitoring are provided with reference to the connect Terrorist Alert scenario.
Over the last few years wireless local area networking (WLAN) has become a very important technol... more Over the last few years wireless local area networking (WLAN) has become a very important technology that offers high-speed communication services to mobile users in indoor environments. WLAN technology offers some very attractive characteristics such as high data rates, increased QoS capabilities, and low installation costs which has made many professionals claim that it will be the main opponent of IMT-2000, despite the enormous effort needed for the specification and implementation of 3G systems. However, WLANs also present many important constraints related mainly to their restricted coverage capabilities. On the other hand, 3G systems are deployed gradually and carefully since their business prospects have not been validated yet and it is expected that 2G and 2G+ cellular systems will continue to play an important role for at least five more years. Thus, today's wireless networking environment is in fact a conglomeration of all these technologies for which there is a strong need for cooperation. In this article we describe a heterogeneous wireless networking environment together with its features and user requirements. We explain the importance of the existence of WLANs and describe a framework and system architecture that supports seamless integration of WLAN in heterogeneous cellular networking environments, focusing on support for efficient resource provision and management.
Electric Power Systems (EPS) become more and more critical for our society, since they provide vi... more Electric Power Systems (EPS) become more and more critical for our society, since they provide vital services for the human activities. At the same time, obtaining dependable behaviour of EPS is an highly challenging task, both in terms of defining effective business management and in terms of analysis of dependability and performability attributes. A major concern when dealing with EPS is the understanding and the evaluation of the interdependencies between Electric Infrastructures (EI) and the Computer-based Control System (CCS), which controls the status and the activities of EI. Studies on these interdependencies are only at an early stage of development. Major difficulties are the complexity of the infrastructures under analysis and the lack of well-established models and tools for dealing with them. This paper presents an ad-hoc simulator for the evaluation of dependability and performability measures in EPS. The system model the simulator is based on focuses on interdependencies between EI and CCS. Most existing modeling approaches in EPS does not provide explicit modeling of interdependencies among the composing subsystems, so that the cascading or escalating phenomena cannot be deeply analyzed. Our stochastic model is composed by separated and simple, but representative, submodels representing the dynamics of EI and different policies of reactions to disruptions and reconfigurations triggered by CCS. In this way, the simulator aims at providing explicit modeling of the interdependencies between the main subsystems, so the impact on the dependability and performability of the cascading or escalating failures can be analyzed. In this paper, we describe the simulator and highlight the design choices.
Wireless networks are starting to be populated by interconnected devices that reveal remarkable h... more Wireless networks are starting to be populated by interconnected devices that reveal remarkable hardware and software differences. This fact raises a number of questions on the applicability of available results on dependability-related aspects of communication protocols, since they were obtained for wireless networks with homogeneous nodes. In this work, we study the impact of heterogeneous communication and computation capabilities of nodes on dependability aspects of diffusion protocols for wireless networks. We build a detailed stochastic model of the logic layers of the communication stack with the SAN formalism. The model takes into account relevant real-world aspects of wireless communication, such as transitional regions and capture effect, and heterogeneous node capabilities. Dependability-related metrics are evaluated with analytical solutions techniques for small networks, while simulation is employed in the case of large networks.
Gracefully degradable algorithm for Byzantine agreement. FD Giandomenico, ML Guidotti, F Grandoni... more Gracefully degradable algorithm for Byzantine agreement. FD Giandomenico, ML Guidotti, F Grandoni, L Simoncini COMP. SYST. SCI. ENG. 3:11, 32-40, 1988. An algorithm for Byzantine agreement without authentication in ...
Gracefully degradable algorithm for Byzantine agreement. FD Giandomenico, ML Guidotti, F Grandoni... more Gracefully degradable algorithm for Byzantine agreement. FD Giandomenico, ML Guidotti, F Grandoni, L Simoncini COMP. SYST. SCI. ENG. 3:11, 32-40, 1988. An algorithm for Byzantine agreement without authentication in ...
Electric power systems are critical infrastructures that provide vital services for the human act... more Electric power systems are critical infrastructures that provide vital services for the human activities; assessing their dependability is thus an high priority task. This paper presents an ad-hoc simulator for the evaluation of dependability and performability measures in electrical power system (EPS). Our stochastic model is composed by separated and simple submodels of the dynamics of the two subsystems composing EPS: the electrical infrastructure and the computer-based control system. By providing explicit modeling of the interdependencies between the main subsystems, the impact on the dependability and performability of the cascading or escalating failures can be analyzed. Some preliminary analysis on a case study are also shown.
A new software fault tolerance scheme, called the Self-Configuring Optimistic Programming scheme,... more A new software fault tolerance scheme, called the Self-Configuring Optimistic Programming scheme, (SCOP), is proposed. It attempts to reduce the cost of fault tolerant software and to eliminate some inflexibilities and rigidities present in the existing software fault tolerance schemes. For obtaining these goals, it is structured in phases in order to produce acceptable results with the minimum possible effort and to release these results as soon as available, and it can be parameterized with respect to both the desired reliability and the desired response time. SCOP allows a trade-off between various attributes of system services (such as reliability, throughput and response time) as desired by designers and it is thus a flexible and cost-effective redundant component for gracefully degradable systems.
& Conclusions Multiple-Phased Systems (MPS), i.e., systems whose operational life can be partitio... more & Conclusions Multiple-Phased Systems (MPS), i.e., systems whose operational life can be partitioned in a set of disjoint periods, called "phases", include several classes of systems such as Phased Mission Systems and Scheduled Maintenance Systems. Because of their deployment in critical applications, the dependability modeling and analysis of Multiple-Phased Systems is a task of primary relevance. The phased behavior makes the analysis of Multiple-Phased Systems extremely complex. This paper describes the modeling methodology and the solution procedure implemented in DEEM, a dependability modeling and evaluation tool specifically tailored for Multiple Phased Systems. It also describes its use for the solution of representative MPS problems. DEEM relies upon Deterministic and Stochastic Petri Nets as the modeling formalism, and on Markov Regenerative Processes for the model solution. When compared to existing general-purpose tools based on similar formalisms, DEEM offers advantages on both the modeling side (sub-models neatly model the phase-dependent behaviors of MPS), and on the evaluation side (a specialized algorithm allows a considerable reduction of the solution cost and time). Thus, DEEM is able to deal with all the scenarios of MPS which have been analytically treated in the literature, at a cost which is comparable with that of the cheapest ones, completely solving the issues posed by the phased-behavior of MPS.
In this paper the consolidate identification of faults, distinguished as transient or permanent/i... more In this paper the consolidate identification of faults, distinguished as transient or permanent/intermittent, is approached. Transient faults discrimination has long been performed in commercial systems: threshold-based techniques have been practiced for several years for this purpose. The present work aims to contribute to the usefulness of the count-and-threshold scheme, through the analysis of its behaviour and the exploration of its effects on the system. To this goal, the scheme is mechanized as a device named αcount, endowed with a few controllable parameters. α-count tries to balance between two conflicting requirements: to keep in the system those components that have experienced just transient faults; and to remove quickly those affected by permanent or intermittent faults. Analytical models are derived, allowing detailed study of α-count's behaviour; the actual evaluation, in a range of configurations, is performed by standard tools, in terms of the delay in spotting faulty components and the probability of improperly blaming correct ones.
ABSTRACT Analysis of interdependencies in Electric Power Systems (EPS) has been recognized as a c... more ABSTRACT Analysis of interdependencies in Electric Power Systems (EPS) has been recognized as a crucial and challenging issue to improve their trustworthiness. The recent liberalization process in energy markets has promoted the entry of a variety of operators in the electricity industry. The resulting new organization contributed to increase in complexity, heterogeneity and interconnection. This paper proposes a framework for analyzing EPS organized as a set of interconnected regions, both from the point of view of the electric power grid and of the cyber control infrastructure. The emphasis is on interdependencies and in assessing their impact on indicators representative of the QoS perceived by users. Taking a reference power grid as test case, the effects of failures on selected measures are shown, both in case the grid is partitioned in a number of regions and in case of a single region, to illustrate the behavior of different grid and control configurations.
Efforts have been significantly invested in the last years on analysis of interdependencies in cr... more Efforts have been significantly invested in the last years on analysis of interdependencies in critical infrastructures, including Electric Power Systems (EPS). Among other studies, a modeling and analysis framework has been proposed, able to account for the impact of interdependencies in EPS organized as a set of interconnected regions, both from the point of view of the electric power grid and of the cyber control infrastructure. Indicators representative of the quality of service (QoS) perceived by users have been defined and assessed on reference test cases under homogeneous conditions of the network parameters and of the cost associated to power losses and to power generation. This paper advances such previous studies by extending the developed framework considering loads of different criticality (from the point of view of consequences of power loss) and different failure rates for the power lines. The proposed extensions address relevant aspects of non-homogeneity of real EPS infrastructures, which need to be taken into account in assessment studies to get accurate results, to be exploited as useful guidelines to appropriate configuration and reconfiguration policies.
AbstractÐThis paper presents a class of count-and-threshold mechanisms, collectively named -count... more AbstractÐThis paper presents a class of count-and-threshold mechanisms, collectively named -count, which are able to discriminate between transient faults and intermittent faults in computing systems. For many years, commercial systems have been using transient fault discrimination via threshold-based techniques. We aim to contribute to the utility of count-and-threshold schemes, by exploring their effects on the system. We adopt a mathematically defined structure, which is simple enough to analyze by standard tools. -count is equipped with internal parameters that can be tuned to suit environmental variables (such as transient fault rate, intermittent fault occurrence patterns). We carried out an extensive behavior analysis for two versions of the count-and-threshold scheme, assuming, first, exponentially distributed fault occurrencies and, then, more realistic fault patterns.
this paper planned conversations (inwhich a set of communicating processes are rolled back togeth... more this paper planned conversations (inwhich a set of communicating processes are rolled back together), which also allow therecomputation after roll-back to use different code from the first computation, so that errorscaused by software design faults may not be repeated at the new execution [Randell 1975], andatomic transactions (in which a sequence of changes on a set of data items are
The authors discuss backward error recovery for complex software systems, where different subsyst... more The authors discuss backward error recovery for complex software systems, where different subsystems may belong to essentially different application areas. Such heterogeneous subsystems are naturally built according to different design 'models', namely the 'objectaction' model (where the long-term state of the computation is encapsulated in data objects, and active processes invoke operations on these objects), and the 'process-conversation' model (where the state is contained in the processes, communicating via messages). To allow backward error recovery in these two 'models' of computation, two different schemes are most appropriate: atomic transactions for the objectaction model, and conversations for the processconversation model. Assuming that each of these two kinds of subsystem already has functioning mechanisms for backward error recovery, the authors describe the additional provisions needed for co-ordination between these heterogeneous subsystems. The solution involves altering the virtual machine on which the programs run, and programming conventions which seem rather natural and can be automatically enforced. The approach is demonstrated by a simple example.
The real-time community is devoting considerable attention to¯exible scheduling and adaptive syst... more The real-time community is devoting considerable attention to¯exible scheduling and adaptive systems. One popular means of increasing the¯exibility, and hence eectiveness, of real-time systems is to use value-based scheduling. It is surprising however, how little attention has been devoted, in the scheduling ®eld, to the actual assignment of value. This paper deals with value assignment and presents a framework for undertaking value-based scheduling and advises on the dierent methods that are available. A distinction is made between ordinal and cardinal value functions. Appropriate techniques from utility theory are reviewed. An approach based on constant value modes is introduced and evaluated via a case example. Ó
Standard network QoS analysis usually accounts for the infrastructure performance/availability on... more Standard network QoS analysis usually accounts for the infrastructure performance/availability only, with scarce consideration of the user perspective. With reference to the General Packet Radio Service (GPRS), this paper addresses the problem of how to evaluate the impact of system unavailability periods on QoS measures, explicitly accounting for user characteristics. In fact, the ultimate goal of a service provider is user satisfaction, therefore it is extremely important to introduce the peculiarities of the user population when performing system analysis in such critical system conditions as during outages. The lack of service during outages is aggravated by the collision phenomenon determined by accumulated users requests, which (negatively) impacts on the QoS provided by the system for some time after its restart. Then, depending on the specific behavior exhibited by the variety of users, such QoS degradation due to outages may be perceived differently by different user categories. We follow a compositional modeling approach, based on the GPRS and user models; the focus is on the GPRS random access procedure on one side, and different classes of users behavior on the other side. Quantitative analysis, performed using a simulation approach, is carried out, showing the impact of outages on a relevant QoS indicator, in relation with the considered user characteristics and network load.
This workshop summary gives a brief overview of the workshop on "Architecting Dependable Systems"... more This workshop summary gives a brief overview of the workshop on "Architecting Dependable Systems" held in conjunction with DSN 2007. The main aim of this workshop is to promote cross-fertilization between the software architecture and dependability communities. We believe that both of them will benefit from clarifying approaches that have been previously tested and have succeeded as well as those that have been tried but have not yet been shown to be successful.
In this chapter we present approaches for analysis and monitoring of dependability and performanc... more In this chapter we present approaches for analysis and monitoring of dependability and performance of connected systems, and their combined usage. These approaches need to account for dynamicity and evolvability of connected systems. In particular, the chapter covers the quantitative assessment of dependability and performance properties through a stochastic model-based approach: first an overview of dependability-related measurements and stochastic model-based approaches provides the necessary background. Then, our proposal in connect of an automated and modular dependability analysis framework for dynamically connected systems is described. This framework can be used off-line for system design (specifically, in connect, for connector synthesis), and on-line, to continuously assess system behaviour and detect possible issues arising at run-time. For the latter purpose, a generic, flexible and modular monitoring infrastructure has been developed. Monitoring is at the core of the connect vision, in order to ensure run-time observation of specified quantitative properties and possibly trigger adequate reactions. We focus here on the interaction chain between monitoring and analysis, to allow for on-line continuous validation of specified dependability and performance properties. Illustrative examples of applications of analysis and monitoring are provided with reference to the connect Terrorist Alert scenario.
Over the last few years wireless local area networking (WLAN) has become a very important technol... more Over the last few years wireless local area networking (WLAN) has become a very important technology that offers high-speed communication services to mobile users in indoor environments. WLAN technology offers some very attractive characteristics such as high data rates, increased QoS capabilities, and low installation costs which has made many professionals claim that it will be the main opponent of IMT-2000, despite the enormous effort needed for the specification and implementation of 3G systems. However, WLANs also present many important constraints related mainly to their restricted coverage capabilities. On the other hand, 3G systems are deployed gradually and carefully since their business prospects have not been validated yet and it is expected that 2G and 2G+ cellular systems will continue to play an important role for at least five more years. Thus, today's wireless networking environment is in fact a conglomeration of all these technologies for which there is a strong need for cooperation. In this article we describe a heterogeneous wireless networking environment together with its features and user requirements. We explain the importance of the existence of WLANs and describe a framework and system architecture that supports seamless integration of WLAN in heterogeneous cellular networking environments, focusing on support for efficient resource provision and management.
Electric Power Systems (EPS) become more and more critical for our society, since they provide vi... more Electric Power Systems (EPS) become more and more critical for our society, since they provide vital services for the human activities. At the same time, obtaining dependable behaviour of EPS is an highly challenging task, both in terms of defining effective business management and in terms of analysis of dependability and performability attributes. A major concern when dealing with EPS is the understanding and the evaluation of the interdependencies between Electric Infrastructures (EI) and the Computer-based Control System (CCS), which controls the status and the activities of EI. Studies on these interdependencies are only at an early stage of development. Major difficulties are the complexity of the infrastructures under analysis and the lack of well-established models and tools for dealing with them. This paper presents an ad-hoc simulator for the evaluation of dependability and performability measures in EPS. The system model the simulator is based on focuses on interdependencies between EI and CCS. Most existing modeling approaches in EPS does not provide explicit modeling of interdependencies among the composing subsystems, so that the cascading or escalating phenomena cannot be deeply analyzed. Our stochastic model is composed by separated and simple, but representative, submodels representing the dynamics of EI and different policies of reactions to disruptions and reconfigurations triggered by CCS. In this way, the simulator aims at providing explicit modeling of the interdependencies between the main subsystems, so the impact on the dependability and performability of the cascading or escalating failures can be analyzed. In this paper, we describe the simulator and highlight the design choices.
Wireless networks are starting to be populated by interconnected devices that reveal remarkable h... more Wireless networks are starting to be populated by interconnected devices that reveal remarkable hardware and software differences. This fact raises a number of questions on the applicability of available results on dependability-related aspects of communication protocols, since they were obtained for wireless networks with homogeneous nodes. In this work, we study the impact of heterogeneous communication and computation capabilities of nodes on dependability aspects of diffusion protocols for wireless networks. We build a detailed stochastic model of the logic layers of the communication stack with the SAN formalism. The model takes into account relevant real-world aspects of wireless communication, such as transitional regions and capture effect, and heterogeneous node capabilities. Dependability-related metrics are evaluated with analytical solutions techniques for small networks, while simulation is employed in the case of large networks.
Gracefully degradable algorithm for Byzantine agreement. FD Giandomenico, ML Guidotti, F Grandoni... more Gracefully degradable algorithm for Byzantine agreement. FD Giandomenico, ML Guidotti, F Grandoni, L Simoncini COMP. SYST. SCI. ENG. 3:11, 32-40, 1988. An algorithm for Byzantine agreement without authentication in ...
Gracefully degradable algorithm for Byzantine agreement. FD Giandomenico, ML Guidotti, F Grandoni... more Gracefully degradable algorithm for Byzantine agreement. FD Giandomenico, ML Guidotti, F Grandoni, L Simoncini COMP. SYST. SCI. ENG. 3:11, 32-40, 1988. An algorithm for Byzantine agreement without authentication in ...
Electric power systems are critical infrastructures that provide vital services for the human act... more Electric power systems are critical infrastructures that provide vital services for the human activities; assessing their dependability is thus an high priority task. This paper presents an ad-hoc simulator for the evaluation of dependability and performability measures in electrical power system (EPS). Our stochastic model is composed by separated and simple submodels of the dynamics of the two subsystems composing EPS: the electrical infrastructure and the computer-based control system. By providing explicit modeling of the interdependencies between the main subsystems, the impact on the dependability and performability of the cascading or escalating failures can be analyzed. Some preliminary analysis on a case study are also shown.
A new software fault tolerance scheme, called the Self-Configuring Optimistic Programming scheme,... more A new software fault tolerance scheme, called the Self-Configuring Optimistic Programming scheme, (SCOP), is proposed. It attempts to reduce the cost of fault tolerant software and to eliminate some inflexibilities and rigidities present in the existing software fault tolerance schemes. For obtaining these goals, it is structured in phases in order to produce acceptable results with the minimum possible effort and to release these results as soon as available, and it can be parameterized with respect to both the desired reliability and the desired response time. SCOP allows a trade-off between various attributes of system services (such as reliability, throughput and response time) as desired by designers and it is thus a flexible and cost-effective redundant component for gracefully degradable systems.
& Conclusions Multiple-Phased Systems (MPS), i.e., systems whose operational life can be partitio... more & Conclusions Multiple-Phased Systems (MPS), i.e., systems whose operational life can be partitioned in a set of disjoint periods, called "phases", include several classes of systems such as Phased Mission Systems and Scheduled Maintenance Systems. Because of their deployment in critical applications, the dependability modeling and analysis of Multiple-Phased Systems is a task of primary relevance. The phased behavior makes the analysis of Multiple-Phased Systems extremely complex. This paper describes the modeling methodology and the solution procedure implemented in DEEM, a dependability modeling and evaluation tool specifically tailored for Multiple Phased Systems. It also describes its use for the solution of representative MPS problems. DEEM relies upon Deterministic and Stochastic Petri Nets as the modeling formalism, and on Markov Regenerative Processes for the model solution. When compared to existing general-purpose tools based on similar formalisms, DEEM offers advantages on both the modeling side (sub-models neatly model the phase-dependent behaviors of MPS), and on the evaluation side (a specialized algorithm allows a considerable reduction of the solution cost and time). Thus, DEEM is able to deal with all the scenarios of MPS which have been analytically treated in the literature, at a cost which is comparable with that of the cheapest ones, completely solving the issues posed by the phased-behavior of MPS.
In this paper the consolidate identification of faults, distinguished as transient or permanent/i... more In this paper the consolidate identification of faults, distinguished as transient or permanent/intermittent, is approached. Transient faults discrimination has long been performed in commercial systems: threshold-based techniques have been practiced for several years for this purpose. The present work aims to contribute to the usefulness of the count-and-threshold scheme, through the analysis of its behaviour and the exploration of its effects on the system. To this goal, the scheme is mechanized as a device named αcount, endowed with a few controllable parameters. α-count tries to balance between two conflicting requirements: to keep in the system those components that have experienced just transient faults; and to remove quickly those affected by permanent or intermittent faults. Analytical models are derived, allowing detailed study of α-count's behaviour; the actual evaluation, in a range of configurations, is performed by standard tools, in terms of the delay in spotting faulty components and the probability of improperly blaming correct ones.
ABSTRACT Analysis of interdependencies in Electric Power Systems (EPS) has been recognized as a c... more ABSTRACT Analysis of interdependencies in Electric Power Systems (EPS) has been recognized as a crucial and challenging issue to improve their trustworthiness. The recent liberalization process in energy markets has promoted the entry of a variety of operators in the electricity industry. The resulting new organization contributed to increase in complexity, heterogeneity and interconnection. This paper proposes a framework for analyzing EPS organized as a set of interconnected regions, both from the point of view of the electric power grid and of the cyber control infrastructure. The emphasis is on interdependencies and in assessing their impact on indicators representative of the QoS perceived by users. Taking a reference power grid as test case, the effects of failures on selected measures are shown, both in case the grid is partitioned in a number of regions and in case of a single region, to illustrate the behavior of different grid and control configurations.
Efforts have been significantly invested in the last years on analysis of interdependencies in cr... more Efforts have been significantly invested in the last years on analysis of interdependencies in critical infrastructures, including Electric Power Systems (EPS). Among other studies, a modeling and analysis framework has been proposed, able to account for the impact of interdependencies in EPS organized as a set of interconnected regions, both from the point of view of the electric power grid and of the cyber control infrastructure. Indicators representative of the quality of service (QoS) perceived by users have been defined and assessed on reference test cases under homogeneous conditions of the network parameters and of the cost associated to power losses and to power generation. This paper advances such previous studies by extending the developed framework considering loads of different criticality (from the point of view of consequences of power loss) and different failure rates for the power lines. The proposed extensions address relevant aspects of non-homogeneity of real EPS infrastructures, which need to be taken into account in assessment studies to get accurate results, to be exploited as useful guidelines to appropriate configuration and reconfiguration policies.
AbstractÐThis paper presents a class of count-and-threshold mechanisms, collectively named -count... more AbstractÐThis paper presents a class of count-and-threshold mechanisms, collectively named -count, which are able to discriminate between transient faults and intermittent faults in computing systems. For many years, commercial systems have been using transient fault discrimination via threshold-based techniques. We aim to contribute to the utility of count-and-threshold schemes, by exploring their effects on the system. We adopt a mathematically defined structure, which is simple enough to analyze by standard tools. -count is equipped with internal parameters that can be tuned to suit environmental variables (such as transient fault rate, intermittent fault occurrence patterns). We carried out an extensive behavior analysis for two versions of the count-and-threshold scheme, assuming, first, exponentially distributed fault occurrencies and, then, more realistic fault patterns.
this paper planned conversations (inwhich a set of communicating processes are rolled back togeth... more this paper planned conversations (inwhich a set of communicating processes are rolled back together), which also allow therecomputation after roll-back to use different code from the first computation, so that errorscaused by software design faults may not be repeated at the new execution [Randell 1975], andatomic transactions (in which a sequence of changes on a set of data items are
The authors discuss backward error recovery for complex software systems, where different subsyst... more The authors discuss backward error recovery for complex software systems, where different subsystems may belong to essentially different application areas. Such heterogeneous subsystems are naturally built according to different design 'models', namely the 'objectaction' model (where the long-term state of the computation is encapsulated in data objects, and active processes invoke operations on these objects), and the 'process-conversation' model (where the state is contained in the processes, communicating via messages). To allow backward error recovery in these two 'models' of computation, two different schemes are most appropriate: atomic transactions for the objectaction model, and conversations for the processconversation model. Assuming that each of these two kinds of subsystem already has functioning mechanisms for backward error recovery, the authors describe the additional provisions needed for co-ordination between these heterogeneous subsystems. The solution involves altering the virtual machine on which the programs run, and programming conventions which seem rather natural and can be automatically enforced. The approach is demonstrated by a simple example.
The real-time community is devoting considerable attention to¯exible scheduling and adaptive syst... more The real-time community is devoting considerable attention to¯exible scheduling and adaptive systems. One popular means of increasing the¯exibility, and hence eectiveness, of real-time systems is to use value-based scheduling. It is surprising however, how little attention has been devoted, in the scheduling ®eld, to the actual assignment of value. This paper deals with value assignment and presents a framework for undertaking value-based scheduling and advises on the dierent methods that are available. A distinction is made between ordinal and cardinal value functions. Appropriate techniques from utility theory are reviewed. An approach based on constant value modes is introduced and evaluated via a case example. Ó
Uploads
Papers by Felicita Di Giandomenico