Journal of Parallel and Distributed Computing, 2008
We investigate two distinct issues related to resource allocation heuristics: robustness and fail... more We investigate two distinct issues related to resource allocation heuristics: robustness and failure rate. The target system consists of a number of sensors feeding a set of heterogeneous applications continuously executing on a set of heterogeneous machines connected together by high-speed heterogeneous links. There are two quality of service (QoS) constraints that must be satisfied: the maximum end-to-end latency and minimum throughput. A failure occurs if no allocation is found that allows the system to meet its QoS constraints. The system is expected to operate in an uncertain environment where the workload, i.e., the load presented by the set of sensors, is likely to change unpredictably, possibly resulting in a QoS violation. The focus of this paper is the design of a static heuristic that: (a) determines a robust resource allocation, i.e., a resource allocation that maximizes the allowable increase in workload until a run-time reallocation of resources is required to avoid a QoS violation, and (b) has a very low failure rate (i.e., the percentage of instances a heuristic fails). Two such heuristics proposed in this study are a genetic algorithm and a simulated annealing heuristic. Both were “seeded” by the best solution found by using a set of fast greedy heuristics.
IEEE Transactions on Parallel and Distributed Systems, 2004
AbstractParallel and distributed systems may operate in an environment that undergoes unpredicta... more AbstractParallel and distributed systems may operate in an environment that undergoes unpredictable changes causing certain system performance features to degrade. Such systems need robustness to guarantee limited degradation despite fluctuations in the behavior of its component ...
Dynamic real-time systems such as embedded systems operate in environments in which several param... more Dynamic real-time systems such as embedded systems operate in environments in which several parameters vary at run time. These systems must satisfy several performance requirements. Resource allocation on these systems becomes challenging because variations of run-time parameters may cause violations of the performance requirements. Performance violations result in the need for dynamic re-allocation, which is a costly operation. A method for allocating resources such that the allocation can sustain the system in the light of a continuously changing environment is developed. We introduce a novel performance metric called MAIL (maximum allowable increase in load) to capture the effectiveness of a resource allocation. Given a resource allocation, MAIL quantifies the amount of additional load that can be sustained by the system without any performance violations. A mixed-integer-programming-based approach (MIP) is developed to determine a resource allocation that has the highest MAIL value. Using simulations, several sets of experiments are conducted to evaluate our heuristics in various scenarios of machine and task heterogeneities. The performance of MIP is compared with three other heuristics: integer-programming based, greedy, and classic min-min. Our results show that MIP performs significantly better when compared with the other heuristics.
System builders are becoming increasingly interested in robust design. We believe that a methodol... more System builders are becoming increasingly interested in robust design. We believe that a methodology for generating robustness metrics will help the robust design research efforts and, in general, is an important step in the efforts to create robust computing systems. The purpose of the research in this paper is to quantify the robustness of a resource allocation, with the eventual objective of setting a standard that could easily be instantiated for a particular computing system to generate a robustness metric. The paper exposes how not considering the uncertainties can result in gross overestimation of the system capacity, and shows a method for reducing the impact of uncertainty, and even use it to our advantage if the uncertainties are correlated. We present our theoretical foundation for a robustness metric and give its instantiation for a particular system.
Journal of Parallel and Distributed Computing, 2008
We investigate two distinct issues related to resource allocation heuristics: robustness and fail... more We investigate two distinct issues related to resource allocation heuristics: robustness and failure rate. The target system consists of a number of sensors feeding a set of heterogeneous applications continuously executing on a set of heterogeneous machines connected together by high-speed heterogeneous links. There are two quality of service (QoS) constraints that must be satisfied: the maximum end-to-end latency and minimum throughput. A failure occurs if no allocation is found that allows the system to meet its QoS constraints. The system is expected to operate in an uncertain environment where the workload, i.e., the load presented by the set of sensors, is likely to change unpredictably, possibly resulting in a QoS violation. The focus of this paper is the design of a static heuristic that: (a) determines a robust resource allocation, i.e., a resource allocation that maximizes the allowable increase in workload until a run-time reallocation of resources is required to avoid a QoS violation, and (b) has a very low failure rate (i.e., the percentage of instances a heuristic fails). Two such heuristics proposed in this study are a genetic algorithm and a simulated annealing heuristic. Both were “seeded” by the best solution found by using a set of fast greedy heuristics.
IEEE Transactions on Parallel and Distributed Systems, 2004
AbstractParallel and distributed systems may operate in an environment that undergoes unpredicta... more AbstractParallel and distributed systems may operate in an environment that undergoes unpredictable changes causing certain system performance features to degrade. Such systems need robustness to guarantee limited degradation despite fluctuations in the behavior of its component ...
Dynamic real-time systems such as embedded systems operate in environments in which several param... more Dynamic real-time systems such as embedded systems operate in environments in which several parameters vary at run time. These systems must satisfy several performance requirements. Resource allocation on these systems becomes challenging because variations of run-time parameters may cause violations of the performance requirements. Performance violations result in the need for dynamic re-allocation, which is a costly operation. A method for allocating resources such that the allocation can sustain the system in the light of a continuously changing environment is developed. We introduce a novel performance metric called MAIL (maximum allowable increase in load) to capture the effectiveness of a resource allocation. Given a resource allocation, MAIL quantifies the amount of additional load that can be sustained by the system without any performance violations. A mixed-integer-programming-based approach (MIP) is developed to determine a resource allocation that has the highest MAIL value. Using simulations, several sets of experiments are conducted to evaluate our heuristics in various scenarios of machine and task heterogeneities. The performance of MIP is compared with three other heuristics: integer-programming based, greedy, and classic min-min. Our results show that MIP performs significantly better when compared with the other heuristics.
System builders are becoming increasingly interested in robust design. We believe that a methodol... more System builders are becoming increasingly interested in robust design. We believe that a methodology for generating robustness metrics will help the robust design research efforts and, in general, is an important step in the efforts to create robust computing systems. The purpose of the research in this paper is to quantify the robustness of a resource allocation, with the eventual objective of setting a standard that could easily be instantiated for a particular computing system to generate a robustness metric. The paper exposes how not considering the uncertainties can result in gross overestimation of the system capacity, and shows a method for reducing the impact of uncertainty, and even use it to our advantage if the uncertainties are correlated. We present our theoretical foundation for a robustness metric and give its instantiation for a particular system.
Uploads
Papers by Shoukat Ali