Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
  • Naperville, Illinois, United States

Cindy Hood

ABSTRACT High performance computing clusters have been a critical resource for computational science for over a decade and have more recently become integral to large-scale industrial analysis. Despite their well-specified components, the... more
ABSTRACT High performance computing clusters have been a critical resource for computational science for over a decade and have more recently become integral to large-scale industrial analysis. Despite their well-specified components, the aggregate behavior of clusters is poorly understood. The difficulties arise from complicated interactions between cluster components during operation. These interactions have been studied by many researchers, some of whom have identified the need for holistic multi-scale modeling that simultaneously includes network level, operating system level, process level, and user level behaviors. Each of these levels presents its own modeling challenges, but the user level is the most complex due to the adaptability of human beings. In this vein, there are several major user modeling goals, namely descriptive modeling, predictive modeling and automated weakness discovery. This study shows how multi-agent techniques were used to simulate a large-scale computing cluster at each of these levels.
ABSTRACT There are an increasing number of ways optical network devices and IP routers can interact with each other during a network fault. To provide continuity of service, the interactions between each component in a network must be... more
ABSTRACT There are an increasing number of ways optical network devices and IP routers can interact with each other during a network fault. To provide continuity of service, the interactions between each component in a network must be cooperative. Consequently, the effect of recovery processes cooperating are the network configurations that have certain structural relationships, which can be elaborated. A conflict detector can prove that service will be restored during a fault scenario by checking whether these structural properties hold. We are using simulation as a method to study the coordination of recovery strategies and whether different coordination strategies will achieve recovery goals attached to a network service. The network service carries a traffic stream, which is injected into and extracted from a network. For multilayer recovery to complete, the cumulative effect of device actions during a failure must be (1) a connected path between the endpoints of a service and (2) a flow traffic delivered to a destination at a quality that matches a service level agreement. We represent Optical and Multiprotocol Label Switching (MPLS) recovery actions as graph-maintenance operations that change the state of a digraph. For example, the actions of forwarding traffic between an access port and a trunk port and selecting traffic from a new trunk port and forwarding it to an access port can be modeled as a sequence of edge additions and deletions. The state of the digraph represents the current configuration of a multilayer network as actions of recovery are performed. In this paper, we define some structural properties that can be observed during a simulation as the network evolves to a final state from an initial state before a failure occurs.
Abstract There is an increasing demand for higher levels of network availability and reliability. Effective network monitoring is necessary to meet this demand. Whereas most of the network monitoring research to date has been focused on... more
Abstract There is an increasing demand for higher levels of network availability and reliability. Effective network monitoring is necessary to meet this demand. Whereas most of the network monitoring research to date has been focused on combining the information collected in a meaningful way, in this research we focus on processing the information collected before it is combined. We propose a change detection methodology for each measurement variable, where we can detect changes from the variable's usual behavior. ...
To improve network management in today's increasingly complex communication networks, the authors propose an intelligent monitoring hierarchy. The hierarchy is comprised of hidden Markov models (HMMs) and neural networks. As... more
To improve network management in today's increasingly complex communication networks, the authors propose an intelligent monitoring hierarchy. The hierarchy is comprised of hidden Markov models (HMMs) and neural networks. As demonstrated on real network data, this hierarchy can detect abnormal behavior at high levels using only readily available low-level fault models. This allows the node to provide the network manager a complete picture of the nodes health
ABSTRACT High performance computing clusters have been a critical resource for computational science for over a decade and have more recently become integral to large-scale industrial analysis. Despite their well-specified components, the... more
ABSTRACT High performance computing clusters have been a critical resource for computational science for over a decade and have more recently become integral to large-scale industrial analysis. Despite their well-specified components, the aggregate behavior of clusters is poorly understood. The difficulties arise from complicated interactions between cluster components during operation. These interactions have been studied by many researchers, some of whom have identified the need for holistic multi-scale modeling that simultaneously includes network level, operating system level, process level, and user level behaviors. Each of these levels presents its own modeling challenges, but the user level is the most complex due to the adaptability of human beings. In this vein, there are several major user modeling goals, namely descriptive modeling, predictive modeling and automated weakness discovery. This study shows how multi-agent techniques were used to simulate a large-scale computing cluster at each of these levels.
Automatic protection switching (APS) protocols assigned to different layers in wide-area networks require interworking functionality in order to restore a wide variety of services and accommodate an evolving network infrastructure.... more
Automatic protection switching (APS) protocols assigned to different layers in wide-area networks require interworking functionality in order to restore a wide variety of services and accommodate an evolving network infrastructure. Without some coordination between restoration mechanisms, an outage duration would be lengthened as methods assigned to different layers interfere with each other, and the network would be locked up in
The plethora of new technologies and services such as MPLS, ATM, IP, SONET and WDM allows services to be restored at different layers and at different costs. Restoration schemes at multiple layers might collide, causing a race condition,... more
The plethora of new technologies and services such as MPLS, ATM, IP, SONET and WDM allows services to be restored at different layers and at different costs. Restoration schemes at multiple layers might collide, causing a race condition, where restoration agents at different layers keep trying to establish a connection. We define a model for restoration mechanisms at different layers
ABSTRACT High performance computing clusters have been a critical resource for computational science for over a decade and have more recently become integral to large-scale industrial analysis. Despite their well-specified components, the... more
ABSTRACT High performance computing clusters have been a critical resource for computational science for over a decade and have more recently become integral to large-scale industrial analysis. Despite their well-specified components, the aggregate behavior of clusters is poorly understood. The difficulties arise from complicated interactions between cluster components during operation. These interactions have been studied by many researchers, some of whom have identified the need for holistic multi-scale modeling that simultaneously includes network level, operating system level, process level, and user level behaviors. Each of these levels presents its own modeling challenges, but the user level is the most complex due to the adaptability of human beings. In this vein, there are several major user modeling goals, namely descriptive modeling, predictive modeling and automated weakness discovery. This study shows how multi-agent techniques were used to simulate a large-scale computing cluster at each of these levels.
Social networking websites have become a vital means of communication that can provide information on various topics. The real time nature of the information published on social networking websites coupled with their accessibility as a... more
Social networking websites have become a vital means of communication that can provide information on various topics. The real time nature of the information published on social networking websites coupled with their accessibility as a publishing platform make them a powerful tool for information gathering. Furthermore, many individuals utilize these sorts of platforms to share their knowledge and opinions with
As communication networks continue to increase in size and complexity to support new applications and large numbers of new users, understanding and managing network behavior becomes increasingly difficult. Many applications, such as video,... more
As communication networks continue to increase in size and complexity to support new applications and large numbers of new users, understanding and managing network behavior becomes increasingly difficult. Many applications, such as video, require networks to maintain a higher standard of network availability and reliability, thereby making effective network fault management critical. The dynamic nature and heterogeneity of current networks makes this more difficult. Fundamental changes to the network occur much more ...
Social Network Sites (SNSs) have become an important method for information exchange, and many people turn to these sites for their information needs. Users encounter a variety of challenges when performing information gathering on SNSs... more
Social Network Sites (SNSs) have become an important method for information exchange, and many people turn to these sites for their information needs. Users encounter a variety of challenges when performing information gathering on SNSs because the SNS messages are not organized like traditional documents. These challenges include small amounts of information in each message, unorganized messages, and lack of context in each message. In this work we focus specifically on how the information should be presented to a user to address these issues. We perform a study of SNS users to identify their preferences for such an interface and present the results.
A key to achieving widespread IT fluency is to make it part of the K-12 curriculum. Along these lines, there have been significant, ongoing efforts to motivate and establish standards for both students and teachers. This paper describes... more
A key to achieving widespread IT fluency is to make it part of the K-12 curriculum. Along these lines, there have been significant, ongoing efforts to motivate and establish standards for both students and teachers. This paper describes our experiences teaching K-12 teachers technology concepts.
Abstract— This paper describes the development, deployment and early results of a long-term, wide-band (30 MHz - 6 GHz) spectrum observatory system focused on downtown Chicago and the immediately surrounding areas. Previous short term... more
Abstract— This paper describes the development, deployment and early results of a long-term, wide-band (30 MHz - 6 GHz) spectrum observatory system focused on downtown Chicago and the immediately surrounding areas. Previous short term studies have ...
As network providers implement multiple network types, including 2G and 3G networks and potentially 1G, WLL, and/or satellite, it becomes feasible to overflow traffic between these diverse networks. This results in an increase in... more
As network providers implement multiple network types, including 2G and 3G networks and potentially 1G, WLL, and/or satellite, it becomes feasible to overflow traffic between these diverse networks. This results in an increase in capacity, because the Erlang model of traffic analysis shows that a large network provides efficiency of scale. This paper analyzes the gains in efficiency when dual
ABSTRACT As network providers implement multiple network types, including 2G/3G networks and potentially 1G, wireless local loops (WLL), and/or satellite, it becomes feasible to overflow traffic between these diverse networks. This will... more
ABSTRACT As network providers implement multiple network types, including 2G/3G networks and potentially 1G, wireless local loops (WLL), and/or satellite, it becomes feasible to overflow traffic between these diverse networks. This will result in an increase in capacity, because the Erlang model of traffic analysis shows that a large network provides efficiency of scale. This paper analyzes the gains in efficiency when dual mode phones may overflow to diverse networks, via handover, thus increasing the number of channels available to these phones and minimizing the blocking rate. The analysis compares different call handling models, via the Erlang and M-dimensional Markov chain models
The challenge for 3G cellular is not only to support varied services at speeds to 2 Mbit/s, but to support these services efficiently and with good quality. A method to optimize the capacity available to high-speed data is to consider... more
The challenge for 3G cellular is not only to support varied services at speeds to 2 Mbit/s, but to support these services efficiently and with good quality. A method to optimize the capacity available to high-speed data is to consider placing dual mode circuit-switched calls on an alternate cellular network (ie, 2G or 3G). For a mature UMTS network with
The acceleration in computational scale to solve problems in emerging "computational" fields from Nanoscience and Genetics to Astrophysics places increasingly heavy compute and data storage burdens on locally and globally... more
The acceleration in computational scale to solve problems in emerging "computational" fields from Nanoscience and Genetics to Astrophysics places increasingly heavy compute and data storage burdens on locally and globally distributed computer systems. We are focusing on the management of these loosely coupled systems (clusters and Grids) which are asked to behave as an increasingly large single entity, repeatably and
ABSTRACT We propose a programmable automatic protection switching (APS) protocol to repair an impaired lightpath traversing an optical link. Recovery agents repair impaired flows by searching through a space of policies before attempting... more
ABSTRACT We propose a programmable automatic protection switching (APS) protocol to repair an impaired lightpath traversing an optical link. Recovery agents repair impaired flows by searching through a space of policies before attempting a protection switch and after switching impaired traffic. A policy manager disseminates a changeable set of policies to each agent and ensures consistent interpretation end-to-end QoS. QoS policies are structured to be interpreted in the same way by developing a model of end-to-end QoS over which logic formulae can be checked for satisfaction.
ABSTRACT Limited attention has been paid to the interactions of service restoration protocols that operate during a fiber cut to restore connectivity between communications equipment. Historically, restoration protocols were deployed only... more
ABSTRACT Limited attention has been paid to the interactions of service restoration protocols that operate during a fiber cut to restore connectivity between communications equipment. Historically, restoration protocols were deployed only at the SONET layer in telephony networks. SONET frames that carried voice-grade signals such as a T1 or T3 would be redirected to protection path over architectures that supported bidirectional communication. With the advent of new communication technologies such as ATM, IP and WDM, a cable cut can affect multiple routing processes at each of these layers even if a particular network region only supports a few of these technologies. For example, restoration processes at an arbitrary layer in adjacent networks might trigger if lower-layer protocols don't finish within specific deadlines. With the growth of data traffic and a wide range of service offerings, the telecommunication networks of the future are growing more complex, requiring multiple interactions between software systems. Failures will be difficult to pinpoint and the cooperation of the repair processes will be key to ensure that services traversing multiple networks are not interrupted.
As cellular networks diversify by expanding the number of services, cell sizes, and generations of technologies supported, it becomes possible to overflow dual-mode terminals using vertical handovers to other cellular networks or... more
As cellular networks diversify by expanding the number of services, cell sizes, and generations of technologies supported, it becomes possible to overflow dual-mode terminals using vertical handovers to other cellular networks or ‘component networks’. This study varies call placement algorithms, by defining, modeling, and evaluating Overflow and Return policies.
ABSTRACT Protection switching interactions in wide-area networks need to interoperate with each other in order to restore a wide variety of services, provide survivability to mission-critical applications and accommodate an evolving... more
ABSTRACT Protection switching interactions in wide-area networks need to interoperate with each other in order to restore a wide variety of services, provide survivability to mission-critical applications and accommodate an evolving network infrastructure. Without some coordination between restoration mechanisms, an outage duration would be lengthened as methods assigned to each layer interfere with each other or the network would be locked up in a deadlocked state that never converges to a new topology. A set of control policies can be specified to coordinate between restoration mechanisms in a network that spans multiple layers and regions. These control policies are expressed as rules, and are collectively denoted as the escalation strategy. The escalation strategy can be provisioned by a network manager and is implemented as a distributed coordination protocol between peer recovery agents in the nodes. As rules for coordinating between restoration mechanisms are formalized, a mathematical proof could be provided to prove that the network does indeed converge to a new topology.
Abstract— It is critical to understand the effect of wireless interference so that we can better utilize the spectrum through dynamic allocation methods. This paper presents the results of an experimental study done to characterize UDP... more
Abstract— It is critical to understand the effect of wireless interference so that we can better utilize the spectrum through dynamic allocation methods. This paper presents the results of an experimental study done to characterize UDP performance in wireless networks in the presence of ...
In this paper we describe an innovative method for using LEGO® bricks to teach programming and other computing concepts. LEGO® bricks are used to express a special purpose language... more
In this paper we describe an innovative method for using LEGO® bricks to teach programming and other computing concepts. LEGO® bricks are used to express a special purpose language to build creations out of LEGOs®. Using this language, students can execute and create programs. Both fundamental and more advanced concepts can be taught. The use of LEGOs® increases the tactile
ABSTRACT An experience-dominated subject like software project management cannot be learned by merely attending lectures. Additional labs, however, even with only modest real-life projects, call for substantial effort to be spent by the... more
ABSTRACT An experience-dominated subject like software project management cannot be learned by merely attending lectures. Additional labs, however, even with only modest real-life projects, call for substantial effort to be spent by the instructors as well as by the partaking students. Our experience shows that using a software development simulation tool enhances the mix of methods used in conventional teaching substantially.
In the past two years, there have been serious common channel signaling (CCS) network outages caused by software faults. Recovery from those outages required intervention of a craftsperson. A pro-active approach in which potential... more
In the past two years, there have been serious common channel signaling (CCS) network outages caused by software faults. Recovery from those outages required intervention of a craftsperson. A pro-active approach in which potential abnormal situations are created as test drill cases by inserting software faults of different types is proposed. These faults are systematically inserted in the critical software
Performance management of clusters and Grids poses many challenges. The sharing of large distributed sets of resources can provide efficiencies, but it also introduces com- plexity in terms of providing and maintaining adequate... more
Performance management of clusters and Grids poses many challenges. The sharing of large distributed sets of resources can provide efficiencies, but it also introduces com- plexity in terms of providing and maintaining adequate performance. Current application requirements focus on the amount of resources needed without explicitly characterizing the performance required from those resources. Inconsistent, or highly variable run-time of applications
Run time variability of parallel application codes continues to be a significant challenge in clusters. We are studying run time variability at the communication level from the perspective of the application, focusing on the network. To... more
Run time variability of parallel application codes continues to be a significant challenge in clusters. We are studying run time variability at the communication level from the perspective of the application, focusing on the network. To gain insight into this problem our earlier work developed a tool to emulate parallel applications and in particular their communication. This framework, called parallel
Performance management of clusters and Grids poses many challenges. Sharing large distributed sets of resources can provide efficiencies, but it also introduces complexity in terms of providing and maintaining adequate performance.... more
Performance management of clusters and Grids poses many challenges. Sharing large distributed sets of resources can provide efficiencies, but it also introduces complexity in terms of providing and maintaining adequate performance. Current application requirements focus on the amount of resources needed without explicitly characterizing the per- formance required from those resources. In clusters and Grids, inconsistent or highly variable application
Highly variable parallel application execution time is a persistent issue in cluster computing environments, and can be particularly acute in systems composed of Networks of Workstations (NOWs). We are looking at this issue in terms of... more
Highly variable parallel application execution time is a persistent issue in cluster computing environments, and can be particularly acute in systems composed of Networks of Workstations (NOWs). We are looking at this issue in terms of consistency. In particular, we are focusing on network performance. Before we can use techniques from fault man- agement to attain consistency, this paper presents

And 34 more