Abstract
Cluster systems gain more and more importance as a platform for parallel computing. In this area the power of the system is strongly coupled with the performance of the network, which has to provide high bandwidth and low latency. Besides these performance aspects fault-tolerance within the network is very important. This paper shows how to build a flexible and fault-tolerant router, the main building part of a network. In addition the overhead for the execution of fault-tolerant routing algorithms is examined.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
N. J. Boden, D. Cohen, R.E. Felderman, A.E. Kulawik, C.L. Seitz, J.N. Seizovic, W.-K. Su: Myrinet: A Gigabit-per-Second Local-Area Network. IEEE Micro, Vol. 15, No. 1, 29–36, 1995
W. Brockmann, A.C. Döring, T. Kosch, G. Lustig, E. Maehle: RuleBased Routing for Fault-Tolerant Parallel Computers. Proc. EDDC-2 Companion Workshop on Dependable Computing, 63–72, AMK-Press, Gliwice 1996
W. Brockmann, T. Kosch, E. Maehle: Rule-based Routing in Massively Parallel Systems. Proc. 4th Euromicro Workshop on Parallel and Distributed Processing — PDP'96, 154–161, IEEE Computer Society Press 1996
R.V. Boppana, S. Chalasani: A Comparison of Adaptive Routing Algorithms. Proc. 20th Symposium on Computer Architectures, 351–360, IEEE Computer Society Press 1993
R.V. Boppana, S. Chalasani: A Framework for Designing DeadlockFree Wormhole Routing Algorithms. IEEE Trans. on Parallel and Distributed Systems, Vol. 7, No. 2, 169–183, 1996
K. Bolding, W. Yost: Design of a Router for Fault-Tolerant Networks. Proc. PCRCW'94 Parallel Computer Routing and Communication '94, 226–240, Springer 1994
S. Chalasani, R.V. Boppana: Fault-Tolerant Wormhole Routing in Tori. Proc. Computers and Digital Techniques, Vol. 142, No. 6, 386–394, IEEE Computer Society Press 1995
S. Chalasani, R.V. Boppana: Communication in Multicomputers with Nonconvex Faults. Proc. Conference on Parallel Processing, Lecture Notes on Computer Science 966, 673–684, Springer 1995
A.A. Chien, J.H. Kim: Planar-Adaptive Routing: Low-Cost Adaptive Networks for Multiprocessors. Proc. 19th Annual Int. Symposium on Computer Architecture, 268–277, ACM Press 1992
G.-M. Chiu, S.-P. Wu: A Fault-Tolerant Routing Strategy in Hypercube Multicomputers. IEEE Trans. on Computers, Vol. 45, No. 2, 143–155, 1996
J. Cohen: Gigabit Ethernet vs. ATM. Network World Magazin, 1996
C.M. Cunningham, D. Avresky: Fault-Tolerant Adaptive Routing for Two-Dimensional Meshes. Proc. First Int. Symposium on High Performance Computing Architecture, 122–131, IEEE Computer Society Press 1995
R. Cypher, L. Gravano: Storage-Efficient, Deadlock-Free Packet Routing Algorithms for Torus Networks. IEEE Trans. on Computers, Vol. 43, No. 12, 1376–1385, 1994
W.J. Dally, C.L. Seitz: Deadlock-Free Message Routing in Multiprocessor Interconnection Networks. IEEE Trans. on Computers, Vol. 36, No. 5, 547–553, 1987
A.C. Döring, G. Lustig, W. Obelöer: The Impact of Routing Decision Time on Network Latency. Proc. 4th PASA Workshop on Parallel Systems and Algorithms, 67–83, World Scientific Publishing 1997
J. Duato: A Theory of Fault-Tolerant Routing in Wormhole Networks. IEEE Trans. On Parallel and Distributed Systems, Vol. 8, No. 8, 790–802, 1997
M. Galles: Spider: A High-Speed Network Interconnect. IEEE Micro, Vol. 17, No. 1, 34–39, 1997
C.J. Glass, L.M. Ni: The Turn Model for Adaptive Routing. Proc. 19th Annual Int. Symposium on Computer Architecture, 278–287, ACM Press 1992
P. Gaughan, S. Yalamanchili: A Family of Fault-Tolerant Routing Protocols for Direct Multiprocessor Networks. IEEE Trans. on Parallel and Distributed Systems, Vol. 6, No. 5, 482–497, 1995
D.B. Gustavson: The Scalable Coherent Interface and Other Related Standards Projects. IEEE Micro, Vol. 12, No. 1, 1992
André DeHon: Reconfigurable Architectures for General-Purpose Computing. PhD-Thesis at Massachusetts Institute of Technology, Artificial Intelligence Laboratory, available as Technical report No. 1586, 1996
J. Kim, K.G. Shin: Deadlock-Free Fault-Tolerant Routing in Injured Hypercubes. IEEE Trans. on Computers, Vol. 42, No. 9, 1078–1088, 1993
S. Konstantinidou: Adaptive, Minimal Routing in Hypercubes. Proc. 6th MIT Conference on Advanced Research in VLSI, 139–153, MIT Press 1990
A. Mu, J. Larson, R. Sastry: A 9.6 GigaByte/s Throuput Plesiochronous Routing Chip. Proc. IEEE Int. Computer Conference '96, 261–266, IEEE Computer Society Press 1996
G. Pifarré, L. Gravano, S.A. Felperin: Fully Adaptive Minimal Deadlock-Free Packet Routing in Hypercubes, Meshes, and Other Networks: Algorithms and Simulations. IEEE Trans. on Parallel and Distributed Systems, Vol. 5, No. 3, 247–263, 1994
C. Scheideler, B. Vöcking: Universal Continuous Routing Strategies. Proc. 8th ACM Symposium on Computer Architecture and Algorithms, 356–365, ACM Press 1996
C.-C. Su, K.G. Shin: Adaptive Fault-Tolerant Deadlock-Free Routing in Meshes and Hypercubes. IEEE Trans. on Computers, Vol. 45, No. 6, 666–683, 1995
P.-H. Sui, S.-D. Wang: An Improved Algorithm for Fault-Tolerant Wormhole Routing in Meshes. IEEE Trans. on Computers, Vol. 46, No. 9, 1040–1042, 1997 *** DIRECT SUPPORT *** A0008D07 00021
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Döring, A.C., Obelöer, W., Lustig, G., Maehle, E. (1998). A flexible approach for a fault-tolerant router. In: Rolim, J. (eds) Parallel and Distributed Processing. IPPS 1998. Lecture Notes in Computer Science, vol 1388. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64359-1_736
Download citation
DOI: https://doi.org/10.1007/3-540-64359-1_736
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64359-3
Online ISBN: 978-3-540-69756-5
eBook Packages: Springer Book Archive