Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

On the validity of flow-level tcp network models for grid and cloud simulations

Published: 16 December 2013 Publication History

Abstract

Researchers in the area of grid/cloud computing perform many of their experiments using simulations that must capture network behavior. In this context, packet-level simulations, which are widely used to study network protocols, are too costly given the typical large scales of simulated systems and applications. An alternative is to implement network simulations with less costly flow-level models. Several flow-level models have been proposed and implemented in grid/cloud simulators. Surprisingly, published validations of these models, if any, consist of verifications for only a few simple cases. Consequently, even when they have been used to obtain published results, the ability of these simulators to produce scientifically meaningful results is in doubt. This work evaluates these state-of-the-art flow-level network models of TCP communication via comparison to packet-level simulation. While it is straightforward to show cases in which previously proposed models lead to good results, instead we follow the critical method, which places model refutation at the center of the scientific activity, and we systematically seek cases that lead to invalid results. Careful analysis of these cases reveals fundamental flaws and also suggests improvements. One contribution of this work is that these improvements lead to a new model that, while far from being perfect, improves upon all previously proposed models in the context of simulation of grids or clouds. A more important contribution, perhaps, is provided by the pitfalls and unexpected behaviors encountered in this work, leading to a number of enlightening lessons. In particular, this work shows that model validation cannot be achieved solely by exhibiting (possibly many) “good cases.” Confidence in the quality of a model can only be strengthened through an invalidation approach that attempts to prove the model wrong.

References

[1]
Alexandrov, A., Ionescu, M., Schauser, K., and Scheiman, C. 1995. LogGP: Incorporating long messages into the LogP Model—One step closer towards a realistic model for parallel computation. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures (SPAA'95).
[2]
Barabási, A. and Albert, R. 1999. Emergence of scaling in random networks. Science 59, 509--512.
[3]
Baumgart, I., Heep, B., and Krause, S. 2009. OverSim: A scalable and flexible overlay framework for simulation and real network applications. In Proceedings of the 9th International Conference on Peer-to-Peer Computing.
[4]
Bell, W. H., Cameron, D. G., Millar, A. P., Capozza, L., Stockinger, K., and Zini, F. 2003. OptorSim: A grid simulator for studying dynamic data replication strategies. Int. J. High Perform. Comput. Appl. 17, 4.
[5]
Bertsekas, D. P. and Gallager, R. 1992. Data Networks. Prentice-Hall, Upper Saddle River, NJ.
[6]
Blythe, J., Jain, S., Deelman, E., Gil, Y., Vahi, K., et al. 2005. Task scheduling strategies for workflow-based applications in grids. In Proceedings of the IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (IN CCGRID'05). IEEE, Los Alamitos, CA, 759--767.
[7]
Braun, T. D., Siegel, H. J., Beck, N., Bölöni, L. L., Maheswaran, M., Reuther, A. I., Robertson, J. P., Theys, M. D., Yao, B., Hensgen, D., and Freund, R. F. 2001. A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61, 6, 810--837.
[8]
Buyya, R. and Murshed, M. 2002. GridSim: A toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing. J. Concurrency Comput. Pract. Experience (CCPE) 14, 13, 1175--1120.
[9]
Calheiros, R. N., Ranjan, R., Beloglazov, A., De Rose, C. A. F., and Buyya, R. 2011. CloudSim: A toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Software Pract. Experience 41, 1, 23--50.
[10]
Casanova, H. 2001. SimGrid: A toolkit for the simulation of application scheduling. In 1st IEEE International Symposium on Cluster Computing and the Grid (CCGrid'01).
[11]
Casanova, H., Legrand, A., and Marchal, L. 2003. Scheduling distributed applications: The SimGrid simulation framework. In Proceedings of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid'03). IEEE, Los Alamitos, CA.
[12]
Casanova, H., Legrand, A., and Quinson, M. 2008. SimGrid: A generic framework for large-scale distributed experiments. In Proceedings of the 10th Conference on Computer Modeling and Simulation (EuroSim'08).
[13]
Casanova, H. and Marchal, L. 2002. A network model for simulation of grid application. Tech. Rep. 2002-40, LIP. Oct.
[14]
Chen, Q., Chang, H., Govindan, R., and Jamin, S. 2002. The origin of power laws in Internet topologies revisited. In Proceedings of the 21st Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM'02). 608--617.
[15]
Chen, W. and Deelman, E. 2012. Workflowsim: A toolkit for simulating scientific workflows in distributed environments. In Proceedings of the 8th IEEE International Conference on eScience. IEEE, Los Alamitos, CA.
[16]
Chiu, D. N. 1999. Some observations on fairness of bandwidth sharing. Tech. Rep., Sun Microsystems.
[17]
Clauss, P.-N., Stillwell, M., Genaud, S., Suter, F., Casanova, H., and Quinson, M. 2011. Single node on-line simulation of MPI applications with SMPI. In Proceedings of the 25th IEEE International Parallel and Distributed Processing Symposium (IPDPS'11).
[18]
Culler, D., Karp, R., Patterson, D., Sahay, A., Schauser, K., Santos, E., Subramonian, R., and von Eicken, T. 1993. LogP: Towards a realistic model of parallel computation. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.
[19]
Dabek, F., Cox, R., Kaashoek, F., and Morris, R. 2004. Vivaldi: A decentralized network coordinate system. In Proceedings of the 2004 ACM Conference of the Special Interest Group on Data Communication (SIGCOMM'04).
[20]
de Cnodder, S., Elloumi, O., and Pauwels, K. 2000. Red behavior with different packet sizes. In Proceedings of the 5th IEEE Symposium on Computers and Communications (ISCC'00). IEEE Computer Society, Washington, DC.
[21]
Doar, M. B. 1996. A better model for generating test networks. In Proceedings of the IEEE Global Communications Conference (GLOBECOM'96). 86--93.
[22]
Faloutsos, M., Faloutsos, P., and Faloutsos, C. 1999. On power-law relationships of the internet topology. In Proceedings of the ACM Conference of the Special Interest Group on Data Communication (SIGCOMM'99). 251--262.
[23]
Floyd, S. and Fall, K. 1999. Promoting the use of end-to-end congestion control in the Internet. IEEE/ACM Trans. Networking 7, 4, 458--472.
[24]
Floyd, S. and Jacobson, V. 1992. On traffic phase effects in packet-switched gateways. Internetworking: Res. Experience 3, 115--156.
[25]
Floyd, S. and Jacobson, V. 1993. Random early detection gateways for congestion avoidance. IEEE/ACM Trans. Networking 1, 4.
[26]
Fujiwara, K. and Casanova, H. 2007. Speed and accuracy of network simulation in the SimGrid framework. In Proceedings of the 2nd International Conference on Performance Evaluation Methodologies and Tools. 1--10.
[27]
Gil, T. M., Kaashoek, F., Li, J., Morris, R., and Stribling, J. 2005. P2PSim: A simulator for peer-to-peer protocols. http://pdos.csail.mit.edu/p2psim/.
[28]
Giuli, T. and Baker, M. 2002. Narses: A scalable flow-based network simulator. Tech. Rep. cs.PF/0211024, Stanford University. Available at http://arxiv.org/abs/cs.PF/0211024.
[29]
Heusse, M., Merritt, S. A., Brown, T. X., and Duda, A. 2011. Two-way TCP connections: Old problem, new insight. ACM SIGGCOMM Comput. Commun. Rev. 41, 2, 5--15.
[30]
Hoefler, T., Schneider, T., and Lumsdaine, A. 2010. LogGOPSim—Simulating large-scale applications in the LogGOPS Model. In Proceedings of the 2nd Workshop on Large-Scale System and Application Performance.
[31]
Ino, F., Fujimoto, N., and Hagihara, K. 2001. LogGPS: A parallel computational model for synchronization analysis. In Proceedings of the 8th ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming.
[32]
Issariyakul, T. and Hossain, E. 2008. Introduction to Network Simulator NS2. Springe, New York.
[33]
Jacobsson, K., Andrew, L., Tang, A., Johansson, K., Hjalmarsson, H., and Low, S. 2008. Ack-clocking dynamics: Modelling the interaction between windows and the network. In Proceedings of the 27th Conference on Computer Communications (INFOCOM'08).
[34]
Jain, M., Prasad, R. S., and Dovrolis, C. 2003. The TCP bandwidth-delay product revisited: Network buffering, cross traffic, and socket buffer auto-sizing. Tech. Rep. GIT-CERCS-03-02, Georgia Institute of Technology.
[35]
Jansen, S. and McGregor, A. 2007. Validation of simulated real world TCP stacks. In Proceedings of the Winter Simulation Conference.
[36]
Jung, J. and Kim, H. 2012. MR-CloudSim: Designing and implementing MapReduce computing model on CloudSim. In Proceedings of the International Conference on ICT Convergence (ICTC'12). 504--509.
[37]
Kelly, F., Maulloo, A., and Tan, D. 1998. Rate control for communication networks: Shadow prices, proportional fairness and stability. J. Oper. Res. Soc. 49, 3.
[38]
Kielmann, T., Bal, H., and Verstoep, K. 2000. Fast measurement of LogP parameters for message passing platforms. In Proceedings of the 4th Workshop on Run-Time Systems for Parallel Programming.
[39]
Lakhina, A., Byers, J., Crovella, M., and Xie, P. 2003. Sampling biases in ip topology measurements. In Proceedings of the 22nd Annual Joint conference of the IEEE Computer and Communications Societies (INFOCOM'03).
[40]
Ledlie, J., Gardner, P., and Seltzer, M. 2007. Network coordinates in the wild. In Proceedings of the 4th Symposium on Networked Systems Design and Implementation (NSDI'07).
[41]
Low, S. H. 2003. A duality model of TCP and queue management algorithms. IEEE/ACM Trans. Networking 11, 4.
[42]
Low, S. H., Peterson, L. L., and Wang, L. 2002. Understanding vegas: A duality model. J. ACM 49, 2.
[43]
Low, S. H. and Srikant, R. 2004. A mathematical framework for designing a low-loss, low-delay internet. Network Spatial Econ. 4, 75--102.
[44]
Marfia, G., Palazzi, C., Pau, G., Gerla, M., Sanadidi, M., and Roccetti, M. 2007. Tcp libra: Exploring rtt-fairness for tcp. In NETWORKING 2007. Ad Hoc and Sensor Networks, Wireless Networks, Next Generation Internet, I. F. Akyildiz, R. Sivakumar, E. Ekici, J. C. d. Oliveira, and J. McNair, Eds. Lecture Notes in Computer Science Series, vol. 4479. Springer, Berlin, 1005--1013.
[45]
Mathis, M., Semke, J., and Mahdavi, J. 1997. The macroscopic behavior of the TCP congestion avoidance algorithm. Comput. Commun. Rev. 27, 3.
[46]
Medina, A., Lakhina, A., Matta, I., and Byers, J. 2001. BRITE: An approach to universal topology generation. In Proceedings of the International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunications Systems (MASCOTS'01).
[47]
Mo, J., La, R., Anantharam, V., and Walrand, J. 1999. Analysis and comparison of TCP Reno and TCP Vegas. In Proceedings of the 18th Annual Joint Conference of the IEEE Computer and Communication Societies (INFOCOM'99).
[48]
Mo, J. and Walrand, J. 2000. Fair end-to-end window-based congestion control. IEEE/ACM Trans. Networking 8, 5.
[49]
Montresor, A. and Jelasity, M. 2009. PeerSim: A scalable P2P simulator. In Proceedings of the 9th International Conference on Peer-to-Peer Computing.
[50]
NS3. 2011. The Network Simulator 3. http://www.nsnam.org/.
[51]
Núñez, A., Vázquez-Poletti, J., Caminero, A., Carretero, J., and Llorente, I. M. 2011. Design of a new cloud computing simulation platform. In Proceedings of the 11th International Conference on Computational Science and Its Applications.
[52]
Ostermann, S., Prodan, R., and Fahringer, T. 2010. Dynamic cloud provisioning for scientific grid workflows. In Proceedings of the 11th ACM/IEEE International Conference on Grid Computing (Grid'10).
[53]
Pentikousis, K. 2001. Connector: Active queue management. Crossroads 7, 5, 2.
[54]
Popper, K. 1972. Objective Knowledge: An Evolutionary Approach. Oxford University Press, New York.
[55]
Ramaswamy, S. and Banerjee, P. 1993. Processor allocation and scheduling of macro dataflow graphs on distributed memory multicomputers by the paradigm compiler. In Proceedings of the 1993 International Conference on Parallel Processing, volume II-Software. CRC Press, Boca Raton, FL, 134--138.
[56]
Riley, G. F. 2003. The Georgia Tech network simulator. In Proceedings of the ACM SIGCOMM Workshop on Models, Methods and Tools for Reproducible Network Research. 5--12.
[57]
Schnorr, L., Legrand, A., and Vincent, J.-M. 2012. Detection and analysis of resource usage anomalies in large distributed systems through multi-scale visualization. Concurrency Comput. Pract. Experience. 24, 15, 1792--1816.
[58]
Schnorr, L. M., Huard, G., and Navaux, P. O. A. 2010. Triva: Interactive 3D visualization for performance analysis of parallel applications. Future Gener. Comput. Syst. 26, 3, 348--358.
[59]
Shi, Y., Jiang, X., and Ye, K. 2011. An energy-efficient scheme for cloud resource provisioning based on cloudsim. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER'11). 595--599.
[60]
Tang, A., Andrew, L., Jacobsson, K., Johansson, K., Hjalmarsson, H., and Low, S. 2010. Queue dynamics with window flow control. IEEE/ACM Trans. Networking 18, 5.
[61]
Tang, A., Andrew, L., Jacobsson, K., Johansson, K., Low, S., and Hjalmarsson, H. 2008. Window flow control: Macroscopic properties from microscopic factors. In Proceedins of the 27th Conference on Computer Communications (INFOCOM'08).
[62]
Tangmunarunkit, H., Govindan, R., Jamin, S., Shenker, S., and Willinger, W. 2002. Network topology generators: Degree-based vs structural. In Proceedings of the ACM 2002 Annual Conferenc of the Special Interest Group on Data Communication (SIGCOMM'02).
[63]
Teng, F., Yu, L., and Magoulès, F. 2011. SimMapReduce: A simulator for modeling MapReduce framework. In Proceedings of the 5th FTRA International Conference on Multimedia and Ubiquitous Engineering (MUE'11). 277--282.
[64]
Topcuoglu, H., Hariri, S., and Wu, M.-Y. 1999. Task scheduling algorithms for heterogeneous processors. In Proceedings of the 8th Heterogeneous Computing Workshop. IEEE Computer Society Press, Washington, DC.
[65]
Triva. 2011. Triva Visualization Tool. http://triva.gforge.inria.fr.
[66]
Varga, A. and Hornig, R. 2008. An overview of the OMNeT++ simulation environment. In Proceedings of the 1st International Conference on Simulation Tools and Techniques for Communications, Networks and Systems.
[67]
Velho, P. and Legrand, A. 2009. Accuracy study and improvement of network simulation in the SimGrid framework. In Proceedings of the 2nd International Conference on Simulation Tools and Techniques.
[68]
Waxman, B. M. 1988. Routing of multipoint connections. IEEE J. Selected Areas Commun. 6, 9, 1617--1622.
[69]
Yaïche, H., Mazumdar, R. R., and Rosenberg, C. 2010. A game theoretic framework for bandwidth allocation and pricing in broadband networks. IEEE/ACM Trans. Networking 8, 5.
[70]
Zhang, L., Shenker, S., and Clark, D. D. 1991. Observations on the dynamics of a congestion control algorithm: The effects of two-way traffic. ACM Comput. Commun. Rev. 21, 4, 133--147.
[71]
Zheng, G., Kakulapati, G., and Kalé, L. V. 2004a. BigSim: A parallel simulator for performance prediction of extremely large parallel machines. In Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS'04).
[72]
Zheng, G., Wilmarth, T., Lawlor, O. S., Kalé, L. V., Adve, S., and Padua, D. 2004b. Performance modeling and programming environments for petaflops computers and the Blue Gene machine. In Proceedings of the 18th International on Parallel and Distributed Processing Symposium. IEEE, Los Alamitos, CA.

Cited By

View all
  • (2024)Studying the end-to-end performance, energy consumption and carbon footprint of fog applications2024 IEEE Symposium on Computers and Communications (ISCC)10.1109/ISCC61673.2024.10733735(1-7)Online publication date: 26-Jun-2024
  • (2024)Automated Calibration of Parallel and Distributed Computing Simulators: A Case Study2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00173(1026-1035)Online publication date: 27-May-2024
  • (2024)Modeling Distributed Computing Infrastructures for HEP ApplicationsEPJ Web of Conferences10.1051/epjconf/202429504032295(04032)Online publication date: 6-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Modeling and Computer Simulation
ACM Transactions on Modeling and Computer Simulation  Volume 23, Issue 4
October 2013
113 pages
ISSN:1049-3301
EISSN:1558-1195
DOI:10.1145/2556945
Issue’s Table of Contents
© 2013 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 December 2013
Accepted: 01 July 2013
Revised: 01 April 2013
Received: 01 September 2012
Published in TOMACS Volume 23, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Grid and cloud computing simulation
  2. SimGrid
  3. scalability
  4. validation

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Studying the end-to-end performance, energy consumption and carbon footprint of fog applications2024 IEEE Symposium on Computers and Communications (ISCC)10.1109/ISCC61673.2024.10733735(1-7)Online publication date: 26-Jun-2024
  • (2024)Automated Calibration of Parallel and Distributed Computing Simulators: A Case Study2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00173(1026-1035)Online publication date: 27-May-2024
  • (2024)Modeling Distributed Computing Infrastructures for HEP ApplicationsEPJ Web of Conferences10.1051/epjconf/202429504032295(04032)Online publication date: 6-May-2024
  • (2024)An exploration of online-simulation-driven portfolio scheduling in Workflow Management SystemsFuture Generation Computer Systems10.1016/j.future.2024.07.005161(345-360)Online publication date: Dec-2024
  • (2023)A Wi-Fi Energy Model for Scalable Simulation2023 IEEE 24th International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM)10.1109/WoWMoM57956.2023.00038(232-241)Online publication date: Jun-2023
  • (2023)IO-Sets: Simple and Efficient Approaches for I/O Bandwidth ManagementIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.330502834:10(2783-2796)Online publication date: 1-Oct-2023
  • (2023)Validation of ESDS Using Epidemic-Based Data Dissemination Algorithms2023 19th International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT)10.1109/DCOSS-IoT58021.2023.00054(277-284)Online publication date: Jun-2023
  • (2023)On the Feasibility of Simulation-Driven Portfolio Scheduling for Cyberinfrastructure Runtime SystemsJob Scheduling Strategies for Parallel Processing10.1007/978-3-031-22698-4_1(3-24)Online publication date: 12-Jan-2023
  • (2022)A Flow-Level Wi-Fi Model for Large Scale Network SimulationProceedings of the 25th International ACM Conference on Modeling Analysis and Simulation of Wireless and Mobile Systems10.1145/3551659.3559022(111-119)Online publication date: 24-Oct-2022
  • (2022)ElastiSim: A Batch-System Simulator for Malleable WorkloadsProceedings of the 51st International Conference on Parallel Processing10.1145/3545008.3545046(1-11)Online publication date: 29-Aug-2022
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media