Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/CCGrid.2015.106acmotherconferencesArticle/Chapter ViewAbstractPublication PagesccgridConference Proceedingsconference-collections
research-article

Scalable in-memory computing

Published: 04 May 2015 Publication History

Abstract

Data-intensive scientific workflows are composed of many tasks that exhibit data precedence constraints leading to communication schemes expressed by means of intermediate files. In such scenarios, the storage layer is often a bottleneck, limiting overall application scalability, due to large volumes of data being generated during runtime at high I/O rates. To alleviate the storage pressure, applications take advantage of in-memory runtime distributed file systems that act as a fast, distributed cache, which greatly enhances I/O performance.
In this paper, we present scalability results for MemFS, a distributed in-memory runtime file system. MemFS takes an opposite approach to data locality, by scattering all data among the nodes, leading to well balanced storage and network traffic, and thus making the system both highly performant and scalable. Our results show that MemFS is platform independent, performing equally well on both private clusters and commercial clouds. On such platforms, running on up to 1024 cores, MemFS shows excellent horizontal scalability (using more nodes), while the vertical scalability (using more cores per node) is only limited by the network bandwith.
Furthermore, for this challenge we show how MemFS is able to scale elastically, at runtime, based on the application storage demands. In our experiments, we have successfully used up to 1TB memory when running a large instance of the Montage workflow.

References

[1]
I. Raicu, "Many-task computing: Bridging the gap between high throughput computing and high performance computing," 2009.
[2]
J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, D. Ongaro, G. Parulkar et al., "The case for ramcloud," Communications of the ACM, Vol. 54, no. 7, pp. 121--130, 2011.
[3]
D. Zhao and I. Raicu, "Hycache: a user-level caching middleware for distributed file systems," International Workshop on High Performance Data Intsensive Computing, IEEE IPDPS, Vol. 13, 2013.
[4]
"Hazelcast," http://http://hazelcast.com/, 2015.
[5]
"Amazon ElastiCache," http://aws.amazon.com/elasticache/, 2015.
[6]
A. Oprescu and T. Kielmann, "Bag-of-tasks scheduling under budget constraints," Cloud Computing Technology and Science (CloudCom), 2010 IEEE Second International Conference on. IEEE, 2010, pp. 351--359.
[7]
K. Shvachko, H. Kuang, S. Radia, and R. Chansler, "The hadoop distributed file system," Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on. IEEE, 2010, pp. 1--10.
[8]
S. A. Weil, S. A. Brandt, E. L. Miller, D. D. Long, and C. Maltzahn, "Ceph: A scalable, high-performance distributed file system," Proceedings of the 7th symposium on Operating systems design and implementation. USENIX Association, 2006, pp. 307--320.
[9]
A. Uta, A. Sandu, and T. Kielmann, "MemFS: an in-memory runtime file system with symmetrical data distribution," IEEE Cluster, 2014, pp. 272--273, (poster paper).
[10]
A. Uta, A. Sandu, and T. Kielmann, "Overcoming data locality: An in-memory runtime file system with symmetrical data distribution," Future Generation Computer Systems, 2015.
[11]
D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine, and D. Lewin, "Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web," Proceedings of the twenty-ninth annual ACM symposium on Theory of computing. ACM, 1997, pp. 654--663.
[12]
"xxhash," https://code.google.com/p/xxhash/, 2014.
[13]
S. Sanfilippo and P. Noordhuis, "Redis," http://redis.io, 2014.
[14]
P. B. Godfrey and I. Stoica, "Heterogeneity and load balance in distributed hash tables," INFOCOM 2005. 24th Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings IEEE, Vol. 1. IEEE, 2005, pp. 596--606.
[15]
"DAS-4, The Distributed ASCI Supercomputer," http://www.cs.vu.nl/das4/, 2014.
[16]
"Amazon EC2," http://aws.amazon.com/ec2/, 2014.
[17]
Z. Zhang, D. S. Katz, T. G. Armstrong, J. M. Wozniak, and I. Foster, "Parallelizing the execution of sequential scripts," High Performance Computing, Networking, Storage and Analysis (SC), 2013 International Conference for. IEEE, 2013.
[18]
J. C. Jacob, D. S. Katz, G. B. Berriman, J. C. Good, A. Laity, E. Deelman, C. Kesselman, G. Singh, M.-H. Su, T. Prince et al., "Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking," International Journal of Computational Science and Engineering, Vol. 4, no. 2, pp. 73--87, 2009.
[19]
S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, "Basic local alignment search tool," Journal of molecular biology, Vol. 215, no. 3, pp. 403--410, 1990.
[20]
G. Juve, E. Deelman, K. Vahi, G. Mehta, B. Berriman, B. P. Berman, and P. Maechling, "Scientific workflow applications on amazon ec2," E-Science Workshops, 2009 5th IEEE International Conference on. IEEE, 2009, pp. 59--66.
[21]
Z. Zhang, D. S. Katz, J. M. Wozniak, A. Espinosa, and I. Foster, "Design and analysis of data management in scalable parallel scripting," Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society Press, 2012, p. 85.

Cited By

View all
  • (2016)In-staging data placement for asynchronous coupling of task-based scientific workflowsProceedings of the Second Internationsl Workshop on Extreme Scale Programming Models and Middleware10.5555/3018814.3018816(2-9)Online publication date: 13-Nov-2016

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CCGRID '15: Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing
May 2015
1277 pages
ISBN:9781479980062

Publisher

IEEE Press

Publication History

Published: 04 May 2015

Check for updates

Qualifiers

  • Research-article

Conference

CCGrid '15

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2016)In-staging data placement for asynchronous coupling of task-based scientific workflowsProceedings of the Second Internationsl Workshop on Extreme Scale Programming Models and Middleware10.5555/3018814.3018816(2-9)Online publication date: 13-Nov-2016

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media