Cited By
View all- Park JHuang XLee C(2023)Analyzing and predicting job failures from HPC system logThe Journal of Supercomputing10.1007/s11227-023-05482-y80:1(435-462)Online publication date: 24-Jun-2023
- Kumar RJha SMahgoub AKalyanam RHarrell SSong XKalbarczyk ZKramer WIyer RBagchi S(2020)The Mystery of the Failing Jobs: Insights from Operational Data from Two University-Wide Computing Systems2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN48063.2020.00034(158-171)Online publication date: Jun-2020
- Rojas EMeneses EJones TMaxwell D(2020)Towards a Model to Estimate the Reliability of Large-Scale Hybrid SupercomputersEuro-Par 2020: Parallel Processing10.1007/978-3-030-57675-2_3(37-51)Online publication date: 24-Aug-2020
- Show More Cited By