LS-CAT: A Large-Scale CUDA AutoTuning Dataset

Bjertnes, Lars; Tørring, Jacob O.; Elster, Anne C.

Computer Science > Distributed, Parallel, and Cluster Computing

arXiv:2103.14409v1 (cs)

[Submitted on 26 Mar 2021]

Title:LS-CAT: A Large-Scale CUDA AutoTuning Dataset

Authors:Lars Bjertnes, Jacob O. Tørring, Anne C. Elster

View PDF

Abstract:The effectiveness of Machine Learning (ML) methods depend on access to large suitable datasets. In this article, we present how we build the LS-CAT (Large-Scale CUDA AutoTuning) dataset sourced from GitHub for the purpose of training NLP-based ML models. Our dataset includes 19 683 CUDA kernels focused on linear algebra. In addition to the CUDA codes, our LS-CAT dataset contains 5 028 536 associated runtimes, with different combinations of kernels, block sizes and matrix sizes. The runtime are GPU benchmarks on both Nvidia GTX 980 and Nvidia T4 systems. This information creates a foundation upon which NLP-based models can find correlations between source-code features and optimal choice of thread block sizes.
There are several results that can be drawn out of our LS-CAT database. E.g., our experimental results show that an optimal choice in thread block size can gain an average of 6% for the average case. We thus also analyze how much performance increase can be achieved in general, finding that in 10% of the cases more than 20% performance increase can be achieved by using the optimal block. A description of current and future work is also included.

Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2103.14409 [cs.DC]
	(or arXiv:2103.14409v1 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.2103.14409

Submission history

From: Lars Bjertnes [view email]
[v1] Fri, 26 Mar 2021 11:33:48 UTC (4,171 KB)

Computer Science > Distributed, Parallel, and Cluster Computing

Title:LS-CAT: A Large-Scale CUDA AutoTuning Dataset

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Distributed, Parallel, and Cluster Computing

Title:LS-CAT: A Large-Scale CUDA AutoTuning Dataset

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators