Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
  • Zhao W, Yuan L, Yan B, Ma P, Zhang Y, Wang L and Wang Z. Stencil Computation with Vector Outer Product. Proceedings of the 38th ACM International Conference on Supercomputing. (247-258).

    https://doi.org/10.1145/3650200.3656611

  • Tao X, Pang J, Xu J and Zhu Y. (2021). Compiler-directed scratchpad memory data transfer optimization for multithreaded applications on a heterogeneous many-core architecture. The Journal of Supercomputing. 77:12. (14502-14524). Online publication date: 1-Dec-2021.

    https://doi.org/10.1007/s11227-021-03853-x

  • Li Y, Sun H and Pang J. (2021). Revisiting split tiling for stencil computations in polyhedral compilation. The Journal of Supercomputing. 10.1007/s11227-021-03835-z.

    https://link.springer.com/10.1007/s11227-021-03835-z

  • Zhao J and Cohen A. (2019). Flextended Tiles. ACM Transactions on Architecture and Code Optimization. 16:4. (1-25). Online publication date: 31-Dec-2020.

    https://doi.org/10.1145/3369382

  • Yuan L, Huang S, Zhang Y and Cao H. Tessellating Star Stencils. Proceedings of the 48th International Conference on Parallel Processing. (1-10).

    https://doi.org/10.1145/3337821.3337835

  • Loffeld J and Hittinger J. (2019). On the arithmetic intensity of high-order finite-volume discretizations for hyperbolic systems of conservation laws. International Journal of High Performance Computing Applications. 33:1. (25-52). Online publication date: 1-Jan-2019.

    https://doi.org/10.1177/1094342017691876

  • Kruse M and Grosser T. DeLICM: scalar dependence removal at zero memory cost. Proceedings of the 2018 International Symposium on Code Generation and Optimization. (241-253).

    https://doi.org/10.1145/3168815

  • Yuan L, Zhang Y, Guo P and Huang S. Tessellating stencils. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. (1-13).

    https://doi.org/10.1145/3126908.3126920

  • Bondhugula U, Bandishti V and Pananilath I. (2017). Diamond Tiling. IEEE Transactions on Parallel and Distributed Systems. 28:5. (1285-1298). Online publication date: 1-May-2017.

    https://doi.org/10.1109/TPDS.2016.2615094

  • Doerfert J, Grosser T and Hack S. Optimistic loop optimization. Proceedings of the 2017 International Symposium on Code Generation and Optimization. (292-304).

    /doi/10.5555/3049832.3049864

  • Doerfert J, Grosser T and Hack S. (2017). Optimistic loop optimization 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 10.1109/CGO.2017.7863748. 978-1-5090-4931-8. (292-304).

    http://ieeexplore.ieee.org/document/7863748/

  • Bondhugula U, Acharya A and Cohen A. (2016). The Pluto+ Algorithm. ACM Transactions on Programming Languages and Systems. 38:3. (1-32). Online publication date: 2-May-2016.

    https://doi.org/10.1145/2896389

  • Bhaskaracharya S, Bondhugula U and Cohen A. (2016). Automatic Storage Optimization for Arrays. ACM Transactions on Programming Languages and Systems. 38:3. (1-23). Online publication date: 2-May-2016.

    https://doi.org/10.1145/2845078

  • Li D, Xu C, Wang Y, Song Z, Xiong M, Gao X and Deng X. (2016). Parallelizing and optimizing large-scale 3D multi-phase flow simulations on the Tianhe-2 supercomputer. Concurrency and Computation: Practice & Experience. 28:5. (1678-1692). Online publication date: 10-Apr-2016.

    https://doi.org/10.1002/cpe.3717