Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleMay 2024
Hyper: A High-Performance and Memory-Efficient Learned Index via Hybrid Construction
Proceedings of the ACM on Management of Data (PACMMOD), Volume 2, Issue 3Article No.: 145, Pages 1–26https://doi.org/10.1145/3654948Learned indexes use machine learning techniques to improve index construction. However, they often face a fundamental trade-off between performance and memory consumption, especially in dynamic environments with frequent insert and delete operations. ...
- research-articleJuly 2022
FIST-HOSVD: fused in-place sequentially truncated higher order singular value decomposition
PASC '22: Proceedings of the Platform for Advanced Scientific Computing ConferenceArticle No.: 15, Pages 1–11https://doi.org/10.1145/3539781.3539798In this paper, several novel methods of improving the memory locality of the Sequentially Truncated Higher Order Singular Value Decomposition (ST-HOSVD) algorithm for computing the Tucker decomposition are presented. We show how the two primary ...
- research-articleMay 2020
Memory-Aware Framework for Efficient Second-Order Random Walk on Large Graphs
SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of DataPages 1797–1812https://doi.org/10.1145/3318464.3380562Second-order random walk is an important technique for graph analysis. Many applications use it to capture higher-order patterns in the graph, thus improving the model accuracy. However, the memory explosion problem of this technique hinders it from ...
- research-articleSeptember 2016
Parallel Memory-Efficient Adaptive Mesh Refinement on Structured Triangular Meshes with Billions of Grid Cells
ACM Transactions on Mathematical Software (TOMS), Volume 43, Issue 3Article No.: 19, Pages 1–27https://doi.org/10.1145/2947668We present sam(oa)2, a software package for a dynamically adaptive, parallel solution of 2D partial differential equations on triangular grids created via newest vertex bisection. An element order imposed by the Sierpinski space-filling curve provides ...
- research-articleMarch 2016
A Fast Memory Efficient Construction Algorithm for Hierarchically Semi-Separable Representations
SIAM Journal on Matrix Analysis and Applications (SIMAX), Volume 37, Issue 1Pages 338–353https://doi.org/10.1137/15M1028467Existing hierarchically semi-separable construction algorithms for dense $n \times n$ matrices require as much as $O(n^2)$ peak workspace memory, at a cost of $O(n^2)$ flops. An algorithm is presented which requires $O(n^{1.5})$ peak worskpace memory in ...
- ArticleDecember 2013
Memory Efficient 3D Integral Volumes
ICCVW '13: Proceedings of the 2013 IEEE International Conference on Computer Vision WorkshopsPages 722–729https://doi.org/10.1109/ICCVW.2013.99Integral image data structures are very useful in computer vision applications that involve machine learning approaches based on ensembles of weak learners. The weak learners often are simply several regional sums of intensities subtracted from each ...
- ArticleSeptember 2012
A highly-efficient memory-compression approach for GPU-Accelerated virus signature matching
ISC'12: Proceedings of the 15th international conference on Information SecurityPages 354–369https://doi.org/10.1007/978-3-642-33383-5_22We are proposing an approach for implementing highly compressed Aho-Corasick and Commentz-Walter automatons for performing GPU-accelerated virus scanning, suitable for implementation in real-world software and hardware systems. We are performing ...
- research-articleFebruary 2012
An application-level parallel I/O library for Earth system models
- John M. Dennis,
- Jim Edwards,
- Ray Loy,
- Robert Jacob,
- Arthur A. Mirin,
- Anthony P. Craig,
- Mariana Vertenstein
International Journal of High Performance Computing Applications (SAGE-HPCA), Volume 26, Issue 1Pages 43–53https://doi.org/10.1177/1094342011428143We describe the design and implementation of an application-level parallel I/O (PIO) library for the reading and writing of distributed arrays to several common scientific data formats. PIO provides the flexibility to control the number of I/O tasks ...
- research-articleDecember 2010
Fast and memory efficient 2-D connected components using linked lists of line segments
IEEE Transactions on Image Processing (TIP), Volume 19, Issue 12Pages 3222–3231https://doi.org/10.1109/TIP.2010.2052826In this paper we present a more efficient approach to the problem of finding the connected components in binary images. In conventional connected components algorithms, the main data structure to compute and store the connected components is the region ...
- research-articleApril 2007
A memory efficient partially parallel decoder architecture for quasi-cyclic LDPC codes
IEEE Transactions on Very Large Scale Integration (VLSI) Systems (ITVL), Volume 15, Issue 4Pages 483–488https://doi.org/10.1109/TED.2007.895247This paper presents a memory efficient partially parallel decoder architecture suited for high rate quasi-cyclic low-density parity-check (QC-LDPC) codes using (modified) min-sum algorithm for decoding. In general, over 30% of memory can be saved over ...
- research-articleDecember 2005
Quantization of accumulated diffused errors in error diffusion
IEEE Transactions on Image Processing (TIP), Volume 14, Issue 12Pages 1960–1976https://doi.org/10.1109/TIP.2005.859372Due to its high image quality and moderate computational complexity, error diffusion is a popular halftoning algorithm for use with inkjet printers. However, error diffusion is an inherently serial algorithm that requires buffering a full row of ...