Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

SIMD-scan: ultra fast in-memory table scan using on-chip vector processing units

Published: 01 August 2009 Publication History

Abstract

The availability of huge system memory, even on standard servers, generated a lot of interest in main memory database engines. In data warehouse systems, highly compressed column-oriented data structures are quite prominent. In order to scale with the data volume and the system load, many of these systems are highly distributed with a shared-nothing approach. The fundamental principle of all systems is a full table scan over one or multiple compressed columns. Recent research proposed different techniques to speedup table scans like intelligent compression or using an additional hardware such as graphic cards or FPGAs. In this paper, we show that utilizing the embedded Vector Processing Units (VPUs) found in standard superscalar processors can speed up the performance of mainmemory full table scan by factors. This is achieved without changing the hardware architecture and thereby without additional power consumption. Moreover, as on-chip VPUs directly access the system's RAM, no additional costly copy operations are needed for using the new SIMD-scan approach in standard main memory database engines. Therefore, we propose this scan approach to be used as the standard scan operator for compressed column-oriented main memory storage. We then discuss how well our solution scales with the number of processor cores; consequently, to what degree it can be applied in multi-threaded environments. To verify the feasibility of our approach, we implemented the proposed techniques on a modern Intel multi-core processor using Intel® Streaming SIMD Extensions (Intel® SSE). In addition, we integrated the new SIMD-scan approach into SAP® Netweaver® Business Warehouse Accelerator. We conclude with describing the performance benefits of using our approach for processing and scanning compressed data using VPUs in column-oriented main memory database systems.

References

[1]
Westmann, T., Kossmann D., Helmer, S., Moerkkotte, G., "The Implementation and Performance of Compressed Databases," in SIGMOD, vol. 29, no. 3, pp. 55--67, 2000
[2]
Harizopoulos S., Liang V., Abadi D., Madden S., "Performance tradeoffs in read-optimized databases," In VLDB, pp. 487--498, 2006
[3]
Flynn, M. J., "Very high-speed computing systems," Proceedings of the IEEE, vol. 54, no. 12, pp. 1901--1909, 1966
[4]
Duncan, R., "A survey of parallel computer architectures," Computer, vol. 23, no. 2, pp. 5--16, Feb 1990
[5]
Graefe, G., Shapiro, L. D., "Data Compression and Database Performance," Applied Computing, pp. 22--27, 1991
[6]
Zukowski M., Heman S., Nes N., Boncz P., "Super-Scalar RAM-CPU Cache Compression," Data Engineering, International Conference, vol. 0, no. 0, pp. 59, 2006.
[7]
Holloway A., Raman V., Swart G., DeWitt D., "How to Barter Bits for Chronons: Compression and Bandwidth Trade Offs for Database Scans," In SIGMOD, pp. 389--400, 2007
[8]
Qiao, L., Raman, V., Reiss, F., Haas, P. J., and Lohman, G. M., "Main-memory scan sharing for multi-core CPUs," In VLDB, pp. 610--621, 2008
[9]
Johnson, R., Raman, V., Sidle, R., and Swart, G., "Row-wise parallel predicate evaluation," In VLDB, pp. 622--634, 2008
[10]
Zhou J., Ross K. A., "Implementing database operations using SIMD instructions," In SIGMOD, 2002.
[11]
Heman S., Nes N., Zukowski M., Boncz P., "Vectorized Data Processing on the Cell Broadband Engine," Data Management on New Hardware, no. 4, 2007
[12]
Roth M., Van Horn S., "Database compression," In SIGMOD Record, pp. 31--39, 1993
[13]
Goldstein J., Ramakrishnan R., Shaft U., "Compressing relations and indexes," In ICDE, 1998
[14]
Abel J., Balasubramanian, K., Bargeron M., Craver T., Phlipot M., "Applications Tuning for Streaming SIMD Extensions," Intel Technology Journal Q2, 1999
[15]
Oberman S., Favor G., Weber F., "AMD 3DNow! Technology: Architecture and Implementations," IEEE Micro, vol. 19, pp. 37--48, 1999
[16]
Gerber R., Bik A., Smith K., Tian X., "The Software Optimization Cookbook," 2nd edition, Intel Press
[17]
SAP AG, https://www.sdn.sap.com/irj/sdn/bia

Cited By

View all
  • (2024)GOLAP: A GPU-in-Data-Path Architecture for High-Speed OLAPProceedings of the ACM on Management of Data10.1145/36988122:6(1-26)Online publication date: 20-Dec-2024
  • (2024)SIMDified Data Processing - Foundations, Abstraction, and Advanced TechniquesCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654694(613-621)Online publication date: 9-Jun-2024
  • (2024)Tandem Processor: Grappling with Emerging Operators in Neural NetworksProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640365(1165-1182)Online publication date: 27-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 2, Issue 1
August 2009
1293 pages

Publisher

VLDB Endowment

Publication History

Published: 01 August 2009
Published in PVLDB Volume 2, Issue 1

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)73
  • Downloads (Last 6 weeks)9
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)GOLAP: A GPU-in-Data-Path Architecture for High-Speed OLAPProceedings of the ACM on Management of Data10.1145/36988122:6(1-26)Online publication date: 20-Dec-2024
  • (2024)SIMDified Data Processing - Foundations, Abstraction, and Advanced TechniquesCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654694(613-621)Online publication date: 9-Jun-2024
  • (2024)Tandem Processor: Grappling with Emerging Operators in Neural NetworksProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640365(1165-1182)Online publication date: 27-Apr-2024
  • (2023)The FastLanes Compression Layout: Decoding > 100 Billion Integers per Second with Scalar CodeProceedings of the VLDB Endowment10.14778/3598581.359858716:9(2132-2144)Online publication date: 1-May-2023
  • (2023)Rethinking the Encoding of Integers for Scans on Skewed DataProceedings of the ACM on Management of Data10.1145/36267511:4(1-27)Online publication date: 12-Dec-2023
  • (2023)Selection Pushdown in Column Stores using Bit Manipulation InstructionsProceedings of the ACM on Management of Data10.1145/35893231:2(1-26)Online publication date: 20-Jun-2023
  • (2023)BtrBlocks: Efficient Columnar Compression for Data LakesProceedings of the ACM on Management of Data10.1145/35892631:2(1-26)Online publication date: 20-Jun-2023
  • (2023)SIMD-ified R-tree Query Processing and OptimizationProceedings of the 31st ACM International Conference on Advances in Geographic Information Systems10.1145/3589132.3625610(1-10)Online publication date: 13-Nov-2023
  • (2023)AWARE: Workload-aware, Redundancy-exploiting Linear AlgebraProceedings of the ACM on Management of Data10.1145/35886821:1(1-28)Online publication date: 30-May-2023
  • (2023)Main Memory Database Recovery StrategiesCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589402(31-35)Online publication date: 4-Jun-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media