You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ARMv8 introduced special CPU instructions for calculating CRC-32C. Use
them, when available, for speed.
Like with the similar Intel CRC instructions, several factors affect
whether the instructions can be used. The compiler intrinsics for them must
be supported by the compiler, and the instructions must be supported by the
target architecture. If the compilation target architecture does not
support the instructions, but adding "-march=armv8-a+crc" makes them
available, then we compile the code with a runtime check to determine if
the host we're running on supports them or not.
For the runtime check, use glibc getauxval() function. Unfortunately,
that's not very portable, but I couldn't find any more portable way to do
it. If getauxval() is not available, the CRC instructions will still be
used if the target architecture supports them without any additional
compiler flags, but the runtime check will not be available.
Original patch by Yuqi Gu, heavily modified by me. Reviewed by Andres
Freund, Thomas Munro.
Discussion: https://www.postgresql.org/message-id/HE1PR0801MB1323D171938EABC04FFE7FA9E3110%40HE1PR0801MB1323.eurprd08.prod.outlook.com
# If we are targeting a processor that has SSE 4.2 instructions, we can use the
17260
-
# special CRC instructions for calculating CRC-32C. If we're not targeting such
17261
-
# a processor, but we can nevertheless produce code that uses the SSE
17262
-
# intrinsics, perhaps with some extra CFLAGS, compile both implementations and
17263
-
# select which one to use at runtime, depending on whether SSE 4.2 is supported
17264
-
# by the processor we're running on.
17389
+
# If we are targeting a processor that has Intel SSE 4.2 instructions, we can
17390
+
# use the special CRC instructions for calculating CRC-32C. If we're not
17391
+
# targeting such a processor, but we can nevertheless produce code that uses
17392
+
# the SSE intrinsics, perhaps with some extra CFLAGS, compile both
17393
+
# implementations and select which one to use at runtime, depending on whether
17394
+
# SSE 4.2 is supported by the processor we're running on.
17395
+
#
17396
+
# Similarly, if we are targeting an ARM processor that has the CRC
17397
+
# instructions that are part of the ARMv8 CRC Extension, use them. And if
17398
+
# we're not targeting such a processor, but can nevertheless produce code that
17399
+
# uses the CRC instructions, compile both, and select at runtime.
17265
17400
#
17266
17401
# You can override this logic by setting the appropriate USE_*_CRC32 flag to 1
17267
17402
# in the template or configure command line.
17268
-
if test x"$USE_SSE42_CRC32C" = x"" && test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"" && test x"$USE_SLICING_BY_8_CRC32C" = x""; then
17403
+
if test x"$USE_SLICING_BY_8_CRC32C" = x"" && test x"$USE_SSE42_CRC32C" = x"" && test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"" && test x"$USE_ARMV8_CRC32C" = x"" && test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x""; then
17404
+
# Use Intel SSE 4.2 if available.
17269
17405
if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && test x"$SSE4_2_TARGETED" = x"1" ; then
17270
17406
USE_SSE42_CRC32C=1
17271
17407
else
17272
-
# the CPUID instruction is needed for the runtime check.
17408
+
# Intel SSE 4.2, with runtime check? The CPUID instruction is needed for
17409
+
# the runtime check.
17273
17410
if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && (test x"$pgac_cv__get_cpuid" = x"yes" || test x"$pgac_cv__cpuid" = x"yes"); then
17274
17411
USE_SSE42_CRC32C_WITH_RUNTIME_CHECK=1
17275
17412
else
17276
-
# fall back to slicing-by-8 algorithm which doesn't require any special
17277
-
# CPU support.
17278
-
USE_SLICING_BY_8_CRC32C=1
17413
+
# Use ARM CRC Extension if available.
17414
+
if test x"$pgac_armv8_crc32c_intrinsics" = x"yes" && test x"$CFLAGS_ARMV8_CRC32C" = x""; then
17415
+
USE_ARMV8_CRC32C=1
17416
+
else
17417
+
# ARM CRC Extension, with runtime check? The getauxval() function and
17418
+
# HWCAP_CRC32 are needed for the runtime check.
17419
+
if test x"$pgac_armv8_crc32c_intrinsics" = x"yes" && test x"$ac_cv_func_getauxval" = x"yes" && test x"$HAVE_HWCAP_CRC32" = x"1"; then
17420
+
USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK=1
17421
+
else
17422
+
# fall back to slicing-by-8 algorithm, which doesn't require any
# In order to detect at runtime, if the ARM CRC Extension is available,
2018
+
# we will do "getauxval(AT_HWCAP) & HWCAP_CRC32". Check if we have
2019
+
# everything we need for that.
2020
+
AC_CHECK_FUNCS([getauxval])
2021
+
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([
2022
+
#include <sys/auxv.h>
2023
+
#include <asm/hwcap.h>
2024
+
], [
2025
+
#ifndef AT_HWCAP
2026
+
#error AT_HWCAP not defined
2027
+
#endif
2028
+
#ifndef HWCAP_CRC32
2029
+
#error HWCAP_CRC32 not defined
2030
+
#endif
2031
+
])], [HAVE_HWCAP_CRC32=1])
2032
+
2006
2033
# Select CRC-32C implementation.
2007
2034
#
2008
-
# If we are targeting a processor that has SSE 4.2 instructions, we can use the
2009
-
# special CRC instructions for calculating CRC-32C. If we're not targeting such
2010
-
# a processor, but we can nevertheless produce code that uses the SSE
2011
-
# intrinsics, perhaps with some extra CFLAGS, compile both implementations and
2012
-
# select which one to use at runtime, depending on whether SSE 4.2 is supported
2013
-
# by the processor we're running on.
2035
+
# If we are targeting a processor that has Intel SSE 4.2 instructions, we can
2036
+
# use the special CRC instructions for calculating CRC-32C. If we're not
2037
+
# targeting such a processor, but we can nevertheless produce code that uses
2038
+
# the SSE intrinsics, perhaps with some extra CFLAGS, compile both
2039
+
# implementations and select which one to use at runtime, depending on whether
2040
+
# SSE 4.2 is supported by the processor we're running on.
2041
+
#
2042
+
# Similarly, if we are targeting an ARM processor that has the CRC
2043
+
# instructions that are part of the ARMv8 CRC Extension, use them. And if
2044
+
# we're not targeting such a processor, but can nevertheless produce code that
2045
+
# uses the CRC instructions, compile both, and select at runtime.
2014
2046
#
2015
2047
# You can override this logic by setting the appropriate USE_*_CRC32 flag to 1
2016
2048
# in the template or configure command line.
2017
-
if test x"$USE_SSE42_CRC32C" = x"" && test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"" && test x"$USE_SLICING_BY_8_CRC32C" = x""; then
2049
+
if test x"$USE_SLICING_BY_8_CRC32C" = x"" && test x"$USE_SSE42_CRC32C" = x"" && test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"" && test x"$USE_ARMV8_CRC32C" = x"" && test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x""; then
2050
+
# Use Intel SSE 4.2 if available.
2018
2051
if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && test x"$SSE4_2_TARGETED" = x"1" ; then
2019
2052
USE_SSE42_CRC32C=1
2020
2053
else
2021
-
# the CPUID instruction is needed for the runtime check.
2054
+
# Intel SSE 4.2, with runtime check? The CPUID instruction is needed for
2055
+
# the runtime check.
2022
2056
if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && (test x"$pgac_cv__get_cpuid" = x"yes" || test x"$pgac_cv__cpuid" = x"yes"); then
2023
2057
USE_SSE42_CRC32C_WITH_RUNTIME_CHECK=1
2024
2058
else
2025
-
# fall back to slicing-by-8 algorithm which doesn't require any special
2026
-
# CPU support.
2027
-
USE_SLICING_BY_8_CRC32C=1
2059
+
# Use ARM CRC Extension if available.
2060
+
if test x"$pgac_armv8_crc32c_intrinsics" = x"yes" && test x"$CFLAGS_ARMV8_CRC32C" = x""; then
2061
+
USE_ARMV8_CRC32C=1
2062
+
else
2063
+
# ARM CRC Extension, with runtime check? The getauxval() function and
2064
+
# HWCAP_CRC32 are needed for the runtime check.
2065
+
if test x"$pgac_armv8_crc32c_intrinsics" = x"yes" && test x"$ac_cv_func_getauxval" = x"yes" && test x"$HAVE_HWCAP_CRC32" = x"1"; then
2066
+
USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK=1
2067
+
else
2068
+
# fall back to slicing-by-8 algorithm, which doesn't require any
2069
+
# special CPU support.
2070
+
USE_SLICING_BY_8_CRC32C=1
2071
+
fi
2072
+
fi
2028
2073
fi
2029
2074
fi
2030
2075
fi
@@ -2038,12 +2083,24 @@ if test x"$USE_SSE42_CRC32C" = x"1"; then
2038
2083
else
2039
2084
if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
2040
2085
AC_DEFINE(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use Intel SSE 4.2 CRC instructions with a runtime check.])
0 commit comments