Available on x86 or x86-64 only.
Functions§
- _cvtmask8_
u32 avx512dq
- Convert 8-bit mask a to a 32-bit integer value and store the result in dst.
- _cvtu32_
mask8 avx512dq
- Convert 32-bit integer value a to an 8-bit mask and store the result in dst.
- _kadd_
mask8 avx512dq
- Add 8-bit masks a and b, and store the result in dst.
- _kadd_
mask16 avx512dq
- Add 16-bit masks a and b, and store the result in dst.
- _kand_
mask8 avx512dq
- Bitwise AND of 8-bit masks a and b, and store the result in dst.
- _kandn_
mask8 avx512dq
- Bitwise AND NOT of 8-bit masks a and b, and store the result in dst.
- _knot_
mask8 avx512dq
- Bitwise NOT of 8-bit mask a, and store the result in dst.
- _kor_
mask8 avx512dq
- Bitwise OR of 8-bit masks a and b, and store the result in dst.
- _kortest_
mask8_ ⚠u8 avx512dq
- Compute the bitwise OR of 8-bit masks a and b. If the result is all zeros, store 1 in dst, otherwise store 0 in dst. If the result is all ones, store 1 in all_ones, otherwise store 0 in all_ones.
- _kortestc_
mask8_ u8 avx512dq
- Compute the bitwise OR of 8-bit masks a and b. If the result is all ones, store 1 in dst, otherwise store 0 in dst.
- _kortestz_
mask8_ u8 avx512dq
- Compute the bitwise OR of 8-bit masks a and b. If the result is all zeros, store 1 in dst, otherwise store 0 in dst.
- _kshiftli_
mask8 avx512dq
- Shift 8-bit mask a left by count bits while shifting in zeros, and store the result in dst.
- _kshiftri_
mask8 avx512dq
- Shift 8-bit mask a right by count bits while shifting in zeros, and store the result in dst.
- _ktest_
mask8_ ⚠u8 avx512dq
- Compute the bitwise AND of 8-bit masks a and b, and if the result is all zeros, store 1 in dst, otherwise store 0 in dst. Compute the bitwise NOT of a and then AND with b, if the result is all zeros, store 1 in and_not, otherwise store 0 in and_not.
- _ktest_
mask16_ ⚠u8 avx512dq
- Compute the bitwise AND of 16-bit masks a and b, and if the result is all zeros, store 1 in dst, otherwise store 0 in dst. Compute the bitwise NOT of a and then AND with b, if the result is all zeros, store 1 in and_not, otherwise store 0 in and_not.
- _ktestc_
mask8_ u8 avx512dq
- Compute the bitwise NOT of 8-bit mask a and then AND with 8-bit mask b, if the result is all zeros, store 1 in dst, otherwise store 0 in dst.
- _ktestc_
mask16_ u8 avx512dq
- Compute the bitwise NOT of 16-bit mask a and then AND with 16-bit mask b, if the result is all zeros, store 1 in dst, otherwise store 0 in dst.
- _ktestz_
mask8_ u8 avx512dq
- Compute the bitwise AND of 8-bit masks a and b, if the result is all zeros, store 1 in dst, otherwise store 0 in dst.
- _ktestz_
mask16_ u8 avx512dq
- Compute the bitwise AND of 16-bit masks a and b, if the result is all zeros, store 1 in dst, otherwise store 0 in dst.
- _kxnor_
mask8 avx512dq
- Bitwise XNOR of 8-bit masks a and b, and store the result in dst.
- _kxor_
mask8 avx512dq
- Bitwise XOR of 8-bit masks a and b, and store the result in dst.
- _load_
mask8 ⚠avx512dq
- Load 8-bit mask from memory
- _mm256_
broadcast_ f32x2 avx512dq
andavx512vl
- Broadcasts the lower 2 packed single-precision (32-bit) floating-point elements from a to all elements of dst.
- _mm256_
broadcast_ f64x2 avx512dq
andavx512vl
- Broadcasts the 2 packed double-precision (64-bit) floating-point elements from a to all elements of dst.
- _mm256_
broadcast_ i32x2 avx512dq
andavx512vl
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst.
- _mm256_
broadcast_ i64x2 avx512dq
andavx512vl
- Broadcasts the 2 packed 64-bit integers from a to all elements of dst.
- _mm256_
cvtepi64_ pd avx512dq
andavx512vl
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm256_
cvtepi64_ ps avx512dq
andavx512vl
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm256_
cvtepu64_ pd avx512dq
andavx512vl
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm256_
cvtepu64_ ps avx512dq
andavx512vl
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm256_
cvtpd_ epi64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst.
- _mm256_
cvtpd_ epu64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst.
- _mm256_
cvtps_ epi64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst.
- _mm256_
cvtps_ epu64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst.
- _mm256_
cvttpd_ epi64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst.
- _mm256_
cvttpd_ epu64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst.
- _mm256_
cvttps_ epi64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst.
- _mm256_
cvttps_ epu64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst.
- _mm256_
extractf64x2_ pd avx512dq
andavx512vl
- Extracts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst.
- _mm256_
extracti64x2_ epi64 avx512dq
andavx512vl
- Extracts 128 bits (composed of 2 packed 64-bit integers) from a, selected with IMM8, and stores the result in dst.
- _mm256_
fpclass_ pd_ mask avx512dq
andavx512vl
- Test packed double-precision (64-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm256_
fpclass_ ps_ mask avx512dq
andavx512vl
- Test packed single-precision (32-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm256_
insertf64x2 avx512dq
andavx512vl
- Copy a to dst, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into dst at the location specified by IMM8.
- _mm256_
inserti64x2 avx512dq
andavx512vl
- Copy a to dst, then insert 128 bits (composed of 2 packed 64-bit integers) from b into dst at the location specified by IMM8.
- _mm256_
mask_ and_ pd avx512dq
andavx512vl
- Compute the bitwise AND of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ and_ ps avx512dq
andavx512vl
- Compute the bitwise AND of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ andnot_ pd avx512dq
andavx512vl
- Compute the bitwise NOT of packed double-precision (64-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ andnot_ ps avx512dq
andavx512vl
- Compute the bitwise NOT of packed single-precision (32-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ broadcast_ f32x2 avx512dq
andavx512vl
- Broadcasts the lower 2 packed single-precision (32-bit) floating-point elements from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ broadcast_ f64x2 avx512dq
andavx512vl
- Broadcasts the 2 packed double-precision (64-bit) floating-point elements from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ broadcast_ i32x2 avx512dq
andavx512vl
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ broadcast_ i64x2 avx512dq
andavx512vl
- Broadcasts the 2 packed 64-bit integers from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ cvtepi64_ pd avx512dq
andavx512vl
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ cvtepi64_ ps avx512dq
andavx512vl
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ cvtepu64_ pd avx512dq
andavx512vl
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ cvtepu64_ ps avx512dq
andavx512vl
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ cvtpd_ epi64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ cvtpd_ epu64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ cvtps_ epi64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ cvtps_ epu64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ cvttpd_ epi64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ cvttpd_ epu64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ cvttps_ epi64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ cvttps_ epu64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ extractf64x2_ pd avx512dq
andavx512vl
- Extracts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ extracti64x2_ epi64 avx512dq
andavx512vl
- Extracts 128 bits (composed of 2 packed 64-bit integers) from a, selected with IMM8, and stores the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ fpclass_ pd_ mask avx512dq
andavx512vl
- Test packed double-precision (64-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm256_
mask_ fpclass_ ps_ mask avx512dq
andavx512vl
- Test packed single-precision (32-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm256_
mask_ insertf64x2 avx512dq
andavx512vl
- Copy a to tmp, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into tmp at the location specified by IMM8, and copy tmp to dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ inserti64x2 avx512dq
andavx512vl
- Copy a to tmp, then insert 128 bits (composed of 2 packed 64-bit integers) from b into tmp at the location specified by IMM8, and copy tmp to dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ mullo_ epi64 avx512dq
andavx512vl
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
using writemaskk
(elements are copied fromsrc
if the corresponding bit is not set). - _mm256_
mask_ or_ pd avx512dq
andavx512vl
- Compute the bitwise OR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ or_ ps avx512dq
andavx512vl
- Compute the bitwise OR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ range_ pd avx512dq
andavx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm256_
mask_ range_ ps avx512dq
andavx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm256_
mask_ reduce_ pd avx512dq
andavx512vl
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm256_
mask_ reduce_ ps avx512dq
andavx512vl
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm256_
mask_ xor_ pd avx512dq
andavx512vl
- Compute the bitwise XOR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
mask_ xor_ ps avx512dq
andavx512vl
- Compute the bitwise XOR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm256_
maskz_ and_ pd avx512dq
andavx512vl
- Compute the bitwise AND of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ and_ ps avx512dq
andavx512vl
- Compute the bitwise AND of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ andnot_ pd avx512dq
andavx512vl
- Compute the bitwise NOT of packed double-precision (64-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ andnot_ ps avx512dq
andavx512vl
- Compute the bitwise NOT of packed single-precision (32-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ broadcast_ f32x2 avx512dq
andavx512vl
- Broadcasts the lower 2 packed single-precision (32-bit) floating-point elements from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ broadcast_ f64x2 avx512dq
andavx512vl
- Broadcasts the 2 packed double-precision (64-bit) floating-point elements from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ broadcast_ i32x2 avx512dq
andavx512vl
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ broadcast_ i64x2 avx512dq
andavx512vl
- Broadcasts the 2 packed 64-bit integers from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ cvtepi64_ pd avx512dq
andavx512vl
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ cvtepi64_ ps avx512dq
andavx512vl
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ cvtepu64_ pd avx512dq
andavx512vl
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ cvtepu64_ ps avx512dq
andavx512vl
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ cvtpd_ epi64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ cvtpd_ epu64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ cvtps_ epi64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ cvtps_ epu64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ cvttpd_ epi64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ cvttpd_ epu64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ cvttps_ epi64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ cvttps_ epu64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ extractf64x2_ pd avx512dq
andavx512vl
- Extracts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ extracti64x2_ epi64 avx512dq
andavx512vl
- Extracts 128 bits (composed of 2 packed 64-bit integers) from a, selected with IMM8, and stores the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ insertf64x2 avx512dq
andavx512vl
- Copy a to tmp, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into tmp at the location specified by IMM8, and copy tmp to dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ inserti64x2 avx512dq
andavx512vl
- Copy a to tmp, then insert 128 bits (composed of 2 packed 64-bit integers) from b into tmp at the location specified by IMM8, and copy tmp to dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ mullo_ epi64 avx512dq
andavx512vl
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
using zeromaskk
(elements are zeroed out if the corresponding bit is not set). - _mm256_
maskz_ or_ pd avx512dq
andavx512vl
- Compute the bitwise OR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ or_ ps avx512dq
andavx512vl
- Compute the bitwise OR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ range_ pd avx512dq
andavx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm256_
maskz_ range_ ps avx512dq
andavx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm256_
maskz_ reduce_ pd avx512dq
andavx512vl
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm256_
maskz_ reduce_ ps avx512dq
andavx512vl
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm256_
maskz_ xor_ pd avx512dq
andavx512vl
- Compute the bitwise XOR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
maskz_ xor_ ps avx512dq
andavx512vl
- Compute the bitwise XOR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm256_
movepi32_ mask avx512dq
andavx512vl
- Set each bit of mask register k based on the most significant bit of the corresponding packed 32-bit integer in a.
- _mm256_
movepi64_ mask avx512dq
andavx512vl
- Set each bit of mask register k based on the most significant bit of the corresponding packed 64-bit integer in a.
- _mm256_
movm_ epi32 avx512dq
andavx512vl
- Set each packed 32-bit integer in dst to all ones or all zeros based on the value of the corresponding bit in k.
- _mm256_
movm_ epi64 avx512dq
andavx512vl
- Set each packed 64-bit integer in dst to all ones or all zeros based on the value of the corresponding bit in k.
- _mm256_
mullo_ epi64 avx512dq
andavx512vl
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
. - _mm256_
range_ pd avx512dq
andavx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm256_
range_ ps avx512dq
andavx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm256_
reduce_ pd avx512dq
andavx512vl
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm256_
reduce_ ps avx512dq
andavx512vl
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
and_ pd avx512dq
- Compute the bitwise AND of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst.
- _mm512_
and_ ps avx512dq
- Compute the bitwise AND of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst.
- _mm512_
andnot_ pd avx512dq
- Compute the bitwise NOT of packed double-precision (64-bit) floating point numbers in a and then bitwise AND with b and store the results in dst.
- _mm512_
andnot_ ps avx512dq
- Compute the bitwise NOT of packed single-precision (32-bit) floating point numbers in a and then bitwise AND with b and store the results in dst.
- _mm512_
broadcast_ f32x2 avx512dq
- Broadcasts the lower 2 packed single-precision (32-bit) floating-point elements from a to all elements of dst.
- _mm512_
broadcast_ f32x8 avx512dq
- Broadcasts the 8 packed single-precision (32-bit) floating-point elements from a to all elements of dst.
- _mm512_
broadcast_ f64x2 avx512dq
- Broadcasts the 2 packed double-precision (64-bit) floating-point elements from a to all elements of dst.
- _mm512_
broadcast_ i32x2 avx512dq
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst.
- _mm512_
broadcast_ i32x8 avx512dq
- Broadcasts the 8 packed 32-bit integers from a to all elements of dst.
- _mm512_
broadcast_ i64x2 avx512dq
- Broadcasts the 2 packed 64-bit integers from a to all elements of dst.
- _mm512_
cvt_ roundepi64_ pd avx512dq
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvt_ roundepi64_ ps avx512dq
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvt_ roundepu64_ pd avx512dq
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvt_ roundepu64_ ps avx512dq
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvt_ roundpd_ epi64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvt_ roundpd_ epu64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvt_ roundps_ epi64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvt_ roundps_ epu64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst. Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
cvtepi64_ pd avx512dq
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtepi64_ ps avx512dq
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtepu64_ pd avx512dq
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtepu64_ ps avx512dq
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm512_
cvtpd_ epi64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst.
- _mm512_
cvtpd_ epu64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst.
- _mm512_
cvtps_ epi64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst.
- _mm512_
cvtps_ epu64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst.
- _mm512_
cvtt_ roundpd_ epi64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
cvtt_ roundpd_ epu64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
cvtt_ roundps_ epi64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
cvtt_ roundps_ epu64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
cvttpd_ epi64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst.
- _mm512_
cvttpd_ epu64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst.
- _mm512_
cvttps_ epi64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst.
- _mm512_
cvttps_ epu64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst.
- _mm512_
extractf32x8_ ps avx512dq
- Extracts 256 bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst.
- _mm512_
extractf64x2_ pd avx512dq
- Extracts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst.
- _mm512_
extracti32x8_ epi32 avx512dq
- Extracts 256 bits (composed of 8 packed 32-bit integers) from a, selected with IMM8, and stores the result in dst.
- _mm512_
extracti64x2_ epi64 avx512dq
- Extracts 128 bits (composed of 2 packed 64-bit integers) from a, selected with IMM8, and stores the result in dst.
- _mm512_
fpclass_ pd_ mask avx512dq
- Test packed double-precision (64-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm512_
fpclass_ ps_ mask avx512dq
- Test packed single-precision (32-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm512_
insertf32x8 avx512dq
- Copy a to dst, then insert 256 bits (composed of 8 packed single-precision (32-bit) floating-point elements) from b into dst at the location specified by IMM8.
- _mm512_
insertf64x2 avx512dq
- Copy a to dst, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into dst at the location specified by IMM8.
- _mm512_
inserti32x8 avx512dq
- Copy a to dst, then insert 256 bits (composed of 8 packed 32-bit integers) from b into dst at the location specified by IMM8.
- _mm512_
inserti64x2 avx512dq
- Copy a to dst, then insert 128 bits (composed of 2 packed 64-bit integers) from b into dst at the location specified by IMM8.
- _mm512_
mask_ and_ pd avx512dq
- Compute the bitwise AND of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ and_ ps avx512dq
- Compute the bitwise AND of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ andnot_ pd avx512dq
- Compute the bitwise NOT of packed double-precision (64-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ andnot_ ps avx512dq
- Compute the bitwise NOT of packed single-precision (32-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ broadcast_ f32x2 avx512dq
- Broadcasts the lower 2 packed single-precision (32-bit) floating-point elements from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ broadcast_ f32x8 avx512dq
- Broadcasts the 8 packed single-precision (32-bit) floating-point elements from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ broadcast_ f64x2 avx512dq
- Broadcasts the 2 packed double-precision (64-bit) floating-point elements from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ broadcast_ i32x2 avx512dq
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ broadcast_ i32x8 avx512dq
- Broadcasts the 8 packed 32-bit integers from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ broadcast_ i64x2 avx512dq
- Broadcasts the 2 packed 64-bit integers from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ cvt_ roundepi64_ pd avx512dq
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ cvt_ roundepi64_ ps avx512dq
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ cvt_ roundepu64_ pd avx512dq
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ cvt_ roundepu64_ ps avx512dq
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ cvt_ roundpd_ epi64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ cvt_ roundpd_ epu64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ cvt_ roundps_ epi64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ cvt_ roundps_ epu64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
mask_ cvtepi64_ pd avx512dq
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ cvtepi64_ ps avx512dq
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ cvtepu64_ pd avx512dq
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ cvtepu64_ ps avx512dq
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ cvtpd_ epi64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ cvtpd_ epu64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ cvtps_ epi64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ cvtps_ epu64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ cvtt_ roundpd_ epi64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
mask_ cvtt_ roundpd_ epu64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
mask_ cvtt_ roundps_ epi64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
mask_ cvtt_ roundps_ epu64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
mask_ cvttpd_ epi64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ cvttpd_ epu64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ cvttps_ epi64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ cvttps_ epu64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ extractf32x8_ ps avx512dq
- Extracts 256 bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ extractf64x2_ pd avx512dq
- Extracts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ extracti32x8_ epi32 avx512dq
- Extracts 256 bits (composed of 8 packed 32-bit integers) from a, selected with IMM8, and stores the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ extracti64x2_ epi64 avx512dq
- Extracts 128 bits (composed of 2 packed 64-bit integers) from a, selected with IMM8, and stores the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ fpclass_ pd_ mask avx512dq
- Test packed double-precision (64-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm512_
mask_ fpclass_ ps_ mask avx512dq
- Test packed single-precision (32-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm512_
mask_ insertf32x8 avx512dq
- Copy a to tmp, then insert 256 bits (composed of 8 packed single-precision (32-bit) floating-point elements) from b into tmp at the location specified by IMM8, and copy tmp to dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ insertf64x2 avx512dq
- Copy a to tmp, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into tmp at the location specified by IMM8, and copy tmp to dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ inserti32x8 avx512dq
- Copy a to tmp, then insert 256 bits (composed of 8 packed 32-bit integers) from b into tmp at the location specified by IMM8, and copy tmp to dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ inserti64x2 avx512dq
- Copy a to tmp, then insert 128 bits (composed of 2 packed 64-bit integers) from b into tmp at the location specified by IMM8, and copy tmp to dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ mullo_ epi64 avx512dq
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
using writemaskk
(elements are copied fromsrc
if the corresponding bit is not set). - _mm512_
mask_ or_ pd avx512dq
- Compute the bitwise OR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ or_ ps avx512dq
- Compute the bitwise OR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ range_ pd avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
mask_ range_ ps avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
mask_ range_ round_ pd avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm512_
mask_ range_ round_ ps avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
mask_ reduce_ pd avx512dq
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
mask_ reduce_ ps avx512dq
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
mask_ reduce_ round_ pd avx512dq
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
mask_ reduce_ round_ ps avx512dq
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
mask_ xor_ pd avx512dq
- Compute the bitwise XOR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
mask_ xor_ ps avx512dq
- Compute the bitwise XOR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm512_
maskz_ and_ pd avx512dq
- Compute the bitwise AND of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ and_ ps avx512dq
- Compute the bitwise AND of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ andnot_ pd avx512dq
- Compute the bitwise NOT of packed double-precision (64-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ andnot_ ps avx512dq
- Compute the bitwise NOT of packed single-precision (32-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ broadcast_ f32x2 avx512dq
- Broadcasts the lower 2 packed single-precision (32-bit) floating-point elements from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ broadcast_ f32x8 avx512dq
- Broadcasts the 8 packed single-precision (32-bit) floating-point elements from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ broadcast_ f64x2 avx512dq
- Broadcasts the 2 packed double-precision (64-bit) floating-point elements from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ broadcast_ i32x2 avx512dq
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ broadcast_ i32x8 avx512dq
- Broadcasts the 8 packed 32-bit integers from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ broadcast_ i64x2 avx512dq
- Broadcasts the 2 packed 64-bit integers from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ cvt_ roundepi64_ pd avx512dq
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ cvt_ roundepi64_ ps avx512dq
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ cvt_ roundepu64_ pd avx512dq
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ cvt_ roundepu64_ ps avx512dq
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ cvt_ roundpd_ epi64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ cvt_ roundpd_ epu64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ cvt_ roundps_ epi64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ cvt_ roundps_ epu64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Rounding is done according to the ROUNDING parameter, which can be one of:
- _mm512_
maskz_ cvtepi64_ pd avx512dq
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ cvtepi64_ ps avx512dq
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ cvtepu64_ pd avx512dq
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ cvtepu64_ ps avx512dq
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ cvtpd_ epi64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ cvtpd_ epu64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ cvtps_ epi64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ cvtps_ epu64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ cvtt_ roundpd_ epi64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
maskz_ cvtt_ roundpd_ epu64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
maskz_ cvtt_ roundps_ epi64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
maskz_ cvtt_ roundps_ epu64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC to the sae parameter.
- _mm512_
maskz_ cvttpd_ epi64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ cvttpd_ epu64 avx512dq
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding
- _mm512_
maskz_ cvttps_ epi64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ cvttps_ epu64 avx512dq
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ extractf32x8_ ps avx512dq
- Extracts 256 bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ extractf64x2_ pd avx512dq
- Extracts 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with IMM8, and stores the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ extracti32x8_ epi32 avx512dq
- Extracts 256 bits (composed of 8 packed 32-bit integers) from a, selected with IMM8, and stores the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ extracti64x2_ epi64 avx512dq
- Extracts 128 bits (composed of 2 packed 64-bit integers) from a, selected with IMM8, and stores the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ insertf32x8 avx512dq
- Copy a to tmp, then insert 256 bits (composed of 8 packed single-precision (32-bit) floating-point elements) from b into tmp at the location specified by IMM8, and copy tmp to dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ insertf64x2 avx512dq
- Copy a to tmp, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into tmp at the location specified by IMM8, and copy tmp to dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ inserti32x8 avx512dq
- Copy a to tmp, then insert 256 bits (composed of 8 packed 32-bit integers) from b into tmp at the location specified by IMM8, and copy tmp to dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ inserti64x2 avx512dq
- Copy a to tmp, then insert 128 bits (composed of 2 packed 64-bit integers) from b into tmp at the location specified by IMM8, and copy tmp to dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ mullo_ epi64 avx512dq
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
using zeromaskk
(elements are zeroed out if the corresponding bit is not set). - _mm512_
maskz_ or_ pd avx512dq
- Compute the bitwise OR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ or_ ps avx512dq
- Compute the bitwise OR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ range_ pd avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
maskz_ range_ ps avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
maskz_ range_ round_ pd avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm512_
maskz_ range_ round_ ps avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
maskz_ reduce_ pd avx512dq
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
maskz_ reduce_ ps avx512dq
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
maskz_ reduce_ round_ pd avx512dq
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
maskz_ reduce_ round_ ps avx512dq
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
maskz_ xor_ pd avx512dq
- Compute the bitwise XOR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
maskz_ xor_ ps avx512dq
- Compute the bitwise XOR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm512_
movepi32_ mask avx512dq
- Set each bit of mask register k based on the most significant bit of the corresponding packed 32-bit integer in a.
- _mm512_
movepi64_ mask avx512dq
- Set each bit of mask register k based on the most significant bit of the corresponding packed 64-bit integer in a.
- _mm512_
movm_ epi32 avx512dq
- Set each packed 32-bit integer in dst to all ones or all zeros based on the value of the corresponding bit in k.
- _mm512_
movm_ epi64 avx512dq
- Set each packed 64-bit integer in dst to all ones or all zeros based on the value of the corresponding bit in k.
- _mm512_
mullo_ epi64 avx512dq
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
. - _mm512_
or_ pd avx512dq
- Compute the bitwise OR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst.
- _mm512_
or_ ps avx512dq
- Compute the bitwise OR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst.
- _mm512_
range_ pd avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
range_ ps avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm512_
range_ round_ pd avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm512_
range_ round_ ps avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm512_
reduce_ pd avx512dq
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
reduce_ ps avx512dq
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
reduce_ round_ pd avx512dq
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
reduce_ round_ ps avx512dq
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm512_
xor_ pd avx512dq
- Compute the bitwise XOR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst.
- _mm512_
xor_ ps avx512dq
- Compute the bitwise XOR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst.
- _mm_
broadcast_ i32x2 avx512dq
andavx512vl
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst.
- _mm_
cvtepi64_ pd avx512dq
andavx512vl
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm_
cvtepi64_ ps avx512dq
andavx512vl
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm_
cvtepu64_ pd avx512dq
andavx512vl
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
- _mm_
cvtepu64_ ps avx512dq
andavx512vl
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
- _mm_
cvtpd_ epi64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst.
- _mm_
cvtpd_ epu64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst.
- _mm_
cvtps_ epi64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst.
- _mm_
cvtps_ epu64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst.
- _mm_
cvttpd_ epi64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst.
- _mm_
cvttpd_ epu64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst.
- _mm_
cvttps_ epi64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst.
- _mm_
cvttps_ epu64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst.
- _mm_
fpclass_ pd_ mask avx512dq
andavx512vl
- Test packed double-precision (64-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm_
fpclass_ ps_ mask avx512dq
andavx512vl
- Test packed single-precision (32-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm_
fpclass_ sd_ mask avx512dq
- Test the lower double-precision (64-bit) floating-point element in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm_
fpclass_ ss_ mask avx512dq
- Test the lower single-precision (32-bit) floating-point element in a for special categories specified by imm8, and store the results in mask vector k. imm can be a combination of:
- _mm_
mask_ and_ pd avx512dq
andavx512vl
- Compute the bitwise AND of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ and_ ps avx512dq
andavx512vl
- Compute the bitwise AND of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ andnot_ pd avx512dq
andavx512vl
- Compute the bitwise NOT of packed double-precision (64-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ andnot_ ps avx512dq
andavx512vl
- Compute the bitwise NOT of packed single-precision (32-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ broadcast_ i32x2 avx512dq
andavx512vl
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ cvtepi64_ pd avx512dq
andavx512vl
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ cvtepi64_ ps avx512dq
andavx512vl
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ cvtepu64_ pd avx512dq
andavx512vl
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ cvtepu64_ ps avx512dq
andavx512vl
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ cvtpd_ epi64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ cvtpd_ epu64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ cvtps_ epi64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ cvtps_ epu64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ cvttpd_ epi64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ cvttpd_ epu64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ cvttps_ epi64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ cvttps_ epu64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ fpclass_ pd_ mask avx512dq
andavx512vl
- Test packed double-precision (64-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm_
mask_ fpclass_ ps_ mask avx512dq
andavx512vl
- Test packed single-precision (32-bit) floating-point elements in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm_
mask_ fpclass_ sd_ mask avx512dq
- Test the lower double-precision (64-bit) floating-point element in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm_
mask_ fpclass_ ss_ mask avx512dq
- Test the lower single-precision (32-bit) floating-point element in a for special categories specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). imm can be a combination of:
- _mm_
mask_ mullo_ epi64 avx512dq
andavx512vl
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
using writemaskk
(elements are copied fromsrc
if the corresponding bit is not set). - _mm_
mask_ or_ pd avx512dq
andavx512vl
- Compute the bitwise OR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ or_ ps avx512dq
andavx512vl
- Compute the bitwise OR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ range_ pd avx512dq
andavx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
mask_ range_ ps avx512dq
andavx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
mask_ range_ round_ sd avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm_
mask_ range_ round_ ss avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm_
mask_ range_ sd avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
mask_ range_ ss avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
mask_ reduce_ pd avx512dq
andavx512vl
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
mask_ reduce_ ps avx512dq
andavx512vl
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using writemask k (elements are copied from src to dst if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
mask_ reduce_ round_ sd avx512dq
- Extract the reduced argument of the lower double-precision (64-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
mask_ reduce_ round_ ss avx512dq
- Extract the reduced argument of the lower single-precision (32-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a. to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
mask_ reduce_ sd avx512dq
- Extract the reduced argument of the lower double-precision (64-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
mask_ reduce_ ss avx512dq
- Extract the reduced argument of the lower single-precision (32-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a. to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
mask_ xor_ pd avx512dq
andavx512vl
- Compute the bitwise XOR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
mask_ xor_ ps avx512dq
andavx512vl
- Compute the bitwise XOR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using writemask k (elements are copied from src if the corresponding bit is not set).
- _mm_
maskz_ and_ pd avx512dq
andavx512vl
- Compute the bitwise AND of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ and_ ps avx512dq
andavx512vl
- Compute the bitwise AND of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ andnot_ pd avx512dq
andavx512vl
- Compute the bitwise NOT of packed double-precision (64-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ andnot_ ps avx512dq
andavx512vl
- Compute the bitwise NOT of packed single-precision (32-bit) floating point numbers in a and then bitwise AND with b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ broadcast_ i32x2 avx512dq
andavx512vl
- Broadcasts the lower 2 packed 32-bit integers from a to all elements of dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ cvtepi64_ pd avx512dq
andavx512vl
- Convert packed signed 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ cvtepi64_ ps avx512dq
andavx512vl
- Convert packed signed 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ cvtepu64_ pd avx512dq
andavx512vl
- Convert packed unsigned 64-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ cvtepu64_ ps avx512dq
andavx512vl
- Convert packed unsigned 64-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ cvtpd_ epi64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ cvtpd_ epu64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ cvtps_ epi64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ cvtps_ epu64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers, and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ cvttpd_ epi64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ cvttpd_ epu64 avx512dq
andavx512vl
- Convert packed double-precision (64-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ cvttps_ epi64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed signed 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ cvttps_ epu64 avx512dq
andavx512vl
- Convert packed single-precision (32-bit) floating-point elements in a to packed unsigned 64-bit integers with truncation, and store the result in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ mullo_ epi64 avx512dq
andavx512vl
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
using zeromaskk
(elements are zeroed out if the corresponding bit is not set). - _mm_
maskz_ or_ pd avx512dq
andavx512vl
- Compute the bitwise OR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ or_ ps avx512dq
andavx512vl
- Compute the bitwise OR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ range_ pd avx512dq
andavx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
maskz_ range_ ps avx512dq
andavx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
maskz_ range_ round_ sd avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm_
maskz_ range_ round_ ss avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm_
maskz_ range_ sd avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
maskz_ range_ ss avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
maskz_ reduce_ pd avx512dq
andavx512vl
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
maskz_ reduce_ ps avx512dq
andavx512vl
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out if the corresponding mask bit is not set). Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
maskz_ reduce_ round_ sd avx512dq
- Extract the reduced argument of the lower double-precision (64-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
maskz_ reduce_ round_ ss avx512dq
- Extract the reduced argument of the lower single-precision (32-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a. to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
maskz_ reduce_ sd avx512dq
- Extract the reduced argument of the lower double-precision (64-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
maskz_ reduce_ ss avx512dq
- Extract the reduced argument of the lower single-precision (32-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a. to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
maskz_ xor_ pd avx512dq
andavx512vl
- Compute the bitwise XOR of packed double-precision (64-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
maskz_ xor_ ps avx512dq
andavx512vl
- Compute the bitwise XOR of packed single-precision (32-bit) floating point numbers in a and b and store the results in dst using zeromask k (elements are zeroed out if the corresponding bit is not set).
- _mm_
movepi32_ mask avx512dq
andavx512vl
- Set each bit of mask register k based on the most significant bit of the corresponding packed 32-bit integer in a.
- _mm_
movepi64_ mask avx512dq
andavx512vl
- Set each bit of mask register k based on the most significant bit of the corresponding packed 64-bit integer in a.
- _mm_
movm_ epi32 avx512dq
andavx512vl
- Set each packed 32-bit integer in dst to all ones or all zeros based on the value of the corresponding bit in k.
- _mm_
movm_ epi64 avx512dq
andavx512vl
- Set each packed 64-bit integer in dst to all ones or all zeros based on the value of the corresponding bit in k.
- _mm_
mullo_ epi64 avx512dq
andavx512vl
- Multiply packed 64-bit integers in
a
andb
, producing intermediate 128-bit integers, and store the low 64 bits of the intermediate integers indst
. - _mm_
range_ pd avx512dq
andavx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
range_ ps avx512dq
andavx512vl
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit.
- _mm_
range_ round_ sd avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower double-precision (64-bit) floating-point element in a and b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm_
range_ round_ ss avx512dq
- Calculate the max, min, absolute max, or absolute min (depending on control in imm8) for the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. Lower 2 bits of IMM8 specifies the operation control: 00 = min, 01 = max, 10 = absolute min, 11 = absolute max. Upper 2 bits of IMM8 specifies the sign control: 00 = sign from a, 01 = sign from compare result, 10 = clear sign bit, 11 = set sign bit. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter.
- _mm_
reduce_ pd avx512dq
andavx512vl
- Extract the reduced argument of packed double-precision (64-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
reduce_ ps avx512dq
andavx512vl
- Extract the reduced argument of packed single-precision (32-bit) floating-point elements in a by the number of bits specified by imm8, and store the results in dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
reduce_ round_ sd avx512dq
- Extract the reduced argument of the lower double-precision (64-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
reduce_ round_ ss avx512dq
- Extract the reduced argument of the lower single-precision (32-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst, and copy the upper element from a. to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
reduce_ sd avx512dq
- Extract the reduced argument of the lower double-precision (64-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst using, and copy the upper element from a. to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _mm_
reduce_ ss avx512dq
- Extract the reduced argument of the lower single-precision (32-bit) floating-point element in b by the number of bits specified by imm8, store the result in the lower element of dst, and copy the upper element from a. to the upper element of dst. Rounding is done according to the imm8 parameter, which can be one of:
- _store_
mask8 ⚠avx512dq
- Store 8-bit mask to memory
- vcvtpd2qq_
128 🔒 ⚠ - vcvtpd2qq_
256 🔒 ⚠ - vcvtpd2qq_
512 🔒 ⚠ - vcvtpd2uqq_
128 🔒 ⚠ - vcvtpd2uqq_
256 🔒 ⚠ - vcvtpd2uqq_
512 🔒 ⚠ - vcvtps2qq_
128 🔒 ⚠ - vcvtps2qq_
256 🔒 ⚠ - vcvtps2qq_
512 🔒 ⚠ - vcvtps2uqq_
128 🔒 ⚠ - vcvtps2uqq_
256 🔒 ⚠ - vcvtps2uqq_
512 🔒 ⚠ - vcvtqq2pd_
128 🔒 ⚠ - vcvtqq2pd_
256 🔒 ⚠ - vcvtqq2pd_
512 🔒 ⚠ - vcvtqq2ps_
128 🔒 ⚠ - vcvtqq2ps_
256 🔒 ⚠ - vcvtqq2ps_
512 🔒 ⚠ - vcvttpd2qq_
128 🔒 ⚠ - vcvttpd2qq_
256 🔒 ⚠ - vcvttpd2qq_
512 🔒 ⚠ - vcvttpd2uqq_
128 🔒 ⚠ - vcvttpd2uqq_
256 🔒 ⚠ - vcvttpd2uqq_
512 🔒 ⚠ - vcvttps2qq_
128 🔒 ⚠ - vcvttps2qq_
256 🔒 ⚠ - vcvttps2qq_
512 🔒 ⚠ - vcvttps2uqq_
128 🔒 ⚠ - vcvttps2uqq_
256 🔒 ⚠ - vcvttps2uqq_
512 🔒 ⚠ - vcvtuqq2pd_
128 🔒 ⚠ - vcvtuqq2pd_
256 🔒 ⚠ - vcvtuqq2pd_
512 🔒 ⚠ - vcvtuqq2ps_
128 🔒 ⚠ - vcvtuqq2ps_
256 🔒 ⚠ - vcvtuqq2ps_
512 🔒 ⚠ - vfpclasspd_
128 🔒 ⚠ - vfpclasspd_
256 🔒 ⚠ - vfpclasspd_
512 🔒 ⚠ - vfpclassps_
128 🔒 ⚠ - vfpclassps_
256 🔒 ⚠ - vfpclassps_
512 🔒 ⚠ - vfpclasssd 🔒 ⚠
- vfpclassss 🔒 ⚠
- vrangepd_
128 🔒 ⚠ - vrangepd_
256 🔒 ⚠ - vrangepd_
512 🔒 ⚠ - vrangeps_
128 🔒 ⚠ - vrangeps_
256 🔒 ⚠ - vrangeps_
512 🔒 ⚠ - vrangesd 🔒 ⚠
- vrangess 🔒 ⚠
- vreducepd_
128 🔒 ⚠ - vreducepd_
256 🔒 ⚠ - vreducepd_
512 🔒 ⚠ - vreduceps_
128 🔒 ⚠ - vreduceps_
256 🔒 ⚠ - vreduceps_
512 🔒 ⚠ - vreducesd 🔒 ⚠
- vreducess 🔒 ⚠