[DOXYGEN] Minor improvements in doxygen comments.

Separated very long brief sections into two sections. 

I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.

llvm-svn: 303031
This commit is contained in:
Ekaterina Romanova 2017-05-15 03:25:04 +00:00
parent 718b7457e8
commit 1d4a0f270c
8 changed files with 432 additions and 274 deletions

View File

@ -1458,12 +1458,13 @@ _mm256_blendv_ps(__m256 __a, __m256 __b, __m256 __c)
/// \brief Computes two dot products in parallel, using the lower and upper
/// halves of two [8 x float] vectors as input to the two computations, and
/// returning the two dot products in the lower and upper halves of the
/// [8 x float] result. The immediate integer operand controls which input
/// elements will contribute to the dot product, and where the final results
/// are returned. In general, for each dot product, the four corresponding
/// elements of the input vectors are multiplied; the first two and second
/// two products are summed, then the two sums are added to form the final
/// result.
/// [8 x float] result.
///
/// The immediate integer operand controls which input elements will
/// contribute to the dot product, and where the final results are returned.
/// In general, for each dot product, the four corresponding elements of the
/// input vectors are multiplied; the first two and second two products are
/// summed, then the two sums are added to form the final result.
///
/// \headerfile <x86intrin.h>
///
@ -1497,15 +1498,16 @@ _mm256_blendv_ps(__m256 __a, __m256 __b, __m256 __c)
/* Vector shuffle */
/// \brief Selects 8 float values from the 256-bit operands of [8 x float], as
/// specified by the immediate value operand. The four selected elements in
/// each operand are copied to the destination according to the bits
/// specified in the immediate operand. The selected elements from the first
/// 256-bit operand are copied to bits [63:0] and bits [191:128] of the
/// destination, and the selected elements from the second 256-bit operand
/// are copied to bits [127:64] and bits [255:192] of the destination. For
/// example, if bits [7:0] of the immediate operand contain a value of 0xFF,
/// the 256-bit destination vector would contain the following values: b[7],
/// b[7], a[7], a[7], b[3], b[3], a[3], a[3].
/// specified by the immediate value operand.
///
/// The four selected elements in each operand are copied to the destination
/// according to the bits specified in the immediate operand. The selected
/// elements from the first 256-bit operand are copied to bits [63:0] and
/// bits [191:128] of the destination, and the selected elements from the
/// second 256-bit operand are copied to bits [127:64] and bits [255:192] of
/// the destination. For example, if bits [7:0] of the immediate operand
/// contain a value of 0xFF, the 256-bit destination vector would contain the
/// following values: b[7], b[7], a[7], a[7], b[3], b[3], a[3], a[3].
///
/// \headerfile <x86intrin.h>
///
@ -1557,13 +1559,14 @@ _mm256_blendv_ps(__m256 __a, __m256 __b, __m256 __c)
12 + (((mask) >> 6) & 0x3)); })
/// \brief Selects four double-precision values from the 256-bit operands of
/// [4 x double], as specified by the immediate value operand. The selected
/// elements from the first 256-bit operand are copied to bits [63:0] and
/// bits [191:128] in the destination, and the selected elements from the
/// second 256-bit operand are copied to bits [127:64] and bits [255:192] in
/// the destination. For example, if bits [3:0] of the immediate operand
/// contain a value of 0xF, the 256-bit destination vector would contain the
/// following values: b[3], a[3], b[1], a[1].
/// [4 x double], as specified by the immediate value operand.
///
/// The selected elements from the first 256-bit operand are copied to bits
/// [63:0] and bits [191:128] in the destination, and the selected elements
/// from the second 256-bit operand are copied to bits [127:64] and bits
/// [255:192] in the destination. For example, if bits [3:0] of the immediate
/// operand contain a value of 0xF, the 256-bit destination vector would
/// contain the following values: b[3], a[3], b[1], a[1].
///
/// \headerfile <x86intrin.h>
///
@ -1641,9 +1644,11 @@ _mm256_blendv_ps(__m256 __a, __m256 __b, __m256 __c)
/// \brief Compares each of the corresponding double-precision values of two
/// 128-bit vectors of [2 x double], using the operation specified by the
/// immediate integer operand. Returns a [2 x double] vector consisting of
/// two doubles corresponding to the two comparison results: zero if the
/// comparison is false, and all 1's if the comparison is true.
/// immediate integer operand.
///
/// Returns a [2 x double] vector consisting of two doubles corresponding to
/// the two comparison results: zero if the comparison is false, and all 1's
/// if the comparison is true.
///
/// \headerfile <x86intrin.h>
///
@ -1699,9 +1704,11 @@ _mm256_blendv_ps(__m256 __a, __m256 __b, __m256 __c)
/// \brief Compares each of the corresponding values of two 128-bit vectors of
/// [4 x float], using the operation specified by the immediate integer
/// operand. Returns a [4 x float] vector consisting of four floats
/// corresponding to the four comparison results: zero if the comparison is
/// false, and all 1's if the comparison is true.
/// operand.
///
/// Returns a [4 x float] vector consisting of four floats corresponding to
/// the four comparison results: zero if the comparison is false, and all 1's
/// if the comparison is true.
///
/// \headerfile <x86intrin.h>
///
@ -1757,9 +1764,11 @@ _mm256_blendv_ps(__m256 __a, __m256 __b, __m256 __c)
/// \brief Compares each of the corresponding double-precision values of two
/// 256-bit vectors of [4 x double], using the operation specified by the
/// immediate integer operand. Returns a [4 x double] vector consisting of
/// four doubles corresponding to the four comparison results: zero if the
/// comparison is false, and all 1's if the comparison is true.
/// immediate integer operand.
///
/// Returns a [4 x double] vector consisting of four doubles corresponding to
/// the four comparison results: zero if the comparison is false, and all 1's
/// if the comparison is true.
///
/// \headerfile <x86intrin.h>
///
@ -1815,9 +1824,11 @@ _mm256_blendv_ps(__m256 __a, __m256 __b, __m256 __c)
/// \brief Compares each of the corresponding values of two 256-bit vectors of
/// [8 x float], using the operation specified by the immediate integer
/// operand. Returns a [8 x float] vector consisting of eight floats
/// corresponding to the eight comparison results: zero if the comparison is
/// false, and all 1's if the comparison is true.
/// operand.
///
/// Returns a [8 x float] vector consisting of eight floats corresponding to
/// the eight comparison results: zero if the comparison is false, and all
/// 1's if the comparison is true.
///
/// \headerfile <x86intrin.h>
///
@ -1873,8 +1884,10 @@ _mm256_blendv_ps(__m256 __a, __m256 __b, __m256 __c)
/// \brief Compares each of the corresponding scalar double-precision values of
/// two 128-bit vectors of [2 x double], using the operation specified by the
/// immediate integer operand. If the result is true, all 64 bits of the
/// destination vector are set; otherwise they are cleared.
/// immediate integer operand.
///
/// If the result is true, all 64 bits of the destination vector are set;
/// otherwise they are cleared.
///
/// \headerfile <x86intrin.h>
///
@ -1930,8 +1943,10 @@ _mm256_blendv_ps(__m256 __a, __m256 __b, __m256 __c)
/// \brief Compares each of the corresponding scalar values of two 128-bit
/// vectors of [4 x float], using the operation specified by the immediate
/// integer operand. If the result is true, all 32 bits of the destination
/// vector are set; otherwise they are cleared.
/// integer operand.
///
/// If the result is true, all 32 bits of the destination vector are set;
/// otherwise they are cleared.
///
/// \headerfile <x86intrin.h>
///
@ -2536,7 +2551,9 @@ _mm256_unpacklo_ps(__m256 __a, __m256 __b)
/// \brief Given two 128-bit floating-point vectors of [2 x double], perform an
/// element-by-element comparison of the double-precision element in the
/// first source vector and the corresponding element in the second source
/// vector. The EFLAGS register is updated as follows: \n
/// vector.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of double-precision elements where the
/// sign-bits of both elements are 1, the ZF flag is set to 0. Otherwise the
/// ZF flag is set to 1. \n
@ -2563,7 +2580,9 @@ _mm_testz_pd(__m128d __a, __m128d __b)
/// \brief Given two 128-bit floating-point vectors of [2 x double], perform an
/// element-by-element comparison of the double-precision element in the
/// first source vector and the corresponding element in the second source
/// vector. The EFLAGS register is updated as follows: \n
/// vector.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of double-precision elements where the
/// sign-bits of both elements are 1, the ZF flag is set to 0. Otherwise the
/// ZF flag is set to 1. \n
@ -2590,7 +2609,9 @@ _mm_testc_pd(__m128d __a, __m128d __b)
/// \brief Given two 128-bit floating-point vectors of [2 x double], perform an
/// element-by-element comparison of the double-precision element in the
/// first source vector and the corresponding element in the second source
/// vector. The EFLAGS register is updated as follows: \n
/// vector.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of double-precision elements where the
/// sign-bits of both elements are 1, the ZF flag is set to 0. Otherwise the
/// ZF flag is set to 1. \n
@ -2618,7 +2639,9 @@ _mm_testnzc_pd(__m128d __a, __m128d __b)
/// \brief Given two 128-bit floating-point vectors of [4 x float], perform an
/// element-by-element comparison of the single-precision element in the
/// first source vector and the corresponding element in the second source
/// vector. The EFLAGS register is updated as follows: \n
/// vector.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of single-precision elements where the
/// sign-bits of both elements are 1, the ZF flag is set to 0. Otherwise the
/// ZF flag is set to 1. \n
@ -2645,7 +2668,9 @@ _mm_testz_ps(__m128 __a, __m128 __b)
/// \brief Given two 128-bit floating-point vectors of [4 x float], perform an
/// element-by-element comparison of the single-precision element in the
/// first source vector and the corresponding element in the second source
/// vector. The EFLAGS register is updated as follows: \n
/// vector.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of single-precision elements where the
/// sign-bits of both elements are 1, the ZF flag is set to 0. Otherwise the
/// ZF flag is set to 1. \n
@ -2672,7 +2697,9 @@ _mm_testc_ps(__m128 __a, __m128 __b)
/// \brief Given two 128-bit floating-point vectors of [4 x float], perform an
/// element-by-element comparison of the single-precision element in the
/// first source vector and the corresponding element in the second source
/// vector. The EFLAGS register is updated as follows: \n
/// vector.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of single-precision elements where the
/// sign-bits of both elements are 1, the ZF flag is set to 0. Otherwise the
/// ZF flag is set to 1. \n
@ -2700,7 +2727,9 @@ _mm_testnzc_ps(__m128 __a, __m128 __b)
/// \brief Given two 256-bit floating-point vectors of [4 x double], perform an
/// element-by-element comparison of the double-precision elements in the
/// first source vector and the corresponding elements in the second source
/// vector. The EFLAGS register is updated as follows: \n
/// vector.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of double-precision elements where the
/// sign-bits of both elements are 1, the ZF flag is set to 0. Otherwise the
/// ZF flag is set to 1. \n
@ -2727,7 +2756,9 @@ _mm256_testz_pd(__m256d __a, __m256d __b)
/// \brief Given two 256-bit floating-point vectors of [4 x double], perform an
/// element-by-element comparison of the double-precision elements in the
/// first source vector and the corresponding elements in the second source
/// vector. The EFLAGS register is updated as follows: \n
/// vector.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of double-precision elements where the
/// sign-bits of both elements are 1, the ZF flag is set to 0. Otherwise the
/// ZF flag is set to 1. \n
@ -2754,7 +2785,9 @@ _mm256_testc_pd(__m256d __a, __m256d __b)
/// \brief Given two 256-bit floating-point vectors of [4 x double], perform an
/// element-by-element comparison of the double-precision elements in the
/// first source vector and the corresponding elements in the second source
/// vector. The EFLAGS register is updated as follows: \n
/// vector.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of double-precision elements where the
/// sign-bits of both elements are 1, the ZF flag is set to 0. Otherwise the
/// ZF flag is set to 1. \n
@ -2782,7 +2815,9 @@ _mm256_testnzc_pd(__m256d __a, __m256d __b)
/// \brief Given two 256-bit floating-point vectors of [8 x float], perform an
/// element-by-element comparison of the single-precision element in the
/// first source vector and the corresponding element in the second source
/// vector. The EFLAGS register is updated as follows: \n
/// vector.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of single-precision elements where the
/// sign-bits of both elements are 1, the ZF flag is set to 0. Otherwise the
/// ZF flag is set to 1. \n
@ -2809,7 +2844,9 @@ _mm256_testz_ps(__m256 __a, __m256 __b)
/// \brief Given two 256-bit floating-point vectors of [8 x float], perform an
/// element-by-element comparison of the single-precision element in the
/// first source vector and the corresponding element in the second source
/// vector. The EFLAGS register is updated as follows: \n
/// vector.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of single-precision elements where the
/// sign-bits of both elements are 1, the ZF flag is set to 0. Otherwise the
/// ZF flag is set to 1. \n
@ -2836,7 +2873,9 @@ _mm256_testc_ps(__m256 __a, __m256 __b)
/// \brief Given two 256-bit floating-point vectors of [8 x float], perform an
/// element-by-element comparison of the single-precision elements in the
/// first source vector and the corresponding elements in the second source
/// vector. The EFLAGS register is updated as follows: \n
/// vector.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of single-precision elements where the
/// sign-bits of both elements are 1, the ZF flag is set to 0. Otherwise the
/// ZF flag is set to 1. \n
@ -2862,7 +2901,9 @@ _mm256_testnzc_ps(__m256 __a, __m256 __b)
}
/// \brief Given two 256-bit integer vectors, perform a bit-by-bit comparison
/// of the two source vectors and update the EFLAGS register as follows: \n
/// of the two source vectors.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of bits where both bits are 1, the ZF flag
/// is set to 0. Otherwise the ZF flag is set to 1. \n
/// If there is at least one pair of bits where the bit from the first source
@ -2886,7 +2927,9 @@ _mm256_testz_si256(__m256i __a, __m256i __b)
}
/// \brief Given two 256-bit integer vectors, perform a bit-by-bit comparison
/// of the two source vectors and update the EFLAGS register as follows: \n
/// of the two source vectors.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of bits where both bits are 1, the ZF flag
/// is set to 0. Otherwise the ZF flag is set to 1. \n
/// If there is at least one pair of bits where the bit from the first source
@ -2910,7 +2953,9 @@ _mm256_testc_si256(__m256i __a, __m256i __b)
}
/// \brief Given two 256-bit integer vectors, perform a bit-by-bit comparison
/// of the two source vectors and update the EFLAGS register as follows: \n
/// of the two source vectors.
///
/// The EFLAGS register is updated as follows: \n
/// If there is at least one pair of bits where both bits are 1, the ZF flag
/// is set to 0. Otherwise the ZF flag is set to 1. \n
/// If there is at least one pair of bits where the bit from the first source
@ -4466,9 +4511,10 @@ _mm256_castsi256_si128(__m256i __a)
}
/// \brief Constructs a 256-bit floating-point vector of [4 x double] from a
/// 128-bit floating-point vector of [2 x double]. The lower 128 bits
/// contain the value of the source vector. The contents of the upper 128
/// bits are undefined.
/// 128-bit floating-point vector of [2 x double].
///
/// The lower 128 bits contain the value of the source vector. The contents
/// of the upper 128 bits are undefined.
///
/// \headerfile <x86intrin.h>
///
@ -4486,9 +4532,10 @@ _mm256_castpd128_pd256(__m128d __a)
}
/// \brief Constructs a 256-bit floating-point vector of [8 x float] from a
/// 128-bit floating-point vector of [4 x float]. The lower 128 bits contain
/// the value of the source vector. The contents of the upper 128 bits are
/// undefined.
/// 128-bit floating-point vector of [4 x float].
///
/// The lower 128 bits contain the value of the source vector. The contents
/// of the upper 128 bits are undefined.
///
/// \headerfile <x86intrin.h>
///
@ -4506,6 +4553,7 @@ _mm256_castps128_ps256(__m128 __a)
}
/// \brief Constructs a 256-bit integer vector from a 128-bit integer vector.
///
/// The lower 128 bits contain the value of the source vector. The contents
/// of the upper 128 bits are undefined.
///
@ -4586,8 +4634,10 @@ _mm256_zextsi128_si256(__m128i __a)
/// \brief Constructs a new 256-bit vector of [8 x float] by first duplicating
/// a 256-bit vector of [8 x float] given in the first parameter, and then
/// replacing either the upper or the lower 128 bits with the contents of a
/// 128-bit vector of [4 x float] in the second parameter. The immediate
/// integer parameter determines between the upper or the lower 128 bits.
/// 128-bit vector of [4 x float] in the second parameter.
///
/// The immediate integer parameter determines between the upper or the lower
/// 128 bits.
///
/// \headerfile <x86intrin.h>
///
@ -4631,8 +4681,10 @@ _mm256_zextsi128_si256(__m128i __a)
/// \brief Constructs a new 256-bit vector of [4 x double] by first duplicating
/// a 256-bit vector of [4 x double] given in the first parameter, and then
/// replacing either the upper or the lower 128 bits with the contents of a
/// 128-bit vector of [2 x double] in the second parameter. The immediate
/// integer parameter determines between the upper or the lower 128 bits.
/// 128-bit vector of [2 x double] in the second parameter.
///
/// The immediate integer parameter determines between the upper or the lower
/// 128 bits.
///
/// \headerfile <x86intrin.h>
///
@ -4672,8 +4724,10 @@ _mm256_zextsi128_si256(__m128i __a)
/// \brief Constructs a new 256-bit integer vector by first duplicating a
/// 256-bit integer vector given in the first parameter, and then replacing
/// either the upper or the lower 128 bits with the contents of a 128-bit
/// integer vector in the second parameter. The immediate integer parameter
/// determines between the upper or the lower 128 bits.
/// integer vector in the second parameter.
///
/// The immediate integer parameter determines between the upper or the lower
/// 128 bits.
///
/// \headerfile <x86intrin.h>
///

View File

@ -462,8 +462,9 @@ _mm_cmplt_pd(__m128d __a, __m128d __b)
/// \brief Compares each of the corresponding double-precision values of the
/// 128-bit vectors of [2 x double] to determine if the values in the first
/// operand are less than or equal to those in the second operand. Each
/// comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
/// operand are less than or equal to those in the second operand.
///
/// Each comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -482,8 +483,9 @@ _mm_cmple_pd(__m128d __a, __m128d __b)
/// \brief Compares each of the corresponding double-precision values of the
/// 128-bit vectors of [2 x double] to determine if the values in the first
/// operand are greater than those in the second operand. Each comparison
/// yields 0h for false, FFFFFFFFFFFFFFFFh for true.
/// operand are greater than those in the second operand.
///
/// Each comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -502,8 +504,9 @@ _mm_cmpgt_pd(__m128d __a, __m128d __b)
/// \brief Compares each of the corresponding double-precision values of the
/// 128-bit vectors of [2 x double] to determine if the values in the first
/// operand are greater than or equal to those in the second operand. Each
/// comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
/// operand are greater than or equal to those in the second operand.
///
/// Each comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -522,9 +525,10 @@ _mm_cmpge_pd(__m128d __a, __m128d __b)
/// \brief Compares each of the corresponding double-precision values of the
/// 128-bit vectors of [2 x double] to determine if the values in the first
/// operand are ordered with respect to those in the second operand. A pair
/// of double-precision values are "ordered" with respect to each other if
/// neither value is a NaN. Each comparison yields 0h for false,
/// operand are ordered with respect to those in the second operand.
///
/// A pair of double-precision values are "ordered" with respect to each
/// other if neither value is a NaN. Each comparison yields 0h for false,
/// FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
@ -544,9 +548,10 @@ _mm_cmpord_pd(__m128d __a, __m128d __b)
/// \brief Compares each of the corresponding double-precision values of the
/// 128-bit vectors of [2 x double] to determine if the values in the first
/// operand are unordered with respect to those in the second operand. A pair
/// of double-precision values are "unordered" with respect to each other if
/// one or both values are NaN. Each comparison yields 0h for false,
/// operand are unordered with respect to those in the second operand.
///
/// A pair of double-precision values are "unordered" with respect to each
/// other if one or both values are NaN. Each comparison yields 0h for false,
/// FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
@ -567,8 +572,9 @@ _mm_cmpunord_pd(__m128d __a, __m128d __b)
/// \brief Compares each of the corresponding double-precision values of the
/// 128-bit vectors of [2 x double] to determine if the values in the first
/// operand are unequal to those in the second operand. Each comparison
/// yields 0h for false, FFFFFFFFFFFFFFFFh for true.
/// operand are unequal to those in the second operand.
///
/// Each comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -587,8 +593,9 @@ _mm_cmpneq_pd(__m128d __a, __m128d __b)
/// \brief Compares each of the corresponding double-precision values of the
/// 128-bit vectors of [2 x double] to determine if the values in the first
/// operand are not less than those in the second operand. Each comparison
/// yields 0h for false, FFFFFFFFFFFFFFFFh for true.
/// operand are not less than those in the second operand.
///
/// Each comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -607,8 +614,9 @@ _mm_cmpnlt_pd(__m128d __a, __m128d __b)
/// \brief Compares each of the corresponding double-precision values of the
/// 128-bit vectors of [2 x double] to determine if the values in the first
/// operand are not less than or equal to those in the second operand. Each
/// comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
/// operand are not less than or equal to those in the second operand.
///
/// Each comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -627,8 +635,9 @@ _mm_cmpnle_pd(__m128d __a, __m128d __b)
/// \brief Compares each of the corresponding double-precision values of the
/// 128-bit vectors of [2 x double] to determine if the values in the first
/// operand are not greater than those in the second operand. Each
/// comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
/// operand are not greater than those in the second operand.
///
/// Each comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -648,6 +657,7 @@ _mm_cmpngt_pd(__m128d __a, __m128d __b)
/// \brief Compares each of the corresponding double-precision values of the
/// 128-bit vectors of [2 x double] to determine if the values in the first
/// operand are not greater than or equal to those in the second operand.
///
/// Each comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
@ -666,8 +676,9 @@ _mm_cmpnge_pd(__m128d __a, __m128d __b)
}
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] for equality. The
/// comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
/// the two 128-bit floating-point vectors of [2 x double] for equality.
///
/// The comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -690,8 +701,9 @@ _mm_cmpeq_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is less than the corresponding value in
/// the second parameter. The comparison yields 0h for false,
/// FFFFFFFFFFFFFFFFh for true.
/// the second parameter.
///
/// The comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -714,8 +726,9 @@ _mm_cmplt_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is less than or equal to the
/// corresponding value in the second parameter. The comparison yields 0h for
/// false, FFFFFFFFFFFFFFFFh for true.
/// corresponding value in the second parameter.
///
/// The comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -738,8 +751,9 @@ _mm_cmple_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is greater than the corresponding value
/// in the second parameter. The comparison yields 0h for false,
/// FFFFFFFFFFFFFFFFh for true.
/// in the second parameter.
///
/// The comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -763,8 +777,9 @@ _mm_cmpgt_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is greater than or equal to the
/// corresponding value in the second parameter. The comparison yields 0h for
/// false, FFFFFFFFFFFFFFFFh for true.
/// corresponding value in the second parameter.
///
/// The comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -788,9 +803,11 @@ _mm_cmpge_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is "ordered" with respect to the
/// corresponding value in the second parameter. The comparison yields 0h for
/// false, FFFFFFFFFFFFFFFFh for true. A pair of double-precision values are
/// "ordered" with respect to each other if neither value is a NaN.
/// corresponding value in the second parameter.
///
/// The comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true. A pair of
/// double-precision values are "ordered" with respect to each other if
/// neither value is a NaN.
///
/// \headerfile <x86intrin.h>
///
@ -813,9 +830,11 @@ _mm_cmpord_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is "unordered" with respect to the
/// corresponding value in the second parameter. The comparison yields 0h
/// for false, FFFFFFFFFFFFFFFFh for true. A pair of double-precision values
/// are "unordered" with respect to each other if one or both values are NaN.
/// corresponding value in the second parameter.
///
/// The comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true. A pair of
/// double-precision values are "unordered" with respect to each other if one
/// or both values are NaN.
///
/// \headerfile <x86intrin.h>
///
@ -839,8 +858,9 @@ _mm_cmpunord_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is unequal to the corresponding value in
/// the second parameter. The comparison yields 0h for false,
/// FFFFFFFFFFFFFFFFh for true.
/// the second parameter.
///
/// The comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -863,8 +883,9 @@ _mm_cmpneq_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is not less than the corresponding
/// value in the second parameter. The comparison yields 0h for false,
/// FFFFFFFFFFFFFFFFh for true.
/// value in the second parameter.
///
/// The comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -887,8 +908,9 @@ _mm_cmpnlt_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is not less than or equal to the
/// corresponding value in the second parameter. The comparison yields 0h
/// for false, FFFFFFFFFFFFFFFFh for true.
/// corresponding value in the second parameter.
///
/// The comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -911,8 +933,9 @@ _mm_cmpnle_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is not greater than the corresponding
/// value in the second parameter. The comparison yields 0h for false,
/// FFFFFFFFFFFFFFFFh for true.
/// value in the second parameter.
///
/// The comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -936,8 +959,9 @@ _mm_cmpngt_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is not greater than or equal to the
/// corresponding value in the second parameter. The comparison yields 0h
/// for false, FFFFFFFFFFFFFFFFh for true.
/// corresponding value in the second parameter.
///
/// The comparison yields 0h for false, FFFFFFFFFFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -982,7 +1006,9 @@ _mm_comieq_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is less than the corresponding value in
/// the second parameter. The comparison yields 0 for false, 1 for true.
/// the second parameter.
///
/// The comparison yields 0 for false, 1 for true.
///
/// \headerfile <x86intrin.h>
///
@ -1004,8 +1030,9 @@ _mm_comilt_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is less than or equal to the
/// corresponding value in the second parameter. The comparison yields 0 for
/// false, 1 for true.
/// corresponding value in the second parameter.
///
/// The comparison yields 0 for false, 1 for true.
///
/// \headerfile <x86intrin.h>
///
@ -1027,7 +1054,9 @@ _mm_comile_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is greater than the corresponding value
/// in the second parameter. The comparison yields 0 for false, 1 for true.
/// in the second parameter.
///
/// The comparison yields 0 for false, 1 for true.
///
/// \headerfile <x86intrin.h>
///
@ -1049,8 +1078,9 @@ _mm_comigt_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is greater than or equal to the
/// corresponding value in the second parameter. The comparison yields 0 for
/// false, 1 for true.
/// corresponding value in the second parameter.
///
/// The comparison yields 0 for false, 1 for true.
///
/// \headerfile <x86intrin.h>
///
@ -1072,7 +1102,9 @@ _mm_comige_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is unequal to the corresponding value in
/// the second parameter. The comparison yields 0 for false, 1 for true.
/// the second parameter.
///
/// The comparison yields 0 for false, 1 for true.
///
/// \headerfile <x86intrin.h>
///
@ -1093,8 +1125,9 @@ _mm_comineq_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] for equality. The
/// comparison yields 0 for false, 1 for true. If either of the two lower
/// double-precision values is NaN, 1 is returned.
/// comparison yields 0 for false, 1 for true.
///
/// If either of the two lower double-precision values is NaN, 1 is returned.
///
/// \headerfile <x86intrin.h>
///
@ -1117,8 +1150,10 @@ _mm_ucomieq_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is less than the corresponding value in
/// the second parameter. The comparison yields 0 for false, 1 for true. If
/// either of the two lower double-precision values is NaN, 1 is returned.
/// the second parameter.
///
/// The comparison yields 0 for false, 1 for true. If either of the two lower
/// double-precision values is NaN, 1 is returned.
///
/// \headerfile <x86intrin.h>
///
@ -1141,9 +1176,10 @@ _mm_ucomilt_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is less than or equal to the
/// corresponding value in the second parameter. The comparison yields 0 for
/// false, 1 for true. If either of the two lower double-precision values is
/// NaN, 1 is returned.
/// corresponding value in the second parameter.
///
/// The comparison yields 0 for false, 1 for true. If either of the two lower
/// double-precision values is NaN, 1 is returned.
///
/// \headerfile <x86intrin.h>
///
@ -1166,8 +1202,10 @@ _mm_ucomile_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is greater than the corresponding value
/// in the second parameter. The comparison yields 0 for false, 1 for true.
/// If either of the two lower double-precision values is NaN, 0 is returned.
/// in the second parameter.
///
/// The comparison yields 0 for false, 1 for true. If either of the two lower
/// double-precision values is NaN, 0 is returned.
///
/// \headerfile <x86intrin.h>
///
@ -1190,9 +1228,10 @@ _mm_ucomigt_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is greater than or equal to the
/// corresponding value in the second parameter. The comparison yields 0 for
/// false, 1 for true. If either of the two lower double-precision values
/// is NaN, 0 is returned.
/// corresponding value in the second parameter.
///
/// The comparison yields 0 for false, 1 for true. If either of the two
/// lower double-precision values is NaN, 0 is returned.
///
/// \headerfile <x86intrin.h>
///
@ -1215,8 +1254,10 @@ _mm_ucomige_sd(__m128d __a, __m128d __b)
/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is unequal to the corresponding value in
/// the second parameter. The comparison yields 0 for false, 1 for true. If
/// either of the two lower double-precision values is NaN, 0 is returned.
/// the second parameter.
///
/// The comparison yields 0 for false, 1 for true. If either of the two lower
/// double-precision values is NaN, 0 is returned.
///
/// \headerfile <x86intrin.h>
///
@ -1278,8 +1319,9 @@ _mm_cvtps_pd(__m128 __a)
/// \brief Converts the lower two integer elements of a 128-bit vector of
/// [4 x i32] into two double-precision floating-point values, returned in a
/// 128-bit vector of [2 x double]. The upper two elements of the input
/// vector are unused.
/// 128-bit vector of [2 x double].
///
/// The upper two elements of the input vector are unused.
///
/// \headerfile <x86intrin.h>
///
@ -1287,7 +1329,9 @@ _mm_cvtps_pd(__m128 __a)
///
/// \param __a
/// A 128-bit integer vector of [4 x i32]. The lower two integer elements are
/// converted to double-precision values. The upper two elements are unused.
/// converted to double-precision values.
///
/// The upper two elements are unused.
/// \returns A 128-bit vector of [2 x double] containing the converted values.
static __inline__ __m128d __DEFAULT_FN_ATTRS
_mm_cvtepi32_pd(__m128i __a)
@ -1409,10 +1453,11 @@ _mm_cvtss_sd(__m128d __a, __m128 __b)
/// \brief Converts the two double-precision floating-point elements of a
/// 128-bit vector of [2 x double] into two signed 32-bit integer values,
/// returned in the lower 64 bits of a 128-bit vector of [4 x i32]. If the
/// result of either conversion is inexact, the result is truncated (rounded
/// towards zero) regardless of the current MXCSR setting. The upper 64 bits
/// of the result vector are set to zero.
/// returned in the lower 64 bits of a 128-bit vector of [4 x i32].
///
/// If the result of either conversion is inexact, the result is truncated
/// (rounded towards zero) regardless of the current MXCSR setting. The upper
/// 64 bits of the result vector are set to zero.
///
/// \headerfile <x86intrin.h>
///
@ -1466,9 +1511,10 @@ _mm_cvtpd_pi32(__m128d __a)
/// \brief Converts the two double-precision floating-point elements of a
/// 128-bit vector of [2 x double] into two signed 32-bit integer values,
/// returned in a 64-bit vector of [2 x i32]. If the result of either
/// conversion is inexact, the result is truncated (rounded towards zero)
/// regardless of the current MXCSR setting.
/// returned in a 64-bit vector of [2 x i32].
///
/// If the result of either conversion is inexact, the result is truncated
/// (rounded towards zero) regardless of the current MXCSR setting.
///
/// \headerfile <x86intrin.h>
///
@ -1980,8 +2026,9 @@ _mm_storel_pd(double *__dp, __m128d __a)
/// \brief Adds the corresponding elements of two 128-bit vectors of [16 x i8],
/// saving the lower 8 bits of each sum in the corresponding element of a
/// 128-bit result vector of [16 x i8]. The integer elements of both
/// parameters can be either signed or unsigned.
/// 128-bit result vector of [16 x i8].
///
/// The integer elements of both parameters can be either signed or unsigned.
///
/// \headerfile <x86intrin.h>
///
@ -2001,8 +2048,9 @@ _mm_add_epi8(__m128i __a, __m128i __b)
/// \brief Adds the corresponding elements of two 128-bit vectors of [8 x i16],
/// saving the lower 16 bits of each sum in the corresponding element of a
/// 128-bit result vector of [8 x i16]. The integer elements of both
/// parameters can be either signed or unsigned.
/// 128-bit result vector of [8 x i16].
///
/// The integer elements of both parameters can be either signed or unsigned.
///
/// \headerfile <x86intrin.h>
///
@ -2022,8 +2070,9 @@ _mm_add_epi16(__m128i __a, __m128i __b)
/// \brief Adds the corresponding elements of two 128-bit vectors of [4 x i32],
/// saving the lower 32 bits of each sum in the corresponding element of a
/// 128-bit result vector of [4 x i32]. The integer elements of both
/// parameters can be either signed or unsigned.
/// 128-bit result vector of [4 x i32].
///
/// The integer elements of both parameters can be either signed or unsigned.
///
/// \headerfile <x86intrin.h>
///
@ -2061,8 +2110,9 @@ _mm_add_si64(__m64 __a, __m64 __b)
/// \brief Adds the corresponding elements of two 128-bit vectors of [2 x i64],
/// saving the lower 64 bits of each sum in the corresponding element of a
/// 128-bit result vector of [2 x i64]. The integer elements of both
/// parameters can be either signed or unsigned.
/// 128-bit result vector of [2 x i64].
///
/// The integer elements of both parameters can be either signed or unsigned.
///
/// \headerfile <x86intrin.h>
///
@ -2208,10 +2258,12 @@ _mm_avg_epu16(__m128i __a, __m128i __b)
/// \brief Multiplies the corresponding elements of two 128-bit signed [8 x i16]
/// vectors, producing eight intermediate 32-bit signed integer products, and
/// adds the consecutive pairs of 32-bit products to form a 128-bit signed
/// [4 x i32] vector. For example, bits [15:0] of both parameters are
/// multiplied producing a 32-bit product, bits [31:16] of both parameters
/// are multiplied producing a 32-bit product, and the sum of those two
/// products becomes bits [31:0] of the result.
/// [4 x i32] vector.
///
/// For example, bits [15:0] of both parameters are multiplied producing a
/// 32-bit product, bits [31:16] of both parameters are multiplied producing
/// a 32-bit product, and the sum of those two products becomes bits [31:0]
/// of the result.
///
/// \headerfile <x86intrin.h>
///
@ -3146,8 +3198,9 @@ _mm_cmpgt_epi8(__m128i __a, __m128i __b)
/// \brief Compares each of the corresponding signed 16-bit values of the
/// 128-bit integer vectors to determine if the values in the first operand
/// are greater than those in the second operand. Each comparison yields 0h
/// for false, FFFFh for true.
/// are greater than those in the second operand.
///
/// Each comparison yields 0h for false, FFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -3166,8 +3219,9 @@ _mm_cmpgt_epi16(__m128i __a, __m128i __b)
/// \brief Compares each of the corresponding signed 32-bit values of the
/// 128-bit integer vectors to determine if the values in the first operand
/// are greater than those in the second operand. Each comparison yields 0h
/// for false, FFFFFFFFh for true.
/// are greater than those in the second operand.
///
/// Each comparison yields 0h for false, FFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -3186,8 +3240,9 @@ _mm_cmpgt_epi32(__m128i __a, __m128i __b)
/// \brief Compares each of the corresponding signed 8-bit values of the 128-bit
/// integer vectors to determine if the values in the first operand are less
/// than those in the second operand. Each comparison yields 0h for false,
/// FFh for true.
/// than those in the second operand.
///
/// Each comparison yields 0h for false, FFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -3206,8 +3261,9 @@ _mm_cmplt_epi8(__m128i __a, __m128i __b)
/// \brief Compares each of the corresponding signed 16-bit values of the
/// 128-bit integer vectors to determine if the values in the first operand
/// are less than those in the second operand. Each comparison yields 0h for
/// false, FFFFh for true.
/// are less than those in the second operand.
///
/// Each comparison yields 0h for false, FFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -3226,8 +3282,9 @@ _mm_cmplt_epi16(__m128i __a, __m128i __b)
/// \brief Compares each of the corresponding signed 32-bit values of the
/// 128-bit integer vectors to determine if the values in the first operand
/// are less than those in the second operand. Each comparison yields 0h for
/// false, FFFFFFFFh for true.
/// are less than those in the second operand.
///
/// Each comparison yields 0h for false, FFFFFFFFh for true.
///
/// \headerfile <x86intrin.h>
///
@ -3925,10 +3982,11 @@ _mm_storeu_si128(__m128i *__p, __m128i __b)
/// \brief Moves bytes selected by the mask from the first operand to the
/// specified unaligned memory location. When a mask bit is 1, the
/// corresponding byte is written, otherwise it is not written. To minimize
/// caching, the date is flagged as non-temporal (unlikely to be used again
/// soon). Exception and trap behavior for elements not selected for storage
/// to memory are implementation dependent.
/// corresponding byte is written, otherwise it is not written.
///
/// To minimize caching, the date is flagged as non-temporal (unlikely to be
/// used again soon). Exception and trap behavior for elements not selected
/// for storage to memory are implementation dependent.
///
/// \headerfile <x86intrin.h>
///
@ -3972,8 +4030,10 @@ _mm_storel_epi64(__m128i *__p, __m128i __a)
}
/// \brief Stores a 128-bit floating point vector of [2 x double] to a 128-bit
/// aligned memory location. To minimize caching, the data is flagged as
/// non-temporal (unlikely to be used again soon).
/// aligned memory location.
///
/// To minimize caching, the data is flagged as non-temporal (unlikely to be
/// used again soon).
///
/// \headerfile <x86intrin.h>
///
@ -3990,6 +4050,7 @@ _mm_stream_pd(double *__p, __m128d __a)
}
/// \brief Stores a 128-bit integer vector to a 128-bit aligned memory location.
///
/// To minimize caching, the data is flagged as non-temporal (unlikely to be
/// used again soon).
///
@ -4007,8 +4068,9 @@ _mm_stream_si128(__m128i *__p, __m128i __a)
__builtin_nontemporal_store((__v2di)__a, (__v2di*)__p);
}
/// \brief Stores a 32-bit integer value in the specified memory location. To
/// minimize caching, the data is flagged as non-temporal (unlikely to be
/// \brief Stores a 32-bit integer value in the specified memory location.
///
/// To minimize caching, the data is flagged as non-temporal (unlikely to be
/// used again soon).
///
/// \headerfile <x86intrin.h>
@ -4026,8 +4088,9 @@ _mm_stream_si32(int *__p, int __a)
}
#ifdef __x86_64__
/// \brief Stores a 64-bit integer value in the specified memory location. To
/// minimize caching, the data is flagged as non-temporal (unlikely to be
/// \brief Stores a 64-bit integer value in the specified memory location.
///
/// To minimize caching, the data is flagged as non-temporal (unlikely to be
/// used again soon).
///
/// \headerfile <x86intrin.h>

View File

@ -608,10 +608,11 @@ _mm_subs_pi16(__m64 __m1, __m64 __m2)
/// \brief Subtracts each 8-bit unsigned integer element of the second 64-bit
/// integer vector of [8 x i8] from the corresponding 8-bit unsigned integer
/// element of the first 64-bit integer vector of [8 x i8]. If an element of
/// the first vector is less than the corresponding element of the second
/// vector, the result is saturated to 0. The results are packed into a
/// 64-bit integer vector of [8 x i8].
/// element of the first 64-bit integer vector of [8 x i8].
///
/// If an element of the first vector is less than the corresponding element
/// of the second vector, the result is saturated to 0. The results are
/// packed into a 64-bit integer vector of [8 x i8].
///
/// \headerfile <x86intrin.h>
///
@ -631,10 +632,11 @@ _mm_subs_pu8(__m64 __m1, __m64 __m2)
/// \brief Subtracts each 16-bit unsigned integer element of the second 64-bit
/// integer vector of [4 x i16] from the corresponding 16-bit unsigned
/// integer element of the first 64-bit integer vector of [4 x i16]. If an
/// element of the first vector is less than the corresponding element of the
/// second vector, the result is saturated to 0. The results are packed into
/// a 64-bit integer vector of [4 x i16].
/// integer element of the first 64-bit integer vector of [4 x i16].
///
/// If an element of the first vector is less than the corresponding element
/// of the second vector, the result is saturated to 0. The results are
/// packed into a 64-bit integer vector of [4 x i16].
///
/// \headerfile <x86intrin.h>
///
@ -657,9 +659,11 @@ _mm_subs_pu16(__m64 __m1, __m64 __m2)
/// element of the second 64-bit integer vector of [4 x i16] and get four
/// 32-bit products. Adds adjacent pairs of products to get two 32-bit sums.
/// The lower 32 bits of these two sums are packed into a 64-bit integer
/// vector of [2 x i32]. For example, bits [15:0] of both parameters are
/// multiplied, bits [31:16] of both parameters are multiplied, and the sum
/// of both results is written to bits [31:0] of the result.
/// vector of [2 x i32].
///
/// For example, bits [15:0] of both parameters are multiplied, bits [31:16]
/// of both parameters are multiplied, and the sum of both results is written
/// to bits [31:0] of the result.
///
/// \headerfile <x86intrin.h>
///
@ -851,10 +855,11 @@ _mm_slli_si64(__m64 __m, int __count)
/// \brief Right-shifts each 16-bit integer element of the first parameter,
/// which is a 64-bit integer vector of [4 x i16], by the number of bits
/// specified by the second parameter, which is a 64-bit integer. High-order
/// bits are filled with the sign bit of the initial value of each 16-bit
/// element. The 16-bit results are packed into a 64-bit integer vector of
/// [4 x i16].
/// specified by the second parameter, which is a 64-bit integer.
///
/// High-order bits are filled with the sign bit of the initial value of each
/// 16-bit element. The 16-bit results are packed into a 64-bit integer
/// vector of [4 x i16].
///
/// \headerfile <x86intrin.h>
///
@ -874,6 +879,7 @@ _mm_sra_pi16(__m64 __m, __m64 __count)
/// \brief Right-shifts each 16-bit integer element of a 64-bit integer vector
/// of [4 x i16] by the number of bits specified by a 32-bit integer.
///
/// High-order bits are filled with the sign bit of the initial value of each
/// 16-bit element. The 16-bit results are packed into a 64-bit integer
/// vector of [4 x i16].
@ -896,10 +902,11 @@ _mm_srai_pi16(__m64 __m, int __count)
/// \brief Right-shifts each 32-bit integer element of the first parameter,
/// which is a 64-bit integer vector of [2 x i32], by the number of bits
/// specified by the second parameter, which is a 64-bit integer. High-order
/// bits are filled with the sign bit of the initial value of each 32-bit
/// element. The 32-bit results are packed into a 64-bit integer vector of
/// [2 x i32].
/// specified by the second parameter, which is a 64-bit integer.
///
/// High-order bits are filled with the sign bit of the initial value of each
/// 32-bit element. The 32-bit results are packed into a 64-bit integer
/// vector of [2 x i32].
///
/// \headerfile <x86intrin.h>
///
@ -919,6 +926,7 @@ _mm_sra_pi32(__m64 __m, __m64 __count)
/// \brief Right-shifts each 32-bit integer element of a 64-bit integer vector
/// of [2 x i32] by the number of bits specified by a 32-bit integer.
///
/// High-order bits are filled with the sign bit of the initial value of each
/// 32-bit element. The 32-bit results are packed into a 64-bit integer
/// vector of [2 x i32].
@ -941,9 +949,10 @@ _mm_srai_pi32(__m64 __m, int __count)
/// \brief Right-shifts each 16-bit integer element of the first parameter,
/// which is a 64-bit integer vector of [4 x i16], by the number of bits
/// specified by the second parameter, which is a 64-bit integer. High-order
/// bits are cleared. The 16-bit results are packed into a 64-bit integer
/// vector of [4 x i16].
/// specified by the second parameter, which is a 64-bit integer.
///
/// High-order bits are cleared. The 16-bit results are packed into a 64-bit
/// integer vector of [4 x i16].
///
/// \headerfile <x86intrin.h>
///
@ -963,6 +972,7 @@ _mm_srl_pi16(__m64 __m, __m64 __count)
/// \brief Right-shifts each 16-bit integer element of a 64-bit integer vector
/// of [4 x i16] by the number of bits specified by a 32-bit integer.
///
/// High-order bits are cleared. The 16-bit results are packed into a 64-bit
/// integer vector of [4 x i16].
///
@ -984,9 +994,10 @@ _mm_srli_pi16(__m64 __m, int __count)
/// \brief Right-shifts each 32-bit integer element of the first parameter,
/// which is a 64-bit integer vector of [2 x i32], by the number of bits
/// specified by the second parameter, which is a 64-bit integer. High-order
/// bits are cleared. The 32-bit results are packed into a 64-bit integer
/// vector of [2 x i32].
/// specified by the second parameter, which is a 64-bit integer.
///
/// High-order bits are cleared. The 32-bit results are packed into a 64-bit
/// integer vector of [2 x i32].
///
/// \headerfile <x86intrin.h>
///
@ -1006,6 +1017,7 @@ _mm_srl_pi32(__m64 __m, __m64 __count)
/// \brief Right-shifts each 32-bit integer element of a 64-bit integer vector
/// of [2 x i32] by the number of bits specified by a 32-bit integer.
///
/// High-order bits are cleared. The 32-bit results are packed into a 64-bit
/// integer vector of [2 x i32].
///
@ -1026,8 +1038,9 @@ _mm_srli_pi32(__m64 __m, int __count)
}
/// \brief Right-shifts the first 64-bit integer parameter by the number of bits
/// specified by the second 64-bit integer parameter. High-order bits are
/// cleared.
/// specified by the second 64-bit integer parameter.
///
/// High-order bits are cleared.
///
/// \headerfile <x86intrin.h>
///
@ -1046,7 +1059,9 @@ _mm_srl_si64(__m64 __m, __m64 __count)
/// \brief Right-shifts the first parameter, which is a 64-bit integer, by the
/// number of bits specified by the second parameter, which is a 32-bit
/// integer. High-order bits are cleared.
/// integer.
///
/// High-order bits are cleared.
///
/// \headerfile <x86intrin.h>
///
@ -1140,8 +1155,9 @@ _mm_xor_si64(__m64 __m1, __m64 __m2)
/// \brief Compares the 8-bit integer elements of two 64-bit integer vectors of
/// [8 x i8] to determine if the element of the first vector is equal to the
/// corresponding element of the second vector. The comparison yields 0 for
/// false, 0xFF for true.
/// corresponding element of the second vector.
///
/// The comparison yields 0 for false, 0xFF for true.
///
/// \headerfile <x86intrin.h>
///
@ -1161,8 +1177,9 @@ _mm_cmpeq_pi8(__m64 __m1, __m64 __m2)
/// \brief Compares the 16-bit integer elements of two 64-bit integer vectors of
/// [4 x i16] to determine if the element of the first vector is equal to the
/// corresponding element of the second vector. The comparison yields 0 for
/// false, 0xFFFF for true.
/// corresponding element of the second vector.
///
/// The comparison yields 0 for false, 0xFFFF for true.
///
/// \headerfile <x86intrin.h>
///
@ -1182,8 +1199,9 @@ _mm_cmpeq_pi16(__m64 __m1, __m64 __m2)
/// \brief Compares the 32-bit integer elements of two 64-bit integer vectors of
/// [2 x i32] to determine if the element of the first vector is equal to the
/// corresponding element of the second vector. The comparison yields 0 for
/// false, 0xFFFFFFFF for true.
/// corresponding element of the second vector.
///
/// The comparison yields 0 for false, 0xFFFFFFFF for true.
///
/// \headerfile <x86intrin.h>
///
@ -1203,8 +1221,9 @@ _mm_cmpeq_pi32(__m64 __m1, __m64 __m2)
/// \brief Compares the 8-bit integer elements of two 64-bit integer vectors of
/// [8 x i8] to determine if the element of the first vector is greater than
/// the corresponding element of the second vector. The comparison yields 0
/// for false, 0xFF for true.
/// the corresponding element of the second vector.
///
/// The comparison yields 0 for false, 0xFF for true.
///
/// \headerfile <x86intrin.h>
///
@ -1224,8 +1243,9 @@ _mm_cmpgt_pi8(__m64 __m1, __m64 __m2)
/// \brief Compares the 16-bit integer elements of two 64-bit integer vectors of
/// [4 x i16] to determine if the element of the first vector is greater than
/// the corresponding element of the second vector. The comparison yields 0
/// for false, 0xFFFF for true.
/// the corresponding element of the second vector.
///
/// The comparison yields 0 for false, 0xFFFF for true.
///
/// \headerfile <x86intrin.h>
///
@ -1245,8 +1265,9 @@ _mm_cmpgt_pi16(__m64 __m1, __m64 __m2)
/// \brief Compares the 32-bit integer elements of two 64-bit integer vectors of
/// [2 x i32] to determine if the element of the first vector is greater than
/// the corresponding element of the second vector. The comparison yields 0
/// for false, 0xFFFFFFFF for true.
/// the corresponding element of the second vector.
///
/// The comparison yields 0 for false, 0xFFFFFFFF for true.
///
/// \headerfile <x86intrin.h>
///

View File

@ -31,9 +31,11 @@
__attribute__((__always_inline__, __nodebug__, __target__("sse3")))
/// \brief Loads data from an unaligned memory location to elements in a 128-bit
/// vector. If the address of the data is not 16-byte aligned, the
/// instruction may read two adjacent aligned blocks of memory to retrieve
/// the requested data.
/// vector.
///
/// If the address of the data is not 16-byte aligned, the instruction may
/// read two adjacent aligned blocks of memory to retrieve the requested
/// data.
///
/// \headerfile <x86intrin.h>
///

View File

@ -50,8 +50,10 @@ _m_prefetch(void *__P)
/// the L1 data cache and sets the cache-coherency to modified. This
/// provides a hint to the processor that the cache line will be modified.
/// It is intended for use when the cache line will be written to shortly
/// after the prefetch is performed. Note that the effect of this intrinsic
/// is dependent on the processor implementation.
/// after the prefetch is performed.
///
/// Note that the effect of this intrinsic is dependent on the processor
/// implementation.
///
/// \headerfile <x86intrin.h>
///

View File

@ -586,7 +586,9 @@ _mm_mul_epi32 (__m128i __V1, __m128i __V2)
/* SSE4 Floating Point Dot Product Instructions. */
/// \brief Computes the dot product of the two 128-bit vectors of [4 x float]
/// and returns it in the elements of the 128-bit result vector of
/// [4 x float]. The immediate integer operand controls which input elements
/// [4 x float].
///
/// The immediate integer operand controls which input elements
/// will contribute to the dot product, and where the final results are
/// returned.
///
@ -620,7 +622,9 @@ _mm_mul_epi32 (__m128i __V1, __m128i __V2)
/// \brief Computes the dot product of the two 128-bit vectors of [2 x double]
/// and returns it in the elements of the 128-bit result vector of
/// [2 x double]. The immediate integer operand controls which input
/// [2 x double].
///
/// The immediate integer operand controls which input
/// elements will contribute to the dot product, and where the final results
/// are returned.
///
@ -875,7 +879,7 @@ _mm_max_epu32 (__m128i __V1, __m128i __V2)
/// int _mm_extract_ps(__m128 X, const int N);
/// \endcode
///
/// This intrinsic corresponds to the <c> VEXTRACTPS / EXTRACTPS </c>
/// This intrinsic corresponds to the <c> VEXTRACTPS / EXTRACTPS </c>
/// instruction.
///
/// \param X

View File

@ -469,10 +469,11 @@ _mm_hsubs_pi16(__m64 __a, __m64 __b)
/// values contained in the first source operand and packed 8-bit signed
/// integer values contained in the second source operand, adds pairs of
/// contiguous products with signed saturation, and writes the 16-bit sums to
/// the corresponding bits in the destination. For example, bits [7:0] of
/// both operands are multiplied, bits [15:8] of both operands are
/// multiplied, and the sum of both results is written to bits [15:0] of the
/// destination.
/// the corresponding bits in the destination.
///
/// For example, bits [7:0] of both operands are multiplied, bits [15:8] of
/// both operands are multiplied, and the sum of both results is written to
/// bits [15:0] of the destination.
///
/// \headerfile <x86intrin.h>
///
@ -502,10 +503,11 @@ _mm_maddubs_epi16(__m128i __a, __m128i __b)
/// values contained in the first source operand and packed 8-bit signed
/// integer values contained in the second source operand, adds pairs of
/// contiguous products with signed saturation, and writes the 16-bit sums to
/// the corresponding bits in the destination. For example, bits [7:0] of
/// both operands are multiplied, bits [15:8] of both operands are
/// multiplied, and the sum of both results is written to bits [15:0] of the
/// destination.
/// the corresponding bits in the destination.
///
/// For example, bits [7:0] of both operands are multiplied, bits [15:8] of
/// both operands are multiplied, and the sum of both results is written to
/// bits [15:0] of the destination.
///
/// \headerfile <x86intrin.h>
///
@ -619,13 +621,14 @@ _mm_shuffle_pi8(__m64 __a, __m64 __b)
}
/// \brief For each 8-bit integer in the first source operand, perform one of
/// the following actions as specified by the second source operand: If the
/// byte in the second source is negative, calculate the two's complement of
/// the corresponding byte in the first source, and write that value to the
/// destination. If the byte in the second source is positive, copy the
/// corresponding byte from the first source to the destination. If the byte
/// in the second source is zero, clear the corresponding byte in the
/// destination.
/// the following actions as specified by the second source operand.
///
/// If the byte in the second source is negative, calculate the two's
/// complement of the corresponding byte in the first source, and write that
/// value to the destination. If the byte in the second source is positive,
/// copy the corresponding byte from the first source to the destination. If
/// the byte in the second source is zero, clear the corresponding byte in
/// the destination.
///
/// \headerfile <x86intrin.h>
///
@ -644,13 +647,14 @@ _mm_sign_epi8(__m128i __a, __m128i __b)
}
/// \brief For each 16-bit integer in the first source operand, perform one of
/// the following actions as specified by the second source operand: If the
/// word in the second source is negative, calculate the two's complement of
/// the corresponding word in the first source, and write that value to the
/// destination. If the word in the second source is positive, copy the
/// corresponding word from the first source to the destination. If the word
/// in the second source is zero, clear the corresponding word in the
/// destination.
/// the following actions as specified by the second source operand.
///
/// If the word in the second source is negative, calculate the two's
/// complement of the corresponding word in the first source, and write that
/// value to the destination. If the word in the second source is positive,
/// copy the corresponding word from the first source to the destination. If
/// the word in the second source is zero, clear the corresponding word in
/// the destination.
///
/// \headerfile <x86intrin.h>
///
@ -669,8 +673,9 @@ _mm_sign_epi16(__m128i __a, __m128i __b)
}
/// \brief For each 32-bit integer in the first source operand, perform one of
/// the following actions as specified by the second source operand: If the
/// doubleword in the second source is negative, calculate the two's
/// the following actions as specified by the second source operand.
///
/// If the doubleword in the second source is negative, calculate the two's
/// complement of the corresponding word in the first source, and write that
/// value to the destination. If the doubleword in the second source is
/// positive, copy the corresponding word from the first source to the
@ -694,13 +699,14 @@ _mm_sign_epi32(__m128i __a, __m128i __b)
}
/// \brief For each 8-bit integer in the first source operand, perform one of
/// the following actions as specified by the second source operand: If the
/// byte in the second source is negative, calculate the two's complement of
/// the corresponding byte in the first source, and write that value to the
/// destination. If the byte in the second source is positive, copy the
/// corresponding byte from the first source to the destination. If the byte
/// in the second source is zero, clear the corresponding byte in the
/// destination.
/// the following actions as specified by the second source operand.
///
/// If the byte in the second source is negative, calculate the two's
/// complement of the corresponding byte in the first source, and write that
/// value to the destination. If the byte in the second source is positive,
/// copy the corresponding byte from the first source to the destination. If
/// the byte in the second source is zero, clear the corresponding byte in
/// the destination.
///
/// \headerfile <x86intrin.h>
///
@ -719,13 +725,14 @@ _mm_sign_pi8(__m64 __a, __m64 __b)
}
/// \brief For each 16-bit integer in the first source operand, perform one of
/// the following actions as specified by the second source operand: If the
/// word in the second source is negative, calculate the two's complement of
/// the corresponding word in the first source, and write that value to the
/// destination. If the word in the second source is positive, copy the
/// corresponding word from the first source to the destination. If the word
/// in the second source is zero, clear the corresponding word in the
/// destination.
/// the following actions as specified by the second source operand.
///
/// If the word in the second source is negative, calculate the two's
/// complement of the corresponding word in the first source, and write that
/// value to the destination. If the word in the second source is positive,
/// copy the corresponding word from the first source to the destination. If
/// the word in the second source is zero, clear the corresponding word in
/// the destination.
///
/// \headerfile <x86intrin.h>
///
@ -744,8 +751,9 @@ _mm_sign_pi16(__m64 __a, __m64 __b)
}
/// \brief For each 32-bit integer in the first source operand, perform one of
/// the following actions as specified by the second source operand: If the
/// doubleword in the second source is negative, calculate the two's
/// the following actions as specified by the second source operand.
///
/// If the doubleword in the second source is negative, calculate the two's
/// complement of the corresponding doubleword in the first source, and
/// write that value to the destination. If the doubleword in the second
/// source is positive, copy the corresponding doubleword from the first

View File

@ -2331,8 +2331,10 @@ _mm_mulhi_pu16(__m64 __a, __m64 __b)
/// \brief Conditionally copies the values from each 8-bit element in the first
/// 64-bit integer vector operand to the specified memory location, as
/// specified by the most significant bit in the corresponding element in the
/// second 64-bit integer vector operand. To minimize caching, the data is
/// flagged as non-temporal (unlikely to be used again soon).
/// second 64-bit integer vector operand.
///
/// To minimize caching, the data is flagged as non-temporal
/// (unlikely to be used again soon).
///
/// \headerfile <x86intrin.h>
///
@ -2815,11 +2817,12 @@ _mm_cvtpi32x2_ps(__m64 __a, __m64 __b)
/// \brief Converts each single-precision floating-point element of a 128-bit
/// floating-point vector of [4 x float] into a 16-bit signed integer, and
/// packs the results into a 64-bit integer vector of [4 x i16]. If the
/// floating-point element is NaN or infinity, or if the floating-point
/// element is greater than 0x7FFFFFFF or less than -0x8000, it is converted
/// to 0x8000. Otherwise if the floating-point element is greater than
/// 0x7FFF, it is converted to 0x7FFF.
/// packs the results into a 64-bit integer vector of [4 x i16].
///
/// If the floating-point element is NaN or infinity, or if the
/// floating-point element is greater than 0x7FFFFFFF or less than -0x8000,
/// it is converted to 0x8000. Otherwise if the floating-point element is
/// greater than 0x7FFF, it is converted to 0x7FFF.
///
/// \headerfile <x86intrin.h>
///
@ -2845,11 +2848,12 @@ _mm_cvtps_pi16(__m128 __a)
/// \brief Converts each single-precision floating-point element of a 128-bit
/// floating-point vector of [4 x float] into an 8-bit signed integer, and
/// packs the results into the lower 32 bits of a 64-bit integer vector of
/// [8 x i8]. The upper 32 bits of the vector are set to 0. If the
/// floating-point element is NaN or infinity, or if the floating-point
/// element is greater than 0x7FFFFFFF or less than -0x80, it is converted
/// to 0x80. Otherwise if the floating-point element is greater than 0x7F,
/// it is converted to 0x7F.
/// [8 x i8]. The upper 32 bits of the vector are set to 0.
///
/// If the floating-point element is NaN or infinity, or if the
/// floating-point element is greater than 0x7FFFFFFF or less than -0x80, it
/// is converted to 0x80. Otherwise if the floating-point element is greater
/// than 0x7F, it is converted to 0x7F.
///
/// \headerfile <x86intrin.h>
///