c++ - an optimal select function for vector extensions? -
opencl has select
function, usable all-vector arguments. both clang , gcc support vector types well, gcc supports ternary operator supporting vectors , none of them has opencl-like select
function yet. i've tried implement replacement, results non-optimal, both gcc , clang producing conditional jumps. however, gcc pretty job ternary operator, serves drop-in replacement. optimal solution select
-like function exist (particularly under clang) , it? here non-optimal ideas:
template <typename u, typename v> inline v select(u const s, v const a, v const b) noexcept { // loop gives horrible results /* constexpr u zero{}; // clang return (-(s != zero) * a) - ((s == zero) * b); */ return v{ s[0] ? a[0] : b[0], s[1] ? a[1] : b[1], s[2] ? a[2] : b[2], s[3] ? a[3] : b[3] }; // 4-component vectors only, 1 generalize indices trick /* return s ? : b; // gcc */ }
this compiles semi-good code under both gcc , clang:
template <typename u, typename v> constexpr inline v select(u const s, v const a, v const b) noexcept { return v((s & u(a)) | (~s & u(b))); }
asm listing:
dump of assembler code function select<int __vector(4), float __vector(4)>(int __vector(4), float __vector(4), float __vector(4)): 0x00000000004009c0 <+0>: andps %xmm0,%xmm1 0x00000000004009c3 <+3>: andnps %xmm2,%xmm0 0x00000000004009c6 <+6>: orps %xmm1,%xmm0 0x00000000004009c9 <+9>: retq end of assembler dump.
this still not good, optimizer produce. way works abusing vector casts , fact, comparison operations produce -1 (all ones) integers in places comparison matches. @ least there no arithmetic operations necessary.
Comments
Post a Comment