This patch fixes a classic "confuse the enemy" bug.
_mm_andnot_si128 (SSE) and vec_andc (VMX) do the same operation, but the
arguments are opposite.
_mm_andnot_si128 performs "r = (~a) & b" while
vec_andc performs "r = a & (~b)"
To make sure this error won't return in another place, I added a wrapper
function, vec_andnot_si128, in u_pwr8.h, which makes the swap inside.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
return v;
}
+static inline __m128i
+vec_andnot_si128 (__m128i a, __m128i b)
+{
+ return vec_andc (b, a);
+}
+
static inline void
transpose4_epi32(const __m128i * restrict a,
const __m128i * restrict b,
/* Calculate trivial reject values:
*/
- eo = vec_sub_epi32(vec_andc(dcdy_neg_mask, dcdy),
+ eo = vec_sub_epi32(vec_andnot_si128(dcdy_neg_mask, dcdy),
vec_and(dcdx_neg_mask, dcdx));
/* ei = _mm_sub_epi32(_mm_sub_epi32(dcdy, dcdx), eo); */