In the old code, we would generate the exact same instruction for
extract_u8(some_u64, 0) and extract_u8(some_u64, 1). The mask-a-word
trick only works for even numbered bytes.
This fixes the (new) piglit test
tests/spec/arb_gpu_shader_int64/execution/fs-ushr-and-mask.shader_test.
v2: Use a SHR instead of an AND. This saves an instruction compared to
using two moves. Suggested by Jason.
Fixes: 6ac2d169019 ("i965/fs: Fix extract_i8/u8 to a 64-bit destination")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
fs_reg w_temp = bld.vgrf(BRW_REGISTER_TYPE_W);
bld.MOV(w_temp, subscript(op[0], type, byte));
bld.MOV(result, w_temp);
+ } else if (byte & 1) {
+ /* Extract the high byte from the word containing the desired byte
+ * offset.
+ */
+ bld.SHR(result,
+ subscript(op[0], BRW_REGISTER_TYPE_UW, byte / 2),
+ brw_imm_uw(8));
} else {
/* Otherwise use an AND with 0xff and a word type */
bld.AND(result,