Broadwell removed the F32TO16 and F16TO32 instructions. However, it has
actual support for HF values, so they're actually just MOV.
Fixes vs-packHalf2x16 and vs-unpackHalf2x16 tests (both the ARB
extension and ES 3.0 variants).
v2: Emulate F32TO16's align16 zeroing bug, since Chad's front end code
relies on it happening. We can probably refactor this code to be
better later.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
break;
case BRW_OPCODE_F32TO16:
- F32TO16(dst, src[0]);
+ MOV(retype(dst, BRW_REGISTER_TYPE_HF), src[0]);
break;
case BRW_OPCODE_F16TO32:
- F16TO32(dst, src[0]);
+ MOV(dst, retype(src[0], BRW_REGISTER_TYPE_HF));
break;
case BRW_OPCODE_CMP:
ALU3(BFE)
ALU2(BFI1)
ALU3(BFI2)
-ALU1(F32TO16)
-ALU1(F16TO32)
ALU1(BFREV)
ALU1(CBIT)
ALU2_ACCUMULATE(ADDC)
break;
case BRW_OPCODE_F32TO16:
- F32TO16(dst, src[0]);
+ /* Emulate the Gen7 zeroing bug. */
+ MOV(retype(dst, BRW_REGISTER_TYPE_UD), brw_imm_ud(0u));
+ MOV(retype(dst, BRW_REGISTER_TYPE_HF), src[0]);
break;
case BRW_OPCODE_F16TO32:
- F16TO32(dst, src[0]);
+ MOV(dst, retype(src[0], BRW_REGISTER_TYPE_HF));
break;
case BRW_OPCODE_LRP: