From: Alejandro Piñeiro <apinheiro@igalia.com>
Date: Sat, 1 Jul 2017 06:11:05 +0000 (+0200)
Subject: i965/fs: Handle 32-bit to 16-bit conversions
X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=5d5ee507fb4a385f98ba19bd901ce4e3aca7def4;p=mesa.git

i965/fs: Handle 32-bit to 16-bit conversions

Conversions to 16-bit need having aligment between the 16-bit
and 32-bit types. So the conversion operations unpack 16-bit types
to with an stride=2 and then applies a MOV with the conversion.

v2 (Jason Ekstrand):
  - Avoid the general use of stride=2 for 16-bit register types.

v3 (Topi Pohjolainen)
  - Code style fix
   (Jason Ekstrand)
  - Now nir_op_f2f16 was renamed to nir_op_f2f16_undef
    because conversion to f16 with undefined rounding is explicit

Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
---

diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp
index bed1cd3b492..ddc0c6d105e 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -724,6 +724,31 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr)
       inst->saturate = instr->dest.saturate;
       break;
 
+      /* In theory, it would be better to use BRW_OPCODE_F32TO16. Depending
+       * on the HW gen, it is a special hw opcode or just a MOV, and
+       * brw_F32TO16 (at brw_eu_emit) would do the work to chose.
+       *
+       * But if we want to use that opcode, we need to provide support on
+       * different optimizations and lowerings. As right now HF support is
+       * only for gen8+, it will be better to use directly the MOV, and use
+       * BRW_OPCODE_F32TO16 when/if we work for HF support on gen7.
+       */
+
+   case nir_op_f2f16_undef:
+   case nir_op_i2i16:
+   case nir_op_u2u16: {
+      /* TODO: Fixing aligment rules for conversions from 32-bits to
+       * 16-bit types should be moved to lower_conversions
+       */
+      fs_reg tmp = bld.vgrf(op[0].type, 1);
+      tmp = subscript(tmp, result.type, 0);
+      inst = bld.MOV(tmp, op[0]);
+      inst->saturate = instr->dest.saturate;
+      inst = bld.MOV(result, tmp);
+      inst->saturate = instr->dest.saturate;
+      break;
+   }
+
    case nir_op_f2f64:
    case nir_op_f2i64:
    case nir_op_f2u64: