i965/vec4: Fix DCE for VEC4_OPCODE_SET_{LOW,HIGH}_32BIT
authorIago Toral Quiroga <itoral@igalia.com>
Wed, 24 Aug 2016 09:21:57 +0000 (11:21 +0200)
committerSamuel Iglesias Gonsálvez <siglesias@igalia.com>
Tue, 3 Jan 2017 10:26:50 +0000 (11:26 +0100)
These align1 opcodes do partial writes of 64-bit data. The problem is that we
want to use them to write on the same register to implement packDouble2x32 and
from the point of view of DCE, since both opcodes write to the same register,
only the last one stands and decides to eliminate the first, which is
not correct, so prevent this from happening.

v2: Make a helper in vec4_instruction to know if the instruction is an
    align1 partial write. This will come in handy when we implement a
    simd splitting pass in a later patch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
src/mesa/drivers/dri/i965/brw_ir_vec4.h
src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp

index 5dfdfce17ab7907ac7902c942b2fc53d2f533a10..766cec7e9e1b71e50b3c74c5fa93827ecf157b2c 100644 (file)
@@ -280,6 +280,12 @@ public:
    bool can_change_types() const;
    bool has_source_and_destination_hazard() const;
 
+   bool is_align1_partial_write()
+   {
+      return opcode == VEC4_OPCODE_SET_LOW_32BIT ||
+             opcode == VEC4_OPCODE_SET_HIGH_32BIT;
+   }
+
    bool reads_flag()
    {
       return predicate || opcode == VS_OPCODE_UNPACK_FLAGS_SIMD4X2;
index 65f9f3889883904c51900d3f4945b02666bf517e..9185d5202b91e15e190ce77e9b60f77359fce26b 100644 (file)
@@ -110,7 +110,8 @@ vec4_visitor::dead_code_eliminate()
             }
          }
 
-         if (inst->dst.file == VGRF && !inst->predicate) {
+         if (inst->dst.file == VGRF && !inst->predicate &&
+             !inst->is_align1_partial_write()) {
             for (unsigned i = 0; i < regs_written(inst); i++) {
                for (int c = 0; c < 4; c++) {
                   if (inst->dst.writemask & (1 << c)) {