i965/fs/lower_simd_width: Fix registers written for split instructions
authorIago Toral Quiroga <itoral@igalia.com>
Fri, 15 Jan 2016 13:59:13 +0000 (14:59 +0100)
committerSamuel Iglesias Gonsálvez <siglesias@igalia.com>
Tue, 10 May 2016 09:25:09 +0000 (11:25 +0200)
When the original instruction had a stride > 1, the combined registers
written by the split instructions won't amount to the same register space
written by the original instruction because the split instructions will
use a stride of 1. The current code assumed otherwise and computed the
number of registers written by split instructions as an equal share based
on the relation between the lowered width and the original execution size
of the instruction.

It is only after the split, when we interleave the components of the result
from the lowered instructions back into the original dst register, that the
original stride takes effect and we write all the registers specified by
the original instruction.

Just make the number of register written the same as the vgrf space we
allocate for the dst of the split instruction.

Fixes crashes in fp64 tests produced as a result of assigning incorrectly the
number of registers written by split instructions, which led to incorrect
validation of the size of the writes against the allocated vgrf space.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
src/mesa/drivers/dri/i965/brw_fs.cpp

index 3d331d191b732a23fb6c404bd199b6e1e09f6ef0..dcbb4bd8b849fc7136adc14daec0a4485aeb7113 100644 (file)
@@ -4722,8 +4722,8 @@ fs_visitor::lower_simd_width()
                split_inst.dst = dsts[i] =
                   lbld.vgrf(inst->dst.type, dst_size);
                split_inst.regs_written =
-                  DIV_ROUND_UP(inst->regs_written * lower_width,
-                               inst->exec_size);
+                  DIV_ROUND_UP(type_sz(inst->dst.type) * dst_size * lower_width,
+                               REG_SIZE);
             }
 
             lbld.emit(split_inst);