From: Francisco Jerez Date: Wed, 18 Mar 2015 13:27:58 +0000 (+0200) Subject: i965/vec4: Improve src_reg/dst_reg conversion constructors. X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=430c6bf70e48c08ba4dc9e00f2b88e2230793010;p=mesa.git i965/vec4: Improve src_reg/dst_reg conversion constructors. This simplifies the src_reg/dst_reg conversion constructors using the swizzle utils introduced in a previous patch. It also makes them more useful by changing their semantics slightly: dst_reg(src_reg) used to set the writemask to XYZW if the src_reg swizzle was anything other than XXXX, which was almost certainly not what the caller intended if the swizzle was non-trivial. After this patch the same components that are present in the swizzle will be enabled in the resulting writemask. src_reg(dst_reg) used to set the first components of the swizzle to the enabled components of the writemask and then replicate the last enabled component to fill the swizzle, which, in cases where the writemask didn't have exactly the first n components set, would in general not be compatible with the original dst_reg. E.g.: | ADD(tmp, src_reg(tmp), src_reg(1)); would *not* do what one would expect (add one to each of the enabled components of tmp) if tmp didn't have a writemask of the described form (e.g. YZ, YW, XZW would all fail). This pattern actually occurs in many different places in the VEC4 back-end, it's a wonder that it hasn't caused piglit failures until now. After this patch src_reg(dst_reg) will construct a swizzle with each enabled component at its natural position (e.g. Y at the second position, Z at the third, and so on). The resulting swizzle will behave like the identity when used in any instruction with the original writemask. I've manually verified that *none* of the callers of both conversion constructors were relying on the previous broken semantics. There are no piglit regressions on any generation. Reviewed-by: Matt Turner --- diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 5606e03dbca..68ec4e34550 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -133,24 +133,7 @@ src_reg::src_reg(const dst_reg ®) this->type = reg.type; this->reladdr = reg.reladdr; this->fixed_hw_reg = reg.fixed_hw_reg; - - int swizzles[4]; - int next_chan = 0; - int last = 0; - - for (int i = 0; i < 4; i++) { - if (!(reg.writemask & (1 << i))) - continue; - - swizzles[next_chan++] = last = i; - } - - for (; next_chan < 4; next_chan++) { - swizzles[next_chan] = last; - } - - this->swizzle = BRW_SWIZZLE4(swizzles[0], swizzles[1], - swizzles[2], swizzles[3]); + this->swizzle = brw_swizzle_for_mask(reg.writemask); } void @@ -202,14 +185,7 @@ dst_reg::dst_reg(const src_reg ®) this->reg = reg.reg; this->reg_offset = reg.reg_offset; this->type = reg.type; - /* How should we do writemasking when converting from a src_reg? It seems - * pretty obvious that for src.xxxx the caller wants to write to src.x, but - * what about for src.wx? Just special-case src.xxxx for now. - */ - if (reg.swizzle == BRW_SWIZZLE_XXXX) - this->writemask = WRITEMASK_X; - else - this->writemask = WRITEMASK_XYZW; + this->writemask = brw_mask_for_swizzle(reg.swizzle); this->reladdr = reg.reladdr; this->fixed_hw_reg = reg.fixed_hw_reg; }