i965/fs: Skip SIMD lowering destination zipping if possible.
authorFrancisco Jerez <currojerez@riseup.net>
Fri, 27 May 2016 07:45:04 +0000 (00:45 -0700)
committerFrancisco Jerez <currojerez@riseup.net>
Thu, 2 Jun 2016 20:24:48 +0000 (13:24 -0700)
commit7aa76d66a1f5edad9e8c1d54aafdce99ffa6c345
tree9ea23bcd1c5d33a41b044e270d079988f1c28dee
parent75da9c9933a97e6f2baf0884b98350df800ee785
i965/fs: Skip SIMD lowering destination zipping if possible.

Skipping the temporary allocation and copy instructions is easy (just
return dst), but the conditions used to find out whether the copy can
be optimized out safely without breaking the program are rather
complex: The destination must be exactly one component of at most the
execution width of the lowered instruction, and all source regions of
the instruction must be either fully disjoint from the destination or
be aligned with it group by group.

v2: Don't handle partial source-destination overlap for simplicity
    (Jason).  No instruction count regressions with respect to v1 in
    either shader-db or the few FP64 shader_runner test-cases with
    partial overlap I've checked manually.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
src/mesa/drivers/dri/i965/brw_fs.cpp