i965/fs: Fix test for smearing enabled on an instruction.
authorEric Anholt <eric@anholt.net>
Fri, 3 May 2013 00:44:28 +0000 (17:44 -0700)
committerEric Anholt <eric@anholt.net>
Wed, 29 May 2013 17:20:26 +0000 (10:20 -0700)
We were expanding the live range too far, breaking register_coalesce_2()
and compute_to_mrf() on 16-wide shaders.  Turning it back on improves
GLB2.7 performance by 0.239355% +/- 0.0850649% (n=398). shader-db stats
are:

total instructions in shared programs: 1627211 -> 1609262 (-1.10%)
instructions in affected programs:     450351 -> 432402 (-3.99%)

While 33 new 16-wide shaders are gained, 70 are lost.  Despite that,
tropics (the app that lost the most 16-wide) shows a .41% +/- .16%
(n=7/8, first-run outlier removed) performance improvement on my HSW.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp

index 3daf8fa7786d63b12fd4e2a2fca6459e835ad8d0..f5daab2d2ce3ba7ae5040cebf0a0caa6bf1e5a54 100644 (file)
@@ -216,7 +216,7 @@ fs_visitor::calculate_live_intervals()
              * pixel_x/pixel_y, which are registers of 16-bit values and thus
              * would get stomped by the first decode as well.
              */
-            if (dispatch_width == 16 && (inst->src[i].smear ||
+            if (dispatch_width == 16 && (inst->src[i].smear >= 0 ||
                                          (this->pixel_x.reg == reg ||
                                           this->pixel_y.reg == reg))) {
                end_ip++;