i965/fs: Improve performance of shaders that start out with a discard.
authorEric Anholt <eric@anholt.net>
Thu, 6 Dec 2012 18:15:08 +0000 (10:15 -0800)
committerEric Anholt <eric@anholt.net>
Tue, 11 Dec 2012 18:13:15 +0000 (10:13 -0800)
commitbeafced21c3c11315a8b94f20508562729453175
tree72e4c1b18972ac7f05310eab9a0515cf7e1f61d4
parentd5016495cc1b50b1673d0d3ab8e6af8249b071d5
i965/fs: Improve performance of shaders that start out with a discard.

I had tried this in the past, but ran into trouble with applications
that sample from undiscarded pixels in the same subspan.  To fix that
issue, only jump to the end for an entire subspan at a time.

Improves GLbenchmark 2.7 (1024x768) performance by 7.9 +/- 1.5% (n=8).

v2: Drop the br variable in the jump instruction -- if I ever do jumps
    pre-gen6, it'll be a different code block anyway since we don't have
    HALT until gen6.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
src/mesa/drivers/dri/i965/brw_defines.h
src/mesa/drivers/dri/i965/brw_eu.h
src/mesa/drivers/dri/i965/brw_eu_emit.c
src/mesa/drivers/dri/i965/brw_fs.h
src/mesa/drivers/dri/i965/brw_fs_emit.cpp
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp