As far as I can see, the intention of the requirement that we do so is to
prevent instruction prefetch from wandering out into either unmapped memory or
memory with a different caching type, and hanging the chip. The kernel makes
sure that the page after your BO has a valid page of the same caching type,
which meets this requirement, so there's no need to waste space between our
programs (and in instruction cache) on this.
Saves another 9kb instructions in l4d2 shaders.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
const GLuint *brw_get_program( struct brw_compile *p,
GLuint *sz )
{
- GLuint i;
-
brw_compact_instructions(p);
- /* We emit a cacheline (8 instructions) of NOPs at the end of the program to
- * make sure that instruction prefetch doesn't wander off into some other BO.
- */
- for (i = 0; i < 8; i++)
- brw_NOP(p);
-
*sz = p->next_insn_offset;
return (const GLuint *)p->store;
}