From: Kenneth Graunke Date: Wed, 18 Nov 2015 02:24:11 +0000 (-0800) Subject: i965: Fix JIP to properly skip over unrelated control flow. X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=1ac1581f3889d5f7e6e231c05651f44fbd80f0b6;p=mesa.git i965: Fix JIP to properly skip over unrelated control flow. We've apparently always been botching JIP for sequences such as: do cmp.f0.0 ... (+f0.0) break ... if ... else ... endif ... while Normally, UIP is supposed to point to the final destination of the jump, while in nested control flow, JIP is supposed to point to the end of the current nesting level. It essentially bounces out of the current nested control flow, to an instruction that has a JIP which bounces out another level, and so on. In the above example, when setting JIP for the BREAK, we call brw_find_next_block_end(), which begins a search after the BREAK for the next ENDIF, ELSE, WHILE, or HALT. It ignores the IF and finds the ELSE, setting JIP there. This makes no sense at all. The break is supposed to skip over the whole if/else/endif block entirely. They have a sibling relationship, not a nesting relationship. This patch fixes brw_find_next_block_end() to track depth as it does its search, and ignore anything not at depth 0. So when it sees the IF, it ignores everything until after the ENDIF. That way, it finds the end of the right block. I noticed this while reading some assembly code. We believe jumping earlier is harmless, but makes the EU walk through a bunch of disabled instructions for no reason. I noticed that GLBenchmark Manhattan had a shader that contained a BREAK with a bogus JIP, but didn't measure any performance improvement (it's likely miniscule, if there is any). Signed-off-by: Kenneth Graunke Reviewed-by: Matt Turner Reviewed-by: Francisco Jerez --- diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index bb6f5dce91b..25064c0eb87 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -2617,17 +2617,27 @@ brw_find_next_block_end(struct brw_codegen *p, int start_offset) void *store = p->store; const struct brw_device_info *devinfo = p->devinfo; + int depth = 0; + for (offset = next_offset(devinfo, store, start_offset); offset < p->next_insn_offset; offset = next_offset(devinfo, store, offset)) { brw_inst *insn = store + offset; switch (brw_inst_opcode(devinfo, insn)) { + case BRW_OPCODE_IF: + depth++; + break; case BRW_OPCODE_ENDIF: + if (depth == 0) + return offset; + depth--; + break; case BRW_OPCODE_ELSE: case BRW_OPCODE_WHILE: case BRW_OPCODE_HALT: - return offset; + if (depth == 0) + return offset; } }