From 47b7f104a0aa3692e9fb202741406a0c6d9ac8ad Mon Sep 17 00:00:00 2001 From: Rhys Perry Date: Thu, 27 Feb 2020 19:47:01 +0000 Subject: [PATCH] aco: consider non-hazard writes in handle_raw_hazard_internal MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit I think this helps GFX6 in particular because code like this is common: s_add_i32 s4, 0x60, s3 s_mov_b32 s5, 0 s_load_dwordx4 s[4:7], s[4:5], 0x0 s_buffer_load_dword s4, s[4:7], 0xcc pipeline-db (Tahiti): Totals from affected shaders: SGPRS: 1923878 -> 1923878 (0.00 %) VGPRS: 1528964 -> 1528964 (0.00 %) Spilled SGPRs: 476 -> 476 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 88723604 -> 88528880 (-0.22 %) bytes LDS: 241 -> 241 (0.00 %) blocks Max Waves: 145402 -> 145402 (0.00 %) pipeline-db (Polaris): Totals from affected shaders: SGPRS: 428128 -> 428128 (0.00 %) VGPRS: 353092 -> 353092 (0.00 %) Spilled SGPRs: 119251 -> 119251 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 57580468 -> 57563964 (-0.03 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 11631 -> 11631 (0.00 %) piepline-db (Vega): Totals from affected shaders: SGPRS: 425016 -> 425016 (0.00 %) VGPRS: 349588 -> 349588 (0.00 %) Spilled SGPRs: 117835 -> 117835 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 54890792 -> 54874432 (-0.03 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 54 -> 54 (0.00 %) Signed-off-by: Rhys Perry Reviewed-by: Daniel Schürmann Part-of: --- src/amd/compiler/aco_insert_NOPs.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/amd/compiler/aco_insert_NOPs.cpp b/src/amd/compiler/aco_insert_NOPs.cpp index 9c5b1c8b7c6..4302711ba81 100644 --- a/src/amd/compiler/aco_insert_NOPs.cpp +++ b/src/amd/compiler/aco_insert_NOPs.cpp @@ -218,9 +218,10 @@ int handle_raw_hazard_internal(Program *program, Block *block, if (is_hazard) return nops_needed; + mask &= ~writemask; nops_needed -= get_wait_states(pred); - if (nops_needed <= 0) + if (nops_needed <= 0 || mask == 0) return 0; } -- 2.30.2