ac: fix exclusive scans on GFX8-GFX9
authorSamuel Pitoiset <samuel.pitoiset@gmail.com>
Wed, 21 Aug 2019 14:29:46 +0000 (16:29 +0200)
committerSamuel Pitoiset <samuel.pitoiset@gmail.com>
Thu, 22 Aug 2019 06:43:15 +0000 (08:43 +0200)
This fixes a regression introduced with scan&reduce operations
on GFX10. Note that some subgroups CTS still fail on GFX10 but
I assume it's a different issue.

This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*.

Fixes: 227c29a80de "amd/common/gfx10: implement scan & reduce operations"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
src/amd/common/ac_llvm_build.c

index 05871f5ea98abdacf4019392b351ce21d10d8f01..5abae00d8f60b9a5d501b32bc10cf111149b383f 100644 (file)
@@ -4221,10 +4221,9 @@ ac_build_scan(struct ac_llvm_context *ctx, nir_op op, LLVMValueRef src, LLVMValu
        if (ctx->chip_class >= GFX10) {
                result = inclusive ? src : identity;
        } else {
-               if (inclusive)
-                       result = src;
-               else
-                       result = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 0xf, 0xf, false);
+               if (!inclusive)
+                       src = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 0xf, 0xf, false);
+               result = src;
        }
        if (maxprefix <= 1)
                return result;