mesa.git
5 years agogitlab-ci: update deqp build so we can generate xml
Rob Clark [Fri, 15 Nov 2019 18:15:32 +0000 (10:15 -0800)]
gitlab-ci: update deqp build so we can generate xml

Update the deqp build to preserve testlog-to-xml and stylesheets, so
deqp runner can extract .qpa for failed/flaked tests, and convert to
xml.  With this, will be able to browse output from failed tests
directly from the artifacts.

The main motiviation is to give better visibility into what happens with
flaked tests, when it is difficult/impossible to reproduce the flake
locally (ie. when it happens once out of N million tests).  But this
should also make it easier to debug regressions that a MR triggers,
especially when it is on hw that you don't have.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodrirc: Enable glthread for dolphin/citra/yuzu.
Markus Wick [Tue, 5 Nov 2019 08:16:37 +0000 (09:16 +0100)]
drirc: Enable glthread for dolphin/citra/yuzu.

Dolphin: 75 fps -> 88 fps - Super Mario Galaxy
Citra:   81 fps -> 91 fps - A Link Between Worlds
Yuzu:    21 fps -> 27 fps - Super Mario Odyssey

Dolphin still has many syncs because of glFenceSync and glClientWaitSync.
Moving them to the dispatcher thread might yield another speedup.

Yuzu uses a compatible profile by default. This benchmark used the variable
MESA_GL_VERSION_OVERRIDE=4.5FC to overwrite this behavior.

This profilation was done on a mobile i7-8550U CPU with i965.

Signed-off-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa/glthread: Implement ARB_multi_bind.
Markus Wick [Sun, 3 Nov 2019 08:49:59 +0000 (09:49 +0100)]
mesa/glthread: Implement ARB_multi_bind.

Signed-off-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoaco: fix waitcnts for barriers at block ends
Rhys Perry [Fri, 22 Nov 2019 19:38:51 +0000 (19:38 +0000)]
aco: fix waitcnts for barriers at block ends

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: d1b9deee ('aco: improve waitcnt insertion around loops')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoRevert "draw: revert using correct order for prim decomposition."
Zebediah Figura [Tue, 5 Nov 2019 16:21:21 +0000 (10:21 -0600)]
Revert "draw: revert using correct order for prim decomposition."

This reverts commit f97b731c82afb06cfd6ffebc90a3e098a9a1b308.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/250
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agoiris: Change keybox parenting
Kenneth Graunke [Wed, 5 Jun 2019 20:15:35 +0000 (13:15 -0700)]
iris: Change keybox parenting

For temporary lookups, just allocate out of the NULL ralloc context,
so we don't have to edit the linked list of ralloc children to add it
and then immediately remove it again.

When uploading a new shader, allocate the keybox off the shader, so
if we delete the shader the keybox also goes away.  Less manual cleanup.

5 years agonir/range_analysis: Make sure the table validation only occurs once
Ian Romanick [Sat, 16 Nov 2019 21:19:47 +0000 (13:19 -0800)]
nir/range_analysis: Make sure the table validation only occurs once

All of the tables are static const, so they only need to be validated
once.  As noted in the previous commit, the compiler should be able to
eliminate all of this code when the assertions would pass.  Even with
the help of the previous commit, this does not always occur.

-Og: -95.688 +/- 3.91935 (-24.9562% +/- 1.0222%) N=5
-O1: No difference proven at 95.0% confidence. N=5
-O2: -1.962 +/- 0.85001 (-0.860013% +/- 0.372589%) N=5

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir/range-analysis: Add pragmas to help loop unrolling
Ian Romanick [Sat, 16 Nov 2019 21:23:31 +0000 (13:23 -0800)]
nir/range-analysis: Add pragmas to help loop unrolling

I was pretty liberal with these assertions when I wrote this code
because I had assumed that GCC would unroll the loops, inline the look ups
of static const arrays with now constant indices, and then elmininate
all the actuall assertions.  It seems none of this happens even at -O3.

Adding the pragmas helps encourage loop unrolling at some optimization
levels.  I tested by running shader-db with NIR_VALIDATE=false on a Core
i7 Haswell desktop system.

-Og: No difference proven at 95.0% confidence. N=5
-O1: -48.304 +/- 1.221 (-16.3343% +/- 0.412888%) N=5
-O2: -49.94 +/- 1.23521 (-17.9634% +/- 0.444303%) N=5

v2: Add a _Pragma to an inner loop that was accidentally dropped during
a rebase.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoglsl: Add varyings to "zero-init of uninitialized vars" workaround
Danylo Piliaiev [Thu, 21 Nov 2019 13:04:37 +0000 (15:04 +0200)]
glsl: Add varyings to "zero-init of uninitialized vars" workaround

Varyings are similar to already handled cases. And "glsl_zero_init"
name of the workaround already looks like it should include varyings.

The issue was observed in GiMark subtest from GpuTest.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agopan/midgard: Use lower_tex_without_implicit_lod
Alyssa Rosenzweig [Thu, 21 Nov 2019 18:40:00 +0000 (13:40 -0500)]
pan/midgard: Use lower_tex_without_implicit_lod

Just a bit of cleanup. lower_tex can do this lowering for us, which
should also eliminate some special cases (one less thing to fix if we
ever need texturing in tess/geom/etc, perhaps?)

Closes #2133

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agoetnaviv: use a more self-explanatory param name
Christian Gmeiner [Fri, 15 Nov 2019 16:35:50 +0000 (17:35 +0100)]
etnaviv: use a more self-explanatory param name

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: drop not used config_out function param
Christian Gmeiner [Fri, 15 Nov 2019 16:34:11 +0000 (17:34 +0100)]
etnaviv: drop not used config_out function param

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agogitlab-ci: reduce the number of scons build
Samuel Pitoiset [Thu, 21 Nov 2019 07:29:25 +0000 (08:29 +0100)]
gitlab-ci: reduce the number of scons build

It seems overkill to me to build scons 7x for every pipeline.
Scons is now build with the oldest llvm version in scons-old-llvm
and with the newest llvm version in scons.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agopanfrost: Add lcra.c to Android.mk
Alyssa Rosenzweig [Thu, 21 Nov 2019 13:43:21 +0000 (08:43 -0500)]
panfrost: Add lcra.c to Android.mk

This was forgotten.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/midgard: Enable LOD lowering only on buggy chips
Alyssa Rosenzweig [Thu, 21 Nov 2019 13:45:27 +0000 (08:45 -0500)]
pan/midgard: Enable LOD lowering only on buggy chips

T720 and earlier need this workaround, so check the quirk before
lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/midgard: Describe quirk MIDGARD_BROKEN_LOD
Alyssa Rosenzweig [Wed, 20 Nov 2019 02:21:19 +0000 (21:21 -0500)]
pan/midgard: Describe quirk MIDGARD_BROKEN_LOD

Corresponds to errata #10471, applies to T6xx and T720. Fixed in T760.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/midgard: Add LOD bias/clamp lowering
Alyssa Rosenzweig [Thu, 21 Nov 2019 13:43:53 +0000 (08:43 -0500)]
pan/midgard: Add LOD bias/clamp lowering

We fetch the info with the new intrinsic and lower with ALU ops for txl
instructions, which seemingly correspond to "TEXGRD" instructions (what
we call textureLod).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/midgard: Implement load_sampler_lod_paramaters_pan
Alyssa Rosenzweig [Thu, 21 Nov 2019 13:42:28 +0000 (08:42 -0500)]
pan/midgard: Implement load_sampler_lod_paramaters_pan

We can stuff this information in as parametrized system values, like we
currently do texture size and SSBO addresses.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agonir: Add load_sampler_lod_paramaters_pan intrinsic
Alyssa Rosenzweig [Thu, 21 Nov 2019 13:41:22 +0000 (08:41 -0500)]
nir: Add load_sampler_lod_paramaters_pan intrinsic

This loads in the <min_lod, max_lod, lod_bias> settings for a given
sampler, which is necessary for lowering clamps/biases on certain
Midgard chips.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agomapi/glapi: Generate sizeof() helpers instead of fixed sizes.
Markus Wick [Sun, 17 Nov 2019 18:12:04 +0000 (19:12 +0100)]
mapi/glapi: Generate sizeof() helpers instead of fixed sizes.

Generating a source code with a fixed size leads to issues with plattform dependent types.
We either hard code 4 or 8 bytes there, and both are wrong on the other plattform.
So this patch solves this issue by generating eg sizeof(GLsizeiptr), which is valid both
on 32 and on 64 bit plattforms.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
5 years agointel/fs: Disable conditional discard optimization on Gen4 and Gen5
Ian Romanick [Mon, 18 Nov 2019 19:52:47 +0000 (11:52 -0800)]
intel/fs: Disable conditional discard optimization on Gen4 and Gen5

The CMP instruction on Gen4 and Gen5 generates one bit (the LSB) of
valid data and 31 bits of junk.  Results of comparisons that are used as
Boolean values need to have a fixup applied to generate the proper 0/~0
values.

Calling fs_visitor::nir_emit_alu with need_dest=false prevents the fixup
code from being generated.  This results in a sequence like:

        cmp.l.f0.0(16)  g8<1>F          g14<8,8,1>F     0x0F  /* 0F */
        ...
        cmp.l.f0.0(16)  g4<1>F          g6<8,8,1>F      0x0F  /* 0F */
(+f0.1) or.z.f0.1(16) null<1>UD g4<8,8,1>UD     g8<8,8,1>UD

instead of

        cmp.l.f0.0(16)  g8<1>F          g14<8,8,1>F     0x0F  /* 0F */
        ...
        cmp.l.f0.0(16)  g4<1>F          g6<8,8,1>F      0x0F  /* 0F */
        or(16) g4<1>UD g4<8,8,1>UD     g8<8,8,1>UD
(+f0.1) and.z.f0.1(16) null<1>UD g4<8,8,1>UD     1UD

I examined a couple of the shaders hurt by this change, and ALL of them
would have been affected by this bug. :(

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1836
Fixes: 0ba9497e66a ("intel/fs: Improve discard_if code generation")
Iron Lake
total instructions in shared programs: 8122757 -> 8122957 (<.01%)
instructions in affected programs: 8307 -> 8507 (2.41%)
helped: 0
HURT: 100
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.84% max: 6.67% x̄: 2.81% x̃: 2.76%
95% mean confidence interval for instructions value: 2.00 2.00
95% mean confidence interval for instructions %-change: 2.58% 3.03%
Instructions are HURT.

total cycles in shared programs: 188510100 -> 188510376 (<.01%)
cycles in affected programs: 76018 -> 76294 (0.36%)
helped: 0
HURT: 55
HURT stats (abs)   min: 2 max: 12 x̄: 5.02 x̃: 4
HURT stats (rel)   min: 0.07% max: 3.75% x̄: 0.86% x̃: 0.56%
95% mean confidence interval for cycles value: 4.33 5.71
95% mean confidence interval for cycles %-change: 0.60% 1.12%
Cycles are HURT.

GM45
total instructions in shared programs: 4994403 -> 4994503 (<.01%)
instructions in affected programs: 4212 -> 4312 (2.37%)
helped: 0
HURT: 50
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.84% max: 6.25% x̄: 2.76% x̃: 2.72%
95% mean confidence interval for instructions value: 2.00 2.00
95% mean confidence interval for instructions %-change: 2.45% 3.07%
Instructions are HURT.

total cycles in shared programs: 128928750 -> 128928982 (<.01%)
cycles in affected programs: 67442 -> 67674 (0.34%)
helped: 0
HURT: 47
HURT stats (abs)   min: 2 max: 12 x̄: 4.94 x̃: 4
HURT stats (rel)   min: 0.09% max: 3.75% x̄: 0.75% x̃: 0.53%
95% mean confidence interval for cycles value: 4.19 5.68
95% mean confidence interval for cycles %-change: 0.50% 1.00%
Cycles are HURT.

5 years agodocs: update calendar, add news item and link release notes for 19.2.6
Dylan Baker [Fri, 22 Nov 2019 00:33:19 +0000 (16:33 -0800)]
docs: update calendar, add news item and link release notes for 19.2.6

5 years agodocs: Add SHA256 sum for 19.2.6
Dylan Baker [Fri, 22 Nov 2019 00:31:47 +0000 (16:31 -0800)]
docs: Add SHA256 sum for 19.2.6

5 years agodocs: Add release notes for 19.2.6
Dylan Baker [Fri, 22 Nov 2019 00:04:11 +0000 (16:04 -0800)]
docs: Add release notes for 19.2.6

5 years agonir/serialize: do ctx = {0} instead of manual initializations
Marek Olšák [Tue, 5 Nov 2019 02:29:56 +0000 (21:29 -0500)]
nir/serialize: do ctx = {0} instead of manual initializations

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agonir: strip as we serialize to remove the nir_shader_clone call
Marek Olšák [Mon, 4 Nov 2019 23:09:26 +0000 (18:09 -0500)]
nir: strip as we serialize to remove the nir_shader_clone call

Serializing stripped NIR is faster now.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agoetnaviv: add drm-shim
Christian Gmeiner [Tue, 6 Aug 2019 21:49:03 +0000 (23:49 +0200)]
etnaviv: add drm-shim

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agovk_util: drop duplicate formats in vk_format_map[]
Eric Engestrom [Thu, 21 Nov 2019 20:29:35 +0000 (20:29 +0000)]
vk_util: drop duplicate formats in vk_format_map[]

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: implement UBWC
Jonathan Marek [Mon, 18 Nov 2019 21:46:39 +0000 (16:46 -0500)]
turnip: implement UBWC

This enables UBWC for everything except 3D textures.

It breaks many image_to_image copies but those aren't important and it can
be worked around later (image_to_image copy needs to be done in two steps,
decode from the source format and then encode to the destination format).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agofreedreno/regs: update UBWC related bits
Jonathan Marek [Mon, 18 Nov 2019 21:17:55 +0000 (16:17 -0500)]
freedreno/regs: update UBWC related bits

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoswr: Fix build with llvm-10.0.
Vinson Lee [Wed, 20 Nov 2019 06:51:22 +0000 (06:51 +0000)]
swr: Fix build with llvm-10.0.

Fix build error after llvm-10.0 commit 1dfede3122ee ("Move
CodeGenFileType enum to Support/CodeGen.h").

../src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp: In member function ‘void JitManager::DumpAsm(llvm::Function*, const char*)’:
../src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp:428:45: error: ‘CGFT_AssemblyFile’ is not a member of ‘llvm::TargetMachine’
             *pMPasses, filestream, nullptr, TargetMachine::CGFT_AssemblyFile);
                                             ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
5 years agoaco: fix copy+paste error
Rhys Perry [Tue, 22 Oct 2019 14:16:37 +0000 (15:16 +0100)]
aco: fix copy+paste error

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: improve waitcnt insertion around loops
Rhys Perry [Mon, 21 Oct 2019 20:36:41 +0000 (21:36 +0100)]
aco: improve waitcnt insertion around loops

Do this by repeating processing of loops until no progress is made.

Totals from affected shaders:
SGPRS: 162576 -> 162576 (0.00 %)
VGPRS: 145228 -> 145228 (0.00 %)
Spilled SGPRs: 668 -> 668 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 15778640 -> 15771336 (-0.05 %) bytes
LDS: 146 -> 146 (0.00 %) blocks
Max Waves: 6087 -> 6087 (0.00 %)

v2: use block_kind_loop_header/block_kind_loop_exit to repeat at the end
    of loops instead of at each continue

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agofreedreno/perfctrs/fdperf: periodically restore counters
Rob Clark [Wed, 20 Nov 2019 19:56:57 +0000 (11:56 -0800)]
freedreno/perfctrs/fdperf: periodically restore counters

When GPU is idle and suspends, the currently selected countables
will all reset to the first one.  So periodically restore the selected
countables.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/perfcntrs: add fdperf
Rob Clark [Tue, 19 Nov 2019 22:53:49 +0000 (14:53 -0800)]
freedreno/perfcntrs: add fdperf

Port from the envytools tree, but converted to use the .c tables for
describing the perfcounter groups/countables, rather than using rnndec
to get this at runtime from the register xml.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/perfcntrs/a6xx: remove RBBM counters
Rob Clark [Wed, 20 Nov 2019 00:37:18 +0000 (16:37 -0800)]
freedreno/perfcntrs/a6xx: remove RBBM counters

Currently this are getting blocked by the kernel.. these counters don't
seem to be the most useful ones, and to use them we'd have to somehow
probe the kernel by submitting cmdstream to write the selector regs and
see if that triggers a GPU fault.  So let's just skip them.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/perfctrs/a2xx: move CP to be first group
Rob Clark [Wed, 20 Nov 2019 00:35:58 +0000 (16:35 -0800)]
freedreno/perfctrs/a2xx: move CP to be first group

fdperf expects this, to find the ALWAYS_COUNT counter

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/perfcntrs: add accessor to get per-gen tables
Rob Clark [Tue, 19 Nov 2019 19:43:40 +0000 (11:43 -0800)]
freedreno/perfcntrs: add accessor to get per-gen tables

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/perfcntrs: move to shared location
Rob Clark [Tue, 19 Nov 2019 19:05:59 +0000 (11:05 -0800)]
freedreno/perfcntrs: move to shared location

This should eventually be useful for VK_KHR_performance_query as well.
And in the more near term, for fdperf.

Attempt to not break android build is best-effort and untested.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/perfcntrs: remove gallium dependencies
Rob Clark [Tue, 19 Nov 2019 18:54:04 +0000 (10:54 -0800)]
freedreno/perfcntrs: remove gallium dependencies

Prep work to move to a shared location.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/perfcntrs: small cleanup
Rob Clark [Tue, 19 Nov 2019 18:22:44 +0000 (10:22 -0800)]
freedreno/perfcntrs: small cleanup

When we had one gen supporting performance counters, it made sense to
have these builder macros in the .c file with the table.  But time has
come to de-duplicate.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agonir: fix deref offset builder
Dave Airlie [Mon, 18 Nov 2019 22:26:54 +0000 (08:26 +1000)]
nir: fix deref offset builder

Use the correct bit size

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agovtn/opencl: add clz support
Dave Airlie [Mon, 18 Nov 2019 07:04:35 +0000 (17:04 +1000)]
vtn/opencl: add clz support

This is needed for OpenCL

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonouveau: request ufind_msb64 lowering in the frontend.
Dave Airlie [Tue, 19 Nov 2019 23:23:46 +0000 (09:23 +1000)]
nouveau: request ufind_msb64 lowering in the frontend.

This passes the piglit CL builtin-ulong-clz-1.0.generated.cl
test.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agonir: add 64-bit ufind_msb lowering support. (v2)
Dave Airlie [Tue, 19 Nov 2019 23:23:14 +0000 (09:23 +1000)]
nir: add 64-bit ufind_msb lowering support. (v2)

This adds the option to lower 64-bit ufind_msb opcodes.

v2: use split_x/y removes component loops (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agospirv/nir/opencl: handle some multiply instructions.
Dave Airlie [Mon, 29 Apr 2019 20:57:11 +0000 (06:57 +1000)]
spirv/nir/opencl: handle some multiply instructions.

This adds support for some missing 24-bit and hi multiply
variants.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agospirv: get the correct type for function returns.
Dave Airlie [Tue, 19 Nov 2019 22:33:10 +0000 (08:33 +1000)]
spirv: get the correct type for function returns.

This needs to be derived from the address format, not always 1/32.

Suggested by Jason

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agospirv: don't store 0 to cs.ptr_size for non kernel stages.
Dave Airlie [Tue, 19 Nov 2019 22:29:30 +0000 (08:29 +1000)]
spirv: don't store 0 to cs.ptr_size for non kernel stages.

cs is a union so storing this there is wrong.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoutil: add missing R8G8B8A8_SRGB format to vk_format_map
Jonathan Marek [Thu, 21 Nov 2019 13:24:19 +0000 (08:24 -0500)]
util: add missing R8G8B8A8_SRGB format to vk_format_map

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agodocs: fix ascii html representation
Elie Tournier [Fri, 15 Nov 2019 13:41:25 +0000 (13:41 +0000)]
docs: fix ascii html representation

v2 (Eric): Use more readable ascii version

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoDocs: remove duplicate meson docs for windows
Elie Tournier [Fri, 15 Nov 2019 13:29:58 +0000 (13:29 +0000)]
Docs: remove duplicate meson docs for windows

This block is duplicated, we already have the windows instruction above.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoci: Move freedreno's parallelism to the runner instead of gitlab-ci jobs.
Eric Anholt [Thu, 21 Nov 2019 13:12:58 +0000 (05:12 -0800)]
ci: Move freedreno's parallelism to the runner instead of gitlab-ci jobs.

I set the runners to concurrency=1, so they serve only one gitlab-ci job
at at time.  Swap over to using the parallel runner now to keep the
runners busy, more efficiently than spawning many docker containers and
downloading artifacts multiple times, and producing easier-to-understand
results for browsing on the web.

This bumps the a306 runners to 4x parallel instead of 2x like before, but
cheza gles3 drops from 6 to 4.  Current rough timings of the jobs (if no
container download):

db410c-gles2: 5:00
a630-gles2: 1:30
a630-gles3: 6:00
a630-gles31: 5:30

a630-gles3 is a bit longer than I like, but it should come back down once
I can sort out the NIR algebraic rewinding.

5 years agoglsl: add missing initialization of the location path field
Iago Toral Quiroga [Thu, 21 Nov 2019 09:05:49 +0000 (10:05 +0100)]
glsl: add missing initialization of the location path field

This was apparently missed in 67b32190f3c95, which added support
for ARB_shading_language_include to #line, including the 'path'
field for the location.

Fixes crashes in CTS with all drivers as they attempt to access
an uninitialized path string during parsing.

Fixes: 67b32190f3c95 ("glsl: add ARB_shading_language_include support to #line")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2132
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
5 years agodocs: update features.txt for RADV
Rhys Perry [Wed, 20 Nov 2019 15:08:30 +0000 (15:08 +0000)]
docs: update features.txt for RADV

[skip ci]

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agogitlab-ci: Directly use host-mapped directory for ccache
Michel Dänzer [Wed, 20 Nov 2019 08:11:35 +0000 (09:11 +0100)]
gitlab-ci: Directly use host-mapped directory for ccache

Use hardcoded /cache/mesa/ccache for the cache, so it will be shared by
all jobs of all Mesa projects running on the same runner host. This
should increase the hit rate and decrease the worst case storage used.

Further benefits of directly using a host-mapped directory:

* Saves up to ~1 minute per job for restoring and saving the cache
  contents via the GitLab CI cache mechanism
* Cache contents generated by failed jobs are no longer lost
* Jobs running in parallel on the same runner host can get hits from
  each other

Also enable compression, so the default maximum cache size of 5G might
be sufficient.

v2:
* Move CCACHE_DIR variable to the .build-linux template

Suggested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net> # v1
5 years agogitlab-ci: remove now useless meson-swr-glvnd build job
Samuel Pitoiset [Tue, 19 Nov 2019 13:39:31 +0000 (14:39 +0100)]
gitlab-ci: remove now useless meson-swr-glvnd build job

All things are already part of meson-main.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
5 years agogitlab-ci: build GLVND in meson-clang
Samuel Pitoiset [Tue, 19 Nov 2019 13:37:32 +0000 (14:37 +0100)]
gitlab-ci: build GLVND in meson-clang

Building GLVND in meson-main doesn't work because this disables
libEGL and it's needed for running shader-db.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
5 years agogitlab-ci: build swr in meson-main
Samuel Pitoiset [Tue, 19 Nov 2019 13:36:02 +0000 (14:36 +0100)]
gitlab-ci: build swr in meson-main

Now that debugoptimized isn't set and that all test jobs depend on
meson-testing, enabling swr shouldn't slowdown the CI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
5 years agogitlab-ci: do not build with debugoptimized for meson-main
Samuel Pitoiset [Tue, 19 Nov 2019 11:25:36 +0000 (12:25 +0100)]
gitlab-ci: do not build with debugoptimized for meson-main

This should reduce compile time because optimizations are costly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
5 years agogitlab-ci: add a job that only build things needed for testing
Samuel Pitoiset [Tue, 19 Nov 2019 11:23:41 +0000 (12:23 +0100)]
gitlab-ci: add a job that only build things needed for testing

For turnip and RADV testing, we will need a debugoptimized build
without UBSAN. This introduces meson-testing which builds only the
things that are needed by the test stage.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
5 years agogitlab-ci: fix ldd check for Vulkan drivers
Samuel Pitoiset [Thu, 14 Nov 2019 13:00:46 +0000 (14:00 +0100)]
gitlab-ci: fix ldd check for Vulkan drivers

The 'dri' directory isn't created when building Vulkan drivers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
5 years agogitlab-ci: move building piglit into a separate script
Samuel Pitoiset [Fri, 15 Nov 2019 11:02:08 +0000 (12:02 +0100)]
gitlab-ci: move building piglit into a separate script

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
5 years agopipe-loader: check that the pointer to driconf_xml isn't NULL
Samuel Pitoiset [Wed, 20 Nov 2019 08:14:17 +0000 (09:14 +0100)]
pipe-loader: check that the pointer to driconf_xml isn't NULL

This happens when mesa is built with only swrast. The default
driver being kmsro and the default driconf file being v3d,
it's NULL and then strdup crashes.

This fixes a crash with piglit spec/egl_mesa_query_driver/conformance.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agopanfrost: Add the lod_bias field
Alyssa Rosenzweig [Wed, 20 Nov 2019 14:26:48 +0000 (09:26 -0500)]
panfrost: Add the lod_bias field

Enough trial and error ... just think even *more* Midgard about where
this field might be!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agocompiler: move build definition of pp_standalone_scaffolding.c
Timothy Arceri [Thu, 21 Nov 2019 00:18:54 +0000 (11:18 +1100)]
compiler: move build definition of pp_standalone_scaffolding.c

This should fix android build issues while still allowing scons to
build the standalone compiler.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2129
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
5 years agonir/validate: validate num_components on registers and intrinsics
Karol Herbst [Fri, 15 Nov 2019 11:44:58 +0000 (12:44 +0100)]
nir/validate: validate num_components on registers and intrinsics

also make 8 and 16 compoments invalid. We will enable that later again
when we actually support it.

v2: fix validation of nir_intrinsic_instr::num_components
    correct validation of instr->num_components

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoRevert "st/mesa: keep serialized NIR instead of nir_shader in st_program"
Mark Janes [Wed, 20 Nov 2019 22:05:50 +0000 (14:05 -0800)]
Revert "st/mesa: keep serialized NIR instead of nir_shader in st_program"

This reverts commit db0c89d4bffa01ab15dfa819dbb518739131e1a9.

Gitlab: mesa/mesa#2128
Acked-by: Marek Olšák <maraeo@gmail.com>
5 years agoRevert "st/mesa: call nir_serialize only once per shader"
Mark Janes [Wed, 20 Nov 2019 22:05:41 +0000 (14:05 -0800)]
Revert "st/mesa: call nir_serialize only once per shader"

This reverts commit 3a8d6868897c7dfe72bac09c1eddd551144ca751.

Acked-by: Marek Olšák <maraeo@gmail.com>
5 years agolima/ppir: add lod-bias support
Arno Messiaen [Sat, 2 Nov 2019 21:09:21 +0000 (22:09 +0100)]
lima/ppir: add lod-bias support

Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
5 years agoRevert "i965/fs: Merge CMP and SEL into CSEL on Gen8+"
Jason Ekstrand [Fri, 1 Nov 2019 19:22:35 +0000 (14:22 -0500)]
Revert "i965/fs: Merge CMP and SEL into CSEL on Gen8+"

This reverts commit 52c7df1643ec9af119fd66f916f7fbdbcc798d2d.  The pass,
while clearly useful for some shaders, has at least three bugs that I
was able to find fairly quickly:

 1. It doesn't work for type-converting MOVs because f > 0 is not the
    same as f2i(f) > 0

 2. CSEL is a 3src instruction and only supports one source type; it
    doesn't take this into account and tries to create instructions
    which do a F compare and a D select.  This is especially nasty to
    debug because you don't see that in the dumped assembly because we
    don't properly assert that types are the same in codegen.

 3. While you can handle 2, in theory, by reinterpreting types, you
    can't do that in the presence of source modifiers.  This pass
    doesn't even attempt to detect that.

Those are just the ones I found with the one almost trival shader I was
debugging.  There very likely may be more and.  Best thing to do for now
is just shut it off until someone has the time to figure out how to do
this properly and write tests to ensure it's correct.

Fixes: 3cb085e6d61a "i965/fs: Merge CMP and SEL into CSEL on Gen8+"
Reviewed-by: Brian Paul <brianp@vmware.com>
5 years agoradv: Enable Subgroup Arithmetic and Clustered for SI
Daniel Schürmann [Wed, 20 Nov 2019 11:41:19 +0000 (12:41 +0100)]
radv: Enable Subgroup Arithmetic and Clustered for SI

This patch also allows to enable VK_AMD_shader_ballot on SI.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoamd/llvm: Add Subgroup Scan functions for SI
Daniel Schürmann [Wed, 20 Nov 2019 11:40:07 +0000 (12:40 +0100)]
amd/llvm: Add Subgroup Scan functions for SI

The idea of this implementation is taken from the ROCm Device Libs:
https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agolima/streamparser: Add findings introduced with gl_PointSize
Andreas Baierl [Mon, 18 Nov 2019 09:03:32 +0000 (10:03 +0100)]
lima/streamparser: Add findings introduced with gl_PointSize

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
5 years agolima/streamparser: Fix typo in vs semaphore parser
Andreas Baierl [Mon, 18 Nov 2019 08:46:21 +0000 (09:46 +0100)]
lima/streamparser: Fix typo in vs semaphore parser

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
5 years agomeson: Fix linkage of libgallium_nine with libgalliumvl
Yevhenii Kolesnikov [Thu, 7 Nov 2019 15:22:26 +0000 (17:22 +0200)]
meson: Fix linkage of libgallium_nine with libgalliumvl

Do not link libgallium_nine with libgalliumvl_stub if it's already
linked with libgalliumvl. Linking with stub leads to "duplicate
symbol" errors.

Fixes: 6b4c7047d57178d3362a710ad503057c6a582ca3
       ("meson: build gallium nine state_tracker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2040
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agodocs/release-calendar: Update for extended 19.3 rc period
Dylan Baker [Wed, 20 Nov 2019 17:57:05 +0000 (09:57 -0800)]
docs/release-calendar: Update for extended 19.3 rc period

5 years agodocs: update calendar, add news item and link release notes for 19.2.5
Dylan Baker [Wed, 20 Nov 2019 17:22:29 +0000 (09:22 -0800)]
docs: update calendar, add news item and link release notes for 19.2.5

5 years agodocs/relnotes/19.2.5: Add SHA256 sum
Dylan Baker [Wed, 20 Nov 2019 17:18:11 +0000 (09:18 -0800)]
docs/relnotes/19.2.5: Add SHA256 sum

5 years agodocs: Add relnotes for 19.2.5
Dylan Baker [Wed, 20 Nov 2019 16:54:10 +0000 (08:54 -0800)]
docs: Add relnotes for 19.2.5

5 years agonir/large_constants: use nir_index_vars and nir_variable::index
Rhys Perry [Fri, 15 Nov 2019 16:13:20 +0000 (16:13 +0000)]
nir/large_constants: use nir_index_vars and nir_variable::index

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agonir: add nir_variable::index and nir_index_vars
Rhys Perry [Fri, 15 Nov 2019 15:15:14 +0000 (15:15 +0000)]
nir: add nir_variable::index and nir_index_vars

This will be useful as a deterministic identifier/index for the variable.

v2: fix comment style

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v1)
5 years agonir: make nir_variable::{num_members,num_state_slots} a uint16_t
Rhys Perry [Fri, 15 Nov 2019 14:40:19 +0000 (14:40 +0000)]
nir: make nir_variable::{num_members,num_state_slots} a uint16_t

Doesn't shrink it (at least, on x86-64) and leaves space for more members.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agodocs: add missing new features for RADV
Samuel Pitoiset [Wed, 20 Nov 2019 14:54:43 +0000 (15:54 +0100)]
docs: add missing new features for RADV

[skip ci]

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agofreedreno/ir3: enable half precision for pre-fs texture fetch
Hyunjun Ko [Thu, 24 Oct 2019 05:30:58 +0000 (05:30 +0000)]
freedreno/ir3: enable half precision for pre-fs texture fetch

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/ir3: fixup when changing to mad.f16
Hyunjun Ko [Fri, 30 Aug 2019 08:29:10 +0000 (08:29 +0000)]
freedreno/ir3: fixup when changing to mad.f16

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/ir3: fix printing output registers of FS.
Hyunjun Ko [Fri, 21 Jun 2019 03:18:33 +0000 (03:18 +0000)]
freedreno/ir3: fix printing output registers of FS.

Fixes: cea39af2fbf1 ("freedreno/ir3: Generalize ir3_shader_disasm()")
Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/ir3: Enabling lowering 16-bit flrp
Neil Roberts [Thu, 6 Jun 2019 14:29:35 +0000 (16:29 +0200)]
freedreno/ir3: Enabling lowering 16-bit flrp

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno: support 16b for the sampler opcode
Hyunjun Ko [Wed, 23 Oct 2019 09:08:57 +0000 (09:08 +0000)]
freedreno: support 16b for the sampler opcode

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/ir3: Implement f2b16 and i2b16
Neil Roberts [Wed, 27 Feb 2019 10:58:18 +0000 (11:58 +0100)]
freedreno/ir3: Implement f2b16 and i2b16

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/ir3: Add implementation of nir_op_b16csel
Neil Roberts [Wed, 30 Jan 2019 15:41:46 +0000 (16:41 +0100)]
freedreno/ir3: Add implementation of nir_op_b16csel

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/ir3: Support 16-bit comparison instructions
Neil Roberts [Wed, 30 Jan 2019 15:33:05 +0000 (16:33 +0100)]
freedreno/ir3: Support 16-bit comparison instructions

v2. [Hyunjun Ko (zzoon@igalia.com)]
Avoid using too much open code like "instr->regs[n]->flags |= FOO"

v3. [Hyunjun Ko (zzoon@igalia.com)]
Remove redundant code for both 16b and 32b operations.

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/ir3: cleanup by removing repeated code
Hyunjun Ko [Fri, 27 Sep 2019 05:41:02 +0000 (05:41 +0000)]
freedreno/ir3: cleanup by removing repeated code

Prep-work for the corresponding patch.

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agonir/lower_alu_to_scalar: Support lowering 8- and 16-bit reduce ops
Neil Roberts [Wed, 30 Jan 2019 15:31:40 +0000 (16:31 +0100)]
nir/lower_alu_to_scalar: Support lowering 8- and 16-bit reduce ops

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agonir: Add a 8-bit bool type
Neil Roberts [Thu, 31 Jan 2019 15:05:44 +0000 (16:05 +0100)]
nir: Add a 8-bit bool type

Adds nir_type_bool8 as well as 8-bit versions of all the bool
opcodes.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agonir: Add a 16-bit bool type
Neil Roberts [Wed, 30 Jan 2019 10:02:39 +0000 (11:02 +0100)]
nir: Add a 16-bit bool type

Adds nir_type_bool16 as well as 16-bit versions of all the bool
opcodes.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agonir/opcodes: Add a helper function to generate reduce opcodes
Neil Roberts [Wed, 30 Jan 2019 09:58:48 +0000 (10:58 +0100)]
nir/opcodes: Add a helper function to generate reduce opcodes

Adds binop_reduce_all_sizes which generates both 1-bit and 32-bit
versions of the reduce operation. This reduces the code duplication a
bit and will make it easier to later add 16-bit versions as well.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agonir/opcodes: Add a helper function to generate the comparison binops
Neil Roberts [Wed, 30 Jan 2019 09:50:09 +0000 (10:50 +0100)]
nir/opcodes: Add a helper function to generate the comparison binops

Adds binop_compare_all_sizes which generates both 1-bit and 32-bit
versions of the comparison operation. This reduces the code
duplication a bit and will make it easier to later add 16-bit versions
as well.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoradv: enable VK_KHR_shader_subgroup_extended_types on GFX6-GFX7
Samuel Pitoiset [Wed, 20 Nov 2019 10:09:10 +0000 (11:09 +0100)]
radv: enable VK_KHR_shader_subgroup_extended_types on GFX6-GFX7

Most of DEQP-VK.subgroups are skipped because 16-bit float aren't
supported but others pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agov3d: adds an extra MOV for any sig.ld*
Alejandro Piñeiro [Tue, 19 Nov 2019 10:13:15 +0000 (11:13 +0100)]
v3d: adds an extra MOV for any sig.ld*

Specifically when we are in non-uniform control flow, as we would need
to set the condition for the last instruction. If (for example) a
image atomic load stores directly their value on a NIR register,
last_inst would be a nop, and would fail when set the condition.

Fixes piglit test:
spec/glsl-es-3.10/execution/cs-ssbo-atomic-if-else-2.shader_test

Fixes: 6281f26f064ada ("v3d: Add support for shader_image_load_store.")
v2: (Changes suggested by Eric Anholt)
   * Cover all sig.ld* signals, not just ldunif and ldtmu, as all of
     them have the same restriction.
   * Update comment explaining why we add a MOV in that case
   * Tweak commit message.

v3:
   * Drop extra set of parens (Eric)
   * Add missing ld signal to is_ld_signal to fix shader-db regression.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: Fix predication with atomic image operations
Jose Maria Casanova Crespo [Fri, 15 Nov 2019 13:46:30 +0000 (14:46 +0100)]
v3d: Fix predication with atomic image operations

Fixes dEQP test:
dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.image_atomic_multiple_interleaved_write_read

Fixes piglit test:
spec/glsl-es-3.10/execution/cs-image-atomic-if-else.shader_test

Fixes: 6281f26f064ada ("v3d: Add support for shader_image_load_store.")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>