mesa.git
4 years agofreedreno: Use common guardband helper
Connor Abbott [Fri, 17 Jul 2020 13:18:33 +0000 (15:18 +0200)]
freedreno: Use common guardband helper

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5950>

4 years agotu: Use common guardband helper
Connor Abbott [Fri, 17 Jul 2020 13:18:15 +0000 (15:18 +0200)]
tu: Use common guardband helper

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5950>

4 years agofreedreno: Add a helper for computing guardband sizes
Connor Abbott [Fri, 17 Jul 2020 13:15:42 +0000 (15:15 +0200)]
freedreno: Add a helper for computing guardband sizes

This should be much better than the previous method that was more
guesswork-based than anything else. It returns a value within 1 of the
blob for every input value I've tested, and it seems like it returns
slightly better (but still legal) answers when it differs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5950>

4 years agopanfrost: Remove unused batch_fence->ctx
Alyssa Rosenzweig [Mon, 20 Jul 2020 17:55:22 +0000 (13:55 -0400)]
panfrost: Remove unused batch_fence->ctx

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995>

4 years agopanfrost: Remove unused batch_fence->signaled
Alyssa Rosenzweig [Mon, 20 Jul 2020 17:53:42 +0000 (13:53 -0400)]
panfrost: Remove unused batch_fence->signaled

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995>

4 years agopanfrost: Allocate syncobjs in panfrost_flush
Alyssa Rosenzweig [Mon, 20 Jul 2020 17:34:42 +0000 (13:34 -0400)]
panfrost: Allocate syncobjs in panfrost_flush

For implementing panfrost_flush, it suffices to wait on only a single
syncobj, not an entire array of them. This lets us wait on it directly,
without coercing to/from syncfds in the middle (although some complexity
may be added later to support Android winsys).

Further, we should let the fence own the syncobj, tying together the
lifetimes and thus removing the connection between syncobjs and
batch_fence.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995>

4 years agopanfrost: Skip specifying in_syncs
Alyssa Rosenzweig [Mon, 20 Jul 2020 15:55:25 +0000 (11:55 -0400)]
panfrost: Skip specifying in_syncs

With the current kernel UABI, there is no benefit to explicitly
specifiying dependencies, since the kernel by design adds implicit
dependencies to any referenced BOs. This is something we'd like to
address in the future, but efficient handling with future kernels will
require a tweaked design in userspace as well. So let's do the obvious
thing now, and extend later.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995>

4 years agopanfrost: Remove wait parameter to flush_all_batches
Alyssa Rosenzweig [Mon, 20 Jul 2020 15:48:16 +0000 (11:48 -0400)]
panfrost: Remove wait parameter to flush_all_batches

It is always false now.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995>

4 years agopanfrost: Avoid wait=true flushing all batches
Alyssa Rosenzweig [Mon, 20 Jul 2020 15:44:10 +0000 (11:44 -0400)]
panfrost: Avoid wait=true flushing all batches

What is intended is to flush the batches and wait on a particular BO at
a later time. Explicitly forcing a wait immediately is redundant.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995>

4 years agoaco: implement b2i8/b2i16
Rhys Perry [Mon, 20 Jul 2020 18:21:20 +0000 (19:21 +0100)]
aco: implement b2i8/b2i16

Fixes lots of tests under dEQP-VK.spirv_assembly.type.*

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5993>

4 years agonv50/ir: initialize persampleInvocation to false
Karol Herbst [Tue, 21 Jul 2020 00:30:35 +0000 (02:30 +0200)]
nv50/ir: initialize persampleInvocation to false

Fixes: random KHR-GL45.sample_variables.mask.* fails
Fixes: 66ed9792edb702 ("nv50: Clear nv50_ir_prog_info of dead and codegen specific variables")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6001>

4 years agonv50/ir/tgsi: silence warning about unhandled GS_INPUT_PRIM property
Karol Herbst [Mon, 20 Jul 2020 21:46:39 +0000 (23:46 +0200)]
nv50/ir/tgsi: silence warning about unhandled GS_INPUT_PRIM property

Fixes: 66ed9792edb702 ("nv50: Clear nv50_ir_prog_info of dead and codegen specific variables")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6001>

4 years agoradv: disable CPU caching for the upload BO to reduce fetch latency
Samuel Pitoiset [Mon, 20 Jul 2020 11:47:19 +0000 (13:47 +0200)]
radv: disable CPU caching for the upload BO to reduce fetch latency

AMDGPU_GEM_CREATE_CPU_GTT_USWC should be faster when CPU reads
are unexpected (because they aren't cached).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5978>

4 years agoradv: do not perform read-modify-write with the upload BO
Samuel Pitoiset [Mon, 20 Jul 2020 11:43:40 +0000 (13:43 +0200)]
radv: do not perform read-modify-write with the upload BO

To disable CPU caching.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5978>

4 years agoradv: replace discard with demote for Quantic Dream games
Rhys Perry [Mon, 20 Jul 2020 15:54:22 +0000 (16:54 +0100)]
radv: replace discard with demote for Quantic Dream games

Detroit: Become Human uses dFdx/dFdy immediately after a quad-divergent
discard, which can cause the image to become white.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3212
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5991>

4 years agoaco: always set FI on GFX10
Rhys Perry [Mon, 20 Jul 2020 16:19:40 +0000 (17:19 +0100)]
aco: always set FI on GFX10

bounds_ctrl is set to true by default which works around some game bugs,
but that isn't enough on GFX10.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5991>

4 years agoci: Set XDG_CACHE_HOME to tmpfs for bare-metal runners to avoid NFS.
Eric Anholt [Mon, 20 Jul 2020 17:46:51 +0000 (10:46 -0700)]
ci: Set XDG_CACHE_HOME to tmpfs for bare-metal runners to avoid NFS.

We don't want these files shared between builds (it'll get blown away by
the next rsync), and NFS will just increase our latency for hitting the
cache.

Drops a630 gles31 run from 11-17 minutes to 5.5.  Maximum cache size on a
run I've seen is 153M, which it seems we can easily spare.

Fixes: f97acb4bb4b1 ("freedreno/ir3: disk-cache support")
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5998>

4 years agogitlab-ci: Fix needs: of the arm64 LAVA test jobs
Tomeu Vizoso [Mon, 13 Jul 2020 15:20:28 +0000 (17:20 +0200)]
gitlab-ci: Fix needs: of the arm64 LAVA test jobs

They were still depending on arm_build, but the build of kernel and
rootfs has been moved to a separate job.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472>

4 years agogitlab-ci: Upload tracie artifacts to MinIO
Tomeu Vizoso [Thu, 9 Jul 2020 20:29:39 +0000 (22:29 +0200)]
gitlab-ci: Upload tracie artifacts to MinIO

Upload failed images and the results.yml file to MinIO, to facilitate
debugging.

Also, fix version checking when git is installed as Mesa is going to
output a different renderer string if git is installed.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472>

4 years agogitlab-ci: Download traces from MinIO
Tomeu Vizoso [Thu, 9 Jul 2020 10:42:02 +0000 (12:42 +0200)]
gitlab-ci: Download traces from MinIO

Downloading the traces directly from git causes very high egress from
GCE, which is expensive.

So we can expand trace testing further, we are going to keep a cache in
freedesktop.org's MinIO instance. This commit implements downloading
from it.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472>

4 years agogitlab-ci: Replay traces on lava devices
Rohan Garg [Tue, 28 Jan 2020 14:19:53 +0000 (15:19 +0100)]
gitlab-ci: Replay traces on lava devices

Submit lava jobs to replay traces on Veyron (Mali T760) and Kevin (Mali
T860) boards.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472>

4 years agoiris: Fix CCS check in iris_texture_subdata().
Kenneth Graunke [Tue, 21 Jul 2020 00:08:46 +0000 (17:08 -0700)]
iris: Fix CCS check in iris_texture_subdata().

The intention here was to check "Would the GPU be able to compress
this if we used the PBO-based texture upload path?"  Prior to Gen12,
that meant checking for CCS_E.  On Gen12, there are a lot more types
of compression, and basic CCS_E was replaced by GEN12_CCS_E, making
this check simply not work, so we'd take the CPU path instead.

Instead, check if it has CCS, and isn't the basic "fast clear" CCS_D.

Fixes: 39f06e28485 ("iris: Implement pipe->texture_subdata directly")
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6005>

4 years agonir/lower_int64: lower 64-bit amul
Rhys Perry [Wed, 1 Jul 2020 15:14:16 +0000 (16:14 +0100)]
nir/lower_int64: lower 64-bit amul

Fixes an issue with Renderdoc's shader debugging with ACO.

If nir_opt_algebraic isn't called in-between nir_lower_explicit_io and
nir_lower_int64, we can end up with 64-bit multiplications.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 6320e37d4be ('nir: add amul instruction')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5709>

4 years agoanv: Advertise support for VK_EXT_shader_atomic_float
Jason Ekstrand [Tue, 19 May 2020 15:26:58 +0000 (10:26 -0500)]
anv: Advertise support for VK_EXT_shader_atomic_float

We already have all of the shader code for load/store/exchange.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>

4 years agointel/fs: Use the correct logical op for global float atomics
Jason Ekstrand [Tue, 14 Jul 2020 19:40:35 +0000 (14:40 -0500)]
intel/fs: Use the correct logical op for global float atomics

Fixes: e644ed468f98 "intel/fs: Implement nir_intrinsic_global_atomic_*"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>

4 years agospirv: Add support for SPV_EXT_shader_atomic_float
Jason Ekstrand [Tue, 19 May 2020 15:19:55 +0000 (10:19 -0500)]
spirv: Add support for SPV_EXT_shader_atomic_float

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>

4 years agospirv: Update headers and grammar json
Jason Ekstrand [Mon, 20 Jul 2020 14:53:33 +0000 (09:53 -0500)]
spirv: Update headers and grammar json

This pulls in commit 63cb1fc131573fa from KhronosGroup/SPIRV-Headers

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>

4 years agoegl: inline _EGLAPI into _EGLDriver
Eric Engestrom [Mon, 20 Jul 2020 11:38:24 +0000 (13:38 +0200)]
egl: inline _EGLAPI into _EGLDriver

_EGLDriver was an empty wrapper around _EGLAPI, so let's only keep one
of them. "driver" represents better what's being accessed, so that's the
one we're keeping.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5987>

4 years agoradeonsi: Inhibit clock-gating for perf counters.
Bas Nieuwenhuizen [Sun, 19 Jul 2020 23:43:18 +0000 (01:43 +0200)]
radeonsi: Inhibit clock-gating for perf counters.

Otherwise most counters return 0. Should be much more user friendly
than having to totally disable clock-gating on the kernel cmdline.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5972>

4 years agoamd/registers: add RLC_PERFMON_CLK_CNTL for pre-GFX10
Bas Nieuwenhuizen [Sun, 19 Jul 2020 23:37:11 +0000 (01:37 +0200)]
amd/registers: add RLC_PERFMON_CLK_CNTL for pre-GFX10

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5972>

4 years agoanv: Advertise VK_EXT_image_robustness
Jason Ekstrand [Tue, 9 Jun 2020 15:59:16 +0000 (10:59 -0500)]
anv: Advertise VK_EXT_image_robustness

We already support a superset of VK_EXT_image_robustness via
VK_EXT_robustness2.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5985>

4 years agofreedreno/ir3: Add missing ld_args_build_id to the ir3_delay unit test.
Eric Anholt [Mon, 20 Jul 2020 17:06:02 +0000 (10:06 -0700)]
freedreno/ir3: Add missing ld_args_build_id to the ir3_delay unit test.

It triggers the disk cache for me, and asserts abount not getting the
build id right.

Fixes: f97acb4bb4b1 ("freedreno/ir3: disk-cache support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5989>

4 years agoradv: advertise VK_EXT_image_robustness
Samuel Pitoiset [Mon, 20 Jul 2020 12:54:58 +0000 (14:54 +0200)]
radv: advertise VK_EXT_image_robustness

All new dEQP-VK.robustness.image_robustness.* pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5979>

4 years agoci: bare-metal: use nginx to get results from DUT
Christian Gmeiner [Wed, 10 Jun 2020 12:44:17 +0000 (14:44 +0200)]
ci: bare-metal: use nginx to get results from DUT

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2655
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5661>

4 years agomesa: change error code of *TextureSubImage* for incorreect target
Yevhenii Kolesnikov [Thu, 16 Jul 2020 12:13:08 +0000 (15:13 +0300)]
mesa: change error code of *TextureSubImage* for incorreect target

According to the "Errors" list of the OpenGL 4.6 spec, section 8.6
"Alternate Texture Image Specification Commands":

An INVALID_OPERATION error is generated by *TextureSubImage* if the
effective target of texture does not match the command, as shown in table 8.15.

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5934>

4 years agofreedreno/ir3: Fix disasm of register offsets in ldp/stp.
Eric Anholt [Wed, 8 Jul 2020 20:38:18 +0000 (13:38 -0700)]
freedreno/ir3: Fix disasm of register offsets in ldp/stp.

I had a stp testcase that was getting its offset wrong, and by twiddling
bits and feeding it to qc disasm, I found that the comment was sort of
right: some the cat6a bits implicated in the old comment do get used, as
the high bits of the cat6c offset.  Reallocating those bits also fixes how
we were getting r960.y for r0.y.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815>

4 years agofreedreno/ir3: Refactor cat6 general dst printing.
Eric Anholt [Wed, 8 Jul 2020 23:51:16 +0000 (16:51 -0700)]
freedreno/ir3: Refactor cat6 general dst printing.

We didn't need the extra branch and temp, we can move it inside of the dst
handling by just duplicating the print of the dst reg.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815>

4 years agofreedreno/ir3: Add a bunch more tests for cat6 opcodes.
Eric Anholt [Wed, 8 Jul 2020 23:09:48 +0000 (16:09 -0700)]
freedreno/ir3: Add a bunch more tests for cat6 opcodes.

This started with making note of some ldp/stp instructions from the blob
and how we differ from them.  In the process of fixing it, I accidentally
modified behavior of other opcodes, and the other instructions listed will
keep us from doing that.  I also dropped an old stl test that looks like I
took from after a shader 'end' instruction.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815>

4 years agofreedreno/ir3: Add a note about the instructions in the disasm test.
Eric Anholt [Wed, 8 Jul 2020 23:37:55 +0000 (16:37 -0700)]
freedreno/ir3: Add a note about the instructions in the disasm test.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815>

4 years agovulkan: Update Vulkan XML and headers to 1.2.148
Jason Ekstrand [Mon, 20 Jul 2020 14:50:21 +0000 (09:50 -0500)]
vulkan: Update Vulkan XML and headers to 1.2.148

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5983>

4 years agoci: Use FDO_CI_CONCURRENT as our -j flags when present in the runner env.
Eric Anholt [Fri, 26 Jun 2020 17:59:41 +0000 (10:59 -0700)]
ci: Use FDO_CI_CONCURRENT as our -j flags when present in the runner env.

fd.o has retuned the x86 runners on packet for -j8.  Rather than having to
tweak our CI every time fd.o decides to rebalance job concurrency, respect
what the runner admin has chosen for their builds (this will also be
convenient for people with large local runners).

Reviewed-by: Michel Dänzer <michel@daenzer.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5669>

4 years agonir/algebraic: fold some nested bcsel
Daniel Schürmann [Wed, 15 Jul 2020 17:31:22 +0000 (19:31 +0200)]
nir/algebraic: fold some nested bcsel

Totals from 14266 (10.62% of 134368) affected shaders (Polaris):
SGPRs: 761756 -> 762732 (+0.13%); split: -0.00%, +0.13%
VGPRs: 430392 -> 430924 (+0.12%); split: -0.05%, +0.17%
SpillSGPRs: 4652 -> 4628 (-0.52%); split: -0.60%, +0.09%
CodeSize: 30133000 -> 29949780 (-0.61%); split: -0.66%, +0.05%
MaxWaves: 102122 -> 102111 (-0.01%); split: +0.00%, -0.01%
Instrs: 5845085 -> 5841668 (-0.06%); split: -0.08%, +0.03%
Cycles: 69033140 -> 68889188 (-0.21%); split: -0.22%, +0.01%
VMEM: 8479021 -> 8474978 (-0.05%); split: +0.03%, -0.08%
SMEM: 831437 -> 830464 (-0.12%); split: +0.06%, -0.18%
VClause: 105411 -> 105410 (-0.00%); split: -0.01%, +0.01%
SClause: 327727 -> 327780 (+0.02%); split: -0.00%, +0.02%
Copies: 372704 -> 373306 (+0.16%); split: -0.16%, +0.32%
Branches: 112260 -> 112269 (+0.01%); split: -0.00%, +0.01%
PreSGPRs: 433308 -> 433631 (+0.07%); split: -0.01%, +0.09%
PreVGPRs: 397888 -> 397905 (+0.00%); split: -0.01%, +0.01%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>

4 years agonir/algebraic: propagate b2i out of ior/iand
Daniel Schürmann [Wed, 15 Jul 2020 17:30:34 +0000 (19:30 +0200)]
nir/algebraic: propagate b2i out of ior/iand

Totals from 761 (0.57% of 134368) affected shaders (Polaris):
SGPRs: 29496 -> 29488 (-0.03%)
SpillSGPRs: 41 -> 43 (+4.88%)
CodeSize: 1922036 -> 1882408 (-2.06%); split: -2.08%, +0.02%
Instrs: 366051 -> 360362 (-1.55%); split: -1.57%, +0.02%
Cycles: 7692516 -> 7661216 (-0.41%); split: -0.41%, +0.01%
VMEM: 365175 -> 365172 (-0.00%)
VClause: 15324 -> 15322 (-0.01%)
SClause: 9825 -> 9824 (-0.01%); split: -0.02%, +0.01%
Copies: 41216 -> 41294 (+0.19%); split: -0.01%, +0.20%
Branches: 7020 -> 7033 (+0.19%)
PreSGPRs: 22103 -> 22106 (+0.01%)
PreVGPRs: 26518 -> 26515 (-0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>

4 years agonir/algebraic: add distributive rules for ior/iand
Daniel Schürmann [Wed, 15 Jul 2020 17:23:54 +0000 (19:23 +0200)]
nir/algebraic: add distributive rules for ior/iand

Totals from 581 (0.43% of 134368) affected shaders (Polaris):
CodeSize: 1389560 -> 1386488 (-0.22%)
Instrs: 264488 -> 263984 (-0.19%)
Cycles: 1057952 -> 1055936 (-0.19%)
VMEM: 296016 -> 291613 (-1.49%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>

4 years agonir/algebraic: optimize (a < 0.0) ? -a : a -> fabs(a)
Daniel Schürmann [Thu, 30 Apr 2020 09:58:08 +0000 (10:58 +0100)]
nir/algebraic: optimize (a < 0.0) ? -a : a -> fabs(a)

Totals from affected shaders: (VEGA)
SGPRS: 13920 -> 13920 (0.00 %)
VGPRS: 10252 -> 10252 (0.00 %)
Spilled SGPRs: 62 -> 62 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 587648 -> 587224 (-0.07 %) bytes
LDS: 5 -> 5 (0.00 %) blocks
Max Waves: 1489 -> 1489 (0.00 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>

4 years agonir/algebraic: optimize fmul(x, bcsel(c, -1.0, 1.0)) -> bcsel(c, -x, x)
Daniel Schürmann [Wed, 29 Apr 2020 14:38:54 +0000 (15:38 +0100)]
nir/algebraic: optimize fmul(x, bcsel(c, -1.0, 1.0)) -> bcsel(c, -x, x)

Totals from affected shaders: (VEGA)
SGPRS: 545712 -> 545712 (0.00 %)
VGPRS: 413092 -> 413116 (0.01 %)
Spilled SGPRs: 10616 -> 10616 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 37031684 -> 36984248 (-0.13 %) bytes
LDS: 427 -> 427 (0.00 %) blocks
Max Waves: 54350 -> 54340 (-0.02 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>

4 years agonir/algebraic: add some more unop + bcsel optimizations
Daniel Schürmann [Wed, 29 Apr 2020 17:50:27 +0000 (18:50 +0100)]
nir/algebraic: add some more unop + bcsel optimizations

Totals from affected shaders: (VEGA)
SGPRS: 284392 -> 284400 (0.00 %)
VGPRS: 261080 -> 261076 (-0.00 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 24698596 -> 24277788 (-1.70 %) bytes
LDS: 196 -> 196 (0.00 %) blocks
Max Waves: 10101 -> 10105 (0.04 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>

4 years agonir/algebraic: add optimizations for fsign/isign
Daniel Schürmann [Wed, 29 Apr 2020 16:56:05 +0000 (17:56 +0100)]
nir/algebraic: add optimizations for fsign/isign

This just reverts fsign/isign lowering.

Totals from affected shaders:
SGPRS: 257496 -> 256672 (-0.32 %)
VGPRS: 181800 -> 178864 (-1.61 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 11355852 -> 11141840 (-1.88 %) bytes
LDS: 3789 -> 3789 (0.00 %) blocks
Max Waves: 30453 -> 30951 (1.64 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>

4 years agonir/algebraic: optimize iand/ior of (n)eq zero
Daniel Schürmann [Wed, 29 Apr 2020 16:49:45 +0000 (17:49 +0100)]
nir/algebraic: optimize iand/ior of (n)eq zero

Found in some Detroit: Become Human shaders.

Totals from affected shaders:
SGPRS: 700256 -> 700256 (0.00 %)
VGPRS: 507208 -> 507212 (0.00 %)
Spilled SGPRs: 142531 -> 142531 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 76404616 -> 76301768 (-0.13 %) bytes
LDS: 43 -> 43 (0.00 %) blocks
Max Waves: 21438 -> 21438 (0.00 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>

4 years agonir: also move b2i in case of nir_move_copies
Daniel Schürmann [Wed, 29 Apr 2020 14:36:41 +0000 (15:36 +0100)]
nir: also move b2i in case of nir_move_copies

Booleans are often more efficient with register usage.
This also allows to move comparisons further.

Totals from affected shaders: (VEGA)
SGPRS: 451608 -> 450320 (-0.29 %)
VGPRS: 351448 -> 351256 (-0.05 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 1008 -> 1008 (0.00 %) dwords per thread
Code Size: 26555596 -> 26551080 (-0.02 %) bytes
LDS: 10323 -> 10323 (0.00 %) blocks
Max Waves: 42850 -> 42934 (0.20 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>

4 years agonir/algebraic: optimize bcsel(a, 0, 1) to b2i
Daniel Schürmann [Tue, 28 Apr 2020 10:45:07 +0000 (11:45 +0100)]
nir/algebraic: optimize bcsel(a, 0, 1) to b2i

This avoids combination with other bcsel operations,
and as b2i is often a no-op (when used for iadd and such),
the resulting pattern is preferable.

Totals from affected shaders: (VEGA)
SGPRS: 598448 -> 598448 (0.00 %)
VGPRS: 457940 -> 457352 (-0.13 %)
Spilled SGPRs: 127154 -> 127154 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 64836352 -> 64802728 (-0.05 %) bytes
LDS: 781 -> 781 (0.00 %) blocks
Max Waves: 22931 -> 22931 (0.00 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>

4 years agopan/mdg: Use the blend RT for blend shader framebuffer fetches
Icecream95 [Sun, 19 Jul 2020 10:31:26 +0000 (22:31 +1200)]
pan/mdg: Use the blend RT for blend shader framebuffer fetches

Fixes piglit test fbo-drawbuffers-blend-add when fixed-function
blending is disabled in panfrost_get_blend_for_context.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5892>

4 years agopanfrost: 8x MRT support
Icecream95 [Tue, 14 Jul 2020 00:05:47 +0000 (12:05 +1200)]
panfrost: 8x MRT support

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5892>

4 years agopanfrost: Use more tilebuffer sizes
Icecream95 [Mon, 13 Jul 2020 23:50:10 +0000 (11:50 +1200)]
panfrost: Use more tilebuffer sizes

This will be needed for 8x MRT with 128-bit framebuffer formats.

Adds support for 256-bit, 1024-bit, and 2048-bit tilebuffer allocations,
depending on the amount of data required.

v2: Squash commits (Alyssa)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5892>

4 years agopanfrost: Fake RGTC support
Icecream95 [Mon, 13 Jul 2020 10:45:51 +0000 (22:45 +1200)]
panfrost: Fake RGTC support

For most GPUs RGTC is disabled, so it needs to be emulated, using the
fake_rgtc option of u_transfer_helper.

Passes the rgtc-teximage tests in piglit.

v2: Update docs/features.txt (Alyssa)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5975>

4 years agospirv: don't split memory barriers
Rhys Perry [Fri, 17 Jul 2020 10:46:47 +0000 (11:46 +0100)]
spirv: don't split memory barriers

If the SPIR-V had a shared+image memory barrier, we would emit two NIR
barriers: a shared barrier and an image barrier.

Unlike a single barrier, two barriers allows transformations such as:

intrinsic image_deref_store (ssa_27, ssa_33, ssa_34, ssa_32, ssa_25) (1)
intrinsic memory_barrier_shared () ()
intrinsic memory_barrier_image () ()
intrinsic store_shared (ssa_35, ssa_24) (0, 1, 4, 0)
->
intrinsic memory_barrier_shared () ()
intrinsic store_shared (ssa_35, ssa_24) (0, 1, 4, 0)
intrinsic image_deref_store (ssa_27, ssa_33, ssa_34, ssa_32, ssa_25) (1)
intrinsic memory_barrier_image () ()

This commit fixes two dEQP-VK.memory_model.* CTS tests with ACO.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5951>

4 years agoradv/winsys: always allow GTT placements on APUs
Samuel Pitoiset [Thu, 30 Apr 2020 18:47:46 +0000 (20:47 +0200)]
radv/winsys: always allow GTT placements on APUs

When the VRAM size is small and the preferred heap only VRAM,
the kernel tries to always honor the requested heap and it does
a ton of evictions which is a disaster for performance.

On APUs, VRAM and GTT have similar performance, so allow the
kernel to choose the best placement (GTT or VRAM) itself.

This gives a huge performance boost with Doom Eternal on RAVEN.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5665>

4 years agoradv: disable CPU caching for IBS to reduce fetch latency
Samuel Pitoiset [Fri, 17 Jul 2020 20:51:34 +0000 (22:51 +0200)]
radv: disable CPU caching for IBS to reduce fetch latency

AMDGPU_GEM_CREATE_CPU_GTT_USWC should be faster when CPU reads
are unexpected (because they aren't cached).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5959>

4 years agoradeonsi: adjust epitch for PIPE_FORMAT_R8G8_R8B8_UNORM
Pierre-Eric Pelloux-Prayer [Mon, 13 Jul 2020 13:36:25 +0000 (15:36 +0200)]
radeonsi: adjust epitch for PIPE_FORMAT_R8G8_R8B8_UNORM

This fix si_compute_copy_image for yuyv image (so using PIPE_FORMAT_R8G8_R8B8_UNORM).

With this change, the following gst pipeline produce the expected results for various
image sizes (with or without AMD_DEBUG=nodma):

gst-launch-1.0 filesrc location=input.jpg ! jpegparse ! vaapijpegdec ! filesink location=output.yuv

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5841>

4 years agoac/surface: adapt surf_size when modifying surf_pitch
Pierre-Eric Pelloux-Prayer [Thu, 9 Jul 2020 12:10:51 +0000 (14:10 +0200)]
ac/surface: adapt surf_size when modifying surf_pitch

Otherwise we might get VM_L2_PROTECTION_FAULT_STATUS errors.

Fixes: 8275dc1ed57 ("ac/surface: fix epitch when modifying surf_pitch")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5841>

4 years agod600/sfn: write stream outputs to correct mem ring
Gert Wollny [Sat, 18 Jul 2020 20:42:54 +0000 (22:42 +0200)]
d600/sfn: write stream outputs to correct mem ring

Fixes: arb_gpu_shader5-xfb-streams
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>

4 years agor600/sfn: Make the pin_to_channel generic
Gert Wollny [Sun, 5 Jul 2020 14:49:56 +0000 (16:49 +0200)]
r600/sfn: Make the pin_to_channel generic

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>

4 years agor600/sfn: Only use sample mask if the according shader key is set
Gert Wollny [Sun, 5 Jul 2020 14:49:14 +0000 (16:49 +0200)]
r600/sfn: Only use sample mask if the according shader key is set

This fixes all the piglits from arb_sample_shading "samplemask * *"
with the nir backend.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>

4 years agor600: Add shader key item to identify when the sample mask should be used
Gert Wollny [Sun, 5 Jul 2020 16:35:35 +0000 (18:35 +0200)]
r600: Add shader key item to identify when the sample mask should be used

The sample mask must be applied when more then one sample is available or
multisamplig is not enabled, so add a shader key to track this.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>

4 years agor600/sfn: Fix default z swizzle for GDS instructions
Gert Wollny [Sun, 5 Jul 2020 14:46:32 +0000 (16:46 +0200)]
r600/sfn: Fix default z swizzle for GDS instructions

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>

4 years agor600/sfn: Fix Ring output swizzle masks
Gert Wollny [Sun, 5 Jul 2020 14:54:10 +0000 (16:54 +0200)]
r600/sfn: Fix Ring output swizzle masks

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>

4 years agor600/sfn: Add a forced output swizzle for depth write
Gert Wollny [Sat, 18 Jul 2020 19:33:54 +0000 (21:33 +0200)]
r600/sfn: Add a forced output swizzle for depth write

This makes sure no components are written that shouldn't be written.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>

4 years agor600/sfn: correct handling of loading vec4 with fetching constants
Gert Wollny [Sat, 18 Jul 2020 18:34:37 +0000 (20:34 +0200)]
r600/sfn: correct handling of loading vec4 with fetching constants

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>

4 years agor600/sfn: Add option to get a temp value for a specific channel
Gert Wollny [Sat, 18 Jul 2020 18:21:10 +0000 (20:21 +0200)]
r600/sfn: Add option to get a temp value for a specific channel

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>

4 years agor600/sfn: emit texture instructions in one block
Gert Wollny [Sun, 5 Jul 2020 14:39:09 +0000 (16:39 +0200)]
r600/sfn: emit texture instructions in one block

Setting the offset must happen in the same CF like using it, so don't
emit ALU instruction between the tex instructions.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>

4 years agor600/sfn: Pipe through requesting a register at a given channel
Gert Wollny [Sun, 5 Jul 2020 14:46:03 +0000 (16:46 +0200)]
r600/sfn: Pipe through requesting a register at a given channel

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>

4 years agor600/sfn: lower rotate ALU ops
Gert Wollny [Sun, 5 Jul 2020 14:36:11 +0000 (16:36 +0200)]
r600/sfn: lower rotate ALU ops

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>

4 years agoci/llvmpipe: reenable gpu shader5 tests
Dave Airlie [Mon, 20 Jul 2020 00:45:42 +0000 (10:45 +1000)]
ci/llvmpipe: reenable gpu shader5 tests

I hadn't realised these were disabled, llvmpipe now exposes this extension.

One additional failure is fine to get the added testing coverage.

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5973>

4 years agollvmpipe: add framebuffer fetching support (v1.1)
Dave Airlie [Tue, 14 Jul 2020 23:53:03 +0000 (09:53 +1000)]
llvmpipe: add framebuffer fetching support (v1.1)

v1.1:
Merge two if blocks (Roland)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5914>

4 years agollvmpipe/cs: respect render condition
Dave Airlie [Thu, 16 Jul 2020 19:22:18 +0000 (05:22 +1000)]
llvmpipe/cs: respect render condition

Running complete CTS turned up a missing cond render.

Fixes KHR-GL45.compute_shader.conditional-dispatching

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5944>

4 years agofreedreno/ir3/ra: fix array conflicts for split/merged
Rob Clark [Fri, 17 Jul 2020 18:06:55 +0000 (11:06 -0700)]
freedreno/ir3/ra: fix array conflicts for split/merged

Properly handle the difference between split and merged register file
when determining where arrays can fit without conflicting with other
arrays or pre-colored instructions.

1) if not mergedregs, only consider other things with same precision
   as potentially conflicting
2) if mergedregs, calculate everything in therms of half-regs and
   convert back to fullregs in the end

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>

4 years agofreedreno/ir3/ra: assign vreg names to all array elements
Rob Clark [Thu, 16 Jul 2020 19:41:11 +0000 (12:41 -0700)]
freedreno/ir3/ra: assign vreg names to all array elements

We shouldn't divide-by-two for half-reg arrays.  We set the proper node
interference class, based on `arr->half`.

Fixes a RA fail with 16b arrays:

  src/freedreno/ir3/ir3_ra.c:633: name_to_array: Assertion `!"invalid array name"' failed.

Caused by use/def iterators returning `arr->length` vreg namess, but
only assigning the array half that many names.

Also, since we are assigning unique vreg names to each array element,
there is no need to try and convert from half-reg to it's conflicting
full reg when pre-coloring the array elements.  Getting us closer to
having half-arrays work sanely with split-register-file (a5xx and
earlier).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>

4 years agofreedreno/ir3/ra: debug msgs tweak
Rob Clark [Thu, 16 Jul 2020 19:11:24 +0000 (12:11 -0700)]
freedreno/ir3/ra: debug msgs tweak

Print out the assigned vreg names earlier.  Also print the few special
nodes.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>

4 years agofreedreno/ir3: fix half-reg array stores
Rob Clark [Thu, 16 Jul 2020 22:20:45 +0000 (15:20 -0700)]
freedreno/ir3: fix half-reg array stores

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>

4 years agofreedreno/ir3: set array precision on creation
Rob Clark [Thu, 16 Jul 2020 22:13:53 +0000 (15:13 -0700)]
freedreno/ir3: set array precision on creation

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>

4 years agofreedreno/ir3/parser: half-precision relative regs
Rob Clark [Fri, 17 Jul 2020 16:35:18 +0000 (09:35 -0700)]
freedreno/ir3/parser: half-precision relative regs

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>

4 years agofreedreno: whitespace fix
Rob Clark [Tue, 14 Jul 2020 18:48:10 +0000 (11:48 -0700)]
freedreno: whitespace fix

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>

4 years agofreedreno: small comment re-word
Rob Clark [Thu, 16 Jul 2020 14:40:02 +0000 (07:40 -0700)]
freedreno: small comment re-word

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>

4 years agozink: free all ntv allocations after creating shader module
Mike Blumenkrantz [Wed, 3 Jun 2020 15:41:43 +0000 (11:41 -0400)]
zink: free all ntv allocations after creating shader module

these are all fairly large sources of leaks

Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5887>

4 years agozink: free pipeline cache during program destroy
Mike Blumenkrantz [Wed, 3 Jun 2020 15:42:10 +0000 (11:42 -0400)]
zink: free pipeline cache during program destroy

more leaks

Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5887>

4 years agozink: destroy descriptor pools on context destroy
Mike Blumenkrantz [Wed, 3 Jun 2020 15:13:35 +0000 (11:13 -0400)]
zink: destroy descriptor pools on context destroy

this is a big leak

Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5887>

4 years agozink: destroy gfx program when a shader is freed
Mike Blumenkrantz [Wed, 3 Jun 2020 15:08:34 +0000 (11:08 -0400)]
zink: destroy gfx program when a shader is freed

there's no sense in having these objects sitting around when they can
never be used again

requires adding a zink_context* pointer to each program in order to prune
the hash table entry

Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5887>

4 years agoandroid: panfrost/encoder: add libmesa_nir static dependency
Mauro Rossi [Fri, 17 Jul 2020 22:11:51 +0000 (00:11 +0200)]
android: panfrost/encoder: add libmesa_nir static dependency

Fixes the following build error:

In file included from external/mesa/src/panfrost/encoder/pan_blit.c:34:
In file included from external/mesa/src/panfrost/encoder/../midgard/midgard_compile.h:27:
external/mesa/src/compiler/nir/nir.h:52:10: fatal error: 'nir_opcodes.h' file not found
         ^~~~~~~~~~~~~~~
1 error generated.

Fixes: 293f251871b ("panfrost: Use Midgard-specific reloads")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5961>

4 years agopanfrost: Fix calls to panfrost_flush_batches_accessing_bo
Icecream95 [Fri, 17 Jul 2020 23:39:45 +0000 (11:39 +1200)]
panfrost: Fix calls to panfrost_flush_batches_accessing_bo

The function now takes a bool flush_readers instead of an access type,
but some calls were not updated.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5962>

4 years agopanfrost: Make panfrost_bo_wait take a wait_readers bool
Icecream95 [Fri, 17 Jul 2020 23:36:36 +0000 (11:36 +1200)]
panfrost: Make panfrost_bo_wait take a wait_readers bool

panfrost_bo_wait is often used after
panfrost_flush_batches_accessing_bo, so make them take similar
arguments for consistency.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5962>

4 years agofreedreno/ir3: Add unit tests for derivatives disasm.
Eric Anholt [Tue, 30 Jun 2020 20:05:51 +0000 (13:05 -0700)]
freedreno/ir3: Add unit tests for derivatives disasm.

Since I was going back to look at fine derivs again, add some tests of
instruction encoding.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5699>

4 years agofreedreno/ir3: Fix duplicated fine derivatives instructions.
Eric Anholt [Tue, 30 Jun 2020 19:17:10 +0000 (12:17 -0700)]
freedreno/ir3: Fix duplicated fine derivatives instructions.

legalize_block() can get run multiple times, which I didn't notice when
adding fine derivs support.  Other instruction clones change things such
that the legalization won't trigger again, but that didn't apply to the
DS.PP legalization.  To keep someone else from tripping over this, split
the one-shot legalization out of the iterative sync flag application.

Fixes failures in dEQP-VK.glsl.derivate.dfdxfine.*

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3198
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5699>

4 years agoamd/addrlib: Clean up unused colorFlags argument
Bas Nieuwenhuizen [Sat, 11 Jul 2020 12:34:58 +0000 (14:34 +0200)]
amd/addrlib: Clean up unused colorFlags argument

Cleanup.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5865>

4 years agoamd/common: Cache intra-tile addresses for retile map.
Bas Nieuwenhuizen [Sat, 11 Jul 2020 20:04:25 +0000 (22:04 +0200)]
amd/common: Cache intra-tile addresses for retile map.

However complicated DCC addressing is it is still based on tiles.
If we have the intra-tile offsets + tile dimensions we can expand
that to the full image ourselves.

Behavior around ~1080p on a 2500U:

old:
  30-60 ms on every miss

new:
  5 ms initally (miss in the tile cache)
  <0.5 ms afterwards

The most common case is that the tile cache only contains data for
2 tiles, which for Raven/Renoir/Navi14 will be 4 KiB each, so the
size increase is fairly modest.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5865>

4 years agoaco: use s_waitcnt_depctr to mitigate VMEMtoScalarWriteHazard
Rhys Perry [Wed, 15 Jul 2020 16:08:01 +0000 (17:08 +0100)]
aco: use s_waitcnt_depctr to mitigate VMEMtoScalarWriteHazard

Apparently this is potentially faster than v_nop:
https://reviews.llvm.org/D83872

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5923>

4 years agoaco: properly recognize that s_waitcnt mitigates VMEMtoScalarWriteHazard
Rhys Perry [Wed, 15 Jul 2020 16:34:21 +0000 (17:34 +0100)]
aco: properly recognize that s_waitcnt mitigates VMEMtoScalarWriteHazard

fossil-db (Navi):
Totals from 555 (0.41% of 135946) affected shaders:
CodeSize: 1005716 -> 1003400 (-0.23%)
Instrs: 195326 -> 194744 (-0.30%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5923>

4 years agomeson: Enable GCing of functions and data from compilation units by default.
Eric Anholt [Fri, 3 Jul 2020 19:03:52 +0000 (12:03 -0700)]
meson: Enable GCing of functions and data from compilation units by default.

Normally, the linker will pull in any compilation unit (aka .c file) from
a static lib (such as our shared util code) that is depended on by the
code linking against it.  Since that code is already compiled, the .text
section is allowed to jump anywhere in .text, and the compiler can't
garbage collect unused functions inside of a compile unit.

Teasing callgraphs apart so that normal compilation-unit-level GCing can
reduce driver size hurts the logical organization of the code and is
difficult.  As an example, once I'd split the format pack/unpack tables, I
had to split out util_format_read/write() from util_format.c to avoid
pulling in pack/unpack.  But even then it didn't help, because it turns
out turnip's pack calls pull in util_format_bptc.c for bptc packing, but
that file also includes the unpack impls, and those internally call
util_format_unpack, and thus we pulled in all of unpack.  Splitting all of
this to separate files makes code harder to find and maintain, and is a
waste of dev time.

By setting these compiler flags, the compiler puts each function and data
symbol in a separate ELF section and the linker can then safely GC unused
text and data sections from a compile unit that gets pulled in.  There's a
bit of a space cost due to having those separate sections, but it ends up
being a huge win in disk space on my personal release driver builds:

- i965_dri.so -213k
- x86 gallium dri.so -430k
- libvulkan_intel.so -272k
- aarch64 gallium dri.so -330k
- libvulkan_freedreno.so -783k

No difference on iris drawoverhead -compat -test 1 on my skylake (n=60)

Effect on debugoptimized build times (n=5)
touch nir_lower_io.c build time (bfd)        +15.999% +/- 3.80377%
touch freedreno fd6_gmem.c build time (bfd)  +13.5294% +/- 4.86363%
touch nir_lower_io.c build time (lld)        no change
touch freedreno fd6_gmem.c build time (lld)  +2.45375% +/- 2.2383%

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5739>

4 years agopanfrost: Enable FP16 by default
Alyssa Rosenzweig [Fri, 17 Jul 2020 21:04:41 +0000 (17:04 -0400)]
panfrost: Enable FP16 by default

I see no reason to hide this. The small hit in cycle count is offset in
practice by the increase in thread count. So let's ship it and get some
testing.

If this regresses a workload:

1. Open an issue on the tracker and attach an apitrace.
2. In the meantime set PAN_MESA_DEBUG=nofp16 to override.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5960>

4 years agogitlab-ci: re-enable all a630 jobs
Rob Clark [Thu, 16 Jul 2020 21:20:22 +0000 (14:20 -0700)]
gitlab-ci: re-enable all a630 jobs

I haven't noticed tftp boot issues in last few days, not sure if they
where just a fluke on Mon or if it is somehow related to # of jobs we
run (ie. having more of the c630 runners powered up and running more
of the time).

Let's turn them back on and see what happens.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5952>

4 years agofreedreno/a2xx: Fix compiler warning in disasm.
Eric Anholt [Fri, 17 Jul 2020 17:48:56 +0000 (10:48 -0700)]
freedreno/a2xx: Fix compiler warning in disasm.

warning: converting a packed ‘instr_cf_t’ {aka ‘union <anonymous>’}
pointer (alignment 1) to a ‘uint16_t’ {aka ‘short unsigned int’} pointer
(alignment 2) may result in an unaligned pointer value
[-Waddress-of-packed-member]

We may know that we'll only ever have aligned instr_cf_ts, but gcc
doesn't.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5955>