Rhys Perry [Tue, 15 Oct 2019 16:25:57 +0000 (17:25 +0100)]
aco: add NUW flag
This (combined with a pass to actually set the corresponding NIR flags)
should help fix a lot of the regressions from the SMEM addition combining
change.
fossil-db (Navi):
Totals from 12 (0.01% of 135946) affected shaders:
CodeSize: 12376 -> 12304 (-0.58%)
Instrs: 2436 -> 2422 (-0.57%)
VMEM: 1105 -> 1096 (-0.81%)
SClause: 133 -> 130 (-2.26%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>
Rhys Perry [Fri, 17 Jul 2020 14:01:41 +0000 (15:01 +0100)]
aco: allow overflow for some SMEM instructions
fossil-db (Navi):
Totals from 10184 (7.49% of 135946) affected shaders:
CodeSize:
83419748 ->
82430824 (-1.19%); split: -1.19%, +0.01%
Instrs:
16054612 ->
15908523 (-0.91%); split: -0.93%, +0.02%
VMEM:
1608018 ->
1581829 (-1.63%); split: +0.20%, -1.83%
SMEM: 577031 -> 563492 (-2.35%); split: +0.10%, -2.45%
VClause: 242643 -> 242512 (-0.05%); split: -0.06%, +0.00%
SClause: 640966 -> 569897 (-11.09%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>
Rhys Perry [Tue, 15 Oct 2019 16:00:55 +0000 (17:00 +0100)]
aco: be more careful combining additions that could wrap into loads/stores
SMEM does the addition with 64-bits, not 32. So if the original code
relied on wrapping around (for example, for subtraction), it would break.
Apparently swizzled MUBUF accesses also have issues with combining
additions that could overflow. Normal MUBUF accesses seem fine.
fossil-db (Navi):
Totals from 27219 (20.02% of 135946) affected shaders:
CodeSize:
128303256 ->
131062756 (+2.15%); split: -0.00%, +2.15%
Instrs:
24818911 ->
25280558 (+1.86%); split: -0.01%, +1.87%
VMEM:
162311926 ->
177226874 (+9.19%); split: +9.36%, -0.17%
SMEM:
18182559 ->
20218734 (+11.20%); split: +11.53%, -0.34%
VClause: 423635 -> 424398 (+0.18%); split: -0.02%, +0.20%
SClause: 865384 ->
1104986 (+27.69%); split: -0.00%, +27.69%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2748
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>
Andres Gomez [Wed, 8 Jul 2020 18:33:58 +0000 (21:33 +0300)]
gitlab-ci/traces: updated paths and checksums for POLARIS10 traces
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5890>
Andres Gomez [Mon, 13 Jul 2020 22:11:26 +0000 (01:11 +0300)]
gitlab-ci: get the last frame from a gfxr trace using gfxrecon-info
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5890>
Andres Gomez [Wed, 1 Jul 2020 20:45:08 +0000 (23:45 +0300)]
gitlab-ci: build gfxreconstruct from the "dev" branch
We want to use the fix for
https://github.com/LunarG/gfxreconstruct/issues/328 while it is yet
not available in the "master" branch.
Additionally, we get the gfxreconstruct-info tool as an extra.
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5890>
Connor Abbott [Fri, 17 Jul 2020 13:18:33 +0000 (15:18 +0200)]
freedreno: Use common guardband helper
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5950>
Connor Abbott [Fri, 17 Jul 2020 13:18:15 +0000 (15:18 +0200)]
tu: Use common guardband helper
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5950>
Connor Abbott [Fri, 17 Jul 2020 13:15:42 +0000 (15:15 +0200)]
freedreno: Add a helper for computing guardband sizes
This should be much better than the previous method that was more
guesswork-based than anything else. It returns a value within 1 of the
blob for every input value I've tested, and it seems like it returns
slightly better (but still legal) answers when it differs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5950>
Alyssa Rosenzweig [Mon, 20 Jul 2020 17:55:22 +0000 (13:55 -0400)]
panfrost: Remove unused batch_fence->ctx
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995>
Alyssa Rosenzweig [Mon, 20 Jul 2020 17:53:42 +0000 (13:53 -0400)]
panfrost: Remove unused batch_fence->signaled
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995>
Alyssa Rosenzweig [Mon, 20 Jul 2020 17:34:42 +0000 (13:34 -0400)]
panfrost: Allocate syncobjs in panfrost_flush
For implementing panfrost_flush, it suffices to wait on only a single
syncobj, not an entire array of them. This lets us wait on it directly,
without coercing to/from syncfds in the middle (although some complexity
may be added later to support Android winsys).
Further, we should let the fence own the syncobj, tying together the
lifetimes and thus removing the connection between syncobjs and
batch_fence.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995>
Alyssa Rosenzweig [Mon, 20 Jul 2020 15:55:25 +0000 (11:55 -0400)]
panfrost: Skip specifying in_syncs
With the current kernel UABI, there is no benefit to explicitly
specifiying dependencies, since the kernel by design adds implicit
dependencies to any referenced BOs. This is something we'd like to
address in the future, but efficient handling with future kernels will
require a tweaked design in userspace as well. So let's do the obvious
thing now, and extend later.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995>
Alyssa Rosenzweig [Mon, 20 Jul 2020 15:48:16 +0000 (11:48 -0400)]
panfrost: Remove wait parameter to flush_all_batches
It is always false now.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995>
Alyssa Rosenzweig [Mon, 20 Jul 2020 15:44:10 +0000 (11:44 -0400)]
panfrost: Avoid wait=true flushing all batches
What is intended is to flush the batches and wait on a particular BO at
a later time. Explicitly forcing a wait immediately is redundant.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995>
Rhys Perry [Mon, 20 Jul 2020 18:21:20 +0000 (19:21 +0100)]
aco: implement b2i8/b2i16
Fixes lots of tests under dEQP-VK.spirv_assembly.type.*
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5993>
Karol Herbst [Tue, 21 Jul 2020 00:30:35 +0000 (02:30 +0200)]
nv50/ir: initialize persampleInvocation to false
Fixes: random KHR-GL45.sample_variables.mask.* fails
Fixes: 66ed9792edb702 ("nv50: Clear nv50_ir_prog_info of dead and codegen specific variables")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6001>
Karol Herbst [Mon, 20 Jul 2020 21:46:39 +0000 (23:46 +0200)]
nv50/ir/tgsi: silence warning about unhandled GS_INPUT_PRIM property
Fixes: 66ed9792edb702 ("nv50: Clear nv50_ir_prog_info of dead and codegen specific variables")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6001>
Samuel Pitoiset [Mon, 20 Jul 2020 11:47:19 +0000 (13:47 +0200)]
radv: disable CPU caching for the upload BO to reduce fetch latency
AMDGPU_GEM_CREATE_CPU_GTT_USWC should be faster when CPU reads
are unexpected (because they aren't cached).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5978>
Samuel Pitoiset [Mon, 20 Jul 2020 11:43:40 +0000 (13:43 +0200)]
radv: do not perform read-modify-write with the upload BO
To disable CPU caching.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5978>
Rhys Perry [Mon, 20 Jul 2020 15:54:22 +0000 (16:54 +0100)]
radv: replace discard with demote for Quantic Dream games
Detroit: Become Human uses dFdx/dFdy immediately after a quad-divergent
discard, which can cause the image to become white.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3212
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5991>
Rhys Perry [Mon, 20 Jul 2020 16:19:40 +0000 (17:19 +0100)]
aco: always set FI on GFX10
bounds_ctrl is set to true by default which works around some game bugs,
but that isn't enough on GFX10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5991>
Eric Anholt [Mon, 20 Jul 2020 17:46:51 +0000 (10:46 -0700)]
ci: Set XDG_CACHE_HOME to tmpfs for bare-metal runners to avoid NFS.
We don't want these files shared between builds (it'll get blown away by
the next rsync), and NFS will just increase our latency for hitting the
cache.
Drops a630 gles31 run from 11-17 minutes to 5.5. Maximum cache size on a
run I've seen is 153M, which it seems we can easily spare.
Fixes: f97acb4bb4b1 ("freedreno/ir3: disk-cache support")
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5998>
Tomeu Vizoso [Mon, 13 Jul 2020 15:20:28 +0000 (17:20 +0200)]
gitlab-ci: Fix needs: of the arm64 LAVA test jobs
They were still depending on arm_build, but the build of kernel and
rootfs has been moved to a separate job.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472>
Tomeu Vizoso [Thu, 9 Jul 2020 20:29:39 +0000 (22:29 +0200)]
gitlab-ci: Upload tracie artifacts to MinIO
Upload failed images and the results.yml file to MinIO, to facilitate
debugging.
Also, fix version checking when git is installed as Mesa is going to
output a different renderer string if git is installed.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472>
Tomeu Vizoso [Thu, 9 Jul 2020 10:42:02 +0000 (12:42 +0200)]
gitlab-ci: Download traces from MinIO
Downloading the traces directly from git causes very high egress from
GCE, which is expensive.
So we can expand trace testing further, we are going to keep a cache in
freedesktop.org's MinIO instance. This commit implements downloading
from it.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472>
Rohan Garg [Tue, 28 Jan 2020 14:19:53 +0000 (15:19 +0100)]
gitlab-ci: Replay traces on lava devices
Submit lava jobs to replay traces on Veyron (Mali T760) and Kevin (Mali
T860) boards.
Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472>
Kenneth Graunke [Tue, 21 Jul 2020 00:08:46 +0000 (17:08 -0700)]
iris: Fix CCS check in iris_texture_subdata().
The intention here was to check "Would the GPU be able to compress
this if we used the PBO-based texture upload path?" Prior to Gen12,
that meant checking for CCS_E. On Gen12, there are a lot more types
of compression, and basic CCS_E was replaced by GEN12_CCS_E, making
this check simply not work, so we'd take the CPU path instead.
Instead, check if it has CCS, and isn't the basic "fast clear" CCS_D.
Fixes: 39f06e28485 ("iris: Implement pipe->texture_subdata directly")
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6005>
Rhys Perry [Wed, 1 Jul 2020 15:14:16 +0000 (16:14 +0100)]
nir/lower_int64: lower 64-bit amul
Fixes an issue with Renderdoc's shader debugging with ACO.
If nir_opt_algebraic isn't called in-between nir_lower_explicit_io and
nir_lower_int64, we can end up with 64-bit multiplications.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 6320e37d4be ('nir: add amul instruction')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5709>
Jason Ekstrand [Tue, 19 May 2020 15:26:58 +0000 (10:26 -0500)]
anv: Advertise support for VK_EXT_shader_atomic_float
We already have all of the shader code for load/store/exchange.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>
Jason Ekstrand [Tue, 14 Jul 2020 19:40:35 +0000 (14:40 -0500)]
intel/fs: Use the correct logical op for global float atomics
Fixes: e644ed468f98 "intel/fs: Implement nir_intrinsic_global_atomic_*"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>
Jason Ekstrand [Tue, 19 May 2020 15:19:55 +0000 (10:19 -0500)]
spirv: Add support for SPV_EXT_shader_atomic_float
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>
Jason Ekstrand [Mon, 20 Jul 2020 14:53:33 +0000 (09:53 -0500)]
spirv: Update headers and grammar json
This pulls in commit
63cb1fc131573fa from KhronosGroup/SPIRV-Headers
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>
Eric Engestrom [Mon, 20 Jul 2020 11:38:24 +0000 (13:38 +0200)]
egl: inline _EGLAPI into _EGLDriver
_EGLDriver was an empty wrapper around _EGLAPI, so let's only keep one
of them. "driver" represents better what's being accessed, so that's the
one we're keeping.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5987>
Bas Nieuwenhuizen [Sun, 19 Jul 2020 23:43:18 +0000 (01:43 +0200)]
radeonsi: Inhibit clock-gating for perf counters.
Otherwise most counters return 0. Should be much more user friendly
than having to totally disable clock-gating on the kernel cmdline.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5972>
Bas Nieuwenhuizen [Sun, 19 Jul 2020 23:37:11 +0000 (01:37 +0200)]
amd/registers: add RLC_PERFMON_CLK_CNTL for pre-GFX10
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5972>
Jason Ekstrand [Tue, 9 Jun 2020 15:59:16 +0000 (10:59 -0500)]
anv: Advertise VK_EXT_image_robustness
We already support a superset of VK_EXT_image_robustness via
VK_EXT_robustness2.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5985>
Eric Anholt [Mon, 20 Jul 2020 17:06:02 +0000 (10:06 -0700)]
freedreno/ir3: Add missing ld_args_build_id to the ir3_delay unit test.
It triggers the disk cache for me, and asserts abount not getting the
build id right.
Fixes: f97acb4bb4b1 ("freedreno/ir3: disk-cache support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5989>
Samuel Pitoiset [Mon, 20 Jul 2020 12:54:58 +0000 (14:54 +0200)]
radv: advertise VK_EXT_image_robustness
All new dEQP-VK.robustness.image_robustness.* pass.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5979>
Christian Gmeiner [Wed, 10 Jun 2020 12:44:17 +0000 (14:44 +0200)]
ci: bare-metal: use nginx to get results from DUT
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2655
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5661>
Yevhenii Kolesnikov [Thu, 16 Jul 2020 12:13:08 +0000 (15:13 +0300)]
mesa: change error code of *TextureSubImage* for incorreect target
According to the "Errors" list of the OpenGL 4.6 spec, section 8.6
"Alternate Texture Image Specification Commands":
An INVALID_OPERATION error is generated by *TextureSubImage* if the
effective target of texture does not match the command, as shown in table 8.15.
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5934>
Eric Anholt [Wed, 8 Jul 2020 20:38:18 +0000 (13:38 -0700)]
freedreno/ir3: Fix disasm of register offsets in ldp/stp.
I had a stp testcase that was getting its offset wrong, and by twiddling
bits and feeding it to qc disasm, I found that the comment was sort of
right: some the cat6a bits implicated in the old comment do get used, as
the high bits of the cat6c offset. Reallocating those bits also fixes how
we were getting r960.y for r0.y.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815>
Eric Anholt [Wed, 8 Jul 2020 23:51:16 +0000 (16:51 -0700)]
freedreno/ir3: Refactor cat6 general dst printing.
We didn't need the extra branch and temp, we can move it inside of the dst
handling by just duplicating the print of the dst reg.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815>
Eric Anholt [Wed, 8 Jul 2020 23:09:48 +0000 (16:09 -0700)]
freedreno/ir3: Add a bunch more tests for cat6 opcodes.
This started with making note of some ldp/stp instructions from the blob
and how we differ from them. In the process of fixing it, I accidentally
modified behavior of other opcodes, and the other instructions listed will
keep us from doing that. I also dropped an old stl test that looks like I
took from after a shader 'end' instruction.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815>
Eric Anholt [Wed, 8 Jul 2020 23:37:55 +0000 (16:37 -0700)]
freedreno/ir3: Add a note about the instructions in the disasm test.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815>
Jason Ekstrand [Mon, 20 Jul 2020 14:50:21 +0000 (09:50 -0500)]
vulkan: Update Vulkan XML and headers to 1.2.148
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5983>
Eric Anholt [Fri, 26 Jun 2020 17:59:41 +0000 (10:59 -0700)]
ci: Use FDO_CI_CONCURRENT as our -j flags when present in the runner env.
fd.o has retuned the x86 runners on packet for -j8. Rather than having to
tweak our CI every time fd.o decides to rebalance job concurrency, respect
what the runner admin has chosen for their builds (this will also be
convenient for people with large local runners).
Reviewed-by: Michel Dänzer <michel@daenzer.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5669>
Daniel Schürmann [Wed, 15 Jul 2020 17:31:22 +0000 (19:31 +0200)]
nir/algebraic: fold some nested bcsel
Totals from 14266 (10.62% of 134368) affected shaders (Polaris):
SGPRs: 761756 -> 762732 (+0.13%); split: -0.00%, +0.13%
VGPRs: 430392 -> 430924 (+0.12%); split: -0.05%, +0.17%
SpillSGPRs: 4652 -> 4628 (-0.52%); split: -0.60%, +0.09%
CodeSize:
30133000 ->
29949780 (-0.61%); split: -0.66%, +0.05%
MaxWaves: 102122 -> 102111 (-0.01%); split: +0.00%, -0.01%
Instrs:
5845085 ->
5841668 (-0.06%); split: -0.08%, +0.03%
Cycles:
69033140 ->
68889188 (-0.21%); split: -0.22%, +0.01%
VMEM:
8479021 ->
8474978 (-0.05%); split: +0.03%, -0.08%
SMEM: 831437 -> 830464 (-0.12%); split: +0.06%, -0.18%
VClause: 105411 -> 105410 (-0.00%); split: -0.01%, +0.01%
SClause: 327727 -> 327780 (+0.02%); split: -0.00%, +0.02%
Copies: 372704 -> 373306 (+0.16%); split: -0.16%, +0.32%
Branches: 112260 -> 112269 (+0.01%); split: -0.00%, +0.01%
PreSGPRs: 433308 -> 433631 (+0.07%); split: -0.01%, +0.09%
PreVGPRs: 397888 -> 397905 (+0.00%); split: -0.01%, +0.01%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Wed, 15 Jul 2020 17:30:34 +0000 (19:30 +0200)]
nir/algebraic: propagate b2i out of ior/iand
Totals from 761 (0.57% of 134368) affected shaders (Polaris):
SGPRs: 29496 -> 29488 (-0.03%)
SpillSGPRs: 41 -> 43 (+4.88%)
CodeSize:
1922036 ->
1882408 (-2.06%); split: -2.08%, +0.02%
Instrs: 366051 -> 360362 (-1.55%); split: -1.57%, +0.02%
Cycles:
7692516 ->
7661216 (-0.41%); split: -0.41%, +0.01%
VMEM: 365175 -> 365172 (-0.00%)
VClause: 15324 -> 15322 (-0.01%)
SClause: 9825 -> 9824 (-0.01%); split: -0.02%, +0.01%
Copies: 41216 -> 41294 (+0.19%); split: -0.01%, +0.20%
Branches: 7020 -> 7033 (+0.19%)
PreSGPRs: 22103 -> 22106 (+0.01%)
PreVGPRs: 26518 -> 26515 (-0.01%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Wed, 15 Jul 2020 17:23:54 +0000 (19:23 +0200)]
nir/algebraic: add distributive rules for ior/iand
Totals from 581 (0.43% of 134368) affected shaders (Polaris):
CodeSize:
1389560 ->
1386488 (-0.22%)
Instrs: 264488 -> 263984 (-0.19%)
Cycles:
1057952 ->
1055936 (-0.19%)
VMEM: 296016 -> 291613 (-1.49%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Thu, 30 Apr 2020 09:58:08 +0000 (10:58 +0100)]
nir/algebraic: optimize (a < 0.0) ? -a : a -> fabs(a)
Totals from affected shaders: (VEGA)
SGPRS: 13920 -> 13920 (0.00 %)
VGPRS: 10252 -> 10252 (0.00 %)
Spilled SGPRs: 62 -> 62 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 587648 -> 587224 (-0.07 %) bytes
LDS: 5 -> 5 (0.00 %) blocks
Max Waves: 1489 -> 1489 (0.00 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Wed, 29 Apr 2020 14:38:54 +0000 (15:38 +0100)]
nir/algebraic: optimize fmul(x, bcsel(c, -1.0, 1.0)) -> bcsel(c, -x, x)
Totals from affected shaders: (VEGA)
SGPRS: 545712 -> 545712 (0.00 %)
VGPRS: 413092 -> 413116 (0.01 %)
Spilled SGPRs: 10616 -> 10616 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size:
37031684 ->
36984248 (-0.13 %) bytes
LDS: 427 -> 427 (0.00 %) blocks
Max Waves: 54350 -> 54340 (-0.02 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Wed, 29 Apr 2020 17:50:27 +0000 (18:50 +0100)]
nir/algebraic: add some more unop + bcsel optimizations
Totals from affected shaders: (VEGA)
SGPRS: 284392 -> 284400 (0.00 %)
VGPRS: 261080 -> 261076 (-0.00 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size:
24698596 ->
24277788 (-1.70 %) bytes
LDS: 196 -> 196 (0.00 %) blocks
Max Waves: 10101 -> 10105 (0.04 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Wed, 29 Apr 2020 16:56:05 +0000 (17:56 +0100)]
nir/algebraic: add optimizations for fsign/isign
This just reverts fsign/isign lowering.
Totals from affected shaders:
SGPRS: 257496 -> 256672 (-0.32 %)
VGPRS: 181800 -> 178864 (-1.61 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size:
11355852 ->
11141840 (-1.88 %) bytes
LDS: 3789 -> 3789 (0.00 %) blocks
Max Waves: 30453 -> 30951 (1.64 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Wed, 29 Apr 2020 16:49:45 +0000 (17:49 +0100)]
nir/algebraic: optimize iand/ior of (n)eq zero
Found in some Detroit: Become Human shaders.
Totals from affected shaders:
SGPRS: 700256 -> 700256 (0.00 %)
VGPRS: 507208 -> 507212 (0.00 %)
Spilled SGPRs: 142531 -> 142531 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size:
76404616 ->
76301768 (-0.13 %) bytes
LDS: 43 -> 43 (0.00 %) blocks
Max Waves: 21438 -> 21438 (0.00 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Wed, 29 Apr 2020 14:36:41 +0000 (15:36 +0100)]
nir: also move b2i in case of nir_move_copies
Booleans are often more efficient with register usage.
This also allows to move comparisons further.
Totals from affected shaders: (VEGA)
SGPRS: 451608 -> 450320 (-0.29 %)
VGPRS: 351448 -> 351256 (-0.05 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 1008 -> 1008 (0.00 %) dwords per thread
Code Size:
26555596 ->
26551080 (-0.02 %) bytes
LDS: 10323 -> 10323 (0.00 %) blocks
Max Waves: 42850 -> 42934 (0.20 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Tue, 28 Apr 2020 10:45:07 +0000 (11:45 +0100)]
nir/algebraic: optimize bcsel(a, 0, 1) to b2i
This avoids combination with other bcsel operations,
and as b2i is often a no-op (when used for iadd and such),
the resulting pattern is preferable.
Totals from affected shaders: (VEGA)
SGPRS: 598448 -> 598448 (0.00 %)
VGPRS: 457940 -> 457352 (-0.13 %)
Spilled SGPRs: 127154 -> 127154 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size:
64836352 ->
64802728 (-0.05 %) bytes
LDS: 781 -> 781 (0.00 %) blocks
Max Waves: 22931 -> 22931 (0.00 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Icecream95 [Sun, 19 Jul 2020 10:31:26 +0000 (22:31 +1200)]
pan/mdg: Use the blend RT for blend shader framebuffer fetches
Fixes piglit test fbo-drawbuffers-blend-add when fixed-function
blending is disabled in panfrost_get_blend_for_context.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5892>
Icecream95 [Tue, 14 Jul 2020 00:05:47 +0000 (12:05 +1200)]
panfrost: 8x MRT support
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5892>
Icecream95 [Mon, 13 Jul 2020 23:50:10 +0000 (11:50 +1200)]
panfrost: Use more tilebuffer sizes
This will be needed for 8x MRT with 128-bit framebuffer formats.
Adds support for 256-bit, 1024-bit, and 2048-bit tilebuffer allocations,
depending on the amount of data required.
v2: Squash commits (Alyssa)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5892>
Icecream95 [Mon, 13 Jul 2020 10:45:51 +0000 (22:45 +1200)]
panfrost: Fake RGTC support
For most GPUs RGTC is disabled, so it needs to be emulated, using the
fake_rgtc option of u_transfer_helper.
Passes the rgtc-teximage tests in piglit.
v2: Update docs/features.txt (Alyssa)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5975>
Rhys Perry [Fri, 17 Jul 2020 10:46:47 +0000 (11:46 +0100)]
spirv: don't split memory barriers
If the SPIR-V had a shared+image memory barrier, we would emit two NIR
barriers: a shared barrier and an image barrier.
Unlike a single barrier, two barriers allows transformations such as:
intrinsic image_deref_store (ssa_27, ssa_33, ssa_34, ssa_32, ssa_25) (1)
intrinsic memory_barrier_shared () ()
intrinsic memory_barrier_image () ()
intrinsic store_shared (ssa_35, ssa_24) (0, 1, 4, 0)
->
intrinsic memory_barrier_shared () ()
intrinsic store_shared (ssa_35, ssa_24) (0, 1, 4, 0)
intrinsic image_deref_store (ssa_27, ssa_33, ssa_34, ssa_32, ssa_25) (1)
intrinsic memory_barrier_image () ()
This commit fixes two dEQP-VK.memory_model.* CTS tests with ACO.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5951>
Samuel Pitoiset [Thu, 30 Apr 2020 18:47:46 +0000 (20:47 +0200)]
radv/winsys: always allow GTT placements on APUs
When the VRAM size is small and the preferred heap only VRAM,
the kernel tries to always honor the requested heap and it does
a ton of evictions which is a disaster for performance.
On APUs, VRAM and GTT have similar performance, so allow the
kernel to choose the best placement (GTT or VRAM) itself.
This gives a huge performance boost with Doom Eternal on RAVEN.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5665>
Samuel Pitoiset [Fri, 17 Jul 2020 20:51:34 +0000 (22:51 +0200)]
radv: disable CPU caching for IBS to reduce fetch latency
AMDGPU_GEM_CREATE_CPU_GTT_USWC should be faster when CPU reads
are unexpected (because they aren't cached).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5959>
Pierre-Eric Pelloux-Prayer [Mon, 13 Jul 2020 13:36:25 +0000 (15:36 +0200)]
radeonsi: adjust epitch for PIPE_FORMAT_R8G8_R8B8_UNORM
This fix si_compute_copy_image for yuyv image (so using PIPE_FORMAT_R8G8_R8B8_UNORM).
With this change, the following gst pipeline produce the expected results for various
image sizes (with or without AMD_DEBUG=nodma):
gst-launch-1.0 filesrc location=input.jpg ! jpegparse ! vaapijpegdec ! filesink location=output.yuv
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5841>
Pierre-Eric Pelloux-Prayer [Thu, 9 Jul 2020 12:10:51 +0000 (14:10 +0200)]
ac/surface: adapt surf_size when modifying surf_pitch
Otherwise we might get VM_L2_PROTECTION_FAULT_STATUS errors.
Fixes: 8275dc1ed57 ("ac/surface: fix epitch when modifying surf_pitch")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5841>
Gert Wollny [Sat, 18 Jul 2020 20:42:54 +0000 (22:42 +0200)]
d600/sfn: write stream outputs to correct mem ring
Fixes: arb_gpu_shader5-xfb-streams
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 14:49:56 +0000 (16:49 +0200)]
r600/sfn: Make the pin_to_channel generic
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 14:49:14 +0000 (16:49 +0200)]
r600/sfn: Only use sample mask if the according shader key is set
This fixes all the piglits from arb_sample_shading "samplemask * *"
with the nir backend.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 16:35:35 +0000 (18:35 +0200)]
r600: Add shader key item to identify when the sample mask should be used
The sample mask must be applied when more then one sample is available or
multisamplig is not enabled, so add a shader key to track this.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 14:46:32 +0000 (16:46 +0200)]
r600/sfn: Fix default z swizzle for GDS instructions
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 14:54:10 +0000 (16:54 +0200)]
r600/sfn: Fix Ring output swizzle masks
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sat, 18 Jul 2020 19:33:54 +0000 (21:33 +0200)]
r600/sfn: Add a forced output swizzle for depth write
This makes sure no components are written that shouldn't be written.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sat, 18 Jul 2020 18:34:37 +0000 (20:34 +0200)]
r600/sfn: correct handling of loading vec4 with fetching constants
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sat, 18 Jul 2020 18:21:10 +0000 (20:21 +0200)]
r600/sfn: Add option to get a temp value for a specific channel
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 14:39:09 +0000 (16:39 +0200)]
r600/sfn: emit texture instructions in one block
Setting the offset must happen in the same CF like using it, so don't
emit ALU instruction between the tex instructions.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 14:46:03 +0000 (16:46 +0200)]
r600/sfn: Pipe through requesting a register at a given channel
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 14:36:11 +0000 (16:36 +0200)]
r600/sfn: lower rotate ALU ops
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Dave Airlie [Mon, 20 Jul 2020 00:45:42 +0000 (10:45 +1000)]
ci/llvmpipe: reenable gpu shader5 tests
I hadn't realised these were disabled, llvmpipe now exposes this extension.
One additional failure is fine to get the added testing coverage.
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5973>
Dave Airlie [Tue, 14 Jul 2020 23:53:03 +0000 (09:53 +1000)]
llvmpipe: add framebuffer fetching support (v1.1)
v1.1:
Merge two if blocks (Roland)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5914>
Dave Airlie [Thu, 16 Jul 2020 19:22:18 +0000 (05:22 +1000)]
llvmpipe/cs: respect render condition
Running complete CTS turned up a missing cond render.
Fixes KHR-GL45.compute_shader.conditional-dispatching
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5944>
Rob Clark [Fri, 17 Jul 2020 18:06:55 +0000 (11:06 -0700)]
freedreno/ir3/ra: fix array conflicts for split/merged
Properly handle the difference between split and merged register file
when determining where arrays can fit without conflicting with other
arrays or pre-colored instructions.
1) if not mergedregs, only consider other things with same precision
as potentially conflicting
2) if mergedregs, calculate everything in therms of half-regs and
convert back to fullregs in the end
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Rob Clark [Thu, 16 Jul 2020 19:41:11 +0000 (12:41 -0700)]
freedreno/ir3/ra: assign vreg names to all array elements
We shouldn't divide-by-two for half-reg arrays. We set the proper node
interference class, based on `arr->half`.
Fixes a RA fail with 16b arrays:
src/freedreno/ir3/ir3_ra.c:633: name_to_array: Assertion `!"invalid array name"' failed.
Caused by use/def iterators returning `arr->length` vreg namess, but
only assigning the array half that many names.
Also, since we are assigning unique vreg names to each array element,
there is no need to try and convert from half-reg to it's conflicting
full reg when pre-coloring the array elements. Getting us closer to
having half-arrays work sanely with split-register-file (a5xx and
earlier).
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Rob Clark [Thu, 16 Jul 2020 19:11:24 +0000 (12:11 -0700)]
freedreno/ir3/ra: debug msgs tweak
Print out the assigned vreg names earlier. Also print the few special
nodes.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Rob Clark [Thu, 16 Jul 2020 22:20:45 +0000 (15:20 -0700)]
freedreno/ir3: fix half-reg array stores
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Rob Clark [Thu, 16 Jul 2020 22:13:53 +0000 (15:13 -0700)]
freedreno/ir3: set array precision on creation
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Rob Clark [Fri, 17 Jul 2020 16:35:18 +0000 (09:35 -0700)]
freedreno/ir3/parser: half-precision relative regs
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Rob Clark [Tue, 14 Jul 2020 18:48:10 +0000 (11:48 -0700)]
freedreno: whitespace fix
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Rob Clark [Thu, 16 Jul 2020 14:40:02 +0000 (07:40 -0700)]
freedreno: small comment re-word
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Mike Blumenkrantz [Wed, 3 Jun 2020 15:41:43 +0000 (11:41 -0400)]
zink: free all ntv allocations after creating shader module
these are all fairly large sources of leaks
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5887>
Mike Blumenkrantz [Wed, 3 Jun 2020 15:42:10 +0000 (11:42 -0400)]
zink: free pipeline cache during program destroy
more leaks
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5887>
Mike Blumenkrantz [Wed, 3 Jun 2020 15:13:35 +0000 (11:13 -0400)]
zink: destroy descriptor pools on context destroy
this is a big leak
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5887>
Mike Blumenkrantz [Wed, 3 Jun 2020 15:08:34 +0000 (11:08 -0400)]
zink: destroy gfx program when a shader is freed
there's no sense in having these objects sitting around when they can
never be used again
requires adding a zink_context* pointer to each program in order to prune
the hash table entry
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5887>
Mauro Rossi [Fri, 17 Jul 2020 22:11:51 +0000 (00:11 +0200)]
android: panfrost/encoder: add libmesa_nir static dependency
Fixes the following build error:
In file included from external/mesa/src/panfrost/encoder/pan_blit.c:34:
In file included from external/mesa/src/panfrost/encoder/../midgard/midgard_compile.h:27:
external/mesa/src/compiler/nir/nir.h:52:10: fatal error: 'nir_opcodes.h' file not found
^~~~~~~~~~~~~~~
1 error generated.
Fixes: 293f251871b ("panfrost: Use Midgard-specific reloads")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5961>
Icecream95 [Fri, 17 Jul 2020 23:39:45 +0000 (11:39 +1200)]
panfrost: Fix calls to panfrost_flush_batches_accessing_bo
The function now takes a bool flush_readers instead of an access type,
but some calls were not updated.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5962>
Icecream95 [Fri, 17 Jul 2020 23:36:36 +0000 (11:36 +1200)]
panfrost: Make panfrost_bo_wait take a wait_readers bool
panfrost_bo_wait is often used after
panfrost_flush_batches_accessing_bo, so make them take similar
arguments for consistency.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5962>
Eric Anholt [Tue, 30 Jun 2020 20:05:51 +0000 (13:05 -0700)]
freedreno/ir3: Add unit tests for derivatives disasm.
Since I was going back to look at fine derivs again, add some tests of
instruction encoding.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5699>
Eric Anholt [Tue, 30 Jun 2020 19:17:10 +0000 (12:17 -0700)]
freedreno/ir3: Fix duplicated fine derivatives instructions.
legalize_block() can get run multiple times, which I didn't notice when
adding fine derivs support. Other instruction clones change things such
that the legalization won't trigger again, but that didn't apply to the
DS.PP legalization. To keep someone else from tripping over this, split
the one-shot legalization out of the iterative sync flag application.
Fixes failures in dEQP-VK.glsl.derivate.dfdxfine.*
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3198
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5699>
Bas Nieuwenhuizen [Sat, 11 Jul 2020 12:34:58 +0000 (14:34 +0200)]
amd/addrlib: Clean up unused colorFlags argument
Cleanup.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5865>
Bas Nieuwenhuizen [Sat, 11 Jul 2020 20:04:25 +0000 (22:04 +0200)]
amd/common: Cache intra-tile addresses for retile map.
However complicated DCC addressing is it is still based on tiles.
If we have the intra-tile offsets + tile dimensions we can expand
that to the full image ourselves.
Behavior around ~1080p on a 2500U:
old:
30-60 ms on every miss
new:
5 ms initally (miss in the tile cache)
<0.5 ms afterwards
The most common case is that the tile cache only contains data for
2 tiles, which for Raven/Renoir/Navi14 will be 4 KiB each, so the
size increase is fairly modest.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5865>