Samuel Pitoiset [Mon, 20 Jul 2020 11:47:19 +0000 (13:47 +0200)]
radv: disable CPU caching for the upload BO to reduce fetch latency
AMDGPU_GEM_CREATE_CPU_GTT_USWC should be faster when CPU reads
are unexpected (because they aren't cached).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5978>
Samuel Pitoiset [Mon, 20 Jul 2020 11:43:40 +0000 (13:43 +0200)]
radv: do not perform read-modify-write with the upload BO
To disable CPU caching.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5978>
Rhys Perry [Mon, 20 Jul 2020 15:54:22 +0000 (16:54 +0100)]
radv: replace discard with demote for Quantic Dream games
Detroit: Become Human uses dFdx/dFdy immediately after a quad-divergent
discard, which can cause the image to become white.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3212
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5991>
Rhys Perry [Mon, 20 Jul 2020 16:19:40 +0000 (17:19 +0100)]
aco: always set FI on GFX10
bounds_ctrl is set to true by default which works around some game bugs,
but that isn't enough on GFX10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5991>
Eric Anholt [Mon, 20 Jul 2020 17:46:51 +0000 (10:46 -0700)]
ci: Set XDG_CACHE_HOME to tmpfs for bare-metal runners to avoid NFS.
We don't want these files shared between builds (it'll get blown away by
the next rsync), and NFS will just increase our latency for hitting the
cache.
Drops a630 gles31 run from 11-17 minutes to 5.5. Maximum cache size on a
run I've seen is 153M, which it seems we can easily spare.
Fixes: f97acb4bb4b1 ("freedreno/ir3: disk-cache support")
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5998>
Tomeu Vizoso [Mon, 13 Jul 2020 15:20:28 +0000 (17:20 +0200)]
gitlab-ci: Fix needs: of the arm64 LAVA test jobs
They were still depending on arm_build, but the build of kernel and
rootfs has been moved to a separate job.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472>
Tomeu Vizoso [Thu, 9 Jul 2020 20:29:39 +0000 (22:29 +0200)]
gitlab-ci: Upload tracie artifacts to MinIO
Upload failed images and the results.yml file to MinIO, to facilitate
debugging.
Also, fix version checking when git is installed as Mesa is going to
output a different renderer string if git is installed.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472>
Tomeu Vizoso [Thu, 9 Jul 2020 10:42:02 +0000 (12:42 +0200)]
gitlab-ci: Download traces from MinIO
Downloading the traces directly from git causes very high egress from
GCE, which is expensive.
So we can expand trace testing further, we are going to keep a cache in
freedesktop.org's MinIO instance. This commit implements downloading
from it.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472>
Rohan Garg [Tue, 28 Jan 2020 14:19:53 +0000 (15:19 +0100)]
gitlab-ci: Replay traces on lava devices
Submit lava jobs to replay traces on Veyron (Mali T760) and Kevin (Mali
T860) boards.
Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472>
Kenneth Graunke [Tue, 21 Jul 2020 00:08:46 +0000 (17:08 -0700)]
iris: Fix CCS check in iris_texture_subdata().
The intention here was to check "Would the GPU be able to compress
this if we used the PBO-based texture upload path?" Prior to Gen12,
that meant checking for CCS_E. On Gen12, there are a lot more types
of compression, and basic CCS_E was replaced by GEN12_CCS_E, making
this check simply not work, so we'd take the CPU path instead.
Instead, check if it has CCS, and isn't the basic "fast clear" CCS_D.
Fixes: 39f06e28485 ("iris: Implement pipe->texture_subdata directly")
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6005>
Rhys Perry [Wed, 1 Jul 2020 15:14:16 +0000 (16:14 +0100)]
nir/lower_int64: lower 64-bit amul
Fixes an issue with Renderdoc's shader debugging with ACO.
If nir_opt_algebraic isn't called in-between nir_lower_explicit_io and
nir_lower_int64, we can end up with 64-bit multiplications.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 6320e37d4be ('nir: add amul instruction')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5709>
Jason Ekstrand [Tue, 19 May 2020 15:26:58 +0000 (10:26 -0500)]
anv: Advertise support for VK_EXT_shader_atomic_float
We already have all of the shader code for load/store/exchange.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>
Jason Ekstrand [Tue, 14 Jul 2020 19:40:35 +0000 (14:40 -0500)]
intel/fs: Use the correct logical op for global float atomics
Fixes: e644ed468f98 "intel/fs: Implement nir_intrinsic_global_atomic_*"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>
Jason Ekstrand [Tue, 19 May 2020 15:19:55 +0000 (10:19 -0500)]
spirv: Add support for SPV_EXT_shader_atomic_float
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>
Jason Ekstrand [Mon, 20 Jul 2020 14:53:33 +0000 (09:53 -0500)]
spirv: Update headers and grammar json
This pulls in commit
63cb1fc131573fa from KhronosGroup/SPIRV-Headers
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>
Eric Engestrom [Mon, 20 Jul 2020 11:38:24 +0000 (13:38 +0200)]
egl: inline _EGLAPI into _EGLDriver
_EGLDriver was an empty wrapper around _EGLAPI, so let's only keep one
of them. "driver" represents better what's being accessed, so that's the
one we're keeping.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5987>
Bas Nieuwenhuizen [Sun, 19 Jul 2020 23:43:18 +0000 (01:43 +0200)]
radeonsi: Inhibit clock-gating for perf counters.
Otherwise most counters return 0. Should be much more user friendly
than having to totally disable clock-gating on the kernel cmdline.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5972>
Bas Nieuwenhuizen [Sun, 19 Jul 2020 23:37:11 +0000 (01:37 +0200)]
amd/registers: add RLC_PERFMON_CLK_CNTL for pre-GFX10
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5972>
Jason Ekstrand [Tue, 9 Jun 2020 15:59:16 +0000 (10:59 -0500)]
anv: Advertise VK_EXT_image_robustness
We already support a superset of VK_EXT_image_robustness via
VK_EXT_robustness2.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5985>
Eric Anholt [Mon, 20 Jul 2020 17:06:02 +0000 (10:06 -0700)]
freedreno/ir3: Add missing ld_args_build_id to the ir3_delay unit test.
It triggers the disk cache for me, and asserts abount not getting the
build id right.
Fixes: f97acb4bb4b1 ("freedreno/ir3: disk-cache support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5989>
Samuel Pitoiset [Mon, 20 Jul 2020 12:54:58 +0000 (14:54 +0200)]
radv: advertise VK_EXT_image_robustness
All new dEQP-VK.robustness.image_robustness.* pass.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5979>
Christian Gmeiner [Wed, 10 Jun 2020 12:44:17 +0000 (14:44 +0200)]
ci: bare-metal: use nginx to get results from DUT
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2655
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5661>
Yevhenii Kolesnikov [Thu, 16 Jul 2020 12:13:08 +0000 (15:13 +0300)]
mesa: change error code of *TextureSubImage* for incorreect target
According to the "Errors" list of the OpenGL 4.6 spec, section 8.6
"Alternate Texture Image Specification Commands":
An INVALID_OPERATION error is generated by *TextureSubImage* if the
effective target of texture does not match the command, as shown in table 8.15.
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5934>
Eric Anholt [Wed, 8 Jul 2020 20:38:18 +0000 (13:38 -0700)]
freedreno/ir3: Fix disasm of register offsets in ldp/stp.
I had a stp testcase that was getting its offset wrong, and by twiddling
bits and feeding it to qc disasm, I found that the comment was sort of
right: some the cat6a bits implicated in the old comment do get used, as
the high bits of the cat6c offset. Reallocating those bits also fixes how
we were getting r960.y for r0.y.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815>
Eric Anholt [Wed, 8 Jul 2020 23:51:16 +0000 (16:51 -0700)]
freedreno/ir3: Refactor cat6 general dst printing.
We didn't need the extra branch and temp, we can move it inside of the dst
handling by just duplicating the print of the dst reg.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815>
Eric Anholt [Wed, 8 Jul 2020 23:09:48 +0000 (16:09 -0700)]
freedreno/ir3: Add a bunch more tests for cat6 opcodes.
This started with making note of some ldp/stp instructions from the blob
and how we differ from them. In the process of fixing it, I accidentally
modified behavior of other opcodes, and the other instructions listed will
keep us from doing that. I also dropped an old stl test that looks like I
took from after a shader 'end' instruction.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815>
Eric Anholt [Wed, 8 Jul 2020 23:37:55 +0000 (16:37 -0700)]
freedreno/ir3: Add a note about the instructions in the disasm test.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815>
Jason Ekstrand [Mon, 20 Jul 2020 14:50:21 +0000 (09:50 -0500)]
vulkan: Update Vulkan XML and headers to 1.2.148
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5983>
Eric Anholt [Fri, 26 Jun 2020 17:59:41 +0000 (10:59 -0700)]
ci: Use FDO_CI_CONCURRENT as our -j flags when present in the runner env.
fd.o has retuned the x86 runners on packet for -j8. Rather than having to
tweak our CI every time fd.o decides to rebalance job concurrency, respect
what the runner admin has chosen for their builds (this will also be
convenient for people with large local runners).
Reviewed-by: Michel Dänzer <michel@daenzer.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5669>
Daniel Schürmann [Wed, 15 Jul 2020 17:31:22 +0000 (19:31 +0200)]
nir/algebraic: fold some nested bcsel
Totals from 14266 (10.62% of 134368) affected shaders (Polaris):
SGPRs: 761756 -> 762732 (+0.13%); split: -0.00%, +0.13%
VGPRs: 430392 -> 430924 (+0.12%); split: -0.05%, +0.17%
SpillSGPRs: 4652 -> 4628 (-0.52%); split: -0.60%, +0.09%
CodeSize:
30133000 ->
29949780 (-0.61%); split: -0.66%, +0.05%
MaxWaves: 102122 -> 102111 (-0.01%); split: +0.00%, -0.01%
Instrs:
5845085 ->
5841668 (-0.06%); split: -0.08%, +0.03%
Cycles:
69033140 ->
68889188 (-0.21%); split: -0.22%, +0.01%
VMEM:
8479021 ->
8474978 (-0.05%); split: +0.03%, -0.08%
SMEM: 831437 -> 830464 (-0.12%); split: +0.06%, -0.18%
VClause: 105411 -> 105410 (-0.00%); split: -0.01%, +0.01%
SClause: 327727 -> 327780 (+0.02%); split: -0.00%, +0.02%
Copies: 372704 -> 373306 (+0.16%); split: -0.16%, +0.32%
Branches: 112260 -> 112269 (+0.01%); split: -0.00%, +0.01%
PreSGPRs: 433308 -> 433631 (+0.07%); split: -0.01%, +0.09%
PreVGPRs: 397888 -> 397905 (+0.00%); split: -0.01%, +0.01%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Wed, 15 Jul 2020 17:30:34 +0000 (19:30 +0200)]
nir/algebraic: propagate b2i out of ior/iand
Totals from 761 (0.57% of 134368) affected shaders (Polaris):
SGPRs: 29496 -> 29488 (-0.03%)
SpillSGPRs: 41 -> 43 (+4.88%)
CodeSize:
1922036 ->
1882408 (-2.06%); split: -2.08%, +0.02%
Instrs: 366051 -> 360362 (-1.55%); split: -1.57%, +0.02%
Cycles:
7692516 ->
7661216 (-0.41%); split: -0.41%, +0.01%
VMEM: 365175 -> 365172 (-0.00%)
VClause: 15324 -> 15322 (-0.01%)
SClause: 9825 -> 9824 (-0.01%); split: -0.02%, +0.01%
Copies: 41216 -> 41294 (+0.19%); split: -0.01%, +0.20%
Branches: 7020 -> 7033 (+0.19%)
PreSGPRs: 22103 -> 22106 (+0.01%)
PreVGPRs: 26518 -> 26515 (-0.01%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Wed, 15 Jul 2020 17:23:54 +0000 (19:23 +0200)]
nir/algebraic: add distributive rules for ior/iand
Totals from 581 (0.43% of 134368) affected shaders (Polaris):
CodeSize:
1389560 ->
1386488 (-0.22%)
Instrs: 264488 -> 263984 (-0.19%)
Cycles:
1057952 ->
1055936 (-0.19%)
VMEM: 296016 -> 291613 (-1.49%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Thu, 30 Apr 2020 09:58:08 +0000 (10:58 +0100)]
nir/algebraic: optimize (a < 0.0) ? -a : a -> fabs(a)
Totals from affected shaders: (VEGA)
SGPRS: 13920 -> 13920 (0.00 %)
VGPRS: 10252 -> 10252 (0.00 %)
Spilled SGPRs: 62 -> 62 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 587648 -> 587224 (-0.07 %) bytes
LDS: 5 -> 5 (0.00 %) blocks
Max Waves: 1489 -> 1489 (0.00 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Wed, 29 Apr 2020 14:38:54 +0000 (15:38 +0100)]
nir/algebraic: optimize fmul(x, bcsel(c, -1.0, 1.0)) -> bcsel(c, -x, x)
Totals from affected shaders: (VEGA)
SGPRS: 545712 -> 545712 (0.00 %)
VGPRS: 413092 -> 413116 (0.01 %)
Spilled SGPRs: 10616 -> 10616 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size:
37031684 ->
36984248 (-0.13 %) bytes
LDS: 427 -> 427 (0.00 %) blocks
Max Waves: 54350 -> 54340 (-0.02 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Wed, 29 Apr 2020 17:50:27 +0000 (18:50 +0100)]
nir/algebraic: add some more unop + bcsel optimizations
Totals from affected shaders: (VEGA)
SGPRS: 284392 -> 284400 (0.00 %)
VGPRS: 261080 -> 261076 (-0.00 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size:
24698596 ->
24277788 (-1.70 %) bytes
LDS: 196 -> 196 (0.00 %) blocks
Max Waves: 10101 -> 10105 (0.04 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Wed, 29 Apr 2020 16:56:05 +0000 (17:56 +0100)]
nir/algebraic: add optimizations for fsign/isign
This just reverts fsign/isign lowering.
Totals from affected shaders:
SGPRS: 257496 -> 256672 (-0.32 %)
VGPRS: 181800 -> 178864 (-1.61 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size:
11355852 ->
11141840 (-1.88 %) bytes
LDS: 3789 -> 3789 (0.00 %) blocks
Max Waves: 30453 -> 30951 (1.64 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Wed, 29 Apr 2020 16:49:45 +0000 (17:49 +0100)]
nir/algebraic: optimize iand/ior of (n)eq zero
Found in some Detroit: Become Human shaders.
Totals from affected shaders:
SGPRS: 700256 -> 700256 (0.00 %)
VGPRS: 507208 -> 507212 (0.00 %)
Spilled SGPRs: 142531 -> 142531 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size:
76404616 ->
76301768 (-0.13 %) bytes
LDS: 43 -> 43 (0.00 %) blocks
Max Waves: 21438 -> 21438 (0.00 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Wed, 29 Apr 2020 14:36:41 +0000 (15:36 +0100)]
nir: also move b2i in case of nir_move_copies
Booleans are often more efficient with register usage.
This also allows to move comparisons further.
Totals from affected shaders: (VEGA)
SGPRS: 451608 -> 450320 (-0.29 %)
VGPRS: 351448 -> 351256 (-0.05 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 1008 -> 1008 (0.00 %) dwords per thread
Code Size:
26555596 ->
26551080 (-0.02 %) bytes
LDS: 10323 -> 10323 (0.00 %) blocks
Max Waves: 42850 -> 42934 (0.20 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Daniel Schürmann [Tue, 28 Apr 2020 10:45:07 +0000 (11:45 +0100)]
nir/algebraic: optimize bcsel(a, 0, 1) to b2i
This avoids combination with other bcsel operations,
and as b2i is often a no-op (when used for iadd and such),
the resulting pattern is preferable.
Totals from affected shaders: (VEGA)
SGPRS: 598448 -> 598448 (0.00 %)
VGPRS: 457940 -> 457352 (-0.13 %)
Spilled SGPRs: 127154 -> 127154 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size:
64836352 ->
64802728 (-0.05 %) bytes
LDS: 781 -> 781 (0.00 %) blocks
Max Waves: 22931 -> 22931 (0.00 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
Icecream95 [Sun, 19 Jul 2020 10:31:26 +0000 (22:31 +1200)]
pan/mdg: Use the blend RT for blend shader framebuffer fetches
Fixes piglit test fbo-drawbuffers-blend-add when fixed-function
blending is disabled in panfrost_get_blend_for_context.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5892>
Icecream95 [Tue, 14 Jul 2020 00:05:47 +0000 (12:05 +1200)]
panfrost: 8x MRT support
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5892>
Icecream95 [Mon, 13 Jul 2020 23:50:10 +0000 (11:50 +1200)]
panfrost: Use more tilebuffer sizes
This will be needed for 8x MRT with 128-bit framebuffer formats.
Adds support for 256-bit, 1024-bit, and 2048-bit tilebuffer allocations,
depending on the amount of data required.
v2: Squash commits (Alyssa)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5892>
Icecream95 [Mon, 13 Jul 2020 10:45:51 +0000 (22:45 +1200)]
panfrost: Fake RGTC support
For most GPUs RGTC is disabled, so it needs to be emulated, using the
fake_rgtc option of u_transfer_helper.
Passes the rgtc-teximage tests in piglit.
v2: Update docs/features.txt (Alyssa)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5975>
Rhys Perry [Fri, 17 Jul 2020 10:46:47 +0000 (11:46 +0100)]
spirv: don't split memory barriers
If the SPIR-V had a shared+image memory barrier, we would emit two NIR
barriers: a shared barrier and an image barrier.
Unlike a single barrier, two barriers allows transformations such as:
intrinsic image_deref_store (ssa_27, ssa_33, ssa_34, ssa_32, ssa_25) (1)
intrinsic memory_barrier_shared () ()
intrinsic memory_barrier_image () ()
intrinsic store_shared (ssa_35, ssa_24) (0, 1, 4, 0)
->
intrinsic memory_barrier_shared () ()
intrinsic store_shared (ssa_35, ssa_24) (0, 1, 4, 0)
intrinsic image_deref_store (ssa_27, ssa_33, ssa_34, ssa_32, ssa_25) (1)
intrinsic memory_barrier_image () ()
This commit fixes two dEQP-VK.memory_model.* CTS tests with ACO.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5951>
Samuel Pitoiset [Thu, 30 Apr 2020 18:47:46 +0000 (20:47 +0200)]
radv/winsys: always allow GTT placements on APUs
When the VRAM size is small and the preferred heap only VRAM,
the kernel tries to always honor the requested heap and it does
a ton of evictions which is a disaster for performance.
On APUs, VRAM and GTT have similar performance, so allow the
kernel to choose the best placement (GTT or VRAM) itself.
This gives a huge performance boost with Doom Eternal on RAVEN.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5665>
Samuel Pitoiset [Fri, 17 Jul 2020 20:51:34 +0000 (22:51 +0200)]
radv: disable CPU caching for IBS to reduce fetch latency
AMDGPU_GEM_CREATE_CPU_GTT_USWC should be faster when CPU reads
are unexpected (because they aren't cached).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5959>
Pierre-Eric Pelloux-Prayer [Mon, 13 Jul 2020 13:36:25 +0000 (15:36 +0200)]
radeonsi: adjust epitch for PIPE_FORMAT_R8G8_R8B8_UNORM
This fix si_compute_copy_image for yuyv image (so using PIPE_FORMAT_R8G8_R8B8_UNORM).
With this change, the following gst pipeline produce the expected results for various
image sizes (with or without AMD_DEBUG=nodma):
gst-launch-1.0 filesrc location=input.jpg ! jpegparse ! vaapijpegdec ! filesink location=output.yuv
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5841>
Pierre-Eric Pelloux-Prayer [Thu, 9 Jul 2020 12:10:51 +0000 (14:10 +0200)]
ac/surface: adapt surf_size when modifying surf_pitch
Otherwise we might get VM_L2_PROTECTION_FAULT_STATUS errors.
Fixes: 8275dc1ed57 ("ac/surface: fix epitch when modifying surf_pitch")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5841>
Gert Wollny [Sat, 18 Jul 2020 20:42:54 +0000 (22:42 +0200)]
d600/sfn: write stream outputs to correct mem ring
Fixes: arb_gpu_shader5-xfb-streams
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 14:49:56 +0000 (16:49 +0200)]
r600/sfn: Make the pin_to_channel generic
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 14:49:14 +0000 (16:49 +0200)]
r600/sfn: Only use sample mask if the according shader key is set
This fixes all the piglits from arb_sample_shading "samplemask * *"
with the nir backend.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 16:35:35 +0000 (18:35 +0200)]
r600: Add shader key item to identify when the sample mask should be used
The sample mask must be applied when more then one sample is available or
multisamplig is not enabled, so add a shader key to track this.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 14:46:32 +0000 (16:46 +0200)]
r600/sfn: Fix default z swizzle for GDS instructions
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 14:54:10 +0000 (16:54 +0200)]
r600/sfn: Fix Ring output swizzle masks
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sat, 18 Jul 2020 19:33:54 +0000 (21:33 +0200)]
r600/sfn: Add a forced output swizzle for depth write
This makes sure no components are written that shouldn't be written.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sat, 18 Jul 2020 18:34:37 +0000 (20:34 +0200)]
r600/sfn: correct handling of loading vec4 with fetching constants
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sat, 18 Jul 2020 18:21:10 +0000 (20:21 +0200)]
r600/sfn: Add option to get a temp value for a specific channel
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 14:39:09 +0000 (16:39 +0200)]
r600/sfn: emit texture instructions in one block
Setting the offset must happen in the same CF like using it, so don't
emit ALU instruction between the tex instructions.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 14:46:03 +0000 (16:46 +0200)]
r600/sfn: Pipe through requesting a register at a given channel
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Gert Wollny [Sun, 5 Jul 2020 14:36:11 +0000 (16:36 +0200)]
r600/sfn: lower rotate ALU ops
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5963>
Dave Airlie [Mon, 20 Jul 2020 00:45:42 +0000 (10:45 +1000)]
ci/llvmpipe: reenable gpu shader5 tests
I hadn't realised these were disabled, llvmpipe now exposes this extension.
One additional failure is fine to get the added testing coverage.
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5973>
Dave Airlie [Tue, 14 Jul 2020 23:53:03 +0000 (09:53 +1000)]
llvmpipe: add framebuffer fetching support (v1.1)
v1.1:
Merge two if blocks (Roland)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5914>
Dave Airlie [Thu, 16 Jul 2020 19:22:18 +0000 (05:22 +1000)]
llvmpipe/cs: respect render condition
Running complete CTS turned up a missing cond render.
Fixes KHR-GL45.compute_shader.conditional-dispatching
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5944>
Rob Clark [Fri, 17 Jul 2020 18:06:55 +0000 (11:06 -0700)]
freedreno/ir3/ra: fix array conflicts for split/merged
Properly handle the difference between split and merged register file
when determining where arrays can fit without conflicting with other
arrays or pre-colored instructions.
1) if not mergedregs, only consider other things with same precision
as potentially conflicting
2) if mergedregs, calculate everything in therms of half-regs and
convert back to fullregs in the end
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Rob Clark [Thu, 16 Jul 2020 19:41:11 +0000 (12:41 -0700)]
freedreno/ir3/ra: assign vreg names to all array elements
We shouldn't divide-by-two for half-reg arrays. We set the proper node
interference class, based on `arr->half`.
Fixes a RA fail with 16b arrays:
src/freedreno/ir3/ir3_ra.c:633: name_to_array: Assertion `!"invalid array name"' failed.
Caused by use/def iterators returning `arr->length` vreg namess, but
only assigning the array half that many names.
Also, since we are assigning unique vreg names to each array element,
there is no need to try and convert from half-reg to it's conflicting
full reg when pre-coloring the array elements. Getting us closer to
having half-arrays work sanely with split-register-file (a5xx and
earlier).
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Rob Clark [Thu, 16 Jul 2020 19:11:24 +0000 (12:11 -0700)]
freedreno/ir3/ra: debug msgs tweak
Print out the assigned vreg names earlier. Also print the few special
nodes.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Rob Clark [Thu, 16 Jul 2020 22:20:45 +0000 (15:20 -0700)]
freedreno/ir3: fix half-reg array stores
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Rob Clark [Thu, 16 Jul 2020 22:13:53 +0000 (15:13 -0700)]
freedreno/ir3: set array precision on creation
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Rob Clark [Fri, 17 Jul 2020 16:35:18 +0000 (09:35 -0700)]
freedreno/ir3/parser: half-precision relative regs
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Rob Clark [Tue, 14 Jul 2020 18:48:10 +0000 (11:48 -0700)]
freedreno: whitespace fix
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Rob Clark [Thu, 16 Jul 2020 14:40:02 +0000 (07:40 -0700)]
freedreno: small comment re-word
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Mike Blumenkrantz [Wed, 3 Jun 2020 15:41:43 +0000 (11:41 -0400)]
zink: free all ntv allocations after creating shader module
these are all fairly large sources of leaks
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5887>
Mike Blumenkrantz [Wed, 3 Jun 2020 15:42:10 +0000 (11:42 -0400)]
zink: free pipeline cache during program destroy
more leaks
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5887>
Mike Blumenkrantz [Wed, 3 Jun 2020 15:13:35 +0000 (11:13 -0400)]
zink: destroy descriptor pools on context destroy
this is a big leak
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5887>
Mike Blumenkrantz [Wed, 3 Jun 2020 15:08:34 +0000 (11:08 -0400)]
zink: destroy gfx program when a shader is freed
there's no sense in having these objects sitting around when they can
never be used again
requires adding a zink_context* pointer to each program in order to prune
the hash table entry
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5887>
Mauro Rossi [Fri, 17 Jul 2020 22:11:51 +0000 (00:11 +0200)]
android: panfrost/encoder: add libmesa_nir static dependency
Fixes the following build error:
In file included from external/mesa/src/panfrost/encoder/pan_blit.c:34:
In file included from external/mesa/src/panfrost/encoder/../midgard/midgard_compile.h:27:
external/mesa/src/compiler/nir/nir.h:52:10: fatal error: 'nir_opcodes.h' file not found
^~~~~~~~~~~~~~~
1 error generated.
Fixes: 293f251871b ("panfrost: Use Midgard-specific reloads")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5961>
Icecream95 [Fri, 17 Jul 2020 23:39:45 +0000 (11:39 +1200)]
panfrost: Fix calls to panfrost_flush_batches_accessing_bo
The function now takes a bool flush_readers instead of an access type,
but some calls were not updated.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5962>
Icecream95 [Fri, 17 Jul 2020 23:36:36 +0000 (11:36 +1200)]
panfrost: Make panfrost_bo_wait take a wait_readers bool
panfrost_bo_wait is often used after
panfrost_flush_batches_accessing_bo, so make them take similar
arguments for consistency.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5962>
Eric Anholt [Tue, 30 Jun 2020 20:05:51 +0000 (13:05 -0700)]
freedreno/ir3: Add unit tests for derivatives disasm.
Since I was going back to look at fine derivs again, add some tests of
instruction encoding.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5699>
Eric Anholt [Tue, 30 Jun 2020 19:17:10 +0000 (12:17 -0700)]
freedreno/ir3: Fix duplicated fine derivatives instructions.
legalize_block() can get run multiple times, which I didn't notice when
adding fine derivs support. Other instruction clones change things such
that the legalization won't trigger again, but that didn't apply to the
DS.PP legalization. To keep someone else from tripping over this, split
the one-shot legalization out of the iterative sync flag application.
Fixes failures in dEQP-VK.glsl.derivate.dfdxfine.*
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3198
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5699>
Bas Nieuwenhuizen [Sat, 11 Jul 2020 12:34:58 +0000 (14:34 +0200)]
amd/addrlib: Clean up unused colorFlags argument
Cleanup.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5865>
Bas Nieuwenhuizen [Sat, 11 Jul 2020 20:04:25 +0000 (22:04 +0200)]
amd/common: Cache intra-tile addresses for retile map.
However complicated DCC addressing is it is still based on tiles.
If we have the intra-tile offsets + tile dimensions we can expand
that to the full image ourselves.
Behavior around ~1080p on a 2500U:
old:
30-60 ms on every miss
new:
5 ms initally (miss in the tile cache)
<0.5 ms afterwards
The most common case is that the tile cache only contains data for
2 tiles, which for Raven/Renoir/Navi14 will be 4 KiB each, so the
size increase is fairly modest.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5865>
Rhys Perry [Wed, 15 Jul 2020 16:08:01 +0000 (17:08 +0100)]
aco: use s_waitcnt_depctr to mitigate VMEMtoScalarWriteHazard
Apparently this is potentially faster than v_nop:
https://reviews.llvm.org/D83872
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5923>
Rhys Perry [Wed, 15 Jul 2020 16:34:21 +0000 (17:34 +0100)]
aco: properly recognize that s_waitcnt mitigates VMEMtoScalarWriteHazard
fossil-db (Navi):
Totals from 555 (0.41% of 135946) affected shaders:
CodeSize:
1005716 ->
1003400 (-0.23%)
Instrs: 195326 -> 194744 (-0.30%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5923>
Eric Anholt [Fri, 3 Jul 2020 19:03:52 +0000 (12:03 -0700)]
meson: Enable GCing of functions and data from compilation units by default.
Normally, the linker will pull in any compilation unit (aka .c file) from
a static lib (such as our shared util code) that is depended on by the
code linking against it. Since that code is already compiled, the .text
section is allowed to jump anywhere in .text, and the compiler can't
garbage collect unused functions inside of a compile unit.
Teasing callgraphs apart so that normal compilation-unit-level GCing can
reduce driver size hurts the logical organization of the code and is
difficult. As an example, once I'd split the format pack/unpack tables, I
had to split out util_format_read/write() from util_format.c to avoid
pulling in pack/unpack. But even then it didn't help, because it turns
out turnip's pack calls pull in util_format_bptc.c for bptc packing, but
that file also includes the unpack impls, and those internally call
util_format_unpack, and thus we pulled in all of unpack. Splitting all of
this to separate files makes code harder to find and maintain, and is a
waste of dev time.
By setting these compiler flags, the compiler puts each function and data
symbol in a separate ELF section and the linker can then safely GC unused
text and data sections from a compile unit that gets pulled in. There's a
bit of a space cost due to having those separate sections, but it ends up
being a huge win in disk space on my personal release driver builds:
- i965_dri.so -213k
- x86 gallium dri.so -430k
- libvulkan_intel.so -272k
- aarch64 gallium dri.so -330k
- libvulkan_freedreno.so -783k
No difference on iris drawoverhead -compat -test 1 on my skylake (n=60)
Effect on debugoptimized build times (n=5)
touch nir_lower_io.c build time (bfd) +15.999% +/- 3.80377%
touch freedreno fd6_gmem.c build time (bfd) +13.5294% +/- 4.86363%
touch nir_lower_io.c build time (lld) no change
touch freedreno fd6_gmem.c build time (lld) +2.45375% +/- 2.2383%
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5739>
Alyssa Rosenzweig [Fri, 17 Jul 2020 21:04:41 +0000 (17:04 -0400)]
panfrost: Enable FP16 by default
I see no reason to hide this. The small hit in cycle count is offset in
practice by the increase in thread count. So let's ship it and get some
testing.
If this regresses a workload:
1. Open an issue on the tracker and attach an apitrace.
2. In the meantime set PAN_MESA_DEBUG=nofp16 to override.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5960>
Rob Clark [Thu, 16 Jul 2020 21:20:22 +0000 (14:20 -0700)]
gitlab-ci: re-enable all a630 jobs
I haven't noticed tftp boot issues in last few days, not sure if they
where just a fluke on Mon or if it is somehow related to # of jobs we
run (ie. having more of the c630 runners powered up and running more
of the time).
Let's turn them back on and see what happens.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5952>
Eric Anholt [Fri, 17 Jul 2020 17:48:56 +0000 (10:48 -0700)]
freedreno/a2xx: Fix compiler warning in disasm.
warning: converting a packed ‘instr_cf_t’ {aka ‘union <anonymous>’}
pointer (alignment 1) to a ‘uint16_t’ {aka ‘short unsigned int’} pointer
(alignment 2) may result in an unaligned pointer value
[-Waddress-of-packed-member]
We may know that we'll only ever have aligned instr_cf_ts, but gcc
doesn't.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5955>
Tomeu Vizoso [Thu, 9 Jul 2020 20:38:51 +0000 (22:38 +0200)]
gitlab-ci: Re-add kernels for bare-metal
I mistakenly removed what I thought were remnants of when Freedreno used
LAVA for their DUTs. lava_arm.sh is used for baremetal, so re-add that
code.
Fixes: dcd171f5e9bd ("gitlab-ci: More stable URL for kernel and ramdisk artifacts, for LAVA")
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5839>
Icecream95 [Fri, 17 Jul 2020 11:26:41 +0000 (23:26 +1200)]
nir: Set the alignment for SSBO lowering
The alignment can just be copied from the source intrinsic.
Fixes the assertion
nir_intrinsic_align_offset(instr) < nir_intrinsic_align_mul(instr)
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5949>
Eric Anholt [Thu, 9 Jul 2020 16:41:45 +0000 (09:41 -0700)]
intel/perf: Move perf query register programming to static tables.
And now that they're static tables, we don't need to ralloc a copy in
non-shared memory.
Saves ~210k in the built intel drivers.
Bug: https://bugs.chromium.org/p/chromium/issues/detail?id=
1048434
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5829>
Eric Anholt [Thu, 9 Jul 2020 17:06:08 +0000 (10:06 -0700)]
intel/perf: Fix unused var warning in release builds.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5829>
Eric Anholt [Thu, 9 Jul 2020 18:07:42 +0000 (11:07 -0700)]
intel: Fix release-build warnings about sf_entry_size.
In one side of the ifdef it's only used in an assert.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5829>
Erik Faye-Lund [Fri, 17 Jul 2020 17:18:58 +0000 (19:18 +0200)]
zink: use ralloc for spirv_builder as well
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5954>
Erik Faye-Lund [Fri, 17 Jul 2020 17:03:05 +0000 (19:03 +0200)]
zink: pass mem_ctx to ralloc_size-call
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5954>
Erik Faye-Lund [Fri, 17 Jul 2020 16:55:53 +0000 (18:55 +0200)]
zink: use ralloc for plain malloc-calls
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5954>
Erik Faye-Lund [Fri, 17 Jul 2020 16:52:13 +0000 (18:52 +0200)]
zink: use ralloc in nir-to-spirv
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5954>
Rhys Perry [Thu, 2 Jul 2020 12:38:18 +0000 (13:38 +0100)]
radv: enable more float_controls features
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5773>
Rhys Perry [Thu, 2 Jul 2020 12:37:10 +0000 (13:37 +0100)]
aco: set tcs_in_out_eq=false if float controls of VS and TCS stages differ
Otherwise, we might have both VS and TCS code in the same block but float
controls are set per-block.
We also rely on VS code not dominating TCS code for the optimizer to work
correctly.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5773>
Rhys Perry [Thu, 2 Jul 2020 12:35:41 +0000 (13:35 +0100)]
aco: fix nir_op_f2f16_rtne with non-default rounding modes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5773>