mesa.git
3 years agoaco: consider affinities when creating v_mac_f32
Rhys Perry [Fri, 8 May 2020 16:58:07 +0000 (17:58 +0100)]
aco: consider affinities when creating v_mac_f32

Totals from 8487 (6.65% of 127638) affected shaders:
CodeSize: 62061988 -> 62058020 (-0.01%); split: -0.01%, +0.01%
Instrs: 11910757 -> 11885409 (-0.21%); split: -0.21%, +0.00%
Copies: 1065244 -> 1040945 (-2.28%); split: -2.30%, +0.02%
Branches: 349665 -> 348914 (-0.21%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>

3 years agoaco: mark phi definitions as last-seen phi operands
Rhys Perry [Fri, 8 May 2020 10:45:57 +0000 (11:45 +0100)]
aco: mark phi definitions as last-seen phi operands

Totals from 14340 (11.23% of 127638) affected shaders:
SGPRs: 1251648 -> 1251512 (-0.01%)
VGPRs: 994556 -> 994104 (-0.05%); split: -0.06%, +0.01%
CodeSize: 122894528 -> 121099604 (-1.46%); split: -1.49%, +0.03%
MaxWaves: 106039 -> 106103 (+0.06%); split: +0.06%, -0.00%
Instrs: 23860066 -> 23414317 (-1.87%); split: -1.90%, +0.03%
Copies: 2448228 -> 2049305 (-16.29%); split: -16.37%, +0.07%
Branches: 789381 -> 757921 (-3.99%); split: -4.62%, +0.64%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>

3 years agoaco: fix consecutively written vgprs from vmem instructions
Rhys Perry [Thu, 7 May 2020 13:27:42 +0000 (14:27 +0100)]
aco: fix consecutively written vgprs from vmem instructions

If one VMEM instruction uses a sampler and the other doesn't, we can't do
this optimization.

Totals from 47 (0.04% of 127638) affected shaders:
CodeSize: 271744 -> 271656 (-0.03%); split: -0.04%, +0.01%
Instrs: 52783 -> 52761 (-0.04%); split: -0.05%, +0.01%
Cycles: 5547040 -> 5546952 (-0.00%); split: -0.00%, +0.00%
VMEM: 10022 -> 9887 (-1.35%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4949>

3 years agoaco: simplify consecutive ordered vmem/lds writes optimization
Rhys Perry [Thu, 7 May 2020 14:02:20 +0000 (15:02 +0100)]
aco: simplify consecutive ordered vmem/lds writes optimization

This was unnecessary and messed with statistics

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4949>

3 years agogitlab-ci: correct tracie behavior with replay errors
Andres Gomez [Thu, 30 Apr 2020 21:05:07 +0000 (00:05 +0300)]
gitlab-ci: correct tracie behavior with replay errors

[dump_trace_images] Info: Dumping trace /tmp/tracie.test.ap5pshYcsg/traces-db/trace1/magenta.testtrace... ERROR
[dump_trace_images] Debug: === Failure log start ===
invalid literal for int() with base 16: 'in'
[dump_trace_images] Debug: === Failure log end ===
[check_image] Trace /tmp/tracie.test.ap5pshYcsg/traces-db/trace1/magenta.testtrace couldn't be replayed. See above logs for more information.
Traceback (most recent call last):
  File "/tmp/tracie.test.ap5pshYcsg/tracie.py", line 176, in <module>
    main()
  File "/tmp/tracie.test.ap5pshYcsg/tracie.py", line 164, in main
    ok, result = gitlab_check_trace(project_url, commit_id, args.device_name, trace, expectation)
TypeError: cannot unpack non-iterable bool object

Fixes: efbbf8bb81e ("tracie: Print results in a machine readable format")
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4839>

3 years agogitlab-ci: create always the "results" directory with tracie
Andres Gomez [Thu, 30 Apr 2020 19:49:58 +0000 (22:49 +0300)]
gitlab-ci: create always the "results" directory with tracie

Otherwise, we will fail when the traces description file doesn't
contain any checksum for the specified device.

Fixes: efbbf8bb81e ("tracie: Print results in a machine readable format")
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4839>

3 years agoradv: add a LLVM version string workaround for SotTR and ACO
Samuel Pitoiset [Mon, 11 May 2020 07:54:11 +0000 (09:54 +0200)]
radv: add a LLVM version string workaround for SotTR and ACO

When the LLVM version is too old or missing, SotTR applies shader
workarounds and that reduces performance by 2-5% with ACO.

SotTR workarounds are applied with LLVM 8 and older, so reporting
LLVM 9.0.1 should be fine.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4984>

3 years agoturnip: use the common code for generating extensions and dispatch tables
Samuel Pitoiset [Tue, 12 May 2020 14:17:31 +0000 (16:17 +0200)]
turnip: use the common code for generating extensions and dispatch tables

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>

3 years agoanv: use the common code for generating extensions and dispatch tables
Samuel Pitoiset [Mon, 11 May 2020 13:08:16 +0000 (15:08 +0200)]
anv: use the common code for generating extensions and dispatch tables

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>

3 years agoradv: use the common code for generating extensions and dispatch tables
Samuel Pitoiset [Mon, 11 May 2020 09:33:00 +0000 (11:33 +0200)]
radv: use the common code for generating extensions and dispatch tables

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>

3 years agovulkan: import common code for generating extensions
Samuel Pitoiset [Mon, 11 May 2020 12:36:02 +0000 (14:36 +0200)]
vulkan: import common code for generating extensions

ANV and RADV have similar Python code for generating extensions
and dispatch tables.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>

3 years agoradv: implement VK_EXT_private_data
Samuel Pitoiset [Wed, 29 Apr 2020 08:19:11 +0000 (10:19 +0200)]
radv: implement VK_EXT_private_data

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4886>

3 years agoradv: use the base object struct types
Samuel Pitoiset [Wed, 29 Apr 2020 12:57:20 +0000 (14:57 +0200)]
radv: use the base object struct types

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4886>

3 years agoradv: use the common base object type for VkDevice
Samuel Pitoiset [Wed, 29 Apr 2020 08:16:32 +0000 (10:16 +0200)]
radv: use the common base object type for VkDevice

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4886>

3 years agoetnaviv: Disable seamless cube map on GC880
Marek Vasut [Sat, 2 May 2020 20:24:25 +0000 (22:24 +0200)]
etnaviv: Disable seamless cube map on GC880

The GC880 on iMX6DL indicates in it's minorFeatures2 register that it
does support SEAMLESS_CUBE_MAP, however when the TE.SAMPLER_CONFIG1
VIVS_TE_SAMPLER_CONFIG1_SEAMLESS_CUBE_MAP bit is set on GC880 on iMX6DL,
the result is corrupted image. In particular, the following ~112 dEQPs
are affected and fail:

  dEQP-GLES2.functional.texture.filtering.cube.*

This only happens on MX6DL GC880, MX6Q GC2000 and STM32MP1 GC400(GCnano)
do not report the minorFeatures2 SEAMLESS_CUBE_MAP bit and ignore the
TE_SAMPLER_CONFIG1 VIVS_TE_SAMPLER_CONFIG1_SEAMLESS_CUBE_MAP bit (note
that ss->seamless_cube_map is unconditionally set by mesa at times even
PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE returns 0), so there is no visible
problem and there are no failing dEQP tests on the GC2000 and GCnano.

This might imply that the minorFeatures2 SEAMLESS_CUBE_MAP has some
different meaning on GC880 or the SEAMLESS_CUBE_MAP behaves differently
on the GC880.

This patch does not set the SEAMLESS_CUBE_MAP bit on hardware which does
not indicate support for seamless cube map and on GC880, which results
in reduction in failed dEQPs: 635 to 186 on GC880, 274 to 270 on GC2000
and no change on GC400(GCnano).

Fixes: 8dd26fa2f06 ("etnaviv: support GL_ARB_seamless_cubemap_per_texture")
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4865>

3 years agofreedreno/a6xx: fix max-scissor opt
Rob Clark [Tue, 12 May 2020 23:39:20 +0000 (16:39 -0700)]
freedreno/a6xx: fix max-scissor opt

On a6xx we need a 0,0 based scissor in the binning pass, but can use the
blit-scissor to avoid restore/resolve of untouched pixels, and use the
conditional execution if the IB to bin to skip bins with no geometry
(due to the scissor).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5021>

3 years agofreedreno/ir3/sched: try to avoid syncs
Rob Clark [Wed, 6 May 2020 17:29:01 +0000 (10:29 -0700)]
freedreno/ir3/sched: try to avoid syncs

Similar to what we do in postsched.  It is useful for pre-RA sched to be
a bit aware of things that would cause syncs.  In particular for the tex
fetches, since the vecN src/dst tends to limit postsched's ability to
re-order them.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>

3 years agofreedreno/ir3/sched: avoid scheduling outputs
Rob Clark [Wed, 6 May 2020 17:20:14 +0000 (10:20 -0700)]
freedreno/ir3/sched: avoid scheduling outputs

If an instruction's only use is as an output, and it increases register
pressure, then try to avoid scheduling it until there are no other
options.

A semi-common pattern is `fragcolN.a = 1.0`, this pushes all these
immed loads to the end of the shader.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>

3 years agofreedreno/ir3/postsched: try to avoid (sy) syncs
Rob Clark [Wed, 6 May 2020 17:06:17 +0000 (10:06 -0700)]
freedreno/ir3/postsched: try to avoid (sy) syncs

Similar to avoidance of `(ss)` syncs, it turns out to be helpful to
avoid `(sy)` syncs as well.  This helps us turn an tex, (sy)alu, tex,
(sy)alu sequence into tex, tex, (sy)alu, alu, which is a big win in
gfxbench gl_fill2.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>

3 years agofreedreno/ir3/postsched: reset sfu_delay on sync
Rob Clark [Wed, 6 May 2020 17:01:08 +0000 (10:01 -0700)]
freedreno/ir3/postsched: reset sfu_delay on sync

Once we schedule an instruction that will require an `(ss)` sync flag,
there is no need to delay any further instructions that consume an
SFU result (until the next SFU instruction is scheduled).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>

3 years agofreedreno/ir3: limit # of tex prefetch by shader size
Rob Clark [Mon, 4 May 2020 22:13:20 +0000 (15:13 -0700)]
freedreno/ir3: limit # of tex prefetch by shader size

It seems for short frag shaders, too much prefetch can be detrimental.
I think what we *really* want to do is decide after pre-RA sched, when
we also know about nop's and what the actual ir3 instruction count is.
But that will require re-working how prefetch lowering works.  For now
this is a super crude heuristic to attempt to approximate a good
solution.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>

3 years agofreedreno/ir3: fix indirect cb0 load_ubo lowering
Rob Clark [Thu, 7 May 2020 20:24:46 +0000 (13:24 -0700)]
freedreno/ir3: fix indirect cb0 load_ubo lowering

We can no longer assume that `state->ranges[0]` is block 0.  It *often*
is, but when we encounter a "real" ubo that we lower to `load_uniform`
before a block 0 `load_ubo`, it could end up another entry in the table.
Resulting in the second pass after gathering ubo ranges, not finding a
valid range.  Which results in a `load_ubo` for a thing that is not
actually a ubo making it's way into ir3 frontend.  Resulting in grabbing
what we think is a ubo address out of some unrelated const register, and
trying to dereference that.  Which as you can imagine, fails in amusing
ways.

Fixes: fc850080ee3 ("ir3: Rewrite UBO push analysis to support bindless")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>

3 years agofreedreno/ir3: don't allow negative const_offset
Rob Clark [Thu, 7 May 2020 18:09:17 +0000 (11:09 -0700)]
freedreno/ir3: don't allow negative const_offset

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>

3 years agopanfrost: Run dEQP-GLES3.functional.shaders.derivate.* on CI
Alyssa Rosenzweig [Tue, 12 May 2020 18:14:29 +0000 (14:14 -0400)]
panfrost: Run dEQP-GLES3.functional.shaders.derivate.* on CI

Should be stable now, and should pass except for MSAA tests
(multisampling is still a todo overall).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>

3 years agopan/mdg: Fix derivative swizzle
Alyssa Rosenzweig [Mon, 11 May 2020 13:20:39 +0000 (09:20 -0400)]
pan/mdg: Fix derivative swizzle

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>

3 years agopan/mdg: Set types for derivatives
Alyssa Rosenzweig [Wed, 6 May 2020 18:17:34 +0000 (14:17 -0400)]
pan/mdg: Set types for derivatives

Closes #2900

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>

3 years agopan/mdg: Remove texture_op_count
Alyssa Rosenzweig [Tue, 12 May 2020 17:36:51 +0000 (13:36 -0400)]
pan/mdg: Remove texture_op_count

Was used as a crude approximation of the terminate flag, which we now
can do properly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>

3 years agopan/mdg: Use analysis to set .cont/.last flags
Alyssa Rosenzweig [Tue, 12 May 2020 17:34:52 +0000 (13:34 -0400)]
pan/mdg: Use analysis to set .cont/.last flags

Corresponds roughly to what we analyze. Note that "terminate AND
execute" is a contradiction (rather: it's equivalent to just
terminating), hence why there are only three possibilities for the
states of the flags:

   .cont = continue, don't execute
   .last = don't continue, don't execute
   .cont.last = continue and execute

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>

3 years agopan/mdg: Use the helper invo analyze passes
Alyssa Rosenzweig [Tue, 12 May 2020 17:26:32 +0000 (13:26 -0400)]
pan/mdg: Use the helper invo analyze passes

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>

3 years agopan/mdg: Analyze helper execution requirements
Alyssa Rosenzweig [Tue, 12 May 2020 17:19:23 +0000 (13:19 -0400)]
pan/mdg: Analyze helper execution requirements

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>

3 years agopan/mdg: Analyze helper invocation termination
Alyssa Rosenzweig [Tue, 12 May 2020 00:22:16 +0000 (20:22 -0400)]
pan/mdg: Analyze helper invocation termination

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>

3 years agopan/mdg: Explain helper invocations dataflow theory
Alyssa Rosenzweig [Tue, 12 May 2020 00:05:24 +0000 (20:05 -0400)]
pan/mdg: Explain helper invocations dataflow theory

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>

3 years agointel/compiler: fix alignment assert in nir_emit_intrinsic
Arcady Goldmints-Orlov [Mon, 11 May 2020 23:31:49 +0000 (18:31 -0500)]
intel/compiler: fix alignment assert in nir_emit_intrinsic

Fixes: c643979228 (intel/fs: Choose memory message type based on bit size)
Fixes: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_i8vec2
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5000>

3 years agofreedreno: Skip taking the lock for resource usage if it's already flagged.
Eric Anholt [Tue, 12 May 2020 16:39:20 +0000 (09:39 -0700)]
freedreno: Skip taking the lock for resource usage if it's already flagged.

Improves nohw drawoverhead 8-ubos update throughput by 13.493% +/-
0.391444% (n=15).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5011>

3 years agofreedreno: Move the resource_read early out to an inline.
Eric Anholt [Mon, 11 May 2020 22:08:35 +0000 (15:08 -0700)]
freedreno: Move the resource_read early out to an inline.

Looking at perf, the drawoverhead test case was now spending 13% CPU (89%
in that function) on stack management.

nohw drawoverhead throughput 1.03902% +/- 0.380257% (n=13).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>

3 years agofreedreno: Add an early out for preparing to read a resource.
Eric Anholt [Mon, 11 May 2020 20:46:33 +0000 (13:46 -0700)]
freedreno: Add an early out for preparing to read a resource.

nohw drawoverhead 8 UBOs test throughput 1.06093% +/- 0.363376% (n=10).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>

3 years agofreedreno: Split the fd_batch_resource_used by read vs write.
Eric Anholt [Mon, 11 May 2020 20:28:58 +0000 (13:28 -0700)]
freedreno: Split the fd_batch_resource_used by read vs write.

This is for an optimization I plan in a following commit.  I found I had
to add likely()s to avoid a perf regression from branch prediction.

On the drawoverhead 8 UBOs test, the HW can't quite keep up with the CPU,
but if I set nohw then this change is 1.32023% +/- 0.373053% (n=10).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>

3 years agofreedreno: Add a nohw flag to skip submitting to the kernel.
Eric Anholt [Mon, 11 May 2020 20:53:48 +0000 (13:53 -0700)]
freedreno: Add a nohw flag to skip submitting to the kernel.

For some CPU-side-only optimizations, it can be nice to disable rendering
so that we can see what the impact is even on cases where the GPU can't
quite keep up.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>

3 years agoturnip: Execute ir3_nir_lower_gs pass again
Brian Ho [Mon, 4 May 2020 20:55:06 +0000 (13:55 -0700)]
turnip: Execute ir3_nir_lower_gs pass again

This commit fixes a GS regression introduced in !4562 where
ir3's GS lowering pass was moved from common code (ir3_nir) to
freedreno-specific code (ir3_shader). For GS support in turnip, we
need to add the GS lowering pass back in, this time in tu_shader.

As for the nir_gather_info change, the GS lowering pass has always
introduced a discard_if intrinsic into the GS. Previously, we simply
ran nir_shader_gather_info before GS lowering, but now since we lower
the GS before we need to remove the assertion that only a FS can use
the discard_if intrinsic.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>

3 years agofreedreno/gmem: rework gmem layout algo
Rob Clark [Fri, 8 May 2020 23:35:29 +0000 (16:35 -0700)]
freedreno/gmem: rework gmem layout algo

And try a bit harder to find an optimal layout.  Improves on a sub-
optimal layout we arrive at in the 4 MRT pass in manhattan, picking
up a bit more than 3%.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>

3 years agofreedreno/gmem: relax alignment on a6xx
Rob Clark [Sat, 9 May 2020 16:42:14 +0000 (09:42 -0700)]
freedreno/gmem: relax alignment on a6xx

The blob only uses single page alignment, and empirically that appears
to work just fine.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>

3 years agofreedreno: add gmemtool
Rob Clark [Sat, 9 May 2020 20:19:14 +0000 (13:19 -0700)]
freedreno: add gmemtool

A simple standalone thing to run through a bunch of GMEM layouts for a
given gpu.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>

3 years agofreedreno/gmem: add helper to dump GMEM layout
Rob Clark [Sat, 9 May 2020 16:38:02 +0000 (09:38 -0700)]
freedreno/gmem: add helper to dump GMEM layout

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>

3 years agofreedreno/gmem: add div_align() helper
Rob Clark [Fri, 8 May 2020 23:13:59 +0000 (16:13 -0700)]
freedreno/gmem: add div_align() helper

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>

3 years agofreedreno: initialize max_scissor
Rob Clark [Sat, 9 May 2020 19:31:20 +0000 (12:31 -0700)]
freedreno: initialize max_scissor

Somehow the initialization of this got lost somewhere along the way,
resulting in assuming minx/miny are always zero.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>

3 years agofreedreno/gmem: don't assume scissor opt when estimating # of bins
Rob Clark [Sat, 9 May 2020 19:29:43 +0000 (12:29 -0700)]
freedreno/gmem: don't assume scissor opt when estimating # of bins

We potentially don't know yet what the resulting scissor bounds are, so
we can't assume this when estimating number of bins per pipe for VSC
size calculations.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>

3 years agovulkan: Handle vkGet/SetPrivateDataEXT on Android swapchains
Jason Ekstrand [Fri, 8 May 2020 07:06:26 +0000 (02:06 -0500)]
vulkan: Handle vkGet/SetPrivateDataEXT on Android swapchains

There is an annoying spec corner on Android.  Because VkSwapchain is
implemented in the Vulkan loader on Android which may not know about
this extension, we have to handle it as a special case inside the
driver.  We only have to do this on Android and only for VkSwapchainKHR.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4882>

3 years agoanv,vulkan: Implement VK_EXT_private_data
Jason Ekstrand [Tue, 21 Apr 2020 21:31:25 +0000 (16:31 -0500)]
anv,vulkan: Implement VK_EXT_private_data

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4882>

3 years agoturnip: enable tiling for compressed formats
Jonathan Marek [Tue, 12 May 2020 15:28:51 +0000 (11:28 -0400)]
turnip: enable tiling for compressed formats

Now that layout code supports this, we can enable it.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>

3 years agoturnip: update "fetchsize" value to match fdl6_layout changes
Jonathan Marek [Tue, 12 May 2020 15:26:05 +0000 (11:26 -0400)]
turnip: update "fetchsize" value to match fdl6_layout changes

It seems this is actually a "minimum pitch" value. For example
TFETCH6_2_BYTE means a minimum pitch of 128 bytes for mipmap levels.

This fixes breakage with compressed formats. For example this test:

dEQP-VK.pipeline.sampler.view_type.2d.format.eac_r11_snorm_block.mipmap.linear.lod.equal_min_3_max_3

Fixes: a34b3fa198a4f ("freedreno/fdl: Align after dividing by block size")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>

3 years agofreedreno: Fix non-constbuf-upload UBO block indices and count.
Eric Anholt [Mon, 11 May 2020 18:18:02 +0000 (11:18 -0700)]
freedreno: Fix non-constbuf-upload UBO block indices and count.

The nir_analyze_ubo_ranges pass removes all UBO block 0 loads to reverse
what nir_lower_uniforms_to_ubo() had done, and we only upload UBO pointers
to the HW for UBO block 1-N, so let's just fix up the shader state.

Fixes an off by one in const state layout setup, and some really dodgy
register addressing trying to deal with dynamic UBO indices when the UBO
pointers happen to be at the start of the constbuf.

There's no fixes tag, though this fixes a bug from September, because it
would require the num_ubos fix in nir_lower_uniforms_to_ubo.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>

3 years agonir: Fix count when we didn't lower load_uniforms but did shift load_ubos.
Eric Anholt [Mon, 11 May 2020 18:53:22 +0000 (11:53 -0700)]
nir: Fix count when we didn't lower load_uniforms but did shift load_ubos.

The fixed commit was really nice in mostly fixing num_ubos to reflect the
shader after lowering, but for
dEQP-GLES31.functional.compute.basic.ubo_to_ssbo_single_invocation there
are no default uniforms and so we skipped the increment, even though we
shifted the block index up.

Fixes: 4777ee1a62f0 ("nir: Always create UBO variable when lowering uniforms to ubo")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>

3 years agofreedreno: Drop the "write" arg to emit_const_bo now relocs don't care.
Eric Anholt [Mon, 11 May 2020 16:46:03 +0000 (09:46 -0700)]
freedreno: Drop the "write" arg to emit_const_bo now relocs don't care.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>

3 years agofreedreno: Replace OUT_RELOCW with OUT_RELOC.
Eric Anholt [Fri, 8 May 2020 19:39:36 +0000 (12:39 -0700)]
freedreno: Replace OUT_RELOCW with OUT_RELOC.

Final cleanup commit now that they're the same.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>

3 years agofreedreno: Tell the kernel that all BOs are for writing.
Eric Anholt [Fri, 8 May 2020 19:28:08 +0000 (12:28 -0700)]
freedreno: Tell the kernel that all BOs are for writing.

Using non-write flags is pretty dubious -- it means the kernel tracking an
array of read-only consumers of the BO and having exclusive consumers wait
on each reader's fence.  It allows multiple readers through dma-bufs to do
work in parallel, but at the cost of kernel CPU time and memory management
of the shared array.  Other drivers have dropped this distinction since
dma-buf sharing is usually producer-consumer, not producer-two-consumers,
and the userspace and kernel space tracking is expensive.

For us, this lets us drop the flags passed in for relocs and tracked in
the ringbuffer reloc lists.  The end result of the flags reduction work is
drawoverhead uniforms test throughput 2.37195% +/- 0.365579% (n=15)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>

3 years agofreedreno: Mark all ringbuffer BOs as to be dumped on crash.
Eric Anholt [Fri, 8 May 2020 18:28:14 +0000 (11:28 -0700)]
freedreno: Mark all ringbuffer BOs as to be dumped on crash.

We can avoid passing these flags around in the DRM backends by just
marking ring BOs up front.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>

3 years agofreedreno: Replace OUT_RELOCD with permanently flagging shader BOs for it.
Eric Anholt [Fri, 8 May 2020 18:24:12 +0000 (11:24 -0700)]
freedreno: Replace OUT_RELOCD with permanently flagging shader BOs for it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>

3 years agofreedreno: Start moving relocs flags into the BOs.
Eric Anholt [Fri, 8 May 2020 18:20:07 +0000 (11:20 -0700)]
freedreno: Start moving relocs flags into the BOs.

It's silly to have all the reloc emitters passing around FD_RELOC_READ
when you have to have it set on all relocs (that don't include WRITE,
which implies read) for the kernel to actually track the fences on the BO.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>

3 years agoaco: optimize add/sub(a, cndmask(b, 0, 1, cond)) -> addc/subbrev_co(0, a, b)
Samuel Pitoiset [Thu, 2 Apr 2020 15:41:36 +0000 (17:41 +0200)]
aco: optimize add/sub(a, cndmask(b, 0, 1, cond)) -> addc/subbrev_co(0, a, b)

v2: outline into a separate function and also optimize additions (by Daniel Schürmann)

Totals from affected shaders: (VEGA)
SGPRS: 938888 -> 941496 (0.28 %)
VGPRS: 832068 -> 831532 (-0.06 %)
Spilled SGPRs: 618 -> 618 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 3696 -> 3696 (0.00 %) dwords per thread
Code Size: 72893900 -> 72558928 (-0.46 %) bytes
LDS: 18201 -> 18201 (0.00 %) blocks
Max Waves: 64256 -> 64268 (0.02 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4419>

3 years agoaco: coalesce parallelcopies during register allocation
Daniel Schürmann [Thu, 7 May 2020 17:19:54 +0000 (18:19 +0100)]
aco: coalesce parallelcopies during register allocation

These are the result of lowering to CSSA, and should be removed if possible

Totals from affected shaders: (VEGA)
SGPRS: 544544 -> 544544 (0.00 %)
VGPRS: 418224 -> 418224 (0.00 %)
Spilled SGPRs: 141826 -> 141826 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 65853740 -> 64703380 (-1.75 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 13669 -> 13669 (0.00 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4952>

3 years agoglthread: Fix use of alloca() without #include "c99_alloca.h"
Jon Turney [Wed, 6 May 2020 15:09:56 +0000 (16:09 +0100)]
glthread: Fix use of alloca() without #include "c99_alloca.h"

../src/mesa/main/glthread_draw.c: In function ‘_mesa_marshal_MultiDrawElementsBaseVertex’:
../src/mesa/main/glthread_draw.c:812:36: error: implicit declaration of function ‘alloca’; did you mean ‘malloc’? [-Werror=implicit-function-declaration]
  812 |       const GLvoid **out_indices = alloca(sizeof(indices[0]) * draw_count);
      |                                    ^~~~~~
      |                                    malloc
../src/mesa/main/glthread_draw.c:812:36: error: initialization of ‘const GLvoid **’ {aka ‘const void **’} from ‘int’ makes pointer from integer without a cast [-Werror=int-conversion]
cc1: some warnings being treated as errors

Include c99_alloca.h to portably make the alloca() prototype available.

Fixes: 2840bc30
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4920>

3 years agoetnaviv: generalize FE stall before loading shader and sampler states
Lucas Stach [Thu, 6 Feb 2020 16:11:03 +0000 (17:11 +0100)]
etnaviv: generalize FE stall before loading shader and sampler states

It seems that some of the new shader and sampler states added with
Halti0 are not self-synchronizing anymore. Make sure to stall the FE
before loading those new states to avoid corruption of the in-flight
draw state.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3963>

3 years agoCI: Re-enable Panfrost T7x0 jobs
Daniel Stone [Tue, 12 May 2020 10:33:06 +0000 (11:33 +0100)]
CI: Re-enable Panfrost T7x0 jobs

The hardware issue in the lab preventing jobs from being run on those
machines (and limiting T820 availability), leading to them being
disabled in !4965, has been fixed.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: 696bafac40f5 ("CI: Disable Panfrost T7x0 jobs")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5006>

3 years agoradv: update the list of allowed Android extensions
Samuel Pitoiset [Mon, 11 May 2020 10:04:51 +0000 (12:04 +0200)]
radv: update the list of allowed Android extensions

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4985>

3 years agoradv: handle different Vulkan API versions correctly
Samuel Pitoiset [Mon, 11 May 2020 09:58:26 +0000 (11:58 +0200)]
radv: handle different Vulkan API versions correctly

Loosely based on ANV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4985>

3 years agoradv: limit the Vulkan version to 1.1 for Android
Samuel Pitoiset [Mon, 11 May 2020 08:52:18 +0000 (10:52 +0200)]
radv: limit the Vulkan version to 1.1 for Android

Vulkan 1.2 seems rejected. This hardcodes the Android version to
1.1.107.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2936
Fixes: 7f5462e349a ("radv: enable Vulkan 1.2")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4985>

3 years agor600: Fix nir compiler options, i.e. don't lower IO to temps for TESS
Gert Wollny [Mon, 11 May 2020 07:03:41 +0000 (09:03 +0200)]
r600: Fix nir compiler options, i.e. don't lower IO to temps for TESS

Also fix alignments and add umad24 and umul24 options.

Fixes: 6747a984f59ea9a2dd74b98d59cb8fdb028969ae
    r600: Enable tesselation for NIR

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4982>

3 years agov3d/tex: use TMUSLOD register if possible
Alejandro Piñeiro [Thu, 7 May 2020 09:46:25 +0000 (11:46 +0200)]
v3d/tex: use TMUSLOD register if possible

TMUSLOD register is the same that TMUS but having the same effect that
setting disable_autolod on the TMU configuration parameter 2.

So using that register is potentially more efficient, as in several
cases we would be able to skip writing P2.

One case where we can't use it is for texture cube maps, as we need to
use TMUSCM.

v2: don't put a comment in the middle of the conditions (Iago)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4962>

3 years agov3d/tex: set up default values for Configuration Parameter 1 if possible
Alejandro Piñeiro [Wed, 29 Apr 2020 08:29:50 +0000 (10:29 +0200)]
v3d/tex: set up default values for Configuration Parameter 1 if possible

Texture access has three configuration parameters, P0 (texture), P1
(sampler) and P2(lookup). P1 and P2 are optional, but if P2 is needed
(like for example to set the offset for texelFetchOffset), then you
need to set P1.

But until now when setting up P1 we were asking the driver to fill up
the address with the shader state. But in that case we can just fill
that address with the default value NULL.

So let's avoid asking the driver to fill that default values, and do
it directly on the compiler. This is a good-to-have on OpenGL, and
likely would be needed on Vulkan.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4962>

3 years agov3d/tex: only look up the 2nd texture gather offset for 1d non-arrays
Alejandro Piñeiro [Tue, 28 Apr 2020 22:33:47 +0000 (00:33 +0200)]
v3d/tex: only look up the 2nd texture gather offset for 1d non-arrays

Commit 1bc71e8b655f2f02b3e3a0af34c7cad12b9cb83d already did that for
the 3rd offset, but it also needs to do it for the 2nd (to handle 1d
array).

Fixes assertion failures with Vulkan CTS tests using 1darray
targets. Seems that there isn't too many 1darray tests on OpenGL CTS,
and OpenGL-ES don't support 1d arrays, but the same problem could
arise eventually on OpenGL.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4962>

3 years agodrirc: Enable glthread for rpcs3
Ani [Mon, 11 May 2020 14:45:47 +0000 (14:45 +0000)]
drirc: Enable glthread for rpcs3

Closes: #2939
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4988>

3 years agopan/midgard: Fix old style shadows
Icecream95 [Mon, 11 May 2020 22:16:31 +0000 (10:16 +1200)]
pan/midgard: Fix old style shadows

This fixes the sky being red in OpenMW, as well as some of the Mesa
demos using shadows (shadowtex, shadow_sampler).

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4997>

3 years agogallium/util: Fix leak in the live shader cache
Axel Davy [Sun, 10 May 2020 18:12:56 +0000 (20:12 +0200)]
gallium/util: Fix leak in the live shader cache

When the nir backend is used, the create_shader
call is supposed to release state->ir.nir.
When the cache hits, create_shader is not called,
thus state->ir.nir should be freed.

There is nothing to be done for the TGSI case as the
tokens release is done by the caller.

This fixes a leak noticed in:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/2931

Fixes: 4bb919b0b8b4ed6f6a7049c3f8d294b74b50e198
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4980>

3 years agonir/algebraic: Eliminate useless extract before unpack
Ian Romanick [Tue, 31 Mar 2020 23:57:03 +0000 (16:57 -0700)]
nir/algebraic: Eliminate useless extract before unpack

The shader helped for spills and fills is the big compute shader in Dirt
Showdown.  One of the shaders hurt for spills and fills on Broadwell is
the big compute shader in Bioshock Infinite, but combined with the
previous commit, it's still an impovement.

Tiger Lake
total instructions in shared programs: 21833218 -> 21832449 (<.01%)
instructions in affected programs: 66104 -> 65335 (-1.16%)
helped: 106
HURT: 14
helped stats (abs) min: 1 max: 67 x̄: 7.87 x̃: 5
helped stats (rel) min: 0.19% max: 5.76% x̄: 1.27% x̃: 0.95%
HURT stats (abs)   min: 1 max: 14 x̄: 4.64 x̃: 1
HURT stats (rel)   min: 0.19% max: 4.12% x̄: 1.41% x̃: 0.19%
95% mean confidence interval for instructions value: -8.51 -4.30
95% mean confidence interval for instructions %-change: -1.23% -0.69%
Instructions are helped.

total cycles in shared programs: 506180109 -> 506196314 (<.01%)
cycles in affected programs: 1671429 -> 1687634 (0.97%)
helped: 37
HURT: 84
helped stats (abs) min: 1 max: 490 x̄: 73.27 x̃: 24
helped stats (rel) min: 0.02% max: 7.98% x̄: 1.25% x̃: 0.41%
HURT stats (abs)   min: 1 max: 5000 x̄: 225.19 x̃: 8
HURT stats (rel)   min: 0.03% max: 10.22% x̄: 1.22% x̃: 0.42%
95% mean confidence interval for cycles value: 2.85 265.00
95% mean confidence interval for cycles %-change: 0.04% 0.88%
Cycles are HURT.

Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 19961317 -> 19960543 (<.01%)
instructions in affected programs: 30268 -> 29494 (-2.56%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 142 x̄: 19.85 x̃: 7
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.33% x̃: 2.31%
95% mean confidence interval for instructions value: -29.46 -10.23
95% mean confidence interval for instructions %-change: -2.95% -1.71%
Instructions are helped.

total cycles in shared programs: 498863755 -> 498865843 (<.01%)
cycles in affected programs: 1831136 -> 1833224 (0.11%)
helped: 57
HURT: 65
helped stats (abs) min: 1 max: 1400 x̄: 128.93 x̃: 25
helped stats (rel) min: 0.05% max: 3.49% x̄: 0.89% x̃: 0.71%
HURT stats (abs)   min: 1 max: 1887 x̄: 145.18 x̃: 15
HURT stats (rel)   min: 0.02% max: 9.88% x̄: 1.83% x̃: 0.73%
95% mean confidence interval for cycles value: -58.30 92.53
95% mean confidence interval for cycles %-change: 0.16% 0.97%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 8774 -> 8773 (-0.01%)
spills in affected programs: 20 -> 19 (-5.00%)
helped: 1
HURT: 0

total fills in shared programs: 9496 -> 9494 (-0.02%)
fills in affected programs: 40 -> 38 (-5.00%)
helped: 1
HURT: 0

Broadwell
total instructions in shared programs: 17859373 -> 17858548 (<.01%)
instructions in affected programs: 38452 -> 37627 (-2.15%)
helped: 31
HURT: 0
helped stats (abs) min: 1 max: 143 x̄: 26.61 x̃: 10
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.57% x̃: 2.69%
95% mean confidence interval for instructions value: -39.79 -13.44
95% mean confidence interval for instructions %-change: -3.25% -1.89%
Instructions are helped.

total cycles in shared programs: 525858109 -> 525869236 (<.01%)
cycles in affected programs: 2058597 -> 2069724 (0.54%)
helped: 44
HURT: 75
helped stats (abs) min: 2 max: 1330 x̄: 187.84 x̃: 23
helped stats (rel) min: 0.04% max: 31.31% x̄: 2.13% x̃: 0.85%
HURT stats (abs)   min: 1 max: 3915 x̄: 258.56 x̃: 47
HURT stats (rel)   min: 0.02% max: 10.53% x̄: 2.81% x̃: 2.21%
95% mean confidence interval for cycles value: -26.06 213.07
95% mean confidence interval for cycles %-change: 0.19% 1.78%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 25744 -> 25730 (-0.05%)
spills in affected programs: 1578 -> 1564 (-0.89%)
helped: 4
HURT: 2

total fills in shared programs: 31710 -> 31689 (-0.07%)
fills in affected programs: 4346 -> 4325 (-0.48%)
helped: 3
HURT: 3

Haswell
total instructions in shared programs: 16228399 -> 16227783 (<.01%)
instructions in affected programs: 22201 -> 21585 (-2.77%)
helped: 27
HURT: 0
helped stats (abs) min: 1 max: 68 x̄: 22.81 x̃: 11
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.92% x̃: 2.86%
95% mean confidence interval for instructions value: -31.96 -13.66
95% mean confidence interval for instructions %-change: -3.68% -2.15%
Instructions are helped.

total cycles in shared programs: 538613967 -> 538701354 (0.02%)
cycles in affected programs: 1653044 -> 1740431 (5.29%)
helped: 36
HURT: 81
helped stats (abs) min: 2 max: 708 x̄: 104.50 x̃: 17
helped stats (rel) min: <.01% max: 15.01% x̄: 1.67% x̃: 0.65%
HURT stats (abs)   min: 1 max: 30100 x̄: 1125.30 x̃: 304
HURT stats (rel)   min: 0.02% max: 16.21% x̄: 8.98% x̃: 11.60%
95% mean confidence interval for cycles value: 23.78 1470.01
95% mean confidence interval for cycles %-change: 4.29% 7.12%
Cycles are HURT.

total spills in shared programs: 23418 -> 23409 (-0.04%)
spills in affected programs: 177 -> 168 (-5.08%)
helped: 2
HURT: 0

total fills in shared programs: 25919 -> 25896 (-0.09%)
fills in affected programs: 568 -> 545 (-4.05%)
helped: 3
HURT: 0

Ivy Bridge
total instructions in shared programs: 15265983 -> 15265759 (<.01%)
instructions in affected programs: 8418 -> 8194 (-2.66%)
helped: 5
HURT: 0
helped stats (abs) min: 18 max: 99 x̄: 44.80 x̃: 26
helped stats (rel) min: 1.74% max: 4.26% x̄: 3.12% x̃: 3.00%
95% mean confidence interval for instructions value: -86.29 -3.31
95% mean confidence interval for instructions %-change: -4.43% -1.81%
Instructions are helped.

total cycles in shared programs: 422930336 -> 422929589 (<.01%)
cycles in affected programs: 59347 -> 58600 (-1.26%)
helped: 3
HURT: 2
helped stats (abs) min: 72 max: 1060 x̄: 433.33 x̃: 168
helped stats (rel) min: 1.14% max: 3.48% x̄: 2.23% x̃: 2.06%
HURT stats (abs)   min: 265 max: 288 x̄: 276.50 x̃: 276
HURT stats (rel)   min: 4.79% max: 5.64% x̄: 5.22% x̃: 5.22%
95% mean confidence interval for cycles value: -829.08 530.28
95% mean confidence interval for cycles %-change: -4.43% 5.93%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 4953 -> 4946 (-0.14%)
spills in affected programs: 344 -> 337 (-2.03%)
helped: 2
HURT: 0

total fills in shared programs: 5548 -> 5521 (-0.49%)
fills in affected programs: 838 -> 811 (-3.22%)
helped: 2
HURT: 0

No shader-db changes on any earlier Intel platform.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>

3 years agonir/algebraic: Add some half packing optimizations for pack_half_2x16_split
Ian Romanick [Thu, 2 Apr 2020 19:20:57 +0000 (12:20 -0700)]
nir/algebraic: Add some half packing optimizations for pack_half_2x16_split

Like 1f72857739b ("nir/algebraic: add some half packing optimizations"),
but for the pack_half_2x16_split variant.

The shader helped for spills and fills is the big compute shader in
Bioshock Infinite.

Tiger Lake
total instructions in shared programs: 21834539 -> 21833218 (<.01%)
instructions in affected programs: 60119 -> 58798 (-2.20%)
helped: 105
HURT: 0
helped stats (abs) min: 5 max: 50 x̄: 12.58 x̃: 9
helped stats (rel) min: 0.86% max: 26.46% x̄: 2.58% x̃: 1.70%
95% mean confidence interval for instructions value: -14.35 -10.81
95% mean confidence interval for instructions %-change: -3.20% -1.97%
Instructions are helped.

total cycles in shared programs: 506215169 -> 506180109 (<.01%)
cycles in affected programs: 1445088 -> 1410028 (-2.43%)
helped: 97
HURT: 8
helped stats (abs) min: 1 max: 16882 x̄: 387.76 x̃: 26
helped stats (rel) min: 0.05% max: 18.31% x̄: 1.77% x̃: 1.34%
HURT stats (abs)   min: 21 max: 635 x̄: 319.12 x̃: 212
HURT stats (rel)   min: 0.39% max: 20.08% x̄: 8.96% x̃: 4.46%
95% mean confidence interval for cycles value: -782.96 115.15
95% mean confidence interval for cycles %-change: -1.74% -0.16%
Inconclusive result (value mean confidence interval includes 0).

Ice Lake, Skylake, and Broadwell had similar results. (Ice Lake shown)
total instructions in shared programs: 19962974 -> 19961317 (<.01%)
instructions in affected programs: 63471 -> 61814 (-2.61%)
helped: 105
HURT: 0
helped stats (abs) min: 6 max: 82 x̄: 15.78 x̃: 11
helped stats (rel) min: 1.11% max: 28.65% x̄: 3.17% x̃: 2.16%
95% mean confidence interval for instructions value: -18.38 -13.18
95% mean confidence interval for instructions %-change: -3.86% -2.48%
Instructions are helped.

total cycles in shared programs: 498908953 -> 498863755 (<.01%)
cycles in affected programs: 1566998 -> 1521800 (-2.88%)
helped: 89
HURT: 15
helped stats (abs) min: 2 max: 17502 x̄: 532.19 x̃: 69
helped stats (rel) min: 0.07% max: 18.54% x̄: 4.71% x̃: 3.12%
HURT stats (abs)   min: 3 max: 661 x̄: 144.47 x̃: 16
HURT stats (rel)   min: 0.14% max: 20.57% x̄: 4.29% x̃: 0.30%
95% mean confidence interval for cycles value: -903.93 34.74
95% mean confidence interval for cycles %-change: -4.50% -2.32%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 8776 -> 8774 (-0.02%)
spills in affected programs: 25 -> 23 (-8.00%)
helped: 1
HURT: 0

total fills in shared programs: 9500 -> 9496 (-0.04%)
fills in affected programs: 46 -> 42 (-8.70%)
helped: 1
HURT: 0

Haswell
total instructions in shared programs: 16229912 -> 16228399 (<.01%)
instructions in affected programs: 61257 -> 59744 (-2.47%)
helped: 105
HURT: 0
helped stats (abs) min: 6 max: 51 x̄: 14.41 x̃: 11
helped stats (rel) min: 0.77% max: 28.65% x̄: 3.08% x̃: 2.15%
95% mean confidence interval for instructions value: -16.14 -12.68
95% mean confidence interval for instructions %-change: -3.77% -2.40%
Instructions are helped.

total cycles in shared programs: 538654481 -> 538613967 (<.01%)
cycles in affected programs: 1448966 -> 1408452 (-2.80%)
helped: 58
HURT: 47
helped stats (abs) min: 9 max: 22604 x̄: 957.00 x̃: 74
helped stats (rel) min: 0.40% max: 18.81% x̄: 6.22% x̃: 3.03%
HURT stats (abs)   min: 5 max: 3720 x̄: 318.98 x̃: 49
HURT stats (rel)   min: 0.20% max: 34.50% x̄: 5.05% x̃: 2.12%
95% mean confidence interval for cycles value: -999.84 228.14
95% mean confidence interval for cycles %-change: -2.86% 0.51%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 15266086 -> 15265983 (<.01%)
instructions in affected programs: 7272 -> 7169 (-1.42%)
helped: 3
HURT: 0
helped stats (abs) min: 21 max: 41 x̄: 34.33 x̃: 41
helped stats (rel) min: 0.66% max: 5.43% x̄: 2.44% x̃: 1.23%

total cycles in shared programs: 422930883 -> 422930336 (<.01%)
cycles in affected programs: 49259 -> 48712 (-1.11%)
helped: 3
HURT: 0
helped stats (abs) min: 106 max: 221 x̄: 182.33 x̃: 220
helped stats (rel) min: 0.71% max: 5.95% x̄: 2.46% x̃: 0.72%

No changes on any earilier Intel platforms.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>

3 years agonir/algebraic: Optimize ushr of pack_half, not ishr
Ian Romanick [Thu, 2 Apr 2020 19:14:12 +0000 (12:14 -0700)]
nir/algebraic: Optimize ushr of pack_half, not ishr

When a = -1.0, pack_half_2x16(vec2(0x0000, 0xBC00)) will produce
0xBC000000.  The ishr will produce 0xFFFFBC00.  The replacement
pack_half_2x16(vec2(0xBC00, 0x0000)) will produce 0x0000BC00.

Fixes: 1f72857739b ("nir/algebraic: add some half packing optimizations")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>

3 years agointel: Delete hardcoded devinfo->urb.size values for Gen7+ (sans DG1).
Kenneth Graunke [Fri, 8 May 2020 19:51:11 +0000 (12:51 -0700)]
intel: Delete hardcoded devinfo->urb.size values for Gen7+ (sans DG1).

On all Gen7+ platforms except DG1, the URB is a subsection of the
configurable L3 cache, and so the size can vary.  The size listed
in the documentation on those platforms is an "example size", picked
by calculating it based on an arbitrarily chosen L3 config.

Hardcoding a value for those platforms provides no value and only
confuses people trying to fill out these tables when doing hardware
enabling.  anv and iris never use this field.  i965 uses it to
initialize brw->urb.size, but then updates that in update_urb_size()
to be the correct value, so the initial value doesn't matter.

Delete the values for Gen7+ and update the comment accordingly.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4969>

3 years agoegl: Limit the EGL ver for android
Abhishek Kumar [Thu, 7 May 2020 16:32:02 +0000 (22:02 +0530)]
egl: Limit the EGL ver for android

Android support EGL 1.5 from Q onwards,
so limit EGL ver to 1.4 for P and below.

Closes: #2892
Signed-off-by: Abhishek Kumar <abhishek4.kumar@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4951>

3 years agoamd/common: Fix incorrect use of asprintf instead of vasprintf
Serge Martin [Sun, 10 May 2020 16:23:25 +0000 (18:23 +0200)]
amd/common: Fix incorrect use of asprintf instead of vasprintf

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
3 years agodocs/features: mark GL_NV_conditional_render as done for zink
Erik Faye-Lund [Thu, 30 Apr 2020 18:01:22 +0000 (20:01 +0200)]
docs/features: mark GL_NV_conditional_render as done for zink

Requires VK_EXT_conditional_rendering.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4835>

3 years agozink: enable conditional rendering if available
Dave Airlie [Sun, 14 Oct 2018 23:15:50 +0000 (00:15 +0100)]
zink: enable conditional rendering if available

This doesn't seem to work perfect, but I'm not sure what is possible
in GL vs Vulkan here

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2867
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4835>

3 years agozink: add a GET_PROC_ADDR macro to simplify load_device_extensions
Erik Faye-Lund [Fri, 8 May 2020 10:07:48 +0000 (12:07 +0200)]
zink: add a GET_PROC_ADDR macro to simplify load_device_extensions

This doesn't do much for now, but it will keep thing cleaner in the next
commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4835>

3 years agozink: load vk_GetMemoryFdKHR while creating screen
Erik Faye-Lund [Thu, 30 Apr 2020 17:06:51 +0000 (19:06 +0200)]
zink: load vk_GetMemoryFdKHR while creating screen

We're about to load some more extension-pointers as well, so let's
create a separate place for doing this.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4835>

3 years agoradeonsi: do not use cmask with encrypted texture
Pierre-Eric Pelloux-Prayer [Thu, 7 May 2020 19:45:49 +0000 (21:45 +0200)]
radeonsi: do not use cmask with encrypted texture

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>

3 years agoradeonsi: determine secure flag must be set for gfx IB
Pierre-Eric Pelloux-Prayer [Tue, 25 Feb 2020 20:56:12 +0000 (21:56 +0100)]
radeonsi: determine secure flag must be set for gfx IB

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>

3 years agoamdgpu: use AMDGPU_IB_FLAGS_SECURE when requested
Pierre-Eric Pelloux-Prayer [Fri, 28 Feb 2020 10:54:41 +0000 (11:54 +0100)]
amdgpu: use AMDGPU_IB_FLAGS_SECURE when requested

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>

3 years agoradeonsi: add support for PIPE_RESOURCE_FLAG_ENCRYPTED
Pierre-Eric Pelloux-Prayer [Fri, 6 Dec 2019 09:33:56 +0000 (10:33 +0100)]
radeonsi: add support for PIPE_RESOURCE_FLAG_ENCRYPTED

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>

3 years agogallium: PIPE_RESOURCE_FLAG_ENCRYPTED
Pierre-Eric Pelloux-Prayer [Mon, 27 Apr 2020 09:21:35 +0000 (11:21 +0200)]
gallium: PIPE_RESOURCE_FLAG_ENCRYPTED

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>

3 years agoradeonsi/sdma: implement tmz support
Pierre-Eric Pelloux-Prayer [Fri, 6 Dec 2019 09:29:31 +0000 (10:29 +0100)]
radeonsi/sdma: implement tmz support

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>

3 years agoradeonsi: force using staging texture when uploading to secure texture
Pierre-Eric Pelloux-Prayer [Fri, 28 Feb 2020 16:10:45 +0000 (17:10 +0100)]
radeonsi: force using staging texture when uploading to secure texture

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>

3 years agoamdgpu: add encrypted slabs support
Pierre-Eric Pelloux-Prayer [Fri, 6 Dec 2019 09:40:02 +0000 (10:40 +0100)]
amdgpu: add encrypted slabs support

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>

3 years agoradeonsi: allocate framebuffer texture as secure when using tmz
Pierre-Eric Pelloux-Prayer [Fri, 28 Feb 2020 13:25:54 +0000 (14:25 +0100)]
radeonsi: allocate framebuffer texture as secure when using tmz

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>

3 years agoradeon: add RADEON_CREATE_ENCRYPTED flag
Pierre-Eric Pelloux-Prayer [Fri, 6 Dec 2019 09:33:43 +0000 (10:33 +0100)]
radeon: add RADEON_CREATE_ENCRYPTED flag

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>

3 years agoradeonsi: add AMD_DEBUG=tmz option
Pierre-Eric Pelloux-Prayer [Fri, 28 Feb 2020 13:24:29 +0000 (14:24 +0100)]
radeonsi: add AMD_DEBUG=tmz option

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>

3 years agoamdgpu/radeon: add secure api
Pierre-Eric Pelloux-Prayer [Fri, 6 Dec 2019 09:28:10 +0000 (10:28 +0100)]
amdgpu/radeon: add secure api

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>

3 years agoac/surface: remove shadowing declaration
Pierre-Eric Pelloux-Prayer [Mon, 11 May 2020 07:18:49 +0000 (09:18 +0200)]
ac/surface: remove shadowing declaration

Fixes: 7691de0dcef ("ac/surface,radeonsi: move the set/get_bo_metadata code to ac_surface.c")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2929
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4983>

3 years agoaco: prevent invalid loads/stores vectorization if robustness is enabled
Samuel Pitoiset [Mon, 4 May 2020 14:03:35 +0000 (16:03 +0200)]
aco: prevent invalid loads/stores vectorization if robustness is enabled

Only UBO, SSBO, global and push constants accesses should matter.

This fixes a bunch of new robustness2 failures. Note that RADV/LLVM
isn't affected because it relies on LLVM for loads/stores
vectorization and LLVM doesn't vectorize in this situation as well.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4881>

3 years agonir: do not vectorize load/store if offset can overflow and robustness enabled
Samuel Pitoiset [Mon, 4 May 2020 14:02:38 +0000 (16:02 +0200)]
nir: do not vectorize load/store if offset can overflow and robustness enabled

This prevents vectorization for loads/stores that can overflow if
the low offset is negative and the range greater or equal than 0.

The caller can pass the list of variable modes that matter for
robust access.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4881>

3 years agoaco: fix 64-bit trunc with negative exponents on GFX6
Samuel Pitoiset [Wed, 6 May 2020 13:34:07 +0000 (15:34 +0200)]
aco: fix 64-bit trunc with negative exponents on GFX6

v_frexp_exp returns the exponent as an unsigned value.

Also, v_ashr returns either 0 or -1 depending on the sign of the
source operand, but what we want is only the sign bit.

Fixes a bunch of recent dEQP-VK.glsl.builtin.precision_double.* tests.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4921>

3 years agoetnaviv: drm: Normalize nano seconds
Guido Günther [Thu, 23 Jan 2020 08:20:00 +0000 (09:20 +0100)]
etnaviv: drm: Normalize nano seconds

Make sure the nano second part is less than one second. This matches
what clock_settime expects and allows for more concise kernel
interfaces.

Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3534>