Ian Romanick [Mon, 24 Jun 2019 22:12:56 +0000 (15:12 -0700)]
nir/algebraic: Fail build when too many commutative expressions are used
Search patterns that are expected to have too many (e.g., the giant
bitfield_reverse pattern) can be added to a white list.
This would have saved me a few hours debugging. :(
v2: Implement the expected-failure annotation as a property of the
search-replace pattern instead of as a property of the whole list of
patterns. Suggested by Connor.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Ian Romanick [Tue, 25 Jun 2019 17:04:21 +0000 (10:04 -0700)]
nir/algebraic: Fix whitespace error
Trivial
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Alyssa Rosenzweig [Sat, 29 Jun 2019 01:47:10 +0000 (18:47 -0700)]
panfrost: Allow R11G11B10 rendering
Doesn't fully work yet, but better than crashing.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Sat, 29 Jun 2019 01:46:43 +0000 (18:46 -0700)]
panfrost: Default to util_pack_color for clears
This might help as we bringup more render-target formats.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Ian Romanick [Wed, 26 Jun 2019 20:36:17 +0000 (13:36 -0700)]
intel/vec4: Try both sources as candidates for being immediates
For some reason, when I first wrote try_immediate_source, I thought the
sources had already been ordered so that the immediate value was the
second source. That's rubbish. The generator assumes *neither* source
is immediate, and it relies on later copy/constant propagation passes to
do the reordering.
For this reason, the changes to try_immediate_source have to go to some
efforts to reorder the operands and tell the caller when it reordered
them. The generator for comparison instructions uses this to determine
when the comparison needs to change (e.g., from GT to LT).
No changes on any Gen8 or later platform because those platforms do not
use the vec4 backend.
Haswell
total instructions in shared programs:
13484431 ->
13480500 (-0.03%)
instructions in affected programs: 441138 -> 437207 (-0.89%)
helped: 1883
HURT: 0
helped stats (abs) min: 1 max: 49 x̄: 2.09 x̃: 1
helped stats (rel) min: 0.07% max: 8.91% x̄: 1.10% x̃: 0.90%
95% mean confidence interval for instructions value: -2.19 -1.98
95% mean confidence interval for instructions %-change: -1.14% -1.06%
Instructions are helped.
total cycles in shared programs:
376420286 ->
376406400 (<.01%)
cycles in affected programs:
15995668 ->
15981782 (-0.09%)
helped: 1692
HURT: 219
helped stats (abs) min: 2 max: 764 x̄: 13.78 x̃: 4
helped stats (rel) min: <.01% max: 9.69% x̄: 0.69% x̃: 0.35%
HURT stats (abs) min: 2 max: 516 x̄: 43.09 x̃: 22
HURT stats (rel) min: 0.02% max: 12.09% x̄: 2.30% x̃: 1.13%
95% mean confidence interval for cycles value: -9.70 -4.83
95% mean confidence interval for cycles %-change: -0.42% -0.28%
Cycles are helped.
total spills in shared programs: 23166 -> 23158 (-0.03%)
spills in affected programs: 66 -> 58 (-12.12%)
helped: 2
HURT: 0
total fills in shared programs: 34592 -> 34580 (-0.03%)
fills in affected programs: 75 -> 63 (-16.00%)
helped: 2
HURT: 0
Ivy Bridge
total instructions in shared programs:
12051590 ->
12048513 (-0.03%)
instructions in affected programs: 355911 -> 352834 (-0.86%)
helped: 1481
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 2.08 x̃: 1
helped stats (rel) min: 0.07% max: 4.92% x̄: 1.08% x̃: 0.90%
95% mean confidence interval for instructions value: -2.17 -1.98
95% mean confidence interval for instructions %-change: -1.12% -1.04%
Instructions are helped.
total cycles in shared programs:
180319624 ->
180307642 (<.01%)
cycles in affected programs:
15591028 ->
15579046 (-0.08%)
helped: 1340
HURT: 174
helped stats (abs) min: 2 max: 764 x̄: 14.19 x̃: 2
helped stats (rel) min: <.01% max: 8.68% x̄: 0.64% x̃: 0.32%
HURT stats (abs) min: 2 max: 518 x̄: 40.41 x̃: 14
HURT stats (rel) min: 0.02% max: 8.37% x̄: 1.59% x̃: 0.67%
95% mean confidence interval for cycles value: -10.85 -4.97
95% mean confidence interval for cycles %-change: -0.45% -0.31%
Cycles are helped.
All Gen6 and earlier platforms had simlar results. (Sandy Bridge shown)
total instructions in shared programs:
10863159 ->
10861462 (-0.02%)
instructions in affected programs: 157839 -> 156142 (-1.08%)
helped: 715
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 2.37 x̃: 2
helped stats (rel) min: 0.23% max: 4.33% x̄: 1.07% x̃: 0.85%
95% mean confidence interval for instructions value: -2.53 -2.21
95% mean confidence interval for instructions %-change: -1.13% -1.02%
Instructions are helped.
total cycles in shared programs:
153957782 ->
153948778 (<.01%)
cycles in affected programs:
3171648 ->
3162644 (-0.28%)
helped: 696
HURT: 62
helped stats (abs) min: 2 max: 390 x̄: 15.72 x̃: 4
helped stats (rel) min: 0.02% max: 10.57% x̄: 0.57% x̃: 0.12%
HURT stats (abs) min: 2 max: 300 x̄: 31.29 x̃: 2
HURT stats (rel) min: 0.11% max: 7.23% x̄: 0.83% x̃: 0.34%
95% mean confidence interval for cycles value: -15.65 -8.11
95% mean confidence interval for cycles %-change: -0.56% -0.36%
Cycles are helped.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Wed, 26 Jun 2019 01:47:18 +0000 (18:47 -0700)]
intel/vec4: Try immediate sources for dot products too
No changes on any Gen8 or later platform because those platforms do not
use the vec4 backend.
All Haswell and earlier platforms has similar results. (Haswell shown)
total instructions in shared programs:
13484467 ->
13484431 (<.01%)
instructions in affected programs: 8540 -> 8504 (-0.42%)
helped: 33
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.09 x̃: 1
helped stats (rel) min: 0.31% max: 1.53% x̄: 0.49% x̃: 0.35%
95% mean confidence interval for instructions value: -1.19 -0.99
95% mean confidence interval for instructions %-change: -0.60% -0.38%
Instructions are helped.
total cycles in shared programs:
376420572 ->
376420286 (<.01%)
cycles in affected programs: 56260 -> 55974 (-0.51%)
helped: 26
HURT: 5
helped stats (abs) min: 2 max: 204 x̄: 11.85 x̃: 2
helped stats (rel) min: 0.11% max: 3.08% x̄: 0.39% x̃: 0.13%
HURT stats (abs) min: 2 max: 6 x̄: 4.40 x̃: 6
HURT stats (rel) min: 0.03% max: 0.35% x̄: 0.24% x̃: 0.35%
95% mean confidence interval for cycles value: -22.91 4.45
95% mean confidence interval for cycles %-change: -0.56% -0.02%
Inconclusive result (value mean confidence interval includes 0).
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Wed, 26 Jun 2019 01:39:59 +0000 (18:39 -0700)]
intel/vec4: Try emitting non-scalar immediates
Sometimes an instruction has a vector as a source, but all of the
components have the same value. For example,
vec3 32 ssa_16 = load_const (1.0, 1.0, 1.0)
...
vec3 32 ssa_82 = fadd ssa_16, -ssa_81.xyz
No changes on any Gen8 or later platform because those platforms do not
use the vec4 backend.
Haswell
total instructions in shared programs:
13487811 ->
13484467 (-0.02%)
instructions in affected programs: 421981 -> 418637 (-0.79%)
helped: 1859
HURT: 0
helped stats (abs) min: 1 max: 15 x̄: 1.80 x̃: 1
helped stats (rel) min: 0.04% max: 9.80% x̄: 1.04% x̃: 0.84%
95% mean confidence interval for instructions value: -1.85 -1.74
95% mean confidence interval for instructions %-change: -1.07% -1.00%
Instructions are helped.
total cycles in shared programs:
376423252 ->
376420572 (<.01%)
cycles in affected programs:
14800970 ->
14798290 (-0.02%)
helped: 1519
HURT: 329
helped stats (abs) min: 2 max: 462 x̄: 10.59 x̃: 4
helped stats (rel) min: 0.03% max: 16.73% x̄: 0.79% x̃: 0.36%
HURT stats (abs) min: 2 max: 598 x̄: 40.74 x̃: 16
HURT stats (rel) min: <.01% max: 10.32% x̄: 2.56% x̃: 0.98%
95% mean confidence interval for cycles value: -3.53 0.63
95% mean confidence interval for cycles %-change: -0.30% -0.09%
Inconclusive result (value mean confidence interval includes 0).
total fills in shared programs: 34601 -> 34592 (-0.03%)
fills in affected programs: 91 -> 82 (-9.89%)
helped: 9
HURT: 0
Ivy Bridge
total instructions in shared programs:
12053565 ->
12051626 (-0.02%)
instructions in affected programs: 298103 -> 296164 (-0.65%)
helped: 1228
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 1.58 x̃: 1
helped stats (rel) min: 0.04% max: 3.57% x̄: 0.91% x̃: 0.81%
95% mean confidence interval for instructions value: -1.63 -1.53
95% mean confidence interval for instructions %-change: -0.95% -0.88%
Instructions are helped.
total cycles in shared programs:
180322270 ->
180319922 (<.01%)
cycles in affected programs:
14123840 ->
14121492 (-0.02%)
helped: 1036
HURT: 195
helped stats (abs) min: 2 max: 462 x̄: 11.93 x̃: 2
helped stats (rel) min: 0.03% max: 14.05% x̄: 0.82% x̃: 0.35%
HURT stats (abs) min: 2 max: 598 x̄: 51.33 x̃: 16
HURT stats (rel) min: <.01% max: 9.68% x̄: 3.02% x̃: 0.72%
95% mean confidence interval for cycles value: -4.92 1.10
95% mean confidence interval for cycles %-change: -0.35% -0.07%
Inconclusive result (value mean confidence interval includes 0).
Sandy Bridge
total instructions in shared programs:
10864286 ->
10863189 (-0.01%)
instructions in affected programs: 159722 -> 158625 (-0.69%)
helped: 724
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 1.52 x̃: 1
helped stats (rel) min: 0.10% max: 2.91% x̄: 0.79% x̃: 0.62%
95% mean confidence interval for instructions value: -1.58 -1.46
95% mean confidence interval for instructions %-change: -0.82% -0.75%
Instructions are helped.
total cycles in shared programs:
153967938 ->
153957926 (<.01%)
cycles in affected programs:
1923186 ->
1913174 (-0.52%)
helped: 654
HURT: 56
helped stats (abs) min: 2 max: 170 x̄: 20.00 x̃: 4
helped stats (rel) min: 0.03% max: 11.82% x̄: 0.89% x̃: 0.18%
HURT stats (abs) min: 2 max: 390 x̄: 54.75 x̃: 32
HURT stats (rel) min: 0.05% max: 6.92% x̄: 3.09% x̃: 2.92%
95% mean confidence interval for cycles value: -17.42 -10.78
95% mean confidence interval for cycles %-change: -0.76% -0.40%
Cycles are helped.
Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs:
8142677 ->
8141721 (-0.01%)
instructions in affected programs: 139511 -> 138555 (-0.69%)
helped: 588
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 1.63 x̃: 1
helped stats (rel) min: 0.21% max: 4.39% x̄: 0.84% x̃: 0.46%
95% mean confidence interval for instructions value: -1.70 -1.55
95% mean confidence interval for instructions %-change: -0.89% -0.78%
Instructions are helped.
total cycles in shared programs:
188549394 ->
188547676 (<.01%)
cycles in affected programs:
3171960 ->
3170242 (-0.05%)
helped: 527
HURT: 0
helped stats (abs) min: 2 max: 18 x̄: 3.26 x̃: 2
helped stats (rel) min: <.01% max: 0.80% x̄: 0.08% x̃: 0.06%
95% mean confidence interval for cycles value: -3.49 -3.03
95% mean confidence interval for cycles %-change: -0.09% -0.07%
Cycles are helped.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Anholt [Thu, 27 Jun 2019 21:29:37 +0000 (14:29 -0700)]
nir: Fix lowering of bitfield_insert to shifts.
The bfi/bfm behavior change replaced the bfi/bfm usage in
lower_bitfield_insert_to_shifts with actual shifts like the name says,
but it failed to handle the offset=0, bits==32 case in the new
lowering.
v2: Use 31 < bits instead of bits == 32, to get the 31 < (iand bits,
31) -> false optimization.
Fixes regressions in dEQP-GLES31.*bitfield_insert* on freedreno.
Fixes: 165b7f3a4487 ("nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec.")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Dylan Baker [Fri, 28 Jun 2019 23:36:38 +0000 (16:36 -0700)]
Revert "meson: Add support for using cmake for finding LLVM"
This reverts commit
5157a4276500c77e2210e853b262be1d1b30aedf.
There is a meson bug that causes llvm to always be statically linked,
which is obviously not what we want. I haven't had time to look into it
yet, but for now let's just revert it.
Dylan Baker [Fri, 28 Jun 2019 23:36:27 +0000 (16:36 -0700)]
Revert "meson: try to use cmake as a finder for clang"
This reverts commit
0ba0c0c15c633a5a3b7a4651a743f800f30bcbf6.
Eric Engestrom [Sat, 22 Jun 2019 12:49:02 +0000 (13:49 +0100)]
mesa: stop trying new filenames if the filename existing is not the issue
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Eric Engestrom [Wed, 12 Jun 2019 14:46:11 +0000 (15:46 +0100)]
mesa: use os_file_create_unique()
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Eric Engestrom [Mon, 3 Jun 2019 16:51:37 +0000 (17:51 +0100)]
util: add os_file_create_unique()
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Alyssa Rosenzweig [Wed, 26 Jun 2019 23:38:50 +0000 (16:38 -0700)]
panfrost: Disable DXT-style texture compression
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 26 Jun 2019 23:36:17 +0000 (16:36 -0700)]
panfrost: Dump unknown formats before aborting
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 26 Jun 2019 23:31:31 +0000 (16:31 -0700)]
panfrost/midgard: Fix 3D texture regression
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 26 Jun 2019 23:24:28 +0000 (16:24 -0700)]
panfrost: Add some special formats
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 26 Jun 2019 23:12:28 +0000 (16:12 -0700)]
panfrost/midgard: Implement integer sampler
Turns out one of the magic bits in the texture instruction meant
'float'. Different magic bits mean int and uint then :)
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 26 Jun 2019 22:59:59 +0000 (15:59 -0700)]
panfrost: Remove dubious assert
We already *can* support texture formats with bpp > 4, so..
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 26 Jun 2019 22:59:29 +0000 (15:59 -0700)]
panfrost: Implement primitive restart
For GLES3, just pass the flag through.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Anuj Phogat [Fri, 28 Jun 2019 16:24:29 +0000 (09:24 -0700)]
i965/icl: Apply WA_1606682166 to compute workloads
We missed the workaround for compute workloads in earlier patches.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Anuj Phogat [Wed, 26 Jun 2019 21:27:01 +0000 (14:27 -0700)]
Revert "iris/icl: Add WA_2204188704 to disable pixel shader panic dispatch"
SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace.
This patch silences a simulator warning about it.
We don't need to add this workaround in linux kernel as the WA description
says it's fixed on latest stepping.
This reverts commit
9c421d6b47e0c5f206959acd68814b63232946be.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Anuj Phogat [Wed, 26 Jun 2019 21:23:35 +0000 (14:23 -0700)]
Revert "anv/icl: Add WA_2204188704 to disable pixel shader panic dispatch"
SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace.
This patch silences a simulator warning about it.
We don't need to add this workaround in linux kernel as the WA description
says it's fixed on latest stepping.
This reverts commit
2be60e0c73ed1555a919c5725cc0cab119a2b6de.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Anuj Phogat [Wed, 26 Jun 2019 21:19:53 +0000 (14:19 -0700)]
Revert "i965/icl: Add WA_2204188704 to disable pixel shader panic dispatch"
SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace.
This patch silences a simulator warning about it.
We don't need to add this workaround in linux kernel as the WA description
says it's fixed on latest stepping.
This reverts commit
85ecd14ef6a084f5e82860de6dbc79870b335682.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Anuj Phogat [Wed, 26 Jun 2019 20:18:38 +0000 (13:18 -0700)]
i965/icl: Fix WA_1606682166
An earlier change was setting the SamplerCount = 0 for Gen 11
under #if GEN_GEN < 7. This commit fixes the problem.
This WA has also been added to the linux kernel.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rob Clark [Fri, 28 Jun 2019 13:27:17 +0000 (06:27 -0700)]
freedreno/ir3: small cleanup
`target` cannot be NULL here.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Thu, 27 Jun 2019 15:24:32 +0000 (08:24 -0700)]
freedreno/ir3: fix missing (ss) in dummy bary.f case
In case we need to insert a dummy bary.f for the (ei) flag, it also
needs (ss) so we don't release varying storage to the next VS wave
before the ldlv completed. Fixes random failures in:
dEQP-GLES3.functional.transform_feedback.random.interleaved.lines.*
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Rob Clark [Thu, 27 Jun 2019 20:37:21 +0000 (13:37 -0700)]
freedreno/a6xx: wire up dither state
Fixes:
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.no_rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.no_rebind_rbo_rgba4_stencil_index8
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Arfrever Frehtes Taifersar Arahesis [Mon, 17 Jun 2019 18:37:44 +0000 (18:37 +0000)]
meson: Improve detection of Python when using Meson >=0.50.
Previously, on systems where multiple versions of Python 3 (e.g. 3.6 and 3.7)
are installed, wrong version of Python 3 could have been used.
The proper fix requires availability of path() method in Meson's python
module, which has been added in Meson 0.50:
https://github.com/mesonbuild/meson/pull/4616
Distro Bug: https://bugs.gentoo.org/671308
Signed-off-by: Arfrever Frehtes Taifersar Arahesis <Arfrever@Apache.Org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
v2: - Add missing `endif` keyword (Dylan)
Pierre-Eric Pelloux-Prayer [Fri, 21 Jun 2019 08:02:49 +0000 (10:02 +0200)]
radeon/uvd: fix calc_ctx_size_h265_main10
Left shift was applied twice.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110702
Reviewed-by: Leo Liu <leo.liu@amd.com>
Tested-by: <irherder@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: <mesa-stable@lists.freedesktop.org>
Pierre-Eric Pelloux-Prayer [Tue, 4 Jun 2019 14:05:53 +0000 (16:05 +0200)]
mesa: add display list support for gl(Compressed)TextureSubImage2DEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Tue, 4 Jun 2019 12:11:46 +0000 (14:11 +0200)]
mesa: add glTextureParameteri/iv/f/fvEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Tue, 4 Jun 2019 13:47:05 +0000 (15:47 +0200)]
mesa: extend _mesa_lookup_or_create_texture to support EXT_dsa
Adds a boolean to implement EXT_dsa specifics.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Tue, 25 Jun 2019 14:53:49 +0000 (16:53 +0200)]
mesa: refactor bind_texture
Splits texture lookup and binding actions.
The new _mesa_lookup_or_create_texture will be useful to implement the EXT_direct_state_access extension.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Tue, 25 Jun 2019 14:39:37 +0000 (16:39 +0200)]
mesa: extract helper function for glTexParameter*
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Fri, 26 Apr 2019 15:47:05 +0000 (17:47 +0200)]
mesa: add buffer != 0 checks to glNamedBufferEXT functions
The EXT_direct_state_access spec says:
INVALID_OPERATION is generated by GetNamedBufferParameterivEXT,
GetNamedBufferPointervEXT, GetNamedBufferSubDataEXT,
MapNamedBufferEXT, NamedBufferDataEXT, NamedBufferSubDataEXT, and
UnmapNamedBufferEXT if the buffer parameter is zero.
This commits adds buffer != 0 validation to the implemented functions.
glNamedBufferStorageEXT isn't included in this list and the EXT_buffer_storage
doesn't says that buffer = 0 is an error either so I didn't add the same
validation for this function.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Marek Olšák [Wed, 24 Apr 2019 17:44:12 +0000 (13:44 -0400)]
mesa: fix a typo in map_named_buffer_range
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Wed, 5 Sep 2018 05:48:07 +0000 (15:48 +1000)]
mesa: add support for glMapNamedBufferEXT()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Wed, 5 Sep 2018 05:18:04 +0000 (15:18 +1000)]
mesa: add support for glUnmapNamedBufferEXT()
Since the ARB DSA function glUnmapNamedBuffer() is only exposed
for 3.1 or above we make glUnmapNamedBuffer() an alias of
glUnmapNamedBufferEXT() rather than the other way around.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Mon, 3 Sep 2018 00:27:38 +0000 (10:27 +1000)]
mesa: add support for glCompressedTextureSubImage2DEXT()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Sun, 2 Sep 2018 23:53:31 +0000 (09:53 +1000)]
mesa: add support for glTextureSubImage2DEXT()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Fri, 25 May 2018 03:24:47 +0000 (13:24 +1000)]
mesa: add support for glMapNamedBufferRangeEXT()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Fri, 18 May 2018 03:23:15 +0000 (13:23 +1000)]
mesa: add support for glNamedBufferStorageEXT
This is available in ARB_buffer_storage when
EXT_direct_state_access is present.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Fri, 18 May 2018 05:20:35 +0000 (15:20 +1000)]
mesa: add support for glNamedBuffer*DataEXT()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Thu, 16 Aug 2018 01:21:38 +0000 (11:21 +1000)]
mesa: add support for glBindMultiTextureEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Mon, 10 Jun 2019 14:45:23 +0000 (16:45 +0200)]
mesa: delete framebuffer texture attachment sampler views
When a context is destroyed the destroy_tex_sampler_cb makes sure that all the
sampler views created by that context are destroyed.
This is done by walking the ctx->Shared->TexObjects hash table.
In a multiple context environment the texture can be deleted by a different context,
so it will be removed from the TexObjects table and will prevent the above mechanism
to work.
This can result in an assertion in st_save_zombie_sampler_view because the
sampler_view owns a reference to a destroyed context.
This issue occurs in blender 2.80.
This commit fixes this by explicitly releasing sampler_view created by the destroyed
context for all texture attachments.
Fixes: 593e36f956 (st/mesa: implement "zombie" sampler views (v2))
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110944
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
James Clarke [Sat, 4 May 2019 20:54:40 +0000 (21:54 +0100)]
meson: GNU/kFreeBSD has DRM/KMS and requires -D_GNU_SOURCE
This is a regression from the old autotools build system.
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Kenneth Graunke [Fri, 28 Jun 2019 00:00:46 +0000 (17:00 -0700)]
gallium/u_transfer_helper: Don't leak a reference to the resource.
We pipe_resource_reference when handling transfers in map, we need to
do a corresponding unreference in unmap.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Eric Engestrom [Tue, 25 Jun 2019 09:13:17 +0000 (10:13 +0100)]
meson: only add empty lines betwen active summary sections
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Eric Engestrom [Mon, 24 Jun 2019 16:47:15 +0000 (17:47 +0100)]
meson: bump required libdrm version to 2.4.81
dbb4457d9858fa977246 started using drmDevicesEqual(), which was
introduced in libdrm 2.4.81
We could either copy the function locally, or bump the required version.
Since the function is non-trivial and 2.4.81 is old enough already,
I suggesting the latter.
Fixes: dbb4457d9858fa977246 ("egl: add EGL_EXT_device_drm support")
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Fri, 21 Jun 2019 16:26:44 +0000 (17:26 +0100)]
ac: change ac_query_gpu_info() signature
Currently libdrm_amdgpu provides a typedef of the various handles. While
the goal was to make those opaque, it effectively became part of the API
To the best of my knowledge there are two ways to have opaque handles:
- "typedef void *foo;" - rather messy IMHO
- "stuct foo;" and use "struct foo *" through the API
In our case amdgpu_device_handle is used only internally, plus
respective code is not used or applicable for r300 and r600. Hence we
copied the typedef.
Seemingly this will be a problem since libdrm_amdgpu wants to change the
API, while not updating the code(?).
Either way, we can safely s/amdgpU_device_handle/void */ and carry on.
Cc: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak at amd.com>
Tomeu Vizoso [Fri, 28 Jun 2019 07:17:55 +0000 (09:17 +0200)]
panfrost: Only tag AFBC addresses when sampling
Rendering to AFBC was broken, as the HW will complaint loudly if we pass
a tagged pointer in bifrost_render_target.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Fixes: 3609b50a6443 ("panfrost: Merge AFBC slab with BO backing")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Jose Fonseca [Fri, 31 May 2019 16:10:40 +0000 (17:10 +0100)]
gallivm: Improve lp_build_rcp_refine.
Use the alternative more accurate expression from
https://en.wikipedia.org/wiki/Division_algorithm#Newton%E2%80%93Raphson_division
v2: Use lp_build_fmuladd as suggested by Roland
Tested by enabling this code path, and running lp_test_arit.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tomeu Vizoso [Fri, 28 Jun 2019 06:10:29 +0000 (08:10 +0200)]
panfrost/ci: Don't error out on RK3288
At the moment we don't have enough people to ensure that RK3288 is
regression-free, so don't fail the CI in that case.
For now we'll focus on not regressing on RK3399 and we can expand to
other SoCs as more people join the effort.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tomeu Vizoso [Wed, 26 Jun 2019 11:36:30 +0000 (13:36 +0200)]
panfrost/ci: Don't print every kernel file
As there's lots of them and Gitlab struggles rendering logs with so many
lines.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tomeu Vizoso [Wed, 26 Jun 2019 10:50:52 +0000 (12:50 +0200)]
panfrost/ci: Fix the image name
These changes will make sure we get the right image from the container
registry.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tomeu Vizoso [Wed, 26 Jun 2019 06:02:31 +0000 (08:02 +0200)]
panfrost/ci: Remove batching
Panfrost has grown and doesn't leak as much as before.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Kenneth Graunke [Fri, 28 Jun 2019 00:16:20 +0000 (17:16 -0700)]
iris: Don't leak resources in iris_create_surface for incomplete FBOs
We were failing to pipe_resource_unreference on the failure path due
to a non-renderable format. Instead of fixing this, just move the
checks earlier, before we even bother with refcounting or calloc.
Samuel Pitoiset [Thu, 27 Jun 2019 17:29:13 +0000 (19:29 +0200)]
radv: only enable VK_AMD_gpu_shader_{half_float,int16} on GFX9+
These two extensions are supported on GFX8 but the throughput
of 16-bit floats/integers is same as 32-bit. Also, shaderInt16
is only enabled on GFX9+ for the same reason, be more consistent.
This fixes a crash with Wolfenstein II because it expects
shaderInt16 to be enabled when VK_AMD_gpu_shader_half_float is
exposed. Note that AMDVLK only enables these extensions on GFX9+.
Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 26 Jun 2019 07:23:31 +0000 (09:23 +0200)]
radv: add si_emit_ia_multi_vgt_param() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Alexandros Frantzis [Thu, 27 Jun 2019 07:48:50 +0000 (10:48 +0300)]
virgl: Don't allow creating staging pipe_resources
Staging buffers are now created directly by the virgl_staging_mgr. We
don't need to support creating staging pipe_resources.
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Alexandros Frantzis [Mon, 24 Jun 2019 13:57:46 +0000 (16:57 +0300)]
virgl: Use virgl_staging_mgr
Use an instance of virgl_staging_mgr instead of u_upload_mgr to handle
the staging buffer. This removes the need to track the availability
of the staging manager, since virgl_staging_mgr can handle concurrent
active allocations.
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Alexandros Frantzis [Tue, 25 Jun 2019 10:56:52 +0000 (13:56 +0300)]
virgl: Add tests for virgl_staging_mgr
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Alexandros Frantzis [Mon, 24 Jun 2019 13:30:07 +0000 (16:30 +0300)]
virgl: Introduce virgl_staging_mgr
Add a manager for the staging buffer used in virgl. The staging manager
is heavily inspired by u_upload_mgr, but is simpler and is a better fit
for virgl's purposes. In particular, the staging manager:
* Allows concurrent staging allocations.
* Calls the virgl winsys directly to create and map resources, avoiding
unnecessarily going through gallium resources and transfers.
olv: make virgl_staging_alloc_buffer return a bool
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Alexandros Frantzis [Wed, 26 Jun 2019 09:12:17 +0000 (12:12 +0300)]
virgl: Store the virgl_hw_res for copy transfers
Store the virgl_hw_res instead of the pipe_resource for copy transfer
sources. This prepares the codebase for a change to provide only the
virgl_hw_res for the staging buffers in upcoming commits.
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Kenneth Graunke [Thu, 27 Jun 2019 23:54:47 +0000 (16:54 -0700)]
iris: Fix major resource leak in iris_set_shader_images
We were failing to unreference the old image resource. Instead of open
coding this and doing it badly, just use the copier function which does
the right thing.
Kenneth Graunke [Thu, 27 Jun 2019 23:50:00 +0000 (16:50 -0700)]
gallium: Make util_copy_image_view handle shader_access
A while back, we added a new field, but failed to update the copier.
I believe iris is the only current user of the new field, and it hasn't
used the copier, so noone noticed.
Fixes: 8b626a22b24 st/mesa: Record shader access qualifiers for images
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Kenneth Graunke [Thu, 27 Jun 2019 21:35:13 +0000 (14:35 -0700)]
gallium: Teach GALLIUM_REFCNT_LOG about array textures
Otherwise they are classified as pipe_martian_resource, and don't
contain any helpful information about the texture.
Reviewed-by: Eric Anholt <eric@anholt.net>
Nanley Chery [Mon, 20 May 2019 21:50:23 +0000 (14:50 -0700)]
isl: Don't align phys_level0_sa by block dimension
Aligning phys_level0_sa by the compression block dimension prior to
mipmap layout causes the layout of compressed surfaces to differ from
the sampler's expectations in certain cases. The hardware docs agree:
From the BDW PRM, Vol. 5, Compressed Mipmap Layout,
The compressed mipmaps are stored in a similar fashion to
uncompressed mipmaps [...]
The following exceptions apply to the layout of compressed (vs.
uncompressed) mipmaps:
* [...]
* The dimensions of the mip maps are first determined by applying
the sizing algorithm presented in Non-Power-of-Two Mipmaps
above. Then, if necessary, they are padded out to compression
block boundaries.
The last bullet indicates that alignment should not be done for
calculating a miplevel's dimensions, but rather for determining miplevel
placement/padding. Comply with this text by removing the extra
alignment.
Fixes some fbo-generatemipmap-formats piglit failures on all tested
platforms (SNB-KBL).
v2:
- Note fixed platforms.
- Update some consumers via a helper function.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Nanley Chery [Thu, 23 May 2019 20:44:52 +0000 (13:44 -0700)]
intel: Add and use helpers for level0 extent
Prepare for a bug fix by adding and using helpers which convert
isl_surf::logical_level0_px and isl_surf::phys_level0_sa to units of
surface elements.
v2:
- Update iris (Ken).
- Update anv.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Dylan Baker [Wed, 22 May 2019 22:49:01 +0000 (15:49 -0700)]
meson: try to use cmake as a finder for clang
Clang (like LLVM), very annoyingly refuses to provide pkg-config, and
only provides cmake (unlike LLVM which at least provides llvm-config,
even if llvm-config is terrible). Meson has gained the ability to use
cmake to find dependencies, and can successfully find Clang. This change
attempts to use cmake to find clang instead of a bunch of library
searches, when paired with -Dcmake_prefix_path we can much more reliably
use cmake to control which clang we're getting. This is only enabled for
meson >= 0.51, which adds the required options.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Dylan Baker [Wed, 22 May 2019 18:01:17 +0000 (11:01 -0700)]
meson: Add support for using cmake for finding LLVM
Meson has support for using cmake as a finder for some dependencies,
including LLVM. Using cmake has a lot of advantages: it needs less meson
maintenance to keep working (even for llvm updates); it works more
sanely for cross compiles (as llvm-config is a compiled binary not a
shell script). Meson 0.51.0 also has a new generic variable getter that
can be used to get information from either cmake, pkg-config, or
config-tools dependencies, which is needed for cmake. We continue to
support using llvm-config if you don't have cmake installed, or if cmake
cannot find a suitable version.
Fixes: 0d59459432cf077d768164091318af8fb1612500
("meson: Force the use of config-tool for llvm")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Kenneth Graunke [Thu, 27 Jun 2019 21:09:05 +0000 (14:09 -0700)]
iris: Fix memory leak of SO targets
We need to pitch these on context destroy.
Kenneth Graunke [Thu, 27 Jun 2019 18:49:41 +0000 (11:49 -0700)]
iris: Fix memory leak for draw parameter resources
Need to pitch these on context destroy.
Kenneth Graunke [Thu, 27 Jun 2019 18:15:10 +0000 (11:15 -0700)]
iris: Drop u_upload_unmap
We use persistent maps so this does nothing.
Lionel Landwerlin [Tue, 25 Jun 2019 08:10:14 +0000 (11:10 +0300)]
intel/compiler: fix derivative on y axis implementation
This rewrites the ddy in EXECUTE_4 mode with a loop to make it more
obvious what is going on and also sets the group each of the 4 threads
in the groups are supposed to execute.
Fixes the following CTS tests :
dEQP-VK.glsl.derivate.dfdyfine.dynamic_*
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Co-Authored-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 2134ea380033d5 ("intel/compiler/fs: Implement ddy without using align16 for Gen11+")
Eric Engestrom [Wed, 22 May 2019 15:37:10 +0000 (16:37 +0100)]
meson: set up a proper internal dependency for xmlconfig
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Eric Engestrom [Wed, 22 May 2019 14:32:27 +0000 (15:32 +0100)]
xmlconfig: add missing #include
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Eric Engestrom [Wed, 22 May 2019 11:52:44 +0000 (12:52 +0100)]
xmlpool: fix typo in comment
s/otions/options/, and while here let's give the full path to xmlpool.h
since `../` won't be true in the generated file.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Kenneth Graunke [Thu, 27 Jun 2019 06:56:45 +0000 (23:56 -0700)]
iris: Also properly restore INTERFACE_DESCRIPTOR_DATA buffer object
We were at least cleaning up this reference, but we were failing to
pin it in iris_restore_compute_saved_bos.
Kenneth Graunke [Thu, 27 Jun 2019 06:38:59 +0000 (23:38 -0700)]
iris: Fix resource tracking for CS thread ID buffer
Today, we stream the compute shader thread IDs simply because they're
(annoyingly) relative to dynamic state base address. We could upload
them once at compile time, but we'd need a separate non-streaming
uploader for IRIS_MEMZONE_DYNAMIC, and I'm not sure it's worth it.
stream_state pins the buffer for use in the current batch, but also
returns a reference to the pipe_resource. We dropped this reference
on the floor, leaking a reference basically every time we dispatched
a compute shader after switching to a new one.
The reason it returns a reference is so that we can hold on to it and
re-pin it in iris_restore_compute_saved_bos, which we were also failing
to do. So if we actually filled up a batch with repeated dispatches to
the same compute shader, and flushed, then continued dispatching, we
would fail to pin it and likely GPU hang.
Kenneth Graunke [Thu, 27 Jun 2019 06:33:40 +0000 (23:33 -0700)]
iris: Only bother with thread ID upload if doing MEDIA_CURBE_LOAD
We were unconditionally uploading the new data, but then conditionally
using it with MEDIA_CURBE_LOAD. If we're not going to emit the command,
there's no point in uploading the data.
Kenneth Graunke [Thu, 27 Jun 2019 00:14:58 +0000 (17:14 -0700)]
iris: Do MEDIA_CURBE_LOAD when IRIS_DIRTY_CS is set, not constants
We only use push the compute shader thread IDs, not any actual constant
buffer data. So we should track the compute shader variant changing,
not constbuf changes.
Kenneth Graunke [Thu, 27 Jun 2019 06:24:56 +0000 (23:24 -0700)]
iris: Drop UBO range stuff from iris_restore_compute_saved_bos
Compute doesn't use UBO ranges (annoyingly), so this is dead code.
Kenneth Graunke [Thu, 27 Jun 2019 00:35:45 +0000 (17:35 -0700)]
iris: Properly align interface descriptor data addresses
MEDIA_INTERFACE_DESCRIPTOR's Interface Descriptor Data Start Address
field's docs say: "This bit specifies the 64-byte aligned address..."
And we were doing 32. Superfluous thread ID uploading was apparently
saving us from GPU hangs in most cases.
Andrii Simiklit [Tue, 25 Jun 2019 14:42:43 +0000 (17:42 +0300)]
mesa: use a correct function return type
v2: standard 'bool' can be used
( Eric Engestrom <eric.engestrom@intel.com> )
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Tomeu Vizoso [Tue, 25 Jun 2019 07:20:51 +0000 (09:20 +0200)]
panfrost/decode: Mention the address of a few descriptors
When the fault_pointer field in the header is set, we can get some idea
of which descriptor the HW isn't happy with if we know their addresses.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tomeu Vizoso [Tue, 25 Jun 2019 06:41:06 +0000 (08:41 +0200)]
panfrost/decode: Wait for a job to finish before dumping
Then we can get some information back about any exception that might
have happened.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tomeu Vizoso [Tue, 25 Jun 2019 06:22:30 +0000 (08:22 +0200)]
panfrost/decode: Decode exception status
Arm's kernel driver mentions how to decode this field, which makes a bit
clearer what had happened.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tomeu Vizoso [Tue, 25 Jun 2019 06:21:15 +0000 (08:21 +0200)]
panfrost/decode: Print AFBC struct when appropriate
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Samuel Pitoiset [Wed, 26 Jun 2019 14:35:44 +0000 (16:35 +0200)]
radv: only export clip/cull distances if PS reads them
The only exception is the GS copy shader which emits them
unconditionally.
Totals from affected shaders:
SGPRS: 71320 -> 71008 (-0.44 %)
VGPRS: 54372 -> 54240 (-0.24 %)
Code Size:
2952628 ->
2941368 (-0.38 %) bytes
Max Waves: 9689 -> 9723 (0.35 %)
This helps Dota2, Doom, GTAV and Hitman 2.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 26 Jun 2019 14:24:10 +0000 (16:24 +0200)]
radv: fix FMASK expand if layerCount is VK_REMAINING_ARRAY_LAYERS
This doesn't fix anything known, but it's likely going to
break if layerCount is ~0U.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Kenneth Graunke [Wed, 26 Jun 2019 02:34:45 +0000 (19:34 -0700)]
iris: Disable loop unrolling in GLSL IR.
Leave it to NIR instead, like i965 does. Thanks to Tim Arceri for
noticing that I'd left this enabled by accident.
shader-db results on Skylake:
total instructions in shared programs:
15522628 ->
15521642 (<.01%)
instructions in affected programs: 94008 -> 93022 (-1.05%)
helped: 34
HURT: 33
helped stats (abs) min: 12 max: 48 x̄: 33.82 x̃: 42
helped stats (rel) min: 0.06% max: 22.14% x̄: 9.86% x̃: 10.89%
HURT stats (abs) min: 1 max: 16 x̄: 4.97 x̃: 3t
HURT stats (rel) min: 0.82% max: 3.77% x̄: 1.73% x̃: 1.53%
95% mean confidence interval for instructions value: -20.08 -9.35
95% mean confidence interval for instructions %-change: -5.95% -2.36%
Instructions are helped.
total cycles in shared programs:
367105221 ->
367074230 (<.01%)
cycles in affected programs:
10017660 ->
9986669 (-0.31%)
helped: 266
HURT: 184
helped stats (abs) min: 1 max: 9556 x̄: 151.35 x̃: 12
helped stats (rel) min: 0.08% max: 59.91% x̄: 4.66% x̃: 1.67%
HURT stats (abs) min: 1 max: 1716 x̄: 50.37 x̃: 6
HURT stats (rel) min: <.01% max: 24.40% x̄: 2.42% x̃: 0.85%
95% mean confidence interval for cycles value: -133.90 -3.84
95% mean confidence interval for cycles %-change: -2.44% -1.10%
Cycles are helped.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Kenneth Graunke [Wed, 26 Jun 2019 04:00:46 +0000 (21:00 -0700)]
st/mesa: Set EmitNoIndirectSampler if GLSLVersion < 400.
This patch changes the code which sets EmitNoIndirectSampler to check
the core profile GLSL version, rather than the ARB_gpu_shader5 extension
enable. st/mesa exposes ARB_gpu_shader5 if GLSLVersion (in core
profiles) or GLSLVersionCompat (in compat profiles) >= 400.
The Intel drivers do not currently expose ARB_gpu_shader5 in compat
profiles. But the backend can absolutely handle indirect samplers.
Looking at the core profile version number should be a good indication
of what the driver supports.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Kenneth Graunke [Thu, 27 Jun 2019 03:16:10 +0000 (20:16 -0700)]
iris: Delete dead ice->state.streamout_strides field.
Nothing uses this, it must be a remnant from an earlier approach.
Caio Marcelo de Oliveira Filho [Wed, 12 Jun 2019 23:48:21 +0000 (16:48 -0700)]
nir/algebraic: Add helpers and a rule involving wrapping
The helpers are needed so we can use the syntax `instr(cond)` in the
algebraic rules. Add simple rule for dropping a pair of mul-div of
the same value when wrapping is guaranteed to not happen.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Caio Marcelo de Oliveira Filho [Sat, 18 May 2019 05:52:42 +0000 (22:52 -0700)]
spirv: Implement NoSignedWrap and NoUnsignedWrap decorations
When handling the specified ALU operations, check for the decorations
and set nir_alu_instr no_signed_wrap and no_unsigned_wrap flags accordingly.
v2: Add a glsl_base_type_is_unsigned_integer() helper. (Karol)
v3: Rename helper to glsl_base_type_is_uint().
v4: Use two flags, so we don't need the helper anymore. (Connor)
v5: Pass alu directly to handle function. (Jason)
Reviewed-by: Karol Herbst <kherbst@redhat.com> [v3]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Caio Marcelo de Oliveira Filho [Fri, 17 May 2019 20:46:38 +0000 (13:46 -0700)]
nir: Add a no wrapping bits to nir_alu_instr
They indicate the operation does not cause overflow or underflow.
This is motivated by SPIR-V decorations NoSignedWrap and
NoUnsignedWrap.
Change the storage of `exact` to be a single bit, so they pack
together.
v2: Handle no_wrap in nir_instr_set. (Karol)
v3: Use two separate flags, since the NIR SSA values and certain
instructions are typeless, so just no_wrap would be insufficient
to know which one was referred to. (Connor)
v4: Don't use nir_instr_set to propagate the flags, unlike `exact`,
consider the instructions different if the flags have different
values. Fix hashing/comparing. (Jason)
Reviewed-by: Karol Herbst <kherbst@redhat.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dylan Baker [Wed, 26 Jun 2019 20:48:06 +0000 (13:48 -0700)]
docs: add news item and link release notes for 19.0.8
This is an emergency release due to a critical bug.
Dylan Baker [Wed, 26 Jun 2019 20:42:45 +0000 (13:42 -0700)]
docs: Add mesa 19.0.8 sha256 sums