mesa.git
5 years agovulkan/wsi/wayland: implement acquire timeout
Lionel Landwerlin [Fri, 26 Jul 2019 09:08:13 +0000 (12:08 +0300)]
vulkan/wsi/wayland: implement acquire timeout

v2: Eric's nits

v3: Reuse timespec utils (Daniel)
    Deal with ppoll being interrupted by a signal (Daniel)

v4: Remove unnecessary time check

v5: Deal with EAGAIN from wl_display_prepare_read_queue() (Daniel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
Reviewed-by: Daniel Stone <daniels@collabora.com>
5 years agoutil: add a timespec helper
Lionel Landwerlin [Mon, 29 Jul 2019 09:54:04 +0000 (12:54 +0300)]
util: add a timespec helper

Copied from Weston, upon Daniel's suggestion

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
5 years agointel: replace large stack buffer with heap allocation
Eric Engestrom [Thu, 18 Oct 2018 16:19:56 +0000 (17:19 +0100)]
intel: replace large stack buffer with heap allocation

For now, this keeps the "100 bytes" allocation; we can try to figure out
the correct size as a follow up.

Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoradv/gfx10: do not use the fast depth or stencil clear bytes path
Samuel Pitoiset [Mon, 29 Jul 2019 12:15:23 +0000 (14:15 +0200)]
radv/gfx10: do not use the fast depth or stencil clear bytes path

It causes issues on GFX10.

This fixes rendering issues with vkmark and Wreckfest at least.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
5 years agoac: do not crash when the buffer data format is invalid
Samuel Pitoiset [Mon, 29 Jul 2019 10:03:46 +0000 (12:03 +0200)]
ac: do not crash when the buffer data format is invalid

This might happen when a pipeline doesn't define the vertex input
state, so the buffer data format is 0 (aka INVALID).

This fixes crashes when compiling some shaders on GFX10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac/nir: fix txf_ms with an offset
Rhys Perry [Fri, 19 Jul 2019 14:20:44 +0000 (15:20 +0100)]
ac/nir: fix txf_ms with an offset

Seems to fix some hair artifacts in Max Payne 3:
https://github.com/daniel-schuermann/mesa/issues/76

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f4e499ec791 ('radv: add initial non-conformant radv vulkan driver')
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Delete unused local variables in optimization loop
Connor Abbott [Wed, 26 Jun 2019 12:03:31 +0000 (14:03 +0200)]
radv: Delete unused local variables in optimization loop

Totals from affected shaders:
SGPRS: 376 -> 376 (0.00 %)
VGPRS: 620 -> 560 (-9.68 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 292 -> 292 (0.00 %) dwords per thread
Code Size: 20024 -> 20144 (0.60 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 25 -> 25 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agonir/find_array_copies: Handle wildcards and overlapping copies
Connor Abbott [Tue, 18 Jun 2019 10:12:49 +0000 (12:12 +0200)]
nir/find_array_copies: Handle wildcards and overlapping copies

This commit rewrites opt_find_array_copies to be able to handle
an array copy sequence with other intervening operations in between. In
particular, this handles the case where we OpLoad an array of structs
and then OpStore it, which generates code like:

foo[0].a = bar[0].a
foo[0].b = bar[0].b
foo[1].a = bar[1].a
foo[1].b = bar[1].b
...

that wasn't recognized by the previous pass.

In order to correctly handle copying arrays of arrays, and in particular
to correctly handle copies involving wildcards, we need to use a tree
structure similar to lower_vars_to_ssa so that we can walk all the
partial array copies invalidated by a particular write, including
ones where one of the common indices is a wildcard. I actually think
that when factoring in the needed hashing/comparing code, a hash table
based approach wouldn't be a lot smaller anyways.

All of the changes come from tessellation control shaders in Strange
Brigade, where we're able to remove the DXVK-inserted copy at the
beginning of the shader. These are the result for radv:

Totals from affected shaders:
SGPRS: 4576 -> 4576 (0.00 %)
VGPRS: 13784 -> 5560 (-59.66 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 8696 -> 6876 (-20.93 %) dwords per thread
Code Size: 329940 -> 263268 (-20.21 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 330 -> 898 (172.12 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: Print array deref indices as decimal
Connor Abbott [Wed, 26 Jun 2019 11:41:20 +0000 (13:41 +0200)]
nir: Print array deref indices as decimal

We print the size as decimal too, and using hex without a leading "0x"
was very confusing.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agolima/gpir/sched: Handle more special ops in can_use_complex()
Connor Abbott [Sat, 27 Jul 2019 18:24:32 +0000 (20:24 +0200)]
lima/gpir/sched: Handle more special ops in can_use_complex()

We were missing handling for a few other ops that rearrange their
sources somehow in codegen, namely complex2 and select.

This should fix spec@glsl-1.10@execution@built-in-functions@vs-asin-vec3
and possibly other random regressions from the new scheduler which were
supposed to be fixed in the commit right after.

Fixes: 54434fe6706 ("lima/gpir: Rework the scheduler")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Acked-by: Qiang Yu <yuq825@gmail.com>
5 years agolima/gp: Clean up lima_program_optimize_vs_nir() a little
Connor Abbott [Sat, 27 Jul 2019 17:38:53 +0000 (19:38 +0200)]
lima/gp: Clean up lima_program_optimize_vs_nir() a little

Remove an unnecessary nir_lower_regs_to_ssa as that should be done by
the state tracker, and add a missing DCE pass after running copy
propagation in order to remove the dead copies. This shouldn't fix
anything but the second part will reduce shader sizes.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agolima/gpir/sched: Don't try to spill when something else has succeeded
Connor Abbott [Sat, 27 Jul 2019 16:31:09 +0000 (18:31 +0200)]
lima/gpir/sched: Don't try to spill when something else has succeeded

In try_node(), we assume that the node we pick can still be scheduled
successfully after speculatively trying all the other nodes. Normally we
always undo every node after speculating it, so that when we finally
schedule best_node the scheduler state is exactly the same and it
succeeds. However, we also try to spill nodes, which can change the
state and in a corner case that can make scheduling best_node fail. In
particular, the following sequence of events happened with piglit
shaders@glsl-vs-if-nested: a partially-ready node N was spilled and a
register store node S, which is a use of N, was created and then later
the other uses of N were scheduled, so that S is now ready and N is
partially ready. First we try to schedule S and succeed, then we try to
schedule another node M, which fails, so we try to spill the remaining
uses of N. This succeeds, but scheduling M still fails so that best_node
is still S. However since one of the uses of N is one cycle ago, and
therefore we inserted a read dependent on S one cycle ago when spilling
N, S can no longer be scheduled as read-after-write latency is three
cycles.

While we could ad-hoc try to catch cases like this, or (the best option
but very complicated) treat the spill as speculative and roll it back if
we decide not to schedule the node, a simpler solution is to just
give up on spilling if we've already successfully speculatively
scheduled another node. We'd give up a few cases where we discover that
by spilling even harder we could schedule a more desirable node, but
that seems like it would be pretty rare in practice. With this we
guarantee that nothing has been touched after best_node was successfully
scheduled. We also cut down on pointless spilling, since if we already
scheduled a node it's unlikely that spilling harder will let us schedule
an even better node, and hence any spilling at this point is probably
useless.

While we're here, clean up the code around spilling by flattening the
two if's and getting rid of the second unnecessary check for INT_MIN.

Fixes: 54434fe6706 ("lima/gpir: Rework the scheduler")
Acked-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
5 years agonv50/ir: don't consider the main compute function as taking arguments
Ilia Mirkin [Fri, 26 Jul 2019 05:18:23 +0000 (01:18 -0400)]
nv50/ir: don't consider the main compute function as taking arguments

With OpenCL, kernels can take arguments and return values (?). However
in practice, there is no more TGSI compute implementation, and even if
there were, it would probably have named functions and no explicit main.

This improves RA considerably for compute shaders, since temps are not
kept around as return values.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agonv50/ir: handle insn not being there for definition of CVT arg
Ilia Mirkin [Fri, 26 Jul 2019 05:01:45 +0000 (01:01 -0400)]
nv50/ir: handle insn not being there for definition of CVT arg

This can happen if it's e.g. a uniform or a function argument.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
5 years agonouveau: flip DEBUG -> !NDEBUG
Ilia Mirkin [Fri, 26 Jul 2019 03:23:08 +0000 (23:23 -0400)]
nouveau: flip DEBUG -> !NDEBUG

The meson conversion chose to change the meaning of DEBUG to "used for
debugging" to be "used for expensive things for debugging", primarily
for nir_validate. Flip things over so that we get nice things with
optimizations enabled.

While we're at it, also kill off nouveau_statebuf.h which is unused (and
has a mention of DEBUG which is how I found it).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agonvc0: allow a non-user buffer to be bound at position 0
Ilia Mirkin [Fri, 26 Jul 2019 03:27:56 +0000 (23:27 -0400)]
nvc0: allow a non-user buffer to be bound at position 0

Previously the code only handled it for positions 1 and up (as would be
for UBO's in GL). It's not a lot of trouble to handle this, and vl or
vdpau want this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
5 years agonv50,nvc0: update sampler/view bind functions to accept NULL array
Ilia Mirkin [Fri, 26 Jul 2019 03:26:46 +0000 (23:26 -0400)]
nv50,nvc0: update sampler/view bind functions to accept NULL array

Apparently vl (or vdpau) wants to pass that in now. Handle it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
5 years agogallium/vl: fix compute tgsi shaders to not process undefined components
Ilia Mirkin [Fri, 26 Jul 2019 03:18:18 +0000 (23:18 -0400)]
gallium/vl: fix compute tgsi shaders to not process undefined components

This caused nouveau's function handling logic to think that the MAIN
function was due to receive external parameters, and cascaded some
failures after that. Instead avoid having the undefined components in
the first place.

Fixes: f6ac0b5d71 (gallium/auxiliary/vl: Add compute shader to support video compositor render)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agopan/midgard: Introduce invert field
Alyssa Rosenzweig [Fri, 26 Jul 2019 18:15:31 +0000 (11:15 -0700)]
pan/midgard: Introduce invert field

This will enable us to fuse inverts in various ways. Marginal hurt:

total instructions in shared programs: 3610 -> 3611 (0.03%)
instructions in affected programs: 67 -> 68 (1.49%)
helped: 0
HURT: 1

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Tag SSA/reg
Alyssa Rosenzweig [Fri, 26 Jul 2019 18:30:06 +0000 (11:30 -0700)]
pan/midgard: Tag SSA/reg

Rather than putting registers after SSA in the MIR indexing, put them
side-by-side, shifted 1, using the bottom bit as the SSA/reg select.
This will allow us to generate SSA temps in the compiler.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoradeon/vcn: enable rate control for hevc encoding
Boyuan Zhang [Mon, 17 Jun 2019 19:02:32 +0000 (15:02 -0400)]
radeon/vcn: enable rate control for hevc encoding

Set cu_qp_delta_enable_flag on when rate control is enabled, and set it
off when rate control is disabled (e.g. constant qp).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673
Cc: mesa-stable@lists.freedesktop.org
V2: fix typo and add bugzilla info

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
5 years agoradeon/uvd: enable rate control for hevc encoding
Boyuan Zhang [Mon, 17 Jun 2019 19:00:53 +0000 (15:00 -0400)]
radeon/uvd: enable rate control for hevc encoding

Set cu_qp_delta_enable_flag on when rate control is enabled, and set it
off when rate control is disabled (e.g. constant qp).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673
Cc: mesa-stable@lists.freedesktop.org
V2: fix typo and add bugzilla info

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
5 years agoradeon/vcn: fix poc for hevc encode
Boyuan Zhang [Wed, 29 May 2019 18:25:38 +0000 (14:25 -0400)]
radeon/vcn: fix poc for hevc encode

MaxPicOrderCntLsb should be at least 16 according to the spec,
therefore add minimum value check.

Also use poc value passed from st instead of calculation
in slice header encoding.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673
Cc: mesa-stable@lists.freedesktop.org
V2: Fix typo

V3: Use MAX2 macro instead of coding. Also MaxPicOrderCntLsb
should be power of 2 according to spec.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
5 years agoradeon/uvd: fix poc for hevc encode
Boyuan Zhang [Wed, 29 May 2019 18:25:07 +0000 (14:25 -0400)]
radeon/uvd: fix poc for hevc encode

MaxPicOrderCntLsb should be at least 16 according to the spec,
therefore add minimum value check.

Also use poc value passed from st instead of calculation
in slice header encoding.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673
Cc: mesa-stable@lists.freedesktop.org
V2: Fix typo

V3: Use MAX2 macro instead of coding. Also MaxPicOrderCntLsb
should be power of 2 according to spec.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
5 years agonir: Optimize umod lowering
Sagar Ghuge [Mon, 22 Jul 2019 23:30:56 +0000 (16:30 -0700)]
nir: Optimize umod lowering

We don't have calculate final quotient in order to calculate unsigned
modulo result.  Once we are done with error correction we have partial
result which can be used to find out modulo operation result

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agopan/midgard: Improve scheduling
Alyssa Rosenzweig [Fri, 26 Jul 2019 17:28:46 +0000 (10:28 -0700)]
pan/midgard: Improve scheduling

Make scalar scheduling onto vector units more aggressive (it can only
help while we schedule strictly in order). Also, allow imov on VLUT.

total bundles in shared programs: 2176 -> 2117 (-2.71%)
bundles in affected programs: 901 -> 842 (-6.55%)
helped: 24
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 2.46 x̃: 2
helped stats (rel) min: 2.08% max: 20.00% x̄: 8.68% x̃: 5.94%
95% mean confidence interval for bundles value: -3.93 -0.99
95% mean confidence interval for bundles %-change: -10.92% -6.45%
Bundles are helped.

total quadwords in shared programs: 3605 -> 3566 (-1.08%)
quadwords in affected programs: 1984 -> 1945 (-1.97%)
helped: 28
HURT: 5
helped stats (abs) min: 1 max: 3 x̄: 1.68 x̃: 2
helped stats (rel) min: 1.02% max: 14.29% x̄: 5.12% x̃: 2.94%
HURT stats (abs)   min: 1 max: 3 x̄: 1.60 x̃: 1
HURT stats (rel)   min: 0.57% max: 9.09% x̄: 6.40% x̃: 9.09%
95% mean confidence interval for quadwords value: -1.67 -0.69
95% mean confidence interval for quadwords %-change: -5.37% -1.37%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Specialize mod checking by type when checking constants
Alyssa Rosenzweig [Fri, 26 Jul 2019 15:50:22 +0000 (08:50 -0700)]
pan/midgard: Specialize mod checking by type when checking constants

Fixes inlining of integer constants.

total quadwords in shared programs: 3585 -> 3568 (-0.47%)
quadwords in affected programs: 625 -> 608 (-2.72%)
helped: 13
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.31 x̃: 1
helped stats (rel) min: 1.27% max: 9.52% x̄: 3.84% x̃: 2.94%
95% mean confidence interval for quadwords value: -1.60 -1.02
95% mean confidence interval for quadwords %-change: -5.60% -2.07%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Use more aggressive writeout criteria
Alyssa Rosenzweig [Fri, 26 Jul 2019 15:30:22 +0000 (08:30 -0700)]
pan/midgard: Use more aggressive writeout criteria

We loosen the requirement of "no dependencies" to simply be "no
non-pipelined dependencies", so we check for what could be pipelined.

total bundles in shared programs: 2176 -> 2156 (-0.92%)
bundles in affected programs: 779 -> 759 (-2.57%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.33% max: 20.00% x̄: 6.47% x̃: 2.78%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -9.44% -3.50%
Bundles are helped.

total quadwords in shared programs: 3605 -> 3585 (-0.55%)
quadwords in affected programs: 1391 -> 1371 (-1.44%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 14.29% x̄: 3.84% x̃: 1.64%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -5.73% -1.94%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Pipeline non-SSA registers
Alyssa Rosenzweig [Fri, 26 Jul 2019 16:37:58 +0000 (09:37 -0700)]
pan/midgard: Pipeline non-SSA registers

Rather than bailing if we see something that's not SSA, do out the
analysis to check if we can pipeline and do so if we can.

total registers in shared programs: 392 -> 391 (-0.26%)
registers in affected programs: 3 -> 2 (-33.33%)
helped: 1
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add mir_mask_of_read_components helper
Alyssa Rosenzweig [Fri, 26 Jul 2019 16:37:28 +0000 (09:37 -0700)]
pan/midgard: Add mir_mask_of_read_components helper

This facilitates analysis of vec4 registers (after going out-of-SSA).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add mir_is_written_before helper
Alyssa Rosenzweig [Fri, 26 Jul 2019 16:20:52 +0000 (09:20 -0700)]
pan/midgard: Add mir_is_written_before helper

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Obey fragment writeout criteria
Alyssa Rosenzweig [Tue, 16 Jul 2019 22:57:19 +0000 (15:57 -0700)]
pan/midgard: Obey fragment writeout criteria

Rather than always emitting an extra move for fragments, check the
actual criteria and emit accordingly. (This was lost during the RA
improvements at the end of May).

total bundles in shared programs: 2210 -> 2176 (-1.54%)
bundles in affected programs: 501 -> 467 (-6.79%)
helped: 34
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.59% max: 33.33% x̄: 13.13% x̃: 12.50%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -16.06% -10.21%
Bundles are helped.

total quadwords in shared programs: 3639 -> 3605 (-0.93%)
quadwords in affected programs: 795 -> 761 (-4.28%)
helped: 34
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.96% max: 33.33% x̄: 11.22% x̃: 8.33%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -14.31% -8.13%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add post-RA move elimination
Alyssa Rosenzweig [Thu, 25 Jul 2019 22:34:13 +0000 (15:34 -0700)]
pan/midgard: Add post-RA move elimination

Think of this pass as register coalescing part 2. After RA runs, but
before scheduling, we scan for code of the form:

   mov rN, rN

and delete the move, since it's totally redundant. This pass helps
already, but it'd of course be much more effective paired with
register coalescing to encourage moves in general to end up in this
form. Nevertheless, even by itself:

total instructions in shared programs: 3665 -> 3613 (-1.42%)
instructions in affected programs: 2046 -> 1994 (-2.54%)
helped: 52
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 25.00% x̄: 8.02% x̃: 4.00%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -10.26% -5.79%
Instructions are helped.

total bundles in shared programs: 2256 -> 2213 (-1.91%)
bundles in affected programs: 1154 -> 1111 (-3.73%)
helped: 43
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.33% max: 25.00% x̄: 9.10% x̃: 5.56%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -11.60% -6.60%
Bundles are helped.

total quadwords in shared programs: 3689 -> 3642 (-1.27%)
quadwords in affected programs: 2025 -> 1978 (-2.32%)
helped: 47
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 25.00% x̄: 7.86% x̃: 3.85%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -10.30% -5.42%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Share mir_nontrivial_outmod
Alyssa Rosenzweig [Thu, 25 Jul 2019 22:33:56 +0000 (15:33 -0700)]
pan/midgard: Share mir_nontrivial_outmod

To be used with redundant move elimination.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Implement texture RA
Alyssa Rosenzweig [Thu, 25 Jul 2019 15:44:53 +0000 (08:44 -0700)]
pan/midgard: Implement texture RA

total instructions in shared programs: 3916 -> 3665 (-6.41%)
instructions in affected programs: 1405 -> 1154 (-17.86%)
helped: 35
HURT: 0
helped stats (abs) min: 1 max: 21 x̄: 7.17 x̃: 3
helped stats (rel) min: 3.00% max: 28.57% x̄: 20.11% x̃: 21.74%
95% mean confidence interval for instructions value: -9.35 -4.99
95% mean confidence interval for instructions %-change: -22.75% -17.46%
Instructions are helped.

total bundles in shared programs: 2472 -> 2256 (-8.74%)
bundles in affected programs: 906 -> 690 (-23.84%)
helped: 32
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 6.75 x̃: 3
helped stats (rel) min: 5.56% max: 32.26% x̄: 20.83% x̃: 16.67%
95% mean confidence interval for bundles value: -9.09 -4.41
95% mean confidence interval for bundles %-change: -23.77% -17.89%
Bundles are helped.

total quadwords in shared programs: 3965 -> 3689 (-6.96%)
quadwords in affected programs: 1568 -> 1292 (-17.60%)
helped: 35
HURT: 0
helped stats (abs) min: 1 max: 21 x̄: 7.89 x̃: 3
helped stats (rel) min: 2.08% max: 28.57% x̄: 19.87% x̃: 20.00%
95% mean confidence interval for quadwords value: -10.38 -5.39
95% mean confidence interval for quadwords %-change: -22.57% -17.17%
Quadwords are helped.

total registers in shared programs: 411 -> 392 (-4.62%)
registers in affected programs: 76 -> 57 (-25.00%)
helped: 15
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.27 x̃: 1
helped stats (rel) min: 9.09% max: 50.00% x̄: 30.97% x̃: 33.33%
95% mean confidence interval for registers value: -1.52 -1.01
95% mean confidence interval for registers %-change: -39.12% -22.82%
Registers are helped.

total threads in shared programs: 426 -> 432 (1.41%)
threads in affected programs: 6 -> 12 (100.00%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Fix backwards blend color load
Alyssa Rosenzweig [Fri, 26 Jul 2019 15:15:50 +0000 (08:15 -0700)]
pan/midgard: Fix backwards blend color load

The source and destination were incorrectly flipped in the move, but
some details of our internal regalloc made this function anyway. Now
that we're changing the regalloc, we need to fix this to avoid
regressing blend shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Fix scheduling mishap
Alyssa Rosenzweig [Thu, 25 Jul 2019 21:53:20 +0000 (14:53 -0700)]
pan/midgard: Fix scheduling mishap

We shouldn't try to schedule onto a vmul if the last unit was a smul;
that would force a break ("traveling back in time").

total bundles in shared programs: 2519 -> 2472 (-1.87%)
bundles in affected programs: 791 -> 744 (-5.94%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 2.35 x̃: 1
helped stats (rel) min: 1.52% max: 11.76% x̄: 7.94% x̃: 7.69%
95% mean confidence interval for bundles value: -3.47 -1.23
95% mean confidence interval for bundles %-change: -9.36% -6.51%
Bundles are helped.

total quadwords in shared programs: 4028 -> 3965 (-1.56%)
quadwords in affected programs: 1223 -> 1160 (-5.15%)
helped: 17
HURT: 0
helped stats (abs) min: 1 max: 17 x̄: 3.71 x̃: 2
helped stats (rel) min: 2.97% max: 10.64% x̄: 6.97% x̃: 7.14%
95% mean confidence interval for quadwords value: -5.71 -1.70
95% mean confidence interval for quadwords %-change: -8.03% -5.91%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Fix vector->scalar swizzles
Alyssa Rosenzweig [Fri, 26 Jul 2019 13:30:16 +0000 (06:30 -0700)]
pan/midgard: Fix vector->scalar swizzles

The swizzle should be taken on the masked component, rather than
unconditionally X.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add dead move elimination pass
Alyssa Rosenzweig [Thu, 25 Jul 2019 21:43:32 +0000 (14:43 -0700)]
pan/midgard: Add dead move elimination pass

This is a special case of DCE designed to run after the out-of-ssa pass
to cleanup special register lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Move DCE into its own file
Alyssa Rosenzweig [Thu, 25 Jul 2019 21:33:58 +0000 (14:33 -0700)]
pan/midgard: Move DCE into its own file

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add mir_rewrite_dst_tag helper
Alyssa Rosenzweig [Thu, 25 Jul 2019 19:28:38 +0000 (12:28 -0700)]
pan/midgard: Add mir_rewrite_dst_tag helper

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Fix flipped register bias fields
Alyssa Rosenzweig [Thu, 25 Jul 2019 17:55:09 +0000 (10:55 -0700)]
pan/midgard: Fix flipped register bias fields

We mixed up component_lo and full, which made it appear that we had
less freedom in RA than we actually do. Fix this to fix some
disassemblies as well as prepare for RA with the bias field.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Update RA for cubemap coords
Alyssa Rosenzweig [Thu, 25 Jul 2019 14:09:40 +0000 (07:09 -0700)]
pan/midgard: Update RA for cubemap coords

Following the RA work, we apply the same technique to eliminate the move
to r27 when loading cubemaps.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoanv+tu+radv: delete unusable dev_icd.json
Eric Engestrom [Wed, 10 Jul 2019 15:22:29 +0000 (16:22 +0100)]
anv+tu+radv: delete unusable dev_icd.json

As per previous commit, Meson doesn't support using uninstalled libs,
they're simply not ready until `ninja install` is ran, so delete them.

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> # for anv
Reviewed-by: Eric Anholt <eric@anholt.net> # for tu
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> # for radv
5 years agodocs: fix intel_icd.json path
Eric Engestrom [Fri, 21 Jun 2019 15:53:17 +0000 (16:53 +0100)]
docs: fix intel_icd.json path

Meson doesn't support using uninstalled libs, they're simply not ready
until `ninja install` is ran, at which point one might as well use the
proper icd.json file in the install folder.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agovulkan/wsi/x11: Increase the effective min. images for mailbox.
Bas Nieuwenhuizen [Mon, 20 May 2019 20:58:32 +0000 (22:58 +0200)]
vulkan/wsi/x11: Increase the effective min. images for mailbox.

We need 5 images:
1) CPU work
2) GPU work
3) idle
4) queued for flip
5) presenting

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agovulkan/wsi/x11: Wait for GPU work before present with mailbox.
Bas Nieuwenhuizen [Mon, 20 May 2019 01:10:46 +0000 (03:10 +0200)]
vulkan/wsi/x11: Wait for GPU work before present with mailbox.

Otherwise the wait only happens at flip time, which messes with
keeping idle buffers around if the GPU work makes the image miss
the next flip.

I decided not to use the wait fences as those are still xshm fences,
so that means we'd still have to wait in the application. Just doing
it before presenting makes things simpler.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agovulkan/wsi/x11: Allow using thread present-only.
Bas Nieuwenhuizen [Mon, 20 May 2019 00:59:00 +0000 (02:59 +0200)]
vulkan/wsi/x11: Allow using thread present-only.

This allows doing a potential long blocking operation before present.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agovulkan/wsi: Use one fence per image.
Bas Nieuwenhuizen [Mon, 20 May 2019 00:51:51 +0000 (02:51 +0200)]
vulkan/wsi: Use one fence per image.

Much easier to work with if we want to use them in the WS-specific
WSI implementation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agospirv: propagate access qualifiers through ssa & pointer
Lionel Landwerlin [Thu, 16 May 2019 12:06:27 +0000 (13:06 +0100)]
spirv: propagate access qualifiers through ssa & pointer

Not only variables can be flagged as NonUniformEXT but also
expressions. We're currently ignoring it in an expression such as :

   imageLoad(data[nonuniformEXT(rIndex)], 0)

The associated SPIRV :

   OpDecorate %69 NonUniformEXT
   ...
   %69 = OpLoad %61 %68

This changes propagates access qualifiers through ssa & pointers so
that when it hits a OpLoad/OpStore style instructions, qualifiers are
not forgotten.

Fixes failure the following tests :

   dEQP-VK.descriptor_indexing.*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8ed583fe523703 ("spirv: Handle the NonUniformEXT decoration")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agospirv: wrap push ssa/pointer values
Lionel Landwerlin [Mon, 1 Jul 2019 11:57:54 +0000 (14:57 +0300)]
spirv: wrap push ssa/pointer values

This refactor allows for common code to apply decoration on all
ssa/pointer values. In particular this will allow to propagage access
qualifiers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir: add access to image_deref intrinsics
Lionel Landwerlin [Thu, 16 May 2019 12:03:39 +0000 (13:03 +0100)]
nir: add access to image_deref intrinsics

SPIRV added the ability to access variables and have expressions non
dynamically uniform and because spirv_to_nir generates deref
instructions, we'll need to have that access there.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agomain: unreference ATIFragmentShader program before creating new one
Yevhenii Kolesnikov [Thu, 25 Jul 2019 15:15:24 +0000 (18:15 +0300)]
main: unreference ATIFragmentShader program before creating new one

Old program was overwritten without release of memory.

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agostate_tracker: Add destroying routine for feedback and select stages
Yevhenii Kolesnikov [Wed, 24 Jul 2019 10:03:16 +0000 (13:03 +0300)]
state_tracker: Add destroying routine for feedback and select stages

Fixes leaking memory on iris.

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agov3d: fix glDrawTransformFeedback{Instanced}()
Iago Toral Quiroga [Wed, 24 Jul 2019 08:14:33 +0000 (10:14 +0200)]
v3d: fix glDrawTransformFeedback{Instanced}()

This needs to take the vertex count from the provided transform
feedback buffer.

v2:
 - don't take the vertex count from the underlying buffer, instead,
   take it from a v3d subclass of pipe_stream_output_target (Eric).

Fixes piglit tests:
spec/ext_transform_feedback2/draw-auto
spec/ext_transform_feedback2/draw-auto instanced

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: subclass pipe_streamout_output_target to record TF vertices written
Iago Toral Quiroga [Wed, 24 Jul 2019 07:59:25 +0000 (09:59 +0200)]
v3d: subclass pipe_streamout_output_target to record TF vertices written

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: refactor v3d_tf_statistics_record slightly
Iago Toral Quiroga [Tue, 23 Jul 2019 09:28:52 +0000 (11:28 +0200)]
v3d: refactor v3d_tf_statistics_record slightly

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoRevert "panfrost: Don't DIY point size/coord fields"
Alyssa Rosenzweig [Thu, 25 Jul 2019 20:16:45 +0000 (13:16 -0700)]
Revert "panfrost: Don't DIY point size/coord fields"

This reverts commit 4508f43eed5a4528f0e8ca9d1cfcdc78857043e0, which
broke a bunch of dEQP tests (e.g. in
dEQP-GLES2.functional.draw.draw_arrays.*)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoanv: Disable transform feedback on gen7
Jason Ekstrand [Thu, 25 Jul 2019 19:54:20 +0000 (14:54 -0500)]
anv: Disable transform feedback on gen7

It's totally implementable, it's just that the plumbing is a bit
different and we never hooked it up.  Don't advertise a broken feature.

Fixes: 36ee2fd61c8 "anv: Implement the basic form of VK_EXT_transform_feedback"
5 years agomesa: Fix GetTextureImage error reporting, again
Pierre-Eric Pelloux-Prayer [Tue, 23 Jul 2019 23:39:31 +0000 (16:39 -0700)]
mesa: Fix GetTextureImage error reporting, again

Iago Toral Quiroga fixed this in commit 94f740e3fce0cb26e4d90cb9de75b,
but it recently regressed in 0d8826f723cd8868b5271f17f18a1ab4548a1199.
Quoting Iago's original commit message for the fix:

GetTex*Image should return INVALID_ENUM if target is not valid, however,
GetTextureImage does not receive a target, and instead should return
INVALID_OPERATION if the effective target is not valid.  From the
OpenGL 4.6 core profile spec, section 8.11 Texture Queries:

   "An INVALID_OPERATION error is generated by GetTextureImage if the
    effective target is not one of TEXTURE_1D, TEXTURE_2D, TEXTURE_3D,
    TEXTURE_1D_ARRAY, TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY,
    TEXTURE_RECTANGLE, or TEXTURE_CUBE_MAP (for GetTextureImage only)."

Note that this differs from the original ARB_direct_state_access spec.

However, the EXT_direct_state_access version does take a target
parameter, so it should continue reporting INVALID_ENUM.

Fixes KHR-GL45.direct_state_access.textures_image_query_errors.

Fixes: 0d8826f723c ("mesa: refactor get_texture_image to remove duplicate code")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: Use gen_mi_builder to handle CS ALU operations.
Kenneth Graunke [Mon, 1 Apr 2019 22:23:51 +0000 (15:23 -0700)]
iris: Use gen_mi_builder to handle CS ALU operations.

In a few cases, we switch to MI_MATH instead of MI_PREDICATE,
just because we were already doing math and it's easier to chain
together.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agointel/mi: Add a unit test for gen_mi_store_if().
Kenneth Graunke [Wed, 17 Jul 2019 00:22:01 +0000 (17:22 -0700)]
intel/mi: Add a unit test for gen_mi_store_if().

This tests that predicated stores work.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agointel/mi: Add a new gen_mi_store_if() helper.
Kenneth Graunke [Mon, 1 Apr 2019 23:33:08 +0000 (16:33 -0700)]
intel/mi: Add a new gen_mi_store_if() helper.

This performs predicated MI_STORE_REGISTER_MEM commands, assuming that
the condition is already loaded into MI_PREDICATE_DATA.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agointel/mi: Add gen_mi_nz() and gen_mi_z() helpers.
Kenneth Graunke [Mon, 1 Apr 2019 23:01:50 +0000 (16:01 -0700)]
intel/mi: Add gen_mi_nz() and gen_mi_z() helpers.

These provide comparisons against zero.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agointel/mi: Add a gen_mi_ior() to go with gen_mi_iand()
Kenneth Graunke [Wed, 10 Jul 2019 19:05:23 +0000 (12:05 -0700)]
intel/mi: Add a gen_mi_ior() to go with gen_mi_iand()

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agointel/mi: Optimize away LOAD_REGISTER_REG from a register to itself
Kenneth Graunke [Mon, 1 Apr 2019 23:00:25 +0000 (16:00 -0700)]
intel/mi: Optimize away LOAD_REGISTER_REG from a register to itself

We might want to resolve something to be in a particular register,
so we can access it outside of the gen_mi framework...but it may already
be in that register, at which point there's no work to do.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Make iris_query.c a genxml-compiled file.
Kenneth Graunke [Mon, 1 Apr 2019 18:16:22 +0000 (11:16 -0700)]
iris: Make iris_query.c a genxml-compiled file.

This will let us use Jason's new MI-builder shortly.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Move iris_resolve_conditional_render to the vtable.
Kenneth Graunke [Mon, 1 Apr 2019 18:46:36 +0000 (11:46 -0700)]
iris: Move iris_resolve_conditional_render to the vtable.

It's going to be in genxml code shortly.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Refactor genxml macros and inlines into iris_genx_macros.h.
Kenneth Graunke [Fri, 12 Jul 2019 07:50:19 +0000 (00:50 -0700)]
iris: Refactor genxml macros and inlines into iris_genx_macros.h.

This will let us put the genxml boilerplate in one place, before we
expand genxml to more files shortly.  Like i965/genX_boilerplate.h.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Make an iris_genx_protos.h header for prototypes.
Kenneth Graunke [Mon, 1 Apr 2019 18:40:33 +0000 (11:40 -0700)]
iris: Make an iris_genx_protos.h header for prototypes.

This lets us specify the prototypes once, instead of cut and pasting
them per generation.  isl uses a similar approach (isl_genX_priv.h).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoradeonsi: fix DAL hang due to incorrect DCC offset on Raven
Marek Olšák [Wed, 24 Jul 2019 02:32:02 +0000 (22:32 -0400)]
radeonsi: fix DAL hang due to incorrect DCC offset on Raven

Set the correct relative offset.

Fixes: f8b6c5a "radeonsi: rewrite si_get_opaque_metadata, also for gfx10 support"
5 years agoanv: Disable subgroup arithmetic on gen7
Jason Ekstrand [Thu, 25 Jul 2019 05:00:37 +0000 (00:00 -0500)]
anv: Disable subgroup arithmetic on gen7

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agogitlab-ci: Add a shader-db run using v3d on drm-shim.
Eric Anholt [Wed, 10 Apr 2019 22:59:12 +0000 (15:59 -0700)]
gitlab-ci: Add a shader-db run using v3d on drm-shim.

This provides significant compiler coverage during CI at a fairly low
cost in CPU time (~17s per thread for 4 threads on
gst-gitlab-htz-runner3).

I'm leaving wget in the docker image, as once this is in master I'm
planning on having an automatic shader-db comparison between master
and the branch included in the artifacts.  I also haven't done
freedreno yet, because it has some races when run in multithreaded
mode that I'm still tracking down.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
5 years agogitlab-ci: Only keep the build logs as artifacts.
Eric Anholt [Wed, 24 Jul 2019 16:27:48 +0000 (09:27 -0700)]
gitlab-ci: Only keep the build logs as artifacts.

On a build failure, we were tarring up the whole ccache directory,
build.ninja, build products, etc.  This was over 400MB compressed on a
recent early meson-main build failure, which fd.o then has to hang on
to for 4 weeks.  The build logs are probably the interesting part, are
potentially useful regardless ("how did CI's build flags differ from
mine?"), and are <500k uncompressed on my personal meson build.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
5 years agogitlab-ci: Always set libdir to lib/
Eric Anholt [Tue, 23 Jul 2019 18:12:07 +0000 (11:12 -0700)]
gitlab-ci: Always set libdir to lib/

I introduced libdir for cross-builds so we could point at the
resulting drivers without per-arch dependencies, but I'd rather not
have to type x86_64-linux-whatever for non-cross-builds either.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
5 years agofreedreno: Add support for drm-shim.
Eric Anholt [Mon, 13 May 2019 21:03:07 +0000 (14:03 -0700)]
freedreno: Add support for drm-shim.

I'm using this for shader-db analysis on x86_64 systems.

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agov3d: Introduce a DRM shim for calling out to the simulator.
Eric Anholt [Wed, 26 Sep 2018 02:15:45 +0000 (19:15 -0700)]
v3d: Introduce a DRM shim for calling out to the simulator.

The goal is to enable testing of parts of drivers without depending on any
particular kernel version or hardware being present.

Simply set LD_PRELOAD=$PREFIX/lib/libv3d_drm_shim.so in your environment,
and we'll fake a /dev/dri/renderD128 (or whatever the next available node
is) using v3dv3.  That node can then be used with the surfaceless or gbm
EGL platforms.

Acked-by: Iago Toral Quiroga <itoral@igalia.com>
5 years agoglsl: report no function instead of empty candidate list
Erik Faye-Lund [Thu, 13 Jun 2019 10:03:27 +0000 (12:03 +0200)]
glsl: report no function instead of empty candidate list

When generating the error message for a missing function error where
all available overloads were missing due to a too low GLSL version, we
used to report something like this:

---8<---
0:224(14): error: no matching function for call to
           `textureCubeLod(samplerCube, vec3, float)'; candidates are:
0:224(14): error: type mismatch
---8<---

This is a pretty confusing error message, and can throw people off when
debugging. So let's instead check if any overload is available before we
decide what to print. This allow us to report something like this
instead:

---8<---
0:224(14): error: no function with name 'textureCubeLod'
0:224(14): error: type mismatch
---8<---

This is arguably easier to understand for programmers, and doesn't send
you on a wild goose chase to figure out what argument is wrong just
because you stopped reading the message prematurely. I'm of course
referring to a friend, not me. For sure. I would never do that.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agoradv: Set correct metadata size for GFX9+.
Bas Nieuwenhuizen [Thu, 25 Jul 2019 14:53:34 +0000 (16:53 +0200)]
radv: Set correct metadata size for GFX9+.

Without correct size, radeonsi assumes the metadata is incorrect,
which can and will cause issues.

Since the metadata is really incorrect without the size, let us
fix that.

Fixes: e43cc3e3afc "radv/gfx9: handle GFX9 opaque metadata"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoanv: report HOST_ALLOCATION as supported for images
Arcady Goldmints-Orlov [Tue, 23 Jul 2019 19:36:48 +0000 (14:36 -0500)]
anv: report HOST_ALLOCATION as supported for images

Report VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT as
supported for images. It was being shown supported for buffers, but not
images.

Fixes: 69cc6272fbc1 ("anv: Implement VK_EXT_external_memory_host")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoradv/gfx10: fix intensity formats by setting ALPHA_IS_ON_MSB
Samuel Pitoiset [Wed, 24 Jul 2019 14:50:38 +0000 (16:50 +0200)]
radv/gfx10: fix intensity formats by setting ALPHA_IS_ON_MSB

This fixes
dEQP-VK.rasterization.primitive_size.points.point_size_*

This also fixes some black squares with the Sascha SSAO demo.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: use L2 for DMA copy/fill operations
Samuel Pitoiset [Thu, 25 Jul 2019 13:38:51 +0000 (15:38 +0200)]
radv/gfx10: use L2 for DMA copy/fill operations

It's coherent and faster. GFX7-GFX9 should also support this but
for now only uses L2 for GFX10 because it's untested on previous gens.

This fixes dEQP-VK.memory.pipeline_barrier.transfer_*

This also fixes some missing geometry in Dawn Of War III because
VBOs weren't updated correctly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agopan/midgard: Optimize varying projection
Alyssa Rosenzweig [Wed, 24 Jul 2019 22:37:24 +0000 (15:37 -0700)]
pan/midgard: Optimize varying projection

We add a new opt pass fusing perspective projection with varyings. Minor
win..? We don't combine non-varying projections, since if we're too
agressive, the extra load/store traffic will hurt us so it's not really
a win in practice.

total instructions in shared programs: 3915 -> 3913 (-0.05%)
instructions in affected programs: 76 -> 74 (-2.63%)
helped: 1
HURT: 0

total bundles in shared programs: 2520 -> 2519 (-0.04%)
bundles in affected programs: 46 -> 45 (-2.17%)
helped: 1
HURT: 0

total quadwords in shared programs: 4027 -> 4025 (-0.05%)
quadwords in affected programs: 80 -> 78 (-2.50%)
helped: 1
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add perspective projection recombine pass
Alyssa Rosenzweig [Wed, 24 Jul 2019 03:02:56 +0000 (20:02 -0700)]
pan/midgard: Add perspective projection recombine pass

We don't use it yet, since it's actually a shader-db regression. This is
primarily helpful as an intermediate step for attaching projection to
varyings.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Force perspective ops to use vec4
Alyssa Rosenzweig [Wed, 24 Jul 2019 22:36:46 +0000 (15:36 -0700)]
pan/midgard: Force perspective ops to use vec4

It doesn't make sense to use them with anything less.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add R27-only op handling
Alyssa Rosenzweig [Wed, 24 Jul 2019 21:58:46 +0000 (14:58 -0700)]
pan/midgard: Add R27-only op handling

We use a special conflicting register class.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add OP_R27_ONLY helper
Alyssa Rosenzweig [Wed, 24 Jul 2019 21:52:57 +0000 (14:52 -0700)]
pan/midgard: Add OP_R27_ONLY helper

While load/store ops like st_vary can take an argument in either
r26/r27, ops like those for perspective projection must specifically
take their argument in r27.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Enable RA for st_vary
Alyssa Rosenzweig [Wed, 24 Jul 2019 19:54:59 +0000 (12:54 -0700)]
pan/midgard: Enable RA for st_vary

Now that all the piping is in place to do so without regressions, we
flip on automatic register allocation for varyings. Hooray!

total instructions in shared programs: 4025 -> 3915 (-2.73%)
instructions in affected programs: 1667 -> 1557 (-6.60%)
helped: 62
HURT: 0
helped stats (abs) min: 1 max: 3 x̄: 1.77 x̃: 2
helped stats (rel) min: 0.93% max: 20.00% x̄: 10.80% x̃: 10.64%
95% mean confidence interval for instructions value: -1.89 -1.66
95% mean confidence interval for instructions %-change: -12.50% -9.11%
Instructions are helped.

total bundles in shared programs: 2683 -> 2520 (-6.08%)
bundles in affected programs: 1066 -> 903 (-15.29%)
helped: 62
HURT: 0
helped stats (abs) min: 1 max: 3 x̄: 2.63 x̃: 3
helped stats (rel) min: 2.94% max: 42.86% x̄: 23.85% x̃: 22.50%
95% mean confidence interval for bundles value: -2.83 -2.43
95% mean confidence interval for bundles %-change: -27.73% -19.97%
Bundles are helped.

total quadwords in shared programs: 4192 -> 4027 (-3.94%)
quadwords in affected programs: 1584 -> 1419 (-10.42%)
helped: 62
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 2.66 x̃: 3
helped stats (rel) min: 1.85% max: 30.00% x̄: 16.49% x̃: 16.52%
95% mean confidence interval for quadwords value: -2.87 -2.46
95% mean confidence interval for quadwords %-change: -19.14% -13.84%
Quadwords are helped.

total registers in shared programs: 433 -> 411 (-5.08%)
registers in affected programs: 67 -> 45 (-32.84%)
helped: 23
HURT: 1
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 25.00% max: 50.00% x̄: 41.30% x̃: 50.00%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 14.29% max: 14.29% x̄: 14.29% x̃: 14.29%
95% mean confidence interval for registers value: -1.09 -0.74
95% mean confidence interval for registers %-change: -45.45% -32.52%
Registers are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Remove check for `class`
Alyssa Rosenzweig [Wed, 24 Jul 2019 21:10:12 +0000 (14:10 -0700)]
pan/midgard: Remove check for `class`

Fixes classes defaulting to vec4 in some cases.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Move uniforms to special registers
Alyssa Rosenzweig [Wed, 24 Jul 2019 20:29:36 +0000 (13:29 -0700)]
pan/midgard: Move uniforms to special registers

The load/store pipes can't take a uniform register in, so an explicit
move is necessary here.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Emit st_vary registers in install_registers
Alyssa Rosenzweig [Wed, 24 Jul 2019 19:53:58 +0000 (12:53 -0700)]
pan/midgard: Emit st_vary registers in install_registers

Now that we have its registers handled normally like the rest of the IR.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add mir_lower_special_reads helper
Alyssa Rosenzweig [Wed, 24 Jul 2019 19:53:08 +0000 (12:53 -0700)]
pan/midgard: Add mir_lower_special_reads helper

Given the constraints on special registers, we add a helper for lowering
these by inserting moves (copies) where needed to satsify the ISA
constraints.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add emit_explicit_constant helper
Alyssa Rosenzweig [Wed, 24 Jul 2019 19:52:27 +0000 (12:52 -0700)]
pan/midgard: Add emit_explicit_constant helper

We generalize the constant emission helper used in fragment writeout as
we'll also need it for vertex outputs.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add mir_rewrite_index_src_tag
Alyssa Rosenzweig [Wed, 24 Jul 2019 19:51:51 +0000 (12:51 -0700)]
pan/midgard: Add mir_rewrite_index_src_tag

Specialized version of a rewrite that only rewrites a certain type of
instruction.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add class check
Alyssa Rosenzweig [Wed, 24 Jul 2019 18:33:26 +0000 (11:33 -0700)]
pan/midgard: Add class check

This ensures the rules for accessing special register classes are
satisfied. This is asserted as a prepass should have lowered offending
uses to something satisfying these rules. Special register classes are
*not* work registers and cannot be used for RMW operations; they are
essentially 1-way pipes straight into/from fixed-function logic in the
shader cores.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Implement class spilling
Alyssa Rosenzweig [Wed, 24 Jul 2019 18:16:48 +0000 (11:16 -0700)]
pan/midgard: Implement class spilling

We reuse the same register spilling mechanism as for work->memory to
spill special->work registers, e.g. to allow writing out more than 2
vec4 varyings (without better scheduling anyway).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Extend liveness analysis to st_vary
Alyssa Rosenzweig [Wed, 24 Jul 2019 18:16:15 +0000 (11:16 -0700)]
pan/midgard: Extend liveness analysis to st_vary

These can consume sources now.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Implement load/store register classing
Alyssa Rosenzweig [Wed, 24 Jul 2019 17:03:24 +0000 (10:03 -0700)]
pan/midgard: Implement load/store register classing

This does not yet support special->work spilling, nor does it support
multiclass breakup. These corner cases will be handled in succeeding
commits.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Allocate special register classes
Alyssa Rosenzweig [Wed, 24 Jul 2019 15:31:46 +0000 (08:31 -0700)]
pan/midgard: Allocate special register classes

We'll want to also handle load/store and texture registers in our RA
loop.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Move copy propagation into its own file
Alyssa Rosenzweig [Wed, 24 Jul 2019 14:23:19 +0000 (07:23 -0700)]
pan/midgard: Move copy propagation into its own file

We also expose some utilities it uses as general MIR helpers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>