mesa.git
4 years agoradeonsi: fix assertion and other failures in si_emit_graphics_shader_pointers
Marek Olšák [Tue, 14 Jan 2020 02:09:35 +0000 (21:09 -0500)]
radeonsi: fix assertion and other failures in si_emit_graphics_shader_pointers

The assertion was failing.

Fixes: 363b4027fcb - radeonsi: put up to 5 VBO descriptors into user SGPRs
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agonir/algebraic: a & ~(a >> 31) -> imax(a, 0)
Rhys Perry [Tue, 10 Dec 2019 16:20:56 +0000 (16:20 +0000)]
nir/algebraic: a & ~(a >> 31) -> imax(a, 0)

Found in some Doom shaders

Totals from affected shaders:
SGPRS: 30056 -> 30064 (0.03 %)
VGPRS: 28024 -> 28024 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 4278648 -> 4270852 (-0.18 %) bytes
Max Waves: 1476 -> 1476 (0.00 %)
Instructions: 835287 -> 833338 (-0.23 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3089>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3089>

4 years agoetnaviv: Fix assert when try to accumulate an invalid fd
Marco Felsch [Thu, 5 Dec 2019 16:04:11 +0000 (17:04 +0100)]
etnaviv: Fix assert when try to accumulate an invalid fd

Check if it is a valid fd before merging it to the context's fd.

Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3381>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3381>

4 years agopan/midgard: Fix midgard_compile.h includes
Afonso Bordado [Tue, 31 Dec 2019 15:08:49 +0000 (15:08 +0000)]
pan/midgard: Fix midgard_compile.h includes

We now use enum mali_format which is defined in panfrost-job.h

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3243>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3243>

4 years agoanv: only use VkSamplerCreateInfo::compareOp if enabled
Lionel Landwerlin [Tue, 14 Jan 2020 14:10:21 +0000 (16:10 +0200)]
anv: only use VkSamplerCreateInfo::compareOp if enabled

The spec says nothing about the validity of the compareOp field when
compareEnable is false.

v2: use vulkan enum to pick default value (Caio)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2350
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3387>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3387>

4 years agonir/sink,nir/move: move/sink nir_op_mov
Rhys Perry [Mon, 14 Oct 2019 16:15:04 +0000 (17:15 +0100)]
nir/sink,nir/move: move/sink nir_op_mov

Can uncover opportunities to move other instructions. This can increase
register usage, but that doesn't seem to actually happen.

This optimizes a pattern of a load_per_vertex_input followed by several
moves and then a store_output in a different block.

v2: add nir_move_copies to make it optional

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Acked-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>

4 years agonir/sink,nir/move: move/sink load_per_vertex_input
Rhys Perry [Mon, 14 Oct 2019 16:15:37 +0000 (17:15 +0100)]
nir/sink,nir/move: move/sink load_per_vertex_input

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>

4 years agogitlab-ci: Consolidate container and build stages for LAVA
Tomeu Vizoso [Tue, 17 Dec 2019 10:50:14 +0000 (11:50 +0100)]
gitlab-ci: Consolidate container and build stages for LAVA

Use the normal build job to also prepare the artifacts for LAVA jobs.

For that, the build container needs to also build the test suites,
kernel, ramdisk, etc.

Then the build job will place the just-built Mesa in the ramdisk and the
test job can generate a LAVA job and point to those artifacts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3295>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3295>

4 years agoaco: add integer min/max to can_swap_operands
Rhys Perry [Fri, 13 Dec 2019 16:17:21 +0000 (16:17 +0000)]
aco: add integer min/max to can_swap_operands

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: improve readfirstlane after uniform LDS loads
Rhys Perry [Wed, 11 Dec 2019 16:10:26 +0000 (16:10 +0000)]
aco: improve readfirstlane after uniform LDS loads

Totals from affected shaders:
SGPRS: 976 -> 968 (-0.82 %)
VGPRS: 580 -> 584 (0.69 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 106032 -> 103076 (-2.79 %) bytes
Max Waves: 237 -> 237 (0.00 %)
Instructions: 19452 -> 18740 (-3.66 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: replace extract_vector with copies
Rhys Perry [Mon, 9 Dec 2019 21:20:10 +0000 (21:20 +0000)]
aco: replace extract_vector with copies

Helps a small number of small shaders with situations like this:
a = p_create_vector ...
b = p_extract_vector a, 3
and copy propagation can't be done

Totals from affected shaders:
SGPRS: 14304 -> 14416 (0.78 %)
VGPRS: 8716 -> 6592 (-24.37 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 184664 -> 176888 (-4.21 %) bytes
Max Waves: 6260 -> 6260 (0.00 %)
Instructions: 35561 -> 33617 (-5.47 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: allow input modifiers on v_cndmask_b32
Rhys Perry [Tue, 10 Dec 2019 19:03:42 +0000 (19:03 +0000)]
aco: allow input modifiers on v_cndmask_b32

Totals from affected shaders:
SGPRS: 594099 -> 594019 (-0.01 %)
VGPRS: 441016 -> 441124 (0.02 %)
Spilled SGPRs: 101 -> 101 (0.00 %)
Spilled VGPRs: 18 -> 18 (0.00 %)
Code Size: 30266652 -> 30125256 (-0.47 %) bytes
Max Waves: 67044 -> 67057 (0.02 %)
Instructions: 5753097 -> 5726607 (-0.46 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: don't move literal to reg when making an instruction VOP3 on GFX10
Rhys Perry [Wed, 4 Dec 2019 19:28:43 +0000 (19:28 +0000)]
aco: don't move literal to reg when making an instruction VOP3 on GFX10

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 163398 -> 163398 (0.00 %)
VGPRS: 143820 -> 143820 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 13065744 -> 13044308 (-0.16 %) bytes
Max Waves: 18921 -> 18921 (0.00 %)
Instructions: 2514644 -> 2509285 (-0.21 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: add min(-max(), ) and max(-min(), ) optimization
Rhys Perry [Fri, 22 Nov 2019 20:32:11 +0000 (20:32 +0000)]
aco: add min(-max(), ) and max(-min(), ) optimization

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: improve clamp optimization
Rhys Perry [Fri, 22 Nov 2019 17:50:29 +0000 (17:50 +0000)]
aco: improve clamp optimization

Not sure why it checked the use count, it doesn't apply the constants.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 269409 -> 269745 (0.12 %)
VGPRS: 238120 -> 238132 (0.01 %)
Spilled SGPRs: 305 -> 305 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 22908584 -> 22904672 (-0.02 %) bytes
Max Waves: 20217 -> 20217 (0.00 %)
Instructions: 4275312 -> 4263869 (-0.27 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 155409 -> 155233 (-0.11 %)
VGPRS: 153072 -> 153072 (0.00 %)
Spilled SGPRs: 269 -> 269 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 14650824 -> 14650396 (-0.00 %) bytes
Max Waves: 9609 -> 9609 (0.00 %)
Instructions: 2762802 -> 2755517 (-0.26 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: fix clamp optimization
Rhys Perry [Fri, 22 Nov 2019 20:58:59 +0000 (20:58 +0000)]
aco: fix clamp optimization

We can't do the optimization if there are neg/abs in-between.

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: improve creation of v_madmk_f32/v_madak_f32
Rhys Perry [Fri, 22 Nov 2019 15:18:38 +0000 (15:18 +0000)]
aco: improve creation of v_madmk_f32/v_madak_f32

Using needs_vop3 check was flawed because it would only combine the
literal if the first operand is the literal. If the second or third
operand is the literal, then needs_vop3 will be true and the literal will
not be combined.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 782051 -> 782051 (0.00 %)
VGPRS: 630048 -> 630048 (0.00 %)
Spilled SGPRs: 195 -> 195 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 54743740 -> 54585548 (-0.29 %) bytes
Max Waves: 67340 -> 67340 (0.00 %)
Instructions: 10182030 -> 10182030 (0.00 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 701990 -> 699590 (-0.34 %)
VGPRS: 566632 -> 566784 (0.03 %)
Spilled SGPRs: 218 -> 218 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 49173564 -> 49007856 (-0.34 %) bytes
Max Waves: 59650 -> 59612 (-0.06 %)
Instructions: 9315135 -> 9293330 (-0.23 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: take advantage of GFX10's constant bus limit and VOP3 literals
Rhys Perry [Wed, 20 Nov 2019 16:42:17 +0000 (16:42 +0000)]
aco: take advantage of GFX10's constant bus limit and VOP3 literals

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 2397159 -> 2392494 (-0.19 %)
VGPRS: 1756036 -> 1753920 (-0.12 %)
Spilled SGPRs: 461 -> 470 (1.95 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 110287304 -> 109946304 (-0.31 %) bytes
Max Waves: 318341 -> 318475 (0.04 %)
Instructions: 21019327 -> 20533618 (-2.31 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 0 -> 0 (0.00 %) bytes
Max Waves: 0 -> 0 (0.00 %)
Instructions: 0 -> 0 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: allow an extra SGPR with multiple uses to be applied to VOP3
Rhys Perry [Fri, 22 Nov 2019 13:47:54 +0000 (13:47 +0000)]
aco: allow an extra SGPR with multiple uses to be applied to VOP3

This is in a separate patch from the apply_sgprs() rewrite so that the
rewrite can be more easily tested.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 3056 -> 3056 (0.00 %)
VGPRS: 1632 -> 1632 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 156468 -> 156304 (-0.10 %) bytes
Max Waves: 288 -> 288 (0.00 %)
Instructions: 29510 -> 29469 (-0.14 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 2984 -> 2984 (0.00 %)
VGPRS: 1616 -> 1616 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 156132 -> 155968 (-0.11 %) bytes
Max Waves: 289 -> 289 (0.00 %)
Instructions: 29426 -> 29385 (-0.14 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: allow applying two sgprs to an instruction
Rhys Perry [Fri, 22 Nov 2019 13:47:08 +0000 (13:47 +0000)]
aco: allow applying two sgprs to an instruction

We could create VALU instructions which read two sgprs, but only if isel
created an instruction which already read one of them.

This change is in a separate patch from the apply_sgprs() rewrite so that
it can be tested if the rewrite affected anything.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 216 -> 216 (0.00 %)
VGPRS: 64 -> 64 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1756 -> 1708 (-2.73 %) bytes
Max Waves: 120 -> 120 (0.00 %)
Instructions: 312 -> 300 (-3.85 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 216 -> 216 (0.00 %)
VGPRS: 64 -> 64 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1784 -> 1736 (-2.69 %) bytes
Max Waves: 120 -> 120 (0.00 %)
Instructions: 319 -> 307 (-3.76 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: follow through temporary when merging tests into constant comparisons
Rhys Perry [Fri, 22 Nov 2019 14:17:27 +0000 (14:17 +0000)]
aco: follow through temporary when merging tests into constant comparisons

This can happen with v_mov_b32(s_mov_b32(literal))

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 632 -> 632 (0.00 %)
VGPRS: 492 -> 492 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 77488 -> 76928 (-0.72 %) bytes
Max Waves: 67 -> 67 (0.00 %)
Instructions: 14426 -> 14332 (-0.65 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 632 -> 632 (0.00 %)
VGPRS: 492 -> 492 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 77512 -> 76952 (-0.72 %) bytes
Max Waves: 67 -> 67 (0.00 %)
Instructions: 14432 -> 14338 (-0.65 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: be more careful with literals in combine_salu_{n2,lshl_add}
Rhys Perry [Fri, 22 Nov 2019 14:34:24 +0000 (14:34 +0000)]
aco: be more careful with literals in combine_salu_{n2,lshl_add}

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: add check_vop3_operands()
Rhys Perry [Fri, 22 Nov 2019 14:50:41 +0000 (14:50 +0000)]
aco: add check_vop3_operands()

This will be useful when taking advantage of GFX10 features.

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: rewrite apply_sgprs()
Rhys Perry [Fri, 22 Nov 2019 14:55:25 +0000 (14:55 +0000)]
aco: rewrite apply_sgprs()

This will make it easier to apply two different sgprs (for GFX10) or apply
the same sgpr twice (just remove the break).

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: rewrite literal combining
Rhys Perry [Fri, 22 Nov 2019 13:43:39 +0000 (13:43 +0000)]
aco: rewrite literal combining

Should make taking advantage of GFX10's increased constant bus limit and
VOP3 literals easier.

No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: improve can_use_VOP3()
Rhys Perry [Fri, 22 Nov 2019 15:00:04 +0000 (15:00 +0000)]
aco: improve can_use_VOP3()

No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: combine two sgprs into a VALU if they're the same
Rhys Perry [Wed, 20 Nov 2019 16:31:43 +0000 (16:31 +0000)]
aco: combine two sgprs into a VALU if they're the same

This was supposed to be done before but it wasn't done correctly and
everywhere.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 784680 -> 786128 (0.18 %)
VGPRS: 574012 -> 573892 (-0.02 %)
Spilled SGPRs: 461 -> 461 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 45477088 -> 45478172 (0.00 %) bytes
Max Waves: 81294 -> 81277 (-0.02 %)
Instructions: 8657970 -> 8622483 (-0.41 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 780664 -> 782072 (0.18 %)
VGPRS: 573880 -> 573760 (-0.02 %)
Spilled SGPRs: 629 -> 629 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 45445244 -> 45448340 (0.01 %) bytes
Max Waves: 81178 -> 81161 (-0.02 %)
Instructions: 8649902 -> 8614918 (-0.40 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: apply literals to split mads
Rhys Perry [Wed, 20 Nov 2019 19:09:25 +0000 (19:09 +0000)]
aco: apply literals to split mads

Removing the return is also needed to apply literals to mads (which can be
done on GFX10).

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 368787 -> 367555 (-0.33 %)
VGPRS: 312436 -> 312448 (0.00 %)
Spilled SGPRs: 461 -> 461 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 26113388 -> 26098260 (-0.06 %) bytes
Max Waves: 35982 -> 35982 (0.00 %)
Instructions: 5038670 -> 5028941 (-0.19 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 369843 -> 368659 (-0.32 %)
VGPRS: 317224 -> 317196 (-0.01 %)
Spilled SGPRs: 629 -> 629 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 26310540 -> 26295156 (-0.06 %) bytes
Max Waves: 36324 -> 36326 (0.01 %)
Instructions: 5073957 -> 5064164 (-0.19 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agoaco: update IR validator
Rhys Perry [Mon, 25 Nov 2019 16:12:44 +0000 (16:12 +0000)]
aco: update IR validator

GFX10 increased the constant bus limit and allowed literals on VOP3

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>

4 years agonir/lower_gs_intrinsics: add option for per-stream counts
Rhys Perry [Tue, 15 Oct 2019 15:46:02 +0000 (16:46 +0100)]
nir/lower_gs_intrinsics: add option for per-stream counts

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2422>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2422>

4 years agonir/divergence: handle load_primitive_id in GS
Rhys Perry [Mon, 14 Oct 2019 16:03:07 +0000 (17:03 +0100)]
nir/divergence: handle load_primitive_id in GS

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2323>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2323>

4 years agomesa/st: use float literals
Erik Faye-Lund [Tue, 24 Sep 2019 13:51:13 +0000 (15:51 +0200)]
mesa/st: use float literals

This removes a warning on MSVC.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
4 years agogallium: fix a warning
Erik Faye-Lund [Mon, 23 Sep 2019 11:36:45 +0000 (13:36 +0200)]
gallium: fix a warning

On some platforms (like Win64), unsigned long is 32-bit, so the first
cast doesn't do anything, and the compiler complains about an implicit
cast to a smaller type. So let's cast to an uintptr_t instead first,
as that's large enough on all platforms.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
4 years agost/wgl: eliminate implicit cast warning
Erik Faye-Lund [Fri, 20 Sep 2019 14:07:47 +0000 (16:07 +0200)]
st/wgl: eliminate implicit cast warning

I get warnings on MSVC for these implicit casts. Let's use explicit
casts instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
4 years agoutil: initialize float-array with float-literals
Erik Faye-Lund [Fri, 20 Sep 2019 14:04:06 +0000 (16:04 +0200)]
util: initialize float-array with float-literals

We currently initialize this float-array with double-literals. Some
compilers generate warnings for this, so let's switch these to
float-literals instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
4 years agoanv: Implement Gen12 workaround for non pipelined state
Lionel Landwerlin [Mon, 13 Jan 2020 15:50:36 +0000 (17:50 +0200)]
anv: Implement Gen12 workaround for non pipelined state

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>

4 years agoiris: Implement Gen12 workaround for non pipelined state
Lionel Landwerlin [Mon, 13 Jan 2020 15:50:06 +0000 (17:50 +0200)]
iris: Implement Gen12 workaround for non pipelined state

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>

4 years agolima: add new findings to texture descriptor
Vasily Khoruzhick [Sun, 12 Jan 2020 06:23:36 +0000 (22:23 -0800)]
lima: add new findings to texture descriptor

Lower 8 bits of unknown_1_3 seems to be min_lod,
rest of 4 bits + miplevels are max_lod and min_mipfilter seems to be
lod bias. All are in fixed format with 4 bit integer and 4 bit fraction,
lod_bias also has sign bit.

Blob also seems to do some magic with lod_bias if min filter is nearest --
it adds 0.5 to lod_bias in this case. Same story when all filters are
nearest and mipmapping is enabled, but in this case it subtracts 1/16
from lod_bias.

Fixes 134 dEQP tests in dEQP-GLES2.functional.texture.*

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3359>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3359>

4 years agointel: Use similar brand strings to the Windows drivers
Kenneth Graunke [Tue, 17 Dec 2019 08:51:20 +0000 (00:51 -0800)]
intel: Use similar brand strings to the Windows drivers

This updates our product name strings to match the ones reported
by the Windows driver, which is typically the marketing name.

We retain a platform abbreviation and GT level in parenthesis so that
we're able to distinguish similar parts more easily, helping us better
understand at a glance which GPU a bug reporter has.

Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>

4 years agoiris: Simplify iris_get_renderer_string()
Kenneth Graunke [Tue, 17 Dec 2019 10:57:55 +0000 (02:57 -0800)]
iris: Simplify iris_get_renderer_string()

We use gen_get_device_name() instead of PCI ID list munging.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>

4 years agoi965: Simplify brw_get_renderer_string()
Kenneth Graunke [Tue, 17 Dec 2019 09:00:14 +0000 (01:00 -0800)]
i965: Simplify brw_get_renderer_string()

This stops using driGetRendererString() in favor of a simple snprintf().
This should have the same functionality on 64-bit systems, but drops
a "x86/MMX/SSE2" suffix on 32-bit systems.  (People shouldn't be using
the GL_RENDERER string to check for CPU features...)

We also use gen_get_device_name() instead of PCI ID list munging.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>

4 years agoRevert "nir: assert that nir_lower_tex runs after lowering derefs"
Kenneth Graunke [Tue, 14 Jan 2020 01:35:35 +0000 (17:35 -0800)]
Revert "nir: assert that nir_lower_tex runs after lowering derefs"

This reverts commit 4cda61f11e922fb5914ae73d22cc0c495abf0377 for now,
as it appears to break i965 CI (32,000+ failures).  Rob and I suspect
we need to do the equivalent of 1c6a2efa06e9bb5914f4557118930fc61065a467
on i965 - we are doing nir_lower_tex and brw_nir_lower_resources in the
wrong order and that's likely triggering this condition.  Once we fix
that, we should put this patch back.

4 years agozink: fixup initialization of operand_mask / num_extra_operands
Erik Faye-Lund [Thu, 19 Dec 2019 09:17:14 +0000 (10:17 +0100)]
zink: fixup initialization of operand_mask / num_extra_operands

This doesn't change behavior, but makes the code a bit easier to read.
Both values are zero, but I somehow swapped the logical meaning of them
when initializing.

4 years agomesa: Fix detection of invalidating both depth and stencil.
Eric Anholt [Mon, 13 Jan 2020 21:06:01 +0000 (13:06 -0800)]
mesa: Fix detection of invalidating both depth and stencil.

Fixes an extra 1024x1024x4 MSAA Z/S store on WebGL fishtank on cheza.

Reported-by: Dave Airlie <airlied@redhat.com>
Fixes: db2ae5112106 ("mesa: Skip partial InvalidateFramebuffer of packed depth/stencil.")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3370>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3370>

4 years agomesa/st: lower samplers before nir_lower_tex
Rob Clark [Mon, 13 Jan 2020 19:36:19 +0000 (11:36 -0800)]
mesa/st: lower samplers before nir_lower_tex

Fixes incorrect lowering of YUV samplers when there are non-yuv
samplers.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3368>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3368>

4 years agonir: assert that nir_lower_tex runs after lowering derefs
Rob Clark [Mon, 13 Jan 2020 19:34:53 +0000 (11:34 -0800)]
nir: assert that nir_lower_tex runs after lowering derefs

It isn't going to do the right thing, because texture_index/
sampler_index defaults to zero.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3368>

4 years agoi965: support EXT_EGL_image_storage
Gurchetan Singh [Thu, 15 Aug 2019 01:09:28 +0000 (18:09 -0700)]
i965: support EXT_EGL_image_storage

i965 can support this.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
4 years agoi965: refactor intel_image_target_texture_2d
Gurchetan Singh [Thu, 7 Nov 2019 02:02:37 +0000 (18:02 -0800)]
i965: refactor intel_image_target_texture_2d

intel_image_target_texture_tex_storage can reuse much of this
code.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
4 years agoi965: track if image is created by a dmabuf
Gurchetan Singh [Wed, 21 Aug 2019 22:07:28 +0000 (15:07 -0700)]
i965: track if image is created by a dmabuf

Will be used by EXT_EGL_image_storage later.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
4 years agodri_util: add driImageFormatToSizedInternalGLFormat function
Gurchetan Singh [Thu, 7 Nov 2019 01:18:13 +0000 (17:18 -0800)]
dri_util: add driImageFormatToSizedInternalGLFormat function

This is needed to implement the EXT_EGL_image_storage spec:

"If <target> is GL_TEXTURE_2D, then the resultant texture must have a
sized internal format which is colorspace and size compatible with the
dma-buf.  If the GL is unable to determine such a format, the error
INVALID_OPERATION is generated."

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
4 years agoglapi / teximage: implement EGLImageTargetTexStorageEXT
Gurchetan Singh [Wed, 14 Aug 2019 22:16:04 +0000 (15:16 -0700)]
glapi / teximage: implement EGLImageTargetTexStorageEXT

Check various parts of the EXT_EGL_image_storage spec, and add a
new vfunc for drivers implementing it.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
4 years agoteximage: split out helper from EGLImageTargetTexture2DOES
Gurchetan Singh [Wed, 14 Aug 2019 01:16:41 +0000 (18:16 -0700)]
teximage: split out helper from EGLImageTargetTexture2DOES

The major differences between EXT_EGL_image_storage and
EGLImageTargetTexture2DOES are:

(1) The texture target is made immutable
(2) EXT_EGL_image_storage supports non-2D targets.

We can reuse EGLImageTargetTexture2D and FreeTextureImageBuffer
for (1) pretty easily.  For (2), let's just not support the
complicated targets.  Let's reuse aspects of the
EGLImageTargetTexture2DOES implementation.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
4 years agoanv: Memset array properties
Jason Ekstrand [Mon, 13 Jan 2020 19:49:57 +0000 (13:49 -0600)]
anv: Memset array properties

This is probably better than possibly leaving those bytes uninitialized
even if the app will theoretically not use them.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369>

4 years agoanv: Don't over-advertise descriptor indexing features
Jason Ekstrand [Mon, 13 Jan 2020 18:55:41 +0000 (12:55 -0600)]
anv: Don't over-advertise descriptor indexing features

We should only advertise sub-features if we advertise the extension.

Fixes: 6e230d7607f "anv: Implement VK_EXT_descriptor_indexing"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369>

4 years agointel/blorp: Fill out all the dwords of MI_ATOMIC
Jason Ekstrand [Fri, 10 Jan 2020 21:30:02 +0000 (15:30 -0600)]
intel/blorp: Fill out all the dwords of MI_ATOMIC

This makes us valgrind clean again.

Fixes: 9175c7058efb "intel/blorp: Make blorp update the clear color..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3366>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3366>

4 years agogitlab-ci: Upgrade kernel for LAVA jobs to v5.5-rc5
Tomeu Vizoso [Mon, 13 Jan 2020 10:47:58 +0000 (11:47 +0100)]
gitlab-ci: Upgrade kernel for LAVA jobs to v5.5-rc5

Some fixes got in that should prevent hangs in lima jobs.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3363>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3363>

4 years agoaco: fix unconditional demote_to_helper
Daniel Schürmann [Fri, 10 Jan 2020 16:19:40 +0000 (17:19 +0100)]
aco: fix unconditional demote_to_helper

This patch fixes an out-of-bounds access on p_exit_early
and binds the exec register to the correct operand.

Fixes: 2ea9e59e8d976ec77800d2a20645087b96d1e241 ('aco: move s_andn2_b64 instructions out of the p_discard_if')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3347>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3347>

4 years agoradeonsi: don't enable VBOs in user SGPRs if compute-based culling can be used
Marek Olšák [Thu, 9 Jan 2020 21:41:13 +0000 (16:41 -0500)]
radeonsi: don't enable VBOs in user SGPRs if compute-based culling can be used

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoradeonsi: put up to 5 VBO descriptors into user SGPRs
Marek Olšák [Tue, 7 Jan 2020 23:23:53 +0000 (18:23 -0500)]
radeonsi: put up to 5 VBO descriptors into user SGPRs

gfx6-8: 1 VBO descriptor in user SGPRs
gfx9-10: 5 VBO descriptors in user SGPRs

We no longer pull up to 5 VBO descriptors from GTT when SDMA is disabled.

Totals from affected shaders:
SGPRS: 1110528 -> 1170528 (5.40 %)
VGPRS: 952896 -> 951936 (-0.10 %)
Spilled SGPRs: 83 -> 61 (-26.51 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 23766296 -> 22843920 (-3.88 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 179344 -> 179344 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoac,radeonsi: increase the maximum number of shader args and return values
Marek Olšák [Wed, 8 Jan 2020 20:52:44 +0000 (15:52 -0500)]
ac,radeonsi: increase the maximum number of shader args and return values

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoradeonsi: simplify si_set_vertex_buffers
Marek Olšák [Wed, 8 Jan 2020 00:45:01 +0000 (19:45 -0500)]
radeonsi: simplify si_set_vertex_buffers

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoradeonsi: don't allow draw calls with uninitialized VS inputs
Marek Olšák [Tue, 7 Jan 2020 23:16:59 +0000 (18:16 -0500)]
radeonsi: don't allow draw calls with uninitialized VS inputs

These always hang, because vertex buffer descriptors are not set up.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoradeonsi: add si_context::num_vertex_elements
Marek Olšák [Tue, 7 Jan 2020 23:10:38 +0000 (18:10 -0500)]
radeonsi: add si_context::num_vertex_elements

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoradeonsi: rename desc_list_byte_size -> vb_desc_list_alloc_size
Marek Olšák [Tue, 7 Jan 2020 23:06:14 +0000 (18:06 -0500)]
radeonsi: rename desc_list_byte_size -> vb_desc_list_alloc_size

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoanv: set stencil layout for input attachments
Lionel Landwerlin [Tue, 26 Nov 2019 15:53:09 +0000 (17:53 +0200)]
anv: set stencil layout for input attachments

If an input attachment has a stencil format, we need to set this.

v2: Fish out VkAttachmentReferenceStencilLayoutKHR from
    VkAttachmentReference2KHR::pNext (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: c1c346f16673 ("anv: implement VK_KHR_separate_depth_stencil_layouts")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2891>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2891>

4 years agoanv: Drop an unused variable
Jason Ekstrand [Mon, 13 Jan 2020 18:20:48 +0000 (12:20 -0600)]
anv: Drop an unused variable

4 years agonir/lower_atomics_to_ssbo: Also lower barriers
Jason Ekstrand [Wed, 8 Jan 2020 01:22:13 +0000 (19:22 -0600)]
nir/lower_atomics_to_ssbo: Also lower barriers

This is more correct for a pass which is supposed to completely lower
away atomic counters.  It also lets us stop supporting atomic counter
barriers in most of the drivers.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>

4 years agonir: Rename nir_intrinsic_barrier to control_barrier
Jason Ekstrand [Tue, 7 Jan 2020 20:54:26 +0000 (14:54 -0600)]
nir: Rename nir_intrinsic_barrier to control_barrier

This is a more explicit name now that we don't want it to be doing any
memory barrier stuff for us.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>

4 years agointel/nir: Stop adding redundant barriers
Jason Ekstrand [Tue, 7 Jan 2020 20:58:45 +0000 (14:58 -0600)]
intel/nir: Stop adding redundant barriers

Now that both GLSL and SPIR-V are adding shared and tcs_patch barriers
(as appropreate) prior to the nir_intrinsic_barrier, we don't need to do
it ourselves in the back-end.  This reverts commit
26e950a5de01564e3b5f2148ae994454ae5205fe.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>

4 years agonir/glsl: Emit memory barriers as part of barrier()
Jason Ekstrand [Tue, 7 Jan 2020 20:40:53 +0000 (14:40 -0600)]
nir/glsl: Emit memory barriers as part of barrier()

The GLSL barrier() intrinsic does an implicit shared memory barrier in
compute shaders and an implicit TCS patch output barrier in tessellation
control shaders.  We'd like NIR's barrier intrinsic to just be a control
flow barrier and not have memory implications.  To satisfy this, we need
to add an extra memory barrier in front of each nir_intrinsic_barrier.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>

4 years agospirv: Add output memory semantics to OpControlBarrier in TCS
Jason Ekstrand [Tue, 7 Jan 2020 18:01:13 +0000 (12:01 -0600)]
spirv: Add output memory semantics to OpControlBarrier in TCS

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>

4 years agospirv: Add a workaround for OpControlBarrier on old GLSLang
Jason Ekstrand [Tue, 7 Jan 2020 17:35:54 +0000 (11:35 -0600)]
spirv: Add a workaround for OpControlBarrier on old GLSLang

As per the Vulkan memory model, the proper translation of GLSL barrier()
is an OpControlBarrier with a scope of Workgroup and semantics of
Acquire, Release, and WorkgroupMemory.  Older versions of GLSLang gave
an OpControlBarrier with semantics of None so we need to patch it up on
those versions.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>

4 years agonir: Add a new memory_barrier_tcs_patch intrinsic
Jason Ekstrand [Tue, 7 Jan 2020 20:18:56 +0000 (14:18 -0600)]
nir: Add a new memory_barrier_tcs_patch intrinsic

Right now, it's implemented as a no-op for everyone.  For most drivers,
it's a switch case in the NIR -> whatever which just breaks.  For ir3,
they already have code to delete tessellation barriers so we just add a
case to also delete memory_barrier_tcs_patch.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>

4 years agollmvpipe: No-op implement more barriers
Jason Ekstrand [Wed, 8 Jan 2020 01:21:37 +0000 (19:21 -0600)]
llmvpipe: No-op implement more barriers

Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>

4 years agonir: Handle barriers with more granularity in combine_stores
Jason Ekstrand [Tue, 7 Jan 2020 20:13:43 +0000 (14:13 -0600)]
nir: Handle barriers with more granularity in combine_stores

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>

4 years agonir: Handle more barriers in dead_write and copy_prop
Jason Ekstrand [Tue, 7 Jan 2020 20:11:55 +0000 (14:11 -0600)]
nir: Handle more barriers in dead_write and copy_prop

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>

4 years agointel/vec4: Support scoped_memory_barrier
Jason Ekstrand [Tue, 7 Jan 2020 22:14:56 +0000 (16:14 -0600)]
intel/vec4: Support scoped_memory_barrier

Fixes: 06aecb14c0476 "anv: Implement VK_KHR_vulkan_memory_model"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>

4 years agolima: Add stencil support
Andreas Baierl [Tue, 7 Jan 2020 16:06:46 +0000 (17:06 +0100)]
lima: Add stencil support

This re-enables and fixes support for stencil buffer.

It fixes 365 stencil related deqp tests. All tests that use INCR, INCR_WRAR,
DECR and DECR_WRAP as a stencil op still fail, but they also fail with the
blob, so we may ignore that for now.
We still have dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked
failing, which is strange because it's the only one out of the
depth_stencil_clear.* set.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
4 years agolima/parser: Make rsw alpha blend parsing more readable
Andreas Baierl [Mon, 13 Jan 2020 07:58:09 +0000 (08:58 +0100)]
lima/parser: Make rsw alpha blend parsing more readable

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
4 years agopanfrost: Remove unneeded phi nodes
Boris Brezillon [Mon, 6 Jan 2020 13:31:38 +0000 (14:31 +0100)]
panfrost: Remove unneeded phi nodes

Add a pass to remove unneeded phi nodes as done in other drivers.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3294>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3294>

4 years agoaco: check if multiplication/clamp is live when applying output modifier
Rhys Perry [Thu, 2 Jan 2020 17:05:30 +0000 (17:05 +0000)]
aco: check if multiplication/clamp is live when applying output modifier

It's possible that a multiplication/clamp is dead code and the single use
is from a different user.

Fixes portal rendering in Path of Exile when global illumination is
enabled.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 93c8ebfa780 ('aco: Initial commit of independent AMD compiler')
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>

4 years agoaco: disable add combining for ds_swizzle_b32
Rhys Perry [Mon, 16 Dec 2019 13:58:16 +0000 (13:58 +0000)]
aco: disable add combining for ds_swizzle_b32

ds_bpermute_b32/ds_permute_b32 are fine, I think

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa780 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>

4 years agoaco: don't DCE atomics with return values
Rhys Perry [Mon, 16 Dec 2019 13:30:10 +0000 (13:30 +0000)]
aco: don't DCE atomics with return values

We don't create atomics with definitions if they are not used in NIR, but
our own DCE can remove the uses if an export turns out to be null.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa780 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>

4 years agoaco: set exec_potentially_empty for demotes
Rhys Perry [Mon, 16 Dec 2019 11:29:08 +0000 (11:29 +0000)]
aco: set exec_potentially_empty for demotes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa780 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>

4 years agoaco: better handle neg/abs of sgprs
Rhys Perry [Fri, 13 Dec 2019 16:59:54 +0000 (16:59 +0000)]
aco: better handle neg/abs of sgprs

isel/label_instruction currently doesn't create these but we should
probably check anyway.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>

4 years agoaco: check usesModifiers() when identifying a neg/abs
Rhys Perry [Wed, 11 Dec 2019 19:41:22 +0000 (19:41 +0000)]
aco: check usesModifiers() when identifying a neg/abs

This was fine because a literal used to mean that it didn't use modifiers,
but now VOP3 can take a literal on GFX10.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>

4 years agoaco: handle omod successors with the constant in the first operand
Rhys Perry [Wed, 11 Dec 2019 15:54:18 +0000 (15:54 +0000)]
aco: handle omod successors with the constant in the first operand

No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>

4 years agoaco: handle VOP3 modifiers when combining a constant comparison's NaN test
Rhys Perry [Wed, 11 Dec 2019 16:57:11 +0000 (16:57 +0000)]
aco: handle VOP3 modifiers when combining a constant comparison's NaN test

No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>

4 years agoaco: fix uninitialized data in the binary
Rhys Perry [Tue, 24 Sep 2019 16:21:51 +0000 (17:21 +0100)]
aco: fix uninitialized data in the binary

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 93c8ebfa780 ('aco: Initial commit of independent AMD compiler')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>

4 years agoaco: fix imageSize()/textureSize() with large buffers on GFX8
Rhys Perry [Mon, 9 Dec 2019 18:00:55 +0000 (18:00 +0000)]
aco: fix imageSize()/textureSize() with large buffers on GFX8

Tested on Navi by using dEQP-VK.image.image_size.buffer.* and the GFX8
path with the size multipled by the stride.
dEQP-VK.image.image_size.buffer.* was also run with the tests modified to
use a 96bit format.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa780 ('aco: Initial commit of independent AMD compiler')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>

4 years agoaco: set vm for pos0 exports on GFX10
Rhys Perry [Mon, 9 Dec 2019 13:38:47 +0000 (13:38 +0000)]
aco: set vm for pos0 exports on GFX10

RADV's LLVM backend and radeonsi does the same thing.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>

4 years agopanfrost: Fix headers and gpu_headers memory leak
Daniel Ogorchock [Tue, 7 Jan 2020 16:07:37 +0000 (10:07 -0600)]
panfrost: Fix headers and gpu_headers memory leak

The per-batch headers/gpu_headers dynarrays need to be freed during the
batch cleanup to prevent leaking.

Signed-off-by: Daniel Ogorchock <daniel.ogorchock@garmin.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3308>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3308>

4 years agopanfrost: Fix panfrost_bo_access memory leak
Daniel Ogorchock [Mon, 6 Jan 2020 23:33:49 +0000 (17:33 -0600)]
panfrost: Fix panfrost_bo_access memory leak

The bo access needs to be freed prior to removing it from its hash
table. This prevents leaking them over time.

Signed-off-by: Daniel Ogorchock <daniel.ogorchock@garmin.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3308>

4 years agoradv/gfx10: improve performance for TES using PrimID but not exporting it
Samuel Pitoiset [Wed, 8 Jan 2020 07:55:16 +0000 (08:55 +0100)]
radv/gfx10: improve performance for TES using PrimID but not exporting it

This field is for the primitive ID export to the fragment shader.
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: add support for NGG passthrough mode
Samuel Pitoiset [Thu, 9 Jan 2020 07:24:11 +0000 (08:24 +0100)]
radv/gfx10: add support for NGG passthrough mode

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: do not declare LDS for NGG if useless
Samuel Pitoiset [Wed, 8 Jan 2020 07:39:10 +0000 (08:39 +0100)]
radv/gfx10: do not declare LDS for NGG if useless

Only needed for NGG without passthrough mode or for NGG streamout.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: determine if a pipeline is eligible for NGG passthrough
Samuel Pitoiset [Thu, 9 Jan 2020 07:23:12 +0000 (08:23 +0100)]
radv/gfx10: determine if a pipeline is eligible for NGG passthrough

It can't be enabled for geometry shaders, for NGG streamout and
for vertex shaders that export the primitive ID. NGG passthrough
requires that LDS isn't used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: disable vertex grouping
Samuel Pitoiset [Tue, 7 Jan 2020 16:01:39 +0000 (17:01 +0100)]
radv/gfx10: disable vertex grouping

RadeonSI and AMDVLK does that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agonvc0: treat all draws without color0 broadcast as MRT
Ilia Mirkin [Sun, 12 Jan 2020 05:07:05 +0000 (00:07 -0500)]
nvc0: treat all draws without color0 broadcast as MRT

Per the semi-recently-released NVIDIA docs, when this bit is not
enabled, then the result for RT[0] will be used. So if e.g. only a
single RT is drawn to and it's not RT[2], the results will not be
visible. Fixes
GTF-GL45.gtf33.GL3Tests.explicit_attrib_location.explicit_attrib_location_pipeline
which was failing due to a frag shader outputting only to location=2.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
4 years agogm107/ir: avoid combining geometry shader stores at 0x60
Ilia Mirkin [Tue, 7 Jan 2020 02:54:26 +0000 (21:54 -0500)]
gm107/ir: avoid combining geometry shader stores at 0x60

This corresponds to gl_PrimitiveID and gl_Layer. When both of these are
stored in a single AST.64 or AST.128 operation, then it appears as
though the whole store fails. Fixes the recently extended
glsl-1.50-transform-feedback-builtins piglit, and also
gtf30.GL3Tests.transform_feedback.transform_feedback_builtins.

The issue was reproduced on GM107 and GP108 but not GK208 nor GK104.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>