mesa.git
3 years agonvc0: use NVIDIA headers for GP100- compute QMD
Ben Skeggs [Sat, 6 Jun 2020 23:52:41 +0000 (09:52 +1000)]
nvc0: use NVIDIA headers for GP100- compute QMD

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvc0: use NVIDIA headers for GK104->GM2xx compute QMD
Ben Skeggs [Sat, 6 Jun 2020 23:52:39 +0000 (09:52 +1000)]
nvc0: use NVIDIA headers for GK104->GM2xx compute QMD

v2:
- add header debug_printf(), and indent the output
v3:
- rename one of the helper macros

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/gv100: enable support for tu1xx
Ben Skeggs [Sat, 6 Jun 2020 23:52:37 +0000 (09:52 +1000)]
nvir/gv100: enable support for tu1xx

SM75 has a bunch more stuff, but is otherwise backwards-compatible
with SM70 SASS.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/gv100: initial support
Ben Skeggs [Sat, 6 Jun 2020 23:52:35 +0000 (09:52 +1000)]
nvir/gv100: initial support

v2:
- add TargetGV100::isBarrierRequired() for OP_BREV
- use NV50_IR_SUBOP_LOP3_LUT() convenience macro where it makes sense
- separated out nir_lower_idiv into its own commit
- make use of the shared function to generate compiler options
- disable lower_fpow, nir's lowering is broken
v3:
- use replaceCvt() instead of custom NEG/ABS/SAT lowering
v4:
- remove WAR from peephole, not needed now we're using replaceCvt()

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir/gm107: switch off lower_extract_word
Ben Skeggs [Sat, 6 Jun 2020 23:52:33 +0000 (09:52 +1000)]
nvir/nir/gm107: switch off lower_extract_word

We can use PRMT here.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir/gm107: switch off lower_extract_byte
Ben Skeggs [Sat, 6 Jun 2020 23:52:31 +0000 (09:52 +1000)]
nvir/nir/gm107: switch off lower_extract_byte

We can use PRMT here.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir/gm107: turn on nir_lower_extract64
Ben Skeggs [Sat, 6 Jun 2020 23:52:29 +0000 (09:52 +1000)]
nvir/nir/gm107: turn on nir_lower_extract64

About to disable lowering for extract_byte/word in favour of a better
local implementation, but still need lowering for 64-bit versions.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir/gm107: split nir shader compiler options from gf100
Ben Skeggs [Sat, 6 Jun 2020 23:52:27 +0000 (09:52 +1000)]
nvir/nir/gm107: split nir shader compiler options from gf100

We can enable some more things here vs earlier GPUs.

v2:
- make use of the shared function to generate compiler options

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/gm107: separate out header for sched data calculator
Ben Skeggs [Sat, 6 Jun 2020 23:52:25 +0000 (09:52 +1000)]
nvir/gm107: separate out header for sched data calculator

SM70 code emitter will want to reuse this.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/gm107: replace SHR+AND+AND with PRMT+PRMT in PFETCH lowering
Ben Skeggs [Sat, 6 Jun 2020 23:52:23 +0000 (09:52 +1000)]
nvir/gm107: replace SHR+AND+AND with PRMT+PRMT in PFETCH lowering

This is more SM70-friendly.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/gm107: implement OP_PERMT
Ben Skeggs [Sat, 6 Jun 2020 23:52:21 +0000 (09:52 +1000)]
nvir/gm107: implement OP_PERMT

PFETCH lowering will be changed to use this as it's more SM70-friendly,
and this will also allow us to implement extract_byte/word opcodes.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir: use nir_lower_idiv
Ben Skeggs [Sun, 7 Jun 2020 20:23:50 +0000 (06:23 +1000)]
nvir/nir: use nir_lower_idiv

NIR provides a common implementation of this so we don't need to use a
hand-written built-in library.

v2:
- use idiv_precise instead

Especially important on SM70 where we don't have an assembler.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir: nir expects the shift amount to wrap, rather than clamp
Ben Skeggs [Sat, 6 Jun 2020 23:52:17 +0000 (09:52 +1000)]
nvir/nir: nir expects the shift amount to wrap, rather than clamp

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir: implement nir_op_uror
Ben Skeggs [Sat, 6 Jun 2020 23:52:15 +0000 (09:52 +1000)]
nvir/nir: implement nir_op_uror

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir: implement nir_op_urol
Ben Skeggs [Sat, 6 Jun 2020 23:52:13 +0000 (09:52 +1000)]
nvir/nir: implement nir_op_urol

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir: implement nir_op_extract_i16
Ben Skeggs [Sat, 6 Jun 2020 23:52:11 +0000 (09:52 +1000)]
nvir/nir: implement nir_op_extract_i16

v2:
- use getSSA() instead of getScratch()
v3:
- fix whitespace

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir: implement nir_op_extract_u16
Ben Skeggs [Sat, 6 Jun 2020 23:52:10 +0000 (09:52 +1000)]
nvir/nir: implement nir_op_extract_u16

v2:
- use getSSA() instead of getScratch()
v3:
- fix whitespace

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir: implement nir_op_extract_i8
Ben Skeggs [Sat, 6 Jun 2020 23:52:08 +0000 (09:52 +1000)]
nvir/nir: implement nir_op_extract_i8

v2:
- use getSSA() instead of getScratch()
v3:
- fix whitespace

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir: implement nir_op_extract_u8
Ben Skeggs [Sat, 6 Jun 2020 23:52:06 +0000 (09:52 +1000)]
nvir/nir: implement nir_op_extract_u8

v2:
- use getSSA() instead of getScratch()
v3:
- fix whitespace

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir: turn on lower_rotate
Ben Skeggs [Sat, 6 Jun 2020 23:52:19 +0000 (09:52 +1000)]
nvir/nir: turn on lower_rotate

This isn't implemented, and won't be for GPUs that don't support SHF.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir: flesh out options
Ben Skeggs [Sun, 7 Jun 2020 22:43:54 +0000 (08:43 +1000)]
nvir/nir: flesh out options

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir: move nir options to codegen
Ben Skeggs [Sat, 6 Jun 2020 23:52:04 +0000 (09:52 +1000)]
nvir/nir: move nir options to codegen

These seem to make more sense living with the compiler.

v2:
- use a shared function to generate the per-chipset structs
- remove nir.h include from header, not needed

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir: fix fragment program output when using MRT
Ben Skeggs [Sat, 6 Jun 2020 23:52:02 +0000 (09:52 +1000)]
nvir/nir: fix fragment program output when using MRT

v2:
- use BITFIELD64_BIT()

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir/nir: use component helpers instead of insn->num_components
Karol Herbst [Fri, 15 May 2020 09:14:12 +0000 (11:14 +0200)]
nvir/nir: use component helpers instead of insn->num_components

We have nir_intrinsic_dest_components and nir_intrinsic_src_components
which handle all the corner cases.

Fixes a bunch of regressions like front_face stuff.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir: run replaceZero() before replaceCvt()
Ben Skeggs [Mon, 8 Jun 2020 23:52:47 +0000 (09:52 +1000)]
nvir: run replaceZero() before replaceCvt()

replaceCvt() will miss some cases otherwise.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir: add constant folding for OP_PERMT
Ben Skeggs [Sat, 6 Jun 2020 23:52:00 +0000 (09:52 +1000)]
nvir: add constant folding for OP_PERMT

Important for SM70 INSBF/EXTBF lowering, as these can can often be
eliminated completely.

v2:
- skip CF when subOp is set

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir: introduce OP_FINAL
Ben Skeggs [Sat, 6 Jun 2020 23:51:58 +0000 (09:51 +1000)]
nvir: introduce OP_FINAL

Required to support SM70 GS.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir: introduce OP_SGXT
Ben Skeggs [Sat, 6 Jun 2020 23:51:56 +0000 (09:51 +1000)]
nvir: introduce OP_SGXT

Required for SM70 EXTBF lowering.

v2:
- added constant folding

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir: introduce OP_BMSK
Ben Skeggs [Sat, 6 Jun 2020 23:51:55 +0000 (09:51 +1000)]
nvir: introduce OP_BMSK

This replaces the existing implementation without adding lowering for
earlier GPUs.  The reason for this is because the existing code isn't
at all correct, and it also can't be hit anyway.

Will be required to support SM70 lowering passes.

v2:
- fixup source selection

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir: introduce OP_SHF
Ben Skeggs [Sat, 6 Jun 2020 23:51:53 +0000 (09:51 +1000)]
nvir: introduce OP_SHF

We already use a hack from NVC0LegalizeSSA::handleShift() on GK110 and
newer which encodes SHF into the existing SHL/SHR opcodes, but there's
a couple of problems with it:

- LO/HI are swapped in one of the directions, which is very confusing.
- The initial SM70 code will emit this from NIR->NVIR, and using the
  existing encodings will confuse the optimisation passes.

As I want to limit the impact on other GPUs from the initial bring-up
of Volta/Turing, let's add an explicit representation of SHF in the IR.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir: introduce OP_BREV with lowering to EXTBF_REV for current GPUs
Ben Skeggs [Sat, 6 Jun 2020 23:51:51 +0000 (09:51 +1000)]
nvir: introduce OP_BREV with lowering to EXTBF_REV for current GPUs

SM70 has this instruction, but no BFE.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir: introduce OP_WARPSYNC
Ben Skeggs [Sat, 6 Jun 2020 23:51:49 +0000 (09:51 +1000)]
nvir: introduce OP_WARPSYNC

Will be required to support SM70.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir: introduce OP_LOP3_LUT
Ben Skeggs [Sat, 6 Jun 2020 23:51:45 +0000 (09:51 +1000)]
nvir: introduce OP_LOP3_LUT

Will be required to support SM70, but is also available on earlier GPUs.

v2:
- add convenience macro suggested by Karol

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agonvir: bump max encoding size of instructions
Ben Skeggs [Sat, 6 Jun 2020 23:51:40 +0000 (09:51 +1000)]
nvir: bump max encoding size of instructions

SM70 SASS is encoded into 16 bytes.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>

3 years agogallium/hud: do not specify potentially invalid depth-range
Erik Faye-Lund [Tue, 9 Jun 2020 19:25:26 +0000 (21:25 +0200)]
gallium/hud: do not specify potentially invalid depth-range

Setting the depth-scale to 1 while leaving the depth-translation at 0
means our near-plane is at -1 in OpenGL semantics, which is
out-of-range on some drivers. In particular, Zink has this limitation.

But since we'll only pass a zero z in here anyway, we might as well
multiply it by zero, and get the same result. This avoids the problem.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5408>

3 years agodraw: add disk caching for draw shaders
Dave Airlie [Wed, 13 May 2020 03:37:39 +0000 (13:37 +1000)]
draw: add disk caching for draw shaders

This adds the cache search/insert and compile skipping for cached
objects to the VS/GS/TES/TCS stages in draw.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agollvmpipe: hook draw disk cache up
Dave Airlie [Wed, 13 May 2020 03:37:19 +0000 (13:37 +1000)]
llvmpipe: hook draw disk cache up

Connect the draw callbacks into the llvmpipe code.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agodraw: add disk cache callbacks for draw shaders
Dave Airlie [Wed, 13 May 2020 03:36:55 +0000 (13:36 +1000)]
draw: add disk cache callbacks for draw shaders

This provides a set of hooks from the driver that draw can
use to access the disk cache for the draw shaders.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agollvmpipe/cs: add shader caching
Dave Airlie [Wed, 13 May 2020 00:49:51 +0000 (10:49 +1000)]
llvmpipe/cs: add shader caching

As for fragment shader, skip compilation step if we have the shaders

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agollvmpipe/fs: add caching support
Dave Airlie [Wed, 13 May 2020 00:45:37 +0000 (10:45 +1000)]
llvmpipe/fs: add caching support

Serialize and check if the object is in the cache, it there is
a cached object skip compilation code once we've constructed
the function interface.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agogallivm: don't cache shaders that use fetch functions.
Dave Airlie [Thu, 14 May 2020 05:23:48 +0000 (15:23 +1000)]
gallivm: don't cache shaders that use fetch functions.

This needs to be reworked, but it's a bit messy as we have to store
all the fetch pointers to be added as globals later once gallivm
has been initialised further. For now just refuse to cache shaders
that hit these paths (mainly ETC1 and BPTC).

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agollvmpipe: add infrastructure for disk cache support
Dave Airlie [Tue, 21 Apr 2020 03:14:20 +0000 (13:14 +1000)]
llvmpipe: add infrastructure for disk cache support

This hooks up the gallium API and adds the APIs needed
for shader stages to search and add things to the cache.

It also adds cache stats debug printing.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agogallivm: add cache interface to mcjit
Dave Airlie [Wed, 13 May 2020 00:43:56 +0000 (10:43 +1000)]
gallivm: add cache interface to mcjit

MCJIT uses an ObjectCache object to implement the cache,
this creates and instances of it and adds it to the MCJIT
instances, it stores the cached object for later use by
the outer layers.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agogallivm: skip operations if we have a cached object.
Dave Airlie [Fri, 15 May 2020 00:11:56 +0000 (10:11 +1000)]
gallivm: skip operations if we have a cached object.

If the object is loaded from the cache, a bunch of gallivm/llvm
interactions can be skipped.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agogallivm: add support for a cache object
Dave Airlie [Tue, 12 May 2020 23:30:44 +0000 (09:30 +1000)]
gallivm: add support for a cache object

This plumbs the cache object into the gallivm API, nothing uses
it yet.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agogallivm: rework debug printf hook to use global mapping.
Dave Airlie [Fri, 15 May 2020 00:05:55 +0000 (10:05 +1000)]
gallivm: rework debug printf hook to use global mapping.

Cached shaders require relinking, so hardcoding the pointer
can't work. This switches out the printf code to use new
proper API.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agogallivm: rework coroutine malloc/free callouts.
Dave Airlie [Fri, 15 May 2020 00:03:32 +0000 (10:03 +1000)]
gallivm: rework coroutine malloc/free callouts.

When using cached shaders we have to relink the shader with
external symbols when it's loaded. However the way gallivm does
function calls now hardcodes the function pointer into the shader.

LLVM had a mechanism for doing this properly using global mappings,
this switches the coroutine alloc/free code to use a global mapping.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agollvmpipe/draw: drop variant number from function names.
Dave Airlie [Thu, 14 May 2020 23:59:34 +0000 (09:59 +1000)]
llvmpipe/draw: drop variant number from function names.

When we use an object cache for the MCJIT we can have identical
cache entries from the same shader variant in different shaders,
but the JIT objcache uses the function name to relink things,
so it has to be consistent. Just drop the variants from the
function names.

Note the modules still have the variant info.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agollvmpipe/cs: overhaul cs variant key state.
Dave Airlie [Thu, 21 May 2020 03:21:51 +0000 (13:21 +1000)]
llvmpipe/cs: overhaul cs variant key state.

This just realigns it with the fs state, and fixes some issues
where shaders weren't getting cached correctly.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agoutil/disk_cache: add fallback for disk_cache_get_function_identifier
Dave Airlie [Mon, 8 Jun 2020 02:30:27 +0000 (12:30 +1000)]
util/disk_cache: add fallback for disk_cache_get_function_identifier

Otherwise drivers need to have a ifdef on windows, easier to fix
here hopefully.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>

3 years agoci: fix possible spuriously run of jobs
Christian Gmeiner [Tue, 9 Jun 2020 17:05:21 +0000 (19:05 +0200)]
ci: fix possible spuriously run of jobs

Need to list arm_test-base here as well, or jobs using this
template may spuriously run if the arm_test-base job fails or
is cancelled.

Suggested-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5405>

3 years agoac/surface: cache DCC retile maps (v2)
Marek Olšák [Tue, 9 Jun 2020 08:55:19 +0000 (04:55 -0400)]
ac/surface: cache DCC retile maps (v2)

This reduces overhead when resizing windows or when allocating
similar image sizes over and over again.

v2: optimize the memory footprint of the cache

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>

3 years agoac/surface: add a wrapper structure to hold ADDR_HANDLE
Marek Olšák [Tue, 9 Jun 2020 07:19:04 +0000 (03:19 -0400)]
ac/surface: add a wrapper structure to hold ADDR_HANDLE

and more things in the future.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>

3 years agoamd/addrlib: remove unused members of ADDR2_COMPUTE_DCC_ADDRFROMCOORD_INPUT
Marek Olšák [Tue, 9 Jun 2020 07:06:22 +0000 (03:06 -0400)]
amd/addrlib: remove unused members of ADDR2_COMPUTE_DCC_ADDRFROMCOORD_INPUT

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>

3 years agoamd/addrlib: don't recompute DCC info for every ComputeDccAddrFromCoord call
Marek Olšák [Tue, 9 Jun 2020 06:40:20 +0000 (02:40 -0400)]
amd/addrlib: don't recompute DCC info for every ComputeDccAddrFromCoord call

This decreases the DCC retile map overhead from 23% to 18%.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>

3 years agoac/surface: don't recompute the DCC retile map for imported textures
Marek Olšák [Tue, 9 Jun 2020 06:08:21 +0000 (02:08 -0400)]
ac/surface: don't recompute the DCC retile map for imported textures

The retile map is not used in this case, and the retile map computation
takes 39% of CPU time when resizing a window.

This brings it down to 23%.

The dcc_retile_use_uint16 setting has to be derived from DCC sizes.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>

3 years agoaco: fix moving sub-dword values out of a register for a fixed definition
Rhys Perry [Thu, 21 May 2020 19:21:37 +0000 (20:21 +0100)]
aco: fix moving sub-dword values out of a register for a fixed definition

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>

3 years agoaco: use Info::definition_size instead of definition's regclass
Rhys Perry [Fri, 15 May 2020 14:25:44 +0000 (15:25 +0100)]
aco: use Info::definition_size instead of definition's regclass

16-bit abs/neg creates v_xor_b32/v_and_b32 with v2b definitions. These
instructions never do partial writes without SDWA.

No shader-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>

3 years agoaco: add Info::{operand_size,definition_size}
Rhys Perry [Mon, 18 May 2020 14:37:33 +0000 (15:37 +0100)]
aco: add Info::{operand_size,definition_size}

No shader-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>

3 years agoaco: prefer 4-byte aligned definitions
Rhys Perry [Tue, 12 May 2020 14:08:05 +0000 (15:08 +0100)]
aco: prefer 4-byte aligned definitions

shader-db (Navi, fp16 enabled):
Totals from 42 (0.03% of 127638) affected shaders:
CodeSize: 811984 -> 806224 (-0.71%)
Instrs: 155733 -> 155939 (+0.13%); split: -0.04%, +0.18%
Cycles: 1982568 -> 1984400 (+0.09%); split: -0.06%, +0.15%
VMEM: 7187 -> 7121 (-0.92%); split: +0.86%, -1.78%
SMEM: 1770 -> 1769 (-0.06%)
VClause: 1475 -> 1476 (+0.07%)
Copies: 12406 -> 12606 (+1.61%); split: -0.46%, +2.07%
Branches: 5901 -> 5900 (-0.02%); split: -0.25%, +0.24%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>

3 years agoaco: allow reading/writing upper halves/bytes when possible
Rhys Perry [Mon, 11 May 2020 16:49:40 +0000 (17:49 +0100)]
aco: allow reading/writing upper halves/bytes when possible

Use SDWA, opsel or a different opcode to achieve this.

shader-db (Navi, fp16 enabled):
Totals from 42 (0.03% of 127638) affected shaders:
VGPRs: 3424 -> 3416 (-0.23%)
CodeSize: 811124 -> 811984 (+0.11%); split: -0.12%, +0.23%
Instrs: 156638 -> 155733 (-0.58%)
Cycles: 1994180 -> 1982568 (-0.58%); split: -0.59%, +0.00%
VMEM: 7019 -> 7187 (+2.39%); split: +3.45%, -1.05%
SMEM: 1771 -> 1770 (-0.06%); split: +0.06%, -0.11%
VClause: 1477 -> 1475 (-0.14%)
Copies: 13216 -> 12406 (-6.13%)
Branches: 5942 -> 5901 (-0.69%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>

3 years agoaco: p_extract_vector in 64-bit u2f16/i2f16
Rhys Perry [Tue, 9 Jun 2020 19:41:49 +0000 (20:41 +0100)]
aco: p_extract_vector in 64-bit u2f16/i2f16

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>

3 years agoaco: validate instructions reading/writing upper halves/bytes
Rhys Perry [Wed, 3 Jun 2020 10:27:55 +0000 (11:27 +0100)]
aco: validate instructions reading/writing upper halves/bytes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>

3 years agopanfrost: Add writes_stencil to the EARLY_Z disable list
Icecream95 [Sat, 6 Jun 2020 10:32:04 +0000 (22:32 +1200)]
panfrost: Add writes_stencil to the EARLY_Z disable list

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>

3 years agopan/mdg: Print writeout sources in mir_print_instruction
Icecream95 [Sat, 6 Jun 2020 03:36:22 +0000 (15:36 +1200)]
pan/mdg: Print writeout sources in mir_print_instruction

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>

3 years agopan/mdg: Add new depth store lowering
Icecream95 [Sat, 6 Jun 2020 05:25:08 +0000 (17:25 +1200)]
pan/mdg: Add new depth store lowering

This uses the new nir_intrinsic_store_combined_output_pan intrinsic,
which can write depth, stencil and color in a single instruction. If
there are no color writes, the "depth RT" is written to.

Fixes the dEQP GLES3 depth write tests, as well as the piglit tests
fragdepth_gles2, glsl-1.10-fragdepth and when modified to not rely
on depth/stencil reload, glsl-fs-shader-stencil-export.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>

3 years agopan/mdg: Add depth/stencil support to emit_fragment_store
Icecream95 [Sat, 6 Jun 2020 03:41:51 +0000 (15:41 +1200)]
pan/mdg: Add depth/stencil support to emit_fragment_store

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>

3 years agopan/mdg: Move search_var to earlier in midgard_compile.c
Icecream95 [Sat, 6 Jun 2020 03:39:22 +0000 (15:39 +1200)]
pan/mdg: Move search_var to earlier in midgard_compile.c

It will be needed by the new zs lowering.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>

3 years agopan/mdg: Add new depth writeout code
Icecream95 [Sat, 6 Jun 2020 03:21:21 +0000 (15:21 +1200)]
pan/mdg: Add new depth writeout code

We schedule depth writeout to smul and stencil to vlut, so scheduling
to smul has to be disabled in these cases.

When only writing stencil, scheduling to smul is still disabled to
prevent stencil writeout from being scheduled there.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>

3 years agopan/mdg: Replace writeout booleans with a single value
Icecream95 [Sat, 6 Jun 2020 03:08:06 +0000 (15:08 +1200)]
pan/mdg: Replace writeout booleans with a single value

A single value is easier to deal with than three separate booleans.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>

3 years agonir: Replace the zs_output_pan intrinsic with combined_output_pan
Icecream95 [Sat, 6 Jun 2020 02:26:49 +0000 (14:26 +1200)]
nir: Replace the zs_output_pan intrinsic with combined_output_pan

Depth and stencil writes are combined with color writes, so we need
this intrinsic which has sources for color, RT, depth and stencil.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>

3 years agopan/mdg: Remove writeout case from bytemask_of_read_components
Icecream95 [Sat, 6 Jun 2020 02:42:18 +0000 (14:42 +1200)]
pan/mdg: Remove writeout case from bytemask_of_read_components

By setting the swizzle for the fragment color, and setting qmask to ~0
for branches, the special case for writeout branches can be removed
from mir_bytemask_of_read_components_index.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>

3 years agopan/mdg: Remove old depth writeout code
Icecream95 [Sat, 6 Jun 2020 02:59:31 +0000 (14:59 +1200)]
pan/mdg: Remove old depth writeout code

We need to be able to do color writeout at the same time as depth
writeout. The old code can't do that, so needs to be removed.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>

3 years agopan/mdg: Remove old zs store lowering
Icecream95 [Sat, 6 Jun 2020 02:24:08 +0000 (14:24 +1200)]
pan/mdg: Remove old zs store lowering

It is broken for when there are also color writes, and will be
replaced with a new lowering which takes that into account.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>

3 years agopan/mdg: Move r1.w writeout to branch->dest
Icecream95 [Fri, 5 Jun 2020 12:24:22 +0000 (00:24 +1200)]
pan/mdg: Move r1.w writeout to branch->dest

There will need to be sources for depth and stencil writeout, so
something has to be moved to the dest of the writeout branch.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>

3 years agopan/mdg: Add a macro for printing instruction source information
Icecream95 [Fri, 5 Jun 2020 12:20:52 +0000 (00:20 +1200)]
pan/mdg: Add a macro for printing instruction source information

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>

3 years agonir: Remove nir_intrinsic_output_u8_as_fp16_pan
Alyssa Rosenzweig [Mon, 25 May 2020 16:45:13 +0000 (12:45 -0400)]
nir: Remove nir_intrinsic_output_u8_as_fp16_pan

Now unused in favour of nir_intrinsic_load_output, happily.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5287>

3 years agoac/surface: fix epitch when modifying surf_pitch
Pierre-Eric Pelloux-Prayer [Wed, 3 Jun 2020 16:20:15 +0000 (18:20 +0200)]
ac/surface: fix epitch when modifying surf_pitch

This is needed otherwise it can cause bad rendering of UYVY files.
The align(..., 256 / surf->bpe) constraint comes from addrlib.

Fixes: 69aadc49331 ("radeonsi: fix surf_pitch for subsampled surface")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5314>

3 years agoac/surface: set SCANOUT if surf->is_displayable
Pierre-Eric Pelloux-Prayer [Tue, 26 May 2020 07:53:27 +0000 (09:53 +0200)]
ac/surface: set SCANOUT if surf->is_displayable

Fixes: ba10fb3f7f4 ("radeonsi: preserve the scanout flag for shared resources on gfx9 and gfx10")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5314>

3 years agozink: only report device-local memory as video-memory
Erik Faye-Lund [Tue, 9 Jun 2020 19:54:23 +0000 (21:54 +0200)]
zink: only report device-local memory as video-memory

While the definition of "video memory" isn't super clear, I think it's
pretty reasonable to assume host-memory isn't meant to be included. So
let's only count dedicated memory here.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3107
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
Tested-by: Witold Baryluk <witold.baryluk@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5409>

3 years agoac/nir: fix integer comparisons with pointers
Samuel Pitoiset [Tue, 9 Jun 2020 06:36:17 +0000 (08:36 +0200)]
ac/nir: fix integer comparisons with pointers

If we get a comparison between a pointer and an integer, LLVM
complains if the operands aren't of the same type.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3085
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5397>

3 years agoradeonsi/ngg: try GS multi-cycling mode if default mode failed
Pierre-Eric Pelloux-Prayer [Tue, 9 Jun 2020 10:24:41 +0000 (12:24 +0200)]
radeonsi/ngg: try GS multi-cycling mode if default mode failed

If gsprim_lds_size is larger than target_lds_size then gfx10_ngg_calculate_subgroup_info
will fail.

This commit adds a logic to try the multi-cycling in this case because it's
using less memory.

This fix glsl-1.50-gs-max-output when using NGG.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5401>

3 years agoradeonsi: add return value to gfx10_ngg_calculate_subgroup_info
Pierre-Eric Pelloux-Prayer [Tue, 9 Jun 2020 10:23:04 +0000 (12:23 +0200)]
radeonsi: add return value to gfx10_ngg_calculate_subgroup_info

gfx10_ngg_calculate_subgroup_info uses assert to detect invalid configuration,
but if asserts are disabled it will continue its execution.

This commits adds a boolean return value to let the caller know that something
went wrong and that the results mustn't be used.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3103
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5401>

3 years agoglsl: fix crash on glsl macro redefinition
Andrii Simiklit [Wed, 3 Jun 2020 15:59:02 +0000 (18:59 +0300)]
glsl: fix crash on glsl macro redefinition

In case shader contains two equal macro defines, first one with trailing spaces
and the second one without.
`#define A 1   `
`#define A 1`
The parser crashes

Fixes: 0346ad37741b11d640c1c4970b275c1f0c7f9e75 ("glsl: ignore trailing whitespace when define redefined")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5312>

3 years agoanv/allocator: Compare to start_offset in state_pool_free_no_vg
Jason Ekstrand [Tue, 9 Jun 2020 01:37:51 +0000 (20:37 -0500)]
anv/allocator: Compare to start_offset in state_pool_free_no_vg

In d11e4738a86ec, we started using a start_offset to allow us to
allocate pools where the base address isn't at the start of the pool.
This is useful for binding table pools which want to be relative to
surface state base address (more or less), among other things.  However,
we had a bug where, if you have a negative offset, everything returned
to the pool would end up being returned to the "back" of the pool.  This
isn't what we want for binding tables in the softpin world.  This was
causing us to never actually re-use any binding table blocks.  How this
passed CTS, I have no idea.

Closes: #3100
Fixes: d11e4738a86ec "anv/allocator: Add a start_offset to anv_state_pool"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5395>

3 years agopanfrost: Ensure we have ro before using it
Alyssa Rosenzweig [Tue, 9 Jun 2020 20:04:37 +0000 (16:04 -0400)]
panfrost: Ensure we have ro before using it

Even through the resouce requested has a BIND_SCANOUT or related tag,
this does not mean that we have a render-only driver.

This can trivially happen as one requests such resource from GBM, while
using the panfrost fd (and hence panfrost_dri.so)

Forward port of !3000

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Closes: #2664
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5410>

3 years agoradv/aco: enable shaderInt8 and VK_KHR_shader_float16_int8 on GFX6-GFX7
Samuel Pitoiset [Thu, 4 Jun 2020 09:33:28 +0000 (11:33 +0200)]
radv/aco: enable shaderInt8 and VK_KHR_shader_float16_int8 on GFX6-GFX7

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>

3 years agoradv/aco: enable shaderInt16 on GFX6-GFX7
Samuel Pitoiset [Thu, 7 May 2020 08:54:12 +0000 (10:54 +0200)]
radv/aco: enable shaderInt16 on GFX6-GFX7

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>

3 years agoradv/aco: enable 8-bit/16-bit storage on GFX6-GFX7
Samuel Pitoiset [Tue, 5 May 2020 07:07:33 +0000 (09:07 +0200)]
radv/aco: enable 8-bit/16-bit storage on GFX6-GFX7

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>

3 years agoaco: remove unnecessary split- and create_vector instructions for subdword loads
Daniel Schürmann [Mon, 25 May 2020 09:51:27 +0000 (11:51 +0200)]
aco: remove unnecessary split- and create_vector instructions for subdword loads

This helps GFX6/7 by removing unnecessary shuffle code.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>

3 years agoaco: fix alignment of vectors with 4 elements
Samuel Pitoiset [Mon, 25 May 2020 16:33:18 +0000 (18:33 +0200)]
aco: fix alignment of vectors with 4 elements

I think this case was just missing.

This fixes a bunch of 16-bit storage related CTS failures like
dEQP-VK.ssbo.phys.layout.single_basic_type.std430.u16vec4.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>

3 years agoaco: implement 8-bit/16-bit conversions on GFX6-GFX7
Samuel Pitoiset [Thu, 7 May 2020 08:55:28 +0000 (10:55 +0200)]
aco: implement 8-bit/16-bit conversions on GFX6-GFX7

Use v_bfe to implement small bitsize conversions because the
compiler probably optimizes this better.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>

3 years agoaco: optimize packing of 16bit subdword registers on GFX6/7
Daniel Schürmann [Mon, 11 May 2020 15:42:37 +0000 (16:42 +0100)]
aco: optimize packing of 16bit subdword registers on GFX6/7

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>

3 years agoaco: skip partial copies on first iteration when lowering to hw
Daniel Schürmann [Fri, 5 Jun 2020 20:21:02 +0000 (21:21 +0100)]
aco: skip partial copies on first iteration when lowering to hw

Helps some Detroit : Become Human shaders.

Totals from affected shaders: (VEGA)
Code Size: 47693912 -> 47670212 (-0.05 %) bytes
Instructions: 9183788 -> 9177863 (-0.06 %)
Copies: 910052 -> 904127 (-0.65 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>

3 years agoaco: coalesce copies more aggressively when lowering to hw
Daniel Schürmann [Thu, 7 May 2020 17:15:59 +0000 (18:15 +0100)]
aco: coalesce copies more aggressively when lowering to hw

Helps some Detroit : Become Human shaders.

Totals from affected shaders: (VEGA)
Code Size: 9880420 -> 9879088 (-0.01 %) bytes
Instructions: 1918553 -> 1918220 (-0.02 %)
Copies: 177783 -> 177450 (-0.19 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>

3 years agoaco: add and use scratch SGPR to lower subdword p_create_vector on GFX6/7
Daniel Schürmann [Wed, 27 May 2020 17:31:33 +0000 (18:31 +0100)]
aco: add and use scratch SGPR to lower subdword p_create_vector on GFX6/7

This is needed to lower some corner cases correctly,
in case the same operand occurs multiple times:
e.g. v0 = p_create_vector(v0[0:8], v0[0:8], v0[0:8], v0[0:8])

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>

3 years agoaco: adjust GFX6 subdword lowering workarounds for 8bit
Daniel Schürmann [Wed, 27 May 2020 10:08:31 +0000 (11:08 +0100)]
aco: adjust GFX6 subdword lowering workarounds for 8bit

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>

3 years agoaco: Workarounds subdword lowering on GFX6/7
Daniel Schürmann [Sat, 16 May 2020 16:30:21 +0000 (17:30 +0100)]
aco: Workarounds subdword lowering on GFX6/7

As there are no SDWA instructions, we need to take care not to overwrite
the upper bits of other copy_operation's operands.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>

3 years agoaco: use full-register instructions to implement subdword packing on GFX6/7
Daniel Schürmann [Wed, 6 May 2020 10:58:02 +0000 (11:58 +0100)]
aco: use full-register instructions to implement subdword packing on GFX6/7

On GFX6/7, there are no SDWA instructions.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>

3 years agoaco: simplify statistics collection for copies
Daniel Schürmann [Fri, 5 Jun 2020 20:05:31 +0000 (21:05 +0100)]
aco: simplify statistics collection for copies

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>