mesa.git
4 years agoaco: skip unused channels at the start when fetching vertices
Rhys Perry [Fri, 13 Dec 2019 13:23:27 +0000 (13:23 +0000)]
aco: skip unused channels at the start when fetching vertices

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 161320 -> 161224 (-0.06 %)
VGPRS: 153968 -> 149408 (-2.96 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 4331496 -> 4331308 (-0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 27814 -> 28594 (2.80 %)

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 161504 -> 161408 (-0.06 %)
VGPRS: 153836 -> 149440 (-2.86 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 4327572 -> 4327604 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 27837 -> 28618 (2.81 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>

4 years agoaco: rework vertex fetching a bit
Rhys Perry [Mon, 9 Dec 2019 12:18:51 +0000 (12:18 +0000)]
aco: rework vertex fetching a bit

This will make it easier to skip unused channels at the start and to split
unaligned loads on GFX10.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>

4 years agoamd/common,radv: move vertex_format_table to ac_shader_util.{h,c}
Rhys Perry [Tue, 14 Jan 2020 13:01:53 +0000 (13:01 +0000)]
amd/common,radv: move vertex_format_table to ac_shader_util.{h,c}

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>

4 years agogallium/swr: fix tessellation state save/restore
Jan Zielinski [Tue, 28 Jan 2020 11:09:11 +0000 (12:09 +0100)]
gallium/swr: fix tessellation state save/restore

Tessellation state should be saved with TCS/TES state
when binding new state and restored if old state
is set again.

Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3596>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3596>

4 years agolima: disable early-z if fragment shader uses discard
Vasily Khoruzhick [Sun, 26 Jan 2020 18:30:17 +0000 (10:30 -0800)]
lima: disable early-z if fragment shader uses discard

We have to disable early-z if fragment shader uses discard,
otherwise we'll get misrendering.

Reported-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3570>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3570>

4 years agolima: ppir: always create move and update ld_tex successors for all blocks
Vasily Khoruzhick [Sat, 25 Jan 2020 21:31:53 +0000 (13:31 -0800)]
lima: ppir: always create move and update ld_tex successors for all blocks

Always create a mov for ld_tex since we can't rely on
ppir_node_has_single_src_succ() if we have multiple blocks. And since
ld_tex successor can be in a different block we have to update their
ppir_src as well.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3564>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3564>

4 years agolima: ppir: don't delete root ld_tex nodes without successors in current block
Vasily Khoruzhick [Sat, 25 Jan 2020 19:40:37 +0000 (11:40 -0800)]
lima: ppir: don't delete root ld_tex nodes without successors in current block

We don't clone ld_tex nodes into each block anymore, so ld_tex may have
successors in another block.

Fixes: c8554f849e41 ("lima/ppir: don't clone texture loads")
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3564>

4 years agofreedreno/drm: fix invalid-cmdstream-size with older kernels
Rob Clark [Tue, 19 Nov 2019 17:43:22 +0000 (09:43 -0800)]
freedreno/drm: fix invalid-cmdstream-size with older kernels

A cmdstream of size zero is invalid.  But this can appear in various
places where we emit a pointer to state.  This doesn't show up with
newer kernels (newer than v5.0) which use "softpin", but on earlier
kernels can result in:

  [drm:msm_ioctl_gem_submit [msm]] *ERROR* invalid cmdstream size: 0

Since the pointer value doesn't matter in these cases, the easy solution
is just to not emit a cmds table entry in this case.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2805>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2805>

4 years agoRevert "winsys/amdgpu: Re-use amdgpu_screen_winsys when possible"
Marek Olšák [Mon, 27 Jan 2020 22:40:38 +0000 (17:40 -0500)]
Revert "winsys/amdgpu: Re-use amdgpu_screen_winsys when possible"

This reverts commit b60f5cbc15a99ddd9251bce40eae7d84c3a1c373.

This fixes dmesg errors and X freezes:
[   29.543096] amdgpu 0000:0c:00.0: No GEM object associated to handle 0x00000009, can't create framebuffer
[   29.543103] amdgpu 0000:0c:00.0: No GEM object associated to handle 0x00000009, can't create framebuffer

4 years agoRevert "winsys/amdgpu: Close KMS handles for other DRM file descriptions"
Marek Olšák [Mon, 27 Jan 2020 22:40:32 +0000 (17:40 -0500)]
Revert "winsys/amdgpu: Close KMS handles for other DRM file descriptions"

This reverts commit 552028c013cc1d49a2b61ebe0fc3a3781a9ba826.

Required by the next reverted commit.

4 years agoanv: Insert holes for non-existant XFB varyings
Jason Ekstrand [Wed, 22 Jan 2020 20:26:24 +0000 (14:26 -0600)]
anv: Insert holes for non-existant XFB varyings

Thanks to optimizations, it's possible for varyings to get deleted but
still leave the variable there for nir_gather_xfb_info to find.  If we
get into this case, insert a hole.

Fixes: 36ee2fd61c8 "anv: Implement the basic form of..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3520>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3520>

4 years agointel/genxml: Make SO_DECL::"Hole Flag" a Boolean
Jason Ekstrand [Wed, 22 Jan 2020 20:25:08 +0000 (14:25 -0600)]
intel/genxml: Make SO_DECL::"Hole Flag" a Boolean

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3520>

4 years agointel/compiler: Clear accumulator register before EOT
Sagar Ghuge [Wed, 15 Jan 2020 00:12:31 +0000 (16:12 -0800)]
intel/compiler: Clear accumulator register before EOT

v2: (Francisco Jerez)
- Drop vec4 changes.
- Handle explicit acc0 operand and implicit one.
- Make sure instruction is SIMD16, prediction is off and default mask
  control set to true.

v3: (Francisco Jerez)
- Clear accumulator only when it's written.
- Use BRW_MASK_DISABLE instead of true.
- Use correct width for brw_acc_reg().
- Fix last_inst_offset.

v4: (Francisco Jerez)
- Don't check for last instruction for accummulator write.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3376>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3376>

4 years agopan/midgard: Remove float_bitcast
Alyssa Rosenzweig [Mon, 27 Jan 2020 18:37:36 +0000 (13:37 -0500)]
pan/midgard: Remove float_bitcast

Now unused.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3588>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3588>

4 years agoradv: do not allow sparse resources with multi-planar formats
Samuel Pitoiset [Mon, 27 Jan 2020 14:17:25 +0000 (15:17 +0100)]
radv: do not allow sparse resources with multi-planar formats

It's unsupported.

Fixes some fails or hangs with
dEQP-VK.sparse_resources.image_sparse_binding.*

Cc: 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3581>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3581>

4 years agopanfrost/midgard: Prettify embedded constant prints
Boris Brezillon [Mon, 20 Jan 2020 17:40:43 +0000 (18:40 +0100)]
panfrost/midgard: Prettify embedded constant prints

Until now, embedded constants were printed as all 32 bits integer or
floats, but the compiler can pack constant from different types if
severa instructions with different reg_mode and native type refer to
the constant register. Let's implement something smarter so users don't
have to do a manual conversion when looking at a trace.

Note that 8-bit constants are not decoded yet, as we're not sure how
the writemask is encoded in that case.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3536>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3536>

4 years agopanfrost/midgard: Add a condense_writemask() helper
Boris Brezillon [Fri, 24 Jan 2020 08:22:48 +0000 (09:22 +0100)]
panfrost/midgard: Add a condense_writemask() helper

This way we can convert an 8-bit writemask (Midgard specific
representation) into the more common 1-bit/component representation.

8-bit mode is not supported yet, as we're not sure how the writemask is
encoded for this mode.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3536>

4 years agoaco: fix literal application with v_cndmask_b32/v_addc_co_u32/etc
Rhys Perry [Fri, 24 Jan 2020 17:37:11 +0000 (17:37 +0000)]
aco: fix literal application with v_cndmask_b32/v_addc_co_u32/etc

No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 0be74090696 ('aco: rewrite literal combining')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>

4 years agoaco: always add sgprs to sgpr_ids when choosing literals
Rhys Perry [Thu, 23 Jan 2020 20:03:40 +0000 (20:03 +0000)]
aco: always add sgprs to sgpr_ids when choosing literals

Even if it's a literal, we should add this to sgpr_ids.
No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 0be74090696 ('aco: rewrite literal combining')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>

4 years agoaco: fix operand to scc when selecting SGPR ufind_msb/ifind_msb
Rhys Perry [Thu, 23 Jan 2020 19:34:06 +0000 (19:34 +0000)]
aco: fix operand to scc when selecting SGPR ufind_msb/ifind_msb

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>

4 years agoaco: fix WaR check for >64-bit FLAT/GLOBAL instructions
Rhys Perry [Thu, 23 Jan 2020 19:30:29 +0000 (19:30 +0000)]
aco: fix WaR check for >64-bit FLAT/GLOBAL instructions

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 5986e0019 ('aco: improve WAR hazard workaround with >64bit stores')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>

4 years agopan/midgard: Handle tag 0x4 as texture
Alyssa Rosenzweig [Mon, 27 Jan 2020 13:34:49 +0000 (08:34 -0500)]
pan/midgard: Handle tag 0x4 as texture

Used for barriers which work as texture ops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>

4 years agopan/midgard: Validate barriers use a barrier tag
Alyssa Rosenzweig [Fri, 24 Jan 2020 02:20:16 +0000 (21:20 -0500)]
pan/midgard: Validate barriers use a barrier tag

...and that non-barriers don't use a barrier tag. It's not clear what
the difference means quite yet, though.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>

4 years agopan/midgard: Disassemble barrier instructions
Alyssa Rosenzweig [Fri, 24 Jan 2020 02:17:07 +0000 (21:17 -0500)]
pan/midgard: Disassemble barrier instructions

We don't need to print all the usual texture noise; just the relevant
fields and the rest can be guarded to zero.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>

4 years agopan/midgard: Record TEXTURE_OP_BARRIER
Alyssa Rosenzweig [Fri, 24 Jan 2020 01:54:14 +0000 (20:54 -0500)]
pan/midgard: Record TEXTURE_OP_BARRIER

It's 0x0B for whatever reason.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>

4 years agopan/decode: Drop MFBD compute shader stuff
Alyssa Rosenzweig [Wed, 22 Jan 2020 13:51:19 +0000 (08:51 -0500)]
pan/decode: Drop MFBD compute shader stuff

This is triggering all sorts of failures in pandecode and is only mostly
spurious. Let's not overwhelm ourselves with this yet.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>

4 years agopanfrost: Don't copy uniforms when the size is zero
Icecream95 [Fri, 24 Jan 2020 06:45:17 +0000 (19:45 +1300)]
panfrost: Don't copy uniforms when the size is zero

This fixes a crash when using Gallium HUD with QuakeSpasm when gamma
correction shaders (a QuakeSpasm feature, not part of Mesa) are used.

Reviewd-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3549>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3549>

4 years agoradv/winsys: set IB flags prior to submit in the sysmem path
Florian Will [Mon, 27 Jan 2020 09:30:21 +0000 (10:30 +0100)]
radv/winsys: set IB flags prior to submit in the sysmem path

This fixes missing scene objects in ZUSI 3 + dxvk. Index / vertex buffer
upload using thousands of CopyBuffer commands in one huge Vulkan command
buffer, mixed with lots of render pass begin/end and draw calls, failed
for some of the buffers.

radv divides the huge command buffer into 3 IBs, and they had random
flags set because the field was uninitialized. Maybe IBs got discarded
if they had the PREAMBLE bit set.

Signed-off-by: Florian Will <florian.will@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3577>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3577>

4 years agodocs: document AMD_DEBUG variable
Pierre-Eric Pelloux-Prayer [Tue, 21 Jan 2020 17:56:03 +0000 (18:56 +0100)]
docs: document AMD_DEBUG variable

See https://gitlab.freedesktop.org/mesa/mesa/issues/2022

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3492>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3492>

4 years agoradeonsi: move AMD_DEBUG tests to AMD_TEST
Pierre-Eric Pelloux-Prayer [Tue, 21 Jan 2020 17:46:28 +0000 (18:46 +0100)]
radeonsi: move AMD_DEBUG tests to AMD_TEST

AMD_DEBUG env var is stored in a 64 bits int and has 64 different values.
This commit makes some space by moving the test* special values to AMD_TEST.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3492>

4 years agogallivm/nir: add missing break for isub.
Dave Airlie [Sun, 26 Jan 2020 21:21:17 +0000 (07:21 +1000)]
gallivm/nir: add missing break for isub.

Pointed out by coverity scan.

Fixes: 3adf74f2ef55 ("gallivm: pick integer builders for alu instructions.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3571>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3571>

4 years agoisl: add gen12 comment about CCS for linear tiling
Lionel Landwerlin [Fri, 24 Jan 2020 21:45:41 +0000 (23:45 +0200)]
isl: add gen12 comment about CCS for linear tiling

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3551>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3551>

4 years agoisl: drop CCS row pitch requirement for linear surfaces
Lionel Landwerlin [Fri, 24 Jan 2020 13:34:36 +0000 (15:34 +0200)]
isl: drop CCS row pitch requirement for linear surfaces

We were applying row pitch constraint of CCS surfaces to linear
surfaces. But CCS is only supported in linear tiling under some
condition (more on that in the following commit). So let's drop that
requirement for now.

Fixes a bunch of crucible assert where the byte size of a linear image
is expected to be similar to the byte size of buffer for the same
extent in the following category :

   func.miptree.r8g8b8a8-unorm.aspect-color.view-2d.*download-copy-with-draw.*

v2: Move restriction to isl_calc_tiled_min_row_pitch()

v3: Move restrinction to isl_calc_row_pitch_alignment() (Jason)

v4: Update message (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 07e16221d975 ("isl: Round up some pitches to 512B for Gen12's CCS")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3551>

4 years agointel: Implement Gen12 workaround for array textures of size 1
Lionel Landwerlin [Mon, 13 Jan 2020 13:11:25 +0000 (15:11 +0200)]
intel: Implement Gen12 workaround for array textures of size 1

Gen12 does not support RENDER_SURFACE_STATE::SurfaceArray = true &&
RENDER_SURFACE_STATE::Depth = 0. SurfaceArray can only be set to true
if Depth >= 1.

We workaround this limitation by adding the max(value, 1) snippet in
the shaders on the 3 components for texture array sizes.

Tested on Gen9 with the following Vulkan CTS tests :
dEQP-VK.image.image_size.2d_array.*

v2: Drop debug print (Tapani)
    Switch to GEN:BUG instead of Wa_

v3: Fix dEQP-VK.image.image_size.1d_array.* cases (Lionel)

v4: Fix dEQP-VK.glsl.texture_functions.query.texturesize.* cases
    (Missing tex_op handling) (Lionel)

v5: Missing break statement (Lionel)

v6: Fixup comment (Tapani)

v7: Fixup comment again (Tapani)

v8: Don't use sample_dim as index (Jason)
    Rename pass
    Simplify control flow

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v7)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3362>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3362>

4 years agointel/isl: Allow CCS_E on more formats
Jason Ekstrand [Wed, 7 Mar 2018 00:35:47 +0000 (16:35 -0800)]
intel/isl: Allow CCS_E on more formats

Now that BLORP supports copies on everything except R11G11B10_FLOAT,
we should be able to support CCS_E those formats.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3554>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3554>

4 years agointel/blorp: Add support for CCS_E copies with UNORM formats
Jason Ekstrand [Wed, 7 Mar 2018 00:35:30 +0000 (16:35 -0800)]
intel/blorp: Add support for CCS_E copies with UNORM formats

Some of the smaller bit-size formats which support CCS_E don't have a
UINT representative in their compression class.  However, we should be
able to use UNORM just fine and still get bit-exact copies.  We just
have to do a conversion to/from UNORM when we bitcast.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3554>

4 years agolima/ppir: fix src read mask swizzling
Erico Nunes [Mon, 20 Jan 2020 00:29:40 +0000 (01:29 +0100)]
lima/ppir: fix src read mask swizzling

The src mask can't be calculated from the dest write_mask.
Instead, it must be calculated from the swizzled operators of the src.
Otherwise, liveness calculation may report incorrect live components for
non-ssa registers.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>

4 years agolima/ppir: split ppir_op_undef into undef and dummy again
Erico Nunes [Tue, 21 Jan 2020 23:42:14 +0000 (00:42 +0100)]
lima/ppir: split ppir_op_undef into undef and dummy again

Those were renamed/merged some time ago but it turns out that
ppir_op_undef can't be shared.
It was being used for undefined ssa operations and for read-before-write
operations that may happen to e.g. uninitialized registers (non-ssa)
inside a loop.
We really don't want to reserve a register for the undef ssa case, but
we must reserve and allocate register for the unitialized register case
because when it happens inside a loop it may need to hold its value
across iterations.

This dummy node might be eliminated with a code refactor in ppir in case
we are able to emit the write and allocate the ppir_reg before we emit
the read. But a major refactor we need this to keep this code to avoid
apparent regressions with the new liveness analysis implementation.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>

4 years agolima/ppir: fix ssa undef emit
Erico Nunes [Tue, 21 Jan 2020 23:37:22 +0000 (00:37 +0100)]
lima/ppir: fix ssa undef emit

The ssa doesn't need to be manually added to block->comp->reg_list.
Doing so actually causes other registers to be marked as undef=true
later.

This patch alone fixes a few deqp tests that have undefs.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>

4 years agolima/ppir: handle write to dead registers in ppir
Erico Nunes [Mon, 20 Jan 2020 00:33:07 +0000 (01:33 +0100)]
lima/ppir: handle write to dead registers in ppir

nir can output writes to dead registers when expanding vec4 operations
to non-ssa registers. In that case, some components of the vec4 may be
assigned but never read. These are also not currently removed by a nir
dead code elimination pass as they are not ssa.
In order to prevent regalloc from allocating a live register for this
operation, an interference must be assigned to it during liveness
analysis.

This workaround may be removed in the future if the assignments to dead
components can be removed earlier in ppir or nir.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>

4 years agoradeonsi: fix a regression since the addition of si_shader_llvm_vs.c
Marek Olšák [Fri, 24 Jan 2020 22:12:10 +0000 (17:12 -0500)]
radeonsi: fix a regression since the addition of si_shader_llvm_vs.c

Fixes: cd5b99c541d241d - radeonsi: move VS shader code into si_shader_llvm_vs.c
Closes: #2416
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3561>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3561>

4 years agoradeonsi: make screen available to shader part compilation
Marek Olšák [Fri, 24 Jan 2020 21:28:54 +0000 (16:28 -0500)]
radeonsi: make screen available to shader part compilation

to fix a crash in is_multi_part_shader.

Fixes: 1a0890dcf30 - radeonsi: change prototypes of si_is_multi_part_shader & si_is_merged_shader
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3561>

4 years agoanv: Rework CCS memory handling on TGL-LP
Jason Ekstrand [Wed, 22 Jan 2020 21:29:51 +0000 (15:29 -0600)]
anv: Rework CCS memory handling on TGL-LP

The previous way we were attempting to handle AUX tables on TGL-LP was
very GL-like.  We used the same aux table management code that's shared
with iris and we updated the table on image create/destroy.  The problem
with this is that Vulkan allows multiple VkImage objects to be bound to
the same memory location simultaneously and the app can ping-pong back
and forth between them in the same command buffer.  Because the AUX
table contains format-specific data, we cannot support this ping-pong
behavior with only CPU updates of the AUX table.

The new mechanism switches things around a bit and instead makes the aux
data part of the BO.  At BO creation time, a bit of space is appended to
the end of the BO for AUX data and the AUX table is updated in bulk for
the entire BO.  The problem here, of course, is that we can't insert the
format-specific data into the AUX table at BO create time.

Fortunately, Vulkan has a requirement that every TILING_OPTIMAL image
must be initialized prior to use by transitioning the image from
VK_IMAGE_LAYOUT_UNDEFINED to something else.  When doing the above
described ping-pong behavior, the app has to do such an initialization
transition every time it corrupts the underlying memory of the VkImage
by using it as something else.  We can hook into this initialization and
use it to update the AUX-TT entries from the command streamer.  This way
the AUX table gets its format information, apps get aliasing support,
and everyone is happy.

One side-effect of this is that we disallow CCS on shared buffers.
We'll need to fix this for modifiers on the scanout path but that's a
task for another patch.  We should be able to do it with dedicated
allocations.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>

4 years agoanv: Make anv_vma_alloc/free a lot dumber
Jason Ekstrand [Wed, 22 Jan 2020 22:40:13 +0000 (16:40 -0600)]
anv: Make anv_vma_alloc/free a lot dumber

All they do now is take a size, align, and flags and figure out which
heap to allocate in.  All of the actual code to deal with the BO is in
anv_allocator.c.  We want to leave anv_vma_alloc/free in anv_device.c
because it deals with API-exposed heaps so it still makes sense to have
it there.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>

4 years agoanv: Make AUX table invalidate a PIPE_* bit
Jason Ekstrand [Wed, 22 Jan 2020 18:39:51 +0000 (12:39 -0600)]
anv: Make AUX table invalidate a PIPE_* bit

This commit moves it in with all the other cache invalidation operations
as if it were done by PIPE_CONTROL even though it's a pair of register
writes.  This means we only have to write the GFX_AUX_TABLE_BASE_ADDR
register once at device initialization instead of every invalidate.
Invalidates are now a single LRI instead of two.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>

4 years agoanv: Add another align_down helper
Jason Ekstrand [Wed, 22 Jan 2020 17:58:44 +0000 (11:58 -0600)]
anv: Add another align_down helper

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>

4 years agoisl: Add a helper for calculating subimage memory ranges
Jason Ekstrand [Wed, 22 Jan 2020 17:40:00 +0000 (11:40 -0600)]
isl: Add a helper for calculating subimage memory ranges

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>

4 years agoanv: Delete a redundant calculation
Jason Ekstrand [Tue, 21 Jan 2020 21:58:32 +0000 (15:58 -0600)]
anv: Delete a redundant calculation

We compute the same thing with the same variable name at the top of the
function.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>

4 years agointel/aux-map: Factor out some useful helpers
Jason Ekstrand [Tue, 21 Jan 2020 20:23:41 +0000 (14:23 -0600)]
intel/aux-map: Factor out some useful helpers

This breaks add_mapping() into three pieces:

    1. get_aux_entry() adds AUX-TT pages as needed and returns the
       L1 entry index, L1 entry address, and L1 entry map.

    2. gen_aux_map_format_bits_for_isl_surf() computes the format-
       specific information that goes in the AUX-TT entry.

    3. add_mapping() is a lot dumber function that now just adds the
       requested mapping with the requested format bits.

This lets us break out some additional helpers in the API which we want
to use for more direct AUX-TT management in ANV.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>

4 years agointel/aux-map: Add some #defines
Jason Ekstrand [Tue, 21 Jan 2020 20:14:20 +0000 (14:14 -0600)]
intel/aux-map: Add some #defines

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>

4 years agoradeonsi: expose shader cache stats to the HUD
Marek Olšák [Thu, 16 Jan 2020 21:50:06 +0000 (16:50 -0500)]
radeonsi: expose shader cache stats to the HUD

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>

4 years agoradeonsi: print shader cache stats with AMD_DEBUG=cache_stats
Marek Olšák [Thu, 16 Jan 2020 02:17:51 +0000 (21:17 -0500)]
radeonsi: print shader cache stats with AMD_DEBUG=cache_stats

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>

4 years agoradeonsi: restructure si_shader_cache_load_shader
Marek Olšák [Thu, 16 Jan 2020 01:49:06 +0000 (20:49 -0500)]
radeonsi: restructure si_shader_cache_load_shader

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>

4 years agoradeonsi: use the live shader cache
Marek Olšák [Sat, 30 Nov 2019 02:25:07 +0000 (21:25 -0500)]
radeonsi: use the live shader cache

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>

4 years agogallium/util: add a cache of live shaders for shader CSO deduplication
Marek Olšák [Sat, 30 Nov 2019 02:01:19 +0000 (21:01 -0500)]
gallium/util: add a cache of live shaders for shader CSO deduplication

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>

4 years agoutil/simple_mtx: add a missing include to get ASSERTED
Marek Olšák [Sat, 30 Nov 2019 02:02:39 +0000 (21:02 -0500)]
util/simple_mtx: add a missing include to get ASSERTED

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>

4 years agointel/compiler: Add names for SHADER_OPCODE_[IU]SUB_SAT
Caio Marcelo de Oliveira Filho [Fri, 24 Jan 2020 18:55:28 +0000 (10:55 -0800)]
intel/compiler: Add names for SHADER_OPCODE_[IU]SUB_SAT

Fixes: 58907568ec5 ("intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops")
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3558>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3558>

4 years agoanv: Always initialize target_stencil_layout
Caio Marcelo de Oliveira Filho [Fri, 24 Jan 2020 19:11:20 +0000 (11:11 -0800)]
anv: Always initialize target_stencil_layout

Pass down stencil data from the subpass attachment like we do
elsewhere.  Only stencil attachments will make use of it.

Fixes warnings like

    ../src/intel/vulkan/genX_cmd_buffer.c: In function ‘cmd_buffer_begin_subpass’:
    ../src/intel/vulkan/genX_cmd_buffer.c:4656:41: warning: ‘target_stencil_layout’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     4656 |       att_state->current_stencil_layout = target_stencil_layout;
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3557>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3557>

4 years agoanv: Replace aux_surface.isl.size_B checks with aux_usage checks
Jason Ekstrand [Tue, 21 Jan 2020 23:21:47 +0000 (17:21 -0600)]
anv: Replace aux_surface.isl.size_B checks with aux_usage checks

Now that aux_usage has a unified meaning, aux_usage == NONE if and only
if aux_surface.isl.size_B > 0.  In most of these cases, the question
we're asking is "does have compression?" and not "have we allocated an
aux surface for compression?".

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>

4 years agoanv: Rework the meaning of anv_image::planes[]::aux_usage
Jason Ekstrand [Tue, 21 Jan 2020 23:13:30 +0000 (17:13 -0600)]
anv: Rework the meaning of anv_image::planes[]::aux_usage

Previously, we set aux_usage=ISL_AUX_USAGE_NONE when we really meant
CCS_D.  This sort-of made sense before we had anv_layout_to_aux_usage
but now that we have that helper.  However, in our more modern aux
tracking model, all aux usage goes through anv_layout_to_* and we're
better off making the meaning of anv_image::planes[]::aux_usage be
AUX_USAGE_NONE if and only if there is no compression.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>

4 years agoradv: print NIR shaders after lowering FS inputs/outputs
Samuel Pitoiset [Fri, 24 Jan 2020 15:59:32 +0000 (16:59 +0100)]
radv: print NIR shaders after lowering FS inputs/outputs

This is confusing otherwise.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3553>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3553>

4 years agointel/isl: Add a hack for the Gen12 A0 texture buffer bug
Jason Ekstrand [Fri, 24 Jan 2020 04:21:03 +0000 (22:21 -0600)]
intel/isl: Add a hack for the Gen12 A0 texture buffer bug

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>

4 years agointel/isl: Plumb devinfo into isl_genX(buffer_fill_state_s)
Jason Ekstrand [Fri, 24 Jan 2020 04:20:33 +0000 (22:20 -0600)]
intel/isl: Plumb devinfo into isl_genX(buffer_fill_state_s)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>

4 years agointel/disasm: Properly disassemble indirect SENDs
Jason Ekstrand [Fri, 24 Jan 2020 03:57:03 +0000 (21:57 -0600)]
intel/disasm: Properly disassemble indirect SENDs

Instead of emitting g[a0]UD for the indirect descriptor, emit a0<0>UD.
This is more correct because there is no GRF involved.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>

4 years agointel/fs: Don't unnecessarily fall back to indirect sends on Gen12
Jason Ekstrand [Thu, 23 Jan 2020 04:54:20 +0000 (22:54 -0600)]
intel/fs: Don't unnecessarily fall back to indirect sends on Gen12

The instruction encoding for SENDS changed on Gen12 and it now supports
embedding the entire extended message descriptor in the instruction if
it's an immediate.  Stop falling back to doing an indirect SEND just
because we had something in [15:12] of ex_desc.ud.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>

4 years agoanv: Improve BTI change cache flushing
Jason Ekstrand [Thu, 23 Jan 2020 04:37:10 +0000 (22:37 -0600)]
anv: Improve BTI change cache flushing

This commit makes two changes:

 1. We set pending_pipe_bits instead of emitting PIPE_CONTROL directly
    for the flush at the end of cmd_buffer_begin_subpass.

 2. Because BLORP ops such as vkCmdClearAttachments may come in the
    middle of a render pass, we have to also flag the need for a cache
    flush after the blorp op.

Fixes: 185630c6bc97 "anv/blorp: Do the gen11 BTI flush"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>

4 years agopanfrost: Fix 32-bit warning for `indices`
Alyssa Rosenzweig [Fri, 24 Jan 2020 13:26:38 +0000 (08:26 -0500)]
panfrost: Fix 32-bit warning for `indices`

../src/gallium/drivers/panfrost/pan_context.c: In function ‘panfrost_draw_vbo’:
../src/gallium/drivers/panfrost/pan_context.c:1551:70: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
                 ctx->payloads[PIPE_SHADER_FRAGMENT].prefix.indices = (u64) NULL;
                                                                      ^

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Icecream95 <ixn@keemail.me>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>

4 years agopan/decode: Remove SHORT_SLIDE indirection
Alyssa Rosenzweig [Fri, 24 Jan 2020 13:25:08 +0000 (08:25 -0500)]
pan/decode: Remove SHORT_SLIDE indirection

../src/panfrost/pandecode/decode.c: In function ‘pandecode_compute_fbd’:
../src/panfrost/pandecode/decode.c:789:35: warning: taking address of packed member of ‘struct mali_compute_fbd’ may result in an unaligned pointer value [-Waddress-of-packed-member]
  789 |         pandecode_u32_slide(num, s->unknown ## num, ARRAY_SIZE(s->unknown ## num))
      |                                  ~^~~~~~~~~
../src/panfrost/pandecode/decode.c:800:9: note: in expansion of macro ‘SHORT_SLIDE’
  800 |         SHORT_SLIDE(1);
      |         ^~~~~~~~~~~

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>

4 years agopan/midgard: Remove pack_color define
Alyssa Rosenzweig [Thu, 23 Jan 2020 20:37:35 +0000 (15:37 -0500)]
pan/midgard: Remove pack_color define

Unused at the moment.

../src/panfrost/midgard/midgard_compile.c:124:29: warning: ‘m_pack_colour’ defined but not used [-Wunused-function]
  124 |  static midgard_instruction m_##name(unsigned ssa, unsigned address) { \
      |                             ^~
../src/panfrost/midgard/midgard_compile.c:145:22: note: in expansion of macro ‘M_LOAD_STORE’
  145 | #define M_LOAD(name) M_LOAD_STORE(name, false)
      |                      ^~~~~~~~~~~~
../src/panfrost/midgard/midgard_compile.c:213:1: note: in expansion of macro ‘M_LOAD’
  213 | M_LOAD(pack_colour);
      | ^~~~~~

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>

4 years agopan/decode: Remove last_size
Alyssa Rosenzweig [Thu, 23 Jan 2020 20:32:58 +0000 (15:32 -0500)]
pan/decode: Remove last_size

Fixes ../src/panfrost/pandecode/decode.c: In function ‘pandecode_jc’:
../src/panfrost/pandecode/decode.c:2859:14: warning: variable ‘last_size’ set but not used [-Wunused-but-set-variable]
 2859 |         bool last_size;

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>

4 years agopanfrost: Don't use implicit mali_exception_status enum
Alyssa Rosenzweig [Thu, 23 Jan 2020 20:28:09 +0000 (15:28 -0500)]
panfrost: Don't use implicit mali_exception_status enum

Fixes ../src/panfrost/pandecode/public.h:53:33: warning: ‘enum mali_exception_access’ declared inside parameter list will not be visible outside of this definition or declaration

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>

4 years agoradv: enable ACO support for GFX6
Samuel Pitoiset [Fri, 10 Jan 2020 13:53:51 +0000 (14:53 +0100)]
radv: enable ACO support for GFX6

CTS should pass, as well as Crucible and the few number of Piglit tests.

List of game benchmarks tested:
- Dawn of War 3
- Serious Sam 2017
- Shadow of The Tomb Raider
- The Talos Principle
- Thrones of Britannia
- Total Warhammer 2
- Total War: Three Kingdoms

Note that F12017 hangs with or without ACO on GFX6 at the moment.

My whole pipelinedb (~30 games) doesn't trigger any compiler crashes.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2401
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>

4 years agoaco: copy the literal offset of SMEM instructions to a temporary
Samuel Pitoiset [Wed, 22 Jan 2020 15:59:34 +0000 (16:59 +0100)]
aco: copy the literal offset of SMEM instructions to a temporary

GFX6 only supports up to 8-bit for the literal offset, so make sure
it's copied to a temporary SGPR before emitting a SMEM instruction.
The optimizer will propagate the literal offset if possible anyways.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>

4 years agoaco: fix a hazard with v_interp_* and v_{read,readfirst}lane_* on GFX6
Samuel Pitoiset [Tue, 21 Jan 2020 15:49:22 +0000 (16:49 +0100)]
aco: fix a hazard with v_interp_* and v_{read,readfirst}lane_* on GFX6

It's required to insert 1 wait state if the dst VGPR of any v_interp_*
is followed by a read with v_readfirstlane or v_readlane to fix GPU
hangs on GFX6. Note that v_writelane_* is apparently not affected.
This hazard isn't documented anywhere but AMD confirmed it.

This fixes a GPU hang with the texturemipmapgen Sascha demo on GFX6.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>

4 years agoaco: fix a hardware bug for MRTZ exports on GFX6
Samuel Pitoiset [Thu, 23 Jan 2020 16:51:09 +0000 (17:51 +0100)]
aco: fix a hardware bug for MRTZ exports on GFX6

GFX6 (except OLAND and HAINAN) has a bug that it only looks at
the X writemask component.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>

4 years agoturnip: Implement vkCmdCopyQueryPoolResults for occlusion queries
Brian Ho [Mon, 6 Jan 2020 22:11:50 +0000 (17:11 -0500)]
turnip: Implement vkCmdCopyQueryPoolResults for occlusion queries

Use CP_COND_EXEC and CP_COND_WRITE to conditionally copy the results
of a query to a buffer based off the query's availability.

Fixes: #2238
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>

4 years agoturnip: Implement vkCmdResetQueryPool
Brian Ho [Fri, 3 Jan 2020 16:33:06 +0000 (11:33 -0500)]
turnip: Implement vkCmdResetQueryPool

Clears the available bit for each requested query on the GPU.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>

4 years agoturnip: Implement vkGetQueryPoolResults for occlusion queries
Brian Ho [Fri, 3 Jan 2020 00:24:29 +0000 (19:24 -0500)]
turnip: Implement vkGetQueryPoolResults for occlusion queries

Implements fetching the results of a query pool with the
VK_QUERY_RESULT_WAIT_BIT, VK_QUERY_RESULT_WITH_AVAILABILITY_BIT,
and VK_QUERY_RESULT_PARTIAL_BIT flags.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>

4 years agoturnip: Update query availability on render pass end
Brian Ho [Thu, 16 Jan 2020 17:15:45 +0000 (12:15 -0500)]
turnip: Update query availability on render pass end

Unlike on an immidiate-mode renderer, Turnip only renders tiles on
vkCmdEndRenderPass. As such, we need to track all queries that were
active in a given render pass and defer setting the available bit
on those queries until after all tiles have rendered.

This commit adds a draw_epilogue_cs to tu_cmd_buffer that is
executed as an IB at the end of tu_CmdEndRenderPass. We then emit
packets to this command stream that update the availability bit of a
given query in tu_CmdEndQuery.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>

4 years agoturnip: Implement vkCmdEndQuery for occlusion queries
Brian Ho [Thu, 2 Jan 2020 22:50:41 +0000 (17:50 -0500)]
turnip: Implement vkCmdEndQuery for occlusion queries

Mostly a translation of freedreno's implementation of glEndQuery for
GL_SAMPLES_PASSED query objects with a slight modification to set the
availability bit of the query bo (slot->available) if the query was
not ended inside a render pass.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>

4 years agoturnip: Implement vkCmdBeginQuery for occlusion queries
Brian Ho [Thu, 2 Jan 2020 20:15:27 +0000 (15:15 -0500)]
turnip: Implement vkCmdBeginQuery for occlusion queries

Mostly a translation of freedreno's implementation of glBeginQuery for
GL_SAMPLES_PASSED query objects with special logic for handling tiled
render passes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>

4 years agoturnip: Implement vkCreateQueryPool for occlusion queries
Brian Ho [Thu, 2 Jan 2020 19:46:57 +0000 (14:46 -0500)]
turnip: Implement vkCreateQueryPool for occlusion queries

General structure is inspired by anv's implementation in genX_query.c.
We define a packed struct that tracks sample count at the beginning of
the query and at the end; the result of the occlusion query is then
slot->end - slot->begin.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>

4 years agoturnip: Update tu_query_pool with turnip-specific fields
Brian Ho [Thu, 2 Jan 2020 19:42:14 +0000 (14:42 -0500)]
turnip: Update tu_query_pool with turnip-specific fields

tu_query_pool was forked from radv_query_pool, but we will need a
different set of fields to implement queries in turnip.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>

4 years agoanv: Allow HiZ in read-only depth layouts
Jason Ekstrand [Wed, 20 Nov 2019 00:20:57 +0000 (18:20 -0600)]
anv: Allow HiZ in read-only depth layouts

This improves the performance of Aztec Ruins by 5% on ICL.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>

4 years agoanv: Add a usage parameter to anv_layout_to_aux_usage
Jason Ekstrand [Tue, 19 Nov 2019 23:51:20 +0000 (17:51 -0600)]
anv: Add a usage parameter to anv_layout_to_aux_usage

Most places we actually know the usage and can provide it.  There are
two exceptions to this:

 1. We pass 0 into get_blorp_surf_for_anv_image when we use
    ANV_IMAGE_LAYOUT_EXPLICIT_AUX because anv_layout_to_aux_usage is
    never actually called so it doesn't matter.

 2. We pass 0 into anv_layout_to_aux_usage in transition_color_buffer.
    However, the coming commits which will begin using the usage
    parameter only care about depth.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>

4 years agoanv: Use isl_aux_state for HiZ resolves
Jason Ekstrand [Mon, 6 Jan 2020 18:49:51 +0000 (12:49 -0600)]
anv: Use isl_aux_state for HiZ resolves

Rather than looking at the aux usage, we look at the isl_aux_state which
provides us with more detailed information.  This commit adds a couple
helpers to isl which let us quickly determine if we have valid depth/hiz
on the initial layout and if we need valid depth/hiz for the final
layout.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>

4 years agoanv: Add a layout_to_aux_state helper
Jason Ekstrand [Mon, 6 Jan 2020 17:23:43 +0000 (11:23 -0600)]
anv: Add a layout_to_aux_state helper

This new helper maps VkImageLayout enums to isl_aux_state enums which
are the hardware's concept of image layouts.  We can then use the aux
state to get the fast clear type and the aux usage.  This should yield
no functional change in driver behavior.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>

4 years agoanv: Use TRANSFER_SRC_OPTIMAL for depth/stencil MSAA resolves
Jason Ekstrand [Mon, 6 Jan 2020 17:28:59 +0000 (11:28 -0600)]
anv: Use TRANSFER_SRC_OPTIMAL for depth/stencil MSAA resolves

As of 52ad1712ed62, TRANSFER_SRC_OPTIMAL and SHADER_READ_ONLY_OPTIMAL
are now identical for depth buffers so there's no reason why we need to
use the "wrong" layout.  Technically, according to Vulkan, blits and
MSAA resolves are transfer ops so we should use the transfer layout now
that we can.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>

4 years agointel/blorp: resize src and dst surfaces separately
Jason Ekstrand [Wed, 15 Jan 2020 20:08:17 +0000 (14:08 -0600)]
intel/blorp: resize src and dst surfaces separately

When copying to an RGB surface, we treat it as an R only one of three
times the width, which may end up being larger than the maximum size
supported by the hardware and so it hits the shrink path. This forced
both source and destination surfaces to be shrunk, even though it's not
necessary for the former, and may even hit some assertions in some
cases, such as the surface being compressed.

Fixes several tests under dEQP-VK.api.copy_and_blit.core.image_to_image.dimensions.*

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3422>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3422>

4 years agoaco: combine MRTZ (depth, stencil, sample mask) exports
Samuel Pitoiset [Thu, 23 Jan 2020 16:50:25 +0000 (17:50 +0100)]
aco: combine MRTZ (depth, stencil, sample mask) exports

Instead of emitting up to 3 for each different components (depth,
stencil and sample mask). This is needed to fix a hw bug on GFX6.

Totals from affected shaders:
SGPRS: 34728 -> 35056 (0.94 %)
VGPRS: 26440 -> 26476 (0.14 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1346088 -> 1344180 (-0.14 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 3922 -> 3915 (-0.18 %)
Wait states: 0 -> 0 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3538>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3538>

4 years agoaco/gfx10: Fix VcmpxExecWARHazard mitigation.
Timur Kristóf [Fri, 24 Jan 2020 14:17:44 +0000 (15:17 +0100)]
aco/gfx10: Fix VcmpxExecWARHazard mitigation.

The SOPP instruction shouldn't have a definition, and its block
should be set to -1 in order to prevent it from being recognized
as a branch.
Also fix a typo in the readme.

Fixes: d6dfce02d074d615a3b88a3fccd8ee8c7e13c010
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3552>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3552>

4 years agoaco: Transform uniform bitwise instructions to 32-bit if possible.
Timur Kristóf [Thu, 16 Jan 2020 18:32:31 +0000 (19:32 +0100)]
aco: Transform uniform bitwise instructions to 32-bit if possible.

This allows removing superfluous s_cselect instructions
that come from turning booleans into 64-bit vector condition.

v2 by Daniel Schürmann:
- Make the code massively simpler
v3 by Timur Kristóf:
- Fix regressions, make it work in wave32 mode
- Eliminate extra moves by not always using the SCC definition
- Use s_absdiff_i32 for uniform XOR
- Skip the transformation for uncommon or invalid instructions

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450>

4 years agoetnaviv: update Android build files
Martin Fuzzey [Fri, 17 Jan 2020 16:37:03 +0000 (17:37 +0100)]
etnaviv: update Android build files

etnaviv no longer builds on Android, fix this.

Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3447>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3447>

4 years agoaco: use nir_move_copies
Rhys Perry [Tue, 14 Jan 2020 11:42:11 +0000 (11:42 +0000)]
aco: use nir_move_copies

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>

4 years agoradv/aco: use ACO for GS copy shaders
Rhys Perry [Fri, 15 Nov 2019 12:42:46 +0000 (12:42 +0000)]
radv/aco: use ACO for GS copy shaders

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>

4 years agoaco: implement GS copy shaders
Rhys Perry [Fri, 15 Nov 2019 11:31:03 +0000 (11:31 +0000)]
aco: implement GS copy shaders

v5: rebase on float_controls changes
v7: rebase after shader args MR and load/store vectorizer MR

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>

4 years agoaco: remove needs_instance_id
Rhys Perry [Fri, 15 Nov 2019 11:47:10 +0000 (11:47 +0000)]
aco: remove needs_instance_id

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>

4 years agoaco: explicitly mark end blocks for exports
Rhys Perry [Fri, 15 Nov 2019 11:43:19 +0000 (11:43 +0000)]
aco: explicitly mark end blocks for exports

For GS copy shaders, whether we want to do exports is conditional. By
explicitly marking the end blocks, we can mark an IF's then branch as an
export block and ensure that's where the assembler inserts null exports.

v6: only fixup exports in the end block, like before
v8: simplify some code

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>

4 years agoradv/aco: allow ACO for GS
Rhys Perry [Mon, 14 Oct 2019 16:45:09 +0000 (17:45 +0100)]
radv/aco: allow ACO for GS

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>

4 years agoaco: implement GS on GFX7-8
Rhys Perry [Thu, 14 Nov 2019 20:14:01 +0000 (20:14 +0000)]
aco: implement GS on GFX7-8

GS is the same on GFX6, but GFX6 isn't fully supported yet.

v4: fix regclass
v7: rebase after shader args MR

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>