mesa.git
4 years agopan/midgard: Add mir_upper_override helper
Alyssa Rosenzweig [Wed, 13 Nov 2019 02:22:53 +0000 (21:22 -0500)]
pan/midgard: Add mir_upper_override helper

Checks if we should emit a dest_override=upper, given a mask.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Support loads from R11G11B10 in a blend shader
Alyssa Rosenzweig [Mon, 16 Dec 2019 19:42:17 +0000 (14:42 -0500)]
pan/midgard: Support loads from R11G11B10 in a blend shader

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Enable lower_(un)pack_* lowering
Alyssa Rosenzweig [Tue, 24 Dec 2019 19:01:33 +0000 (14:01 -0500)]
pan/midgard: Enable lower_(un)pack_* lowering

These show up in some blend shaders. Let's use the shared lowering and
remove our own.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Increase PIPE_SHADER_CAP_MAX_OUTPUTS to 16
Tomeu Vizoso [Thu, 19 Dec 2019 15:01:15 +0000 (16:01 +0100)]
panfrost: Increase PIPE_SHADER_CAP_MAX_OUTPUTS to 16

GL ES 3.0 requires it to be higher, and stuff seems to work just fine.

Fixes: dEQP-GLES3.functional.implementation_limits.max_vertex_output_components
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
4 years agopanfrost: Handle Z24_UNORM_S8_UINT as MALI_Z32_UNORM
Tomeu Vizoso [Thu, 19 Dec 2019 11:51:06 +0000 (12:51 +0100)]
panfrost: Handle Z24_UNORM_S8_UINT as MALI_Z32_UNORM

Fixes dEQP-GLES3.functional.texture.format.sized.2d.depth24_stencil8_pot

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
4 years agopan/midgard: Implement shadow cubemaps
Alyssa Rosenzweig [Fri, 20 Dec 2019 22:25:05 +0000 (17:25 -0500)]
pan/midgard: Implement shadow cubemaps

We need to reshuffle to sync up the shadow coordinate temporary with the
cubemap coordinate temporary. Once that's in place, it's simple enough
(we load the shadow coordinate into .z like 2D).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Generalize temp coordinate to non-2D
Alyssa Rosenzweig [Fri, 20 Dec 2019 22:01:29 +0000 (17:01 -0500)]
pan/midgard: Generalize temp coordinate to non-2D

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Do witchcraft on texture offsets
Alyssa Rosenzweig [Fri, 20 Dec 2019 18:48:24 +0000 (13:48 -0500)]
pan/midgard: Do witchcraft on texture offsets

My latest divination spell has uncovered a pattern in the aether.
Although the swizzle is unaligned, its format is otherwise standard.
Document this, removing the old incorrect understanding of the swizzle
(which coincided on common special swizzles only).

Fixes dEQP-GLES3.functional.shaders.texture_functions.texelfetchoffset.sampler2d_fixed_fragment

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Fix fallthrough from offset to comparator
Alyssa Rosenzweig [Fri, 20 Dec 2019 17:58:10 +0000 (12:58 -0500)]
pan/midgard: Fix fallthrough from offset to comparator

Fixes: ccbc9a4e678 ("pan/midgard: Implement textureOffset for 2D textures")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Expand swizzle for texelFetch
Alyssa Rosenzweig [Fri, 20 Dec 2019 17:38:24 +0000 (12:38 -0500)]
pan/midgard: Expand swizzle for texelFetch

We zero the extra components anyway. Fixes
dEQP-GLES3.functional.shaders.texture_functions.texelfetch.sampler2d_fixed_fragment

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Clamp LOD register swizzle
Alyssa Rosenzweig [Fri, 20 Dec 2019 17:34:20 +0000 (12:34 -0500)]
pan/midgard: Clamp LOD register swizzle

Fixes register allocation failures with textureLodOffset.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Extend IS_VEC4_ONLY to arguments
Alyssa Rosenzweig [Sun, 22 Dec 2019 19:55:46 +0000 (14:55 -0500)]
pan/midgard: Extend IS_VEC4_ONLY to arguments

I think both need to be aligned at least for ld_cubemap_coords.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Bounds check lcra_restrict_range
Alyssa Rosenzweig [Mon, 23 Dec 2019 20:49:18 +0000 (15:49 -0500)]
pan/midgard: Bounds check lcra_restrict_range

We may call it with sentinel values (~0 in particular) corresponding to
unused arguments; ignore these.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agofreedreno/ir3: fix flat shading again
Rob Clark [Fri, 20 Dec 2019 21:06:11 +0000 (13:06 -0800)]
freedreno/ir3: fix flat shading again

These days `ctx->inputs` is the split scalar input components and
`ir->inputs` is the full vecN.  This got fixed in the load_input case,
but the load_interpolated_input case was missed.

Fixes: bdf6b7018ce ("freedreno/ir3: re-work shader inputs/outputs")
Signed-off-by: Rob Clark <robdclark@chromium.org>
4 years agopan/midgard: Fix disassembler cycle/quadword counting
Alyssa Rosenzweig [Mon, 23 Dec 2019 17:24:03 +0000 (12:24 -0500)]
pan/midgard: Fix disassembler cycle/quadword counting

Due to the succeeding break we would fall into some off-by-one errors.
These should be resolved now.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/decode: Append 0:0 spills:fills to blobber-db
Alyssa Rosenzweig [Mon, 23 Dec 2019 16:50:28 +0000 (11:50 -0500)]
pan/decode: Append 0:0 spills:fills to blobber-db

At the moment there's no need to actually count these but we do need a
placeholder for report.py to be happy.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/decode: Prefix blobberdb with MESA_SHADER_*
Alyssa Rosenzweig [Mon, 23 Dec 2019 16:49:09 +0000 (11:49 -0500)]
pan/decode: Prefix blobberdb with MESA_SHADER_*

We use these prefixes in panfrost shader-db and they need to match for
shader-db to be happpy.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/decode: Skip COMPUTE in blobber-db
Alyssa Rosenzweig [Mon, 23 Dec 2019 16:48:23 +0000 (11:48 -0500)]
pan/decode: Skip COMPUTE in blobber-db

The blob uses COMPUTE jobs for some internal purposes. These are
essentially free but panfrost doesn't use them, so it messes up the
numbering. Just filter them out.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Decode shader types in pantrace shader-db
Alyssa Rosenzweig [Mon, 23 Dec 2019 16:40:40 +0000 (11:40 -0500)]
panfrost: Decode shader types in pantrace shader-db

We see some COMPUTE jobs that were mistakenly identified as VERTEX.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agoanv: Properly advertise sampledImageIntegerSampleCounts
Jason Ekstrand [Tue, 24 Dec 2019 04:19:29 +0000 (22:19 -0600)]
anv: Properly advertise sampledImageIntegerSampleCounts

We support the same set of samples for integer color formats as for
non-integer.  We've been advertising it wrong since before the initial
Vulkan 1.0 release. :-(

Fixes: d68974530371 "vk/0.210.0: Rework device features and limits"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoAndroid: Fix build issue without LLVM
Roman Stratiienko [Thu, 5 Dec 2019 16:32:02 +0000 (18:32 +0200)]
Android: Fix build issue without LLVM

Some of the latest changes are causing the following build error on Android:

```
external/mesa3d/src/gallium/auxiliary/nir/nir_to_tgsi_info.c:403:6:
error: redefinition of 'nir_tgsi_scan_shader'
void nir_tgsi_scan_shader(const struct nir_shader *nir,
     ^
external/mesa3d/src/gallium/auxiliary/nir/nir_to_tgsi_info.h:37:20:
note: previous definition is here
static inline void nir_tgsi_scan_shader(const struct nir_shader *nir,
                   ^
```

Include nir_to_tgsi_info.c and nir_to_tgsi_info.h into the build
only if LLVM is enabled.

Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2978>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2978>

4 years agoiris: Avoid replacing backing storage for buffers with no contents
Kenneth Graunke [Mon, 4 Nov 2019 08:21:06 +0000 (00:21 -0800)]
iris: Avoid replacing backing storage for buffers with no contents

We might get asked to pitch the storage on a buffer that already has
no meaningful contents.  In this case, the existing buffer is as good
as a new one.

4 years agoiris: Fix shader recompile debug printing
Kenneth Graunke [Sun, 22 Dec 2019 23:43:51 +0000 (15:43 -0800)]
iris: Fix shader recompile debug printing

I was passing iris keys to brw_debug_key_recompile, leading to out of
bounds memory reads.

Fixes: 2e654db27a1 ("iris: Create smaller program keys without legacy features")
4 years agoiris: Make helper functions to turn iris shader keys into brw keys.
Kenneth Graunke [Sun, 22 Dec 2019 23:33:17 +0000 (15:33 -0800)]
iris: Make helper functions to turn iris shader keys into brw keys.

We'll need to use these in recompile debugging in the next commit.

Fixes: 2e654db27a1 ("iris: Create smaller program keys without legacy features")
4 years agoswr: Fix build with llvm-10.0.
Vinson Lee [Sat, 14 Dec 2019 04:47:51 +0000 (20:47 -0800)]
swr: Fix build with llvm-10.0.

Fix build error after llvm-10 commit 5d986953c8b9 ("[IR] Split out
target specific intrinsic enums into separate headers").

../src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp:78:37: error: ‘x86_bmi_bextr_32’ is not a member of ‘llvm::Intrinsic’
         {"meta.intrinsic.BEXTR_32", Intrinsic::x86_bmi_bextr_32},
                                     ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
4 years agotravis: autodetect python version instead of hard-coding it
Eric Engestrom [Sat, 21 Dec 2019 19:28:07 +0000 (19:28 +0000)]
travis: autodetect python version instead of hard-coding it

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
4 years agoetnaviv: tgsi: Fix gl_FrontFacing support
Marek Vasut [Sun, 17 Nov 2019 13:53:54 +0000 (14:53 +0100)]
etnaviv: tgsi: Fix gl_FrontFacing support

The GPU presents the state of the hardware front_face in internal
register 0 (i0), the range of which is 0.0f..1.0f.

This patch assigns the fragment shader input to this internal register.
Moreover, based on the internal front_ccw state, the value of the i0
register is inverted accordingly using SET.EQ/SEQ.NE instruction before
being further processed in the shader. This mimics the operation of the
NIR compiler.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2868>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2868>

4 years agou_vbuf: Return true in u_vbuf_get_caps if nb of vbufs is below minimum
Paul Cercueil [Fri, 22 Nov 2019 22:23:36 +0000 (23:23 +0100)]
u_vbuf: Return true in u_vbuf_get_caps if nb of vbufs is below minimum

Return true in u_vbuf_get_caps if the number of vertex buffers is below
the minimum required for proper OpenGL 2.0.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>

4 years agou_vbuf: Regard non-constant vbufs with non-instance elements as free
Paul Cercueil [Tue, 19 Nov 2019 21:10:10 +0000 (22:10 +0100)]
u_vbuf: Regard non-constant vbufs with non-instance elements as free

In the case of unroll_indices, we can regard all non-constant
vertex buffers with only non-instance vertex elements as incompatible
and thus free.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>

4 years agou_vbuf: use single vertex buffer if it's not possible to have multiple
Wladimir J. van der Laan [Thu, 3 Oct 2013 10:32:12 +0000 (12:32 +0200)]
u_vbuf: use single vertex buffer if it's not possible to have multiple

Put CONST, VERTEX and INSTANCE attributes into one vertex buffer if
necessary due to hardware constraints.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>

4 years agou_vbuf: Only create driver CSO if no incompatible elements
Paul Cercueil [Tue, 19 Nov 2019 20:59:07 +0000 (21:59 +0100)]
u_vbuf: Only create driver CSO if no incompatible elements

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>

4 years agou_vbuf: Mark vbufs incompatible if more were requested than HW supports
Paul Cercueil [Tue, 19 Nov 2019 20:58:17 +0000 (21:58 +0100)]
u_vbuf: Mark vbufs incompatible if more were requested than HW supports

More vertex buffers are used than the hardware supports.  In
principle, we only need to make sure that less vertex buffers are
used, and mark some of the latter vertex buffers as incompatible.
For now, mark all vertex buffers as incompatible.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>

4 years agou_vbuf: add logic to use a limited number of vbufs
Wladimir J. van der Laan [Sat, 11 Jun 2016 19:21:52 +0000 (21:21 +0200)]
u_vbuf: add logic to use a limited number of vbufs

Make it possible to limit the number of vertex buffers as there exist
GPUs with less then 32 supported vertex buffers.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>

4 years agogallium: add PIPE_CAP_MAX_VERTEX_BUFFERS
Christian Gmeiner [Sat, 11 Jun 2016 19:21:51 +0000 (21:21 +0200)]
gallium: add PIPE_CAP_MAX_VERTEX_BUFFERS

Add PIPE_CAP_MAX_VERTEX_BUFFERS param, which defaults to 16.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>

4 years ago.mailmap: use correct email address
David Heidelberg [Sat, 21 Dec 2019 01:53:10 +0000 (02:53 +0100)]
.mailmap: use correct email address

Signed-off-by: David Heidelberg <david@ixit.cz>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3190>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3190>

4 years agokmsro: Extend to include ingenic-drm
Paul Cercueil [Mon, 11 Nov 2019 01:01:52 +0000 (02:01 +0100)]
kmsro: Extend to include ingenic-drm

This enables Mesa to work with Ingenic SoCs through the use of the
ingenic-drm modesetting driver along with the render-only drivers,
such as Etnaviv on the JZ4770 SoC.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agokmsro: Add "mcde" entry point
Stephan Gerhold [Mon, 4 Nov 2019 21:48:49 +0000 (22:48 +0100)]
kmsro: Add "mcde" entry point

ST-Ericsson Ux500 boards use a Mali 400 GPU together with MCDE
("Multi Channel Display Engine"), which is supported by the "mcde"
DRM driver.

Adding an entry point for it in kmsro seems to be enough to make
Lima work - at least kmscube is working correctly.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3139>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3139>

4 years agoaco: fix vgpr alloc granule with wave32
Rhys Perry [Tue, 3 Dec 2019 14:21:16 +0000 (14:21 +0000)]
aco: fix vgpr alloc granule with wave32

We still need to increase the number of physical vgprs

Totals from affected shaders:
SGPRS: 671976 -> 675288 (0.49 %)
VGPRS: 550112 -> 562596 (2.27 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 27621660 -> 27606532 (-0.05 %) bytes
Max Waves: 81083 -> 87833 (8.32 %)
Instructions: 5391560 -> 5389031 (-0.05 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoaco: improve jump threading with wave32
Rhys Perry [Tue, 3 Dec 2019 14:10:45 +0000 (14:10 +0000)]
aco: improve jump threading with wave32

Totals from affected shaders:
SGPRS: 748746 -> 748746 (0.00 %)
VGPRS: 636984 -> 636984 (0.00 %)
Spilled SGPRs: 387 -> 387 (0.00 %)
Spilled VGPRs: 15 -> 15 (0.00 %)
Code Size: 61138824 -> 60928620 (-0.34 %) bytes
Max Waves: 48602 -> 48602 (0.00 %)
Instructions: 11967660 -> 11915084 (-0.44 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoaco/wave32: fix comparison optimizations
Rhys Perry [Tue, 3 Dec 2019 13:37:49 +0000 (13:37 +0000)]
aco/wave32: fix comparison optimizations

Previously, they weren't done in wave32.

Totals from affected shaders:
SGPRS: 507726 -> 508006 (0.06 %)
VGPRS: 450340 -> 450268 (-0.02 %)
Spilled SGPRs: 298 -> 298 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 39689708 -> 39384488 (-0.77 %) bytes
Max Waves: 39631 -> 39636 (0.01 %)
Instructions: 7865919 -> 7793650 (-0.92 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agonv50ir/nir: support vec8 and vec16
Karol Herbst [Sat, 9 Mar 2019 17:20:38 +0000 (18:20 +0100)]
nv50ir/nir: support vec8 and vec16

Signed-off-by: Karol Herbst <kherbst@redhat.com>
4 years agonir+vtn: vec8+vec16 support
Rob Clark [Sat, 9 Mar 2019 16:17:55 +0000 (17:17 +0100)]
nir+vtn: vec8+vec16 support

This introduces new vec8 and vec16 instructions (which are the only
instructions taking more than 4 sources), in order to construct 8 and 16
component vectors.

In order to avoid fixing up the non-autogenerated nir_build_alu() sites
and making them pass 16 src args for the benefit of the two instructions
that take more than 4 srcs (ie vec8 and vec16), nir_build_alu() is has
nir_build_alu_tail() split out and re-used by nir_build_alu2() (which is
used for the > 4 src args case).

v2 (Karol Herbst):
  use nir_build_alu2 for vec8 and vec16
  use python's array multiplication syntax
  add nir_op_vec helper
  simplify nir_vec
  nir_build_alu_tail -> nir_builder_alu_instr_finish_and_insert
  use nir_build_alu for opcodes with <= 4 sources
v3 (Karol Herbst):
  fix nir_serialize
v4 (Dave Airlie):
  fix serialization of glsl_type
  handle vec8/16 in lowering of bools
v5 (Karol Herbst):
  fix load store vectorizer

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agoaco: use NIR_MAX_VEC_COMPONENTS instead of 4
Karol Herbst [Sat, 9 Nov 2019 21:39:36 +0000 (22:39 +0100)]
aco: use NIR_MAX_VEC_COMPONENTS instead of 4

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
4 years agonir/serialize: cast swizzle before shifting
Karol Herbst [Wed, 11 Dec 2019 15:01:15 +0000 (16:01 +0100)]
nir/serialize: cast swizzle before shifting

fixes undefined behaviour with enabled vec16

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agollvmpipe: switch to NIR by default
Dave Airlie [Tue, 3 Dec 2019 05:23:45 +0000 (15:23 +1000)]
llvmpipe: switch to NIR by default

Add LP_DEBUG=tgsi_ir (tgsi already taken) to fallback to TGSI paths.

Disable NIR_VALIDATE in CI (Michel/Eric acked)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2303>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2303>

4 years agogallivm/nir: wrap idiv to avoid divide by 0 (v2)
Dave Airlie [Fri, 13 Dec 2019 03:09:42 +0000 (13:09 +1000)]
gallivm/nir: wrap idiv to avoid divide by 0 (v2)

This code is taken from the TGSI paths, and should fix the regression
seens with GLES2

v2: use the udiv path which has d3d10 defined return.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2303>

4 years agoac/surface: fix an assertion failure on gfx9 in CMASK computation
Marek Olšák [Fri, 20 Dec 2019 21:19:54 +0000 (16:19 -0500)]
ac/surface: fix an assertion failure on gfx9 in CMASK computation

addrlib only allows the 2D resource type with CMASK.

Fixes: 69ea473eeb9 "amd/addrlib: update to the latest version"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3187>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3187>

4 years agopan/midgard: Optimize comparisions with similar operations
Afonso Bordado [Tue, 10 Dec 2019 13:18:00 +0000 (13:18 +0000)]
pan/midgard: Optimize comparisions with similar operations

Optimizes comparisions by removing the invert flag on operands
which we can prove to be equal without the invert.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3036>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3036>

4 years agolima: set shader caps to optimize control flow
Erico Nunes [Thu, 19 Dec 2019 21:51:07 +0000 (22:51 +0100)]
lima: set shader caps to optimize control flow

With these new caps, nir is able to unroll loops and optimize
conditionals much more efficiently in both gpit and ppir.
panfrost and vc4 were used as reference for the values.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176>

4 years agolima/ppir: remove assert on ppir_emit_tex unsupported feature
Erico Nunes [Thu, 19 Dec 2019 21:49:49 +0000 (22:49 +0100)]
lima/ppir: remove assert on ppir_emit_tex unsupported feature

This assert causes testing tools such as shaderdb to abort on some test
cases. This is an unsupported feature and not a compiler bug. The
compilation error is already propagated correctly, so we can remove the
assert to allow testing tools to run to completion.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176>

4 years agolima/ppir: fix lod bias src
Erico Nunes [Fri, 20 Dec 2019 18:20:58 +0000 (19:20 +0100)]
lima/ppir: fix lod bias src

ppir has some code that operates on all ppir_src variables, and for that
uses ppir_node_get_src.
lod bias support introduced a separate ppir_src that is inaccessible by
that function, causing it to be missed by the compiler in some routines.
Ultimately this caused, in some cases, a bug in const lowering:

  .../pp/lower.c:42: ppir_lower_const: Assertion `src != NULL' failed.

This fix moves the ppir_srcs in ppir_load_texture_node together so they
don't get missed.

Fixes: 721d82cf061 lima/ppir: add lod-bias support
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3185>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3185>

4 years agolima: Fix dump file creation
Andreas Baierl [Fri, 20 Dec 2019 10:30:05 +0000 (11:30 +0100)]
lima: Fix dump file creation

Otherwise lima_dump_file_next() always opens a new file and creates the
dumps regardless of what the environment variables say.

Fixes d71cd245d74 ('lima: Rotate dump files after each finished pp frame')

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3179>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3179>

4 years agoradeon/vcn2: enable rate control for hevc encoding
Pierre-Eric Pelloux-Prayer [Tue, 17 Dec 2019 09:41:39 +0000 (10:41 +0100)]
radeon/vcn2: enable rate control for hevc encoding

Based on b0626c1f306 ("radeon/vcn: enable rate control for hevc encoding").

Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2225
Fixes: 587b9c5dae6 ("radeon/vcn: implement vcn 2.0 encode")
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3134>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3134>

4 years agoradv: rely on pipeline layout when creating push descriptors with template
Samuel Pitoiset [Fri, 20 Dec 2019 12:30:28 +0000 (13:30 +0100)]
radv: rely on pipeline layout when creating push descriptors with template

descriptorSetLayout should be ignored for push descriptors. While
we are it, also ignore pipelineBindPoint.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2210
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3180>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3180>

4 years agoetnaviv: Replace bitwise OR with logical OR
Marek Vasut [Mon, 18 Nov 2019 18:12:49 +0000 (19:12 +0100)]
etnaviv: Replace bitwise OR with logical OR

The test here is testing whether either variable is non-zero.
While currently the test works fine, it's fragile. Replace it
with logical OR to avoid the fragility.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: update resource status after flushing
Christian Gmeiner [Fri, 29 Nov 2019 08:44:43 +0000 (09:44 +0100)]
etnaviv: update resource status after flushing

Currently piglit spec@arb_occlusion_query@occlusion_query_conform
spins for ever as the resource status is never reset. See
etna_hw_get_query_result(..) for more details.

Fixes: 1456aa61cc5 ("etnaviv: Rework resource status tracking")
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Marek Vasut <marex@denx.de>
4 years agointel: limit shader geometry on BDW GT1
Ross Zwisler [Thu, 19 Dec 2019 02:56:24 +0000 (19:56 -0700)]
intel: limit shader geometry on BDW GT1

Similar to the SKL GT1 fix introduced here:

https://gitlab.freedesktop.org/asimiklit/mesa/commit/b1ba7ffdbd54fdb5da18d086c7b7a830e06a1cff

we need to limit the .urb.max_entries[MESA_SHADER_GEOMETRY] on BDW GT1
to address failures in these two tests:

dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_3d
dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_2d_array

The value 690 was found via bisection.  691 is the actual max on the
hardware I'm using, but 690 seemed like a nice round number.

Signed-off-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3173>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3173>

4 years agopan/midgard: Lower txd with lower_tex
Alyssa Rosenzweig [Thu, 19 Dec 2019 16:12:50 +0000 (11:12 -0500)]
pan/midgard: Lower txd with lower_tex

This is a hack since we do have native gradient stuff, but for the
moment I'm more interested in conformance and the lowered code is good
enough. Fixes
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler2d_fixed_fragment

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169>

4 years agopan/midgard: Fix crash with txs
Alyssa Rosenzweig [Thu, 19 Dec 2019 16:12:25 +0000 (11:12 -0500)]
pan/midgard: Fix crash with txs

This regressed since we implemented RECT textures natively, oops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169>

4 years agopan/midgard: Implement textureOffset for 2D textures
Alyssa Rosenzweig [Thu, 19 Dec 2019 15:35:18 +0000 (10:35 -0500)]
pan/midgard: Implement textureOffset for 2D textures

Fixes dEQP-GLES3.functional.shaders.texture_functions.textureoffset.sampler2d_fixed_fragment.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169>

4 years agoradv: ignore pColorBlendState if rasterization is disabled
Samuel Pitoiset [Thu, 19 Dec 2019 14:03:41 +0000 (15:03 +0100)]
radv: ignore pColorBlendState if rasterization is disabled

Or if the subpass has no color attachments.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>

4 years agoradv: tidy up radv_pipeline_init_blend_state()
Samuel Pitoiset [Thu, 19 Dec 2019 14:52:13 +0000 (15:52 +0100)]
radv: tidy up radv_pipeline_init_blend_state()

This is needed for the next commit because pColorBlendState can
actually be NULL but some fields might have to be initialized
(eg. alpha to coverage with no color attachments).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>

4 years agoradv: ignore pDepthStencilState if rasterization is disabled
Samuel Pitoiset [Thu, 19 Dec 2019 13:26:45 +0000 (14:26 +0100)]
radv: ignore pDepthStencilState if rasterization is disabled

Or if the subpass has no depth stencil attachment.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>

4 years agoradv: ignore pTessellationState if the pipeline doesn't use tess
Samuel Pitoiset [Thu, 19 Dec 2019 13:18:24 +0000 (14:18 +0100)]
radv: ignore pTessellationState if the pipeline doesn't use tess

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>

4 years agoradv: ignore pMultisampleState if rasterization is disabled
Samuel Pitoiset [Thu, 19 Dec 2019 13:12:21 +0000 (14:12 +0100)]
radv: ignore pMultisampleState if rasterization is disabled

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>

4 years agoradv: init a default multisample state for the resolve FS path
Samuel Pitoiset [Thu, 19 Dec 2019 13:05:27 +0000 (14:05 +0100)]
radv: init a default multisample state for the resolve FS path

pMultisampleState must be a valid pointer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>

4 years agospirv: Implement SPV_KHR_non_semantic_info
Caio Marcelo de Oliveira Filho [Mon, 16 Sep 2019 21:07:00 +0000 (14:07 -0700)]
spirv: Implement SPV_KHR_non_semantic_info

Do nothing for OpExtInst from extended instruction sets that name
start with "NonSemantic.".

Since they can be used within the "preamble" to annotate global
decorations, also don't stop iterating when one of them is found.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3154>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3154>

4 years agoturnip: disable B8G8R8 vertex formats
Jonathan Marek [Thu, 19 Dec 2019 17:43:42 +0000 (12:43 -0500)]
turnip: disable B8G8R8 vertex formats

Looks like swap doesn't work as expected on these, disable them.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3170>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3170>

4 years agoutil/format: add missing vulkan formats
Jonathan Marek [Thu, 19 Dec 2019 17:35:40 +0000 (12:35 -0500)]
util/format: add missing vulkan formats

Add some missing vulkan formats to util/format, this solves all the missing
pipe format cases for the formats that turnip supports.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3170>

4 years agoturnip: minor warning fixes
Jonathan Marek [Thu, 19 Dec 2019 23:03:09 +0000 (18:03 -0500)]
turnip: minor warning fixes

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3177>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3177>

4 years agolima: Rotate dump files after each finished pp frame
Andreas Baierl [Thu, 19 Dec 2019 21:17:45 +0000 (22:17 +0100)]
lima: Rotate dump files after each finished pp frame

This rotates the dump files like the mali-syscall-tracker does.

After each finished pp frame a new file is generated. They are
numbered like lima.dump.0000, lima.dump.0001 ...
The filename and path can be given with the new environment
variable LIMA_DUMP_FILE.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3175>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3175>

4 years agolima: drop suballocator
Vasily Khoruzhick [Sun, 24 Nov 2019 22:37:32 +0000 (14:37 -0800)]
lima: drop suballocator

Since we're using a separate per-draw BO for GP outputs we don't
need suballocator anymore.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158>

4 years agolima: use single BO for GP outputs
Vasily Khoruzhick [Sun, 24 Nov 2019 22:34:44 +0000 (14:34 -0800)]
lima: use single BO for GP outputs

Varyings, gl_Position and gl_PointSize are all GP outputs, so we
can use a single BO for them all. Also that allows us to get rid
of suballocator.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158>

4 years agonir: fix assign_io_var_locations for vertex inputs
Jonathan Marek [Sun, 15 Dec 2019 23:50:29 +0000 (18:50 -0500)]
nir: fix assign_io_var_locations for vertex inputs

Also fixes fragment inputs using the wrong "base" value (which was working
only because FRAG_RESULT_DATA0 is less than VARYING_SLOT_VAR0)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3108>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3108>

4 years agoturnip: implement secondary command buffers
Jonathan Marek [Thu, 12 Dec 2019 22:06:14 +0000 (17:06 -0500)]
turnip: implement secondary command buffers

Uses a new "tu_cs_add_entries" function because tu_cs_emit_call doesn't
work inside draw_cs (which is already called by tu_cs_emit_call).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3075>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3075>

4 years agoturnip: compute gmem offsets at renderpass creation time
Jonathan Marek [Mon, 16 Dec 2019 21:42:35 +0000 (16:42 -0500)]
turnip: compute gmem offsets at renderpass creation time

This makes it easier to implement secondary command buffers, since we no
longer need to know the render area to set the gmem offsets for input
attachments and CmdClearAttachments.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3075>

4 years agoturnip: emit_compute_driver_params fixes
Jonathan Marek [Thu, 19 Dec 2019 03:38:09 +0000 (22:38 -0500)]
turnip: emit_compute_driver_params fixes

Offset was wrong, it is in vec4 not dwords.

There's a hole between DP_NUM_WORK_GROUPS_Z and DP_LOCAL_GROUP_SIZE_X so
use the IR3 enums.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>

4 years agoturnip: emit base instance vs driver param
Jonathan Marek [Thu, 19 Dec 2019 03:35:51 +0000 (22:35 -0500)]
turnip: emit base instance vs driver param

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>

4 years agofreedreno/ir3: support load_base_instance
Jonathan Marek [Sun, 17 Nov 2019 17:17:47 +0000 (12:17 -0500)]
freedreno/ir3: support load_base_instance

Not supported by hardware, uses same mechanism as base vertex.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>

4 years agofreedreno/registers: document vertex/instance id offset bits
Jonathan Marek [Thu, 19 Dec 2019 03:19:15 +0000 (22:19 -0500)]
freedreno/registers: document vertex/instance id offset bits

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>

4 years agost/mesa: release tgsi tokens for shader states
Neha Bhende [Thu, 19 Dec 2019 19:11:49 +0000 (00:41 +0530)]
st/mesa: release tgsi tokens for shader states

Since we are using st_common_variant while creating variant for vertext
program, we can release tokens created in st_create_vp_variant which
are already stored in respective states.
This fix memory leak found with piglit tests

Fixes bc99b22a305b ('st/mesa: use a separate VS variant for the draw module')

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
4 years agoRevert "nir/lower_double_ops: relax lower mod()"
Juan A. Suarez Romero [Thu, 19 Dec 2019 19:01:16 +0000 (20:01 +0100)]
Revert "nir/lower_double_ops: relax lower mod()"

This reverts commit 8172b1fa03fe74165728bfb182c98a3e62193d2b.

This commit was done taking in account Vulkan spec, but did not realize
it was affecting OpenGL too.

Closes: #2252
4 years agofreedreno/a6xx: Set up multisample sysmem MRTs correctly
Kristian H. Kristensen [Fri, 22 Nov 2019 06:48:32 +0000 (22:48 -0800)]
freedreno/a6xx: Set up multisample sysmem MRTs correctly

We had an extra factor of num_samples in the stride.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>

4 years agofreedreno/a6xx: Rewrite compressed blits in a helper function
Kristian H. Kristensen [Tue, 26 Nov 2019 19:38:47 +0000 (11:38 -0800)]
freedreno/a6xx: Rewrite compressed blits in a helper function

Similar to how we handle zs blits.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>

4 years agofreedreno/a6xx: Move handle_rgba_blit() up
Kristian H. Kristensen [Tue, 26 Nov 2019 18:57:56 +0000 (10:57 -0800)]
freedreno/a6xx: Move handle_rgba_blit() up

If we move this function up, we don't have to forward declare it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>

4 years agofreedreno/a6xx: Handle srgb blits on the blitter
Kristian H. Kristensen [Thu, 21 Nov 2019 19:35:48 +0000 (11:35 -0800)]
freedreno/a6xx: Handle srgb blits on the blitter

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>

4 years agofreedreno/a6xx: Use A6XX_SP_2D_SRC_FORMAT_MASK macro
Kristian H. Kristensen [Thu, 21 Nov 2019 19:40:20 +0000 (11:40 -0800)]
freedreno/a6xx: Use A6XX_SP_2D_SRC_FORMAT_MASK macro

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>

4 years agofreedreno/a6xx: RB6_R8G8B8 is actually 32 bit RGBX
Kristian H. Kristensen [Thu, 12 Dec 2019 01:05:57 +0000 (17:05 -0800)]
freedreno/a6xx: RB6_R8G8B8 is actually 32 bit RGBX

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>

4 years agofreedreno/a6xx: Use blitter for resolve blits
Kristian H. Kristensen [Thu, 21 Nov 2019 17:14:40 +0000 (09:14 -0800)]
freedreno/a6xx: Use blitter for resolve blits

We have a SAMPLES_AVERAGE bit that does what we need for resolving
multisample buffers - let's use it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>

4 years agofreedreno/a6xx: Add fd_resource_swap() helper
Kristian H. Kristensen [Thu, 28 Nov 2019 00:06:59 +0000 (16:06 -0800)]
freedreno/a6xx: Add fd_resource_swap() helper

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>

4 years agofreedreno/a6xx: Pick blitter swap based on resource tiling
Kristian H. Kristensen [Tue, 26 Nov 2019 05:50:57 +0000 (21:50 -0800)]
freedreno/a6xx: Pick blitter swap based on resource tiling

The linear levels in a tiled resource are stored in the canonical
swap, WZYX.  We need to pick the swap based on whether or not the
resource is tiled, not whether the the level in question is tiled.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>

4 years agofreedreno/a6xx: Program sampler swap based on resource tiling
Kristian H. Kristensen [Tue, 26 Nov 2019 00:40:37 +0000 (16:40 -0800)]
freedreno/a6xx: Program sampler swap based on resource tiling

It doesn't matter whether or not the level in question is linear.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>

4 years agofreedreno: Add debug flag for forcing linear layouts
Kristian H. Kristensen [Tue, 26 Nov 2019 05:14:31 +0000 (21:14 -0800)]
freedreno: Add debug flag for forcing linear layouts

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>

4 years agofreedreno/a6xx: Make DEBUG_BLIT_FALLBACK only dump fallbacks
Kristian H. Kristensen [Thu, 21 Nov 2019 19:41:15 +0000 (11:41 -0800)]
freedreno/a6xx: Make DEBUG_BLIT_FALLBACK only dump fallbacks

Use new macro, DEBUG_BLIT, for dumping all blits.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>

4 years agofreedreno/ir3: fix vertex shader sysvals with pre_assign_inputs
Jonathan Marek [Thu, 19 Dec 2019 15:40:35 +0000 (10:40 -0500)]
freedreno/ir3: fix vertex shader sysvals with pre_assign_inputs

The first pre_assign_inputs loop doesn't pre-assign sysvals, so skip the
second part for sysvals.

The sysvals don't need to be pre-assigned since the state for those isn't
shared between binning / nonbinning shaders.

Fixes assert failures in cases where the sysvals didn't end up in the same
registers for binning / nonbinning.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3168>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3168>

4 years agost/va: Convert interlaced NV12 to progressive
Thong Thai [Wed, 18 Dec 2019 22:02:02 +0000 (17:02 -0500)]
st/va: Convert interlaced NV12 to progressive

In vlVaDeriveImage, convert interlaced NV12 buffers to progressive.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1193
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3157>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3157>

4 years agopan/midgard: Add uniform/work heuristic
Alyssa Rosenzweig [Wed, 18 Dec 2019 00:05:35 +0000 (19:05 -0500)]
pan/midgard: Add uniform/work heuristic

Uniform/work registers are partitioned on a shader-by-shader basis as
determined by the compiler. We add a simple heuristic here running
before scheduling that prioritizes mitigating spilling at all costs.

A more sophisticated heuristic should run *after* scheduling, doing a
dry run of the register allocator itself to determine spilling. Fitting
this into our current scheduling model is difficult, so while this
heuristic does hurt some shaders, overall the results are acceptable:

total instructions in shared programs: 50065 -> 38747 (-22.61%)
instructions in affected programs: 37187 -> 25869 (-30.44%)
helped: 59
HURT: 77
helped stats (abs) min: 1 max: 757 x̄: 198.46 x̃: 151
helped stats (rel) min: 0.48% max: 62.89% x̄: 32.95% x̃: 36.27%
HURT stats (abs)   min: 1 max: 9 x̄: 5.08 x̃: 6
HURT stats (rel)   min: 0.92% max: 14.29% x̄: 6.71% x̃: 4.60%
95% mean confidence interval for instructions value: -111.15 -55.29
95% mean confidence interval for instructions %-change: -14.33% -6.67%
Instructions are helped.

total bundles in shared programs: 30606 -> 19157 (-37.41%)
bundles in affected programs: 23907 -> 12458 (-47.89%)
helped: 58
HURT: 74
helped stats (abs) min: 6 max: 757 x̄: 203.09 x̃: 152
helped stats (rel) min: 5.19% max: 77.00% x̄: 49.38% x̃: 53.79%
HURT stats (abs)   min: 1 max: 9 x̄: 4.46 x̃: 5
HURT stats (rel)   min: 1.85% max: 26.32% x̄: 11.70% x̃: 9.57%
95% mean confidence interval for bundles value: -115.46 -58.01
95% mean confidence interval for bundles %-change: -20.87% -9.41%
Bundles are helped.

total quadwords in shared programs: 31305 -> 32027 (2.31%)
quadwords in affected programs: 20471 -> 21193 (3.53%)
helped: 0
HURT: 133
HURT stats (abs)   min: 1 max: 9 x̄: 5.43 x̃: 5
HURT stats (rel)   min: 0.76% max: 15.15% x̄: 5.47% x̃: 4.65%
95% mean confidence interval for quadwords value: 5.00 5.86
95% mean confidence interval for quadwords %-change: 4.85% 6.08%
Quadwords are HURT.

total registers in shared programs: 2256 -> 2545 (12.81%)
registers in affected programs: 708 -> 997 (40.82%)
helped: 0
HURT: 95
HURT stats (abs)   min: 1 max: 8 x̄: 3.04 x̃: 3
HURT stats (rel)   min: 12.50% max: 100.00% x̄: 39.41% x̃: 37.50%
95% mean confidence interval for registers value: 2.64 3.45
95% mean confidence interval for registers %-change: 34.62% 44.19%
Registers are HURT.

total threads in shared programs: 1776 -> 1709 (-3.77%)
threads in affected programs: 134 -> 67 (-50.00%)
helped: 0
HURT: 67
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: -1.00 -1.00
95% mean confidence interval for threads %-change: -50.00% -50.00%
Threads are HURT.

total spills in shared programs: 3868 -> 2 (-99.95%)
spills in affected programs: 3868 -> 2 (-99.95%)
helped: 60
HURT: 0

total fills in shared programs: 6456 -> 4 (-99.94%)
fills in affected programs: 6456 -> 4 (-99.94%)
helped: 60
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3150>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3150>

4 years agoac: declare an enum for the OOB select field on GFX10
Samuel Pitoiset [Wed, 18 Dec 2019 13:23:26 +0000 (14:23 +0100)]
ac: declare an enum for the OOB select field on GFX10

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>

4 years agoradv/gfx10: fix the out-of-bounds check for vertex descriptors
Samuel Pitoiset [Wed, 18 Dec 2019 12:29:39 +0000 (13:29 +0100)]
radv/gfx10: fix the out-of-bounds check for vertex descriptors

When stride is 0, it should check against the offset not the index.

This fixes black character models with Beat Saber and missing snow
with Dragon Quest.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2233
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1975
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>

4 years agonir/lower_double_ops: relax lower mod()
Juan A. Suarez Romero [Thu, 28 Nov 2019 16:58:45 +0000 (16:58 +0000)]
nir/lower_double_ops: relax lower mod()

Currently when lowering mod() we add an extra instruction so if
mod(a,b) == b then 0 is returned instead of b, as mathematically
mod(a,b) is in the interval [0, b).

But Vulkan spec has relaxed this restriction, and allows the result to
be in the interval [0, b].

This commit takes this in account to remove the extra instruction
required to return 0 instead.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2922>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2922>