mesa.git
4 years agoac: fix the return value in cull_bbox when bbox culling is disabled
Marek Olšák [Thu, 12 Dec 2019 22:01:39 +0000 (17:01 -0500)]
ac: fix the return value in cull_bbox when bbox culling is disabled

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>

4 years agoac: fix ac_get_i1_sgpr_mask for Wave32
Marek Olšák [Thu, 12 Dec 2019 22:00:51 +0000 (17:00 -0500)]
ac: fix ac_get_i1_sgpr_mask for Wave32

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>

4 years agopanfrost: Remove asserts in panfrost_pack_work_groups_compute
Alyssa Rosenzweig [Thu, 12 Dec 2019 16:30:20 +0000 (11:30 -0500)]
panfrost: Remove asserts in panfrost_pack_work_groups_compute

It's a hot routine and these are exceedingly unlikely to break.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3067>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3067>

4 years agopanfrost: Pack invocation_shifts manually instead of a bit field
Alyssa Rosenzweig [Thu, 12 Dec 2019 16:28:08 +0000 (11:28 -0500)]
panfrost: Pack invocation_shifts manually instead of a bit field

gcc generates exceptionally bad code for panfrost_pack_work_groups_fused
otherwise ... although that routine is somehow still hot ...

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3067>

4 years agoanv: Export VK_KHR_buffer_device_address only when really supported
Iván Briano [Fri, 13 Dec 2019 00:09:00 +0000 (16:09 -0800)]
anv: Export VK_KHR_buffer_device_address only when really supported

Fixes: 1b6991ba1d8 ("anv: Implement VK_KHR_buffer_device_address")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>

4 years agoanv: Export filter_minmax support only when it's really supported
Iván Briano [Fri, 13 Dec 2019 00:07:19 +0000 (16:07 -0800)]
anv: Export filter_minmax support only when it's really supported

Fixes: bea4d4c78c3 ("anv: add VK_EXT_sampler_filter_minmax support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>

4 years agofreedreno/ir3: lower mul_2x32_64
Jonathan Marek [Sun, 15 Dec 2019 19:18:13 +0000 (14:18 -0500)]
freedreno/ir3: lower mul_2x32_64

lower_mul_2x32_64 generates mul_high opcodes, and lower_mul_high is done by
nir_lower_alu, so call nir_lower_alu after nir_opt_algebraic.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: implement CmdFillBuffer/CmdUpdateBuffer
Jonathan Marek [Mon, 16 Dec 2019 15:00:20 +0000 (10:00 -0500)]
turnip: implement CmdFillBuffer/CmdUpdateBuffer

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: don't require src image to be set for clear blits
Jonathan Marek [Mon, 16 Dec 2019 14:59:48 +0000 (09:59 -0500)]
turnip: don't require src image to be set for clear blits

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: use common blit path for buffer copy
Jonathan Marek [Mon, 16 Dec 2019 14:56:06 +0000 (09:56 -0500)]
turnip: use common blit path for buffer copy

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: use single substream cs
Jonathan Marek [Mon, 16 Dec 2019 14:49:39 +0000 (09:49 -0500)]
turnip: use single substream cs

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agopanfrost: Remove fbd_type enum
Alyssa Rosenzweig [Mon, 16 Dec 2019 17:05:45 +0000 (12:05 -0500)]
panfrost: Remove fbd_type enum

Just use the MALI_MFBD tag directly; it's clean.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>

4 years agoci: Reinstate Panfrost CI
Alyssa Rosenzweig [Mon, 16 Dec 2019 17:48:56 +0000 (12:48 -0500)]
ci: Reinstate Panfrost CI

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>

4 years agopanfrost: Fix FBD issue
Alyssa Rosenzweig [Mon, 16 Dec 2019 16:46:32 +0000 (11:46 -0500)]
panfrost: Fix FBD issue

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: b0e915b4e65 ("panfrost: Emit SFBD/MFBD after a batch, instead of before")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>

4 years agovulkan/wsi: error out when image fence doesn't signal
Lionel Landwerlin [Thu, 12 Dec 2019 15:51:26 +0000 (17:51 +0200)]
vulkan/wsi: error out when image fence doesn't signal

If for some reason the fence associated with an image doesn't signal,
we're likely in a device lost scenario, we should report that error.

We can't really wait for a given amount of time because we could get a
timeout and that is not a valid error to report for vkQueuePresentKHR,
so just wait forever.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/830
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
4 years agoanv: drop unused parameter from apply layout pass
Lionel Landwerlin [Fri, 6 Dec 2019 11:00:58 +0000 (13:00 +0200)]
anv: drop unused parameter from apply layout pass

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
4 years agoanv: constify pipeline layout in nir passes
Lionel Landwerlin [Fri, 6 Dec 2019 10:57:10 +0000 (12:57 +0200)]
anv: constify pipeline layout in nir passes

Was hoping to find potential issues but nothing. Still probably a good
idea.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
4 years agopan/midgard: Set r1.w magic
Alyssa Rosenzweig [Tue, 3 Dec 2019 15:51:38 +0000 (10:51 -0500)]
pan/midgard: Set r1.w magic

I'm honestly unsure what this is for, but it's needed on MFBD systems
for unknown reasons, at least when MRT is actually in use and then
sometimes without MRT (it fixes a blend shader issue on T760?)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
4 years agopan/midgard: Fix liveness analysis with multiple epilogues
Alyssa Rosenzweig [Tue, 3 Dec 2019 15:37:01 +0000 (10:37 -0500)]
pan/midgard: Fix liveness analysis with multiple epilogues

Epilogues are special fixed-function blocks, so they need special
handling for liveness analysis to work completely. This in turns fixes
RA issues for many shaders using MRT.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
4 years agopan/midgard: Writeout per render target
Alyssa Rosenzweig [Sat, 23 Nov 2019 21:08:02 +0000 (16:08 -0500)]
pan/midgard: Writeout per render target

The flow is considerably more complicated. Instead of one writeout loop
like usual, we have a separate write loop for each render target. This
requires some scheduling shenanigans to get right.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
4 years agopan/midgard: Add schedule barrier after fragment writeout
Alyssa Rosenzweig [Sat, 23 Nov 2019 17:43:55 +0000 (12:43 -0500)]
pan/midgard: Add schedule barrier after fragment writeout

This is a branch, like discard, so we need a barrier to make it safe.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
4 years agopanfrost: Pass blend RT number through
Alyssa Rosenzweig [Sun, 24 Nov 2019 02:44:16 +0000 (21:44 -0500)]
panfrost: Pass blend RT number through

We have to key the blend shader for the render target number due to
writeout silliness.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
4 years agogallium: refuse to create buffers larger than UINT32_MAX
Pierre-Eric Pelloux-Prayer [Thu, 5 Dec 2019 09:07:52 +0000 (10:07 +0100)]
gallium: refuse to create buffers larger than UINT32_MAX

pipe_resource.width0 is 32 bits and hardware support for bigger buffer is
limited (eg: AMD hardware doesn't support buffer shader resources bigger
than 4GB).

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2053
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2948>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2948>

4 years agoradeonsi: disable dcc for 2x MSAA surface and bpe < 4
Pierre-Eric Pelloux-Prayer [Fri, 13 Dec 2019 16:38:27 +0000 (17:38 +0100)]
radeonsi: disable dcc for 2x MSAA surface and bpe < 4

This fixes a series of dEQP tests on Raven platforms:
  - dEQP-GLES3.functional.fbo.msaa.2_samples.rgba4
  - dEQP-GLES3.functional.fbo.msaa.2_samples.rgb5_a1
  - dEQP-GLES3.functional.fbo.msaa.2_samples.rgb565
  - dEQP-GLES3.functional.fbo.msaa.2_samples.rg8
  - dEQP-GLES3.functional.fbo.msaa.2_samples.r16f

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3090>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3090>

4 years agov3d: expose OES_geometry_shader
Iago Toral Quiroga [Mon, 16 Sep 2019 12:30:03 +0000 (14:30 +0200)]
v3d: expose OES_geometry_shader

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: support precompiling geometry shaders
Iago Toral Quiroga [Mon, 11 Nov 2019 10:46:41 +0000 (11:46 +0100)]
v3d: support precompiling geometry shaders

At present, this is only relevant for shader-db.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: disable lowering of indirect inputs
Iago Toral Quiroga [Tue, 5 Nov 2019 11:25:35 +0000 (12:25 +0100)]
v3d: disable lowering of indirect inputs

V3D can do indirect inputs so we don't need it. Also, the lowering
produces horrible if-ladder code that is particularly bad for geometry
shaders where inputs are always arrays and shader bodies usually have
a loop indexing into them.

This fixes a couple of geometry shader tests in CTS that would fail to
register allocate otherwise.

There are no changes in shader-db.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: fix primitive queries for geometry shaders
Iago Toral Quiroga [Wed, 30 Oct 2019 13:19:30 +0000 (14:19 +0100)]
v3d: fix primitive queries for geometry shaders

With geometry shaders the number of emitted primitived is decided
at run time, so we cannot precompute it in the CPU and we need to
use the PRIMITIVE_COUNTS_FEEDBACK commands to have the GPU provide
the number like we do for the number of primitives written to
transform feedback. This may have a performance impact though, since
it requires a sync wait for the draw to complete, so we only do
it when geometry shaders are present.

v2: remove '> 0' comparison for ponter type (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: handle writes to gl_Layer from geometry shaders
Iago Toral Quiroga [Tue, 29 Oct 2019 09:12:28 +0000 (10:12 +0100)]
v3d: handle writes to gl_Layer from geometry shaders

When geometry shaders write a value to gl_Layer that doesn't correspond to
an existing layer in the target framebuffer the rendering behavior is
undefined according to the spec, however, there are CTS tests that trigger
this scenario on purpose, probably to ensure that nothing terrible happens.

For V3D, this situation is problematic because the binner uses the layer
index to select the offset to write into the tile state data, and we only
allocate tile state for MAX2(num_layers, 1), so we want to make sure we
don't produce values that would lead to out of bounds writes. The simulator
has an assert to catch this, although we haven't observed issues in actual
hardware it is probably best to play safe.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: move layer rendering to a separate helper
Iago Toral Quiroga [Thu, 31 Oct 2019 09:46:58 +0000 (10:46 +0100)]
v3d: move layer rendering to a separate helper

This helps with reducing nesting level after adding the loop
to handle layered rendering.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: support rendering to multi-layered framebuffers
Iago Toral Quiroga [Tue, 29 Oct 2019 09:27:23 +0000 (10:27 +0100)]
v3d: support rendering to multi-layered framebuffers

When doing layered rendering the binning stage will prepare per-tile
lists for each layer in the framebuffer, so we need to make sure
we allocate enough space for them .

We also need to emit the NUMBER_OF_LAYERS packet. This is required
even when the number of layers is only 1, otherwise the simulator
detects buffer overflows in the tile_state BO during some CTS test
cases involving layered FBOs.

When rendering, we need to emit commands for each layer of the
framebuffer separately and make sure we address the correct layers for
each one.

v2: fixed typo in comment (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: do not limit new CL space allocations with branch to 4096 bytes
Iago Toral Quiroga [Wed, 30 Oct 2019 10:33:44 +0000 (11:33 +0100)]
v3d: do not limit new CL space allocations with branch to 4096 bytes

For layered rendering we need to emit per layer rendering commands
lists so we we can end up requiring a fairly large buffer for this
if the number of layers is large enough.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: remove obsolete assertion
Iago Toral Quiroga [Tue, 29 Oct 2019 08:32:05 +0000 (09:32 +0100)]
v3d: remove obsolete assertion

OES_geometry_shader introduced the concept of layered framebuffers.
Removing this assertion gets a bunch of CTS tests to pass. We will
also need layered images to implement layered rendering with geometry
shaders.

v2: fix typo in commit message (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: support transform feedback with geometry shaders
Iago Toral Quiroga [Tue, 15 Oct 2019 06:32:47 +0000 (08:32 +0200)]
v3d: support transform feedback with geometry shaders

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: save geometry shader state for blitting
Iago Toral Quiroga [Fri, 11 Oct 2019 10:01:32 +0000 (12:01 +0200)]
v3d: save geometry shader state for blitting

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: predicate geometry shader outputs inside non-uniform control flow
Iago Toral Quiroga [Fri, 11 Oct 2019 09:40:38 +0000 (11:40 +0200)]
v3d: predicate geometry shader outputs inside non-uniform control flow

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: don't try to render if shaders failed to compile
Iago Toral Quiroga [Wed, 9 Oct 2019 09:13:00 +0000 (11:13 +0200)]
v3d: don't try to render if shaders failed to compile

This is the same we do in the compute path to avoid crashes
at draw time.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: add support for adjacency primitives
Iago Toral Quiroga [Wed, 9 Oct 2019 08:26:16 +0000 (10:26 +0200)]
v3d: add support for adjacency primitives

v2: remove obsolete comment (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: we always have at least one output segment
Iago Toral Quiroga [Tue, 8 Oct 2019 11:48:00 +0000 (13:48 +0200)]
v3d: we always have at least one output segment

If we program an output size of 0 the simulator asserts. This was
not a problem until now because our VS would always have to
emit fixed function outputs, however, now that it can be paired
with a GS we can end up with a VS shader that no longer emits
any outputs.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: compute appropriate VPM memory configuration for geometry shader workloads
Iago Toral Quiroga [Thu, 7 Nov 2019 14:40:45 +0000 (15:40 +0100)]
v3d: compute appropriate VPM memory configuration for geometry shader workloads

Geometry shaders can output many vertices and thus have higher VPM memory
pressure as a result. It is possible that too wide geometry shader dispatches
exceed the maximum available VPM output allocated, in which case we need
to reduce the dispatch width until we can fit the VPM memory requirements.
Supported dispatch widths for geometry shaders are 16, 8, 4, 1.

There is a limit in the number of VPM output sectors that can be used by a
geometry shader that we can meet by lowering the dispatch width at compile
time, however, at draw time we need to revisit this number and, together with
other elements that can contribute to total VPM memory requirements, decide
on a configuration that can fit the program into the available VPM memory.
Ideally, we also want to aim for not using more than half of the available
memory so we that we can run a pair of bin and render programs in parallel.

v2: fixed language in comment and typo in commit log. (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: add 1-way SIMD packing definition
Iago Toral Quiroga [Mon, 28 Oct 2019 13:23:51 +0000 (14:23 +0100)]
v3d: add 1-way SIMD packing definition

According to the documentation, the 1-way dispatch width is only supported
with geometry shaders.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: implement geometry shader instancing
Iago Toral Quiroga [Thu, 3 Oct 2019 15:08:47 +0000 (17:08 +0200)]
v3d: implement geometry shader instancing

v2:
 - Remove unused field uses_iid from v3d_gs_prog_data (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: emit geometry shader state commands
Iago Toral Quiroga [Tue, 24 Sep 2019 08:34:55 +0000 (10:34 +0200)]
v3d: emit geometry shader state commands

This is good enough to get basic GS workloads working, later patches will
improve this by adding instancing support, proper SIMD configuration, etc.

Notice that most of the TESSELLATION_GEOMETRY_SHADER_PARAMS fields are only
relevant when tessellation shaders are present. We do not support tessellation
yet, but we still need to fill in these tessellation state with default values
since our packing functions require some of these to have non-zero values.

v2:
 - Add a comment in the code explaining why we fill in
   tessellation fields (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: fix packet descriptions for geometry and tessellation shaders
Iago Toral Quiroga [Thu, 19 Sep 2019 11:36:30 +0000 (13:36 +0200)]
v3d: fix packet descriptions for geometry and tessellation shaders

Every code address starts at bit 3 (addresses must be 64-bit aligned),
with the first 3 bits used to specify threading and NaN propagation
parameters for the shader program.

We generally skip "reserved" bits, however, doing this when the
reserved field is the last in a struct and it is large enough can
make us compute incorrect (smaller) struct sizes which can
lead to corrupt CLs. In particular, the "Tess/Geom Common Params"
struct has a reserved field at the end that is 8-bit, so if we
don't include this we compute a packet size that is 1 byte smaller
than it shold, making the next packet we emit start 1 byte
earlier and therefore leading to incorrect CL data from that point
forward.

The name of one of the fields was not correct.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: add initial compiler plumbing for geometry shaders
Iago Toral Quiroga [Mon, 28 Oct 2019 12:24:44 +0000 (13:24 +0100)]
v3d: add initial compiler plumbing for geometry shaders

Most of the relevant work happens in the v3d_nir_lower_io. Since
geometry shaders can write any number of output vertices, this pass
injects a few variables into the shader code to keep track of things
like the number of vertices emitted or the offsets into the VPM
of the current vertex output, etc. This is also where we handle
EmitVertex() and EmitPrimitive() intrinsics.

The geometry shader VPM output layout has a specific structure
with a 32-bit general header, then another 32-bit header slot for
each output vertex, and finally the actual vertex data.

When vertex shaders are paired with geometry shaders we also need
to consider the following:
  - Only geometry shaders emit fixed function outputs.
  - The coordinate shader used for the vertex stage during binning must
    not drop varyings other than those used by transform feedback, since
    these may be read by the binning GS.

v2:
 - Use MAX3 instead of a chain of MAX2 (Alejandro).
 - Make all loop variables unsigned in ntq_setup_gs_inputs (Alejandro)
 - Update comment in IO owering so it includes the GS stage (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: remove unused variable
Iago Toral Quiroga [Mon, 28 Oct 2019 09:29:15 +0000 (10:29 +0100)]
v3d: remove unused variable

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: enable debug options for geometry shader dumps
Iago Toral Quiroga [Tue, 24 Sep 2019 08:33:30 +0000 (10:33 +0200)]
v3d: enable debug options for geometry shader dumps

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: add debug assert
Iago Toral Quiroga [Mon, 23 Sep 2019 09:14:52 +0000 (11:14 +0200)]
v3d: add debug assert

While lowering vpm outputs we look for the NIR variables matching
particular store output instructions and we expect to find a match,
so assert on that.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agov3d: add missing plumbing for VPM load instructions
Iago Toral Quiroga [Tue, 24 Sep 2019 08:43:41 +0000 (10:43 +0200)]
v3d: add missing plumbing for VPM load instructions

We will need to use LDVPMG_IN specifically to read VPM inputs
in geometry shaders.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agoturnip: Lower usub_borrow.
Eric Anholt [Wed, 27 Nov 2019 18:43:54 +0000 (10:43 -0800)]
turnip: Lower usub_borrow.

Fixes dEQP-VK.glsl.builtin.function.integer.usubborrow.uvec2_mediump_fragment.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2986>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2986>

4 years agointel/fs: Lower 64-bit MOVs after lower_load_payload()
Caio Marcelo de Oliveira Filho [Thu, 12 Dec 2019 21:25:33 +0000 (13:25 -0800)]
intel/fs: Lower 64-bit MOVs after lower_load_payload()

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3070>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3070>

4 years agoamd/common: Always use addrlib for HTILE tc-compat.
Bas Nieuwenhuizen [Thu, 12 Dec 2019 11:10:58 +0000 (12:10 +0100)]
amd/common: Always use addrlib for HTILE tc-compat.

Even without depth+stencil addrlib can (correctly!) decide to
disable tc compatible HTILE.

One example is 8x sampling with 32-bit depth on Stoney. The row size
on Stoney is 1024, while the tile size is 2048, which results in
tile splits which are not supported with tc-compat.

On Stoney, this fixes
dEQP-VK.glsl.builtin_var.fragdepth.*_list_d32_sfloat_multisample_8

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>

4 years agoamd/common: Fix tcCompatible degradation on Stoney.
Bas Nieuwenhuizen [Wed, 11 Dec 2019 15:04:58 +0000 (16:04 +0100)]
amd/common: Fix tcCompatible degradation on Stoney.

addrlib sometimes returns smaller sizes for tcCompat as it does
not seem to take into account the depth+stencil matching config
gymnastics with tcCompat.

This fixes
dEQP-VK.pipeline.render_to_image.core.2d_array.huge.height.r8g8b8a8_unorm_d32_sfloat_s8_uint

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>

4 years agodocs/features: mark GL_ARB_texture_compression_bptc as done for llvmpipe, softpipe...
Denis Pauk [Sat, 14 Dec 2019 07:58:25 +0000 (09:58 +0200)]
docs/features: mark GL_ARB_texture_compression_bptc as done for llvmpipe, softpipe, swr

Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Rhys Perry <pendingchaos02@gmail.com>
CC: Bruce Cherniak <bruce.cherniak@intel.com>
CC: Matt Turner <mattst88@gmail.com>
4 years agogallium/swr: Enable support bptc format.
Denis Pauk [Sat, 14 Dec 2019 07:54:48 +0000 (09:54 +0200)]
gallium/swr: Enable support bptc format.

Reuse Code from:
f69bc797e1 gallium/auxiliary: Add helper support for bptc format compress/decompress

Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Tim Rowley <timothy.o.rowley@intel.com>
4 years agofreedreno/a6xx: fix OUT_REG() vs growable cmdstream
Rob Clark [Sat, 14 Dec 2019 17:09:08 +0000 (09:09 -0800)]
freedreno/a6xx: fix OUT_REG() vs growable cmdstream

BEGIN_RING() could decide we can't fit the next packet in the current
cmdstream segment, and grow a new segment.  So we need to grab ring->cur
*after* BEGIN_RING(), otherwise we are writing cmdstream past the end of
the previous segment.

Fixes: bdd98b892f3 ("freedreno: New struct packing macros")
Signed-off-by: Rob Clark <robdclark@chromium.org>
4 years agolima: split draw calls on 64k vertices
Erico Nunes [Sat, 9 Nov 2019 12:50:52 +0000 (13:50 +0100)]
lima: split draw calls on 64k vertices

The Mali400 only supports draws with up to 64k vertices per command.
To handle this, break the draw_vbo call into multiple commands.
Indexed drawing is left to a separate code path.
This implementation was ported from vc4_draw_vbo.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>

4 years agovc4: move the draw splitting routine to shared code
Erico Nunes [Tue, 12 Nov 2019 13:56:33 +0000 (14:56 +0100)]
vc4: move the draw splitting routine to shared code

This can also be useful for other hardware which has similar limitations
on vertex count per single draw.
The Mali400 has a similar limitation and can reuse this.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>

4 years agolima: refactor indexed draw indices upload
Erico Nunes [Sat, 9 Nov 2019 12:50:07 +0000 (13:50 +0100)]
lima: refactor indexed draw indices upload

As of this commit this is just a refactor in preparation to enable
support for more than 64k vertices.
To support splitting the draw_vbo call, indices shouldn't be re-uploaded
every time.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>

4 years agolima: allocate separate bo to store varyings
Erico Nunes [Wed, 23 Oct 2019 22:27:22 +0000 (00:27 +0200)]
lima: allocate separate bo to store varyings

The current strategy using the suballocator with fixed size doesn't
scale and causes some programs with large number of vertices (like some
glmark2 scenes) to crash.
Change it to dynamically allocate a separate bo to accomodate for
arbitrary number of vertices.
This also fixes the buffer read/write flags for gp.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>

4 years agogallium/util: add alignment parameter to util_upload_index_buffer
Erico Nunes [Sat, 7 Dec 2019 03:38:03 +0000 (04:38 +0100)]
gallium/util: add alignment parameter to util_upload_index_buffer

At least on Mali Utgard, index buffers need to be aligned on 0x40.
To avoid duplicating this, add an alignment parameter.
Keep the previous default for the other existing users.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>

4 years agodrirc: Final Fantasy VIII: Remastered needs allow_higher_compat_version
Kenneth Graunke [Fri, 13 Dec 2019 05:26:12 +0000 (21:26 -0800)]
drirc: Final Fantasy VIII: Remastered needs allow_higher_compat_version

This gets it running on i965 with Mesa master.  (The game won't start
without GL 3.3 compatibility, but uses 1.20 with GL_EXT_gpu_shader4
for shaders.)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3076>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3076>

4 years agost/glsl_to_nir: fix SSO validation regression
Timothy Arceri [Fri, 13 Dec 2019 10:58:28 +0000 (21:58 +1100)]
st/glsl_to_nir: fix SSO validation regression

Fixes: b77907edb554 ("st/glsl_to_nir: use nir based program resource list builder")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2216
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agoci: Remove T760/T860 from CI temporarily
Alyssa Rosenzweig [Fri, 13 Dec 2019 22:13:14 +0000 (17:13 -0500)]
ci: Remove T760/T860 from CI temporarily

I feel really bad about this but this one test is flaking. I don't want
to do a mass revert (and bisection is extremely difficult with
nondeterministic/Heisenbugs), but it's Friday night and master needs to
pass. This commit should be reverted asap (once the flake is solved)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agoiris: Implement WA for push constants.
Rafael Antognolli [Tue, 19 Nov 2019 23:00:06 +0000 (15:00 -0800)]
iris: Implement WA for push constants.

v2: Apply WA to gen11+ instead of gen12+ (Jordan).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
4 years agolima/parser: Add texture descriptor parser
Andreas Baierl [Mon, 9 Dec 2019 11:42:30 +0000 (12:42 +0100)]
lima/parser: Add texture descriptor parser

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>

4 years agolima/parser: Add RSW parsing
Andreas Baierl [Fri, 6 Dec 2019 08:30:14 +0000 (09:30 +0100)]
lima/parser: Add RSW parsing

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>

4 years agolima/parser: Some fixes and cleanups
Andreas Baierl [Thu, 5 Dec 2019 16:39:01 +0000 (17:39 +0100)]
lima/parser: Some fixes and cleanups

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>

4 years agovulkan/overlay: Update docs.
Rafael Antognolli [Thu, 12 Dec 2019 21:55:11 +0000 (13:55 -0800)]
vulkan/overlay: Update docs.

Add mention to overlay control socket.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agovulkan/overlay: Add basic overlay control script.
Rafael Antognolli [Thu, 12 Dec 2019 14:45:51 +0000 (06:45 -0800)]
vulkan/overlay: Add basic overlay control script.

This can be used to start/stop statistics capturing from the command
line.

v3:
 - Install script (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agovulkan/overlay: Add a command to start capturing data to a file.
Rafael Antognolli [Fri, 6 Dec 2019 20:18:19 +0000 (12:18 -0800)]
vulkan/overlay: Add a command to start capturing data to a file.

By default, if an output_file is specified, the overlay layer will start
capturing data immediately. After this commit, when a control socket is
used, the capture starts disabled by default, and is only enabled when a
command ":capture=1;" is received.

when the capture is enabled, we might have already accumulated some
stats. To avoid capturing such noise, we discard and reset the fps and
stats, updating the display and capturing only data from that point on.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agovulkan/overlay: Add support for a control socket.
Rafael Antognolli [Fri, 6 Dec 2019 21:44:19 +0000 (13:44 -0800)]
vulkan/overlay: Add support for a control socket.

Add support for socket from which the overlay layer can receive
commands. This control socket can be useful to allow setting options
once the application is already running. For instance, triggering the
capture of fps data at a certain point.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agovulkan/overlay: Add a control socket.
Rafael Antognolli [Fri, 6 Dec 2019 22:38:07 +0000 (14:38 -0800)]
vulkan/overlay: Add a control socket.

v2: Use a socket instead of named pipe.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoutil/os_socket: Add socket related functions.
Rafael Antognolli [Wed, 11 Dec 2019 23:01:11 +0000 (15:01 -0800)]
util/os_socket: Add socket related functions.

v3:
 - Add os_socket.c/h into Makefile.sources (Lionel)
 - Add empty non-linux implementation to public functions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: drop unused #include
Eric Engestrom [Fri, 13 Dec 2019 17:16:17 +0000 (17:16 +0000)]
anv: drop unused #include

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
4 years agoutil/simple_mtx: don't set the canary when it can't be checked
Eric Engestrom [Fri, 13 Dec 2019 17:12:48 +0000 (17:12 +0000)]
util/simple_mtx: don't set the canary when it can't be checked

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agointel/compiler: replace `0` pointer with `NULL`
Eric Engestrom [Fri, 13 Dec 2019 17:01:39 +0000 (17:01 +0000)]
intel/compiler: replace `0` pointer with `NULL`

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
4 years agointel/compiler: add ASSERTED annotation to avoid "unused variable" warning
Eric Engestrom [Fri, 13 Dec 2019 17:01:17 +0000 (17:01 +0000)]
intel/compiler: add ASSERTED annotation to avoid "unused variable" warning

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
4 years agoiris: Alphabetize source files after iris_perf.c was added
Kenneth Graunke [Fri, 13 Dec 2019 19:03:03 +0000 (11:03 -0800)]
iris: Alphabetize source files after iris_perf.c was added

4 years agofreedreno/ir3: add iterator macros
Rob Clark [Thu, 12 Dec 2019 23:30:49 +0000 (15:30 -0800)]
freedreno/ir3: add iterator macros

So many open coded list iterators were getting annoying.

Signed-off-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno/ir3: add scheduler traces
Rob Clark [Fri, 22 Nov 2019 19:13:19 +0000 (11:13 -0800)]
freedreno/ir3: add scheduler traces

Add some infrastructure to trace scheduler decisions.  The next patch
will add some more traces, just splitting this out to reduce clutter.

Signed-off-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno/ir3: add last-baryf shaderdb stat
Rob Clark [Wed, 11 Dec 2019 23:52:32 +0000 (15:52 -0800)]
freedreno/ir3: add last-baryf shaderdb stat

Sometimes sched changes that are a win in terms of instruction count
and/or register pressure, are worse in real life, due to keeping varying
storage locked for too long.  Add a shader-db stat to give this more
visibility.

Signed-off-by: Rob Clark <robdclark@chromium.org>
4 years agonir/opt_peephole_select: remove unused variables
Alejandro Piñeiro [Fri, 13 Dec 2019 13:57:37 +0000 (14:57 +0100)]
nir/opt_peephole_select: remove unused variables

To avoid "unused variable" warnings.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
4 years agopanfrost: Report GPU name in es2_info
Alyssa Rosenzweig [Mon, 9 Dec 2019 21:02:17 +0000 (16:02 -0500)]
panfrost: Report GPU name in es2_info

We can prettify the ID.

Closes #2093

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Add panfrost_model_name helper
Alyssa Rosenzweig [Mon, 9 Dec 2019 21:02:03 +0000 (16:02 -0500)]
panfrost: Add panfrost_model_name helper

This gives us a string representation of a GPU ID.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Move property queries to _encoder
Alyssa Rosenzweig [Mon, 9 Dec 2019 20:54:09 +0000 (15:54 -0500)]
panfrost: Move property queries to _encoder

We'll want these in non-Gallium devices.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Move nir_undef_to_zero to Midgard compiler
Alyssa Rosenzweig [Mon, 9 Dec 2019 20:41:52 +0000 (15:41 -0500)]
panfrost: Move nir_undef_to_zero to Midgard compiler

Nothing Gallium about it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopandecode: Add cast
Alyssa Rosenzweig [Fri, 13 Dec 2019 15:22:06 +0000 (10:22 -0500)]
pandecode: Add cast

Fixes minor coverity warning about the format specifier.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Pass size to panfrost_batch_get_scratchpad
Alyssa Rosenzweig [Mon, 9 Dec 2019 16:02:15 +0000 (11:02 -0500)]
panfrost: Pass size to panfrost_batch_get_scratchpad

We'll compute the size with the new scratchpad helpers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Calculate maximum stack_size per batch
Alyssa Rosenzweig [Mon, 9 Dec 2019 16:18:47 +0000 (11:18 -0500)]
panfrost: Calculate maximum stack_size per batch

We'll need this so we can allocate a stack for the batch large enough
for all the jobs within it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Handle misc. cppcheck warnings
Alyssa Rosenzweig [Fri, 13 Dec 2019 15:13:24 +0000 (10:13 -0500)]
pan/midgard: Handle misc. cppcheck warnings

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Remove unused ld/st packing hepers
Alyssa Rosenzweig [Fri, 13 Dec 2019 15:09:16 +0000 (10:09 -0500)]
pan/midgard: Remove unused ld/st packing hepers

Identified by cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Handle minor cppcheck issues
Alyssa Rosenzweig [Fri, 13 Dec 2019 15:07:44 +0000 (10:07 -0500)]
panfrost: Handle minor cppcheck issues

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Emit SFBD/MFBD after a batch, instead of before
Alyssa Rosenzweig [Mon, 9 Dec 2019 16:00:42 +0000 (11:00 -0500)]
panfrost: Emit SFBD/MFBD after a batch, instead of before

The size of the scratchpad (as well as some tiler details) depend on the
contents of the batch, so we need to wait to defer filling out the FBD
until after all draws are queued.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Route stack_size from compiler
Alyssa Rosenzweig [Mon, 9 Dec 2019 16:18:21 +0000 (11:18 -0500)]
panfrost: Route stack_size from compiler

We'll need it in pan_context.c

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agoetnaviv: add missing vs_needs_z_div handling to NIR backend
Jonathan Marek [Sun, 8 Dec 2019 23:16:34 +0000 (18:16 -0500)]
etnaviv: add missing vs_needs_z_div handling to NIR backend

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: add missing formats
Jonathan Marek [Sun, 8 Dec 2019 16:52:32 +0000 (11:52 -0500)]
etnaviv: add missing formats

Add missing texture/render formats supported by hardware.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: remove swizzle from format table
Jonathan Marek [Sun, 8 Dec 2019 16:20:46 +0000 (11:20 -0500)]
etnaviv: remove swizzle from format table

The only format that needs swizzle is R8 emulated with L8, so we can get
rid of the SWIZ(X, Y, Z, W) everywhere.

Note: R8G8 also had a swizzle, but it wasn't necessary.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: disable integer vertex formats on pre-HALTI2 hardware
Jonathan Marek [Sun, 8 Dec 2019 16:54:31 +0000 (11:54 -0500)]
etnaviv: disable integer vertex formats on pre-HALTI2 hardware

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: update INT_FILTER choice for GLES3 formats
Jonathan Marek [Sun, 20 Oct 2019 06:10:43 +0000 (02:10 -0400)]
etnaviv: update INT_FILTER choice for GLES3 formats

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>