Jonathan Marek [Tue, 17 Dec 2019 23:15:45 +0000 (18:15 -0500)]
turnip: don't set SP_FS_CTRL_REG0_VARYING if only fragcoord is used
Fixes artifacts in the subpasses demo, which has a shader using fragcoord
without any varyings. It looks like setting this bit when there are no
varyings can cause weirdness in some cases (without this change, if the
previous shader had <= 8 varyings it would work, but with 9 varyings it
would have artifacts).
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143>
Jonathan Marek [Tue, 17 Dec 2019 22:59:45 +0000 (17:59 -0500)]
turnip: add cache invalidate to fix input attachment cases
Fixes artifacts in the subpasses demo.
Workaround texture cache with input attachments from GMEM by adding a cache
invalidate between subpasses.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143>
Lionel Landwerlin [Wed, 18 Dec 2019 15:48:26 +0000 (17:48 +0200)]
loader: fix close on uninitialized file descriptor value
Using a drm syscall layer faking a kernel driver :
==581460== Conditional jump or move depends on uninitialised value(s)
==581460== by 0x48A4C2B: close (drm-hooks.cpp:185)
==581460== by 0x5A815F1: dri3_alloc_render_buffer (loader_dri3_helper.c:1469)
==581460== by 0x5A82050: dri3_get_buffer (loader_dri3_helper.c:1827)
==581460== by 0x5A82662: loader_dri3_get_buffers (loader_dri3_helper.c:2028)
==581460== by 0x6C78109: intel_update_image_buffers (brw_context.c:1870)
==581460== by 0x6C77805: intel_update_renderbuffers (brw_context.c:1499)
==581460== by 0x6C7789D: intel_prepare_render (brw_context.c:1520)
==581460== by 0x6C773D4: intelMakeCurrent (brw_context.c:1341)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 069fdd5f9fac ("egl/x11: Support DRI3 v1.1")
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3152>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3152>
Connor Abbott [Tue, 17 Dec 2019 12:02:56 +0000 (13:02 +0100)]
freedreno: Fix CP_MEM_TO_REG flag definitions
These actually mean something completely different, at least on A5xx
and A6xx. The only other usage of the old flags on something older than
A6xx was a typo, so I don't know if it was always this way, but at the
same time it means that we don't have to worry too much about that.
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
Connor Abbott [Mon, 16 Dec 2019 16:45:02 +0000 (17:45 +0100)]
freedreno: Use new macros for CP_WAIT_REG_MEM and CP_WAIT_MEM_GTE
Similar to the existing usage for CP_COND_WRITE5, this makes it clear
what each of the magic parameters are for.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
Connor Abbott [Mon, 16 Dec 2019 16:17:38 +0000 (17:17 +0100)]
a6xx: Add more CP packets
And add fields uncovered by looking at the firmware. I think this covers
all the memory, register, and scratch manipulation opcodes that exist on
A6xx, plus one additional nice find for Vulkan and describing a
previously unknown opcode and documenting CP_WAIT_REG_MEM.
Note that the bits for the CP_REG_TO_MEM count, as well as the formula
for computing the actual count for both CP_REG_TO_MEM and CP_MEM_TO_REG,
are changed because the A630 SQE firmware actually does something
different. I haven't investigated older microcodes to see whether this
extends back to A5xx and A4xx, but the only non-A6xx uses of this
field result in the same bit-pattern when using the A6xx bit range and
formula, so it should be safe to change the definition universally.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
Bas Nieuwenhuizen [Wed, 18 Dec 2019 09:21:40 +0000 (10:21 +0100)]
radv: Limit workgroup size to 1024.
Fixes a hang with geekbench.
The existence of RX 580 and NAVI10 results shows that the generations
before and after this do not have the issue. (They show up on the
website). So this is likely a GFX9 only issue.
This is not something weird like LDS size since none of the shaders
seem to use LDS.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3145>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3145>
Dylan Baker [Wed, 18 Dec 2019 19:25:32 +0000 (11:25 -0800)]
docs: Add release notes, news, and update calendar for 19.2.8
Dylan Baker [Wed, 18 Dec 2019 19:22:52 +0000 (11:22 -0800)]
docs/relnotes/19.2.8: Add SHA256 sum
Dylan Baker [Wed, 18 Dec 2019 19:01:53 +0000 (11:01 -0800)]
docs: add relnotes for 19.2.8
Dylan Baker [Wed, 18 Dec 2019 18:58:54 +0000 (10:58 -0800)]
docs: Add release notes, update calendar, and add news for 19.3.1
Dylan Baker [Wed, 18 Dec 2019 18:56:33 +0000 (10:56 -0800)]
dcos: add releanse notes for 19.3.1
Lionel Landwerlin [Mon, 16 Dec 2019 14:11:40 +0000 (16:11 +0200)]
i965/iris/perf: factor out frequency register capture
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113>
Jonathan Marek [Tue, 17 Dec 2019 21:03:07 +0000 (16:03 -0500)]
freedreno/ir3: update prefetch input_offset when packing inlocs
If the input location changes then prefetch input_offset needs to change.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3141>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3141>
Eric Anholt [Tue, 17 Dec 2019 19:19:07 +0000 (11:19 -0800)]
ci: Fix caselist results archiving after parallel-deqp-runner rename.
Noticed while reviewing some lava parallel-deqp-runner changes.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3138>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3138>
Kristian H. Kristensen [Mon, 16 Dec 2019 20:59:16 +0000 (12:59 -0800)]
freedreno/a6xx: Document the CP_SET_DRAW_STATE enable bits
There are bits for binning, gmem and sysmem.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3131>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3131>
Caio Marcelo de Oliveira Filho [Mon, 16 Dec 2019 23:12:14 +0000 (15:12 -0800)]
anv/gen12: Temporarily disable VK_KHR_buffer_device_address (and EXT)
For the sake of our testing infrastructure, disable this extension
for TGL until we can sort out a hang in Vulkan CTS.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Caio Marcelo de Oliveira Filho [Mon, 16 Dec 2019 22:43:53 +0000 (14:43 -0800)]
intel/vec4: Fix lowering of multiplication by 16-bit constant
Existing code was ignoring whether the type of the immediate source
was signed or not. If the source was signed, it would ignore small
negative values but it also would wrongly accept values between
INT16_MAX and UINT16_MAX, causing the atual value to later be
reinterpreted as a negative number (under 16-bits).
Fixes tests/shaders/glsl-mul-const.shader_test in Piglit for older
platforms that don't support MUL with 32x32 types and use vec4.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Caio Marcelo de Oliveira Filho [Mon, 16 Dec 2019 21:37:41 +0000 (13:37 -0800)]
intel/fs: Fix lowering of dword multiplication by 16-bit constant
Existing code was ignoring whether the type of the immediate source
was signed or not. If the source was signed, it would ignore small
negative values but it also would wrongly accept values between
INT16_MAX and UINT16_MAX, causing the atual value to later be
reinterpreted as a negative number (under 16-bits).
Fixes tests/shaders/glsl-mul-const.shader_test in Piglit for platforms
that don't support MUL with 32x32 types, including ICL and TGL.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2186
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Alyssa Rosenzweig [Mon, 16 Dec 2019 22:14:04 +0000 (17:14 -0500)]
pan/midgard: Set Z to shadow comparator for 2D
We still need to generalize for other types of (non-2D / array) shadow
samplers, but this is enough for sampler2DShadow to work with initial
dEQP tests passing.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
Alyssa Rosenzweig [Mon, 16 Dec 2019 22:13:46 +0000 (17:13 -0500)]
pan/midgard: Set .shadow for shadow samplers
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
Alyssa Rosenzweig [Mon, 16 Dec 2019 22:02:36 +0000 (17:02 -0500)]
pan/midgard: Hoist temporary coordinate for cubemaps
We'll reuse some of this code for shadow samplers, which are represented
by a distinct source in NIR.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
Alyssa Rosenzweig [Mon, 16 Dec 2019 21:53:52 +0000 (16:53 -0500)]
pan/midgard: Use a reg temporary for mutiple writes
Bug in texelfetch implementation from inspection.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
Alyssa Rosenzweig [Mon, 16 Dec 2019 21:45:28 +0000 (16:45 -0500)]
panfrost: Handle empty shaders
I didn't realize this was in spec, but it fixes a crash in shaderdb.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
Alyssa Rosenzweig [Mon, 16 Dec 2019 23:05:21 +0000 (18:05 -0500)]
panfrost: Let precompile imply shaderdb
This cuts down the number of random environmental variables we need
flying around; now PAN_MESA_DEBUG=precompile is sufficient and
MIDGARD_MESA_DEBUG=shaderdb will be implied.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
Alyssa Rosenzweig [Fri, 13 Dec 2019 20:13:02 +0000 (15:13 -0500)]
panfrost: Add PAN_MESA_DEBUG=precompile for shader-db
We would like to use run.c for shader-db runs (rather than capturing in
real-time, which is limiting).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
Lionel Landwerlin [Mon, 16 Dec 2019 15:58:41 +0000 (17:58 +0200)]
mesa: avoid triggering assert in implementation
When tearing down a GL context with an active performance query, the
implementation can be confused by a query marked active when it's
being deleted.
This shouldn't happen in the implementation because the context will
already be idle.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2235
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3115>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3115>
Samuel Pitoiset [Tue, 17 Dec 2019 09:01:50 +0000 (10:01 +0100)]
radv/gfx10: fix ngg_get_ordered_id
Ported from RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3133>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3133>
Neil Armstrong [Tue, 17 Dec 2019 10:34:47 +0000 (11:34 +0100)]
ci: Remove T820 from CI temporarily
Our lab will have continuous programmed power cuts until the 6th January 2020,
so it's safer to disable the T820 CI running on the BayLibre kernelCI lab
to avoid breaking CI.
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3135>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3135>
Tapani Pälli [Fri, 15 Nov 2019 07:18:10 +0000 (09:18 +0200)]
i965: expose MESA_FORMAT_B8G8R8X8_SRGB visual
Patch adds BGRX sRGB visuals, required format translation information
to the __DRI_IMAGE_FOURCC_SXRGB8888 format and makes all BGRX visuals
sRGB capable just like is done with BGRA.
squashed patches from Yevhenii Kolesnikov:
dri: Add __DRI_IMAGE_FOURCC_SXRGB8888 conversion
i965: force visuals without alpha bits to use sRGB
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1501
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3077>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3077>
Tapani Pälli [Fri, 15 Nov 2019 07:12:15 +0000 (09:12 +0200)]
dri: add __DRI_IMAGE_FORMAT_SXRGB8
Add format definition and required plumbing to create images.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3077>
Gert Wollny [Mon, 16 Dec 2019 20:48:09 +0000 (21:48 +0100)]
virgl: Increase the shader transfer buffer by doubling the size
With only linearly increasing the size of the shader transfer buffer
the transfer of very large shaders may fail, so with each attempt double
the size of the buffer.
CTS:
dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.48
for VTK-GL-CTS
b5dcfb9c5 and newer
virglrenderer bug:
https://gitlab.freedesktop.org/virgl/virglrenderer/issues/150
Fixes: a8987b88ff1db4ac00720a9b56c4bc3aeb666537
virgl: add driver for virtio-gpu 3D (v2)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3121>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3121>
Eric Anholt [Mon, 16 Dec 2019 23:41:16 +0000 (15:41 -0800)]
turnip: Fix support for immutable samplers.
We were setting up the hardware sampler state when updating a combined
image sampler, but never looking at the immutable sampler for in the
separate case.
Fixes failures in
dEQP-VK.binding_model.shader_access.primary_cmd_buf.sampler_immutable.fragment.*
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3127>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3127>
Jonathan Marek [Mon, 16 Dec 2019 16:06:36 +0000 (11:06 -0500)]
turnip: don't set LRZ enable at end of renderpass
Fixes hanging with cases that use more than one renderpass.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3122>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3122>
Jonathan Marek [Sun, 15 Dec 2019 18:43:39 +0000 (13:43 -0500)]
freedreno/ir3: lower pack/unpack ops
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3106>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3106>
Jonathan Marek [Sun, 15 Dec 2019 18:43:07 +0000 (13:43 -0500)]
nir: add option to lower half packing opcodes
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3106>
Eric Anholt [Sat, 14 Dec 2019 06:05:11 +0000 (22:05 -0800)]
turnip: Add support for descriptor arrays.
I had a bigger rework I was working on, but this is simple and gets tests
passing.
Fixes 36 failures in
dEQP-VK.binding_model.shader_access.primary_cmd_buf.sampler_mutable.fragment.*
(now all passing)
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3124>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3124>
Eric Anholt [Sat, 14 Dec 2019 00:52:47 +0000 (16:52 -0800)]
turnip: Drop unused variable.
We really need -Werror in CI.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3124>
Alyssa Rosenzweig [Fri, 13 Dec 2019 19:12:48 +0000 (14:12 -0500)]
panfrost: Don't double-create scratchpad
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: 4f7fddbd716 ("panfrost: Pass size to panfrost_batch_get_scratchpad")
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3119>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3119>
Alyssa Rosenzweig [Fri, 13 Dec 2019 17:41:54 +0000 (12:41 -0500)]
panfrost: Simplify sampler upload condition
Makes it more obvious what's going on.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3119>
Icecream95 [Wed, 11 Dec 2019 08:08:41 +0000 (21:08 +1300)]
gallium/auxiliary: Handle count == 0 in u_vbuf_get_minmax_index_mapped
This makes u_vbuf_get_minmax_index_mapped return min = 0 / max = 0
when info->count == 0.
That should never happen anyway, but this commit makes it at least
return a sane value that callers expect, and also allows us - and
GCC - to assume count != 0 for optimization purposes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3050>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3050>
Icecream95 [Wed, 11 Dec 2019 01:22:19 +0000 (14:22 +1300)]
gallium/auxiliary: Reduce conversions in u_vbuf_get_minmax_index_mapped
With this patch, GCC generates vectorized code that does the comparisons
without converting the indices to 32-bit first.
This optimization makes the aforementioned function almost twice as fast
for ARM NEON, and should speed up vectorised code on other platforms.
Without vectorisation, the function is still a percent or two faster,
but slightly larger.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3050>
Marek Olšák [Wed, 4 Sep 2019 02:38:38 +0000 (22:38 -0400)]
amd/addrlib: update to the latest version
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Jonathan Marek [Sun, 15 Dec 2019 17:00:21 +0000 (12:00 -0500)]
turnip: remove duplicate A6XX_SP_CS_CONFIG_NIBO
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>
Jonathan Marek [Sun, 15 Dec 2019 16:57:40 +0000 (11:57 -0500)]
turnip: change emit_ibo to be like emit_textures
Adds missing alignment and error checking.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>
Jonathan Marek [Sun, 15 Dec 2019 15:51:39 +0000 (10:51 -0500)]
turnip: fix emit_ibo
Based on the GL driver:
-Compute needs different opcode (this fixes a GPU hang problem)
-REG_A6XX_SP_IBO_LO/REG_A6XX_SP_CS_IBO_LO were swapped
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>
Jonathan Marek [Sun, 15 Dec 2019 15:46:03 +0000 (10:46 -0500)]
turnip: remove compute emit_border_color
Current tu6_emit_border_color doesn't work for compute and there's no
example from the GL driver to base it on, so replace it with a finishme.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>
Jonathan Marek [Sun, 15 Dec 2019 15:42:23 +0000 (10:42 -0500)]
turnip: fix emit_textures for compute shaders
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>
Rafael Antognolli [Mon, 16 Dec 2019 19:07:42 +0000 (11:07 -0800)]
utils/os_socket: Define ssize_t on windows.
Fixes: ef5266ebd50 ("util/os_socket: Add socket related functions.")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Marek Olšák [Sat, 14 Dec 2019 05:56:50 +0000 (00:56 -0500)]
radeonsi/gfx10: fix ngg_get_ordered_id
This could have caused issues with NGG streamout.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Marek Olšák [Thu, 12 Dec 2019 22:13:23 +0000 (17:13 -0500)]
radeonsi: reset more fields in si_llvm_context_set_ir to fix reusing ctx
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Marek Olšák [Fri, 13 Dec 2019 02:02:13 +0000 (21:02 -0500)]
radeonsi: fix determining whether the VS prolog is needed
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Marek Olšák [Fri, 13 Dec 2019 02:09:50 +0000 (21:09 -0500)]
radeonsi: allow generating VS prologs with 0 inputs
If "ls_vgpr_fix" is set, we use a prolog, but it can have 0 inputs.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Marek Olšák [Fri, 13 Dec 2019 01:17:41 +0000 (20:17 -0500)]
radeonsi/gfx10: don't insert NGG streamout atomics if they are never used
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Marek Olšák [Thu, 12 Dec 2019 22:11:40 +0000 (17:11 -0500)]
radeonsi: don't wrap the VS prolog in if (ES thread) .. endif
We can execute it unconditionally and the values computed for disabled
threads won't be used anyway.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Marek Olšák [Fri, 6 Dec 2019 02:29:32 +0000 (21:29 -0500)]
radeonsi: set is_monolithic for VS prologs when the shader is really monolithic
This fixes a bug with NGG that is probably harmless.
Basically, !is_monolithic makes the VS prolog emit
llvm.amdgcn.init.exec.from.input, which sets the EXEC mask to only enable
ES threads. In the NGG non-GS case, the GS threads <= ES threads, so it was
never an issue.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Marek Olšák [Fri, 6 Dec 2019 00:37:59 +0000 (19:37 -0500)]
radeonsi: disallow compute-based culling if polygon mode is enabled
Polygon mode can generate thick points or lines.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Marek Olšák [Thu, 5 Dec 2019 01:27:46 +0000 (20:27 -0500)]
radeonsi: deduplicate ES and GS thread enablement code
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Marek Olšák [Thu, 12 Dec 2019 22:01:39 +0000 (17:01 -0500)]
ac: fix the return value in cull_bbox when bbox culling is disabled
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Marek Olšák [Thu, 12 Dec 2019 22:00:51 +0000 (17:00 -0500)]
ac: fix ac_get_i1_sgpr_mask for Wave32
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Alyssa Rosenzweig [Thu, 12 Dec 2019 16:30:20 +0000 (11:30 -0500)]
panfrost: Remove asserts in panfrost_pack_work_groups_compute
It's a hot routine and these are exceedingly unlikely to break.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3067>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3067>
Alyssa Rosenzweig [Thu, 12 Dec 2019 16:28:08 +0000 (11:28 -0500)]
panfrost: Pack invocation_shifts manually instead of a bit field
gcc generates exceptionally bad code for panfrost_pack_work_groups_fused
otherwise ... although that routine is somehow still hot ...
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3067>
Iván Briano [Fri, 13 Dec 2019 00:09:00 +0000 (16:09 -0800)]
anv: Export VK_KHR_buffer_device_address only when really supported
Fixes: 1b6991ba1d8 ("anv: Implement VK_KHR_buffer_device_address")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>
Iván Briano [Fri, 13 Dec 2019 00:07:19 +0000 (16:07 -0800)]
anv: Export filter_minmax support only when it's really supported
Fixes: bea4d4c78c3 ("anv: add VK_EXT_sampler_filter_minmax support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>
Jonathan Marek [Sun, 15 Dec 2019 19:18:13 +0000 (14:18 -0500)]
freedreno/ir3: lower mul_2x32_64
lower_mul_2x32_64 generates mul_high opcodes, and lower_mul_high is done by
nir_lower_alu, so call nir_lower_alu after nir_opt_algebraic.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jonathan Marek [Mon, 16 Dec 2019 15:00:20 +0000 (10:00 -0500)]
turnip: implement CmdFillBuffer/CmdUpdateBuffer
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jonathan Marek [Mon, 16 Dec 2019 14:59:48 +0000 (09:59 -0500)]
turnip: don't require src image to be set for clear blits
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jonathan Marek [Mon, 16 Dec 2019 14:56:06 +0000 (09:56 -0500)]
turnip: use common blit path for buffer copy
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jonathan Marek [Mon, 16 Dec 2019 14:49:39 +0000 (09:49 -0500)]
turnip: use single substream cs
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Alyssa Rosenzweig [Mon, 16 Dec 2019 17:05:45 +0000 (12:05 -0500)]
panfrost: Remove fbd_type enum
Just use the MALI_MFBD tag directly; it's clean.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>
Alyssa Rosenzweig [Mon, 16 Dec 2019 17:48:56 +0000 (12:48 -0500)]
ci: Reinstate Panfrost CI
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>
Alyssa Rosenzweig [Mon, 16 Dec 2019 16:46:32 +0000 (11:46 -0500)]
panfrost: Fix FBD issue
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: b0e915b4e65 ("panfrost: Emit SFBD/MFBD after a batch, instead of before")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>
Lionel Landwerlin [Thu, 12 Dec 2019 15:51:26 +0000 (17:51 +0200)]
vulkan/wsi: error out when image fence doesn't signal
If for some reason the fence associated with an image doesn't signal,
we're likely in a device lost scenario, we should report that error.
We can't really wait for a given amount of time because we could get a
timeout and that is not a valid error to report for vkQueuePresentKHR,
so just wait forever.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/830
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Lionel Landwerlin [Fri, 6 Dec 2019 11:00:58 +0000 (13:00 +0200)]
anv: drop unused parameter from apply layout pass
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Fri, 6 Dec 2019 10:57:10 +0000 (12:57 +0200)]
anv: constify pipeline layout in nir passes
Was hoping to find potential issues but nothing. Still probably a good
idea.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Alyssa Rosenzweig [Tue, 3 Dec 2019 15:51:38 +0000 (10:51 -0500)]
pan/midgard: Set r1.w magic
I'm honestly unsure what this is for, but it's needed on MFBD systems
for unknown reasons, at least when MRT is actually in use and then
sometimes without MRT (it fixes a blend shader issue on T760?)
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
Alyssa Rosenzweig [Tue, 3 Dec 2019 15:37:01 +0000 (10:37 -0500)]
pan/midgard: Fix liveness analysis with multiple epilogues
Epilogues are special fixed-function blocks, so they need special
handling for liveness analysis to work completely. This in turns fixes
RA issues for many shaders using MRT.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
Alyssa Rosenzweig [Sat, 23 Nov 2019 21:08:02 +0000 (16:08 -0500)]
pan/midgard: Writeout per render target
The flow is considerably more complicated. Instead of one writeout loop
like usual, we have a separate write loop for each render target. This
requires some scheduling shenanigans to get right.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
Alyssa Rosenzweig [Sat, 23 Nov 2019 17:43:55 +0000 (12:43 -0500)]
pan/midgard: Add schedule barrier after fragment writeout
This is a branch, like discard, so we need a barrier to make it safe.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
Alyssa Rosenzweig [Sun, 24 Nov 2019 02:44:16 +0000 (21:44 -0500)]
panfrost: Pass blend RT number through
We have to key the blend shader for the render target number due to
writeout silliness.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
Pierre-Eric Pelloux-Prayer [Thu, 5 Dec 2019 09:07:52 +0000 (10:07 +0100)]
gallium: refuse to create buffers larger than UINT32_MAX
pipe_resource.width0 is 32 bits and hardware support for bigger buffer is
limited (eg: AMD hardware doesn't support buffer shader resources bigger
than 4GB).
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2053
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2948>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2948>
Pierre-Eric Pelloux-Prayer [Fri, 13 Dec 2019 16:38:27 +0000 (17:38 +0100)]
radeonsi: disable dcc for 2x MSAA surface and bpe < 4
This fixes a series of dEQP tests on Raven platforms:
- dEQP-GLES3.functional.fbo.msaa.2_samples.rgba4
- dEQP-GLES3.functional.fbo.msaa.2_samples.rgb5_a1
- dEQP-GLES3.functional.fbo.msaa.2_samples.rgb565
- dEQP-GLES3.functional.fbo.msaa.2_samples.rg8
- dEQP-GLES3.functional.fbo.msaa.2_samples.r16f
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3090>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3090>
Iago Toral Quiroga [Mon, 16 Sep 2019 12:30:03 +0000 (14:30 +0200)]
v3d: expose OES_geometry_shader
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Mon, 11 Nov 2019 10:46:41 +0000 (11:46 +0100)]
v3d: support precompiling geometry shaders
At present, this is only relevant for shader-db.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Tue, 5 Nov 2019 11:25:35 +0000 (12:25 +0100)]
v3d: disable lowering of indirect inputs
V3D can do indirect inputs so we don't need it. Also, the lowering
produces horrible if-ladder code that is particularly bad for geometry
shaders where inputs are always arrays and shader bodies usually have
a loop indexing into them.
This fixes a couple of geometry shader tests in CTS that would fail to
register allocate otherwise.
There are no changes in shader-db.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Wed, 30 Oct 2019 13:19:30 +0000 (14:19 +0100)]
v3d: fix primitive queries for geometry shaders
With geometry shaders the number of emitted primitived is decided
at run time, so we cannot precompute it in the CPU and we need to
use the PRIMITIVE_COUNTS_FEEDBACK commands to have the GPU provide
the number like we do for the number of primitives written to
transform feedback. This may have a performance impact though, since
it requires a sync wait for the draw to complete, so we only do
it when geometry shaders are present.
v2: remove '> 0' comparison for ponter type (Alejandro)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Tue, 29 Oct 2019 09:12:28 +0000 (10:12 +0100)]
v3d: handle writes to gl_Layer from geometry shaders
When geometry shaders write a value to gl_Layer that doesn't correspond to
an existing layer in the target framebuffer the rendering behavior is
undefined according to the spec, however, there are CTS tests that trigger
this scenario on purpose, probably to ensure that nothing terrible happens.
For V3D, this situation is problematic because the binner uses the layer
index to select the offset to write into the tile state data, and we only
allocate tile state for MAX2(num_layers, 1), so we want to make sure we
don't produce values that would lead to out of bounds writes. The simulator
has an assert to catch this, although we haven't observed issues in actual
hardware it is probably best to play safe.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Thu, 31 Oct 2019 09:46:58 +0000 (10:46 +0100)]
v3d: move layer rendering to a separate helper
This helps with reducing nesting level after adding the loop
to handle layered rendering.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Tue, 29 Oct 2019 09:27:23 +0000 (10:27 +0100)]
v3d: support rendering to multi-layered framebuffers
When doing layered rendering the binning stage will prepare per-tile
lists for each layer in the framebuffer, so we need to make sure
we allocate enough space for them .
We also need to emit the NUMBER_OF_LAYERS packet. This is required
even when the number of layers is only 1, otherwise the simulator
detects buffer overflows in the tile_state BO during some CTS test
cases involving layered FBOs.
When rendering, we need to emit commands for each layer of the
framebuffer separately and make sure we address the correct layers for
each one.
v2: fixed typo in comment (Alejandro)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Wed, 30 Oct 2019 10:33:44 +0000 (11:33 +0100)]
v3d: do not limit new CL space allocations with branch to 4096 bytes
For layered rendering we need to emit per layer rendering commands
lists so we we can end up requiring a fairly large buffer for this
if the number of layers is large enough.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Tue, 29 Oct 2019 08:32:05 +0000 (09:32 +0100)]
v3d: remove obsolete assertion
OES_geometry_shader introduced the concept of layered framebuffers.
Removing this assertion gets a bunch of CTS tests to pass. We will
also need layered images to implement layered rendering with geometry
shaders.
v2: fix typo in commit message (Alejandro)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Tue, 15 Oct 2019 06:32:47 +0000 (08:32 +0200)]
v3d: support transform feedback with geometry shaders
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Fri, 11 Oct 2019 10:01:32 +0000 (12:01 +0200)]
v3d: save geometry shader state for blitting
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Fri, 11 Oct 2019 09:40:38 +0000 (11:40 +0200)]
v3d: predicate geometry shader outputs inside non-uniform control flow
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Wed, 9 Oct 2019 09:13:00 +0000 (11:13 +0200)]
v3d: don't try to render if shaders failed to compile
This is the same we do in the compute path to avoid crashes
at draw time.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Wed, 9 Oct 2019 08:26:16 +0000 (10:26 +0200)]
v3d: add support for adjacency primitives
v2: remove obsolete comment (Alejandro)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Tue, 8 Oct 2019 11:48:00 +0000 (13:48 +0200)]
v3d: we always have at least one output segment
If we program an output size of 0 the simulator asserts. This was
not a problem until now because our VS would always have to
emit fixed function outputs, however, now that it can be paired
with a GS we can end up with a VS shader that no longer emits
any outputs.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Thu, 7 Nov 2019 14:40:45 +0000 (15:40 +0100)]
v3d: compute appropriate VPM memory configuration for geometry shader workloads
Geometry shaders can output many vertices and thus have higher VPM memory
pressure as a result. It is possible that too wide geometry shader dispatches
exceed the maximum available VPM output allocated, in which case we need
to reduce the dispatch width until we can fit the VPM memory requirements.
Supported dispatch widths for geometry shaders are 16, 8, 4, 1.
There is a limit in the number of VPM output sectors that can be used by a
geometry shader that we can meet by lowering the dispatch width at compile
time, however, at draw time we need to revisit this number and, together with
other elements that can contribute to total VPM memory requirements, decide
on a configuration that can fit the program into the available VPM memory.
Ideally, we also want to aim for not using more than half of the available
memory so we that we can run a pair of bin and render programs in parallel.
v2: fixed language in comment and typo in commit log. (Alejandro)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Mon, 28 Oct 2019 13:23:51 +0000 (14:23 +0100)]
v3d: add 1-way SIMD packing definition
According to the documentation, the 1-way dispatch width is only supported
with geometry shaders.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Iago Toral Quiroga [Thu, 3 Oct 2019 15:08:47 +0000 (17:08 +0200)]
v3d: implement geometry shader instancing
v2:
- Remove unused field uses_iid from v3d_gs_prog_data (Alejandro)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>