Sonny Jiang [Fri, 29 Nov 2019 23:04:54 +0000 (18:04 -0500)]
radeonsi: use compute shader for clear 12-byte buffer
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Marek Olšák [Tue, 10 Dec 2019 03:35:57 +0000 (22:35 -0500)]
st/mesa: release the draw shader properly to fix driver crashes (iris)
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Wed, 4 Dec 2019 22:27:13 +0000 (17:27 -0500)]
draw, st/mesa: generate TGSI for ffvp/ARB_vp if draw lacks LLVM
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Marek Olšák [Thu, 28 Nov 2019 03:47:56 +0000 (22:47 -0500)]
st/mesa: don't generate VS TGSI if NIR is enabled
it's no longer needed
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Thu, 28 Nov 2019 03:37:35 +0000 (22:37 -0500)]
st/mesa: remove struct st_vp_variant in favor of st_common_variant
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Thu, 28 Nov 2019 03:30:22 +0000 (22:30 -0500)]
st/mesa: remove st_vp_variant::num_inputs
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Thu, 28 Nov 2019 03:25:00 +0000 (22:25 -0500)]
st/mesa: use a separate VS variant for the draw module
instead of keeping the IR indefinitely in st_vp_variant.
This trivially fixes Selection/Feedback/RasterPos for NIR.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Wed, 27 Nov 2019 23:01:15 +0000 (18:01 -0500)]
st/mesa: support shader images for Selection/Feedback/RasterPos
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Wed, 27 Nov 2019 22:26:51 +0000 (17:26 -0500)]
st/mesa: support SSBOs for Selection/Feedback/RasterPos
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Wed, 27 Nov 2019 20:10:33 +0000 (15:10 -0500)]
st/mesa: support samplers for Selection/Feedback/RasterPos
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Wed, 27 Nov 2019 20:10:27 +0000 (15:10 -0500)]
st/mesa: save currently bound vertex samplers and sampler views in st_context
for st_draw_feedback.c
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Wed, 27 Nov 2019 01:55:13 +0000 (20:55 -0500)]
st/mesa: support UBOs for Selection/Feedback/RasterPos
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Wed, 27 Nov 2019 22:10:45 +0000 (17:10 -0500)]
gallivm: implement LOAD with CONSTBUF but don't enable it for llvmpipe
This is already used in st_draw_feedback.c, because it uses shaders
generated for drivers.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Marek Olšák [Wed, 27 Nov 2019 20:04:14 +0000 (15:04 -0500)]
llvmpipe: implement TEX_LZ and TXF_LZ opcodes
gallivm receives these opcodes anyway because st_draw_feedback.c uses
shaders that were assembled for drivers, not llvmpipe.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Gurchetan Singh [Thu, 5 Dec 2019 02:03:19 +0000 (18:03 -0800)]
drirc: set allow_higher_compat_version for Faster Than Light
With 781a78 ("mesa: enable ARB_direct_state_access in compat for
GL3.1+), it's possible to have DSA with GL3.1+.
FTL creates a GL3.1 compat context, but fails the
_mesa_has_geometry_shaders(..) check in frame_buffer_texture.
Bump the compat version to pass the check.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Roland Scheidegger [Mon, 9 Dec 2019 17:49:13 +0000 (18:49 +0100)]
util/atomic: Fix p_atomic_add for unlocked and msvc paths
Braces mismatch (flagged by CI, untested).
Fixes: 385d13f26d2 "util/atomic: Add a _return variant of p_atomic_add"
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Eric Anholt [Mon, 9 Dec 2019 19:55:21 +0000 (11:55 -0800)]
freedreno: Track the set of UBOs to be uploaded in UBO analysis.
We were iterating over the entire 32-entry array each time, when we
can just use a bitset to know that we're only uploading from the first
entry normally.
Knocks ir3_emit_user_consts down from ~.5% of CPU to .1% on WebGL
fishtank.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Eric Anholt [Mon, 9 Dec 2019 19:19:14 +0000 (11:19 -0800)]
freedreno: Stop forcing ALLOW_MAPPED_BUFFERS_DURING_EXEC off.
The default is to not throw GL errors when drawing with mapped
buffers, but we were forcing it on for unclear reasons. Internally we
keep all our buffers mapped anyway, so it should be a no-op other than
reducing CPU overhead (.23% in a perf report for WebGL fishtank)
Reviewed-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Mon, 9 Dec 2019 21:08:33 +0000 (13:08 -0800)]
freedreno/fdperf: use drmOpen()
Signed-off-by: Rob Clark <robdclark@chromium.org>
Alyssa Rosenzweig [Fri, 6 Dec 2019 21:45:57 +0000 (16:45 -0500)]
gallium/util: Support POLYGON in u_stream_outputs_for_vertices
u_decomposed_prims_for_vertices cannot support POLYGON, but POLYGON is
trivial to support as a special case directly (since we have the number
of vertices directly).
Fixes aborts in Panfrost in apps using GL_POLYGON.
Fixes: e881aa8c12c ("gallium/util: Add u_stream_outputs_for_vertices helper")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Revewied-by: Eric Anholt <eric@anholt.net>
Anuj Phogat [Wed, 4 Dec 2019 23:21:20 +0000 (15:21 -0800)]
intel: Add pci-ids for Jasper Lake
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Anuj Phogat [Wed, 4 Dec 2019 23:19:18 +0000 (15:19 -0800)]
intel: Add device info for 1x4x6 Jasper Lake
Also removing the FIXME comments after matching the numbers with
updated documentation.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Vasily Khoruzhick [Sun, 8 Dec 2019 20:06:43 +0000 (12:06 -0800)]
lima: expose tiled format modifier in query_dmabuf_modifiers()
Fixes: 8c12f4e5f24f ("lima: enable tiling")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Vasily Khoruzhick [Sun, 8 Dec 2019 20:03:42 +0000 (12:03 -0800)]
lima: handle DRM_FORMAT_MOD_INVALID in resource_from_handle()
Assume that resource is tiled if we get DRM_FORMAT_MOD_INVALID
in resource_from_handle() and we don't have RO.
Fixes: 8c12f4e5f24f ("lima: enable tiling")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Jonathan Marek [Wed, 20 Nov 2019 03:19:46 +0000 (22:19 -0500)]
turnip: add hw binning
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Samuel Pitoiset [Thu, 5 Dec 2019 17:04:32 +0000 (18:04 +0100)]
radv: do not use VK_TRUE/VK_FALSE
For consistency.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Dave Airlie [Tue, 3 Dec 2019 05:54:56 +0000 (15:54 +1000)]
gallivm: add bitfield reverse and ufind_msb
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
Roland Scheidegger [Sat, 7 Dec 2019 03:37:17 +0000 (04:37 +0100)]
gallium/scons: fix graw_gdi build
Fixes: 44a6b0107b37 (gallivm: add nir->llvm translation (v2))
Reviewed-by: Dave Airlie <Airlied@redhat.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Daniel Schürmann [Thu, 5 Dec 2019 18:27:16 +0000 (19:27 +0100)]
aco: propagate temporaries into expanded vectors
Gives a very slight decrease in code size:
Totals from affected shaders:
Code Size:
1708488 ->
1702768 (-0.33 %) bytes
Max Waves: 2858 -> 2855 (-0.10 %)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Thu, 5 Dec 2019 18:17:52 +0000 (19:17 +0100)]
aco: improve readfirstlane after uniform ssbo loads on GFX7
pipeline-db changes for GFX7:
80310 shaders in 40472 tests
Totals:
SGPRS:
3655900 ->
3643916 (-0.33 %)
VGPRS:
2678324 ->
2686324 (0.30 %)
Spilled SGPRs: 1730 -> 1634 (-5.55 %)
Spilled VGPRs: 14 -> 21 (50.00 %)
Scratch size: 15540 -> 15536 (-0.03 %) dwords per thread
Code Size:
136106120 ->
135457616 (-0.48 %) bytes
LDS: 1259 -> 1259 (0.00 %) blocks
Max Waves: 601014 -> 600206 (-0.13 %)
Totals from affected shaders:
SGPRS: 307832 -> 295848 (-3.89 %)
VGPRS: 267864 -> 275864 (2.99 %)
Spilled SGPRs: 770 -> 674 (-12.47 %)
Spilled VGPRs: 14 -> 21 (50.00 %)
Scratch size: 16 -> 12 (-25.00 %) dwords per thread
Code Size:
22007488 ->
21358984 (-2.95 %) bytes
LDS: 65 -> 65 (0.00 %) blocks
Max Waves: 28668 -> 27860 (-2.82 %)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Thu, 5 Dec 2019 17:32:52 +0000 (18:32 +0100)]
aco: use soffset for MUBUF instructions on SI/CI
pipeline-db changes for GFX7:
80310 shaders in 40472 tests
Totals:
SGPRS:
3655300 ->
3655900 (0.02 %)
VGPRS:
2677732 ->
2678324 (0.02 %)
Spilled SGPRs: 1730 -> 1730 (0.00 %)
Spilled VGPRs: 14 -> 14 (0.00 %)
Scratch size: 15540 -> 15540 (0.00 %) dwords per thread
Code Size:
136488364 ->
136106120 (-0.28 %) bytes
LDS: 1259 -> 1259 (0.00 %) blocks
Max Waves: 601039 -> 601014 (-0.00 %)
Totals from affected shaders:
SGPRS: 316312 -> 316912 (0.19 %)
VGPRS: 273844 -> 274436 (0.22 %)
Spilled SGPRs: 770 -> 770 (0.00 %)
Spilled VGPRs: 14 -> 14 (0.00 %)
Scratch size: 16 -> 16 (0.00 %) dwords per thread
Code Size:
22724904 ->
22342660 (-1.68 %) bytes
LDS: 114 -> 114 (0.00 %) blocks
Max Waves: 30861 -> 30836 (-0.08 %)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Fri, 15 Nov 2019 14:37:13 +0000 (15:37 +0100)]
radv: Enable ACO on GFX7 (Sea Islands)
This patch also disables AMD_shader_ballot on GFX7 by default if ACO is used.
Note that shader_ballot works correctly, but performance seems inferior.
To enable shader_ballot use RADV_PERFTEST=shader_ballot.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 4 Dec 2019 12:41:37 +0000 (13:41 +0100)]
aco: return to loop_active mask at continue_or_break blocks
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Mon, 2 Dec 2019 16:58:12 +0000 (17:58 +0100)]
radv: disable Youngblood app profile if ACO is used
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Thu, 21 Nov 2019 09:23:13 +0000 (10:23 +0100)]
aco: implement exclusive scan for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 20 Nov 2019 17:51:39 +0000 (18:51 +0100)]
aco: implement inclusive_scan for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 20 Nov 2019 15:53:42 +0000 (16:53 +0100)]
aco: implement (clustered) reductions for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 20 Nov 2019 17:57:23 +0000 (18:57 +0100)]
aco: don't use a scalar temporary for reductions on GFX10
This patch also adds the scalar temporary for scans on SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Mon, 18 Nov 2019 17:44:51 +0000 (18:44 +0100)]
aco: flush denorms after fmin/fmax on pre-GFX9
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Mon, 18 Nov 2019 10:15:06 +0000 (11:15 +0100)]
radv: only flush scalar cache for SSBO writes with ACO on GFX8+
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Fri, 15 Nov 2019 15:29:32 +0000 (16:29 +0100)]
aco: disable disassembly for SI/CI due to lack of support by LLVM
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Fri, 8 Nov 2019 12:37:15 +0000 (13:37 +0100)]
aco: implement 64bit ine/ieq for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Fri, 15 Nov 2019 07:20:06 +0000 (08:20 +0100)]
aco: implement 64bit i2b for SI /CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Thu, 14 Nov 2019 07:09:32 +0000 (08:09 +0100)]
aco: make 1/2*PI a literal constant on SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Fri, 8 Nov 2019 10:45:13 +0000 (11:45 +0100)]
aco: implement 64bit VGPR shifts for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Thu, 7 Nov 2019 17:02:33 +0000 (18:02 +0100)]
aco: split read/writelane opcode into VOP2/VOP3 version for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 4 Dec 2019 09:43:14 +0000 (10:43 +0100)]
aco: fix disassembly of writelane instructions.
ACO writes an unused 3rd operand for internal usage
which makes LLVM recoginize it as illegal instruction.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 6 Nov 2019 17:09:33 +0000 (18:09 +0100)]
aco: recognize SI/CI SMRD hazards
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 6 Nov 2019 13:01:26 +0000 (14:01 +0100)]
aco: implement quad swizzles for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 6 Nov 2019 11:40:14 +0000 (12:40 +0100)]
aco: move buffer_store data to VGPR if needed
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 6 Nov 2019 09:35:57 +0000 (10:35 +0100)]
aco: implement nir_op_isign on SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 6 Nov 2019 09:13:50 +0000 (10:13 +0100)]
aco: only use scalar loads for readonly buffers on SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 6 Nov 2019 09:12:26 +0000 (10:12 +0100)]
aco: implement nir_op_fquantize2f16 for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Tue, 5 Nov 2019 14:27:59 +0000 (15:27 +0100)]
aco: fix SMEM offsets for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Tue, 5 Nov 2019 14:24:12 +0000 (15:24 +0100)]
aco: SI/CI - fix sampler aniso
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Dave Airlie [Wed, 10 Jul 2019 04:59:46 +0000 (14:59 +1000)]
aco: handle gfx7 int8/10 clamping on exports
Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Mon, 4 Nov 2019 17:02:47 +0000 (18:02 +0100)]
aco: Initial GFX7 Support
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Mon, 18 Nov 2019 09:33:40 +0000 (10:33 +0100)]
aco: refactor visit_store_fs_output() to use the Builder
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Jason Ekstrand [Sat, 7 Dec 2019 00:26:59 +0000 (18:26 -0600)]
anv: Re-emit all compute state on pipeline switch
It's a very odd case to hit in the real world. However, there are some
CTS tests which switch back and forth between dispatch and clear without
changing the pipeline.
Fixes: bc612536eb2f "anv: Emit a dummy MEDIA_VFE_STATE before switching..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Jason Ekstrand [Sat, 7 Dec 2019 00:11:14 +0000 (18:11 -0600)]
anv: Re-capture all batch and state buffers
When we moved from allocating BOs directly to using the BO cache, we
lost the EXEC_OBJECT_CAPTURE flag on all our state buffers.
Fixes: 3119b96bdf57 "anv: Allocate block pool BOs from the cache"
Fixes: ee77938733cd "anv: Allocate batch and fence buffers from..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Jason Ekstrand [Mon, 25 Nov 2019 16:27:02 +0000 (10:27 -0600)]
anv: Return VK_ERROR_OUT_OF_DEVICE_MEMORY for too-large buffers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Anholt [Thu, 5 Dec 2019 05:40:07 +0000 (21:40 -0800)]
freedreno: Enable texture upload memory throttling.
Fixes oom-killer during streaming-texture-upload, which I found while
trying to enable piglit in CI.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Fritz Koenig [Thu, 5 Dec 2019 00:16:43 +0000 (16:16 -0800)]
freedreno: reorder format check
With the addition of the planar formats helper, the
planar formats no longer have a valid block.bits field.
Calling util_format_get_blocksize therefore asserts.
Reorder the check to see if the format is supported
before doing the query to get the blocksize.
Fixes: 20f132e5eff2d ("gallium/util: add planar format layouts and helpers")
Signed-off-by: Fritz Koenig <frkoenig@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Nanley Chery [Fri, 15 Nov 2019 17:17:23 +0000 (09:17 -0800)]
iris: Fix import of multi-planar surfaces with modifiers
Multi-planar surfaces are allowed to have modifiers. Don't require
DRM_FORMAT_MOD_INVALID in order to create a surface for each plane
defined by the format.
Fixes: 246eebba4a8 ("iris: Export and import surfaces with modifiers that have aux data")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Nanley Chery [Fri, 15 Nov 2019 22:10:38 +0000 (14:10 -0800)]
gallium: Store the image format in winsys_handle
This format will be used to properly handle planar images with modifiers
in iris.
Fixes: 246eebba4a8 ("iris: Export and import surfaces with modifiers that have aux data")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Nanley Chery [Thu, 14 Nov 2019 21:59:58 +0000 (13:59 -0800)]
gallium/dri2: Fix creation of multi-planar modifier images
The commit noted below assumed and enforced that DRM_MOD_INVALID was the
only valid modifier for multi-planar imported images. Due to that, it
required that modifier on multi-planar images to:
1. Allow multiple planes.
2. Perform YUV format lowering and extent adjustments.
3. Use buffer_index to correctly map the given planes.
Fix these issues by removing or updating the code built on that
assumption.
Fixes: 2066966c106 ("gallium/dri2: Support creating multi-planar modifier images")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Thu, 5 Dec 2019 23:30:26 +0000 (15:30 -0800)]
meson: Include iris in default gallium-drivers for x86/x86_64
We build i965 by default on x86/x86_64 platforms; let's build iris too.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Jason Ekstrand [Wed, 4 Dec 2019 19:19:23 +0000 (13:19 -0600)]
anv: Use BO fences/semaphores for AcquireNextImage
Instead of doing a dummy submit on the command buffer for the fence or a
dummy semaphore and trusting in implicit sync, this commit moves us to
take advantage of implicit sync and just use the WSI image BO as the
fence. Both semaphores and fences require a tiny bit of extra plumbing
to do this but the result is that we can get rid of a bunch of the extra
synchronization we're doing today.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Wed, 4 Dec 2019 19:01:35 +0000 (13:01 -0600)]
anv: Add a fence_reset_reset_temporary helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Thu, 21 Nov 2019 12:10:32 +0000 (06:10 -0600)]
anv: Use submit-time implicit sync instead of allocate-time
In
83b943cc2f24, we started making all VkDeviceMemory BOs resident all
the time. One unfortunate side-effect of this is that every
vkQueueSubmit sets EXEC_OBJECT_WRITE on every WSI memory object which
means that X server or Wayland compositor, instead of waiting on the
last vkQueueSubmit to actually write the buffer, now waits on the last
vkQueueSubmit to from that driver instance relative to whenever the
compositor's GL driver instance calls execbuf. This potentially leads
to a lot of extra synchronization that we didn't intend to have.
Instead, this commit makes it so that we leave WSI memory objects with
EXEC_OBJECT_ASYNC most of the time and only unset EXEC_OBJECT_ASYNC and
set EXEC_OBJECT_WRITE in the dummy execbuf that we do as part of
vkQueuePresent. This should hopefully result in tighter integration
with the compositor, lower latency, and better performance.
Testing with DOOM 2016, this seems to reduce latency by at least a frame
if not two and makes the game much more responsive. Testing was,
however, subjective, so we don't have any hard data on that.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Thu, 21 Nov 2019 12:00:14 +0000 (06:00 -0600)]
anv: Always add in EXEC_OBJECT_WRITE when specified in extra_flags
Otherwise, we're trusting in the execbuf_add_bo which sets
EXEC_OBJECT_WRITE to to always be the first one that gets called. This
is likely true for fences but it seems somewhat fragile.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Wed, 4 Dec 2019 18:47:31 +0000 (12:47 -0600)]
vulkan/wsi: Add a hooks for signaling semaphores and fences
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Thu, 21 Nov 2019 11:47:10 +0000 (05:47 -0600)]
vulkan/wsi: Provide the implicitly synchronized BO to vkQueueSubmit
This lets us treat the implicit synchronization that we need for X11 and
Wayland like a semaphore. Instead of trusting the driver to somehow
figure out when that memory object needs to be signaled, we provide an
explicit point where the driver can set EXEC_OBJECT_WRITE and signal the
dma_fence on the BO. Without this, we have to somehow track inside the
driver when WSI buffers are actually used to avoid extra synchronization
dependencies.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Urja Rannikko [Fri, 6 Dec 2019 02:47:50 +0000 (02:47 +0000)]
panfrost: free spill cost table in mir_spill_register
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Urja Rannikko [Fri, 6 Dec 2019 02:41:31 +0000 (02:41 +0000)]
panfrost: add lcra_free() to free lcra state
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Urja Rannikko [Fri, 6 Dec 2019 01:20:34 +0000 (01:20 +0000)]
panfrost: free allocations in schedule_block
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Urja Rannikko [Wed, 4 Dec 2019 14:20:48 +0000 (14:20 +0000)]
panfrost: free last_read/write tables in mir_create_dependency_graph
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Thu, 5 Dec 2019 14:06:53 +0000 (09:06 -0500)]
panfrost: Rename SET_VALUE to WRITE_VALUE
See
https://lists.freedesktop.org/archives/dri-devel/2019-December/247601.html
Write value emphasises that it's just a generic write primitive.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 4 Dec 2019 13:59:29 +0000 (08:59 -0500)]
panfrost: Update SET_VALUE with information from igt
It's not a tiler specific initialization; it's a generic GPU-side write
primitive that may be used for tiler reset on midgard.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Samuel Pitoiset [Wed, 13 Nov 2019 10:03:52 +0000 (11:03 +0100)]
gitlab-ci: add a job that runs Vulkan CTS with RADV conditionally
Only Polaris10 is tested at the moment, and I disabled a TON of
tests to keep a CTS run within 5 minutes because my local runner
is a bit slow. A full CTS run takes more than 1h, which means it
will hit the timeout.
RADV CI can only be triggered manually on personal branches to
avoid breaking the world because one runner is definitely not
enough. This will allow us to test it until it's stable enough
to be enabled by default.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Tue, 19 Nov 2019 13:46:53 +0000 (14:46 +0100)]
gitlab-ci: build RADV in meson-testing
This requires to bump LLVM to 8 because it's the minimum supported
version by RADV.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Thu, 14 Nov 2019 11:09:44 +0000 (12:09 +0100)]
gitlab-ci: configure the Vulkan ICD export with VK_DRIVER
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Samuel Pitoiset [Tue, 19 Nov 2019 07:39:00 +0000 (08:39 +0100)]
gitlab-ci: allow to run dEQP Vulkan with DEQP_VER
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Mon, 18 Nov 2019 08:30:27 +0000 (09:30 +0100)]
gitlab-ci: add a new base test job for VK
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Mon, 18 Nov 2019 08:26:00 +0000 (09:26 +0100)]
gitlab-ci: build dEQP VK 1.1.6 in the x86 test image for VK
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Mon, 18 Nov 2019 08:24:27 +0000 (09:24 +0100)]
gitlab-ci: build cts_runner in the x86 test image for VK
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Mon, 18 Nov 2019 08:23:18 +0000 (09:23 +0100)]
gitlab-ci: add a new job that builds a base test image for VK
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Mon, 18 Nov 2019 08:15:12 +0000 (09:15 +0100)]
gitlab-ci: add a gl suffix to the x86 test image and all test jobs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Fri, 15 Nov 2019 11:05:15 +0000 (12:05 +0100)]
gitlab-ci: rename build-deqp.sh to build-deqp-gl.sh
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Michel Dänzer [Thu, 26 Sep 2019 07:27:27 +0000 (09:27 +0200)]
gitlab-ci: Overhaul job run policy
Use new rules: instead of only:
For container stage jobs:
* In the main Mesa project, run them by default.
* In merge requests, run them by default if any files affecting pipeline
results are changed.
* In all other cases (in particular branches in personal projects),
don't run them by default but allow triggering them manually.
build & test stage jobs are left at the default (when: on_success), so
they will run automatically once all their dependencies are satisified.
(Using the same rules as above would require these jobs to be manually
triggered as well, which is only possible once all dependency jobs have
passed) Please be considerate of CI runner resources and cancel unneeded
jobs on personal branches with no corresponding merge requests (this can
be done before the jobs start running).
In summary: No more special branch names. Unnecessary job runs are
avoided by default, but jobs which don't run by default can be triggered
manually.
v2:
* Split out LAVA changes to separate commit
* Clarify commit log a little, in particular WRT build/test stage jobs
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> # v1
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> # v1
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> # v1
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Michel Dänzer [Fri, 6 Dec 2019 08:39:40 +0000 (09:39 +0100)]
gitlab-ci: Use the common run policy for LAVA jobs as well again
Having different policies could have some weird results, e.g. changes
only touching documentation (where the intention is not to run the
pipeline by default) would still create a pipeline with the LAVA jobs
running by default.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Jonathan Marek [Fri, 15 Nov 2019 17:42:44 +0000 (12:42 -0500)]
turnip: implement border color
Fixes the deqp fails in:
dEQP-VK.pipeline.sampler.*border*
(minus 1d array/d24 cases which fail for other reasons)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Jonathan Marek [Fri, 15 Nov 2019 20:12:25 +0000 (15:12 -0500)]
turnip: improve emit_textures
Two things:
* Texture/sampler pointers aligned to the size of texture/sampler state
* Returning errors instead of crashing on OOM
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Jonathan Marek [Fri, 15 Nov 2019 20:15:53 +0000 (15:15 -0500)]
turnip: add function to allocate aligned memory in a substream cs
To use with texture states that need alignment (texconst, sampler, border)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Timothy Arceri [Thu, 5 Dec 2019 04:01:14 +0000 (15:01 +1100)]
glsl/nir: iterate the system values list when adding varyings
Iterate the system values list when adding varyings to the program
resource list in the NIR linker. This is needed to avoid CTS
regressions when using the NIR to build the GLSL resource list in
an upcoming series. Presumably it also fixes a bug with the current
ARB_gl_spirv support.
Fixes: ffdb44d3a0a2 ("nir/linker: Add inputs/outputs to the program resource list")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Dave Airlie [Mon, 2 Dec 2019 05:01:06 +0000 (15:01 +1000)]
llvmpipe: enable support for primitives generated outside streamout
This enables the draw support when the queries are enabled.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Dave Airlie [Mon, 2 Dec 2019 04:37:42 +0000 (14:37 +1000)]
draw: add support for collecting primitives generated outside streamout
GL/gallium require gathering primitives generated outside streamout
stats. This introduces the draw interfaces to enabling collecting this.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Dave Airlie [Mon, 2 Dec 2019 04:58:56 +0000 (14:58 +1000)]
llvmpipe: disable occlusion queries when requested by state tracker
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Dave Airlie [Mon, 2 Dec 2019 04:58:09 +0000 (14:58 +1000)]
llvmpipe: add queries disabled flag
This flag is set when the state tracker request queries
be disabled for meta operations.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Kenneth Graunke [Tue, 3 Dec 2019 21:51:55 +0000 (13:51 -0800)]
main: Change u_mmAllocMem align2 from bytes (old API) to bits (new API)
The main and Gallium implementations were recently merged, and the
align2 parameter in the Gallium one is in bits. execmem.c expected
bytes still. This led to every call here asserting.
Fixes: b6fd679a9e("mesa/main/util: moving gallium u_mm to util, remove main/mm")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>