Jason Ekstrand [Sat, 7 Dec 2019 00:26:59 +0000 (18:26 -0600)]
anv: Re-emit all compute state on pipeline switch
It's a very odd case to hit in the real world. However, there are some
CTS tests which switch back and forth between dispatch and clear without
changing the pipeline.
Fixes: bc612536eb2f "anv: Emit a dummy MEDIA_VFE_STATE before switching..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Jason Ekstrand [Sat, 7 Dec 2019 00:11:14 +0000 (18:11 -0600)]
anv: Re-capture all batch and state buffers
When we moved from allocating BOs directly to using the BO cache, we
lost the EXEC_OBJECT_CAPTURE flag on all our state buffers.
Fixes: 3119b96bdf57 "anv: Allocate block pool BOs from the cache"
Fixes: ee77938733cd "anv: Allocate batch and fence buffers from..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Jason Ekstrand [Mon, 25 Nov 2019 16:27:02 +0000 (10:27 -0600)]
anv: Return VK_ERROR_OUT_OF_DEVICE_MEMORY for too-large buffers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Anholt [Thu, 5 Dec 2019 05:40:07 +0000 (21:40 -0800)]
freedreno: Enable texture upload memory throttling.
Fixes oom-killer during streaming-texture-upload, which I found while
trying to enable piglit in CI.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Fritz Koenig [Thu, 5 Dec 2019 00:16:43 +0000 (16:16 -0800)]
freedreno: reorder format check
With the addition of the planar formats helper, the
planar formats no longer have a valid block.bits field.
Calling util_format_get_blocksize therefore asserts.
Reorder the check to see if the format is supported
before doing the query to get the blocksize.
Fixes: 20f132e5eff2d ("gallium/util: add planar format layouts and helpers")
Signed-off-by: Fritz Koenig <frkoenig@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Nanley Chery [Fri, 15 Nov 2019 17:17:23 +0000 (09:17 -0800)]
iris: Fix import of multi-planar surfaces with modifiers
Multi-planar surfaces are allowed to have modifiers. Don't require
DRM_FORMAT_MOD_INVALID in order to create a surface for each plane
defined by the format.
Fixes: 246eebba4a8 ("iris: Export and import surfaces with modifiers that have aux data")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Nanley Chery [Fri, 15 Nov 2019 22:10:38 +0000 (14:10 -0800)]
gallium: Store the image format in winsys_handle
This format will be used to properly handle planar images with modifiers
in iris.
Fixes: 246eebba4a8 ("iris: Export and import surfaces with modifiers that have aux data")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Nanley Chery [Thu, 14 Nov 2019 21:59:58 +0000 (13:59 -0800)]
gallium/dri2: Fix creation of multi-planar modifier images
The commit noted below assumed and enforced that DRM_MOD_INVALID was the
only valid modifier for multi-planar imported images. Due to that, it
required that modifier on multi-planar images to:
1. Allow multiple planes.
2. Perform YUV format lowering and extent adjustments.
3. Use buffer_index to correctly map the given planes.
Fix these issues by removing or updating the code built on that
assumption.
Fixes: 2066966c106 ("gallium/dri2: Support creating multi-planar modifier images")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Thu, 5 Dec 2019 23:30:26 +0000 (15:30 -0800)]
meson: Include iris in default gallium-drivers for x86/x86_64
We build i965 by default on x86/x86_64 platforms; let's build iris too.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Jason Ekstrand [Wed, 4 Dec 2019 19:19:23 +0000 (13:19 -0600)]
anv: Use BO fences/semaphores for AcquireNextImage
Instead of doing a dummy submit on the command buffer for the fence or a
dummy semaphore and trusting in implicit sync, this commit moves us to
take advantage of implicit sync and just use the WSI image BO as the
fence. Both semaphores and fences require a tiny bit of extra plumbing
to do this but the result is that we can get rid of a bunch of the extra
synchronization we're doing today.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Wed, 4 Dec 2019 19:01:35 +0000 (13:01 -0600)]
anv: Add a fence_reset_reset_temporary helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Thu, 21 Nov 2019 12:10:32 +0000 (06:10 -0600)]
anv: Use submit-time implicit sync instead of allocate-time
In
83b943cc2f24, we started making all VkDeviceMemory BOs resident all
the time. One unfortunate side-effect of this is that every
vkQueueSubmit sets EXEC_OBJECT_WRITE on every WSI memory object which
means that X server or Wayland compositor, instead of waiting on the
last vkQueueSubmit to actually write the buffer, now waits on the last
vkQueueSubmit to from that driver instance relative to whenever the
compositor's GL driver instance calls execbuf. This potentially leads
to a lot of extra synchronization that we didn't intend to have.
Instead, this commit makes it so that we leave WSI memory objects with
EXEC_OBJECT_ASYNC most of the time and only unset EXEC_OBJECT_ASYNC and
set EXEC_OBJECT_WRITE in the dummy execbuf that we do as part of
vkQueuePresent. This should hopefully result in tighter integration
with the compositor, lower latency, and better performance.
Testing with DOOM 2016, this seems to reduce latency by at least a frame
if not two and makes the game much more responsive. Testing was,
however, subjective, so we don't have any hard data on that.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Thu, 21 Nov 2019 12:00:14 +0000 (06:00 -0600)]
anv: Always add in EXEC_OBJECT_WRITE when specified in extra_flags
Otherwise, we're trusting in the execbuf_add_bo which sets
EXEC_OBJECT_WRITE to to always be the first one that gets called. This
is likely true for fences but it seems somewhat fragile.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Wed, 4 Dec 2019 18:47:31 +0000 (12:47 -0600)]
vulkan/wsi: Add a hooks for signaling semaphores and fences
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Thu, 21 Nov 2019 11:47:10 +0000 (05:47 -0600)]
vulkan/wsi: Provide the implicitly synchronized BO to vkQueueSubmit
This lets us treat the implicit synchronization that we need for X11 and
Wayland like a semaphore. Instead of trusting the driver to somehow
figure out when that memory object needs to be signaled, we provide an
explicit point where the driver can set EXEC_OBJECT_WRITE and signal the
dma_fence on the BO. Without this, we have to somehow track inside the
driver when WSI buffers are actually used to avoid extra synchronization
dependencies.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Urja Rannikko [Fri, 6 Dec 2019 02:47:50 +0000 (02:47 +0000)]
panfrost: free spill cost table in mir_spill_register
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Urja Rannikko [Fri, 6 Dec 2019 02:41:31 +0000 (02:41 +0000)]
panfrost: add lcra_free() to free lcra state
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Urja Rannikko [Fri, 6 Dec 2019 01:20:34 +0000 (01:20 +0000)]
panfrost: free allocations in schedule_block
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Urja Rannikko [Wed, 4 Dec 2019 14:20:48 +0000 (14:20 +0000)]
panfrost: free last_read/write tables in mir_create_dependency_graph
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Thu, 5 Dec 2019 14:06:53 +0000 (09:06 -0500)]
panfrost: Rename SET_VALUE to WRITE_VALUE
See
https://lists.freedesktop.org/archives/dri-devel/2019-December/247601.html
Write value emphasises that it's just a generic write primitive.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 4 Dec 2019 13:59:29 +0000 (08:59 -0500)]
panfrost: Update SET_VALUE with information from igt
It's not a tiler specific initialization; it's a generic GPU-side write
primitive that may be used for tiler reset on midgard.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Samuel Pitoiset [Wed, 13 Nov 2019 10:03:52 +0000 (11:03 +0100)]
gitlab-ci: add a job that runs Vulkan CTS with RADV conditionally
Only Polaris10 is tested at the moment, and I disabled a TON of
tests to keep a CTS run within 5 minutes because my local runner
is a bit slow. A full CTS run takes more than 1h, which means it
will hit the timeout.
RADV CI can only be triggered manually on personal branches to
avoid breaking the world because one runner is definitely not
enough. This will allow us to test it until it's stable enough
to be enabled by default.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Tue, 19 Nov 2019 13:46:53 +0000 (14:46 +0100)]
gitlab-ci: build RADV in meson-testing
This requires to bump LLVM to 8 because it's the minimum supported
version by RADV.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Thu, 14 Nov 2019 11:09:44 +0000 (12:09 +0100)]
gitlab-ci: configure the Vulkan ICD export with VK_DRIVER
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Samuel Pitoiset [Tue, 19 Nov 2019 07:39:00 +0000 (08:39 +0100)]
gitlab-ci: allow to run dEQP Vulkan with DEQP_VER
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Mon, 18 Nov 2019 08:30:27 +0000 (09:30 +0100)]
gitlab-ci: add a new base test job for VK
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Mon, 18 Nov 2019 08:26:00 +0000 (09:26 +0100)]
gitlab-ci: build dEQP VK 1.1.6 in the x86 test image for VK
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Mon, 18 Nov 2019 08:24:27 +0000 (09:24 +0100)]
gitlab-ci: build cts_runner in the x86 test image for VK
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Mon, 18 Nov 2019 08:23:18 +0000 (09:23 +0100)]
gitlab-ci: add a new job that builds a base test image for VK
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Mon, 18 Nov 2019 08:15:12 +0000 (09:15 +0100)]
gitlab-ci: add a gl suffix to the x86 test image and all test jobs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Samuel Pitoiset [Fri, 15 Nov 2019 11:05:15 +0000 (12:05 +0100)]
gitlab-ci: rename build-deqp.sh to build-deqp-gl.sh
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Michel Dänzer [Thu, 26 Sep 2019 07:27:27 +0000 (09:27 +0200)]
gitlab-ci: Overhaul job run policy
Use new rules: instead of only:
For container stage jobs:
* In the main Mesa project, run them by default.
* In merge requests, run them by default if any files affecting pipeline
results are changed.
* In all other cases (in particular branches in personal projects),
don't run them by default but allow triggering them manually.
build & test stage jobs are left at the default (when: on_success), so
they will run automatically once all their dependencies are satisified.
(Using the same rules as above would require these jobs to be manually
triggered as well, which is only possible once all dependency jobs have
passed) Please be considerate of CI runner resources and cancel unneeded
jobs on personal branches with no corresponding merge requests (this can
be done before the jobs start running).
In summary: No more special branch names. Unnecessary job runs are
avoided by default, but jobs which don't run by default can be triggered
manually.
v2:
* Split out LAVA changes to separate commit
* Clarify commit log a little, in particular WRT build/test stage jobs
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> # v1
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> # v1
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> # v1
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Michel Dänzer [Fri, 6 Dec 2019 08:39:40 +0000 (09:39 +0100)]
gitlab-ci: Use the common run policy for LAVA jobs as well again
Having different policies could have some weird results, e.g. changes
only touching documentation (where the intention is not to run the
pipeline by default) would still create a pipeline with the LAVA jobs
running by default.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Jonathan Marek [Fri, 15 Nov 2019 17:42:44 +0000 (12:42 -0500)]
turnip: implement border color
Fixes the deqp fails in:
dEQP-VK.pipeline.sampler.*border*
(minus 1d array/d24 cases which fail for other reasons)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Jonathan Marek [Fri, 15 Nov 2019 20:12:25 +0000 (15:12 -0500)]
turnip: improve emit_textures
Two things:
* Texture/sampler pointers aligned to the size of texture/sampler state
* Returning errors instead of crashing on OOM
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Jonathan Marek [Fri, 15 Nov 2019 20:15:53 +0000 (15:15 -0500)]
turnip: add function to allocate aligned memory in a substream cs
To use with texture states that need alignment (texconst, sampler, border)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Timothy Arceri [Thu, 5 Dec 2019 04:01:14 +0000 (15:01 +1100)]
glsl/nir: iterate the system values list when adding varyings
Iterate the system values list when adding varyings to the program
resource list in the NIR linker. This is needed to avoid CTS
regressions when using the NIR to build the GLSL resource list in
an upcoming series. Presumably it also fixes a bug with the current
ARB_gl_spirv support.
Fixes: ffdb44d3a0a2 ("nir/linker: Add inputs/outputs to the program resource list")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Dave Airlie [Mon, 2 Dec 2019 05:01:06 +0000 (15:01 +1000)]
llvmpipe: enable support for primitives generated outside streamout
This enables the draw support when the queries are enabled.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Dave Airlie [Mon, 2 Dec 2019 04:37:42 +0000 (14:37 +1000)]
draw: add support for collecting primitives generated outside streamout
GL/gallium require gathering primitives generated outside streamout
stats. This introduces the draw interfaces to enabling collecting this.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Dave Airlie [Mon, 2 Dec 2019 04:58:56 +0000 (14:58 +1000)]
llvmpipe: disable occlusion queries when requested by state tracker
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Dave Airlie [Mon, 2 Dec 2019 04:58:09 +0000 (14:58 +1000)]
llvmpipe: add queries disabled flag
This flag is set when the state tracker request queries
be disabled for meta operations.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Kenneth Graunke [Tue, 3 Dec 2019 21:51:55 +0000 (13:51 -0800)]
main: Change u_mmAllocMem align2 from bytes (old API) to bits (new API)
The main and Gallium implementations were recently merged, and the
align2 parameter in the Gallium one is in bits. execmem.c expected
bytes still. This led to every call here asserting.
Fixes: b6fd679a9e("mesa/main/util: moving gallium u_mm to util, remove main/mm")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Eric Anholt [Thu, 5 Dec 2019 00:13:38 +0000 (16:13 -0800)]
ci: Disable egl_ext_device_drm tests in piglit.
If the runner has a HW device that would be supported, even without
/dev/dri forwarded into the container, it will be enumerated and the tests
on llvmpipe fail with (for example):
libEGL warning: Not allowed to force software rendering when API explicitly selects a hardware device.
libEGL warning: MESA-LOADER: failed to open i965 (search paths /builds/anholt/mesa/install/lib/dri)
Given that we can't necessarily control the DRI devices present on the
runners (particularly for developers bringing their own runners to reduce
the demands on fd.o's shared resources), just skip these tests in CI.
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Jason Ekstrand [Thu, 5 Dec 2019 17:49:18 +0000 (11:49 -0600)]
util/atomic: Add p_atomic_add_return for the unlocked path
Fixes: 385d13f26d2 "util/atomic: Add a _return variant of p_atomic_add"
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Jason Ekstrand [Mon, 2 Dec 2019 22:28:58 +0000 (16:28 -0600)]
anv: Implement VK_KHR_buffer_device_address
The primary difference between the KHR and EXT versions of the extension
is that the KHR provides the address at AllocateMemory time for replay
so we can replay it safely without moving to a sparse address model.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Wed, 26 Jun 2019 23:02:19 +0000 (18:02 -0500)]
anv: Use a pNext loop in AllocateMemory
This function has a lot of possible extensions and some of them we can
easily handle on-the-fly so it's easier to just have a loop than to find
each structure manually.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 2 Dec 2019 22:03:56 +0000 (16:03 -0600)]
anv: Add allocator support for client-visible addresses
When a BO is flagged as having a client visible address, we put it in
its own heap. We also support the client explicitly specifying an
address in said heap. If an address collision happens, we return false
from anv_vma_alloc which turns into a VK_ERROR_OUT_OF_DEVICE_MEMORY.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Wed, 26 Jun 2019 19:32:31 +0000 (14:32 -0500)]
util/vma: Add a function to allocate a particular address range
This new function lets you request to remove a specific address range
from the allocator. It returns true on success and leaves the allocator
unmodified and returns false on failure. It doesn't need to return an
offset because, if it succeeds, the offset passed in is the allocated
offset.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Wed, 26 Jun 2019 19:31:57 +0000 (14:31 -0500)]
util/vma: Factor out the hole splitting part of util_vma_heap_alloc
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 2 Dec 2019 21:22:38 +0000 (15:22 -0600)]
anv: Add an explicit_address parameter to anv_device_alloc_bo
We already have a mechanism for specifying that we want a fixed address
provided by the driver internals. We're about to let the client start
specifying addresses in some very special scenarios as well so we want
to pass this through to the allocation function.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 2 Dec 2019 20:44:33 +0000 (14:44 -0600)]
anv: Stop advertising two heaps just for the VF cache WA
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 2 Dec 2019 20:51:30 +0000 (14:51 -0600)]
anv: Set up VMA heaps independently from memory heaps
Our VMA allocations are really independent from the memory heaps we
expose via the API. The only thing that really matters is the GTT size
so we can make the high heap the right size.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 2 Dec 2019 20:38:45 +0000 (14:38 -0600)]
anv: Stop tracking VMA allocations
util_vma_heap_alloc will already return 0 if it doesn't have enough
space. The only thing the vma_*_available tracking was doing was
preventing us from allocating too much on any given heap. Now that
we're tracking that in the heap itself, we can drop these.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 2 Dec 2019 20:37:56 +0000 (14:37 -0600)]
anv: Disallow allocating above heap sizes
We're already tracking the amount of memory used in each heap. This
commit just makes us start rejecting memory allocations if the heap
would grow too large.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 2 Dec 2019 20:36:39 +0000 (14:36 -0600)]
util/atomic: Add a _return variant of p_atomic_add
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 2 Dec 2019 19:51:59 +0000 (13:51 -0600)]
anv: Don't leak when set_tiling fails
Fixes: a44744e01d73 "anv: Require a dedicated allocation for..."
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Tue, 26 Nov 2019 03:55:51 +0000 (21:55 -0600)]
anv: Use PIPE_CONTROL flushes to implement the gen8 VF cache WA
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 2 Dec 2019 18:32:16 +0000 (12:32 -0600)]
anv: Apply cache flushes after setting index/draw VBs
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 2 Dec 2019 18:14:45 +0000 (12:14 -0600)]
anv: Always invalidate the VF cache in BeginCommandBuffer
I think the reason why we only do this for primaries is that we didn't
expect to have blorp calls in secondaries. However, you are allowed to
have a full render pass in a secondary command buffer so resolves and
clears can end up in there. We should just always invalidate.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 25 Nov 2019 18:42:42 +0000 (12:42 -0600)]
blorp: Pass the VB size to the VF cache workaround
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 25 Nov 2019 18:06:20 +0000 (12:06 -0600)]
anv: Add a has_softpin boolean
This separates "has" from "use" which will make the next commit a bit
cleaner.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 2 Dec 2019 18:02:12 +0000 (12:02 -0600)]
anv: Drop bo_flags from anv_bo_pool
In
ee77938733cd, we started using the BO cache for anv_bo_pool and
stopped using the bo_flags parameter. However, we never dropped it from
the struct or the init function.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Michel Dänzer [Wed, 20 Nov 2019 10:15:04 +0000 (11:15 +0100)]
glsl/tests: Use splitlines() instead of strip()
strip() removes leading and trailing newlines, but leaves newlines
between multiple lines in the string. This could cause failures when
comparing the output of cross-compiled Windows binaries (producing
Windows-style newlines) to the expected output with Unix-style newlines.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Mauro Rossi [Sat, 16 Nov 2019 17:39:31 +0000 (18:39 +0100)]
android: radeonsi: fix build after vl refactoring (v2)
vl functions moved from radeonsi to gallium/auxiliary/vl have left
android build of radeonsi in broken state.
libmesa_galliumvl static is need to build readeonsi,
gallium_dri building rules are reworked to avoid multiple symbols
and libmesa_galliumvl static dependency is needed in radeonsi.
Here is the changelog:
- android: gallium/auxiliary: add libmesa_galliumvl static
- android: gallium_dri: move libmesa_gallium to static to prevent multiple symbols
- android: radeonsi: fix build after vl refactoring
Fixes the following building error:
external/mesa/src/gallium/drivers/radeonsi/si_uvd.c:47:
error: undefined reference to 'vl_video_buffer_create_as_resource'
clang.real: error: linker command failed with exit code 1 (use -v to see invocation)
Fixes: 86e60bc ("radeonsi: remove si_vid_join_surfaces and use combined planar allocations")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tapani Pälli [Mon, 2 Dec 2019 14:54:30 +0000 (16:54 +0200)]
intel/compiler: force simd8 when dual src blending on gen8
Patch introduces option to force simd8 and uses it as a workaround for
dual source blending issues seen with skqp (skia testsuite) on gen8.
Fixes following Piglit test on gen8 platforms:
arb_blend_func_extended-dual-src-blending-issue-1917
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1917
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
c: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tapani Pälli [Wed, 4 Dec 2019 06:04:21 +0000 (08:04 +0200)]
intel/compiler: add newline to limit_dispatch_width message
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Anholt [Wed, 27 Nov 2019 04:37:19 +0000 (20:37 -0800)]
turnip: Add support for compute shaders.
Since compute shares the FS state with graphics, we have to re-upload the
pipeline state when switching between compute dispatch and graphics draws.
We could potentially expose graphics and compute as separate queues and
then we wouldn't need pipeline state management, but the closed driver
exposes a single queue and consistency with them is probably good.
So far I'm emitting texture/ibo state as IBs that we jump to. This is
kind of silly when we could just emit it directly in our CS, but that's a
refactor we can do later.
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Eric Anholt [Wed, 4 Dec 2019 20:22:55 +0000 (12:22 -0800)]
turnip: Move pipeline BO list adding to BindPipeline.
We only need to do it once when we bind, rather than having to check at
every draw call.
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Eric Anholt [Wed, 4 Dec 2019 20:21:50 +0000 (12:21 -0800)]
turnip: Sanity check that we're adding valid BOs to the list.
I tripped over this during CS enabling when my program BO wasn't set up.
Easier to debug this way than the kernel telling us a 0 handle is invalid.
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Eric Anholt [Wed, 4 Dec 2019 21:13:16 +0000 (13:13 -0800)]
turnip: Add a helper function for getting tu_buffer iovas.
Easier than remembering to add all 3 offsets.
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Eric Anholt [Wed, 27 Nov 2019 00:42:24 +0000 (16:42 -0800)]
turnip: Refactor the graphics pipeline create implementation.
The loop over the pipelines to create (and the failure handling) was
noisy, and the stub for compute setup looked nicer to me.
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Eric Anholt [Mon, 2 Dec 2019 22:32:53 +0000 (14:32 -0800)]
turnip: Add basic SSBO support.
This is enough to pass
dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_buffer.fragment.single_descriptor.*
with fragmentStoresAndAtomics set, and thus to be able to start working on
compute. I haven't enabled that flag yet, because it also implies image
load/store support, which I haven't filled in.
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Eric Anholt [Tue, 3 Dec 2019 00:44:52 +0000 (16:44 -0800)]
turnip: Reuse tu6_stage2opcode() more.
A bit of cleanup for adding more stages later.
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Eric Anholt [Wed, 4 Dec 2019 22:15:42 +0000 (14:15 -0800)]
turnip: Drop redefinition of VALIDREG now that it's in ir3.h.
Fixes: 937b9055698b ("freedreno/ir3: fix neverball assert in case of unused VS inputs")
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Eric Anholt [Tue, 26 Nov 2019 19:15:12 +0000 (11:15 -0800)]
turnip: Fix unused variable warnings.
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Timothy Arceri [Tue, 3 Dec 2019 13:24:35 +0000 (00:24 +1100)]
glsl: make use of active_shader_mask when building resource list
This allows us to avoid walking the entire IR looking for used
uniforms.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Timothy Arceri [Tue, 3 Dec 2019 13:14:03 +0000 (00:14 +1100)]
glsl: don't set uniform block as used when its not
The spec requires unused uniform block to be set as active in the
program resource list. To support this we tell opt dead code not to
remove them. However we can mark them as unused internally and
avoid unnecessarily state changes.
This change is also required for the folowing clean-up patch.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Timothy Arceri [Tue, 3 Dec 2019 04:04:14 +0000 (15:04 +1100)]
glsl: move calculate_array_size_and_stride() to link_uniforms.cpp
This is where all the other uniform values are populated so it
makes much more sense here. Moving it will also allow us to better
share code between the NIR and GLSL IR resource list builders.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Ian Romanick [Tue, 19 Nov 2019 03:53:57 +0000 (19:53 -0800)]
anv: Fix error message format string
See also
246261f0addf
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
CID:
1455892
Fixes: 246261f0add ("anv: prepare the driver for delayed submissions")
Ian Romanick [Tue, 19 Nov 2019 03:42:22 +0000 (19:42 -0800)]
mesa: Silence unused parameter warning
Unused since
e4da8b9c331 ("mesa/compiler: rework tear down of
builtin/types").
src/mesa/main/context.c: In function ‘_mesa_free_context_data’:
src/mesa/main/context.c:1321:54: warning: unused parameter ‘destroy_compiler_types’ [-Wunused-parameter]
1321 | _mesa_free_context_data(struct gl_context *ctx, bool destroy_compiler_types)
| ^
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Ian Romanick [Tue, 19 Nov 2019 03:33:06 +0000 (19:33 -0800)]
mesa: Silence 'left shift of negative value' warning in BPTC compression code
src/util/format/../../mesa/main/texcompress_bptc_tmp.h:830:31: warning: left shift of negative value [-Wshift-negative-value]
830 | value |= (~(int32_t) 0) << n_bits;
| ^~
v2: Rewrite to just shift left then shift right. Based on conversation
with Neil in
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2792#note_320272,
this should be fine.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> [v1]
Reviewed-by: Neil Roberts <nroberts@igalia.com>
Ian Romanick [Tue, 19 Nov 2019 03:16:23 +0000 (19:16 -0800)]
intel/compiler: Fix 'comparison is always true' warning
Without looking at the assembly or something, I'm not sure what the
compiler does here. The brw_reg_type enum is marked packed, so I'm
guess that it gets represented as a uint8_t. That's the only reason I
could think that comparing with -1 would be always true.
This patch adds the same cast that exists in brw_hw_type_to_reg_type.
It might be better to add a #define outside the enum for
BRW_REGISTER_TYPE_INVALID as (enum brw_reg_type)-1.
src/intel/compiler/brw_eu_compact.c: In function ‘has_immediate’:
src/intel/compiler/brw_eu_compact.c:1515:20: warning: comparison is always true due to limited range of data type [-Wtype-limits]
1515 | return *type != -1;
| ^~
src/intel/compiler/brw_eu_compact.c:1518:20: warning: comparison is always true due to limited range of data type [-Wtype-limits]
1518 | return *type != -1;
| ^~
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
CID:
1455194
Fixes: 12d3b11908e ("intel/compiler: Add instruction compaction support on Gen12")
Cc: @mattst88
Dylan Baker [Wed, 4 Dec 2019 22:40:10 +0000 (14:40 -0800)]
docs: Update mesa 19.3 release calendar
Dylan Baker [Wed, 4 Dec 2019 22:38:48 +0000 (14:38 -0800)]
docs: update calendar, add news item and link release notes for 19.2.7
Dylan Baker [Wed, 4 Dec 2019 22:36:13 +0000 (14:36 -0800)]
docs: Add SHA256 sums for 19.2.7
Dylan Baker [Wed, 4 Dec 2019 21:47:44 +0000 (13:47 -0800)]
docs: Add release notes for 19.2.7
Jonathan Marek [Wed, 4 Dec 2019 19:29:58 +0000 (14:29 -0500)]
turnip: allow writes to draw_cs outside of render pass
This is for state commands like CmdSetViewport that can be used outside of
a renderpass. Accumulating those into draw_cs outside of the renderpass
should have the desired effect.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Rob Clark [Wed, 4 Dec 2019 00:28:26 +0000 (16:28 -0800)]
nir/lower_clip: Fix incorrect driver loc for clipdist outputs
Somehow adjusting maxloc based on existing outputs got lost, resulting
in the clipdist varying clobbering the position varying. Causing a
shader that had no position output in freedreno/ir3, which triggers GPU
hangs in neverball.
Fixes: d0f746b6458 ("nir: Save nir_variable pointers in nir_lower_clip_vs rather than locs.")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Rob Clark [Tue, 3 Dec 2019 21:44:35 +0000 (13:44 -0800)]
freedreno/ir3: fix neverball assert in case of unused VS inputs
The logic to ensure VS and BS inputs are aligned wasn't accounting for
unused inputs in VS. This *usually* doesn't happen, but it seems it
can in the case of ARB programs?
Fixes assert:
```
fd6_program_create: Assertion `bs->inputs[i].regid == vs->inputs[i].regid' failed.
```
Fixes: 882d53d8e36 ("freedreno/ir3+a6xx: same VBO state for draw/binning")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Rob Clark [Wed, 4 Dec 2019 18:15:39 +0000 (10:15 -0800)]
freedreno/ir3: remove store_output lowered to store_shared_ir3
Fixes crashes that were unnoticed in CI because debug_assert() was not
enabled (but become real crashes after the next patch):
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_highp_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_lowp_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_mediump_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_highp_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_lowp_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_mediump_geometry
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Rafael Antognolli [Tue, 3 Dec 2019 19:15:38 +0000 (11:15 -0800)]
iris: Add restriction to 3DSTATE_CONSTANT_ packets.
The following programming note shows up in all 3DSTATE_CONSTANT_*
packets:
"The sum of all four read length fields must be less than or equal to
the size of 64."
The backend compiler should guarantee this for us, so let's just add a
check here.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Rafael Antognolli [Tue, 26 Nov 2019 17:42:06 +0000 (09:42 -0800)]
anv: Use 3DSTATE_CONSTANT_ALL when possible.
Use this new instruction introduced in Gen12. The instruction itself is
smaller, and it also allows us to emit a single instruction to all
stages that have the same push constant buffers (e.g. when they don't
have constant buffers).
There's one restriction to use this instruction, though: the length
field is only 5 bits long, so we need to check whether we can use it,
and fallback to the old 3DSTATE_CONSTANT_XS if that field is >= 32.
v2:
- Rebased on top of the lasted changes from Jason.
- Added review suggestions by Caio.
- Removed struct push_bos and merged some code into
anv_nir_compute_push_layout().
v3:
- Remove code churn due to gen8+ workaround in
anv_nir_compute_push_layout(). This code has been removed in an earlier
commit, and implemented in cmd_buffer_emit_push_constant().
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Rafael Antognolli [Tue, 26 Nov 2019 21:07:41 +0000 (13:07 -0800)]
anv: Move code for emitting push constants into its own function.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Rafael Antognolli [Tue, 26 Nov 2019 21:05:06 +0000 (13:05 -0800)]
anv: Add get_push_range_address() helper.
Add a helper function to get the push range address. Once we have a
separate function for emitting gen12 push constants, we can use this
helper and avoid duplicating code.
v3: Do not add range->start to the address in gen7 (Caio).
v4: Do not drop range->start from gen7 (Caio, Jason).
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Rafael Antognolli [Mon, 2 Dec 2019 21:41:32 +0000 (13:41 -0800)]
anv: Move gen8+ push constant packet workaround.
Store push_ranges in ascending order, and only "shift" them to the end
of the array during state packet emission.
We don't need this workaround with the new 3DSTATE_CONSTANT_ALL packet.
So instead of applying the workaround here just for GEN < 12 (which
requires and extra loop through all the ranges to figure out if we
should shift them or not), we simply move the whole logic to the state
emission code. At that point, in a later commit, we are already looping
through all of the ranges anyway to check which packet we will be using,
so we might as well implement the workaround there, where it is going to
be used.
v3: Move gen8+ workaround to the state emission code (Caio).
v4: Add explanation of why we moved the workaroudn (Caio).
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Rafael Antognolli [Mon, 23 Sep 2019 20:25:01 +0000 (13:25 -0700)]
iris: Use 3DSTATE_CONSTANT_ALL when possible.
Use this new instruction introduced in Gen12. The instruction itself is
smaller, and it also allows us to emit a single instruction to all
stages that have the same push constant buffers (e.g. when they don't
have constant buffers).
There's one restriction to use this instruction, though: the length
field is only 5 bits long, so we need to check whether we can use it,
and fallback to the old 3DSTATE_CONSTANT_XS if that field is >= 32.
v2 (Suggestions from Caio):
- use max_length instead of large_buffers.
- remove UNUSED and use #if GEN_GEN >= 12 instead.
- inline "buffers" and drop BITSET_RANGE() usage.
- add assert(n <= max_pointers)
- move emit to outside of the loop.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Rafael Antognolli [Mon, 23 Sep 2019 17:15:52 +0000 (10:15 -0700)]
iris: Rework push constants emitting code.
Split into a function the logic to gather the push constant buffers,
which now stores them in struct push_bos. Another function is added to
emit the packet, using data from the push_bos struct.
This will be useful when adding a new function for emitting push
constants for newer platforms.
v2 (Suggestions from Caio):
- rename 'n' -> 'buffer_count'
- remove large_buffers (for now)
- initialize push_bos
- remove assert
- change for() condition (i <= 3 -> i < 4)
v3:
- Add comment about size limit.
- Rework "shift" logic and 'for' loop.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Rafael Antognolli [Mon, 11 Jun 2018 18:29:14 +0000 (11:29 -0700)]
intel/blorp: Use 3DSTATE_CONSTANT_ALL to setup push constants.
In blorp, all the push constants are disabled, so we only need to emit a
single 3DSTATE_CONSTANT_ALL with the bitmask for stage update
appropriately set.
v2: Update comment (Caio).
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Rafael Antognolli [Wed, 13 Jun 2018 16:49:07 +0000 (09:49 -0700)]
intel/aubinator: Decode 3DSTATE_CONSTANT_ALL.
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Rafael Antognolli [Thu, 7 Jun 2018 22:25:24 +0000 (15:25 -0700)]
intel/genxml: Add 3DSTATE_CONSTANT_ALL packet.
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>