mesa.git
4 years agogallivm: add bitfield reverse and ufind_msb
Dave Airlie [Tue, 3 Dec 2019 05:54:56 +0000 (15:54 +1000)]
gallivm: add bitfield reverse and ufind_msb

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
4 years agogallium/scons: fix graw_gdi build
Roland Scheidegger [Sat, 7 Dec 2019 03:37:17 +0000 (04:37 +0100)]
gallium/scons: fix graw_gdi build

Fixes: 44a6b0107b37 (gallivm: add nir->llvm translation (v2))
Reviewed-by: Dave Airlie <Airlied@redhat.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
4 years agoaco: propagate temporaries into expanded vectors
Daniel Schürmann [Thu, 5 Dec 2019 18:27:16 +0000 (19:27 +0100)]
aco: propagate temporaries into expanded vectors

Gives a very slight decrease in code size:
Totals from affected shaders:
Code Size: 1708488 -> 1702768 (-0.33 %) bytes
Max Waves: 2858 -> 2855 (-0.10 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: improve readfirstlane after uniform ssbo loads on GFX7
Daniel Schürmann [Thu, 5 Dec 2019 18:17:52 +0000 (19:17 +0100)]
aco: improve readfirstlane after uniform ssbo loads on GFX7

pipeline-db changes for GFX7:

80310 shaders in 40472 tests
Totals:
SGPRS: 3655900 -> 3643916 (-0.33 %)
VGPRS: 2678324 -> 2686324 (0.30 %)
Spilled SGPRs: 1730 -> 1634 (-5.55 %)
Spilled VGPRs: 14 -> 21 (50.00 %)
Scratch size: 15540 -> 15536 (-0.03 %) dwords per thread
Code Size: 136106120 -> 135457616 (-0.48 %) bytes
LDS: 1259 -> 1259 (0.00 %) blocks
Max Waves: 601014 -> 600206 (-0.13 %)

Totals from affected shaders:
SGPRS: 307832 -> 295848 (-3.89 %)
VGPRS: 267864 -> 275864 (2.99 %)
Spilled SGPRs: 770 -> 674 (-12.47 %)
Spilled VGPRs: 14 -> 21 (50.00 %)
Scratch size: 16 -> 12 (-25.00 %) dwords per thread
Code Size: 22007488 -> 21358984 (-2.95 %) bytes
LDS: 65 -> 65 (0.00 %) blocks
Max Waves: 28668 -> 27860 (-2.82 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: use soffset for MUBUF instructions on SI/CI
Daniel Schürmann [Thu, 5 Dec 2019 17:32:52 +0000 (18:32 +0100)]
aco: use soffset for MUBUF instructions on SI/CI

pipeline-db changes for GFX7:

80310 shaders in 40472 tests
Totals:
SGPRS: 3655300 -> 3655900 (0.02 %)
VGPRS: 2677732 -> 2678324 (0.02 %)
Spilled SGPRs: 1730 -> 1730 (0.00 %)
Spilled VGPRs: 14 -> 14 (0.00 %)
Scratch size: 15540 -> 15540 (0.00 %) dwords per thread
Code Size: 136488364 -> 136106120 (-0.28 %) bytes
LDS: 1259 -> 1259 (0.00 %) blocks
Max Waves: 601039 -> 601014 (-0.00 %)

Totals from affected shaders:
SGPRS: 316312 -> 316912 (0.19 %)
VGPRS: 273844 -> 274436 (0.22 %)
Spilled SGPRs: 770 -> 770 (0.00 %)
Spilled VGPRs: 14 -> 14 (0.00 %)
Scratch size: 16 -> 16 (0.00 %) dwords per thread
Code Size: 22724904 -> 22342660 (-1.68 %) bytes
LDS: 114 -> 114 (0.00 %) blocks
Max Waves: 30861 -> 30836 (-0.08 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoradv: Enable ACO on GFX7 (Sea Islands)
Daniel Schürmann [Fri, 15 Nov 2019 14:37:13 +0000 (15:37 +0100)]
radv: Enable ACO on GFX7 (Sea Islands)

This patch also disables AMD_shader_ballot on GFX7 by default if ACO is used.
Note that shader_ballot works correctly, but performance seems inferior.
To enable shader_ballot use RADV_PERFTEST=shader_ballot.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: return to loop_active mask at continue_or_break blocks
Daniel Schürmann [Wed, 4 Dec 2019 12:41:37 +0000 (13:41 +0100)]
aco: return to loop_active mask at continue_or_break blocks

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoradv: disable Youngblood app profile if ACO is used
Daniel Schürmann [Mon, 2 Dec 2019 16:58:12 +0000 (17:58 +0100)]
radv: disable Youngblood app profile if ACO is used

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: implement exclusive scan for SI/CI
Daniel Schürmann [Thu, 21 Nov 2019 09:23:13 +0000 (10:23 +0100)]
aco: implement exclusive scan for SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: implement inclusive_scan for SI/CI
Daniel Schürmann [Wed, 20 Nov 2019 17:51:39 +0000 (18:51 +0100)]
aco: implement inclusive_scan for SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: implement (clustered) reductions for SI/CI
Daniel Schürmann [Wed, 20 Nov 2019 15:53:42 +0000 (16:53 +0100)]
aco: implement (clustered) reductions for SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: don't use a scalar temporary for reductions on GFX10
Daniel Schürmann [Wed, 20 Nov 2019 17:57:23 +0000 (18:57 +0100)]
aco: don't use a scalar temporary for reductions on GFX10

This patch also adds the scalar temporary for scans on SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: flush denorms after fmin/fmax on pre-GFX9
Daniel Schürmann [Mon, 18 Nov 2019 17:44:51 +0000 (18:44 +0100)]
aco: flush denorms after fmin/fmax on pre-GFX9

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoradv: only flush scalar cache for SSBO writes with ACO on GFX8+
Daniel Schürmann [Mon, 18 Nov 2019 10:15:06 +0000 (11:15 +0100)]
radv: only flush scalar cache for SSBO writes with ACO on GFX8+

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: disable disassembly for SI/CI due to lack of support by LLVM
Daniel Schürmann [Fri, 15 Nov 2019 15:29:32 +0000 (16:29 +0100)]
aco: disable disassembly for SI/CI due to lack of support by LLVM

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: implement 64bit ine/ieq for SI/CI
Daniel Schürmann [Fri, 8 Nov 2019 12:37:15 +0000 (13:37 +0100)]
aco: implement 64bit ine/ieq for SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: implement 64bit i2b for SI /CI
Daniel Schürmann [Fri, 15 Nov 2019 07:20:06 +0000 (08:20 +0100)]
aco: implement 64bit i2b for SI /CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: make 1/2*PI a literal constant on SI/CI
Daniel Schürmann [Thu, 14 Nov 2019 07:09:32 +0000 (08:09 +0100)]
aco: make 1/2*PI a literal constant on SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: implement 64bit VGPR shifts for SI/CI
Daniel Schürmann [Fri, 8 Nov 2019 10:45:13 +0000 (11:45 +0100)]
aco: implement 64bit VGPR shifts for SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: split read/writelane opcode into VOP2/VOP3 version for SI/CI
Daniel Schürmann [Thu, 7 Nov 2019 17:02:33 +0000 (18:02 +0100)]
aco: split read/writelane opcode into VOP2/VOP3 version for SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: fix disassembly of writelane instructions.
Daniel Schürmann [Wed, 4 Dec 2019 09:43:14 +0000 (10:43 +0100)]
aco: fix disassembly of writelane instructions.

ACO writes an unused 3rd operand for internal usage
which makes LLVM recoginize it as illegal instruction.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: recognize SI/CI SMRD hazards
Daniel Schürmann [Wed, 6 Nov 2019 17:09:33 +0000 (18:09 +0100)]
aco: recognize SI/CI SMRD hazards

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: implement quad swizzles for SI/CI
Daniel Schürmann [Wed, 6 Nov 2019 13:01:26 +0000 (14:01 +0100)]
aco: implement quad swizzles for SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: move buffer_store data to VGPR if needed
Daniel Schürmann [Wed, 6 Nov 2019 11:40:14 +0000 (12:40 +0100)]
aco: move buffer_store data to VGPR if needed

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: implement nir_op_isign on SI/CI
Daniel Schürmann [Wed, 6 Nov 2019 09:35:57 +0000 (10:35 +0100)]
aco: implement nir_op_isign on SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: only use scalar loads for readonly buffers on SI/CI
Daniel Schürmann [Wed, 6 Nov 2019 09:13:50 +0000 (10:13 +0100)]
aco: only use scalar loads for readonly buffers on SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: implement nir_op_fquantize2f16 for SI/CI
Daniel Schürmann [Wed, 6 Nov 2019 09:12:26 +0000 (10:12 +0100)]
aco: implement nir_op_fquantize2f16 for SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: fix SMEM offsets for SI/CI
Daniel Schürmann [Tue, 5 Nov 2019 14:27:59 +0000 (15:27 +0100)]
aco: fix SMEM offsets for SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: SI/CI - fix sampler aniso
Daniel Schürmann [Tue, 5 Nov 2019 14:24:12 +0000 (15:24 +0100)]
aco: SI/CI - fix sampler aniso

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: handle gfx7 int8/10 clamping on exports
Dave Airlie [Wed, 10 Jul 2019 04:59:46 +0000 (14:59 +1000)]
aco: handle gfx7 int8/10 clamping on exports

Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: Initial GFX7 Support
Daniel Schürmann [Mon, 4 Nov 2019 17:02:47 +0000 (18:02 +0100)]
aco: Initial GFX7 Support

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: refactor visit_store_fs_output() to use the Builder
Daniel Schürmann [Mon, 18 Nov 2019 09:33:40 +0000 (10:33 +0100)]
aco: refactor visit_store_fs_output() to use the Builder

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoanv: Re-emit all compute state on pipeline switch
Jason Ekstrand [Sat, 7 Dec 2019 00:26:59 +0000 (18:26 -0600)]
anv: Re-emit all compute state on pipeline switch

It's a very odd case to hit in the real world.  However, there are some
CTS tests which switch back and forth between dispatch and clear without
changing the pipeline.

Fixes: bc612536eb2f "anv: Emit a dummy MEDIA_VFE_STATE before switching..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
4 years agoanv: Re-capture all batch and state buffers
Jason Ekstrand [Sat, 7 Dec 2019 00:11:14 +0000 (18:11 -0600)]
anv: Re-capture all batch and state buffers

When we moved from allocating BOs directly to using the BO cache, we
lost the EXEC_OBJECT_CAPTURE flag on all our state buffers.

Fixes: 3119b96bdf57 "anv: Allocate block pool BOs from the cache"
Fixes: ee77938733cd "anv: Allocate batch and fence buffers from..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
4 years agoanv: Return VK_ERROR_OUT_OF_DEVICE_MEMORY for too-large buffers
Jason Ekstrand [Mon, 25 Nov 2019 16:27:02 +0000 (10:27 -0600)]
anv: Return VK_ERROR_OUT_OF_DEVICE_MEMORY for too-large buffers

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agofreedreno: Enable texture upload memory throttling.
Eric Anholt [Thu, 5 Dec 2019 05:40:07 +0000 (21:40 -0800)]
freedreno: Enable texture upload memory throttling.

Fixes oom-killer during streaming-texture-upload, which I found while
trying to enable piglit in CI.

Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno: reorder format check
Fritz Koenig [Thu, 5 Dec 2019 00:16:43 +0000 (16:16 -0800)]
freedreno: reorder format check

With the addition of the planar formats helper, the
planar formats no longer have a valid block.bits field.
Calling util_format_get_blocksize therefore asserts.

Reorder the check to see if the format is supported
before doing the query to get the blocksize.

Fixes: 20f132e5eff2d ("gallium/util: add planar format layouts and helpers")
Signed-off-by: Fritz Koenig <frkoenig@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agoiris: Fix import of multi-planar surfaces with modifiers
Nanley Chery [Fri, 15 Nov 2019 17:17:23 +0000 (09:17 -0800)]
iris: Fix import of multi-planar surfaces with modifiers

Multi-planar surfaces are allowed to have modifiers. Don't require
DRM_FORMAT_MOD_INVALID in order to create a surface for each plane
defined by the format.

Fixes: 246eebba4a8 ("iris: Export and import surfaces with modifiers that have aux data")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agogallium: Store the image format in winsys_handle
Nanley Chery [Fri, 15 Nov 2019 22:10:38 +0000 (14:10 -0800)]
gallium: Store the image format in winsys_handle

This format will be used to properly handle planar images with modifiers
in iris.

Fixes: 246eebba4a8 ("iris: Export and import surfaces with modifiers that have aux data")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agogallium/dri2: Fix creation of multi-planar modifier images
Nanley Chery [Thu, 14 Nov 2019 21:59:58 +0000 (13:59 -0800)]
gallium/dri2: Fix creation of multi-planar modifier images

The commit noted below assumed and enforced that DRM_MOD_INVALID was the
only valid modifier for multi-planar imported images. Due to that, it
required that modifier on multi-planar images to:

   1. Allow multiple planes.
   2. Perform YUV format lowering and extent adjustments.
   3. Use buffer_index to correctly map the given planes.

Fix these issues by removing or updating the code built on that
assumption.

Fixes: 2066966c106 ("gallium/dri2: Support creating multi-planar modifier images")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agomeson: Include iris in default gallium-drivers for x86/x86_64
Kenneth Graunke [Thu, 5 Dec 2019 23:30:26 +0000 (15:30 -0800)]
meson: Include iris in default gallium-drivers for x86/x86_64

We build i965 by default on x86/x86_64 platforms; let's build iris too.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
4 years agoanv: Use BO fences/semaphores for AcquireNextImage
Jason Ekstrand [Wed, 4 Dec 2019 19:19:23 +0000 (13:19 -0600)]
anv: Use BO fences/semaphores for AcquireNextImage

Instead of doing a dummy submit on the command buffer for the fence or a
dummy semaphore and trusting in implicit sync, this commit moves us to
take advantage of implicit sync and just use the WSI image BO as the
fence.  Both semaphores and fences require a tiny bit of extra plumbing
to do this but the result is that we can get rid of a bunch of the extra
synchronization we're doing today.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Add a fence_reset_reset_temporary helper
Jason Ekstrand [Wed, 4 Dec 2019 19:01:35 +0000 (13:01 -0600)]
anv: Add a fence_reset_reset_temporary helper

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Use submit-time implicit sync instead of allocate-time
Jason Ekstrand [Thu, 21 Nov 2019 12:10:32 +0000 (06:10 -0600)]
anv: Use submit-time implicit sync instead of allocate-time

In 83b943cc2f24, we started making all VkDeviceMemory BOs resident all
the time.  One unfortunate side-effect of this is that every
vkQueueSubmit sets EXEC_OBJECT_WRITE on every WSI memory object which
means that X server or Wayland compositor, instead of waiting on the
last vkQueueSubmit to actually write the buffer, now waits on the last
vkQueueSubmit to from that driver instance relative to whenever the
compositor's GL driver instance calls execbuf.  This potentially leads
to a lot of extra synchronization that we didn't intend to have.

Instead, this commit makes it so that we leave WSI memory objects with
EXEC_OBJECT_ASYNC most of the time and only unset EXEC_OBJECT_ASYNC and
set EXEC_OBJECT_WRITE in the dummy execbuf that we do as part of
vkQueuePresent.  This should hopefully result in tighter integration
with the compositor, lower latency, and better performance.

Testing with DOOM 2016, this seems to reduce latency by at least a frame
if not two and makes the game much more responsive.  Testing was,
however, subjective, so we don't have any hard data on that.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Always add in EXEC_OBJECT_WRITE when specified in extra_flags
Jason Ekstrand [Thu, 21 Nov 2019 12:00:14 +0000 (06:00 -0600)]
anv: Always add in EXEC_OBJECT_WRITE when specified in extra_flags

Otherwise, we're trusting in the execbuf_add_bo which sets
EXEC_OBJECT_WRITE to to always be the first one that gets called.  This
is likely true for fences but it seems somewhat fragile.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agovulkan/wsi: Add a hooks for signaling semaphores and fences
Jason Ekstrand [Wed, 4 Dec 2019 18:47:31 +0000 (12:47 -0600)]
vulkan/wsi: Add a hooks for signaling semaphores and fences

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agovulkan/wsi: Provide the implicitly synchronized BO to vkQueueSubmit
Jason Ekstrand [Thu, 21 Nov 2019 11:47:10 +0000 (05:47 -0600)]
vulkan/wsi: Provide the implicitly synchronized BO to vkQueueSubmit

This lets us treat the implicit synchronization that we need for X11 and
Wayland like a semaphore.  Instead of trusting the driver to somehow
figure out when that memory object needs to be signaled, we provide an
explicit point where the driver can set EXEC_OBJECT_WRITE and signal the
dma_fence on the BO.  Without this, we have to somehow track inside the
driver when WSI buffers are actually used to avoid extra synchronization
dependencies.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agopanfrost: free spill cost table in mir_spill_register
Urja Rannikko [Fri, 6 Dec 2019 02:47:50 +0000 (02:47 +0000)]
panfrost: free spill cost table in mir_spill_register

Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: add lcra_free() to free lcra state
Urja Rannikko [Fri, 6 Dec 2019 02:41:31 +0000 (02:41 +0000)]
panfrost: add lcra_free() to free lcra state

Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: free allocations in schedule_block
Urja Rannikko [Fri, 6 Dec 2019 01:20:34 +0000 (01:20 +0000)]
panfrost: free allocations in schedule_block

Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: free last_read/write tables in mir_create_dependency_graph
Urja Rannikko [Wed, 4 Dec 2019 14:20:48 +0000 (14:20 +0000)]
panfrost: free last_read/write tables in mir_create_dependency_graph

Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Rename SET_VALUE to WRITE_VALUE
Alyssa Rosenzweig [Thu, 5 Dec 2019 14:06:53 +0000 (09:06 -0500)]
panfrost: Rename SET_VALUE to WRITE_VALUE

See
https://lists.freedesktop.org/archives/dri-devel/2019-December/247601.html

Write value emphasises that it's just a generic write primitive.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Update SET_VALUE with information from igt
Alyssa Rosenzweig [Wed, 4 Dec 2019 13:59:29 +0000 (08:59 -0500)]
panfrost: Update SET_VALUE with information from igt

It's not a tiler specific initialization; it's a generic GPU-side write
primitive that may be used for tiler reset on midgard.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agogitlab-ci: add a job that runs Vulkan CTS with RADV conditionally
Samuel Pitoiset [Wed, 13 Nov 2019 10:03:52 +0000 (11:03 +0100)]
gitlab-ci: add a job that runs Vulkan CTS with RADV conditionally

Only Polaris10 is tested at the moment, and I disabled a TON of
tests to keep a CTS run within 5 minutes because my local runner
is a bit slow. A full CTS run takes more than 1h, which means it
will hit the timeout.

RADV CI can only be triggered manually on personal branches to
avoid breaking the world because one runner is definitely not
enough. This will allow us to test it until it's stable enough
to be enabled by default.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
4 years agogitlab-ci: build RADV in meson-testing
Samuel Pitoiset [Tue, 19 Nov 2019 13:46:53 +0000 (14:46 +0100)]
gitlab-ci: build RADV in meson-testing

This requires to bump LLVM to 8 because it's the minimum supported
version by RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
4 years agogitlab-ci: configure the Vulkan ICD export with VK_DRIVER
Samuel Pitoiset [Thu, 14 Nov 2019 11:09:44 +0000 (12:09 +0100)]
gitlab-ci: configure the Vulkan ICD export with VK_DRIVER

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
4 years agogitlab-ci: allow to run dEQP Vulkan with DEQP_VER
Samuel Pitoiset [Tue, 19 Nov 2019 07:39:00 +0000 (08:39 +0100)]
gitlab-ci: allow to run dEQP Vulkan with DEQP_VER

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
4 years agogitlab-ci: add a new base test job for VK
Samuel Pitoiset [Mon, 18 Nov 2019 08:30:27 +0000 (09:30 +0100)]
gitlab-ci: add a new base test job for VK

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
4 years agogitlab-ci: build dEQP VK 1.1.6 in the x86 test image for VK
Samuel Pitoiset [Mon, 18 Nov 2019 08:26:00 +0000 (09:26 +0100)]
gitlab-ci: build dEQP VK 1.1.6 in the x86 test image for VK

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
4 years agogitlab-ci: build cts_runner in the x86 test image for VK
Samuel Pitoiset [Mon, 18 Nov 2019 08:24:27 +0000 (09:24 +0100)]
gitlab-ci: build cts_runner in the x86 test image for VK

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
4 years agogitlab-ci: add a new job that builds a base test image for VK
Samuel Pitoiset [Mon, 18 Nov 2019 08:23:18 +0000 (09:23 +0100)]
gitlab-ci: add a new job that builds a base test image for VK

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
4 years agogitlab-ci: add a gl suffix to the x86 test image and all test jobs
Samuel Pitoiset [Mon, 18 Nov 2019 08:15:12 +0000 (09:15 +0100)]
gitlab-ci: add a gl suffix to the x86 test image and all test jobs

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
4 years agogitlab-ci: rename build-deqp.sh to build-deqp-gl.sh
Samuel Pitoiset [Fri, 15 Nov 2019 11:05:15 +0000 (12:05 +0100)]
gitlab-ci: rename build-deqp.sh to build-deqp-gl.sh

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
4 years agogitlab-ci: Overhaul job run policy
Michel Dänzer [Thu, 26 Sep 2019 07:27:27 +0000 (09:27 +0200)]
gitlab-ci: Overhaul job run policy

Use new rules: instead of only:

For container stage jobs:

* In the main Mesa project, run them by default.

* In merge requests, run them by default if any files affecting pipeline
  results are changed.

* In all other cases (in particular branches in personal projects),
  don't run them by default but allow triggering them manually.

build & test stage jobs are left at the default (when: on_success), so
they will run automatically once all their dependencies are satisified.
(Using the same rules as above would require these jobs to be manually
triggered as well, which is only possible once all dependency jobs have
passed) Please be considerate of CI runner resources and cancel unneeded
jobs on personal branches with no corresponding merge requests (this can
be done before the jobs start running).

In summary: No more special branch names. Unnecessary job runs are
avoided by default, but jobs which don't run by default can be triggered
manually.

v2:
* Split out LAVA changes to separate commit
* Clarify commit log a little, in particular WRT build/test stage jobs

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> # v1
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> # v1
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> # v1
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
4 years agogitlab-ci: Use the common run policy for LAVA jobs as well again
Michel Dänzer [Fri, 6 Dec 2019 08:39:40 +0000 (09:39 +0100)]
gitlab-ci: Use the common run policy for LAVA jobs as well again

Having different policies could have some weird results, e.g. changes
only touching documentation (where the intention is not to run the
pipeline by default) would still create a pipeline with the LAVA jobs
running by default.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
4 years agoturnip: implement border color
Jonathan Marek [Fri, 15 Nov 2019 17:42:44 +0000 (12:42 -0500)]
turnip: implement border color

Fixes the deqp fails in:
dEQP-VK.pipeline.sampler.*border*
(minus 1d array/d24 cases which fail for other reasons)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
4 years agoturnip: improve emit_textures
Jonathan Marek [Fri, 15 Nov 2019 20:12:25 +0000 (15:12 -0500)]
turnip: improve emit_textures

Two things:
* Texture/sampler pointers aligned to the size of texture/sampler state
* Returning errors instead of crashing on OOM

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
4 years agoturnip: add function to allocate aligned memory in a substream cs
Jonathan Marek [Fri, 15 Nov 2019 20:15:53 +0000 (15:15 -0500)]
turnip: add function to allocate aligned memory in a substream cs

To use with texture states that need alignment (texconst, sampler, border)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
4 years agoglsl/nir: iterate the system values list when adding varyings
Timothy Arceri [Thu, 5 Dec 2019 04:01:14 +0000 (15:01 +1100)]
glsl/nir: iterate the system values list when adding varyings

Iterate the system values list when adding varyings to the program
resource list in the NIR linker. This is needed to avoid CTS
regressions when using the NIR to build the GLSL resource list in
an upcoming series. Presumably it also fixes a bug with the current
ARB_gl_spirv support.

Fixes: ffdb44d3a0a2 ("nir/linker: Add inputs/outputs to the program resource list")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agollvmpipe: enable support for primitives generated outside streamout
Dave Airlie [Mon, 2 Dec 2019 05:01:06 +0000 (15:01 +1000)]
llvmpipe: enable support for primitives generated outside streamout

This enables the draw support when the queries are enabled.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
4 years agodraw: add support for collecting primitives generated outside streamout
Dave Airlie [Mon, 2 Dec 2019 04:37:42 +0000 (14:37 +1000)]
draw: add support for collecting primitives generated outside streamout

GL/gallium require gathering primitives generated outside streamout
stats. This introduces the draw interfaces to enabling collecting this.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
4 years agollvmpipe: disable occlusion queries when requested by state tracker
Dave Airlie [Mon, 2 Dec 2019 04:58:56 +0000 (14:58 +1000)]
llvmpipe: disable occlusion queries when requested by state tracker

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
4 years agollvmpipe: add queries disabled flag
Dave Airlie [Mon, 2 Dec 2019 04:58:09 +0000 (14:58 +1000)]
llvmpipe: add queries disabled flag

This flag is set when the state tracker request queries
be disabled for meta operations.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
4 years agomain: Change u_mmAllocMem align2 from bytes (old API) to bits (new API)
Kenneth Graunke [Tue, 3 Dec 2019 21:51:55 +0000 (13:51 -0800)]
main: Change u_mmAllocMem align2 from bytes (old API) to bits (new API)

The main and Gallium implementations were recently merged, and the
align2 parameter in the Gallium one is in bits.  execmem.c expected
bytes still.  This led to every call here asserting.

Fixes: b6fd679a9e("mesa/main/util: moving gallium u_mm to util, remove main/mm")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
4 years agoci: Disable egl_ext_device_drm tests in piglit.
Eric Anholt [Thu, 5 Dec 2019 00:13:38 +0000 (16:13 -0800)]
ci: Disable egl_ext_device_drm tests in piglit.

If the runner has a HW device that would be supported, even without
/dev/dri forwarded into the container, it will be enumerated and the tests
on llvmpipe fail with (for example):

libEGL warning: Not allowed to force software rendering when API explicitly selects a hardware device.
libEGL warning: MESA-LOADER: failed to open i965 (search paths /builds/anholt/mesa/install/lib/dri)

Given that we can't necessarily control the DRI devices present on the
runners (particularly for developers bringing their own runners to reduce
the demands on fd.o's shared resources), just skip these tests in CI.

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
4 years agoutil/atomic: Add p_atomic_add_return for the unlocked path
Jason Ekstrand [Thu, 5 Dec 2019 17:49:18 +0000 (11:49 -0600)]
util/atomic: Add p_atomic_add_return for the unlocked path

Fixes: 385d13f26d2 "util/atomic: Add a _return variant of p_atomic_add"
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
4 years agoanv: Implement VK_KHR_buffer_device_address
Jason Ekstrand [Mon, 2 Dec 2019 22:28:58 +0000 (16:28 -0600)]
anv: Implement VK_KHR_buffer_device_address

The primary difference between the KHR and EXT versions of the extension
is that the KHR provides the address at AllocateMemory time for replay
so we can replay it safely without moving to a sparse address model.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Use a pNext loop in AllocateMemory
Jason Ekstrand [Wed, 26 Jun 2019 23:02:19 +0000 (18:02 -0500)]
anv: Use a pNext loop in AllocateMemory

This function has a lot of possible extensions and some of them we can
easily handle on-the-fly so it's easier to just have a loop than to find
each structure manually.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Add allocator support for client-visible addresses
Jason Ekstrand [Mon, 2 Dec 2019 22:03:56 +0000 (16:03 -0600)]
anv: Add allocator support for client-visible addresses

When a BO is flagged as having a client visible address, we put it in
its own heap.  We also support the client explicitly specifying an
address in said heap.  If an address collision happens, we return false
from anv_vma_alloc which turns into a VK_ERROR_OUT_OF_DEVICE_MEMORY.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoutil/vma: Add a function to allocate a particular address range
Jason Ekstrand [Wed, 26 Jun 2019 19:32:31 +0000 (14:32 -0500)]
util/vma: Add a function to allocate a particular address range

This new function lets you request to remove a specific address range
from the allocator.  It returns true on success and leaves the allocator
unmodified and returns false on failure.  It doesn't need to return an
offset because, if it succeeds, the offset passed in is the allocated
offset.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoutil/vma: Factor out the hole splitting part of util_vma_heap_alloc
Jason Ekstrand [Wed, 26 Jun 2019 19:31:57 +0000 (14:31 -0500)]
util/vma: Factor out the hole splitting part of util_vma_heap_alloc

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Add an explicit_address parameter to anv_device_alloc_bo
Jason Ekstrand [Mon, 2 Dec 2019 21:22:38 +0000 (15:22 -0600)]
anv: Add an explicit_address parameter to anv_device_alloc_bo

We already have a mechanism for specifying that we want a fixed address
provided by the driver internals.  We're about to let the client start
specifying addresses in some very special scenarios as well so we want
to pass this through to the allocation function.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Stop advertising two heaps just for the VF cache WA
Jason Ekstrand [Mon, 2 Dec 2019 20:44:33 +0000 (14:44 -0600)]
anv: Stop advertising two heaps just for the VF cache WA

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Set up VMA heaps independently from memory heaps
Jason Ekstrand [Mon, 2 Dec 2019 20:51:30 +0000 (14:51 -0600)]
anv: Set up VMA heaps independently from memory heaps

Our VMA allocations are really independent from the memory heaps we
expose via the API.  The only thing that really matters is the GTT size
so we can make the high heap the right size.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Stop tracking VMA allocations
Jason Ekstrand [Mon, 2 Dec 2019 20:38:45 +0000 (14:38 -0600)]
anv: Stop tracking VMA allocations

util_vma_heap_alloc will already return 0 if it doesn't have enough
space.  The only thing the vma_*_available tracking was doing was
preventing us from allocating too much on any given heap.  Now that
we're tracking that in the heap itself, we can drop these.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Disallow allocating above heap sizes
Jason Ekstrand [Mon, 2 Dec 2019 20:37:56 +0000 (14:37 -0600)]
anv: Disallow allocating above heap sizes

We're already tracking the amount of memory used in each heap.  This
commit just makes us start rejecting memory allocations if the heap
would grow too large.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoutil/atomic: Add a _return variant of p_atomic_add
Jason Ekstrand [Mon, 2 Dec 2019 20:36:39 +0000 (14:36 -0600)]
util/atomic: Add a _return variant of p_atomic_add

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Don't leak when set_tiling fails
Jason Ekstrand [Mon, 2 Dec 2019 19:51:59 +0000 (13:51 -0600)]
anv: Don't leak when set_tiling fails

Fixes: a44744e01d73 "anv: Require a dedicated allocation for..."
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Use PIPE_CONTROL flushes to implement the gen8 VF cache WA
Jason Ekstrand [Tue, 26 Nov 2019 03:55:51 +0000 (21:55 -0600)]
anv: Use PIPE_CONTROL flushes to implement the gen8 VF cache WA

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Apply cache flushes after setting index/draw VBs
Jason Ekstrand [Mon, 2 Dec 2019 18:32:16 +0000 (12:32 -0600)]
anv: Apply cache flushes after setting index/draw VBs

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Always invalidate the VF cache in BeginCommandBuffer
Jason Ekstrand [Mon, 2 Dec 2019 18:14:45 +0000 (12:14 -0600)]
anv: Always invalidate the VF cache in BeginCommandBuffer

I think the reason why we only do this for primaries is that we didn't
expect to have blorp calls in secondaries.  However, you are allowed to
have a full render pass in a secondary command buffer so resolves and
clears can end up in there.  We should just always invalidate.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoblorp: Pass the VB size to the VF cache workaround
Jason Ekstrand [Mon, 25 Nov 2019 18:42:42 +0000 (12:42 -0600)]
blorp: Pass the VB size to the VF cache workaround

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Add a has_softpin boolean
Jason Ekstrand [Mon, 25 Nov 2019 18:06:20 +0000 (12:06 -0600)]
anv: Add a has_softpin boolean

This separates "has" from "use" which will make the next commit a bit
cleaner.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: Drop bo_flags from anv_bo_pool
Jason Ekstrand [Mon, 2 Dec 2019 18:02:12 +0000 (12:02 -0600)]
anv: Drop bo_flags from anv_bo_pool

In ee77938733cd, we started using the BO cache for anv_bo_pool and
stopped using the bo_flags parameter.  However, we never dropped it from
the struct or the init function.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoglsl/tests: Use splitlines() instead of strip()
Michel Dänzer [Wed, 20 Nov 2019 10:15:04 +0000 (11:15 +0100)]
glsl/tests: Use splitlines() instead of strip()

strip() removes leading and trailing newlines, but leaves newlines
between multiple lines in the string. This could cause failures when
comparing the output of cross-compiled Windows binaries (producing
Windows-style newlines) to the expected output with Unix-style newlines.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
4 years agoandroid: radeonsi: fix build after vl refactoring (v2)
Mauro Rossi [Sat, 16 Nov 2019 17:39:31 +0000 (18:39 +0100)]
android: radeonsi: fix build after vl refactoring (v2)

vl functions moved from radeonsi to gallium/auxiliary/vl have left
android build of radeonsi in broken state.

libmesa_galliumvl static is need to build readeonsi,
gallium_dri building rules are reworked to avoid multiple symbols
and libmesa_galliumvl static dependency is needed in radeonsi.

Here is the changelog:
- android: gallium/auxiliary: add libmesa_galliumvl static
- android: gallium_dri: move libmesa_gallium to static to prevent multiple symbols
- android: radeonsi: fix build after vl refactoring

Fixes the following building error:

external/mesa/src/gallium/drivers/radeonsi/si_uvd.c:47:
error: undefined reference to 'vl_video_buffer_create_as_resource'
clang.real: error: linker command failed with exit code 1 (use -v to see invocation)

Fixes: 86e60bc ("radeonsi: remove si_vid_join_surfaces and use combined planar allocations")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agointel/compiler: force simd8 when dual src blending on gen8
Tapani Pälli [Mon, 2 Dec 2019 14:54:30 +0000 (16:54 +0200)]
intel/compiler: force simd8 when dual src blending on gen8

Patch introduces option to force simd8 and uses it as a workaround for
dual source blending issues seen with skqp (skia testsuite) on gen8.

Fixes following Piglit test on gen8 platforms:
   arb_blend_func_extended-dual-src-blending-issue-1917

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1917
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
c: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agointel/compiler: add newline to limit_dispatch_width message
Tapani Pälli [Wed, 4 Dec 2019 06:04:21 +0000 (08:04 +0200)]
intel/compiler: add newline to limit_dispatch_width message

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoturnip: Add support for compute shaders.
Eric Anholt [Wed, 27 Nov 2019 04:37:19 +0000 (20:37 -0800)]
turnip: Add support for compute shaders.

Since compute shares the FS state with graphics, we have to re-upload the
pipeline state when switching between compute dispatch and graphics draws.
We could potentially expose graphics and compute as separate queues and
then we wouldn't need pipeline state management, but the closed driver
exposes a single queue and consistency with them is probably good.

So far I'm emitting texture/ibo state as IBs that we jump to.  This is
kind of silly when we could just emit it directly in our CS, but that's a
refactor we can do later.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
4 years agoturnip: Move pipeline BO list adding to BindPipeline.
Eric Anholt [Wed, 4 Dec 2019 20:22:55 +0000 (12:22 -0800)]
turnip: Move pipeline BO list adding to BindPipeline.

We only need to do it once when we bind, rather than having to check at
every draw call.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>