Jason Ekstrand [Thu, 16 Nov 2017 20:49:27 +0000 (12:49 -0800)]
vulkan/wsi: Initialize individual WSI interfaces in wsi_device_init
Now that we have anv_device_init/finish functions, there's no reason to
have the individual driver do any more work than that.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 20:38:26 +0000 (12:38 -0800)]
vulkan/wsi: Drop some unneeded cruft from the API
This drops the unneeded callbacks struct as well as the queue_get_family
callback we were using before we'd pulled QueuePresent inside.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 20:26:26 +0000 (12:26 -0800)]
vulkan/wsi: Add wrappers for all of the surface queries
This lets us move wsi_interface to wsi_common_private.h
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 20:05:35 +0000 (12:05 -0800)]
vulkan/wsi: Drop the can_handle_different_gpu parameter from get_support
Both anv and radv can handle prime now.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 18:46:26 +0000 (10:46 -0800)]
vulkan/wsi: Move wsi_swapchain to wsi_common_private.h
The drivers no longer poke at this directly.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 18:44:41 +0000 (10:44 -0800)]
vulkan/wsi: Add a helper for AcquireNextImage
Unfortunately, due to the fact that AcquireNextImage does not take a
queue, the ANV trick for triggering the fence won't work in general. We
leave dealing with the fence up to the caller for now.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Dave Airlie [Thu, 16 Nov 2017 02:02:04 +0000 (12:02 +1000)]
vulkan/wsi: move swapchain create/destroy to common code
v2 (Jason Ekstrand):
- Rebase
- Alter the names of the helpers to better match the vulkan entrypoints
- Use the helpers in anv
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 17:30:16 +0000 (09:30 -0800)]
vulkan/wsi: Move prime blitting into queue_present
This lets us save a QueueSubmit and it also makes prime a lot less
X11-specific. Also, it means we can only wait on the semaphores once
instead of on every blit.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Jason Ekstrand [Thu, 16 Nov 2017 17:56:37 +0000 (09:56 -0800)]
vulkan/wsi: Move get_images into common code
This moves bits out of all four corners (anv, radv, x11, wayland) and
into the wsi common code. We also switch to using an outarray to ensure
we get our return code right.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 19:56:00 +0000 (11:56 -0800)]
anv/wsi: Enable prime support
Now that we're using the same common code as radv, we get prime support
for free. Just enable it.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 17:46:50 +0000 (09:46 -0800)]
anv/wsi: Use the common QueuePresent code
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 17:18:48 +0000 (09:18 -0800)]
vulkan/wsi: Set a proper pWaitDstStageMask on the dummy submit
Neither mesa driver really cares, but we should set it none the less for
the sake of correctness.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 17:07:58 +0000 (09:07 -0800)]
vulkan/wsi: Only wait on semaphores on the first swapchain
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 16:59:21 +0000 (08:59 -0800)]
vulkan/wsi: Refactor result handling in queue_present
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Dave Airlie [Thu, 16 Nov 2017 01:52:39 +0000 (11:52 +1000)]
radv/wsi: Move the guts of QueuePresent to wsi common
v2 (Jason Ekstrand):
- Better comit message
- Rebase
- Re-indent to follow wsi_common style
- Drop the unneeded _swapchain from the newly added helper
- Make the clone more true to the original (as per the rebase)
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 17:39:50 +0000 (09:39 -0800)]
vulkan/wsi: Add a WSI_FROM_HANDLE macro
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Dave Airlie [Thu, 16 Nov 2017 01:03:22 +0000 (11:03 +1000)]
radv/wsi: drop allocate memory special case
Just check if image has scanout flag set
v2 (Jason Ekstrand):
- Rebase
- Also drop the now unused radv_mem_flag_bits enum
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 06:30:20 +0000 (22:30 -0800)]
vulkan/wsi: Do image creation in common code
This uses the mock extension created in a previous commit to tell the
driver that the image it's just been asked to create is, in fact, a
window system image with whatever assumptions that implies. There was a
lot of redundant code between the two drivers to do basically exactly
the same thing.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 03:04:10 +0000 (19:04 -0800)]
vulkan/wsi: Implement prime in a completely generic way
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Sat, 18 Nov 2017 23:30:34 +0000 (15:30 -0800)]
radv: Move wsi initialization later in physical_device_init
We need it to happen after memory type setup so that we can query memory
types in wsi_device_init.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 16:27:01 +0000 (08:27 -0800)]
radv/image: Implement the wsi "extension"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 06:11:46 +0000 (22:11 -0800)]
anv/image: Implement the wsi "extension"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Tue, 28 Nov 2017 16:49:29 +0000 (08:49 -0800)]
anv: Require a dedicated allocation for modified images
This lets us set the BO tiling when we allocate the memory. This is
required for GL to work properly.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Tue, 28 Nov 2017 17:28:12 +0000 (09:28 -0800)]
anv/image: Add a drm_format_mod field
At the moment, this is always initialized to DRM_FORMAT_MOD_INVALID.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Tue, 28 Nov 2017 02:43:43 +0000 (18:43 -0800)]
radv: Implement VK_EXT_external_memory_dma_buf
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Tue, 28 Nov 2017 02:33:44 +0000 (18:33 -0800)]
anv: Implement VK_EXT_external_memory_dma_buf
This is a modified version of the patch originally sent by Chad Versace.
The primary difference is that this version claims that OPQAUE_FD and
DMA_BUF are compatible handle types.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 06:11:23 +0000 (22:11 -0800)]
vulkan/wsi: Add a mock image creation extension
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 04:08:53 +0000 (20:08 -0800)]
vulkan/wsi: Add wsi_swapchain_init/finish functions
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 02:50:44 +0000 (18:50 -0800)]
vulkan/wsi: Add a wsi_device_init function
This gives the opportunity to collect some function pointers if we'd
like which will be very useful in future.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Thu, 16 Nov 2017 03:59:01 +0000 (19:59 -0800)]
vulkan/wsi/x11: Handle the geometry check earlier in create_swapchain
This fixes a potential leak if allocating the swapchain fails. Since
geometry checking and bit-depth fetching is self-contained, it makes
sense to just do it first so we can delete the geometry reply.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Daniel Stone [Thu, 20 Jul 2017 10:51:48 +0000 (11:51 +0100)]
vulkan/wsi: Add a wsi_image structure
This is used to hold information about the allocated image, rather than
an ever-growing function argument list.
v2 (Jason Ekstrand):
- Rename wsi_image_base to wsi_image
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Dave Airlie [Wed, 15 Nov 2017 22:46:41 +0000 (08:46 +1000)]
vulkan/wsi: use function ptr definitions from the spec.
This just seems cleaner, and we may expand this in future.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Tue, 31 Oct 2017 16:57:54 +0000 (09:57 -0700)]
i965: Emit CS stall before MEDIA_VFE_STATE.
This fixes hangs on GFXBench 5's Aztec Ruins benchmark.
Unfortunately, it regresses OglCSCloth performance by about 10%. There
are some ideas for fixing that.
The Vulkan driver already emits this stall.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Tue, 31 Oct 2017 17:02:02 +0000 (10:02 -0700)]
i965: Move PIPE_CONTROL defines and prototypes to brw_pipe_control.h.
We need to be able to emit PIPE_CONTROLs from genX_state_upload.c,
which can't safely include brw_defines.h because it conflicts with
genxml. Move all the PIPE_CONTROL related stuff together into a
separate header.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Tue, 5 Sep 2017 22:46:58 +0000 (15:46 -0700)]
spirv: Replace unreachable with vtn_fail
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Jason Ekstrand [Thu, 17 Aug 2017 00:38:13 +0000 (17:38 -0700)]
spirv: Replace assert with vtn_assert
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Jason Ekstrand [Wed, 16 Aug 2017 23:15:23 +0000 (16:15 -0700)]
spirv: Add vtn_fail and vtn_assert helpers
These helpers are much nicer than just using assert because they don't
kill your process. Instead, it longjmps back to spirv_to_nir(), cleans
up all the temporary memory, and nicely returns NULL. While crashing is
completely OK in the Vulkan world, it's not considered to be quite so
nice in GL. This should help us to make SPIR-V parsing much more
robust. The one downside here is that vtn_assert is not compiled out in
release builds like assert() is so it isn't free.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Jason Ekstrand [Thu, 17 Aug 2017 00:16:45 +0000 (17:16 -0700)]
util: Add a NORETURN macro
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Jason Ekstrand [Wed, 16 Aug 2017 23:38:56 +0000 (16:38 -0700)]
spirv: Do something useful with OpSource
We may as well log the source language and file name.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Jason Ekstrand [Wed, 16 Aug 2017 23:04:08 +0000 (16:04 -0700)]
spirv: Rework logging
This commit reworks the way that logging works in SPIR-V to provide
richer and more detailed logging infrastructure. This commit contains
several improvements over the old mechanism:
1) Log messages are now more detailed. They contain the SPIR-V byte
offset as well as source language information from OpSource and
OpLine.
2) There is now a logging callback mechanism so that errors can get
propagated to the client through debug callbak extensions.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Jason Ekstrand [Wed, 16 Aug 2017 23:10:23 +0000 (16:10 -0700)]
spirv: Re-arrange vtn_builder initialization
This simply moves allocating the vtn_builder and initializing it to the
very beginning before we even parse the header.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Jason Ekstrand [Wed, 16 Aug 2017 15:43:08 +0000 (08:43 -0700)]
spirv: Parent the nir_shader to the builder while building
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Rob Clark [Mon, 4 Dec 2017 16:01:52 +0000 (11:01 -0500)]
freedreno: mark stencil buffer valid too in case of z32x24s8
The separate stencil buffer was not also getting marked as valid if
written by a draw/clear, resulting in gmem2mem getting skipped. Move
this into fd_batch_resource_used() which also handles the separate
stencil case.
Also fix restore_buffers typo.
Fixes: 4ab6ab80365 freedreno: avoid mem2gmem for invalidated buffers
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Mon, 4 Dec 2017 14:02:07 +0000 (09:02 -0500)]
freedreno: remove use of u_transfer
Freedreno doesn't treat buffers and images differently, so it's use was
kind of pointless.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Eric Engestrom [Mon, 4 Dec 2017 13:40:54 +0000 (08:40 -0500)]
freedreno: add -Wno-packed-bitfield-compat for meson build
Otherwise huge amount of spam from instr-a2xx.h.. gcc has no way to know
that freedreno was never built with such an old gcc version to care
about the bugs in old gcc ;-)
Reported-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
[added commit message]
Signed-off-by: Rob Clark <robdclark@gmail.com>
Samuel Iglesias Gonsálvez [Thu, 9 Nov 2017 10:15:03 +0000 (11:15 +0100)]
glsl: don't run intrastage array validation when the interface type is not an array
We validate that the interface block array type's definition matches.
However, previously, the function could be called if an non-array
interface block has different type definitions -for example, when the
precision qualifier differs in a GLSL ES shader, we would create two
different types-, and it would return invalid as both definitions are
non-arrays.
We fix this by specifying that at least one definition should be an
array to call the validation.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Samuel Iglesias Gonsálvez [Thu, 9 Nov 2017 08:58:25 +0000 (09:58 +0100)]
glsl/es: precision qualifier doesn't need to match in UBOs
They might mismatch due to the two shaders using different GLSL
versions, and that's ok in desktop GL. In ES, precision qualifiers
don't need to match.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Pierre Moreau [Sun, 3 Dec 2017 20:28:57 +0000 (21:28 +0100)]
nvc0/ir: Properly lower 64-bit shifts when the shift value is >32
Fixes: 61d7676df77 "nvc0/ir: add support for 64-bit shift lowering on SM20/SM30"
Fixes fs-shift-scalar-by-scalar.shader_test from piglit for the current
set-up:
uniform int64_t ival -0x7dfcfefbdf6536ff # bit pattern: 0x82030104209ac901
uniform uint64_t uval 0x1400000085010203
uniform int shl 36
uniform int shr 36
uniform int64_t iexpected_shl 0x09ac901000000000
uniform int64_t iexpected_shr -0x7dfcff0 # bit pattern: 0xfffffffff8203010
uniform uint64_t uexpected_shl 0x5010203000000000
uniform uint64_t uexpected_shr 0x0000000001400000
draw rect ortho 12 0 4 4
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Fabian Bieler [Thu, 23 Nov 2017 20:48:00 +0000 (13:48 -0700)]
glsl: Match order of gl_LightSourceParameters elements.
spotExponent and spotCosCutoff were swapped in the
gl_builtin_uniform_element struct.
Now the order matches across gl_builtin_uniform_element,
glsl_struct_field and the spec.
Reviewed-by: Brian Paul <brianp@vmware.com>
Fabian Bieler [Thu, 23 Nov 2017 20:48:00 +0000 (13:48 -0700)]
glsl: Fix gl_NormalScale.
GLSL shaders can access the normal scale factor with the built-in
gl_NormalScale. Mesa's modelspace lighting optimization uses a different
normal scale factor than defined in the spec. We have to take care not
to use this factor for gl_NormalScale.
Mesa already defines two seperate states: state.normalScale and
state.internal.normalScale. The first is used by the glsl compiler
while the later is used by the fixed function T&L pipeline. Previously
the only difference was some component swizzling. With this commit
state.normalScale always uses the normal scale factor for eyespace
lighting.
Reviewed-by: Brian Paul <brianp@vmware.com>
Timothy Arceri [Fri, 10 Nov 2017 10:33:37 +0000 (21:33 +1100)]
st/glsl_to_nir/radeonsi: enable gs support for nir backend
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Wed, 8 Nov 2017 03:20:23 +0000 (14:20 +1100)]
ac: add si_nir_load_input_gs() to the abi
V2: make use of driver_location and don't expose NIR to the ABI.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Fri, 10 Nov 2017 02:55:48 +0000 (13:55 +1100)]
ac: move build_varying_gather_values() to ac_llvm_build.h and expose
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Fri, 10 Nov 2017 02:47:50 +0000 (13:47 +1100)]
ac: add basic nir -> llvm type helper
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Tue, 7 Nov 2017 10:13:37 +0000 (21:13 +1100)]
radeonsi: create si_llvm_load_input_gs()
This creates a common function that can be shared by the tgsi
and nir backends.
v2: use LLVMBuildBitCast() directly
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Tue, 7 Nov 2017 10:42:10 +0000 (21:42 +1100)]
radeonsi: pass llvm type to lds_load()
v2: use LLVMBuildBitCast() directly
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Tue, 7 Nov 2017 10:41:27 +0000 (21:41 +1100)]
radeonsi: add llvm_type_is_64bit() helper
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Tue, 7 Nov 2017 10:27:18 +0000 (21:27 +1100)]
radeonsi: pass llvm type to si_llvm_emit_fetch_64bit()
v2: use LLVMBuildBitCast() directly
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Thu, 9 Nov 2017 02:34:37 +0000 (13:34 +1100)]
radeonsi: add nir support for gs epilogue
v2: add emit_gs_epilogue() helper function to reduce duplication.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Mon, 6 Nov 2017 11:35:50 +0000 (22:35 +1100)]
radeonsi: add nir support for es epilogue
v2: make use of existing si_tgsi_emit_epilogue()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Thu, 2 Nov 2017 04:02:55 +0000 (15:02 +1100)]
radeonsi: add nir support for ls epilogue
v2: make use of existing si_tgsi_emit_epilogue()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Wed, 22 Nov 2017 06:37:32 +0000 (17:37 +1100)]
st/glsl_to_nir: add gs support to st_nir_assign_var_locations()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Wed, 15 Nov 2017 03:36:22 +0000 (14:36 +1100)]
st/glsl_to_nir: use nir_lower_io_arrays_to_elements() to lower arrays
This pass is more fully featured, it supports geom and tess shaders.
It also supports interpolation intrinsics.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Wed, 15 Nov 2017 03:30:22 +0000 (14:30 +1100)]
nir: allow builin arrays to be lowered
Galliums nir drivers expect this to be done.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Wed, 15 Nov 2017 03:28:01 +0000 (14:28 +1100)]
nir: add array lowering function that assumes there are no indirects
The gallium glsl->nir pass currently lowers away all indirects on both inputs
and outputs. This fuction allows us to lower vs inputs and fs outputs and also
lower things one stage at a time as we don't need to worry about indirects
on the other side of the shaders interface.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Mon, 30 Oct 2017 00:58:52 +0000 (11:58 +1100)]
radv: enable nir varying array splitting
Acked-by: Dave Airlie <airlied@redhat.com>
Timothy Arceri [Mon, 20 Nov 2017 06:20:35 +0000 (17:20 +1100)]
st/glsl_to_nir: enable NIR link time opts
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Fri, 17 Nov 2017 06:04:22 +0000 (17:04 +1100)]
radeonsi/nir: add support for packed inputs
Because NIR can create non vec4 variables when implementing component
packing we need to make sure not to reprocess the same slot again.
Also we can drop the fs_attr_idx counter and just use driver_location.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Fri, 17 Nov 2017 06:45:32 +0000 (17:45 +1100)]
st/glsl_to_nir: move some calls out of st_glsl_to_nir_post_opts()
NIR component packing will be inserted between these calls and the
calling of st_glsl_to_nir_post_opts().
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Tue, 14 Nov 2017 01:56:20 +0000 (12:56 +1100)]
st/glsl_to_nir: call some lowering passes earlier
This is required so that we can enbale NIR linking optimisations.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Mon, 13 Nov 2017 23:13:58 +0000 (10:13 +1100)]
st/glsl_to_nir: add basic NIR opt loop helper
We need to be able to do these NIR opts in the state tracker
rather than the driver in order for the NIR linking opts to
be useful.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Mon, 13 Nov 2017 23:06:47 +0000 (10:06 +1100)]
st/glsl_to_nir: make st_glsl_to_nir() static
Here we also move the extern C functions to the bottom of the file.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Mon, 13 Nov 2017 22:15:54 +0000 (09:15 +1100)]
st/glsl_to_nir: split the st_glsl_to_nir() function in two
We want to be able to generate NIR then apply NIR optimisations.
Once the optimisations are done we can then apply the new post opt
function which assigns uniforms etc based on the optimised IR.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Mon, 13 Nov 2017 22:03:45 +0000 (09:03 +1100)]
st/glsl_to_nir: create set_st_program() helper
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Thu, 9 Nov 2017 06:32:08 +0000 (17:32 +1100)]
st/glsl: move nir linking loop to new function st_link_nir()
This will allow us to refactor linking and include some nir link
time optimisations.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Fri, 17 Nov 2017 03:27:27 +0000 (14:27 +1100)]
nir: fix support for scalar arrays in nir_lower_io_types()
This was just recreating the same vector type we alreay had and
hitting an assert for scalars.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Tue, 21 Nov 2017 00:53:58 +0000 (11:53 +1100)]
st/glsl_to_nir: add st_nir_assign_var_locations() helper
This avoids packed varyings being assigned different driver locations.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Wed, 18 Oct 2017 02:48:41 +0000 (13:48 +1100)]
radv: enable nir component packing
SaschaWillems Vulkan demo tessellation:
~4000fps -> ~4600fps
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Timothy Arceri [Wed, 18 Oct 2017 08:40:06 +0000 (19:40 +1100)]
nir: add varying component packing helpers
v2: update shader info input/output masks when pack components
v3: make sure interpolation loc matches, this is required for the
radeonsi NIR backend.
v4:
33dca36f4f28 fixed nir_gather_info to update outputs_read
correct, make sure we also adjust this correctly when
packing components.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)
Timothy Arceri [Mon, 23 Oct 2017 04:51:29 +0000 (15:51 +1100)]
nir: add varying array splitting pass
V2:
- fix matrix support, non-array matrices were being skipped in v1
v3:
- handle lowering of tcs output loads correctly
- correctly mark indirect locations for either in or out not both
when processing a stage.
- use nir_src_copy() when lowering stores.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Rob Clark [Sun, 3 Dec 2017 16:50:09 +0000 (11:50 -0500)]
freedreno/ir3: relax barriers
Instructions with no barrier_class can move wrt. an EVERYTHING barrier.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 3 Dec 2017 16:48:56 +0000 (11:48 -0500)]
freedreno/ir3: all mem instructions have WAR hazzard
It isn't just load instructions that have write-after-read hazzard.
Fixes stk gaussian blur compute shaders.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Wed, 29 Nov 2017 20:06:39 +0000 (15:06 -0500)]
freedreno: add debug option to force emulated indirect
Useful mostly for debugging indirect draw.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Wed, 29 Nov 2017 14:04:08 +0000 (09:04 -0500)]
freedreno: also mark draw-indirect buffer as read
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Mon, 20 Nov 2017 20:33:54 +0000 (15:33 -0500)]
freedreno: small cleanups
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 19 Nov 2017 21:45:04 +0000 (16:45 -0500)]
freedreno: avoid unneccessary batch flush
In some cases we can end up trying to add a write dependency on ourself,
which shouldn't trigger a flush.
Avoids an extra couple flushes per from in stk.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 19 Nov 2017 17:50:50 +0000 (12:50 -0500)]
freedreno: avoid mem2gmem for invalidated buffers
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 19 Nov 2017 16:42:25 +0000 (11:42 -0500)]
freedreno: deferred flush support
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 19 Nov 2017 15:36:19 +0000 (10:36 -0500)]
freedreno: rework fence tracking
ctx->last_fence isn't such a terribly clever idea, if batches can be
flushed out of order. Instead, each batch now holds a fence, which is
created before the batch is flushed (useful for next patch), that later
gets populated after the batch is actually flushed.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Mon, 20 Nov 2017 14:52:04 +0000 (09:52 -0500)]
freedreno: proper locking for iterating dependent batches
In transfer_map(), when we need to flush batches that read from a
resource, we should be holding screen->lock to guard against race
conditions. Somehow deferred flush seems to make this existing
race more obvious.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Wed, 22 Nov 2017 14:45:28 +0000 (09:45 -0500)]
freedreno/a5xx: correct max_indicies for indirect draws
Signed-off-by: Rob Clark <robdclark@gmail.com>
Jason Ekstrand [Thu, 19 Oct 2017 00:28:19 +0000 (17:28 -0700)]
spirv: Convert the supported_extensions struct to spirv_options
This is a bit more general and lets us pass additional options into the
spirv_to_nir pass beyond what capabilities we support.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Jason Ekstrand [Thu, 19 Oct 2017 17:11:22 +0000 (10:11 -0700)]
spirv: Only emit functions which are actually used
Instead of emitting absolutely everything, just emit the few functions
that are actually referenced in some way by the entrypoint. This should
save us quite a bit of time when handed large shader modules containing
many entrypoints.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Jason Ekstrand [Thu, 19 Oct 2017 16:56:22 +0000 (09:56 -0700)]
spirv: Drop the impl field from vtn_builder
We have a nir_builder and it has an impl field.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Jordan Justen [Fri, 1 Dec 2017 01:48:57 +0000 (17:48 -0800)]
i965: Serialize nir later in the linking process
Fixes MESA_GLSL=cache_fb with piglit
tests/spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out.shader_test
Fixes: 0610a624a12 i965/link: Serialize program to nir after linking for shader cache
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103988
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Marc Dietrich [Wed, 29 Nov 2017 21:25:05 +0000 (22:25 +0100)]
configure: avoid testing for negative compiler options
gcc seems to always accept unsupported negative compiler warning options:
echo "int i;" | gcc -c -xc -Wno-bob - # no error
echo "int i;" | gcc -c -xc -Walice - # unsupported compiler option
Inverting the options fixes the tests.
V2: fix options in meson build
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Eric Anholt [Wed, 29 Nov 2017 00:17:16 +0000 (16:17 -0800)]
broadcom/vc4: Use a single-entry cached last_hindex value.
Since almost all BOs will be in one CL at a time, this cache will almost
always hit except for the first usage of the BO in each CL.
This didn't show up as statistically significant on the minetest trace
(n=340), but if I lop off the throttled lobe of the bimodal distribution,
it very clearly does (0.74731% +/- 0.162093%, n=269).
Eric Anholt [Sat, 25 Nov 2017 06:34:12 +0000 (22:34 -0800)]
broadcom/vc4: Decompose single QUADs to a TRIANGLE_FAN.
No significant difference in the minetest replay, but it should reduce
overhead by not requiring that we write quad indices to index buffers that
we repeatedly re-upload (and making the draw packet smaller, as well).
Over the course of the series the actual game seems to be up by 1-2 fps.
Eric Anholt [Sat, 25 Nov 2017 06:20:21 +0000 (22:20 -0800)]
broadcom/vc4: Use the new enum functionality of the XML to decode better.
Eric Anholt [Sat, 25 Nov 2017 06:15:28 +0000 (22:15 -0800)]
broadcom/vc4: Skip emitting redundant VC4_PACKET_GEM_HANDLES.
Now that there's only one user of it, it's pretty obvious how to avoid
emitting redundant ones. This should save a bunch of kernel validation
overhead.
No statistically sigificant difference on the minetest trace I was looking
at (n=169), but the maximum FPS is up by .3%