mesa.git
5 years agovulkan: Update the XML and headers to 1.1.93
Jason Ekstrand [Mon, 19 Nov 2018 15:37:38 +0000 (09:37 -0600)]
vulkan: Update the XML and headers to 1.1.93

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
5 years agoradv: remove useless sync after CmdClear{Color,DepthStencil}Image()
Samuel Pitoiset [Wed, 21 Nov 2018 10:34:42 +0000 (11:34 +0100)]
radv: remove useless sync after CmdClear{Color,DepthStencil}Image()

'post_flush' is only set to NULL for the normal clear path
(ie. only vkCmdClearColorImage() and vkCmdClearDepthStencilImage()
are affected commands).

Because these two operations have to be externally synchronized
with VK_PIPELINE_STAGE_TRANSFER_BIT and VK_ACCESS_TRANSFER_WRITE_BIT,
it's useless to set those flags internallY.

VK_PIPELINE_STAGE_TRANSFER_BIT will wait for compute to be idle,
while VK_ACCESS_TRANSFER_WRITE_BIT will invalidate both L1 vector
caches and L2. RADV_CMD_FLAG_WRITEBACK_GLOBAL_L2 will be superseded
by RADV_CMD_FLAG_INV_GLOBAL_L2.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agovulkan: Allow storage images in the WSI.
Bas Nieuwenhuizen [Tue, 20 Nov 2018 20:57:27 +0000 (21:57 +0100)]
vulkan: Allow storage images in the WSI.

Since apps also have to follow the ImageFormatProperties query,
we can disallow formats that don't allow image stores (for AMD
that would be SRGB formats).

Note that this only affects anything if the app actually decides
to use the flag.

Had someone ask for this on IRC and at least on the AMD side we
can support it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agost/nine: Remove thread_submit warning
Axel Davy [Sat, 10 Nov 2018 15:57:31 +0000 (16:57 +0100)]
st/nine: Remove thread_submit warning

thread_submit can be useful even without DRI_PRIME,
as it can help avoid missed pageflips.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
5 years agost/nine: Allow 'triple buffering' with thread_submit
Axel Davy [Sat, 10 Nov 2018 10:42:39 +0000 (11:42 +0100)]
st/nine: Allow 'triple buffering' with thread_submit

The path allowing triple buffering behaviour wasn't implemented
yet for thread_submit

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
5 years agovirgl: add assert and missing function parameter
Robert Foss [Tue, 20 Nov 2018 15:38:27 +0000 (16:38 +0100)]
virgl: add assert and missing function parameter

Verify the pipe_fd_type to be of PIPE_FD_TYPE_NATIVE_SYNC.

Fixes: d1a1c21e7621b5177feb "virgl: native fence fd support"
Suggested-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agor600: clean up the GS ring buffers when the context is destroyed
Gert Wollny [Fri, 16 Nov 2018 11:48:08 +0000 (12:48 +0100)]
r600: clean up the GS ring buffers when the context is destroyed

This fixes two memory leaks reported by ASAN:

Direct leak of 248 byte(s) in 1 object(s) allocated from:
   in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880)
   in r600_alloc_buffer_struct ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:578
   in r600_buffer_create ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:600
   in r600_resource_create_common ../../samba/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1265
   in r600_resource_create ../../samba/mesa/src/gallium/drivers/r600/r600_pipe.c:725
   in pipe_buffer_create ../../samba/mesa/src/gallium/auxiliary/util/u_inlines.h:291
   in update_gs_block_state ../../samba/mesa/src/gallium/drivers/r600/r600_state_common.c:1482

Direct leak of 248 byte(s) in 1 object(s) allocated from:
   in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880)
   in r600_alloc_buffer_struct ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:578
   in r600_buffer_create ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:600
   in r600_resource_create_common ../../samba/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1265
   in r600_resource_create ../../samba/mesa/src/gallium/drivers/r600/r600_pipe.c:722
   in pipe_buffer_create ../../samba/mesa/src/gallium/auxiliary/util/u_inlines.h:291
   in update_gs_block_state ../../samba/mesa/src/gallium/drivers/r600/r600_state_common.c:1489

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Fixes: 1371d65a7fbd695d3516861fe733685569d890d0
  r600g: initial support for geometry shaders on evergreen (v2)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agoradv: only sync CP DMA for transfer operations or bottom pipe
Samuel Pitoiset [Tue, 20 Nov 2018 15:41:23 +0000 (16:41 +0100)]
radv: only sync CP DMA for transfer operations or bottom pipe

CP DMA can only be busy when the driver copies buffers. The
only affected Vulkan commands are vkCmdCopyBuffer() and
vkCmdUpdateBuffer() (because we fallback to a copy depending on
a threshold). Clear operations are currently not concerned
because the driver always syncs after the last DMA operation.

Per the spec, these two operations have to be externally
synchronized with VK_PIPELINE_STAGE_TRANSFER_BIT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: ignore subpass self-dependencies
Samuel Pitoiset [Tue, 20 Nov 2018 12:48:34 +0000 (13:48 +0100)]
radv: ignore subpass self-dependencies

Unnecessary as they allow the app to call vkCmdPipelineBarrier()
inside the render pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoRevert "nir/builder: Assert that intN_t immediates fit"
Iago Toral Quiroga [Tue, 20 Nov 2018 08:24:28 +0000 (09:24 +0100)]
Revert "nir/builder: Assert that intN_t immediates fit"

This reverts commit 1f29f4db1e867357a119c0c7c34fb54dc27fb682.

For this to work the compiler must ensure that it never puts
the values that arrive to this helper into unsigned variables
at any point in its processing, since that would not apply sign
extension to the value and it would break the expectations here.
Unfortunately, we use uint64_t extensively to pass and copy
things around, so some times we get to this helper with values
that are not properly sign extended to 64-bit. Here is an example
for an 8-bit value that comes from a switch case:

(gdb) p /x x
$1 = 0xffffffd6

The value seems to have been sign extended to 32-bit at some point
getting proper sign extension, but then copied into a uint64_t
which wont' apply sign extension, breaking the expectations of
the assertion.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir/from_ssa: fix bit-size of temporary register
Iago Toral Quiroga [Mon, 19 Nov 2018 12:58:06 +0000 (13:58 +0100)]
nir/from_ssa: fix bit-size of temporary register

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agomesa: Remove unneeded bitfield widths from the VAO.
Mathias Fröhlich [Sat, 17 Nov 2018 06:13:11 +0000 (07:13 +0100)]
mesa: Remove unneeded bitfield widths from the VAO.

With the current VAO layout we do not need to make these
fields a bitfield. We get a tight struct layout with this change
for VAO attributes.

v2: Change unsigned char -> GLubyte.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agomesa: Factor out struct gl_vertex_format.
Mathias Fröhlich [Sat, 17 Nov 2018 06:13:11 +0000 (07:13 +0100)]
mesa: Factor out struct gl_vertex_format.

Factor out struct gl_vertex_format from array attributes.
The data type is supposed to describe the type of a vertex
element. At this current stage the data type is only used
with the VAO, but actually is useful in various other places.
Due to the bitfields being used, special care needs to be
taken for the glGet code paths.

v2: Change unsigned char -> GLubyte.
    Use struct assignment for struct gl_vertex_format.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agotnl: Use gl_array_attribute::_ElementSize.
Mathias Fröhlich [Sat, 17 Nov 2018 06:13:11 +0000 (07:13 +0100)]
tnl: Use gl_array_attribute::_ElementSize.

Instead of open coding the size computation, use the
already available gl_array_attribute::_ElementSize value.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agonouveau: Use gl_array_attribute::_ElementSize.
Mathias Fröhlich [Sat, 17 Nov 2018 06:13:11 +0000 (07:13 +0100)]
nouveau: Use gl_array_attribute::_ElementSize.

Instead of open coding the size computation, use the
already available gl_array_attribute::_ElementSize value.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agomesa: Unify glEdgeFlagPointer data type.
Mathias Fröhlich [Sat, 17 Nov 2018 06:13:11 +0000 (07:13 +0100)]
mesa: Unify glEdgeFlagPointer data type.

Use GL_UNSIGNED_BYTE as initialization data type
for the edge flag vertex attribute array. The same datatype
is used in the glEdgeFlagPointer function when setting the
array pointer.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agomesa: Work with bitmasks when en/dis-abling VAO arrays.
Mathias Fröhlich [Sat, 17 Nov 2018 06:13:11 +0000 (07:13 +0100)]
mesa: Work with bitmasks when en/dis-abling VAO arrays.

For enabling or disabling VAO arrays it is now possible to
change a set of arrays with a single call without the need to
iterate the attributes.
Make use of this technique in the vao module.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agomesa: Remove gl_array_attributes::Enabled.
Mathias Fröhlich [Sat, 17 Nov 2018 06:13:11 +0000 (07:13 +0100)]
mesa: Remove gl_array_attributes::Enabled.

Now that all users go via the VAO Enabled bitfield,
get rid of the Enabled boolean.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agomesa: Use gl_vertex_array_object::Enabled for glGet.
Mathias Fröhlich [Sat, 17 Nov 2018 06:13:11 +0000 (07:13 +0100)]
mesa: Use gl_vertex_array_object::Enabled for glGet.

Instead of using gl_array_attributes::Enabled use the
much more compact representation stored in
gl_vertex_array_object::Enabled using the corresponding bits.
Keep the glGet changes in a seperate patch at least for review.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agomesa: Use the gl_vertex_array_object::Enabled bitfield.
Mathias Fröhlich [Sat, 17 Nov 2018 06:13:11 +0000 (07:13 +0100)]
mesa: Use the gl_vertex_array_object::Enabled bitfield.

Instead of using gl_array_attributes::Enabled use the
much more compact representation stored in
gl_vertex_array_object::Enabled using the corresponding bits.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agomesa: Rename gl_vertex_array_object::_Enabled -> Enabled.
Mathias Fröhlich [Sat, 17 Nov 2018 06:13:11 +0000 (07:13 +0100)]
mesa: Rename gl_vertex_array_object::_Enabled -> Enabled.

Mark the up to now derived bitfield value now as primary
value by removing the underscore.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agoradeonsi: go back to using bottom-of-pipe for beginning of TIME_ELAPSED
Marek Olšák [Tue, 13 Nov 2018 21:19:42 +0000 (16:19 -0500)]
radeonsi: go back to using bottom-of-pipe for beginning of TIME_ELAPSED

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102597

Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradeonsi: don't send data after write-confirm with BOTTOM_OF_PIPE_TS
Marek Olšák [Tue, 13 Nov 2018 21:16:51 +0000 (16:16 -0500)]
radeonsi: don't send data after write-confirm with BOTTOM_OF_PIPE_TS

There are no writes.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agost/mesa: pin driver threads to a fixed CCX when glthread is enabled
Marek Olšák [Tue, 13 Nov 2018 00:09:25 +0000 (19:09 -0500)]
st/mesa: pin driver threads to a fixed CCX when glthread is enabled

radeonsi has 3 driver threads (glthread, gallium, winsys), other drivers
may have 2 (glthread, gallium), so it makes sense to pin them to a random
CCX and keep that irrespective of the app thread.

Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agost/mesa: regularly re-pin driver threads to the CCX where the app thread is
Marek Olšák [Mon, 12 Nov 2018 23:10:59 +0000 (18:10 -0500)]
st/mesa: regularly re-pin driver threads to the CCX where the app thread is

This is used when glthread is disabled.

Mesa pretty much chases the app thread on the CPU.
The performance is the same as pinning the app thread.

Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agodrirc: enable glthread for Talos Principle
Marek Olšák [Sat, 10 Nov 2018 06:22:32 +0000 (01:22 -0500)]
drirc: enable glthread for Talos Principle

Ryzen 1700X, Vega 56, 1600x900, 4xAA: improvement +4.4%

Immediate mode was needed.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agomesa/glthread: enable immediate mode
Marek Olšák [Sat, 10 Nov 2018 06:18:30 +0000 (01:18 -0500)]
mesa/glthread: enable immediate mode

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agomesa/glthread: pass the function name to _mesa_glthread_restore_dispatch
Marek Olšák [Sat, 10 Nov 2018 06:17:13 +0000 (01:17 -0500)]
mesa/glthread: pass the function name to _mesa_glthread_restore_dispatch

If you insert printf there, you'll know why glthread was disabled.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agogallium/u_tests: fix MSVC build by using old-style zero initializers
Marek Olšák [Wed, 21 Nov 2018 00:06:22 +0000 (19:06 -0500)]
gallium/u_tests: fix MSVC build by using old-style zero initializers

5 years agoi965: Do NIR shader cloning in the caller.
Kenneth Graunke [Fri, 9 Nov 2018 06:10:03 +0000 (22:10 -0800)]
i965: Do NIR shader cloning in the caller.

This moves nir_shader_clone() to the driver-specific compile function,
rather than the shared src/intel/compiler code.  This allows i965 to do
key-specific passes before calling brw_compile_*.  Vulkan should not
need this cloning as it doesn't compile multiple variants.

We do need to continue cloning in the compute shader code because we
lower various things in NIR based on the SIMD width.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
5 years agoi965: Use a 'nir' temporary rather than poking at brw_program
Kenneth Graunke [Fri, 9 Nov 2018 05:53:16 +0000 (21:53 -0800)]
i965: Use a 'nir' temporary rather than poking at brw_program

It's shorter and will also be useful when I adjust cloning soon.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
5 years agogallium/u_tests: add a compute shader test that clears an image
Marek Olšák [Wed, 14 Nov 2018 21:41:33 +0000 (16:41 -0500)]
gallium/u_tests: add a compute shader test that clears an image

5 years agoac: handle cast derefs
Dave Airlie [Mon, 19 Nov 2018 04:16:16 +0000 (14:16 +1000)]
ac: handle cast derefs

Just give back the same value for now.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: handle loading from shared pointers
Dave Airlie [Mon, 19 Nov 2018 03:48:37 +0000 (13:48 +1000)]
radv: handle loading from shared pointers

We won't have a var to load from, so don't try to the processing
required if we don't need it.

This avoids crashes in:
dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.compute.workgroup_two_buffers

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: avoid casting pointers on bcsel and stores
Dave Airlie [Mon, 19 Nov 2018 03:00:36 +0000 (13:00 +1000)]
ac: avoid casting pointers on bcsel and stores

For variable pointers we really don't want to case the pointers to int
without a good reason, just add a wrapper for bcsel loading and result
storing.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agomeson: Add tests to suites
Dylan Baker [Mon, 19 Nov 2018 21:44:15 +0000 (13:44 -0800)]
meson: Add tests to suites

Meson test has a concepts of suites, which allow tests to be grouped
together. This allows for a subtest of tests to be run only (say only
the tests for nir). A test can be added to more than one suite, but for
the most part I've only added a test to a single suite, though I've
added a compiler group that includes nir, glsl, and glcpp tests.

To use this you'll need to invoke meson test directly, instead of ninja
test (which always runs all targets). it can be invoked as:
`meson test -C builddir --suite $suitename` (meson test has addition
options that are pretty useful).

Tested-By: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoi965/batch: avoid reverting batch buffer if saved state is an empty
Andrii Simiklit [Mon, 5 Nov 2018 07:48:26 +0000 (09:48 +0200)]
i965/batch: avoid reverting batch buffer if saved state is an empty

There's no point reverting to the last saved point if that save point is
the empty batch, we will just repeat ourselves.

v2: Merge with new commits, changes was minimized, added the 'fixes' tag
v3: Added in to patch series
v4: Fixed the regression which was introduced by this patch
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108630
Reported-by: Mark Janes <mark.a.janes@intel.com>
    The solution provided by: Jordan Justen <jordan.l.justen@intel.com>

CC: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 3faf56ffbdeb "intel: Add an interface for saving/restoring
                     the batchbuffer state."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108630 (fixed in v4)
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agotravis: adding missing x11-xcb for meson+vulkan
Emil Velikov [Fri, 7 Sep 2018 13:58:56 +0000 (14:58 +0100)]
travis: adding missing x11-xcb for meson+vulkan

Required by the x11 WSI

Fixes: df82012b2cb ("travis: add meson build for vulkan drivers.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoglx: make xf86vidmode mandatory for direct rendering
Emil Velikov [Fri, 16 Nov 2018 11:15:37 +0000 (11:15 +0000)]
glx: make xf86vidmode mandatory for direct rendering

Currently we detect the module and if missing, the glXGetMsc* API is
effectively a stub, always returning false.

This is what effectively has been happening with our meson build :-(

Thus users have no chance of using it - they cannot even distinguish
if the failure is due to a misconfigured build.

There's no reason for keeping xf86vidmode optional - it has been
available in all distributions for years.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: a47c525f3281a2753180e "meson: build glx"
5 years agotravis: drop unneeded x11proto-xf86vidmode-dev
Emil Velikov [Fri, 16 Nov 2018 11:10:57 +0000 (11:10 +0000)]
travis: drop unneeded x11proto-xf86vidmode-dev

The only place where the package is needed is for building the DRI
based libGL library.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoac/nir: fix intrinsic name string size in visit_image_atomic()
Samuel Pitoiset [Tue, 20 Nov 2018 09:01:01 +0000 (10:01 +0100)]
ac/nir: fix intrinsic name string size in visit_image_atomic()

Fixes an assertion in SoTTR.

Fixes: dd0172e865 ("radv: Use structured intrinsics instead of indexing workaround for GFX9.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: Use structured intrinsics instead of indexing workaround for GFX9.
Bas Nieuwenhuizen [Mon, 12 Nov 2018 21:42:36 +0000 (22:42 +0100)]
radv: Use structured intrinsics instead of indexing workaround for GFX9.

These force the index to be used in the instruction so we don't need the
workaround.

Totals:
SGPRS: 1321642 -> 1321802 (0.01 %)
VGPRS: 943664 -> 943788 (0.01 %)
Spilled SGPRs: 28468 -> 28480 (0.04 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 52415292 -> 52338932 (-0.15 %) bytes
LDS: 400 -> 400 (0.00 %) blocks
Max Waves: 233903 -> 233803 (-0.04 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 238344 -> 238504 (0.07 %)
VGPRS: 232732 -> 232856 (0.05 %)
Spilled SGPRs: 13125 -> 13137 (0.09 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 15752712 -> 15676352 (-0.48 %) bytes
LDS: 139 -> 139 (0.00 %) blocks
Max Waves: 31680 -> 31580 (-0.32 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoi965: Allow only one slot of clip distances to be set on Gen4-5.
Kenneth Graunke [Sat, 27 Oct 2018 18:20:28 +0000 (11:20 -0700)]
i965: Allow only one slot of clip distances to be set on Gen4-5.

The existing backend code assumed that if VARYING_SLOT_CLIP_DIST0
was written, then VARYING_SLOT_CLIP_DIST1 would be as well.  That's
true with the current lowering, but not necessary if there are 4 or
fewer clip distances.  Separate out the checks to allow this.

The new NIR-based lowering will trigger this case, which would have
caused backend validation errors (src is null) without this patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir: Make nir_lower_clip_vs optionally work with variables.
Kenneth Graunke [Mon, 22 May 2017 02:26:15 +0000 (19:26 -0700)]
nir: Make nir_lower_clip_vs optionally work with variables.

The way nir_lower_clip_vs() works with store_output intrinsics makes a
ton of assumptions about the driver_location field.

In i965 and iris, I'd rather do this lowering early and work with
variables.  v3d may want to switch to that as well, and ir3 could too,
but I'm not sure exactly what would need updating.  For now, handle
both methods.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir: Save nir_variable pointers in nir_lower_clip_vs rather than locs.
Kenneth Graunke [Mon, 22 May 2017 02:13:21 +0000 (19:13 -0700)]
nir: Save nir_variable pointers in nir_lower_clip_vs rather than locs.

I'll want the variables in the next patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir: Inline lower_clip_vs() into nir_lower_clip_vs().
Kenneth Graunke [Mon, 22 May 2017 02:29:48 +0000 (19:29 -0700)]
nir: Inline lower_clip_vs() into nir_lower_clip_vs().

It's now called exactly once, and there's not really any distinction.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir: Use nir_shader_get_entrypoint in nir_lower_clip_vs().
Kenneth Graunke [Mon, 22 May 2017 02:26:03 +0000 (19:26 -0700)]
nir: Use nir_shader_get_entrypoint in nir_lower_clip_vs().

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir: handle shared pointers in lowering indirect derefs.
Dave Airlie [Mon, 19 Nov 2018 03:54:33 +0000 (13:54 +1000)]
nir: handle shared pointers in lowering indirect derefs.

Check if the base ends up with no variable, and continue
if we see that case outside the loop.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: move getting deref from var after we check deref type.
Dave Airlie [Mon, 19 Nov 2018 03:51:48 +0000 (13:51 +1000)]
nir: move getting deref from var after we check deref type.

I posted a load of hacks before to do this, Jason suggested this,
just check the deref mode, not the variable mode and delay getting
the variable until we know the type.

avoids crashes when derefing shared memory pointers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agospirv/vtn: handle variable pointers without offset lowering
Dave Airlie [Wed, 4 Jul 2018 06:21:49 +0000 (16:21 +1000)]
spirv/vtn: handle variable pointers without offset lowering

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/fs,vec4: Fix a compiler warning
Jason Ekstrand [Fri, 16 Nov 2018 15:23:56 +0000 (09:23 -0600)]
intel/fs,vec4: Fix a compiler warning

../src/intel/compiler/brw_fs_nir.cpp:3534:46: warning: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare]
       assert(nir_intrinsic_write_mask(instr) ==
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
              (1 << instr->num_components) - 1);
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This was caused by 6339aba775ecdc which added these completely valid
checks.  However clang likes to complain about signedness mismatches.

Fixes: 6339aba775ecdc "intel/compiler: Lower SSBO and shared..."
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
5 years agointel,nir: Move gl_LocalInvocationID lowering to nir_lower_system_values
Jason Ekstrand [Thu, 15 Nov 2018 16:25:46 +0000 (10:25 -0600)]
intel,nir: Move gl_LocalInvocationID lowering to nir_lower_system_values

It's not at all intel-specific; the formula is dictated by OpenGL and
Vulkan.  The only intel-specific thing is that we need the lowering.  As
a nice side-effect, the new version is variable-group-size ready.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
5 years agogbm: add missing comma between strings
Eric Engestrom [Sun, 18 Nov 2018 15:17:13 +0000 (15:17 +0000)]
gbm: add missing comma between strings

Fixes: d971a4230d54069c996bc "loader: Factor out the common driver
                              opening logic from each loader."
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoradv: implement fast HTILE clears for depth or stencil only on GFX9
Samuel Pitoiset [Mon, 12 Nov 2018 16:57:12 +0000 (17:57 +0100)]
radv: implement fast HTILE clears for depth or stencil only on GFX9

This allows to fast clear the depth part (or the stencil part)
of a depth+stencil surface when HTILE is enabled. I didn't test
on GFX8, so it's disabled currently.

This gives a very nice boost, for example when clearing the depth
aspect of a 4096x4096 D32_SFLOAT_S8_UINT image (18x faster).

BEFORE: 235 us
AFTER: 13 us

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: rewrite the condition that checks allowed depth/stencil values
Samuel Pitoiset [Mon, 12 Nov 2018 16:57:11 +0000 (17:57 +0100)]
radv: rewrite the condition that checks allowed depth/stencil values

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: check allowed fast HTILE clears a bit earlier
Samuel Pitoiset [Mon, 12 Nov 2018 16:57:10 +0000 (17:57 +0100)]
radv: check allowed fast HTILE clears a bit earlier

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: add radv_is_fast_clear_{depth,stencil}_allowed() helpers
Samuel Pitoiset [Mon, 12 Nov 2018 16:57:09 +0000 (17:57 +0100)]
radv: add radv_is_fast_clear_{depth,stencil}_allowed() helpers

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: add radv_get_htile_fast_clear_value() helper
Samuel Pitoiset [Mon, 12 Nov 2018 16:57:08 +0000 (17:57 +0100)]
radv: add radv_get_htile_fast_clear_value() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: remove unnecessary goto in the fast clear paths
Samuel Pitoiset [Mon, 12 Nov 2018 16:57:07 +0000 (17:57 +0100)]
radv: remove unnecessary goto in the fast clear paths

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/winsys: remove the max IBs per submit limit for the sysmem path
Samuel Pitoiset [Thu, 15 Nov 2018 10:29:54 +0000 (11:29 +0100)]
radv/winsys: remove the max IBs per submit limit for the sysmem path

This path will be eventually improved later but as it's only
used on SI (or with RADV_DEBUG=noibs), I'm not sure if that
matters much.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/winsys: remove the max IBs per submit limit for the fallback path
Samuel Pitoiset [Thu, 15 Nov 2018 10:29:53 +0000 (11:29 +0100)]
radv/winsys: remove the max IBs per submit limit for the fallback path

The chained submission is the fastest path and it should now
be used more often than before. This removes some EOP events.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoetnaviv: use dummy RT buffer when rendering without color buffer
Lucas Stach [Wed, 14 Nov 2018 13:51:49 +0000 (14:51 +0100)]
etnaviv: use dummy RT buffer when rendering without color buffer

At least GC2000 seems to push some dirt from the PE color cache into
the last bound render target when drawing depth only. Newer cores
seem to behave properly and don't do this, but I have found no way
to fix it on GC2000. Flushes and stalls don't seem to make any
difference.

In order to stop the core from pushing the dirt into a precious real
render target, plug in dummy buffer when rendering without a color
buffer.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
5 years agovirgl: fix vtest regression since fencing changes.
Dave Airlie [Mon, 19 Nov 2018 05:47:40 +0000 (15:47 +1000)]
virgl: fix vtest regression since fencing changes.

The in_fence_fd needs to be initialised to -1.

Fixes: d1a1c21e7 (virgl: native fence fd support)
Reviewed-by: Robert Foss <robert.foss@collabora.com>
5 years agoradv: always clear the FCE predicate after DCC/FMASK/CMASK decompressions
Samuel Pitoiset [Fri, 16 Nov 2018 12:40:10 +0000 (13:40 +0100)]
radv: always clear the FCE predicate after DCC/FMASK/CMASK decompressions

DCC and FMASK also imply a fast-clear eliminate, so it should be
safe to reset the predicate unconditionally. We still only skip
FMASK or CMASK decompressions for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: tidy up radv_set_dcc_need_cmask_elim_pred()
Samuel Pitoiset [Fri, 16 Nov 2018 12:40:09 +0000 (13:40 +0100)]
radv: tidy up radv_set_dcc_need_cmask_elim_pred()

This is just a small cleanup.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi: fix an out-of-bounds read reported by ASAN
Nicolai Hähnle [Fri, 16 Nov 2018 16:20:26 +0000 (17:20 +0100)]
radeonsi: fix an out-of-bounds read reported by ASAN

We read 4 values out of sample_locs_8x, so make sure the array is
big enough.

Fixes: ac76aeef20 ("radeonsi: switch back to standard DX sample positions")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agor600: Only set context streamout strides info from the shader that has outputs
Gert Wollny [Mon, 19 Nov 2018 06:56:09 +0000 (07:56 +0100)]
r600: Only set context streamout strides info from the shader that has outputs

With 5d517a streamout info is only attached to the shader for which the
transform feedback is actually recorded, but the driver set the context info
with each state submitted, thereby always using the info data that was
attached to the vertex shader.

Pass the streamout stride info to the context only from the shader that
actually has outputs. (Thanks to Marek Olšák for pointing me in the right
direction)

Fixes regresion with: dEQP-GLES31.functional.tessellation.invariance.*
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108734
Fixes: 5d517a599b1eabd1d5696bf31e26f16568d35770
  st/mesa: Don't record garbage streamout information in the non-SSO case.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoi965:use FRAMEBUFFER_UNSUPPORTED instead of FRAMEBUFFER_INCOMPLETE_DIMENSIONS
Gert Wollny [Mon, 19 Nov 2018 09:36:26 +0000 (10:36 +0100)]
i965:use FRAMEBUFFER_UNSUPPORTED instead of FRAMEBUFFER_INCOMPLETE_DIMENSIONS

FRAMEBUFFER_INCOMPLETE_DIMENSIONS is not supported for GLES 3.0 and later and
not defined for Desktop OpenGL. Instead use FRAMEBUFFER_UNSUPPORTED like it
was done before.

Thanks to Iago Toral and Andrey Simiklit for pointing out the problem and the
details.

Fixes: ebcde3454552adc6d3fea8af2207aafaba857796
   i965: be more specific about FBO completeness errors
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
5 years agovirgl: Use file descriptor instead of un-allocated object
Gert Wollny [Mon, 19 Nov 2018 09:56:23 +0000 (10:56 +0100)]
virgl: Use file descriptor instead of un-allocated object

The structure qdws is not allocated at this point, nor is the
file descriptor set to it's member. Use the fd directly instead.

Fixes: d1a1c21e7621b5177febf191fcd3d3b8ef69dc96
    virgl: native fence fd support

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
5 years agoi965: Add support for and expose EXT_texture_sRGB_R8
Gert Wollny [Thu, 15 Nov 2018 18:01:24 +0000 (19:01 +0100)]
i965: Add support for and expose EXT_texture_sRGB_R8

Emulate MESA_FORMAT_R_SRGB8 by using L8_UNORM_SRGB. This is possible
because component swizzling is handled based on the mesa format and,
hence, the a r001 swizzling can be used to correct the components.

Enables and makes pass (tested on Kabylake)

  dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.*
  dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8*

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoi965: Force zero swizzles for unused components in GL_RED and GL_RG
Gert Wollny [Thu, 15 Nov 2018 18:01:23 +0000 (19:01 +0100)]
i965: Force zero swizzles for unused components in GL_RED and GL_RG

This makes it possible to use a hardware luminance format as RED format.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoi965: be more specific about FBO completeness errors
Gert Wollny [Thu, 15 Nov 2018 18:01:22 +0000 (19:01 +0100)]
i965: be more specific about FBO completeness errors

The driver was returning GL_FRAMEBUFFER_UNSUPPORTED for all cases of an
incomplete fbo, be a bit more specific about this following the description
of glCheckFramebufferStatus.

This helps to keeps dEQP happy when adding EXT_texture_sRGB_R8 support.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoi965: Correct L8_UNORM_SRGB table entry
Gert Wollny [Thu, 15 Nov 2018 18:01:21 +0000 (19:01 +0100)]
i965: Correct L8_UNORM_SRGB table entry

As the name says, the format is an sRGB format.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agovirgl: Clean up fences commit
Robert Foss [Fri, 16 Nov 2018 13:53:31 +0000 (14:53 +0100)]
virgl: Clean up fences commit

Remove a dead variable, a int->bool conversion and some
whitespace changes.

Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agoi915: Delete swizzling detection logic.
Kenneth Graunke [Fri, 16 Nov 2018 15:40:55 +0000 (07:40 -0800)]
i915: Delete swizzling detection logic.

This is all leftover from the i965 split.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agonv50/ir/ra: enforce max register requirement, and change spill order
Ilia Mirkin [Sun, 11 Nov 2018 20:52:33 +0000 (15:52 -0500)]
nv50/ir/ra: enforce max register requirement, and change spill order

On nv50, certain operations must happen on regs below 64, due to
encoding requirements. First of all, we add infrastructure to enforce
this. Secondly we change the spill order to first spill RIG nodes that
are unconstrained, followed by ones that are.

This makes the gamecube logo shadertoy compile properly. Curiously, if
we adjust the spill order so that we first spill the constrained RIG
nodes instead, the RA also succeeds. However it seems more logical to
first spill the unconstrained ones.

While we're at it, drop the nv50 max register to reserve r127 as the
zero register of last resort (r63 is preferred).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Karol Herbst <kherbst@redhat.com>
5 years agonv50/ir/ra: improve condition for short regs, unify with cond for 16-bit
Ilia Mirkin [Sun, 11 Nov 2018 15:55:55 +0000 (10:55 -0500)]
nv50/ir/ra: improve condition for short regs, unify with cond for 16-bit

Instead of the size restriction existing in two places, and potentially
being applied twice, we move this together. Ops with 16-bit register
addresses can only take a short reg, and ops with immediates can only
take a short reg.

Of course we leave the immediate 0 in place since we know that it will
be replaced by r63/r127 down the line, so don't treat zeroes as an
immediate.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agonv50/ir: delete MINMAX instruction that is no longer in the BB
Ilia Mirkin [Sun, 11 Nov 2018 07:19:36 +0000 (02:19 -0500)]
nv50/ir: delete MINMAX instruction that is no longer in the BB

We removed the op from the BB, but it was still listed in its sources'
uses. This could trip up some logic down the line which analyzes all the
uses of an l-value, e.g. spilling.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agoegl: Print the actual message to the console from _eglError().
Eric Anholt [Fri, 16 Nov 2018 01:31:28 +0000 (17:31 -0800)]
egl: Print the actual message to the console from _eglError().

Previously we would print errors on the console like:

   libEGL debug: EGL user error 0x3001 (EGL_NOT_INITIALIZED) in eglInitialize

When we had everything we needed for:

   libEGL debug: EGL user error 0x3001 (EGL_NOT_INITIALIZED) in eglInitialize: DRI2: failed to find EGLDevice

(for a gbm error in my case)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agoloader: Factor out the common driver opening logic from each loader.
Eric Anholt [Thu, 15 Nov 2018 21:54:49 +0000 (13:54 -0800)]
loader: Factor out the common driver opening logic from each loader.

I copied the code from egl_dri2.c, but the functionality was equivalent
between all the loaders other than their particular environment variables.

v2: Drop the logging function equivalent to loader_default_logger()
    (requested by Eric, Emil).  Move the SCons workaround across.  Drop
    the now-unused driGetDriverExtensions() declaration that was lost in a
    rebase.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
5 years agoloader: Stop using a local definition for an in-tree header
Eric Anholt [Thu, 15 Nov 2018 21:50:48 +0000 (13:50 -0800)]
loader: Stop using a local definition for an in-tree header

I need other types from the header now, and "gl.h is big" is not a good
reason to duplicate definitions.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agoegl: Move loader_set_logger() up to egl_dri2.c.
Eric Anholt [Thu, 15 Nov 2018 22:47:30 +0000 (14:47 -0800)]
egl: Move loader_set_logger() up to egl_dri2.c.

Everyone needs to call it, and platform_x11 forgot to.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agoglx: Move DRI extensions pointer loading to driOpenDriver().
Eric Anholt [Thu, 15 Nov 2018 22:22:16 +0000 (14:22 -0800)]
glx: Move DRI extensions pointer loading to driOpenDriver().

The only thing you do with a dri driver handle is get the extensions
pointer, so just fold it in to simplify the callers.

v2: Add the declaration of driGetDriverExtensions() that got lost in a
    rebase.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
5 years agoglx: Remove an old DEFAULT_DRIVER_DIR default.
Eric Anholt [Thu, 15 Nov 2018 21:41:34 +0000 (13:41 -0800)]
glx: Remove an old DEFAULT_DRIVER_DIR default.

You can tell by "Mesa/configs/default" how old this is.  Your build system
really has to provide the DEFAULT_DRIVER_DIR, or other loaders will break.

v2: Move the bad (non-prefix-dependent) define to the SConscript to avoid
    breaking it.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
5 years agoradv: enable primitive binning by default
Samuel Pitoiset [Thu, 15 Nov 2018 08:58:52 +0000 (09:58 +0100)]
radv: enable primitive binning by default

After doing a bunch of benchmarks, primitive binning helps
some games like The Talos Principle (+5%) or Serious Sam 2017
(+3%). For other titles, either it doesn't change anything or
it hurts very few (less than 1%).

This only affects GFX9.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: add a debug option for disabling primitive binning
Samuel Pitoiset [Wed, 14 Nov 2018 16:23:12 +0000 (17:23 +0100)]
radv: add a debug option for disabling primitive binning

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agovirgl: native fence fd support
Robert Foss [Mon, 29 Aug 2016 23:13:45 +0000 (23:13 +0000)]
virgl: native fence fd support

Following the support for fences on the virtio driver add support
for native fence on virgl. This was somewhat based on the freedeno one.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agointel/aub_viewer: Print blend states properly
Lionel Landwerlin [Fri, 9 Nov 2018 16:49:13 +0000 (16:49 +0000)]
intel/aub_viewer: Print blend states properly

Identical fix to :

commit 70de31d0c106f58d6b7e6d5b79b8d90c1c112a3b
Author: Jason Ekstrand <jason.ekstrand@intel.com>
Date:   Fri Aug 24 16:05:08 2018 -0500

    intel/batch_decoder: Print blend states properly

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>
5 years agointel/aub_viewer: fix dynamic state printing
Lionel Landwerlin [Fri, 9 Nov 2018 16:49:12 +0000 (16:49 +0000)]
intel/aub_viewer: fix dynamic state printing

Identical fix to :

commit cbd4bc1346f7397242e157bb66099b950a8c5643
Author: Jason Ekstrand <jason.ekstrand@intel.com>
Date:   Fri Aug 24 16:04:03 2018 -0500

    intel/batch_decoder: Fix dynamic state printing

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>
5 years agointel/aubinator: fix ring buffer pointer
Lionel Landwerlin [Fri, 9 Nov 2018 16:49:11 +0000 (16:49 +0000)]
intel/aubinator: fix ring buffer pointer

We can only start parsing commands from the head pointer. This was
working fine up to now because we only dealt with a "made up" ring
buffer (generated by aub_write) which always had its head at 0.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>
5 years agointel/decoders: read ring buffer length
Lionel Landwerlin [Fri, 9 Nov 2018 16:49:10 +0000 (16:49 +0000)]
intel/decoders: read ring buffer length

Use this value to limit reading the ring buffer.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>
5 years agoegl/dri: fix error value with unknown drm format
Lionel Landwerlin [Tue, 13 Nov 2018 14:10:45 +0000 (14:10 +0000)]
egl/dri: fix error value with unknown drm format

According to the EGL_EXT_image_dma_buf_import spec, creating an EGL
image with a DRM format not supported should yield the BAD_MATCH
error :

"
       * If <target> is EGL_LINUX_DMA_BUF_EXT, and the EGL_LINUX_DRM_FOURCC_EXT
         attribute is set to a format not supported by the EGL, EGL_BAD_MATCH
         is generated.
"

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 20de7f9f226401 ("egl/dri2: support for creating images out of dma buffers")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
5 years agogbm: Clarify acceptable formats for gbm_bo
Daniel Stone [Thu, 1 Nov 2018 11:30:36 +0000 (11:30 +0000)]
gbm: Clarify acceptable formats for gbm_bo

gbm_bo_create() was presumably meant to originally accept gbm_bo_format
enums, but it's accepted GBM_FORMAT_* tokens since the dawn of time.
This is good, since gbm_bo_format is rarely used and covers a lot less
ground than GBM_FORMAT_*.

Change the documentation to refer to both; this involves removing a 'see
also' for gbm_bo_format, since we can't also use \sa to refer to a
family of anonymous #defines.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoRevert "radv: disable VK_SUBGROUP_FEATURE_VOTE_BIT"
Connor Abbott [Wed, 17 Oct 2018 14:57:01 +0000 (16:57 +0200)]
Revert "radv: disable VK_SUBGROUP_FEATURE_VOTE_BIT"

This reverts commit 647c2b90e96a9ab8571baf958a7c67c1e816911a. There was
one recently-introduced bug in ac for dvec3 loads, but the other test
failures were actually bugs in the tests. See
https://github.com/KhronosGroup/VK-GL-CTS/commit/9429e621c48848d224e35f30a1ae45a4a079922c

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agovc4: Don't return a vc4 BO handle on a renderonly screen.
Eric Anholt [Thu, 25 Oct 2018 16:17:17 +0000 (09:17 -0700)]
vc4: Don't return a vc4 BO handle on a renderonly screen.

The handles exported need to be on the KMS device's fd, anything else is
failure.  Also, this code is assuming that the scanout resource has been
created already, so assert it.

5 years agovc4: Make sure we make ro scanout resources for create_with_modifiers.
Eric Anholt [Thu, 25 Oct 2018 16:12:50 +0000 (09:12 -0700)]
vc4: Make sure we make ro scanout resources for create_with_modifiers.

The DRI3 create_with_modifiers paths don't set tmpl.bind to SCANOUT or
SHARED, with the theory that given that you've got modifiers, that's all
you need.  However, we were looking at the tmpl.bind for setting up the
KMS handle in the renderonly case, so we'd end up trying to use vc4's
handle on the hx8357d fd.

Fixes: 84ed8b67c56b ("vc4: Set shareable BOs as T tiled if possible")
5 years agoi965: Fix calculation of layers array length for isl_view
Danylo Piliaiev [Thu, 15 Nov 2018 10:03:31 +0000 (12:03 +0200)]
i965: Fix calculation of layers array length for isl_view

Handle all cases in calculation of layers count for isl_view
taking into account texture view and image unit.
st_convert_image was taken as a reference.

When u->Layered is true the whole level is taken with respect to
image view. In other case only one layer is taken.

v3: (Józef Kucia and Ilia Mirkin)
    - Rewrote patch by taking st_convert_image as a reference
    - Removed now unused get_image_num_layers function
    - Changed commit message

v4: (Jason Ekstrand)
    - Added assert

Fixes: 5a8c8903
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107856

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/compiler: Lower SSBO and shared loads/stores in NIR
Jason Ekstrand [Tue, 13 Nov 2018 00:48:10 +0000 (18:48 -0600)]
intel/compiler: Lower SSBO and shared loads/stores in NIR

We have a bunch of code to do this in the back-end compiler but it's
fairly specific to typed surface messages and the way we emit them.
This breaks it out into NIR were it's easier to do things a bit more
generally.  It also means we can easily share the code between the vec4
and FS back-ends if we wish.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
5 years agonir: Add alignment parameters to SSBO, UBO, and shared access
Jason Ekstrand [Tue, 13 Nov 2018 15:45:03 +0000 (09:45 -0600)]
nir: Add alignment parameters to SSBO, UBO, and shared access

This also changes spirv_to_nir and glsl_to_nir to set them.  The one
place that doesn't set them is shared memory access lowering in
nir_lower_io.  That will have to be updated before any consumers of it
can effectively use these new alignments.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
5 years agonir/lower_io: Add shared to get_io_offset_src
Jason Ekstrand [Wed, 14 Nov 2018 21:36:38 +0000 (15:36 -0600)]
nir/lower_io: Add shared to get_io_offset_src

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>