mesa.git
5 years agowinsys/virgl/vtest: Correct off-by-one error in resource allocation
Gert Wollny [Tue, 4 Sep 2018 07:36:53 +0000 (09:36 +0200)]
winsys/virgl/vtest: Correct off-by-one error in resource allocation

The resource bo array must already extended when the target index is
equal to the current size of the array.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
5 years agowinsys/virgl: Initialize value to silence valgrind
Gert Wollny [Mon, 3 Sep 2018 08:14:07 +0000 (10:14 +0200)]
winsys/virgl: Initialize value to silence valgrind

Silences:

  Conditional jump or move depends on uninitialised value(s)
  at 0xB72F2C0: virgl_drm_winsys_create (virgl_drm_winsys.c:854)
  by 0xB72F2C0: virgl_drm_screen_create (virgl_drm_winsys.c:926)
  by 0xB21C885: pipe_virgl_create_screen (drm_helper.h:275)
  by 0xB7201F0: pipe_loader_create_screen (pipe_loader.c:137)
  by 0xB639C91: dri2_init_screen (dri2.c:2112)
  by 0xB634F68: driCreateNewScreen2 (dri_util.c:153)
  by 0x63023E6: dri3_create_screen (dri3_glx.c:893)
  by 0x62D35BD: AllocAndFetchScreenConfigs (glxext.c:820)
  by 0x62D35BD: __glXInitialize (glxext.c:946)
  by 0x62CECB3: GetGLXPrivScreenConfig (glxcmds.c:174)
  by 0x62CF69C: glXQueryExtensionsString (glxcmds.c:1304)
  by 0x60AA7D9: ??? (in /usr/lib/x86_64-linux-gnu/libwaffle-1.so.0.5.2)
  by 0x4F81450: wfl_checked_display_connect (piglit-util-waffle.h:74)
  by 0x4F829E0: piglit_wfl_framework_init (piglit_wfl_framework.c:627)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
5 years agowinsys/virgl: correct resource and handle allocation (v2)
Gert Wollny [Mon, 3 Sep 2018 08:05:44 +0000 (10:05 +0200)]
winsys/virgl: correct resource and handle allocation (v2)

Fixes crash with
  piglit/bin/map_buffer_range-invalidate CopyBufferSubData \
                               increment-offset -auto -fbo

* Resize the resource storage already when the count is equal to the
  allocated size, fixes:

  Invalid write of size 8
  at 0xB72E4CF: virgl_drm_add_res (virgl_drm_winsys.c:629)
  by 0xB72E4CF: virgl_drm_emit_res (virgl_drm_winsys.c:663)
  by 0xB72A44A: virgl_encode_resource_copy_region (virgl_encode.c:776)
  by 0xB40CD12: st_copy_buffer_subdata (st_cb_bufferobjects.c:585)
  by 0xB244A3B: _mesa_CopyBufferSubData (bufferobj.c:2940)
  by 0x109A1E: upload (invalidate.c:169)
  by 0x109C2F: piglit_display (invalidate.c:215)
  by 0x4F80FBE: run_test (piglit_fbo_framework.c:52)
  by 0x4F66E5F: piglit_gl_test_run (piglit-framework-gl.c:229)
  by 0x10949D: main (invalidate.c:47)
  Address 0xbe07d30 is 0 bytes after a block of size 4,096 alloc'd
  at 0x4C31B25: calloc (in
       /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
  by 0xB72DAAF: virgl_drm_cmd_buf_create (virgl_drm_winsys.c:567)

* Also resize the space allocated for the handles, fixes:

  Invalid write of size 4
  at 0xB72E4F0: virgl_drm_add_res (virgl_drm_winsys.c:631)
  by 0xB72E4F0: virgl_drm_emit_res (virgl_drm_winsys.c:663)
  by 0xB72A44A: virgl_encode_resource_copy_region (virgl_encode.c:776)
  by 0xB40CD12: st_copy_buffer_subdata (st_cb_bufferobjects.c:585)
  by 0xB244A3B: _mesa_CopyBufferSubData (bufferobj.c:2940)
  by 0x109A1E: upload (invalidate.c:169)
  by 0x109C2F: piglit_display (invalidate.c:215)
  by 0x4F80FBE: run_test (piglit_fbo_framework.c:52)
  by 0x4F66E5F: piglit_gl_test_run (piglit-framework-gl.c:229)
  by 0x10949D: main (invalidate.c:47)
  Address 0xbe08570 is 0 bytes after a block of size 2,048 alloc'd
  at 0x4C2FB0F: malloc (
    in /usr/lib/valgrind/vgpreload_memcheck-amd64- linux.so)
  by 0xB72DAC8: virgl_drm_cmd_buf_create (virgl_drm_winsys.c:572)

Fixes: 4b15b5e803e ("virgl: resize resource bo allocation if we need to.")
v2: - Use REALLOC macro and avoid memory leak when re-allocation fails
    - add Fixes tag (both Emil Velikov)
    - reorder commit message

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
5 years agovirgl: use hw-atomics instead of in-ssbo ones
Tomeu Vizoso [Tue, 17 Jul 2018 11:13:21 +0000 (13:13 +0200)]
virgl: use hw-atomics instead of in-ssbo ones

Emulating atomics on top of ssbos can lead to too small max SSBO count,
so let's use the hw-atomics mechanism to expose atomic buffers instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: update minor differences to upstream header
Erik Faye-Lund [Wed, 29 Aug 2018 07:07:26 +0000 (09:07 +0200)]
virgl: update minor differences to upstream header

virgl_protocol.h is considered to have it's upstream in the
virglrenderer repository, and somehow these minor differences has
crept in.

Let's sync with the upstream to avoid this.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agogallium: add PIPE_CAP_MAX_COMBINED_HW_ATOMIC_COUNTER{S,_BUFFERS}
Erik Faye-Lund [Wed, 29 Aug 2018 14:11:14 +0000 (16:11 +0200)]
gallium: add PIPE_CAP_MAX_COMBINED_HW_ATOMIC_COUNTER{S,_BUFFERS}

This moves the evergreen-specific max-sizes out as a driver-cap, so
other drivers with less strict requirements also can use hw-atomics.

Remove ssbo_atomic as it's no longer needed.

We should now be able to use hw-atomics for some stages and not for
other, if needed.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agogallium: add PIPE_CAP_MAX_COMBINED_SHADER_BUFFERS
Erik Faye-Lund [Thu, 30 Aug 2018 09:04:17 +0000 (11:04 +0200)]
gallium: add PIPE_CAP_MAX_COMBINED_SHADER_BUFFERS

This gets rid of a r600 specific hack in the state-tracker, and prepares
for other drivers to be able to use hw-atomics.

While we're at it, clean up some indentation in the various drivers.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agost/mesa: simplify MaxAtomicBufferSize-logic
Erik Faye-Lund [Wed, 29 Aug 2018 13:48:29 +0000 (15:48 +0200)]
st/mesa: simplify MaxAtomicBufferSize-logic

MaxAtomicCounters has already been assigned in the loop above in the
ssbo_atomic = true case, so this will calculate the same value as the
default.

While we're at it, fixup indentation on the MaxAtomicBufferBindings
assign.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agost/mesa: clean up atomic vs ssbo code
Erik Faye-Lund [Wed, 29 Aug 2018 13:35:11 +0000 (15:35 +0200)]
st/mesa: clean up atomic vs ssbo code

This makes the code a bit easier to follow; we first set up
MaxShaderStorageBlocks, then we either set up a dedicated
MaxAtomicBuffers, or we split MaxShaderStorageBlocks in two.

While we're at it, also make the SSBO-splitting code tolerate the
hypothetical case of having an odd number of SSBOs without incorrectly
dropping the last SSBO.

This has the nice result that the SSBOs and atomic buffers are dealt
with almost completely orthogonally, easing some upcoming patches.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agost/mesa: use real bool for can_ubo
Erik Faye-Lund [Wed, 29 Aug 2018 13:24:26 +0000 (15:24 +0200)]
st/mesa: use real bool for can_ubo

We're doing full c99 now, so there's no point in using the old boolean
type.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agogallium/u_threaded: increase batch size to increase performance
Marek Olšák [Sat, 1 Sep 2018 07:10:27 +0000 (03:10 -0400)]
gallium/u_threaded: increase batch size to increase performance

This reduces mutex overhead.

radeonsi: +4.4% performance with piglit/drawoverhead, DrawElements, Ryzen X1700
iris_dri.so: +14% with piglit/drawoverhead, DrawArrays, i7 7700HQ.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agost/vdpau: silence an unitialized-variable warning
Marek Olšák [Mon, 20 Aug 2018 18:52:52 +0000 (14:52 -0400)]
st/vdpau: silence an unitialized-variable warning

5 years agost/mesa: help fix stencil border color for GL_DEPTH_STENCIL textures
Marek Olšák [Tue, 21 Aug 2018 01:33:24 +0000 (21:33 -0400)]
st/mesa: help fix stencil border color for GL_DEPTH_STENCIL textures

GL_STENCIL_INDEX uses GL_INTENSITY for the border color, which is nicer
to hardware that doesn't read the stencil border value from the X channel.

This fixes a bunch of dEQP tests on Vega & Raven.

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
5 years agoglsl_to_tgsi: Fix potential leak
Ernestas Kulik [Thu, 30 Aug 2018 16:02:44 +0000 (19:02 +0300)]
glsl_to_tgsi: Fix potential leak

Reported by Coverity: arr_live_ranges is freed in a different branch
than the one in which it was allocated.

Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
5 years agou_vbuf: Fix leak
Ernestas Kulik [Thu, 30 Aug 2018 16:02:45 +0000 (19:02 +0300)]
u_vbuf: Fix leak

Reported by Coverity: data is heap-allocated, but only freed in the
info->index_size != 0 branch.

Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 18.2 <mesa-stable@lists.freedesktop.org>
5 years agofreedreno: Drop a bunch of duplicated gallium PIPE_CAP default code.
Eric Anholt [Sat, 11 Aug 2018 01:15:45 +0000 (18:15 -0700)]
freedreno: Drop a bunch of duplicated gallium PIPE_CAP default code.

Now that we have the util function for the default values, we can get rid
of the boilerplate.

v2: Rebase on new gallium caps

Reviewed-by: Rob Clark <robdclark@gmail.com> (v1)
5 years agov3d: Drop a bunch of duplicated gallium PIPE_CAP default code.
Eric Anholt [Sat, 11 Aug 2018 01:04:40 +0000 (18:04 -0700)]
v3d: Drop a bunch of duplicated gallium PIPE_CAP default code.

Now that we have the util function for the default values, we can get rid
of the boilerplate.

v2: Rebase on new gallium caps

5 years agovc4: Drop a bunch of duplicated gallium PIPE_CAP default code.
Eric Anholt [Sat, 11 Aug 2018 01:02:02 +0000 (18:02 -0700)]
vc4: Drop a bunch of duplicated gallium PIPE_CAP default code.

Now that we have the util function for the default values, we can get rid
of the boilerplate.

v2: drop GLSL level in favor of defaults.
v3: Rebase on new gallium caps

5 years agogallium: Add a helper for implementing PIPE_CAP_* default values.
Eric Anholt [Fri, 10 Aug 2018 23:57:31 +0000 (16:57 -0700)]
gallium: Add a helper for implementing PIPE_CAP_* default values.

One of the pains of implementing a gallium driver is filling in a million
pipe caps you don't know about yet when you're just starting out.  One of
the pains of working on gallium is copy-and-pasting your new PIPE_CAP into
each driver.  We can fix both of these by having each driver call into the
default helper from their default case, so that both sides can ignore each
other until they need to.

v2: fix i915g build, revert swr change to avoid breaking scons build
    (https://travis-ci.org/anholt/mesa/jobs/419739857)
v3: Rebase on 3 new gallium caps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: George Kyriazis <george.kyriazis@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/compiler: Remove redundant nir_remove_dead_variables call
Jason Ekstrand [Mon, 3 Sep 2018 18:20:54 +0000 (13:20 -0500)]
intel/compiler: Remove redundant nir_remove_dead_variables call

As of 07a2098a708a2, brw_nir_optimize calls nir_remove_dead_variables as
the last optimization.  Doing it again is just pointless.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
5 years agointel: compiler: remove dead local variables at optimization pass
Lionel Landwerlin [Thu, 23 Aug 2018 13:34:19 +0000 (14:34 +0100)]
intel: compiler: remove dead local variables at optimization pass

We're hitting an assert in gfxbench because one of the local variable
is a sampler (according to Jason this isn't valid) :

testfw_app: ../src/compiler/nir_types.cpp:551: void glsl_get_natural_size_align_bytes(const glsl_type*, unsigned int*, unsigned int*): Assertion `!"type does not have a natural size"' failed.

Since this particular variable isn't used, it can be eliminated by
removing unused local variables at the end of the optimization loop.
This makes sense also for valid local variables.

v2: Move additional local variable removal out of optimization loop,
    but before large constant removal (Jason/Lionel)

v3: Move the removal at the end of brw_nir_optimize()

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107806
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/decoder: fix the possible out of bounds group_iter
Andrii Simiklit [Mon, 20 Aug 2018 16:20:59 +0000 (19:20 +0300)]
intel/decoder: fix the possible out of bounds group_iter

The "gen_group_get_length" function can return a negative value
and it can lead to the out of bounds group_iter.

v2: printing of "unknown command type" was added
v3: just the asserts are added

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoradv: Fix CMASK dimensions.
Bas Nieuwenhuizen [Mon, 3 Sep 2018 00:34:04 +0000 (02:34 +0200)]
radv: Fix CMASK dimensions.

Mirrors

1e40f694831 "ac/surface: fix CMASK fast clear for NPOT textures with mipmapping on SI/CI/VI"

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Use a lower max offchip buffer count.
Bas Nieuwenhuizen [Mon, 3 Sep 2018 00:30:48 +0000 (02:30 +0200)]
radv: Use a lower max offchip buffer count.

No clue what gets fixed by this but both radeonsi and amdvlk do it.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Add VEGA20 support.
Bas Nieuwenhuizen [Mon, 3 Sep 2018 00:19:25 +0000 (02:19 +0200)]
radv: Add VEGA20 support.

Just mirror the radeonsi bits. Since this is just adding the extra
switch entries for new HW I think this should be fine for stable.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: don't expose linear depth surfaces on SI/CIK/VI either.
Dave Airlie [Fri, 31 Aug 2018 05:55:15 +0000 (15:55 +1000)]
radv: don't expose linear depth surfaces on SI/CIK/VI either.

ac_surface.c: gfx6_compute_surface says
/* DB doesn't support linear layouts. */

Now if we expose linear depth and create a linear depth image
and use CmdCopyImage to copy into it, we can't map the underlying
memory and read it linearly which I think should work.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoegl/android: do not indent HAVE_DRM_GRALLOC preprocessor directive
Mauro Rossi [Wed, 15 Aug 2018 12:46:25 +0000 (14:46 +0200)]
egl/android: do not indent HAVE_DRM_GRALLOC preprocessor directive

Fixes: 3f7bca44d9 ("egl/android: #ifdef out flink name support")
Fixes: c7bb82136b ("egl/android: Add DRM node probing and filtering")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
5 years agoanv/blorp: Fix a comment as per Nanley's review feedback
Jason Ekstrand [Sat, 1 Sep 2018 14:11:17 +0000 (09:11 -0500)]
anv/blorp: Fix a comment as per Nanley's review feedback

This accidentally didn't make it into 62378c5e9e5

5 years agoanv/blorp: Do more flushing around HiZ clears
Jason Ekstrand [Thu, 30 Aug 2018 17:05:06 +0000 (12:05 -0500)]
anv/blorp: Do more flushing around HiZ clears

We make the flush after a HiZ clear unconditional and add a flush/stall
before the clear as well.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107760
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
5 years agoi965/vec4: Clamp indirect tes input array reads with 0x0fffffff
Ian Romanick [Tue, 19 Jun 2018 00:02:58 +0000 (17:02 -0700)]
i965/vec4: Clamp indirect tes input array reads with 0x0fffffff

Page 190 of "Volume 7: 3D Media GPGPU Engine (Haswell)" says the valid
range of the offset is [0, 0FFFFFFFh].

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
5 years agoi965/vec4: Correctly handle uniform sources in generate_tes_add_indirect_urb_offset
Ian Romanick [Sat, 16 Jun 2018 02:39:56 +0000 (19:39 -0700)]
i965/vec4: Correctly handle uniform sources in generate_tes_add_indirect_urb_offset

Fixes failure in the new piglit test
tes-patch-input-array-vec2-index-invalid-rd.shader_test.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
5 years agodocs: update calendar to extended the 18.1 cycle by one more release
Andres Gomez [Thu, 30 Aug 2018 15:03:04 +0000 (18:03 +0300)]
docs: update calendar to extended the 18.1 cycle by one more release

Due to having 2 additional RCs for 18.2.

Cc: Dylan Baker <dylan.c.baker@intel.com>
Cc: Juan A. Suarez <jasuarez@igalia.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
5 years agointel: Introducing Amber Lake platform
Rodrigo Vivi [Thu, 30 Aug 2018 21:39:27 +0000 (14:39 -0700)]
intel: Introducing Amber Lake platform

Amber Lake uses the same gen graphics as Kaby Lake, including a id
that were previously marked as reserved on Kaby Lake, but that
now is moved to AML page.

This follows the ids and approach used on kernel's commit
e364672477a1 ("drm/i915/aml: Introducing Amber Lake platform")

Reported-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agointel: aubinator: Adding missed platforms to the error message.
Rodrigo Vivi [Thu, 30 Aug 2018 21:32:57 +0000 (14:32 -0700)]
intel: aubinator: Adding missed platforms to the error message.

Many new platforms got added to gen_device_name_to_pci_device_id()
but the error message inside aubinator didn't reflected those
changes. So syncing on the same order to be sure that we are not
missing any now.

Cc: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoi965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on some gen9
Nanley Chery [Wed, 22 Aug 2018 17:43:32 +0000 (10:43 -0700)]
i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on some gen9

According to internal docs, some gen9 platforms have a pixel shader push
constant synchronization issue. Although not listed among said
platforms, this issue seems to be present on the GeminiLake 2x6's we've
tested.

We consider the available workarounds to be too detrimental on
performance. Instead, we mitigate the issue by applying part of one of
the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch
(as suggested by Ken).

Fixes ext_framebuffer_multisample-accuracy piglit test failures with the
following options:
* 6 depth_draw small depthstencil
* 8 stencil_draw small depthstencil
* 6 stencil_draw small depthstencil
* 8 depth_resolve small
* 6 stencil_resolve small depthstencil
* 4 stencil_draw small depthstencil
* 16 stencil_draw small depthstencil
* 16 depth_draw small depthstencil
* 2 stencil_resolve small depthstencil
* 6 stencil_draw small
* all_samples stencil_draw small
* 2 depth_draw small depthstencil
* all_samples depth_draw small depthstencil
* all_samples stencil_resolve small
* 4 depth_draw small depthstencil
* all_samples depth_draw small
* all_samples stencil_draw small depthstencil
* 4 stencil_resolve small depthstencil
* 4 depth_resolve small depthstencil
* all_samples stencil_resolve small depthstencil

v2: Include more platforms in WA (Ken).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93355
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoimx: make use of loader_open_render_node(..) helper
Christian Gmeiner [Thu, 9 Aug 2018 05:12:24 +0000 (07:12 +0200)]
imx: make use of loader_open_render_node(..) helper

Gets rid of hard-coded gpu device path.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agotegra: make use loader_open_render_node(..) helper
Christian Gmeiner [Thu, 2 Aug 2018 18:04:45 +0000 (20:04 +0200)]
tegra: make use loader_open_render_node(..) helper

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agoloader: add loader_open_render_node(..)
Christian Gmeiner [Thu, 9 Aug 2018 05:12:22 +0000 (07:12 +0200)]
loader: add loader_open_render_node(..)

This helper is almost a 1:1 copy of tegra_open_render_node().

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agotegra: fix memory leak
Christian Gmeiner [Fri, 10 Aug 2018 09:51:25 +0000 (11:51 +0200)]
tegra: fix memory leak

Fixes: 1755f608f52 ("tegra: Initial support")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agost/dri: Don't expose sRGB formats to clients
Daniel Stone [Fri, 31 Aug 2018 16:34:03 +0000 (17:34 +0100)]
st/dri: Don't expose sRGB formats to clients

Though the SARGB8888 format is used internally through its FourCC value,
it is not a real format as defined by drm_fourcc.h; it cannot be used
with KMS or other interfaces expecting drm_fourcc.h format codes.

Ensure we don't advertise it through the dmabuf format/modifier query
interfaces, preventing us from tripping over an assert.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Fixes: 8c1b9882b2e0 ("egl/dri2: Guard against invalid fourcc formats")
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
5 years agoradv: add missing support for protected memory properties
Samuel Pitoiset [Thu, 30 Aug 2018 09:43:47 +0000 (11:43 +0200)]
radv: add missing support for protected memory properties

Fixes Vulkan CTS CL#2849. Similar to the ANV driver.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: remove dead code in scan_shader_output_decl()
Samuel Pitoiset [Thu, 30 Aug 2018 08:33:37 +0000 (10:33 +0200)]
radv: remove dead code in scan_shader_output_decl()

Never used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: remove radv_shader_context::num_output_{clips,culls}
Samuel Pitoiset [Thu, 30 Aug 2018 08:33:00 +0000 (10:33 +0200)]
radv: remove radv_shader_context::num_output_{clips,culls}

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: adjust the cull dist mask in scan_shader_output_decl()
Samuel Pitoiset [Thu, 30 Aug 2018 08:30:54 +0000 (10:30 +0200)]
radv: adjust the cull dist mask in scan_shader_output_decl()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: get length of the clip/cull distances array from usage mask
Samuel Pitoiset [Thu, 30 Aug 2018 08:12:03 +0000 (10:12 +0200)]
radv: get length of the clip/cull distances array from usage mask

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: do not recompute the output usage mask for clipdist twice
Samuel Pitoiset [Thu, 30 Aug 2018 08:01:26 +0000 (10:01 +0200)]
radv: do not recompute the output usage mask for clipdist twice

The shader info pass takes care of this now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: gather the output usage mask for clip/cull distances correctly
Samuel Pitoiset [Thu, 30 Aug 2018 07:43:29 +0000 (09:43 +0200)]
radv: gather the output usage mask for clip/cull distances correctly

It's a special case because both are combined into a single array.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: add set_output_usage_mask() helper
Samuel Pitoiset [Thu, 30 Aug 2018 07:35:41 +0000 (09:35 +0200)]
radv: add set_output_usage_mask() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: fix passing clip/cull distances from VS to PS
Samuel Pitoiset [Wed, 29 Aug 2018 20:13:52 +0000 (22:13 +0200)]
radv: fix passing clip/cull distances from VS to PS

CTS doesn't test input clip/cull distances for the fragment
shader stage, which explains why this was totally broken. I
wrote a simple test locally that works now.

This fixes a crash with GTA V and DXVK.

Note that we are exporting unused parameters from the vertex
shader now, but this can't be optimized easily because we don't
keep the fragment shader info...

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107477
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoegl/wayland: do not leak wl_buffer when it is locked
Juan A. Suarez Romero [Thu, 30 Aug 2018 08:14:49 +0000 (10:14 +0200)]
egl/wayland: do not leak wl_buffer when it is locked

If color buffer is locked, do not set its wayland buffer to NULL;
otherwise it can not be freed later.

Rather, flag it in order to destroy it later on the release event.

v2: instruct release event to unlock only or free wl_buffer too (Daniel)

This also fixes dEQP-EGL.functional.swap_buffers_with_damage.* tests.

CC: Daniel Stone <daniel@fooishbar.org>
Reviewed-by: Daniel Stone <daniels@collabora.com>
5 years agoac/radeonsi: fix CIK copy max size
Dave Airlie [Wed, 29 Aug 2018 03:52:15 +0000 (13:52 +1000)]
ac/radeonsi: fix CIK copy max size

While adding transfer queues to radv, I started writing some tests,
the first test I wrote fell over copying a buffer larger than this
limit.

Checked AMDVLK and found the correct limit.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: fix regression in indirect input swizzles.
Dave Airlie [Fri, 31 Aug 2018 00:12:06 +0000 (01:12 +0100)]
radeonsi: fix regression in indirect input swizzles.

This fixes:
tests/spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-dvec3.shader_test
since I reworked the 64-bit swizzles.

Fixes: bb17ae49ee2 (gallivm: allow to pass two swizzles into fetches.)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: fix tess/gs fetchs for new swizzle.
Dave Airlie [Thu, 30 Aug 2018 23:27:44 +0000 (00:27 +0100)]
radeonsi: fix tess/gs fetchs for new swizzle.

I have piglit results from my machine, but I must have messed up,
and not built mesa in between properly.

Fixes: bb17ae49ee2 (gallivm: allow to pass two swizzles into fetches.)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: ignore VAO IDs equal to 0 in glDeleteVertexArrays
Marek Olšák [Thu, 30 Aug 2018 19:14:46 +0000 (15:14 -0400)]
mesa: ignore VAO IDs equal to 0 in glDeleteVertexArrays

This fixes a firefox crash.

Fixes: 781a78914c798dc64005b37c6ca1224ce06803fc
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoRevert "intel/tools/aubwrite: Always use physical addresses for traces."
Kenneth Graunke [Thu, 30 Aug 2018 18:19:51 +0000 (11:19 -0700)]
Revert "intel/tools/aubwrite: Always use physical addresses for traces."

This reverts commit f8cfc7766016d0ff7d52953e7a992b1e77c521d0.

This appears to break intel_dump_gpu for Gen9 systems - I can load them
in the simulator, but nothing happens.  Reverting the patch makes the
simulator properly execute our commands and shaders again.

5 years agointel/nir: Lowering image loads and stores trashes all metadata
Jason Ekstrand [Thu, 30 Aug 2018 17:50:31 +0000 (12:50 -0500)]
intel/nir: Lowering image loads and stores trashes all metadata

This fixes the GL_ARB_fragment_shader_interlock piglit test on gen8
platforms where the lack of metadata dirtying was causing another pass
to accidentally delete a much needed loop.

https://bugs.freedesktop.org/show_bug.cgi?id=107745
Fixes: 37f7983bcca1 "intel/compiler: Do image load/store lowering..."
Jason Ekstrand <jason@jlekstrand.net> writes:
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoi965/screen: Allow modifiers on sRGB formats
Jason Ekstrand [Tue, 28 Aug 2018 20:25:23 +0000 (15:25 -0500)]
i965/screen: Allow modifiers on sRGB formats

This effectively reverts a26693493570a9d0f0fba1be617e01ee7bfff4db which
was a misguided attempt at protecting intel_query_dma_buf_modifiers from
invalid formats.  Unfortunately, in some internal EGL cases, we can get
an SRGB format validly in this function.  Rejecting such formats caused
us to not allow CCS in some cases where we should have been allowing it.
This regressed the performance of some SynMark tests as well as GfxBench
ALU2, Tessellation and Manhattan 3.0 tests

There's some question of whether or not we really should be using SRGB
"fourcc" formats that aren't actually in drm_foucc.h but there's not
much harm in allowing them through here.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107223
Fixes: a26693493570 "i965/screen: Return false for unsupported..."
Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoegl/dri2: Guard against invalid fourcc formats
Jason Ekstrand [Tue, 28 Aug 2018 21:43:57 +0000 (16:43 -0500)]
egl/dri2: Guard against invalid fourcc formats

We already reject attempts to import images with invalid fourcc formats
but don't really guard the queries all that well.  This makes us error
out in any calls to eglQueryDmaBufModifiersEXT if the given format is
not a valid fourcc format.  We also add an assert to ensure that drivers
don't advertise any non-fourcc formats.

Cc: mesa-stable@lists.freedesktop.org
Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoegl/dri2: Add a helper for the number of planes for a FOURCC format
Jason Ekstrand [Tue, 28 Aug 2018 21:31:22 +0000 (16:31 -0500)]
egl/dri2: Add a helper for the number of planes for a FOURCC format

This also serves as a convenient "is this a fourcc format" check as well
which we'll take advantage of in the next commit.

Cc: mesa-stable@lists.freedesktop.org
Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoradv/meta: Set num_components on image_store intrinsics
Jason Ekstrand [Thu, 30 Aug 2018 00:47:19 +0000 (19:47 -0500)]
radv/meta: Set num_components on image_store intrinsics

Now that image load/store intrinsics are variable-width, we need to set
num_components accordingly.  In 15d39f474b890, both glsl_to_nir and
spirv_to_nir were updated to properly set num_components but radv meta
was left behind.

Fixes: 15d39f474b890 "nir: Make image load/store intrinsics..."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agogallivm: Detect VSX separately from Altivec
Vicki Pfau [Sun, 19 Aug 2018 21:17:01 +0000 (14:17 -0700)]
gallivm: Detect VSX separately from Altivec

Previously gallivm would attempt to use VSX instructions on all systems
where it detected that Altivec is supported; however, VSX was added to
POWER long after Altivec, causing lots of crashes on older POWER/PPC
hardware, e.g. PPC Macs. By detecting VSX separately from Altivec we can
automatically disable it on hardware that supports Altivec but not VSX

Signed-off-by: Vicki Pfau <vi@endrift.com>
5 years agonv50: bump compat glsl level to same as core
Ilia Mirkin [Sun, 26 Aug 2018 21:47:12 +0000 (17:47 -0400)]
nv50: bump compat glsl level to same as core

Passes the compat piglits. I'm sure that there will be odd issues that
aren't caught by them, but at least it should basically work.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
5 years agonvc0: bump compat GLSL version to match core
Ilia Mirkin [Sun, 26 Aug 2018 18:21:39 +0000 (14:21 -0400)]
nvc0: bump compat GLSL version to match core

This passes the handful of tests in piglit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
5 years agoglsl: avoid lowering texcoord array except in simple cases
Ilia Mirkin [Sun, 26 Aug 2018 17:48:10 +0000 (13:48 -0400)]
glsl: avoid lowering texcoord array except in simple cases

With compat creeping up to geometry and tess shaders, lowering texcoord
accesses/writes becomes more complicated. Since it's an optimization
anyways, just avoid the complication for now.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agodocs: update calendar 18.2.0-rc5 is out, extend to 18.2.0-rc6
Andres Gomez [Thu, 30 Aug 2018 00:32:02 +0000 (03:32 +0300)]
docs: update calendar 18.2.0-rc5 is out, extend to 18.2.0-rc6

Signed-off-by: Andres Gomez <agomez@igalia.com>
5 years agost/mesa, gallium: add a workaround for No Mans Sky
Timothy Arceri [Wed, 29 Aug 2018 05:48:47 +0000 (15:48 +1000)]
st/mesa, gallium: add a workaround for No Mans Sky

The spec seems clear this is not allowed but the Nvidia binary
forces apps to add layout qualifiers so this works around the
issue for No Mans Sky until the CTS can be sorted out.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoglsl: add a mechanism to allow layout qualifiers on function params
Timothy Arceri [Wed, 29 Aug 2018 05:48:46 +0000 (15:48 +1000)]
glsl: add a mechanism to allow layout qualifiers on function params

The spec is quite clear this is not allowed:

    From Section 4.4. (Layout Qualifiers) of the GLSL 4.60 spec:

       "Layout qualifiers can appear in several forms of declaration.
       They can appear as part of an interface block definition or
       block member, as shown in the grammar in the previous section.
       They can also appear with just an interface-qualifier to establish
       layouts of other declarations made with that qualifier:

          layout-qualifier interface-qualifier ;

       Or, they can appear with an individual variable declared with
       an interface qualifier:

          layout-qualifier interface-qualifier declaration ;"

    From Section 4.10 (Memory Qualifiers) of the GLSL 4.60 spec:

       "Layout qualifiers cannot be used on formal function parameters,
       and layout qualification is not included in parameter matching."

However on the Nvidia binary driver they actually fail to compile
if image function params don't have a layout qualifier. This results
in applications such as No Mans Sky using layout qualifiers on params.

I've submitted a CTS test to expose this problem in the Nvidia driver
but until that is resolved this patch will help Mesa drivers work
around the issue.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoglsl: skip stringification in preprocessor if in unreachable branch
Timothy Arceri [Wed, 29 Aug 2018 01:36:51 +0000 (11:36 +1000)]
glsl: skip stringification in preprocessor if in unreachable branch

This fixes compilation of some "No Mans Sky" shaders where the stringification
happens in branches intended for DX12.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoradv: Add missing checks in radv_get_image_format_properties.
Bas Nieuwenhuizen [Wed, 29 Aug 2018 15:04:25 +0000 (17:04 +0200)]
radv: Add missing checks in radv_get_image_format_properties.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agogallivm: allow to pass two swizzles into fetches.
Dave Airlie [Mon, 27 Aug 2018 01:03:41 +0000 (02:03 +0100)]
gallivm: allow to pass two swizzles into fetches.

This hijacks the top 16-bits of swizzle, to pass in the swizzle
for the second channel.

This fixes handling .yx swizzles of 64-bit values.

This should fixup radeonsi and llvmpipe.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107524
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: enable radeonsi_zerovram for No Mans Sky
Timothy Arceri [Fri, 24 Aug 2018 11:06:19 +0000 (21:06 +1000)]
radeonsi: enable radeonsi_zerovram for No Mans Sky

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: add radeonsi_zerovram driconfig option
Timothy Arceri [Fri, 24 Aug 2018 11:06:18 +0000 (21:06 +1000)]
radeonsi: add radeonsi_zerovram driconfig option

More and more games seem to require this so lets make it a config
option.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: enable GL 4.5 in compat profile
Timothy Arceri [Fri, 24 Aug 2018 11:06:17 +0000 (21:06 +1000)]
radeonsi: enable GL 4.5 in compat profile

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: enable ARB_direct_state_access in compat for GL3.1+
Timothy Arceri [Wed, 29 Aug 2018 02:40:12 +0000 (12:40 +1000)]
mesa: enable ARB_direct_state_access in compat for GL3.1+

We could enable it for lower versions of GL but this allows us
to just use the existing version/extension checks that are already
used by the core profile.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: add a thorough clear/copy_buffer benchmark
Marek Olšák [Thu, 2 Aug 2018 22:15:48 +0000 (18:15 -0400)]
radeonsi: add a thorough clear/copy_buffer benchmark

5 years agoradeonsi: let internal compute dispatches tune WAVES_PER_SH
Marek Olšák [Thu, 2 Aug 2018 20:37:17 +0000 (16:37 -0400)]
radeonsi: let internal compute dispatches tune WAVES_PER_SH

5 years agoradeonsi: add TGSI_SEMANTIC_CS_USER_DATA for reading up to 4 SGPRs with TGSI
Marek Olšák [Wed, 25 Jul 2018 05:37:21 +0000 (01:37 -0400)]
radeonsi: add TGSI_SEMANTIC_CS_USER_DATA for reading up to 4 SGPRs with TGSI

5 years agoradeonsi: add SI_QUERY_TIME_ELAPSED_SDMA_SI for measuring DMA on SI
Marek Olšák [Tue, 21 Aug 2018 04:46:53 +0000 (00:46 -0400)]
radeonsi: add SI_QUERY_TIME_ELAPSED_SDMA_SI for measuring DMA on SI

DMA on SI doesn't support the timestamp packet, so it's emulated.

5 years agoradeonsi: add SI_QUERY_TIME_ELAPSED_SDMA for measuring SDMA performance
Marek Olšák [Tue, 24 Jul 2018 17:14:29 +0000 (13:14 -0400)]
radeonsi: add SI_QUERY_TIME_ELAPSED_SDMA for measuring SDMA performance

5 years agoradeonsi: add flag L2_STREAM for minimal cache usage
Marek Olšák [Fri, 3 Aug 2018 00:33:06 +0000 (20:33 -0400)]
radeonsi: add flag L2_STREAM for minimal cache usage

5 years agogallium: add TGSI_MEMORY_STREAM_CACHE_POLICY
Marek Olšák [Wed, 25 Jul 2018 04:41:48 +0000 (00:41 -0400)]
gallium: add TGSI_MEMORY_STREAM_CACHE_POLICY

For internal radeonsi shaders.

5 years agointel/compiler: Remove surface_idx from brw_image_param
Jason Ekstrand [Fri, 17 Aug 2018 14:15:56 +0000 (09:15 -0500)]
intel/compiler: Remove surface_idx from brw_image_param

Now that the drivers are lowering to surface indices themselves, we no
longer need to push the surface index into the shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel: Use TXS for image_size when we have a typed surface
Jason Ekstrand [Thu, 16 Aug 2018 16:01:24 +0000 (11:01 -0500)]
intel: Use TXS for image_size when we have a typed surface

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoanv,i965: Lower away image derefs in the driver
Jason Ekstrand [Thu, 16 Aug 2018 21:23:10 +0000 (16:23 -0500)]
anv,i965: Lower away image derefs in the driver

Previously, the back-end compiler turn image access into magic uniform
reads and there was a complex contract between back-end compiler and
driver about setting up and filling out those params.  As of this
commit, both drivers now lower image_deref_load_param_intel intrinsics
to load_uniform intrinsics controlled by the driver and lower the other
image_deref_* intrinsics to image_* intrinsics which take an actual
binding table index.  There are still "magic" uniforms but they are now
added and controlled entirely by the driver and that contract no longer
spans components.

This also has the side-effect of making most image use compile-time
binding table indices.  Previously, all image access pulled the binding
table index from a uniform.  Part of the reason for this was that the
magic uniforms made it difficult to decouple binding table indices from
the uniforms and, since they are indexed completely differently
(especially in Vulkan), it was hard to pull them apart.  Now that the
driver is handling both, it's trivial to decouple the two and provide
actual binding table indices.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15166872 -> 15164293 (-0.02%)
    instructions in affected programs: 115834 -> 113255 (-2.23%)
    helped: 191
    HURT: 0

    total cycles in shared programs: 571311495 -> 571196465 (-0.02%)
    cycles in affected programs: 4757115 -> 4642085 (-2.42%)
    helped: 73
    HURT: 67

    total spills in shared programs: 10951 -> 10926 (-0.23%)
    spills in affected programs: 742 -> 717 (-3.37%)
    helped: 7
    HURT: 0

    total fills in shared programs: 22226 -> 22201 (-0.11%)
    fills in affected programs: 1146 -> 1121 (-2.18%)
    helped: 7
    HURT: 0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir: Add handle/index-based image intrinsics
Jason Ekstrand [Thu, 16 Aug 2018 20:11:44 +0000 (15:11 -0500)]
nir: Add handle/index-based image intrinsics

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir: Use a bitfield for image access qualifiers
Jason Ekstrand [Thu, 16 Aug 2018 20:11:12 +0000 (15:11 -0500)]
nir: Use a bitfield for image access qualifiers

This commit expands the current memory access enum to contain the extra
two bits provided for images.  We choose to follow the SPIR-V convention
of NonReadable and NonWriteable because readonly implies that you *can*
read so readonly + writeonly doesn't make as much sense as NonReadable +
NonWriteable.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl/link,i965: Make ImageAccess four-state
Jason Ekstrand [Thu, 16 Aug 2018 19:31:28 +0000 (14:31 -0500)]
glsl/link,i965: Make ImageAccess four-state

The GLSL spec allows you to set both the "readonly" and "writeonly"
qualifiers on images to indicate that it can only be used with
imageSize.  However, we had no way of representing this int he linked
shader and flagged it as GL_READ_ONLY.  This is good from a "does it use
this buffer?" perspective but not from a format and access lowering
perspective.  By using GL_NONE for if "readonly" and "writeonly" are
both set, we can detect this case in the driver and handle it correctly.

Nothing currently relies on the type of surface in the "readonly" +
"writeonly" case but that's about to change.  i965 is the only drier
which uses the ImageAccess field and gl_bindless_image::access is
currently unused.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/compiler: Use two components for 1D array image sizes
Jason Ekstrand [Thu, 16 Aug 2018 15:16:41 +0000 (10:16 -0500)]
intel/compiler: Use two components for 1D array image sizes

Having the array length component stored in .z was a small convenience
for the ISL image param filling code and an annoyance in the NIR
lowering code.  The only convenience of treating 1D arrays like 2D
arrays in the lowering code is in the address calculation code so let's
put all the complexity there as well.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoisl: Use the view array length for the image size
Jason Ekstrand [Thu, 16 Aug 2018 15:12:16 +0000 (10:12 -0500)]
isl: Use the view array length for the image size

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/compiler: Do image load/store lowering to NIR
Jason Ekstrand [Sat, 27 Jan 2018 21:19:57 +0000 (13:19 -0800)]
intel/compiler: Do image load/store lowering to NIR

This commit moves our storage image format conversion codegen into NIR
instead of doing it in the back-end.  This has the advantage of letting
us run it through NIR's optimizer which is pretty effective at shrinking
things down.  In the common case of rgba8, the number of instructions
emitted after NIR is done with it is half of what it was with the
lowering happening in the back-end.  On the downside, the back-end's
lowering is able to directly use predicates and the NIR lowering has to
use IFs.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15166910 -> 15166872 (<.01%)
    instructions in affected programs: 5895 -> 5857 (-0.64%)
    helped: 15
    HURT: 0

Clearly, we don't have that much image_load_store happening in the
shaders in shader-db....

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/types: Add a wrapper for coordinate_components
Jason Ekstrand [Thu, 16 Aug 2018 15:22:32 +0000 (10:22 -0500)]
nir/types: Add a wrapper for coordinate_components

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoanv/pipeline: Remove dead image loads in lower_input_attacnments
Jason Ekstrand [Wed, 15 Aug 2018 19:04:25 +0000 (14:04 -0500)]
anv/pipeline: Remove dead image loads in lower_input_attacnments

Dead code will get rid of them eventually but it's better if they're
just gone so we guarantee they won't trip up later passes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir: Make image load/store intrinsics variable-width
Jason Ekstrand [Tue, 14 Aug 2018 19:03:05 +0000 (14:03 -0500)]
nir: Make image load/store intrinsics variable-width

Instead of requiring 4 components, this allows them to potentially use
fewer.  Both the SPIR-V and GLSL paths still generate vec4 intrinsics so
drivers which assume 4 components should be safe.  However, we want to
be able to shrink them for i965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/format_convert: Fix a bitmask in unpack_11f11f10f
Jason Ekstrand [Thu, 16 Aug 2018 14:21:10 +0000 (09:21 -0500)]
nir/format_convert: Fix a bitmask in unpack_11f11f10f

Fixes: 4e337b42f9a2 "nir/format_convert: Add pack/unpack for R11F_G11F_B10F"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/format_convert: Rename pack_r11g11b10f to pack_11f11f10f
Jason Ekstrand [Mon, 13 Aug 2018 22:31:19 +0000 (17:31 -0500)]
nir/format_convert: Rename pack_r11g11b10f to pack_11f11f10f

This matches the unpack function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/format_convert: Add [us]norm conversion helpers
Jason Ekstrand [Mon, 13 Aug 2018 21:13:50 +0000 (16:13 -0500)]
nir/format_convert: Add [us]norm conversion helpers

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/format_convert: Rename nir_format_bitcast_uint_vec
Jason Ekstrand [Mon, 13 Aug 2018 19:57:22 +0000 (14:57 -0500)]
nir/format_convert: Rename nir_format_bitcast_uint_vec

We have a name for that, it's called a uvec.  This just makes the
function name a bit shorter.  While we're here, we also add an assert
for one of the assumptions this function makes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/format_convert: Add vec mask and sign-extend helpers
Jason Ekstrand [Mon, 13 Aug 2018 17:04:25 +0000 (12:04 -0500)]
nir/format_convert: Add vec mask and sign-extend helpers

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/format_convert: Add support for unpacking signed integers
Jason Ekstrand [Mon, 13 Aug 2018 16:41:41 +0000 (11:41 -0500)]
nir/format_convert: Add support for unpacking signed integers

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/opcodes: Make unpack_half_2x16_split_* variable-width
Jason Ekstrand [Wed, 15 Aug 2018 16:58:50 +0000 (11:58 -0500)]
nir/opcodes: Make unpack_half_2x16_split_* variable-width

There is nothing inherent about these opcodes that requires them to only
take scalars.  It's very convenient if we let them take vectors as well.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>