mesa.git
7 years agoradv: force enable LLVM sisched for The Talos Principle
Samuel Pitoiset [Tue, 7 Nov 2017 09:02:32 +0000 (10:02 +0100)]
radv: force enable LLVM sisched for The Talos Principle

It seems safe and it improves performance by +4% (73->76).

A drirc based solution is not what we want for now, keep it
simple and improve later if it's really needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: add nosisched debug option
Samuel Pitoiset [Fri, 10 Nov 2017 08:34:46 +0000 (09:34 +0100)]
radv: add nosisched debug option

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agospirv: fix typo on DO NOT EDIT header
Alejandro Piñeiro [Tue, 14 Nov 2017 07:32:18 +0000 (08:32 +0100)]
spirv: fix typo on DO NOT EDIT header

Introduced on commit 157c9a13414b524ce98ea0ea07fce819efc1ba65

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
7 years agomeson: if dep_dl is an empty list, it's not a dependency object
Jon Turney [Mon, 13 Nov 2017 10:13:54 +0000 (10:13 +0000)]
meson: if dep_dl is an empty list, it's not a dependency object

It's ok to use an empty list for dependencies:, but it's not ok to try to
use the found() method of it.

See also https://github.com/mesonbuild/meson/issues/2324

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
7 years agoradv: Free temporary syncobj after waiting on it.
Bas Nieuwenhuizen [Mon, 13 Nov 2017 22:26:32 +0000 (23:26 +0100)]
radv: Free temporary syncobj after waiting on it.

Otherwise we leak it.

Fixes: eaa56eab6da "radv: initial support for shared semaphores (v2)"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
7 years agoradv: Free syncobj with multiple imports.
Bas Nieuwenhuizen [Mon, 13 Nov 2017 22:18:19 +0000 (23:18 +0100)]
radv: Free syncobj with multiple imports.

Otherwise we can leak the old syncobj.

Fixes: eaa56eab6da "radv: initial support for shared semaphores (v2)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
7 years agoi965: Track the depth and render caches separately
Jason Ekstrand [Fri, 3 Nov 2017 23:11:54 +0000 (16:11 -0700)]
i965: Track the depth and render caches separately

Previously, we just had one hash set for tracking depth and render
caches called brw_context::render_cache.  This is less than ideal
because the depth and render caches are separate and we can't track
moves between the depth and the render caches.  This limitation led
to some unnecessary flushing around the depth cache.  There are cases
(mostly with BLORP) where we can end up touching a depth or stencil
buffer through the render cache.  To guard against this, blorp would
unconditionally do a render_cache_set_check_flush on it's destination
which meant that if you did any rendering (including a BLORP operation)
to a given surface and then used it as a blorp destination, you would
end up flushing it out of the render cache before rendering into it.

Things get worse when you dig into the depth/stencil state code for
regular GL draw calls.  Because we may end up rendering to a depth
or stencil buffer via BLORP, we did a render_cache_set_check_flush on
all depth and stencil buffers in brw_emit_depthbuffer to ensure that
they got flushed out of the render cache prior to using them for depth
or stencil testing.  However, because we also need to track dirtiness
for depth and stencil so that we can implement depth and stencil
texturing correctly, we were adding all depth and stencil buffers to the
render cache set in brw_postdraw_set_buffers_need_resolve.  This meant
that, if anything caused 3DSTATE_DEPTH_BUFFER to get re-emitted
(currently _NEW_BUFFERS, BRW_NEW_BATCH, and BRW_NEW_BLORP), we would
almost always do a full pipeline stall and render/depth cache flush.

The root cause of both of these problems is that we can't tell the
difference between the render and depth caches in our tracking.  This
commit splits our cache tracking into two sets, one for render and one
for depth, and properly handles transitioning between the two.  We still
flush all the caches whenever anything needs to be flushed.  The idea is
that if we're going to take the hit of a flush and stall, we may as well
flush everything in the hopes that we can avoid a flush by something
else later.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965/blorp: Add more destination flushing
Jason Ekstrand [Fri, 3 Nov 2017 23:03:52 +0000 (16:03 -0700)]
i965/blorp: Add more destination flushing

Right now we just always flush the destination for render and aren't
particularly careful about depth or stencil.  Soon, flush_for_render
isn't going to do the same thing as flush_for_depth and we may be doing
a good deal less depth flushing so we should be a bit more precise.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965: Add more precise cache tracking helpers
Jason Ekstrand [Fri, 3 Nov 2017 23:01:28 +0000 (16:01 -0700)]
i965: Add more precise cache tracking helpers

In theory, this will let us track the depth and render caches
separately.  Right now, they're just wrappers around
brw_render_cache_set_*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965: Add stencil buffers to cache set regardless of stencil texturing
Jason Ekstrand [Fri, 3 Nov 2017 22:57:47 +0000 (15:57 -0700)]
i965: Add stencil buffers to cache set regardless of stencil texturing

We may access them as a texture using blorp regardless of whether or not
stencil texturing is enabled.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
7 years agoi965: Switch over to fully external-or-not MOCS scheme
Jason Ekstrand [Tue, 14 Nov 2017 04:13:09 +0000 (20:13 -0800)]
i965: Switch over to fully external-or-not MOCS scheme

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965: Use PTE MOCS for all external buffers
Jason Ekstrand [Fri, 3 Nov 2017 22:26:17 +0000 (15:26 -0700)]
i965: Use PTE MOCS for all external buffers

We were already using PTE for all render targets in case one happened to
get scanned out.  However, this still wasn't 100% correct because there
are still possibly cases where we may want to texture from an external
buffer even though we don't know the caching mode.  This can happen, for
instance, on buffers imported from another GPU via prime.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101691
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Tested-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agointel/blorp: Make the MOCS setting part of blorp_address
Jason Ekstrand [Fri, 3 Nov 2017 22:20:08 +0000 (15:20 -0700)]
intel/blorp: Make the MOCS setting part of blorp_address

This makes our MOCS settings significantly more flexible.

Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Tested-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoanv/blorp: Add a device parameter to blorp_surf_for_anv_image
Jason Ekstrand [Fri, 3 Nov 2017 22:18:45 +0000 (15:18 -0700)]
anv/blorp: Add a device parameter to blorp_surf_for_anv_image

Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Tested-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agointel/blorp: Use mocs.tex for depth stencil
Jason Ekstrand [Fri, 3 Nov 2017 21:31:51 +0000 (14:31 -0700)]
intel/blorp: Use mocs.tex for depth stencil

Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Tested-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agointel/tools/error: Decode compute shaders.
Kenneth Graunke [Thu, 26 Oct 2017 21:54:29 +0000 (14:54 -0700)]
intel/tools/error: Decode compute shaders.

This is a bit more annoying than your average shader - we need to look
at MEDIA_INTERFACE_DESCRIPTOR_LOAD in the batch buffer, then hop over
to the dynamic state buffer to read the INTERFACE_DESCRIPTOR_DATA, then
hop over to the instruction buffer to decode the program.

Now that we store all the buffers before decoding, we can actually do
this fairly easily.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agointel/tools/error: Use do-while for field iterator loops.
Kenneth Graunke [Sun, 12 Nov 2017 08:12:04 +0000 (00:12 -0800)]
intel/tools/error: Use do-while for field iterator loops.

while loops skip the first field of the instruction/structure, which
is not what the code intended.  It works out because the field we're
looking for doesn't happen to be first, but we ought to do it right
regardless.

Found while writing the next patch, where Kernel Start Pointer is
the first field of INTERFACE_DESCRIPTOR_DATA.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agointel/tools/error: Decode shaders while decoding batch commands.
Kenneth Graunke [Sun, 12 Nov 2017 07:26:19 +0000 (23:26 -0800)]
intel/tools/error: Decode shaders while decoding batch commands.

This makes aubinator_error_decode's shader dumping work like aubinator.
Instead of printing them after the fact, it prints them right inside the
3DSTATE_VS/HS/DS/GS/PS packet that references them.  This saves you the
effort of cross-referencing things and jumping back and forth.

It also reduces a bunch of book-keeping, and eliminates the limitation
that we could only handle 4096 programs.  That code was also broken and
failed to print any shaders if there were under 4096 programs.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agointel/tools/error: Save error state sections and decode them later.
Kenneth Graunke [Sun, 12 Nov 2017 06:37:35 +0000 (22:37 -0800)]
intel/tools/error: Save error state sections and decode them later.

This lets us complete parsing and storing of each buffer's data before
we begin decoding the batchbuffer.  This makes it possible to inspect
the state buffer and program buffer, so we can properly decode any
indirect state or shader programs.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agointel/tools/error: Fix null termination of ring name string.
Kenneth Graunke [Sun, 12 Nov 2017 06:51:07 +0000 (22:51 -0800)]
intel/tools/error: Fix null termination of ring name string.

Ported from intel_error_decode.  We don't want to run off the end.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agointel/tools/error: Drop unused MAX_RINGS #define.
Kenneth Graunke [Sun, 12 Nov 2017 06:20:23 +0000 (22:20 -0800)]
intel/tools/error: Drop unused MAX_RINGS #define.

Dead code.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agointel/tools/error: Refactor buffer matching, add more buffers.
Kenneth Graunke [Sun, 12 Nov 2017 06:09:16 +0000 (22:09 -0800)]
intel/tools/error: Refactor buffer matching, add more buffers.

Based on a similar patch to intel_error_decode by Chris Wilson.

While we're de-duplicating the gtt_offset calculation, we can simplify
it to assume two hex digits are there - the kernel has done this since
v4.6, and we already require error states from v4.10.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agointel/tools/error: Only decode a few sections of error states.
Kenneth Graunke [Sun, 12 Nov 2017 06:04:01 +0000 (22:04 -0800)]
intel/tools/error: Only decode a few sections of error states.

These three are the only we can reasonably decode with genxml.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agointel/tools/error: Drop unused parameters from decode() helper.
Kenneth Graunke [Sun, 12 Nov 2017 06:17:30 +0000 (22:17 -0800)]
intel/tools/error: Drop unused parameters from decode() helper.

Also change count from a pointer into a value.  We were supposed to
be resetting it to 0 (and failed to), but that's gone since we dropped
the pre-ascii85 handling.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agointel/tools/error: Drop support for non-ascii85 encoded error states.
Kenneth Graunke [Sun, 12 Nov 2017 05:55:27 +0000 (21:55 -0800)]
intel/tools/error: Drop support for non-ascii85 encoded error states.

Error state files used to look like:

   render ring --- gtt_offset = 0x0e8f6000
   00000000 :  69040000
   00000004 :  79090000
   ...
   00007ffc :  00000000
   --- ringbuffer = 0x00001000

There were thousands of lines between sections.  The file format changed
with Kernel 4.10, and now has a single ascii85-encoded line following
each section heading.  This is much easier to parse.

There are a bunch of bugs in our handling of the old style format,
where we'd decode the wrong data, at the wrong time.  Fixing all of
these is going to be a giant pain.  It's also a lot of extra code
complexity.  In order to properly decode indirect state, or compute
shaders, we'll also need to parse data in advance of decoding, which
is going to be a giant pain with this ad-hoc "decode everywhere!"
mentality.  So, let's just drop support for the older file format.

This unfortunately requires an error state generated by Kernel 4.10 or
later.  That's probably not the end of the world, as we encourage users
to upgrade to the latest kernel when encountering GPU hangs anyway.  It
might be a giant pain for people with LTS kernels, though...

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agointel/tools/error: Do ascii85 decode first.
Kenneth Graunke [Sun, 12 Nov 2017 04:57:42 +0000 (20:57 -0800)]
intel/tools/error: Do ascii85 decode first.

The dashes "---" may occur within an ascii85 block, but only an ascii85
block starts with ':' or '~'.

Ported from Chris Wilson's intel-gpu-tools commit:
bceec7e1d8a160226b783c6344eae8cbf4ece144

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoc11/haiku: Define missing timespec_get on Haiku
Alexander von Gluck IV [Sun, 12 Nov 2017 21:24:15 +0000 (15:24 -0600)]
c11/haiku: Define missing timespec_get on Haiku

Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agoegl/haiku: Correct invalid void* conversion in calloc
Alexander von Gluck IV [Sun, 12 Nov 2017 04:20:03 +0000 (22:20 -0600)]
egl/haiku: Correct invalid void* conversion in calloc

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
7 years agomeson: Remove build_by_default from amd code
Dylan Baker [Mon, 13 Nov 2017 19:16:28 +0000 (11:16 -0800)]
meson: Remove build_by_default from amd code

This is the same logic as the previous two patches.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
7 years agomeson: Don't build intel shared components by default
Dylan Baker [Mon, 13 Nov 2017 19:14:47 +0000 (11:14 -0800)]
meson: Don't build intel shared components by default

It's a neat idea, and still useful in some cases, but the intel common
code is used by i965 and anvil only, this is a little clearer.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
7 years agomeson: don't use build_by_default for specific gallium drivers
Dylan Baker [Fri, 10 Nov 2017 17:17:08 +0000 (09:17 -0800)]
meson: don't use build_by_default for specific gallium drivers

Using build_by_default : false is convenient for dependencies that can
be pulled in by various diverse components of the build system, the
gallium hardware/software drivers and state trackers do not fit that
description. Instead, these should be guarded using the variable that tracks
whether that driver should be enabled.

This leaves a few helper libraries: trace, rbug, etc, and the generic
winsys bits as `build_by_default : false` because there are a large
number of gallium components that pull them in.

v2: - remove build_by_default from winsys convenience libs as well.
v3: - Always put drivers before winsys for consistency

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Eric Anholt <eric@anholt.net>
7 years agor600/shader: handle bitfield extract semantics properly.
Dave Airlie [Mon, 13 Nov 2017 06:28:35 +0000 (16:28 +1000)]
r600/shader: handle bitfield extract semantics properly.

Fixes:
tests/spec/arb_gpu_shader5/execution/built-in-functions/fs-bitfieldExtract.shader_test

Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agor600: handle bitfieldInsert corner case.
Dave Airlie [Mon, 13 Nov 2017 06:58:35 +0000 (16:58 +1000)]
r600: handle bitfieldInsert corner case.

This handles the bits >= 32 corner case in bitfieldInsert.

Fixes:
tests/spec/arb_gpu_shader5/execution/built-in-functions/fs-bitfieldInsert.shader_test.

Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agor600: add gs tri strip adjacency fix.
Dave Airlie [Mon, 13 Nov 2017 04:10:25 +0000 (14:10 +1000)]
r600: add gs tri strip adjacency fix.

Like
radeonsi: generate GS prolog to (partially) fix triangle strip adjacency rotation

evergreen hw suffers from the same problem, so rotate the
geometry inputs to fix this.

This fixes:
./bin/glsl-1.50-geometry-primitive-types GL_TRIANGLE_STRIP_ADJACENCY
on evergreen.

Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agor600: fix isoline tess factor component swapping.
Dave Airlie [Mon, 13 Nov 2017 05:40:15 +0000 (15:40 +1000)]
r600: fix isoline tess factor component swapping.

As per radeonsi, the tess factor components for isolines
are reversed.

Fixes: tests/spec/arb_tessellation_shader/execution/isoline.shader_test
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agor600/shader: reserve first register of vertex shader.
Dave Airlie [Mon, 13 Nov 2017 03:05:25 +0000 (13:05 +1000)]
r600/shader: reserve first register of vertex shader.

r0 in input into vertex shaders contains things like vertexid,
we need to reserve it even if we have no inputs.

This fixes a bunch of tessellation piglits.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agomeson: Move -Dvulkan-drivers handling higher in the file
Jason Ekstrand [Sat, 11 Nov 2017 18:30:35 +0000 (10:30 -0800)]
meson: Move -Dvulkan-drivers handling higher in the file

The window-system auto-detection code (specifically for glx) relies on
with_any_vk being available.  This fixes the Vulkan-only build.  Also,
this puts it up near the handling of -Ddri-drivers and -Dgallium-drivers
which seems to make a bit more sense.

Fixes: 118a7f044191d4ab15ac9 "meson: add support for xlib glx"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
7 years agomeson: Stop requiring platforms for Vulkan
Jason Ekstrand [Sat, 11 Nov 2017 18:28:54 +0000 (10:28 -0800)]
meson: Stop requiring platforms for Vulkan

It should be perfectly valid to build a completely headless Vulkan
driver.  We don't need to require a platform.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
7 years agor600: don't emit atomic save if we have no atomic counters.
Dave Airlie [Fri, 10 Nov 2017 03:46:19 +0000 (13:46 +1000)]
r600: don't emit atomic save if we have no atomic counters.

Otherwise we end up emitting the fence.

Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoglx/dri3: Fix passing renderType into glXCreateContext
Adam Jackson [Thu, 9 Nov 2017 21:57:31 +0000 (16:57 -0500)]
glx/dri3: Fix passing renderType into glXCreateContext

Without this, trying to create a GLX_RGBA_FLOAT_TYPE_ARB context would
fail, because GLX_RGBA_TYPE would be a mismatch with the fbconfig.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
7 years agoglx/drisw: Fix glXMakeCurrent(dpy, None, ctx)
Adam Jackson [Thu, 9 Nov 2017 21:57:30 +0000 (16:57 -0500)]
glx/drisw: Fix glXMakeCurrent(dpy, None, ctx)

This is perfectly legal in GL 3.0+.

Fixes piglit/glx-create-context-current-no-framebuffer.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
7 years agoglx: Lower GLX opcode lookup into SendMakeCurrentRequest
Adam Jackson [Thu, 9 Nov 2017 21:57:29 +0000 (16:57 -0500)]
glx: Lower GLX opcode lookup into SendMakeCurrentRequest

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
7 years agoaubinator: Don't skip the first field in each subgroup
Jason Ekstrand [Sun, 12 Nov 2017 05:43:46 +0000 (21:43 -0800)]
aubinator: Don't skip the first field in each subgroup

The previous iteration algorithm would advance the field pointer right
after we advance the group.  This meant that you would end up with
skipping the first field of the group.  In the common case, where the
only field is a struct (e.g. 3DSTATE_VERTEX_BUFFERS), it would get
skipped entirely.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agointel/genxml: Delete empty groups
Jason Ekstrand [Sun, 12 Nov 2017 23:40:43 +0000 (15:40 -0800)]
intel/genxml: Delete empty groups

They serve no purpose other than to just fill empty space in the packet
so each dword has something.  Just disallowing empty groups is a bit
easier on some of the tools.  This does not change the generated packing
headers in any way.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: Don't crash on invalid heap sizes when the PCI ID is overriden
Jason Ekstrand [Wed, 8 Nov 2017 22:24:57 +0000 (14:24 -0800)]
anv: Don't crash on invalid heap sizes when the PCI ID is overriden

7 years agonir/spirv: tg4 requires a sampler
Alex Smith [Tue, 7 Nov 2017 10:52:48 +0000 (10:52 +0000)]
nir/spirv: tg4 requires a sampler

Gather operations in both GLSL and SPIR-V require a sampler. Fixes
gathers returning garbage when using separate texture/samplers (on AMD,
was using an invalid sampler descriptor).

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agospirv: Use correct type for sampled images
Alex Smith [Mon, 6 Nov 2017 10:37:05 +0000 (10:37 +0000)]
spirv: Use correct type for sampled images

We should use the result type of the OpSampledImage opcode, rather than
the type of the underlying image/samplers.

This resolves an issue when using separate images and shadow samplers
with glslang. Example:

    layout (...) uniform samplerShadow s0;
    layout (...) uniform texture2D res0;
    ...
    float result = textureLod(sampler2DShadow(res0, s0), uv, 0);

For this, for the combined OpSampledImage, the type of the base image
was being used (which does not have the Depth flag set, whereas the
result type does), therefore it was not being recognised as a shadow
sampler. This led to the wrong LLVM intrinsics being emitted by RADV.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agospirv: add DO NOT EDIT warning on generated spirv_info.c
Alejandro Piñeiro [Fri, 13 Oct 2017 14:17:14 +0000 (16:17 +0200)]
spirv: add DO NOT EDIT warning on generated spirv_info.c

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
7 years agoloader/dri3: Improve dri3 thread-safety
Thomas Hellstrom [Tue, 19 Sep 2017 17:41:22 +0000 (19:41 +0200)]
loader/dri3: Improve dri3 thread-safety

It turned out that with recent changes that call into dri3 from glFinish(),
it appears like different thread end up waiting for X events simultaneously,
causing deadlocks since they steal events from eachoter and update the dri3
counters behind eachothers backs.

This patch intends to improve on that. It allows at most one thread at a
time to wait on events for a single drawable. If another thread intends to
do the same, it's put to sleep until the first thread finishes waiting, and
then it rechecks counters and optionally retries the waiting. Threads that
poll for X events never pulls X events off the event queue if there are
other threads waiting for events on that drawable. Counters in the
dri3 drawable structure are protected by a mutex. Finally, the mutex we
introduce is never held while waiting for the X server to avoid
unnecessary stalls.

This does not make dri3 drawables completely thread-safe but at least it's a
first step.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102358
Fixes: d5ba75f8881 "st/dri2 Plumb the flush_swapbuffer functionality through to dri3"
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoetnaviv: automake,meson: include common_3d.xml.h in the sources lists
Juan A. Suarez Romero [Tue, 7 Nov 2017 15:53:43 +0000 (16:53 +0100)]
etnaviv: automake,meson: include common_3d.xml.h in the sources lists

v2: include the file also in the meson.build (Eric Engestrom).

Fixes: f1e1c60ff6 ("etnaviv: Update from rnndb")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
7 years agoegl: EXT_pixel_format_float plumbing
Tapani Pälli [Tue, 31 Oct 2017 08:57:42 +0000 (10:57 +0200)]
egl: EXT_pixel_format_float plumbing

Patch adds support and capability to match with new surface attribute,
component type. Currently no configs with floating point type are exposed.

With this change, following dEQP test starts to pass:

   dEQP-EGL.functional.choose_config.color_component_type_ext.dont_care
   dEQP-EGL.functional.choose_config.color_component_type_ext.fixed
   dEQP-EGL.functional.choose_config.color_component_type_ext.float

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
7 years agoradv: add unlikely() around radv_save_descriptors()
Samuel Pitoiset [Fri, 10 Nov 2017 08:18:05 +0000 (09:18 +0100)]
radv: add unlikely() around radv_save_descriptors()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: optimize calling radv_cmd_buffer_trace_emit()
Samuel Pitoiset [Fri, 10 Nov 2017 08:18:04 +0000 (09:18 +0100)]
radv: optimize calling radv_cmd_buffer_trace_emit()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: optimize calling radv_save_pipeline()
Samuel Pitoiset [Fri, 10 Nov 2017 08:18:03 +0000 (09:18 +0100)]
radv: optimize calling radv_save_pipeline()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: use vk_zalloc instead of vk_alloc+memset
Samuel Pitoiset [Fri, 10 Nov 2017 08:18:02 +0000 (09:18 +0100)]
radv: use vk_zalloc instead of vk_alloc+memset

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: remove unnecessary memset() in radv_AllocateCommandBuffers()
Samuel Pitoiset [Fri, 10 Nov 2017 08:18:01 +0000 (09:18 +0100)]
radv: remove unnecessary memset() in radv_AllocateCommandBuffers()

This should not be needed, if the allocation fails an error is
returned and the host should handle it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: remove useless initializations in radv_create_cmd_buffer()
Samuel Pitoiset [Fri, 10 Nov 2017 08:18:00 +0000 (09:18 +0100)]
radv: remove useless initializations in radv_create_cmd_buffer()

There is a memset() above.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: remove useless memset() in radv_CreateFence()
Samuel Pitoiset [Fri, 10 Nov 2017 08:17:59 +0000 (09:17 +0100)]
radv: remove useless memset() in radv_CreateFence()

All radv_fence fields are initialized here.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: use vk_error() everywhere an error is returned
Samuel Pitoiset [Fri, 10 Nov 2017 08:17:58 +0000 (09:17 +0100)]
radv: use vk_error() everywhere an error is returned

For consistency and it might help for debugging purposes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: make radv_emit_framebuffer_state() static
Samuel Pitoiset [Wed, 8 Nov 2017 11:52:32 +0000 (12:52 +0100)]
radv: make radv_emit_framebuffer_state() static

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: do not emit the framebuffer when restoring a pass
Samuel Pitoiset [Wed, 8 Nov 2017 11:52:31 +0000 (12:52 +0100)]
radv: do not emit the framebuffer when restoring a pass

Instead just dirty RADV_CMD_DIRTY_FRAMEBUFFER and it will be
re-emitted if necessary before the next draw.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: prefetch VBO descriptors at the right place
Samuel Pitoiset [Wed, 8 Nov 2017 11:12:31 +0000 (12:12 +0100)]
radv: prefetch VBO descriptors at the right place

Just after the vertex shader.

This seems to give a minor boost for, at least, Serious Sam
Fusion 2017 and Dawn of War 3. I don't see any real impacts
with The Talos Principle.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: add radv_emit_prefetch_TC_L2_async() helper
Samuel Pitoiset [Wed, 8 Nov 2017 11:12:30 +0000 (12:12 +0100)]
radv: add radv_emit_prefetch_TC_L2_async() helper

Will be used for VBO descriptors prefetching.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: rename radv_emit_shaders_prefetch() to radv_emit_prefetch()
Samuel Pitoiset [Wed, 8 Nov 2017 11:12:29 +0000 (12:12 +0100)]
radv: rename radv_emit_shaders_prefetch() to radv_emit_prefetch()

For consistency because this function will also prefetch VBO
descriptors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoglsl/linker: use without_array() to retrieve type
Iago Toral Quiroga [Fri, 3 Nov 2017 09:46:30 +0000 (10:46 +0100)]
glsl/linker: use without_array() to retrieve type

This is what we do in the condition too, so it makes sense.

v2: Only compute without_array() once (Ilia).

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agoradv: emit esgs ring size in one place.
Dave Airlie [Mon, 6 Nov 2017 02:03:43 +0000 (02:03 +0000)]
radv: emit esgs ring size in one place.

This register is the same on all gpus so far, so emit it in one
place and also for the pre-gfx9 gpus set the value in the pipeline
creation.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: move calculating vs out info regs into pipeline.
Dave Airlie [Mon, 6 Nov 2017 02:00:34 +0000 (02:00 +0000)]
radv: move calculating vs out info regs into pipeline.

This moves some calculations of register values into the pipeline
construction, it saves looking at outinfo in the cmd buffer emit.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agofreedreno/a5xx: fix SSBO emit for non-zero offset
Rob Clark [Sat, 11 Nov 2017 15:50:20 +0000 (10:50 -0500)]
freedreno/a5xx: fix SSBO emit for non-zero offset

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/a5xx: remove obsolete comment
Rob Clark [Sat, 11 Nov 2017 14:55:00 +0000 (09:55 -0500)]
freedreno/a5xx: remove obsolete comment

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: don't create split/fo if only writing .x
Rob Clark [Fri, 10 Nov 2017 17:54:49 +0000 (12:54 -0500)]
freedreno/ir3: don't create split/fo if only writing .x

In case an instruction only writes one register, and it is .x, we can
skip the extra level of fanout indirection.

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/a5xx: indirect grids
Rob Clark [Fri, 10 Nov 2017 17:53:13 +0000 (12:53 -0500)]
freedreno/a5xx: indirect grids

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/a5xx: add global size compute cap
Rob Clark [Thu, 9 Nov 2017 21:57:05 +0000 (16:57 -0500)]
freedreno/a5xx: add global size compute cap

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: turn on std430 packing
Rob Clark [Sun, 5 Nov 2017 14:15:08 +0000 (09:15 -0500)]
freedreno/ir3: turn on std430 packing

Seems to fix dEQP compute related tests.. and matches what i965 does, so
perhaps there is some assumption that std430 packing is on by default
somewhere in NIR?

7 years agofreedreno/a5xx: image support
Rob Clark [Sat, 4 Nov 2017 16:52:43 +0000 (12:52 -0400)]
freedreno/a5xx: image support

7 years agofreedreno/ir3: moar better scheduler
Rob Clark [Tue, 7 Nov 2017 20:12:03 +0000 (15:12 -0500)]
freedreno/ir3: moar better scheduler

Add a new pass that inserts additional dependencies, rather than simply
relying on SSA srcs added in the nir->ir3 frontend.  This makes it
easier to deal with barriers, but the additional false deps also lets us
deal properly with ensuring a write depends on all previous reads.

Since conversion to barrier instructions is lossy (ie. just knowing the
instruction doesn't tell us enough about what other instructions the
barrier applies to), use barrier_class/barrier_conflict fields in the
ir3_instruction to retain this information.

This could probably be relaxed somewhat by considering *which* array/
buffer/image variable is being referenced.  Ie. a write to buffer A
can overtake a read from buffer B, if B is not coherent.  (right?)

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: move macros
Rob Clark [Thu, 9 Nov 2017 19:36:06 +0000 (14:36 -0500)]
freedreno/ir3: move macros

I want to add a growable array to ir3_instruction, so we can append
false dependencies for purposes of scheduling barriers, atomics, and
dealing with write after read hazards.

Just code motion preparing for next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: image support
Rob Clark [Thu, 9 Nov 2017 15:57:55 +0000 (10:57 -0500)]
freedreno/ir3: image support

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: shared variable support
Rob Clark [Thu, 9 Nov 2017 15:56:43 +0000 (10:56 -0500)]
freedreno/ir3: shared variable support

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: some SSBO cleanups/fixes
Rob Clark [Thu, 9 Nov 2017 15:48:52 +0000 (10:48 -0500)]
freedreno/ir3: some SSBO cleanups/fixes

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: split out INSTR4F instructions
Rob Clark [Wed, 8 Nov 2017 23:08:16 +0000 (18:08 -0500)]
freedreno/ir3: split out INSTR4F instructions

Atomic instructions take a different # of src args depending on .g or .l
variant, split these out into different helpers with INSTR*F() helper
macro that lets you specify instruction flag.

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: cat6 encoding fixes
Rob Clark [Wed, 8 Nov 2017 22:51:40 +0000 (17:51 -0500)]
freedreno/ir3: cat6 encoding fixes

Instruction encoding/decoding fixes needed for images, shared variables,
etc.

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: add barriers
Rob Clark [Tue, 31 Oct 2017 15:23:15 +0000 (11:23 -0400)]
freedreno/ir3: add barriers

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: invert is_same_type_mov() logic
Rob Clark [Tue, 31 Oct 2017 16:21:51 +0000 (12:21 -0400)]
freedreno/ir3: invert is_same_type_mov() logic

Some instructions (like barriers) have no dst, which causes problems
with dereferencing a NULL dst.  Flip the logic around to reject opc's
that can't be a type of move first, to filter out those instructions.

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: add cat7 instructions
Rob Clark [Mon, 30 Oct 2017 23:24:59 +0000 (19:24 -0400)]
freedreno/ir3: add cat7 instructions

Needed for memory and execution barriers.

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: add SSBO get_buffer_size() support
Rob Clark [Mon, 30 Oct 2017 17:23:37 +0000 (13:23 -0400)]
freedreno/ir3: add SSBO get_buffer_size() support

Somehow I overlooked this when adding initial SSBO support.

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: extract helper for common consts
Rob Clark [Mon, 30 Oct 2017 17:20:17 +0000 (13:20 -0400)]
freedreno/ir3: extract helper for common consts

User consts and driver consts such as UBO addresses and immediates are
handled the same for all shader stages, so split out a shared helper for
these, to make it easier to add more.

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno: add image view state tracking
Rob Clark [Sat, 4 Nov 2017 15:14:09 +0000 (11:14 -0400)]
freedreno: add image view state tracking

It is unfortunate that image state isn't a real CSO, since (at least for
a4xx/a5xx) it is a combination of sampler and "SSBO" image state, and it
would be useful to pre-compute the state block "register" values rather
than doing it at emit time.

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno: update generated headers
Rob Clark [Tue, 31 Oct 2017 13:15:08 +0000 (09:15 -0400)]
freedreno: update generated headers

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agomesa/st/nir: assign driver_location for images
Rob Clark [Fri, 3 Nov 2017 16:47:51 +0000 (12:47 -0400)]
mesa/st/nir: assign driver_location for images

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agost/program: fix compute shader nir references
Rob Clark [Mon, 30 Oct 2017 13:56:43 +0000 (09:56 -0400)]
st/program: fix compute shader nir references

In case the IR is NIR, the driver takes reference to the nir_shader.
Also, because there are no variants, we need to clone the shader,
instead of sharing the reference with gl_program, which would result
in a double free in _mesa_delete_program().

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agofreedreno/ir3: rename ir3_compile -> ir3_context
Rob Clark [Fri, 10 Nov 2017 14:04:52 +0000 (09:04 -0500)]
freedreno/ir3: rename ir3_compile -> ir3_context

Having both an ir3_compile (which was really context for compiling a
single shader variant) and ir3_compiler (which is the compiler object
that compiles all variants, ie. basically holds the RA regset) is a
bit confusing.

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agointel/tools: Fix detection of enabled shader stages.
Kenneth Graunke [Fri, 10 Nov 2017 23:36:22 +0000 (15:36 -0800)]
intel/tools: Fix detection of enabled shader stages.

We renamed "Function Enable" to "Enable", which broke our detection
of whether shaders are enabled or not.  So, we'd see a bunch of HS/DS
packets with program offsets of 0, and think that was a valid TCS/TES.

Fixes: c032cae9ff77e (genxml: Rename "Function Enable" to "Enable".)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agost/atifs: remove unrequired initialisation of gl_program fields
Timothy Arceri [Fri, 10 Nov 2017 08:49:29 +0000 (19:49 +1100)]
st/atifs: remove unrequired initialisation of gl_program fields

As far as I can tell these fields are only used to query arb
program info and are not related to ATI_fragment_shader.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Miklós Máté <mtmkls@gmail.com>
7 years agoac: add emit_vertex to the abi
Timothy Arceri [Mon, 6 Nov 2017 06:45:34 +0000 (17:45 +1100)]
ac: add emit_vertex to the abi

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: rework gs_vtx_offset handling
Timothy Arceri [Thu, 9 Nov 2017 03:43:34 +0000 (14:43 +1100)]
radeonsi: rework gs_vtx_offset handling

This simplifies things a bit and will enable it to work with the
common NIR -> LLVM code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agonir: add streams to nir data
Timothy Arceri [Tue, 7 Nov 2017 02:56:08 +0000 (13:56 +1100)]
nir: add streams to nir data

This will be used by gallium drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/dri: fix deadlock when waiting on android fences
Marek Olšák [Fri, 10 Nov 2017 18:08:50 +0000 (19:08 +0100)]
st/dri: fix deadlock when waiting on android fences

Android fences can't be deferred, because st/dri calls fence_finish
with ctx = NULL, so the driver can't flush u_threaded_context.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agomeson: Guard freedreno build with with_gallium_freedreno.
Rob Clark [Sat, 11 Nov 2017 01:09:01 +0000 (17:09 -0800)]
meson: Guard freedreno build with with_gallium_freedreno.

This prevents build failures when libdrm_freedreno is unavailable,
which started happening after the ir3_compiler build was enabled.

(Patch by Rob, commit message by Ken).

Fixes: fecd04a66ae ("freedreno/ir3: fix standalone compiler meson build")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agodocs: update calendar, add news item and link release notes for 17.2.5
Andres Gomez [Fri, 10 Nov 2017 13:40:06 +0000 (15:40 +0200)]
docs: update calendar, add news item and link release notes for 17.2.5

Signed-off-by: Andres Gomez <agomez@igalia.com>
7 years agodocs: add sha256 checksums for 17.2.5
Andres Gomez [Fri, 10 Nov 2017 23:23:24 +0000 (01:23 +0200)]
docs: add sha256 checksums for 17.2.5

Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 96ad27f8fcf3979c577c052f725e2a80035295aa)