mesa.git
7 years agoautotools: Set C++ visibility flags on Intel
Dylan Baker [Thu, 9 Nov 2017 21:49:52 +0000 (13:49 -0800)]
autotools: Set C++ visibility flags on Intel

These flags are set for C sources, but not C++. This causes symbol
visibility leaks from the C++ parts of the Intel compiler.

Fixes: 700bebb958e93f4d ("i965: Move the back-end compiler to src/intel/compiler")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agodocs/releasing: improve the pre-announce template and examples
Andres Gomez [Thu, 9 Nov 2017 14:58:23 +0000 (16:58 +0200)]
docs/releasing: improve the pre-announce template and examples

v2: Choose a proper rejection example (Emil).

Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
7 years agodocs/releasing: drop manually exported variables during smoke test
Andres Gomez [Thu, 9 Nov 2017 14:58:22 +0000 (16:58 +0200)]
docs/releasing: drop manually exported variables during smoke test

Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
7 years agodocs/releasing: drop custom LLVM_CONFIG if previously manually set
Andres Gomez [Thu, 9 Nov 2017 14:58:21 +0000 (16:58 +0200)]
docs/releasing: drop custom LLVM_CONFIG if previously manually set

Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
7 years agost/dri: fix android fence regression
Marek Olšák [Thu, 9 Nov 2017 19:12:07 +0000 (20:12 +0100)]
st/dri: fix android fence regression

Fixes piglit - egl_khr_fence_sync/android_native tests.
Broken by 884a0b2a9e55d4c1ca39475b50d9af598d7d7280.

Introduce state-tracker flush flags, analogous to the pipe ones. Use
the former when with stapi->flush().

Fixes: 884a0b2a9e5 ("st/dri: use stapi flush instead of pipe flush
when creating fences")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoutil/u_thread: fix compilation on Mac OS
Nicolai Hähnle [Fri, 10 Nov 2017 11:36:16 +0000 (12:36 +0100)]
util/u_thread: fix compilation on Mac OS

Apparently, it doesn't have pthread barriers.

p_config.h (which was originally used to guard this code) uses the
__APPLE__ macro to detect Mac OS.

Fixes: f0d3a4de75 ("util: move pipe_barrier into src/util and rename to util_barrier")
Cc: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agoutil/u_queue: handle OS_TIMEOUT_INFINITE in util_queue_fence_wait_timeout
Nicolai Hähnle [Fri, 10 Nov 2017 09:40:41 +0000 (10:40 +0100)]
util/u_queue: handle OS_TIMEOUT_INFINITE in util_queue_fence_wait_timeout

Fixes e.g. piglit/bin/bufferstorage-persistent read -auto

Fixes: e6dbc804a87a ("winsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/u_threaded: fix end_query regression
Nicolai Hähnle [Fri, 10 Nov 2017 08:59:08 +0000 (09:59 +0100)]
gallium/u_threaded: fix end_query regression

Ouch...

Fixes: 244536d3d6b4 ("gallium/u_threaded: avoid syncs for get_query_result")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103653
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoswr: Fixed an uncommon freed-memory access during state validation
Bruce Cherniak [Thu, 9 Nov 2017 00:39:37 +0000 (18:39 -0600)]
swr: Fixed an uncommon freed-memory access during state validation

State validation is performed during clear and draw calls.  Validation
during clear was still accessing vertex buffer state.  When the currently
set vertex buffers are client arrays, this could lead to accessing freed
memory.  Such is the case with the VMD application.

Previously, vertex buffer validation depended on a dirty bit or the
draw info indicating an indexed draw.  This required special handling for
clears.  But, vertex buffer validation still occurred which was unnecessary
and wrong.

Now, only minimal validation is performed during clear, deferring the
remainder to the next draw.  And, by setting the dirty bit in swr_draw_vbo
for indexed draws, vertex buffer validation is only dependent upon a
single dirty bit.

This fixes a bug exposed by the VMD application when changing models.

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
7 years agofreedreno/ir3: fix standalone compiler meson build
Rob Clark [Tue, 31 Oct 2017 15:24:06 +0000 (11:24 -0400)]
freedreno/ir3: fix standalone compiler meson build

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: correct # of dest components for intrinsics
Rob Clark [Thu, 9 Nov 2017 15:50:44 +0000 (10:50 -0500)]
freedreno/ir3: correct # of dest components for intrinsics

Don't rely on intr->num_components having a valid value.  It doesn't
seem to anymore for non-vectorized intrinsics.

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agofreedreno/ir3: remove bogus assert
Rob Clark [Tue, 31 Oct 2017 16:34:23 +0000 (12:34 -0400)]
freedreno/ir3: remove bogus assert

The ssbo atomic instructions are not vectorized.  So num_components is
not expected to be valid.

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agonir: handle get_buffer_size in nir_lower_atomics_to_ssbo
Rob Clark [Sun, 5 Nov 2017 18:53:50 +0000 (13:53 -0500)]
nir: handle get_buffer_size in nir_lower_atomics_to_ssbo

Overlooked initially, be we need to remap the SSBO index for this as
well.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoanv/meson: Generate dev_icd.json
Chad Versace [Wed, 1 Nov 2017 20:47:55 +0000 (13:47 -0700)]
anv/meson: Generate dev_icd.json

I tested this in a setup where the builddir was outside of the srcdir.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
7 years agoanv: Fix architecture in intel_icd.{arch}.json
Chad Versace [Thu, 9 Nov 2017 23:02:13 +0000 (15:02 -0800)]
anv: Fix architecture in intel_icd.{arch}.json

Use the host arch, not the target arch. In Meson and in recent
Autotools, the host arch is where the binary will be used. The target
arch is useful only when compiling a compiler.

See: http://mesonbuild.com/Cross-compilation.html
See: https://www.gnu.org/software/automake/manual/html_node/Cross_002dCompilation.html
Reported-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
7 years agoradv: Fix architecture in radeon_icd.{arch}.json
Chad Versace [Thu, 9 Nov 2017 22:55:37 +0000 (14:55 -0800)]
radv: Fix architecture in radeon_icd.{arch}.json

Use the host arch, not the target arch. In Meson and in recent
Autotools, the host arch is where the binary will be used. The target
arch is useful only when compiling a compiler.

See: http://mesonbuild.com/Cross-compilation.html
See: https://www.gnu.org/software/automake/manual/html_node/Cross_002dCompilation.html
Reported-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoanv: Refactor anv_GetImageSubresourceLayout()
Chad Versace [Tue, 7 Nov 2017 03:35:43 +0000 (19:35 -0800)]
anv: Refactor anv_GetImageSubresourceLayout()

Its helper function, anv_surface_get_subresource_layout(), was not very
helpful. So fold it into the main function.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv/image: Refactor choice of isl_tiling_flags_t
Chad Versace [Tue, 7 Nov 2017 02:51:53 +0000 (18:51 -0800)]
anv/image: Refactor choice of isl_tiling_flags_t

Instead of choosing the tiling flags inside make_surface(), which is
called once per aspect in a loop, and which chooses the same tiling for
each aspect, choose the tiling flags exactly once before entering the
aspect loop.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: Refactor anv_get_format_plane() - explicit unsupported
Chad Versace [Fri, 3 Nov 2017 23:42:16 +0000 (16:42 -0700)]
anv: Refactor anv_get_format_plane() - explicit unsupported

The same local variable, 'plane_format', was returned on success *and*
failure. Be more explicit in distinguishing the two cases: return
'plane_format' on success and return 'unsupported' on failure.

This simplifies the diff in upcoming patches for
VK_EXT_image_drm_format_modifier.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: Remove anv_physical_device_get_format_properties()
Chad Versace [Fri, 3 Nov 2017 23:35:12 +0000 (16:35 -0700)]
anv: Remove anv_physical_device_get_format_properties()

Fold its body into its sole caller,
anv_GetPhysicalDeviceFormatProperties().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: Simplify anv_physical_device_get_format_properties()
Chad Versace [Fri, 3 Nov 2017 19:44:34 +0000 (12:44 -0700)]
anv: Simplify anv_physical_device_get_format_properties()

Now that get_image_format_properties() returns the correct
VkFormatFeatureFlags, we can remove the unneeded if-branch and some
local variables.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: Simplify anv_get_image_format_properties()
Chad Versace [Fri, 3 Nov 2017 23:26:51 +0000 (16:26 -0700)]
anv: Simplify anv_get_image_format_properties()

Now that get_image_format_features() has a VkImageTiling parameter, we
can bypass anv_physical_device_get_format_properties() and call
get_image_format_features() directly.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: Rename get_image_format_properties()
Chad Versace [Fri, 3 Nov 2017 21:50:00 +0000 (14:50 -0700)]
anv: Rename get_image_format_properties()

The name is misleading. It looks like vkGetPhysicalDeviceImageFormatProperties(),
but it actually implement vkGetPhysicalDeviceFormatProperties. Let's
rename it to what it actually does, get_image_format_features(), because it
returns VkFormatFeatureFlags.

For consistency, also rename get_buffer_format_properties() to
get_buffer_format_features().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: Fix get_image_format_properties() - YCbCr
Chad Versace [Fri, 3 Nov 2017 19:32:00 +0000 (12:32 -0700)]
anv: Fix get_image_format_properties() - YCbCr

Teach it to calculate the format features for YCbCr.

The goal (which is completed in this patch) is to incrementally fix
get_image_format_properties() to return a correct result.  Previously,
it returned incorrect VkFormatFeatureFlags which the caller needed clean
up.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: Fix get_image_format_properties() - 3-channel formats
Chad Versace [Fri, 3 Nov 2017 19:25:51 +0000 (12:25 -0700)]
anv: Fix get_image_format_properties() - 3-channel formats

Teach it to calculate the format features for 3-channel formats.

The goal is to incrementally fix get_image_format_properties() to return
a correct result.  Currently, it returns incorrect VkFormatFeatureFlags
which the caller must clean up.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: Refactor get_image_format_properties() - Reduce params
Chad Versace [Fri, 3 Nov 2017 19:20:21 +0000 (12:20 -0700)]
anv: Refactor get_image_format_properties() - Reduce params

Replace parameters 'enum isl_format' and 'struct anv_format_plane' with
new parameter 'const struct anv_format *'.

The goal is to incrementally fix get_image_format_properties() to return
a correct result.  Currently, it returns incorrect VkFormatFeatureFlags
which the caller must clean up.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: Refactor get_image_format_properties() - base_isl_format
Chad Versace [Fri, 3 Nov 2017 19:19:36 +0000 (12:19 -0700)]
anv: Refactor get_image_format_properties() - base_isl_format

Rename parameter 'base' to 'base_isl_format'.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: Refactor get_image_format_properties() - plane_format
Chad Versace [Fri, 3 Nov 2017 01:28:02 +0000 (18:28 -0700)]
anv: Refactor get_image_format_properties() - plane_format

Rename parameter 'format' to 'plane_format'.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: Refactor get_image_format_properties() - ASTC
Chad Versace [Fri, 3 Nov 2017 01:26:23 +0000 (18:26 -0700)]
anv: Refactor get_image_format_properties() - ASTC

Teach it to calculate the format features for ASTC.

The goal is to incrementally fix get_image_format_properties() to return
a correct result.  Currently, it returns incorrect VkFormatFeatureFlags
which the caller must clean up.

v2: New commit message

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: Refactor get_image_format_properties() - depthstencil (v2)
Chad Versace [Fri, 3 Nov 2017 01:24:08 +0000 (18:24 -0700)]
anv: Refactor get_image_format_properties() - depthstencil (v2)

Teach it to calculate the features of depthstencil formats.

The goal is to incrementally fix get_image_format_properties() to return
a correct result.  Currently, it returns incorrect VkFormatFeatureFlags
which the caller must clean up.

v2: New commit message

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: Better types for 'aspect' function params
Chad Versace [Fri, 3 Nov 2017 00:26:51 +0000 (17:26 -0700)]
anv: Better types for 'aspect' function params

Some functions have a comment that says "Exactly one bit must be in
'aspect'". So change the type of their 'aspect' parameter from
VkImageAspectFlags to VkImageAspectFlagBits.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: Refactor get_buffer_format_properties()
Chad Versace [Thu, 2 Nov 2017 23:55:55 +0000 (16:55 -0700)]
anv: Refactor get_buffer_format_properties()

Make it a stand-alone function. Pre-patch, for some formats the function
returned incorrect VkFormatFeatureFlags which were cleaned up by the
caller.

This prepares for a cleaner implementation of
VK_EXT_image_drm_format_modifier.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agobroadcom/vc4: Fix simulator mode for the MADVISE usage.
Eric Anholt [Thu, 9 Nov 2017 23:51:56 +0000 (15:51 -0800)]
broadcom/vc4: Fix simulator mode for the MADVISE usage.

7 years agomesa: enable ARB_texture_buffer_* extensions in the Compatibility profile
Marek Olšák [Thu, 19 Oct 2017 20:22:15 +0000 (22:22 +0200)]
mesa: enable ARB_texture_buffer_* extensions in the Compatibility profile

We already have piglit tests testing alpha, luminance, and intensity
formats. They were skipped by piglit until now.

Additionally, I'm enabling one ARB_texture_buffer_range piglit test to run
with the compat profile.

i965 behavior is unchanged except that it doesn't expose TBOs in the Compat
profile. Not sure how that affects the GL version override.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
7 years agodocs: update r600 atomic counter status.
Dave Airlie [Thu, 2 Nov 2017 01:04:53 +0000 (11:04 +1000)]
docs: update r600 atomic counter status.

Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agor600: add support for hw atomic counters. (v3)
Dave Airlie [Thu, 2 Nov 2017 00:26:51 +0000 (10:26 +1000)]
r600: add support for hw atomic counters. (v3)

This adds support for the evergreen/cayman atomic counters.

These are implemented using GDS append/consume counters. The values
for each counter are loaded before drawing and saved after each draw
using special CP packets.

v2: move hw atomic assignment into driver.
v3: fix messing up caps (Gert Wollny), only store ranges in driver,
drop buffers.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
7 years agost/mesa: add support for hw atomics to glsl->tgsi. (v5)
Dave Airlie [Wed, 1 Nov 2017 19:51:36 +0000 (05:51 +1000)]
st/mesa: add support for hw atomics to glsl->tgsi. (v5)

This adds support for creating the hw atomic tgsi from
the glsl codepaths.

v2: drop the atomic index and move to backend.
v3: drop buffer decls. (Marek)
v4: fix off by one (Gert)
v5: fix off by one the other way (Dave)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agost/mesa: setup hw atomic limits. (v1.1)
Dave Airlie [Wed, 1 Nov 2017 19:49:40 +0000 (05:49 +1000)]
st/mesa: setup hw atomic limits. (v1.1)

HW atomics need to use caps to set some limits, and some
other limits may also need limiting.

This fixes things up to work for evergreen hw, it may need
more changes in the future if other hw wants to use this path.

v1.1: fix indent.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agost/mesa: start adding support for hw atomics atom. (v2)
Dave Airlie [Wed, 1 Nov 2017 04:30:13 +0000 (14:30 +1000)]
st/mesa: start adding support for hw atomics atom. (v2)

This adds a new atom that calls the new driver API to
bind buffers containing hw atomics.

v2: fixup bindings for sparse buffers. (mareko/nha)
don't bind buffer atomics when hw atomics are enabled.
use NewAtomicBuffer (mareko)

Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agomesa/program: add hw atomic counter file
Dave Airlie [Wed, 1 Nov 2017 04:22:04 +0000 (14:22 +1000)]
mesa/program: add hw atomic counter file

This is needed for the GLSL->TGSI translation for hw atomic counters.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agogallium: add hw atomic buffer binding API.
Dave Airlie [Wed, 1 Nov 2017 04:17:49 +0000 (14:17 +1000)]
gallium: add hw atomic buffer binding API.

This API binds atomic buffers for all bound shaders (as per the
GL semantics).

This is needed to support cross shader hw atomic counters.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agogallium/tgsi: start adding hw atomics (v3.2)
Dave Airlie [Wed, 1 Nov 2017 04:05:19 +0000 (14:05 +1000)]
gallium/tgsi: start adding hw atomics (v3.2)

This adds support for a hw atomic counters to TGSI.

A new register file for storing atomic counters is added,
along with a new atomic counter semantic, along with docs
for both.

v2: drop semantic, move hw counter to backend,
Ilia pointed out SSO would have busted my plan, and he
was right.
v3: drop BUFFER decls. (Marek)
v3.1: minor fixups for whitespace, set ureg error
if we overflow the hw atomic limits. (nha)
v3.2: fix some docs inconsistencies (Ilia)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agogallium: add CAPs to support HW atomic counters. (v3)
Dave Airlie [Tue, 8 Aug 2017 03:13:03 +0000 (13:13 +1000)]
gallium: add CAPs to support HW atomic counters. (v3)

This looks like an evergreen specific feature, but with atomic
counters AMD have hw specific counters they use instead of operating
on buffers directly. These are separate to the buffer atomics,
so require different limits and code paths.

I've left the CAP for atomic type extensible in case someone
else has a variant on this sort of thing (freedreno maybe?)
and needs to change it.

This adds all the CAPs required to add support for those atomic
counters, along with a related CAP for limiting the number of
output resources.

I'd like to land this and the st patch then I can start to
upstream the evergreen support for these and other GL4.x features.

v2: drop the ATOMIC_COUNTER_MODE cap, just use the return
from the HW counters. If 0 we use the current mode.
v3: fix some rebase errors (Gert Wollny)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agor600/query: drop rest of vi workaround code.
Dave Airlie [Thu, 9 Nov 2017 05:48:18 +0000 (15:48 +1000)]
r600/query: drop rest of vi workaround code.

This isn't needed in r600 anymore.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agodocs: Fix GL_MESA_program_debug enums
Roland Scheidegger [Tue, 7 Nov 2017 00:43:51 +0000 (01:43 +0100)]
docs: Fix GL_MESA_program_debug enums

13b303ff9265b89bdd9100e32f905e9cdadfad81 added the actual enums but
didn't remove the already existing XXXX ones. (And also duplicated
the "fragment" names instead of using the "vertex" names.)

Fixes: 13b303ff9265b89bdd91 "docs: Update the list of used MESA GL enums."
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agost/mesa: remove 'struct' keyword on function parameter
Brian Paul [Thu, 9 Nov 2017 17:48:42 +0000 (10:48 -0700)]
st/mesa: remove 'struct' keyword on function parameter

st_src_reg is a class, not a struct.  Simply remove 'struct' to silence
a MSVC compiler warning (class vs. struct mismatch).

Reviewed-by; Charmaine Lee <charmainel@vmware.com>

7 years agothreads: fix MinGW build breakage
Brian Paul [Thu, 9 Nov 2017 16:43:37 +0000 (09:43 -0700)]
threads: fix MinGW build breakage

Fixes: f1a364878431c8 ("threads: update for late C11 changes")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
7 years agomesa: s/GLint/gl_buffer_index/ for _ColorDrawBufferIndexes
Brian Paul [Thu, 9 Nov 2017 05:18:40 +0000 (22:18 -0700)]
mesa: s/GLint/gl_buffer_index/ for _ColorDrawBufferIndexes

Also fix local variable declarations and replace -1 with BUFFER_NONE.
No Piglit changes.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
7 years agomesa: s/GLint/gl_buffer_index/ for _ColorReadBufferIndex
Brian Paul [Thu, 9 Nov 2017 05:09:59 +0000 (22:09 -0700)]
mesa: s/GLint/gl_buffer_index/ for _ColorReadBufferIndex

BUFFER_NONE is -1 so no reason for GLint.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
7 years agomesa: minor reformatting, add const to gl_external_samplers()
Brian Paul [Thu, 9 Nov 2017 05:07:08 +0000 (22:07 -0700)]
mesa: minor reformatting, add const to gl_external_samplers()

This function should probably be moved elsewhere, too.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
7 years agost/mesa: whitespace clean-up in st_mesa_to_tgsi.c
Brian Paul [Mon, 6 Nov 2017 16:28:02 +0000 (09:28 -0700)]
st/mesa: whitespace clean-up in st_mesa_to_tgsi.c

Remove trailing whitespace, fix indentation, wrap lines to 78 columns, etc.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
7 years agomeson: implement default driver arguments
Dylan Baker [Mon, 30 Oct 2017 17:17:22 +0000 (10:17 -0700)]
meson: implement default driver arguments

This allows drivers to be set by OS/arch in a sane manner.

v2: - set _drivers to a list of drivers instead of manually assigning
      each with_*
v3: - Use "auto" instead of "default", which matches the value of other
      automatically configured options.
    - Set vulkan drivers as well
    - Add error message if no automatic drivers are known for a given
      arch/OS combo
    - use not(darwin or windows) instead of (linux or *bsd), which is
      probably more accurate (that way Solaris and other *nix systems
      aren't excluded)
    - rename softpipe to swrast, as swrast is the actual option name

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
7 years agoi965: Pretend there are 4 subslices for compute shader threads on Gen9+.
Kenneth Graunke [Wed, 8 Nov 2017 18:56:00 +0000 (10:56 -0800)]
i965: Pretend there are 4 subslices for compute shader threads on Gen9+.

Similar to what we did for pixel shader threads - see gen_device_info.c.

We don't want to bump the actual Maximum Number of Threads though, so
we adjust it here.  For pixel shaders, we don't use max_wm_threads, so
we could just bump it globally.

Supposedly fixes Piglit tests:
arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-i64vec3-int64_t
arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-i64vec4-int64_t
arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-u64vec4-uint64_t

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
7 years agomeson: Add script to use VERSION file for getting version
Dylan Baker [Wed, 1 Nov 2017 18:54:10 +0000 (11:54 -0700)]
meson: Add script to use VERSION file for getting version

Meson has up until this point set it's version in the root meson.build
script, while the other build systems read the VERSION file. This is
just "one more thing" to duplicate between meson and every other build
system. This script is a simple "read, strip, print" sort of deal to
allow meson to read the VERSION file.

I chose to implement this in python since python is portable, and to
keep the meson.build script clean. This is also complicated by the fact
that the project() call *must* be the first non-comment,non-blank in the
toplevel meson.build script.

v2: - Move from scripts/ to bin/
    - use python explicitly to run the scripts to support windows

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
7 years agobroadcom/vc4: Mark BOs as purgeable when they enter the BO cache
Boris Brezillon [Tue, 26 Sep 2017 11:37:40 +0000 (13:37 +0200)]
broadcom/vc4: Mark BOs as purgeable when they enter the BO cache

This patch makes use of the DRM_IOCTL_VC4_GEM_MADVISE ioctl to mark all
BOs placed in the mesa BO cache as purgeable so that the system can
reclaim this memory under memory pressure.

v2:
- Removed BOs from the cache when they've been purged by the kernel
- Check whether the madvise ioctl is supported or not before using it

v3: Don't walk the whole list when we find a busy BO (by anholt, acked by
    Boris)

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
7 years agodrm-uapi: Update vc4 header from drm-next
Boris Brezillon [Tue, 26 Sep 2017 07:35:53 +0000 (09:35 +0200)]
drm-uapi: Update vc4 header from drm-next

Taken from drm-next d65d31388a23 ("Merge tag
'drm-misc-next-fixes-2017-11-07' of
git://anongit.freedesktop.org/drm/drm-misc into drm-next")

v2: Add the NOTSUPP definition from the final drm-next version, not the
    commit (anholt).

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
7 years agomeson: Enable VC4's NEON assembly support.
Eric Anholt [Wed, 8 Nov 2017 22:07:38 +0000 (14:07 -0800)]
meson: Enable VC4's NEON assembly support.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agomeson: Always link libgallium_dri.so against dep_thread.
Eric Anholt [Wed, 8 Nov 2017 22:00:51 +0000 (14:00 -0800)]
meson: Always link libgallium_dri.so against dep_thread.

Somehow on my cross build the -pthread is getting lost.  All the other
deps seem to work out fine.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agomeson: Drop stale comment about making valgrind conditional.
Eric Anholt [Wed, 8 Nov 2017 19:54:24 +0000 (11:54 -0800)]
meson: Drop stale comment about making valgrind conditional.

It was fixed in 5c2ff5773a707519f6a773126f201c4e1e8a42d7.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agomeson: Leave dep_llvm empty if !with_llvm
Eric Anholt [Wed, 8 Nov 2017 19:52:09 +0000 (11:52 -0800)]
meson: Leave dep_llvm empty if !with_llvm

The gallium auxiliary build would link against llvm, for the gallivm code
that it didn't build.  This broke the build on my armhf cross, where
libLLVM-3.9.so is not multiarch and thus points to x86-64 libs.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoRevert "glx: Implement GLX_EXT_no_config_context (v2)"
Adam Jackson [Thu, 9 Nov 2017 16:41:14 +0000 (11:41 -0500)]
Revert "glx: Implement GLX_EXT_no_config_context (v2)"

Pushed ahead of things actually working.

This reverts commit 5293b96b160b904c0e53cbce93679c3aa090f846.

7 years agoradeonsi: pack r600_surface better
Marek Olšák [Tue, 7 Nov 2017 18:00:20 +0000 (19:00 +0100)]
radeonsi: pack r600_surface better

160 -> 136 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: pack r600_texture better
Marek Olšák [Tue, 7 Nov 2017 17:53:37 +0000 (18:53 +0100)]
radeonsi: pack r600_texture better

1752 -> 1736 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: clean up r600_surface
Marek Olšák [Tue, 7 Nov 2017 17:53:59 +0000 (18:53 +0100)]
radeonsi: clean up r600_surface

216 -> 160 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: remove r600_texture::non_disp_tiling
Marek Olšák [Tue, 7 Nov 2017 17:42:53 +0000 (18:42 +0100)]
radeonsi: remove r600_texture::non_disp_tiling

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: remove DBG_NO_DISCARD_RANGE
Marek Olšák [Tue, 7 Nov 2017 17:42:23 +0000 (18:42 +0100)]
radeonsi: remove DBG_NO_DISCARD_RANGE

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoglx: Implement GLX_EXT_no_config_context (v2)
Adam Jackson [Tue, 7 Nov 2017 16:36:53 +0000 (11:36 -0500)]
glx: Implement GLX_EXT_no_config_context (v2)

This more or less ports EGL_KHR_no_config_context to GLX.

v2: Enable the extension only for those backends that support it.

Khronos: https://github.com/KhronosGroup/OpenGL-Registry/pull/102
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
7 years agoglx: Prepare the DRI backends for GLX_EXT_no_config_context
Adam Jackson [Tue, 7 Nov 2017 16:36:52 +0000 (11:36 -0500)]
glx: Prepare the DRI backends for GLX_EXT_no_config_context

This should be safe as these backends already support the EGL version of
this extension. DRI1 is not affected because it does not support
GLX_ARB_create_context anyway. DRI-Windows is not prepared to implement
this as there's no equivalent WGL extension, and wglCreateContextAttribs
seems to really want the HDC's pixel format to be set.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
7 years agoglx: Relax validate_renderType_against_config for EXT_no_config_context
Adam Jackson [Tue, 7 Nov 2017 16:36:51 +0000 (11:36 -0500)]
glx: Relax validate_renderType_against_config for EXT_no_config_context

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
7 years agoanv: fix build failure
Nicolai Hähnle [Thu, 9 Nov 2017 13:49:19 +0000 (14:49 +0100)]
anv: fix build failure

Fixes: e3a8013de8ca ("util/u_queue: add util_queue_fence_wait_timeout")
7 years agomesa: flush and wait after creating a fallback texture
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:04 +0000 (17:39 +0200)]
mesa: flush and wait after creating a fallback texture

Fixes non-deterministic failures in
dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_sync.images.texture_source.teximage2d_render
and others in dEQP-EGL.functional.sharing.gles2.multithread.*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa: increase MaxServerWaitTimeout
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:03 +0000 (17:39 +0200)]
mesa: increase MaxServerWaitTimeout

The current value was introduced in commit a27180d0d8666, which claims
that it represents ~1.11 years. However, it is interpreted in nanoseconds,
so it actually only represents ~9.8 hours. That seems a bit short.

Use the largest value consistent with both int32 and int64. It
corresponds to ~292 years in nanoseconds.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/mesa: remove redundant flushes from st_flush
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:02 +0000 (17:39 +0200)]
st/mesa: remove redundant flushes from st_flush

st_flush should flush state tracker-internal state and the pipe, but
not mesa/main state. Of the four callers:

- glFlush/glFinish already call FLUSH_{VERTICES,STATE}.
- st_vdpau doesn't need to call them.
- st_manager will now call them explicitly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/dri: use stapi flush instead of pipe flush when creating fences
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:37 +0000 (17:38 +0200)]
st/dri: use stapi flush instead of pipe flush when creating fences

There may be pending operations (e.g. vertices) that need to be flushed
by the state tracker.

Found by inspection.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: use a threaded context even for debug contexts
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:05 +0000 (17:39 +0200)]
radeonsi: use a threaded context even for debug contexts

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: record and dump time of flush
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:05 +0000 (17:39 +0200)]
radeonsi: record and dump time of flush

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoddebug: optionally handle transfer commands like draws
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:01 +0000 (17:39 +0200)]
ddebug: optionally handle transfer commands like draws

Transfer commands can have associated GPU operations.

Enabled by passing GALLIUM_DDEBUG=transfers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoddebug: dump context and before/after times of draws
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:01 +0000 (17:39 +0200)]
ddebug: dump context and before/after times of draws

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoddebug: generalize print_named_xxx via a PRINT_NAMED macro
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:00 +0000 (17:39 +0200)]
ddebug: generalize print_named_xxx via a PRINT_NAMED macro

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoddebug: rewrite to always use a threaded approach
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:59 +0000 (17:38 +0200)]
ddebug: rewrite to always use a threaded approach

This patch has multiple goals:

1. Off-load the writing of records in 'always' mode to another thread
   for performance.

2. Allow using ddebug with threaded contexts. This really forces us to
   move some of the "after_draw" handling into another thread.

3. Simplify the different modes of ddebug, both in the code and in
   the user interface, i.e. GALLIUM_DDEBUG. In particular, there's
   no 'pipelined' anymore, since we're always pipelined; and 'noflush'
   is replaced by 'flush', since we no longer flush by default.

4. Fix the fences in pipelining mode. They previously relied on writes
   via pipe_context::clear_buffer. However, on radeonsi, those could
   (quite reasonably) end up in the SDMA buffer. So we use the newly
   added PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE fences instead.

5. Improve pipelined mode overall, using the finer grained information
   provided by the new fences.

Overall, the result is that pipelined mode should be more useful, and
using ddebug in default mode is much less invasive, in the sense that
it changes the overall driver behavior less (which is kind of crucial
for a driver debugging tool).

An example of the new hang debug output:

  Gallium debugger active.
  Hang detection timeout is 1000ms.
  GPU hang detected, collecting information...

  Draw #   driver  prev BOP  TOP  BOP  dump file
  -------------------------------------------------------------
  2          YES      YES    YES  NO   /home/nha/ddebug_dumps/shader_runner_19919_00000000
  3          YES      NO     YES  NO   /home/nha/ddebug_dumps/shader_runner_19919_00000001
  4          YES      NO     YES  NO   /home/nha/ddebug_dumps/shader_runner_19919_00000002
  5          YES      NO     YES  NO   /home/nha/ddebug_dumps/shader_runner_19919_00000003

  Done.

We can see that there were almost certainly 4 draws in flight when
the hang happened: the top-of-pipe fence was signaled for all 4 draws,
the bottom-of-pipe fence for none of them. In virtually all cases,
we'd expect the first draw in the list to be at fault, but due to the
GPU parallelism, it's possible (though highly unlikely) that one of
the later draws causes a component to get stuck in a way that prevents
the earlier draws from making progress as well.

(In the above example, there were actually only 3 draws truly in flight:
the last draw is a blit that waits for the earlier draws; however, its
top-of-pipe fence is emitted before the cache flush and wait, and so
the fact that the draw hasn't truly started yet can only be seen from a
closer inspection of GPU state.)

Acked-by: Marek Olšák <marek.olsak@amd.com>
7 years agoddebug: use an atomic increment when numbering files
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:58 +0000 (17:38 +0200)]
ddebug: use an atomic increment when numbering files

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agodd/util: extract dd_get_debug_filename_and_mkdir
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:58 +0000 (17:38 +0200)]
dd/util: extract dd_get_debug_filename_and_mkdir

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/u_dump: add and use util_dump_transfer_usage
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:57 +0000 (17:38 +0200)]
gallium/u_dump: add and use util_dump_transfer_usage

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/u_dump: add util_dump_ns
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:56 +0000 (17:38 +0200)]
gallium/u_dump: add util_dump_ns

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/u_dump: export util_dump_ptr
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:55 +0000 (17:38 +0200)]
gallium/u_dump: export util_dump_ptr

Change format to %p while we're at it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: implement PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:54 +0000 (17:38 +0200)]
radeonsi: implement PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE

v2: use uncached system memory for the fence, and use the CPU to
    clear it so we never read garbage when checking the fence

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: document some subtle details of fence_finish & fence_server_sync
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:53 +0000 (17:38 +0200)]
radeonsi: document some subtle details of fence_finish & fence_server_sync

v2: remove the change to si_fence_server_sync, we'll handle that more
    robustly

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium: add pipe_context::callback
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:53 +0000 (17:38 +0200)]
gallium: add pipe_context::callback

For running post-draw operations inside the driver thread. ddebug will
use it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/u_threaded: implement pipe_context::set_log_context
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:52 +0000 (17:38 +0200)]
gallium/u_threaded: implement pipe_context::set_log_context

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/u_threaded: avoid syncs for get_query_result
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:51 +0000 (17:38 +0200)]
gallium/u_threaded: avoid syncs for get_query_result

Queries should still get marked as flushed when flushes are executed
asynchronously in the driver thread.

To this end, the management of the unflushed_queries list is moved into
the driver thread.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/u_threaded: implement asynchronous flushes
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:50 +0000 (17:38 +0200)]
gallium/u_threaded: implement asynchronous flushes

This requires out-of-band creation of fences, and will be signaled to
the pipe_context::flush implementation by a special TC_FLUSH_ASYNC flag.

v2:
- remove an incorrect assertion
- handle fence_server_sync for unsubmitted fences by
  relying on the improved cs_add_fence_dependency
- only implement asynchronous flushes on amdgpu

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/u_threaded: mark queries flushed only for non-deferred flushes
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:50 +0000 (17:38 +0200)]
gallium/u_threaded: mark queries flushed only for non-deferred flushes

The driver uses (and must use) the flushed flag of queries as a hint that
it does not have to check for synchronization with currently queued up
commands. Deferred flushes do not actually flush queued up commands, so
we must not set the flushed flag for them.

Found by inspection.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: move fence functions to si_fence.c
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:49 +0000 (17:38 +0200)]
radeonsi: move fence functions to si_fence.c

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences
Nicolai Hähnle [Thu, 9 Nov 2017 13:00:22 +0000 (14:00 +0100)]
winsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences

The idea is to fix the following interleaving of operations
that can arise from deferred fences:

 Thread 1 / Context 1          Thread 2 / Context 2
 --------------------          --------------------
 f = deferred flush
 <------- application-side synchronization ------->
                               fence_server_sync(f)
                               ...
                               flush()
 flush()

We will now stall in fence_server_sync until the flush of context 1
has completed.

This scenario was unlikely to occur previously, because applications
seem to be doing

 Thread 1 / Context 1          Thread 2 / Context 2
 --------------------          --------------------
 f = glFenceSync()
 glFlush()
 <------- application-side synchronization ------->
                               glWaitSync(f)

... and indeed they probably *have* to use this ordering to avoid
deadlocks in the GLX model, where all GL operations conceptually
go through a single connection to the X server. However, it's less
clear whether applications have to do this with other WSI (i.e. EGL).
Besides, even this sequence of GL commands can be translated into
the Gallium-level sequence outlined above when Gallium threading
and asynchronous flushes are used. So it makes sense to be more
robust.

As a side effect, we no longer busy-wait on submission_in_progress.

We won't enable asynchronous flushes on radeon, but add a
cs_add_fence_dependency stub anyway to document the potential
issue.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium: add PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE bits
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:48 +0000 (17:38 +0200)]
gallium: add PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE bits

These bits are intended to be used by the ddebug hang detection and are
named in analogy to the Vulkan stage bits (and the corresponding Radeon
pipeline event).

Hang detection needs fences on the granularity of individual commands,
which nothing else really covers. The closest alternative would have
been PIPE_QUERY_GPU_FINISHED, but (a) queries are a per-context object
and we really want a per-screen object, (b) queries don't offer a
wait with timeout, and (c) in any case, PIPE_QUERY_GPU_FINISHED is
meant to imply that GPU caches are flushed, which the new bits
explicitly aren't.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium: add PIPE_FLUSH_ASYNC and PIPE_FLUSH_HINT_FINISH
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:47 +0000 (17:38 +0200)]
gallium: add PIPE_FLUSH_ASYNC and PIPE_FLUSH_HINT_FINISH

Also document some subtleties of pipe_context::flush.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoutil/u_queue: add util_queue_fence_wait_timeout
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:46 +0000 (17:38 +0200)]
util/u_queue: add util_queue_fence_wait_timeout

v2:
- style fixes
- fix missing timeout handling in futex path

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agothreads: update for late C11 changes
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:45 +0000 (17:38 +0200)]
threads: update for late C11 changes

C11 threads were changed to use struct timespec instead of xtime, and
thrd_sleep got a second argument.

See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1554.htm and
http://en.cppreference.com/w/c/thread/{thrd_sleep,cnd_timedwait,mtx_timedlock}

Note that cnd_timedwait is spec'd to be relative to TIME_UTC / CLOCK_REALTIME.

v2: Fix Windows build errors. Tested with a default Appveyor config
    that uses Visual Studio 2013. Judging from Brian's email and
    random internet sources, Visual Studio 2015 does have timespec
    and timespec_get, hence the _MSC_VER-based guard which I have
    not tested.

Cc: Jose Fonseca <jfonseca@vmware.com>
Cc: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
7 years agogallium: remove unused and deprecated u_time.h
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:44 +0000 (17:38 +0200)]
gallium: remove unused and deprecated u_time.h

Cc: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoutil: move os_time.[ch] to src/util
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:44 +0000 (17:38 +0200)]
util: move os_time.[ch] to src/util

Reviewed-by: Marek Olšák <marek.olsak@amd.com>