Andres Gomez [Thu, 9 Nov 2017 14:58:23 +0000 (16:58 +0200)]
docs/releasing: improve the pre-announce template and examples
v2: Choose a proper rejection example (Emil).
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Andres Gomez [Thu, 9 Nov 2017 14:58:22 +0000 (16:58 +0200)]
docs/releasing: drop manually exported variables during smoke test
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Andres Gomez [Thu, 9 Nov 2017 14:58:21 +0000 (16:58 +0200)]
docs/releasing: drop custom LLVM_CONFIG if previously manually set
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Marek Olšák [Thu, 9 Nov 2017 19:12:07 +0000 (20:12 +0100)]
st/dri: fix android fence regression
Fixes piglit - egl_khr_fence_sync/android_native tests.
Broken by
884a0b2a9e55d4c1ca39475b50d9af598d7d7280.
Introduce state-tracker flush flags, analogous to the pipe ones. Use
the former when with stapi->flush().
Fixes: 884a0b2a9e5 ("st/dri: use stapi flush instead of pipe flush
when creating fences")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Nicolai Hähnle [Fri, 10 Nov 2017 11:36:16 +0000 (12:36 +0100)]
util/u_thread: fix compilation on Mac OS
Apparently, it doesn't have pthread barriers.
p_config.h (which was originally used to guard this code) uses the
__APPLE__ macro to detect Mac OS.
Fixes: f0d3a4de75 ("util: move pipe_barrier into src/util and rename to util_barrier")
Cc: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Nicolai Hähnle [Fri, 10 Nov 2017 09:40:41 +0000 (10:40 +0100)]
util/u_queue: handle OS_TIMEOUT_INFINITE in util_queue_fence_wait_timeout
Fixes e.g. piglit/bin/bufferstorage-persistent read -auto
Fixes: e6dbc804a87a ("winsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 10 Nov 2017 08:59:08 +0000 (09:59 +0100)]
gallium/u_threaded: fix end_query regression
Ouch...
Fixes: 244536d3d6b4 ("gallium/u_threaded: avoid syncs for get_query_result")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103653
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bruce Cherniak [Thu, 9 Nov 2017 00:39:37 +0000 (18:39 -0600)]
swr: Fixed an uncommon freed-memory access during state validation
State validation is performed during clear and draw calls. Validation
during clear was still accessing vertex buffer state. When the currently
set vertex buffers are client arrays, this could lead to accessing freed
memory. Such is the case with the VMD application.
Previously, vertex buffer validation depended on a dirty bit or the
draw info indicating an indexed draw. This required special handling for
clears. But, vertex buffer validation still occurred which was unnecessary
and wrong.
Now, only minimal validation is performed during clear, deferring the
remainder to the next draw. And, by setting the dirty bit in swr_draw_vbo
for indexed draws, vertex buffer validation is only dependent upon a
single dirty bit.
This fixes a bug exposed by the VMD application when changing models.
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
Rob Clark [Tue, 31 Oct 2017 15:24:06 +0000 (11:24 -0400)]
freedreno/ir3: fix standalone compiler meson build
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Thu, 9 Nov 2017 15:50:44 +0000 (10:50 -0500)]
freedreno/ir3: correct # of dest components for intrinsics
Don't rely on intr->num_components having a valid value. It doesn't
seem to anymore for non-vectorized intrinsics.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 31 Oct 2017 16:34:23 +0000 (12:34 -0400)]
freedreno/ir3: remove bogus assert
The ssbo atomic instructions are not vectorized. So num_components is
not expected to be valid.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 5 Nov 2017 18:53:50 +0000 (13:53 -0500)]
nir: handle get_buffer_size in nir_lower_atomics_to_ssbo
Overlooked initially, be we need to remap the SSBO index for this as
well.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chad Versace [Wed, 1 Nov 2017 20:47:55 +0000 (13:47 -0700)]
anv/meson: Generate dev_icd.json
I tested this in a setup where the builddir was outside of the srcdir.
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Chad Versace [Thu, 9 Nov 2017 23:02:13 +0000 (15:02 -0800)]
anv: Fix architecture in intel_icd.{arch}.json
Use the host arch, not the target arch. In Meson and in recent
Autotools, the host arch is where the binary will be used. The target
arch is useful only when compiling a compiler.
See: http://mesonbuild.com/Cross-compilation.html
See: https://www.gnu.org/software/automake/manual/html_node/Cross_002dCompilation.html
Reported-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Chad Versace [Thu, 9 Nov 2017 22:55:37 +0000 (14:55 -0800)]
radv: Fix architecture in radeon_icd.{arch}.json
Use the host arch, not the target arch. In Meson and in recent
Autotools, the host arch is where the binary will be used. The target
arch is useful only when compiling a compiler.
See: http://mesonbuild.com/Cross-compilation.html
See: https://www.gnu.org/software/automake/manual/html_node/Cross_002dCompilation.html
Reported-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Chad Versace [Tue, 7 Nov 2017 03:35:43 +0000 (19:35 -0800)]
anv: Refactor anv_GetImageSubresourceLayout()
Its helper function, anv_surface_get_subresource_layout(), was not very
helpful. So fold it into the main function.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Tue, 7 Nov 2017 02:51:53 +0000 (18:51 -0800)]
anv/image: Refactor choice of isl_tiling_flags_t
Instead of choosing the tiling flags inside make_surface(), which is
called once per aspect in a loop, and which chooses the same tiling for
each aspect, choose the tiling flags exactly once before entering the
aspect loop.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Fri, 3 Nov 2017 23:42:16 +0000 (16:42 -0700)]
anv: Refactor anv_get_format_plane() - explicit unsupported
The same local variable, 'plane_format', was returned on success *and*
failure. Be more explicit in distinguishing the two cases: return
'plane_format' on success and return 'unsupported' on failure.
This simplifies the diff in upcoming patches for
VK_EXT_image_drm_format_modifier.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Fri, 3 Nov 2017 23:35:12 +0000 (16:35 -0700)]
anv: Remove anv_physical_device_get_format_properties()
Fold its body into its sole caller,
anv_GetPhysicalDeviceFormatProperties().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Fri, 3 Nov 2017 19:44:34 +0000 (12:44 -0700)]
anv: Simplify anv_physical_device_get_format_properties()
Now that get_image_format_properties() returns the correct
VkFormatFeatureFlags, we can remove the unneeded if-branch and some
local variables.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Fri, 3 Nov 2017 23:26:51 +0000 (16:26 -0700)]
anv: Simplify anv_get_image_format_properties()
Now that get_image_format_features() has a VkImageTiling parameter, we
can bypass anv_physical_device_get_format_properties() and call
get_image_format_features() directly.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Fri, 3 Nov 2017 21:50:00 +0000 (14:50 -0700)]
anv: Rename get_image_format_properties()
The name is misleading. It looks like vkGetPhysicalDeviceImageFormatProperties(),
but it actually implement vkGetPhysicalDeviceFormatProperties. Let's
rename it to what it actually does, get_image_format_features(), because it
returns VkFormatFeatureFlags.
For consistency, also rename get_buffer_format_properties() to
get_buffer_format_features().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Fri, 3 Nov 2017 19:32:00 +0000 (12:32 -0700)]
anv: Fix get_image_format_properties() - YCbCr
Teach it to calculate the format features for YCbCr.
The goal (which is completed in this patch) is to incrementally fix
get_image_format_properties() to return a correct result. Previously,
it returned incorrect VkFormatFeatureFlags which the caller needed clean
up.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Fri, 3 Nov 2017 19:25:51 +0000 (12:25 -0700)]
anv: Fix get_image_format_properties() - 3-channel formats
Teach it to calculate the format features for 3-channel formats.
The goal is to incrementally fix get_image_format_properties() to return
a correct result. Currently, it returns incorrect VkFormatFeatureFlags
which the caller must clean up.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Fri, 3 Nov 2017 19:20:21 +0000 (12:20 -0700)]
anv: Refactor get_image_format_properties() - Reduce params
Replace parameters 'enum isl_format' and 'struct anv_format_plane' with
new parameter 'const struct anv_format *'.
The goal is to incrementally fix get_image_format_properties() to return
a correct result. Currently, it returns incorrect VkFormatFeatureFlags
which the caller must clean up.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Chad Versace [Fri, 3 Nov 2017 19:19:36 +0000 (12:19 -0700)]
anv: Refactor get_image_format_properties() - base_isl_format
Rename parameter 'base' to 'base_isl_format'.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Chad Versace [Fri, 3 Nov 2017 01:28:02 +0000 (18:28 -0700)]
anv: Refactor get_image_format_properties() - plane_format
Rename parameter 'format' to 'plane_format'.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Fri, 3 Nov 2017 01:26:23 +0000 (18:26 -0700)]
anv: Refactor get_image_format_properties() - ASTC
Teach it to calculate the format features for ASTC.
The goal is to incrementally fix get_image_format_properties() to return
a correct result. Currently, it returns incorrect VkFormatFeatureFlags
which the caller must clean up.
v2: New commit message
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Fri, 3 Nov 2017 01:24:08 +0000 (18:24 -0700)]
anv: Refactor get_image_format_properties() - depthstencil (v2)
Teach it to calculate the features of depthstencil formats.
The goal is to incrementally fix get_image_format_properties() to return
a correct result. Currently, it returns incorrect VkFormatFeatureFlags
which the caller must clean up.
v2: New commit message
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Fri, 3 Nov 2017 00:26:51 +0000 (17:26 -0700)]
anv: Better types for 'aspect' function params
Some functions have a comment that says "Exactly one bit must be in
'aspect'". So change the type of their 'aspect' parameter from
VkImageAspectFlags to VkImageAspectFlagBits.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Chad Versace [Thu, 2 Nov 2017 23:55:55 +0000 (16:55 -0700)]
anv: Refactor get_buffer_format_properties()
Make it a stand-alone function. Pre-patch, for some formats the function
returned incorrect VkFormatFeatureFlags which were cleaned up by the
caller.
This prepares for a cleaner implementation of
VK_EXT_image_drm_format_modifier.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Eric Anholt [Thu, 9 Nov 2017 23:51:56 +0000 (15:51 -0800)]
broadcom/vc4: Fix simulator mode for the MADVISE usage.
Marek Olšák [Thu, 19 Oct 2017 20:22:15 +0000 (22:22 +0200)]
mesa: enable ARB_texture_buffer_* extensions in the Compatibility profile
We already have piglit tests testing alpha, luminance, and intensity
formats. They were skipped by piglit until now.
Additionally, I'm enabling one ARB_texture_buffer_range piglit test to run
with the compat profile.
i965 behavior is unchanged except that it doesn't expose TBOs in the Compat
profile. Not sure how that affects the GL version override.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Dave Airlie [Thu, 2 Nov 2017 01:04:53 +0000 (11:04 +1000)]
docs: update r600 atomic counter status.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 2 Nov 2017 00:26:51 +0000 (10:26 +1000)]
r600: add support for hw atomic counters. (v3)
This adds support for the evergreen/cayman atomic counters.
These are implemented using GDS append/consume counters. The values
for each counter are loaded before drawing and saved after each draw
using special CP packets.
v2: move hw atomic assignment into driver.
v3: fix messing up caps (Gert Wollny), only store ranges in driver,
drop buffers.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Dave Airlie [Wed, 1 Nov 2017 19:51:36 +0000 (05:51 +1000)]
st/mesa: add support for hw atomics to glsl->tgsi. (v5)
This adds support for creating the hw atomic tgsi from
the glsl codepaths.
v2: drop the atomic index and move to backend.
v3: drop buffer decls. (Marek)
v4: fix off by one (Gert)
v5: fix off by one the other way (Dave)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 1 Nov 2017 19:49:40 +0000 (05:49 +1000)]
st/mesa: setup hw atomic limits. (v1.1)
HW atomics need to use caps to set some limits, and some
other limits may also need limiting.
This fixes things up to work for evergreen hw, it may need
more changes in the future if other hw wants to use this path.
v1.1: fix indent.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 1 Nov 2017 04:30:13 +0000 (14:30 +1000)]
st/mesa: start adding support for hw atomics atom. (v2)
This adds a new atom that calls the new driver API to
bind buffers containing hw atomics.
v2: fixup bindings for sparse buffers. (mareko/nha)
don't bind buffer atomics when hw atomics are enabled.
use NewAtomicBuffer (mareko)
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 1 Nov 2017 04:22:04 +0000 (14:22 +1000)]
mesa/program: add hw atomic counter file
This is needed for the GLSL->TGSI translation for hw atomic counters.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 1 Nov 2017 04:17:49 +0000 (14:17 +1000)]
gallium: add hw atomic buffer binding API.
This API binds atomic buffers for all bound shaders (as per the
GL semantics).
This is needed to support cross shader hw atomic counters.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 1 Nov 2017 04:05:19 +0000 (14:05 +1000)]
gallium/tgsi: start adding hw atomics (v3.2)
This adds support for a hw atomic counters to TGSI.
A new register file for storing atomic counters is added,
along with a new atomic counter semantic, along with docs
for both.
v2: drop semantic, move hw counter to backend,
Ilia pointed out SSO would have busted my plan, and he
was right.
v3: drop BUFFER decls. (Marek)
v3.1: minor fixups for whitespace, set ureg error
if we overflow the hw atomic limits. (nha)
v3.2: fix some docs inconsistencies (Ilia)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 8 Aug 2017 03:13:03 +0000 (13:13 +1000)]
gallium: add CAPs to support HW atomic counters. (v3)
This looks like an evergreen specific feature, but with atomic
counters AMD have hw specific counters they use instead of operating
on buffers directly. These are separate to the buffer atomics,
so require different limits and code paths.
I've left the CAP for atomic type extensible in case someone
else has a variant on this sort of thing (freedreno maybe?)
and needs to change it.
This adds all the CAPs required to add support for those atomic
counters, along with a related CAP for limiting the number of
output resources.
I'd like to land this and the st patch then I can start to
upstream the evergreen support for these and other GL4.x features.
v2: drop the ATOMIC_COUNTER_MODE cap, just use the return
from the HW counters. If 0 we use the current mode.
v3: fix some rebase errors (Gert Wollny)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 9 Nov 2017 05:48:18 +0000 (15:48 +1000)]
r600/query: drop rest of vi workaround code.
This isn't needed in r600 anymore.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Roland Scheidegger [Tue, 7 Nov 2017 00:43:51 +0000 (01:43 +0100)]
docs: Fix GL_MESA_program_debug enums
13b303ff9265b89bdd9100e32f905e9cdadfad81 added the actual enums but
didn't remove the already existing XXXX ones. (And also duplicated
the "fragment" names instead of using the "vertex" names.)
Fixes: 13b303ff9265b89bdd91 "docs: Update the list of used MESA GL enums."
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Brian Paul [Thu, 9 Nov 2017 17:48:42 +0000 (10:48 -0700)]
st/mesa: remove 'struct' keyword on function parameter
st_src_reg is a class, not a struct. Simply remove 'struct' to silence
a MSVC compiler warning (class vs. struct mismatch).
Reviewed-by; Charmaine Lee <charmainel@vmware.com>
Brian Paul [Thu, 9 Nov 2017 16:43:37 +0000 (09:43 -0700)]
threads: fix MinGW build breakage
Fixes: f1a364878431c8 ("threads: update for late C11 changes")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Thu, 9 Nov 2017 05:18:40 +0000 (22:18 -0700)]
mesa: s/GLint/gl_buffer_index/ for _ColorDrawBufferIndexes
Also fix local variable declarations and replace -1 with BUFFER_NONE.
No Piglit changes.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Thu, 9 Nov 2017 05:09:59 +0000 (22:09 -0700)]
mesa: s/GLint/gl_buffer_index/ for _ColorReadBufferIndex
BUFFER_NONE is -1 so no reason for GLint.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Thu, 9 Nov 2017 05:07:08 +0000 (22:07 -0700)]
mesa: minor reformatting, add const to gl_external_samplers()
This function should probably be moved elsewhere, too.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Mon, 6 Nov 2017 16:28:02 +0000 (09:28 -0700)]
st/mesa: whitespace clean-up in st_mesa_to_tgsi.c
Remove trailing whitespace, fix indentation, wrap lines to 78 columns, etc.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Dylan Baker [Mon, 30 Oct 2017 17:17:22 +0000 (10:17 -0700)]
meson: implement default driver arguments
This allows drivers to be set by OS/arch in a sane manner.
v2: - set _drivers to a list of drivers instead of manually assigning
each with_*
v3: - Use "auto" instead of "default", which matches the value of other
automatically configured options.
- Set vulkan drivers as well
- Add error message if no automatic drivers are known for a given
arch/OS combo
- use not(darwin or windows) instead of (linux or *bsd), which is
probably more accurate (that way Solaris and other *nix systems
aren't excluded)
- rename softpipe to swrast, as swrast is the actual option name
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Kenneth Graunke [Wed, 8 Nov 2017 18:56:00 +0000 (10:56 -0800)]
i965: Pretend there are 4 subslices for compute shader threads on Gen9+.
Similar to what we did for pixel shader threads - see gen_device_info.c.
We don't want to bump the actual Maximum Number of Threads though, so
we adjust it here. For pixel shaders, we don't use max_wm_threads, so
we could just bump it globally.
Supposedly fixes Piglit tests:
arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-i64vec3-int64_t
arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-i64vec4-int64_t
arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-u64vec4-uint64_t
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Dylan Baker [Wed, 1 Nov 2017 18:54:10 +0000 (11:54 -0700)]
meson: Add script to use VERSION file for getting version
Meson has up until this point set it's version in the root meson.build
script, while the other build systems read the VERSION file. This is
just "one more thing" to duplicate between meson and every other build
system. This script is a simple "read, strip, print" sort of deal to
allow meson to read the VERSION file.
I chose to implement this in python since python is portable, and to
keep the meson.build script clean. This is also complicated by the fact
that the project() call *must* be the first non-comment,non-blank in the
toplevel meson.build script.
v2: - Move from scripts/ to bin/
- use python explicitly to run the scripts to support windows
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Boris Brezillon [Tue, 26 Sep 2017 11:37:40 +0000 (13:37 +0200)]
broadcom/vc4: Mark BOs as purgeable when they enter the BO cache
This patch makes use of the DRM_IOCTL_VC4_GEM_MADVISE ioctl to mark all
BOs placed in the mesa BO cache as purgeable so that the system can
reclaim this memory under memory pressure.
v2:
- Removed BOs from the cache when they've been purged by the kernel
- Check whether the madvise ioctl is supported or not before using it
v3: Don't walk the whole list when we find a busy BO (by anholt, acked by
Boris)
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Boris Brezillon [Tue, 26 Sep 2017 07:35:53 +0000 (09:35 +0200)]
drm-uapi: Update vc4 header from drm-next
Taken from drm-next
d65d31388a23 ("Merge tag
'drm-misc-next-fixes-2017-11-07' of
git://anongit.freedesktop.org/drm/drm-misc into drm-next")
v2: Add the NOTSUPP definition from the final drm-next version, not the
commit (anholt).
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Eric Anholt [Wed, 8 Nov 2017 22:07:38 +0000 (14:07 -0800)]
meson: Enable VC4's NEON assembly support.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
Eric Anholt [Wed, 8 Nov 2017 22:00:51 +0000 (14:00 -0800)]
meson: Always link libgallium_dri.so against dep_thread.
Somehow on my cross build the -pthread is getting lost. All the other
deps seem to work out fine.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
Eric Anholt [Wed, 8 Nov 2017 19:54:24 +0000 (11:54 -0800)]
meson: Drop stale comment about making valgrind conditional.
It was fixed in
5c2ff5773a707519f6a773126f201c4e1e8a42d7.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
Eric Anholt [Wed, 8 Nov 2017 19:52:09 +0000 (11:52 -0800)]
meson: Leave dep_llvm empty if !with_llvm
The gallium auxiliary build would link against llvm, for the gallivm code
that it didn't build. This broke the build on my armhf cross, where
libLLVM-3.9.so is not multiarch and thus points to x86-64 libs.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
Adam Jackson [Thu, 9 Nov 2017 16:41:14 +0000 (11:41 -0500)]
Revert "glx: Implement GLX_EXT_no_config_context (v2)"
Pushed ahead of things actually working.
This reverts commit
5293b96b160b904c0e53cbce93679c3aa090f846.
Marek Olšák [Tue, 7 Nov 2017 18:00:20 +0000 (19:00 +0100)]
radeonsi: pack r600_surface better
160 -> 136 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 7 Nov 2017 17:53:37 +0000 (18:53 +0100)]
radeonsi: pack r600_texture better
1752 -> 1736 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 7 Nov 2017 17:53:59 +0000 (18:53 +0100)]
radeonsi: clean up r600_surface
216 -> 160 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 7 Nov 2017 17:42:53 +0000 (18:42 +0100)]
radeonsi: remove r600_texture::non_disp_tiling
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 7 Nov 2017 17:42:23 +0000 (18:42 +0100)]
radeonsi: remove DBG_NO_DISCARD_RANGE
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Adam Jackson [Tue, 7 Nov 2017 16:36:53 +0000 (11:36 -0500)]
glx: Implement GLX_EXT_no_config_context (v2)
This more or less ports EGL_KHR_no_config_context to GLX.
v2: Enable the extension only for those backends that support it.
Khronos: https://github.com/KhronosGroup/OpenGL-Registry/pull/102
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Adam Jackson [Tue, 7 Nov 2017 16:36:52 +0000 (11:36 -0500)]
glx: Prepare the DRI backends for GLX_EXT_no_config_context
This should be safe as these backends already support the EGL version of
this extension. DRI1 is not affected because it does not support
GLX_ARB_create_context anyway. DRI-Windows is not prepared to implement
this as there's no equivalent WGL extension, and wglCreateContextAttribs
seems to really want the HDC's pixel format to be set.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Adam Jackson [Tue, 7 Nov 2017 16:36:51 +0000 (11:36 -0500)]
glx: Relax validate_renderType_against_config for EXT_no_config_context
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Nicolai Hähnle [Thu, 9 Nov 2017 13:49:19 +0000 (14:49 +0100)]
anv: fix build failure
Fixes: e3a8013de8ca ("util/u_queue: add util_queue_fence_wait_timeout")
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:04 +0000 (17:39 +0200)]
mesa: flush and wait after creating a fallback texture
Fixes non-deterministic failures in
dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_sync.images.texture_source.teximage2d_render
and others in dEQP-EGL.functional.sharing.gles2.multithread.*
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:03 +0000 (17:39 +0200)]
mesa: increase MaxServerWaitTimeout
The current value was introduced in commit
a27180d0d8666, which claims
that it represents ~1.11 years. However, it is interpreted in nanoseconds,
so it actually only represents ~9.8 hours. That seems a bit short.
Use the largest value consistent with both int32 and int64. It
corresponds to ~292 years in nanoseconds.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:02 +0000 (17:39 +0200)]
st/mesa: remove redundant flushes from st_flush
st_flush should flush state tracker-internal state and the pipe, but
not mesa/main state. Of the four callers:
- glFlush/glFinish already call FLUSH_{VERTICES,STATE}.
- st_vdpau doesn't need to call them.
- st_manager will now call them explicitly.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:37 +0000 (17:38 +0200)]
st/dri: use stapi flush instead of pipe flush when creating fences
There may be pending operations (e.g. vertices) that need to be flushed
by the state tracker.
Found by inspection.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:05 +0000 (17:39 +0200)]
radeonsi: use a threaded context even for debug contexts
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:05 +0000 (17:39 +0200)]
radeonsi: record and dump time of flush
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:01 +0000 (17:39 +0200)]
ddebug: optionally handle transfer commands like draws
Transfer commands can have associated GPU operations.
Enabled by passing GALLIUM_DDEBUG=transfers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:01 +0000 (17:39 +0200)]
ddebug: dump context and before/after times of draws
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:39:00 +0000 (17:39 +0200)]
ddebug: generalize print_named_xxx via a PRINT_NAMED macro
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:59 +0000 (17:38 +0200)]
ddebug: rewrite to always use a threaded approach
This patch has multiple goals:
1. Off-load the writing of records in 'always' mode to another thread
for performance.
2. Allow using ddebug with threaded contexts. This really forces us to
move some of the "after_draw" handling into another thread.
3. Simplify the different modes of ddebug, both in the code and in
the user interface, i.e. GALLIUM_DDEBUG. In particular, there's
no 'pipelined' anymore, since we're always pipelined; and 'noflush'
is replaced by 'flush', since we no longer flush by default.
4. Fix the fences in pipelining mode. They previously relied on writes
via pipe_context::clear_buffer. However, on radeonsi, those could
(quite reasonably) end up in the SDMA buffer. So we use the newly
added PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE fences instead.
5. Improve pipelined mode overall, using the finer grained information
provided by the new fences.
Overall, the result is that pipelined mode should be more useful, and
using ddebug in default mode is much less invasive, in the sense that
it changes the overall driver behavior less (which is kind of crucial
for a driver debugging tool).
An example of the new hang debug output:
Gallium debugger active.
Hang detection timeout is 1000ms.
GPU hang detected, collecting information...
Draw # driver prev BOP TOP BOP dump file
-------------------------------------------------------------
2 YES YES YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000000
3 YES NO YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000001
4 YES NO YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000002
5 YES NO YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000003
Done.
We can see that there were almost certainly 4 draws in flight when
the hang happened: the top-of-pipe fence was signaled for all 4 draws,
the bottom-of-pipe fence for none of them. In virtually all cases,
we'd expect the first draw in the list to be at fault, but due to the
GPU parallelism, it's possible (though highly unlikely) that one of
the later draws causes a component to get stuck in a way that prevents
the earlier draws from making progress as well.
(In the above example, there were actually only 3 draws truly in flight:
the last draw is a blit that waits for the earlier draws; however, its
top-of-pipe fence is emitted before the cache flush and wait, and so
the fact that the draw hasn't truly started yet can only be seen from a
closer inspection of GPU state.)
Acked-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:58 +0000 (17:38 +0200)]
ddebug: use an atomic increment when numbering files
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:58 +0000 (17:38 +0200)]
dd/util: extract dd_get_debug_filename_and_mkdir
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:57 +0000 (17:38 +0200)]
gallium/u_dump: add and use util_dump_transfer_usage
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:56 +0000 (17:38 +0200)]
gallium/u_dump: add util_dump_ns
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:55 +0000 (17:38 +0200)]
gallium/u_dump: export util_dump_ptr
Change format to %p while we're at it.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:54 +0000 (17:38 +0200)]
radeonsi: implement PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE
v2: use uncached system memory for the fence, and use the CPU to
clear it so we never read garbage when checking the fence
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:53 +0000 (17:38 +0200)]
radeonsi: document some subtle details of fence_finish & fence_server_sync
v2: remove the change to si_fence_server_sync, we'll handle that more
robustly
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:53 +0000 (17:38 +0200)]
gallium: add pipe_context::callback
For running post-draw operations inside the driver thread. ddebug will
use it.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:52 +0000 (17:38 +0200)]
gallium/u_threaded: implement pipe_context::set_log_context
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:51 +0000 (17:38 +0200)]
gallium/u_threaded: avoid syncs for get_query_result
Queries should still get marked as flushed when flushes are executed
asynchronously in the driver thread.
To this end, the management of the unflushed_queries list is moved into
the driver thread.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:50 +0000 (17:38 +0200)]
gallium/u_threaded: implement asynchronous flushes
This requires out-of-band creation of fences, and will be signaled to
the pipe_context::flush implementation by a special TC_FLUSH_ASYNC flag.
v2:
- remove an incorrect assertion
- handle fence_server_sync for unsubmitted fences by
relying on the improved cs_add_fence_dependency
- only implement asynchronous flushes on amdgpu
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:50 +0000 (17:38 +0200)]
gallium/u_threaded: mark queries flushed only for non-deferred flushes
The driver uses (and must use) the flushed flag of queries as a hint that
it does not have to check for synchronization with currently queued up
commands. Deferred flushes do not actually flush queued up commands, so
we must not set the flushed flag for them.
Found by inspection.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:49 +0000 (17:38 +0200)]
radeonsi: move fence functions to si_fence.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 9 Nov 2017 13:00:22 +0000 (14:00 +0100)]
winsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences
The idea is to fix the following interleaving of operations
that can arise from deferred fences:
Thread 1 / Context 1 Thread 2 / Context 2
-------------------- --------------------
f = deferred flush
<------- application-side synchronization ------->
fence_server_sync(f)
...
flush()
flush()
We will now stall in fence_server_sync until the flush of context 1
has completed.
This scenario was unlikely to occur previously, because applications
seem to be doing
Thread 1 / Context 1 Thread 2 / Context 2
-------------------- --------------------
f = glFenceSync()
glFlush()
<------- application-side synchronization ------->
glWaitSync(f)
... and indeed they probably *have* to use this ordering to avoid
deadlocks in the GLX model, where all GL operations conceptually
go through a single connection to the X server. However, it's less
clear whether applications have to do this with other WSI (i.e. EGL).
Besides, even this sequence of GL commands can be translated into
the Gallium-level sequence outlined above when Gallium threading
and asynchronous flushes are used. So it makes sense to be more
robust.
As a side effect, we no longer busy-wait on submission_in_progress.
We won't enable asynchronous flushes on radeon, but add a
cs_add_fence_dependency stub anyway to document the potential
issue.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:48 +0000 (17:38 +0200)]
gallium: add PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE bits
These bits are intended to be used by the ddebug hang detection and are
named in analogy to the Vulkan stage bits (and the corresponding Radeon
pipeline event).
Hang detection needs fences on the granularity of individual commands,
which nothing else really covers. The closest alternative would have
been PIPE_QUERY_GPU_FINISHED, but (a) queries are a per-context object
and we really want a per-screen object, (b) queries don't offer a
wait with timeout, and (c) in any case, PIPE_QUERY_GPU_FINISHED is
meant to imply that GPU caches are flushed, which the new bits
explicitly aren't.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:47 +0000 (17:38 +0200)]
gallium: add PIPE_FLUSH_ASYNC and PIPE_FLUSH_HINT_FINISH
Also document some subtleties of pipe_context::flush.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:46 +0000 (17:38 +0200)]
util/u_queue: add util_queue_fence_wait_timeout
v2:
- style fixes
- fix missing timeout handling in futex path
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:45 +0000 (17:38 +0200)]
threads: update for late C11 changes
C11 threads were changed to use struct timespec instead of xtime, and
thrd_sleep got a second argument.
See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1554.htm and
http://en.cppreference.com/w/c/thread/{thrd_sleep,cnd_timedwait,mtx_timedlock}
Note that cnd_timedwait is spec'd to be relative to TIME_UTC / CLOCK_REALTIME.
v2: Fix Windows build errors. Tested with a default Appveyor config
that uses Visual Studio 2013. Judging from Brian's email and
random internet sources, Visual Studio 2015 does have timespec
and timespec_get, hence the _MSC_VER-based guard which I have
not tested.
Cc: Jose Fonseca <jfonseca@vmware.com>
Cc: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:44 +0000 (17:38 +0200)]
gallium: remove unused and deprecated u_time.h
Cc: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:44 +0000 (17:38 +0200)]
util: move os_time.[ch] to src/util
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 22 Oct 2017 15:38:43 +0000 (17:38 +0200)]
radeonsi: always use async compiles when creating shader/compute states
With Gallium threaded contexts, creating shader/compute states is
effectively a screen operation, so we should not use context state.
In particular, this allows us to avoid using the context's LLVM
TargetMachine.
This isn't an issue yet because u_threaded_context filters out non-async
debug callbacks, and we disable threaded contexts for debug contexts.
However, we may want to change that in the future.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>