mesa.git
7 years agosvga: specify include path for git_sha1.h for out-of-src builds
Brian Paul [Tue, 4 Apr 2017 19:13:33 +0000 (13:13 -0600)]
svga: specify include path for git_sha1.h for out-of-src builds

If we're doing an out-of-src build, we need to specify the #include
patch to find git_sha1.h

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
7 years agost/wgl: pseudo-implementation of WGL_EXT_swap_control
Brian Paul [Tue, 4 Apr 2017 19:11:44 +0000 (13:11 -0600)]
st/wgl: pseudo-implementation of WGL_EXT_swap_control

This implementation is based on querying the time just before swap/present
and doing a Sleep() if needed.  There is no sync to vblank or actual
coordination with the GPU.  This isn't perfect, but basically works.

We've had some request for this functionality, and it sounds like there
are some Windows GL apps that refuse to start if the driver doesn't
advertise this extension.

Note: NVIDIA's Windows OpenGL driver advertises the WGL_EXT_swap_control
string both with wglGetExtensionsStringEXT() and with
glGetString(GL_EXTENSIONS).  We're only advertising it with the former at
this time.

Tested with asst. Mesa demos, Google Earth, Lightsmark, etc.

VMware bug 1591534.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
7 years agosvga: Fix out-of-sync backing surface
Charmaine Lee [Tue, 4 Apr 2017 19:02:45 +0000 (13:02 -0600)]
svga: Fix out-of-sync backing surface

When a backing surface is reused, it is possible that
the original surface has been changed. So before the backing surface
is bound again, we need to sync up the surface.
This patch creates a new helper function svga_texture_copy_handle_resource()
to sync up the backing surface resource.

This patch, together with the backing surface dirty bit fix, fixes
the rendering corruption in NobelClinicianViewer when rotating the model.

Also tested with MTT glretrace, piglit, Cinebench, Turbine.

Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agosvga: add a reset flag to svga_propagate_surface()
Charmaine Lee [Wed, 22 Mar 2017 19:45:11 +0000 (12:45 -0700)]
svga: add a reset flag to svga_propagate_surface()

The reset flag specifies if the dirty bit needs to be reset
after the surface is propagated to the texture. This is used
to make sure that the dirty bit is not reset and stay unset
before the surface is unbound.

Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agosvga: add the has_backed_views flag
Charmaine Lee [Wed, 22 Mar 2017 17:46:54 +0000 (10:46 -0700)]
svga: add the has_backed_views flag

The new has_backed_views flag specifies if any of the render target
views or depth stencil view is a backing surface view.
The flag is used in svga_propagate_rendertargets() so it can return early
if there is no surface to propagate.

Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agosvga: only destroy render target view from a context that created it
Charmaine Lee [Thu, 3 Nov 2016 17:35:55 +0000 (10:35 -0700)]
svga: only destroy render target view from a context that created it

A texture can be destroyed from a different context from which it is
created, but destroying the render target view from a different context
will cause svga device errors. Similar to shader resource view,
this patch skips destroying render target view or depth stencil view
from a non-parent context.

Fixes driver errors running NobelClinician Viewer application.

Tested with NobelClinician Viewer, MTT piglit, glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agosvga: disable rasterization if rasterizer_discard is set or FS undefined
Charmaine Lee [Tue, 4 Apr 2017 18:47:49 +0000 (12:47 -0600)]
svga: disable rasterization if rasterizer_discard is set or FS undefined

With this patch, rasterization will be disabled if the
rasterizer_discard flag is set or the fragment shader
is undefined due to missing position output from the
vertex/geometry shader.

Tested with piglit test glsl-1.50-geometry-primitive-id-restart.
Also tested with full MTT glretrace and piglit.

v2: As suggested by Roland, to properly disable rasterization, besides
    setting FS to NULL, we will also need to disable depth and stencil test.

v3: As suggested by Brian, set SVGA_NEW_DEPTH_STENCIL_ALPHA dirty bit
    in svga_bind_rasterizer_state() if the rasterizer_discard flag is
    changed.

Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agosvga: do not emulate wide points in GS when doing transform feedback
Charmaine Lee [Wed, 15 Mar 2017 22:18:14 +0000 (15:18 -0700)]
svga: do not emulate wide points in GS when doing transform feedback

Emulating wide points in geometry shader when doing transform feedback
is problematic. This patch disables the emulation.

Tested with piglit test ext_transform_feedback-points.
Also tested with MTT glretrace, mesa demos pointblast and spriteblast.

Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agoanv/query: Use snooping on !LLC platforms
Jason Ekstrand [Thu, 6 Apr 2017 20:34:38 +0000 (13:34 -0700)]
anv/query: Use snooping on !LLC platforms

Commit b2c97bc789198427043cd902bc76e194e7e81c7d which made us start
using a busy-wait for individual query results also messed up cache
flushing on !LLC platforms.  For one thing, I forgot the mfence after
the clflush so memory access wasn't properly getting fenced.  More
importantly, however, was that we were clflushing the whole query range
and then waiting for individual queries and then trying to read the
results without clflushing again.  Getting the clflushing both correct
and efficient is very subtle and painful.  Instead, let's side-step the
problem by just snooping.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoanv: provide anv_gem_busy() stub for the tests
Emil Velikov [Thu, 6 Apr 2017 12:01:26 +0000 (13:01 +0100)]
anv: provide anv_gem_busy() stub for the tests

Otherwise linking way fail.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100600
Fixes: f195d40eca4 ("anv/device: Add a helper for querying whether a BO is busy")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
7 years agogallium/util: tweak backtrace format with libunwind
Rob Clark [Tue, 4 Apr 2017 17:01:56 +0000 (13:01 -0400)]
gallium/util: tweak backtrace format with libunwind

To work with addr2line.sh we also need the relative offset within the
DSO.  And addr2line.sh gets confused by the leading stackframe number.

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agogallium/util: cache symbol lookup with libunwind
Rob Clark [Tue, 4 Apr 2017 13:52:57 +0000 (09:52 -0400)]
gallium/util: cache symbol lookup with libunwind

Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agogallium/util: fix missing limit check in libunwind backtrace
Rob Clark [Tue, 4 Apr 2017 12:53:57 +0000 (08:53 -0400)]
gallium/util: fix missing limit check in libunwind backtrace

Fixes: 70c272004f ("gallium/util: libunwind support")
Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agomesa: fix renderbuffer leak
Timothy Arceri [Thu, 6 Apr 2017 21:55:17 +0000 (07:55 +1000)]
mesa: fix renderbuffer leak

We don't need to call _mesa_reference_renderbuffer() for the first
assignment as refCount starts at 1. For swrast we work around the
fact we will indirectly call _mesa_reference_renderbuffer() by
resetting refCount to 0.

Fixes: 32141e53d1520 (mesa: tidy up renderbuffer RefCount initialisation)
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
7 years agoanv/blorp: sample input attachments with resolves on BDW
Samuel Iglesias Gonsálvez [Tue, 21 Mar 2017 06:16:27 +0000 (07:16 +0100)]
anv/blorp: sample input attachments with resolves on BDW

On Broadwell we still need to do a resolve between the subpass
that writes and the subpass that reads when there is a
self-dependency because HW could not see fast-clears and works
on the render cache as if there was regular non-fast-clear surface.

Fixes 16 tests on BDW:

dEQP-VK.renderpass.formats.*.input.clear.store.self_dep*

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoradv: don't call radeon_check_space in radv_BindDescriptorSets
Fredrik Höglund [Sat, 1 Apr 2017 13:03:09 +0000 (15:03 +0200)]
radv: don't call radeon_check_space in radv_BindDescriptorSets

This appears to be a leftover from an earlier version of this function.
Nothing is emitted into the CS.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: implement VK_KHR_descriptor_update_template
Fredrik Höglund [Wed, 29 Mar 2017 17:19:47 +0000 (19:19 +0200)]
radv: implement VK_KHR_descriptor_update_template

All offsets and strides are precomputed by
radv_CreateDescriptorUpdateTemplateKHR and stored in the template.

v2: Move the new struct declarations from radv_descriptor_set.h
    to radv_private.h (Bas)

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: implement VK_KHR_push_descriptor
Fredrik Höglund [Wed, 29 Mar 2017 16:12:44 +0000 (18:12 +0200)]
radv: implement VK_KHR_push_descriptor

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: replace an assertion with a conditional
Fredrik Höglund [Wed, 29 Mar 2017 16:11:56 +0000 (18:11 +0200)]
radv: replace an assertion with a conditional

Replace the !binding_layout->immutable_samplers assertion in
radv_update_descriptor_sets with a conditional.

The Vulkan specification does not say that it is illegal to update
a sampler descriptor when it is immutable; only that pImageInfo is
ignored.

This change is also needed for push descriptors, because valid
descriptors must be pushed for all bindings accessed by shaders,
including immutable sampler descriptors.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: refactor radv_UpdateDescriptorSets
Fredrik Höglund [Wed, 29 Mar 2017 16:08:06 +0000 (18:08 +0200)]
radv: refactor radv_UpdateDescriptorSets

Move the implementation into a separate function that takes a
cmd_buffer and a dstSetOverride parameter.

When cmd_buffer is not NULL, radv_update_descriptor_sets calls
cs_add_buffer directly instead of updating the buffer list.

This will be used to implement VK_KHR_push_descriptor.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agogallium/radeon: fix typo in radeon_winsys.h
Samuel Pitoiset [Thu, 6 Apr 2017 12:27:42 +0000 (14:27 +0200)]
gallium/radeon: fix typo in radeon_winsys.h

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa/main: simplify _mesa_IsRenderbuffer()
Samuel Pitoiset [Thu, 6 Apr 2017 16:05:36 +0000 (18:05 +0200)]
mesa/main: simplify _mesa_IsRenderbuffer()

_mesa_lookup_renderbuffer() already checks if 'id' is non-zero.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
7 years agomesa: stop abstracting texture object hashtable locking
Timothy Arceri [Thu, 6 Apr 2017 04:43:32 +0000 (14:43 +1000)]
mesa: stop abstracting texture object hashtable locking

This doesn't do anything useful so just remove it.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
7 years agomesa: stop abstracting buffer object hashtable locking
Timothy Arceri [Thu, 6 Apr 2017 01:00:15 +0000 (11:00 +1000)]
mesa: stop abstracting buffer object hashtable locking

This doesn't do anything useful so just remove it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
7 years agoi965/blorp: Bump the batch space estimate
Jason Ekstrand [Wed, 5 Apr 2017 20:41:56 +0000 (13:41 -0700)]
i965/blorp: Bump the batch space estimate

Commit f938354362655a378d474c5f79c52cea9852ab91 recently increased the
alignment on vertex buffer data from 32 to 64.  This caused us to
consume a bit more batch than we were before and we now go over the
estimate by a small amount on certain blits on gen8+.  This commit bumps
then gen8 batch estimate by a bit to compensate.  Haswell and older
still seems to be well within the limit.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100582
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
7 years agointel/aubinator: Stop searching after a custom handler is found
Jordan Justen [Tue, 28 Mar 2017 20:36:11 +0000 (13:36 -0700)]
intel/aubinator: Stop searching after a custom handler is found

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agointel/gen_decoder: return -1 for unknown command formats
Jordan Justen [Tue, 28 Mar 2017 19:03:37 +0000 (12:03 -0700)]
intel/gen_decoder: return -1 for unknown command formats

Decoding with aubinator encountered a command of 0xffffffff. With the
previous code, it caused aubinator to jump 255 + 2 dwords to start
decoding again.

Instead we can attempt to detect the known instruction formats. If the
format is not recognized, then we can advance just 1 dword.

v2:
 * Update aubinator_error_decode
 * Actually convert the length variable returned into a *signed* integer
   in aubinator.c, intel_batchbuffer.c and aubinator_error_decode.c.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agointel/gen_decoder: Fix length for Media State/Object commands
Jordan Justen [Tue, 28 Mar 2017 18:55:26 +0000 (11:55 -0700)]
intel/gen_decoder: Fix length for Media State/Object commands

From BDW PRM, Volume 6: Command Stream Programming, 'Render Command
Header Format'.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agointel/aubinator_error_decode: Fix structure decode data
Jordan Justen [Thu, 6 Apr 2017 05:34:42 +0000 (22:34 -0700)]
intel/aubinator_error_decode: Fix structure decode data

The call to gen_print_group should provide a pointer to the beginning
of the the structure data, not the start of the batch data.

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agost/pbo: select the right swizzle for instance IDs
Nicolai Hähnle [Thu, 6 Apr 2017 14:44:11 +0000 (16:44 +0200)]
st/pbo: select the right swizzle for instance IDs

The system value only has an X component, and radeonsi started
checking that in debug builds.

Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Fixes: 4cf29427770f ("radeonsi: support 64-bit system values")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoanv/query: Busy-wait for available query entries
Jason Ekstrand [Tue, 4 Apr 2017 21:36:46 +0000 (14:36 -0700)]
anv/query: Busy-wait for available query entries

Before, we were just looking at whether or not the user wanted us to
wait and waiting on the BO.  Some clients, such as the Serious engine,
use a single query pool for hundreds of individual query results where
the writes for those queries may be split across several command
buffers.  In this scenario, the individual query we're looking for may
become available long before the BO is idle so waiting on the query pool
BO to be finished is wasteful. This commit makes us instead busy-loop on
each query until it's available.

This significantly reduces pipeline bubbles and improves performance of
The Talos Principle on medium settings (where the GPU isn't overloaded
with drawing) by around 20% on my SkyLake gt4.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
7 years agoanv/device: Add a helper for querying whether a BO is busy
Jason Ekstrand [Wed, 5 Apr 2017 16:55:15 +0000 (09:55 -0700)]
anv/device: Add a helper for querying whether a BO is busy

This is a bit more efficient than using GEM_WAIT with a timeout of 0.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoswr: [rasterizer core] SIMD16 Frontend WIP
Tim Rowley [Wed, 29 Mar 2017 17:58:18 +0000 (12:58 -0500)]
swr: [rasterizer core] SIMD16 Frontend WIP

Implement widened binner for SIMD16

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer core] Enable 8x2 backend
Tim Rowley [Wed, 29 Mar 2017 20:06:28 +0000 (15:06 -0500)]
swr: [rasterizer core] Enable 8x2 backend

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer codegen] remove copy of mako
Tim Rowley [Tue, 28 Mar 2017 21:58:58 +0000 (16:58 -0500)]
swr: [rasterizer codegen] remove copy of mako

mako is already a mesa build requirement, extra copy not needed.

Tested building against mesa build baseline (mako-0.8.0).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer core/memory] Move intrinics to _simd functions
Tim Rowley [Tue, 28 Mar 2017 21:18:21 +0000 (16:18 -0500)]
swr: [rasterizer core/memory] Move intrinics to _simd functions

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer core] Programmable sample position support
Tim Rowley [Tue, 28 Mar 2017 20:32:04 +0000 (15:32 -0500)]
swr: [rasterizer core] Programmable sample position support

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [configure.ac/scons] require c++14
Tim Rowley [Wed, 29 Mar 2017 19:25:12 +0000 (14:25 -0500)]
swr: [configure.ac/scons] require c++14

New C++ features used by upcoming swr changes.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer core] Fix center sample pattern
Tim Rowley [Tue, 28 Mar 2017 18:29:22 +0000 (13:29 -0500)]
swr: [rasterizer core] Fix center sample pattern

Fix long hidden bug in rasterizer handling of center sample pattern.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer core/memory] Fix missing avx512 storetile
Tim Rowley [Tue, 28 Mar 2017 16:43:09 +0000 (11:43 -0500)]
swr: [rasterizer core/memory] Fix missing avx512 storetile

Fix pre-processor macro handing to eliminate silently missing
implementation for AVX512.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer core] SIMD16 Frontend WIP
Tim Rowley [Sun, 26 Mar 2017 20:46:42 +0000 (15:46 -0500)]
swr: [rasterizer core] SIMD16 Frontend WIP

Implement widened VS output for SIMD16

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agomesa: use internal function when deleting buffers
Timothy Arceri [Tue, 4 Apr 2017 06:38:35 +0000 (16:38 +1000)]
mesa: use internal function when deleting buffers

This avoids validation and looking up the buffer target for a second time.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agomesa: rework bind_buffer_object()
Timothy Arceri [Tue, 4 Apr 2017 05:40:06 +0000 (15:40 +1000)]
mesa: rework bind_buffer_object()

This allows internal users to pass buffer objects directly and
allows for KHR_no_error support to be more easily added.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agomesa: small texstate tidy up
Timothy Arceri [Tue, 4 Apr 2017 02:39:31 +0000 (12:39 +1000)]
mesa: small texstate tidy up

Possibly more efficient, either way it makes the code easier to
follow.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agomesa: tidy up renderbuffer RefCount initialisation
Timothy Arceri [Wed, 5 Apr 2017 10:03:47 +0000 (20:03 +1000)]
mesa: tidy up renderbuffer RefCount initialisation

42aaa548 changed the renderbuffer initialisation of RefCount from
1 to 0.

This is inconsitent with how we use RefCount elsewhere. Also every
driver implementation of NewRenderbuffer() calls
_mesa_init_renderbuffer() so its safe to set it there.

Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agoRevert "etnaviv: Cannot render to rb-swapped formats"
Christian Gmeiner [Sun, 26 Mar 2017 09:30:29 +0000 (11:30 +0200)]
Revert "etnaviv: Cannot render to rb-swapped formats"

This reverts commit 658568941d5e232d690e1ffbcddbd6ea9685693a.

With the help of shader variants we can render to rb-swapped
formats now. Fixes about 60 piglits.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agoetnaviv: add support for rb swap
Christian Gmeiner [Tue, 21 Mar 2017 19:00:31 +0000 (20:00 +0100)]
etnaviv: add support for rb swap

If we render to rb swapped format we will create a shader variant doing
the involved swizzing in the pixel shader.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agoetnaviv: adapt shader-db output for variant support
Christian Gmeiner [Sun, 19 Mar 2017 21:27:55 +0000 (22:27 +0100)]
etnaviv: adapt shader-db output for variant support

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
7 years agoetnaviv: bring back shader-db traces
Christian Gmeiner [Sun, 19 Mar 2017 21:22:18 +0000 (22:22 +0100)]
etnaviv: bring back shader-db traces

If shader-db run, create a standard variant immediately
(as otherwise nothing will trigger the shader to be
actually compiled).

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
7 years agoetnaviv: add etna_shader_key and generate variants if needed
Christian Gmeiner [Sun, 19 Mar 2017 16:19:10 +0000 (17:19 +0100)]
etnaviv: add etna_shader_key and generate variants if needed

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agoetnaviv: pass a preallocated variant to compiler
Christian Gmeiner [Sun, 19 Mar 2017 15:32:40 +0000 (16:32 +0100)]
etnaviv: pass a preallocated variant to compiler

In the long run the compiler needs to know the specifc variant
'key' in order to compile appropriate assembly. With this commit
the variant knows its shader and we are able pass the preallocated
variant into etna_compile_shader(..). This saves us from passing
extra ptrs everywhere.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
7 years agoetnaviv: make specs const
Christian Gmeiner [Sun, 19 Mar 2017 15:31:47 +0000 (16:31 +0100)]
etnaviv: make specs const

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agoetnaviv: add struct etna_shader_state
Christian Gmeiner [Tue, 14 Mar 2017 21:25:17 +0000 (22:25 +0100)]
etnaviv: add struct etna_shader_state

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
7 years agoetnaviv: add basic shader variant support
Christian Gmeiner [Tue, 14 Mar 2017 18:57:15 +0000 (19:57 +0100)]
etnaviv: add basic shader variant support

This commit adds some basic infrastructure to handle shader
variants. We are still creating exactly one shader variant
for each shader.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
7 years agoetnaviv: s/etna_shader/etna_shader_variant
Christian Gmeiner [Sun, 12 Feb 2017 16:11:44 +0000 (17:11 +0100)]
etnaviv: s/etna_shader/etna_shader_variant

Prep work to add shader variant support.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agoetnaviv: remove not needed forward declarations
Christian Gmeiner [Sun, 12 Feb 2017 15:39:57 +0000 (16:39 +0100)]
etnaviv: remove not needed forward declarations

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agogallium/util: honour LIBUNWIND_CFLAGS
Emil Velikov [Wed, 5 Apr 2017 12:21:26 +0000 (13:21 +0100)]
gallium/util: honour LIBUNWIND_CFLAGS

Fixes: 70c272004f72 ("gallium/util: libunwind support")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agotravis: Add radeonsi to continuous integration
Rhys Kidd [Wed, 5 Apr 2017 03:56:04 +0000 (23:56 -0400)]
travis: Add radeonsi to continuous integration

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agotravis: Add radv vulkan driver to continuous integration
Rhys Kidd [Wed, 5 Apr 2017 03:58:59 +0000 (23:58 -0400)]
travis: Add radv vulkan driver to continuous integration

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoanv: provide required gem stubs for the tests
Emil Velikov [Wed, 5 Apr 2017 16:52:51 +0000 (17:52 +0100)]
anv: provide required gem stubs for the tests

Introduce stubs to anv_gem_stub.c that match the anv_gem.c ones.
Otherwise we may get link-time errors, when building the tests.

v2: Introduce all the missing stubs at once.

Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Vinson Lee <vlee@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100574
Fixes: c964f0e485d ("anv: Query the kernel for reset status")
Fixes: 651ec926fc1 ("anv: Add support for 48-bit addresses")
Fixes: 060a6434eca ("anv: Advertise larger heap sizes")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
---
I've intentionally kept the order the same identical to the anv_gem.c.
This way we can easily grep & diff in the future ;-)

7 years agoconfigure.ac: pthread-stubs is not a thing on GNU/kFreeBSD
Emil Velikov [Wed, 5 Apr 2017 16:21:29 +0000 (17:21 +0100)]
configure.ac: pthread-stubs is not a thing on GNU/kFreeBSD

As mentioned on the xcb mailing list, the platform uses the GLIBC
forwarding mechanism.

https://lists.freedesktop.org/archives/xcb/2016-November/010896.html

Cc: Andreas Boll <andreas.boll.dev@gmail.com>
Reported-by: Andreas Boll <andreas.boll.dev@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agost/clover: Fix build after shrink of pipe_box
Aaron Watry [Wed, 5 Apr 2017 01:16:02 +0000 (20:16 -0500)]
st/clover: Fix build after shrink of pipe_box

Fixes: 3dfe61e ("gallium: decrease the size of pipe_box - 24 -> 16 bytes")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100569
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
7 years agoradeonsi: add new polaris10 pci id
Alex Deucher [Wed, 5 Apr 2017 13:40:53 +0000 (09:40 -0400)]
radeonsi: add new polaris10 pci id

Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoradeonsi: enable ARB_shader_ballot
Nicolai Hähnle [Thu, 30 Mar 2017 09:19:39 +0000 (11:19 +0200)]
radeonsi: enable ARB_shader_ballot

Require LLVM 5.0 or later because LLVM 4.0 is easily fooled into
putting the lane select of llvm.amdgcn.readlane into a VGPR and then
fails to continue to compile.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: optimization barriers to work around LLVM deficiencies
Nicolai Hähnle [Fri, 31 Mar 2017 16:42:51 +0000 (18:42 +0200)]
radeonsi: optimization barriers to work around LLVM deficiencies

Notably, llvm.amdgcn.readfirstlane and llvm.amdgcn.icmp may be hoisted
out of loops or if/else branches in cases like

  if (cond) {
    v = readFirstInvocationARB(x);
    ... use v ...
  } else {
    v = readFirstInvocationARB(x);
    ... use v ...
  }
===>
  v = readFirstInvocationARB(x);
  if (cond) {
    ... use v ...
  } else {
    ... use v ...
  }

The optimization barrier is a heavy hammer to stop that until LLVM
is taught the semantics of the intrinsic properly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: strengthen emit_optimization_barrier
Nicolai Hähnle [Fri, 31 Mar 2017 11:04:34 +0000 (13:04 +0200)]
radeonsi: strengthen emit_optimization_barrier

LLVM will lift inline assembly out of if-else-blocks if both paths have
the same inline assembly. Prevent this by adding an irrelevant unique
text to the assembly.

This requires the LLVM assembly parser to be initialized.

Furthermore, allow forcing subsequent computations to happen after the
optimization barrier by defining a data dependency.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: emit TGSI_OPCODE_READ_*
Nicolai Hähnle [Thu, 30 Mar 2017 12:15:27 +0000 (14:15 +0200)]
radeonsi: emit TGSI_OPCODE_READ_*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: emit TGSI_OPCODE_BALLOT
Nicolai Hähnle [Thu, 30 Mar 2017 10:15:19 +0000 (12:15 +0200)]
radeonsi: emit TGSI_OPCODE_BALLOT

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: implement TGSI_SEMANTIC_SUBGROUP_*
Nicolai Hähnle [Thu, 30 Mar 2017 12:15:10 +0000 (14:15 +0200)]
radeonsi: implement TGSI_SEMANTIC_SUBGROUP_*

64-bit system values are stored as v2i32 to simplify the fetch logic.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: support 64-bit system values
Nicolai Hähnle [Fri, 31 Mar 2017 11:02:34 +0000 (13:02 +0200)]
radeonsi: support 64-bit system values

For simplicitly, always store system values as 32-bit values or arrays
of 32-bit values. 64-bit values are unpacked and packed accordingly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: bump RADEON_LLVM_MAX_SYSTEM_VALUES
Nicolai Hähnle [Thu, 30 Mar 2017 12:14:27 +0000 (14:14 +0200)]
radeonsi: bump RADEON_LLVM_MAX_SYSTEM_VALUES

ARB_shader_ballot introduces 7 new system values that can be used
in all shader stages.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/mesa: enable ARB_shader_ballot
Nicolai Hähnle [Thu, 30 Mar 2017 09:16:27 +0000 (11:16 +0200)]
st/mesa: enable ARB_shader_ballot

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/glsl_to_tgsi: implement ARB_shader_ballot system variables
Nicolai Hähnle [Thu, 30 Mar 2017 12:14:45 +0000 (14:14 +0200)]
st/glsl_to_tgsi: implement ARB_shader_ballot system variables

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/glsl_to_tgsi: implement ARB_shader_ballot builtin functions
Nicolai Hähnle [Thu, 30 Mar 2017 10:15:51 +0000 (12:15 +0200)]
st/glsl_to_tgsi: implement ARB_shader_ballot builtin functions

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agotgsi: add SUBGROUP_* semantics
Ilia Mirkin [Thu, 9 Feb 2017 23:48:18 +0000 (18:48 -0500)]
tgsi: add SUBGROUP_* semantics

v2: add documentation (Nicolai)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agotgsi: add BALLOT/READ_* opcodes
Ilia Mirkin [Thu, 9 Feb 2017 23:38:17 +0000 (18:38 -0500)]
tgsi: add BALLOT/READ_* opcodes

v2 (Nicolai):
- BALLOT isn't per-channel
- expand the documentation (also for VOTE_*)

v3:
- only BALLOT returns a 64-bit lanemask (Boyan)
- relax the requirement on READ_INVOC: the invocation number to read
  from must be uniform within a sub-group. This matches the
  GL_ARB_shader_ballot spect (and the v_readlane instruction of AMD
  GCN)

v4:
- hopefully really fix the doc of VOTE_* returns (Ilia)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
7 years agogallium: add PIPE_CAP_TGSI_BALLOT
Nicolai Hähnle [Thu, 30 Mar 2017 09:16:09 +0000 (11:16 +0200)]
gallium: add PIPE_CAP_TGSI_BALLOT

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoglsl: add gl_SubGroup*ARB builtins
Nicolai Hähnle [Thu, 30 Mar 2017 10:10:00 +0000 (12:10 +0200)]
glsl: add gl_SubGroup*ARB builtins

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoglsl: add ARB_shader_ballot builtin functions
Nicolai Hähnle [Thu, 30 Mar 2017 09:18:47 +0000 (11:18 +0200)]
glsl: add ARB_shader_ballot builtin functions

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoglsl: add ARB_shader_ballot operations
Nicolai Hähnle [Thu, 30 Mar 2017 09:18:30 +0000 (11:18 +0200)]
glsl: add ARB_shader_ballot operations

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoglsl: add ARB_shader_ballot enable
Nicolai Hähnle [Thu, 30 Mar 2017 09:17:47 +0000 (11:17 +0200)]
glsl: add ARB_shader_ballot enable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa: add GL_ARB_shader_ballot boilerplate
Nicolai Hähnle [Thu, 30 Mar 2017 09:14:01 +0000 (11:14 +0200)]
mesa: add GL_ARB_shader_ballot boilerplate

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoswr: automake: add gen_common.py to the tarball
Emil Velikov [Tue, 4 Apr 2017 17:18:45 +0000 (18:18 +0100)]
swr: automake: add gen_common.py to the tarball

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agointel: genxml: automake: include gen_bits_header.py in the tarball
Emil Velikov [Tue, 4 Apr 2017 16:54:28 +0000 (17:54 +0100)]
intel: genxml: automake: include gen_bits_header.py in the tarball

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agointel: genxml: automake: polish automake rules
Emil Velikov [Tue, 4 Apr 2017 16:10:51 +0000 (17:10 +0100)]
intel: genxml: automake: polish automake rules

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoamd/addrlib: automake: add all headers to the tarball
Emil Velikov [Tue, 4 Apr 2017 16:47:09 +0000 (17:47 +0100)]
amd/addrlib: automake: add all headers to the tarball

Fixes: 7f160efcde4 ("amd/addrlib: import gfx9 support")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoradeonsi: enable ARB_sparse_buffer
Nicolai Hähnle [Thu, 2 Feb 2017 20:11:05 +0000 (21:11 +0100)]
radeonsi: enable ARB_sparse_buffer

v2:
- fill in DRM version requirement
- disable on SI due to CP DMA faults

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: disable SDMA clears and copies for sparse buffers
Nicolai Hähnle [Fri, 24 Mar 2017 22:30:55 +0000 (23:30 +0100)]
radeonsi: disable SDMA clears and copies for sparse buffers

VM faults cannot be disabled for SDMA on <= VI.

We could still use SDMA by asking the winsys about which parts of the
buffers are committed. This is left as a potential future improvement.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/radeon: implement pipe->resource_commit
Nicolai Hähnle [Wed, 8 Feb 2017 10:07:19 +0000 (11:07 +0100)]
gallium/radeon: implement pipe->resource_commit

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/radeon: transfers and invalidation for sparse buffers
Nicolai Hähnle [Tue, 7 Feb 2017 17:24:59 +0000 (18:24 +0100)]
gallium/radeon: transfers and invalidation for sparse buffers

Sparse buffers can never be mapped by the CPU.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/radeon: implement sparse buffer creation
Nicolai Hähnle [Tue, 7 Feb 2017 17:03:55 +0000 (18:03 +0100)]
gallium/radeon: implement sparse buffer creation

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: sparse buffer debugging helpers
Nicolai Hähnle [Mon, 13 Feb 2017 11:51:12 +0000 (12:51 +0100)]
winsys/amdgpu: sparse buffer debugging helpers

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: take fences when freeing a backing buffer
Nicolai Hähnle [Tue, 7 Feb 2017 16:58:39 +0000 (17:58 +0100)]
winsys/amdgpu: take fences when freeing a backing buffer

We never add fences to backing buffers during submit. When we free a
backing buffer, it must inherit the sparse buffer's fences, so that it
doesn't get re-used prematurely via the cache.

v2:
- remove pipe_mutex_*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: add sparse buffers to CS
Nicolai Hähnle [Tue, 7 Feb 2017 16:11:00 +0000 (17:11 +0100)]
winsys/amdgpu: add sparse buffers to CS

... and implement the corresponding fence handling.

v2:
- add missing bit in amdgpu_bo_is_referenced_by_cs_with_usage
- remove pipe_mutex_*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: sparse buffer creation / destruction / commitment
Nicolai Hähnle [Tue, 7 Feb 2017 16:04:49 +0000 (17:04 +0100)]
winsys/amdgpu: sparse buffer creation / destruction / commitment

This is the bulk of the buffer allocation logic. It is fairly simple and
stupid. We'll probably want to use e.g. interval trees at some point to
keep track of commitments, but Mesa doesn't have an implementation of those
yet.

v2:
- remove pipe_mutex_*
- fix total_backing_pages accounting
- simplify by using the new VA_OP_CLEAR/REPLACE kernel interface

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: add sparse buffer data structures
Nicolai Hähnle [Tue, 7 Feb 2017 16:03:59 +0000 (17:03 +0100)]
winsys/amdgpu: add sparse buffer data structures

v2:
- remove pipe_mutex_*
- use a simple page commitment array

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: extend amdgpu_add_fence to allow adding multiple fences
Nicolai Hähnle [Tue, 7 Feb 2017 16:53:49 +0000 (17:53 +0100)]
winsys/amdgpu: extend amdgpu_add_fence to allow adding multiple fences

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: build handles and flags list late on submit thread
Nicolai Hähnle [Tue, 7 Feb 2017 16:35:02 +0000 (17:35 +0100)]
winsys/amdgpu: build handles and flags list late on submit thread

This probably has only minor performance effects, but it simplifies some
subsequent code slightly.

Ideally, it could also be used to simplify the handling of slab buffers
in the same way, but unfortunately that's not possible as long as we need
indices for relocations.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: share common code in amdgpu_add_fence_dependencies
Nicolai Hähnle [Tue, 7 Feb 2017 16:08:07 +0000 (17:08 +0100)]
winsys/amdgpu: share common code in amdgpu_add_fence_dependencies

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: extract amdgpu_do_add_real_buffer
Nicolai Hähnle [Tue, 7 Feb 2017 16:06:06 +0000 (17:06 +0100)]
winsys/amdgpu: extract amdgpu_do_add_real_buffer

We will use it for delayed adding of sparse buffers' backing buffers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>