Jason Ekstrand [Wed, 5 Apr 2017 20:41:56 +0000 (13:41 -0700)]
i965/blorp: Bump the batch space estimate
Commit
f938354362655a378d474c5f79c52cea9852ab91 recently increased the
alignment on vertex buffer data from 32 to 64. This caused us to
consume a bit more batch than we were before and we now go over the
estimate by a small amount on certain blits on gen8+. This commit bumps
then gen8 batch estimate by a bit to compensate. Haswell and older
still seems to be well within the limit.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100582
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Jordan Justen [Tue, 28 Mar 2017 20:36:11 +0000 (13:36 -0700)]
intel/aubinator: Stop searching after a custom handler is found
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jordan Justen [Tue, 28 Mar 2017 19:03:37 +0000 (12:03 -0700)]
intel/gen_decoder: return -1 for unknown command formats
Decoding with aubinator encountered a command of 0xffffffff. With the
previous code, it caused aubinator to jump 255 + 2 dwords to start
decoding again.
Instead we can attempt to detect the known instruction formats. If the
format is not recognized, then we can advance just 1 dword.
v2:
* Update aubinator_error_decode
* Actually convert the length variable returned into a *signed* integer
in aubinator.c, intel_batchbuffer.c and aubinator_error_decode.c.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jordan Justen [Tue, 28 Mar 2017 18:55:26 +0000 (11:55 -0700)]
intel/gen_decoder: Fix length for Media State/Object commands
From BDW PRM, Volume 6: Command Stream Programming, 'Render Command
Header Format'.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jordan Justen [Thu, 6 Apr 2017 05:34:42 +0000 (22:34 -0700)]
intel/aubinator_error_decode: Fix structure decode data
The call to gen_print_group should provide a pointer to the beginning
of the the structure data, not the start of the batch data.
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Nicolai Hähnle [Thu, 6 Apr 2017 14:44:11 +0000 (16:44 +0200)]
st/pbo: select the right swizzle for instance IDs
The system value only has an X component, and radeonsi started
checking that in debug builds.
Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Fixes: 4cf29427770f ("radeonsi: support 64-bit system values")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Jason Ekstrand [Tue, 4 Apr 2017 21:36:46 +0000 (14:36 -0700)]
anv/query: Busy-wait for available query entries
Before, we were just looking at whether or not the user wanted us to
wait and waiting on the BO. Some clients, such as the Serious engine,
use a single query pool for hundreds of individual query results where
the writes for those queries may be split across several command
buffers. In this scenario, the individual query we're looking for may
become available long before the BO is idle so waiting on the query pool
BO to be finished is wasteful. This commit makes us instead busy-loop on
each query until it's available.
This significantly reduces pipeline bubbles and improves performance of
The Talos Principle on medium settings (where the GPU isn't overloaded
with drawing) by around 20% on my SkyLake gt4.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Jason Ekstrand [Wed, 5 Apr 2017 16:55:15 +0000 (09:55 -0700)]
anv/device: Add a helper for querying whether a BO is busy
This is a bit more efficient than using GEM_WAIT with a timeout of 0.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tim Rowley [Wed, 29 Mar 2017 17:58:18 +0000 (12:58 -0500)]
swr: [rasterizer core] SIMD16 Frontend WIP
Implement widened binner for SIMD16
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Wed, 29 Mar 2017 20:06:28 +0000 (15:06 -0500)]
swr: [rasterizer core] Enable 8x2 backend
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 28 Mar 2017 21:58:58 +0000 (16:58 -0500)]
swr: [rasterizer codegen] remove copy of mako
mako is already a mesa build requirement, extra copy not needed.
Tested building against mesa build baseline (mako-0.8.0).
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 28 Mar 2017 21:18:21 +0000 (16:18 -0500)]
swr: [rasterizer core/memory] Move intrinics to _simd functions
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 28 Mar 2017 20:32:04 +0000 (15:32 -0500)]
swr: [rasterizer core] Programmable sample position support
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Wed, 29 Mar 2017 19:25:12 +0000 (14:25 -0500)]
swr: [configure.ac/scons] require c++14
New C++ features used by upcoming swr changes.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 28 Mar 2017 18:29:22 +0000 (13:29 -0500)]
swr: [rasterizer core] Fix center sample pattern
Fix long hidden bug in rasterizer handling of center sample pattern.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 28 Mar 2017 16:43:09 +0000 (11:43 -0500)]
swr: [rasterizer core/memory] Fix missing avx512 storetile
Fix pre-processor macro handing to eliminate silently missing
implementation for AVX512.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Sun, 26 Mar 2017 20:46:42 +0000 (15:46 -0500)]
swr: [rasterizer core] SIMD16 Frontend WIP
Implement widened VS output for SIMD16
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Timothy Arceri [Tue, 4 Apr 2017 06:38:35 +0000 (16:38 +1000)]
mesa: use internal function when deleting buffers
This avoids validation and looking up the buffer target for a second time.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Tue, 4 Apr 2017 05:40:06 +0000 (15:40 +1000)]
mesa: rework bind_buffer_object()
This allows internal users to pass buffer objects directly and
allows for KHR_no_error support to be more easily added.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Tue, 4 Apr 2017 02:39:31 +0000 (12:39 +1000)]
mesa: small texstate tidy up
Possibly more efficient, either way it makes the code easier to
follow.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Wed, 5 Apr 2017 10:03:47 +0000 (20:03 +1000)]
mesa: tidy up renderbuffer RefCount initialisation
42aaa548 changed the renderbuffer initialisation of RefCount from
1 to 0.
This is inconsitent with how we use RefCount elsewhere. Also every
driver implementation of NewRenderbuffer() calls
_mesa_init_renderbuffer() so its safe to set it there.
Reviewed-by: Brian Paul <brianp@vmware.com>
Christian Gmeiner [Sun, 26 Mar 2017 09:30:29 +0000 (11:30 +0200)]
Revert "etnaviv: Cannot render to rb-swapped formats"
This reverts commit
658568941d5e232d690e1ffbcddbd6ea9685693a.
With the help of shader variants we can render to rb-swapped
formats now. Fixes about 60 piglits.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Christian Gmeiner [Tue, 21 Mar 2017 19:00:31 +0000 (20:00 +0100)]
etnaviv: add support for rb swap
If we render to rb swapped format we will create a shader variant doing
the involved swizzing in the pixel shader.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Christian Gmeiner [Sun, 19 Mar 2017 21:27:55 +0000 (22:27 +0100)]
etnaviv: adapt shader-db output for variant support
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Christian Gmeiner [Sun, 19 Mar 2017 21:22:18 +0000 (22:22 +0100)]
etnaviv: bring back shader-db traces
If shader-db run, create a standard variant immediately
(as otherwise nothing will trigger the shader to be
actually compiled).
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Christian Gmeiner [Sun, 19 Mar 2017 16:19:10 +0000 (17:19 +0100)]
etnaviv: add etna_shader_key and generate variants if needed
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Christian Gmeiner [Sun, 19 Mar 2017 15:32:40 +0000 (16:32 +0100)]
etnaviv: pass a preallocated variant to compiler
In the long run the compiler needs to know the specifc variant
'key' in order to compile appropriate assembly. With this commit
the variant knows its shader and we are able pass the preallocated
variant into etna_compile_shader(..). This saves us from passing
extra ptrs everywhere.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Christian Gmeiner [Sun, 19 Mar 2017 15:31:47 +0000 (16:31 +0100)]
etnaviv: make specs const
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Christian Gmeiner [Tue, 14 Mar 2017 21:25:17 +0000 (22:25 +0100)]
etnaviv: add struct etna_shader_state
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Christian Gmeiner [Tue, 14 Mar 2017 18:57:15 +0000 (19:57 +0100)]
etnaviv: add basic shader variant support
This commit adds some basic infrastructure to handle shader
variants. We are still creating exactly one shader variant
for each shader.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Christian Gmeiner [Sun, 12 Feb 2017 16:11:44 +0000 (17:11 +0100)]
etnaviv: s/etna_shader/etna_shader_variant
Prep work to add shader variant support.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Christian Gmeiner [Sun, 12 Feb 2017 15:39:57 +0000 (16:39 +0100)]
etnaviv: remove not needed forward declarations
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Emil Velikov [Wed, 5 Apr 2017 12:21:26 +0000 (13:21 +0100)]
gallium/util: honour LIBUNWIND_CFLAGS
Fixes: 70c272004f72 ("gallium/util: libunwind support")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Rhys Kidd [Wed, 5 Apr 2017 03:56:04 +0000 (23:56 -0400)]
travis: Add radeonsi to continuous integration
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Rhys Kidd [Wed, 5 Apr 2017 03:58:59 +0000 (23:58 -0400)]
travis: Add radv vulkan driver to continuous integration
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Wed, 5 Apr 2017 16:52:51 +0000 (17:52 +0100)]
anv: provide required gem stubs for the tests
Introduce stubs to anv_gem_stub.c that match the anv_gem.c ones.
Otherwise we may get link-time errors, when building the tests.
v2: Introduce all the missing stubs at once.
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Vinson Lee <vlee@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100574
Fixes: c964f0e485d ("anv: Query the kernel for reset status")
Fixes: 651ec926fc1 ("anv: Add support for 48-bit addresses")
Fixes: 060a6434eca ("anv: Advertise larger heap sizes")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
---
I've intentionally kept the order the same identical to the anv_gem.c.
This way we can easily grep & diff in the future ;-)
Emil Velikov [Wed, 5 Apr 2017 16:21:29 +0000 (17:21 +0100)]
configure.ac: pthread-stubs is not a thing on GNU/kFreeBSD
As mentioned on the xcb mailing list, the platform uses the GLIBC
forwarding mechanism.
https://lists.freedesktop.org/archives/xcb/2016-November/010896.html
Cc: Andreas Boll <andreas.boll.dev@gmail.com>
Reported-by: Andreas Boll <andreas.boll.dev@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Aaron Watry [Wed, 5 Apr 2017 01:16:02 +0000 (20:16 -0500)]
st/clover: Fix build after shrink of pipe_box
Fixes: 3dfe61e ("gallium: decrease the size of pipe_box - 24 -> 16 bytes")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100569
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Alex Deucher [Wed, 5 Apr 2017 13:40:53 +0000 (09:40 -0400)]
radeonsi: add new polaris10 pci id
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicolai Hähnle [Thu, 30 Mar 2017 09:19:39 +0000 (11:19 +0200)]
radeonsi: enable ARB_shader_ballot
Require LLVM 5.0 or later because LLVM 4.0 is easily fooled into
putting the lane select of llvm.amdgcn.readlane into a VGPR and then
fails to continue to compile.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 31 Mar 2017 16:42:51 +0000 (18:42 +0200)]
radeonsi: optimization barriers to work around LLVM deficiencies
Notably, llvm.amdgcn.readfirstlane and llvm.amdgcn.icmp may be hoisted
out of loops or if/else branches in cases like
if (cond) {
v = readFirstInvocationARB(x);
... use v ...
} else {
v = readFirstInvocationARB(x);
... use v ...
}
===>
v = readFirstInvocationARB(x);
if (cond) {
... use v ...
} else {
... use v ...
}
The optimization barrier is a heavy hammer to stop that until LLVM
is taught the semantics of the intrinsic properly.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 31 Mar 2017 11:04:34 +0000 (13:04 +0200)]
radeonsi: strengthen emit_optimization_barrier
LLVM will lift inline assembly out of if-else-blocks if both paths have
the same inline assembly. Prevent this by adding an irrelevant unique
text to the assembly.
This requires the LLVM assembly parser to be initialized.
Furthermore, allow forcing subsequent computations to happen after the
optimization barrier by defining a data dependency.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 30 Mar 2017 12:15:27 +0000 (14:15 +0200)]
radeonsi: emit TGSI_OPCODE_READ_*
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 30 Mar 2017 10:15:19 +0000 (12:15 +0200)]
radeonsi: emit TGSI_OPCODE_BALLOT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 30 Mar 2017 12:15:10 +0000 (14:15 +0200)]
radeonsi: implement TGSI_SEMANTIC_SUBGROUP_*
64-bit system values are stored as v2i32 to simplify the fetch logic.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 31 Mar 2017 11:02:34 +0000 (13:02 +0200)]
radeonsi: support 64-bit system values
For simplicitly, always store system values as 32-bit values or arrays
of 32-bit values. 64-bit values are unpacked and packed accordingly.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 30 Mar 2017 12:14:27 +0000 (14:14 +0200)]
radeonsi: bump RADEON_LLVM_MAX_SYSTEM_VALUES
ARB_shader_ballot introduces 7 new system values that can be used
in all shader stages.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 30 Mar 2017 09:16:27 +0000 (11:16 +0200)]
st/mesa: enable ARB_shader_ballot
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 30 Mar 2017 12:14:45 +0000 (14:14 +0200)]
st/glsl_to_tgsi: implement ARB_shader_ballot system variables
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 30 Mar 2017 10:15:51 +0000 (12:15 +0200)]
st/glsl_to_tgsi: implement ARB_shader_ballot builtin functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Thu, 9 Feb 2017 23:48:18 +0000 (18:48 -0500)]
tgsi: add SUBGROUP_* semantics
v2: add documentation (Nicolai)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Thu, 9 Feb 2017 23:38:17 +0000 (18:38 -0500)]
tgsi: add BALLOT/READ_* opcodes
v2 (Nicolai):
- BALLOT isn't per-channel
- expand the documentation (also for VOTE_*)
v3:
- only BALLOT returns a 64-bit lanemask (Boyan)
- relax the requirement on READ_INVOC: the invocation number to read
from must be uniform within a sub-group. This matches the
GL_ARB_shader_ballot spect (and the v_readlane instruction of AMD
GCN)
v4:
- hopefully really fix the doc of VOTE_* returns (Ilia)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Nicolai Hähnle [Thu, 30 Mar 2017 09:16:09 +0000 (11:16 +0200)]
gallium: add PIPE_CAP_TGSI_BALLOT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 30 Mar 2017 10:10:00 +0000 (12:10 +0200)]
glsl: add gl_SubGroup*ARB builtins
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 30 Mar 2017 09:18:47 +0000 (11:18 +0200)]
glsl: add ARB_shader_ballot builtin functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 30 Mar 2017 09:18:30 +0000 (11:18 +0200)]
glsl: add ARB_shader_ballot operations
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 30 Mar 2017 09:17:47 +0000 (11:17 +0200)]
glsl: add ARB_shader_ballot enable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 30 Mar 2017 09:14:01 +0000 (11:14 +0200)]
mesa: add GL_ARB_shader_ballot boilerplate
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Tue, 4 Apr 2017 17:18:45 +0000 (18:18 +0100)]
swr: automake: add gen_common.py to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Tue, 4 Apr 2017 16:54:28 +0000 (17:54 +0100)]
intel: genxml: automake: include gen_bits_header.py in the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Tue, 4 Apr 2017 16:10:51 +0000 (17:10 +0100)]
intel: genxml: automake: polish automake rules
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Tue, 4 Apr 2017 16:47:09 +0000 (17:47 +0100)]
amd/addrlib: automake: add all headers to the tarball
Fixes: 7f160efcde4 ("amd/addrlib: import gfx9 support")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Nicolai Hähnle [Thu, 2 Feb 2017 20:11:05 +0000 (21:11 +0100)]
radeonsi: enable ARB_sparse_buffer
v2:
- fill in DRM version requirement
- disable on SI due to CP DMA faults
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 24 Mar 2017 22:30:55 +0000 (23:30 +0100)]
radeonsi: disable SDMA clears and copies for sparse buffers
VM faults cannot be disabled for SDMA on <= VI.
We could still use SDMA by asking the winsys about which parts of the
buffers are committed. This is left as a potential future improvement.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Wed, 8 Feb 2017 10:07:19 +0000 (11:07 +0100)]
gallium/radeon: implement pipe->resource_commit
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 7 Feb 2017 17:24:59 +0000 (18:24 +0100)]
gallium/radeon: transfers and invalidation for sparse buffers
Sparse buffers can never be mapped by the CPU.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 7 Feb 2017 17:03:55 +0000 (18:03 +0100)]
gallium/radeon: implement sparse buffer creation
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Mon, 13 Feb 2017 11:51:12 +0000 (12:51 +0100)]
winsys/amdgpu: sparse buffer debugging helpers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 7 Feb 2017 16:58:39 +0000 (17:58 +0100)]
winsys/amdgpu: take fences when freeing a backing buffer
We never add fences to backing buffers during submit. When we free a
backing buffer, it must inherit the sparse buffer's fences, so that it
doesn't get re-used prematurely via the cache.
v2:
- remove pipe_mutex_*
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 7 Feb 2017 16:11:00 +0000 (17:11 +0100)]
winsys/amdgpu: add sparse buffers to CS
... and implement the corresponding fence handling.
v2:
- add missing bit in amdgpu_bo_is_referenced_by_cs_with_usage
- remove pipe_mutex_*
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 7 Feb 2017 16:04:49 +0000 (17:04 +0100)]
winsys/amdgpu: sparse buffer creation / destruction / commitment
This is the bulk of the buffer allocation logic. It is fairly simple and
stupid. We'll probably want to use e.g. interval trees at some point to
keep track of commitments, but Mesa doesn't have an implementation of those
yet.
v2:
- remove pipe_mutex_*
- fix total_backing_pages accounting
- simplify by using the new VA_OP_CLEAR/REPLACE kernel interface
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 7 Feb 2017 16:03:59 +0000 (17:03 +0100)]
winsys/amdgpu: add sparse buffer data structures
v2:
- remove pipe_mutex_*
- use a simple page commitment array
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 7 Feb 2017 16:53:49 +0000 (17:53 +0100)]
winsys/amdgpu: extend amdgpu_add_fence to allow adding multiple fences
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 7 Feb 2017 16:35:02 +0000 (17:35 +0100)]
winsys/amdgpu: build handles and flags list late on submit thread
This probably has only minor performance effects, but it simplifies some
subsequent code slightly.
Ideally, it could also be used to simplify the handling of slab buffers
in the same way, but unfortunately that's not possible as long as we need
indices for relocations.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 7 Feb 2017 16:08:07 +0000 (17:08 +0100)]
winsys/amdgpu: share common code in amdgpu_add_fence_dependencies
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 7 Feb 2017 16:06:06 +0000 (17:06 +0100)]
winsys/amdgpu: extract amdgpu_do_add_real_buffer
We will use it for delayed adding of sparse buffers' backing buffers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 7 Feb 2017 16:00:10 +0000 (17:00 +0100)]
winsys/radeon: sparse buffers will not be supported
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 7 Feb 2017 15:59:54 +0000 (16:59 +0100)]
radeon/winsys: add sparse buffer interface
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 2 Feb 2017 22:32:48 +0000 (23:32 +0100)]
st/mesa: plumbing for sparse buffers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 2 Feb 2017 20:11:29 +0000 (21:11 +0100)]
st/mesa: enable ARB_sparse_buffer when supported
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 23 Mar 2017 17:43:07 +0000 (18:43 +0100)]
trace: add resource_commit pass-through
v2: fix return type to bool (Marek)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 17 Feb 2017 13:55:49 +0000 (14:55 +0100)]
ddebug: add resource_commit pass-through
v2: fix return type to bool (Marek)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 2 Feb 2017 20:10:44 +0000 (21:10 +0100)]
gallium: add sparse buffer interface and capability
v2:
- explain the resource_commit interface in more detail
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 2 Feb 2017 22:24:35 +0000 (23:24 +0100)]
mesa: implement sparse buffer commitment
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 2 Feb 2017 20:50:32 +0000 (21:50 +0100)]
mesa: implement sparse storage buffer allocation
v2:
- spec quote and style (Ian)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 2 Feb 2017 19:56:48 +0000 (20:56 +0100)]
mesa: implement SPARSE_BUFFER_PAGE_SIZE_ARB
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 2 Feb 2017 19:47:31 +0000 (20:47 +0100)]
mesa: Add GL_ARB_sparse_buffer boilerplate
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 4 Apr 2017 14:35:27 +0000 (16:35 +0200)]
configure.ac: require libdrm_amdgpu 2.4.77
The sparse buffer implementation requires amdgpu_bo_va_op_raw.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Matt Turner [Wed, 5 Apr 2017 00:49:35 +0000 (10:49 +1000)]
mesa: Replace program locks with atomic inc/dec.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Jason Ekstrand [Fri, 17 Mar 2017 23:16:06 +0000 (16:16 -0700)]
anv: Advertise larger heap sizes
Instead of just advertising the aperture size, we do something more
intelligent. On systems with a full 48-bit PPGTT, we can address 100%
of the available system RAM from the GPU. In order to keep clients from
burning 100% of your available RAM for graphics resources, we have a
nice little heuristic (which has received exactly zero tuning) to keep
things under a reasonable level of control.
Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
Jason Ekstrand [Sat, 18 Mar 2017 00:31:44 +0000 (17:31 -0700)]
anv: Add support for 48-bit addresses
This commit adds support for using the full 48-bit address space on
Broadwell and newer hardware. Thanks to certain limitations, not all
objects can be placed above the 32-bit boundary. In particular, general
and state base address need to live within 32 bits. (See also
Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.) In order
to handle this, we add a supports_48bit_address field to anv_bo and only
set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set. We set the bit
for all client-allocated memory objects but leave it false for
driver-allocated objects. While this is more conservative than needed,
all driver allocations should easily fit in the first 32 bits of address
space and keeps things simple because we don't have to think about
whether or not any given one of our allocation data structures will be
used in a 48-bit-unsafe way.
Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
Jason Ekstrand [Thu, 30 Mar 2017 18:48:05 +0000 (11:48 -0700)]
anv: Replace anv_bo::is_winsys_bo with a uint32_t flags
Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
Jason Ekstrand [Fri, 31 Mar 2017 22:23:35 +0000 (15:23 -0700)]
i965/blorp: Align vertex buffers to 64B
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Jason Ekstrand [Fri, 31 Mar 2017 22:21:04 +0000 (15:21 -0700)]
anv/blorp: Align vertex buffers to 64B
This fixes issues seen when adding support for full 48-bit addresses.
The 48-bit addresses themselves have nothing to do with it other than
that it caused the kernel to place buffers slightly differently so they
interacted differently with the caches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Jason Ekstrand [Mon, 27 Mar 2017 23:03:57 +0000 (16:03 -0700)]
anv: Query the kernel for reset status
When a client causes a GPU hang (or experiences issues due to a hang in
another client) we want to let it know as soon as possible. In
particular, if it submits work with a fence and calls vkWaitForFences or
vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be
able to trust the results of that rendering. In order to provide this
guarantee, we have to ask the kernel for context status in a few key
locations.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 27 Mar 2017 23:01:42 +0000 (16:01 -0700)]
anv: Check for device loss at the end of WaitForFences
It's possible that the device could have been lost while we were
waiting. We should let the user know if this has happened.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 3 Apr 2017 19:25:15 +0000 (12:25 -0700)]
anv/pipeline: Properly handle unset gl_Layer and gl_ViewportIndex
When the shader does not set one of these values, they are supposed to
get a default value of 0. We have hardware bits in 3DSTATE_CLIP for
this but haven't been setting them. This fixes the intermittent failure
of dEQP-VK.geometry.layered.3d.render_to_default_layer.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Jason Ekstrand [Wed, 29 Mar 2017 22:16:15 +0000 (15:16 -0700)]
i965/fs: Always provide a default LOD of 0 for TXS and TXL
We already provide a default LOD for textureQueryLevels and texture() on
non-fragment stages. However, there are more cases where one is needed
such as textureSize(gsampler2DMS*) in SPIR-V. Instead of trying to list
out all of the cases one at a time, just provide the default for all TXS
and TXL operations. This fixes a shader validation error in the new
Sascha deferredmultisampling demo which uses textureSize(gsampler2DMS).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100391
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Kenneth Graunke [Thu, 23 Feb 2017 23:04:52 +0000 (15:04 -0800)]
mesa: Require mipmap completeness for glCopyImageSubData(), sometimes.
This patch makes glCopyImageSubData require mipmap completeness when the
texture object's built-in sampler object has a mipmapping MinFilter.
Fixes (on i965):
dEQP-GLES31.functional.debug.negative_coverage.*.buffer.copy_image_sub_data
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Vinson Lee [Tue, 4 Apr 2017 21:52:39 +0000 (14:52 -0700)]
libgl-xlib: Link with libunwind.
Fix linking error.
CXXLD libGL.la
../../../../src/gallium/auxiliary/.libs/libgallium.a(u_debug_stack.o): In function `debug_backtrace_capture':
src/gallium/auxiliary/util/u_debug_stack.c:59: undefined reference to `_Ux86_64_getcontext'
src/gallium/auxiliary/util/u_debug_stack.c:60: undefined reference to `_ULx86_64_init_local'
src/gallium/auxiliary/util/u_debug_stack.c:62: undefined reference to `_ULx86_64_step'
src/gallium/auxiliary/util/u_debug_stack.c:71: undefined reference to `_ULx86_64_get_proc_info'
src/gallium/auxiliary/util/u_debug_stack.c:73: undefined reference to `_ULx86_64_get_proc_name'
src/gallium/auxiliary/util/u_debug_stack.c:65: undefined reference to `_ULx86_64_step'
Fixes: 70c272004f72 ("gallium/util: libunwind support")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100562
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>