Rhys Perry [Sat, 20 Oct 2018 13:54:10 +0000 (14:54 +0100)]
glsl_to_tgsi: don't create 64-bit integer MAD/FMA
TGSI has no I64MAD/U64MAD opcode.
Fixes: 278580729a5 ('st/glsl_to_tgsi: add support for 64-bit integers')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Marek Olšák [Tue, 7 Nov 2017 01:01:40 +0000 (02:01 +0100)]
radeonsi: add support for Raven2 (v2)
v2: fix enabling primitive binning
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Marek Olšák [Wed, 25 Apr 2018 23:21:06 +0000 (19:21 -0400)]
radeonsi: clean up decompress flags in fast color clear
Marek Olšák [Sat, 27 Oct 2018 02:07:29 +0000 (22:07 -0400)]
radeonsi/gfx9: set optimal OVERWRITE_COMBINER_WATERMARK
Marek Olšák [Thu, 25 Oct 2018 19:33:00 +0000 (15:33 -0400)]
gallium: rework PIPE_HANDLE_USAGE_* flags
Only radeonsi uses them, so adjust them to match its needs.
Danylo Piliaiev [Fri, 20 Jul 2018 09:54:42 +0000 (12:54 +0300)]
anv: Disable dual source blending when shader doesn't support it on gen8+
Dual source blending behaviour is undefined when shader doesn't
have second color output.
"If SRC1 is included in a src/dst blend factor and
a DualSource RT Write message is not used, results
are UNDEFINED. (This reflects the same restriction in DX APIs,
where undefined results are produced if “o1” is not written
by a PS – there are no default values defined)."
Dismissing fragment in such situation leads to a hang on gen8+
if depth test in enabled.
Since blending cannot be gracefully fixed in such case and the result
is undefined - blending is simply disabled.
v2 (Jason Ekstrand):
- Apply the workaround to each individual entry
- Emit a warning through debug_report
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Danylo Piliaiev [Mon, 2 Jul 2018 14:04:23 +0000 (17:04 +0300)]
i965: Disable dual source blending when shader doesn't support it on gen8+
Dual source blending behaviour is undefined when shader doesn't
have second color output, dismissing fragment in such situation
leads to a hang on gen8+ if depth test in enabled.
Since blending cannot be gracefully fixed in such case and the result
is undefined - blending is simply disabled.
v2 (Kenneth Graunke):
- Listen to BRW_NEW_FS_PROG_DATA in 3DSTATE_PS_BLEND
- Also whack BLEND_STATE[] to keep the two in sync, since we're not
sure exactly which copy of the redundant info the hardware will use.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107088
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Mon, 29 Oct 2018 22:29:10 +0000 (15:29 -0700)]
i965: Respect GL_TEXTURE_SRGB_DECODE_EXT in GenerateMipmaps()
Apparently, we're supposed to look at the texture object's built-in
sampler object's sRGB decode setting in order to decide whether to
decode/downsample/re-encode, or simply downsample as-is. Previously,
I had always done the decoding/encoding.
Fixes SKQP's Skia_Unit_Tests.SRGBMipMaps test.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Andrii Simiklit [Wed, 12 Sep 2018 16:05:45 +0000 (19:05 +0300)]
i965/batch: don't ignore the 'brw_new_batch' call for a 'new batch'
If we restore the 'new batch' using 'intel_batchbuffer_reset_to_saved'
function we must restore the default state of the batch using
'brw_new_batch' function because the 'intel_batchbuffer_flush'
function will not do it for the 'new batch' again.
At least the following fields of the batch
'state_base_address_emitted','aperture_space', 'state_used'
should be restored to default values to avoid:
1. the aperture_space overflow
2. the missed STATE_BASE_ADDRESS commad in the batch
3. the memory overconsumption of the 'statebuffer'
due to uncleared 'state_used' field.
etc.
v2: merge with new commits, changes was minimized, added the 'fixes' tag
v3: added in to patch series
Fixes: 3faf56ffbdeb "intel: Add an interface for saving/restoring
the batchbuffer state."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Andrii Simiklit [Wed, 12 Sep 2018 16:05:44 +0000 (19:05 +0300)]
i965/batch: avoid reverting batch buffer if saved state is an empty
There's no point reverting to the last saved point if that save point is
the empty batch, we will just repeat ourselves.
CC: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 3faf56ffbdeb "intel: Add an interface for saving/restoring
the batchbuffer state."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Engestrom [Sun, 28 Oct 2018 18:20:20 +0000 (18:20 +0000)]
egl: add messages to a few assert() and turn a couple into unreachable()
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Eric Engestrom [Sun, 28 Oct 2018 18:04:01 +0000 (18:04 +0000)]
util: s/0/NULL/ for pointer
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Eric Engestrom [Sun, 28 Oct 2018 17:58:05 +0000 (17:58 +0000)]
i965: add missing case to fix -Wswitch
While at it, turn "unreachable" assert() into unreachable().
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Eric Engestrom [Sun, 28 Oct 2018 17:52:14 +0000 (17:52 +0000)]
mesa: fix struct/class mismatch
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Eric Engestrom [Sun, 28 Oct 2018 17:50:47 +0000 (17:50 +0000)]
mesa: fix memcpy() and memset(0) of non-trivial structs
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Eric Engestrom [Sun, 28 Oct 2018 18:10:35 +0000 (18:10 +0000)]
nouveau: remove unused class member
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Eric Engestrom [Sun, 28 Oct 2018 17:30:26 +0000 (17:30 +0000)]
scons: drop unused HAVE_STDINT_H macro
This was required back when MSVC didn't support C99 and was missing this
header, but since MSVC 2013 (or maybe earlier?) this isn't it does and
this code isn't doing anything anymore.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Eric Engestrom [Tue, 30 Oct 2018 12:05:14 +0000 (12:05 +0000)]
aub_viewer: show vertex buffer pitch
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Tue, 30 Oct 2018 11:26:46 +0000 (11:26 +0000)]
meson: add note about intel tools build options
Fixes: ea83a1d304dc97d1d155a "intel: tools: import ImGui"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Eric Engestrom [Fri, 26 Oct 2018 12:11:14 +0000 (13:11 +0100)]
vl: drop left-over variable
Fixes: 6ccc435e7ad92bb0ba77d "pipe-loader: move dup(fd) within pipe_loader_drm_probe_fd"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Eric Anholt [Mon, 29 Oct 2018 23:05:18 +0000 (16:05 -0700)]
vc4: Fix unused variable warning.
Fixes: bb84fa146f22 ("util: use C99 declaration in the for-loop hash_table_foreach() macro")
Eric Anholt [Wed, 26 Sep 2018 16:22:51 +0000 (09:22 -0700)]
v3d: Use nir_remove_unused_io_vars to handle binner shader output DCE
We were doing this late after nir_lower_io, but we can just reuse the core
code. By doing it at this stage, we won't even set up the VS attributes
as inputs, reducing our VPM size.
Eric Anholt [Fri, 28 Sep 2018 19:40:32 +0000 (12:40 -0700)]
v3d: Only add output slot tracking for the current varying slot.
We always emit 4 slots per slot because things like color output and
position processing in the epilogue will potentially look up more values
than the variable declaration had. However, when we get a .location_frac
!= 0, we don't want to overwrite components of the following
.driver_location.
Eric Anholt [Tue, 18 Sep 2018 17:34:11 +0000 (10:34 -0700)]
v3d: Use nir_lower_io_to_scalar_early to DCE unused VS input components.
This lets us trim unused trailing components in the vertex attributes,
reducing the size of our VPM allocations.
Eric Anholt [Tue, 18 Sep 2018 18:56:22 +0000 (11:56 -0700)]
v3d: Don't rely on sorting input vars for VPM read setup.
For supporting scalar VPM i/o at the NIR level, we need to do a pass over
the vars to figure out how big each attribute is after DCE. Once we've
done that, we can just walk over c->vattr_sizes[] instead of bothering
with vars.
Eric Anholt [Tue, 18 Sep 2018 18:40:54 +0000 (11:40 -0700)]
v3d: Split out NIR input setup between FS and VPM.
They don't share much code, and I'm about to rewrite the remaining shared
code for the VPM case.
Eric Anholt [Tue, 18 Sep 2018 17:35:34 +0000 (10:35 -0700)]
nir: Allow using nir_lower_io_to_scalar_early on VS input vars.
This will be used on V3D to cut down the size of the VS inputs in the VPM
(memory area for sharing data between shader stages).
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Jason Ekstrand [Mon, 10 Sep 2018 17:10:17 +0000 (12:10 -0500)]
anv: Bump the advertised patch version to 90
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Emil Velikov [Wed, 24 Oct 2018 17:53:11 +0000 (18:53 +0100)]
m4: add Werror when checking for compiler flags
Seemingly that at some point clang started accepting _any_ flags,
whereas previously it would error out.
These days, you can give it -Whamsandwich and it will succeed, while
at the same time throwing an annoying warning.
Add -Werror so that everything gets flagged and set accordingly.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108082
Cc: Vinson Lee <vlee@freedesktop.org>
Repored-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Dylan Baker [Tue, 23 Oct 2018 16:30:52 +0000 (09:30 -0700)]
docs/calendar: Add 18.3 plan and expand 18.2
Emil will be helping out with 18.3, while Juan finalises 18.2
v2: [Emil] add Emil for 18.3, fix typos
CC: Emil Velikov <emil.velikov@collabora.com>
CC: Juan A. Romero Suarez <jasuarez@igalia.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Emil Velikov [Mon, 22 Oct 2018 16:43:03 +0000 (17:43 +0100)]
vulkan/wsi: use the drmGetDevice2() API
On older kernels, the drmGetDevice() call will wake up all the GPUs
on the system, while fetching the PCI revision.
Use the 2 version of the API and pass flags == 0, so we don't fetch the
device PCI revision, since we don't need that information.
Fixes: baa38c144f6 ("vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Jason Ekstrand [Sat, 22 Sep 2018 14:46:26 +0000 (09:46 -0500)]
spirv: Pass SSA values through functions
Previously, we would create temporary variables and fill them out.
Instead, we create as many function parameters as we need and pass them
through as SSA defs.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Mauro Rossi [Mon, 29 Oct 2018 11:59:56 +0000 (12:59 +0100)]
android: i965/tiled_memcpy: fix build for x86 generic target
x86 32 bit generic target does not enable ARCH_X86_HAVE_SSE4_1
for this reason all Android library modules using SSE4_1 in mesa
are built conditionally to ARCH_X86_HAVE_SSE4_1
The same approach is now applied to libmesa_intel_tiled_memcpy_sse41
in order to avoid the following building errors:
external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:574:15:
error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
__m128i val = _mm_stream_load_si128((__m128i *)src);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:578:15:
error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
__m128i val0 = _mm_stream_load_si128(((__m128i *)src) + 0);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:579:15:
error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
__m128i val1 = _mm_stream_load_si128(((__m128i *)src) + 1);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:580:15:
error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
__m128i val2 = _mm_stream_load_si128(((__m128i *)src) + 2);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:581:15: error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int'
__m128i val3 = _mm_stream_load_si128(((__m128i *)src) + 3);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5 errors generated.
Fixes: 11b1afdc92 ("i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Toni Lönnberg [Mon, 29 Oct 2018 14:05:10 +0000 (16:05 +0200)]
intel: tools: Add handling for video pipe
Preliminary work for adding handling of different pipes to gen_decoder. We
need to be able to distinguish between different pipes in order to decode
the packets correctly due to opcode re-use.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Toni Lönnberg [Mon, 29 Oct 2018 11:56:44 +0000 (13:56 +0200)]
intel/decoder: Use 'DWord Length' and 'bias' fields for packet length.
Use the 'DWord Length' and 'bias' fields from the instruction definition to
parse the packet length from the command stream when possible. The hardcoded
mechanism is used whenever an instruction doesn't have this field.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Marek Olšák [Mon, 6 Aug 2018 02:50:54 +0000 (22:50 -0400)]
mesa: expose EXT_texture_compression_s3tc on GLES
The spec was modified to support GLES.
Tested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Michał Janiszewski [Mon, 29 Oct 2018 21:51:00 +0000 (15:51 -0600)]
mesa: Add missing include guards
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Michał Janiszewski [Mon, 29 Oct 2018 21:51:00 +0000 (15:51 -0600)]
glx: Add missing include guards
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Michał Janiszewski [Mon, 29 Oct 2018 21:51:00 +0000 (15:51 -0600)]
svga: Add missing include guards
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Michał Janiszewski [Mon, 29 Oct 2018 21:51:00 +0000 (15:51 -0600)]
glsl: Add missing include guards
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Eric Engestrom [Sun, 28 Oct 2018 16:46:21 +0000 (16:46 +0000)]
intel/batch-decoder: remove never-used function
This function was there when the file was introduced in commit
38f10d5a03542c60a589 "intel: tools: add aubinator viewer", but was
never actually used.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Tue, 9 Oct 2018 13:31:34 +0000 (14:31 +0100)]
st/dri: remove leftover local variable
Left over from the cleanup in
6ccc435e7ad92bb0ba77d "pipe-loader: move dup(fd)
within pipe_loader_drm_probe_fd"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Vadym Shovkoplias [Wed, 24 Oct 2018 10:28:23 +0000 (13:28 +0300)]
glsl/linker: Fix out variables linking during single stage
Since out variables are copied from shader objects instruction
streams to linked shader instruction steam it should be cloned
at first to keep source instruction steam unaltered.
Fixes: 966a797e433 ("glsl/linker: Link all out vars from a shader
objects on a single stage")
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105731
Marek Olšák [Mon, 29 Oct 2018 21:22:03 +0000 (17:22 -0400)]
ac: fix ac_build_fdiv for f64
trivial
Fixes: a5f35aa742c
Brian Paul [Mon, 29 Oct 2018 17:15:09 +0000 (11:15 -0600)]
nir: fix yet another MSVC build break
Trivial.
Eric Engestrom [Sun, 28 Oct 2018 13:30:36 +0000 (13:30 +0000)]
vulkan/wsi: simplify meson file tracking
Meson already automatically tracks included headers, so there's no need
to add them everywhere; cleans up the code a bit.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Eric Engestrom [Tue, 2 Oct 2018 13:58:29 +0000 (14:58 +0100)]
clover: add missing meson build dependency
Fixes: 42ea0631f108d82554339 "meson: build clover"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Eric Engestrom [Tue, 2 Oct 2018 13:57:20 +0000 (14:57 +0100)]
svga: add missing meson build dependency
Fixes: a537231b226280bc1e5b7 "meson: build svga driver on linux"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Eric Engestrom [Sun, 28 Oct 2018 13:11:21 +0000 (13:11 +0000)]
radv: add missing meson build dependency
Fixes: 9d40ec2cf6ec6d3d9d78 "radv: Add support for VK_KHR_driver_properties."
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Eric Engestrom [Tue, 2 Oct 2018 13:54:17 +0000 (14:54 +0100)]
anv: add missing meson build dependency
Fixes: e4538b93f5d5177318f2 "anv: Implement VK_KHR_driver_properties"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Samuel Pitoiset [Fri, 5 Oct 2018 16:04:56 +0000 (18:04 +0200)]
radv: implement VK_EXT_transform_feedback
This implementation should work and potential bugs can be
fixed during the release candidates window anyway.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Fri, 5 Oct 2018 15:54:49 +0000 (17:54 +0200)]
radv: add multiple streams support for the GS copy shader
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Fri, 5 Oct 2018 15:54:22 +0000 (17:54 +0200)]
radv: emit stream outputs for vertex and tessellation stages
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Fri, 5 Oct 2018 15:51:22 +0000 (17:51 +0200)]
radv: declare streamout SGPRs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Fri, 5 Oct 2018 15:45:58 +0000 (17:45 +0200)]
radv: gather stream output info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Mon, 10 Sep 2018 13:36:58 +0000 (15:36 +0200)]
radv: allow to emit a vertex to a specified stream
This is required for GS multiple streams support.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Tue, 11 Sep 2018 12:39:42 +0000 (14:39 +0200)]
radv: allow to use up to 4 GSVS ring buffers
For all streams. We basically just need to update the
base address and compute a stride for every stream.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Tue, 11 Sep 2018 09:25:14 +0000 (11:25 +0200)]
radv: adjust the number of output components per stream
Same as the previous patch, except that is only the number of
components.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Tue, 11 Sep 2018 09:21:31 +0000 (11:21 +0200)]
radv: adjust the GSVS ring sizes based on the number of components
For multiple streams support we have to set the different ring
buffer sizes correctly. This relies on the number of output
components per stream.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Thu, 13 Sep 2018 13:39:43 +0000 (15:39 +0200)]
radv: gather which GS stream is used for every outputs
To only emit outputs for the given stream.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Tue, 11 Sep 2018 09:08:49 +0000 (11:08 +0200)]
radv: gather the number of output components per stream
This will be also used for splitting the GS->VS ring buffer.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Tue, 11 Sep 2018 09:08:23 +0000 (11:08 +0200)]
radv: gather the number of streams used by geometry shaders
This will be used for splitting the GS->VS ring buffer. The
stream ID is always 0 for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Jason Ekstrand [Fri, 5 Oct 2018 14:13:25 +0000 (09:13 -0500)]
nir: Add a pass for gathering transform feedback info
This is different from the GL_ARB_spirv pass because it generates a much
simpler data structure that isn't tied to OpenGL and mtypes.h.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Jason Ekstrand [Mon, 29 Oct 2018 14:42:21 +0000 (09:42 -0500)]
vulkan: Update the XML and headers to 1.1.90
This doesn't include any new features but it does include an XML and
header typo fix for modifiers.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Samuel Pitoiset [Mon, 29 Oct 2018 11:13:13 +0000 (12:13 +0100)]
radv: remove wrong comment in calculate_gs_ring_sizes() about streams
The computation seems correct compared to RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Rob Clark [Sun, 28 Oct 2018 14:45:06 +0000 (10:45 -0400)]
freedreno: don't flush when new and old pfb is identical
In the 'inorder' case (ie. FD_MESA_DEBUG=inorder, or old kernel), if the
u_blitter clear path is used (a3xx, a4xx, and some fallback cases on
newer gens), util_blitter_restore_fb_state() will set_framebuffer_state()
to something that is identical to the current fb state, which triggers
an unnecessary flush, and then eventually an assert:
(gdb) bt
#0 0x0000007fbf24a078 in kill () from /lib64/libc.so.6
#1 0x0000007fbe061278 in _debug_assert_fail (expr=0x7fbe93a820 "!batch->flushed", file=0x7fbe93a628 "../src/gallium/drivers/freedreno/freedreno_batch.c", line=491, function=0x7fbe93a990 <__func__.17380> "fd_batch_check_size") at ../src/gallium/auxiliary/util/u_debug.c:322
#2 0x0000007fbe1ccb8c in fd_batch_check_size (batch=0x55556d5a70) at ../src/gallium/drivers/freedreno/freedreno_batch.c:491
#3 0x0000007fbe1d0e08 in fd_clear (pctx=0x55555c61e0, buffers=5, color=0x55556e388c, depth=1, stencil=0) at ../src/gallium/drivers/freedreno/freedreno_draw.c:463
#4 0x0000007fbe57afa4 in st_Clear (ctx=0x55556e17b0, mask=18) at ../src/mesa/state_tracker/st_cb_clear.c:452
The assert was introduced in
4b847b38ae3, so from a functionality
standpoint this patch fixes that commit. But it should also avoid an
unnecessary flush in the 'inorder' case, fixing a performance bug.
Fixes: 4b847b38ae3 freedreno: make fd_batch a one-shot thing
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sat, 27 Oct 2018 18:20:22 +0000 (14:20 -0400)]
freedreno: dependency tracking for z/s depends on ZSA state
ZSA state can change whether depth or stencil is enabled
This plus previous patch fix stk, and various things w/
FD_MESA_DEBUG=inorder
Fixes: ec717fc629 freedreno: reduce resource dependency tracking overhead
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sat, 27 Oct 2018 18:07:09 +0000 (14:07 -0400)]
freedreno: mark all state dirty after switching batch
The problem isn't directly with
ec717fc629 but rather that commit
exposes the problem. When we switch batch we cannot assume previous
state is clean so we should mark all state dirty.
Fixes: ec717fc629 freedreno: reduce resource dependency tracking overhead
Signed-off-by: Rob Clark <robdclark@gmail.com>
Jason Ekstrand [Tue, 16 Oct 2018 21:59:37 +0000 (16:59 -0500)]
anv: Use absolute timeouts in wait_for_bo_fences
We were previously using relative timeouts and decrementing the
user-provided timeout as we waited. Instead, this commit refactors
things to use absolute timeouts throughout. This should fix a subtle
bug in the waitAll case where we aren't decrementing the timeout after a
successful GPU wait. Since pthread_cond_timedwait already takes an
absolute timeout, it's also significantly simpler.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 26 Oct 2018 18:36:01 +0000 (13:36 -0500)]
anv: Flag semaphore BOs as external
It probably doesn't actually break anything but it does cause some
assertions in debug builds.
Fixes: 7a89a0d9edae6 "anv: Use separate MOCS settings for external BOs"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Tue, 2 Oct 2018 22:29:33 +0000 (17:29 -0500)]
anv: Improve the asserts in anv_buffer_get_range
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Rob Clark [Fri, 26 Oct 2018 18:34:04 +0000 (14:34 -0400)]
freedreno/a6xx: inline draw_impl()
Now that it is just called once per draw (instead of once for binning
and once for draw), let's just inline it. If nothing else, it makes
perf-annotate easier to look at.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Fri, 26 Oct 2018 17:50:58 +0000 (13:50 -0400)]
freedreno/a6xx: small cleanup
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Fri, 26 Oct 2018 17:48:38 +0000 (13:48 -0400)]
freedreno/a6xx: move where we handle dirty vbo state
Historically this wasn't in fdN_emit_state(), because prior to addition
of blitter in a5xx, fdN_emit_state() was also used in the clear path.
These days that is only true for a2xx (a3xx and a4xx use u_blitter). So
the reason for it not to be in fd6_emit_state() no longer exists.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Thu, 4 Oct 2018 12:05:49 +0000 (08:05 -0400)]
freedreno: avoid no-op flushes by re-using last-fence
Noticed that with webgl (in chromium, at least) we end up generating a
lot of no-op submits just to get a fence. Tracking the last fence and
returning that if there is no rendering since last flush avoids this.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Kristian H. Kristensen [Wed, 24 Oct 2018 19:02:00 +0000 (12:02 -0700)]
freedreno/a6xx: Move stencil/depth/alpha state to IB
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Kristian H. Kristensen [Thu, 25 Oct 2018 20:46:24 +0000 (13:46 -0700)]
freedreno/a6xx: Move stencil mask emit to FD_DIRTY_ZSA group
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Kristian H. Kristensen [Thu, 25 Oct 2018 20:35:15 +0000 (13:35 -0700)]
freedreno/a6xx: Rename FD6_GROUP_ZSA ro FD6_GROUP_LRZ
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Kristian H. Kristensen [Mon, 22 Oct 2018 16:35:39 +0000 (09:35 -0700)]
freedreno/a6xx: Move rasterizer state to state object
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Kristian H. Kristensen [Wed, 17 Oct 2018 21:18:56 +0000 (14:18 -0700)]
freedreno/a6xx: Fix set_blit_scissor helper
The scissor maxx/maxy are non-inclusive, so don't subtract one from
framebuffer width and height.
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Kristian H. Kristensen [Tue, 16 Oct 2018 21:50:58 +0000 (14:50 -0700)]
freedreno/a2xx: Squash a compiler warning
We get a warning here for assigning a const char * pointer to
char *swizzle in struct ir2_src_register. The constructor strdups a 4
byte string here, so just memcpy to that instead.
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Kristian H. Kristensen [Tue, 16 Oct 2018 21:28:57 +0000 (14:28 -0700)]
freedreno/a6xx: Use fd6_emit_ib from a6xx
Move it to a header and use it where possible to avoid vfunc call.
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Rob Clark [Sun, 21 Oct 2018 14:22:11 +0000 (10:22 -0400)]
freedreno: import libdrm_freedreno + redesign submit
In the pursuit of lowering driver overhead, it became clear that some
amount of redesign of how libdrm_freedreno constructs the submit ioctl
would be needed. In particular, as the gallium driver is starting to
make heavier use of CP_SET_DRAW_STATE state groups/objects, the over-
head of tracking cmd buffers and relocs becomes too much. And for
"streaming" state, which isn't ever reused (like uniform uploads) the
overhead of allocating/freeing ringbuffer[1] objects is too high.
This redesign makes two main changes:
1) Introduces a fd_submit object for tracking bos and cmds table
for the submit ioctl, making ringbuffer objects more light-
weight. This was previously done in the ringbuffer. But we
have many ringbuffer instances involved in a submit (gmem +
draw + potentially 1000's of state-group rbs), and only need
a single bos and cmds table. (Reloc table is still per-rb)
The submit is also a convenient place for a slab allocator for
ringbuffer objects. Other options would have required locking
because, while we can guarantee allocations will only happen on
a single thread, free's could happen either on the application
thread or the flush_queue thread. With the slab allocator in
the submit object, any frees that happen on the flush_queue
thread happen after we know that the application thread is done
with the submit.
2) Introduce a new "softpin" msm_ringbuffer_sp implementation that
does not use relocs and only has cmds table entries for IB1 (ie.
the cmdstream buffers that kernel needs to CP_INDIRECT_BUFFER
to from the RB). To do this properly will require some updates
on the kernel side, so whether you get the softpin or legacy
submit/ringbuffer implementation at runtime depends on your
kernel version.
To make all these changes in libdrm would basically require adding a
libdrm_freedreno2, so this is a good point to just pull the libdrm code
into mesa. Plus it allows for using mesa's hashtable, slab allocator,
etc. And it lets us have asserts enabled for debug mesa buids but
omitted for release builds. And it makes life easier if further API
changes become necessary.
At this point I haven't tried to pull in the kgsl backend. Although
I left the level of vfunc indirection which would make it possible
to have other backends. (And this was convenient to keep to allow
for the "softpin" ringbuffer to coexist.)
NOTE: if bisecting a build error takes you here, try a clean build.
There are a bunch of ways things can go wrong if you still have
libdrm_freedreno cflags.
[1] "ringbuffer" is probably a bad name, the only level of cmdstream
buffer that is actually a ring is RB managed by kernel. User-
space cmdstream is all IB1/IB2 and state-groups.
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Jason Ekstrand [Tue, 16 Oct 2018 19:58:41 +0000 (14:58 -0500)]
Revert "anv/skylake: disable ForceThreadDispatchEnable"
This reverts commit
0fa9e6d7b304f6a8064ed78a4b9c557e1026e7e5. The real
issue appears to have been that HiZ ops don't like having WM thread
dispatch force-enabled. The previous commit fixes that problem so we
can go back to using the ForceThreadDispatchEnable bit even on SKL+.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 16 Oct 2018 19:58:18 +0000 (14:58 -0500)]
blorp: Emit a dummy 3DSTATE_WM prior to 3DSTATE_WM_HZ_OP
Cc: mesa-stable@lists.freedesktop.org
Suggested-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Axel Davy [Sun, 16 Sep 2018 15:43:56 +0000 (17:43 +0200)]
st/nine: Handle window resize when a presentation buffer is used
Usually when a window is resized, the app calls d3d to resize the back
buffer to the window size. In some cases, it is not done,
and it expects the output resizes to the window size, even if
the back buffer size is unchanged.
This patch introduces the behaviour when a presentation buffer
is used.
ID3DPresent_GetWindowInfo is a function available with
D3DPresent v1.0, and thus we don't need to check if the
function is available.
The function had been introduced to implement this very
feature.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sun, 16 Sep 2018 15:24:45 +0000 (17:24 +0200)]
d3dadapter: Fix wrong naming in header file
GetWindowInfo used to be GetWindowSize before gallium
nine was merged. A left-over remained...
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sun, 14 Oct 2018 21:31:07 +0000 (23:31 +0200)]
st/nine: Reduce MaxSimultaneousTextures to 8
Windows drivers don't set this flag (which affects ff) to more than 8.
Do the same in case some games check for 8.
v2: Remove any dependence on MaxSimultaneousTextures. For non-ff
the number of textures is 16 when the device is able of vs/ps3.
Add this requirement of 16 textures to the driver requirements.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sun, 14 Oct 2018 20:02:06 +0000 (22:02 +0200)]
st/nine: Enable shadow mapping for ps 1.X
We didn't implement shadow textures for ps 1.X,
assuming the case couldn't happen...
Well it does.
Fixes: https://github.com/iXit/Mesa-3D/issues/261
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sat, 13 Oct 2018 21:33:47 +0000 (23:33 +0200)]
st/nine: Do not set unused states for stateblocks
A lot of these states are used only for the context,
and are unused for stateblocks (which just uses the
changed.* fields instead for a lot of them).
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sat, 13 Oct 2018 21:21:36 +0000 (23:21 +0200)]
st/nine: Fix aliasing states for stateblocks
If NINE_STATE_FF_MATERIAL is set, the stateblock will upload
its recorded materials matrix.
If NINE_STATE_FF_LIGHTING is set, the lighting set is uploaded.
These flags could be set by a NineDevice9_SetTransform call
or by setting some states related to ff, but that shouldn't trigger
these stateblock behaviours.
We don't need to follow the context states dirtied by render states.
NINE_STATE_FF_VSTRANSF is exactly the state controlling stateblock
updates of transformation matrices, NINE_STATE_FF is too broad.
These two changes avoid setting the two mentionned states when we
shouldn't.
Fixes: https://github.com/iXit/Mesa-3D/issues/320
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sat, 13 Oct 2018 20:45:16 +0000 (22:45 +0200)]
st/nine: Never update device changed.* fields
The device state changed.* field are never used.
These fields are used only for stateblocks.
Avoid setting them at all for clarity.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sun, 23 Sep 2018 14:22:01 +0000 (16:22 +0200)]
st/nine: Capture also default matrices for D3DSBT_ALL
We avoid allocating space for never unused matrices.
However we must do as if we had captured them.
Thus when a D3DSBT_ALL stateblock apply has fewer matrices
than device state, allocate the default matrices for the stateblock
before applying.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sun, 23 Sep 2018 14:45:30 +0000 (16:45 +0200)]
st/nine: Mark transform matrices dirty for D3DSBT_ALL
D3DSBT_ALL stateblocks capture the transform matrices.
Fixes some d3d test programs not displaying properly.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sun, 23 Sep 2018 20:28:07 +0000 (22:28 +0200)]
st/nine: Don't update unused world matrices
While to the application we have to track
accurately all 256 world matrices (including
in stateblocks), hw vertex processing enables
to set a limit to the number of world matrices
the hardware can access to in the advertised caps,
which is 8 for nine.
Thus don't bother in the stateblock code to send
the updated values for the unreachable matrices.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sat, 13 Oct 2018 20:28:34 +0000 (22:28 +0200)]
st/nine: Remove two unused states.
NINE_STATE_MATERIAL was used incorrectly at one location.
Replace it with the correct state.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sat, 13 Oct 2018 20:35:22 +0000 (22:35 +0200)]
st/nine: Remove commented nine_context_apply_stateblock
At some point the project was to adapt the
commented version to csmt.
The csmt rework enabled to fix some state aliasing
issues between stateblocks and internal state updates.
The commented version needs a lot of work to work with that.
Just drop it.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Brian Paul [Fri, 26 Oct 2018 18:34:09 +0000 (12:34 -0600)]
nir: Fix array initializer
Empty initializer is not standard C. This fixes MSVC build.
Trivial.
Jason Ekstrand [Fri, 26 Oct 2018 13:32:39 +0000 (08:32 -0500)]
anv: Return VK_ERROR_DEVICE_LOST from anv_device_set_lost
This lets us get rid of a bunch of duplicated error messages.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Jason Ekstrand [Fri, 26 Oct 2018 13:24:49 +0000 (08:24 -0500)]
anv/util: Split a vk_errorv helper out of vk_errorf
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>