mesa.git
5 years agoaco: fix opcode for s_mul_hi_i32
Rhys Perry [Sat, 21 Sep 2019 13:10:38 +0000 (14:10 +0100)]
aco: fix opcode for s_mul_hi_i32

Fixes dEQP-VK.glsl.builtin.function.integer.imulextended.*_compute

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoaco: fix v_subrev_co_u32_e64 opcode
Rhys Perry [Tue, 24 Sep 2019 14:45:48 +0000 (15:45 +0100)]
aco: fix v_subrev_co_u32_e64 opcode

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoaco: fix GFX9 opcode for v_xad_u32
Rhys Perry [Tue, 24 Sep 2019 13:34:28 +0000 (14:34 +0100)]
aco: fix GFX9 opcode for v_xad_u32

Fixes various dEQP-VK.image.store.* tests.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoaco: implement 64-bit ineg
Rhys Perry [Wed, 25 Sep 2019 11:16:34 +0000 (12:16 +0100)]
aco: implement 64-bit ineg

We currently lower them, but nir_opt_algebraic() can add new ones because
lower_sub=true.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: run nir_lower_int64() before nir_lower_idiv()
Rhys Perry [Tue, 24 Sep 2019 14:15:26 +0000 (15:15 +0100)]
aco: run nir_lower_int64() before nir_lower_idiv()

nir_lower_idiv() asserts on 64-bit integers.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agonir: Fix overlapping vars in nir_assign_io_var_locations()
Connor Abbott [Tue, 24 Sep 2019 15:29:53 +0000 (17:29 +0200)]
nir: Fix overlapping vars in nir_assign_io_var_locations()

When handling two variables with overlapping locations, we process the
one with lower location first, and then extend the location ->
driver_location map to guarantee that it's contiguous for the second
variable too. But the loop had the wrong bound, so we weren't extending
the map 100%, which could lead to problems later such as an incorrect
num_inputs. The loop index i is an index into the slots of the variable,
so we need to stop at the final slot of the variable (var_size) instead
of the number of unassigned slots.

This fixes
spec@arb_enhanced_layouts@execution@component-layout@vs-fs-array-interleave-range
on radeonsi NIR.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoclover: eliminate "ignoring attributes on template argument" warning
Karol Herbst [Fri, 20 Sep 2019 11:08:50 +0000 (13:08 +0200)]
clover: eliminate "ignoring attributes on template argument" warning

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
5 years agoclover/codegen: remove unused get_symbol_offsets function
Karol Herbst [Fri, 20 Sep 2019 10:45:11 +0000 (12:45 +0200)]
clover/codegen: remove unused get_symbol_offsets function

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
5 years agoclover/llvm: remove harmful std::move call
Karol Herbst [Fri, 20 Sep 2019 10:43:10 +0000 (12:43 +0200)]
clover/llvm: remove harmful std::move call

both clang and gcc warn with:
"moving a local object in a return statement prevents copy elision"

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
5 years agoiris: disable aux on first get_param if not created with aux
Tapani Pälli [Mon, 2 Sep 2019 10:02:33 +0000 (13:02 +0300)]
iris: disable aux on first get_param if not created with aux

This moves the fix from commit 361f3d19f1f to happen in get_param
(used now instead of get_handle by st/dri). This fixes artifacts
seen with Xorg and CCS_E.

Fixes: fc12fd05f56 "iris: Implement pipe_screen::resource_get_param"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl: correct bitcast-helpers
Erik Faye-Lund [Tue, 24 Sep 2019 14:57:03 +0000 (16:57 +0200)]
glsl: correct bitcast-helpers

Without this, we'll incorrectly round off huge values to the nearest
representable double instead of keeping it at the exact value  as
we're supposed to.

Found by inspecting compiler-warnings.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 85faf5082f ("glsl: Add 64-bit integer support for constant expressions")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agolima/ppir: add support for indirect load of uniforms and varyings
Vasily Khoruzhick [Sat, 14 Sep 2019 18:00:16 +0000 (11:00 -0700)]
lima/ppir: add support for indirect load of uniforms and varyings

Utgard PP supports indirect load of uniforms and varyings, so let's
enable it.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/ppir: add node dependency types
Vasily Khoruzhick [Sat, 14 Sep 2019 18:01:03 +0000 (11:01 -0700)]
lima/ppir: add node dependency types

Currently we add dependecies in 3 cases:
1) One node consumes value produced by another node
2) Sequency dependencies
3) Write after read dependencies

2) and 3) only affect scheduler decisions since we still can use pipeline
register if we have only 1 dependency of type 1).

Add 3 dependency types and mark dependencies as we add them.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/ppir: don't attempt to clone tex coords if it's not varying
Vasily Khoruzhick [Tue, 24 Sep 2019 04:20:07 +0000 (21:20 -0700)]
lima/ppir: don't attempt to clone tex coords if it's not varying

It makes no sense to clone texture coords if it's not varying, moreover
we don't support cloning ALU nodes.

Fixes: 1c1890fa7077 ("lima/ppir: clone uniforms and load_coords into each successor")
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agoradeonsi/nir: lower load constants to scalar
Timothy Arceri [Fri, 20 Sep 2019 06:54:31 +0000 (16:54 +1000)]
radeonsi/nir: lower load constants to scalar

We call nir_lower_load_const_to_scalar in the state trackers linker
however some later passes can reintroduce constant vectors. Here
we lower these to scalar and perform optimisations. The Intel
drivers do a similar call in their backend..

shader-db results VEGA 64:

Totals from affected shaders:
SGPRS: 152168 -> 151976 (-0.13 %)
VGPRS: 135224 -> 135112 (-0.08 %)
Spilled SGPRs: 4027 -> 4163 (3.38 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 10670028 -> 10654776 (-0.14 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 13122 -> 13135 (0.10 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoturnip: use image tile_mode for gmem configuration
Jonathan Marek [Tue, 24 Sep 2019 18:39:55 +0000 (14:39 -0400)]
turnip: use image tile_mode for gmem configuration

Fixes at least this deqp test:
dEQP-VK.api.smoke.triangle

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoturnip: fix binning shader compilation
Jonathan Marek [Tue, 24 Sep 2019 18:36:53 +0000 (14:36 -0400)]
turnip: fix binning shader compilation

ir3 segfaults if nonbinning is NULL for the bininng pass shader.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agonir/opt_remove_phis: handle phis with no sources
Rhys Perry [Mon, 23 Sep 2019 13:48:22 +0000 (14:48 +0100)]
nir/opt_remove_phis: handle phis with no sources

This can happen with loops with unreachable exits which are later
optimized away.

Fixes assertion in dEQP-VK.graphicsfuzz.unreachable-loops with RADV.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoradeonsi: fix VAAPI segfault due to various bugs
Michel Dänzer [Thu, 19 Sep 2019 00:18:39 +0000 (20:18 -0400)]
radeonsi: fix VAAPI segfault due to various bugs

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111236

5 years agogallium/vl: don't set PIPE_HANDLE_USAGE_EXPLICIT_FLUSH
Marek Olšák [Tue, 17 Sep 2019 22:22:08 +0000 (18:22 -0400)]
gallium/vl: don't set PIPE_HANDLE_USAGE_EXPLICIT_FLUSH

because vl doesn't call flush_resource and I wasn't able to find
all places where flush_resource needs to be called.

This fixes corrupted / unflushed surfaces with fullscreen videos on Raven.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
5 years agoradeonsi: initialize displayable DCC using the retile blit to prevent hangs
Marek Olšák [Tue, 17 Sep 2019 02:31:48 +0000 (22:31 -0400)]
radeonsi: initialize displayable DCC using the retile blit to prevent hangs

Cc 19.2 <mesa-stable@lists.freedesktop.org>

5 years agonir/opt_large_constants: Handle store writemasks
Connor Abbott [Tue, 24 Sep 2019 10:43:29 +0000 (12:43 +0200)]
nir/opt_large_constants: Handle store writemasks

This fixes some piglit tests on radeonsi NIR where a varying is
initialized to a constant array in the vertex shader. Varying packing
after nir_lower_io_to_temporaries creates writemasked stores which
persist after pulling the constant initialization down into the fragment
shader.

While we're here, rewrite handle_constant_store() to do the loop over
components outside the switch, so that we don't have to duplicate the
writemask checking for every bitsize.

Fixes: 1235850522c ("nir: Add a large constants optimization pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agomeson: split more compiler options to their own line
Eric Engestrom [Mon, 23 Sep 2019 17:53:22 +0000 (18:53 +0100)]
meson: split more compiler options to their own line

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agomeson: drop -Wno-foo bug workaround for Meson < 0.46
Eric Engestrom [Mon, 23 Sep 2019 17:37:01 +0000 (18:37 +0100)]
meson: drop -Wno-foo bug workaround for Meson < 0.46

This was a workaround for a bug in Meson that was fixed in 0.46 [1].

[1] https://github.com/mesonbuild/meson/pull/2284

Fixes: f7b6a8d12fdc446e3251 ("meson: bump required version to 0.46")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoradv: fix s/load/store/ copy-paste typo
Eric Engestrom [Mon, 23 Sep 2019 16:48:55 +0000 (17:48 +0100)]
radv: fix s/load/store/ copy-paste typo

Fixes: cdc6efddf918bc07d30d ("radv: implement all depth/stencil resolve modes using graphics")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agonouveau: add idep_nir_headers as dep for libnouveau
Stephen Barber [Tue, 24 Sep 2019 00:51:43 +0000 (17:51 -0700)]
nouveau: add idep_nir_headers as dep for libnouveau

Fixes a compilation error when building libnouveau:

In file included from ../src/gallium/drivers/nouveau/nv50/nv50_program.c:25:
../src/compiler/nir/nir.h:1115:10: fatal error: nir_intrinsics.h: No such file or directory
 #include "nir_intrinsics.h"
           ^~~~~~~~~~~~~~~~~~
           compilation terminated.

Fixes: f014ae3c7cce504afe5d ("nouveau: add support for nir")
Signed-off-by: Stephen Barber <smbarber@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agoradv: Add workaround for hang in The Surge 2.
Bas Nieuwenhuizen [Tue, 24 Sep 2019 00:53:21 +0000 (02:53 +0200)]
radv: Add workaround for hang in The Surge 2.

Released today and hangs on RADV. We don't have the root cause yet,
but this should unblock people playing the game.

No drirc because the radv debugflags are not usable from drirc and
I want this backported.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoi965/fs: set rounding mode when emitting the flrp instruction
Andres Gomez [Mon, 23 Sep 2019 22:37:57 +0000 (01:37 +0300)]
i965/fs: set rounding mode when emitting the flrp instruction

flrp was forgotten when already adding the rounding mode for other
instructions.

Fixes: ba1e25e1aa6 ("i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions")
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
5 years agoi965/fs: add a comment about how the rounding mode in fmul is set
Andres Gomez [Mon, 23 Sep 2019 22:16:11 +0000 (01:16 +0300)]
i965/fs: add a comment about how the rounding mode in fmul is set

After
1711bf6cf2d ("intel/fs: Generate better code for fsign multiplied by a value"),
the conflicts resolution for setting the rounding mode after the
fused fmul and fsign optimization is non obvious.

Basically, the optimization doesn't really result in a MUL, or any
other operation which would need to have the rounding mode set. Hence,
we set it just before the actual MUL in the treatment of fmul.

Fixes: ba1e25e1aa6 ("i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions")
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
5 years agobin/get-pick-list.sh: sha1 commits can be smaller than 8 chars
Juan A. Suarez Romero [Tue, 10 Sep 2019 08:30:43 +0000 (10:30 +0200)]
bin/get-pick-list.sh: sha1 commits can be smaller than 8 chars

The script only handles commits with "Fixes: <sha1>" where <sha1> is
equal or great than 8 chars. But <sha1> can be smaller, like 7 chars.

This commit relax the restriction to handle <sha1> 4 or more chars.

Fixes: 533fead4236 ("bin/get-pick-list.sh: tweak the commit sha matching pattern")
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
5 years agolima/gpir: Fix 64-bit shift in scheduler spilling
Connor Abbott [Wed, 18 Sep 2019 17:47:28 +0000 (00:47 +0700)]
lima/gpir: Fix 64-bit shift in scheduler spilling

There are 64 physical registers so the shift must be 64 bits.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/gpir: Don't emit movs when translating from NIR
Connor Abbott [Wed, 18 Sep 2019 11:13:08 +0000 (18:13 +0700)]
lima/gpir: Don't emit movs when translating from NIR

The scheduler doesn't expect them. To do this, I had to refactor the
registration part of gpir_node_create_dest() to be separate from
creating and inserting the node, since the last two now aren't done when
handling moves. This adds more code but creates the possibility of
automatically inserting input dependencies when inserting nodes, similar
to what's done in NIR with the use-def lists (this isn't done yet).

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/gpir: Fix postlog2 fixup handling
Connor Abbott [Wed, 18 Sep 2019 05:29:45 +0000 (12:29 +0700)]
lima/gpir: Fix postlog2 fixup handling

We guarantee that a complex1 op is always used by postlog2 directly by
rewriting the postlog2 op to be a move when there would be a move
inserted between them. But we weren't doing this in all circumstances
where there might be a move. Move the logic to place_move() so that it
always happens. Fixes a few log tests that happened to start failing due
to changes in the register allocator leading to a different scheduling
order.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/gpir: Use registers for values live in multiple blocks
Connor Abbott [Fri, 13 Sep 2019 06:23:56 +0000 (13:23 +0700)]
lima/gpir: Use registers for values live in multiple blocks

This commit adds the framework for cross-basic-block register
allocation. Like ARM's compiler, we assume that the value registers
aren't usable across branches, which means we have to use physical
registers to store any value that crosses a basic block. There are three
parts to this:

1. When translating from NIR, we rely on the NIR out-of-ssa pass to
coalesce values into registers. We insert store_reg instructions for
values used in more than one basic block, and load_reg instructions for
values not defined in the same basic block (or defined after their use,
for loops). So by the time we've translated out of NIR we've already
split things into values (which are only used in the same basic block)
and registers (which are only used in different basic blocks than where
they're defined).

2. We allocate the registers at the same time that we allocate the
values, before the final scheduler. Unlike the values, where the
assigned color is fake, we assign the actual physical index & component
to physregs at this stage. load_reg and store_reg are treated as moves
in the allocator and when creating write-after-read dependencies.

3. Finally, in the main scheduler we have to avoid overwriting existing
live physregs when spilling. First, we have to tell the scheduler which
physical registers are live at the end of each block, to avoid
overwriting those. If a register is only live at the beginning, we can
reuse it for spilling after the last original use in the final program
happens, i.e. before any original use is scheduled, but we have to be
careful to add the proper dependencies so that the spill write is
scheduled before the original reads. To handle this we repurpose
reg_link for uses to be used by the scheduler.

A few register-related things copied over from NIR or from other
drivers can be dropped.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/gpir: Support branch instructions
Connor Abbott [Wed, 28 Aug 2019 08:57:35 +0000 (10:57 +0200)]
lima/gpir: Support branch instructions

Because branch conditions have to be in the pass slot, there is no
unconditional branch, and realistically the pass slot has to contain a
move when branching (there's nothing it does that would be useful for
operating on booleans, so we can't use it for anything when computing
the branch condition), we put the branch instruction in the pass slot
and at codegen time turn it into a move of the branch condition. This
means that it doesn't have to be special-cased like store instructions
are in the scheduler. Because of this decision we can remove the
half-implemented BRANCH codegen slot. Finally, we (ab)use the existing
schedule_first mechanism to make sure that branches are always last in
the basic block.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/gpir: Only try to place actual children
Connor Abbott [Tue, 10 Sep 2019 14:11:42 +0000 (21:11 +0700)]
lima/gpir: Only try to place actual children

When picking a node to be scheduled, we try to schedule its children as
well. But we shouldn't try to schedule nodes which only have a fake
dependency on the original node, since this isn't the point of
scheduling children at the same time and can break some expectations of
the rest of the code.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/gpir: Fix compiler warning
Connor Abbott [Fri, 13 Sep 2019 06:18:29 +0000 (13:18 +0700)]
lima/gpir: Fix compiler warning

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agoglx: Implement GLX_EXT_no_config_context
Adam Jackson [Tue, 14 Nov 2017 20:13:06 +0000 (15:13 -0500)]
glx: Implement GLX_EXT_no_config_context

This is the GLX counterpart to EGL_KHR_no_config_context. Contexts may
now be created without reference to an fbconfig, in which case it is
treated as compatible with any fbconfig (and thus any GLX drawable).

Khronos: https://github.com/KhronosGroup/OpenGL-Registry/pull/102
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoglx: Lift sending the MakeCurrent request to top-level code
Adam Jackson [Tue, 14 Nov 2017 20:13:04 +0000 (15:13 -0500)]
glx: Lift sending the MakeCurrent request to top-level code

Somewhat terrifyingly, we never sent this for direct contexts, which
means the server never knew the context/drawable bindings. To handle
this sanely, pull the request code up out of the indirect backend, and
rewrite the context switch path to call it as appropriate.  This
attempts to preserve the existing behavior of not calling unbind() on
the context if its refcount would not drop to zero.

Of course, you can't just do this indiscriminately, because this is GLX
and extant X servers have bugs and everything is terrible. To wit:

- For 1.20.x prior to 1.20.6, you can bind a direct context once, but
the second time you try to modify the context's binding you will get
GLXBadContextTag. This includes unbinding the context. And "deleting"
the context will leak memory, because it will still appear to be
current.

- For 1.19 and earlier, glXMakeCurrent(dpy, None, ctx) should be legal
for GL 3.0+ contexts, but the server will throw BadMatch.

To guard against this, we only send the request for indirect contexts
unless the server is known good, and only mention one context at a time
in such a request; if switching between contexts, we first unbind the
old, and then bind the new. Note that the second VendorRelease() version
is to catch XFree86 4.x and Xorg [67].x, which almost certainly have the
above bugs. Other servers might report different version numbers here,
but we can't do direct rendering against them, so this should be safe.

Fixes glx-make-context, glx-multi-window-single-context and
glx-query-drawable-glx_fbconfig_id-window. Sufficiently old piglit will
regress on glx-make-glxdrawable-current (throwing BadMatch), which is
fixed by mesa/piglit!116.

5 years agoglx: Move vertex array protocol state into the indirect backend
Adam Jackson [Tue, 14 Nov 2017 20:13:03 +0000 (15:13 -0500)]
glx: Move vertex array protocol state into the indirect backend

Only relevant for indirect contexts, so let's get that code out of the
common path.

5 years agointel: Increase Gen11 compute shader scratch IDs to 64.
Kenneth Graunke [Fri, 23 Aug 2019 00:32:25 +0000 (17:32 -0700)]
intel: Increase Gen11 compute shader scratch IDs to 64.

From the MEDIA_VFE_STATE docs:

   "Starting with this configuration, the Maximum Number of Threads must
    be set to (#EU * 8) for GPGPU dispatches.

    Although there are only 7 threads per EU in the configuration, the
    FFTID is calculated as if there are 8 threads per EU, which in turn
    requires a larger amount of Scratch Space to be allocated by the
    driver."

It's pretty clear that we need to increase this for scratch address
calculations, because the FFTID has a certain bit-pattern.  The quote
above seems to indicate that we should increase the actual thread count
programmed in MEDIA_VFE_STATE as well, but we think the intention is to
only bump the scratch space.

Fixes GPU hangs in Bioshock Infinite and Synmark's CSDof on Icelake 8x8.

Fixes: 5ac804bd9ac ("intel: Add a preliminary device for Ice Lake")
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agoRevert "intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM"
Kenneth Graunke [Mon, 23 Sep 2019 23:30:29 +0000 (16:30 -0700)]
Revert "intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM"

This reverts commit 729de1488f49033bc181b8123af5658228a51bf1.

It turns out that, although the register is in the logical context,
it isn't whitelisted, so we can't actually write it from userspace
batch buffers.  The write just becomes a noop, which is why we saw
no performance changes.

I manually whitelisted it, and still observed no performance gains, but
it did regress KHR-GL46.texture_cube_map_array.color_depth_attachments
on the iris driver.  So we might need to fix something before enabling
this.  To prevent it randomly getting turned on should the kernel ever
whitelist this register, we revert the patch for now.

5 years agoutil/rb_tree: Replace useless ifs with asserts
Jason Ekstrand [Mon, 23 Sep 2019 17:24:12 +0000 (12:24 -0500)]
util/rb_tree: Replace useless ifs with asserts

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agobroadcom/genxml: Stop manually scrubbing 'α' -> "alpha"
Kenneth Graunke [Fri, 20 Sep 2019 22:43:12 +0000 (15:43 -0700)]
broadcom/genxml: Stop manually scrubbing 'α' -> "alpha"

'α' has never appeared in any genxml files, so there's no need to
replace it with the word "alpha".

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agointel/genxml: Stop manually scrubbing 'α' -> "alpha"
Kenneth Graunke [Thu, 22 Aug 2019 20:29:31 +0000 (13:29 -0700)]
intel/genxml: Stop manually scrubbing 'α' -> "alpha"

'α' has never appeared in any genxml files, so there's no need to
replace it with the word "alpha".

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
5 years agofreedreno/a6xx: do streamout only in binning pass
Rob Clark [Fri, 20 Sep 2019 21:58:49 +0000 (14:58 -0700)]
freedreno/a6xx: do streamout only in binning pass

Use VPC_SO_OVERRIDE to control whether we do streamout in binning or
draw pass.  Normally we want to do streamout in binning pass, except
when there is a single tile and binning passed is skipped.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/a6xx: fix binning pass vs. xfb
Rob Clark [Fri, 20 Sep 2019 20:50:21 +0000 (13:50 -0700)]
freedreno/a6xx: fix binning pass vs. xfb

We could bit doing streamout from binning pass.  In this case we want to
use the full VS which doesn't have (potentially streamed out) varyings
stripped out.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/a6xx: un-open-code PC_PRIMITIVE_CNTL_1.PSIZE
Rob Clark [Thu, 19 Sep 2019 18:30:01 +0000 (11:30 -0700)]
freedreno/a6xx: un-open-code PC_PRIMITIVE_CNTL_1.PSIZE

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agoac/nir: force unnormalized coordinates for RECT
Marek Olšák [Wed, 18 Sep 2019 23:41:18 +0000 (19:41 -0400)]
ac/nir: force unnormalized coordinates for RECT

This fixes VAAPI.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agoac/nir: port Z compare value clamping from radeonsi
Marek Olšák [Wed, 18 Sep 2019 19:33:45 +0000 (15:33 -0400)]
ac/nir: port Z compare value clamping from radeonsi

This fixes some dEQP tests.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agotgsi_to_nir: fix 2-component system values like tess_level_inner_default
Marek Olšák [Wed, 18 Sep 2019 19:12:30 +0000 (15:12 -0400)]
tgsi_to_nir: fix 2-component system values like tess_level_inner_default

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agotgsi_to_nir: fix masked out image loads
Marek Olšák [Wed, 18 Sep 2019 21:34:13 +0000 (17:34 -0400)]
tgsi_to_nir: fix masked out image loads

This caused a failure in NIR validation.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir: define 8-byte size and alignment for bindless variables
Marek Olšák [Wed, 18 Sep 2019 19:25:15 +0000 (15:25 -0400)]
nir: define 8-byte size and alignment for bindless variables

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agonir: don't add bindless variables to num_textures and num_images
Marek Olšák [Wed, 18 Sep 2019 19:19:29 +0000 (15:19 -0400)]
nir: don't add bindless variables to num_textures and num_images

It confuses radeonsi.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agoamd: remove all PCI IDs supported by amdgpu
Marek Olšák [Wed, 18 Sep 2019 21:12:30 +0000 (17:12 -0400)]
amd: remove all PCI IDs supported by amdgpu

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoloader: always map the "amdgpu" kernel driver name to radeonsi (v2)
Jiang, Sonny [Tue, 3 Sep 2019 22:33:57 +0000 (22:33 +0000)]
loader: always map the "amdgpu" kernel driver name to radeonsi (v2)

v2: cleanup

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoac: stop using PCI IDs for chip identification
Marek Olšák [Wed, 18 Sep 2019 21:07:31 +0000 (17:07 -0400)]
ac: stop using PCI IDs for chip identification

PCI IDs for amdgpu will be removed from Mesa.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoac/addrlib: fix chip identification for Vega10, Arcturus, Raven2, Renoir
Marek Olšák [Wed, 18 Sep 2019 21:05:09 +0000 (17:05 -0400)]
ac/addrlib: fix chip identification for Vega10, Arcturus, Raven2, Renoir

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoamd: add more PCI IDs for Navi14
Marek Olšák [Mon, 23 Sep 2019 19:08:38 +0000 (15:08 -0400)]
amd: add more PCI IDs for Navi14

trivial and urgent

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
5 years agomeson: split compiler warnings one per line
Eric Engestrom [Mon, 23 Sep 2019 16:20:32 +0000 (17:20 +0100)]
meson: split compiler warnings one per line

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agonir/repair_ssa: Replace the unreachable check with the phi builder
Jason Ekstrand [Mon, 9 Sep 2019 18:38:37 +0000 (13:38 -0500)]
nir/repair_ssa: Replace the unreachable check with the phi builder

In a3268599f3c9, I attempted to fix nir_repair_ssa for unreachable
blocks.  However, that commit missed the possibility that the use is in
a block which, itself, is unreachable.  In this case, we can end up in
an infinite loop trying to replace a def with itself.  Even though a
no-op replacement is a fine operation, it keeps extending the end of the
uses list as we're walking it.  Instead of explicitly checking for the
group of conditions, just check if the phi builder gives us a different
def.  That's guaranteed to be 100% reliable and, while it lacks symmetry
with the is_valid checks, should be more reliable.

Fixes: a3268599 "nir/repair_ssa: Repair dominance for unreachable..."
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoaco: only emit waitcnt on loop continues if we there was some load or export
Daniel Schürmann [Thu, 19 Sep 2019 16:48:01 +0000 (18:48 +0200)]
aco: only emit waitcnt on loop continues if we there was some load or export

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
5 years agonv50/ir/nir: comparison of integer expressions of different signedness warning
Karol Herbst [Fri, 20 Sep 2019 17:47:14 +0000 (19:47 +0200)]
nv50/ir/nir: comparison of integer expressions of different signedness warning

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
5 years agonv50/ir: fix unnecessary parentheses warning
Karol Herbst [Fri, 20 Sep 2019 17:45:22 +0000 (19:45 +0200)]
nv50/ir: fix unnecessary parentheses warning

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
5 years agolima: remove partial clear support from pipe->clear()
Erico Nunes [Thu, 19 Sep 2019 19:08:05 +0000 (21:08 +0200)]
lima: remove partial clear support from pipe->clear()

pipe->clear() is not called for partial clears, which mesa emulates by
drawing a quad.
Furthermore, drivers should not use rasterizer state information for
scissor information (which was being used to handle the partial clears).
So, remove the partial clear support since it was not supposed to be
handled by pipe->clear() anyway.
This fixes issues with clearing after switching to different sized
framebuffers.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agodEQP-GLES2.functional.buffer.write.use.index_array.* are passing now.
Boris Brezillon [Wed, 18 Sep 2019 13:23:09 +0000 (15:23 +0200)]
dEQP-GLES2.functional.buffer.write.use.index_array.* are passing now.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
5 years agopanfrost: Fix indexed draws
Boris Brezillon [Wed, 18 Sep 2019 13:22:24 +0000 (15:22 +0200)]
panfrost: Fix indexed draws

->padded_count should be large enough to cover all vertices pointed by
the index array. Use the local vertex_count variable that contains the
updated vertex_count value for the indexed draw case.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoclover/nir: fix compilation with g++-5.5 and maybe earlier
Karol Herbst [Sun, 22 Sep 2019 13:27:33 +0000 (15:27 +0200)]
clover/nir: fix compilation with g++-5.5 and maybe earlier

fixes "sorry, unimplemented: non-trivial designated initializers not supported"

Fixes: deb04adf2ae ("clover: add support for passing kernels as nir to the driver")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
5 years agost/mesa: Bail on incomplete attachments in discard_framebuffer
Kenneth Graunke [Fri, 20 Sep 2019 21:33:51 +0000 (14:33 -0700)]
st/mesa: Bail on incomplete attachments in discard_framebuffer

Incomplete attachments don't have an associated pipe_surface, so
this would crash.

Fixes a WebGL conformance test that uses incomplete attachments:
https://www.khronos.org/registry/webgl/sdk/tests/conformance2/renderbuffers/invalidate-framebuffer.html?webglVersion=2&quiet=0&quick=1

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111756
Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
5 years agolima: implement BO cache
Vasily Khoruzhick [Sun, 8 Sep 2019 02:33:07 +0000 (19:33 -0700)]
lima: implement BO cache

Allocating BOs is expensive, so we should avoid doing that by caching
freed BOs.

BO cache is modelled after one in v3d driver and works as follows:

- in lima_bo_create() check if we have matching BO in cache and return
  it if there's one, allocate new BO otherwise.
- in lima_bo_unreference() (renamed from lima_bo_free()): put BO in
  cache instead of freeing it and remove all stale BOs from cache

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima: use 0 to poll if BO is busy in lima_bo_wait()
Vasily Khoruzhick [Sun, 8 Sep 2019 02:30:39 +0000 (19:30 -0700)]
lima: use 0 to poll if BO is busy in lima_bo_wait()

os_time_get_absolute_timeout(0) returns current time, while kernel
driver expects 0 as value to poll BO status and return immediately.
Fix it by setting abs_timeout to 0 if timeout_ns is 0

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima: move damage bound build to resource
Qiang Yu [Sun, 25 Aug 2019 11:04:01 +0000 (19:04 +0800)]
lima: move damage bound build to resource

Reviewed-and-Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
5 years agolima: don't use damage system when full damage
Qiang Yu [Sun, 25 Aug 2019 09:24:26 +0000 (17:24 +0800)]
lima: don't use damage system when full damage

Some time weston set full damage region. It is
more effient to use the cached pp stream instead
of dynamically create one.

Reviewed-and-Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
5 years agolima: implement EGL_KHR_partial_update
Qiang Yu [Sun, 30 Jun 2019 13:44:12 +0000 (21:44 +0800)]
lima: implement EGL_KHR_partial_update

This extension set a damage region for each
buffer swap which can be used to reduce buffer
reload cost by only feed damage region's tile
buffer address for PP.

Reviewed-and-Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
5 years agolima: fix PLBU viewport configuration
Icenowy Zheng [Sun, 22 Sep 2019 01:37:38 +0000 (09:37 +0800)]
lima: fix PLBU viewport configuration

The PLBU expects the viewport's 4 borders' coordinates, however
currently we're feeding the coordinate of the left-bottom point and the
size to it, which leads to misrendering when the left-bottom point is
not (0,0).

Change the macros for the viewport PLBU command, and the data feed to
it. The code to calculate the 4 borders is ported from Panfrost.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agoamd: Build aco only if radv is enabled
Bas Nieuwenhuizen [Fri, 20 Sep 2019 20:22:13 +0000 (22:22 +0200)]
amd: Build aco only if radv is enabled

ACO depends on C++14, but radeonsi/radv with LLVM 8,9 do not. Let us
only require it for RADV, since that is the only user.

Fixes: a70a9987181 "radv/aco: Setup alternate path in RADV to support the experimental ACO compiler"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agonvc0: expose spirv support
Karol Herbst [Fri, 10 May 2019 07:28:15 +0000 (09:28 +0200)]
nvc0: expose spirv support

required for OpenCL

v2: adjust to changes in previous commits
v3: properly convert to NIR in nvc0_cp_state_create

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> (v1)
5 years agoclover: add support for passing kernels as nir to the driver
Karol Herbst [Tue, 6 Aug 2019 18:35:48 +0000 (20:35 +0200)]
clover: add support for passing kernels as nir to the driver

v2: minor formatting fixes
v3: call glsl_type_singleton_init_or_ref and glsl_type_singleton_decref
v4: capitalize and punctuate comments
    fix text_executable -> text_intermediate in TODO
    make glsl_type_singleton wrapper static
v5: rewrite how we run the nir passes
v6: fix unhandled case switch warning in st/mesa

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v4)
5 years agoclover: prepare supporting multiple IRs
Karol Herbst [Fri, 10 May 2019 07:27:06 +0000 (09:27 +0200)]
clover: prepare supporting multiple IRs

v2: rework arguments to compiler::compile_program
    add assert to device::ir_format
v3: remove PIPE_SHADER_IR_SPIRV
    change title

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v2)
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
5 years agoclover: add support for drivers having no proper binary format
Karol Herbst [Fri, 10 May 2019 07:24:42 +0000 (09:24 +0200)]
clover: add support for drivers having no proper binary format

Most drivers have actually no binary format and just store the IR directly
as a single entry point blob.

v2: add a cap to switch between single or multi entry point binaries
v3: remove the entry_point field
v4: remove PIPE_CAP_MULTI_ENTRY_POINT_BINARIES
v5: remove supports_multiple_entry_points

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
5 years agoclover/functional: add id_equals helper
Karol Herbst [Tue, 30 Jul 2019 11:36:37 +0000 (13:36 +0200)]
clover/functional: add id_equals helper

v2: pass argument by value

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
5 years agorename pipe_llvm_program_header to pipe_binary_program_header
Karol Herbst [Sat, 11 May 2019 12:26:06 +0000 (14:26 +0200)]
rename pipe_llvm_program_header to pipe_binary_program_header

We want to use it for other formats as well, so give it a more generic name

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
5 years agogallium: add blob field to pipe_llvm_program_header
Karol Herbst [Fri, 10 May 2019 07:22:25 +0000 (09:22 +0200)]
gallium: add blob field to pipe_llvm_program_header

makes it easier to consume a IR_NATIVE binary

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
5 years agoclover/llvm: Add functions for compiling from source to SPIR-V
Pierre Moreau [Sat, 10 Feb 2018 20:44:45 +0000 (21:44 +0100)]
clover/llvm: Add functions for compiling from source to SPIR-V

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
5 years agoclover/llvm: Add options for dumping SPIR-V binaries
Pierre Moreau [Sat, 10 Feb 2018 20:41:19 +0000 (21:41 +0100)]
clover/llvm: Add options for dumping SPIR-V binaries

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
5 years agoclover/spirv: Add functions for parsing arguments, linking programs, etc.
Pierre Moreau [Tue, 2 Apr 2019 21:32:23 +0000 (23:32 +0200)]
clover/spirv: Add functions for parsing arguments, linking programs, etc.

v2 (Karol Herbst):
  silence warnings about unhandled enum values
v3 (Karol Herbst):
  added back array size parsing (needed for structs passed by value)

Acked-by: Francisco Jerez <currojerez@riseup.net> (v2)
5 years agoclover/spirv: Add functions for validating SPIR-V binaries
Pierre Moreau [Sat, 10 Feb 2018 20:40:10 +0000 (21:40 +0100)]
clover/spirv: Add functions for validating SPIR-V binaries

Changes since:
* v12:
  - remove autotools (Karol Herbst)
  - Remove the callback in format_validation_msg. (Francisco Jerez)
  - Removed is_binary_spirv. (Francisco Jerez)
  - Pass a string reference to is_valid_spirv instead of the
    notification callback. (Francisco Jerez)
* v11: Fix compilation error introduced in v11.
* v10:
  - Reuse format_validation_msg in is_valid_spirv.
  - Remove LVL2STR macro in format_validation_msg.
* v9: Add `clover_cpp_std` to the overrides of the `libclspirv` target
      in Meson.
* v7: Add DEFINES to libclspirv and libclover, in autotools, as they
      would otherwise never know whether CLOVER_ALLOW_SPIRV has been
      defined (Dave Airlie)
* v6: Update the dependency name (meson) and the libs variable
      (Makefile) due to the replacement of llvm-spirv to the new
      official SPIRV-LLVM-Translator.
* v5: Changed to match the updated “clover/llvm: Allow translating from
      SPIR-V to LLVM IR” in the v6.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
5 years agomeson: Check for SPIRV-Tools and llvm-spirv
Pierre Moreau [Sun, 21 Jan 2018 18:10:58 +0000 (19:10 +0100)]
meson: Check for SPIRV-Tools and llvm-spirv

Changes since:
* v12 (Karol Herbst):
  - rename CLOVER_ALLOW_SPIRV to HAVE_CLOVER_SPIRV
* v11 (Karol Herbst):
  - only set new defines for clover to speed up recompilation
  - remove autotools
* v10:
  - Add a new flag (`--enable-opencl-spirv` for autotools, and
    `-Dopencl-spirv=true` for meson) for enabling SPIR-V support in
    clover, and never automagically enable it without that flag. (Dylan Baker)
  - When enabling the SPIR-V support, the SPIRV-Tools and
    SPIRV-LLVM-Translator libraries are now required dependencies.
* v7:
  - Properly align LLVMSPIRVLib comment (Dylan Baker)
  - Only define CLOVER_ALLOW_SPIRV when **both** dependencies are found:
    autotools was only requiring one or the other.
* v6: Replace the llvm-spirv repository by the new official
      SPIRV-LLVM-Translator.
* v4: Add a comment saying where to find llvm-spirv (Karol Herbst).
* v3:
  - make SPIRV-Tools and llvm-spirv optional (Francisco Jerez);
  - bump requirement for llvm-spirv to version 0.2
* v2:
  - Bump the required version of SPIRV-Tools to the latest release;
  - Add a dependency on llvm-spirv.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v10)
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
5 years agoisl: Drop WaDisableSamplerL2BypassForTextureCompressedFormats on Gen11
Kenneth Graunke [Tue, 17 Sep 2019 06:29:48 +0000 (23:29 -0700)]
isl: Drop WaDisableSamplerL2BypassForTextureCompressedFormats on Gen11

Gen11 doesn't require us to bypass the L2 cache for BC* images anymore.

The documentation is a bit hard to follow on this point, but the Windows
driver clearly only applies this workaround on Gen9, and their commit
history indicates that this was an intentional change to drop the
workaround for Gen11+.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agogallium/osmesa: Fix the inability to set no context as current.
Hal Gentz [Sun, 15 Sep 2019 21:29:50 +0000 (15:29 -0600)]
gallium/osmesa: Fix the inability to set no context as current.

Currently there is no way to make no context current w/gallium + osmesa.
The non-gallium version of osmesa does this if the context and buffer
passed to `OSMesaMakeCurrent` are both null. This small change makes it
so that this is also the case with the gallium version.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agolibgbm: Wire up getCapability for the image loader
Adam Jackson [Tue, 17 Sep 2019 18:23:28 +0000 (14:23 -0400)]
libgbm: Wire up getCapability for the image loader

5 years agoegl/surfaceless: Add FP16 format support
Adam Jackson [Tue, 10 Sep 2019 15:44:24 +0000 (11:44 -0400)]
egl/surfaceless: Add FP16 format support

Reviewed-by: Kevin Strasser <kevin.strasser@intel.com>
5 years agoegl/wayland: Implement getCapability for the dri2 and image loaders
Adam Jackson [Tue, 10 Sep 2019 15:53:11 +0000 (11:53 -0400)]
egl/wayland: Implement getCapability for the dri2 and image loaders

Reviewed-by: Kevin Strasser <kevin.strasser@intel.com>
5 years agoegl/wayland: Add FP16 format support
Adam Jackson [Fri, 30 Aug 2019 19:35:22 +0000 (15:35 -0400)]
egl/wayland: Add FP16 format support

Reviewed-by: Kevin Strasser <kevin.strasser@intel.com>
5 years agoegl/wayland: Reindent the format table
Adam Jackson [Fri, 30 Aug 2019 19:27:23 +0000 (15:27 -0400)]
egl/wayland: Reindent the format table

No idea how these ended up with 3-then-2-space indents.

Reviewed-by: Kevin Strasser <kevin.strasser@intel.com>
5 years agoanv: Advertise VK_KHR_shader_subgroup_extended_types
Jason Ekstrand [Thu, 18 Apr 2019 19:17:50 +0000 (14:17 -0500)]
anv: Advertise VK_KHR_shader_subgroup_extended_types

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
5 years agointel/fs: Do 8-bit subgroup scan operations in 16 bits
Jason Ekstrand [Tue, 4 Jun 2019 16:39:25 +0000 (11:39 -0500)]
intel/fs: Do 8-bit subgroup scan operations in 16 bits

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
5 years agointel/fs: Allow CLUSTER_BROADCAST to do type conversion
Jason Ekstrand [Tue, 4 Jun 2019 16:45:50 +0000 (11:45 -0500)]
intel/fs: Allow CLUSTER_BROADCAST to do type conversion

We can't really handle it in the little-core 64-bit case but it's not
really needed there.  Where we really want this is for when we need to
do 16 -> 8-bit conversions.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
5 years agointel/fs: Allow UB, B, and HF types in brw_nir_reduction_op_identity
Jason Ekstrand [Sat, 27 Apr 2019 09:31:31 +0000 (04:31 -0500)]
intel/fs: Allow UB, B, and HF types in brw_nir_reduction_op_identity

Because byte immediates aren't a thing on GEN hardware, we return a
signed or unsigned word immediate in the byte case.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
5 years agointel/fs: don't forget the stride at generate_shuffle
Paulo Zanoni [Tue, 17 Sep 2019 23:46:33 +0000 (16:46 -0700)]
intel/fs: don't forget the stride at generate_shuffle

During generate_shuffle(), when we use byte sized registers we end up
with a destination stride of 2. We don't take the stride into
consideration when selecting the group offset for the last MOV
operation, which means we end up moving things to the wrong place,
leaving the last few channels untouched. Take the destination stride
in consideration so we don't miss the last channels.

v2: Assert this is not necessary for the IVB special case (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>