mesa.git
5 years agopan/decode: Add missing format specifier
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:08:20 +0000 (17:08 -0700)]
pan/decode: Add missing format specifier

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/decode: Use portable format specifier for 64-bit
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:03:25 +0000 (17:03 -0700)]
pan/decode: Use portable format specifier for 64-bit

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/decode: Use %zu instead of %d
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:02:43 +0000 (17:02 -0700)]
pan/decode: Use %zu instead of %d

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/decode: Fix uninitialized variables
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:00:09 +0000 (17:00 -0700)]
pan/decode: Fix uninitialized variables

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agodocs: update calendar, add news item and link release notes for 19.1.6
Juan A. Suarez Romero [Tue, 3 Sep 2019 11:06:56 +0000 (13:06 +0200)]
docs: update calendar, add news item and link release notes for 19.1.6

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
5 years agodocs: add sha256 checksums for 19.1.6
Juan A. Suarez Romero [Tue, 3 Sep 2019 11:04:25 +0000 (13:04 +0200)]
docs: add sha256 checksums for 19.1.6

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 4ec2325dd07a768f2b52ea788ee76085586b2469)

5 years agodocs: add release notes for 19.1.6
Juan A. Suarez Romero [Tue, 3 Sep 2019 10:02:19 +0000 (12:02 +0200)]
docs: add release notes for 19.1.6

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 85c8f88a49aa7c8aa866faed90a4a63330c15b8b)

5 years agovulkan/overlay: bounce image back to present layout
Lionel Landwerlin [Wed, 21 Aug 2019 11:47:25 +0000 (13:47 +0200)]
vulkan/overlay: bounce image back to present layout

Once we write the overlay to an image to be presented, we must not
forget to put it back into present layout.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111401
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
5 years agobroadcom/vc4: Expand width of dst surface
Zhaowei Yuan [Tue, 3 Sep 2019 02:58:59 +0000 (10:58 +0800)]
broadcom/vc4: Expand width of dst surface

Four bytes of src_surf will be compressed into a 32-bits data and
stored into dst_surf, and dst_surf is read as z-order, so its width
must be aligned to multiples of 8(4x2) before divided by 2.

Signed-off-by: Zhaowei Yuan <zhaowei.yuan@samsung.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111266

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
5 years agoswr: Fix make_unique build error.
Vinson Lee [Thu, 29 Aug 2019 23:44:09 +0000 (16:44 -0700)]
swr: Fix make_unique build error.

swr_shader.cpp: In function ‘void (* swr_compile_gs(swr_context*, swr_jit_gs_key&))(HANDLE, HANDLE, SWR_GS_CONTEXT*)’:
swr_shader.cpp:732:44: error: ‘make_unique’ was not declared in this scope
    ctx->gs->map.insert(std::make_pair(key, make_unique<VariantGS>(builder.gallivm, func)));
                                            ^~~~~~~~~~~

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
5 years agoloader: include limits.h for PATH_MAX
nia [Sat, 31 Aug 2019 17:10:07 +0000 (18:10 +0100)]
loader: include limits.h for PATH_MAX

This is needed to build on illumos.

The location of the PATH_MAX definition in limits.h seems to be fairly standard:
https://pubs.opengroup.org/onlinepubs/009695399/basedefs/limits.h.html

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
5 years agoutil: only allow _BitScanReverse64 on 64-bit cpus
Erik Faye-Lund [Wed, 14 Aug 2019 20:29:24 +0000 (22:29 +0200)]
util: only allow _BitScanReverse64 on 64-bit cpus

While the documentation for _BitScanReverse64 on MSDN says that it's
available on ARM, this isn't true. It's only available on ARM64. So
let's match reality.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
5 years agomesa/x86: improve SSE-checks for MSVC
Erik Faye-Lund [Thu, 15 Aug 2019 19:53:36 +0000 (21:53 +0200)]
mesa/x86: improve SSE-checks for MSVC

This enables some more SSE optimizations on MSVC builds.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agoutil: do not assume MSVC implies SSE
Erik Faye-Lund [Wed, 14 Aug 2019 20:28:12 +0000 (22:28 +0200)]
util: do not assume MSVC implies SSE

This is not true for MSVC on ARM.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agoutil: fix SSE-version needed for double opcodes
Erik Faye-Lund [Sun, 1 Sep 2019 08:05:12 +0000 (10:05 +0200)]
util: fix SSE-version needed for double opcodes

This code generates CVTSD2SI, which requires SSE2. So let's fix the
required SSE-version.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 5de29ae (util: try to use SSE instructions with MSVC and 32-bit gcc)
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agomesa/main: remove unused include
Erik Faye-Lund [Thu, 15 Aug 2019 19:08:59 +0000 (21:08 +0200)]
mesa/main: remove unused include

This has been unused since 183db3a6455 ("glsl: move half<->float
convertion to util"), Oct 10 2015. Let's drop needlessly including it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agonir: do not assume that the result of fexp2(a) is always an integral
Samuel Pitoiset [Tue, 27 Aug 2019 09:35:00 +0000 (11:35 +0200)]
nir: do not assume that the result of fexp2(a) is always an integral

It's only correct when 'a' is an integral greater or equal to 0.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111493
Fixes: 5544b2cbbd2 ("nir/algebraic: Use value range analysis to eliminate useless unary ops")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoegl: fix platform selection
Lionel Landwerlin [Sun, 1 Sep 2019 14:22:24 +0000 (17:22 +0300)]
egl: fix platform selection

Add missing "device" platform

v2: Add the missing platform (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Jean Hertel <jean.hertel@hotmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111529
Fixes: d6edccee8d ("egl: add EGL_platform_device support")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
5 years agoiris: Lessen texture cache hack flush for blits/copies on Icelake.
Kenneth Graunke [Fri, 21 Jun 2019 03:18:11 +0000 (20:18 -0700)]
iris: Lessen texture cache hack flush for blits/copies on Icelake.

Lionel found actual documentation for this at long last.  Apparently
it actually is a sampler cache limitation that was mostly fixed on
Icelake.  Unfortunately, it seems there are still issues with ASTC
and non-ASTC sampler views.  Still, we can lessen the flush condition
from "format mismatch" to "ASTC mismatch", which eliminates most of
the flushing here.

We also update the documentation to refer to the workaround name.

5 years agoutil: Define strchrnul on macOS.
Vinson Lee [Fri, 30 Aug 2019 06:56:17 +0000 (23:56 -0700)]
util: Define strchrnul on macOS.

strchrnul is not available on macOS.

pipe_loader.c:141:14: error: implicit declaration of function 'strchrnul' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
      next = strchrnul(library_paths, ':');
             ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogallium/auxiliary/indices: consistently apply start only to input
Erik Faye-Lund [Wed, 17 Jul 2019 08:21:08 +0000 (10:21 +0200)]
gallium/auxiliary/indices: consistently apply start only to input

The majority of these only apply the start argument to the input, but a
few of them also does for the output-array. util_primconvert, the only
user of this argument expects this pass a non-zero start-argument does
not expect this to be applied to the output; if it is, it will write
outside of allocated memory, leading to VRAM corruption.

The reason this doesn't seem to have been noticed before, is that no
driver currently use util_primconvert to convert a primitive-type to
itself, which is the cases where this was broken. But for Zink, this
will no longer be true, because we need to eliminate the use of 8-bit
index-buffers.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 28f3f8d413f ("gallium/auxiliary/indices: add start param")
Reviewed-by: Rob Clark <robdclark@chromium.org>
5 years agotravis: Fail build if any command in if statement fails.
Vinson Lee [Fri, 30 Aug 2019 06:15:29 +0000 (23:15 -0700)]
travis: Fail build if any command in if statement fails.

Travis is checking the exit code of the entire if statement.

Fixes: 64ffc289be89 ("travis: add MacOS Scons build")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
5 years agoswr: Fix build with llvm-9.0 again.
Vinson Lee [Mon, 26 Aug 2019 23:16:26 +0000 (16:16 -0700)]
swr: Fix build with llvm-9.0 again.

Commit 6f7306c029a7 ("swr/rast: Refactor memory API between rasterizer
core and swr") unintentionally removed changes for llvm-9.0.

Fixes: 6f7306c029a7 ("swr/rast: Refactor memory API between rasterizer core and swr")
Fixes: 5dd9ad157005 ("swr/rasterizer: Better implementation of scatter")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
5 years agopan/midgard: Use shared psiz clamp pass
Alyssa Rosenzweig [Mon, 26 Aug 2019 19:14:11 +0000 (12:14 -0700)]
pan/midgard: Use shared psiz clamp pass

We already had a perfectly cromulent pass for this, but one landed in
common NIR code so let's switch and lighten our tree.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Remove mir_opt_post_move_eliminate
Alyssa Rosenzweig [Fri, 30 Aug 2019 20:49:33 +0000 (13:49 -0700)]
pan/midgard: Remove mir_opt_post_move_eliminate

This optimization depended on RA running before scheduling. It therefore
no longer applies and is now unused.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Schedule before RA
Alyssa Rosenzweig [Fri, 30 Aug 2019 19:56:55 +0000 (12:56 -0700)]
pan/midgard: Schedule before RA

This is a tradeoff.

Scheduling before RA means we don't do RA on what-will-become pipeline
registers. Importantly, it means the scheduler is able to reorder
instructions, as registers have not been decided yet.

Unfortunately, it also complicates register spilling, since the spills
themselves won't get bundled optimally and we can only spill twice per
ALU bundle (only one spill per bundle allowed here). It also prevents us
from eliminating dead moves introduced by register allocation, as they
are not dead before RA. The shader-db regressions are from poor spilling
choices introduced by the new bundling requirements. These could be
solved by the combination of a post-scheduler (to combine adjacent
spills into bundles) with a VLIW-aware spill cost calculation.
Nevertheless, the change is small enough that I feel it's worth it to
eat a tiny shader-db regression for the sake of flexibility.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Handle fragment writeout in RA
Alyssa Rosenzweig [Fri, 30 Aug 2019 18:06:33 +0000 (11:06 -0700)]
pan/midgard: Handle fragment writeout in RA

Rather than using a pile of hacks and awkward constructs in MIR to
ensure the writeout parameter gets written into r0, let's add a
dedicated shadow register class for writeout (interfering with work
register r0) so we can express the writeout condition succintly and
directly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Do not propagate swizzles into writeout
Alyssa Rosenzweig [Fri, 30 Aug 2019 21:35:01 +0000 (14:35 -0700)]
pan/midgard: Do not propagate swizzles into writeout

There's no slot for it; you'll end up writing into the void and
clobbering stuff. Don't. do it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Fix misc. RA issues
Alyssa Rosenzweig [Fri, 30 Aug 2019 18:04:52 +0000 (11:04 -0700)]
pan/midgard: Fix misc. RA issues

When running the register allocator after scheduling, the MIR looks a
little different, so we need to extend the RA to handle a few of these
extra cases correctly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Print MIR by the bundle
Alyssa Rosenzweig [Fri, 30 Aug 2019 18:03:44 +0000 (11:03 -0700)]
pan/midgard: Print MIR by the bundle

After scheduling, we still have valid MIR, but we have additional
bundling annotations which we would like to keep debug, so print these.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Print branches in MIR
Alyssa Rosenzweig [Fri, 30 Aug 2019 18:02:52 +0000 (11:02 -0700)]
pan/midgard: Print branches in MIR

Rather than a vague "br.??" line, annotate the branch with its target
type (useful for disambiguating discards) and whether it was inverted.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Remove texture_index
Alyssa Rosenzweig [Fri, 30 Aug 2019 18:01:57 +0000 (11:01 -0700)]
pan/midgard: Remove texture_index

This is deadcode.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Cleanup fragment writeout branch
Alyssa Rosenzweig [Fri, 30 Aug 2019 18:01:15 +0000 (11:01 -0700)]
pan/midgard: Cleanup fragment writeout branch

I'm not sure if this is strictly necessary but it makes debugging easier
and minimizes the diff with the experimental scheduler.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add scheduling barriers
Alyssa Rosenzweig [Fri, 30 Aug 2019 17:53:13 +0000 (10:53 -0700)]
pan/midgard: Add scheduling barriers

Scheduling occurs on a per-block basis, strongly assuming that a given
block contains at most a single branch. This does not always map to the
source NIR control flow, particularly when discard intrinsics are
involved. The solution is to allow scheduling barriers, which will
terminate a block early in code generation and open a new block.

To facilitate this, we need to move some post-block processing to a new
pass, rather than relying hackily on the current_block pointer.

This allows us to cleanup some logic analyzing branches in other parts
of the driver us well, now that the MIR is much more well-formed.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Track shader quadword count while scheduling
Alyssa Rosenzweig [Fri, 30 Aug 2019 20:57:20 +0000 (13:57 -0700)]
pan/midgard: Track shader quadword count while scheduling

This allow multiblock blend shaders to compute constant colour offsets
correctly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Allow NULL argument in mir_has_arg
Alyssa Rosenzweig [Fri, 30 Aug 2019 17:48:41 +0000 (10:48 -0700)]
pan/midgard: Allow NULL argument in mir_has_arg

It's sometimes convenient to call this with no instruction specified. By
definition, a missing instruction cannot reference any argument, so
let's check for NULL and shortciruit to false.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Improve mir_mask_of_read_components
Alyssa Rosenzweig [Fri, 30 Aug 2019 17:45:57 +0000 (10:45 -0700)]
pan/midgard: Improve mir_mask_of_read_components

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Extend mir_special_index to writeout
Alyssa Rosenzweig [Fri, 30 Aug 2019 17:45:08 +0000 (10:45 -0700)]
pan/midgard: Extend mir_special_index to writeout

The branch has the writeout specified in its source list, making this
special even if it's not explicitly part of r0.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: csel_swizzle with mir get swizzle
Alyssa Rosenzweig [Fri, 30 Aug 2019 17:44:42 +0000 (10:44 -0700)]
pan/midgard: csel_swizzle with mir get swizzle

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add mir_insert_instruction*scheduled helpers
Alyssa Rosenzweig [Fri, 30 Aug 2019 17:46:17 +0000 (10:46 -0700)]
pan/midgard: Add mir_insert_instruction*scheduled helpers

In order to run register allocation after scheduling, it is sometimes
necessary to be able to insert instructions into an already-scheduled
program. This is suboptimal, since it forces us to do a worst-case
scheduling, but it is nevertheless required for correct handling of
spills/fills. Let's add helpers to insert instructions as standalone
bundles for use in spilling code.

These helpers are minimal -- they *only* work on load/store ops or
moves. They should not be used for anything but register spilling; any
other instructions should be added prior to the schedule.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Track csel swizzle
Alyssa Rosenzweig [Fri, 30 Aug 2019 17:42:05 +0000 (10:42 -0700)]
pan/midgard: Track csel swizzle

While it doesn't matter with an unconditional move to the conditional
register (r31), when we try to elide that move we'll need to track the
swizzle explicitly, and there is no slot for that yet since ALU ops are
normally binary.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Ensure fragment writeout is in the final block
Alyssa Rosenzweig [Tue, 27 Aug 2019 19:20:06 +0000 (12:20 -0700)]
pan/midgard: Ensure fragment writeout is in the final block

This ensures the block only has exactly one branch, which makes
scheduling happy.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Document Midgard scheduling requirements
Alyssa Rosenzweig [Mon, 26 Aug 2019 22:28:56 +0000 (15:28 -0700)]
pan/midgard: Document Midgard scheduling requirements

Oh boy. Midgard scheduling is crazy... These are all just the
requirements, not even the algorithm yet.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Include condition in branch->src[0]
Alyssa Rosenzweig [Mon, 26 Aug 2019 20:59:29 +0000 (13:59 -0700)]
pan/midgard: Include condition in branch->src[0]

This will allow us to reference the condition while scheduling.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add post-schedule iteration helpers
Alyssa Rosenzweig [Tue, 27 Aug 2019 21:26:27 +0000 (14:26 -0700)]
pan/midgard: Add post-schedule iteration helpers

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Fix corner case in RA
Alyssa Rosenzweig [Tue, 27 Aug 2019 20:15:12 +0000 (13:15 -0700)]
pan/midgard: Fix corner case in RA

It doesn't really matter but... meh.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add OP_IS_CSEL_V helper
Alyssa Rosenzweig [Tue, 27 Aug 2019 22:51:13 +0000 (15:51 -0700)]
pan/midgard: Add OP_IS_CSEL_V helper

..to distinguish from scalar csel.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Expose mir_get/set_swizzle
Alyssa Rosenzweig [Tue, 27 Aug 2019 22:50:55 +0000 (15:50 -0700)]
pan/midgard: Expose mir_get/set_swizzle

The scheduler would like to use these.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Extract instruction sizing helper
Alyssa Rosenzweig [Mon, 26 Aug 2019 22:06:38 +0000 (15:06 -0700)]
pan/midgard: Extract instruction sizing helper

The scheduler shouldn't need to worry about this.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Factor out mir_is_scalar
Alyssa Rosenzweig [Mon, 26 Aug 2019 21:49:49 +0000 (14:49 -0700)]
pan/midgard: Factor out mir_is_scalar

This helper doesn't need to be in the giant loop.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Count shader-db stats by bundled instructions
Alyssa Rosenzweig [Fri, 30 Aug 2019 20:08:16 +0000 (13:08 -0700)]
pan/midgard: Count shader-db stats by bundled instructions

This does not affect shaders in any way. Rather, it makes the shader-db
instruction count recorded in the compiler accurate with the in-order
scheduler, matching up with what we calculate from pandecode.

Though shaders are the same, instruction counts cannot be compared
across this commit for this reason.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agofreedreno/ir3: Link directly to Sethi-Ullman paper
Alyssa Rosenzweig [Tue, 27 Aug 2019 17:38:34 +0000 (10:38 -0700)]
freedreno/ir3: Link directly to Sethi-Ullman paper

Allow a direct link to the PDF itself from the authors themselves,
rather than a paywall splash page.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Rob Clark <robdclark@chromium.org>
5 years agoRevert "glx: Unset the direct_support bit for GLX_EXT_import_context"
Adam Jackson [Thu, 29 Aug 2019 16:15:22 +0000 (12:15 -0400)]
Revert "glx: Unset the direct_support bit for GLX_EXT_import_context"

The GLX extension strings are independent of any context, so abusing the
direct_support bit to control this extension's visibility is wrong.

This reverts commit 079d0717fc896bc8086b037d0ed22642274986c7.

Reported-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Michel Dänzer <michel@daenzer.net>
5 years agopanfrost: Add transient BOs to job batches
Boris Brezillon [Fri, 30 Aug 2019 13:38:56 +0000 (15:38 +0200)]
panfrost: Add transient BOs to job batches

Memory allocated through panfrost_allocate_transient() is likely to
come from the transient pool. Let's add the BO backing the allocated
memory region to the job batch so the kernel can retain this BO while
jobs are executed.

In practice that has never been a problem because the transient pool
is never shrinked, and even if it was, we still control the lifetime of
the job, so there's no reason for this BO to be freed before the GPU is
done executing the batch. But it still make sense to add the BO for
debugging purpose.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: protect access to shared bo cache and transient pool
Rohan Garg [Fri, 30 Aug 2019 16:00:13 +0000 (18:00 +0200)]
panfrost: protect access to shared bo cache and transient pool

Both the BO cache and the transient pool are shared across
context's. Protect access to these with mutexes.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
5 years agopanfrost: Jobs must be per context, not per screen
Rohan Garg [Fri, 30 Aug 2019 16:00:12 +0000 (18:00 +0200)]
panfrost: Jobs must be per context, not per screen

Jobs _must_ only be shared across the same context, having
the last_job tracked in a screen causes use-after-free issues
and memory corruptions.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
5 years agost/mesa: Allow zero as [level|layer]_override
Lepton Wu [Fri, 30 Aug 2019 17:30:53 +0000 (17:30 +0000)]
st/mesa: Allow zero as [level|layer]_override

This fix two dEQP tests for virgl:

dEQP-EGL.functional.image.create.gles2_cubemap_positive_x_rgba_texture
dEQP-EGL.functional.image.render_multiple_contexts.gles2_cubemap_positive_x_rgba8_texture

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agofreedreno/a3xx: fix sysmem <-> gmem tiles transfer
Khaled Emara [Sun, 25 Aug 2019 21:49:10 +0000 (23:49 +0200)]
freedreno/a3xx: fix sysmem <-> gmem tiles transfer

Tiling mode was missing from fd3_emit_gmem_restore_tex().
emit_gmem2mem_surf() used LINEAR exclusiveley.

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/a3xx: fix texture tiling parameters
Khaled Emara [Sun, 25 Aug 2019 21:39:02 +0000 (23:39 +0200)]
freedreno/a3xx: fix texture tiling parameters

* Fix 2D/2DArray/3D tiling parameters:
  There is a bottom threshold for width and height.
* Renable tiling for Cubemap, after setting the right parameters.

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agogitlab-ci: Use new needs: keyword
Michel Dänzer [Tue, 27 Aug 2019 09:57:13 +0000 (11:57 +0200)]
gitlab-ci: Use new needs: keyword

This way, the test jobs can start running before all build+test jobs
have finished, once the meson-main job has.

Idea suggested by Daniel Stone on IRC.

See https://docs.gitlab.com/ce/ci/directed_acyclic_graph/ and
https://docs.gitlab.com/ce/ci/yaml/README.html#needs for details.

v2:
* Improve commit log (Daniel Stone, Eric Engestrom)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agogitlab-ci: Move up meson-main job definition
Michel Dänzer [Wed, 28 Aug 2019 10:01:02 +0000 (12:01 +0200)]
gitlab-ci: Move up meson-main job definition

In order to increase the chance of it running early.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agobroadcom/v3d: Allow importing linear BOs with arbitrary offset/stride.
Dave Stevenson [Wed, 22 May 2019 16:12:56 +0000 (17:12 +0100)]
broadcom/v3d: Allow importing linear BOs with arbitrary offset/stride.

Equivalent of 0c1dd9dee "broadcom/vc4: Allow importing linear BOs with
arbitrary offset/stride." for v3d.

Allows YUV buffers with a single buffer and plane offsets to be
passed in.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoswr/rasterizer: Fix GS attributes processing
Jan Zielinski [Fri, 2 Aug 2019 09:59:03 +0000 (11:59 +0200)]
swr/rasterizer: Fix GS attributes processing

Input to GS is just a set of attributes, so remove explicit setup of
'position' which is meaningless for GS input processing.

Reviewed-by: Alok Hota <alok.hota@intel.com>
5 years agoradv: keep a pointer to a NIR shader into radv_shader_context
Samuel Pitoiset [Wed, 28 Aug 2019 15:08:29 +0000 (17:08 +0200)]
radv: keep a pointer to a NIR shader into radv_shader_context

This avoids multiple copies for nothing and it's more elegant.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: move setting can_discard to ac_fill_shader_info()
Samuel Pitoiset [Wed, 28 Aug 2019 14:52:30 +0000 (16:52 +0200)]
radv: move setting can_discard to ac_fill_shader_info()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: replace ac_nir_build_if by ac_build_ifcc
Samuel Pitoiset [Thu, 29 Aug 2019 11:32:10 +0000 (13:32 +0200)]
radv: replace ac_nir_build_if by ac_build_ifcc

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: remove radv_init_llvm_target() helper
Samuel Pitoiset [Thu, 29 Aug 2019 09:49:03 +0000 (11:49 +0200)]
radv: remove radv_init_llvm_target() helper

RADV no longer uses specific LLVM options compared to the common code.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: remove useless ac_llvm_util.h include from the WSI code
Samuel Pitoiset [Thu, 29 Aug 2019 09:46:46 +0000 (11:46 +0200)]
radv: remove useless ac_llvm_util.h include from the WSI code

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: remove unused shader_info parameter in ac_compile_llvm_module()
Samuel Pitoiset [Fri, 26 Jul 2019 12:48:23 +0000 (14:48 +0200)]
radv: remove unused shader_info parameter in ac_compile_llvm_module()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: remove some unused fields from radv_shader_context
Samuel Pitoiset [Wed, 28 Aug 2019 14:46:15 +0000 (16:46 +0200)]
radv: remove some unused fields from radv_shader_context

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: move lowering PS inputs/outputs at the right place
Samuel Pitoiset [Thu, 29 Aug 2019 09:16:44 +0000 (11:16 +0200)]
radv: move lowering PS inputs/outputs at the right place

At shaders creation, just after NIR linking.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: gather info about PS inputs in the shader info pass
Samuel Pitoiset [Thu, 29 Aug 2019 09:12:25 +0000 (11:12 +0200)]
radv: gather info about PS inputs in the shader info pass

It's the right place to do that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoac: drop now useless lookup_interp_param from ABI
Samuel Pitoiset [Wed, 31 Jul 2019 07:57:47 +0000 (09:57 +0200)]
ac: drop now useless lookup_interp_param from ABI

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoac: import linear/perspective PS input parameters from radv/radeonsi
Samuel Pitoiset [Wed, 31 Jul 2019 07:54:48 +0000 (09:54 +0200)]
ac: import linear/perspective PS input parameters from radv/radeonsi

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoutil: Add unreachable() definition for clang compiler.
Krzysztof Raszkowski [Fri, 30 Aug 2019 05:50:21 +0000 (05:50 +0000)]
util: Add unreachable() definition for clang compiler.

Without unreachable() definition clang throw return-type error
in many places in mesa code.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoegl/android: Enable HAL_PIXEL_FORMAT_RGBA_FP16 format
Nataraj Deshpande [Wed, 28 Aug 2019 21:18:43 +0000 (14:18 -0700)]
egl/android: Enable HAL_PIXEL_FORMAT_RGBA_FP16 format

The patch adds support for 64 bit HAL_PIXEL_FORMAT_RGBA_FP16
for android platform.

Fixes android.graphics.cts.BitmapColorSpaceTest#test16bitHardware
which failed in egl due to "Unsupported native buffer format 0x16"
on chromebooks.

Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
5 years agogallivm: disable accurate cube corner for integer textures.
Dave Airlie [Thu, 29 Aug 2019 19:50:26 +0000 (05:50 +1000)]
gallivm: disable accurate cube corner for integer textures.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111511
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agoglsl: replace 'x + (-x)' with constant 0
Pierre-Eric Pelloux-Prayer [Wed, 28 Aug 2019 08:56:52 +0000 (10:56 +0200)]
glsl: replace 'x + (-x)' with constant 0

This fixes a hang in shadertoy for radeonsi where a buffer was initialized with:

   value -= value

with value being undefined.
In this case LLVM replace the operation with an assignment to NaN.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111241
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: add JPEG decode support for VCN 2.0 devices
Thong Thai [Mon, 19 Aug 2019 18:31:08 +0000 (14:31 -0400)]
radeonsi: add JPEG decode support for VCN 2.0 devices

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
5 years agoRevert "radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu"
Thong Thai [Wed, 28 Aug 2019 21:02:26 +0000 (17:02 -0400)]
Revert "radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu"

This reverts commit 5a2e65be89d97ed5d7263f0296ea69ae8517187b.

Even though CONTEXT_CONTROL is emitted by the kernel, CONTEXT_CONTROL
still needs to be emitted by the UMD, or else the driver will hang

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agonir/range-analysis: Add a lot more assertions about the contents of tables
Ian Romanick [Mon, 12 Aug 2019 22:40:20 +0000 (15:40 -0700)]
nir/range-analysis: Add a lot more assertions about the contents of tables

v2: Update several of the comments.  Drop some redundant uses of
ASSERT_UNION_OF_OTHERS_MATCHES_UNKNOWN_*_SOURCE source.  Suggested by
Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/range-analysis: Range tracking for fpow
Ian Romanick [Fri, 9 Aug 2019 19:48:27 +0000 (12:48 -0700)]
nir/range-analysis: Range tracking for fpow

One shader from Metro Last Light and the rest from Rochard.  In the
Rochard cases, something like:

    min(1.0, max(pow(saturate(x), y), z))

was transformed to

    saturate(max(pow(saturate(x), y), z))

because the result of the pow must be >= 0.

The Metro Last Light case was similar.  An instance of

    min(pow(abs(x), y), 1.0)

became

    saturate(pow(abs(x), y))

v2: Fix some comments.  Suggested by Caio.

v3: Fix setting is_intgral when the exponent might be negative.  See
also Mesa MR !1778.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16280670 -> 16280659 (<.01%)
instructions in affected programs: 1130 -> 1119 (-0.97%)
helped: 11
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.72% max: 1.43% x̄: 1.03% x̃: 0.97%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -1.19% -0.86%
Instructions are helped.

total cycles in shared programs: 367168430 -> 367168270 (<.01%)
cycles in affected programs: 10281 -> 10121 (-1.56%)
helped: 10
HURT: 1
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 1.31% max: 2.43% x̄: 1.79% x̃: 1.70%
HURT stats (abs)   min: 10 max: 10 x̄: 10.00 x̃: 10
HURT stats (rel)   min: 3.10% max: 3.10% x̄: 3.10% x̃: 3.10%
95% mean confidence interval for cycles value: -20.06 -9.04
95% mean confidence interval for cycles %-change: -2.36% -0.32%
Cycles are helped.

5 years agonir/range-analysis: Handle constants in nir_op_mov just like nir_op_bcsel
Ian Romanick [Tue, 13 Aug 2019 01:44:56 +0000 (18:44 -0700)]
nir/range-analysis: Handle constants in nir_op_mov just like nir_op_bcsel

I discovered this while looking at a shader that was hurt by some other
work I'm doing.  When I examined the changes, I was confused that one
instance of a comparison that was used in a discard_if was (incorrectly)
eliminated, while another instance used by a bcsel was (correctly) not
eliminated.  I had to use NIR_PRINT=true to see exactly where things
when wrong.

A bunch of shaders in Goat Simulator, Dungeon Defenders, Sanctum 2, and
Strike Suit Zero were impacted.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass")
All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16280659 -> 16281075 (<.01%)
instructions in affected programs: 21042 -> 21458 (1.98%)
helped: 0
HURT: 136
HURT stats (abs)   min: 1 max: 9 x̄: 3.06 x̃: 3
HURT stats (rel)   min: 1.16% max: 6.12% x̄: 2.23% x̃: 2.03%
95% mean confidence interval for instructions value: 2.93 3.19
95% mean confidence interval for instructions %-change: 2.08% 2.37%
Instructions are HURT.

total cycles in shared programs: 367168270 -> 367170313 (<.01%)
cycles in affected programs: 172020 -> 174063 (1.19%)
helped: 14
HURT: 111
helped stats (abs) min: 2 max: 80 x̄: 21.21 x̃: 9
helped stats (rel) min: 0.10% max: 4.47% x̄: 1.35% x̃: 0.79%
HURT stats (abs)   min: 2 max: 584 x̄: 21.08 x̃: 5
HURT stats (rel)   min: 0.12% max: 17.28% x̄: 1.55% x̃: 0.40%
95% mean confidence interval for cycles value: 5.41 27.28
95% mean confidence interval for cycles %-change: 0.64% 1.81%
Cycles are HURT.

5 years agonir/range-analysis: Fix incorrect fadd range result for (ne_zero, ne_zero)
Ian Romanick [Mon, 12 Aug 2019 19:08:40 +0000 (12:08 -0700)]
nir/range-analysis: Fix incorrect fadd range result for (ne_zero, ne_zero)

Found by inspection.  I tried really, really hard to make a test case
that would trigger this problem, but I was unsuccesful.  It's very hard
to get an instruction to produce a ne_zero result without ne_zero
sources.  The most plausible way is using bcsel.  That proves
problematic because bcsel interprets its sources as integers, so it
cannot currently be used to "clean" values for floating point
instructions.

No shader-db changes on any Intel platform.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass")
5 years agonir/range-analysis: Adjust result range of multiplication to account for flush-to...
Ian Romanick [Fri, 9 Aug 2019 17:55:49 +0000 (10:55 -0700)]
nir/range-analysis: Adjust result range of multiplication to account for flush-to-zero

Fixes piglit tests (new in piglit!110):

    - fs-underflow-fma-compare-zero.shader_test
    - fs-underflow-mul-compare-zero.shader_test

v2: Add back part of comment accidentally deleted.  Noticed by
Caio. Remove is_not_zero function as it is no longer used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: fa116ce357b ("nir/range-analysis: Range tracking for ffma and flrp")
Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
All Gen7+ platforms** had similar results. (Ice Lake shown)
total instructions in shared programs: 16278465 -> 16279492 (<.01%)
instructions in affected programs: 16765 -> 17792 (6.13%)
helped: 0
HURT: 23
HURT stats (abs)   min: 7 max: 275 x̄: 44.65 x̃: 8
HURT stats (rel)   min: 1.15% max: 17.51% x̄: 4.23% x̃: 1.62%
95% mean confidence interval for instructions value: 9.57 79.74
95% mean confidence interval for instructions %-change: 1.85% 6.61%
Instructions are HURT.

total cycles in shared programs: 367135159 -> 367154270 (<.01%)
cycles in affected programs: 279306 -> 298417 (6.84%)
helped: 0
HURT: 23
HURT stats (abs)   min: 13 max: 6029 x̄: 830.91 x̃: 54
HURT stats (rel)   min: 0.17% max: 45.67% x̄: 7.33% x̃: 0.49%
95% mean confidence interval for cycles value: 100.89 1560.94
95% mean confidence interval for cycles %-change: 0.94% 13.71%
Cycles are HURT.

total spills in shared programs: 8870 -> 8869 (-0.01%)
spills in affected programs: 19 -> 18 (-5.26%)
helped: 1
HURT: 0

total fills in shared programs: 21904 -> 21901 (-0.01%)
fills in affected programs: 81 -> 78 (-3.70%)
helped: 1
HURT: 0

LOST:   0
GAINED: 1

** On Broadwell, a shader was hurt for spills / fills instead of
   helped.

No changes on any earlier platforms.

5 years agonir/range-analysis: Adjust result range of exp2 to account for flush-to-zero
Ian Romanick [Wed, 7 Aug 2019 15:56:22 +0000 (08:56 -0700)]
nir/range-analysis: Adjust result range of exp2 to account for flush-to-zero

Fixes piglit tests (new in piglit!110):

    - fs-underflow-exp2-compare-zero.shader_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Most of the shaders affected are, unsurprisingly, in Unigine Heaven.

All Gen6+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16278207 -> 16278465 (<.01%)
instructions in affected programs: 11374 -> 11632 (2.27%)
helped: 0
HURT: 58
HURT stats (abs)   min: 2 max: 13 x̄: 4.45 x̃: 4
HURT stats (rel)   min: 0.54% max: 4.11% x̄: 2.42% x̃: 2.82%
95% mean confidence interval for instructions value: 3.77 5.13
95% mean confidence interval for instructions %-change: 2.19% 2.64%
Instructions are HURT.

total cycles in shared programs: 367134284 -> 367135159 (<.01%)
cycles in affected programs: 81207 -> 82082 (1.08%)
helped: 17
HURT: 36
helped stats (abs) min: 6 max: 356 x̄: 90.35 x̃: 6
helped stats (rel) min: 0.69% max: 21.45% x̄: 5.71% x̃: 0.78%
HURT stats (abs)   min: 4 max: 235 x̄: 66.97 x̃: 16
HURT stats (rel)   min: 0.35% max: 27.58% x̄: 5.34% x̃: 1.09%
95% mean confidence interval for cycles value: -20.36 53.38
95% mean confidence interval for cycles %-change: -1.08% 4.67%
Inconclusive result (value mean confidence interval includes 0).

No changes on any earlier platforms.

5 years agonir/algebraic: Clean up value range analysis-based optimizations
Ian Romanick [Thu, 8 Aug 2019 23:48:14 +0000 (16:48 -0700)]
nir/algebraic: Clean up value range analysis-based optimizations

Fix the a / b ordering in some compares.  Delete duplicate patterns.
Add a table explaining things.  While I was cleaning this up, I managed
to confuse myself.  The table helped sort that out.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/algebraic: Mark some value range analysis-based optimizations imprecise
Ian Romanick [Wed, 7 Aug 2019 15:54:04 +0000 (08:54 -0700)]
nir/algebraic: Mark some value range analysis-based optimizations imprecise

This didn't fix bug #111308, but it was found will trying to find the
actual cause of that bug.

Fixes piglit tests (new in piglit!110):

    - fs-fract-of-NaN.shader_test
    - fs-lt-nan-tautology.shader_test
    - fs-ge-nan-tautology.shader_test

No shader-db changes on any Intel platform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: b77070e293c ("nir/algebraic: Use value range analysis to eliminate tautological compares")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Fix partial fast clear checks to account for miplevel.
Kenneth Graunke [Thu, 29 Aug 2019 01:05:57 +0000 (18:05 -0700)]
iris: Fix partial fast clear checks to account for miplevel.

We enabled fast clears at level > 0, but didn't minify the dimensions
when comparing the box size, so we always thought it was a partial
clear and as a result never actually enabled any.

This eliminates some slow clears in Civilization VI, but they are mostly
during initialization and not the main rendering.

Thanks to Dan Walsh for noticing we had too many slow clears.

Fixes: 393f659ed83 ("iris: Enable fast clears on other miplevels and layers than 0.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agopanfrost: Remove unused argument from panfrost_drm_submit_vs_fs_job()
Rohan Garg [Thu, 29 Aug 2019 12:53:10 +0000 (14:53 +0200)]
panfrost: Remove unused argument from panfrost_drm_submit_vs_fs_job()

is_scanout is not used anywhere and can be inferred within
panfrost_drm_submit_vs_fs_job() if required.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
5 years agoiris: Actually describe bo_reuse driconf option
Kenneth Graunke [Thu, 29 Aug 2019 16:39:46 +0000 (09:39 -0700)]
iris: Actually describe bo_reuse driconf option

Otherwise it doesn't exist and can't be parsed, so everything dies at
screen init time.

Fixes: 6dc4ddc5f81 ("iris: use driconf for 'bo_reuse' parameter")
5 years agopanfrost/ci: Print only regressions
Tomeu Vizoso [Thu, 29 Aug 2019 12:44:17 +0000 (14:44 +0200)]
panfrost/ci: Print only regressions

Some functionality has been added to deqp-volt to only print
regressions, so update our version of it and use the new options.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agogallivm: use fallback code for mul_hi with llvm >= 7.0
Roland Scheidegger [Wed, 28 Aug 2019 19:35:45 +0000 (21:35 +0200)]
gallivm: use fallback code for mul_hi with llvm >= 7.0

LLVM 7.0 ditched the pmulu intrinsics.
This is only a trivial patch to use the fallback code instead.
It'll likely produce atrocious code since the pattern doesn't match what
llvm itself uses in its autoupgrade paths, hence the pattern won't be
recognized.

Should fix https://bugs.freedesktop.org/show_bug.cgi?id=111496

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv/gfx10: compute the LDS size for exporting PrimID for VS
Samuel Pitoiset [Thu, 29 Aug 2019 07:18:54 +0000 (09:18 +0200)]
radv/gfx10: compute the LDS size for exporting PrimID for VS

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoswr/rasterizer: Enable ARB_fragment_layer_viewport
Jan Zielinski [Fri, 2 Aug 2019 10:28:13 +0000 (12:28 +0200)]
swr/rasterizer: Enable ARB_fragment_layer_viewport

Added loading gl_Layer and gl_ViewportIndex variables
to Pixel Shader context.

Reviewed-by: Alok Hota <alok.hota@intel.com>
5 years agoiris: use driconf for 'bo_reuse' parameter
Tapani Pälli [Wed, 28 Aug 2019 11:46:16 +0000 (14:46 +0300)]
iris: use driconf for 'bo_reuse' parameter

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoi965: initialize bo_reuse when creating brw_bufmgr
Tapani Pälli [Wed, 28 Aug 2019 11:29:53 +0000 (14:29 +0300)]
i965: initialize bo_reuse when creating brw_bufmgr

Fixes a possible data race spotted while debugging on other EGL
related failures where glFinish and eglCreateContext are going on at
the same time:

  ==11558== Possible data race during read of size 1 at 0x5E78CD0 by thread #23
  ==11558== Locks held: 1, at address 0x5E77CA8
  ==11558==    at 0x61B71D4: bo_alloc_internal (brw_bufmgr.c:639)
  ==11558==    by 0x61B7328: brw_bo_alloc (brw_bufmgr.c:669)
  ==11558==    by 0x61EF975: recreate_growing_buffer (intel_batchbuffer.c:231)
  ==11558==    by 0x61EFAAE: intel_batchbuffer_reset (intel_batchbuffer.c:255)
  ==11558==    by 0x61EFB85: intel_batchbuffer_reset_and_clear_render_cache (intel_batchbuffer.c:280)
  ==11558==    by 0x61F0507: brw_new_batch (intel_batchbuffer.c:551)
  ==11558==    by 0x61F12C1: _intel_batchbuffer_flush_fence (intel_batchbuffer.c:888)
  ==11558==    by 0x61BDD6B: intel_glFlush (brw_context.c:296)
  ==11558==    by 0x61BDDB9: intel_finish (brw_context.c:307)
  ==11558==    by 0x623831B: _mesa_Finish (context.c:1906)
  ==11558==    by 0x46D556: deqp::egl::GLES2ThreadTest::Operation::execute(tcu::ThreadUtil::Thread&)
  ==11558==    by 0x721502: tcu::ThreadUtil::Thread::run()
  ==11558==
  ==11558== This conflicts with a previous write of size 1 by thread #26
  ==11558== Locks held: 1, at address 0x5D09878
  ==11558==    at 0x61B98A9: brw_bufmgr_enable_reuse (brw_bufmgr.c:1541)
  ==11558==    by 0x61BF09D: brw_process_driconf_options (brw_context.c:854)
  ==11558==    by 0x61BF6CA: brwCreateContext (brw_context.c:993)
  ==11558==    by 0x621181F: driCreateContextAttribs (dri_util.c:473)
  ==11558==    by 0x53FE87B: dri2_create_context (egl_dri2.c:1388)
  ==11558==    by 0x53EE7BE: eglCreateContext (eglapi.c:807)
  ==11558==    by 0x5C8AB9: eglw::FuncPtrLibrary::createContext(void*, void*, void*, int const*) const
  ==11558==    by 0x46E027: deqp::egl::GLES2ThreadTest::CreateContext::exec(tcu::ThreadUtil::Thread&)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: Don't auto-flush/dirty on transfer unmap for coherent buffers
Kenneth Graunke [Thu, 29 Aug 2019 00:50:13 +0000 (17:50 -0700)]
iris: Don't auto-flush/dirty on transfer unmap for coherent buffers

When u_upload_mgr fills up a buffer, it unmaps and destroys it.  Our
unmap function was automatically performing the equivalent of a
FlushMappedBufferRange call in this case.  Because the buffer mapping
is persistent and coherent, we don't actually do any flushing when we
do the rest of the writes to the buffer - we were just doing one final
one at the end.  But we would be using the uploaded contents on the
GPU the whole time.

This certainly shouldn't be necessary for streaming buffers, and if
such flushing and dirtying is necessary for coherent buffers, this is
wildly insufficient.

Drops a small number of constant packets and PIPE_CONTROL flushes from
most benchmarks that I've looked at.  Doesn't seem to make much of an
impact on performance, however.

Thanks to Felix Degrood for noticing that we were emitting more
3DSTATE_CONSTANT_* packets than we needed to.

5 years agost/nine: Properly initialize GLSL types for NIR shaders.
Timur Kristóf [Wed, 28 Aug 2019 23:23:37 +0000 (01:23 +0200)]
st/nine: Properly initialize GLSL types for NIR shaders.

NIR shaders use GLSL types (note: these live outside libglsl), and
nine needs to properly initialize these just like the other state
trackers. This fixes an assertion failure when TTN is used.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
5 years agofreedreno/ir3: do better job of marking convergence points
Rob Clark [Sat, 29 Jun 2019 11:54:23 +0000 (04:54 -0700)]
freedreno/ir3: do better job of marking convergence points

Fixes:
dEQP-GLES3.functional.shaders.switch.switch_in_do_while_loop_dynamic_vertex
dEQP-GLES3.functional.shaders.switch.switch_in_do_while_loop_dynamic_fragment

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>