mesa.git
2 years agoetnaviv: fix a null pointer dereference
Andrii Simiklit [Fri, 19 Jul 2019 14:39:07 +0000 (17:39 +0300)]
etnaviv: fix a null pointer dereference

This issue was found by cppcheck

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2 years agoac/nir: Lower large indirect variables to scratch
Connor Abbott [Wed, 12 Jun 2019 16:58:13 +0000 (18:58 +0200)]
ac/nir: Lower large indirect variables to scratch

results from radeonsi NIR:

Totals from affected shaders:
SGPRS: 704 -> 464 (-34.09 %)
VGPRS: 2056 -> 672 (-67.32 %)
Spilled SGPRs: 24 -> 0 (-100.00 %)
Spilled VGPRs: 28406 -> 0 (-100.00 %)
Private memory VGPRs: 0 -> 3182 (0.00 %)
Scratch size: 1064 -> 3228 (203.38 %) dwords per thread
Code Size: 935260 -> 40180 (-95.70 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 28 -> 70 (150.00 %)
Wait states: 0 -> 0 (0.00 %)

results from radv:

Totals from affected shaders:
SGPRS: 80 -> 48 (-40.00 %)
VGPRS: 204 -> 108 (-47.06 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 256 (0.00 %) dwords per thread
Code Size: 15792 -> 9504 (-39.82 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1 -> 2 (100.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2 years agodrirc: Add discard workaround for Divinity: Original Sin EE
Timothy Arceri [Fri, 2 Aug 2019 00:05:20 +0000 (10:05 +1000)]
drirc: Add discard workaround for Divinity: Original Sin EE

This adds an additional work around for the game to fix the blocky
shadows as reported in bug 105282

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105282

2 years agolima/ppir: simplify load uni/temp op lowering and scheduling
Erico Nunes [Sun, 21 Jul 2019 23:27:11 +0000 (01:27 +0200)]
lima/ppir: simplify load uni/temp op lowering and scheduling

The load uniform/temporary operations output only to a pipeline
register, which must be consumed by another op in the same instruction
later.
The current implementation delays the decision of who will consume this
result to until the scheduling step. If the consumer node is not able to
use the pipeline register, a mov node may have to be created, during the
scheduler step.

As part of the ppir scheduler simplification, and now that the ppir
scheduler supports pipeline register dependencies, this can be
simplified by always creating a single mov node outputting to a normal
register that can be used directly by all consumers.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2 years agolima/ppir: simplify select op lowering and scheduling
Erico Nunes [Sun, 21 Jul 2019 13:07:50 +0000 (15:07 +0200)]
lima/ppir: simplify select op lowering and scheduling

The select operation relies on the select condition coming from the
result of the the alu scalar mult slot, in the same instruction.
The current implementation creates a mov node to be the predecessor of
select, and then relies on an exception during scheduling to ensure that
both ops are inserted in the same instruction.

Now that the ppir scheduler supports pipeline register dependencies,
this can be simplified by making the mov explicitly output to the fmul
pipeline register, and the scheduler can place it without an exception.

Since the select condition can only be placed in the scalar mult slot,
differently than a regular mov, define a separate op for it.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2 years agolima/ppir: support pipeline registers in scheduler
Erico Nunes [Sun, 21 Jul 2019 11:30:34 +0000 (13:30 +0200)]
lima/ppir: support pipeline registers in scheduler

The ppir scheduler grew to be rather complicated and containing many
exceptions as it also has to take care of inserting additional nodes
when it is mandatory for nodes to be in the same instruction.
As such, the lima lowering and scheduling process can be difficult to
understand and maintain.
The ppir lowering step created nodes hoping that the scheduler would
notice the exception and do the right thing.

This proposal adds a simple refactor to the scheduler so that it places
nodes with pipeline registers in the same instruction.
With the scheduler handling this in a general way, it is possible to
create same-instruction dependencies by using pipeline registers during
the lowering stage.
This is simpler to maintain because now we can make these dependencies
explicit in a single place (lowering), and we can drop exceptions from
scheduling.
Reducing the complexity of the scheduler is also useful as preparatory
work to support control flow in ppir.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2 years agodocs: fix "empty array" meson syntax
Eric Engestrom [Wed, 19 Jun 2019 12:29:49 +0000 (13:29 +0100)]
docs: fix "empty array" meson syntax

On recent versions of Meson (0.47+) these are synonymous, but we still
support older versions than that, so let's use the correct syntax to
avoid confusing users of old Meson versions.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2 years agoegl: drop unnecessary function deref
Eric Engestrom [Tue, 27 Nov 2018 12:48:42 +0000 (12:48 +0000)]
egl: drop unnecessary function deref

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2 years agoglx: drop unnecessary pointer deref for function calls
Eric Engestrom [Thu, 22 Nov 2018 19:47:28 +0000 (19:47 +0000)]
glx: drop unnecessary pointer deref for function calls

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2 years agointroduce c11_compat.h to provide C11 things in C99
Eric Engestrom [Tue, 23 Jul 2019 12:44:33 +0000 (13:44 +0100)]
introduce c11_compat.h to provide C11 things in C99

Right now, all it does is provide the new standard `static_assert()` name.

Fixes: fbf7c38da35afe7f1de0 ("egl/wayland: use bitset.h for `formats` bit set")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Bhushan Shah <bshah@kde.org>
2 years agotravis: add MacOS Scons build
Eric Engestrom [Tue, 30 Jul 2019 15:18:10 +0000 (16:18 +0100)]
travis: add MacOS Scons build

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2 years agosymbols-check: fix `nm` invocation on MacOS
Eric Engestrom [Sat, 3 Aug 2019 17:08:38 +0000 (18:08 +0100)]
symbols-check: fix `nm` invocation on MacOS

According to Mac OSX's man page [1], this is how we should get the list
of exported symbols:
  nm -g -P foo.dylib

-g to only show the exported symbols
-P to show it in a "portable" format, ie. readable by a script

Since this is supported by GNU nm as well, let's use that everywhere,
although some care needs to be taken as there are some differences in
the output.

[1] https://www.unix.com/man-page/osx/1/nm/

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2 years agosymbols-check: discard platform symbols early
Eric Engestrom [Sat, 3 Aug 2019 17:21:26 +0000 (18:21 +0100)]
symbols-check: discard platform symbols early

(as the comment there already claimed)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2 years agosymbols-check: skip test if we can't get the symbols list
Eric Engestrom [Sat, 3 Aug 2019 23:31:05 +0000 (00:31 +0100)]
symbols-check: skip test if we can't get the symbols list

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2 years agolima/ppir: move alu vec to scalar lowering into NIR
Vasily Khoruzhick [Sat, 3 Aug 2019 16:59:27 +0000 (09:59 -0700)]
lima/ppir: move alu vec to scalar lowering into NIR

Utgard PP is vec4, but some operations are scalar, utilize
NIR vec to scalar lowering pass and indicate operations that we
want to lower.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2 years agoiris: Fix handling of SIMD32 fragment shaders
Jason Ekstrand [Sat, 3 Aug 2019 16:37:34 +0000 (11:37 -0500)]
iris: Fix handling of SIMD32 fragment shaders

The brw_wm_prog_data_dispatch_grf_start_reg and _prog_offset helpers
read the _NPixelDispatchEnable fields from 3DSTATE_PS to figure out
which bits to pull out of the prog data and stuff where.  Therefore,
they need to be called with the final set of _NPixelDispatchEnable bits
after we've done the workaround for SIMD32 and 16x MSAA.  Otherwise, if
you end up with a somewhat odd combination of enables, the GRF start reg
and KSP data ends up in the wrong slots.  In particular, running
SIMD32-only is broken but several other combinations are as well.

Fixes: 5445c176e27ba "iris: Disable SIMD32 when using a 16x MSAA..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2 years agomesa: Rename GLX_USE_TLS to USE_ELF_TLS.
Bas Nieuwenhuizen [Sat, 3 Aug 2019 16:44:44 +0000 (18:44 +0200)]
mesa: Rename GLX_USE_TLS to USE_ELF_TLS.

These days it is not GLX only and it does not work with all TLS
implementations.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2 years agomeson: Do not use GLX_USE_TLS on Android.
Bas Nieuwenhuizen [Sat, 3 Aug 2019 16:13:52 +0000 (18:13 +0200)]
meson: Do not use GLX_USE_TLS on Android.

The asm code expects a specific kind of implementation, but Android
uses something different (emutls).

Turns out mesa has a fallback with pthread_getspecific, with an
optimizaiton if only a single thread is used. emutls also uses
getspecific, so lets just use the optimized mesa implementation.

Fixes: 20294dceebc "mesa: Enable asm unconditionally, now that gen_matypes is gone."
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2 years agoetnaviv: s/boolean/bool
Christian Gmeiner [Sat, 3 Aug 2019 10:04:09 +0000 (12:04 +0200)]
etnaviv: s/boolean/bool

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com>
2 years agolima/ppir: Add gl_FrontFace handling
Andreas Baierl [Thu, 1 Aug 2019 09:35:28 +0000 (11:35 +0200)]
lima/ppir: Add gl_FrontFace handling

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2 years agointel/nir: Add 1-bit opcodes to brw_cmod_for_nir_comparison_op
Jason Ekstrand [Fri, 2 Aug 2019 20:21:14 +0000 (15:21 -0500)]
intel/nir: Add 1-bit opcodes to brw_cmod_for_nir_comparison_op

Reviewed-by: Matt Turner <mattst88@gmail.com>
2 years agointel/nir: Add a common nir comparison -> cmod helper
Jason Ekstrand [Fri, 2 Aug 2019 20:19:16 +0000 (15:19 -0500)]
intel/nir: Add a common nir comparison -> cmod helper

We already had one in the vec4 code, we just had move it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2 years agoutil: fix pointer type on NetBSD
Eric Engestrom [Fri, 2 Aug 2019 23:32:28 +0000 (00:32 +0100)]
util: fix pointer type on NetBSD

NetBSD expects a `void *` argument [1] as the printf-style arguments to
the formatting string, so we need to cast the `const` away.

[1] https://netbsd.gw.com/cgi-bin/man-cgi?pthread_setname_np++NetBSD-current

Suggested-by: Kamil Rytarowski <n54@gmx.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2 years agomeson: remove unused field
Eric Engestrom [Mon, 22 Jul 2019 14:55:06 +0000 (15:55 +0100)]
meson: remove unused field

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2 years agomeson: replace last uses of libxmlconfig with idep_xmlconfig
Eric Engestrom [Mon, 22 Jul 2019 14:25:28 +0000 (15:25 +0100)]
meson: replace last uses of libxmlconfig with idep_xmlconfig

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2 years agomeson: drop unused dep_{thread,dl}
Eric Engestrom [Tue, 23 Jul 2019 10:25:53 +0000 (11:25 +0100)]
meson: drop unused dep_{thread,dl}

Unused as of last commit.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2 years agomeson: replace libmesa_util with idep_mesautil
Eric Engestrom [Mon, 22 Jul 2019 13:50:15 +0000 (14:50 +0100)]
meson: replace libmesa_util with idep_mesautil

This automates the include_directories and dependencies tracking so that
all users of libmesa_util don't need to add them manually.

Next commit will remove the ones that were only added for that reason.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2 years agopan/midgard: Print texture outmod
Alyssa Rosenzweig [Fri, 2 Aug 2019 21:26:33 +0000 (14:26 -0700)]
pan/midgard: Print texture outmod

I have no idea who thought this was a good idea.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopan/midgard: Promote all 16 uniforms
Alyssa Rosenzweig [Fri, 2 Aug 2019 15:46:44 +0000 (08:46 -0700)]
pan/midgard: Promote all 16 uniforms

Now that register spilling is in place, this is reasonable. It turns out
for some shaders, it's actually better to cap at 8 work registers and
extra >8 uniform reigsters and tolerate the spilling, since the extra
resulting threads make up for the spillage. So incidentally, the shader
that spills here is in -bterrain, which jumps from 19fps to 21fps as a
result of this change.

total instructions in shared programs: 3513 -> 3448 (-1.85%)
instructions in affected programs: 776 -> 711 (-8.38%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 3.25 x̃: 2
helped stats (rel) min: 3.57% max: 16.00% x̄: 8.37% x̃: 7.19%
95% mean confidence interval for instructions value: -4.28 -2.22
95% mean confidence interval for instructions %-change: -10.02% -6.73%
Instructions are helped.

total bundles in shared programs: 2067 -> 2024 (-2.08%)
bundles in affected programs: 515 -> 472 (-8.35%)
helped: 19
HURT: 1
helped stats (abs) min: 1 max: 6 x̄: 2.37 x̃: 2
helped stats (rel) min: 2.13% max: 17.86% x̄: 10.19% x̃: 11.11%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23%
95% mean confidence interval for bundles value: -3.01 -1.29
95% mean confidence interval for bundles %-change: -12.13% -6.91%
Bundles are helped.

total quadwords in shared programs: 3468 -> 3426 (-1.21%)
quadwords in affected programs: 764 -> 722 (-5.50%)
helped: 19
HURT: 1
helped stats (abs) min: 1 max: 5 x̄: 2.26 x̃: 2
helped stats (rel) min: 1.41% max: 12.50% x̄: 6.76% x̃: 7.14%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.08% max: 1.08% x̄: 1.08% x̃: 1.08%
95% mean confidence interval for quadwords value: -2.83 -1.37
95% mean confidence interval for quadwords %-change: -8.08% -4.65%
Quadwords are helped.

total registers in shared programs: 383 -> 360 (-6.01%)
registers in affected programs: 112 -> 89 (-20.54%)
helped: 19
HURT: 0
helped stats (abs) min: 1 max: 3 x̄: 1.21 x̃: 1
helped stats (rel) min: 12.50% max: 27.27% x̄: 20.63% x̃: 20.00%
95% mean confidence interval for registers value: -1.47 -0.95
95% mean confidence interval for registers %-change: -22.39% -18.87%
Registers are helped.

total threads in shared programs: 432 -> 451 (4.40%)
threads in affected programs: 19 -> 38 (100.00%)
helped: 11
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.73 x̃: 2
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for threads value: 1.41 2.04
95% mean confidence interval for threads %-change: 100.00% 100.00%
Threads are [helped].

total loops in shared programs: 4 -> 4 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 0 -> 4
spills in affected programs: 0 -> 4
helped: 0
HURT: 2

total fills in shared programs: 0 -> 7
fills in affected programs: 0 -> 7
helped: 0
HURT: 2

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopan/midgard: Break mir_spill_register into its function
Alyssa Rosenzweig [Fri, 2 Aug 2019 16:06:13 +0000 (09:06 -0700)]
pan/midgard: Break mir_spill_register into its function

No functional changes, just breaks out a megamonster function and fixes
the indentation.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopan/midgard: Switch sources to an array for trinary sources
Alyssa Rosenzweig [Fri, 2 Aug 2019 22:25:02 +0000 (15:25 -0700)]
pan/midgard: Switch sources to an array for trinary sources

We need three independent sources to support indirect SSBO writes (as
well as textures with both LOD/bias and offsets). Now is a good time to
make sources just an array so we don't have to rewrite a ton of code if
we ever needed a fourth source for some reason.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopan/midgard: Remove "r27-only" register class
Alyssa Rosenzweig [Thu, 1 Aug 2019 21:28:34 +0000 (14:28 -0700)]
pan/midgard: Remove "r27-only" register class

As far as I know, there's no such thing as a load/store op that only
takes its argument in r27. We just need to set the appropriate arg_1
field in the RA to specify other registers if we want them.

To facilitate this, various RA-related changes are needed across the
compiler ; this should also fix indirect offsets which were implicitly
interpreted as "r27-only" despite not even passing through RA yet. One
ripple effect change is switching the move insertion point and adjusting
the liveness analysis accordingly, so while this was intended as a
purely functional change, there are some shader-db changes:

total instructions in shared programs: 3511 -> 3498 (-0.37%)
instructions in affected programs: 563 -> 550 (-2.31%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.08 x̃: 1
helped stats (rel) min: 0.93% max: 5.00% x̄: 2.58% x̃: 2.33%
95% mean confidence interval for instructions value: -1.27 -0.90
95% mean confidence interval for instructions %-change: -3.23% -1.93%
Instructions are helped.

total bundles in shared programs: 2067 -> 2067 (0.00%)
bundles in affected programs: 398 -> 398 (0.00%)
helped: 7
HURT: 4
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.54% max: 10.00% x̄: 5.04% x̃: 5.56%
HURT stats (abs)   min: 1 max: 2 x̄: 1.75 x̃: 2
HURT stats (rel)   min: 2.13% max: 4.26% x̄: 3.72% x̃: 4.26%
95% mean confidence interval for bundles value: -0.95 0.95
95% mean confidence interval for bundles %-change: -5.21% 1.50%
Inconclusive result (value mean confidence interval includes 0).

total quadwords in shared programs: 3464 -> 3454 (-0.29%)
quadwords in affected programs: 1199 -> 1189 (-0.83%)
helped: 18
HURT: 4
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.03% max: 5.26% x̄: 2.44% x̃: 1.79%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 2.56% max: 2.82% x̄: 2.63% x̃: 2.56%
95% mean confidence interval for quadwords value: -0.98 0.07
Inconclusive result (value mean confidence interval includes 0).

total registers in shared programs: 383 -> 373 (-2.61%)
registers in affected programs: 56 -> 46 (-17.86%)
helped: 12
HURT: 2
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 9.09% max: 33.33% x̄: 29.58% x̃: 33.33%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 20.00% max: 50.00% x̄: 35.00% x̃: 35.00%
95% mean confidence interval for registers value: -1.13 -0.29
95% mean confidence interval for registers %-change: -35.07% -5.63%
Registers are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopan/midgard: Handle get/set_swizzle for load/store arguments
Alyssa Rosenzweig [Fri, 2 Aug 2019 18:22:56 +0000 (11:22 -0700)]
pan/midgard: Handle get/set_swizzle for load/store arguments

Load/store's  main "argument 0" already has its swizzle handled
correctly (for stores, that is). But the tinier arguments, the compact
ones with a component select but not a full swizzle, those are not yet
handled. Let's do something about that!

2 years agopan/midgard: Fix block successors
Alyssa Rosenzweig [Fri, 2 Aug 2019 20:48:27 +0000 (13:48 -0700)]
pan/midgard: Fix block successors

Rather than an ersatz thing that sort of looks like successors but is in
fact just the source order traversal with some backward jumps hacked in
for loops... construct an actual flow graph so we can do analysis
sanely.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopan/midgard: Add helper to pack load/store registers
Alyssa Rosenzweig [Thu, 1 Aug 2019 21:14:43 +0000 (14:14 -0700)]
pan/midgard: Add helper to pack load/store registers

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopan/midgard: Decode register/component in load/store argument
Alyssa Rosenzweig [Thu, 1 Aug 2019 21:06:19 +0000 (14:06 -0700)]
pan/midgard: Decode register/component in load/store argument

3-bits out of 8 down!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopan/midgard: Fix REGISTER_OFFSET
Alyssa Rosenzweig [Thu, 1 Aug 2019 21:06:02 +0000 (14:06 -0700)]
pan/midgard: Fix REGISTER_OFFSET

r27 isn't the special one, usually.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopan/midgard: Split ld/st unknown to arg_1/arg_2 fields
Alyssa Rosenzweig [Thu, 1 Aug 2019 20:29:01 +0000 (13:29 -0700)]
pan/midgard: Split ld/st unknown to arg_1/arg_2 fields

The 16-bit field can be decomposed to two independent 8-bit fields, each
representing a single (additional) argument to the load/store op,
generally used for encoding registers. Addressable registers here are
substantially limited compared to the main register in a load/store op.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agoradv: Expose VK_KHR_imageless_framebuffer.
Bas Nieuwenhuizen [Fri, 2 Aug 2019 12:45:49 +0000 (14:45 +0200)]
radv: Expose VK_KHR_imageless_framebuffer.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2 years agoradv: Implement VK_KHR_imageless_framebuffer.
Bas Nieuwenhuizen [Fri, 2 Aug 2019 12:42:50 +0000 (14:42 +0200)]
radv: Implement VK_KHR_imageless_framebuffer.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2 years agoradv: Store image view also outside framebuffer.
Bas Nieuwenhuizen [Fri, 2 Aug 2019 13:22:37 +0000 (15:22 +0200)]
radv: Store image view also outside framebuffer.

So we can use it with imageless framebuffers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2 years agoradv: Store color/depth surface info in attachment info instead of framebuffer.
Bas Nieuwenhuizen [Mon, 22 Jul 2019 00:31:27 +0000 (02:31 +0200)]
radv: Store color/depth surface info in attachment info instead of framebuffer.

That way we can use it for imageless framebuffers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2 years agopanfrost: Allocate polygon lists on-demand
Alyssa Rosenzweig [Fri, 2 Aug 2019 17:18:48 +0000 (19:18 +0200)]
panfrost: Allocate polygon lists on-demand

Rather than alloacting a huge (64MB) polygon list on context creation
and sharing it across framebuffers, we instead allocate polygon lists as
BOs (which consistently hit the cache) sized appropriately; for about a
month, we've known how to calculate the polygon list size so this has
only recently become possible.

The good news is we can render to truly massive framebuffers without
crashing and, more importantly, we eliminate the 64MB upfront overhead.
If a list that size isn't actually needed, it's not allocated.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2 years agopanfrost: Handle the bo == NULL case in panfrost_bo_[un]reference()
Boris Brezillon [Fri, 2 Aug 2019 17:18:47 +0000 (19:18 +0200)]
panfrost: Handle the bo == NULL case in panfrost_bo_[un]reference()

Allows us to pass BOs without checking if they're NULL or not.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopanfrost: Get rid of the skippable param in attach_vt_framebuffer()
Boris Brezillon [Fri, 2 Aug 2019 17:18:46 +0000 (19:18 +0200)]
panfrost: Get rid of the skippable param in attach_vt_framebuffer()

The only user of this function always passes true.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopanfrost: Don't emit a new FB desc when setting a new FB state
Boris Brezillon [Fri, 2 Aug 2019 17:18:45 +0000 (19:18 +0200)]
panfrost: Don't emit a new FB desc when setting a new FB state

The FB desc will be emitted/attached on the first draw targetting
this new FB.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopanfrost: Bail out early when doing a wallpaper blit
Boris Brezillon [Fri, 2 Aug 2019 17:18:44 +0000 (19:18 +0200)]
panfrost: Bail out early when doing a wallpaper blit

The wallpaper blit is a bit special in that the operation is targetting
the current FB, but the u_blitter logic creates a new surface for it
which makes util_framebuffer_state_equal() return false. In that case
we don't want a new FB descriptor to be emitted/attached, so let's just
copy the new state into ctx->pipe_framebuffer and exit the function.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopanfrost: Bail out early when new and current FB states are equal
Boris Brezillon [Fri, 2 Aug 2019 17:18:43 +0000 (19:18 +0200)]
panfrost: Bail out early when new and current FB states are equal

If the current FB matches the new one there's nothing to be done in
panfrost_set_framebuffer_state(). By bailing out early in that case we
avoid emitting new FB descriptors (the old ones are still valid).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopanfrost: Delay FB descriptor allocation
Boris Brezillon [Fri, 2 Aug 2019 17:18:42 +0000 (19:18 +0200)]
panfrost: Delay FB descriptor allocation

No need to emit SFBD/MFBD at frame invalidation. They can be emitted
when the framebuffer is attached, which saves us a potential FB desc
re-allocation if a new FB is bound after the swap.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopanfrost: Remove job from ctx->jobs at submission time
Boris Brezillon [Fri, 2 Aug 2019 17:18:41 +0000 (19:18 +0200)]
panfrost: Remove job from ctx->jobs at submission time

This guarantees that new draws targetting the same framebuffer will
get a new job instance.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopanfrost: Make ctx->job useful
Boris Brezillon [Fri, 2 Aug 2019 17:18:40 +0000 (19:18 +0200)]
panfrost: Make ctx->job useful

ctx->job is supposed to serve as a cache to avoid an hash table lookup
everytime we access the job attached to the currently bound FB, except
it was never assigned to anything but NULL.

Fix that by adding the missing assignment in panfrost_get_job_for_fbo().
Also add a missing NULL assignment in the ->set_framebuffer_state()
path.

While at it, add extra assert()s to make sure ctx->job is consistent.

Fixes: 59c9623d0a75 ("panfrost: Import job data structures from v3d")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agoac/nir,radv: Optimize bounds check for 64 bit CAS.
Bas Nieuwenhuizen [Fri, 2 Aug 2019 10:40:17 +0000 (12:40 +0200)]
ac/nir,radv: Optimize bounds check for 64 bit CAS.

When the application does not ask for robust buffer access.

Only implemented the check in radv.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2 years agogallivm: fix issue with AtomicCmpXchg wrapper on llvm 3.5-3.8
Roland Scheidegger [Tue, 30 Jul 2019 21:35:49 +0000 (23:35 +0200)]
gallivm: fix issue with AtomicCmpXchg wrapper on llvm 3.5-3.8

These versions still need wrapper but already have both success and
failure ordering.
(Compile tested on llvm 3.3, 3.7, 3.8.)

v2: don't duplicate whole function (suggested by Brian).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111102

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2 years agoutil: Handle differences in pthread_setname_np
Matt Turner [Wed, 31 Jul 2019 21:44:39 +0000 (14:44 -0700)]
util: Handle differences in pthread_setname_np

There are a lot of unfortunate differences in the implementation of this
function. NetBSD and Mac OS X in particular require different arguments.

https://stackoverflow.com/questions/2369738/how-to-set-the-name-of-a-thread-in-linux-pthreads/7989973#7989973
provides for a good overview of the differences.

Fixes: 9c411e020d1 ("util: Drop preprocessor guards for glibc-2.12")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111264
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
[Eric: use DETECT_OS_* instead of PIPE_OS_*]
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2 years agoutil/os_time: use detect_os.h to uncouple from gallium
Eric Engestrom [Thu, 1 Aug 2019 21:38:00 +0000 (22:38 +0100)]
util/os_time: use detect_os.h to uncouple from gallium

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2 years agoutil/u_debug: use detect_os.h
Eric Engestrom [Thu, 1 Aug 2019 21:36:30 +0000 (22:36 +0100)]
util/u_debug: use detect_os.h

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2 years agoutil/os_misc: use detect_os.h to start uncoupling from gallium
Eric Engestrom [Thu, 1 Aug 2019 21:33:05 +0000 (22:33 +0100)]
util/os_misc: use detect_os.h to start uncoupling from gallium

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2 years agoutil/os_memory: use detect_os.h to uncouple it from gallium
Eric Engestrom [Thu, 1 Aug 2019 15:55:39 +0000 (16:55 +0100)]
util/os_memory: use detect_os.h to uncouple it from gallium

While at it, remove p_compiler.h as well as it is unused.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2 years agogallium: deduplicate os detection logic by using detect_os.h
Eric Engestrom [Thu, 1 Aug 2019 13:58:52 +0000 (14:58 +0100)]
gallium: deduplicate os detection logic by using detect_os.h

This allows us to avoid having to rename all the PIPE_OS_* at once while
still making sure PIPE_OS_* and DETECT_OS_* are always in sync.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2 years agogallium/utils: drop PIPE_SUBSYSTEM_WINDOWS_USER
Eric Engestrom [Thu, 1 Aug 2019 21:49:05 +0000 (22:49 +0100)]
gallium/utils: drop PIPE_SUBSYSTEM_WINDOWS_USER

This is basically just an alias for PIPE_OS_WINDOWS.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2 years agoscons: rename PIPE_SUBSYSTEM_EMBEDDED to EMBEDDED_DEVICE
Eric Engestrom [Thu, 1 Aug 2019 20:45:25 +0000 (21:45 +0100)]
scons: rename PIPE_SUBSYSTEM_EMBEDDED to EMBEDDED_DEVICE

It has nothing to do with the PIPE_SUBSYSTEM_* stuff from gallium.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2 years agogallium: remove never-used PIPE_SUBSYSTEM_DRI
Eric Engestrom [Thu, 1 Aug 2019 14:02:15 +0000 (15:02 +0100)]
gallium: remove never-used PIPE_SUBSYSTEM_DRI

PIPE_SUBSYSTEM_DRI was introduced in dacfef158943665fc0d1 ("gallium: New
configuration header.") 11 years ago, and was never used.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2 years agoutil: fix typo in comment
Eric Engestrom [Thu, 1 Aug 2019 14:01:54 +0000 (15:01 +0100)]
util: fix typo in comment

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2 years agoutil: introduce detect_os.h
Eric Engestrom [Thu, 1 Aug 2019 13:48:26 +0000 (14:48 +0100)]
util: introduce detect_os.h

Mostly copied from src/gallium/include/pipe/p_config.h, so I kept its
copyright and authorship.

Other than the obvious rename, the big difference is that these are
always defined, to be used as `#if DETECT_OS_LINUX`.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2 years agofreedreno/batch: fix dependency loop detection
Rob Clark [Wed, 31 Jul 2019 19:30:24 +0000 (12:30 -0700)]
freedreno/batch: fix dependency loop detection

We can have a scenario like:

  A -> B
  A -> C -> B

When adding the A->C dependency, it doesn't really matter that C depends
on something that A depends on, that isn't a necessary condition for a
dependency loop.

Instead what we want to know is that nothing C depends on, directly or
indirectly, depends on A.  We can detect this by recursively OR'ing the
dependents_mask of C and all it's dependencies.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2 years agofreedreno/a6xx: add missing flush/invalidates for blit
Rob Clark [Tue, 30 Jul 2019 16:34:53 +0000 (09:34 -0700)]
freedreno/a6xx: add missing flush/invalidates for blit

Various things we were missing for multiple blits in a single batch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2 years agofreedreno/a6xx: skip tiles with no geometry
Rob Clark [Sat, 27 Jul 2019 16:00:37 +0000 (09:00 -0700)]
freedreno/a6xx: skip tiles with no geometry

If no clear, and no geometry according to VSC_STATE[pipe] we can skip
the tile entirely.  If there is a fast-clear, we can't skip restore
(clear) or resolve IBs, but we can still skip draw IB.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2 years agofreedreno/a6xx: VSC overflow detection/handling
Rob Clark [Fri, 26 Jul 2019 16:55:14 +0000 (09:55 -0700)]
freedreno/a6xx: VSC overflow detection/handling

Check VSC_SIZE/VSC_SIZE2 regs from cmdstream to detect overflow, and
skip use of VSC visibility stream when overflow is detected, to avoid
GPU hangs.  This is done w/ introduction of some CP_REG_TEST/
CP_COND_REG_EXEC packet pairs.

In addition, eventually (after a frame or two) detect the condition and
resize the VSC buffers until overflow no longer happens.

Note that this significantly reduces the initial size of the VSC
buffers, backing out a previous hack to make them 16x larger than
what should be typically required (the previous "solution" for
VSC overflow).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2 years agofreedreno/a6xx: remove USE/IGNORE_VISIBILITY draw patching
Rob Clark [Fri, 26 Jul 2019 16:05:58 +0000 (09:05 -0700)]
freedreno/a6xx: remove USE/IGNORE_VISIBILITY draw patching

Seems this isn't needed anymore on a6xx to control whether visibility
stream is used.  And it would be hard to deal with if it was, for
disabling use of VSC stream in draw pass.  So just remove it and
simplify things.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2 years agofreedreno/a6xx: cleanup "blit_mem"
Rob Clark [Thu, 25 Jul 2019 23:34:25 +0000 (16:34 -0700)]
freedreno/a6xx: cleanup "blit_mem"

Rename to "control_mem", and switch to using a struct to manage the
layout, rather than just ad-hoc hard-coded offsets.

For recovering from VSC stream overflow, we'll need to add more, but
best to clean it up first.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2 years agofreedreno: refresh tile debug
Rob Clark [Wed, 24 Jul 2019 21:28:10 +0000 (14:28 -0700)]
freedreno: refresh tile debug

Fix some #ifdef'd bitrot, and get rid of #ifdef so it doesn't bitrot
again.

And add a prints for per-tile state.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2 years agofreedreno: update registers
Rob Clark [Thu, 25 Jul 2019 22:25:22 +0000 (15:25 -0700)]
freedreno: update registers

Pull in some updates of VSC regs

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2 years agofreedreno/gmem: small cleanup
Rob Clark [Thu, 25 Jul 2019 20:40:02 +0000 (13:40 -0700)]
freedreno/gmem: small cleanup

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2 years agofreedreno/drm: convert ring_pool to child_pool
Rob Clark [Mon, 29 Jul 2019 18:48:06 +0000 (11:48 -0700)]
freedreno/drm: convert ring_pool to child_pool

Worth another couple percent at driver2

Signed-off-by: Rob Clark <robdclark@chromium.org>
2 years agofreedreno/drm: remove idx_lock
Rob Clark [Mon, 29 Jul 2019 17:27:18 +0000 (10:27 -0700)]
freedreno/drm: remove idx_lock

Since it ends up contended, it is a bit of a bottleneck for workloads
with high driver overhead.  Worth nearly +10% at gfxbench driver2.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2 years agofreedreno/batch: always update last_fence
Rob Clark [Sun, 28 Jul 2019 17:04:25 +0000 (10:04 -0700)]
freedreno/batch: always update last_fence

Not all flush paths come thru fd_context_flush(), so we should also set
last_fence in the batch flush path.  This avoids some no-op flushes just
to get a fence.  For example when pctx->flush_resource() triggers a
flush.

We should probably keep the last_fence update in fd_context_flush() as
well to handle deferred flush case.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2 years agofreedreno: drop unused fd_fence_ref param
Rob Clark [Tue, 30 Jul 2019 15:12:46 +0000 (08:12 -0700)]
freedreno: drop unused fd_fence_ref param

The pscreen param was just there to satisfy pipe_screen::fence_reference

But some of the internal uses passed NULL for screen.  Which is a bit
ugly.  Instead drop the param and add a shim function to plug into the
screen.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2 years agopan/midgard: Print invert modifier
Alyssa Rosenzweig [Fri, 26 Jul 2019 21:59:00 +0000 (14:59 -0700)]
pan/midgard: Print invert modifier

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopan/midgard: Flip conditionals
Alyssa Rosenzweig [Fri, 26 Jul 2019 21:25:25 +0000 (14:25 -0700)]
pan/midgard: Flip conditionals

We would like to flip ops to have a constant in the second place to
enable inlining of the constant.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopan/midgard: Add bitwise src/invert fusing
Alyssa Rosenzweig [Fri, 26 Jul 2019 20:32:54 +0000 (13:32 -0700)]
pan/midgard: Add bitwise src/invert fusing

De Morgan's Laws and some special ops basically.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopan/midgard: Add .not propagation pass
Alyssa Rosenzweig [Fri, 26 Jul 2019 20:14:55 +0000 (13:14 -0700)]
pan/midgard: Add .not propagation pass

Essentially .pos propagation but for bitwise.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agopan/midgard: Fuse invert into bitwise ops
Alyssa Rosenzweig [Fri, 26 Jul 2019 20:08:54 +0000 (13:08 -0700)]
pan/midgard: Fuse invert into bitwise ops

We use the new invert flag to produce ops like inand.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2 years agofreedreno: a2xx: implement texture tiling
Jonathan Marek [Thu, 1 Aug 2019 18:41:44 +0000 (14:41 -0400)]
freedreno: a2xx: implement texture tiling

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2 years agofreedreno: a2xx: use nir_lower_alu_to_scalar instead of lowering pass
Jonathan Marek [Thu, 1 Aug 2019 19:52:58 +0000 (15:52 -0400)]
freedreno: a2xx: use nir_lower_alu_to_scalar instead of lowering pass

nir_lower_alu_to_scalar can now be used to only lower certain ops, so we
don't need the custom pass. And we can lower fall_equal/fany_nequal with
lower_vector_cmp instead.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2 years agofreedreno: a2xx: fix HW binning for batches with >256K vertices
Jonathan Marek [Thu, 1 Aug 2019 19:22:47 +0000 (15:22 -0400)]
freedreno: a2xx: fix HW binning for batches with >256K vertices

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2 years agofreedreno: a2xx: fix fneg/fabs/fsat opcodes
Jonathan Marek [Thu, 1 Aug 2019 18:43:12 +0000 (14:43 -0400)]
freedreno: a2xx: fix fneg/fabs/fsat opcodes

Previously we would get a fmov with modifiers, but now that mov has no type
these opcodes need to be supported.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2 years agofreedreno: a2xx: fix order of NIR opts
Jonathan Marek [Thu, 1 Aug 2019 18:38:18 +0000 (14:38 -0400)]
freedreno: a2xx: fix order of NIR opts

int_to_float needs to come after bool_to_float, and lower_to_source_mods
needs to come after both, since they don't deal wih source mods.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2 years agofreedreno: a2xx: fix non-etc1 cubemaps
Jonathan Marek [Thu, 1 Aug 2019 18:36:41 +0000 (14:36 -0400)]
freedreno: a2xx: fix non-etc1 cubemaps

Not sure how this happened, but apparently all cubemaps need swapped XY.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2 years agofreedreno: a2xx: fix fast clear not being used for Z24X8 buffers
Jonathan Marek [Thu, 1 Aug 2019 16:50:03 +0000 (12:50 -0400)]
freedreno: a2xx: fix fast clear not being used for Z24X8 buffers

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2 years agofreedreno: align renderonly scanout buffers
Jonathan Marek [Thu, 1 Aug 2019 16:42:33 +0000 (12:42 -0400)]
freedreno: align renderonly scanout buffers

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2 years agogitlab-ci: just build all the tools
Eric Engestrom [Thu, 1 Aug 2019 22:32:34 +0000 (23:32 +0100)]
gitlab-ci: just build all the tools

This line was mistakenly added while there is already a `-D tools=all`
a few lines below.

Fixes: f60defa72d5d20d99e3a ("gitlab-ci: Add a shader-db run using v3d on drm-shim.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2 years agoi965/clear: clear_value better precision
Sergii Romantsov [Fri, 12 Jul 2019 13:46:45 +0000 (16:46 +0300)]
i965/clear: clear_value better precision

Test-case with depth-clear 0.5 and format
MESA_FORMAT_Z24_UNORM_X8_UINT fails due inconsistent
clear-value of 0.4999997.
Maybe its better to improve?

CC: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: 0ae9ce0f29ea (i965/clear: Quantize the depth clear value based on the format)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111113
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2 years agoradv: fix image_has_{cmask,fmask}() helpers
Samuel Pitoiset [Fri, 2 Aug 2019 11:55:01 +0000 (13:55 +0200)]
radv: fix image_has_{cmask,fmask}() helpers

The driver should now rely on cmask_offset because CMASK can be
disabled by the driver for some reasons (eg. mipmaps). Apply the
same change for FMASK, although it should be useless.

Fixes: ad1bc8621df ("radv: remove radv_get_image_fmask_info()")
Fixes: 10d08da52c6 ("radv/gfx10: add missing dcc_tile_swizzle tweak")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2 years agoradv: remove radv_get_image_fmask_info()
Samuel Pitoiset [Thu, 1 Aug 2019 15:59:56 +0000 (17:59 +0200)]
radv: remove radv_get_image_fmask_info()

It's unnecessary to duplicate fields in another struct.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2 years agoradv/gfx10: add missing dcc_tile_swizzle tweak
Samuel Pitoiset [Thu, 1 Aug 2019 13:45:11 +0000 (15:45 +0200)]
radv/gfx10: add missing dcc_tile_swizzle tweak

Fixes: c90f46700dd ("radv/gfx10: mask DCC tile swizzle by alignment")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2 years agoradv: remove radv_get_image_cmask_info()
Samuel Pitoiset [Thu, 1 Aug 2019 15:59:55 +0000 (17:59 +0200)]
radv: remove radv_get_image_cmask_info()

It's unnecessary to duplicate fields in another struct.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2 years agoradv: only account for tile_swizzle for color surfaces with DCC
Samuel Pitoiset [Thu, 1 Aug 2019 13:45:10 +0000 (15:45 +0200)]
radv: only account for tile_swizzle for color surfaces with DCC

It's 0 for depth surfaces with TC compat HTILE enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2 years agoradv: Enable VK_KHR_shader_atomic_int64
Bas Nieuwenhuizen [Fri, 2 Aug 2019 00:16:23 +0000 (02:16 +0200)]
radv: Enable VK_KHR_shader_atomic_int64

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2 years agoac/nir: Implement LLVM9 64-bit buffer compare & exchange.
Bas Nieuwenhuizen [Fri, 2 Aug 2019 10:01:34 +0000 (12:01 +0200)]
ac/nir: Implement LLVM9 64-bit buffer compare & exchange.

LLVM 9 does not have a 64-bit buffer compswap intrinsic, so this
extracts the ptr, does a bound check and then uses a cmpxchg LLVM
instruction.

Not ideal, but the earliest release we're going to get a proper
intrinsic is LLVM 10.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2 years agoRevert "ac/nir: handle negate modifier"
Connor Abbott [Fri, 2 Aug 2019 09:14:50 +0000 (11:14 +0200)]
Revert "ac/nir: handle negate modifier"

This reverts commit bfea7e4d2965269bff8f1f6449cb99c312fd7384.