mesa.git
4 years agoaco: add vmem/smem score statistic
Rhys Perry [Wed, 4 Dec 2019 14:41:18 +0000 (14:41 +0000)]
aco: add vmem/smem score statistic

This isn't perfect (for example, changes might not be too meaningful when
comparing shaders with different control flow) but it should be useful for
evaluating scheduler changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2965>

4 years agoaco: add various compiler statistics
Rhys Perry [Wed, 4 Dec 2019 15:19:56 +0000 (15:19 +0000)]
aco: add various compiler statistics

Adds these statistics:
- hash of code and constant data
- number of instructions
- number of copies from pseudo-instructions
- number of branches
- estimate of cycles spent not waiting in s_waitcnt
- number of vmem/smem "clauses"
- sgpr/vgpr usage before scheduling

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2965>

4 years agoradv: add code for exposing compiler statistics
Rhys Perry [Wed, 4 Dec 2019 14:46:31 +0000 (14:46 +0000)]
radv: add code for exposing compiler statistics

Statistics will be added to ACO in later commits.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2965>

4 years agoEGL: Add eglSetDamageRegionKHR to GLVND dispatch list
Daniel Stone [Wed, 1 Apr 2020 11:43:51 +0000 (12:43 +0100)]
EGL: Add eglSetDamageRegionKHR to GLVND dispatch list

This was missed in the original conversion, which added support for
eglSetDamageRegionKHR to local EGL exports, but forgot to generate
updated dispatch for GLVND.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Fixes: 9827547313c7 ("egl/android: support for EGL_KHR_partial_update")
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4403>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4403>

4 years agodocs: update calendar, add news item, and link releases notes for 20.0.4
Eric Engestrom [Fri, 3 Apr 2020 11:12:10 +0000 (13:12 +0200)]
docs: update calendar, add news item, and link releases notes for 20.0.4

Note that the next 20.0.x releases numbers have been shifted as this was
not one of the planned releases.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4428>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4428>

4 years agodocs/relnotes: add sha256sum for 20.0.4
Eric Engestrom [Fri, 3 Apr 2020 10:28:20 +0000 (12:28 +0200)]
docs/relnotes: add sha256sum for 20.0.4

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4428>

4 years agodocs: add release notes for 20.0.4
Eric Engestrom [Fri, 3 Apr 2020 09:24:56 +0000 (11:24 +0200)]
docs: add release notes for 20.0.4

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4428>

4 years agoutil/xmlconfig: fix sha1 comparison code
Pierre-Eric Pelloux-Prayer [Fri, 3 Apr 2020 07:25:05 +0000 (09:25 +0200)]
util/xmlconfig: fix sha1 comparison code

Fixes: 8f48e7b1e99 ("util/xmlconfig: add new sha1 application attribute")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2730
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4426>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4426>

4 years agoradv/llvm: enable 16-bit storage features on GFX6-GFX7
Samuel Pitoiset [Wed, 29 Jan 2020 13:40:17 +0000 (14:40 +0100)]
radv/llvm: enable 16-bit storage features on GFX6-GFX7

Should allow to play Doom Eternal on GFX6-GFX7 because the
driver now supports storageBuffer16BitAccess.

It's now supported and all CTS tests pass.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/857
Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>

4 years agoac/nir: split 16-bit SSBO stores on GFX6
Samuel Pitoiset [Thu, 26 Mar 2020 13:14:45 +0000 (14:14 +0100)]
ac/nir: split 16-bit SSBO stores on GFX6

Due to possible alignment issues, make sure to split stores of
16-bit vectors.

Doom Eternal requires storageBuffer16BitAccess.

Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>

4 years agoac/nir: split 16-bit load/store to global memory on GFX6
Samuel Pitoiset [Thu, 26 Mar 2020 13:14:27 +0000 (14:14 +0100)]
ac/nir: split 16-bit load/store to global memory on GFX6

Due to possible alignment issues, make sure to split loads/stores
of 16-bit vectors.

Doom Eternal requires storageBuffer16BitAccess.

Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>

4 years agoradv/llvm: enable 8-bit storage features on GFX6-GFX7
Samuel Pitoiset [Wed, 29 Jan 2020 09:45:40 +0000 (10:45 +0100)]
radv/llvm: enable 8-bit storage features on GFX6-GFX7

It's now supported and all CTS tests pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>

4 years agoac/nir: split 8-bit SSBO stores on GFX6
Samuel Pitoiset [Wed, 29 Jan 2020 13:38:55 +0000 (14:38 +0100)]
ac/nir: split 8-bit SSBO stores on GFX6

Due to possible alignment issues, make sure to split stores of
8-bit vectors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>

4 years agoac/nir: split 8-bit load/store to global memory on GFX6
Samuel Pitoiset [Wed, 29 Jan 2020 13:37:49 +0000 (14:37 +0100)]
ac/nir: split 8-bit load/store to global memory on GFX6

Due to possible alignment issues, make sure to split loads/stores
of 8-bit vectors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>

4 years agoaco: always optimize v_mad to v_madak in presence of literals
Samuel Pitoiset [Wed, 1 Apr 2020 16:09:43 +0000 (18:09 +0200)]
aco: always optimize v_mad to v_madak in presence of literals

v_mad and v_madak are both 64-bit instructions, so it doesn't
increase code size to always apply a 32-bit literal instead of
using v_mad and a sgpr which contains that literal.

Found with some Youngblood shaders but help some other games.

vkpipeline-db (VEGA10):
Totals from affected shaders:
SGPRS: 46168 -> 46016 (-0.33 %)
VGPRS: 45576 -> 45564 (-0.03 %)
Code Size: 5187208 -> 5179584 (-0.15 %) bytes
Max Waves: 3297 -> 3297 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4410>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4410>

4 years agoglsl/lower_precision: Use vector.back() instead of vector.end()[-1]
Neil Roberts [Thu, 2 Apr 2020 14:25:18 +0000 (16:25 +0200)]
glsl/lower_precision: Use vector.back() instead of vector.end()[-1]

The use of vector.end()[-1] seems to generate warnings in Coverity about
not allowing a negative argument to a parameter. The intention with the
code snippet is just to access the last element of the vector. The
vector.back() call acheives the same thing, is clearer and will
hopefully fix the Coverity warning.

I’m not exactly sure why Coverity thinks the array index can’t be
negative. cplusplus.com says that vector::end() returns a random access
iterator and that the type of the array index operator argument to that
should be the difference type for the container. It then also says that
difference_type for a vector is "a signed integral type".

Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoclover: fix build with single library clang build
Karol Herbst [Thu, 2 Apr 2020 11:00:14 +0000 (13:00 +0200)]
clover: fix build with single library clang build

Closes: #2560
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4417>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4417>

4 years agoradv: Filter extensions not whitelisted for Android
Drew Davenport [Tue, 10 Mar 2020 20:14:33 +0000 (14:14 -0600)]
radv: Filter extensions not whitelisted for Android

Android enforces through CTS a whitelist of Vulkan extensions that are
allowed in each Android version. When building radv for Android, disable
extensions that are unknown to the version of Android for which
radv is being built.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4398>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4398>

4 years agost/vdpau: make query test for 2D support
Ilia Mirkin [Sat, 7 Mar 2020 23:49:01 +0000 (18:49 -0500)]
st/vdpau: make query test for 2D support

The 3D check has been there since the dawn of time, but I see no reason
for it, most likely a typo. When the surfaces are actually created, they
use the 2D resource type (as expected).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4108>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4108>

4 years agost/vdpau: avoid asserting on new VDP_YCBCR_* formats
Ilia Mirkin [Sat, 7 Mar 2020 21:18:26 +0000 (16:18 -0500)]
st/vdpau: avoid asserting on new VDP_YCBCR_* formats

Depending on user's vdpau headers, not all of those defines may exist.
Eventually we may want a private copy of these, but this is simple
enough for now.

Fixes asserts when running vdpauinfo which supports these recently added
formats.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4108>

4 years agonir/from_ssa: Only chain movs when a src is also a dest
Jason Ekstrand [Wed, 1 Apr 2020 20:05:18 +0000 (15:05 -0500)]
nir/from_ssa: Only chain movs when a src is also a dest

The algorithm we use for resolving parallel copy instructions plays this
little shell game with the values.  The reason for this is that it lets
us handle cases where, for instance we have a -> b and b -> a and we
need to use a temporary to do a swap.  One result of this algorithm is
that it tends to emit a lot of mov chains which are typcially really bad
for GPUs where a mov is far from free.  For instance, it's likely to
turn this:

    r16 = ssa_0; r17 = ssa_0; r18 = ssa_0; r15 = ssa_0

into this:

    r15 = mov ssa_0
    r18 = mov r15
    r17 = mov r18
    r16 = mov r17

which, if it's the only thing in a block (this is common for phis) is
impossible for a scheduler to fix because of the dependencies and you
end up with significant stalling.  If, on the other hand, we only do the
chaining in the actual case where we need to free up a so that it can be
used as a destination, we can emit this:

    r15 = mov ssa_0
    r18 = mov ssa_0
    r17 = mov ssa_0
    r16 = mov ssa_0

which is far nicer to the scheduler.  On Intel, our copy propagation
pass will undo the chain for us so this has no shader-db impact.
However, for less intelligent back-ends, it's probably a lot better.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4412>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4412>

4 years agofreedreno: Rename RB_DONE_TS
Connor Abbott [Thu, 5 Mar 2020 16:35:55 +0000 (17:35 +0100)]
freedreno: Rename RB_DONE_TS

This makes the various cache_flush implementations make more sense.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4065>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4065>

4 years agofreedreno: Cleanup event names
Connor Abbott [Thu, 5 Mar 2020 16:10:47 +0000 (17:10 +0100)]
freedreno: Cleanup event names

It turns out that every *_TS event, i.e. every event which requires a
seqno pointer, also allows generating an interrupt in the kernel, at
least since a3xx. And furthermore these interrupts are named by the kgsl
kernel driver and already in envytools. Therefore it's possible to map
out what the *_TS events are with 100% certainty, given access to the
hardware, by sending a CP_EVENT_WRITE with bit 31 set, unmasking all
interrupts in the kernel, and logging which ones get hit. I've done this
for a6xx, and I've also looked at the a5xx firmware, and the list of TS
interrupts is the same as a6xx, so I have a pretty good idea of what the
a5xx events are. I also fixed a few related things along the way:

- VIZQUERY_END overlaps with WT_DONE_TS, but VIZQUERY_START was also a
mess, with neither VIZQUERY_START nor HLSQ_FLUSH using variants. I added
what seems like reasonable variants, based on the existing comment
and the fact that HLSQ_FLUSH is only used in Mesa with a3xx and a4xx.
- CACHE_FLUSH_AND_INVALIDATE seems to come straight from R600, and I
have no idea if it's actually valid with a2xx, but given that RB_DONE_TS
exists in the interrupt mask since a3xx, I guessed that RB_DONE_TS
hasn't changed position since then and put it down as a3xx+ and limited
CACHE_FLUSH_AND_INVALIDATE to a2xx. Someone with the relevant hardware
should be able to confirm.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4065>

4 years agogallivm: fix stream id fetch
Roland Scheidegger [Thu, 2 Apr 2020 05:09:50 +0000 (07:09 +0200)]
gallivm: fix stream id fetch

Fetching the stream id directly can crash since bld->immediates may not
exist (if there's too many immediates or we use the array due to indirect
accesses). So just call emit_fetch_immediate instead.

v2: fix the swizzle

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4416>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4416>

4 years agogallivm: switch the mask6/mask7 cases for signed rgtc formats
Roland Scheidegger [Thu, 2 Apr 2020 02:19:51 +0000 (04:19 +0200)]
gallivm: switch the mask6/mask7 cases for signed rgtc formats

This fixes some regressions where -1.0/1.0 results got flipped, but it's still
broken in some cases.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4416>

4 years agogallivm: fix rgtc2 format
Roland Scheidegger [Thu, 2 Apr 2020 00:51:03 +0000 (02:51 +0200)]
gallivm: fix rgtc2 format

In some cases, there can be garbage in the upper bits after the channel
decode - for dxt5 this didn't matter (as the upper bits are shifted out
anyway) but for rgtc2 formats it does.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4416>

4 years agoanv/image: Use align_u64 for image offsets
Jason Ekstrand [Wed, 1 Apr 2020 22:24:10 +0000 (17:24 -0500)]
anv/image: Use align_u64 for image offsets

The ALIGN functions in util/u_math.h work on uintptr_t whose size
changes depending on your platform.  Use ones which take an explicit
64-bit type instead to avoid 32-bit platform issues.

Cc: mesa-stable@lists.freedesktop.org
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4414>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4414>

4 years agogallium: enable EGL_EXT_image_dma_buf_import_modifiers unconditionally
Adam Jackson [Tue, 24 Mar 2020 09:55:26 +0000 (10:55 +0100)]
gallium: enable EGL_EXT_image_dma_buf_import_modifiers unconditionally

This is a re-do of [1].

Enable EGL_EXT_image_dma_buf_import_modifiers with
EXT_image_dma_buf_import. This allows users to use queryDmaBufFormats to
query the list of supported formats even if modifiers are not supported.

With this change, queryDmaBufModifiers always returns zero modifiers. A
compositor survey reveals that this should be fine: wlroots [2],
Weston [3], Mutter [4] [5], kwin [6] and xorg-xserver [7] seem to all
support this case gracefully.

Tested with Sway and wlroots by running weston-info and checking the
list of formats advertised by zwp_linux_dmabuf_v1. Also ran weston-simple-egl
and checked zwp_linux_dmabuf_v1 was used instead of wl_drm.

[1]: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1812
[2]: https://github.com/swaywm/wlroots/blob/8707a9b7ecbba0321804604d9ea954a46ecced21/render/egl.c#L629
[3]: https://gitlab.freedesktop.org/wayland/weston/-/blob/786490cb53439624fd3c20b9e19d3ea5ec316c00/libweston/renderer-gl/gl-renderer.c#L2337
[4]: https://gitlab.gnome.org/GNOME/mutter/-/blob/f0df07cba3ca308b47c9aefcc8112e8880fd9950/src/wayland/meta-wayland-dma-buf.c#L486
[5]: https://gitlab.gnome.org/GNOME/mutter/-/blob/0a6034ef3a745c25ab63c2ca8d4ae08bc5e09d88/src/backends/native/meta-renderer-native.c#L399
[6]: https://cgit.kde.org/kwin.git/tree/platformsupport/scenes/opengl/egl_dmabuf.cpp?id=9b7ab4d16a8ee0cb35108362ee5aa046f4ae20b7#n473
[7]: https://gitlab.freedesktop.org/xorg/xserver/-/blob/26004df63c25061586a967f3586795a75280acc2/glamor/glamor_egl.c#L682

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4298>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4298>

4 years agodriconf: whilelist more games for glthread
Marek Olšák [Wed, 1 Apr 2020 10:22:03 +0000 (06:22 -0400)]
driconf: whilelist more games for glthread

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4402>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4402>

4 years agotracie: Switch to using shutil.move for cross filesystem moves
Rohan Garg [Mon, 30 Mar 2020 17:12:00 +0000 (19:12 +0200)]
tracie: Switch to using shutil.move for cross filesystem moves

When running tracie in a docker container, renaming files from
inside the container to a bind-mounted folder on the host causes
a invalid cross-device link due to os.rename limitations.

Switching to shutil allows us to overcome this.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4377>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4377>

4 years agowgl: do not create screen from DllMain
Erik Faye-Lund [Sun, 26 May 2019 08:42:58 +0000 (10:42 +0200)]
wgl: do not create screen from DllMain

There's a lot of operations that aren't allowed from DllMain, so we
shouldn't create a driver-screen from there. So let's instead delay this
until it's needed from a normal function call.

See https://docs.microsoft.com/en-us/windows/win32/dlls/dllmain for
details about what is allowed and isn't from DllMain.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4307>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4307>

4 years agowgl: move screen-init to a helper
Erik Faye-Lund [Sun, 26 May 2019 08:42:51 +0000 (10:42 +0200)]
wgl: move screen-init to a helper

This will be useful in the next commit.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4307>

4 years agowgl: drop unused member
Erik Faye-Lund [Sun, 26 May 2019 08:42:39 +0000 (10:42 +0200)]
wgl: drop unused member

While we're at it, drop trying to re-calculate the max-size from the
max-level. It's not accurate on any drivers where the max-size isn't a
power of two anyway.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4307>

4 years agowgl: drop pointless debug_printf
Erik Faye-Lund [Sun, 26 May 2019 08:42:32 +0000 (10:42 +0200)]
wgl: drop pointless debug_printf

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4307>

4 years agoradeonsi: dump shader stats when hitting the live cache
Pierre-Eric Pelloux-Prayer [Fri, 27 Mar 2020 12:45:08 +0000 (13:45 +0100)]
radeonsi: dump shader stats when hitting the live cache

With the introduction of the live shader cache, when a shader is
fetched from the cache no stats are printed for shaderdb.
So in a sequence like this: vs1, fs1, vs1, fs2, shaderdb may see
3 or 4 lines, depending on the threads being used.
If one run produces 3 lines while the other produces 4 lines, it
would compare vs1 stats with fs2 stats.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4355>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4355>

4 years agogallium/util: let shader live cache users know if a hit occured
Pierre-Eric Pelloux-Prayer [Wed, 1 Apr 2020 08:47:14 +0000 (10:47 +0200)]
gallium/util: let shader live cache users know if a hit occured

This will be used in next commit.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4355>

4 years agoglsl_to_nir: remove dead code
Timothy Arceri [Thu, 2 Apr 2020 00:33:48 +0000 (11:33 +1100)]
glsl_to_nir: remove dead code

This code was made unused by the changes described in be2990d8fbcd.

NIR based Gallium drivers switched to the NIR based lowering in
efa4fc0ebd96.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4415>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4415>

4 years agoanv/pipeline: allow more than 16 FS inputs
Juan A. Suarez Romero [Tue, 31 Mar 2020 10:59:20 +0000 (10:59 +0000)]
anv/pipeline: allow more than 16 FS inputs

A fragment shader can have more than 16 inputs, so SBE emission should
deal with all of them.

This fixes dEQP-VK.pipeline.max_varyings.*

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>

4 years agointel/compiler: store the FS inputs in WM prog data
Juan A. Suarez Romero [Tue, 31 Mar 2020 10:45:26 +0000 (10:45 +0000)]
intel/compiler: store the FS inputs in WM prog data

Store the fragment shader inputs in the program data so we can use them
later when required without needing the NIR shader.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>

4 years agoanv: use urb_setup_attribs in SBE
Juan A. Suarez Romero [Wed, 11 Mar 2020 10:35:13 +0000 (10:35 +0000)]
anv: use urb_setup_attribs in SBE

Avoid looping over all VARYING_SLOT_MAX urb_setup arrray entries.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>

4 years agodocs: update calendar, add news item, and link releases notes for 20.0.3
Eric Engestrom [Wed, 1 Apr 2020 22:04:19 +0000 (00:04 +0200)]
docs: update calendar, add news item, and link releases notes for 20.0.3

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4413>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4413>

4 years agodocs/relnotes: add sha256sum for 20.0.3
Eric Engestrom [Wed, 1 Apr 2020 21:40:37 +0000 (23:40 +0200)]
docs/relnotes: add sha256sum for 20.0.3

(cherry picked from commit a68048153260fe33f2ec5df48f772f4d1ceaed03)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4413>

4 years agodocs: add release notes for 20.0.3
Eric Engestrom [Wed, 1 Apr 2020 21:24:57 +0000 (23:24 +0200)]
docs: add release notes for 20.0.3

(cherry picked from commit b04ae1f964c977035d9c8fd4144424387e0d868e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4413>

4 years agogallium/llvmpipe: add an optimised 32-bit memset
Dave Airlie [Tue, 31 Mar 2020 22:44:08 +0000 (08:44 +1000)]
gallium/llvmpipe: add an optimised 32-bit memset

This might have other users beyond filling/clearing buffers,

increase a fullscreen 4k gears from 68->74 fps on my Ryzen
since gears is really just a clear benchmark, and this helps
clearing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4394>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4394>

4 years agonir: place aligned members after bitfields in shader_info.tess
Mark Janes [Tue, 31 Mar 2020 23:41:28 +0000 (16:41 -0700)]
nir: place aligned members after bitfields in shader_info.tess

The placement of new shader_info.tess members unnecessarily wastes
space by interspersing 64bit members between bitfields.

Fixes: f1dd81ae104 ("nir: Collect if shader uses cross-invocation or indirect I/O.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4408>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4408>

4 years agonir: check shader type before writing to shaderinfo.tess union
Mark Janes [Tue, 31 Mar 2020 23:40:57 +0000 (16:40 -0700)]
nir: check shader type before writing to shaderinfo.tess union

If the shader is not a tesselation shader, then writing to the tess
member of the shaderinfo union will overwrite other members and crash.

Closes: #2722
Fixes: f1dd81ae104 ("nir: Collect if shader uses cross-invocation or indirect I/O.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4408>

4 years agoanv: Do not sample from 3d depth image with HiZ
Danylo Piliaiev [Wed, 1 Apr 2020 15:41:03 +0000 (18:41 +0300)]
anv: Do not sample from 3d depth image with HiZ

For Gen8-11, there are some restrictions around sampling from HiZ.
The Skylake PRM docs for RENDER_SURFACE_STATE::AuxiliarySurfaceMode
say:

    "If this field is set to AUX_HIZ, Number of Multisamples must
    be MULTISAMPLECOUNT_1, and Surface Type cannot be SURFTYPE_3D."

Fixes: dEQP-VK.geometry.layered.3d.*.readback
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2720
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Arcady Goldmints-Orlov <agoldmints@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4409>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4409>

4 years agogallium/swr: Re-enable scratch space for client-memory buffers
Krzysztof Raszkowski [Wed, 1 Apr 2020 15:02:06 +0000 (17:02 +0200)]
gallium/swr: Re-enable scratch space for client-memory buffers

Commit 7d33203b446cdfa11c2aaea18caf05b120a16283 fixed race condition
in freeing scratch memory mechanism but that approach creates
performance regression in some cases. This change revert previous
changes and fix freeing scratch memory mechanism.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4406>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4406>

4 years agogallium/swr: Fix array stride problem.
Krzysztof Raszkowski [Wed, 1 Apr 2020 14:57:20 +0000 (16:57 +0200)]
gallium/swr: Fix array stride problem.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4405>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4405>

4 years agoci: Consistently use -j4 across x86 build jobs and -j8 on ARM.
Eric Anholt [Tue, 11 Feb 2020 23:44:56 +0000 (15:44 -0800)]
ci: Consistently use -j4 across x86 build jobs and -j8 on ARM.

Our shared runners are set up for concurrent jobs ~= CPUs / 4 (x86) or 8
(ARM).  If you use more build processes than that, then jobs may be
fighting each other for shared system resources, possibly to the point of
failure (we've seen one of the runners OOM on some jobs before, though I'm
not sure if this was the cause).

To try to systematically prevent the problem, we make a ninja wrapper in
the containers that passes the -j flags, and set MAKEFLAGS in the
container builds.  This doesn't cover make in non-container builds, but I
believe we don't have any of those.

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3782>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3782>

4 years agoaco: only break SMEM clauses if XNACK is enabled (mostly APUs)
Samuel Pitoiset [Fri, 27 Mar 2020 14:16:39 +0000 (15:16 +0100)]
aco: only break SMEM clauses if XNACK is enabled (mostly APUs)

According to LLVM, it seems only required for APUs like RAVEN, but
we still ensure that SMEM stores are in their own clause.

pipeline-db (VEGA10):
Totals from affected shaders:
SGPRS: 1775364 -> 1775364 (0.00 %)
VGPRS: 1287176 -> 1287176 (0.00 %)
Spilled SGPRs: 725 -> 725 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 65386620 -> 65107460 (-0.43 %) bytes
Max Waves: 287099 -> 287099 (0.00 %)

pipeline-db (POLARIS10):
Totals from affected shaders:
SGPRS: 1797743 -> 1797743 (0.00 %)
VGPRS: 1271108 -> 1271108 (0.00 %)
Spilled SGPRs: 730 -> 730 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 64046244 -> 63782324 (-0.41 %) bytes
Max Waves: 254875 -> 254875 (0.00 %)

This only affects GFX6-GFX9 chips because the compiler uses a
different pass for GFX10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4349>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4349>

4 years agoRevert "spirv: Implement OpCopyObject and OpCopyLogical as blind copies"
Jason Ekstrand [Wed, 1 Apr 2020 17:40:34 +0000 (12:40 -0500)]
Revert "spirv: Implement OpCopyObject and OpCopyLogical as blind copies"

This reverts commit 7a53e67816ed9baf7d825ed60ee59f0c05f9df48.

4 years agoloader: fallback to kernel name, if PCI fails
Emil Velikov [Thu, 5 Mar 2020 14:41:25 +0000 (14:41 +0000)]
loader: fallback to kernel name, if PCI fails

Currently, if the PCI machinery fails, we return a NULL driver name.
In the past this has resulted in various workarounds.

To avoid those, fallback to loader_get_kernel_driver_name(). It's not
perfect, yet perfectly reasonable.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4084>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4084>

4 years agoloader: move "using driver..." message to loader_get_kernel_driver_name
Emil Velikov [Thu, 5 Mar 2020 14:30:51 +0000 (14:30 +0000)]
loader: move "using driver..." message to  loader_get_kernel_driver_name

Move the message to the function which fetches the name.

While here use the same DEBUG/WARNING approach like in the PCI case. The
current method spam a tad much, plus isn't consistent.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4084>

4 years agoloader: simplify codeflow in drm_get_pci_id_for_fd
Emil Velikov [Thu, 5 Mar 2020 13:50:46 +0000 (13:50 +0000)]
loader: simplify codeflow in drm_get_pci_id_for_fd

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4084>

4 years agoloader: simplify loader_get_user_preferred_fd()
Emil Velikov [Tue, 19 Feb 2019 15:30:40 +0000 (15:30 +0000)]
loader: simplify loader_get_user_preferred_fd()

Reoder the function a bit to make the code-flow more obvious and short.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4084>

4 years agoloader: use a maximum of 64 drmDevices
Emil Velikov [Tue, 19 Feb 2019 15:30:38 +0000 (15:30 +0000)]
loader: use a maximum of 64 drmDevices

Currently that's the hard-coded maximum in the kernel, even though the
libdrm API allows for more. Latter is done with extendability in mind.

Allocate 64 pointers^Wdevices on stack for now. Making for shorter and
ever-so-slightly faster code.

v2: Use single MAX_DRM_DEVICES #define (Eric)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com> (v1)
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4084>

4 years agoRevert "egl/dri2: Don't dlclose() the driver on dri2_load_driver_common failure"
Emil Velikov [Thu, 5 Mar 2020 13:05:36 +0000 (13:05 +0000)]
Revert "egl/dri2: Don't dlclose() the driver on dri2_load_driver_common failure"

This reverts commit 1b87f4058de84d7a0bb4ead0c4f4b024d4cce8fb.

dlclose() of the handle is perfectly reasonable, a follow-up NULL
assignment is missing.

As-is this causes a leak for nearly every platform, since they call
dri2_load_driver* initially, followed by a second swrast fallback call.

Some platforms even loop through the existing drivers probing.

Revert the commit and add the NULL check.

Fixes: 1b87f4058de ("egl/dri2: Don't dlclose() the driver on dri2_load_driver_common failure")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4084>

4 years agoegl/drm: reinstate (kms_)swrast support
Emil Velikov [Wed, 4 Mar 2020 18:33:09 +0000 (18:33 +0000)]
egl/drm: reinstate (kms_)swrast support

With earlier commit we've added a generic LIBGL_ALWAYS_SOFTWARE handling
yet did not consider that the existing codebase unconditionally errors
out when set. That was fixed with a latter commit, while the fix itself
added erroneous restriction for egl/drm.

As mentioned in the report - the feature was working for ages. It was a
Gnome developer who added kms_swrast support for gbm in the first place.

Admittedly kms_swrast is somewhat in the middle between traditional
swrast and HW drivers, regardless - reinstate support.

Fixes: 47273d7312c ("egl: set UseFallback if LIBGL_ALWAYS_SOFTWARE is set")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/165
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4084>

4 years agoglx: set the loader_logger early and for everyone
Emil Velikov [Wed, 4 Mar 2020 17:52:04 +0000 (17:52 +0000)]
glx: set the loader_logger early and for everyone

Currently we set the logger only for DRI3. Even though it's used nearly
everywhere. For platforms where we don't the function is effectively a
no-op.

With this in place, LIBGL_DEBUG=verbose works across the board.

Fixes: d971a4230d5 ("loader: Factor out the common driver opening logic from each loader.")
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4084>

4 years agomeson: glx: drop with_glx == dri check
Emil Velikov [Wed, 4 Mar 2020 17:46:54 +0000 (17:46 +0000)]
meson: glx: drop with_glx == dri check

We can get into src/glx only with with_glx == dri. Thus there's no point
in the secondary, nested, check - it's always true.

Cc: Dylan Baker <dylan@pnwbakers>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4084>

4 years agomesa/main: remove unused macro
Erik Faye-Lund [Mon, 25 Feb 2019 12:30:30 +0000 (13:30 +0100)]
mesa/main: remove unused macro

This macro is no longer used, so let's get rid of it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>

4 years agomesa/main: clean up extension-check for GL_TEXTURE_EXTERNAL
Erik Faye-Lund [Mon, 25 Feb 2019 12:20:35 +0000 (13:20 +0100)]
mesa/main: clean up extension-check for GL_TEXTURE_EXTERNAL

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>

4 years agomesa/main: clean up extension-check for GL_RASTERIZER_DISCARD
Erik Faye-Lund [Mon, 25 Feb 2019 12:01:58 +0000 (13:01 +0100)]
mesa/main: clean up extension-check for GL_RASTERIZER_DISCARD

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>

4 years agomesa/main: clean up extension-check for GL_TEXTURE_CUBE_MAP_SEAMLESS
Erik Faye-Lund [Mon, 25 Feb 2019 12:00:47 +0000 (13:00 +0100)]
mesa/main: clean up extension-check for GL_TEXTURE_CUBE_MAP_SEAMLESS

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>

4 years agomesa/main: clean up extension-check for GL_FRAGMENT_SHADER_ATI
Erik Faye-Lund [Mon, 25 Feb 2019 12:00:07 +0000 (13:00 +0100)]
mesa/main: clean up extension-check for GL_FRAGMENT_SHADER_ATI

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>

4 years agomesa/main: clean up extension-check for AMD_depth_clamp_separate
Erik Faye-Lund [Mon, 25 Feb 2019 11:58:35 +0000 (12:58 +0100)]
mesa/main: clean up extension-check for AMD_depth_clamp_separate

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>

4 years agomesa/main: clean up extension-check for GL_DEPTH_BOUNDS_TEST
Erik Faye-Lund [Mon, 25 Feb 2019 11:57:22 +0000 (12:57 +0100)]
mesa/main: clean up extension-check for GL_DEPTH_BOUNDS_TEST

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>

4 years agomesa/main: clean up extension-check for GL_STENCIL_TEST_TWO_SIDE
Erik Faye-Lund [Mon, 25 Feb 2019 11:54:13 +0000 (12:54 +0100)]
mesa/main: clean up extension-check for GL_STENCIL_TEST_TWO_SIDE

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>

4 years agomesa/main: clean up extension-check for GL_TEXTURE_RECTANGLE
Erik Faye-Lund [Mon, 25 Feb 2019 11:52:37 +0000 (12:52 +0100)]
mesa/main: clean up extension-check for GL_TEXTURE_RECTANGLE

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>

4 years agomesa/main: clean up extension-check for GL_VERTEX_PROGRAM_POINT_SIZE
Erik Faye-Lund [Mon, 25 Feb 2019 11:50:08 +0000 (12:50 +0100)]
mesa/main: clean up extension-check for GL_VERTEX_PROGRAM_POINT_SIZE

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>

4 years agomesa/main: clean up extension-check for GL_VERTEX_PROGRAM_TWO_SIDE
Erik Faye-Lund [Mon, 25 Feb 2019 11:44:55 +0000 (12:44 +0100)]
mesa/main: clean up extension-check for GL_VERTEX_PROGRAM_TWO_SIDE

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>

4 years agomesa/main: clean up extension-check for GL_VERTEX_PROGRAM
Erik Faye-Lund [Mon, 25 Feb 2019 11:25:04 +0000 (12:25 +0100)]
mesa/main: clean up extension-check for GL_VERTEX_PROGRAM

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>

4 years agomesa/main: clean-up extension-checks for point-sprites
Erik Faye-Lund [Mon, 25 Feb 2019 10:52:59 +0000 (11:52 +0100)]
mesa/main: clean-up extension-checks for point-sprites

This is the only user of the CHECK_EXTENSION2 macro, so let's remove
that while we're at it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>

4 years agomesa/main: correct extension-checks for GL_BLACKHOLE_RENDER_INTEL
Erik Faye-Lund [Wed, 19 Feb 2020 11:12:27 +0000 (12:12 +0100)]
mesa/main: correct extension-checks for GL_BLACKHOLE_RENDER_INTEL

KHR_blend_equation_advanced_coherent isn't exposed on OpenGL ES 1.x
nor OpenGL versions prior to 30, so we shouldn't allow to query its
enum-states there either.

Fixes: 74ec39f66d5 ("mesa: add INTEL_blackhole_render")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/329>

4 years agoloader: Warn when we fail to open a device node due to permissions.
Eric Anholt [Wed, 5 Feb 2020 01:31:28 +0000 (17:31 -0800)]
loader: Warn when we fail to open a device node due to permissions.

This is definitely not the first time I've debugged why I'm getting swrast
on a device only to find out I'm not a member of the render node's group.

This does mean that you'll get a warning print even without EGL_LOG_LEVEL
set.  This may be an issue if we expect people outside of the DRI node's
group to actually be using swrast instead of getting their permissions
fixed.  Right now surfaceless throws a "libEGL warning: No hardware driver
found, falling back to software rendering" in that case anyway, so this is
just more informative.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3703>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3703>

4 years agosvga: Treat forced coherent maps as maps of persistent memory
Thomas Hellstrom [Tue, 31 Mar 2020 07:10:17 +0000 (09:10 +0200)]
svga: Treat forced coherent maps as maps of persistent memory

A previous commit made sure we sent a BindGBSurface command at map time
rather than at unmap time for persistent memory. To be consistent, do the
same for forced coherent maps. This makes it possible to avoid the
explicit UpdateGBSurface at unmap time for discard maps and to instead rely
on the kernel's dirty-tracking mechanism at the cost of an additional flush.

Tested with SVGA_FORCE_COHERENT=1, piglit run quick. No regressions.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4399>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4399>

4 years agosvga, winsys/svga: Fix persistent memory discard maps
Thomas Hellstrom [Sat, 28 Mar 2020 17:02:25 +0000 (18:02 +0100)]
svga, winsys/svga: Fix persistent memory discard maps

The kernel driver requires immediate notification using a
BindGBSurface command when a graphics coherent memory resource changes
backing MOB, so that it can start dirty-tracking the new MOB.
Since we always use graphics coherent memory for persistent memory, enqueue
and flush a BindGBSurface commmand at map time rather than at unmap time.
Since we're dealing with persistent memory, It's OK to flush while mapped.

This fixes an issue with gnome-shell / Wayland which uses persistent
memory together with discard maps when we advertise ARB_buffer_storage.
XWayland clients will render incorrectly.

Fixes: 71b43490dd ("svga: Support ARB_buffer_storage")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4399>

4 years agopan/bi: Fix outmod/roundmode flip
Alyssa Rosenzweig [Wed, 1 Apr 2020 02:18:03 +0000 (22:18 -0400)]
pan/bi: Fix outmod/roundmode flip

I misread the disassembler, the fields are in the other order.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4396>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4396>

4 years agopan/bi: Handle fmov class ops
Alyssa Rosenzweig [Wed, 1 Apr 2020 02:16:55 +0000 (22:16 -0400)]
pan/bi: Handle fmov class ops

We need to lower them to something reasonable (ideally, the modifier
would be attached but we need to do something for the case it's not). We
specifically have to lower pre-sched as well, but we can do the lower
literally at schedule time for now (if this proves annoying, we can move
it earlier, but I want to leave room for modifier-aware copyprop should
that prove interesting).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4396>

4 years agopan/bi: Fix unused port swapping
Alyssa Rosenzweig [Tue, 31 Mar 2020 17:37:12 +0000 (13:37 -0400)]
pan/bi: Fix unused port swapping

Fixes INSTR_INVALID_ENC

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4396>

4 years agopan/bi: Add cmdline option for verbose disassembly
Alyssa Rosenzweig [Tue, 31 Mar 2020 17:08:16 +0000 (13:08 -0400)]
pan/bi: Add cmdline option for verbose disassembly

Useful for debugging packing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4396>

4 years agopan/bi: Don't set the back-to-back bit yet
Alyssa Rosenzweig [Tue, 31 Mar 2020 17:05:02 +0000 (13:05 -0400)]
pan/bi: Don't set the back-to-back bit yet

This is bad for performance but we can't assume it's true without some
analysis, which we presently don't do. Leave it for future work and
don't break the present.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4396>

4 years agopan/bi: Use STAGE srcs for scheduler nops
Alyssa Rosenzweig [Tue, 31 Mar 2020 17:04:18 +0000 (13:04 -0400)]
pan/bi: Use STAGE srcs for scheduler nops

..rather than using port 0 for the source, which may or may not actually
exist.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4396>

4 years agopan/bi: Fix writes_component for VECTOR
Alyssa Rosenzweig [Tue, 31 Mar 2020 16:55:31 +0000 (12:55 -0400)]
pan/bi: Fix writes_component for VECTOR

I'm not convinced this is the best way and it's sort of a hack, but it
fixes RA for st_vary.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4396>

4 years agopan/bit: Wire through I/O
Alyssa Rosenzweig [Tue, 31 Mar 2020 16:20:18 +0000 (12:20 -0400)]
pan/bit: Wire through I/O

We'd like to wire in attributes and uniforms as inputs and look at the
varying as output for automatic testing on-device, building up a test
framework for us.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4396>

4 years agopan/bit: Add `run` mode to the cmdline
Alyssa Rosenzweig [Tue, 31 Mar 2020 01:21:49 +0000 (21:21 -0400)]
pan/bit: Add `run` mode to the cmdline

This emulates the functionality of shader_runner (built for kbase) using
the bifrost testing infrastructure so it runs on mainline. Ideally this
will let us test shaders from the assembler.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4396>

4 years agoappveyor: Remove Meson job.
Jose Fonseca [Sat, 28 Mar 2020 10:42:58 +0000 (10:42 +0000)]
appveyor: Remove Meson job.

Appveyor Meson fails misteriously some times, e.g.,
- https://ci.appveyor.com/project/mesa3d/mesa/builds/31780753/job/w8b28iahboxq4na2
- https://ci.appveyor.com/project/mesa3d/mesa/builds/31857376
and now that we have msvc coverage on gitlab ci this is no longer necessary.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4392>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4392>

4 years agofreedreno/log: fix build error
Rob Clark [Tue, 31 Mar 2020 15:31:00 +0000 (08:31 -0700)]
freedreno/log: fix build error

It seems some versions of gcc are less clever about const initializers:

```
../src/gallium/drivers/freedreno/freedreno_log.c:58:33: error: initializer element is not constant
 const unsigned msgs_per_chunk = bo_size / sizeof(uint64_t);
                                 ^~~~~~~
```

See https://gitlab.freedesktop.org/mesa/mesa/-/issues/2713

Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4390>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4390>

4 years agonir/algebraic: Remove a redundant fabs pattern
Ian Romanick [Tue, 24 Mar 2020 18:18:02 +0000 (11:18 -0700)]
nir/algebraic: Remove a redundant fabs pattern

Made redundant by 5544b2cbbd2 ("nir/algebraic: Use value range analysis
to eliminate useless unary ops").

No shader-db changes on any Intel platform.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>

4 years agonir/algebraic: Use value range analysis to convert fmax to fsat
Ian Romanick [Tue, 11 Feb 2020 02:23:54 +0000 (18:23 -0800)]
nir/algebraic: Use value range analysis to convert fmax to fsat

This is conceptually similar to the 1-fsat(a) <=> fsat(1-a) rearragement
done in:

3b747909419 ("nir/algebraic: Recognize open-coded flrp(a, b, fsat(c))")

2d259713b7 ("nir/algebraic: Commute 1-fsat(a) to fsat(1-a) for all
non-fmul instructions").

Note: this helps the Aztex Ruins shader that was hurt for spills and
fills on Braodwell in the previous commit, but it does not fix the
spills or fills. :(

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 14528985 -> 14526116 (-0.02%)
instructions in affected programs: 477300 -> 474431 (-0.60%)
helped: 2332
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 1.23 x̃: 1
helped stats (rel) min: 0.07% max: 8.89% x̄: 0.88% x̃: 0.64%
95% mean confidence interval for instructions value: -1.27 -1.19
95% mean confidence interval for instructions %-change: -0.92% -0.85%
Instructions are helped.

total cycles in shared programs: 203723684 -> 203692984 (-0.02%)
cycles in affected programs: 4878847 -> 4848147 (-0.63%)
helped: 1764
HURT: 324
helped stats (abs) min: 1 max: 706 x̄: 22.94 x̃: 17
helped stats (rel) min: <.01% max: 17.75% x̄: 1.94% x̃: 1.66%
HURT stats (abs)   min: 1 max: 400 x̄: 30.15 x̃: 10
HURT stats (rel)   min: <.01% max: 17.76% x̄: 1.91% x̃: 0.69%
95% mean confidence interval for cycles value: -16.55 -12.86
95% mean confidence interval for cycles %-change: -1.44% -1.24%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>

4 years agonir/algebraic: Distribute source modifiers into instructions
Ian Romanick [Mon, 15 Jul 2019 22:55:00 +0000 (15:55 -0700)]
nir/algebraic: Distribute source modifiers into instructions

There are three main classes of cases that are helped by this change:

1. When the negation is applied to a value being type converted (e.g.,
   float(-x)).  This could possibly also be handled with more clever
   code generation.

2. When the negation is applied to a phi node source (e.g., x = -(...);
   at the end of a basic block).  This was the original case that caught
   my attention while looking at shader-db dumps.

3. When the negation is applied to the source of an instruction that
   cannot have source modifiers.  This includes texture instructions and
   math box instructions on pre-Gen7 platforms (see more details below).

In many these cases the negation can be propagated into the instructions
that generate the value (e.g., -(a*b) = (-a)*b).

In addition to the operations implemtned in this patch, I also tried:

 - frcp - Helped 6 or fewer shaders on Gen7+, and hurt just as many on
   pre-Gen7.  On Gen6 and earlier, frcp is a math box instruction, and
   math box instructions cannot have source modifiers.

   I suspect this is why so many more shaders are helped on Gen6 than on
   Gen5 or Gen7.  Gen6 supports OpenGL 3.3, so a lot more shaders
   compile on it.  A lot of these shaders may have things like cos(-x)
   or rcp(-x) that could result in an explicit negation instruction.

 - bcsel - Hurt a few shaders with none helped.  bcsel operates on
   integer sources, so the fabs or fneg cannot be a source modifier in
   the bcsel itself.

 - Integer instructions - No changes on any Intel platform.

Some notes about the shader-db results below.

 - On Tiger Lake, a single Deus Ex fragment shader is hurt for both
   spills and fills.

 - On Haswell, a different Deus Ex fragment shader is hurt for both
   spills and fills.

 - On GM45, the "LOST: 1" and "GAINED: 1" is a single Left4Dead 2
   (very high graphics settings, lol) fragment shader that upgrades
   from SIMD8 to SIMD16.

v2: Add support for fsign.  Add some patterns that remove redundant
negations and redundant absolute value rather than trying to push them
down the tree.

Tiger Lake
total instructions in shared programs: 17611333 -> 17586465 (-0.14%)
instructions in affected programs: 3033734 -> 3008866 (-0.82%)
helped: 10310
HURT: 632
helped stats (abs) min: 1 max: 35 x̄: 2.61 x̃: 1
helped stats (rel) min: 0.04% max: 16.67% x̄: 1.43% x̃: 1.01%
HURT stats (abs)   min: 1 max: 47 x̄: 3.21 x̃: 2
HURT stats (rel)   min: 0.04% max: 5.08% x̄: 0.88% x̃: 0.63%
95% mean confidence interval for instructions value: -2.33 -2.21
95% mean confidence interval for instructions %-change: -1.32% -1.27%
Instructions are helped.

total cycles in shared programs: 338365223 -> 338262252 (-0.03%)
cycles in affected programs: 125291811 -> 125188840 (-0.08%)
helped: 5224
HURT: 2031
helped stats (abs) min: 1 max: 5670 x̄: 46.73 x̃: 12
helped stats (rel) min: <.01% max: 34.78% x̄: 1.91% x̃: 0.97%
HURT stats (abs)   min: 1 max: 2882 x̄: 69.50 x̃: 14
HURT stats (rel)   min: <.01% max: 44.93% x̄: 2.35% x̃: 0.74%
95% mean confidence interval for cycles value: -18.71 -9.68
95% mean confidence interval for cycles %-change: -0.80% -0.63%
Cycles are helped.

total spills in shared programs: 8942 -> 8946 (0.04%)
spills in affected programs: 8 -> 12 (50.00%)
helped: 0
HURT: 1

total fills in shared programs: 9399 -> 9401 (0.02%)
fills in affected programs: 21 -> 23 (9.52%)
helped: 0
HURT: 1

Ice Lake
total instructions in shared programs: 16124348 -> 16102258 (-0.14%)
instructions in affected programs: 2830928 -> 2808838 (-0.78%)
helped: 11294
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.96 x̃: 1
helped stats (rel) min: 0.07% max: 17.65% x̄: 1.32% x̃: 0.93%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.45% max: 4.00% x̄: 3.72% x̃: 3.72%
95% mean confidence interval for instructions value: -1.99 -1.93
95% mean confidence interval for instructions %-change: -1.34% -1.29%
Instructions are helped.

total cycles in shared programs: 335393932 -> 335325794 (-0.02%)
cycles in affected programs: 123834609 -> 123766471 (-0.06%)
helped: 5034
HURT: 2128
helped stats (abs) min: 1 max: 3256 x̄: 43.39 x̃: 11
helped stats (rel) min: <.01% max: 35.79% x̄: 1.98% x̃: 1.00%
HURT stats (abs)   min: 1 max: 2634 x̄: 70.63 x̃: 16
HURT stats (rel)   min: <.01% max: 49.49% x̄: 2.73% x̃: 0.62%
95% mean confidence interval for cycles value: -13.66 -5.37
95% mean confidence interval for cycles %-change: -0.69% -0.48%
Cycles are helped.

LOST:   0
GAINED: 2

Skylake
total instructions in shared programs: 14949240 -> 14927930 (-0.14%)
instructions in affected programs: 2594756 -> 2573446 (-0.82%)
helped: 11000
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.94 x̃: 1
helped stats (rel) min: 0.07% max: 18.75% x̄: 1.39% x̃: 0.94%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.76% max: 4.76% x̄: 4.76% x̃: 4.76%
95% mean confidence interval for instructions value: -1.97 -1.91
95% mean confidence interval for instructions %-change: -1.42% -1.37%
Instructions are helped.

total cycles in shared programs: 324829346 -> 324821596 (<.01%)
cycles in affected programs: 121566087 -> 121558337 (<.01%)
helped: 4611
HURT: 2147
helped stats (abs) min: 1 max: 3715 x̄: 33.29 x̃: 10
helped stats (rel) min: <.01% max: 36.08% x̄: 1.94% x̃: 1.00%
HURT stats (abs)   min: 1 max: 2551 x̄: 67.88 x̃: 16
HURT stats (rel)   min: <.01% max: 53.79% x̄: 3.69% x̃: 0.89%
95% mean confidence interval for cycles value: -4.25 1.96
95% mean confidence interval for cycles %-change: -0.28% -0.02%
Inconclusive result (value mean confidence interval includes 0).

Broadwell
total instructions in shared programs: 14971203 -> 14949957 (-0.14%)
instructions in affected programs: 2635699 -> 2614453 (-0.81%)
helped: 10982
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.93 x̃: 1
helped stats (rel) min: 0.07% max: 18.75% x̄: 1.39% x̃: 0.94%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.76% max: 4.76% x̄: 4.76% x̃: 4.76%
95% mean confidence interval for instructions value: -1.97 -1.90
95% mean confidence interval for instructions %-change: -1.42% -1.37%
Instructions are helped.

total cycles in shared programs: 336215033 -> 336086458 (-0.04%)
cycles in affected programs: 127383198 -> 127254623 (-0.10%)
helped: 4884
HURT: 1963
helped stats (abs) min: 1 max: 25696 x̄: 51.78 x̃: 12
helped stats (rel) min: <.01% max: 58.28% x̄: 2.00% x̃: 1.05%
HURT stats (abs)   min: 1 max: 3401 x̄: 63.33 x̃: 16
HURT stats (rel)   min: <.01% max: 39.95% x̄: 2.20% x̃: 0.70%
95% mean confidence interval for cycles value: -29.99 -7.57
95% mean confidence interval for cycles %-change: -0.89% -0.71%
Cycles are helped.

total fills in shared programs: 24905 -> 24901 (-0.02%)
fills in affected programs: 117 -> 113 (-3.42%)
helped: 4
HURT: 0

LOST:   0
GAINED: 16

Haswell
total instructions in shared programs: 13148927 -> 13131528 (-0.13%)
instructions in affected programs: 2220941 -> 2203542 (-0.78%)
helped: 8017
HURT: 4
helped stats (abs) min: 1 max: 12 x̄: 2.17 x̃: 1
helped stats (rel) min: 0.07% max: 15.25% x̄: 1.40% x̃: 0.93%
HURT stats (abs)   min: 1 max: 7 x̄: 2.50 x̃: 1
HURT stats (rel)   min: 0.33% max: 4.76% x̄: 2.73% x̃: 2.91%
95% mean confidence interval for instructions value: -2.21 -2.13
95% mean confidence interval for instructions %-change: -1.43% -1.37%
Instructions are helped.

total cycles in shared programs: 321221791 -> 321079870 (-0.04%)
cycles in affected programs: 126886055 -> 126744134 (-0.11%)
helped: 4674
HURT: 1729
helped stats (abs) min: 1 max: 23654 x̄: 56.47 x̃: 16
helped stats (rel) min: <.01% max: 53.22% x̄: 2.13% x̃: 1.05%
HURT stats (abs)   min: 1 max: 3694 x̄: 70.58 x̃: 18
HURT stats (rel)   min: <.01% max: 63.06% x̄: 2.48% x̃: 0.90%
95% mean confidence interval for cycles value: -33.31 -11.02
95% mean confidence interval for cycles %-change: -0.99% -0.78%
Cycles are helped.

total spills in shared programs: 19872 -> 19874 (0.01%)
spills in affected programs: 21 -> 23 (9.52%)
helped: 0
HURT: 1

total fills in shared programs: 20941 -> 20941 (0.00%)
fills in affected programs: 62 -> 62 (0.00%)
helped: 1
HURT: 1

LOST:   0
GAINED: 8

Ivy Bridge
total instructions in shared programs: 11875553 -> 11853839 (-0.18%)
instructions in affected programs: 1553112 -> 1531398 (-1.40%)
helped: 7304
HURT: 3
helped stats (abs) min: 1 max: 16 x̄: 2.97 x̃: 2
helped stats (rel) min: 0.07% max: 15.25% x̄: 1.62% x̃: 1.15%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.05% max: 3.33% x̄: 2.44% x̃: 2.94%
95% mean confidence interval for instructions value: -3.04 -2.90
95% mean confidence interval for instructions %-change: -1.65% -1.59%
Instructions are helped.

total cycles in shared programs: 178246425 -> 178184484 (-0.03%)
cycles in affected programs: 13702146 -> 13640205 (-0.45%)
helped: 4409
HURT: 1566
helped stats (abs) min: 1 max: 531 x̄: 24.52 x̃: 13
helped stats (rel) min: <.01% max: 38.67% x̄: 2.14% x̃: 1.02%
HURT stats (abs)   min: 1 max: 356 x̄: 29.48 x̃: 10
HURT stats (rel)   min: <.01% max: 64.73% x̄: 1.87% x̃: 0.70%
95% mean confidence interval for cycles value: -11.60 -9.14
95% mean confidence interval for cycles %-change: -1.19% -0.99%
Cycles are helped.

LOST:   0
GAINED: 10

Sandy Bridge
total instructions in shared programs: 10695740 -> 10667483 (-0.26%)
instructions in affected programs: 2337607 -> 2309350 (-1.21%)
helped: 10720
HURT: 1
helped stats (abs) min: 1 max: 49 x̄: 2.64 x̃: 2
helped stats (rel) min: 0.07% max: 20.00% x̄: 1.54% x̃: 1.13%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.04% max: 1.04% x̄: 1.04% x̃: 1.04%
95% mean confidence interval for instructions value: -2.69 -2.58
95% mean confidence interval for instructions %-change: -1.57% -1.51%
Instructions are helped.

total cycles in shared programs: 153478839 -> 153416223 (-0.04%)
cycles in affected programs: 22050900 -> 21988284 (-0.28%)
helped: 5342
HURT: 2200
helped stats (abs) min: 1 max: 1020 x̄: 20.34 x̃: 16
helped stats (rel) min: <.01% max: 24.05% x̄: 1.51% x̃: 0.86%
HURT stats (abs)   min: 1 max: 335 x̄: 20.93 x̃: 6
HURT stats (rel)   min: <.01% max: 20.18% x̄: 1.03% x̃: 0.30%
95% mean confidence interval for cycles value: -9.18 -7.42
95% mean confidence interval for cycles %-change: -0.82% -0.71%
Cycles are helped.

Iron Lake
total instructions in shared programs: 8114882 -> 8105574 (-0.11%)
instructions in affected programs: 1232504 -> 1223196 (-0.76%)
helped: 4109
HURT: 2
helped stats (abs) min: 1 max: 6 x̄: 2.27 x̃: 1
helped stats (rel) min: 0.05% max: 8.33% x̄: 0.99% x̃: 0.66%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.94% max: 4.35% x̄: 2.65% x̃: 2.65%
95% mean confidence interval for instructions value: -2.31 -2.21
95% mean confidence interval for instructions %-change: -1.01% -0.96%
Instructions are helped.

total cycles in shared programs: 188504036 -> 188466296 (-0.02%)
cycles in affected programs: 31203798 -> 31166058 (-0.12%)
helped: 3447
HURT: 36
helped stats (abs) min: 2 max: 92 x̄: 11.03 x̃: 8
helped stats (rel) min: <.01% max: 5.41% x̄: 0.21% x̃: 0.13%
HURT stats (abs)   min: 2 max: 30 x̄: 7.33 x̃: 6
HURT stats (rel)   min: 0.01% max: 1.65% x̄: 0.18% x̃: 0.10%
95% mean confidence interval for cycles value: -11.16 -10.51
95% mean confidence interval for cycles %-change: -0.22% -0.20%
Cycles are helped.

LOST:   0
GAINED: 1

GM45
total instructions in shared programs: 4989697 -> 4984531 (-0.10%)
instructions in affected programs: 703952 -> 698786 (-0.73%)
helped: 2493
HURT: 2
helped stats (abs) min: 1 max: 6 x̄: 2.07 x̃: 1
helped stats (rel) min: 0.05% max: 8.33% x̄: 1.03% x̃: 0.66%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.95% max: 4.35% x̄: 2.65% x̃: 2.65%
95% mean confidence interval for instructions value: -2.13 -2.01
95% mean confidence interval for instructions %-change: -1.07% -0.99%
Instructions are helped.

total cycles in shared programs: 128929136 -> 128903886 (-0.02%)
cycles in affected programs: 21583096 -> 21557846 (-0.12%)
helped: 2214
HURT: 17
helped stats (abs) min: 2 max: 92 x̄: 11.44 x̃: 8
helped stats (rel) min: <.01% max: 5.41% x̄: 0.24% x̃: 0.13%
HURT stats (abs)   min: 2 max: 8 x̄: 4.24 x̃: 4
HURT stats (rel)   min: 0.01% max: 1.65% x̄: 0.20% x̃: 0.09%
95% mean confidence interval for cycles value: -11.75 -10.88
95% mean confidence interval for cycles %-change: -0.25% -0.22%
Cycles are helped.

LOST:   1
GAINED: 1

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>

4 years agonir/algebraic: Change the default cursor location when replacing a unary op
Ian Romanick [Mon, 23 Mar 2020 21:25:24 +0000 (14:25 -0700)]
nir/algebraic: Change the default cursor location when replacing a unary op

If the expression tree that is being replaced has a unary operation at
its root, set the cursor (location where new instructions are inserted)
at the source instruction instead.

This doesn't do much now because there are very few patterns that have a
unary operation as the root.  Almost all of the patterns that do have a
unary operation as the root have inot.  All of the shaders that are
affected by this commit have expression trees with an inot at the root.

This change prevents some significant, spurious caused by the next
commit.  There is further explanation in the large comment added in
the code.

I also considered a couple other options that may still be worth exploring.

1. Add some mark-up to the search pattern to denote where new
   instructions should be added.  I considered using "@" to denote the
   cursor location.  For example,

    (('fneg', ('fadd@', a, b)), ...)

2. To prevent other kinds of unintended code motion, add the ability to
   name expressions in the search pattern so that they can be reused in
   the replacement.  For example,

   (('bcsel', ('ige', ('find_lsb=b', a), 0), ('find_lsb', a), -1), b),

   An alternative would be to add some kind of CSE at the time of
   inserting the replacements.  Create a new instruction, then check to
   see if it already exists.  That option might be better overall.

Over the years I know Matt has heard me complain, "I added a pattern
that just deleted an instruction, but it added a bunch of spills!"  This
was always in large, complex shaders that are very hard to analyze.  I
always blamed these cases on the scheduler being dumb.  I am now very
suspicious that unintended code motion was the real problem.

All Gen4+ Intel platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611405 -> 17611333 (<.01%)
instructions in affected programs: 18613 -> 18541 (-0.39%)
helped: 41
HURT: 13
helped stats (abs) min: 1 max: 18 x̄: 4.46 x̃: 4
helped stats (rel) min: 0.27% max: 5.68% x̄: 1.29% x̃: 1.34%
HURT stats (abs)   min: 1 max: 20 x̄: 8.54 x̃: 7
HURT stats (rel)   min: 0.30% max: 4.20% x̄: 2.15% x̃: 2.38%
95% mean confidence interval for instructions value: -3.29 0.63
95% mean confidence interval for instructions %-change: -0.95% 0.02%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 338366118 -> 338365223 (<.01%)
cycles in affected programs: 257889 -> 256994 (-0.35%)
helped: 42
HURT: 15
helped stats (abs) min: 2 max: 120 x̄: 39.38 x̃: 34
helped stats (rel) min: 0.04% max: 2.55% x̄: 0.86% x̃: 0.76%
HURT stats (abs)   min: 6 max: 204 x̄: 50.60 x̃: 34
HURT stats (rel)   min: 0.11% max: 4.75% x̄: 1.12% x̃: 0.56%
95% mean confidence interval for cycles value: -30.39 -1.02
95% mean confidence interval for cycles %-change: -0.66% -0.02%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>

4 years agointel/vec4: Allow late copy propagation on vec4
Ian Romanick [Mon, 10 Jun 2019 23:45:08 +0000 (16:45 -0700)]
intel/vec4: Allow late copy propagation on vec4

This change incurs a small amount of hurt now, but it enables a lot of
benefit on vec4 shaders on the next commit.  nir_opt_algebraic_late
converts dph, dot3, etc. to dhp_replicated, dot_replicated3, etc.  In
the process, it introduces extra moves.  If the original NIR contained

        vec1 32 ssa_45 = fdot4 ssa_51, ssa_44
        vec1 32 ssa_46 = fneg ssa_45

nir_opt_algebraic_late will produce

        vec4 32 ssa_18 = fdot_replicated4 ssa_1, ssa_15
        vec1 32 ssa_19 = mov ssa_18.x
        vec1 32 ssa_17 = fneg ssa_19

The algebraic pass added in the next commit can't see through the move
to know that the fneg applies to a fdot_replicated4.

Haswell, Ivy Bridge, and Sandybridge had similar results. (Haswell shown)
total cycles in shared programs: 187077604 -> 187079858 (<.01%)
cycles in affected programs: 350132 -> 352386 (0.64%)
helped: 174
HURT: 194
helped stats (abs) min: 2 max: 124 x̄: 23.60 x̃: 16
helped stats (rel) min: 0.12% max: 15.88% x̄: 4.98% x̃: 3.86%
HURT stats (abs)   min: 2 max: 164 x̄: 32.78 x̃: 16
HURT stats (rel)   min: 0.17% max: 22.82% x̄: 6.46% x̃: 0.86%
95% mean confidence interval for cycles value: 2.04 10.21
95% mean confidence interval for cycles %-change: 0.17% 1.93%
Cycles are HURT.

No shader-db changes on any other Intel platform.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>

4 years agonir: fix crash in varying packing on interface mismatch
Timothy Arceri [Fri, 27 Mar 2020 15:17:54 +0000 (02:17 +1100)]
nir: fix crash in varying packing on interface mismatch

For example when the outputs are scalars but the inputs are struct
members.

Fixes: 26aa460940f6 ("nir: rewrite varying component packing")
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4351>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4351>

4 years agofreedreno/turnip: Use the NIR info to decide if we need helper invocations.
Eric Anholt [Tue, 24 Mar 2020 16:59:16 +0000 (09:59 -0700)]
freedreno/turnip: Use the NIR info to decide if we need helper invocations.

We had an approximation that was assuming any ddx or tex instruction
needed helper invocations, but that's not true for texelFetch() or
textureSize().  It also meant that we were setting PIXLOD on vertex and
compute shaders doing texturing, which doesn't really make sense.

shader-db (with a hack to log pixlod):
total pixlod in shared programs: 582 -> 573 (-1.55%)

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2681
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4308>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4308>

4 years agofreedreno: Drop an unnecessary include marked "this should go away"
Eric Anholt [Mon, 23 Mar 2020 21:19:48 +0000 (14:19 -0700)]
freedreno: Drop an unnecessary include marked "this should go away"

It came in with the initial import, and doesn't seem to be necessary any
more.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4289>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4289>

4 years agofreedreno/ir3: fix android build
Rob Clark [Mon, 30 Mar 2020 22:53:48 +0000 (15:53 -0700)]
freedreno/ir3: fix android build

Fixes: e5339fe4a47 ("Move compiler.h and imports.h/c from src/mesa/main into src/util")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4381>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4381>

4 years agoutil: move ALIGN/ROUND_DOWN_TO to u_math.h
Rob Clark [Mon, 30 Mar 2020 22:52:30 +0000 (15:52 -0700)]
util: move ALIGN/ROUND_DOWN_TO to u_math.h

These are less mesa specific than the rest of macros.h, and would be
nice to use outside of mesa.  Prep for next patch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4381>

4 years agospirv: Implement OpCopyObject and OpCopyLogical as blind copies
Jason Ekstrand [Mon, 30 Mar 2020 16:25:07 +0000 (11:25 -0500)]
spirv: Implement OpCopyObject and OpCopyLogical as blind copies

Because the types etc. are required to logically match, we can just
copy-propagate the guts of the vtn_value.  This was causing issues with
some new CTS tests that are doing an OpCopyObject of a sampler which is
a special-cased type in spirv_to_nir.  Of course, this is only a partial
solution.  Ideally, we've got a bit of work to do to make all the
composite stuff able to handle all types including images, sampler, and
combined image/samplers but this gets some CTS tests passing.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4375>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4375>