mesa.git
5 years agollvmpipe: introduce compute shader context
Dave Airlie [Tue, 27 Aug 2019 03:19:00 +0000 (13:19 +1000)]
llvmpipe: introduce compute shader context

The compute shader will need it's own context like the frag shader
has, this just introduces the framework struct and allocates/frees
for it in the right places.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agogallivm: add barrier support for compute shaders.
Dave Airlie [Tue, 27 Aug 2019 02:50:35 +0000 (12:50 +1000)]
gallivm: add barrier support for compute shaders.

When the code is executing an hits a barrier, it will suspend
the coroutine and return control to the coroutine dispatcher.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agollvmpipe: add compute threadpool + mutex
Dave Airlie [Tue, 27 Aug 2019 02:45:39 +0000 (12:45 +1000)]
llvmpipe: add compute threadpool + mutex

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
In order to efficiently run a number of compute blocks, use
a threadpool that just allows for jobs with unique sequential
ids to be dispatched.

5 years agogallivm: add support for compute shared memory
Dave Airlie [Sun, 21 Jul 2019 22:29:42 +0000 (08:29 +1000)]
gallivm: add support for compute shared memory

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agogallivm: add new compute related intrinsics
Dave Airlie [Sun, 21 Jul 2019 22:27:27 +0000 (08:27 +1000)]
gallivm: add new compute related intrinsics

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agollvmpipe: reogranise jit pointer ordering
Dave Airlie [Wed, 26 Jun 2019 00:12:28 +0000 (10:12 +1000)]
llvmpipe: reogranise jit pointer ordering

In order to share the texture/image/sampler code with compute
shaders we need to reorg them to be at the front of context
same as draw does for vs/gs sharing.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agogallivm: add coroutine pass manager support
Dave Airlie [Tue, 25 Jun 2019 21:37:20 +0000 (07:37 +1000)]
gallivm: add coroutine pass manager support

coroutines require a proper pass manager, so add the passes
to the correct places

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agogallivm: add coroutine support files to gallivm.
Dave Airlie [Tue, 25 Jun 2019 21:36:40 +0000 (07:36 +1000)]
gallivm: add coroutine support files to gallivm.

These wrap the coroutine intrinsics and also add some higher
level wrappers around coroutine begin, end and suspend procedures

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agogallivm/flow: add counter reset for loops
Dave Airlie [Tue, 25 Jun 2019 21:35:36 +0000 (07:35 +1000)]
gallivm/flow: add counter reset for loops

This allows the counter value to be forced to a certain value

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agollvmpipe: enable fb no attach
Dave Airlie [Wed, 19 Jun 2019 19:38:19 +0000 (05:38 +1000)]
llvmpipe: enable fb no attach

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agoiris: Report correct number of planes for planar images
Kenneth Graunke [Wed, 28 Aug 2019 19:52:23 +0000 (12:52 -0700)]
iris: Report correct number of planes for planar images

We were only handling the modifiers case and not counting the number of
planes in actual planar images.

Fixes Piglit's ext_image_dma_buf_import-export.

Fixes: fc12fd05f56 ("iris: Implement pipe_screen::resource_get_param")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111509
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
5 years agoteximage: ensure that Tex*SubImage* checks format
Ilia Mirkin [Mon, 2 Sep 2019 22:50:01 +0000 (18:50 -0400)]
teximage: ensure that Tex*SubImage* checks format

We were previously not doing at least some of the checks. This uses the
same logic that is used in glTexImage*.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agogallium/hud: add CPU usage support for DragonFly/NetBSD/OpenBSD
Jan Beich [Sat, 31 Aug 2019 18:32:16 +0000 (18:32 +0000)]
gallium/hud: add CPU usage support for DragonFly/NetBSD/OpenBSD

Each BSD has slightly different sysctl for retrieving per-CPU times.
FreeBSD returns long while NetBSD returns uint64_t. On OpenBSD return
type differs between summation and per-CPU times. DragonFly is
compatible with FreeBSD.

Signed-off-by: Jan Beich <jbeich@FreeBSD.org>
5 years agolima: Return fence unconditionally
Roman Stratiienko [Mon, 2 Sep 2019 14:46:22 +0000 (17:46 +0300)]
lima: Return fence unconditionally

Based on the vc4 implementation.
Fixes Android RenderEngine::flush() routine:
android.googlesource.com/platform/frameworks/native/+/refs/tags/android-o-mr1-iot-release-smart-clock-fcs/services/surfaceflinger/RenderEngine/RenderEngine.cpp#225

Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agolima/ppir: clone uniforms and load_coords into each successor
Vasily Khoruzhick [Wed, 28 Aug 2019 06:02:12 +0000 (23:02 -0700)]
lima/ppir: clone uniforms and load_coords into each successor

Try more aggressive approach with cloning uniform and coord loads.

Uniform load can be inserted into any instruction, so let's do that. ARM site
claim that penalty for cache miss is one clock, so we don't lose anything if
we merge it into instruction that uses the result. As side effect we can also
pipeline it and thus decrease reg pressure.

Do the same for varyings that hold texture coords, but for different reason:
looks like there's a special path for coords that increases precision if
varying that holds it is pipelined. If we don't pipeline it and load coords
from a register its precision is fp16 and thus only 10 bits which is not enough
to accurately sample textures of size 1024 or larger.

Since instruction can hold only one uniform load and one varying load,
node_to_instr now creates a move using helper introduced in previous commit if
slot is already taken. As side effect of this change we can also try to
pipeline texture loads and create a move if attempt fails.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/ppir: don't assume that load coords gets value from register
Vasily Khoruzhick [Sun, 1 Sep 2019 17:21:32 +0000 (10:21 -0700)]
lima/ppir: don't assume that load coords gets value from register

It can load value from varying directly as well. Also load_regs is the
only op that has a source, so add src_num field to load node and set it
accordingly.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/ppir: add common helper for creating movs
Vasily Khoruzhick [Wed, 28 Aug 2019 05:22:01 +0000 (22:22 -0700)]
lima/ppir: add common helper for creating movs

Introduce common helper for creating movs to avoid code duplication

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agonir: fix memleak in error path
Eric Engestrom [Mon, 26 Aug 2019 14:33:31 +0000 (15:33 +0100)]
nir: fix memleak in error path

Fixes: 2cf59861a8128a91bfdd ("nir: Add partial redundancy elimination for compares")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agofreedreno/drm-shim: fix mem leak
Eric Engestrom [Mon, 26 Aug 2019 14:42:32 +0000 (15:42 +0100)]
freedreno/drm-shim: fix mem leak

Fixes: 494ecef6b42198ab6c3e ("freedreno: Add support for drm-shim.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoanv: fix format string in error message
Eric Engestrom [Mon, 26 Aug 2019 14:32:36 +0000 (15:32 +0100)]
anv: fix format string in error message

Fixes: 9775894f102535a79186 ("anv: Move size check from anv_bo_cache_import() to caller (v2)")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoutil/os_file: fix double-close()
Eric Engestrom [Mon, 26 Aug 2019 14:30:54 +0000 (15:30 +0100)]
util/os_file: fix double-close()

Fixes: 955c63d3643f30d7db0c ("util/os_file: resize buffer to what was actually needed")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
5 years agoegl: fix deadlock in malloc error path
Eric Engestrom [Mon, 26 Aug 2019 14:29:49 +0000 (15:29 +0100)]
egl: fix deadlock in malloc error path

Fixes: cb0980e69aa921af7086 ("egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
5 years agottn: fix 64-bit shift on 32-bit `1`
Eric Engestrom [Mon, 26 Aug 2019 14:52:33 +0000 (15:52 +0100)]
ttn: fix 64-bit shift on 32-bit `1`

Fixes: 4d0b2c7aaac3cf3de5af ("ttn: Update shader->info as we generate code.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/ir3: use uniform base
Rob Clark [Thu, 8 Aug 2019 21:31:50 +0000 (14:31 -0700)]
freedreno/ir3: use uniform base

When lowering from ubo, use the constant base field in the load_uniform
instruction for the constant part of the offset.  Doesn't change much
for constant indexing, but this will help for indirect indexing because
constant-folding can't completely clean up the result.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agofreedreno/drm: fix 64b iova shifts
Rob Clark [Thu, 29 Aug 2019 18:35:17 +0000 (11:35 -0700)]
freedreno/drm: fix 64b iova shifts

Should shift before splitting 64b iova into dwords

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir: remove unused constant_fold_state
Rob Clark [Thu, 8 Aug 2019 20:37:49 +0000 (13:37 -0700)]
nir: remove unused constant_fold_state

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agofreedreno: Fix the type of single-component scaled vertex attrs.
Eric Anholt [Wed, 28 Aug 2019 20:31:07 +0000 (13:31 -0700)]
freedreno: Fix the type of single-component scaled vertex attrs.

This looks like clear copy-and-pasteos, and fixes:

dEQP-GLES2.functional.draw.random.40

(on A307 and A630, both tested in the new CI farm)

Reviewed-by: Rob Clark <robdclark@chromium.org>
5 years agoradeonsi/nir: Remove uniform variable scanning
Connor Abbott [Tue, 27 Aug 2019 11:36:11 +0000 (13:36 +0200)]
radeonsi/nir: Remove uniform variable scanning

We can get all the information we need from NIR. It's slightly less
accurate, but radeonsi doesn't use the extra information. The old code
also overcounted atomic counters, which led to problems when everything
was used at once.

Fixes KHR-GL45.compute_shader.resources-max.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agottn: Fill out more info fields
Connor Abbott [Tue, 27 Aug 2019 09:34:35 +0000 (11:34 +0200)]
ttn: Fill out more info fields

We'll use these in radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agonir: Fix num_ssbos when lowering atomic counters
Connor Abbott [Tue, 27 Aug 2019 08:54:12 +0000 (10:54 +0200)]
nir: Fix num_ssbos when lowering atomic counters

Otherwise it's impossible to know the maximum SSBO index for both
internal TGSI shaders from TTN (which don't have any notion of atomic
counters and no offset) as well as shaders from GLSL.

I fixed everything I could find while grepping for num_ssbos and
num_abos, which hopefully is everything (iris was the only user I could
find that uses it in a meaningful way).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoac/nir: Fix gather4 integer wa with unnormalized coordinates
Connor Abbott [Thu, 6 Jun 2019 14:28:48 +0000 (16:28 +0200)]
ac/nir: Fix gather4 integer wa with unnormalized coordinates

This adds a bit of unneccesary code on radeonsi, since whether
unnormalized coordinates are used is known at compile time with GL, but
I wasn't sure if it was worth the few instructions to plumb everything
through, especially for something so rare -- my shader-db doesn't have
any instances where this changes anything.

Fixes CTS tests I created at
https://github.com/cwabbott0/VK-GL-CTS/tree/unnorm-gather-tests

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac/nir: Rewrite gather4 integer workaround based on radeonsi
Connor Abbott [Thu, 6 Jun 2019 10:16:18 +0000 (12:16 +0200)]
ac/nir: Rewrite gather4 integer workaround based on radeonsi

The workaround was originally written based on amdgpu-pro traces, but
since then radeonsi has got its own slightly different version. Use the
radeonsi version instead, to be consistent and because it'll be slightly
more convenient for handling unnormalized coordinates.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoegl: warn user if they set an invalid EGL_PLATFORM
Eric Engestrom [Mon, 2 Sep 2019 09:09:58 +0000 (10:09 +0100)]
egl: warn user if they set an invalid EGL_PLATFORM

Technically, the user might have set EGL_DISPLAY instead of
EGL_PLATFORM, but since the former is deprecated let's just mention the
latter in the warning message.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agopanfrost: Remove panfrost_upload
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:38:27 +0000 (17:38 -0700)]
panfrost: Remove panfrost_upload

This routine was made obsolete over a series of reworks of memory
allocation; Tomeu's changes to shader memory allocation finally made
this unused as cppcheck noted.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopanfrost: Fix misc. issues flagged by cppcheck
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:37:22 +0000 (17:37 -0700)]
panfrost: Fix misc. issues flagged by cppcheck

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopanfrost: Mark (1 << 31) as unsigned
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:34:13 +0000 (17:34 -0700)]
panfrost: Mark (1 << 31) as unsigned

I was not aware this incurred undefined behaviour; thank you cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/midgard: Remove mir_rewrite_index_*_tag
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:32:30 +0000 (17:32 -0700)]
pan/midgard: Remove mir_rewrite_index_*_tag

These helpers are unused, as flagged by cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/midgard: Remove mir_print_bundle
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:31:36 +0000 (17:31 -0700)]
pan/midgard: Remove mir_print_bundle

In practice, the new post-schedule print is just as useful.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/midgard: Remove cppwrap.cpp
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:30:31 +0000 (17:30 -0700)]
pan/midgard: Remove cppwrap.cpp

It has not been used in a long time; I forgot this file even existed.
Flagged by cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/midgard: Fix cppcheck issues
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:29:17 +0000 (17:29 -0700)]
pan/midgard: Fix cppcheck issues

Miscellaneous minor issues flagged by cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/midgard: Correct issues in disassemble.c
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:16:17 +0000 (17:16 -0700)]
pan/midgard: Correct issues in disassemble.c

cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/decode: Add missing format specifier
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:08:20 +0000 (17:08 -0700)]
pan/decode: Add missing format specifier

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/decode: Use portable format specifier for 64-bit
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:03:25 +0000 (17:03 -0700)]
pan/decode: Use portable format specifier for 64-bit

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/decode: Use %zu instead of %d
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:02:43 +0000 (17:02 -0700)]
pan/decode: Use %zu instead of %d

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agopan/decode: Fix uninitialized variables
Alyssa Rosenzweig [Sat, 31 Aug 2019 00:00:09 +0000 (17:00 -0700)]
pan/decode: Fix uninitialized variables

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agodocs: update calendar, add news item and link release notes for 19.1.6
Juan A. Suarez Romero [Tue, 3 Sep 2019 11:06:56 +0000 (13:06 +0200)]
docs: update calendar, add news item and link release notes for 19.1.6

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
5 years agodocs: add sha256 checksums for 19.1.6
Juan A. Suarez Romero [Tue, 3 Sep 2019 11:04:25 +0000 (13:04 +0200)]
docs: add sha256 checksums for 19.1.6

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 4ec2325dd07a768f2b52ea788ee76085586b2469)

5 years agodocs: add release notes for 19.1.6
Juan A. Suarez Romero [Tue, 3 Sep 2019 10:02:19 +0000 (12:02 +0200)]
docs: add release notes for 19.1.6

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 85c8f88a49aa7c8aa866faed90a4a63330c15b8b)

5 years agovulkan/overlay: bounce image back to present layout
Lionel Landwerlin [Wed, 21 Aug 2019 11:47:25 +0000 (13:47 +0200)]
vulkan/overlay: bounce image back to present layout

Once we write the overlay to an image to be presented, we must not
forget to put it back into present layout.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111401
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
5 years agobroadcom/vc4: Expand width of dst surface
Zhaowei Yuan [Tue, 3 Sep 2019 02:58:59 +0000 (10:58 +0800)]
broadcom/vc4: Expand width of dst surface

Four bytes of src_surf will be compressed into a 32-bits data and
stored into dst_surf, and dst_surf is read as z-order, so its width
must be aligned to multiples of 8(4x2) before divided by 2.

Signed-off-by: Zhaowei Yuan <zhaowei.yuan@samsung.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111266

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
5 years agoswr: Fix make_unique build error.
Vinson Lee [Thu, 29 Aug 2019 23:44:09 +0000 (16:44 -0700)]
swr: Fix make_unique build error.

swr_shader.cpp: In function ‘void (* swr_compile_gs(swr_context*, swr_jit_gs_key&))(HANDLE, HANDLE, SWR_GS_CONTEXT*)’:
swr_shader.cpp:732:44: error: ‘make_unique’ was not declared in this scope
    ctx->gs->map.insert(std::make_pair(key, make_unique<VariantGS>(builder.gallivm, func)));
                                            ^~~~~~~~~~~

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
5 years agoloader: include limits.h for PATH_MAX
nia [Sat, 31 Aug 2019 17:10:07 +0000 (18:10 +0100)]
loader: include limits.h for PATH_MAX

This is needed to build on illumos.

The location of the PATH_MAX definition in limits.h seems to be fairly standard:
https://pubs.opengroup.org/onlinepubs/009695399/basedefs/limits.h.html

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
5 years agoutil: only allow _BitScanReverse64 on 64-bit cpus
Erik Faye-Lund [Wed, 14 Aug 2019 20:29:24 +0000 (22:29 +0200)]
util: only allow _BitScanReverse64 on 64-bit cpus

While the documentation for _BitScanReverse64 on MSDN says that it's
available on ARM, this isn't true. It's only available on ARM64. So
let's match reality.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
5 years agomesa/x86: improve SSE-checks for MSVC
Erik Faye-Lund [Thu, 15 Aug 2019 19:53:36 +0000 (21:53 +0200)]
mesa/x86: improve SSE-checks for MSVC

This enables some more SSE optimizations on MSVC builds.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agoutil: do not assume MSVC implies SSE
Erik Faye-Lund [Wed, 14 Aug 2019 20:28:12 +0000 (22:28 +0200)]
util: do not assume MSVC implies SSE

This is not true for MSVC on ARM.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agoutil: fix SSE-version needed for double opcodes
Erik Faye-Lund [Sun, 1 Sep 2019 08:05:12 +0000 (10:05 +0200)]
util: fix SSE-version needed for double opcodes

This code generates CVTSD2SI, which requires SSE2. So let's fix the
required SSE-version.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 5de29ae (util: try to use SSE instructions with MSVC and 32-bit gcc)
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agomesa/main: remove unused include
Erik Faye-Lund [Thu, 15 Aug 2019 19:08:59 +0000 (21:08 +0200)]
mesa/main: remove unused include

This has been unused since 183db3a6455 ("glsl: move half<->float
convertion to util"), Oct 10 2015. Let's drop needlessly including it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agonir: do not assume that the result of fexp2(a) is always an integral
Samuel Pitoiset [Tue, 27 Aug 2019 09:35:00 +0000 (11:35 +0200)]
nir: do not assume that the result of fexp2(a) is always an integral

It's only correct when 'a' is an integral greater or equal to 0.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111493
Fixes: 5544b2cbbd2 ("nir/algebraic: Use value range analysis to eliminate useless unary ops")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoegl: fix platform selection
Lionel Landwerlin [Sun, 1 Sep 2019 14:22:24 +0000 (17:22 +0300)]
egl: fix platform selection

Add missing "device" platform

v2: Add the missing platform (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Jean Hertel <jean.hertel@hotmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111529
Fixes: d6edccee8d ("egl: add EGL_platform_device support")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
5 years agoiris: Lessen texture cache hack flush for blits/copies on Icelake.
Kenneth Graunke [Fri, 21 Jun 2019 03:18:11 +0000 (20:18 -0700)]
iris: Lessen texture cache hack flush for blits/copies on Icelake.

Lionel found actual documentation for this at long last.  Apparently
it actually is a sampler cache limitation that was mostly fixed on
Icelake.  Unfortunately, it seems there are still issues with ASTC
and non-ASTC sampler views.  Still, we can lessen the flush condition
from "format mismatch" to "ASTC mismatch", which eliminates most of
the flushing here.

We also update the documentation to refer to the workaround name.

5 years agoutil: Define strchrnul on macOS.
Vinson Lee [Fri, 30 Aug 2019 06:56:17 +0000 (23:56 -0700)]
util: Define strchrnul on macOS.

strchrnul is not available on macOS.

pipe_loader.c:141:14: error: implicit declaration of function 'strchrnul' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
      next = strchrnul(library_paths, ':');
             ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogallium/auxiliary/indices: consistently apply start only to input
Erik Faye-Lund [Wed, 17 Jul 2019 08:21:08 +0000 (10:21 +0200)]
gallium/auxiliary/indices: consistently apply start only to input

The majority of these only apply the start argument to the input, but a
few of them also does for the output-array. util_primconvert, the only
user of this argument expects this pass a non-zero start-argument does
not expect this to be applied to the output; if it is, it will write
outside of allocated memory, leading to VRAM corruption.

The reason this doesn't seem to have been noticed before, is that no
driver currently use util_primconvert to convert a primitive-type to
itself, which is the cases where this was broken. But for Zink, this
will no longer be true, because we need to eliminate the use of 8-bit
index-buffers.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 28f3f8d413f ("gallium/auxiliary/indices: add start param")
Reviewed-by: Rob Clark <robdclark@chromium.org>
5 years agotravis: Fail build if any command in if statement fails.
Vinson Lee [Fri, 30 Aug 2019 06:15:29 +0000 (23:15 -0700)]
travis: Fail build if any command in if statement fails.

Travis is checking the exit code of the entire if statement.

Fixes: 64ffc289be89 ("travis: add MacOS Scons build")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
5 years agoswr: Fix build with llvm-9.0 again.
Vinson Lee [Mon, 26 Aug 2019 23:16:26 +0000 (16:16 -0700)]
swr: Fix build with llvm-9.0 again.

Commit 6f7306c029a7 ("swr/rast: Refactor memory API between rasterizer
core and swr") unintentionally removed changes for llvm-9.0.

Fixes: 6f7306c029a7 ("swr/rast: Refactor memory API between rasterizer core and swr")
Fixes: 5dd9ad157005 ("swr/rasterizer: Better implementation of scatter")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
5 years agopan/midgard: Use shared psiz clamp pass
Alyssa Rosenzweig [Mon, 26 Aug 2019 19:14:11 +0000 (12:14 -0700)]
pan/midgard: Use shared psiz clamp pass

We already had a perfectly cromulent pass for this, but one landed in
common NIR code so let's switch and lighten our tree.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Remove mir_opt_post_move_eliminate
Alyssa Rosenzweig [Fri, 30 Aug 2019 20:49:33 +0000 (13:49 -0700)]
pan/midgard: Remove mir_opt_post_move_eliminate

This optimization depended on RA running before scheduling. It therefore
no longer applies and is now unused.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Schedule before RA
Alyssa Rosenzweig [Fri, 30 Aug 2019 19:56:55 +0000 (12:56 -0700)]
pan/midgard: Schedule before RA

This is a tradeoff.

Scheduling before RA means we don't do RA on what-will-become pipeline
registers. Importantly, it means the scheduler is able to reorder
instructions, as registers have not been decided yet.

Unfortunately, it also complicates register spilling, since the spills
themselves won't get bundled optimally and we can only spill twice per
ALU bundle (only one spill per bundle allowed here). It also prevents us
from eliminating dead moves introduced by register allocation, as they
are not dead before RA. The shader-db regressions are from poor spilling
choices introduced by the new bundling requirements. These could be
solved by the combination of a post-scheduler (to combine adjacent
spills into bundles) with a VLIW-aware spill cost calculation.
Nevertheless, the change is small enough that I feel it's worth it to
eat a tiny shader-db regression for the sake of flexibility.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Handle fragment writeout in RA
Alyssa Rosenzweig [Fri, 30 Aug 2019 18:06:33 +0000 (11:06 -0700)]
pan/midgard: Handle fragment writeout in RA

Rather than using a pile of hacks and awkward constructs in MIR to
ensure the writeout parameter gets written into r0, let's add a
dedicated shadow register class for writeout (interfering with work
register r0) so we can express the writeout condition succintly and
directly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Do not propagate swizzles into writeout
Alyssa Rosenzweig [Fri, 30 Aug 2019 21:35:01 +0000 (14:35 -0700)]
pan/midgard: Do not propagate swizzles into writeout

There's no slot for it; you'll end up writing into the void and
clobbering stuff. Don't. do it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Fix misc. RA issues
Alyssa Rosenzweig [Fri, 30 Aug 2019 18:04:52 +0000 (11:04 -0700)]
pan/midgard: Fix misc. RA issues

When running the register allocator after scheduling, the MIR looks a
little different, so we need to extend the RA to handle a few of these
extra cases correctly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Print MIR by the bundle
Alyssa Rosenzweig [Fri, 30 Aug 2019 18:03:44 +0000 (11:03 -0700)]
pan/midgard: Print MIR by the bundle

After scheduling, we still have valid MIR, but we have additional
bundling annotations which we would like to keep debug, so print these.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Print branches in MIR
Alyssa Rosenzweig [Fri, 30 Aug 2019 18:02:52 +0000 (11:02 -0700)]
pan/midgard: Print branches in MIR

Rather than a vague "br.??" line, annotate the branch with its target
type (useful for disambiguating discards) and whether it was inverted.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Remove texture_index
Alyssa Rosenzweig [Fri, 30 Aug 2019 18:01:57 +0000 (11:01 -0700)]
pan/midgard: Remove texture_index

This is deadcode.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Cleanup fragment writeout branch
Alyssa Rosenzweig [Fri, 30 Aug 2019 18:01:15 +0000 (11:01 -0700)]
pan/midgard: Cleanup fragment writeout branch

I'm not sure if this is strictly necessary but it makes debugging easier
and minimizes the diff with the experimental scheduler.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add scheduling barriers
Alyssa Rosenzweig [Fri, 30 Aug 2019 17:53:13 +0000 (10:53 -0700)]
pan/midgard: Add scheduling barriers

Scheduling occurs on a per-block basis, strongly assuming that a given
block contains at most a single branch. This does not always map to the
source NIR control flow, particularly when discard intrinsics are
involved. The solution is to allow scheduling barriers, which will
terminate a block early in code generation and open a new block.

To facilitate this, we need to move some post-block processing to a new
pass, rather than relying hackily on the current_block pointer.

This allows us to cleanup some logic analyzing branches in other parts
of the driver us well, now that the MIR is much more well-formed.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Track shader quadword count while scheduling
Alyssa Rosenzweig [Fri, 30 Aug 2019 20:57:20 +0000 (13:57 -0700)]
pan/midgard: Track shader quadword count while scheduling

This allow multiblock blend shaders to compute constant colour offsets
correctly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Allow NULL argument in mir_has_arg
Alyssa Rosenzweig [Fri, 30 Aug 2019 17:48:41 +0000 (10:48 -0700)]
pan/midgard: Allow NULL argument in mir_has_arg

It's sometimes convenient to call this with no instruction specified. By
definition, a missing instruction cannot reference any argument, so
let's check for NULL and shortciruit to false.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Improve mir_mask_of_read_components
Alyssa Rosenzweig [Fri, 30 Aug 2019 17:45:57 +0000 (10:45 -0700)]
pan/midgard: Improve mir_mask_of_read_components

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Extend mir_special_index to writeout
Alyssa Rosenzweig [Fri, 30 Aug 2019 17:45:08 +0000 (10:45 -0700)]
pan/midgard: Extend mir_special_index to writeout

The branch has the writeout specified in its source list, making this
special even if it's not explicitly part of r0.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: csel_swizzle with mir get swizzle
Alyssa Rosenzweig [Fri, 30 Aug 2019 17:44:42 +0000 (10:44 -0700)]
pan/midgard: csel_swizzle with mir get swizzle

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add mir_insert_instruction*scheduled helpers
Alyssa Rosenzweig [Fri, 30 Aug 2019 17:46:17 +0000 (10:46 -0700)]
pan/midgard: Add mir_insert_instruction*scheduled helpers

In order to run register allocation after scheduling, it is sometimes
necessary to be able to insert instructions into an already-scheduled
program. This is suboptimal, since it forces us to do a worst-case
scheduling, but it is nevertheless required for correct handling of
spills/fills. Let's add helpers to insert instructions as standalone
bundles for use in spilling code.

These helpers are minimal -- they *only* work on load/store ops or
moves. They should not be used for anything but register spilling; any
other instructions should be added prior to the schedule.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Track csel swizzle
Alyssa Rosenzweig [Fri, 30 Aug 2019 17:42:05 +0000 (10:42 -0700)]
pan/midgard: Track csel swizzle

While it doesn't matter with an unconditional move to the conditional
register (r31), when we try to elide that move we'll need to track the
swizzle explicitly, and there is no slot for that yet since ALU ops are
normally binary.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Ensure fragment writeout is in the final block
Alyssa Rosenzweig [Tue, 27 Aug 2019 19:20:06 +0000 (12:20 -0700)]
pan/midgard: Ensure fragment writeout is in the final block

This ensures the block only has exactly one branch, which makes
scheduling happy.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Document Midgard scheduling requirements
Alyssa Rosenzweig [Mon, 26 Aug 2019 22:28:56 +0000 (15:28 -0700)]
pan/midgard: Document Midgard scheduling requirements

Oh boy. Midgard scheduling is crazy... These are all just the
requirements, not even the algorithm yet.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Include condition in branch->src[0]
Alyssa Rosenzweig [Mon, 26 Aug 2019 20:59:29 +0000 (13:59 -0700)]
pan/midgard: Include condition in branch->src[0]

This will allow us to reference the condition while scheduling.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add post-schedule iteration helpers
Alyssa Rosenzweig [Tue, 27 Aug 2019 21:26:27 +0000 (14:26 -0700)]
pan/midgard: Add post-schedule iteration helpers

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Fix corner case in RA
Alyssa Rosenzweig [Tue, 27 Aug 2019 20:15:12 +0000 (13:15 -0700)]
pan/midgard: Fix corner case in RA

It doesn't really matter but... meh.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add OP_IS_CSEL_V helper
Alyssa Rosenzweig [Tue, 27 Aug 2019 22:51:13 +0000 (15:51 -0700)]
pan/midgard: Add OP_IS_CSEL_V helper

..to distinguish from scalar csel.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Expose mir_get/set_swizzle
Alyssa Rosenzweig [Tue, 27 Aug 2019 22:50:55 +0000 (15:50 -0700)]
pan/midgard: Expose mir_get/set_swizzle

The scheduler would like to use these.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Extract instruction sizing helper
Alyssa Rosenzweig [Mon, 26 Aug 2019 22:06:38 +0000 (15:06 -0700)]
pan/midgard: Extract instruction sizing helper

The scheduler shouldn't need to worry about this.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Factor out mir_is_scalar
Alyssa Rosenzweig [Mon, 26 Aug 2019 21:49:49 +0000 (14:49 -0700)]
pan/midgard: Factor out mir_is_scalar

This helper doesn't need to be in the giant loop.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Count shader-db stats by bundled instructions
Alyssa Rosenzweig [Fri, 30 Aug 2019 20:08:16 +0000 (13:08 -0700)]
pan/midgard: Count shader-db stats by bundled instructions

This does not affect shaders in any way. Rather, it makes the shader-db
instruction count recorded in the compiler accurate with the in-order
scheduler, matching up with what we calculate from pandecode.

Though shaders are the same, instruction counts cannot be compared
across this commit for this reason.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agofreedreno/ir3: Link directly to Sethi-Ullman paper
Alyssa Rosenzweig [Tue, 27 Aug 2019 17:38:34 +0000 (10:38 -0700)]
freedreno/ir3: Link directly to Sethi-Ullman paper

Allow a direct link to the PDF itself from the authors themselves,
rather than a paywall splash page.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Rob Clark <robdclark@chromium.org>
5 years agoRevert "glx: Unset the direct_support bit for GLX_EXT_import_context"
Adam Jackson [Thu, 29 Aug 2019 16:15:22 +0000 (12:15 -0400)]
Revert "glx: Unset the direct_support bit for GLX_EXT_import_context"

The GLX extension strings are independent of any context, so abusing the
direct_support bit to control this extension's visibility is wrong.

This reverts commit 079d0717fc896bc8086b037d0ed22642274986c7.

Reported-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Michel Dänzer <michel@daenzer.net>
5 years agopanfrost: Add transient BOs to job batches
Boris Brezillon [Fri, 30 Aug 2019 13:38:56 +0000 (15:38 +0200)]
panfrost: Add transient BOs to job batches

Memory allocated through panfrost_allocate_transient() is likely to
come from the transient pool. Let's add the BO backing the allocated
memory region to the job batch so the kernel can retain this BO while
jobs are executed.

In practice that has never been a problem because the transient pool
is never shrinked, and even if it was, we still control the lifetime of
the job, so there's no reason for this BO to be freed before the GPU is
done executing the batch. But it still make sense to add the BO for
debugging purpose.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: protect access to shared bo cache and transient pool
Rohan Garg [Fri, 30 Aug 2019 16:00:13 +0000 (18:00 +0200)]
panfrost: protect access to shared bo cache and transient pool

Both the BO cache and the transient pool are shared across
context's. Protect access to these with mutexes.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
5 years agopanfrost: Jobs must be per context, not per screen
Rohan Garg [Fri, 30 Aug 2019 16:00:12 +0000 (18:00 +0200)]
panfrost: Jobs must be per context, not per screen

Jobs _must_ only be shared across the same context, having
the last_job tracked in a screen causes use-after-free issues
and memory corruptions.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
5 years agost/mesa: Allow zero as [level|layer]_override
Lepton Wu [Fri, 30 Aug 2019 17:30:53 +0000 (17:30 +0000)]
st/mesa: Allow zero as [level|layer]_override

This fix two dEQP tests for virgl:

dEQP-EGL.functional.image.create.gles2_cubemap_positive_x_rgba_texture
dEQP-EGL.functional.image.render_multiple_contexts.gles2_cubemap_positive_x_rgba8_texture

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agofreedreno/a3xx: fix sysmem <-> gmem tiles transfer
Khaled Emara [Sun, 25 Aug 2019 21:49:10 +0000 (23:49 +0200)]
freedreno/a3xx: fix sysmem <-> gmem tiles transfer

Tiling mode was missing from fd3_emit_gmem_restore_tex().
emit_gmem2mem_surf() used LINEAR exclusiveley.

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/a3xx: fix texture tiling parameters
Khaled Emara [Sun, 25 Aug 2019 21:39:02 +0000 (23:39 +0200)]
freedreno/a3xx: fix texture tiling parameters

* Fix 2D/2DArray/3D tiling parameters:
  There is a bottom threshold for width and height.
* Renable tiling for Cubemap, after setting the right parameters.

Reviewed-by: Rob Clark <robdclark@gmail.com>