Connor Abbott [Sun, 8 Sep 2019 16:48:35 +0000 (18:48 +0200)]
lima/gpir: Disallow moves for schedule_first nodes
The entire point of schedule_first is that the node has to be scheduled
as soon as possible without any moves because it doesn't produce a
proper floating-point value, or its value changes depending on where you
read it. We were still introducing a move for preexp2 in some cases
though, even if it got scheduled as soon as possible, which broke some
exp() tests. Fix that.
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Connor Abbott [Sat, 7 Sep 2019 14:40:14 +0000 (16:40 +0200)]
lima/gpir: Fix fake dep handling for schedule_first nodes
The whole point of schedule_first nodes is that they need to be
scheduled as soon as possible, so if a schedule_first node is the
successor in a fake dependency that prevents it from being scheduled
after its parent, that can cause problems. We need to add these fake
dependencies to the parent as well, and we need to guarantee that the
pre-RA scheduler puts schedule_first nodes right before their parents in
order to prevent this from adding cycles to the dependency graph.
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Connor Abbott [Mon, 2 Sep 2019 20:31:00 +0000 (22:31 +0200)]
lima/gpir: Fix schedule_first insertion logic
The idea was to make sure schedule_first nodes were always first in the
ready list. I made sure they were inserted first, but not that other
nodes wouldn't later be scheduled ahead of them. Fixes
spec@glsl-1.10@execution@built-in-functions@vs-exp-float and probably
others.
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Connor Abbott [Mon, 2 Sep 2019 07:48:54 +0000 (09:48 +0200)]
lima/gpir: Ignore unscheduled successors in can_use_complex()
The point of the function is to avoid creating a complex move which is
used by certain slots in the next instruction, but unscheduled
successors will never be in the next instruction. Found while debugging
a crash that the previous commit fixed.
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Connor Abbott [Sun, 1 Sep 2019 17:33:06 +0000 (19:33 +0200)]
lima/gpir: Do all lowerings before rsched
The scheduler assumes that load nodes are always duplicated so that they
can always be scheduled eventually and therefore they never need to be
spilled. But some lowerings were running after the pre-RA scheduler,
whereas duplication has to happen before then since it's needed for the
scheduler to do a better job reducing register pressure. This meant
that lowerings were introducing multiple uses of a load instruction,
which broke the scheduler's expectation and resulted in infinite loops
in situations where the only nodes available to spill were load nodes.
Spilling load nodes would be silly, so we want to fix the lowerings
rather than the scheduler. Just do all lowerings before the pre-RA
scheduler, which also helps with reducing pressure since the scheduler
can more accurately compute the pressure.
Fixes lima/mesa#104.
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Mauro Rossi [Sun, 8 Sep 2019 15:35:22 +0000 (17:35 +0200)]
android: anv: libmesa_vulkan_common: add libmesa_util static dependency
Change needed to fix the following building error:
In file included from external/mesa/src/intel/vulkan/anv_device.c:43:
external/mesa/src/util/xmlpool.h:115:10: fatal error: 'xmlpool/options.h' file not found
^~~~~~~~~~~~~~~~~~~
1 error generated.
Fixes: 4dcb1ff ("anv: add support for driconf")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Boris Brezillon [Thu, 5 Sep 2019 19:41:33 +0000 (21:41 +0200)]
panfrost: Rename pan_bo_cache.c into pan_bo.c
So we can move all the BO logic into this file instead of having it
spread over pan_resource.c, pan_drm.c and pan_bo_cache.c.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Boris Brezillon [Thu, 5 Sep 2019 19:41:32 +0000 (21:41 +0200)]
panfrost: Get rid of the now unused SLAB allocator
The last users have been converted to use plain BOs. Let's get rid of
this abstraction. We can always consider adding it back if we need it
at some point.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Boris Brezillon [Thu, 5 Sep 2019 19:41:31 +0000 (21:41 +0200)]
panfrost: Get rid of unused panfrost_context fields
Some fields in panfrost_context are unused (probably leftovers from
previous refactor). Let's get rid of them.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Boris Brezillon [Thu, 5 Sep 2019 19:41:30 +0000 (21:41 +0200)]
panfrost: Convert ctx->{scratchpad, tiler_heap, tiler_dummy} to plain BOs
ctx->{scratchpad,tiler_heap,tiler_dummy} are allocated using
panfrost_drm_allocate_slab() but they never any of the SLAB-based
allocation logic. Let's convert those fields to plain BOs.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Boris Brezillon [Thu, 5 Sep 2019 19:41:29 +0000 (21:41 +0200)]
panfrost: Make transient allocation rely on the BO cache
Right now, the transient memory allocator implements its own BO caching
mechanism, which is not really needed since we already have a generic
BO cache. Let's simplify things a bit.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Boris Brezillon [Thu, 5 Sep 2019 19:41:28 +0000 (21:41 +0200)]
panfrost: Stop passing a ctx to functions being passed a batch
The context can be retrieved from batch->ctx.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Boris Brezillon [Thu, 5 Sep 2019 19:41:27 +0000 (21:41 +0200)]
panfrost: Pass a batch to panfrost_drm_submit_vs_fs_batch()
Given the function name it makes more sense to pass it a job batch
directly.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Boris Brezillon [Thu, 5 Sep 2019 19:41:26 +0000 (21:41 +0200)]
panfrost: s/job/batch/
What we currently call a job is actually a batch containing several jobs
all attached to a rendering operation targeting a specific FBO.
Let's rename structs, functions, variables and fields to reflect this
fact.
Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Heinrich Fink [Tue, 30 Jul 2019 13:58:20 +0000 (15:58 +0200)]
egl: Add GL_MESA_EGL_sync support
This commit follow OES_EGL_sync to universially enable use of EGL sync
objects with desktop OpenGL contexts.
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Heinrich Fink [Tue, 30 Jul 2019 14:14:07 +0000 (16:14 +0200)]
headers: Add GL_MESA_EGL_sync token to GL
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Heinrich Fink [Tue, 30 Jul 2019 14:12:20 +0000 (16:12 +0200)]
registry: update gl.xml with GL_MESA_EGL_sync token
As added by upstream GL registry changes
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Heinrich Fink [Mon, 29 Jul 2019 14:47:20 +0000 (16:47 +0200)]
specs: Add GL_MESA_EGL_sync
Adds GL_MESA_EGL_sync as defined in upstream OpenGL registry
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tapani Pälli [Fri, 6 Sep 2019 05:05:02 +0000 (08:05 +0300)]
android: fix linking issues with liblog
Fixes Android build errors observed in Intel CI.
Fixes: f9f7cbc1aa3 "util: android logging support"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Kenneth Graunke [Thu, 5 Sep 2019 08:52:17 +0000 (01:52 -0700)]
iris: Support the disable_throttling=true driconf option.
Jason Ekstrand [Fri, 30 Aug 2019 16:35:26 +0000 (11:35 -0500)]
nir/dead_cf: Repair SSA if the pass makes progress
The dead_cf pass calls into the CF manipulation helpers which attempt to
keep NIR's SSA form sane. However, when the only break is removed from
a loop, dominance gets messed up anyway because the CF SSA clean-up code
only looks at phis and doesn't consider the case of code becoming
unreachable. One solution to this would be to put the loop into LCSSA
form before we modify any of its contents. Another (and the approach
taken by this pass) is to just run the repair_ssa pass afterwards
because the CF manipulation helpers are smart enough to keep all the
use/def stuff sane; they just don't always preserve dominance
properties.
While we're here, we clean up some bogus indentation.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111405
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111069
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Jason Ekstrand [Fri, 30 Aug 2019 19:16:07 +0000 (14:16 -0500)]
nir/repair_ssa: Insert deref casts when needed
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Jason Ekstrand [Mon, 2 Sep 2019 17:54:31 +0000 (12:54 -0500)]
nir/repair_ssa: Repair dominance for unreachable blocks
NIR currently assumes that unreachable blocks are trivially dominated by
everything. However, when considering well-formed SSA, there is no path
from any block to an unreachable block. Therefore, we can break any
use-def chains where the use is in an unreachable block. This removes
any dependencies on code created by uses in unreachable blocks and lets
DCE do a better job of cleaning it up.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Jason Ekstrand [Mon, 2 Sep 2019 17:53:16 +0000 (12:53 -0500)]
nir: Add a block_is_unreachable helper
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Jason Ekstrand [Fri, 30 Aug 2019 18:55:02 +0000 (13:55 -0500)]
nir: Don't infinitely recurse in lower_ssa_defs_to_regs_block
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Jason Ekstrand [Fri, 30 Aug 2019 18:21:00 +0000 (13:21 -0500)]
nir: Handle complex derefs in nir_split_array_vars
We already bail and don't split the vars but we were passing a NULL to
_mesa_hash_table_search which is not allowed.
Fixes: f1cb3348f1 "nir/split_vars: Properly bail in the presence of ..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Jason Ekstrand [Sat, 3 Feb 2018 17:12:15 +0000 (09:12 -0800)]
intel/blorp: Use wide formats for nicely aligned stencil clears
In the case where the stencil clear is nicely aligned, we can clear
stencil much more efficiently by mapping it as a wide format (say
RGBA32_UINT) and blasting out the stencil clear value with a repclear.
On Unigine Heaven, this makes one stencil clear go from non-trivial to
unnoticeable when looking at per-draw timings.
In order for this change to work properly, ANV needs to do a bit more
flushing around depth and stencil clears. i965 and iris already have
the cache tracking logic to handle this so no changes are required
there.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Sun, 1 Sep 2019 14:07:38 +0000 (09:07 -0500)]
intel/blorp: Expose surf_fake_interleaved_msaa internally
Jason Ekstrand [Sat, 3 Feb 2018 19:46:04 +0000 (11:46 -0800)]
intel/blorp: Expose surf_retile_w_to_y internally
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Sat, 31 Aug 2019 04:57:52 +0000 (23:57 -0500)]
blorp: Memset surface info to zero when initializing it
This isn't known to fix any current bugs but it does prevent a
regression in a subsequent commit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Sat, 31 Aug 2019 19:11:49 +0000 (14:11 -0500)]
intel/tools: Decode PS kernels on SNB
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Sat, 31 Aug 2019 19:02:15 +0000 (14:02 -0500)]
intel/tools: Decode 3DSTATE_BINDING_TABLE_POINTERS on SNB
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Rhys Perry [Fri, 6 Sep 2019 20:38:57 +0000 (21:38 +0100)]
nir/lower_io_to_vector: don't merge compact varyings
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 02bc4aabb48 ('nir/lower_io_to_vector: allow FS outputs to be vectorized')
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Engestrom [Thu, 29 Aug 2019 23:23:01 +0000 (00:23 +0100)]
drirc: override minImageCount=2 for gfxbench
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110765
Fixes: 4689e98fe884d9412b72 ("vulkan/wsi: Set X11 minImageCount to 3.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Thu, 29 Aug 2019 22:52:52 +0000 (23:52 +0100)]
radv: add support for vk_x11_override_min_image_count
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Fri, 30 Aug 2019 16:03:12 +0000 (17:03 +0100)]
amd: move adaptive sync to performance section, as it is defined in xmlpool
Fixes: 3844ed8d44677588bc29 ("radv: Add adaptive_sync driconfig option and enable it by default.")
Fixes: e260493f2ab2483e5a55 ("radeonsi: Enable adaptive_sync by default for radeon")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Thu, 29 Aug 2019 22:55:29 +0000 (23:55 +0100)]
anv: add support for vk_x11_override_min_image_count
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Thu, 29 Aug 2019 22:49:29 +0000 (23:49 +0100)]
wsi: add minImageCount override
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Wed, 24 Apr 2019 15:42:25 +0000 (16:42 +0100)]
anv: add support for driconf
No option is supported yet, this is just the boilerplate.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Tue, 3 Sep 2019 21:40:32 +0000 (22:40 +0100)]
gallivm: drop LLVM<3.3 code paths as no build system allows that
Suggested-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Eric Engestrom [Tue, 27 Aug 2019 23:58:18 +0000 (00:58 +0100)]
meson/scons/android: drop now-unused HAVE_LLVM
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Eric Engestrom [Tue, 27 Aug 2019 23:36:37 +0000 (00:36 +0100)]
llvmpipe: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Eric Engestrom [Tue, 27 Aug 2019 23:36:45 +0000 (00:36 +0100)]
clover: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Eric Engestrom [Tue, 27 Aug 2019 23:36:25 +0000 (00:36 +0100)]
gallivm: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Eric Engestrom [Tue, 27 Aug 2019 23:07:00 +0000 (00:07 +0100)]
clover: replace major llvm version checks with LLVM_VERSION_MAJOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Eric Engestrom [Tue, 27 Aug 2019 23:06:21 +0000 (00:06 +0100)]
gallivm: replace major llvm version checks with LLVM_VERSION_MAJOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Eric Engestrom [Tue, 27 Aug 2019 23:06:45 +0000 (00:06 +0100)]
swr: replace major llvm version checks with LLVM_VERSION_MAJOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Eric Engestrom [Tue, 27 Aug 2019 23:06:03 +0000 (00:06 +0100)]
amd: replace major llvm version checks with LLVM_VERSION_MAJOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Eric Engestrom [Tue, 27 Aug 2019 22:59:14 +0000 (23:59 +0100)]
svga: replace binary HAVE_LLVM checks with LLVM_AVAILABLE
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Eric Engestrom [Tue, 27 Aug 2019 22:58:57 +0000 (23:58 +0100)]
r600: replace binary HAVE_LLVM checks with LLVM_AVAILABLE
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Eric Engestrom [Tue, 27 Aug 2019 22:56:55 +0000 (23:56 +0100)]
aux/draw: replace binary HAVE_LLVM checks with LLVM_AVAILABLE
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Eric Engestrom [Tue, 27 Aug 2019 23:56:24 +0000 (00:56 +0100)]
meson/scons/android: add LLVM_AVAILABLE binary flag
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Eric Engestrom [Tue, 27 Aug 2019 22:54:52 +0000 (23:54 +0100)]
gallivm: replace `0x` version print with actual version string
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Jordan Justen [Wed, 13 Dec 2017 04:24:57 +0000 (20:24 -0800)]
anv,iris: L3ALLOC register replaces L3CNTLREG for gen12
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Anuj Phogat [Sat, 5 Jan 2019 00:04:07 +0000 (16:04 -0800)]
intel/gen12: Add L3 configurations
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rhys Perry [Thu, 5 Sep 2019 19:51:30 +0000 (20:51 +0100)]
util: include u_endian.h in u_math.h
u_endian.h needs to be included, otherwise PIPE_ARCH_BIG_ENDIAN might not
be defined on big-endian architectures and the endian conversion macros
will be incorrect.
I don't think anything is broken because of this, I just noticed this when
looking at the file.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Tue, 3 Sep 2019 15:00:23 +0000 (10:00 -0500)]
anv: Bump maxComputeWorkgroupSize
Fixes: 9a129510f56f "anv: Bump maxComputeWorkgroupInvocations"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111552
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Kenneth Graunke [Sat, 31 Aug 2019 00:00:22 +0000 (17:00 -0700)]
intel: Stop redirecting state cache to command streamer cache section
This bit redirects the state cache from the unified/RO sections of the
L3 cache to the "CS command buffer" section of the cache, which would
be set up via TCCNTLREG. The documentation says:
"Additionaly, this redirection should be enabled only if there is a
non-zero allocation for the CS command buffer section."
We don't allocate any cache to the CS command buffer section, so
enabling this redirection effectively disabled the state cache.
The Windows driver only sets up that section when using POSH, which
we do not currently use. So, leave it unallocated and disable the
redirection to get a functional state cache again.
Improves performance in Civilization VI by 18%, Manhattan 3.0 by 6%,
and Car Chase by 2%.
Kenneth Graunke [Tue, 3 Sep 2019 22:34:54 +0000 (15:34 -0700)]
iris: Invalidate state/texture/constant caches after STATE_BASE_ADDRESS
Jason pointed out that the caches likely refer to offsets from dynamic
and surface state base addresses, so when we change those, we need to
invalidate the caches.
Comment borrowed from src/intel/vulkan/genX_cmd_buffer.c.
Kristian H. Kristensen [Thu, 5 Sep 2019 22:12:23 +0000 (15:12 -0700)]
freedreno/a6xx: Implement primitive count queries on GPU
The driver can't determine PIPE_QUERY_PRIMITIVES_GENERATED or
PIPE_QUERY_PRIMITIVES_EMITTED once we support geometry or
tessellation, since these stages add primitives at runtime. Use the
WRITE_PRIMITIVE_COUNTS event to write back the primitive counts and
implement a hw query for this.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Kristian H. Kristensen [Thu, 5 Sep 2019 22:07:55 +0000 (15:07 -0700)]
freedreno/a6xx: Let the GPU track streamout offsets
The GPU writes out streamout offsets as it goes to the FLUSH_BASE
pointer. We use that value with CP_MEM_TO_REG when appending to the
stream so that we don't have to track the offsets with the CPU in the
driver. This ensures that streamout continues to work once we enable
geometry and tessellation shader stages that add geometry.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Roland Scheidegger [Fri, 6 Sep 2019 02:12:06 +0000 (04:12 +0200)]
llvmpipe: fix CALLOC vs. free mismatches
Should fix some issues we're seeing. And use REALLOC instead of realloc.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Samuel Pitoiset [Thu, 5 Sep 2019 10:19:22 +0000 (12:19 +0200)]
radv/gfx10: determine the number of vertices per primitive for TES
This doesn't fix anything known but it's correct now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Rhys Perry [Fri, 17 May 2019 14:04:39 +0000 (15:04 +0100)]
nir/lower_io_to_vector: add flat mode
This has lower_io_to_vector try to turn variables into arrays of 4-sized
vectors when possible and fall back to the old approach when that isn't
possible.
This is so that lower_io_to_vector can guarantee that only one variable is
used for each fragment shader output.
v2: handle dual-source blending
v3: don't try to merge structs and non-32-bit types in get_flat_type()
v3: fix per-vertex inputs
v3: fix and cleanup location advancement in get_flat_type() and it's
calling code
v4: prioritize the original mode over the flat mode
v4: don't create flat variables to merge only one variable
v5: don't skip an entire slot when encountering structs in the old mode
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Rhys Perry [Fri, 17 May 2019 10:53:32 +0000 (11:53 +0100)]
nir/lower_io_to_vector: allow FS outputs to be vectorized
v2: handle dual-source blending
v3: use a higher MAX_SLOTS
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Samuel Pitoiset [Fri, 6 Sep 2019 08:34:35 +0000 (10:34 +0200)]
radv/gfx10: make use the output usage mask when exporting NGG GS params
It shouldn't matter much because output varyings should have been
compacted during NIR shader linking but it mirrors what the driver
does when emitting NGG GS vertex parameters.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Fri, 6 Sep 2019 08:32:13 +0000 (10:32 +0200)]
radv/gfx10: account for the subpass view for the NGG GS storage
If the fragment shader needs the layer index, we have to allocate
one more dword in the NGG GS storage. Found by inspection. This
doesn't fix anything known.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tomeu Vizoso [Fri, 6 Sep 2019 14:17:26 +0000 (16:17 +0200)]
panfrost/ci: Increase timeouts
Sometimes LAVA jobs will timeout due to transient issues, and the Gitlab
job will fail in that case. Increase the timeouts to reduce the
likeliness of that happening and reduce false positives.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tomeu Vizoso [Fri, 6 Sep 2019 13:56:01 +0000 (15:56 +0200)]
panfrost/ci: Use special runner for LAVA jobs
So repositories don't need to be specially configured with a token to
access LAVA, store this token in a bind volume for a special runner.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tomeu Vizoso [Mon, 2 Sep 2019 06:33:11 +0000 (08:33 +0200)]
panfrost/ci: Re-add support for armhf
Now that Volt supports armhf, build again images and submit to LAVA for
RK3288.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Samuel Pitoiset [Tue, 3 Sep 2019 16:20:07 +0000 (18:20 +0200)]
radv: calculate esgs_itemsize in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 3 Sep 2019 16:16:33 +0000 (18:16 +0200)]
radv: calculate the GSVS vertex size in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 3 Sep 2019 16:12:33 +0000 (18:12 +0200)]
radv: gather primitive ID in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 3 Sep 2019 16:09:00 +0000 (18:09 +0200)]
radv: gather layer in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 3 Sep 2019 16:05:25 +0000 (18:05 +0200)]
radv: gather viewport in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 3 Sep 2019 16:04:43 +0000 (18:04 +0200)]
radv: gather pointsize in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 3 Sep 2019 15:55:02 +0000 (17:55 +0200)]
radv: gather clip/cull distances in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 3 Sep 2019 15:48:07 +0000 (17:48 +0200)]
radv: move ac_fill_shader_info() to radv_nir_shader_info_pass()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 3 Sep 2019 15:39:23 +0000 (17:39 +0200)]
radv: merge radv_shader_variant_info into radv_shader_info
Having two different structs is useless.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Zhu, James [Wed, 4 Sep 2019 17:59:39 +0000 (17:59 +0000)]
radeon: Fix mjpeg issue for ARCTURUS
ARCTURUS mjpeg is using direct register access.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Leo Liu [Wed, 4 Sep 2019 17:27:02 +0000 (13:27 -0400)]
radeon/vcn: add RENOIR VCN decode support
It has same VCN2.x block as Navi1x
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Danylo Piliaiev [Thu, 22 Aug 2019 10:32:50 +0000 (13:32 +0300)]
glsl: Fix unroll of do{} while(false) like loops
For loops which condition is false on the first iteration
iteration count was falsely calculated under the assumption
that loop's condition is true until it becomes false, meaning
it's true at least one time.
Now such loops are reported as having 0 iteration.
Similar to the fix
e71fc7f2 done in NIR.
Fixes tests/shaders/glsl-fs-loop-while-false-02.shader_test
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Timur Kristóf [Wed, 4 Sep 2019 13:56:09 +0000 (16:56 +0300)]
tgsi_to_nir: Remove dependency on libglsl.
This commit removes the GLSL dependency in TTN by manually recording
the textures used and calling nir_lower_samplers
instead of its GL counterpart.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Timur Kristóf [Wed, 28 Aug 2019 20:34:14 +0000 (22:34 +0200)]
nir: Carve out nir_lower_samplers from GLSL code.
Lowering samplers is needed to produce NIR that can actually be
consumed by some gallium drivers, so it doesn't make sense to
to keep it only in the GLSL code.
This commit introduces nir_lower_samplers to compiler/nir,
while maintains the GL-specific function too.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Gert Wollny [Tue, 3 Sep 2019 17:24:09 +0000 (19:24 +0200)]
radeonsi: Release storage for smda_uploads when the context is destroyed
This fixes a memory leak in the flush code:
Direct leak of 128 byte(s) in 1 object(s) allocated from:
#0 in __interceptor_realloc .../gcc-8.3.0/libsanitizer/asan/asan_malloc_linux.cc:105
#1 in si_buffer_do_flush_region src/gallium/drivers/radeonsi/si_buffer.c:573
#2 in si_buffer_flush_region src/gallium/drivers/radeonsi/si_buffer.c:608
#3 in si_buffer_flush_region src/gallium/drivers/radeonsi/si_buffer.c:597
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Mauro Rossi [Sun, 14 Jul 2019 08:53:19 +0000 (10:53 +0200)]
android: mesa: revert "Enable asm unconditionally"
This patch partially reverts
20294dc ("mesa: Enable asm unconditionally, ...")
Android makefile build logic needs to disable assembler optimization
in 32bit builds to avoid text relocations for libglapi.so shared
Fixes the following build error with Android x86 32bit target:
[ 0% 4/477] target SharedLib: libglapi (out/target/product/x86/obj/SHARED_LIBRARIES/libglapi_intermediates/LINKED/libglapi.so)
FAILED: out/target/product/x86/obj/SHARED_LIBRARIES/libglapi_intermediates/LINKED/libglapi.so
...
prebuilts/gcc/linux-x86/x86/x86_64-linux-android-4.9/x86_64-linux-android/bin/ld: warning: shared library text segment is not shareable
prebuilts/gcc/linux-x86/x86/x86_64-linux-android-4.9/x86_64-linux-android/bin/ld: error: treating warnings as errors
clang-6.0: error: linker command failed with exit code 1 (use -v to see invocation)
Fixes: 20294dc ("mesa: Enable asm unconditionally, now that gen_matypes is gone.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Samuel Pitoiset [Tue, 27 Aug 2019 07:01:02 +0000 (09:01 +0200)]
radv/gfx10: always set ballot_mask_bits to 64
The codegen handles it and it adds the correct casts. This fixes
a bunch of LLVM validation errors when enabling Wave32 for compute.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Caio Marcelo de Oliveira Filho [Wed, 28 Aug 2019 01:32:07 +0000 (18:32 -0700)]
nir/lower_explicit_io: Handle 1 bit loads and stores
Load a 32-bit value then convert to 1-bit. Convert 1-bit to 32-bit
value, then Store it.
These cases started to appear when we changed Anvil to use derefs for
shared memory.
v2: Use `bit_size` in a couple of places we were missing. (Jason)
Reassign `value` instead of `src[0]`. (Jason)
Fixes: 024a46a4079 ("anv: use derefs for shared memory access")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Jason Ekstrand [Mon, 2 Sep 2019 03:12:07 +0000 (22:12 -0500)]
Revert "intel/fs: Move the scalar-region conversion to the generator."
This reverts commit
c0504569eac5e5c305e9f0c240e248aca9d8891f. Now that
we're doing interpolation lowering in NIR, we can continue to stride the
FS input registers directly in the brw_fs_nir code like we did before.
This fixes SIMD32 fragment shaders which broke because lower_simd_width
depended on the 0 stride to split PLN instructions correctly.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Jason Ekstrand [Mon, 2 Sep 2019 02:57:05 +0000 (21:57 -0500)]
intel/fs: Fix FB write inst groups
This commit does two things. First, it simplifies the way we compute
the FB write group bit. There's no reason to use a ternary because
inst->group / 16 can only be 0 or 1. Second, it fixes an order-of-
operations bug where the ternary wasn't selecting between (1 << 11) and
0 but between (1 << 11) and 0 | brw_dp_write_desc(...).
Fixes: 0d9648416 "intel/compiler: Use generic SEND for Gen7+ FB writes"
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Vasily Khoruzhick [Thu, 29 Aug 2019 06:09:38 +0000 (23:09 -0700)]
lima/ppir: don't lower phis to scalar
Utgard PP is vec4 architecture, so lowering phis to scalars
increases instruction count and potentially interferes with
spilling.
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Jonathan Marek [Thu, 5 Sep 2019 02:34:23 +0000 (22:34 -0400)]
freedreno/a2xx: formats update
For render formats, update fd2_pipe2color to only work with HW supported
render formats, and remove the format whitelist is_format_supported. This
patch enables float render formats (which work).
For vertex/texture formats, use a generic function which translates using
the bitsize of the channels. Since we fake support for some vertex formats,
check for these in is_format_supported to avoid enabling them as sampler
formats.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Jonathan Marek [Wed, 4 Sep 2019 19:23:27 +0000 (15:23 -0400)]
freedreno/a2xx: fix depth gmem restore
Use fd_gmem_restore_format() to avoid trying to use unsupported Z24S8/Z16
render formats for gmem restore.
Also apply this change to gmem2mem so it doesn't depend on fd2_pipe2color
working with depth formats.
gmem2mem/mem2gmem also doesn't need to use the swap/swizzle, since dst/src
formats are the same.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Jonathan Marek [Thu, 5 Sep 2019 21:21:54 +0000 (17:21 -0400)]
freedreno/a2xx: implement polygon offset
Fixes failures in the following deqp tests:
dEQP-GLES2.functional.polygon_offset.*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jonathan Marek [Thu, 5 Sep 2019 02:36:00 +0000 (22:36 -0400)]
freedreno/a2xx: fix SRC_ALPHA_SATURATE for alpha blend function
Fixes failures in the following deqp tests:
dEQP-GLES2.functional.fragment_ops.*src_alpha_saturate*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jonathan Marek [Thu, 5 Sep 2019 15:25:07 +0000 (11:25 -0400)]
freedreno/a2xx: ir2: update register state in scalar insert
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Jonathan Marek [Thu, 5 Sep 2019 15:23:53 +0000 (11:23 -0400)]
freedreno/a2xx: ir2: fix incorrect instruction reordering
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Jonathan Marek [Thu, 5 Sep 2019 15:21:16 +0000 (11:21 -0400)]
freedreno/a2xx: ir2: check opcode on the right instruction in export cp
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jonathan Marek [Thu, 5 Sep 2019 15:19:21 +0000 (11:19 -0400)]
freedreno/a2xx: ir2: fix saturate in cp
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jonathan Marek [Thu, 5 Sep 2019 15:18:45 +0000 (11:18 -0400)]
freedreno/a2xx: ir2: set lower_fdph
The fdph opcode is not supported.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>