Caio Marcelo de Oliveira Filho [Tue, 28 Apr 2020 21:03:47 +0000 (14:03 -0700)]
iris: Implement ARB_compute_variable_group_size
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4794>
Caio Marcelo de Oliveira Filho [Tue, 28 Apr 2020 20:28:02 +0000 (13:28 -0700)]
intel: Let drivers call brw_nir_lower_cs_intrinsics()
The motivating factor is: this lowering may cause
nir_intrinsic_load_local_group_size intrinsics to be added to the
shader, and by moving this around we make possible for the drivers to
lower that intrinsic by themselves.
Iris will do just that in a later patch for implementing variable
group size.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4794>
Caio Marcelo de Oliveira Filho [Tue, 28 Apr 2020 20:09:27 +0000 (13:09 -0700)]
intel/fs: Add and use a new load_simd_width_intel intrinsic
Intrinsic to get the SIMD width, which not always the same as subgroup
size. Starting with a small scope (Intel), but we can rename it later
to generalize if this turns out useful for other drivers.
Change brw_nir_lower_cs_intrinsics() to use this intrinsic instead of
a width will be passed as argument. The pass also used to optimized
load_subgroup_id for the case that the workgroup fitted into a single
thread (it will be constant zero). This optimization moved together
with lowering of the SIMD.
This is a preparation for letting the drivers call it before the
brw_compile_cs() step.
No shader-db changes in BDW, SKL, ICL and TGL.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4794>
Caio Marcelo de Oliveira Filho [Tue, 28 Apr 2020 16:47:45 +0000 (09:47 -0700)]
intel/fs: Add an option to lower variable group size in backend
Adding this since Iris will handle variable group size parameters by
itself.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4794>
Caio Marcelo de Oliveira Filho [Wed, 29 Apr 2020 04:04:04 +0000 (21:04 -0700)]
intel/fs: Clean up variable group size handling in backend
Just use the information from NIR shader_info.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4794>
Kenneth Graunke [Mon, 5 Aug 2019 20:18:39 +0000 (13:18 -0700)]
iris: Implement PIPE_FLUSH_DEFERRED support.
(Co-authored with Chris Wilson.)
Frequently, games create fences and later check them with a timeout of
0 to see if that work has completed yet. They do not want the work to
be flushed immediately upon fence creation.
This is what PIPE_FLUSH_DEFERRED does - it inhibits the flush at fence
creation time, but still guarantees that a flush will occur later on
once fence_finish() is called.
Since syncpts can only occur at batch boundaries, when deferring a
flush, we have to wait for the syncpt at the end of the batch being
constructed. This is later than desired, but safe if blocking. To
avoid extra delays, we additionally insert a PIPE_CONTROL to write an
availability bit at the exact point of the fence. We can poll this
on the CPU, allowing us to check whether the fence has gone by, even
if the batch hasn't completed. It can also let us skip kernel calls.
Improves performance in Bioshock Infinite by 10% on Icelake GT2 on
-ForceCompatLevel=5 settings. Thanks to Felix Degrood and Mark Janes
for helping notice the extraneous stalls and batches, Marek Olšák for
adding deferred flush support to Gallium to solve this issue, and
Chris Wilson for reworking a lot of the internals of this work.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
Kenneth Graunke [Wed, 29 Apr 2020 20:53:50 +0000 (13:53 -0700)]
iris: Detect DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT kernel support
We will use this for implementing deferred flushes in the next commit.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
Kenneth Graunke [Wed, 29 Apr 2020 20:47:57 +0000 (13:47 -0700)]
intel: Move anv_gem_supports_syncobj_wait to common code.
This will let me use this in iris.
We leave the existing anv function for anv_gem_stubs.c faking, but
move the contents to a helper in a new src/intel/common/gen_gem.c file.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
Kenneth Graunke [Sat, 15 Feb 2020 00:40:07 +0000 (16:40 -0800)]
iris: Flush any current work in iris_fence_await before adding deps
Receiving a fence_server_sync (iris_fence_await) means that any future
work needs to wait for the fence. But previous work doesn't need to.
So flush it now, to avoid delaying it arbitrarily.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
Chris Wilson [Wed, 19 Feb 2020 23:45:59 +0000 (23:45 +0000)]
iris: Store a seqno for each batch in the fence
In the next patch, we will introduce deferred fences where we will need
to flush a fence later. To do this, we need to know which batch requires
flushing, so keep a 1:1 mapping between seqno[] and the associated
batch.
It's also substantially less confusing to have a 1:1 mapping.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
Chris Wilson [Wed, 19 Feb 2020 21:24:40 +0000 (21:24 +0000)]
iris: Convert fences to using lightweight seqno
By using the breadcrumbs we inject into the batch, we can build a
lightweight fence - that can be evaluated in userspace without having to
check in the kernel. In order to pass the fences between processes, and
to wait efficiently, we continue to track the syncobj for each batch and
use that as a terminator for the fence, and for passing coarse
scheduling decisions to the kernel on execbuf.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
Chris Wilson [Tue, 18 Feb 2020 22:22:51 +0000 (22:22 +0000)]
iris: Place a seqno at the end of every batch
We can use seqno as a basic for fast userspace fences: where we can
check a value directly to test for fence completion without having to
query using the kernel. To do so we need to write a breadcrumb from the
batch and track those writes as the basis for our lightweight fences.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
Kenneth Graunke [Fri, 1 May 2020 17:57:15 +0000 (10:57 -0700)]
iris: Destroy transfer slab after batches
Batches are going to have an uploader in the next commit, so destroying
batches will destroy uploaders, which will unmap transfers, which will
return things to the slab allocator. So we need to reorder destroying
the slab allocator to the end to avoid crashing.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
Kenneth Graunke [Fri, 1 May 2020 17:17:15 +0000 (10:17 -0700)]
iris: Give up on not passing ice to iris_init_batch
We're going to need it to create a uploader in the batch soon. We still
avoid storing it, to maintain the charade of separation, and make people
think twice about fetching random fields from there and intertwining
things even worse.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
Kenneth Graunke [Wed, 29 Apr 2020 06:24:38 +0000 (23:24 -0700)]
iris: Rename iris_syncpt to iris_syncobj for clarity.
This is just a refcounted wrapper around a drm_syncobj. There is
enough terminology going on in the area of synchronization (sync
objects, sync files, ...) that I'd rather not invent our own.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
Kenneth Graunke [Wed, 29 Apr 2020 22:16:04 +0000 (15:16 -0700)]
anv: Include linux/sync_file.h instead of cut and pasting contents
Linux 4.7 has been out for a long time, this is probably safe to
depend on at this point, rather than cut and pasting the contents.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
Kenneth Graunke [Wed, 29 Apr 2020 06:30:02 +0000 (23:30 -0700)]
iris: Include linux/sync_file.h instead of cut and pasting contents
Lets us drop some cut and pasted kernel header contents.
Linux 4.7 came out 4 years before we the first officially supported
release of this driver; iris won't run on kernels older than 4.16,
and 4.18.11+ is strongly recommended. So I suspect it's safe to
assume that a kernel header from 4.7 will exist at build time.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
Alyssa Rosenzweig [Fri, 1 May 2020 18:04:05 +0000 (14:04 -0400)]
panfrost: Update dEQP expectation list
These tests were recently fixed.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4852>
Alyssa Rosenzweig [Thu, 30 Apr 2020 00:27:16 +0000 (20:27 -0400)]
pan/mdg: Enable nir_opt_algebraic_distribute_src_mods
Helps cleanup some issues otherwise missed by the new source mod
handling. (Noticed a double negative)
total instructions in shared programs: 3606 -> 3605 (-0.03%)
instructions in affected programs: 41 -> 40 (-2.44%)
helped: 1
HURT: 0
total bundles in shared programs: 1883 -> 1883 (0.00%)
bundles in affected programs: 0 -> 0
helped: 0
HURT: 0
total quadwords in shared programs: 3296 -> 3324 (0.85%)
quadwords in affected programs: 596 -> 624 (4.70%)
helped: 0
HURT: 2
total registers in shared programs: 337 -> 336 (-0.30%)
registers in affected programs: 6 -> 5 (-16.67%)
helped: 1
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4852>
Alyssa Rosenzweig [Thu, 30 Apr 2020 17:13:24 +0000 (13:13 -0400)]
pan/mdg: Drop `opt` in name of midgard_opt_cull_dead_branch
It's necessary for conformance - not an optimization.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4852>
Alyssa Rosenzweig [Thu, 30 Apr 2020 17:11:52 +0000 (13:11 -0400)]
pan/mdg: Drop forever todo
Not much to be done.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4852>
Alyssa Rosenzweig [Thu, 30 Apr 2020 17:06:56 +0000 (13:06 -0400)]
pan/mdg: Move constant switch opts to algebraic pass
No shader-db changes.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4852>
Alyssa Rosenzweig [Thu, 30 Apr 2020 18:33:11 +0000 (14:33 -0400)]
pan/mdg: Rename .one to .sat_signed
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4852>
Alyssa Rosenzweig [Thu, 30 Apr 2020 19:41:10 +0000 (15:41 -0400)]
pan/mdg: Ingest actual isub ops
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4852>
Jose Fonseca [Fri, 1 May 2020 18:12:38 +0000 (19:12 +0100)]
glthread: Add GLAPIENTRY to _mesa_marshal_MultiDrawArrays.
Fixes MSVC build. Trivial.
Fixes: 2840bc3065b9e991b2c5880a2ee02e2458a758c4
Caio Marcelo de Oliveira Filho [Thu, 30 Apr 2020 22:01:27 +0000 (15:01 -0700)]
intel/dev: Bail when INTEL_DEVID_OVERRIDE is not valid
Avoids surprises where you set an OVERRIDE but it gets ignored and the
system PCI ID is used.
Also fixes the bug that the error of invalid platform name being
printed too early, even when the passed platform was a PCI ID (which
is also supported).
For the case where euid != uid, a warning was added but the behavior
wasn't changed: it is still going to fallback to system PCI ID.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4841>
D Scott Phillips [Thu, 30 Apr 2020 23:12:07 +0000 (23:12 +0000)]
anv,iris: Fix input vertex max for tcs on gen12
gen12 does away with the single patch dispatch mode for tcs, and
increases some limits so that 8_patch mode can always work. Make the
necessary changes so we don't try to fall back to single patch mode.
Fixes KHR-GL46.tessellation_shader.single.max_patch_vertices and others
Fixes: 44754279ace7 ("intel/fs/gen12: Use TCS 8_PATCH mode.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4843>
Eric Anholt [Wed, 15 Apr 2020 19:07:16 +0000 (12:07 -0700)]
freedreno/ir3: Set the FS .msaa flag to true during precompiles.
If you're going out of your way to do per-sample interpolation, you are
almost surely going to be doing so to an MSAA framebuffer. Should reduce
recompiles with MSAA enabled.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Eric Anholt [Tue, 14 Apr 2020 22:40:43 +0000 (15:40 -0700)]
freedreno: Immediately compile a default variant of shaders.
Now that we normalize our keys fairly well, build a variant at shader
state creation time so that hopefully you don't have to call the compiler
at draw time (as is now the case with glmark2 ES and most of the humus GL
demos).
Fixes: #2782
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Eric Anholt [Wed, 22 Apr 2020 19:22:30 +0000 (12:22 -0700)]
freedreno/ir3: Set up outputs for multi-slot varyings.
Necessary to avoid compiler assertion failures in:
dEQP-GLES31.functional.program_interface_query.program_output.type.interface_blocks.out.named_block_explicit_location.struct.mat3x2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Eric Anholt [Wed, 22 Apr 2020 19:20:19 +0000 (12:20 -0700)]
freedreno/ir3: Stop initializing regid of so->outputs during setup.
It's unused and overwritten by ir3_compile_shader_nir().
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Eric Anholt [Tue, 14 Apr 2020 23:34:00 +0000 (16:34 -0700)]
freedreno/ir3: Improve shader key normalization.
We can remove a bunch of conditional code at key comparison time by
computing a bitmask of used key bits at ir3_shader creation time. This
also gives us a nice place to put additional key simplification to reduce
how many variants we create (like skipping rastflat if we don't read
colors in the FS, or skipping vclamp_color if we don't write colors).
It does mean walking the whole key to AND it, but the key is just 28 bytes
so far so that seems pretty fine.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Eric Anholt [Tue, 14 Apr 2020 22:31:20 +0000 (15:31 -0700)]
freedreno: Emit debug messages when doing draw-time recompiles of shaders.
Right now that's "always" unless you have shaderdb set.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Eric Anholt [Wed, 15 Apr 2020 18:34:16 +0000 (11:34 -0700)]
freedreno/ir3: Remove unused half precision shader key flag.
The code using it was removed in
4af86bd0b933 ("freedreno/ir3: remove
half-precision output")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Eric Anholt [Wed, 15 Apr 2020 21:21:07 +0000 (14:21 -0700)]
freedreno: Fix assertion failures on GS/tess shaders with shader-db enabled.
We weren't filling in the tess mode of the key, or setting has_gs on GS
shaders, resulting in assertion failures when NIR intrinsics didn't get
lowered.
We have to make a guess at prim mode for TCS, but it should be better to
have some shader-db coverage than none, and it will avoid these failures
happening when we start precompiling shaders.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Eric Anholt [Tue, 21 Apr 2020 22:30:49 +0000 (15:30 -0700)]
freedreno/ir3: Skip tess epilogue if the program is missing stores.
Some of the negative API tests make shaders for tess stages that don't do
all the stores they need to. Once we start precompiling (or doing
shader-db of tess), we need to at least not segfault when generating them.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Eric Anholt [Wed, 15 Apr 2020 18:17:10 +0000 (11:17 -0700)]
freedreno: Stop doing binning shaders other than the VS in shader-db.
ir3_cache.c only ever asks for binning variants for VS.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Eric Anholt [Tue, 21 Apr 2020 20:26:14 +0000 (13:26 -0700)]
freedreno/ir3: Fix register allocation assertion failures.
We were failing to tell the allocator about the restriction that scalar
texture instructions (allocated as scalar regs) couldn't be allocated such
that the start of the full unwritemasked vector started before r0. There
was a patch in select_reg_callback on a6xx that tried to work around that,
but you could still end up backed into a corner you shouldn't be because
we didn't tell the RA what it needed.
Fixes compiler assertion failures on a300-a400's blit_z shader, used for
Z32F gmem blits.
Looks like as a result we get tighter register allocation but more nops:
instructions in affected programs: 757945 -> 760356 (0.32%)
nops in affected programs: 317983 -> 320468 (0.78%)
non-nops in affected programs: 27525 -> 27451 (-0.27%)
mov in affected programs: 3098 -> 3023 (-2.42%)
dwords in affected programs: 109664 -> 110656 (0.90%)
last-baryf in affected programs: 112701 -> 112847 (0.13%)
full in affected programs: 4326 -> 4011 (-7.28%)
sstall in affected programs: 120550 -> 120836 (0.24%)
(ss) in affected programs: 13939 -> 13918 (-0.15%)
(sy) in affected programs: 3006 -> 2786 (-7.32%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Kristian H. Kristensen [Tue, 28 Apr 2020 21:29:55 +0000 (14:29 -0700)]
freedreno/ir3: Drop hack to clean up split vars
When the GS lowering was working on store_output intrinsics, we had to
clean up the split vars to avoid getting confused. Now that we shadow
the output vars instead, there's no confusion and we can drop this
hack.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Kristian H. Kristensen [Tue, 28 Apr 2020 19:52:42 +0000 (12:52 -0700)]
freedreno/ir3: Lower GS builtins before lowering IO
We mostly got away with replacing a store_output with a store_var, but
for complex types like structs, that doesn't work. Once the IO has
been lowered from vars to intrinsic, we've lost the deref chains and
can't properly shadow the outputs.
This commits moves the GS lowering up so we do it before the output
variables get lowered to store_output. This way the pass works much
like nir_lower_io_to_temporaries() and cleanly shadows the outputs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Kristian H. Kristensen [Tue, 28 Apr 2020 19:34:15 +0000 (12:34 -0700)]
freedreno/ir3: Add ir3_nir_lower_to_explicit_input() pass
This pass lowers per-vertex input intrinsics to load_shared_ir3. This
was open coded in the TCS and GS lowering passes before - this way we
can share it. Furthermore, we'll need to run the rest of the GS
lowering earlier (before lowering IO) so we need to split off this
part that operates on the IO intrinsics first.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Kristian H. Kristensen [Tue, 28 Apr 2020 19:29:46 +0000 (12:29 -0700)]
freedreno/ir3: Rename ir3_nir_lower_to_explicit_io
We rename it to ir3_nir_lower_to_explicit_output, since it only
handles output and we'll add a lowering pass for input next.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Kristian H. Kristensen [Fri, 1 May 2020 06:24:27 +0000 (23:24 -0700)]
freedreno/ir3: Pass stream output info to ir3_shader_from_nir
We need shader->stream_output filled out when we layout the push
constants in ir3_setup_const_state(). Otherwise
const_state->offsets.tfbo ends up as ~0, which doesn't work.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Eric Anholt [Fri, 1 May 2020 00:31:02 +0000 (17:31 -0700)]
freedreno/ir3: Fix the a3xx TF outputs stores.
We were trying to deref the vector-collected outputs[] array before it's
been set up, but we want the per-component outputs anyway.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
Eric Anholt [Fri, 1 May 2020 00:30:02 +0000 (17:30 -0700)]
freedreno/ir3: Set up the block predecessors for a3xx TF
Fixes a segfault in ir3_legalize.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
D Scott Phillips [Thu, 30 Apr 2020 17:38:33 +0000 (17:38 +0000)]
intel/fs: Update location of Render Target Array Index for gen12
Render Target Array Index has moved from R0.0[26:16] to
R1.1[26:16] on gen12.
Fixes dEQP-VK.multiview.input_attachments.*
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4836>
Tomeu Vizoso [Fri, 1 May 2020 09:59:13 +0000 (11:59 +0200)]
pan/decode: Properly print tripped zeroes
The "%" got lost.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Fixes: 6148d1be4bb5 ("panfrost: Fix size of bifrost sampler descriptor")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>
Tomeu Vizoso [Fri, 1 May 2020 09:37:56 +0000 (11:37 +0200)]
panfrost: Add Bifrost texture trampoline BO to batch
Fixes: d3eb23adb50c ("panfrost: Emit sampler descriptor on bifrost")
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>
Alyssa Rosenzweig [Thu, 30 Apr 2020 21:06:37 +0000 (17:06 -0400)]
pan/bi: Lower for now sincos
Will be implemented later.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>
Tomeu Vizoso [Fri, 1 May 2020 06:29:21 +0000 (08:29 +0200)]
panfrost: mali_attr_meta.unknown1 is zero on Bifrost
For unknown1 reasons :)
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>
Tomeu Vizoso [Fri, 1 May 2020 05:36:31 +0000 (07:36 +0200)]
panfrost: GPUs newer than G-71 don't have swizzles...
for attributes and varyings.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>
Tomeu Vizoso [Thu, 30 Apr 2020 08:48:02 +0000 (10:48 +0200)]
pan/decode: Trace to stderr with PANDECODE_DUMP_FILE=stderr
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>
Alyssa Rosenzweig [Thu, 30 Apr 2020 08:32:11 +0000 (10:32 +0200)]
panfrost: Update Bifrost fields in mali_shader_meta
Not much is known currently about these fields and their values, but
this gets things going in the scenarios we have been testing with so
far.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>
Tomeu Vizoso [Thu, 30 Apr 2020 07:29:10 +0000 (09:29 +0200)]
pan/bi: Print shaders only if BIFROST_MESA_DEBUG=shaders
Similar to how it's done in the Midgard compiler.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>
Alyssa Rosenzweig [Thu, 30 Apr 2020 07:27:36 +0000 (09:27 +0200)]
pan/bi: Enable lower_mediump_outputs NIR pass
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>
Tomeu Vizoso [Mon, 27 Apr 2020 15:09:39 +0000 (17:09 +0200)]
panfrost: Add a bit more info about some tiler fields
Has been observed that after the job chain has completed, those fields
become populated.
tiler_heap_next_start contains an address inside the tiler heap, a bit
before the value that the GPU writes to tiler_heap_free.
used_hierarchy_mask contains a hex value that corresponds to values
observed as hierarchy masks.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>
Tomeu Vizoso [Mon, 27 Apr 2020 14:49:22 +0000 (16:49 +0200)]
panfrost: Create additional BO for the checksum of imported BOs (Bifrost)
Similar to what the blob does. My reason for doing this was mainly so
traces weren't as different, which makes it more work to spot
relevant differences.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>
Tomeu Vizoso [Fri, 24 Apr 2020 09:30:03 +0000 (11:30 +0200)]
panfrost: Split bit out of format.unk3
On Bifrost traces, we can observe that this bit is always enabled.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>
Samuel Pitoiset [Fri, 1 May 2020 14:06:43 +0000 (16:06 +0200)]
ci: add lists of expected failures & skipped tests for RAVEN with ACO
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4848>
Andres Gomez [Thu, 30 Apr 2020 21:20:33 +0000 (00:20 +0300)]
scripts: remove unittest.mock dependency when not used
Found upon inspection.
Signed-off-by: Andres Gomez <agomez@igalia.com
Reviewed-and-Tested-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4840>
Samuel Pitoiset [Thu, 30 Apr 2020 09:40:42 +0000 (11:40 +0200)]
ci: fix reporting the number of unexpected/flakes
`wc -l $file` returns the number of lines and the filename.
Fixes: b8c66aeb934 ("ci: Clean up some excessive use of pipes in dEQP results processing.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4829>
Michel Dänzer [Wed, 29 Apr 2020 15:56:26 +0000 (17:56 +0200)]
gitlab-ci: Use YAML anchor for llvmpipe paths in virgl rules
Instead of duplicating them.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4808>
Rob Clark [Thu, 16 Apr 2020 17:14:06 +0000 (10:14 -0700)]
freedreno: we don't need aligned vbo's
This gets rid of the last reason that mesa/st would use `u_vbuf` on
a6xx.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4812>
Rob Clark [Thu, 16 Apr 2020 17:13:24 +0000 (10:13 -0700)]
freedreno/a6xx: add some more formats
u_vbuf was translating these for us.. which isn't really necessary.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4812>
Alyssa Rosenzweig [Thu, 30 Apr 2020 20:49:31 +0000 (16:49 -0400)]
pan/decode: Don't crash on missing payload
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>
Alyssa Rosenzweig [Thu, 30 Apr 2020 22:48:53 +0000 (18:48 -0400)]
panfrost: Fix tiled texture "stride"s on Bifrost
They're not real strides like linear textures but the hw does use them
so we do have to get it right annoyingly.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>
Alyssa Rosenzweig [Thu, 30 Apr 2020 21:20:08 +0000 (17:20 -0400)]
panfrost: Fix norm coords on bifrost sampler
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>
Alyssa Rosenzweig [Thu, 30 Apr 2020 21:06:45 +0000 (17:06 -0400)]
panfrost: Fix sampler wrap/filter field orders
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>
Alyssa Rosenzweig [Thu, 30 Apr 2020 21:01:33 +0000 (17:01 -0400)]
panfrost: Fix size of bifrost sampler descriptor
Should be 32-bytes, it looks like.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>
Alyssa Rosenzweig [Thu, 30 Apr 2020 20:17:04 +0000 (16:17 -0400)]
panfrost: Fix texture field size
I have to imagine this was UB.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>
Alyssa Rosenzweig [Thu, 30 Apr 2020 22:15:23 +0000 (18:15 -0400)]
pan/bit: Add round tests
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>
Alyssa Rosenzweig [Thu, 30 Apr 2020 21:32:56 +0000 (17:32 -0400)]
pan/bit: Interpret ROUND
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>
Alyssa Rosenzweig [Thu, 30 Apr 2020 21:31:38 +0000 (17:31 -0400)]
pan/bit: Add framework forinterpreting double vs float
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>
Alyssa Rosenzweig [Thu, 30 Apr 2020 22:15:09 +0000 (18:15 -0400)]
pan/bi: Pack round opcodes (FMA, either 16 or 32)
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>
Alyssa Rosenzweig [Thu, 30 Apr 2020 20:10:55 +0000 (16:10 -0400)]
pan/bi: Pipe multiple textures through
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>
Alyssa Rosenzweig [Thu, 30 Apr 2020 20:08:01 +0000 (16:08 -0400)]
pan/bi: Add texture indices to IR
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>
Rob Clark [Thu, 30 Apr 2020 23:00:21 +0000 (16:00 -0700)]
freedreno/a6xx: fix LRZ hang
In detecting the case where we actually do need to re-emit LRZ state
(due to new batch), we were checking `ctx->last.dirty` to detect when
we cannot trust previous state. But this is cleared before we check
it.
Move where it is cleared to the end of the draw_vbo() path.
Fixes: dfa702e94b9 ("freedreno/a6xx: limit LRZ state emit")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4842>
Eric Anholt [Thu, 9 Apr 2020 17:45:24 +0000 (10:45 -0700)]
freedreno/ir3: Leave bools as 1-bit, storing them in full regs.
If use NIR's 1-bit bool representation , we get exactly the bool behavior
the hardware provides: CMPS produces true or false, AND/OR/XOR work as
intended without extra absnegs, and we can pass those half values directly
to other CMPS. We emit an absneg for b2b1 ("turn a memory load into a
1-bit NIR boolean"), but we would have done so for the ir3_n2b() on the
use of that value anyway. The most awkward bit is that inot(a@1) is now a
sub(1, a), but we can encode the 1 as an immediate so it's fine.
No significant changes to GL_TIME_ELAPSED on my set of traces (n=21).
instructions in affected programs:
1570638 ->
1548702 (-1.40%)
nops in affected programs: 624053 -> 611381 (-2.03%)
non-nops in affected programs: 959061 -> 949797 (-0.97%)
mov in affected programs: 5258 -> 5252 (-0.11%)
cov in affected programs: 15099 -> 15902 (5.32%)
dwords in affected programs: 469600 -> 452768 (-3.58%)
last-baryf in affected programs: 162211 -> 154726 (-4.61%)
full in affected programs: 4881 -> 4797 (-1.72%)
sstall in affected programs: 173953 -> 174545 (0.34%)
(ss) in affected programs: 10922 -> 10934 (0.11%)
(sy) in affected programs: 728 -> 745 (2.34%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4518>
Eric Anholt [Sat, 11 Apr 2020 04:54:58 +0000 (21:54 -0700)]
freedreno/ir3: Drop redundant IR3_REG_HALF setup in ALU ops.
It's set by ir3_put_dst() immediately after.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4518>
Marek Olšák [Tue, 28 Apr 2020 16:51:22 +0000 (12:51 -0400)]
radeonsi: revert an accidental change in si_clear_buffer
The change was in:
7b0b085c94347cb9c94d88e11a64a6c341d95477
Fixes: 7b0b085c943 ("radeonsi: drop the negation from fmask_is_not_identity")
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
Marek Olšák [Tue, 28 Apr 2020 15:19:23 +0000 (11:19 -0400)]
radeonsi: fix si_compute_clear_render_target with render condition enabled
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
Marek Olšák [Sun, 26 Apr 2020 14:50:24 +0000 (10:50 -0400)]
radeonsi: add a workaround to fix KHR-GL45.texture_view.view_classes on gfx9
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
Marek Olšák [Sun, 26 Apr 2020 12:38:54 +0000 (08:38 -0400)]
radeonsi: implement and use compute-based DCC decompression on gfx9-10
DCC_DECOMPRESS doesn't work. Instead of trying to figure out why,
use a compute blit where the load is compressed and the store is
uncompressed.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
Marek Olšák [Sun, 26 Apr 2020 11:45:34 +0000 (07:45 -0400)]
radeonsi: add SI_IMAGE_ACCESS_DCC_OFF to ignore DCC for shader images
A shader-based DCC decompress pass will use this.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
Marek Olšák [Mon, 27 Apr 2020 01:36:59 +0000 (21:36 -0400)]
radeonsi: bind shader images after DCC is disabled for image stores
This prevents an infinite recursion with a compute-based DCC decompression
when it restores shader images.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
Marek Olšák [Sun, 26 Apr 2020 11:22:52 +0000 (07:22 -0400)]
radeonsi: clean up and deduplicate code around internal compute dispatches
In addition to the cleanup, there are these changes in behavior:
- clear_render_target waits for idle after the dispatch and then flushes
L0-L1 caches (this was missing)
- sL0 is no longer invalidated before the dispatch, because src resources
don't use it
- sL0 is no longer invalidated after the dispatch if dst is an image
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
Marek Olšák [Sun, 26 Apr 2020 05:23:11 +0000 (01:23 -0400)]
radeonsi: unify and align down the max SSBO/TBO/UBO buffer binding size
Rounding down the size fixes:
KHR-GL45.enhanced_layouts.ssb_member_invalid_offset_alignment
Fixes: 03e2adc990d239119619f22599204c1b37b83134
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
Marek Olšák [Sun, 26 Apr 2020 12:55:08 +0000 (08:55 -0400)]
tgsi_to_nir: handle TGSI_OPCODE_BARRIER
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
Marek Olšák [Sun, 26 Apr 2020 12:37:42 +0000 (08:37 -0400)]
tgsi_to_nir: handle TGSI_SEMANTIC_BLOCK_SIZE
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
Marek Olšák [Fri, 6 Mar 2020 21:56:54 +0000 (16:56 -0500)]
glthread: upload non-VBO vertices and indices for non-Indirect non-IBM draws
This is basically the same thing u_vbuf does.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>
Marek Olšák [Sat, 21 Mar 2020 06:58:51 +0000 (02:58 -0400)]
glthread: handle gl{Push,Pop}ClientAttrib{DefaultEXT} for glthread states
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>
Marek Olšák [Tue, 24 Mar 2020 03:35:58 +0000 (23:35 -0400)]
glthread: handle POS vs GENERIC0 aliasing
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>
Marek Olšák [Sat, 21 Mar 2020 06:24:28 +0000 (02:24 -0400)]
glthread: initialize VAOs properly
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>
Marek Olšák [Sat, 7 Mar 2020 00:00:03 +0000 (19:00 -0500)]
glthread: track primitive restart state
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>
Marek Olšák [Thu, 5 Mar 2020 00:24:34 +0000 (19:24 -0500)]
glthread: track instance divisor changes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>
Marek Olšák [Fri, 28 Feb 2020 02:58:35 +0000 (21:58 -0500)]
glthread: track pointers and strides for Pointer & EXT_dsa attrib functions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>
Marek Olšák [Sun, 22 Mar 2020 18:45:14 +0000 (14:45 -0400)]
glthread: don't use atomics for refcounting to decrease overhead on AMD Zen
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>
Marek Olšák [Fri, 6 Mar 2020 02:50:17 +0000 (21:50 -0500)]
glthread: do glBufferSubData as unsynchronized upload + GPU copy
1. glthread has a private upload buffer (as struct gl_buffer_object *)
2. the new function glInternalBufferSubDataCopyMESA is used to execute the copy
(the source buffer parameter type is struct gl_buffer_object * as GLintptr)
Now glthread can handle arbitrary glBufferSubData sizes without syncing.
This is a good exercise for uploading data outside of the driver thread.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>
Marek Olšák [Sat, 7 Mar 2020 01:37:57 +0000 (20:37 -0500)]
mesa: add _mesa_InternalBind{ElementBuffer,VertexBuffers} for glthread
Uploaded non-VBO user data will be set via these functions.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>
Marek Olšák [Sat, 7 Mar 2020 01:19:11 +0000 (20:19 -0500)]
mesa: add glInternalBufferSubDataCopyMESA for glthread
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>