mesa.git
4 years agofreedreno/ir3: Set the FS .msaa flag to true during precompiles.
Eric Anholt [Wed, 15 Apr 2020 19:07:16 +0000 (12:07 -0700)]
freedreno/ir3: Set the FS .msaa flag to true during precompiles.

If you're going out of your way to do per-sample interpolation, you are
almost surely going to be doing so to an MSAA framebuffer.  Should reduce
recompiles with MSAA enabled.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno: Immediately compile a default variant of shaders.
Eric Anholt [Tue, 14 Apr 2020 22:40:43 +0000 (15:40 -0700)]
freedreno: Immediately compile a default variant of shaders.

Now that we normalize our keys fairly well, build a variant at shader
state creation time so that hopefully you don't have to call the compiler
at draw time (as is now the case with glmark2 ES and most of the humus GL
demos).

Fixes: #2782
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno/ir3: Set up outputs for multi-slot varyings.
Eric Anholt [Wed, 22 Apr 2020 19:22:30 +0000 (12:22 -0700)]
freedreno/ir3: Set up outputs for multi-slot varyings.

Necessary to avoid compiler assertion failures in:

dEQP-GLES31.functional.program_interface_query.program_output.type.interface_blocks.out.named_block_explicit_location.struct.mat3x2

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno/ir3: Stop initializing regid of so->outputs during setup.
Eric Anholt [Wed, 22 Apr 2020 19:20:19 +0000 (12:20 -0700)]
freedreno/ir3: Stop initializing regid of so->outputs during setup.

It's unused and overwritten by ir3_compile_shader_nir().

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno/ir3: Improve shader key normalization.
Eric Anholt [Tue, 14 Apr 2020 23:34:00 +0000 (16:34 -0700)]
freedreno/ir3: Improve shader key normalization.

We can remove a bunch of conditional code at key comparison time by
computing a bitmask of used key bits at ir3_shader creation time.  This
also gives us a nice place to put additional key simplification to reduce
how many variants we create (like skipping rastflat if we don't read
colors in the FS, or skipping vclamp_color if we don't write colors).

It does mean walking the whole key to AND it, but the key is just 28 bytes
so far so that seems pretty fine.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno: Emit debug messages when doing draw-time recompiles of shaders.
Eric Anholt [Tue, 14 Apr 2020 22:31:20 +0000 (15:31 -0700)]
freedreno: Emit debug messages when doing draw-time recompiles of shaders.

Right now that's "always" unless you have shaderdb set.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno/ir3: Remove unused half precision shader key flag.
Eric Anholt [Wed, 15 Apr 2020 18:34:16 +0000 (11:34 -0700)]
freedreno/ir3: Remove unused half precision shader key flag.

The code using it was removed in 4af86bd0b933 ("freedreno/ir3: remove
half-precision output")

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno: Fix assertion failures on GS/tess shaders with shader-db enabled.
Eric Anholt [Wed, 15 Apr 2020 21:21:07 +0000 (14:21 -0700)]
freedreno: Fix assertion failures on GS/tess shaders with shader-db enabled.

We weren't filling in the tess mode of the key, or setting has_gs on GS
shaders, resulting in assertion failures when NIR intrinsics didn't get
lowered.

We have to make a guess at prim mode for TCS, but it should be better to
have some shader-db coverage than none, and it will avoid these failures
happening when we start precompiling shaders.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno/ir3: Skip tess epilogue if the program is missing stores.
Eric Anholt [Tue, 21 Apr 2020 22:30:49 +0000 (15:30 -0700)]
freedreno/ir3: Skip tess epilogue if the program is missing stores.

Some of the negative API tests make shaders for tess stages that don't do
all the stores they need to.  Once we start precompiling (or doing
shader-db of tess), we need to at least not segfault when generating them.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno: Stop doing binning shaders other than the VS in shader-db.
Eric Anholt [Wed, 15 Apr 2020 18:17:10 +0000 (11:17 -0700)]
freedreno: Stop doing binning shaders other than the VS in shader-db.

ir3_cache.c only ever asks for binning variants for VS.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno/ir3: Fix register allocation assertion failures.
Eric Anholt [Tue, 21 Apr 2020 20:26:14 +0000 (13:26 -0700)]
freedreno/ir3: Fix register allocation assertion failures.

We were failing to tell the allocator about the restriction that scalar
texture instructions (allocated as scalar regs) couldn't be allocated such
that the start of the full unwritemasked vector started before r0.  There
was a patch in select_reg_callback on a6xx that tried to work around that,
but you could still end up backed into a corner you shouldn't be because
we didn't tell the RA what it needed.

Fixes compiler assertion failures on a300-a400's blit_z shader, used for
Z32F gmem blits.

Looks like as a result we get tighter register allocation but more nops:

instructions in affected programs: 757945 -> 760356 (0.32%)
nops in affected programs: 317983 -> 320468 (0.78%)
non-nops in affected programs: 27525 -> 27451 (-0.27%)
mov in affected programs: 3098 -> 3023 (-2.42%)
dwords in affected programs: 109664 -> 110656 (0.90%)
last-baryf in affected programs: 112701 -> 112847 (0.13%)
full in affected programs: 4326 -> 4011 (-7.28%)
sstall in affected programs: 120550 -> 120836 (0.24%)
(ss) in affected programs: 13939 -> 13918 (-0.15%)
(sy) in affected programs: 3006 -> 2786 (-7.32%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno/ir3: Drop hack to clean up split vars
Kristian H. Kristensen [Tue, 28 Apr 2020 21:29:55 +0000 (14:29 -0700)]
freedreno/ir3: Drop hack to clean up split vars

When the GS lowering was working on store_output intrinsics, we had to
clean up the split vars to avoid getting confused.  Now that we shadow
the output vars instead, there's no confusion and we can drop this
hack.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno/ir3: Lower GS builtins before lowering IO
Kristian H. Kristensen [Tue, 28 Apr 2020 19:52:42 +0000 (12:52 -0700)]
freedreno/ir3: Lower GS builtins before lowering IO

We mostly got away with replacing a store_output with a store_var, but
for complex types like structs, that doesn't work. Once the IO has
been lowered from vars to intrinsic, we've lost the deref chains and
can't properly shadow the outputs.

This commits moves the GS lowering up so we do it before the output
variables get lowered to store_output.  This way the pass works much
like nir_lower_io_to_temporaries() and cleanly shadows the outputs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno/ir3: Add ir3_nir_lower_to_explicit_input() pass
Kristian H. Kristensen [Tue, 28 Apr 2020 19:34:15 +0000 (12:34 -0700)]
freedreno/ir3: Add ir3_nir_lower_to_explicit_input() pass

This pass lowers per-vertex input intrinsics to load_shared_ir3. This
was open coded in the TCS and GS lowering passes before - this way we
can share it. Furthermore, we'll need to run the rest of the GS
lowering earlier (before lowering IO) so we need to split off this
part that operates on the IO intrinsics first.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno/ir3: Rename ir3_nir_lower_to_explicit_io
Kristian H. Kristensen [Tue, 28 Apr 2020 19:29:46 +0000 (12:29 -0700)]
freedreno/ir3: Rename ir3_nir_lower_to_explicit_io

We rename it to ir3_nir_lower_to_explicit_output, since it only
handles output and we'll add a lowering pass for input next.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno/ir3: Pass stream output info to ir3_shader_from_nir
Kristian H. Kristensen [Fri, 1 May 2020 06:24:27 +0000 (23:24 -0700)]
freedreno/ir3: Pass stream output info to ir3_shader_from_nir

We need shader->stream_output filled out when we layout the push
constants in ir3_setup_const_state(). Otherwise
const_state->offsets.tfbo ends up as ~0, which doesn't work.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno/ir3: Fix the a3xx TF outputs stores.
Eric Anholt [Fri, 1 May 2020 00:31:02 +0000 (17:31 -0700)]
freedreno/ir3: Fix the a3xx TF outputs stores.

We were trying to deref the vector-collected outputs[] array before it's
been set up, but we want the per-component outputs anyway.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agofreedreno/ir3: Set up the block predecessors for a3xx TF
Eric Anholt [Fri, 1 May 2020 00:30:02 +0000 (17:30 -0700)]
freedreno/ir3: Set up the block predecessors for a3xx TF

Fixes a segfault in ir3_legalize.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>

4 years agointel/fs: Update location of Render Target Array Index for gen12
D Scott Phillips [Thu, 30 Apr 2020 17:38:33 +0000 (17:38 +0000)]
intel/fs: Update location of Render Target Array Index for gen12

Render Target Array Index has moved from R0.0[26:16] to
R1.1[26:16] on gen12.

Fixes dEQP-VK.multiview.input_attachments.*

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4836>

4 years agopan/decode: Properly print tripped zeroes
Tomeu Vizoso [Fri, 1 May 2020 09:59:13 +0000 (11:59 +0200)]
pan/decode: Properly print tripped zeroes

The "%" got lost.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Fixes: 6148d1be4bb5 ("panfrost: Fix size of bifrost sampler descriptor")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>

4 years agopanfrost: Add Bifrost texture trampoline BO to batch
Tomeu Vizoso [Fri, 1 May 2020 09:37:56 +0000 (11:37 +0200)]
panfrost: Add Bifrost texture trampoline BO to batch

Fixes: d3eb23adb50c ("panfrost: Emit sampler descriptor on bifrost")
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>

4 years agopan/bi: Lower for now sincos
Alyssa Rosenzweig [Thu, 30 Apr 2020 21:06:37 +0000 (17:06 -0400)]
pan/bi: Lower for now sincos

Will be implemented later.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>

4 years agopanfrost: mali_attr_meta.unknown1 is zero on Bifrost
Tomeu Vizoso [Fri, 1 May 2020 06:29:21 +0000 (08:29 +0200)]
panfrost: mali_attr_meta.unknown1 is zero on Bifrost

For unknown1 reasons :)

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>

4 years agopanfrost: GPUs newer than G-71 don't have swizzles...
Tomeu Vizoso [Fri, 1 May 2020 05:36:31 +0000 (07:36 +0200)]
panfrost: GPUs newer than G-71 don't have swizzles...

for attributes and varyings.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>

4 years agopan/decode: Trace to stderr with PANDECODE_DUMP_FILE=stderr
Tomeu Vizoso [Thu, 30 Apr 2020 08:48:02 +0000 (10:48 +0200)]
pan/decode: Trace to stderr with PANDECODE_DUMP_FILE=stderr

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>

4 years agopanfrost: Update Bifrost fields in mali_shader_meta
Alyssa Rosenzweig [Thu, 30 Apr 2020 08:32:11 +0000 (10:32 +0200)]
panfrost: Update Bifrost fields in mali_shader_meta

Not much is known currently about these fields and their values, but
this gets things going in the scenarios we have been testing with so
far.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>

4 years agopan/bi: Print shaders only if BIFROST_MESA_DEBUG=shaders
Tomeu Vizoso [Thu, 30 Apr 2020 07:29:10 +0000 (09:29 +0200)]
pan/bi: Print shaders only if BIFROST_MESA_DEBUG=shaders

Similar to how it's done in the Midgard compiler.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>

4 years agopan/bi: Enable lower_mediump_outputs NIR pass
Alyssa Rosenzweig [Thu, 30 Apr 2020 07:27:36 +0000 (09:27 +0200)]
pan/bi: Enable lower_mediump_outputs NIR pass

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>

4 years agopanfrost: Add a bit more info about some tiler fields
Tomeu Vizoso [Mon, 27 Apr 2020 15:09:39 +0000 (17:09 +0200)]
panfrost: Add a bit more info about some tiler fields

Has been observed that after the job chain has completed, those fields
become populated.

tiler_heap_next_start contains an address inside the tiler heap, a bit
before the value that the GPU writes to tiler_heap_free.

used_hierarchy_mask contains a hex value that corresponds to values
observed as hierarchy masks.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>

4 years agopanfrost: Create additional BO for the checksum of imported BOs (Bifrost)
Tomeu Vizoso [Mon, 27 Apr 2020 14:49:22 +0000 (16:49 +0200)]
panfrost: Create additional BO for the checksum of imported BOs (Bifrost)

Similar to what the blob does. My reason for doing this was mainly so
traces weren't as different, which makes it more work to spot
relevant differences.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>

4 years agopanfrost: Split bit out of format.unk3
Tomeu Vizoso [Fri, 24 Apr 2020 09:30:03 +0000 (11:30 +0200)]
panfrost: Split bit out of format.unk3

On Bifrost traces, we can observe that this bit is always enabled.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>

4 years agoci: add lists of expected failures & skipped tests for RAVEN with ACO
Samuel Pitoiset [Fri, 1 May 2020 14:06:43 +0000 (16:06 +0200)]
ci: add lists of expected failures & skipped tests for RAVEN with ACO

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4848>

4 years agoscripts: remove unittest.mock dependency when not used
Andres Gomez [Thu, 30 Apr 2020 21:20:33 +0000 (00:20 +0300)]
scripts: remove unittest.mock dependency when not used

Found upon inspection.

Signed-off-by: Andres Gomez <agomez@igalia.com
Reviewed-and-Tested-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4840>

4 years agoci: fix reporting the number of unexpected/flakes
Samuel Pitoiset [Thu, 30 Apr 2020 09:40:42 +0000 (11:40 +0200)]
ci: fix reporting the number of unexpected/flakes

`wc -l $file` returns the number of lines and the filename.

Fixes: b8c66aeb934 ("ci: Clean up some excessive use of pipes in dEQP results processing.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4829>

4 years agogitlab-ci: Use YAML anchor for llvmpipe paths in virgl rules
Michel Dänzer [Wed, 29 Apr 2020 15:56:26 +0000 (17:56 +0200)]
gitlab-ci: Use YAML anchor for llvmpipe paths in virgl rules

Instead of duplicating them.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4808>

4 years agofreedreno: we don't need aligned vbo's
Rob Clark [Thu, 16 Apr 2020 17:14:06 +0000 (10:14 -0700)]
freedreno: we don't need aligned vbo's

This gets rid of the last reason that mesa/st would use `u_vbuf` on
a6xx.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4812>

4 years agofreedreno/a6xx: add some more formats
Rob Clark [Thu, 16 Apr 2020 17:13:24 +0000 (10:13 -0700)]
freedreno/a6xx: add some more formats

u_vbuf was translating these for us.. which isn't really necessary.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4812>

4 years agopan/decode: Don't crash on missing payload
Alyssa Rosenzweig [Thu, 30 Apr 2020 20:49:31 +0000 (16:49 -0400)]
pan/decode: Don't crash on missing payload

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>

4 years agopanfrost: Fix tiled texture "stride"s on Bifrost
Alyssa Rosenzweig [Thu, 30 Apr 2020 22:48:53 +0000 (18:48 -0400)]
panfrost: Fix tiled texture "stride"s on Bifrost

They're not real strides like linear textures but the hw does use them
so we do have to get it right annoyingly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>

4 years agopanfrost: Fix norm coords on bifrost sampler
Alyssa Rosenzweig [Thu, 30 Apr 2020 21:20:08 +0000 (17:20 -0400)]
panfrost: Fix norm coords on bifrost sampler

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>

4 years agopanfrost: Fix sampler wrap/filter field orders
Alyssa Rosenzweig [Thu, 30 Apr 2020 21:06:45 +0000 (17:06 -0400)]
panfrost: Fix sampler wrap/filter field orders

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>

4 years agopanfrost: Fix size of bifrost sampler descriptor
Alyssa Rosenzweig [Thu, 30 Apr 2020 21:01:33 +0000 (17:01 -0400)]
panfrost: Fix size of bifrost sampler descriptor

Should be 32-bytes, it looks like.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>

4 years agopanfrost: Fix texture field size
Alyssa Rosenzweig [Thu, 30 Apr 2020 20:17:04 +0000 (16:17 -0400)]
panfrost: Fix texture field size

I have to imagine this was UB.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>

4 years agopan/bit: Add round tests
Alyssa Rosenzweig [Thu, 30 Apr 2020 22:15:23 +0000 (18:15 -0400)]
pan/bit: Add round tests

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>

4 years agopan/bit: Interpret ROUND
Alyssa Rosenzweig [Thu, 30 Apr 2020 21:32:56 +0000 (17:32 -0400)]
pan/bit: Interpret ROUND

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>

4 years agopan/bit: Add framework forinterpreting double vs float
Alyssa Rosenzweig [Thu, 30 Apr 2020 21:31:38 +0000 (17:31 -0400)]
pan/bit: Add framework forinterpreting double vs float

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>

4 years agopan/bi: Pack round opcodes (FMA, either 16 or 32)
Alyssa Rosenzweig [Thu, 30 Apr 2020 22:15:09 +0000 (18:15 -0400)]
pan/bi: Pack round opcodes (FMA, either 16 or 32)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>

4 years agopan/bi: Pipe multiple textures through
Alyssa Rosenzweig [Thu, 30 Apr 2020 20:10:55 +0000 (16:10 -0400)]
pan/bi: Pipe multiple textures through

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>

4 years agopan/bi: Add texture indices to IR
Alyssa Rosenzweig [Thu, 30 Apr 2020 20:08:01 +0000 (16:08 -0400)]
pan/bi: Add texture indices to IR

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4844>

4 years agofreedreno/a6xx: fix LRZ hang
Rob Clark [Thu, 30 Apr 2020 23:00:21 +0000 (16:00 -0700)]
freedreno/a6xx: fix LRZ hang

In detecting the case where we actually do need to re-emit LRZ state
(due to new batch), we were checking `ctx->last.dirty` to detect when
we cannot trust previous state.  But this is cleared before we check
it.

Move where it is cleared to the end of the draw_vbo() path.

Fixes: dfa702e94b9 ("freedreno/a6xx: limit LRZ state emit")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4842>

4 years agofreedreno/ir3: Leave bools as 1-bit, storing them in full regs.
Eric Anholt [Thu, 9 Apr 2020 17:45:24 +0000 (10:45 -0700)]
freedreno/ir3: Leave bools as 1-bit, storing them in full regs.

If use NIR's 1-bit bool representation , we get exactly the bool behavior
the hardware provides: CMPS produces true or false, AND/OR/XOR work as
intended without extra absnegs, and we can pass those half values directly
to other CMPS.  We emit an absneg for b2b1 ("turn a memory load into a
1-bit NIR boolean"), but we would have done so for the ir3_n2b() on the
use of that value anyway.  The most awkward bit is that inot(a@1) is now a
sub(1, a), but we can encode the 1 as an immediate so it's fine.

No significant changes to GL_TIME_ELAPSED on my set of traces (n=21).

instructions in affected programs: 1570638 -> 1548702 (-1.40%)
nops in affected programs: 624053 -> 611381 (-2.03%)
non-nops in affected programs: 959061 -> 949797 (-0.97%)
mov in affected programs: 5258 -> 5252 (-0.11%)
cov in affected programs: 15099 -> 15902 (5.32%)
dwords in affected programs: 469600 -> 452768 (-3.58%)
last-baryf in affected programs: 162211 -> 154726 (-4.61%)
full in affected programs: 4881 -> 4797 (-1.72%)
sstall in affected programs: 173953 -> 174545 (0.34%)
(ss) in affected programs: 10922 -> 10934 (0.11%)
(sy) in affected programs: 728 -> 745 (2.34%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4518>

4 years agofreedreno/ir3: Drop redundant IR3_REG_HALF setup in ALU ops.
Eric Anholt [Sat, 11 Apr 2020 04:54:58 +0000 (21:54 -0700)]
freedreno/ir3: Drop redundant IR3_REG_HALF setup in ALU ops.

It's set by ir3_put_dst() immediately after.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4518>

4 years agoradeonsi: revert an accidental change in si_clear_buffer
Marek Olšák [Tue, 28 Apr 2020 16:51:22 +0000 (12:51 -0400)]
radeonsi: revert an accidental change in si_clear_buffer

The change was in: 7b0b085c94347cb9c94d88e11a64a6c341d95477

Fixes: 7b0b085c943 ("radeonsi: drop the negation from fmask_is_not_identity")
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>

4 years agoradeonsi: fix si_compute_clear_render_target with render condition enabled
Marek Olšák [Tue, 28 Apr 2020 15:19:23 +0000 (11:19 -0400)]
radeonsi: fix si_compute_clear_render_target with render condition enabled

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>

4 years agoradeonsi: add a workaround to fix KHR-GL45.texture_view.view_classes on gfx9
Marek Olšák [Sun, 26 Apr 2020 14:50:24 +0000 (10:50 -0400)]
radeonsi: add a workaround to fix KHR-GL45.texture_view.view_classes on gfx9

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>

4 years agoradeonsi: implement and use compute-based DCC decompression on gfx9-10
Marek Olšák [Sun, 26 Apr 2020 12:38:54 +0000 (08:38 -0400)]
radeonsi: implement and use compute-based DCC decompression on gfx9-10

DCC_DECOMPRESS doesn't work. Instead of trying to figure out why,
use a compute blit where the load is compressed and the store is
uncompressed.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>

4 years agoradeonsi: add SI_IMAGE_ACCESS_DCC_OFF to ignore DCC for shader images
Marek Olšák [Sun, 26 Apr 2020 11:45:34 +0000 (07:45 -0400)]
radeonsi: add SI_IMAGE_ACCESS_DCC_OFF to ignore DCC for shader images

A shader-based DCC decompress pass will use this.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>

4 years agoradeonsi: bind shader images after DCC is disabled for image stores
Marek Olšák [Mon, 27 Apr 2020 01:36:59 +0000 (21:36 -0400)]
radeonsi: bind shader images after DCC is disabled for image stores

This prevents an infinite recursion with a compute-based DCC decompression
when it restores shader images.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>

4 years agoradeonsi: clean up and deduplicate code around internal compute dispatches
Marek Olšák [Sun, 26 Apr 2020 11:22:52 +0000 (07:22 -0400)]
radeonsi: clean up and deduplicate code around internal compute dispatches

In addition to the cleanup, there are these changes in behavior:
- clear_render_target waits for idle after the dispatch and then flushes
  L0-L1 caches (this was missing)
- sL0 is no longer invalidated before the dispatch, because src resources
  don't use it
- sL0 is no longer invalidated after the dispatch if dst is an image

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>

4 years agoradeonsi: unify and align down the max SSBO/TBO/UBO buffer binding size
Marek Olšák [Sun, 26 Apr 2020 05:23:11 +0000 (01:23 -0400)]
radeonsi: unify and align down the max SSBO/TBO/UBO buffer binding size

Rounding down the size fixes:
    KHR-GL45.enhanced_layouts.ssb_member_invalid_offset_alignment

Fixes: 03e2adc990d239119619f22599204c1b37b83134
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>

4 years agotgsi_to_nir: handle TGSI_OPCODE_BARRIER
Marek Olšák [Sun, 26 Apr 2020 12:55:08 +0000 (08:55 -0400)]
tgsi_to_nir: handle TGSI_OPCODE_BARRIER

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>

4 years agotgsi_to_nir: handle TGSI_SEMANTIC_BLOCK_SIZE
Marek Olšák [Sun, 26 Apr 2020 12:37:42 +0000 (08:37 -0400)]
tgsi_to_nir: handle TGSI_SEMANTIC_BLOCK_SIZE

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>

4 years agoglthread: upload non-VBO vertices and indices for non-Indirect non-IBM draws
Marek Olšák [Fri, 6 Mar 2020 21:56:54 +0000 (16:56 -0500)]
glthread: upload non-VBO vertices and indices for non-Indirect non-IBM draws

This is basically the same thing u_vbuf does.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agoglthread: handle gl{Push,Pop}ClientAttrib{DefaultEXT} for glthread states
Marek Olšák [Sat, 21 Mar 2020 06:58:51 +0000 (02:58 -0400)]
glthread: handle gl{Push,Pop}ClientAttrib{DefaultEXT} for glthread states

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agoglthread: handle POS vs GENERIC0 aliasing
Marek Olšák [Tue, 24 Mar 2020 03:35:58 +0000 (23:35 -0400)]
glthread: handle POS vs GENERIC0 aliasing

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agoglthread: initialize VAOs properly
Marek Olšák [Sat, 21 Mar 2020 06:24:28 +0000 (02:24 -0400)]
glthread: initialize VAOs properly

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agoglthread: track primitive restart state
Marek Olšák [Sat, 7 Mar 2020 00:00:03 +0000 (19:00 -0500)]
glthread: track primitive restart state

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agoglthread: track instance divisor changes
Marek Olšák [Thu, 5 Mar 2020 00:24:34 +0000 (19:24 -0500)]
glthread: track instance divisor changes

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agoglthread: track pointers and strides for Pointer & EXT_dsa attrib functions
Marek Olšák [Fri, 28 Feb 2020 02:58:35 +0000 (21:58 -0500)]
glthread: track pointers and strides for Pointer & EXT_dsa attrib functions

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agoglthread: don't use atomics for refcounting to decrease overhead on AMD Zen
Marek Olšák [Sun, 22 Mar 2020 18:45:14 +0000 (14:45 -0400)]
glthread: don't use atomics for refcounting to decrease overhead on AMD Zen

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agoglthread: do glBufferSubData as unsynchronized upload + GPU copy
Marek Olšák [Fri, 6 Mar 2020 02:50:17 +0000 (21:50 -0500)]
glthread: do glBufferSubData as unsynchronized upload + GPU copy

1. glthread has a private upload buffer (as struct gl_buffer_object *)
2. the new function glInternalBufferSubDataCopyMESA is used to execute the copy
   (the source buffer parameter type is struct gl_buffer_object * as GLintptr)

Now glthread can handle arbitrary glBufferSubData sizes without syncing.

This is a good exercise for uploading data outside of the driver thread.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agomesa: add _mesa_InternalBind{ElementBuffer,VertexBuffers} for glthread
Marek Olšák [Sat, 7 Mar 2020 01:37:57 +0000 (20:37 -0500)]
mesa: add _mesa_InternalBind{ElementBuffer,VertexBuffers} for glthread

Uploaded non-VBO user data will be set via these functions.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agomesa: add glInternalBufferSubDataCopyMESA for glthread
Marek Olšák [Sat, 7 Mar 2020 01:19:11 +0000 (20:19 -0500)]
mesa: add glInternalBufferSubDataCopyMESA for glthread

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agomesa: inline vbo_context inside gl_context to remove vbo_context dereferences
Marek Olšák [Fri, 27 Mar 2020 11:57:07 +0000 (07:57 -0400)]
mesa: inline vbo_context inside gl_context to remove vbo_context dereferences

The number of lines in the disassembly of vbo_exec_api.c.o decreased
by 4.5%, which roughly corresponds to a decrease in instructions
for immediate mode thanks to the removal of ctx->vbo_context dereferences.

It increases performance in one Viewperf11 subtest by 2.8%.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agomesa,st/mesa: add a fast path for non-static VAOs
Marek Olšák [Fri, 27 Mar 2020 09:07:02 +0000 (05:07 -0400)]
mesa,st/mesa: add a fast path for non-static VAOs

Skip most of _mesa_update_vao_derived_arrays if the VAO is not static.
Drivers need a separate codepath for this.

This increases performance by 7% with glthread and the game "torcs".

The reason is that glthread uploads vertices and sets vertex buffers
every draw call, so the overhead is very noticable. glthread doesn't
hide the overhead, because the driver thread is the busiest thread.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agomesa: don't update shaders on fixed-func state changes if user shaders are bound
Marek Olšák [Sun, 22 Mar 2020 05:07:54 +0000 (01:07 -0400)]
mesa: don't update shaders on fixed-func state changes if user shaders are bound

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agomesa: don't set unnecessary program flags in _mesa_update_state
Marek Olšák [Sun, 22 Mar 2020 04:47:23 +0000 (00:47 -0400)]
mesa: don't set unnecessary program flags in _mesa_update_state

_NEW_PROGRAM is already set.
_NEW_FRAG_CLAMP is not used by the fixed-func fragment shader.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agomesa: set _NEW_FRAG_CLAMP only when needed
Mathias Fröhlich [Thu, 27 Feb 2020 07:13:07 +0000 (08:13 +0100)]
mesa: set _NEW_FRAG_CLAMP only when needed

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agomesa: don't call _mesa_update_state for _mesa_get_clamp_fragment_color
Marek Olšák [Sun, 22 Mar 2020 04:30:40 +0000 (00:30 -0400)]
mesa: don't call _mesa_update_state for _mesa_get_clamp_fragment_color

It's not needed.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agost/mesa: Move _NEW_FRAG_CLAMP to NewFragClamp driver flag.
Mathias Fröhlich [Mon, 12 Aug 2019 10:16:16 +0000 (12:16 +0200)]
st/mesa: Move _NEW_FRAG_CLAMP to NewFragClamp driver flag.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agomesa: optimize glPush/PopClientAttrib by removing malloc overhead
Marek Olšák [Sun, 22 Mar 2020 02:15:20 +0000 (22:15 -0400)]
mesa: optimize glPush/PopClientAttrib by removing malloc overhead

just declare all structures needed by the stack in gl_context.

This improves performance by 5.6% in the game "torcs". FPS: 101.01 -> 106.73

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4314>

4 years agofreedreno/a6xx: don't set SP_FS_CTRL_REG0.VARYING for fragcoord
Rob Clark [Thu, 30 Apr 2020 20:54:49 +0000 (13:54 -0700)]
freedreno/a6xx: don't set SP_FS_CTRL_REG0.VARYING for fragcoord

Similar change to 5785bcc8a0ff9c5072c647337bf73f696c63cbe6.  It appears
on a6xx and in fact this could cause varying corruption before the FS
had a chance to consume the varyings from varying storage.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2838
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4838>

4 years agoiris: don't assert on unfinished aux import in copy paths
Lionel Landwerlin [Tue, 21 Apr 2020 09:25:44 +0000 (12:25 +0300)]
iris: don't assert on unfinished aux import in copy paths

After a resource is created the first command using it could be a copy
command.

In iris_state we finish the import on surface/view creation but we
don't do that for copies.

v2: Move finish call to gallium entrypoints (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2725
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4657>

4 years agofreedreno: sync registers with envytools
Rob Clark [Thu, 30 Apr 2020 17:12:28 +0000 (10:12 -0700)]
freedreno: sync registers with envytools

Pull in the `SP_xS_BRANCH_COND` regs to keep the mesa and envytools
copies from getting out of sync.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/a6xx: more OUT_REG()
Rob Clark [Wed, 22 Apr 2020 22:41:12 +0000 (15:41 -0700)]
freedreno/a6xx: more OUT_REG()

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno: scissor vs disabled scissor micro-opt
Rob Clark [Wed, 22 Apr 2020 22:26:02 +0000 (15:26 -0700)]
freedreno: scissor vs disabled scissor micro-opt

We don't need to deref and check rast state every time scissor changes,
only when rast state changes.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/a6xx: convert const emit to OUT_PKT()
Rob Clark [Wed, 29 Apr 2020 17:09:28 +0000 (10:09 -0700)]
freedreno/a6xx: convert const emit to OUT_PKT()

This is another hot packet.  This splits out each of the four cases
(geom vs frag, and indirect vs inline) intentionally, to avoid some
parity bit calc.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/ir3: inline const emit
Rob Clark [Wed, 22 Apr 2020 18:51:42 +0000 (11:51 -0700)]
freedreno/ir3: inline const emit

Drop vfunc callbacks for per-gen packet emit, and instead have a header
that is #include'd once per gen.

We'll end up with multiple copies of some of this, but since we never
have multiple gen's of adreno on a single device, only one copy will be
paged in (and hopefully in the I-cache for hot-paths)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/a6xx: split out const emit
Rob Clark [Wed, 22 Apr 2020 18:20:25 +0000 (11:20 -0700)]
freedreno/a6xx: split out const emit

In order to inline the const emit and drop the per-gen vfuncs to emit
the correct sort of packet, we should consolidate all of the entry-
points to const emit in one object file, otherwise we'll end up with
multiple copies per gen.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/a6xx: convert draw packet to OUT_PKT()
Rob Clark [Wed, 29 Apr 2020 17:01:23 +0000 (10:01 -0700)]
freedreno/a6xx: convert draw packet to OUT_PKT()

This is one of the hotter pkt7 packets, since it is guaranteed to happen
on every draw.  Switch to OUT_PKT() for less driver overhead in the draw
path.

Slight bit of cheating for using CP_DRAW_INDX_OFFSET_0 for the first
dword in all cases.  Possibly *gen_header.py* could be more clever
and use typedef's in the cases of bitsets like vgt_draw_initiator.
But this works out because it is always the first dword.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/a6xx: add OUT_PKT()
Rob Clark [Wed, 29 Apr 2020 16:58:38 +0000 (09:58 -0700)]
freedreno/a6xx: add OUT_PKT()

Similar to OUT_REG(), this has the benefits of:

1. No more messing up pkt size
2. Detects errors of mixing up the order of dwords in the packet
3. Optimizes to more efficient code

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/a6xx: skip unnecessary MRT blend state
Rob Clark [Tue, 21 Apr 2020 16:05:55 +0000 (09:05 -0700)]
freedreno/a6xx: skip unnecessary MRT blend state

To lower CP overhead.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/a6xx: combine sample mask into blend state
Rob Clark [Thu, 16 Apr 2020 22:25:27 +0000 (15:25 -0700)]
freedreno/a6xx: combine sample mask into blend state

This gets rid of one lone register we used to emit directly in IB2
whenever blend state changes, at the expense of needing blend state
variants when sample-mask changes.  I think typically sample-mask
should not change frequently, so this seems like a fair trade-off.

To further limit the # of variants, we ignore sample-mask bits that
are not relavant for the current # of samples.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/a6xx: move blend-color to stateobj
Rob Clark [Thu, 16 Apr 2020 22:18:08 +0000 (15:18 -0700)]
freedreno/a6xx: move blend-color to stateobj

To reduce CP overhead for draws skipped in a bin.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/a6xx: limit LRZ state emit
Rob Clark [Thu, 16 Apr 2020 21:52:29 +0000 (14:52 -0700)]
freedreno/a6xx: limit LRZ state emit

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/a6xx: limit PROG_FB_RAST state emit
Rob Clark [Thu, 16 Apr 2020 21:33:50 +0000 (14:33 -0700)]
freedreno/a6xx: limit PROG_FB_RAST state emit

The dependency on RASTERIZER state is only when rasterizer_discard
changes.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/a6xx: move scissor state to stateobj
Rob Clark [Thu, 16 Apr 2020 21:13:39 +0000 (14:13 -0700)]
freedreno/a6xx: move scissor state to stateobj

To reduce CP overhead for draws skipped in a given tile.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/a6xx: move const state to single stateobj
Rob Clark [Thu, 16 Apr 2020 19:55:35 +0000 (12:55 -0700)]
freedreno/a6xx: move const state to single stateobj

In practice, we end up updating all the shader stages at the same time.
So collapse this into a single group.

Reduces CP overhead.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/a6xx: avoid unnecessary clearing VS DP state
Rob Clark [Thu, 16 Apr 2020 17:18:29 +0000 (10:18 -0700)]
freedreno/a6xx: avoid unnecessary clearing VS DP state

If there is no (potentially unflushed) VS driver-param state, we don't
need to emit a DISABLE on each frame.  So avoid that to reduce CP
overhead.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>

4 years agofreedreno/a6xx: small query cleanup
Rob Clark [Tue, 21 Apr 2020 16:06:28 +0000 (09:06 -0700)]
freedreno/a6xx: small query cleanup

Don't open-code `fd6_event_write()`

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>