git.libre-soc.org Git - mesa.git/log

ci/bare-metal: Include a timestamp in our serial reads.

gitlab CI doesn't include timestamps in its logs by default, but it's
really useful for finding delays in our CI so stuff one in on the lines
coming in from serial and being output to the gitlab log. The artifacts
file is still the raw serial output.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>

ci/bare-metal: Fix detection of "POWER_GOOD not seen in time" fails

We were only reading from the CPU serial, not EC, so we'd never notice
these sources of job timeouts. I couldn't find a cleaner solution, so I
spawned two threads to do the blocking reads from our serial line fifos
and merge them together in a single queue to read.

Closes: #3470
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>

ci/bare-metal: Use re.search() instead re.match() for our line matching.

match() looks for the start of the line to match our regex, while search
just looks for the regex anywhere in the line. I messed this up when
converting our greps in shell to python, which was part of breaking the
POWER_GOOD flake detection. Most of our matches worked, but let's
consistently use this one so we don't mess this up in the future.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>

amd/common: Fix various non-critical integer overflows

The result of 0xf << 28 is a signed integer and hence overflows into the sign
bit. In practice compilers did the right thing here, since the intent of the
code was unsigned arithmetic anyway.

Cc: mesa-stable
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6568>

aco: Fix integer overflows when emitting parallel copies during RA

32-bit shifts were accidentally used before this change despite the intended
output being 64 bits.

This was observed when compiling Dolphin's ubershaders.

Cc: mesa-stable
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6568>

radv: Fix various non-critical integer overflows

The result of 0xf << 28 is a signed integer and hence overflows into the sign
bit. In practice compilers did the right thing here, since the intent of the
code was unsigned arithmetic anyway.

These conditions were observed in:
* dEQP-VK.pipeline.image.suballocation.sampling_type.combined.view_type.1d.format.r4g4b4a4_unorm_pack16.count_8.size.512x1
* dEQP-VK.binding_model.descriptorset_random.sets32.noarray.ubolimitlow.sbolimitlow.sampledimglow.outimgonly.noiub.nouab.frag.ialimithigh.0

Cc: mesa-stable
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6568>

aco: remove omod_success/clamp_success

This simplifies the optimizer and should make SDWA optimizations easier.

No fossil-db changes on Navi.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6293>

aco: fix mad splitting after applying output modifiers

Previously, this wasn't done because the mad label wasn't passed to the
new definition.

fossil-db (Navi):
Totals from 5770 (4.24% of 135946) affected shaders:
SGPRs: 391920 -> 391872 (-0.01%)
VGPRs: 349084 -> 348424 (-0.19%); split: -0.20%, +0.01%
CodeSize: 34639636 -> 34637496 (-0.01%); split: -0.02%, +0.01%
MaxWaves: 58828 -> 58862 (+0.06%)
Instrs: 6723436 -> 6723297 (-0.00%); split: -0.02%, +0.02%
Cycles: 197594168 -> 197591968 (-0.00%); split: -0.02%, +0.02%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6293>

radv: remove descriptor_indexing fails from expected fails

I think these should be fixed since
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6451. They also
pass for me on polaris and navi.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6586>

anv: Set alignments on UBO/SSBO root derefs

This doesn't really do anything for us today. One day, I suppose we
could use it to do something with wide loads with non-uniform offsets.
The big reason to do this is to get better testing to make sure that NIR
doesn't blow up on the deref paths.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

spirv: Drop the OpenCL type layout code

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

clover/nir: Use lower_vars_to_explicit for uniform and global

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

spirv: Stop counting inputs in entry_point_wrapper

nir_shader::num_inputs isn't supposed to be a count of how many input
variables we have. It's a size of the lowered input space.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

clover: Use args.size() to compute new var locations

This is better than using num_uniforms as it guarantees what we want: a
mapping from nir_variable to the args vector.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

nir: Allow uniform in nir_lower_vars_to_explicit_types

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

nir: Allow var_mem_global in nir_lower_vars_to_explicit_types

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

spirv: Propagate alignments to deref chains via casts

This commit propagates the alignment information provided either through
the Alignment decoration on pointers or via the alignment mem operands
to OpLoad, OpStore, and OpCopyMemory to the NIR deref chain. It does so
by wrapping the deref in a cast. NIR should be able to clean up most
unnecessary casts only leaving us with the useful alignment information.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

spirv: Add pointer helper vars to OpCopyMemory

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

nir/opt_deref: Remove restrictive alignment information from casts

If we have a cast deref with alignment information and we can get equal
or better alignment information from something further up the deref
chain, we can strip the alignment information from the cast and allow
other optimizations to potentially eliminate the cast.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

nir/opt_deref: Don't remove casts with alignment information

Generally, if a cast has alignment information, that information is
useful and we don't want to loose it.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

nir/lower_io: Apply alignments from derefs when available

If the deref has no explicit alignment in the chain, we assume component
alignment which is what we currently assume for all derefs today.  This
should be correct for all APIs in the sense that we can usually assume
at least component alignment.  However, for some APIs such as OpenCL, we
could potentially make larger alignment assumptions.  The intention is
that those will be handled via alignment-increasing casts.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

nir: Add a helper for getting the alignment of a deref

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

nir: Handle all array stride cases in nir_deref_instr_array_stride

This renames it to drop the ptr_as and makes it handle all of the stride
cases. There's a bit of a tricky bit in here around Booleans but we
currently use 32-bit for those always.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

nir: Add alignment information to cast derefs

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

nir/glsl: Add an explicit_alignment field to glsl_type

When creating explicit type, the alignment information is lost, thus
forcing explicit type users to recalculate the alignment using the same
size_align() function. Let's add a new field to cache this information.

Only structs, matrices, and vectors have and explicit alignment.  Arrays
alignment is implicitly set to its element alignment and matrices are
required to have an alignment that matches that of its vector columns.
the concept of alignment simply doesn't apply to other types.

We make the strategic choice to not allow explicit alignments on
scalars.  This is for a couple of reasons:

1. There are no cases today where we use explicit types where we want
    any other alignment for scalars than natural alignment.

2. Vectors don't have a component alignment that's separate from the
    explicit_alignment so it's impossible to get an explicitly aligned
    scalar type which is the component of the explicitly aligned vector
    type.

This choice may cause problems if we ever want to use explicitly laid
out types for things like varyings where we sometimes want vec4
alignment of scalars.  We can deal with that when the time comes.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

compiler/types: Make booleans 32-bit for cl_size/align

OpenCL doesn't mandate a size and this is consistent with the rest of
the glsl_type system. While we're here, we also clean ::cl_size() up a
bit and use a new explicit_type_scalar_byte_size() helper.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

nir: Expose the packed attribute attached to glsl_type objects

This should help code calculating field offsets to get it right when
the structure is marked packed.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

nir/glsl: Consider block interfaces as structs when it comes to size/align calculation

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

glsl: Propagate packed info in get_explicit_type_for_size_align()

Right now, when calling get_explicit_type_for_size_align() on a packed
struct, the packed attribute is lost and field offsets are wrong.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

spirv: Propagate packed information to glsl_type

We need to parse the CPacked decoration early enough to apply it when
calculating field offsets and creating the struct type.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

spirv: Don't accept CPacked decoration on struct members

CPacked decoration is only allowed on struct definitions, not struct
members.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

clover: Call nir_lower_mem_constant_vars

Fixes: 26a4c8f375e "clover/nir: Use nir_var_mem_constant for..."
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

nir: Don't bail too early in lower_mem_constant_vars

If there were no constant variables, we would bail out entirely.
However, we may still have constant input pointers coming in from the
client.

Fixes: 4360a8a2b3fce "nir/lower_io: Add support for nir_var_mem_constant"
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>

intel/nir: Stop using nir_lower_vars_to_scratch

Instead, we do a limited indirect deref lowering and then use
nir_lower_vars_to_explicit_types and nir_lower_explicit_io to lower it
as if it were SSBO or global memory access. Among other things, this
should enable pointer arithmetic on local variables. Fun!

The only shader-db change from this change on ICL was a few tiny cycle
count changes in 7 Aztec Ruins compute shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909>

nir/lower_indirect_derefs: Add a threshold

Instead of always lowering everything, we add a threshold such that if
the total indirected array size (AoA size) is above that threshold, it
won't lower. It's assumed that the driver will sort things out somehow
by, for instance, lowering to scratch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909>

intel/compiler: Handle all indirect lowering choices in brw_nir.c

Since everything flows through NIR and we're doing all of our indirect
deref lowering there now, there's no reason to keep making those
decisions in brw_compiler and stuffing them in the GLSL compiler
structs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909>

zink: generically handle matrix types

there's a bunch of glsl 1.10 tests for this

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6268>

gallium/util: use uint sampler for stencil-reads

Some drivers can't use float-samplers to read integer textures, so let's
make sure the stenicil-sampler has the right type.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6512>

radeonsi: optimize out the loop in si_get_ps_input_cntl

Use a remap table from a semantic to an index instead of searching
for the correct index.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: replace TGSI_SEMANTIC with VARYING_SLOT and FRAG_RESULT

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: replace TGSI_INTERPOLATE with INTERP_MODE

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

compiler: add INTERP_MODE_COLOR for radeonsi

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: remove si_shader_selector::type

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: change PIPE_SHADER to MESA_SHADER (si_dump_descriptors)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: precompute si_*_descriptors_idx in si_shader_selector

It helps remove one use of sel->type.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: change PIPE_SHADER to MESA_SHADER (si_shader_dump_disassembly)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: remove unused si_shader_context::type

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: change PIPE_SHADER to MESA_SHADER (si_get_shader_part)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: change PIPE_SHADER to MESA_SHADER (si_compile_llvm)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: change PIPE_SHADER to MESA_SHADER (debug flags)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: change PIPE_SHADER to MESA_SHADER (si_shader_context::type)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: change PIPE_SHADER to MESA_SHADER (si_shader_selector::type)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: simplify handling color interp modes in si_emit_spi_map

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: don't execute LDS stores for TCS outputs that are never read

This is a per-component version of the previous mechanism.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>

radeonsi: don't lower indirect IO in GLSL

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6445>

radeonsi: remove in/out/uniform variables from NIR after lowering IO

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6445>

radeonsi: lower IO intrinsics - complete rewrite of input/output scanning

Input and output info is gathered from intrinsics. nir_variables are
ignored (and we'll remove them anyway).

This is a prerequisite for ACO, but also makes the IR prettier.
The ac_nir_to_llvm change has to be in this commit.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6445>

ac/nir: handle all lowered IO intrinsics

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6445>

radeonsi: clean up code for loading VS inputs

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6445>

radeonsi: get color interpolation info from shader_info

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6445>

radeonsi: don't crash if input_usage_mask is 0 for a VS input

This will start happening with the lowered IO intrinstics and new scanning
code.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6445>

freedreno: fence_server_sync() fixes

Two potential problems, batch re-ordering doesn't really play nicely
with fence_server_sync(), so when we switch away from one batch, detect
the case that we need to sync, and if so flush.  The alternative of
trying to track that later batches depend on an earlier batch that had
an in-fence is hairy, and the normal use-case would be to sync at the
beginning of the frame.

But this brings up the second problem, which is that typically we'll get
told to sync on an in-fence before the first draw, which means before
mesa/st flushes down the framebuffer state to the driver.  Which means
we don't yet have the correct batch to attach the fence to.  So we need
to track the in-fence on the context, and transfer it to the batch
before draws, etc.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6575>

freedreno: Fix missing rsc->seqno updates

There were a couple paths where we weren't getting valid seqno's, which
are supposed to be updated whenever the backing bo is set/changed. So
wrap that up in a helper to make it harder to mess up.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6575>

docs: shift 20.2 rc dates by two weeks to match reality

The release candidates have slipped by a couple of weeks, so let's fix
the dates in the calendar.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6574>

docs: update calendar and link releases notes for 20.1.7

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6573>

docs: add release notes for 20.1.7

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6573>

iris: Re-emit push constants if we have a varying workgroup size

Fixes: 33c61eb2f10526 "iris: Implement ARB_compute_variable_group_size"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6570>

intel/nir: Lower load_num_work_groups to 32-bit if needed

For OpenCL-style kernels, this builtin is 64-bit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6570>

intel/fs: Use a single untyped surface read for load_num_work_groups

There's no good reason to split this into three. Sure, CS indirects are
only guaranteed by the spec to be DWORD aligned, but that's all untyped
surface reads require anyway.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6570>

intel/fs: Don't copy-propagate stride=0 sources into ddx/ddy

This can come up if, for instance, the shader does a derivative of a
uniform or flat input. Ideally, NIR would use divergence analysis to
get rid of the derivative in this case but it doesn't right now. This
fixes a crash in F1 2017.

Cc: mesa-stable@lists.freedesktop.org
Reported-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Tested-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6564>

st/mesa: fix lowered IO - don't call st_nir_assign_vs_in_locations twice

If IO is lowered, the second call is a no-op, which breaks:
spec@!opengl 1.1@gl-1.1-color-material-unused-normal-array

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6540>

nir: fix a bug in is_dual_slot in nir_io_add_const_offset_to_base

Fixes: 01ab308edc "nir: update IO semantics in nir_io_add_const_offset_to_base"
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6540>

iris: Patch constant data pointers into shaders

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>

anv: Patch constant data pointers into shaders with using softpin

When we have softpin, we know the address of the shader constant data at
shader upload time because it's sitting at the end of the shader. This
commit changes ANV to use patch constants to embed the address in the
shader patch the right address in at upload time. This allows us to
avoid having to set up a UBO binding on-the-fly for shader constants.

This commit uses an A64 message but it's quite possible that we could
also use an A32 message and make the dataport do the 64-bit add for us.
However, load_global is what we have right now so it was easier to just
use that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>

nir/builder: Add load/store_global helpers

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>

anv: Properly cache brw_stage_prog_data::relocs

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>

intel/fs: Add support for a new load_reloc_const intrinsic

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>

intel/eu: Add a mechanism for emitting relocatable constant MOVs

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>

intel/eu: Include brw_compiler.h in brw_eu.h

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>

anv: Stop storing the shader constant data side-band

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>

intel/fs,vec4: Stuff the constant data from NIR in the end of the program

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>

intel/eu: Add some new helpers

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>

iris: Use gen_disassemble

This one doesn't require the program size and so it won't mess up if we
have a bunch of constant data at the end.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>

intel/compiler: Get rid of struct gen_disasm

It's just a container around a devinfo. The one useful purpose it did
serve is that gen_disasm_create initialized the compaction table
singletons. Now that those no longer exist, this isn't necessary.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>

intel/compiler: Get rid of the global compaction table pointers

With discrete GPUs, it's going to be possible to have GPUs from two
different hardware generations in the machine at the same time. Global
singletons like this aren't going to fly. Have a struct containing the
pointers which gets initialized once per shader disassemble instead.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>

spirv: Deal with glslang not setting NonUniform on constructors.

Especially a problem for OpImage/OpSampledImage. Note that the problem
doesn't seem to be propagation through glslang, but only in emitting
the SPIR-V. So it is fine if we are somewhat lossy in handling this, as
long as direct Op(Sampled)Image -> texture op chains work.

Fixes: af81486a8cd "spirv: Simplify our handling of NonUniform"
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3406
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6451>

spirv: Deal with glslang bug not setting the decoration for stores.

Fixes: af81486a8cd "spirv: Simplify our handling of NonUniform"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6451>

radv: Avoid deadlock on bo_list.

With the kernel timeline sysncobj changes, the kernel submits do
not necessarily happen in global vkQueueSubmit order. Which should
be fine, we added the appropriate waits for that. (See
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT in the winsys)

However, all kernel submissions take a lock on the bo_list mutex,
and since we do the wait in the winsys, we wait while having the
bo_list mutex held. This means that as soon as a wait and a signal
submission are out of order we have a deadlock on the bo_list mutex
and the wait.

Solution is to use a shared reader lock during the kernel submission,
as we only need read access for the submission.

Fixes: 6bc5ce7a91d "radv: Add timeline syncobj for timeline semaphores."
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3446
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6478>

radv: Fix threading issue with submission refcounts.

If decrement == 0 then:

- it isn't safe to access the submission
- even if it is, checking that the result of the atomic_sub is 0
doesn't given an unique owner anymore.

So skip it. The submission always starts out with refcount >= 1,
so first one to decrement to 0 still get dibs on executing it.

Fixes: 4aa75bb3bdd "radv: Add wait-before-submit support for timelines."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6478>

intel/tools: Disassemble WAIT's argument as a destination

WAIT takes a notification register as a destination and a src0 argument.
Since the same notification register is specified in both fields, we
treat it as a special case and disassemble it only once.

If we disassemble it as if it is a source register, its scalar region
will be printed as <0,1,0>. This causes difficulties round-tripping
through the assembler <-> disassembler because that is not an acceptable
destination region. If we instead disassemble the destination, we
instead get a <1> region which is an acceptable and equivalent region
for source and destination.

The test .asm files are regenerated by round-tripping them through the
assembler/disassembler. Note that the <0> region in the tests was a
harmless mistake: the compiler translated it to a <0,1,0> source region
and a <1> destination region, since <0> isn't valid.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6543>

gallium/tgsi_exec: Fix up NumOutputs counting

We can get duplicate declarations for an index (for example dvec3 + float
packed into 2 vec4s, the second one won't pack into the first's array
decl), and we'd end up stepping by the wrong amount in GS vtx/prim emit.

Fixes vs-gs-fs-double, sso-vs-gs-fs-array-interleave piglit tests.

Fixes: 49155c3264d0 ("draw/tgsi: fix geometry shader input/output swizzling")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>

gallium/tgsi_exec: Add missing DFLR opcode support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>

nir/clone: Add a helper for cloning most instruction types

@anholt needed it for nir_to_tgsi, and the desire comes up frequently.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>

nir/opt_vectorize: Add a callback for filtering of vectorizing.

For NIR-to-TGSI, we don't want to revectorize 64-bit ops that we split to
scalar beyond vec2 width. We even have some ops that we would rather
retain as scalar due to TGSI opcodes being scalar, or having more unusual
requirements.

This could be used to do the vectorize_vec2_16bit filtering, but that
shader compiler option is also used in algebraic so leave it in place for
now.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>

nir: Add simplistic lowering for bany_equal/ball_inequal.

It would be nice if we could do swizzling of an expression on the
replacement side so that we could have a single ieq/ine of the vector
after CSE. However, if you do want vector operations, nir_opt_vectorize()
does just fine.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>

gallium/ureg: Set the next shader stage from the shader info.

Saves a loop over the linked shaders in glsl_to_tgsi which the GLSL linker
has already done for us.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>

gallium/tgsi: Add a helper for initializing ureg from a shader_info.

This moves a bunch of code from glsl_to_tgsi that will be reused by
tgsi-to-nir.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>

gallium/tgsi: Add some missing opcodes to tgsi_ureg.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>

gallium/tgsi: Add support for PRIMITIVEID as a system value.

NIR always represents this as a system value, so for NIR-to-TGSI we need
this support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>