nir: get ffma support from NIR options for nir_lower_flrp This also fixes the inverted last parameter of nir_lower_flrp in most drivers. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6599>
nir: rename nir_op_fne to nir_op_fneu It was always fneu but naming it fne causes confusion from time to time. So lets rename it. Later we also want to add other unordered and fne, this is a smaller preparation for that. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>
nir: Add an LOD parameter to image_*_size The OpenCL image_width/height/depth functions have variants which can take an LOD parameter. More importantly, LLVM-SPIRV-Translator always generates OpImageQuerySizeLod even if the LOD is guaranteed to be zero. Given that over half the hardware out there has an LOD field for image size queries (based on a rudimentary scan through their NIR -> whatever code), we may as well just add the source to the NIR intrinsic. If this is ever a problem for anyone, the lowering is pretty trivial. I've also added asserts to everyone's drivers that should alert them if they ever see an LOD other than zero. This will never happen with GL or Vulkan so there's no need for panic. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6396>
broadcom/compiler: Enable PER_QUAD for UBO and SSBO loads. Helper invocations need to be able to read from UBOs since those values can be used for flow control, but writes from helper invocations need to be dropped. Fixes CTS tests: dEQP-VK.glsl.derivate.*.uniform_loop.* Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6356>
broadcom/compiler: support nir_intrinsic_load_sample_id This adds support for the intrinsic as well as the vir_SAMPID instruction that corresponds to it in vir. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6356>
nir: Add goto_if jump instruction Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2401>
nir: Add nir_foreach_shader_in/out_variable helpers Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5966>
v3d/compiler: handle compact varyings We are going to need this in Vulkan because the SPIR-V compiler defines clip distances as a single compact array of scalars, so our compiler needs to know what to do with them. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6022>
v3d: Retry with the fallback scheduler when RA fails v3d_compile is now split out into a helper function that gets called a second time if compilation fails the first time with the result reporting the register allocation failed. The second time it is run with the fallback scheduler to try and increase the chances of successfully allocating the registers. v2: Add a performance debug message when using the fallback scheduler. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5953>
v3d: Changed v3d_compile:failed to an enum Instead of just having a bool status for the failure, there is now an enum so that the compilation can report a more detailed status. Currently this is only used to report whether the failure was due to failed register allocation. The “failed” bool doesn’t seem to actually have been used anywhere so this doesn’t really change a lot. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5953>
v3d/compiler: Lower geometry output store base into offset src When generating the VPM write instruction for geometry shader outputs, emit_store_output_gs ends up adding the base and offset arguments together with an ADD instruction. The addition was done at the VIR level after scheduling so it always ends up right next to the corresponding stvpm instruction. Most of the time the offset is constant but nothing does any constant folding at the VIR level. This patch makes it instead fold the addition into the offset at the NIR level in v3d_nir_lower_io so that the NIR-level constant folding can get rid of the addition most of the time. v2: Use nir_iadd_imm to simplify the code. (Eric Anholt) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5825>
v3d/compiler: Fix sorting the gs and fs inputs ntq_setup_fs_inputs and ntq_setup_gs_inputs sort the inputs according to the driver location. This input array is then used to calculate the VPM offset for the outputs in the previous stage. However, it wasn’t taking into account variables that are packed into a single varying slot. In that case they would have the same driver_location and are distinguished by location_frac. This patch makes it additionally sort by location_frac when the driver locations are equal. This can happen when the compiler packs varyings that are sized less than vec4. Without this fix, when the VPM is used to transmit data free-form between the stages (such as VS->GS) then it would end up writing to inconsistent locations. Fixes dEQP tests such as: dEQP-GLES31.functional.primitive_bounding_box.lines.global_state. vertex_geometry_fragment.default_framebuffer_bbox_equal Fixes: 5d578c27cec ("v3d: add initial compiler plumbing for geometry shaders") Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5787>
v3d: Handle the line width intrinsics Adds new QUNIFORMs to store the line widths. v2: Also handle the aa_line_width intrinsic Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5624>
v3d: Implement the line coord intrinsic The line coord intrinsic is loaded from the implicit varying stored in the same slot as the point coord when drawing lines. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5624>
v3d/compiler: fix image size for 1D arrays Reviewed by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5692>
v3d: Enable PIPE_CAP_TGSI_TEXCOORD. Dave wants to drop the !TEXCOORD path from NIR, and it's easy enough to do. Untested. Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2952>
v3d: Use stvpmd for non-uniform offsets in GS The offset for the VPM write for storing outputs from the geometry shader isn’t necessarily uniform across all the lanes. This can happen if some of the lanes don’t emit some of the vertices. In that case the offset for the subsequent vertices will be different in each lane. In that case we need to use the stvpmd instruction instead of stvpmv because it will scatter the values out. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3150 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5621>
v3d: don't use intr->num_components for non-vectorized intrinsics Squashed-in-fix-from: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5371>
v3d: Ask the state tracker to lower image accesses off of derefs. This saves a bunch of hassle in handling derefs in the backend, and would be needed for reasonable handling of dynamic indexing of image arrays. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3728>
nir/lower_atomics_to_ssbo: Also lower barriers This is more correct for a pass which is supposed to completely lower away atomic counters. It also lets us stop supporting atomic counter barriers in most of the drivers. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>