mesa.git
4 years agoaco: improve register assignment when live-range splits are necessary
Daniel Schürmann [Tue, 10 Mar 2020 10:50:41 +0000 (11:50 +0100)]
aco: improve register assignment when live-range splits are necessary

When finding a good place for a register, we can ignore
killed operands.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>

4 years agoaco: improve hashing for value numbering
Daniel Schürmann [Tue, 10 Mar 2020 09:00:32 +0000 (10:00 +0100)]
aco: improve hashing for value numbering

An improved hashing greatly reduces the number of collisions,
and thus, increases the speed for lookups in the hash table.
The hash function now uses Murmur3 written by Austin Appleby.

This patch also pre-reserves space for the hashmap to avoid rehashing.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>

4 years agoaco: add explicit padding for all Instruction sub-structs
Daniel Schürmann [Mon, 30 Mar 2020 16:25:00 +0000 (17:25 +0100)]
aco: add explicit padding for all Instruction sub-structs

This patch also adds static_asserts on the size of Instructions
to ensure no internal padding is present.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>

4 years agoaco: guarantee that Temp fits in 4 bytes
Daniel Schürmann [Wed, 11 Mar 2020 12:12:08 +0000 (13:12 +0100)]
aco: guarantee that Temp fits in 4 bytes

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>

4 years agoturnip: new clear/blit implementation with shader path fallback
Jonathan Marek [Fri, 13 Mar 2020 15:57:23 +0000 (11:57 -0400)]
turnip: new clear/blit implementation with shader path fallback

The shader path is used to implement the following cases:
* stencil aspect mask on D24S8 (for image_to_buffer,buffer_to_image)
* clear/copy msaa destination (2D engine can't have msaa dest)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>

4 years agoturnip: add vk_format_is_snorm/is_float
Jonathan Marek [Wed, 8 Apr 2020 14:56:30 +0000 (10:56 -0400)]
turnip: add vk_format_is_snorm/is_float

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>

4 years agoturnip: rework format helpers
Jonathan Marek [Wed, 8 Apr 2020 14:56:16 +0000 (10:56 -0400)]
turnip: rework format helpers

* Take tile_mode as input directly
* tu6_format_gmem to tu6_base_format, use may not be limited to GMEM
* Add new helpers that will return the correct tile_mode as for image level
  as part of the format.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>

4 years agoturnip: use dirty bits for dynamic viewport/scissor state
Jonathan Marek [Wed, 8 Apr 2020 03:25:12 +0000 (23:25 -0400)]
turnip: use dirty bits for dynamic viewport/scissor state

CmdClearAttachments shader path will overwrite this state, so it needs to
be re-emitted with dirty bits in that case.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>

4 years agoturnip: save attachment samples in renderpass state
Jonathan Marek [Wed, 8 Apr 2020 02:23:27 +0000 (22:23 -0400)]
turnip: save attachment samples in renderpass state

This is needed to be able to know the number of samples during
CmdClearAttachments which can be used while the framebuffer is unknown.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>

4 years agoturnip: disable 8x msaa
Jonathan Marek [Wed, 8 Apr 2020 02:20:10 +0000 (22:20 -0400)]
turnip: disable 8x msaa

Not everything supports 8x msaa, and the blob doesn't support it at all.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>

4 years agoturnip: fix nir validate failure from push constant lowering
Jonathan Marek [Wed, 8 Apr 2020 01:39:40 +0000 (21:39 -0400)]
turnip: fix nir validate failure from push constant lowering

Fixes newly added checks in nir validate failing.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>

4 years agoturnip: split up gmem/tile alignment
Jonathan Marek [Tue, 18 Feb 2020 13:54:15 +0000 (08:54 -0500)]
turnip: split up gmem/tile alignment

Note: the x1/y1 align in tu6_emit_blit_scissor was broken

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>

4 years agoturnip: RB_CCU_CNTL fixes
Jonathan Marek [Fri, 13 Mar 2020 15:47:15 +0000 (11:47 -0400)]
turnip: RB_CCU_CNTL fixes

* Correct bypass value for a618
* Bypass value for blitter
* Don't set RB_CCU_CNTL again unnecessarily in tu6_emit_binning_pass

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>

4 years agofreedreno/a6xx: set bypass RB_CCU_CNTL value for blitter
Jonathan Marek [Fri, 13 Mar 2020 14:20:23 +0000 (10:20 -0400)]
freedreno/a6xx: set bypass RB_CCU_CNTL value for blitter

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>

4 years agofreedreno/registers: add RB_CCU_CNTL bitfields
Jonathan Marek [Fri, 13 Mar 2020 14:09:11 +0000 (10:09 -0400)]
freedreno/registers: add RB_CCU_CNTL bitfields

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>

4 years agoradv: allow TC-compat HTILE with GENERAL outside of render loops
Samuel Pitoiset [Thu, 9 Apr 2020 09:37:27 +0000 (11:37 +0200)]
radv: allow TC-compat HTILE with GENERAL outside of render loops

This gives +8% with Wolfeinstein Youngblood on my Vega64, and
according to someone else, it also improves performance with Doom
2016 and Wolfenstein 2 (and probably other ID Tech games).

This improvement is because Youngblood uses GENERAL for the main
depth-only pass and TC-compat HTILE is now enabled with GENERAL if
we know that we are outside of a render loop. This obviously also
reduces the number of HTILE decompressions from/to GENERAL.

Note that Youngblood violates the Vulkan spec regarding render loops
because they are only allowed with input attachments. Expect possible
rendering issues if apps use render loops with the wrong way (ie.
without input attachmens) because HTILE might not be coherent if
a depth-stencil texture is sampled and rendered in the same draw.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2704
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4391>

4 years agoradv: only enable TC-compat HTILE for images readable by a shader
Samuel Pitoiset [Tue, 31 Mar 2020 08:35:00 +0000 (10:35 +0200)]
radv: only enable TC-compat HTILE for images readable by a shader

If no texture fetches happen it's useless to enable TC-compat HTILE.

Because the driver currently doesn't support TC-compat HTILE for
storage images we don't have to check.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4497>

4 years agoradv: only expose fp16 control features for chips with double rate fp16
Samuel Pitoiset [Sun, 5 Apr 2020 07:42:50 +0000 (09:42 +0200)]
radv: only expose fp16 control features for chips with double rate fp16

This disables all fp16 shader control features on GFX8 because only
GFX9+ supports double rate packed math.

This improves consistency regarding other AMD Vulkan drivers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>

4 years agoradv: only expose storageInputOutput16 for chips with double rate fp16
Samuel Pitoiset [Sun, 5 Apr 2020 07:33:43 +0000 (09:33 +0200)]
radv: only expose storageInputOutput16 for chips with double rate fp16

This feature allows to use both 16-bit integers and 16-bit floats
as inputs/outputs.

This disables storageInputOutput16 on GFX8 because only GFX9+ supports
double rate packed math.

This improves consistency regarding other AMD Vulkan drivers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>

4 years agoradv: only expose shaderFloat16 for chips with double rate fp16
Samuel Pitoiset [Sun, 5 Apr 2020 07:25:18 +0000 (09:25 +0200)]
radv: only expose shaderFloat16 for chips with double rate fp16

This disables shaderFloat16 on GFX8 because only GFX9+ supports
double rate packed math.

This improves consistency regarding other AMD Vulkan drivers and
it makes no sense to enable that feature without packed math.

This also reduces performance with Wolfeinstein Youngblood if
fp16 is forced enabled on GFX8, while it's similar on GFX9.

We might re-introduce that feature in the future with ACO support
if it ends up being faster and correct.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>

4 years agoac,radv: add ac_gpu_info::has_double_rate_fp16
Samuel Pitoiset [Sun, 5 Apr 2020 07:23:16 +0000 (09:23 +0200)]
ac,radv: add ac_gpu_info::has_double_rate_fp16

Only GFX9+ support double rate packed math instructions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>

4 years agoturnip: use buffer size instead of bo size for VFD_FETCH_SIZE
Jonathan Marek [Wed, 8 Apr 2020 23:43:24 +0000 (19:43 -0400)]
turnip: use buffer size instead of bo size for VFD_FETCH_SIZE

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4224>

4 years agoturnip: improve vertex input handling
Jonathan Marek [Wed, 18 Mar 2020 02:28:38 +0000 (22:28 -0400)]
turnip: improve vertex input handling

Emit vertexBindingDescriptionCount bindings, instead of one per attribute.

Verified with dEQP-VK.pipeline.vertex_input.*

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4224>

4 years agoradeonsi: fix Segmentation fault during vaapi enc test
James Zhu [Mon, 6 Apr 2020 20:34:01 +0000 (20:34 +0000)]
radeonsi: fix Segmentation fault during vaapi enc test

Fix Segmentation fault during vaapi enc test on Arcturus.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by Leo Liu <leo.liu@amd.com>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4472>

4 years agoradv: Use correct buffer count with variable descriptor set sizes.
Bas Nieuwenhuizen [Wed, 8 Apr 2020 11:53:47 +0000 (13:53 +0200)]
radv: Use correct buffer count with variable descriptor set sizes.

Fixes dEQP-VK.binding_model.descriptorset_random.sets16.noarray.ubolimitlow.sbolimitlow.imglimitlow.iublimitlow.frag.ialimitlow.0

CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2607
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4489>

4 years agoradv: Whitespace fixup.
Bas Nieuwenhuizen [Wed, 8 Apr 2020 10:51:34 +0000 (12:51 +0200)]
radv: Whitespace fixup.

Review comment that I did, but forgot to git add before amending ...

From https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4334

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4488>

4 years agoradv: set sparseAddressSpaceSize to RADV_MAX_MEMORY_ALLOCATION_SIZE
Samuel Iglesias Gonsálvez [Wed, 8 Apr 2020 08:57:28 +0000 (10:57 +0200)]
radv: set sparseAddressSpaceSize to RADV_MAX_MEMORY_ALLOCATION_SIZE

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4487>

4 years agoradv: check buffer size in vkCreateBuffer()
Samuel Iglesias Gonsálvez [Wed, 8 Apr 2020 08:55:37 +0000 (10:55 +0200)]
radv: check buffer size in vkCreateBuffer()

Fixes:

   dEQP-VK.api.buffer.basic.size_max_uint64

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4487>

4 years agoradv: Consider maximum sample distances for entire grid.
Bas Nieuwenhuizen [Tue, 7 Apr 2020 20:23:09 +0000 (22:23 +0200)]
radv: Consider maximum sample distances for entire grid.

The other pixels in the grid might have samples with a larger
distance than the (0,0) pixel.

Fixes dEQP-VK.pipeline.multisample.sample_locations_ext.verify_location.samples_8_packed
when CTS is compiled with clang.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4480>

4 years agoradv: enable lowering of GS intrinsics for the LLVM backend
Samuel Pitoiset [Fri, 13 Mar 2020 10:23:07 +0000 (11:23 +0100)]
radv: enable lowering of GS intrinsics for the LLVM backend

This replaces emit_vertex with:
   if (vertex_count < max_vertices) {
      emit_vertex_with_counter vertex_count ...
      vertex_count += 1
   }

Which is exactly what NIR->LLVM was doing but at NIR level. This
pass is already called by ACO.

pipeline-db changes on GFX10:
Totals from affected shaders:
SGPRS: 1952 -> 1912 (-2.05 %)
VGPRS: 2112 -> 2044 (-3.22 %)
Code Size: 189368 -> 185620 (-1.98 %) bytes
Max Waves: 494 -> 491 (-0.61 %)

No pipeline-db changes on other generations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4182>

4 years agoradv: remove radv_layout_has_htile() helper
Samuel Pitoiset [Tue, 31 Mar 2020 13:26:00 +0000 (15:26 +0200)]
radv: remove radv_layout_has_htile() helper

The goal of this function was to return whether a depth-stencil image
has HTILE, in comparison to radv_layout_is_htile_compressed() which
is used to know whether a depth-stencil image has HTILE compressed.

These two functions are actually similar and they have never been
used for what they were supposed to. Remove radv_layout_has_htile()
in favour of radv_layout_is_htile_compressed() for now. If it's
needed in the future, I will re-introduce this concept properly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4389>

4 years agoradv: cleanup creating the decompress/resummarize pipelines
Samuel Pitoiset [Tue, 31 Mar 2020 13:22:01 +0000 (15:22 +0200)]
radv: cleanup creating the decompress/resummarize pipelines

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4389>

4 years agoradv: rename extra graphics pipeline decompress/resummarize fields
Samuel Pitoiset [Tue, 31 Mar 2020 13:14:37 +0000 (15:14 +0200)]
radv: rename extra graphics pipeline decompress/resummarize fields

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4389>

4 years agoradv: rename decompress/resummarize depth/stencil functions
Samuel Pitoiset [Tue, 31 Mar 2020 13:09:58 +0000 (15:09 +0200)]
radv: rename decompress/resummarize depth/stencil functions

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4389>

4 years agoturnip: fix compute shaders crashing after geometry shader change
Jonathan Marek [Wed, 8 Apr 2020 01:13:31 +0000 (21:13 -0400)]
turnip: fix compute shaders crashing after geometry shader change

Fixes: 1af71bee734da7d8 ("turnip: Set has_gs in ir3_shader_key")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4483>

4 years agonir: make opt_if_loop_terminator() less strict
Timothy Arceri [Tue, 7 Apr 2020 13:33:55 +0000 (23:33 +1000)]
nir: make opt_if_loop_terminator() less strict

nir_cf_{extract,reinsert}() can't stitch a block together
if the block we are extracting ends in a jump but other jumps
nested in further ifs should be fine to move.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4477>

4 years agoradeonsi: don't lower constant arrays to uniforms in GLSL IR
Timothy Arceri [Tue, 7 Apr 2020 01:28:32 +0000 (11:28 +1000)]
radeonsi: don't lower constant arrays to uniforms in GLSL IR

This re-enables the change made in 2f5783bc2b82 which was
incorrectly disabled by 3e1dd99adca5.

For radeonsi, we will prefer the NIR pass as it'll generate better code
(some index calculation and a single load vs. a load, then index
calculation, then another load) and oftentimes NIR optimization can kick
in and make all the access indices constant.

Fixes: 3e1dd99adca5 ("radeonsi: Remove a bunch of default handling of pipe caps.")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4474>

4 years agomeson: fix debug build on Android
Dominik Behr [Tue, 22 Oct 2019 01:13:08 +0000 (18:13 -0700)]
meson: fix debug build on Android

debug_stack functions are implemented in another file for Android.
Also add backtrace library dependency.

Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-bu: Kristian H. Kristensen <hoegsber@google.com>
Signed-off-by: Dominik Behr <dbehr@chromium.org>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2435>

4 years agoradv: Store 64-bit availability bools if requested.
Bas Nieuwenhuizen [Thu, 26 Mar 2020 14:19:37 +0000 (15:19 +0100)]
radv: Store 64-bit availability bools if requested.

Fixes dEQP-VK.query_pool.*.reset_before_copy.* on RAVEN.

CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2296
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4334>

4 years agogallivm: Add missing header for powf.
Vinson Lee [Tue, 7 Apr 2020 01:21:54 +0000 (18:21 -0700)]
gallivm: Add missing header for powf.

Fix build error after llvm-11 commit 3a29393b4709 ("Remove
math.h/cmath include from DataTypes.h").

src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c: In function ‘lp_build_linear_to_srgb’:
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c:194:44: error: implicit declaration of function ‘powf’ [-Werror=implicit-function-declaration]
  194 |                                  exp2f_c * powf(coeff_f, 1.0f / exp_f));
      |                                            ^~~~
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c:194:44: warning: incompatible implicit declaration of built-in function ‘powf’
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c:78:1: note: include ‘<math.h>’ or provide a declaration of ‘powf’
   77 | #include "lp_bld_format.h"
  +++ |+#include <math.h>
   78 |

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4473>

4 years agoturnip: Drop dep_llvm from dependencies
Kristian H. Kristensen [Tue, 7 Apr 2020 16:34:42 +0000 (09:34 -0700)]
turnip: Drop dep_llvm from dependencies

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478>

4 years agoturnip: Make Android platform build
Kristian H. Kristensen [Tue, 7 Apr 2020 16:04:00 +0000 (09:04 -0700)]
turnip: Make Android platform build

We still don't have a way to keep this from breaking, but I don't
think this ever built.  Let's call it progress.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478>

4 years agoturnip: Stub out VK_KHR_external_{fence,semaphore}_fd
Kristian H. Kristensen [Tue, 7 Apr 2020 15:57:10 +0000 (08:57 -0700)]
turnip: Stub out VK_KHR_external_{fence,semaphore}_fd

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478>

4 years agoturnip: Add missing VKAPI_ATTR annotations
Kristian H. Kristensen [Tue, 7 Apr 2020 15:45:03 +0000 (08:45 -0700)]
turnip: Add missing VKAPI_ATTR annotations

Make sure the types match.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478>

4 years agotracie: Reformat code to fix indentation
Rohan Garg [Tue, 7 Apr 2020 11:17:07 +0000 (13:17 +0200)]
tracie: Reformat code to fix indentation

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4435>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4435>

4 years agotracie: Print results in a machine readable format
Rohan Garg [Wed, 1 Apr 2020 16:06:49 +0000 (18:06 +0200)]
tracie: Print results in a machine readable format

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4435>

4 years agofreedreno/a6xx: Set a level's pitch based on minified level0 pitch, not width0.
Eric Anholt [Tue, 25 Feb 2020 22:03:36 +0000 (14:03 -0800)]
freedreno/a6xx: Set a level's pitch based on minified level0 pitch, not width0.

Found from piglit fbo-generatemipmaps failures, then tracked down with the
texturator test.  The piece that really revealed things was finding that
1024x1 linear RGBA8 on the older blob drivers would have a pitch of 5120
instead of 4096, and the following levels minified that pitch.

Fixes ~124 piglit tests (~8.5% of piglit failures) on cheza.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>

4 years agofreedreno: Add the outline of a test for a6xx texture layout.
Eric Anholt [Wed, 26 Feb 2020 19:27:04 +0000 (11:27 -0800)]
freedreno: Add the outline of a test for a6xx texture layout.

Trying to work out texture layout by remembering what things looked like
in texturator is hard.  Instead, let's use texture layouts from tracing
the blob as a source of truth to make sure that we pick the same layouts
they do (and don't break known-good ones).  More testcases will be added
as I fix layout bugs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>

4 years agofreedreno/a6xx: Drop the "alignment" layout temporary.
Eric Anholt [Tue, 25 Feb 2020 22:42:33 +0000 (14:42 -0800)]
freedreno/a6xx: Drop the "alignment" layout temporary.

It's just 1 for !3d, which means that the align we're doing in that case
is pointless.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>

4 years agofreedreno/a6xx: Remove the "aligned_height" temporary.
Eric Anholt [Tue, 25 Feb 2020 22:40:37 +0000 (14:40 -0800)]
freedreno/a6xx: Remove the "aligned_height" temporary.

Now that we're not incrementally minifying height, we can just modify it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>

4 years agofreedreno/a6xx: Sink the per-level size temps inside the loop.
Eric Anholt [Thu, 27 Feb 2020 20:35:32 +0000 (12:35 -0800)]
freedreno/a6xx: Sink the per-level size temps inside the loop.

u_minify(n, 1) is no cheaper than u_minify(n, level), and this makes the
logic a lot simpler to follow.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>

4 years agogitlab-ci: Run merge request pipelines automatically only for Marge Bot
Michel Dänzer [Fri, 3 Apr 2020 09:46:12 +0000 (11:46 +0200)]
gitlab-ci: Run merge request pipelines automatically only for Marge Bot

MR pipelines not triggered by Marge Bot can still be triggered manually.

Motivation: The main & forked Mesa project CI pipelines combined are
currently generating over 1 TB of egress traffic per week. ~80% of this
is from pre-merge pipelines. Assuming this corresponds to 4 pre-merge
and one post-merge pipeline per MR on average, this change could
potentially eliminate up to ~60% of the overall traffic (by preventing
3 of the 4 pre-merge pipelines from running automatically).

(Of course, this could be subverted if all jobs of the other pipelines
were triggered manually anyway... In most cases, manually triggering
just a few jobs should suffice)

v2:
* $GITLAB_USER_NAME was the wrong variable, $GITLAB_USER_LOGIN should
  do the trick.

Suggested-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432>

4 years agogitlab-ci: Don't require triggering build/test jobs manually
Michel Dänzer [Fri, 3 Apr 2020 10:50:11 +0000 (12:50 +0200)]
gitlab-ci: Don't require triggering build/test jobs manually

Let them run automatically once all their dependencies have passed.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432>

4 years agogitlab-ci/lava: Add needs: for container image to test jobs (again)
Michel Dänzer [Fri, 3 Apr 2020 13:59:31 +0000 (15:59 +0200)]
gitlab-ci/lava: Add needs: for container image to test jobs (again)

Without this, the test jobs could spuriously run after the container
job failed or was cancelled, even if the build job didn't run at all.

(I already did this in 94cfe590703018cf3d34a0c1f8667064919bf843, but it
got dropped accidentally in 22d976454f4e50142116f4544c0bbf11134ce991)

Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432>

4 years agogitlab-ci: Rename "paths" YAML anchor to "all_paths"
Michel Dänzer [Fri, 3 Apr 2020 09:17:48 +0000 (11:17 +0200)]
gitlab-ci: Rename "paths" YAML anchor to "all_paths"

To avoid confusion with `paths:` elements.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432>

4 years agoanv/gen12: Lower VK_KHR_multiview using Primitive Replication
Caio Marcelo de Oliveira Filho [Tue, 27 Mar 2018 17:10:34 +0000 (10:10 -0700)]
anv/gen12: Lower VK_KHR_multiview using Primitive Replication

Identify if view_index is used only for position calculation, and use
Primitive Replication to implement Multiview in Gen12.  This feature
allows storing per-view position information in a single execution of
the shader, treating position as an array.

The shader is transformed by adding a for-loop around it, that have an
iteration per active view (in the view_mask).  Stores to the position
now store into the position array for the current index in the loop,
and load_view_index() will return the view index corresponding to the
current index in the loop.

The feature is controlled by setting the environment variable
ANV_PRIMITIVE_REPLICATION_MAX_VIEWS, which defaults to 2 if unset.
For pipelines with view counts larger than that, the regular
instancing will be used instead of Primitive Replication.  To disable
it completely set the variable to 0.

v2: Don't assume position is set in vertex shader; remove only stores
    for position; don't apply optimizations since other passes will
    do; clone shader body without extract/reinsert; don't use
    last_block (potentially stale). (Jason)

    Fix view_index immediate to contain the view index, not its order.
    Check for maximum number of views supported.
    Add guard for gen12.

v3: Clone the entire shader function and change it before reinsert;
    disable optimization when shader has memory writes. (Jason)

    Use a single environment variable with _DEBUG on the name.

v4: Change to use new nir_deref_instr.
    When removing stores, look for mode nir_var_shader_out instead
    of the walking the list of outputs.
    Ensure unused derefs are removed in the non-position part of the
    shader.
    Remove dead control flow when identifying if can use or not
    primitive replication.

v5: Consider all the active shaders (including fragment) when deciding
    that Primitive Replication can be used.
    Change environment variable to ANV_PRIMITIVE_REPLICATION.
    Squash the emission of 3DSTATE_PRIMITIVE_REPLICATION into this patch.
    Disable Prim Rep in blorp_exec_3d.

v6: Use a loop around the shader, instead of manually unrolling, since
    the regular unroll pass will kick in.
    Document that we don't expect to see copy_deref or load_deref
    involving the position variable.
    Recover use_primitive_replication value when loading pipeline from
    the cache.
    Set VARYING_SLOT_LAYER to 0 in the shader.  Earlier versions were
    relying on ForceZeroRTAIndexEnable but that might not be
    sufficient.
    Disable Prim Rep in cmd_buffer_so_memcpy.

v7: Don't use Primitive Replication if position is not set, fallback
    to instancing; change environment variable to be
    ANV_PRIMITVE_REPLICATION_MAX_VIEWS and default it to 2 based on
    experiments.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>

4 years agointel/fs: Allow multiple slots for position
Caio Marcelo de Oliveira Filho [Fri, 21 Sep 2018 23:07:38 +0000 (16:07 -0700)]
intel/fs: Allow multiple slots for position

Change brw_compute_vue_map() to also take the number of pos slots.  If
more than one slot is used, the VARYING_SLOT_POS is treated as an
array.

When using Primitive Replication, instead of a single position, the
VUE must contain an array of positions.  Padding might be
necessary (after clip distance) to ensure rest of attributes start
aligned.

v2: Add note about array in the commit message and assert that
    pos_slots >= 1 to make clear 0 is invalid. (Jason)
    Move padding to be after the clip distance.

v3: Apply the correct offset when gathering the sources from outputs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v2]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>

4 years agointel/gen12: Add XML description for 3DSTATE_PRIMITIVE_REPLICATION
Caio Marcelo de Oliveira Filho [Sat, 12 Oct 2019 00:04:36 +0000 (17:04 -0700)]
intel/gen12: Add XML description for 3DSTATE_PRIMITIVE_REPLICATION

v2: Use groups for the 16-element arrays "Viewport Offset"
    and "RTAI Offset". (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>

4 years agonir: Add per_view attribute to nir_variable
Caio Marcelo de Oliveira Filho [Tue, 11 Feb 2020 22:41:05 +0000 (14:41 -0800)]
nir: Add per_view attribute to nir_variable

If a nir_variable is tagged with per_view, it must be an array with
size corresponding to the number of views.  For slot-tracking, it is
considered to take just the slot for a single element -- drivers will
take care of expanding this appropriately.

This will be used to implement the ability of having per-view position
in a vertex shader in Intel platforms.

Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>

4 years agomesa: add support for NV_pixel_buffer_object
Simon Ser [Thu, 2 Apr 2020 20:53:41 +0000 (22:53 +0200)]
mesa: add support for NV_pixel_buffer_object

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4422>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4422>

4 years agoturnip: implement timestamp query
Jonathan Marek [Tue, 3 Mar 2020 01:52:15 +0000 (20:52 -0500)]
turnip: implement timestamp query

Passes tests in:
dEQP-VK.pipeline.timestamp.*

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4027>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4027>

4 years agoturnip: Enable geometryShader device feature
Brian Ho [Thu, 2 Apr 2020 18:01:54 +0000 (11:01 -0700)]
turnip: Enable geometryShader device feature

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>

4 years agoturnip: Enable geometry shaders for CP_DRAWs
Brian Ho [Wed, 1 Apr 2020 20:36:31 +0000 (13:36 -0700)]
turnip: Enable geometry shaders for CP_DRAWs

Enable geometry shading on draw if the pipeline has a geometry
stage.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>

4 years agoturnip: Populate tu_pipeline.active_stages
Brian Ho [Mon, 6 Apr 2020 20:38:04 +0000 (13:38 -0700)]
turnip: Populate tu_pipeline.active_stages

This can be used to determine if the pipeline has a specific shader
stage (e.g. geometry shader).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>

4 years agoturnip: Update maxGeometryShaderInvocations to match blob
Brian Ho [Fri, 3 Apr 2020 14:59:47 +0000 (07:59 -0700)]
turnip: Update maxGeometryShaderInvocations to match blob

Geometry shaders support an invocations parameter up to a limit
defined by maxGeometryShaderInvocations. This was set to 127, but
executing with invocations > 32 causes a crash. As it turns out, the
blob only advertises a max of 32 invocations, so we set that in
turnip as well.

Fixes dEQP-VK.geometry.instanced.draw_*_instances_{127, 64}_geometry_invocations

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>

4 years agoturnip: Selectively configure GRAS_LAYER_CNTL
Brian Ho [Fri, 3 Apr 2020 14:57:25 +0000 (07:57 -0700)]
turnip: Selectively configure GRAS_LAYER_CNTL

One of the features of geometry shaders is the ability to render to
different layers by assigning to the gl_Layer (Layer in SPIR-V)
builtin.

While have already plumbed the layer regid to the geometry shader,
we also need to GRAS_LAYER_CNTL to actually use layered rendering.
In addition, gmem does not support layered rendering, so we need to
force sysmem.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>

4 years agoturnip: Set up REG_A6XX_SP_GS_CONFIG
Brian Ho [Wed, 1 Apr 2020 20:26:17 +0000 (13:26 -0700)]
turnip: Set up REG_A6XX_SP_GS_CONFIG

Updates GS_CONFIG and HLSQ_GS_CNTL registers to match those emitted
by the blob and fd.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>

4 years agoturnip: Configure VFD_CONTROL with gsheader and primitiveid
Brian Ho [Wed, 1 Apr 2020 20:21:26 +0000 (13:21 -0700)]
turnip: Configure VFD_CONTROL with gsheader and primitiveid

This commit updates VFD_CONTROL to use the GS header and primitive
ID sysvals if a geometry shader stage is present in the pipeline.
Like in the case of VPC, the code here is adapted from fd6_program.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>

4 years agoturnip: Configure VPC for geometry shaders
Brian Ho [Wed, 1 Apr 2020 18:50:55 +0000 (11:50 -0700)]
turnip: Configure VPC for geometry shaders

This commit updates tu6_emit_vpc to selectively emit GS-specifc
configuration. Most of this is repurposed from fd6_program.c.

This also refactors `link_geometry_stages` to ir3_nir_lower_tess.c
so it can be shared between fd and tu.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>

4 years agoturnip: Emit geometry shader obj and related consts
Brian Ho [Wed, 1 Apr 2020 18:09:48 +0000 (11:09 -0700)]
turnip: Emit geometry shader obj and related consts

Like with other shader types, we need to emit the geometry shader
object and the consts it uses. In addition, we need to emit
additional geometry-specific consts that link primitive/vertex stride
between the vs and gs. In conjunction with the gsheader, these are
used by the vs to determine where to stlw outputs and used by the gs
to determine where to ldlw those outputs from.

FD emits these consts in the draw call because in GL, you can mix
and match shaders in different programs. In Vulkan, however, we
compile and link the shaders at pipeline creation, so we can emit
these in the pipeline IB instead.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>

4 years agoturnip: Set has_gs in ir3_shader_key
Brian Ho [Wed, 1 Apr 2020 16:01:52 +0000 (09:01 -0700)]
turnip: Set has_gs in ir3_shader_key

The ir3 compiler only lowers the VS and GS for geometry shading if
the corresponding has_gs key is set in the shader key. Without it,
GS-specific intrinsics like load_per_vertex_input won't get lowered
and the GS header will be initialized with invalid values.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>

4 years agoradv: Print shader stage before disassembly.
Timur Kristóf [Wed, 1 Apr 2020 11:43:50 +0000 (13:43 +0200)]
radv: Print shader stage before disassembly.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>

4 years agoaco: Print shader stage in aco_print_program.
Timur Kristóf [Wed, 1 Apr 2020 11:39:25 +0000 (13:39 +0200)]
aco: Print shader stage in aco_print_program.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>

4 years agoradv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.
Timur Kristóf [Tue, 31 Mar 2020 08:41:01 +0000 (10:41 +0200)]
radv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>

4 years agoaco/ngg: Run GS_ALLOC_REQ on priority 3 for NGG VS and TES.
Timur Kristóf [Wed, 1 Apr 2020 13:55:41 +0000 (15:55 +0200)]
aco/ngg: Run GS_ALLOC_REQ on priority 3 for NGG VS and TES.

It is recommended to do this as quickly as possible.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>

4 years agoaco/ngg: Schedule position exports of NGG VS/TES.
Timur Kristóf [Wed, 1 Apr 2020 14:02:13 +0000 (16:02 +0200)]
aco/ngg: Schedule position exports of NGG VS/TES.

Similarly to the HW VS stage, the HW NGG GS stage also
benefits from executing these exports as early as possible.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>

4 years agoaco/ngg: Implement NGG VS and TES.
Timur Kristóf [Wed, 1 Apr 2020 10:29:30 +0000 (12:29 +0200)]
aco/ngg: Implement NGG VS and TES.

When NGG is used, vertex and tess eval shaders are executed on the
hardware NGG geometry stage. There is a series of steps they
must perform:

* Request GS space using GS_ALLOC_REQ
* Export the primitive
* Finally, export the normal VS outputs

In this commit, two modes are implemented:

* "late" which matches what the RADV LLVM backend currently does
* "early" which is an optimized version as seen in radeonsi

Vulkan doesn't allow the shader to write the edge flags, so we can
currently always use the "early" mode.

Exporting the primitive ID is also supported by having the GS threads
write that into LDS and reading them from LDS in the ES threads.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>

4 years agoaco/ngg: Setup NGG VS and TES stages.
Timur Kristóf [Wed, 1 Apr 2020 10:18:50 +0000 (12:18 +0200)]
aco/ngg: Setup NGG VS and TES stages.

ngg_vertex_gs and ngg_tess_eval_gs work very similarly to
vertex_vs and tess_eval_vs, but they run on the HW NGG GS stage.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>

4 years agoaco/ngg: Fix exports for NGG VS and TES.
Timur Kristóf [Wed, 1 Apr 2020 10:14:59 +0000 (12:14 +0200)]
aco/ngg: Fix exports for NGG VS and TES.

The exports in NGG VS and TES work just like VS exports,
so the assembler needs to fix these too in the same manner.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>

4 years agoaco/ngg: Initialize exec mask for NGG VS and TES.
Timur Kristóf [Wed, 1 Apr 2020 10:14:00 +0000 (12:14 +0200)]
aco/ngg: Initialize exec mask for NGG VS and TES.

They behave like merged ESGS shaders, so the exec mask needs
to be manually initialized for these NGG shaders too.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>

4 years agoaco/ngg: Add new stage for hw_ngg_gs.
Timur Kristóf [Mon, 27 Jan 2020 07:17:47 +0000 (08:17 +0100)]
aco/ngg: Add new stage for hw_ngg_gs.

This is needed to distinguish between NGG and legacy.
Otherwise, vertex_geometry_gs and ngg_vertex_geometry_gs
have the same value, which we want to avoid.

Also, there is no such thing as ngg_vertex_tess_control_hs.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>

4 years agoaco: Treat s_setprio as a scheduling barrier.
Timur Kristóf [Wed, 1 Apr 2020 13:38:43 +0000 (15:38 +0200)]
aco: Treat s_setprio as a scheduling barrier.

We want to execute instructions after s_setprio in the given
priority, so we must prevent the scheduler from scheduling beyond
s_setprio, otherwise some instructions could be executed in a
different priority.

Rename hazard_fail_memtime to hazard_fail_unreorderable and include
s_setprio in the list of unreorderable opcodes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>

4 years agoaco: Extract merged_wave_info_to_mask to its own function.
Timur Kristóf [Tue, 31 Mar 2020 08:49:52 +0000 (10:49 +0200)]
aco: Extract merged_wave_info_to_mask to its own function.

Currently we only use this at the beginning of merged shader parts,
but we are going to need to use it with some NGG code as well.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>

4 years agoaco: Print block_kind_export_end.
Timur Kristóf [Mon, 27 Jan 2020 07:16:29 +0000 (08:16 +0100)]
aco: Print block_kind_export_end.

Useful when debugging issues with exports.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>

4 years agoaco: Extract uniform if handling to separate functions.
Timur Kristóf [Wed, 22 Jan 2020 17:21:43 +0000 (18:21 +0100)]
aco: Extract uniform if handling to separate functions.

Currently we only use this for uniform ifs that come from NIR,
but we are going to need to use it with some NGG parts as well.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>

4 years agoaco: Fix crash in insert_wait_states.
Timur Kristóf [Mon, 6 Apr 2020 14:34:45 +0000 (16:34 +0200)]
aco: Fix crash in insert_wait_states.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4465>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4465>

4 years agopan/bit: Wire up add/add op+test
Alyssa Rosenzweig [Mon, 6 Apr 2020 18:15:37 +0000 (14:15 -0400)]
pan/bit: Wire up add/add op+test

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>

4 years agopan/bit: Add fmin/max16 tests
Alyssa Rosenzweig [Mon, 6 Apr 2020 19:17:03 +0000 (15:17 -0400)]
pan/bit: Add fmin/max16 tests

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>

4 years agopan/bit: Enable more debug for `run`
Alyssa Rosenzweig [Mon, 6 Apr 2020 18:06:59 +0000 (14:06 -0400)]
pan/bit: Enable more debug for `run`

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>

4 years agopan/bit: Add min/max support to interpreter
Alyssa Rosenzweig [Mon, 6 Apr 2020 17:08:44 +0000 (13:08 -0400)]
pan/bit: Add min/max support to interpreter

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>

4 years agopan/bit: Unify test frontends
Alyssa Rosenzweig [Mon, 6 Apr 2020 17:03:06 +0000 (13:03 -0400)]
pan/bit: Unify test frontends

Random.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>

4 years agopan/bi: Force ADD scheduling for MINMAX
Alyssa Rosenzweig [Mon, 6 Apr 2020 18:16:17 +0000 (14:16 -0400)]
pan/bi: Force ADD scheduling for MINMAX

Might be GPU version specific.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>

4 years agopan/bi: Fix incorrect abs flip in fma/fadd16
Alyssa Rosenzweig [Mon, 6 Apr 2020 18:13:13 +0000 (14:13 -0400)]
pan/bi: Fix incorrect abs flip in fma/fadd16

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>

4 years agopan/bi: Set BI_MODS for MINMAX
Alyssa Rosenzweig [Mon, 6 Apr 2020 18:06:43 +0000 (14:06 -0400)]
pan/bi: Set BI_MODS for MINMAX

We support it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>

4 years agopan/bi: Add ADD add/min/max fp32 packing
Alyssa Rosenzweig [Mon, 6 Apr 2020 17:48:26 +0000 (13:48 -0400)]
pan/bi: Add ADD add/min/max fp32 packing

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>

4 years agopan/bi: Structify ADD unit add/min/max
Alyssa Rosenzweig [Mon, 6 Apr 2020 17:48:06 +0000 (13:48 -0400)]
pan/bi: Structify ADD unit add/min/max

..since it's missing for FMA

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>

4 years agopan/bi: Implement min/max on FMA
Alyssa Rosenzweig [Mon, 6 Apr 2020 17:27:12 +0000 (13:27 -0400)]
pan/bi: Implement min/max on FMA

Unfortunately, while this looks fine to the disasm, it's raising
INSTR_INVALID_ENC on my g31 board here. Looks like it might be ADD only
on newer Bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>

4 years agopan/bit: Add special unit test
Alyssa Rosenzweig [Mon, 6 Apr 2020 13:51:56 +0000 (09:51 -0400)]
pan/bit: Add special unit test

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>

4 years agopan/bit: Add special op interpreting
Alyssa Rosenzweig [Mon, 6 Apr 2020 13:39:41 +0000 (09:39 -0400)]
pan/bit: Add special op interpreting

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>

4 years agopan/bi: Add fp16 support for frcp/frsq
Alyssa Rosenzweig [Mon, 6 Apr 2020 14:36:10 +0000 (10:36 -0400)]
pan/bi: Add fp16 support for frcp/frsq

More ops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4470>