Connor Abbott [Thu, 19 Mar 2020 13:15:26 +0000 (14:15 +0100)]
ir3: Plumb through bindless support
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
Connor Abbott [Fri, 20 Mar 2020 14:25:59 +0000 (15:25 +0100)]
ir3: LDC also has a destination
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
Connor Abbott [Fri, 20 Mar 2020 14:25:11 +0000 (15:25 +0100)]
ir3: Also don't propagate immediate offset with LDC
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
Connor Abbott [Wed, 18 Mar 2020 17:06:41 +0000 (18:06 +0100)]
ir3: Plumb through support for a1.x
This will need to be used in some cases for the upcoming bindless
support, plus ldc.k instructions which push data from a UBO to const
registers.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
Connor Abbott [Fri, 6 Mar 2020 17:06:06 +0000 (18:06 +0100)]
ir3: Add bindless instruction encoding
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
Connor Abbott [Fri, 6 Mar 2020 10:29:54 +0000 (11:29 +0100)]
freedreno/a6xx: Add registers for the bindless model
In Vulkan, descriptors for samplers, SSBO's, etc. are collected into
descriptor sets, and shaders can use multiple descriptor sets. At
command-recording time, users can swap out only some of the descriptor
sets, and the driver is supposed to do the minimum amount necessary to
update any internal binding tables, knowing that only some of the
descriptors have changed.
With the old binding model, focused on GL, where there are separate
tables for each type of resource, we can do somewhat better than now by
preserving descriptors from lower descriptor sets when switching higher
descriptor sets. However we still have to copy around descriptors before
each draw.
At least for a6xx, qualcomm went further, essentially copying the Vulkan
binding model as an alternate way to load resources. There's an array of
registers (actually an array for compute and one for everything else),
where each register holds a pointer to a descriptor set that can contain
various different descriptor types. The descriptors are padded out to 16
dwords, so that every instruction can use an index instead of a dword
offset. It's called "bindless", I think, because it can also be used to
implement the old GL bindless extensions (presumably it allows more
samplers and textures than the old model).
This commit adds the register and cmdstream parts. Next up will be the
instruction encoding.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
Connor Abbott [Fri, 6 Mar 2020 10:27:46 +0000 (11:27 +0100)]
freedreno/a6xx: Add UBO size field
Verified with the vulkan blob, which uses ldc and UBO descriptors, and
turnip will too soon.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
Connor Abbott [Wed, 18 Mar 2020 12:12:31 +0000 (13:12 +0100)]
tu: ir3: Emit push constants directly
Carve out some space at the beginning for push constants, and push them
directly, rather than remapping them to a UBO and then relying on the
UBO pushing code. Remapping to a UBO is easy now, where there's a single
table of UBO's, but with the bindless model it'll be a lot harder. I
haven't removed all the code to move the remaining UBO's over by 1,
though, because it's going to all get rewritten with bindless anyways.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
Connor Abbott [Thu, 26 Mar 2020 14:35:11 +0000 (15:35 +0100)]
tu: Dump out shader assembly when requested
We don't use the ir3 variant machinery, so we have to do this ourselves.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
Daniel Schürmann [Tue, 7 Apr 2020 16:15:35 +0000 (17:15 +0100)]
aco: RA - move all std::function objects into proper functions
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
Daniel Schürmann [Tue, 7 Apr 2020 15:46:58 +0000 (16:46 +0100)]
aco: move all needed helper containers to ra_ctx
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
Daniel Schürmann [Wed, 11 Mar 2020 10:02:20 +0000 (11:02 +0100)]
aco: change live_out variables to std::unordered_set
Improves performance of live_var_analysis for larger shaders
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
Daniel Schürmann [Wed, 11 Mar 2020 09:47:07 +0000 (10:47 +0100)]
aco: change some std::map to std::unordered_map in register_allocation
This improves compile times slightly for larger shaders
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
Daniel Schürmann [Wed, 11 Mar 2020 07:38:48 +0000 (08:38 +0100)]
aco: refactor try_remove_trivial_phi() in RA
Minor refactoring to avoid some pointer chasing.
This patch also changes the live_out argument to be
passed by reference to avoid an unnecessary copy.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
Daniel Schürmann [Tue, 10 Mar 2020 12:39:42 +0000 (13:39 +0100)]
aco: improve speed of live_var_analysis
by merging live_sgprs and live_vgprs sets.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
Daniel Schürmann [Tue, 10 Mar 2020 11:41:02 +0000 (12:41 +0100)]
aco: during RA only insert into renames table if a variable got renamed
This improves the speed of register allocation.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
Daniel Schürmann [Tue, 10 Mar 2020 10:47:30 +0000 (11:47 +0100)]
aco: replace assignment hashmap by std::vector in register allocation
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
Daniel Schürmann [Tue, 10 Mar 2020 10:50:41 +0000 (11:50 +0100)]
aco: improve register assignment when live-range splits are necessary
When finding a good place for a register, we can ignore
killed operands.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
Daniel Schürmann [Tue, 10 Mar 2020 09:00:32 +0000 (10:00 +0100)]
aco: improve hashing for value numbering
An improved hashing greatly reduces the number of collisions,
and thus, increases the speed for lookups in the hash table.
The hash function now uses Murmur3 written by Austin Appleby.
This patch also pre-reserves space for the hashmap to avoid rehashing.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
Daniel Schürmann [Mon, 30 Mar 2020 16:25:00 +0000 (17:25 +0100)]
aco: add explicit padding for all Instruction sub-structs
This patch also adds static_asserts on the size of Instructions
to ensure no internal padding is present.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
Daniel Schürmann [Wed, 11 Mar 2020 12:12:08 +0000 (13:12 +0100)]
aco: guarantee that Temp fits in 4 bytes
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
Jonathan Marek [Fri, 13 Mar 2020 15:57:23 +0000 (11:57 -0400)]
turnip: new clear/blit implementation with shader path fallback
The shader path is used to implement the following cases:
* stencil aspect mask on D24S8 (for image_to_buffer,buffer_to_image)
* clear/copy msaa destination (2D engine can't have msaa dest)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
Jonathan Marek [Wed, 8 Apr 2020 14:56:30 +0000 (10:56 -0400)]
turnip: add vk_format_is_snorm/is_float
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
Jonathan Marek [Wed, 8 Apr 2020 14:56:16 +0000 (10:56 -0400)]
turnip: rework format helpers
* Take tile_mode as input directly
* tu6_format_gmem to tu6_base_format, use may not be limited to GMEM
* Add new helpers that will return the correct tile_mode as for image level
as part of the format.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
Jonathan Marek [Wed, 8 Apr 2020 03:25:12 +0000 (23:25 -0400)]
turnip: use dirty bits for dynamic viewport/scissor state
CmdClearAttachments shader path will overwrite this state, so it needs to
be re-emitted with dirty bits in that case.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
Jonathan Marek [Wed, 8 Apr 2020 02:23:27 +0000 (22:23 -0400)]
turnip: save attachment samples in renderpass state
This is needed to be able to know the number of samples during
CmdClearAttachments which can be used while the framebuffer is unknown.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
Jonathan Marek [Wed, 8 Apr 2020 02:20:10 +0000 (22:20 -0400)]
turnip: disable 8x msaa
Not everything supports 8x msaa, and the blob doesn't support it at all.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
Jonathan Marek [Wed, 8 Apr 2020 01:39:40 +0000 (21:39 -0400)]
turnip: fix nir validate failure from push constant lowering
Fixes newly added checks in nir validate failing.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
Jonathan Marek [Tue, 18 Feb 2020 13:54:15 +0000 (08:54 -0500)]
turnip: split up gmem/tile alignment
Note: the x1/y1 align in tu6_emit_blit_scissor was broken
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
Jonathan Marek [Fri, 13 Mar 2020 15:47:15 +0000 (11:47 -0400)]
turnip: RB_CCU_CNTL fixes
* Correct bypass value for a618
* Bypass value for blitter
* Don't set RB_CCU_CNTL again unnecessarily in tu6_emit_binning_pass
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
Jonathan Marek [Fri, 13 Mar 2020 14:20:23 +0000 (10:20 -0400)]
freedreno/a6xx: set bypass RB_CCU_CNTL value for blitter
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
Jonathan Marek [Fri, 13 Mar 2020 14:09:11 +0000 (10:09 -0400)]
freedreno/registers: add RB_CCU_CNTL bitfields
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
Samuel Pitoiset [Thu, 9 Apr 2020 09:37:27 +0000 (11:37 +0200)]
radv: allow TC-compat HTILE with GENERAL outside of render loops
This gives +8% with Wolfeinstein Youngblood on my Vega64, and
according to someone else, it also improves performance with Doom
2016 and Wolfenstein 2 (and probably other ID Tech games).
This improvement is because Youngblood uses GENERAL for the main
depth-only pass and TC-compat HTILE is now enabled with GENERAL if
we know that we are outside of a render loop. This obviously also
reduces the number of HTILE decompressions from/to GENERAL.
Note that Youngblood violates the Vulkan spec regarding render loops
because they are only allowed with input attachments. Expect possible
rendering issues if apps use render loops with the wrong way (ie.
without input attachmens) because HTILE might not be coherent if
a depth-stencil texture is sampled and rendered in the same draw.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2704
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4391>
Samuel Pitoiset [Tue, 31 Mar 2020 08:35:00 +0000 (10:35 +0200)]
radv: only enable TC-compat HTILE for images readable by a shader
If no texture fetches happen it's useless to enable TC-compat HTILE.
Because the driver currently doesn't support TC-compat HTILE for
storage images we don't have to check.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4497>
Samuel Pitoiset [Sun, 5 Apr 2020 07:42:50 +0000 (09:42 +0200)]
radv: only expose fp16 control features for chips with double rate fp16
This disables all fp16 shader control features on GFX8 because only
GFX9+ supports double rate packed math.
This improves consistency regarding other AMD Vulkan drivers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
Samuel Pitoiset [Sun, 5 Apr 2020 07:33:43 +0000 (09:33 +0200)]
radv: only expose storageInputOutput16 for chips with double rate fp16
This feature allows to use both 16-bit integers and 16-bit floats
as inputs/outputs.
This disables storageInputOutput16 on GFX8 because only GFX9+ supports
double rate packed math.
This improves consistency regarding other AMD Vulkan drivers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
Samuel Pitoiset [Sun, 5 Apr 2020 07:25:18 +0000 (09:25 +0200)]
radv: only expose shaderFloat16 for chips with double rate fp16
This disables shaderFloat16 on GFX8 because only GFX9+ supports
double rate packed math.
This improves consistency regarding other AMD Vulkan drivers and
it makes no sense to enable that feature without packed math.
This also reduces performance with Wolfeinstein Youngblood if
fp16 is forced enabled on GFX8, while it's similar on GFX9.
We might re-introduce that feature in the future with ACO support
if it ends up being faster and correct.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
Samuel Pitoiset [Sun, 5 Apr 2020 07:23:16 +0000 (09:23 +0200)]
ac,radv: add ac_gpu_info::has_double_rate_fp16
Only GFX9+ support double rate packed math instructions.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
Jonathan Marek [Wed, 8 Apr 2020 23:43:24 +0000 (19:43 -0400)]
turnip: use buffer size instead of bo size for VFD_FETCH_SIZE
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4224>
Jonathan Marek [Wed, 18 Mar 2020 02:28:38 +0000 (22:28 -0400)]
turnip: improve vertex input handling
Emit vertexBindingDescriptionCount bindings, instead of one per attribute.
Verified with dEQP-VK.pipeline.vertex_input.*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4224>
James Zhu [Mon, 6 Apr 2020 20:34:01 +0000 (20:34 +0000)]
radeonsi: fix Segmentation fault during vaapi enc test
Fix Segmentation fault during vaapi enc test on Arcturus.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by Leo Liu <leo.liu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4472>
Bas Nieuwenhuizen [Wed, 8 Apr 2020 11:53:47 +0000 (13:53 +0200)]
radv: Use correct buffer count with variable descriptor set sizes.
Fixes dEQP-VK.binding_model.descriptorset_random.sets16.noarray.ubolimitlow.sbolimitlow.imglimitlow.iublimitlow.frag.ialimitlow.0
CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2607
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4489>
Bas Nieuwenhuizen [Wed, 8 Apr 2020 10:51:34 +0000 (12:51 +0200)]
radv: Whitespace fixup.
Review comment that I did, but forgot to git add before amending ...
From https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4334
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4488>
Samuel Iglesias Gonsálvez [Wed, 8 Apr 2020 08:57:28 +0000 (10:57 +0200)]
radv: set sparseAddressSpaceSize to RADV_MAX_MEMORY_ALLOCATION_SIZE
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4487>
Samuel Iglesias Gonsálvez [Wed, 8 Apr 2020 08:55:37 +0000 (10:55 +0200)]
radv: check buffer size in vkCreateBuffer()
Fixes:
dEQP-VK.api.buffer.basic.size_max_uint64
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4487>
Bas Nieuwenhuizen [Tue, 7 Apr 2020 20:23:09 +0000 (22:23 +0200)]
radv: Consider maximum sample distances for entire grid.
The other pixels in the grid might have samples with a larger
distance than the (0,0) pixel.
Fixes dEQP-VK.pipeline.multisample.sample_locations_ext.verify_location.samples_8_packed
when CTS is compiled with clang.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4480>
Samuel Pitoiset [Fri, 13 Mar 2020 10:23:07 +0000 (11:23 +0100)]
radv: enable lowering of GS intrinsics for the LLVM backend
This replaces emit_vertex with:
if (vertex_count < max_vertices) {
emit_vertex_with_counter vertex_count ...
vertex_count += 1
}
Which is exactly what NIR->LLVM was doing but at NIR level. This
pass is already called by ACO.
pipeline-db changes on GFX10:
Totals from affected shaders:
SGPRS: 1952 -> 1912 (-2.05 %)
VGPRS: 2112 -> 2044 (-3.22 %)
Code Size: 189368 -> 185620 (-1.98 %) bytes
Max Waves: 494 -> 491 (-0.61 %)
No pipeline-db changes on other generations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4182>
Samuel Pitoiset [Tue, 31 Mar 2020 13:26:00 +0000 (15:26 +0200)]
radv: remove radv_layout_has_htile() helper
The goal of this function was to return whether a depth-stencil image
has HTILE, in comparison to radv_layout_is_htile_compressed() which
is used to know whether a depth-stencil image has HTILE compressed.
These two functions are actually similar and they have never been
used for what they were supposed to. Remove radv_layout_has_htile()
in favour of radv_layout_is_htile_compressed() for now. If it's
needed in the future, I will re-introduce this concept properly.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4389>
Samuel Pitoiset [Tue, 31 Mar 2020 13:22:01 +0000 (15:22 +0200)]
radv: cleanup creating the decompress/resummarize pipelines
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4389>
Samuel Pitoiset [Tue, 31 Mar 2020 13:14:37 +0000 (15:14 +0200)]
radv: rename extra graphics pipeline decompress/resummarize fields
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4389>
Samuel Pitoiset [Tue, 31 Mar 2020 13:09:58 +0000 (15:09 +0200)]
radv: rename decompress/resummarize depth/stencil functions
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4389>
Jonathan Marek [Wed, 8 Apr 2020 01:13:31 +0000 (21:13 -0400)]
turnip: fix compute shaders crashing after geometry shader change
Fixes: 1af71bee734da7d8 ("turnip: Set has_gs in ir3_shader_key")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4483>
Timothy Arceri [Tue, 7 Apr 2020 13:33:55 +0000 (23:33 +1000)]
nir: make opt_if_loop_terminator() less strict
nir_cf_{extract,reinsert}() can't stitch a block together
if the block we are extracting ends in a jump but other jumps
nested in further ifs should be fine to move.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4477>
Timothy Arceri [Tue, 7 Apr 2020 01:28:32 +0000 (11:28 +1000)]
radeonsi: don't lower constant arrays to uniforms in GLSL IR
This re-enables the change made in
2f5783bc2b82 which was
incorrectly disabled by
3e1dd99adca5.
For radeonsi, we will prefer the NIR pass as it'll generate better code
(some index calculation and a single load vs. a load, then index
calculation, then another load) and oftentimes NIR optimization can kick
in and make all the access indices constant.
Fixes: 3e1dd99adca5 ("radeonsi: Remove a bunch of default handling of pipe caps.")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4474>
Dominik Behr [Tue, 22 Oct 2019 01:13:08 +0000 (18:13 -0700)]
meson: fix debug build on Android
debug_stack functions are implemented in another file for Android.
Also add backtrace library dependency.
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-bu: Kristian H. Kristensen <hoegsber@google.com>
Signed-off-by: Dominik Behr <dbehr@chromium.org>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2435>
Bas Nieuwenhuizen [Thu, 26 Mar 2020 14:19:37 +0000 (15:19 +0100)]
radv: Store 64-bit availability bools if requested.
Fixes dEQP-VK.query_pool.*.reset_before_copy.* on RAVEN.
CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2296
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4334>
Vinson Lee [Tue, 7 Apr 2020 01:21:54 +0000 (18:21 -0700)]
gallivm: Add missing header for powf.
Fix build error after llvm-11 commit
3a29393b4709 ("Remove
math.h/cmath include from DataTypes.h").
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c: In function ‘lp_build_linear_to_srgb’:
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c:194:44: error: implicit declaration of function ‘powf’ [-Werror=implicit-function-declaration]
194 | exp2f_c * powf(coeff_f, 1.0f / exp_f));
| ^~~~
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c:194:44: warning: incompatible implicit declaration of built-in function ‘powf’
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c:78:1: note: include ‘<math.h>’ or provide a declaration of ‘powf’
77 | #include "lp_bld_format.h"
+++ |+#include <math.h>
78 |
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4473>
Kristian H. Kristensen [Tue, 7 Apr 2020 16:34:42 +0000 (09:34 -0700)]
turnip: Drop dep_llvm from dependencies
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478>
Kristian H. Kristensen [Tue, 7 Apr 2020 16:04:00 +0000 (09:04 -0700)]
turnip: Make Android platform build
We still don't have a way to keep this from breaking, but I don't
think this ever built. Let's call it progress.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478>
Kristian H. Kristensen [Tue, 7 Apr 2020 15:57:10 +0000 (08:57 -0700)]
turnip: Stub out VK_KHR_external_{fence,semaphore}_fd
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478>
Kristian H. Kristensen [Tue, 7 Apr 2020 15:45:03 +0000 (08:45 -0700)]
turnip: Add missing VKAPI_ATTR annotations
Make sure the types match.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4478>
Rohan Garg [Tue, 7 Apr 2020 11:17:07 +0000 (13:17 +0200)]
tracie: Reformat code to fix indentation
Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4435>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4435>
Rohan Garg [Wed, 1 Apr 2020 16:06:49 +0000 (18:06 +0200)]
tracie: Print results in a machine readable format
Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4435>
Eric Anholt [Tue, 25 Feb 2020 22:03:36 +0000 (14:03 -0800)]
freedreno/a6xx: Set a level's pitch based on minified level0 pitch, not width0.
Found from piglit fbo-generatemipmaps failures, then tracked down with the
texturator test. The piece that really revealed things was finding that
1024x1 linear RGBA8 on the older blob drivers would have a pitch of 5120
instead of 4096, and the following levels minified that pitch.
Fixes ~124 piglit tests (~8.5% of piglit failures) on cheza.
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>
Eric Anholt [Wed, 26 Feb 2020 19:27:04 +0000 (11:27 -0800)]
freedreno: Add the outline of a test for a6xx texture layout.
Trying to work out texture layout by remembering what things looked like
in texturator is hard. Instead, let's use texture layouts from tracing
the blob as a source of truth to make sure that we pick the same layouts
they do (and don't break known-good ones). More testcases will be added
as I fix layout bugs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>
Eric Anholt [Tue, 25 Feb 2020 22:42:33 +0000 (14:42 -0800)]
freedreno/a6xx: Drop the "alignment" layout temporary.
It's just 1 for !3d, which means that the align we're doing in that case
is pointless.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>
Eric Anholt [Tue, 25 Feb 2020 22:40:37 +0000 (14:40 -0800)]
freedreno/a6xx: Remove the "aligned_height" temporary.
Now that we're not incrementally minifying height, we can just modify it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>
Eric Anholt [Thu, 27 Feb 2020 20:35:32 +0000 (12:35 -0800)]
freedreno/a6xx: Sink the per-level size temps inside the loop.
u_minify(n, 1) is no cheaper than u_minify(n, level), and this makes the
logic a lot simpler to follow.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3987>
Michel Dänzer [Fri, 3 Apr 2020 09:46:12 +0000 (11:46 +0200)]
gitlab-ci: Run merge request pipelines automatically only for Marge Bot
MR pipelines not triggered by Marge Bot can still be triggered manually.
Motivation: The main & forked Mesa project CI pipelines combined are
currently generating over 1 TB of egress traffic per week. ~80% of this
is from pre-merge pipelines. Assuming this corresponds to 4 pre-merge
and one post-merge pipeline per MR on average, this change could
potentially eliminate up to ~60% of the overall traffic (by preventing
3 of the 4 pre-merge pipelines from running automatically).
(Of course, this could be subverted if all jobs of the other pipelines
were triggered manually anyway... In most cases, manually triggering
just a few jobs should suffice)
v2:
* $GITLAB_USER_NAME was the wrong variable, $GITLAB_USER_LOGIN should
do the trick.
Suggested-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432>
Michel Dänzer [Fri, 3 Apr 2020 10:50:11 +0000 (12:50 +0200)]
gitlab-ci: Don't require triggering build/test jobs manually
Let them run automatically once all their dependencies have passed.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432>
Michel Dänzer [Fri, 3 Apr 2020 13:59:31 +0000 (15:59 +0200)]
gitlab-ci/lava: Add needs: for container image to test jobs (again)
Without this, the test jobs could spuriously run after the container
job failed or was cancelled, even if the build job didn't run at all.
(I already did this in
94cfe590703018cf3d34a0c1f8667064919bf843, but it
got dropped accidentally in
22d976454f4e50142116f4544c0bbf11134ce991)
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432>
Michel Dänzer [Fri, 3 Apr 2020 09:17:48 +0000 (11:17 +0200)]
gitlab-ci: Rename "paths" YAML anchor to "all_paths"
To avoid confusion with `paths:` elements.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432>
Caio Marcelo de Oliveira Filho [Tue, 27 Mar 2018 17:10:34 +0000 (10:10 -0700)]
anv/gen12: Lower VK_KHR_multiview using Primitive Replication
Identify if view_index is used only for position calculation, and use
Primitive Replication to implement Multiview in Gen12. This feature
allows storing per-view position information in a single execution of
the shader, treating position as an array.
The shader is transformed by adding a for-loop around it, that have an
iteration per active view (in the view_mask). Stores to the position
now store into the position array for the current index in the loop,
and load_view_index() will return the view index corresponding to the
current index in the loop.
The feature is controlled by setting the environment variable
ANV_PRIMITIVE_REPLICATION_MAX_VIEWS, which defaults to 2 if unset.
For pipelines with view counts larger than that, the regular
instancing will be used instead of Primitive Replication. To disable
it completely set the variable to 0.
v2: Don't assume position is set in vertex shader; remove only stores
for position; don't apply optimizations since other passes will
do; clone shader body without extract/reinsert; don't use
last_block (potentially stale). (Jason)
Fix view_index immediate to contain the view index, not its order.
Check for maximum number of views supported.
Add guard for gen12.
v3: Clone the entire shader function and change it before reinsert;
disable optimization when shader has memory writes. (Jason)
Use a single environment variable with _DEBUG on the name.
v4: Change to use new nir_deref_instr.
When removing stores, look for mode nir_var_shader_out instead
of the walking the list of outputs.
Ensure unused derefs are removed in the non-position part of the
shader.
Remove dead control flow when identifying if can use or not
primitive replication.
v5: Consider all the active shaders (including fragment) when deciding
that Primitive Replication can be used.
Change environment variable to ANV_PRIMITIVE_REPLICATION.
Squash the emission of 3DSTATE_PRIMITIVE_REPLICATION into this patch.
Disable Prim Rep in blorp_exec_3d.
v6: Use a loop around the shader, instead of manually unrolling, since
the regular unroll pass will kick in.
Document that we don't expect to see copy_deref or load_deref
involving the position variable.
Recover use_primitive_replication value when loading pipeline from
the cache.
Set VARYING_SLOT_LAYER to 0 in the shader. Earlier versions were
relying on ForceZeroRTAIndexEnable but that might not be
sufficient.
Disable Prim Rep in cmd_buffer_so_memcpy.
v7: Don't use Primitive Replication if position is not set, fallback
to instancing; change environment variable to be
ANV_PRIMITVE_REPLICATION_MAX_VIEWS and default it to 2 based on
experiments.
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
Caio Marcelo de Oliveira Filho [Fri, 21 Sep 2018 23:07:38 +0000 (16:07 -0700)]
intel/fs: Allow multiple slots for position
Change brw_compute_vue_map() to also take the number of pos slots. If
more than one slot is used, the VARYING_SLOT_POS is treated as an
array.
When using Primitive Replication, instead of a single position, the
VUE must contain an array of positions. Padding might be
necessary (after clip distance) to ensure rest of attributes start
aligned.
v2: Add note about array in the commit message and assert that
pos_slots >= 1 to make clear 0 is invalid. (Jason)
Move padding to be after the clip distance.
v3: Apply the correct offset when gathering the sources from outputs.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v2]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
Caio Marcelo de Oliveira Filho [Sat, 12 Oct 2019 00:04:36 +0000 (17:04 -0700)]
intel/gen12: Add XML description for 3DSTATE_PRIMITIVE_REPLICATION
v2: Use groups for the 16-element arrays "Viewport Offset"
and "RTAI Offset". (Ken)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
Caio Marcelo de Oliveira Filho [Tue, 11 Feb 2020 22:41:05 +0000 (14:41 -0800)]
nir: Add per_view attribute to nir_variable
If a nir_variable is tagged with per_view, it must be an array with
size corresponding to the number of views. For slot-tracking, it is
considered to take just the slot for a single element -- drivers will
take care of expanding this appropriately.
This will be used to implement the ability of having per-view position
in a vertex shader in Intel platforms.
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
Simon Ser [Thu, 2 Apr 2020 20:53:41 +0000 (22:53 +0200)]
mesa: add support for NV_pixel_buffer_object
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4422>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4422>
Jonathan Marek [Tue, 3 Mar 2020 01:52:15 +0000 (20:52 -0500)]
turnip: implement timestamp query
Passes tests in:
dEQP-VK.pipeline.timestamp.*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4027>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4027>
Brian Ho [Thu, 2 Apr 2020 18:01:54 +0000 (11:01 -0700)]
turnip: Enable geometryShader device feature
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>
Brian Ho [Wed, 1 Apr 2020 20:36:31 +0000 (13:36 -0700)]
turnip: Enable geometry shaders for CP_DRAWs
Enable geometry shading on draw if the pipeline has a geometry
stage.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>
Brian Ho [Mon, 6 Apr 2020 20:38:04 +0000 (13:38 -0700)]
turnip: Populate tu_pipeline.active_stages
This can be used to determine if the pipeline has a specific shader
stage (e.g. geometry shader).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>
Brian Ho [Fri, 3 Apr 2020 14:59:47 +0000 (07:59 -0700)]
turnip: Update maxGeometryShaderInvocations to match blob
Geometry shaders support an invocations parameter up to a limit
defined by maxGeometryShaderInvocations. This was set to 127, but
executing with invocations > 32 causes a crash. As it turns out, the
blob only advertises a max of 32 invocations, so we set that in
turnip as well.
Fixes dEQP-VK.geometry.instanced.draw_*_instances_{127, 64}_geometry_invocations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>
Brian Ho [Fri, 3 Apr 2020 14:57:25 +0000 (07:57 -0700)]
turnip: Selectively configure GRAS_LAYER_CNTL
One of the features of geometry shaders is the ability to render to
different layers by assigning to the gl_Layer (Layer in SPIR-V)
builtin.
While have already plumbed the layer regid to the geometry shader,
we also need to GRAS_LAYER_CNTL to actually use layered rendering.
In addition, gmem does not support layered rendering, so we need to
force sysmem.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>
Brian Ho [Wed, 1 Apr 2020 20:26:17 +0000 (13:26 -0700)]
turnip: Set up REG_A6XX_SP_GS_CONFIG
Updates GS_CONFIG and HLSQ_GS_CNTL registers to match those emitted
by the blob and fd.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>
Brian Ho [Wed, 1 Apr 2020 20:21:26 +0000 (13:21 -0700)]
turnip: Configure VFD_CONTROL with gsheader and primitiveid
This commit updates VFD_CONTROL to use the GS header and primitive
ID sysvals if a geometry shader stage is present in the pipeline.
Like in the case of VPC, the code here is adapted from fd6_program.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>
Brian Ho [Wed, 1 Apr 2020 18:50:55 +0000 (11:50 -0700)]
turnip: Configure VPC for geometry shaders
This commit updates tu6_emit_vpc to selectively emit GS-specifc
configuration. Most of this is repurposed from fd6_program.c.
This also refactors `link_geometry_stages` to ir3_nir_lower_tess.c
so it can be shared between fd and tu.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>
Brian Ho [Wed, 1 Apr 2020 18:09:48 +0000 (11:09 -0700)]
turnip: Emit geometry shader obj and related consts
Like with other shader types, we need to emit the geometry shader
object and the consts it uses. In addition, we need to emit
additional geometry-specific consts that link primitive/vertex stride
between the vs and gs. In conjunction with the gsheader, these are
used by the vs to determine where to stlw outputs and used by the gs
to determine where to ldlw those outputs from.
FD emits these consts in the draw call because in GL, you can mix
and match shaders in different programs. In Vulkan, however, we
compile and link the shaders at pipeline creation, so we can emit
these in the pipeline IB instead.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>
Brian Ho [Wed, 1 Apr 2020 16:01:52 +0000 (09:01 -0700)]
turnip: Set has_gs in ir3_shader_key
The ir3 compiler only lowers the VS and GS for geometry shading if
the corresponding has_gs key is set in the shader key. Without it,
GS-specific intrinsics like load_per_vertex_input won't get lowered
and the GS header will be initialized with invalid values.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>
Timur Kristóf [Wed, 1 Apr 2020 11:43:50 +0000 (13:43 +0200)]
radv: Print shader stage before disassembly.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>
Timur Kristóf [Wed, 1 Apr 2020 11:39:25 +0000 (13:39 +0200)]
aco: Print shader stage in aco_print_program.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>
Timur Kristóf [Tue, 31 Mar 2020 08:41:01 +0000 (10:41 +0200)]
radv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>
Timur Kristóf [Wed, 1 Apr 2020 13:55:41 +0000 (15:55 +0200)]
aco/ngg: Run GS_ALLOC_REQ on priority 3 for NGG VS and TES.
It is recommended to do this as quickly as possible.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>
Timur Kristóf [Wed, 1 Apr 2020 14:02:13 +0000 (16:02 +0200)]
aco/ngg: Schedule position exports of NGG VS/TES.
Similarly to the HW VS stage, the HW NGG GS stage also
benefits from executing these exports as early as possible.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>
Timur Kristóf [Wed, 1 Apr 2020 10:29:30 +0000 (12:29 +0200)]
aco/ngg: Implement NGG VS and TES.
When NGG is used, vertex and tess eval shaders are executed on the
hardware NGG geometry stage. There is a series of steps they
must perform:
* Request GS space using GS_ALLOC_REQ
* Export the primitive
* Finally, export the normal VS outputs
In this commit, two modes are implemented:
* "late" which matches what the RADV LLVM backend currently does
* "early" which is an optimized version as seen in radeonsi
Vulkan doesn't allow the shader to write the edge flags, so we can
currently always use the "early" mode.
Exporting the primitive ID is also supported by having the GS threads
write that into LDS and reading them from LDS in the ES threads.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>
Timur Kristóf [Wed, 1 Apr 2020 10:18:50 +0000 (12:18 +0200)]
aco/ngg: Setup NGG VS and TES stages.
ngg_vertex_gs and ngg_tess_eval_gs work very similarly to
vertex_vs and tess_eval_vs, but they run on the HW NGG GS stage.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>
Timur Kristóf [Wed, 1 Apr 2020 10:14:59 +0000 (12:14 +0200)]
aco/ngg: Fix exports for NGG VS and TES.
The exports in NGG VS and TES work just like VS exports,
so the assembler needs to fix these too in the same manner.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>
Timur Kristóf [Wed, 1 Apr 2020 10:14:00 +0000 (12:14 +0200)]
aco/ngg: Initialize exec mask for NGG VS and TES.
They behave like merged ESGS shaders, so the exec mask needs
to be manually initialized for these NGG shaders too.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>
Timur Kristóf [Mon, 27 Jan 2020 07:17:47 +0000 (08:17 +0100)]
aco/ngg: Add new stage for hw_ngg_gs.
This is needed to distinguish between NGG and legacy.
Otherwise, vertex_geometry_gs and ngg_vertex_geometry_gs
have the same value, which we want to avoid.
Also, there is no such thing as ngg_vertex_tess_control_hs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>
Timur Kristóf [Wed, 1 Apr 2020 13:38:43 +0000 (15:38 +0200)]
aco: Treat s_setprio as a scheduling barrier.
We want to execute instructions after s_setprio in the given
priority, so we must prevent the scheduler from scheduling beyond
s_setprio, otherwise some instructions could be executed in a
different priority.
Rename hazard_fail_memtime to hazard_fail_unreorderable and include
s_setprio in the list of unreorderable opcodes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>
Timur Kristóf [Tue, 31 Mar 2020 08:49:52 +0000 (10:49 +0200)]
aco: Extract merged_wave_info_to_mask to its own function.
Currently we only use this at the beginning of merged shader parts,
but we are going to need to use it with some NGG code as well.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>