radv: fix updating bound fast ds clear values with different aspects
authorSamuel Pitoiset <samuel.pitoiset@gmail.com>
Mon, 21 Oct 2019 20:17:43 +0000 (22:17 +0200)
committerSamuel Pitoiset <samuel.pitoiset@gmail.com>
Tue, 22 Oct 2019 09:16:13 +0000 (11:16 +0200)
On GFX9, the driver is able to do an optimized fast depth/stencil
clear with only one aspect (ie. clear the stencil part of a
depth/stencil image). When this happens, the driver should only
update the clear values of the given aspect.

Note that it's currently only supported on GFX9 but I have some
local patches that extend this optimized path for other gens.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1967
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
src/amd/vulkan/radv_cmd_buffer.c

index 01a0787dcf5bcd7fe7b992c49334adc0569b96ef..6a9002eaa442e80d253437e34785e7baeda63555 100644 (file)
@@ -1553,9 +1553,19 @@ radv_update_bound_fast_clear_ds(struct radv_cmd_buffer *cmd_buffer,
        if (cmd_buffer->state.attachments[att_idx].iview->image != image)
                return;
 
-       radeon_set_context_reg_seq(cs, R_028028_DB_STENCIL_CLEAR, 2);
-       radeon_emit(cs, ds_clear_value.stencil);
-       radeon_emit(cs, fui(ds_clear_value.depth));
+       if (aspects == (VK_IMAGE_ASPECT_DEPTH_BIT |
+                       VK_IMAGE_ASPECT_STENCIL_BIT)) {
+               radeon_set_context_reg_seq(cs, R_028028_DB_STENCIL_CLEAR, 2);
+               radeon_emit(cs, ds_clear_value.stencil);
+               radeon_emit(cs, fui(ds_clear_value.depth));
+       } else if (aspects == VK_IMAGE_ASPECT_DEPTH_BIT) {
+               radeon_set_context_reg_seq(cs, R_02802C_DB_DEPTH_CLEAR, 1);
+               radeon_emit(cs, fui(ds_clear_value.depth));
+       } else {
+               assert(aspects == VK_IMAGE_ASPECT_STENCIL_BIT);
+               radeon_set_context_reg_seq(cs, R_028028_DB_STENCIL_CLEAR, 1);
+               radeon_emit(cs, ds_clear_value.stencil);
+       }
 
        /* Update the ZRANGE_PRECISION value for the TC-compat bug. This is
         * only needed when clearing Z to 0.0.