ac/llvm: load 1 byte at a time if unaligned on gfx10
authorPierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tue, 16 Jun 2020 12:46:08 +0000 (14:46 +0200)
committerPierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Fri, 19 Jun 2020 07:20:16 +0000 (09:20 +0200)
If buffer or stride is unaligned we use the same trick as on gfx6:
load 1 byte at a time and recompose the output if needed.
This change fixes lots of deqp/glcts tests:
  - dEQP-GLES2.functional.draw.random.1, 10, ...
  - dEQP-GLES2.functional.vertex_arrays.multiple_attributes.stride.3_float2_0_float2_0_float2_17, ...
  - dEQP-GLES2.functional.vertex_arrays.single_attribute.first.byte_first24_offset1_stride2_quads256, ...
  - dEQP-GLES2.functional.vertex_arrays.single_attribute.strides.buffer_0_17_byte2_vec4_dynamic_draw_quads_1, ...
  - dEQP-GLES31.functional.draw_indirect.random.14, ...

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5502>

src/amd/llvm/ac_llvm_build.c

index 69b1deaa8b275597ded358c78b076e43c29bb343..77681834ffae7c31889110bc5b80118cf90d4e83 100644 (file)
@@ -1651,7 +1651,7 @@ ac_build_opencoded_load_format(struct ac_llvm_context *ctx,
        }
 
        int log_recombine = 0;
-       if (ctx->chip_class == GFX6 && !known_aligned) {
+       if ((ctx->chip_class == GFX6 || ctx->chip_class == GFX10) && !known_aligned) {
                /* Avoid alignment restrictions by loading one byte at a time. */
                load_num_channels <<= load_log_size;
                log_recombine = load_log_size;