i965/gen8/cs: Gen8 requires 64 byte alignment for push constant data
authorJordan Justen <jordan.l.justen@intel.com>
Wed, 23 Dec 2015 07:42:52 +0000 (23:42 -0800)
committerJordan Justen <jordan.l.justen@intel.com>
Wed, 23 Dec 2015 07:54:02 +0000 (23:54 -0800)
The BDW PRM Vol2a: Command Reference: Instructions, section MEDIA_CURBE_LOAD,
says that 'CURBE Total Data Length' and 'CURBE Data Start Address' are
64-byte aligned. This is different from previous gens, that were 32-byte
aligned.

v2 (Jordan):
  - CURBE Data Start Address is also 64-byte aligned.
    - The call to brw_state_batch should also use 64-byte alignment.
      - Improve PRM reference.

v3:
 * New patch from Jordan. Always align base and size to 64 bytes.

Fixes the following SSBO CTS tests on BDW:
ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs
ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs
ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs
ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs
ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs
ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing-case2-cs
ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs

And many other CS CTS tests as reported by Marta Lofstedt.

(Commit message is from Iago, but in v3, code is from Jordan.)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
src/mesa/drivers/dri/i965/gen7_cs_state.c

index 1fde69cf78efff34ef8a1d7086235c2876efef29..a025bb9dd664045f260ec87f48f7dc1d78854f87 100644 (file)
@@ -141,7 +141,7 @@ brw_upload_cs_state(struct brw_context *brw)
       BEGIN_BATCH(4);
       OUT_BATCH(MEDIA_CURBE_LOAD << 16 | (4 - 2));
       OUT_BATCH(0);
-      OUT_BATCH(reg_aligned_constant_size * threads);
+      OUT_BATCH(ALIGN(reg_aligned_constant_size * threads, 64));
       OUT_BATCH(stage_state->push_const_offset);
       ADVANCE_BATCH();
    }
@@ -249,8 +249,8 @@ brw_upload_cs_push_constants(struct brw_context *brw,
 
       param = (gl_constant_value*)
          brw_state_batch(brw, type,
-                         reg_aligned_constant_size * threads,
-                         32, &stage_state->push_const_offset);
+                         ALIGN(reg_aligned_constant_size * threads, 64),
+                         64, &stage_state->push_const_offset);
       assert(param);
 
       STATIC_ASSERT(sizeof(gl_constant_value) == sizeof(float));