See Ivy Bridge PRM, Volume 2, Part 2, 1.8.4 INTERFACE_DESCRIPTOR_DATA:
DWORD 5, bits 20:16: "This field indicates how much shared local
memory the thread group requires. The amount is specified in 4k
blocks, but only powers of 2 are allowed: 0, 4k, 8k, 16k, 32k and 64k
per half-slice."
For Haswell, see Volume 2d, INTERFACE_DESCRIPTOR_DATA:
DWORD 5, bits 20:16: With text identical to the Ivy Bridge PRM.
For Broadwell, see Volume 2d, INTERFACE_DESCRIPTOR_DATA:
DWORD 6, bits 20:16: With text identical to the Ivy Bridge PRM.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
ctx->Const.MaxComputeWorkGroupSize[1] = max_invocations;
ctx->Const.MaxComputeWorkGroupSize[2] = max_invocations;
ctx->Const.MaxComputeWorkGroupInvocations = max_invocations;
- ctx->Const.MaxComputeSharedMemorySize = 32768;
+ ctx->Const.MaxComputeSharedMemorySize = 64 * 1024;
}
/**