For fragment shaders, we can always use a SIMD8 program. Therefore, if
we detect spilling with a SIMD16 program, then it is better to skip
generating a SIMD16 program to only rely on a SIMD8 program.
Unfortunately, this doesn't work for compute shaders. For a compute
shader, we may be required to use SIMD16 if the local workgroup size
is bigger than a certain size. For example, on gen7, if the local
workgroup size is larger than 512, then a SIMD16 program is required.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93840
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
* SIMD8. There's probably actually some intermediate point where
* SIMD16 with a couple of spills is still better.
*/
- if (dispatch_width == 16) {
+ if (dispatch_width == 16 && min_dispatch_width <= 8) {
fail("Failure to register allocate. Reduce number of "
"live scalar values to avoid this.");
} else {
bool spilled_any_registers;
const unsigned dispatch_width; /**< 8 or 16 */
+ unsigned min_dispatch_width;
int shader_time_index;
unreachable("unhandled shader stage");
}
+ if (stage == MESA_SHADER_COMPUTE) {
+ const brw_cs_prog_data *cs_prog_data =
+ (const brw_cs_prog_data *) prog_data;
+ unsigned size = cs_prog_data->local_size[0] *
+ cs_prog_data->local_size[1] *
+ cs_prog_data->local_size[2];
+ size = DIV_ROUND_UP(size, devinfo->max_cs_threads);
+ min_dispatch_width = size > 16 ? 32 : (size > 8 ? 16 : 8);
+ } else {
+ min_dispatch_width = 8;
+ }
+
this->prog_data = this->stage_prog_data;
this->failed = false;