radeonsi/compute: Clamp COMPUTE_TMPRING_SIZE.WAVES to: num_cu * 32