radeonsi: use ready fences on all shaders, not just optimized ones
There's a race condition between si_shader_select_with_key and
si_bind_XX_shader:
Thread 1 Thread 2
-------- --------
si_shader_select_with_key
begin compiling the first
variant
(guarded by sel->mutex)
si_bind_XX_shader
select first_variant by default
as state->current
si_shader_select_with_key
match state->current and early-out
Since thread 2 never takes sel->mutex, it may go on rendering without a
PM4 for that shader, for example.
The solution taken by this patch is to broaden the scope of
shader->optimized_ready to a fence shader->ready that applies to
all shaders. This does not hurt the fast path (if anything it makes
it faster, because we don't explicitly check is_optimized).
It will also allow reducing the scope of sel->mutex locks, but this is
deferred to a later commit for better bisectability.
Fixes dEQP-EGL.functional.sharing.gles2.multithread.simple.buffers.bufferdata_render
Reviewed-by: Marek Olšák <marek.olsak@amd.com>