iris: Implement buffer-local memory barrier based on cache coherency matrix.
This takes advantage of the previously introduced cache tracking
infrastructure in order to define a multi-purpose barrier operation
that allows the caller to order memory operations with respect to
previous operations performed on the same buffer from any other cache
domain.
v2: Assorted CPU overhead micro-optimizations (Francisco).
v3: Use C99 designated initializers (Ken).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>