The VBO module maps a buffer with GL_MAP_FLUSH_EXPLICIT, and keeps
appending data, and calling glFlushMappedBufferRange(). We were
invalidating the VF cache each time it flushed a new range, which
results in a ton of VF flushes.
If the contents of the destination in the target range are undefined
(never even possibly written), this patch makes us assume that it's
likely not in the cache and so cache invalidations are required. If
the destination range is defined, we continue cache flushing as we may
need to expunge stale data.
This eliminates 88% of the VF cache invalidates on Manhattan 3.0.
Improves performance in Manhattan 3.0 on my Icelake 8x8 with the GPU
frequency locked to 700Mhz by 0.376724% +/- 0.
0989183% (n=10).
xfer->box = *box;
*ptransfer = xfer;
+ map->dest_had_defined_contents =
+ util_ranges_intersect(&res->valid_buffer_range, box->x,
+ box->x + box->width);
+
if (usage & PIPE_TRANSFER_WRITE)
util_range_add(&res->valid_buffer_range, box->x, box->x + box->width);
uint32_t history_flush = 0;
if (res->base.target == PIPE_BUFFER) {
- history_flush |= iris_flush_bits_for_history(res) |
- (map->staging ? PIPE_CONTROL_RENDER_TARGET_FLUSH : 0);
+ if (map->staging)
+ history_flush |= PIPE_CONTROL_RENDER_TARGET_FLUSH;
+
+ if (map->dest_had_defined_contents)
+ history_flush |= iris_flush_bits_for_history(res);
+
+ util_range_add(&res->valid_buffer_range, box->x, box->x + box->width);
}
if (history_flush & ~PIPE_CONTROL_CS_STALL) {
struct blorp_context *blorp;
struct iris_batch *batch;
+ bool dest_had_defined_contents;
+
void (*unmap)(struct iris_transfer *);
};