In order to use the Observability Architecture effectively, we'll need
to take snapshots of the OA counters via MI_REPORT_PERF_COUNT at the
start and end of each batch.
Experimentation reveals that we need to flush before and after each
MI_REPORT_PERF_COUNT to get working values. For simplicitly, I chose to
use intel_batchbuffer_emit_mi_flush(), which unfortunately expands to
triple pipe controls on Sandybridge.
We may want to start computing per-generation reserved batch space to
avoid the insanity of Sandybridge's PIPE_CONTROL cost. That said, much
of this cost existed before I rewrote the query object support to use
hardware contexts, so it's at least not entirely new.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
* - Any state emitted by vtbl->finish_batch():
* - Gen4-5 record ending occlusion query values (4 * 4 = 16 bytes)
* - Disabling OA counters on Gen6+ (3 DWords = 12 bytes)
+ * - Ending MI_REPORT_PERF_COUNT on Gen5+, plus associated PIPE_CONTROLs:
+ * - Two sets of PIPE_CONTROLs, which become 3 PIPE_CONTROLs each on SNB,
+ * which are 4 DWords each ==> 2 * 3 * 4 * 4 = 96 bytes
+ * - 3 DWords for MI_REPORT_PERF_COUNT itself on Gen6+. ==> 12 bytes.
+ * On Ironlake, it's 6 DWords, but we have some slack due to the lack of
+ * Sandybridge PIPE_CONTROL madness.
*/
-#define BATCH_RESERVED 36
+#define BATCH_RESERVED 146
struct intel_batchbuffer;