i965: Optimize batchbuffer macros.
authorMatt Turner <mattst88@gmail.com>
Thu, 9 Jul 2015 02:00:48 +0000 (19:00 -0700)
committerMatt Turner <mattst88@gmail.com>
Wed, 15 Jul 2015 20:09:22 +0000 (13:09 -0700)
commitf11c6f09cf36909ff399353b20195a31cf0f1907
tree7cb18d467c3646f37add23a0e63cb8940bb14a57
parent131573df7aea0b10e97d9d5db0d26d89f8dfef54
i965: Optimize batchbuffer macros.

Previously OUT_BATCH was just a macro around an inline function which
does

   brw->batch.map[brw->batch.used++] = dword;

When making consecutive calls to intel_batchbuffer_emit_dword() the
compiler isn't able to recognize that we're writing consecutive memory
locations or that it doesn't need to write batch.used back to memory
each time.

We can avoid both of these problems by making a local pointer to the
next location in the batch in BEGIN_BATCH().

Cuts 18k from the .text size.

   text     data      bss      dec      hex  filename
4946956   195152    26192  5168300   4edcac  i965_dri.so before
4928956   195152    26192  5150300   4e965c  i965_dri.so after

This series (including commit c0433948) improves performance of Synmark
OglBatch7 by 8.01389% +/- 0.63922% (n=83) on Ivybridge.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
src/mesa/drivers/dri/i965/brw_context.h
src/mesa/drivers/dri/i965/brw_draw_upload.c
src/mesa/drivers/dri/i965/brw_urb.c
src/mesa/drivers/dri/i965/intel_batchbuffer.c
src/mesa/drivers/dri/i965/intel_batchbuffer.h
src/mesa/drivers/dri/i965/intel_blit.c