glx/xlib: configurable strict/non-strict buffer size invalidate
Introduce a new configuration option XMESA_STRICT_INVALIDATE to switch
between swapbuffers-based and glViewport-based buffer invalidation.
Default strict invalidate to false, ie glViewport-based invalidation,
aka ST_MANAGER_BROKEN_INVALIDATE.
This means we will not call XGetGeometry after every swapbuffers,
which allows swapbuffers to remain asynchronous. For apps running at
100fps with synchronous swapping, a 10% boost is typical. For gears,
I see closer to 20% speedup.
Note that the work of copying data on swapbuffers doesn't disappear -
this change just allows the X server to execute the PutImage
asynchronously without us effectively blocked until its completion.
This applies even to llvmpipe's threaded rasterization as the
swapbuffers operation was a large part of the serial component of an
llvmpipe frame.
The downside of this is correctness - applications which don't call
glViewport on window resizes will get incorrect rendering, unless
XMESA_STRICT_INVALIDATE is set.
The ultimate solution would be to have per-frame but asynchronous
invalidation. Xcb almost looks as if it could provide this, but the
API doesn't seem to quite be there.