intel: Fix glCopyTexSubImage on buffers whose width >= 32kbytes
When possible, glCopyTexSubImage calls are performed using the
hardware blitter. However, according to the Ivy Bridge PRM, Vol1
Part4, section 1.2.1.2 (Graphics Data Size Limitations):
The BLT engine is capable of transferring very large quantities of
graphics data. Any graphics data read from and written to the
destination is permitted to represent a number of pixels that
occupies up to 65,536 scan lines and up to 32,768 bytes per scan
line at the destination. The maximum number of pixels that may be
represented per scan line’s worth of graphics data depends on the
color depth.
With an RGBA32F color buffer (which has 16 bytes per pixel) this
imposes a maximum width of 2048 pixels. Other pixel formats have
accordingly larger limits.
To make matters worse, if the pitch of the buffer is 32k or greater,
intel_copy_texsubimage's call to intelEmitCopyBlit will overflow
intelEmitCopyBlit's src_pitch and dst_pitch parameters (which are
16-bit signed integers).
We can conveniently avoid both problems by avoiding use of the blitter
when the miptree's pitch is >= 32k.
Fixes gles3conform "framebuffer_blit_functionality_magnifying_blit"
tests when the buffer width is equal to 8192.
Note: this is very similar to the recent patch "intel: Fix ReadPixels
on buffers whose width >= 32kbytes" except that it applies to
glCopyTexSubImage instead of glReadPixels. In a future patch it would
be nice to refactor the code so that (a) overflow is avoided, and (b)
intelEmitCopyBlit is responsible for checking whether the blitter can
handle the width, so that all callers of intelEmitCopyBlit work
properly, rather than just these two.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>