i965: Add support for accelerated CopyTexSubImage.
There were hacks in EmitCopyBlit before to adjust offsets so that y=0 after
the offsets had been adjusted for a negative pitch. It appears that those
hacks were due to an unclear and surprising aspect of the hardware: inverting
the pitch results in the blit into the specified rectangle being inverted,
without the user needing to adjust y and base offset.
Tested with piglit copytexsubimage test on 915GM and GM965. Should fix
serious performance issues with ETQW and other applications.