i965/fs: Run SIMD and logical send lowering after the optimization loop.
There are two reasons why this is useful:
- It avoids the introduction of an amount of partial writes emitted
by the SIMD lowering pass to zip and unzip register regions early
during optimization, which can make subsequent optimization less
effective.
- It substantially reduces the burden on the compiler when a large
fraction of the instructions in the program need to be split (e.g.
during SIMD32 builds). Individual halves of split instructions
will be optimized identically (if they can still be optimized at
all), so doing it up front can duplicate the amount of instructions
the optimizer has to deal with which causes the compilation time to
explode in some cases due to the worse-than-linear runtime
behaviour of the back-end.
It seems helpful to re-run a few optimization passes in cases where
any of the lowering passes was able to make progress.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>