i965: Add a pass for the FS to reduce vector expressions down to scalar.
This is a step towards implementing a GLSL IR backend for the 965
fragment shader. Because it has downsides with the current codegen,
it is hidden under the environment variable INTEL_NEW_FS.
This results in an increase in instruction count at the moment (1444
-> 1752 for glsl-fs-raytrace, 345 -> 359 on my demo), because dot
products are turned into a series of multiplies and adds instead of a
custom expansion of MULs and MACs, and by not splitting the variable
types up we don't get tree grafting and thus there are extra moves of
temporary storage. However, register count drops for the non-GLSL
path (64 -> 56 on my demo shader) because the register allocator sees
all the sub-operations.