rs6000: Init V4SF vector without converting SP to DP
Move V4SF to V4SI, init vector like V4SI and move to V4SF back.
Better instruction sequence could be generated on Power9:
lfs + xxpermdi + xvcvdpsp + vmrgew
=>
lwz + (sldi + or) + mtvsrdd
With the patch followed, it could be continue optimized to:
lwz + rldimi + mtvsrdd
The point is to use lwz to avoid converting the single-precision to
double-precision upon load, pack four 32-bit data into one 128-bit
register directly.
gcc/ChangeLog:
2020-07-13 Xionghu Luo <luoxhu@linux.ibm.com>
* config/rs6000/rs6000.c (rs6000_expand_vector_init):
Move V4SF to V4SI, init vector like V4SI and move to V4SF back.