+the analysis counting instructions and D-Cache Loads actually shows
+that whilst the initial idea for `pfmvis` would be to fill in the
+remaining mantissa and high exponent bits to complete a full FP64,
+the cost of doing so is:
+
+* 1x32 flis
+* 1x32 fishmv
+* 1x64 pfishmv
+
+which is QTY 8 bytes which is actually *more* than just `fld`,
+which is only QTY 6 bytes. the only technical reason therefore
+to avoid D-Cache entirely, just like the 5-instruction sequence
+that writes a 64-bit GPR only from immediates
+(li, oris, rldicl, li, oris)
+