In addition, the vast majority of GPR <-> FPR data-transfers are as part
of a FP <-> Integer conversion sequence, therefore reducing the number
-of instructions required to the minimum seems necessary.
+of instructions required is a priority.
Therefore, we are proposing adding:
-* FPR load-immediate equivalent partially to `BF16`
+* FPR load-immediate instructions, one equivalent to `BF16`, the
+ other increasing accuracy to `FP32`
* FPR <-> GPR data-transfer instructions that just copy bits without conversion
* FPR <-> GPR combined data-transfer/conversion instructions that do
Integer <-> FP conversions
* **JavaScript** - modulo wrapping with Inf/NaN converted to 0
The assembly listings in the [[int_fp_mv/appendix]] show how costly
-some of these language-specific conversions are: Javascript is 32
-scalar instructions, including seven branch instructions.
+some of these language-specific conversions are: Javascript, the
+worst case, is 32 scalar instructions including seven branch instructions.
# Proposed New Scalar Instructions