From: lkcl Date: Wed, 19 Apr 2023 16:38:48 +0000 (+0100) Subject: (no commit message) X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=1a3ace9e18f618e3e15d4430ded7580d5d05a47e;p=libreriscv.git --- diff --git a/openpower/sv/rfc/ls013.mdwn b/openpower/sv/rfc/ls013.mdwn index 6341e572e..2d479fa89 100644 --- a/openpower/sv/rfc/ls013.mdwn +++ b/openpower/sv/rfc/ls013.mdwn @@ -19,7 +19,7 @@ **Books and Section affected**: ``` - Book I Fixed-Point Instructions + Book I Fixed-Point and Floating-Point Instructions Appendix E Power ISA sorted by opcode Appendix F Power ISA sorted by version Appendix G Power ISA sorted by Compliancy Subset @@ -39,7 +39,7 @@ **Impact on processor**: ``` - Addition of new GPR-based instructions + Addition of new GPR-based and FPR-based instructions ``` **Impact on software**: @@ -65,56 +65,12 @@ TODO SVP64 tree reduction needs a single instruction to work properly. 2. if you implement any of the FP min/max modes, the rest are not much more hardware. -3. TODO(lkcl): fill out: that using VSX may have different meaning (SVP64/VSX) - so it is *really* crucial to have SVP64/SFFS ops. +3. SVP64/VSX may have different meaning from SVP64/SFFS, + so it is *really* crucial to have SVP64/SFFS ops even if "equivalent" to VSX. 4. FP min/max are rather complex to implement in software, the most commonly used FP max function `fmax` from glibc compiled for SFFS is 32 (!) instructions. -https://gcc.godbolt.org/z/6xba61To6 - -``` - fmax(double, double): - fcmpu 0,1,2 - fmr 0,1 - cror 30,1,2 - beq 7,.L12 - blt 0,.L13 - stfd 1,-16(1) - lis 9,0x8 - li 8,-1 - sldi 9,9,32 - rldicr 8,8,0,11 - ori 2,2,0 - ld 10,-16(1) - xor 10,10,9 - sldi 10,10,1 - cmpld 0,10,8 - bgt 0,.L5 - stfd 2,-16(1) - ori 2,2,0 - ld 10,-16(1) - xor 9,10,9 - sldi 9,9,1 - cmpld 0,9,8 - ble 0,.L6 -.L5: - fadd 1,0,2 - blr -.L13: - fmr 1,2 - blr -.L6: - fcmpu 0,2,2 - fmr 1,2 - bnulr 0 -.L12: - fmr 1,0 - blr - .long 0 - .byte 0,9,0,0,0,0,0,0 -``` - **Changes** Add the following entries to: @@ -375,5 +331,52 @@ Add a new field to Book I 1.6.2 Word Instruction Fields: | X | I | # | 3.2B | min | Minimum | | X | I | # | 3.2B | max | Maximum | +## fmax instruction count + +32 instructions are required in SFFS to emulate fmac. + + +``` + fmax(double, double): + fcmpu 0,1,2 + fmr 0,1 + cror 30,1,2 + beq 7,.L12 + blt 0,.L13 + stfd 1,-16(1) + lis 9,0x8 + li 8,-1 + sldi 9,9,32 + rldicr 8,8,0,11 + ori 2,2,0 + ld 10,-16(1) + xor 10,10,9 + sldi 10,10,1 + cmpld 0,10,8 + bgt 0,.L5 + stfd 2,-16(1) + ori 2,2,0 + ld 10,-16(1) + xor 9,10,9 + sldi 9,9,1 + cmpld 0,9,8 + ble 0,.L6 +.L5: + fadd 1,0,2 + blr +.L13: + fmr 1,2 + blr +.L6: + fcmpu 0,2,2 + fmr 1,2 + bnulr 0 +.L12: + fmr 1,0 + blr + .long 0 + .byte 0,9,0,0,0,0,0,0 +``` + [[!tag opf_rfc]]