X-Git-Url: https://git.libre-soc.org/?a=blobdiff_plain;f=openpower%2Fsv%2Frfc%2Fls011.mdwn;h=67bb26ec0f2362ce0b07372de777a9462dd69cc8;hb=a175a4756f5b4e1350d2099c3f0aae4954b337ed;hp=5fa2f850eb970611dd96e277ef0f04c31773e904;hpb=f088bb60b86b027bd08dadbe8b8d9e45ada2ac1f;p=libreriscv.git diff --git a/openpower/sv/rfc/ls011.mdwn b/openpower/sv/rfc/ls011.mdwn index 5fa2f850e..67bb26ec0 100644 --- a/openpower/sv/rfc/ls011.mdwn +++ b/openpower/sv/rfc/ls011.mdwn @@ -1,5 +1,275 @@ -TODO +# RFC ls011 LD/ST-Update-PostIncrement -* +* Funded by NLnet under the Privacy and Enhanced Trust Programme, EU + Horizon2020 Grant 825310, and NGI0 Entrust No 101069594 +* +* +* +* + +**Severity**: Major + +**Status**: New + +**Date**: 21 Apr 2023. + +**Target**: v3.2B + +**Source**: v3.0B + +**Books and Section affected**: + +``` + Chapter 2 Book I, new Fixed-Point Load / Store Sections 3.3.2 3.3.3 + Chapter 4 Book I, new Floating-Point Load / Store Sections 4.6.2 4.6.3 +``` + +**Summary** + +``` + TODO +``` + +**Submitter**: Luke Leighton (Libre-SOC) + +**Requester**: Libre-SOC + +**Impact on processor**: + +``` + Addition of new Load/Store Fixed and Floating Point instructions +``` + +**Impact on software**: + +``` + Requires support for new instructions in assembler, debuggers, and related tools. + Reduces instructions in hot-loops +``` + +**Keywords**: + +``` + +``` + +**Motivation** + +Moving the update of RA to *after* the Memory operation saves on instruction count +both outside and inside hot-loops. strncpy may be reduced to 11 Vector instructions, +3 of which are the zeroing loop, 5 of which are the copy. Percentage-wise LD/ST +Update Post-Increment represents a massive 20% reduction. + +**Notes and Observations**: + +These types of instructions are already present in x86 (sort-of). + +* x86 chose that store should be pre-indexed and load should be post-indexed +* Power ISA chose everything to be pre-indexed +* Motorola 68000 (decades old) has pre- and post- indexed + + + + + +**Changes** + +Add the following entries to: + +* New Load/Store Sections +* Appendices + +[[!tag opf_rfc]] + +-------- + +\newpage{} + +TODO (key stub notes below) + + + +The LD/ST-Immediate-Post-Increment instructions are all Primary +Opcode: there are 13 of these. LD/ST-Indexed-Post-Increment +are all effectively 9-bit XO and consequently may easily +fit into one single Primary Opcode. EXT2xx is recommended. + +One alternative idea is that bit 31 could be allocated (retrospectively) +to Post-Increment. Although it may be too late for Scalar Power ISA +it **may** be possible to consider for SVP64Single and/or SVP64-Vector, +but this risks creating a non-Orthogonal ISA. + + + +``` +# LD/ST-Postincrement +lbzup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W +lbzupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W +lhzup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W +lhzupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W +lhaup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W +lhaupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W +lwzup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W +lwzupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W +lwaupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W +ldup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W +ldupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W +stbup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W +stbupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W +sthup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W +sthupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W +stwup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W +stwupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W +stdup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W +stdupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W + +# FP LD/ST-Postincrement +lfdup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W +lfsup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W +lfdupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W +lsdupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W +stfdup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W +stfsup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W +stfdupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W +stfsupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W + +# LD/ST-Shifted-Postincrement +lbzupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W +lhzupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W +lhaupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W +lwzupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W +lwaupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W +ldupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W +stbupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W +sthupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W +stwupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W +stdupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W + +# FP LD/ST-Shifted-Postincrement +lfdupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W +lfsupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W +stfdupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W +stfsupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W + +``` + +# Example + +Here is an annotated example where the pseudo-code changes to +just use `RA` as the address, otherwise remaining the same. +No actual change to the Effective Address computation itself +occurs, in any of the Post-Update instructions. + +**Load Byte and Zero with Post-Update** + +D-Form + +* lbzup RT,D(RA) + +Pseudo-code: + +``` + EA <- (RA) # EA just RA + RT <- ([0] * (XLEN-8)) || MEM(EA, 1) # then load + RA <- (RA) + EXTS(D) # then update RA after +``` + +Special Registers Altered: + +``` + None +``` + +where the same pseudocode for `lbzu` is: + +``` + EA <- (RA) + EXTS(D) # EA includes D + RT <- ([0] * (XLEN-8)) || MEM(EA, 1) # load from RA+D + RA <- EA # and update RA +``` +----- + +\newpage{} + +# Fixed-point Load with Post-Update + +Add the following additional Section to Fixed-Point Load: Book I 3.3.2.1 + +[[!inline pages="openpower/isa/pifixedload" raw=yes ]] + +----- + +\newpage{} + +# Fixed-Point Store Post-Update + +Add the following as a new section in Fixed-Point Store, Book I + +[[!inline pages="openpower/isa/pifixedstore" raw=yes ]] + +----- + +\newpage{} + +# Floating-Point Load Post-Update + +Add the following as a new section in Floating-Point Load, Book I 4.6.2 + +[[!inline pages="openpower/isa/fpload" raw=yes ]] + +----- + +\newpage{} + +# Floating-Point Store Post-Update + +Add the following as a new section in Floating-Point Store, Book I 4.6.3 + +[[!inline pages="openpower/isa/fpstore" raw=yes ]] + +----- + +\newpage{} + +# Fixed-Point Load Shifted Post-Update + +Add the following as a new section in Fixed-Point Load: Book I + +[[!inline pages="openpower/isa/pifixedloadshift" raw=yes ]] + +----- + +\newpage{} + +# Fixed-Point Store Shifted Post-Update + +Add the following as a new section in Fixed-Point Store: Book I + +[[!inline pages="openpower/isa/pifixedstoreshift" raw=yes ]] + +----- + +\newpage{} + +# Floating-Point Load Shifted Post-Update + +Add the following as a new section in Floating-Point Load: Book I + +[[!inline pages="openpower/isa/pifploadshift" raw=yes ]] + +----- + +\newpage{} + +# Floating-Point Store Shifted Post-Update + +Add the following as a new section in Floating-Point Store: Book I + +[[!inline pages="openpower/isa/pifpstoreshift" raw=yes ]] + +\newpage{} +[[!inline pages="openpower/isa/fixedload" raw=yes ]] +\newpage{} +[[!inline pages="openpower/isa/fixedstore" raw=yes ]] [[!tag opf_rfc]]