ha! noticed that LD-ST-Update-Shifted-Postinc is far less than

author Luke Kenneth Casson Leighton <lkcl@lkcl.net>

Thu, 13 Apr 2023 11:32:09 +0000 (12:32 +0100)

committer Luke Kenneth Casson Leighton <lkcl@lkcl.net>

Thu, 13 Apr 2023 11:32:09 +0000 (12:32 +0100)
author Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Thu, 13 Apr 2023 11:32:09 +0000 (12:32 +0100)
committer Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Thu, 13 Apr 2023 11:32:09 +0000 (12:32 +0100)
diff --git a/openpower/sv/rfc/ls012.mdwn b/openpower/sv/rfc/ls012.mdwn

index 74e63f16d2b526aa7d69a336741e730fa8d7b9b8..68dd582c643c6fa2aba40fc8b71100622169e58f 100644 (file)
--- a/openpower/sv/rfc/ls012.mdwn
+++ b/openpower/sv/rfc/ls012.mdwn
@@ -80,8 +80,8 @@ Audio/Visual, High-Performance Compute, GPU workloads and DSP.
  * 4 - INT<->FP mv [[ls006]]
  * 19 - GPR LD/ST-PostIncrement-Update (saves hugely in hot-loops) [[ls011]]
  * ~12 - FPR LD/ST-PostIncrement-Update (ditto) [[ls011]]
-* 19 - GPR LD/ST-Shifted-PostIncrement-Update (saves hugely in hot-loops) [[ls011]]
-* ~12 - FPR LD/ST-Shifted-PostIncrement-Update (ditto) [[ls011]]
+* 11 - GPR LD/ST-Shifted-PostIncrement-Update (saves hugely in hot-loops) [[ls011]]
+* ~4 - FPR LD/ST-Shifted-PostIncrement-Update (ditto) [[ls011]]
  * 26 - GPR LD/ST-Shifted (again saves hugely in hot-loops) [[ls004]]
  * 11 - FPR LD/ST-Shifted (ditto) [[ls004]]
  * 2 - Float-Load-Immediate (always saves one LD L1/2/3 D-Cache op) [[ls002]]
@@ -381,11 +381,10 @@ Power ISA up a level.
  Where things begin to get more than a little hairy is if both
  Post-Increment *and* Shifted are included.  If SVP64 keeps one
  single bit (/pi) dedicated in the `RM.Mode` field then this
-problem ges away, at the cost of reducing SVP64's effectiveness,
-but at least a stunning **24** Primary Opcodes (there are only
-32 in EXT2xx) would not disappear overnight.
-Mostly the Post-Increment-and-Shifted set are included to illustrate
-the options and have a formal record of the evluation, for Due Diligence.
+problem ges away, at the cost of reducing SVP64's effectiveness.
+However again, given that even the Shifted-Post-Increment
+instructions are all 9-bit XO it is not outside the realm of
+possibility to include them in EXT2xx.
  
  ## Shift-and-add (and LD/ST Indexed-Shift)
  
@@ -401,12 +400,6 @@ Being a 10-bit XO it would be somewhat punitive to place these in EXT2xx
  when their whole purpose and value is to reduce binary size in Address
  offset computation, thus they are best placed in EXT0xx.
  
-Also included because it is important to see the quantity of instructions:
-LD/ST-Indexed-Shifted.  Across Update variants, Byte-reverse variants,
-Arithmetic and FP, the total is a slightly-eye-watering **37**
-instructions, only ameliorated by the fact that they are all 9-bit XO.
-When it comes to Shifted-Postincrement the number of Primary Opcodes
-needed in EXT2xx comes to 24 which is most of them.
  The upside as far as adding them is concerned is that existing hardware
  will already have amalgamated pipelines with very few actual back-end
  (Micro-Coded) internal operations (likely just two: one load, one store).
@@ -416,9 +409,13 @@ is not hard.
  *(Readers unfamiliar with Micro-coding should look at the Microwatt VHDL
  source code)*
  
-When it comes to LD/ST-Shifted-Postincrement the sheer number particularly
-Primary Opcodes needed in EXT2xx makes for a compelling case to prioritise
-Shift-and-Add.
+Also included because it is important to see the quantity of instructions:
+LD/ST-Indexed-Shifted.  Across Update variants, Byte-reverse variants,
+Arithmetic and FP, the total is a slightly-eye-watering **37**
+instructions, only ameliorated by the fact that they are all 9-bit XO.
+Even when adding the Post-Increment-Shifted group it is still only
+52 9-bit XO instructions, which is not unreasonable to consider (in
+EXT2xx).
  
  \newpage{}
  
diff --git a/openpower/sv/rfc/ls012/optable.csv b/openpower/sv/rfc/ls012/optable.csv

index dfcf382f4d1dc48ecbc380407abf6b7f5f4e2449..c98a6266af9282261c01834203d4072d6ad86b09 100644 (file)
--- a/openpower/sv/rfc/ls012/optable.csv
+++ b/openpower/sv/rfc/ls012/optable.csv
@@ -29,34 +29,21 @@ stfsup,    ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
  stfdupx,   ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
  stfsupx,   ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
  # LD/ST-Shifted-Postincrement
-lbzusp,    ls011, low, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
-lbzuspx,   ls011, med, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
-lhzusp,    ls011, low, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
-lhzuspx,   ls011, med, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
-lhausp,    ls011, low, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
-lhauspx,   ls011, med, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
-lwzusp,    ls011, low, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
-lwzuspx,   ls011, med, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
-lwauspx,   ls011, med, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
-ldusp,     ls011, low, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
-lduspx,    ls011, med, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
-stbusp,    ls011, low, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
-stbuspx,   ls011, med, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
-sthusp,    ls011, low, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
-sthuspx,   ls011, med, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
-stwusp,    ls011, low, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
-stwuspx,   ls011, med, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
-stdusp,    ls011, low, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
-stduspx,   ls011, med, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
+lbzuspx,   ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
+lhzuspx,   ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
+lhauspx,   ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
+lwzuspx,   ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
+lwauspx,   ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
+lduspx,    ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
+stbuspx,   ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
+sthuspx,   ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
+stwuspx,   ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
+stduspx,   ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
  # FP LD/ST-Shifted-Postincrement
-lfdups,    ls011, low, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
-lfsups,    ls011, low, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
-lfdupsx,   ls011, med, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
-lsdupsx,   ls011, med, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
-stfdups,    ls011, low, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
-stfsups,    ls011, low, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
-stfdupsx,   ls011, med, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
-stfsupsx,   ls011, med, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
+lfdupsx,   ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
+lsdupsx,   ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
+stfdupsx,   ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
+stfsupsx,   ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
  # LD/ST-Index-Shifted (w/Update)
  lbzsx,    ls004, high, 9, yes, EXT0xx, no, ls004, 2R1W
  lbzusx,    ls004, high, 9, yes, EXT0xx, no, ls004, 2R2W
author	Luke Kenneth Casson Leighton <lkcl@lkcl.net>
	Thu, 13 Apr 2023 11:32:09 +0000 (12:32 +0100)
committer	Luke Kenneth Casson Leighton <lkcl@lkcl.net>
	Thu, 13 Apr 2023 11:32:09 +0000 (12:32 +0100)
openpower/sv/rfc/ls012.mdwn		patch \| blob \| history
openpower/sv/rfc/ls012/optable.csv		patch \| blob \| history