From b147819006d863a57ffa481e7204a8a10bb7bd5d Mon Sep 17 00:00:00 2001 From: lkcl Date: Fri, 19 May 2023 20:17:17 +0100 Subject: [PATCH] --- openpower/sv/remap/appendix.mdwn | 80 ++++++++++++++++---------------- 1 file changed, 40 insertions(+), 40 deletions(-) diff --git a/openpower/sv/remap/appendix.mdwn b/openpower/sv/remap/appendix.mdwn index c6694be6e..39a9ec8c9 100644 --- a/openpower/sv/remap/appendix.mdwn +++ b/openpower/sv/remap/appendix.mdwn @@ -1,43 +1,3 @@ -## Example Usage - -* `svshape` to set the type of reordering to be applied to an - otherwise usual `0..VL-1` hardware for-loop -* `svremap` to set which registers a given reordering is to apply to - (RA, RT etc) -* `sv.{instruction}` where any Vectorised register marked by `svremap` - will have its ordering REMAPPED according to the schedule set - by `svshape`. - -The following illustrative example multiplies a 3x4 and a 5x3 -matrix to create -a 5x4 result: - -``` - svshape 5,4,3,0,0 # Outer Product 5x4 by 4x3 - svremap 15,1,2,3,0,0,0,0 # link Schedule to registers - sv.fmadds *0,*32,*64,*0 # 60 FMACs get executed here -``` - -* svshape sets up the four SVSHAPE SPRS for a Matrix Schedule -* svremap activates four out of five registers RA RB RC RT RS (15) -* svremap requests: - - RA to use SVSHAPE1 - - RB to use SVSHAPE2 - - RC to use SVSHAPE3 - - RT to use SVSHAPE0 - - RS Remapping to not be activated -* sv.fmadds has vectors RT=0, RA=32, RB=64, RC=0 -* With REMAP being active each register's element index is - *independently* transformed using the specified SHAPEs. - -Thus the Vector Loop is arranged such that the use of -the multiply-and-accumulate instruction executes precisely the required -Schedule to perform an in-place in-registers Outer Product -Matrix Multiply with no -need to perform additional Transpose or register copy instructions. -The example above may be executed as a unit test and demo, -[here](https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/isa/test_caller_svp64_matrix.py;h=c15479db9a36055166b6b023c7495f9ca3637333;hb=a17a252e474d5d5bf34026c25a19682e3f2015c3#l94) - ## REMAP Matrix pseudocode The algorithm below shows how REMAP works more clearly, and may be @@ -535,6 +495,46 @@ used even there. SVSTATE[46-bit] <- 1 ``` +## Example Matrix Usage + +* `svshape` to set the type of reordering to be applied to an + otherwise usual `0..VL-1` hardware for-loop +* `svremap` to set which registers a given reordering is to apply to + (RA, RT etc) +* `sv.{instruction}` where any Vectorised register marked by `svremap` + will have its ordering REMAPPED according to the schedule set + by `svshape`. + +The following illustrative example multiplies a 3x4 and a 5x3 +matrix to create +a 5x4 result: + +``` + svshape 5,4,3,0,0 # Outer Product 5x4 by 4x3 + svremap 15,1,2,3,0,0,0,0 # link Schedule to registers + sv.fmadds *0,*32,*64,*0 # 60 FMACs get executed here +``` + +* svshape sets up the four SVSHAPE SPRS for a Matrix Schedule +* svremap activates four out of five registers RA RB RC RT RS (15) +* svremap requests: + - RA to use SVSHAPE1 + - RB to use SVSHAPE2 + - RC to use SVSHAPE3 + - RT to use SVSHAPE0 + - RS Remapping to not be activated +* sv.fmadds has vectors RT=0, RA=32, RB=64, RC=0 +* With REMAP being active each register's element index is + *independently* transformed using the specified SHAPEs. + +Thus the Vector Loop is arranged such that the use of +the multiply-and-accumulate instruction executes precisely the required +Schedule to perform an in-place in-registers Outer Product +Matrix Multiply with no +need to perform additional Transpose or register copy instructions. +The example above may be executed as a unit test and demo, +[here](https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/isa/test_caller_svp64_matrix.py;h=c15479db9a36055166b6b023c7495f9ca3637333;hb=a17a252e474d5d5bf34026c25a19682e3f2015c3#l94) + [[!tag standards]] --------- -- 2.30.2