(no commit message)

author lkcl <lkcl@web>

Thu, 24 Dec 2020 21:23:13 +0000 (21:23 +0000)

committer IkiWiki <ikiwiki.info>

Thu, 24 Dec 2020 21:23:13 +0000 (21:23 +0000)
author lkcl <lkcl@web>
Thu, 24 Dec 2020 21:23:13 +0000 (21:23 +0000)
committer IkiWiki <ikiwiki.info>
Thu, 24 Dec 2020 21:23:13 +0000 (21:23 +0000)
diff --git a/openpower/sv/overview.mdwn b/openpower/sv/overview.mdwn

index 816800c757cbf0de831d0b20bc4db362faccf0d6..2a30d6323f0430791a1ead25712a270c0b3e4cc1 100644 (file)
--- a/openpower/sv/overview.mdwn
+++ b/openpower/sv/overview.mdwn
@@ -1,5 +1,7 @@
  # SV Overview
  
+[[! toc]]
+
  This document provides a crash-course overview as to why SV exists, and how it works.
  
  SIMD, the primary method for easy parallelism of the past 30 years in Computer Architectures, is [known to be harmful](https://www.sigarch.org/simd-instructions-considered-harmful/). SIMD provides
@@ -209,7 +211,7 @@ inner part.  Predication is still taken from the VL index, however it is applied
          if (rs1.isvec)  { irs1 += 1; }
          if (rs2.isvec)  { irs2 += 1; }
  
-# Swizzle
+# Swizzle <a name="subvl"></a>
  
  Swizzle is particularly important for 3D work.  It allows in-place reordering of XYZW, ARGB etc. and access of sub-portions of the same in arbitrary order *without* requiring timeconsuming scalar mv instructions (scalar due to the convoluted offsets).  With somewhere around 10% of operations in 3D Shaders involving swizzle this is a huge saving and reduces pressure on register files.
  
@@ -225,4 +227,6 @@ In SV given the percentage of operations that also involve initislisation to 0.0
      elif remap == 5:
            ireg[rd+s] <= 1.0
  
-Note that a value of 6 (and 7) will leave the target subvector element untouched. This is equivalent to a predicate mask which is built-in, in immediate form, into the [[sv/mv.swizzle]] operation.  mv.swizzle is rare in that it is one of the few instructions needed to be added that are never going to be part of a Scalar ISA.  Even in High Performance Compute workloads it is unusual: it is only because SV is targetted at 3D and Video that it is being considered.    
+Note that a value of 6 (and 7) will leave the target subvector element untouched. This is equivalent to a predicate mask which is built-in, in immediate form, into the [[sv/mv.swizzle]] operation.  mv.swizzle is rare in that it is one of the few instructions needed to be added that are never going to be part of a Scalar ISA.  Even in High Performance Compute workloads it is unusual: it is only because SV is targetted at 3D and Video that it is being considered.
+
+Some 3D GPU ISAs also allow for two-operand subvector swizzles.  These are sufficiently unusual, and the immediate opcode space required so large, that the tradeoff balance was decided in SV to only add mv.swizzle.
author	lkcl <lkcl@web>
	Thu, 24 Dec 2020 21:23:13 +0000 (21:23 +0000)
committer	IkiWiki <ikiwiki.info>
	Thu, 24 Dec 2020 21:23:13 +0000 (21:23 +0000)