From aa8af14278fd96c9874da62baf6868b24c43571f Mon Sep 17 00:00:00 2001 From: lkcl Date: Thu, 24 Dec 2020 21:23:13 +0000 Subject: [PATCH] --- openpower/sv/overview.mdwn | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/openpower/sv/overview.mdwn b/openpower/sv/overview.mdwn index 816800c75..2a30d6323 100644 --- a/openpower/sv/overview.mdwn +++ b/openpower/sv/overview.mdwn @@ -1,5 +1,7 @@ # SV Overview +[[! toc]] + This document provides a crash-course overview as to why SV exists, and how it works. SIMD, the primary method for easy parallelism of the past 30 years in Computer Architectures, is [known to be harmful](https://www.sigarch.org/simd-instructions-considered-harmful/). SIMD provides @@ -209,7 +211,7 @@ inner part. Predication is still taken from the VL index, however it is applied if (rs1.isvec) { irs1 += 1; } if (rs2.isvec) { irs2 += 1; } -# Swizzle +# Swizzle Swizzle is particularly important for 3D work. It allows in-place reordering of XYZW, ARGB etc. and access of sub-portions of the same in arbitrary order *without* requiring timeconsuming scalar mv instructions (scalar due to the convoluted offsets). With somewhere around 10% of operations in 3D Shaders involving swizzle this is a huge saving and reduces pressure on register files. @@ -225,4 +227,6 @@ In SV given the percentage of operations that also involve initislisation to 0.0 elif remap == 5: ireg[rd+s] <= 1.0 -Note that a value of 6 (and 7) will leave the target subvector element untouched. This is equivalent to a predicate mask which is built-in, in immediate form, into the [[sv/mv.swizzle]] operation. mv.swizzle is rare in that it is one of the few instructions needed to be added that are never going to be part of a Scalar ISA. Even in High Performance Compute workloads it is unusual: it is only because SV is targetted at 3D and Video that it is being considered. +Note that a value of 6 (and 7) will leave the target subvector element untouched. This is equivalent to a predicate mask which is built-in, in immediate form, into the [[sv/mv.swizzle]] operation. mv.swizzle is rare in that it is one of the few instructions needed to be added that are never going to be part of a Scalar ISA. Even in High Performance Compute workloads it is unusual: it is only because SV is targetted at 3D and Video that it is being considered. + +Some 3D GPU ISAs also allow for two-operand subvector swizzles. These are sufficiently unusual, and the immediate opcode space required so large, that the tradeoff balance was decided in SV to only add mv.swizzle. -- 2.30.2