From ecbe96541e1c961d90b8016101672831433777e3 Mon Sep 17 00:00:00 2001 From: lkcl Date: Fri, 18 Dec 2020 19:32:16 +0000 Subject: [PATCH] --- .../architecture/dynamic_simd/logicops.mdwn | 35 ++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/3d_gpu/architecture/dynamic_simd/logicops.mdwn b/3d_gpu/architecture/dynamic_simd/logicops.mdwn index 76b867614..f64f91f12 100644 --- a/3d_gpu/architecture/dynamic_simd/logicops.mdwn +++ b/3d_gpu/architecture/dynamic_simd/logicops.mdwn @@ -12,8 +12,41 @@ These are not the same as bitwise operations equivslent to: they are instead SIMD versions of: - result = 0 # initial value + result = 0 # initial value (single bit) for i in range(64): result = result or a[i] +# bool (some operator) as an example +instead of the above single 64 bit bool result, dynamic partitioned SIMD must return a batch of results. if the subdivision is 2x32 it is: + + result[0] = 0 # initial value for low word + result[1] = 0 # initial value for hi word + for i in range(32): + result[0] = result[0] or a[i] + result[1] = result[1] or a[i+32] + +and likewise by the time 8x8 is reached: + + for j in range(8): + result[j] = 0 # initial value for each byte + for i in range(8): + result[j] = result[j] or a[i+j*8] + +now the question becomes: what to do when the Signal is dynamically partitionable? how do we merge all of the combinations, 1x64 2x32 4x16 8x8 into the same statically-allocated hardware? + +the first thing is to define some conventions, that the answer (result) will always be 8 bit (not 1 bit) and that, rather than just one bit being set if sone are set, all 8 bits are clear or all 8 bits are set. + +likewise, when configured as 2x32 the result is subdivided into two 4 bit halves: the first half is all zero if all the first 32 bits are zero, and all ones if any one bit in the first 32 bits are set. + + result[0] = 0 # initial value for low word + result[4] = 0 # initial value for hi word + for i in range(32): + result[0] = result[0] or a[i] + result[4] = result[4] or a[i+32] + if result[0]: + result[1:3] = 1 + if result[4]: + result[5:7] = 1 + +thus we have a convention where -- 2.30.2