they are instead SIMD versions of:
- result = 0 # initial value
+ result = 0 # initial value (single bit)
for i in range(64):
result = result or a[i]
+# bool (some operator) as an example
+instead of the above single 64 bit bool result, dynamic partitioned SIMD must return a batch of results. if the subdivision is 2x32 it is:
+
+ result[0] = 0 # initial value for low word
+ result[1] = 0 # initial value for hi word
+ for i in range(32):
+ result[0] = result[0] or a[i]
+ result[1] = result[1] or a[i+32]
+
+and likewise by the time 8x8 is reached:
+
+ for j in range(8):
+ result[j] = 0 # initial value for each byte
+ for i in range(8):
+ result[j] = result[j] or a[i+j*8]
+
+now the question becomes: what to do when the Signal is dynamically partitionable? how do we merge all of the combinations, 1x64 2x32 4x16 8x8 into the same statically-allocated hardware?
+
+the first thing is to define some conventions, that the answer (result) will always be 8 bit (not 1 bit) and that, rather than just one bit being set if sone are set, all 8 bits are clear or all 8 bits are set.
+
+likewise, when configured as 2x32 the result is subdivided into two 4 bit halves: the first half is all zero if all the first 32 bits are zero, and all ones if any one bit in the first 32 bits are set.
+
+ result[0] = 0 # initial value for low word
+ result[4] = 0 # initial value for hi word
+ for i in range(32):
+ result[0] = result[0] or a[i]
+ result[4] = result[4] or a[i+32]
+ if result[0]:
+ result[1:3] = 1
+ if result[4]:
+ result[5:7] = 1
+
+thus we have a convention where