is it reasonable for the complexity to bleed over into LE Mode
in order to "compensate" by making BE easier but LE more difficult.
+It is actually possible to check what POWER9/10 do, here,
+using `vaddubm` and `vadduhm` (or other suitable instructions)
+
+* start from a register src of zero
+* perform an 8-byte add 0x05
+* using the result of the previous calculation add a *16 bit*
+ value where only the upper half is set (0x0a00)
+
+The expected results for every single element regardless of BE or LE
+order will be `0x0a05` **not** `0x00a5` in BE Mode.
+
+This will be down to the Architects having chosen a **fixed**
+bit-ordering and **fixed** byte ordering in the underlying
+hardware without revealing that fact.