SIMD strncpy hand-written assembly routines are, to be blunt about it, a total nightmare. 240 instructions is not uncommon, and the worst thing about them is that they are unable to cope with detection of a page fault condition.
+# Data-dependent fail-first
+
+This is a minor variant on the CR-based predicate-result mode. Where ored-result continues with independent element testing, data-dependent fail-first *stops* at the first failure:
+
+ for i in range(VL):
+ # predication test, skip all masked out elements.
+ if predicate_masked_out(i): continue # skip
+ result = op(iregs[RA+i], iregs[RB+i])
+ CRnew = analyse(result) # calculates eq/lt/gt
+ # now test CR, similar to branch
+ if CRnew[BO[0:1]] != BO[2]:
+ VL = i # truncate: only successes allowed
+ break
+ # test passed: store result (and CR?)
+ iregs[RT+i] = result
+ if Rc=1: crregs[offs+i] = CRnew