*three* widths involved:
* operation width (lb=8, lh=16, lw=32, ld=64)
-* src elelent width override (8/16/32/default)
+* src element width override (8/16/32/default)
* destination element width override (8/16/32/default)
Some care is therefore needed to express and make clear the transformations,
which are expressly in this order:
+* Calculate the Effective Address from RA at full width
+ but (on Indexed Load) allow srcwidth overrides on RB
* Load at the operation width (lb/lh/lw/ld) as usual
* byte-reversal as usual
* Non-saturated mode:
- - zero-extension or truncation from operation width to source elwidth
- - zero/truncation to dest elwidth
+ - zero-extension or truncation from operation width to dest elwidth
+ - place result in destination at dest elwidth
* Saturated mode:
- - Sign-extension or truncation from operation width to source width
+ - Sign-extension or truncation from operation width to dest width
- signed/unsigned saturation down to dest elwidth
In order to respect OpenPOWER v3.0B Scalar behaviour the memory side
# check saturation.
if svpctx.saturation_mode:
- ... saturation adjustment...
+ # ... saturation adjustment...
+ memread = clamp(memread, op_width, svctx.dest_elwidth)
else:
- # truncate/extend to over-ridden source width.
- memread = adjust_wid(memread, op_width, svctx.src_elwidth)
+ # truncate/extend to over-ridden dest width.
+ memread = adjust_wid(memread, op_width, svctx.dest_elwidth)
# takes care of inserting memory-read (now correctly byteswapped)
# into regfile underlying LE-defined order, into the right place
# within the NEON-like register, respecting destination element
# bitwidth, and the element index (j)
- set_polymorphed_reg(RT, svctx.dest_bitwidth, j, memread)
+ set_polymorphed_reg(RT, svctx.dest_elwidth, j, memread)
# increments both src and dest element indices (no predication here)
i++;