From 84c345f2926b0ce994dda92ece4d4f588f1a71ba Mon Sep 17 00:00:00 2001 From: lkcl Date: Sat, 20 Aug 2022 02:26:52 +0100 Subject: [PATCH] --- openpower/sv/ldst.mdwn | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/openpower/sv/ldst.mdwn b/openpower/sv/ldst.mdwn index 765f43a7b..b5aa548e8 100644 --- a/openpower/sv/ldst.mdwn +++ b/openpower/sv/ldst.mdwn @@ -386,19 +386,21 @@ others like it provide an explicit operation width. There are therefore *three* widths involved: * operation width (lb=8, lh=16, lw=32, ld=64) -* src elelent width override (8/16/32/default) +* src element width override (8/16/32/default) * destination element width override (8/16/32/default) Some care is therefore needed to express and make clear the transformations, which are expressly in this order: +* Calculate the Effective Address from RA at full width + but (on Indexed Load) allow srcwidth overrides on RB * Load at the operation width (lb/lh/lw/ld) as usual * byte-reversal as usual * Non-saturated mode: - - zero-extension or truncation from operation width to source elwidth - - zero/truncation to dest elwidth + - zero-extension or truncation from operation width to dest elwidth + - place result in destination at dest elwidth * Saturated mode: - - Sign-extension or truncation from operation width to source width + - Sign-extension or truncation from operation width to dest width - signed/unsigned saturation down to dest elwidth In order to respect OpenPOWER v3.0B Scalar behaviour the memory side @@ -466,16 +468,17 @@ and other modes have all been removed, for clarity and simplicity: # check saturation. if svpctx.saturation_mode: - ... saturation adjustment... + # ... saturation adjustment... + memread = clamp(memread, op_width, svctx.dest_elwidth) else: - # truncate/extend to over-ridden source width. - memread = adjust_wid(memread, op_width, svctx.src_elwidth) + # truncate/extend to over-ridden dest width. + memread = adjust_wid(memread, op_width, svctx.dest_elwidth) # takes care of inserting memory-read (now correctly byteswapped) # into regfile underlying LE-defined order, into the right place # within the NEON-like register, respecting destination element # bitwidth, and the element index (j) - set_polymorphed_reg(RT, svctx.dest_bitwidth, j, memread) + set_polymorphed_reg(RT, svctx.dest_elwidth, j, memread) # increments both src and dest element indices (no predication here) i++; -- 2.30.2