From: Jacob Lifshay <programmerjake@gmail.com>
Date: Thu, 6 Oct 2022 01:33:55 +0000 (-0700)
Subject: fix x86 sh[lr]d, *not* sh[lr]q. if you like AT&T form, it's sh[lr]dq.
X-Git-Tag: opf_rfc_ls005_v1~146
X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=b2c4e1cda8831245c4d1b8a98f34174ab5132c8c;p=libreriscv.git

fix x86 sh[lr]d, *not* sh[lr]q. if you like AT&T form, it's sh[lr]dq.

for an example, see https://gcc.godbolt.org/z/ME4bE7Mdv
---

diff --git a/openpower/sv/biginteger.mdwn b/openpower/sv/biginteger.mdwn
index 5e4a4f770..d4dc5a202 100644
--- a/openpower/sv/biginteger.mdwn
+++ b/openpower/sv/biginteger.mdwn
@@ -28,9 +28,9 @@ Dynamic SIMD ALUs for maximum performance and effectiveness.
 Covered in [[biginteger/analysis]] the summary is that standard `adde`
 is sufficient for SVP64 Vectorisation of big-integer addition (and `subfe`
 for subtraction) but that big-integer shift, multiply and divide require an
-extra 3-in 2-out instructions, similar to Intel's 
-[shlq](https://www.felixcloutier.com/x86/shld)
-and [shrq](https://www.felixcloutier.com/x86/shrd),
+extra 3-in 2-out instructions, similar to Intel's
+[shld](https://www.felixcloutier.com/x86/shld)
+and [shrd](https://www.felixcloutier.com/x86/shrd),
 `mulx` and `divq`, to be efficient.
 The same instruction (`maddedu`) is used in both
 big-divide and big-multiply because 'maddedu''s primary