arch-gcn3: Fix V_MAD_I32_I24 sign extension
authorMichael LeBeane <Michael.Lebeane@amd.com>
Wed, 9 May 2018 21:02:17 +0000 (17:02 -0400)
committerAnthony Gutierrez <anthony.gutierrez@amd.com>
Mon, 22 Jun 2020 16:14:35 +0000 (16:14 +0000)
We are not properly sign extending the bits we hack off for
V_MAD_I32_I24.

This fixes rnn_fwdBwd 64 1 1 lstm pte assertion failure.

Change-Id: I2516e5715227cbd822e6a62630674f64f7a109e0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29928
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
src/arch/gcn3/insts/instructions.cc

index 32719ad27e065ac750f4b0a69bb747a23127ad93..0256d469bbdfa0e888d3a4ac9c12aa4d39411b93 100644 (file)
@@ -27446,8 +27446,8 @@ namespace Gcn3ISA
 
         for (int lane = 0; lane < NumVecElemPerVecReg; ++lane) {
             if (wf->execMask(lane)) {
-                vdst[lane] = bits(src0[lane], 23, 0) * bits(src1[lane], 23, 0)
-                    + src2[lane];
+                vdst[lane] = sext<24>(bits(src0[lane], 23, 0))
+                    * sext<24>(bits(src1[lane], 23, 0)) + src2[lane];
             }
         }