From eec0baf1d512a3435396142c8d414da7ce02dff5 Mon Sep 17 00:00:00 2001 From: Luke Kenneth Casson Leighton Date: Sat, 7 Apr 2018 12:44:02 +0100 Subject: [PATCH] add reference to hwacha ISA comparison --- simple_v_extension.mdwn | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/simple_v_extension.mdwn b/simple_v_extension.mdwn index 6f9f89cf6..563df85bd 100644 --- a/simple_v_extension.mdwn +++ b/simple_v_extension.mdwn @@ -492,6 +492,23 @@ existing non-Simple-V implementation.  i say that despite really *really* wanting IEEE 704 FP Half-precision to end up somewhere in RISC-V in some fashion, for optimising 3D Graphics.  *sigh*. +## TODO: instructions (based on Hwacha) V-Ext duplication analysis + +This is partly speculative due to lack of access to an up-to-date +V-Ext Spec (V2.3-draft RVV 0.4-Draft at the time of writing). However +basin an analysis instead on Hwacha, a cursory examination shows over +an **85%** duplication of V-Ext operand-related instructions when +compared to Simple-V on a standard RG64G base. Even Vector Fetch +is analogous to "zero-overhead loop". + +Exceptions are: + +* Vector Indexed Memory Instructions (non-contiguous) +* Vector Atomic Memory Instructions. +* Some of the Vector Arithmetic ops: FMIN, FMAX, FSQRT, MADD, MSUB, + VSRL, VSRA, VEIDX, VFIRST, VSGNJN, VFSGNJX and potentially more. +* Consensual Jump + ## TODO: sort > I suspect that the "hardware loop" in question is actually a zero-overhead @@ -540,3 +557,5 @@ translates effectively to: * B-Extension discussion * Broadcom VideoCore-IV Figure 2 P17 and Section 3 on P16. +* Hwacha +* Hwacha -- 2.30.2