Luke Kenneth Casson Leighton [Thu, 18 Oct 2018 22:12:00 +0000 (23:12 +0100)]
typedef on sv_reg_t to reg_t (and signed variant)
still working on redirecting everything through a planned class
that can be polymorphic overloaded... eventually
Luke Kenneth Casson Leighton [Thu, 18 Oct 2018 22:11:08 +0000 (23:11 +0100)]
use unsigned long shift on sv csr setting
Luke Kenneth Casson Leighton [Thu, 18 Oct 2018 17:14:36 +0000 (18:14 +0100)]
forgot to set clroffset
Luke Kenneth Casson Leighton [Wed, 17 Oct 2018 09:56:30 +0000 (10:56 +0100)]
allow 4 CSR entries to be set at a time, on RV64
Luke Kenneth Casson Leighton [Wed, 17 Oct 2018 00:58:15 +0000 (01:58 +0100)]
minor alteration to CSRRWI SETVL / SETMVL to offset immediate by 1
allows CSRRWI to make maximum use of only 5-bit immediate
Luke Kenneth Casson Leighton [Tue, 16 Oct 2018 22:40:26 +0000 (23:40 +0100)]
shuffle CSR offsets around, offset VL and MVL by one
VL and MVL now span from 1 to XLEN rather than 0 to XLEN-1
also making room for M-Mode and S-Mode CSRs
Luke Kenneth Casson Leighton [Mon, 15 Oct 2018 11:56:24 +0000 (12:56 +0100)]
fix compiler warnings on printfs
Luke Kenneth Casson Leighton [Mon, 15 Oct 2018 11:55:38 +0000 (12:55 +0100)]
fix annoying printf warning on fp compiles
Luke Kenneth Casson Leighton [Mon, 15 Oct 2018 09:26:14 +0000 (10:26 +0100)]
whoops deref null pointer
Luke Kenneth Casson Leighton [Mon, 15 Oct 2018 08:58:22 +0000 (09:58 +0100)]
c_beqz sv operational
Luke Kenneth Casson Leighton [Mon, 15 Oct 2018 07:56:51 +0000 (08:56 +0100)]
put RVC_SP at back of cintpatterns list
Luke Kenneth Casson Leighton [Mon, 15 Oct 2018 06:49:26 +0000 (07:49 +0100)]
add rvc_sp redirection/offset overload
also found weird bug where RVC_FRS1/2 were not being detected,
how it was not found earlier is a mystery, code should not
have compiled!
Luke Kenneth Casson Leighton [Mon, 15 Oct 2018 05:52:02 +0000 (06:52 +0100)]
need to check whether SP (reg 2) is used, without redirection
Luke Kenneth Casson Leighton [Mon, 15 Oct 2018 05:43:24 +0000 (06:43 +0100)]
add overload/redirection for WRITE_REG
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 21:29:34 +0000 (22:29 +0100)]
move design to separate document
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 21:29:22 +0000 (22:29 +0100)]
drop all lui from restriction on parallelism
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 12:30:48 +0000 (13:30 +0100)]
disable jal in sv
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 05:41:04 +0000 (06:41 +0100)]
rv_xxx convert c_xxx
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 05:31:27 +0000 (06:31 +0100)]
rv_add in lh/sh
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 05:28:16 +0000 (06:28 +0100)]
rv_add in store
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 05:26:45 +0000 (06:26 +0100)]
blt and use of rv_add
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 05:23:38 +0000 (06:23 +0100)]
add rv_ge
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 05:21:42 +0000 (06:21 +0100)]
add rv_eq and rv_ne
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 05:20:05 +0000 (06:20 +0100)]
add rv_eq and rv_ne
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 05:16:54 +0000 (06:16 +0100)]
add rv_gt headers
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 05:16:03 +0000 (06:16 +0100)]
add rv_sr
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 05:06:43 +0000 (06:06 +0100)]
add shiftright
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 05:02:59 +0000 (06:02 +0100)]
add shiftleft and lessthan
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 04:50:24 +0000 (05:50 +0100)]
missed a mul
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 04:49:05 +0000 (05:49 +0100)]
replace % operator with rv_rem
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 04:46:37 +0000 (05:46 +0100)]
replace ^ operator with rv_xor
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 04:45:40 +0000 (05:45 +0100)]
replace | operator with rv_or
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 04:44:36 +0000 (05:44 +0100)]
replace & operator with rv_and
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 04:42:39 +0000 (05:42 +0100)]
replace operator * with rv_mul
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 04:35:03 +0000 (05:35 +0100)]
add rv_div (signed and unsigned) to replace operator /
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 04:24:18 +0000 (05:24 +0100)]
redirect subtract through rv_sub
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 04:22:43 +0000 (05:22 +0100)]
redirect add to rv_add
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 04:22:18 +0000 (05:22 +0100)]
redirect add to rv_add
Luke Kenneth Casson Leighton [Sun, 14 Oct 2018 04:06:46 +0000 (05:06 +0100)]
bit of a mess: attempted to create a complete arithmetic overload
had to back most of it out, and left in a change to the amo* functions
passing in a 2nd parameter to the higher-order-function
Luke Kenneth Casson Leighton [Sat, 13 Oct 2018 13:43:06 +0000 (14:43 +0100)]
rename _zext_xlen
Luke Kenneth Casson Leighton [Sat, 13 Oct 2018 13:40:30 +0000 (14:40 +0100)]
add sv_reg_t
Luke Kenneth Casson Leighton [Fri, 12 Oct 2018 17:22:43 +0000 (18:22 +0100)]
redirect WRITE_FRD including different types (128/64/32)
Luke Kenneth Casson Leighton [Fri, 12 Oct 2018 17:05:04 +0000 (18:05 +0100)]
add WRITE_FRD macro redirect
Luke Kenneth Casson Leighton [Fri, 12 Oct 2018 15:21:11 +0000 (16:21 +0100)]
changed style, can revert changes to amomin/max
Luke Kenneth Casson Leighton [Fri, 12 Oct 2018 15:20:27 +0000 (16:20 +0100)]
add frs2 redirect
Luke Kenneth Casson Leighton [Fri, 12 Oct 2018 15:15:32 +0000 (16:15 +0100)]
add RS3 replacement
Luke Kenneth Casson Leighton [Fri, 12 Oct 2018 15:12:31 +0000 (16:12 +0100)]
simplify sv_proc_t redirection of RS1-3 / FRS1 macrhos
Luke Kenneth Casson Leighton [Fri, 12 Oct 2018 14:16:07 +0000 (15:16 +0100)]
redirect RS2 to sv_proc_t class
Luke Kenneth Casson Leighton [Fri, 12 Oct 2018 12:27:11 +0000 (13:27 +0100)]
proof-of-concept, redirect RS1 to class sv_proc_t
Luke Kenneth Casson Leighton [Fri, 12 Oct 2018 02:38:33 +0000 (03:38 +0100)]
combination of redirection through a "property" class overloads WRITE_RD
WRITE_RD, formerly a macro, now replaced with a function.
also required a redirector function rd
Luke Kenneth Casson Leighton [Thu, 11 Oct 2018 20:45:26 +0000 (21:45 +0100)]
redirect instructions through a class called sv_proc_t
preparing the groundwork for a total over-ride of macros such as
WRITE_RD, and so on, so that element width can be implemented
Luke Kenneth Casson Leighton [Thu, 11 Oct 2018 12:17:11 +0000 (13:17 +0100)]
more explicit testing, duplicating header file algorithms for div, rem and shift
Luke Kenneth Casson Leighton [Thu, 11 Oct 2018 07:09:16 +0000 (08:09 +0100)]
whoops run from 0-255 not 0-254, and other test corrections
Luke Kenneth Casson Leighton [Thu, 11 Oct 2018 06:56:32 +0000 (07:56 +0100)]
add some operator tests for int8_t being typecast to uint16_t
Luke Kenneth Casson Leighton [Thu, 11 Oct 2018 06:34:13 +0000 (07:34 +0100)]
add more experimenting on operators
Luke Kenneth Casson Leighton [Wed, 10 Oct 2018 14:01:42 +0000 (15:01 +0100)]
add operators test class
Luke Kenneth Casson Leighton [Wed, 10 Oct 2018 08:49:43 +0000 (09:49 +0100)]
add operators library to contain operator-overloads of +/-/*/div/>/>= etc
for all types int8, uint8, float16_t, float32_t etc.
all to be template-ified
Luke Kenneth Casson Leighton [Tue, 9 Oct 2018 18:33:53 +0000 (19:33 +0100)]
get predicated-vectorised branch working
Luke Kenneth Casson Leighton [Tue, 9 Oct 2018 16:01:12 +0000 (17:01 +0100)]
save branch address and predication merged result, and test after branch
if the predication was all good, go ahead with the branch
still not tested for predicated / vectorised branch yet however
at least scalar branches work
Luke Kenneth Casson Leighton [Tue, 9 Oct 2018 15:17:42 +0000 (16:17 +0100)]
add explanatory comment
Luke Kenneth Casson Leighton [Tue, 9 Oct 2018 15:02:00 +0000 (16:02 +0100)]
add explanatory comment
Luke Kenneth Casson Leighton [Tue, 9 Oct 2018 14:51:12 +0000 (15:51 +0100)]
start adding explicit twin-predicated branch identification (rs2)
Luke Kenneth Casson Leighton [Tue, 9 Oct 2018 10:39:49 +0000 (11:39 +0100)]
extend sv register file from 64 to 128 after discussion.
evaluation of even embedded GPUs shows that they have really enormous
register files. a fp16 x 4 to express quads, times four, takes up eight
consecutive registers just on its own.
Luke Kenneth Casson Leighton [Sun, 7 Oct 2018 14:06:39 +0000 (15:06 +0100)]
override setpc macro so that sv can redirect it in branch
this is slightly complicated. setpc is a global macro, intended for
use in the templates. however for SV it needs to be conditional,
so needs to redirect to a function in sv_insn_t.
that in turn needs quite a few extra parameters: the current loop
element offset, the argument to set_pc just in case actually it is
detected that this really is to be a branch not a predication scenario
and so on.
have not tried out vectorisation yet, at least straight non-vectorised
branch operations (unit tests) pass.
Luke Kenneth Casson Leighton [Sun, 7 Oct 2018 08:09:21 +0000 (09:09 +0100)]
swap #ifdef USING_NOREGS so that it is possible to redefine set_pc
to avoid having to modify the branch instructions the set_pc macro will
be #undefd and redefined to modify the predication target
Luke Kenneth Casson Leighton [Sun, 7 Oct 2018 07:03:11 +0000 (08:03 +0100)]
add rd bit-setting function
Luke Kenneth Casson Leighton [Sun, 7 Oct 2018 05:52:11 +0000 (06:52 +0100)]
add extra debug printing for c.lwsp
Luke Kenneth Casson Leighton [Sun, 7 Oct 2018 03:51:10 +0000 (04:51 +0100)]
add rvc_swsp_imm sv overload, provides vector unit stride now
Luke Kenneth Casson Leighton [Sat, 6 Oct 2018 19:18:32 +0000 (20:18 +0100)]
c.swsp and c.fswsp predication and offset enabling
Luke Kenneth Casson Leighton [Sat, 6 Oct 2018 16:07:16 +0000 (17:07 +0100)]
allow x2 (sp) to be redirected in C.LWSP
Luke Kenneth Casson Leighton [Sat, 6 Oct 2018 15:49:20 +0000 (16:49 +0100)]
temporary hack disabling SV in anything other than user mode
Luke Kenneth Casson Leighton [Sat, 6 Oct 2018 15:48:55 +0000 (16:48 +0100)]
whoops inverted ldsp and lwsp immediates
Luke Kenneth Casson Leighton [Sat, 6 Oct 2018 09:24:25 +0000 (10:24 +0100)]
create ldsp immediate offset overrides
this allows C.LWSP/C.LDSP to do predicated load/stores from contiguous
blocks of memory
Luke Kenneth Casson Leighton [Sat, 6 Oct 2018 05:51:46 +0000 (06:51 +0100)]
add in predication for immediate, for C.LWSP
Luke Kenneth Casson Leighton [Fri, 5 Oct 2018 14:26:20 +0000 (15:26 +0100)]
reorganise src and dest vector-element offsets
pass in pointer to offsets from processor_t->state.srcoffs and destoffs
into sv_insn_t so that the element state information about how many
parallel elements have been executed is recorded.
in this way it will be possible to restart a loop at the right place
if there is an exception (such as a memory cache miss on a LOAD/STORE)
Luke Kenneth Casson Leighton [Fri, 5 Oct 2018 14:09:48 +0000 (15:09 +0100)]
add srcoffs and destoffs sv state, alter CSRs
remove SVREALVL, replace with SVSTATE
Luke Kenneth Casson Leighton [Thu, 4 Oct 2018 13:11:02 +0000 (14:11 +0100)]
reorganise twin-predication
move the offset incrementing to outside of the sv_insn_t, and pass in
the src_offs and dest_offs by reference, mirroring and matching the
predication src and dest referencing
Luke Kenneth Casson Leighton [Thu, 4 Oct 2018 09:16:10 +0000 (10:16 +0100)]
big reorganisation to support twin-predication
twin predication is on certain operations like LOAD, STORE, C.MV, FCVT
where the effect is similar to VSPLAT, VINSERT, and also bitmanip
scatter/gather.
Luke Kenneth Casson Leighton [Wed, 3 Oct 2018 09:42:23 +0000 (10:42 +0100)]
add in twin-predication identification
pass in second predicate for twin-predication operands
Luke Kenneth Casson Leighton [Wed, 3 Oct 2018 06:16:34 +0000 (07:16 +0100)]
decided not to change the behaviour of LOAD/STORE
Luke Kenneth Casson Leighton [Tue, 2 Oct 2018 15:01:22 +0000 (16:01 +0100)]
start work on parallelsing LOAD, pass in parameter to reinterpret immed
Luke Kenneth Casson Leighton [Tue, 2 Oct 2018 11:26:47 +0000 (12:26 +0100)]
debug print for floating-point regs
Luke Kenneth Casson Leighton [Mon, 1 Oct 2018 14:09:21 +0000 (15:09 +0100)]
add comment explaining why invert isnt done in zeroing test
(already inverted basically)
Luke Kenneth Casson Leighton [Mon, 1 Oct 2018 14:08:25 +0000 (15:08 +0100)]
add comment explaining use of insn._rd() in zeroing
Luke Kenneth Casson Leighton [Mon, 1 Oct 2018 11:00:27 +0000 (12:00 +0100)]
whoops vloop continuation logic the wrong way round
the loop has to continue if there is one vectorised register left
rather than stop if there is *no* vectorised registers
Luke Kenneth Casson Leighton [Mon, 1 Oct 2018 01:16:21 +0000 (02:16 +0100)]
skip parallelisation of complex LR/SC operations
Luke Kenneth Casson Leighton [Mon, 1 Oct 2018 01:04:57 +0000 (02:04 +0100)]
identify type of instruction with additional #defines
Luke Kenneth Casson Leighton [Sun, 30 Sep 2018 12:33:07 +0000 (13:33 +0100)]
add a #define to id_regs.py which indicates name of the instruction
the c preprocessor cannot cope with detecting what the name of the
instruction is (and when it does, if it is "and" that throws a
compile-error), so as a workaround have id_regs.py create a
#define INSN_ADD, #define INSN_ADD_I etc.
Luke Kenneth Casson Leighton [Sun, 30 Sep 2018 09:37:05 +0000 (10:37 +0100)]
list of instructions to avoid parallelising
Luke Kenneth Casson Leighton [Sun, 30 Sep 2018 08:26:54 +0000 (09:26 +0100)]
update template comment
Luke Kenneth Casson Leighton [Sun, 30 Sep 2018 08:15:17 +0000 (09:15 +0100)]
lots of debugging of predication, found other errors
lots of stuff here:
* changed sv reg and pred entry layout to make them a bit more
human-readable: put the key and idx on byte-boundaries
* bug in SV REG setting (getting int table instead of fp one, twice)
* bug(s) in SV PRED table setting (zeroing the reg tables, using idx not
key and more)
* stop lookup of predication if the *REGISTER* entry is inactive
(but not if it is a scalar, because scalar predication is ok)
this was the final piece of the puzzle that got predication working
* added a useful macro for creating SV REG and PRED CAM table entries
predication now working including zeroing
Luke Kenneth Casson Leighton [Sun, 30 Sep 2018 07:00:09 +0000 (08:00 +0100)]
add sv support for zeroing predication in dest register
bit of a major rework:
* access to the "unpredicated" (non-zero-hacked) register was needed
* therefore all rd/rs1-3/rvc_xxx functions had to have _ variants
* the underscored variants are not predicated
* this in turn meant that the offset for each register was wrong
as it is incremented *after* being checked
* therefore a newoffs had to be added
* and the reset_cache function copies the newoffs values
bit of a mess but it works: this is a state machine after all...
Luke Kenneth Casson Leighton [Sun, 30 Sep 2018 06:07:43 +0000 (07:07 +0100)]
add in predication to sv instruction execution
this relies on setting the value of the destination register
(and source registers) to zero. a bad hack but it will do
Luke Kenneth Casson Leighton [Sun, 30 Sep 2018 05:14:17 +0000 (06:14 +0100)]
start linking in predication into sv
Luke Kenneth Casson Leighton [Sun, 30 Sep 2018 04:13:50 +0000 (05:13 +0100)]
use an alternative logic for detecting scalar / loop-end
instead of pre-checking do the check for "all-scalar" during the
first loop iteration i.e. when registers are first accessed
Luke Kenneth Casson Leighton [Sun, 30 Sep 2018 04:12:49 +0000 (05:12 +0100)]
add compressed-identifying patterns to id_regs.py
also skip LUI (and C.LUI) as non-paralleliseable instructions
Luke Kenneth Casson Leighton [Sun, 30 Sep 2018 03:25:55 +0000 (04:25 +0100)]
fix code template for when SPIKE_SIMPLEV is not defined
Luke Kenneth Casson Leighton [Sun, 30 Sep 2018 03:25:28 +0000 (04:25 +0100)]
yuk. break id_regs.py being a generic tool by skipping csr ops
trying to use c preprocessor macros to skip CSRs in sv from being
parallelised is too painful, and is necessary to do. a parallel CSR
read/write does not make sense
Luke Kenneth Casson Leighton [Sat, 29 Sep 2018 11:16:00 +0000 (12:16 +0100)]
fix bug in sv template where FRS2 was checking rs3
Luke Kenneth Casson Leighton [Sat, 29 Sep 2018 11:15:29 +0000 (12:15 +0100)]
add checks for RVC registers to sv template