X-Git-Url: https://git.libre-soc.org/?a=blobdiff_plain;f=openpower%2Fsv%2Fimplementation.mdwn;h=f55d5c58a0bdc14923cf5678202990b65383d174;hb=88a823723277b6f79040042f9f2c1a00b33d7f8d;hp=24e27d76e1120091d7681d3dfc63ebdcaa690b4f;hpb=07866151e9f90e7424996b81eec1c2cdcdbc8d75;p=libreriscv.git diff --git a/openpower/sv/implementation.mdwn b/openpower/sv/implementation.mdwn index 24e27d76e..f55d5c58a 100644 --- a/openpower/sv/implementation.mdwn +++ b/openpower/sv/implementation.mdwn @@ -1,8 +1,260 @@ # Implementation This page covers and coordinates implementing SV. The basic concept is -to go step-by-step through the [[sv/overview]] adding each feature. +to go step-by-step through the [[sv/overview]] adding each feature, +one at a time. Caveats and notes are included so that other implementors may avoid some common pitfalls. Links: * +* python-based svp64 + assembler translator +* c/c++ macro svp64 + assembler translator +* microwatt svp64-decode1.vhdl autogenerator +* gcc/binutils/svp64 +* gem5 / ISACaller simulator + - gem5 upstreaming +* TestIssuer +* PowerDecoder2 +* setvl ancillary tasks + (instruction form SVL-Form, field designations, pseudocode, SPR allocation) +* agree sv assembly syntax +* TestIssuer add single/twin Predication +* ISACaller add single/twin Predication +* tracking manual augmentation of CSV files +* add zeroing and exceptions +* element-width overrides + +# Code to convert + +There are five projects: + +* TestIssuer (the HDL) +* ISACaller (the python-based simulator) +* power-gem5 (a cycle accurate simulator) +* Microwatt (VHDL) +* gcc and binutils + +Each of these needs to have SV augmentation, and the best way to +do it is if they are all done at the same time, implementing the same +incremental feature. + +# Critical tasks + +These are prerequisite tasks: + +* power-gem5 automanagement, similar to pygdbmi for starting qemu + - found this + just use pygdbmi + - remote gdb should work +* c++, c and python macros for generating [[sv/svp64]] assembler + (svp64 prefixes) + - python svp64 underway, minimalist sufficient for FU unit tests + +* PowerDecoder2 - both TestIssuer and ISACaller are dependent on this + - underway + - INT and CR EXTRA svp64 fields completed. +* SVP64PowerDecoder2, used to identify SVP64 Prefixes. DONE. + +People coordinating different tasks. This doesn't mean exclusive work on these areas it just means they are the "coordinator" and lead: + +* Lauri: +* Jacob: C/C++ header for using SV through inline assembly +* Cesar: TestIssuer FSM +* Alain: power-gem5 +* Cole: +* Luke: ISACaller, python-assembler-generator-class +* Tobias: +* Alexandre: binutils-svp64-assembler and gcc +* Paul: microwatt + +# Adding SV + +order: listed in [[sv/overview]] + +## svp64 decoder + +An autogenerator containing CSV files is available so that the task of creating decoders is not burdensome. sv_analyse.py creates the CSV files, SVP64RM class picks them up. + +* ISACaller: part done. svp64 detected, PowerDecoder2 in use +* power-gem5: TODO +* TestIssuer: part done. svp64 detected, PowerDecoder2 in use. +* Microwatt: TODO, started auto-generated sv_decode.vhdl +* python-based assembler-translator: 40% done (lkcl) +* c++ macros: underway (jacob) + +Note when decoding the RM into bits different modes that LDST interprets the 5 mode bits differently not just on whether it is LD/ST bit also what *type* of LD/ST. Immediate LD/ST is further qualified to indicate if it operates in element-strided or unit-strided mode. However Indexed LD/ST is not. + +**IMPORTANT**! when spotting RA=0 in some instructions it is critical to note that the *full **seven** bits* are used (those from EXTRA2/3 included) because RA is no longer only five bits. + +Links: + +* +* +* + +## SVSTATE SPR needed + +This is a peer of MSR but is stored in an SPR. It should be considered part of the state of PC+MSR because SVSTATE is effectively a Sub-PC. + +Chosen values, fitting with v3.1 / v3.0C p12 "Sandbox" guidelines: + + num name priv width + 704,SVSTATE,no,no,32 + 720,SVSRR0,yes,yes,32 + +Progress: + +* ISACaller: done +* power-gem5: TODO +* TestIssuer: TODO +* Microwatt: TODO + +* + +## Adding SVSTATE "set/get" support for hw/sw debugging + +This includes adding DMI get/set support in hardware as well as gdb (remote) support. + +* LibreSOC DMI/JTAG: TODO +* Microwatt DMI: TODO +* power-gem5 remote gdb: TODO +* TestIssuer: DONE (read-only at least) + +Links: + +* + +## sv.setvl + +a [[sv/setvl]] instruction is needed, which also implements [[sv/sprs]] i.e. primarily the `SVSTATE` SPR. the dual-access SPRs for VL and MVL which mirror into the SVSTATE.VL and SVSTATE.MVL fields are not immediately essential to implement. + +* LibreSOC OpenPOWER wiki fields/forms: DONE. pseudocode: TODO +* ISACaller: TODO +* power-gem5: TODO +* TestIssuer: TODO +* Microwatt: TODO + +Links: + +## SVSRR0 for exceptions + +SV's SVSTATE context is effectively a Sub-PC. On exceptions the PC is saved into SRR0: it should come as no surprise that SVSTATE must be treated exactly the same. SVSRR0 therefore is added to the list to be saved/restored in **exactly** the same way and time as SRR0 and SRR1. This is fundamental and absolutely critical to view SVSTATE as a full peer of PC (CIA, NIA). + +* ISACaller: TODO unit test +* power-gem5: TODO +* TestIssuer: TODO +* Microwatt: TODO + +* added ISACaller SVSRR0 save + +## Illegal instruction exceptions + +Anything not listed as SVP64 extended must raise an illegal exception if prefixed. setvl, branch, mtmsr, mfmsr at the minimum. + +* ISACaller: TODO +* power-gem5: TODO +* TestIssuer: TODO +* Microwatt: TODO + +## VL for-loop + +main SV for-loop, as a FSM, updating `SVSTATE.srcstep`, using it as the index in the for-loop from 0 to VL-1. Register numbers are incremented by one if marked as vector. + +*This loop goes in between decode and issue phases*. It is as if there were multiple sequential instructions in the instruction stream *and the loop must be treated as such*. Specifically: all register read and write hazards **MUST** be respected; the Program Order must be respected even though and especially because this is Sub-PC execution. + +This **includes** any exceptions, hence why SVSTATE exists and why SVSRR0 must be used to store SVSTATE alongside when SRR0 and SRR1 store PC and MSR. + +Due to the need for exceptions to occur in the middle, the loop should *not* be implemented as an actual for-loop, whilst recognising that optimised implementations may do multi-issue element execution as long as Program Order is preserved, just as it would be for the PC. + +* ISACaller: DONE, first revision +* power-gem5: TODO +* TestIssuer: + - part done + - done +* Microwatt: TODO + +Remember the following register files need to have for-loops, plus +unit tests: + +* GPR +* SPRs (yes, really: mtspr and mfspr are SV Context-extensible) +* Condition Registers. see note below +* FPR (if present) + +When Rc=1 is encountered in an SVP64 Context the destination is different (TODO) i.e. not CR0 or CR1. Implicit Rc=1 Condition Registers are still Vectorised but do **not** have EXTRA2/3 spec adjustments. The only part if the EXTRA2/3 spec that is observed and respected is whether the CR is Vectorised (isvec). + +## Increasing register file sizes + +TODO. INTs, FPs, CRs, these all increase to 128. Welcome To Vector ISAs. + +At the same time the `Rc=1` CR offsets normslly CR0 and CR1 for fixed and FP scalar may also be adjusted. + +## Single and Twin Predication + +both CR and INT predication is needed, as well as zeroing in both. +the order is best done as follows: + +* INT-based single +* CR-based single +* srcstep+dststep +* INT-based twin +* CR-based twin +* Zeroing single +* Zeroing twin + +Best done as a FSM that "advances" srcstep and dststep over the +zeros in their respective predicate masks, *including* when the +src and dest predicate mask is "All 1s". + +Bear in mind that srcstep+deststep are a form of back-to-back +VGATHER+VSCATTER + +Watch out in zeroing! CR0 will *not* be set (itself) to zero: +the CR0.eq flag will be set because the *result* is still tested. +correction: CR0-and-any-other-Vector-of-CR-fields (Vector elements +have their corresponding CR field, so the test of zero needs to +be done for the associated *element* result, not jam absolutely +every element vector test *into* CR0) + +Progress: + +* TestIssuer + and Zeroing +* ISACaller +* power-gem5: TODO +* Microwatt: TODO + +## Element width overrides + + + +* Pseudocode: TODO +* Simulator: TODO +* TestIssuer: TODO +* unit tests: TODO +* power-gem5: TODO +* cavatools: TODO + +## Reduce Mode + +TODO + +## Saturation Mode + +TODO + +## REMAP and Context Propagation + +* +* +* + +## Vectorised Branches + +TODO [[sv/branches]] + +## Vectorised LD/ST + +TODO [[sv/ldst]]