+++ /dev/null
-# Resolving ISA conflicts and providing a pain-free RISC-V Standards Upgrade Path
-
-**Note: out-of-date as of review 31apr2018, requires updating to reflect
-"mvendorid-marchid-isamux" concept.** Recent discussion 10jun2019
-<https://groups.google.com/a/groups.riscv.org/d/msg/isa-dev/x-uFZDXiOxY/_ISBs1enCgAJ>.
-Now updated with its own spec [[isamux_isans]].
-
-[[!toc ]]
-
-## Executive Summary (old, still relevant for compilers)
-
-A non-invasive backwards-compatible change to make mvendorid and marchid
-being read-only to be a formal declaration of an architecture having no
-Custom Extensions, and being permitted to be WARL in order to support
-multiple simultaneous architectures on the same processor (or per hart
-or harts) permits not only backwards and forwards compatibility with
-existing implementations of the RISC-V Standard, not only permits seamless
-transitions to future versions of the RISC-V Standard (something that is
-not possible at the moment), but fixes the problem of clashes in Custom
-Extension opcodes on a global worldwide permanent and ongoing basis.
-
-Summary of impact and benefits:
-
-* Implementation impact for existing implementations (even though
- the Standard is not finalised) is zero.
-* Impact for future implementations compliant with (only one) version of the
- RISC-V Standard is zero.
-* Benefits for implementations complying with (one or more) versions
- of the RISC-V Standard is: increased customer acceptance due to
- a smooth upgrade path at the customer's pace and initiative vis-a-vis
- legacy proprietary software.
-* Benefits for implementations deploying multiple Custom Extensions
- are a massive reduction in NREs and the hugely reduced ongoing software
- toolchain maintenance costs plus the benefit of having security updates
- from upstream software sources due to
- *globally unique identifying information* resulting in zero binary
- encoding conflicts in the toolchains and resultant binaries
- *even for Custom Extensions*.
-
-## Introduction
-
-In a lengthy thread that ironically was full of conflict indicative
-of the future direction in which RISC-V will go if left unresolved,
-multiple Custom Extensions were noted to be permitted free rein to
-introduce global binary-encoding conflict with no means of resolution
-described or endorsed by the RISC-V Standard: a practice that has known
-disastrous and irreversible consequences for any architecture that
-permits such practices (1).
-
-Much later on in the discussion it was realised that there is also no way
-within the current RISC-V Specification to transition to improved versions
-of the standard, regardless of whether the fixes are absolutely critical
-show-stoppers or whether they are just keeping the standard up-to-date (2).
-
-With no transition path there is guaranteed to be tension and conflict
-within the RISC-V Community over whether revisions should be made:
-should existing legacy designs be prioritised, mutually-exclusively over
-future designs (and what happens during the transition period is absolute
-chaos, with the compiler toolchain, software ecosystem and ultimately
-the end-users bearing the full brunt of the impact). If several
-overlapping revisions are required that have not yet transitioned out
-of use (which could take well over two decades to occur) the situation
-becomes disastrous for the credibility of the entire RISC-V ecosystem.
-
-It was also pointed out that Compliance is an extremely important factor
-to take into consideration, and that Custom Extensions (as being optional)
-effectively and quite reasonably fall entirely outside of the scope of
-Compliance Testing. At this point in the discussion however it was not
-yet noted the stark problem that the *mandatory* RISC-V Specification
-also faces, by virtue of there being no transitional way to bring in
-show-stopping critical alterations.
-
-To put this into perspective, just taking into account hardware costs
-alone: with production mask charges for 28nm being around USD $1.5m,
-engineering development costs and licensing of RTLs for peripherals
-being of a similar magnitude, no manufacturer is going to back away
-from selling a "flawed" or "legacy" product (whether it complies with
-the RISC-V Specification or not) without a bitter fight.
-
-It was also pointed out that there will be significant software tool
-maintenance costs for manufacturers, meaning that the probability will
-be extremely high that they will refuse to shoulder such costs, and
-will publish and continue to publish (and use) hopelessly out-of-date
-unpatched tools. This practice is well-known to result in security
-flaws going unpatched, with one of many immediate undesirable consequences
-being that product in extremely large volume gets discarded into landfill.
-
-**All and any of the issues that were discussed, and all of those that
-were not, can be avoided by providing a hardware-level runtime-enabled
-forwards and backwards compatible transition path between *all* parts
-(mandatory or not) of current and future revisions of the RISC-V ISA
-Standard.**
-
-The rest of the discussion - indicative as it was of the stark mutually
-exclusive gap being faced by the RISC-V ISA Standard given that it does
-not cope with the problem - was an effort by two groups in two clear
-camps: one that wanted things to remain as they are, and another that
-made efforts to point out that the consequences of not taking action
-are clearly extreme and irreversible (which, unfortunately, given the
-severity, some of the first group were unable to believe, despite there
-being clear historical precedent for the exact same mistake being made in
-other architectures, and the consequences on the same being absolutely
-clear).
-
-However after a significant amount of time, certain clear requirements came
-out of the discussion:
-
-* Any proposal must be a minimal change with minimal (or zero) impact
-* Any proposal should place no restriction on existing or future
- ISA encoding space
-* Any proposal should take into account that there are existing implementors
- of the (yet to be finalised but still "partly frozen") Standard who may
- resist, for financial investment reasons, efforts to make any change
- (at all) that could cost them immediate short-term profits.
-
-Several proposals were put forward (and some are still under discussion)
-
-* "Do nothing": problem is not severe: no action needed.
-* "Do nothing": problem is out-of-scope for RISC-V Foundation.
-* "Do nothing": problem complicates Compliance Testing (and is out of scope)
-* "MISA": the MISA CSR enables and disables extensions already: use that
-* "MISA-like": a new CSR which switches in and out new encodings
- (without destroying state)
-* "mvendorid/marchid WARL": switching the entire "identity" of a machine
-* "ioctl-like": a OO proposal based around the linux kernel "ioctl" system.
-
-Each of these will be discussed below in their own sections.
-
-# Do nothing (no problem exists)
-
-(Summary: not an option)
-
-There were several solutions offered that fell into this category.
-A few of them are listed in the introduction; more are listed below,
-and it was exhaustively (and exhaustingly) established that none of
-them are workable.
-
-Initially it was pointed out that Fabless Semiconductor companies could
-simply license multiple Custom Extensions and a suitable RISC-V core, and
-modify them accordingly. The Fabless Semi Company would be responsible
-for paying the NREs on re-developing the test vectors (as the extension
-licensers would be extremely unlikely to do that without payment), and
-given that said Companies have an "integration" job to do, it would
-be reasonable to expect them to have such additional costs as well.
-
-The costs of this approach were outlined and discussed as being
-disproportionate and extreme compared to the actual likely cost of
-licensing the Custom Extensions in the first place. Additionally it
-was pointed out that not only hardware NREs would be involved but
-custom software tools (compilers and more) would also be required
-(and maintained separately, on the basis that upstream would not
-accept them except under extreme pressure, and then only with
-prejudice).
-
-All similar schemes involving customisation of the custom extensions
-were likewise rejected, but not before the customisation process was
-mistakenly conflated with tne *normal* integration process of developing
-a custom processor (Bus Architectures, Cache layouts, peripheral layouts).
-
-The most compelling hardware-related reason (excluding the severe impact on
-the software ecosystem) for rejecting the customisation-of-customisation
-approach was the case where Extensions were using an instruction encoding
-space (48-bit, 64-bit) *greater* than that which the chosen core could
-cope with (32-bit, 48-bit).
-
-Overall, none of the options presented were feasible, and, in addition,
-with no clear leadership from the RISC-V Foundation on how to avoid
-global world-wide encoding conflict, even if they were followed through,
-still would result in the failure of the RISC-V ecosystem due to
-irreversible global conflicting ISA binary-encoding meanings (POWERPC's
-Altivec / SPE nightmare).
-
-This in addition to the case where the RISC-V Foundation wishes to
-fix a critical show-stopping update to the Standard, post-release,
-where billions of dollars have been spent on deploying RISC-V in the
-field.
-
-# Do nothing (out of scope)
-
-(Summary: may not be RV Foundation's "scope", still results in
-problem, so not an option)
-
-This was one of the first arguments presented: The RISC-V Foundation
-considers Custom Extensions to be "out of scope"; that "it's not their
-problem, therefore there isn't a problem".
-
-The logical errors in this argument were quickly enumerated: namely that
-the RISC-V Foundation is not in control of the uses to which RISC-V is
-put, such that public global conflicts in binary-encoding are a hundred
-percent guaranteed to occur (*outside* of the control and remit of the
-RISC-V Foundation), and a hundred percent guaranteed to occur in
-*commodity* hardware where Debian, Fedora, SUSE and other distros will
-be hardest hit by the resultant chaos, and that will just be the more
-"visible" aspect of the underlying problem.
-
-# Do nothing (Compliance too complex, therefore out of scope)
-
-(Summary: may not be RV Foundation's "scope", still results in
-problem, so not an option)
-
-The summary here was that Compliance testing of Custom Extensions is
-not just out-of-scope, but even if it was taken into account that
-binary-encoding meanings could change, it would still be out-of-scope.
-
-However at the time that this argument was made, it had not yet been
-appreciated fully the impact that revisions to the Standard would have,
-when billions of dollars worth of (older, legacy) RISC-V hardware had
-already been deployed.
-
-Two interestingly diametrically-opposed equally valid arguments exist here:
-
-* Whilst Compliance testing of Custom Extensions is definitely legitimately
- out of scope, Compliance testing of simultaneous legacy (old revisions of
- ISA Standards) and current (new revisions of ISA Standard) definitely
- is not. Efforts to reduce *Compliance Testing* complexity is therefore
- "Compliance Tail Wagging Standard Dog".
-* Beyond a certain threshold, complexity of Compliance Testing is so
- burdensome that it risks outright rejection of the entire Standard.
-
-Meeting these two diametrically-opposed perspectives requires that the
-solution be very, very simple.
-
-# MISA
-
-(Summary: MISA not suitable, leads to better idea)
-
-MISA permits extensions to be disabled by masking out the relevant bit.
-Hypothetically it could be used to disable one extension, then enable
-another that happens to use the same binary encoding.
-
-*However*:
-
-* MISA Extension disabling is permitted (optionally) to **destroy**
- the state information. Thus it is totally unsuitable for cases
- where instructions from different Custom extensions are needed in
- quick succession.
-* MISA was only designed to cover Standard Extensions.
-* There is nothing to prevent multiple Extensions being enabled
- that wish to simultaneously interpret the same binary encoding.
-* There is nothing in the MISA specification which permits
- *future* versions (bug-fixes) of the RISC-V ISA to be "switched in".
-
-Overall, whilst the MISA concept is a step in the right direction it's
-a hundred percent unsuitable for solving the problem.
-
-# MISA-like
-
-(Summary: basically same as mvend/march WARL except needs an extra CSR where
-mv/ma doesn't. Along right lines, doesn't meet full requirements)
-
-Out of the MISA discussion came a "MISA-like" proposal, which would
-take into account the flaws pointed out by trying to use "MISA":
-
-* The MISA-like CSR's meaning would be identified by compilers using the
- mvendor-id/march-id tuple as a compiler target
-* Each custom-defined bit of the MISA-like CSR would (mutually-exclusively)
- redirect binary encoding(s) to specific encodings
-* No Extension would *actually* be disabled: its internal state would
- be left on (permanently) so that switching of ISA decoding
- could be done inside inner loops without adverse impact on
- performance.
-
-Whilst it was the first "workable" solution it was also noted that the
-scheme is invasive: it requires an entirely new CSR to be added
-to the privileged spec (thus making existing implementations redundant).
-This does not fulfil the "minimum impact" requirement.
-
-Also interesting around the same time an additional discussion was
-raised that covered the *compiler* side of the same equation. This
-revolved around using mvendorid-marchid tuples at the compiler level,
-to be put into assembly output (by gcc), preserving the required
-*globally* unique identifying information for binutils to successfully
-turn the custom instruction into an actual binary-encoding (plus
-binary-encoding of the context-switching information). (**TBD, Jacob,
-separate page? review this para?**)
-
-# mvendorid/marchid WARL <a name="mvendor_marchid_warl"></a>
-
-(Summary: the only idea that meets the full requirements. Needs
- toolchain backup, but only when the first chip is released)
-
-This proposal has full details at the following page:
-[[mvendor_march_warl]]
-
-Coming out of the software-related proposal by Jacob Bachmeyer, which
-hinged on the idea of a globally-maintained gcc / binutils database
-that kept and coordinated architectural encodings (curated by the Free
-Software Foundation), was to quite simply make the mvendorid and marchid
-CSRs have WARL (writeable) characteristics. Read-only is taken to
-mean a declaration of "Having no Custom Extensions" (a zero-impact
-change).
-
-By making mvendorid-marchid tuples WARL the instruction decode phase
-may re-route mutually-exclusively to different engines, thus providing
-a controlled means and method of supporting multiple (future, past and
-present) versions of the **Base** ISA, Custom Extensions and even
-completely foreign ISAs in the same processor.
-
-This incredibly simple non-invasive idea has some unique and distinct
-advantages over other proposals:
-
-* Existing designs - even though the specification is not finalised
- (but has "frozen" aspects) - would be completely unaffected: the
- change is to the "wording" of the specification to "retrospectively"
- fit reality.
-* Unlike with the MISA idea this is *purely* at the "decode" phase:
- no internal Extension state information is permitted to be disabled,
- altered or destroyed as a direct result of writing to the
- mvendor/march-id CSRs.
-* Compliance Testing may be carried out with a different vendorid/marchid
- tuple set prior to a test, allowing a vendor to claim *Certified*
- compatibility with *both* one (or more) legacy variants of the RISC-V
- Specification *and* with a present one.
-* With sufficient care taken in the implementation an implementor
- may have multiple interpretations of the same binary encoding within
- an inner loop, with a single instruction (to the WARL register)
- changing the meaning.
-
-**This is the only one of the proposals that meet the full requirements**
-
-# Overloadable opcodes <a name="overloadable opcodes"></a>
-
-See [[overloadable opcodes]] for full details, including a description in terms of C functions.
-
-NOTE: under discussion.
-
-==RB 2018-5-1 dropped IOCTL proposal for the much simpler overloadable opcodes proposal==
-
-The overloadable opcode (or xext) proposal allows a non standard extension to use a documented 20 + 3 bit (or 52 + 3 bit on RV64) UUID identifier for an instruction for _software_ to use. At runtime, a cpu translates the UUID to a small implementation defined 12 + 3 bit bit identifier for _hardware_ to use. It also defines a fallback mechanism for the UUID's of instructions the cpu does not recognise.
-
-The overloadable opcodes proposal defines 8 standardised R-type instructions xcmd0, xcmd1, ...xcmd7 preferably in the brownfield opcode space.
-Each xcmd takes in rs1 a 12 bit "logical unit" (lun) identifying a device on the cpu that implements some "extension interface" (xintf) together with some additional data. An xintf is a set of up to 8 commands with 2 input and 1 output port (i.e. like an R-type instruction), together with a description of the semantics of the commands. Calling e.g. xcmd3 routes its two inputs and one output ports to command 3 on the device determined by the lun bits in rs1. Thus, the 8 standard xcmd instructions are standard-designated overloadable opcodes, with the non standard semantics of the opcode determined by the lun.
-
-Portable software, does not use luns directly. Instead, it goes through a level of indirection using a further instruction xext that translates a 20 bit globally unique identifier UUID of an xintf, to the lun of a device on the cpu that implements that xintf. The cpu can do this, because it knows (at manufacturing or boot time) which devices it has, and which xintfs they provide. This includes devices that would be described as non standard extension of the cpu if the designers had used custom opcodes instead of xintf as an interface. If the UUID of the xintf is not recognised at the current privilege level, the xext instruction returns the special lun = 0, causing any xcmd to trap. Minor variations of this scheme (requiring two more instructions) cause xcmd instructions to fallback to always return 0 or -1 instead of trapping.
-
-The 20 bit provided by the UUID of the xintf is much more room than provided by the 2 custom 32 bit, or even 4 custom 64/48 bit opcode spaces. Thus the overloadable opcodes proposal avoids most of the need to put a claim on opcode space and the associated collisions when combining independent extensions. In this respect it is similar to POSIX ioctls, which obviate the need for defining new syscalls to control new and nonstandard hardware.
-
-Remark1: the main difference with a previous "ioctl like proposal" is that UUID translation is stateless and does not use resources. The xext instruction _neither_ initialises a device _nor_ builds global state identified by a cookie. If a device needs initialisation it can do this using xcmds as init and deinit instructions. Likewise, it can hand out cookies (which can include the lun) as a return value .
-
-Remark2: Implementing devices can respond to an (essentially) arbitrary number of xintfs. Hence an implementing device can respond to an arbitrary number of commands. Organising related commands in xintfs, helps avoid UUID space pollution, and allows to amortise the (small) cost of UUID to lun translation if related commands are used in combination.
-
-==RB not sure if this is still correct and relevant==
-
-The proposal is functionally similar to that of the mvendor/march-id
-except the non standard extension is explicit and restricted to a small set of well defined individual opcodes.
-Hence several extensions can be mixed and there is no state to be tracked over context switches.
-As such it could hypothetically be proposed as an independent Standard Extension.
-
-Despite the proposal (which is still undergoing clarification)
-being worthwhile in its own right, and standing on its own merits and
-thus definitely worthwhile pursuing, it is non-trivial and more
-invasive than the mvendor/march-id WARL concept.
-
-==RB==
-
-# Dynamic runtime hardware-adjustable custom opcode encodings <a name="dynamic_opcodes"></a>
-
-Perhaps this is a misunderstanding, that what is being advocated
-below (see link for full context):
-
-> The best that can be done is to allow each custom extension to have
-> its opcodes easily re positioned depending on what other custom extensions
-> the user wants available in the same program (without mode switches).
-
-It was suggested to use markers in the object files as a way to
-identify opcodes that can be "re-encoded". Contrast this with Jacob
-Bachmeyer's original idea where the *assembly code* (only) contains
-such markers (on a global world-wide unique basis, using mvendorid-marchid-isamux
-tuples to do so).
-
-<https://groups.google.com/a/groups.riscv.org/d/msg/isa-dev/Jnon96tVQD0/XuHWvduvDQAJ>
-
-There are two possible interpretations of this:
-
-* (1) the Hardware RTL is reconfigureable (parameterisable) to allow
- easy selection of *static* moving (adjustment) of which opcodes a
- particular instruction uses. This runs into the same difficulties
- as outlined in other areas of this document.
-* (2) the Hardware RTL contains DYNAMIC and RUN-TIME CONFIGUREABLE
- opcodes (presumably using additional CSRs to move meanings)
-
-This would help any implementation to adjust to whatever future (official)
-uses a particular encoding was selected. It would be particularly useful
-if an implementation used certain brownfield encodings.
-
-The only downsides are:
-
-* (1) Compiler support for dynamic opcode reconfiguration would be...
- complex.
-* (2) The instruction decode phase is also made more complex, now
- involving reconfigureable lookup tables. Whilst major opcodes
- can be easily redirected, brownfield encodings are more involved.
-
-Compared to a stark choice of having to move (exclusively) to 48-bit
-or 64-bit encodings, dynamic runtime opcode reconfiguration is
-comparatively much more palatable.
-
-In effect, it is a much more advanced version of ISAMUX/NS
-(see [[isamux_isans]]).
-
-# Comments, Discussion and analysis
-
-TBD: placeholder as of 26apr2018
-
-## new (old) m-a-i tuple idea
-
-> actually that's a good point: where the user decides that they want
-> to boot one and only one tuple (for the entire OS), forcing a HARDWARE
-> level default m-a-i tuple at them actually prevents and prohibits them
-> from doing that, Jacob.
->
-> so we have apps on one RV-Base ISA and apps on an INCOMPATIBLE (future)
-> variant of RV-Base ISA. with the approach that i was advocating (S-mode
-> does NOT switch automatically), there are totally separate mtvec /
-> stvec / bstvec traps.
->
-> would it be reasonable to assume the following:
->
-> (a) RV-Base ISA, particularly code-execution in the critical S-mode
-> trap-handling, is *EXTREMELY* unlikely to ever be changed, even thinking
-> 30 years into the future ?
->
-> (b) if the current M-mode (user app level) context is "RV Base ISA 1"
-> then i would hazard a guess that S-mode is prettty much going to drop
-> down into *exactly* the same mode / context, the majority of the time
->
-> thus the hypothesis is that not only is it the common code-path to *not*
-> switch the ISA in the S-mode trap but that the instructions used are
-> extremely unlikely to be changed between "RV Base Revisions".
->
-> foreign isa hardware-level execution
-> ------------------------
->
-> this is the one i've not really thought through so much, other than it
-> would clearly be disadvantageous for S-mode to be arbitrarily restricted
-> to running RV-Base code (of any variant). a case could be made that by the
-> time the m-a-i tuple is switched to the foreign isa it's "all bets off",
-> foreign arch is "on its own", including having to devise a means and
-> method to switch back (equivalent in its ISA of m-a-i switching).
->
-> conclusion / idea
-> --------------------
->
-> the multi-base "user wants to run one and only one tuple" is the key
-> case, here, that is a show-stopper to the idea of hard-wiring the default
-> S-mode m-a-i.
->
-> now, if instead we were to say, "ok so there should be a default S-mode
-> m-a-i tuple" and it was permitted to SET (choose) that tuple, *that*
-> would solve that problem. it could even be set to the foreign isa.
-> which would be hilarious.
-
-jacob's idea: one hart, one configuration:
-
->>> (a) RV-Base ISA, particularly code-execution in the critical S-mode
->>> trap-handling, is *EXTREMELY* unlikely to ever be changed, even
->>> thinking 30 years into the future ?
->>
->> Oddly enough, due to the minimalism of RISC-V, I believe that this is
->> actually quite likely. :-)
->>
->>> thus the hypothesis is that not only is it the common code-path to
->>> *not* switch the ISA in the S-mode trap but that the instructions used
->>> are extremely unlikely to be changed between "RV Base Revisions".
->>>
->> Correct. I argue that S-mode should *not* be able to switch the selected
->> ISA on multi-arch processors.
->
-> that would produce an artificial limitation which would prevent
-> and prohibit implementors from making a single-core (single-hart)
-> multi-configuration processor.
-
-
-
-# Summary and Conclusion
-
-In the early sections (those in the category "no action") it was established
-in each case that the problem is not solved. Avoidance of responsibility,
-or conflation of "not our problem" with "no problem" does not make "problem"
-go away. Even "making it the Fabless Semiconductor's design problem" resulted
-in a chip being *more costly to engineer as hardware **and** more costly
-from a software-support perspective to maintain*... without actually
-fixing the problem.
-
-The first idea considered which could fix the problem was to just use
-the pre-existing MISA CSR, however this was determined not to have
-the right coverage (Standard Extensions only), and also crucially it
-destroyed state. Whilst unworkable it did lead to the first "workable"
-solution, "MISA-like".
-
-The "MISA-like" proposal, whilst meeting most of the requirements, led to
-a better idea: "mvendor/march-id WARL", which, in combination with an offshoot
-idea related to gcc and binutils, is the only proposal that fully meets the
-requirements.
-
-The "ioctl-like" idea *also* solves the problem, but, unlike the WARL idea
-does not meet the full requirements to be "non-invasive" and "backwards
-compatible" with pre-existing (pre-Standards-finalised) implementations.
-It does however stand on its own merit as a way to extend the extremely
-small Custom Extension opcode space, even if it itself implemented *as*
-a Custom Extension into which *other* Custom Extensions are subsequently
-shoe-horned. This approach has the advantage that it requires no "approval"
-from the RISC-V Foundation... but without the RISC-V Standard "approval"
-guaranteeing no binary-encoding conflicts, still does not actually solve the
-problem (if deployed as a Custom Extension for extending Custom Extensions).
-
-Overall the mvendor/march-id WARL idea meets the three requirements,
-and is the only idea that meets the three requirements:
-
-* **Any proposal must be a minimal change with minimal (or zero) impact**
- (met through being purely a single backwards-compatible change to the
- wording of the specification: mvendor/march-id changes from read-only
- to WARL)
-* **Any proposal should place no restriction on existing or future
- ISA encoding space**
- (met because it is just a change to one pre-existing CSR, as opposed
- to requiring additional CSRs or requiring extra opcodes or changes
- to existing opcodes)
-* **Any proposal should take into account that there are existing implementors
- of the (yet to be finalised but still "partly frozen") Standard who may
- resist, for financial investment reasons, efforts to make any change
- (at all) that could cost them immediate short-term profits.**
- (met because existing implementations, with the exception of those
- that have Custom Extensions, come under the "vendor/arch-id read only
- is a formal declaration of an implementation having no Custom Extensions"
- fall-back category)
-
-So to summarise:
-
-* The consequences of not tackling this are severe: the RISC-V Foundation
- cannot take a back seat. If it does, clear historical precedent shows
- 100% what the outcome will be (1).
-* Making the mvendorid and marchid CSRs WARL solves the problem in a
- minimal to zero-disruptive backwards-compatible fashion that provides
- indefinite transparent *forwards*-compatibility.
-* The retro-fitting cost onto existing implementations (even though the
- specification has not been finalised) is zero to negligeable
- (only changes to words in the specification required at this time:
- no vendor need discard existing designs, either being designed,
- taped out, or actually in production).
-* The benefits are clear (pain-free transition path for vendors to safely
- upgrade over time; no fights over Custom opcode space; no hassle for
- software toolchain; no hassle for GNU/Linux Distros)
-* The implementation details are clear (and problem-free except for
- vendors who insist on deploying dozens of conflicting Custom Extensions:
- an extreme unlikely outlier).
-* Compliance Testing is straightforward and allows vendors to seek and
- obtain *multiple* Compliance Certificates with past, present and future
- variants of the RISC-V Standard (in the exact same processor,
- simultaneously), in order to support end-customer legacy scenarios and
- provide the same with a way to avoid "impossible-to-make" decisions that
- throw out ultra-costly multi-decade-investment in proprietary legacy
- software at the same as the (legacy) hardware.
-
--------
-
-# Conversation Exerpts
-
-The following conversation exerpts are taken from the ISA-dev discussion
-
-## (1) Albert Calahan on SPE / Altiven conflict in POWERPC
-
-> Yes. Well, it should be blocked via legal means. Incompatibility is
-> a disaster for an architecture.
->
-> The viability of PowerPC was badly damaged when SPE was
-> introduced. This was a vector instruction set that was incompatible
-> with the AltiVec instruction set. Software vendors had to choose,
-> and typically the choice was "neither". Nobody wants to put in the
-> effort when there is uncertainty and a market fragmented into
-> small bits.
->
-> Note how Intel did not screw up. When SSE was added, MMX remained.
-> Software vendors could trust that instructions would be supported.
-> Both MMX and SSE remain today, in all shipping processors. With very
-> few exceptions, Intel does not ship chips with missing functionality.
-> There is a unified software ecosystem.
->
-> This goes beyond the instruction set. MMU functionality also matters.
-> You can add stuff, but then it must be implemented in every future CPU.
-> You can not take stuff away without harming the architecture.
-
-## (2) Luke Kenneth Casson Leighton on Standards backwards-compatibility
-
-> For the case where "legacy" variants of the RISC-V Standard are
-> backwards-forwards-compatibly supported over a 10-20 year period in
-> Industrial and Military/Goverment-procurement scenarios (so that the
-> impossible-to-achieve pressure is off to get the spec ABSOLUTELY
-> correct, RIGHT now), nobody would expect a seriously heavy-duty amount
-> of instruction-by-instruction switching: it'd be used pretty much once
-> and only once at boot-up (or once in a Hypervisor Virtual Machine
-> client) and that's it.
-
-## (3) Allen Baum on Standards Compliance
-
-> Putting my compliance chair hat on: One point that was made quite
-> clear to me is that compliance will only test that an implementation
-> correctly implements the portions of the spec that are mandatory, and
-> the portions of the spec that are optional and the implementor claims
-> it is implementing. It will test nothing in the custom extension space,
-> and doesn't monitor or care what is in that space.
-
-## (4) Jacob Bachmeyer on explaining disambiguation of opcode space
-
-> ...have different harts with different sets of encodings.) Adding a "select"
-> CSR as has been proposed does not escape this fundamental truth that
-> instruction decode must be unambiguous, it merely expands every opcode with
-> extra bits from a "select" CSR.
-
-## (5) Krste Asanovic on clarification of use of opcode space
-
-> A CPU is even free to reuse some standard extension encoding space for
-> non-standard extensions provided it does not claim to implement that
-> standard extension.
-
-## (6) Clarification of difference between assembler and encodings
-
-> > The extensible assembler database I proposed assumes that each processor
-> > will have *one* and *only* one set of recognized instructions. (The "hidden
-> > prefix" is the immutable vendor/arch/impl tuple in my proposals.)
->
-> ah this is an extremely important thing to clarify, the difference
-> between the recognised instruction assembly mnemonic (which must be
-> globally world-wide accepted as canonical) and the binary-level encodings
-> of that mnemonic used different vendor implementations which will most
-> definitely *not* be unique but require "registration" in the form of
-> atomic acceptance as a patch by the FSF to gcc and binutils [and other
-> compiler tools].
-
-
-# References
-
-* <https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/7bbwSIW5aqM>
-* <https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/InzQ1wr_3Ak%5B1-25%5D>
-* Review mvendorid-marchid WARL <https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/Uvy9paXN1xA>
+++ /dev/null
-The ioctls proposal was a precursor of the [[overloadable opcodes]] proposal. Please see there.
-
+++ /dev/null
-# Note-form on ISAMUX (aka "ISANS")
-
-Links:
-
-* <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2020-February/004190.html>
-* bugreport <http://bugs.libre-riscv.org/show_bug.cgi?id=214>
-
-A fixed number of additional (hidden) bits, conceptually a "namespace",
-set by way of a CSR or other out-of-band mechanism,
-that go directly and non-optionally
-into the instruction decode phase, extending (in each implementation) the
-opcode length to 16+N, 32+N, 48+N, where N is a hard fixed quantity on
-a per-implementor basis.
-
-Where the opcode is normally loaded from the location at the PC, the extra
-bits, set via a CSR, are mandatorially appended to every instruction: hence why they are described as "hidden" opcode bits, and as a "namespace".
-
-The parallels with c++ "using namespace" are direct and clear.
-Alternative conceptual ways to understand this concept include
-"escape-sequencing".
-
-TODO: reserve some bits which permit the namespace (escape-sequence) to
-be relevant for a fixed number of instructions at a time. Caveat:
-allowing such a countdown to cross branch-points is unwise (illegal
-instruction?)
-
-An example of a pre-existing "namespace" switch that has been in
-prevalent use for several decades (SPARC and other architectures):
-dynamic runtime selectability of littel-endian / big-endian "meaning"
-of instructions by way of a "mode switch" instruction (of some kind).
-
-That "switch" is in effect a 33rd (hidden) bit that is part of the opcode,
-going directly into the mux / decode phase of instruction decode, and
-thus qualifies categorically as a "namespace". This proposal both formalises
-and generalises that concept.
-
-# Hypothetical Format
-
-Note that this is a hypothetical format, yet TBD, where particular attention
-needs to be paid to the fact that there is an "immediate" version of CSRRW
-(with 5 bits of immediate) that could save a lot of space in binaries.
-
-<pre>
- 3 2 1
-|1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0|
-|------------------------------ |-------|---------------------|-|
-|1 custom custom custom custom custom | foreignarch |1|
-|0 reserved reserved reserved reserved reserved | foreignarch |1|
-|custom | reserved | official|B| rvcpage |0|
-</pre>
-
-RV Mode
-
-* when bit 0 is 0, "RV" mode is selected.
-* in RV mode, bits 1 thru 5 provide up to 16 possible alternative meanings (namespaces) for 16 Bit opcodes. "pages" if you will. The top bit indicates custom meanings. When set to 0, the top bit is for official usage.
-* Bits 15 thru 23 are reserved.
-* Bits 24 thru 31 are for custom usage.
-* bit 6 ("B") is endian-selection: LE/BE
-
-16 bit page examples:
-
-* 0b0000 STANDARD (2019) RVC
-* 0b0001 RVCv2
-* 0b0010 RV16
-* 0b0011 RVCv3
-* ...
-* 0b1000 custom 16 bit opcode meanings 1
-* 0b1001 custom 16 bit opcode meanings 2
-* .....
-
-Foreign Arch Mode
-
-* when bit 0 is 1, "Foreign arch" mode is selected.
-* Bits 1 thru 7 are a table of foreign arches.
-* when the MSB is 1, this is for custom use.
-* when the MSB is 0, bits 1 thru 6 are reserved for 64 possible official foreign archs.
-
-Foreign archs could be (examples):
-
-* 0b0000000 x86_32
-* 0b0000001 x86_64
-* 0b0000010 MIPS32
-* 0b0000011 MIPS64
-* ....
-* 0b0010000 Java Bytecode
-* 0b0010001 N.E.Other Bytecode
-* ....
-* 0b1000000 custom foreign arch 1
-* 0b1000001 custom foreign arch 2
-* ....
-
-Note that "official" foreign archs have a binary value where the MSB is zero,
-and custom foreign archs have a binary value where the MSB is 1.
-
-# Namespaces are permitted to swap to new state <a name="stateswap"></a>
-
-In each privilege level, on a change of ISANS (whether through manual setting of ISANS or through trap entry or exit changing the ISANS CSR), an implementation is permitted to completely and arbitrarily switch not only the instruction set, it is permitted to switch to a new bank of CSRs (or a subset of the same), and even to switch to a new PC.
-
-This to occur immediately and atomically at the point at which the change in ISANS occurs.
-
-The most obvious application of this is for Foreign Archs, which may have their own completely separate PC. Thus, foreign assembly code and RISCV assembly code need not be mixed in the same binary.
-
-Further use-cases may be envisaged however great care needs to be taken to not cause massive complications for JIT emulation, as the RV ISANS is unary encoded (2^31 permutations).
-
-In addition, the state information of *all* namespaces has to be saved and restored on a context-switch (unless the SP is also switched as part of the state!) which is quite severely burdensome and getting exceptionally complex.
-
-Switching CSR, PC (and potentially SP) and other state on a NS change in the RISCV unary NS therefore needs to be done wisely and responsibly, i.e. minimised!
-
-To be discussed. Context <https://groups.google.com/a/groups.riscv.org/d/msg/isa-dev/x-uFZDXiOxY/27QDW5KvBQAJ>
-
-# Privileged Modes / Traps <a name="privtraps"></a>
-
-An additional WLRL CSR per priv-level named "LAST-ISANS" is required, and
-another called "TRAP-ISANS"
-These mirrors the ISANS CSR, and, on a trap, the current ISANS in
-that privilege level is atomically
-transferred into LAST-ISANS by the hardware, and ISANS in that trap
-is set to TRAP-ISANS. Hardware is *only then* permitted to modify the PC to
-begin execution of the trap.
-
-On exit from the trap, LAST-ISANS is copied into the ISANS CSR, and
-LAST-ISANS is set to TRAP-ISANS. *Only then* is the hardware permitted
-to modify the PC to begin execution where the trap left off.
-
-This is identical to how xepc is handled.
-
-Note 1: in the case of Supervisor Mode (context switches in particular),
-saving and changing of LAST-ISANS (to and from the stack) must be done
-atomically and under the protection of the SIE bit. Failure to do so
-could result in corruption of LAST-ISANS when multiple traps occur in
-the same privilege level.
-
-Note 2: question - should the trap due to illegal (unsupported) values
-written into LAST-ISANS occur when the *software* writes to LAST-ISANS,
-or when the *trap* (on exit) writes into LAST-ISANS? this latter seems
-fraught: a trap, on exit, causing another trap??
-
-Per-privilege-level pseudocode (there exists UISANS, UTRAPISANS, ULASTISANS,
-MISANS, MTRAPISANS, MLASTISANS and so on):
-
-<pre>
-trap_entry()
-{
- LAST-ISANS = ISANS // record the old NS
- ISANS = TRAP_ISANS // traps are executed in "trap" NS
-}
-
-and trap_exit:
-
-trap_exit():
-{
- ISANS = LAST-ISANS
- LAST-ISANS = TRAP_ISANS
-}
-</pre>
-
-# Alternative RVC 16 Bit Opcode meanings
-
-Here is appropriate to raise an idea how to cover RVC and future
-variants, including RV16.
-
-Just as with foreign archs, and you quite rightly highlight above, it
-makes absolutely no sense to try to select both RVCv1, v2, v3 and so on,
-all simultaneously. An unary bit vector for RVC modes, changing the 16
-BIT opcode space meaning, is wasteful and again has us believe that WARL
-is the "solution".
-
-The correct thing to do is, again, just like with foreign archs, to
-treat RVCs as a *binary* namespace selector. Bits 1 thru 3 would give
-8 possible completely new alternative meanings, just like how the Z80
-and the 286 and 386 used to do bank switching.
-
-All zeros is clearly reserved for the present RVC. 0b001 for RVCv2. 0b010
-for RV16 (look it up) and there should definitely be room reserved here
-for custom reencodings of the 16 bit opcode space.
-
-# FAQ
-
-## Why not have TRAP-ISANS as a vector table, matching mtvec? <a name="trap-isans-vec"></a>
-
-Use case to be determined. Rather than be a global per-priv-level value,
-TRAP-ISANS is a table of length exactly equal to the mtvec/utvec/stvec table,
-with corresponding entries that specify the assembly-code namespace in which
-the trap handler routine is written.
-
-Open question: see <https://groups.google.com/a/groups.riscv.org/d/msg/isa-dev/IAhyOqEZoWA/BM0G3J2zBgAJ>
-
-<pre>
-trap_entry(x_cause)
-{
- LAST-ISANS = ISANS // record the old NS
- ISANS = TRAP_ISANS_VEC[xcause] // traps are executed in "trap" NS
-}
-
-and trap_exit:
-
-trap_exit(x_cause):
-{
- ISANS = LAST-ISANS
- LAST-ISANS = TRAP_ISANS_VEC[x_cause]
-}
-</pre>
-
-## Is this like MISA? <a name="misa"></a>
-
-No.
-
-* MISA's space is entirely taken up (and running out).
-* There is no allocation (provision) for custom extensions.
-* MISA switches on and off entire extensions: ISAMUX/NS may be used to switch multiple opcodes (present and future), to alternate meanings.
-* MISA is WARL and is inaccessible from everything but M-Mode (not even readable).
-
-MISA is therefore wholly unsuited to U-Mode usage; ISANS is specifically permitted to be called by userspace to switch (with no stalling) between namespaces, repeatedly and in quick succession.
-
-## What happens if this scheme is not adopted? Why is it better than leaving things well alone? <a name="laissezfaire"></a>
-
-At the first sign of an emergency non-backwards compatible and unavoidable
-change to the *frozen* RISCV *official* Standards, the entire RISCV
-community is fragmented and divided into two:
-
-* Those vendors that are hardware compatible with the legacy standard.
-* Those that are compatible with the new standard.
-
-*These two communities would be mutually exclusively incompatible*. If
-a second emergency occurs, RISCV becomes even less tenable.
-
-Hardware that wished to be "compatible" with either flavour would require
-JIT or offline static binary recompilation. No vendor would willingly
-accept this as a condition of the standards divergence in the first place,
-locking up decision making to the detriment of RISCV as a whole.
-
-By providing a "safety valve" in the form of a hidden namespace, at least
-newer hardware has the option to implement both (or more) variations,
-*and still apply for Certification*.
-
-However to also allow "legacy" hardware to at least be JIT soft
-compatible, some very strict rules *must* be adhered to, that appear at
-first sight not to make any sense.
-
-It's complicated in other words!
-
-## Surely it's okay to just tell people to use 48-bit encodings? <a name="use48bit"></a>
-
-Short answer: it doesn't help resolve conflicts, and costs hardware and
-redesigns to do so. Softcores in cost-sensitive embedded applications may
-even not actually be able to fit the required 48 bit instruction decode engine
-into a (small, ICE40) FPGA. 48-bit instruction decoding is much more complex
-than straight 32-bit decoding, requiring a queue.
-
-Second answer: conflicts can still occur in the (unregulated, custom) 48-bit
-space, which *could* be resolved by ISAMUX/ISANS as applied to the *48* bit
-space in exactly the same way. And the 64-bit space.
-
-## Why not leave this to individual custom vendors to solve on a case by case basis? <a name="case-by-case"></a>
-
-The suggestion was raised that a custom extension vendor could create
-their own CSR that selects between conflicting namespaces that resolve
-the meaning of the exact same opcode. This to be done by all and any
-vendors, as they see fit, with little to no collaboration or coordination
-towards standardisation in any form.
-
-The problems with this approach are numerous, when presented to a
-worldwide context that the UNIX Platform, in particular, has to face
-(where the embedded platform does not)
-
-First: lack of coordination, in the proliferation of arbitrary solutions,
-has to primarily be borne by gcc, binutils, LLVM and other compilers.
-
-Secondly: CSR space is precious. With each vendor likely needing only one
-or two bits to express the namespace collision avoidance, if they make
-even a token effort to use worldwide unique CSRs (an effort that would
-benefit compiler writers), the CSR register space is quickly exhausted.
-
-Thirdly: JIT Emulation of such an unregulated space becomes just as
-much hell as it is for compiler writers. In addition, if two vendors
-use conflicting CSR addresses, the only sane way to tell the emulator
-what to do is to give the emulator a runtime commandline argument.
-
-Fourthly: with each vendor coming up with their own way of handling
-conflicts, not only are the chances of mistakes higher, it is against the
-very principles of collaboration and cooperation that save vendors money
-on development and ongoing maintenance. Each custom vendor will have
-to maintain their own separate hard fork of the toolchain and software,
-which is well known to result in security vulnerabilities.
-
-By coordinating and managing the allocation of namespace bits (unary
-or binary) the above issues are solved. CSR space is no longer wasted,
-compiler and JIT software writers have an easier time, clashes are
-avoided, and RISCV is stabilised and has a trustable long term future.
-
-## Why ISAMUX / ISANS has to be WLRL and mandatory trap on illegal writes <a name="wlrlmandatorytrap"></a>
-
-The namespaces, set by bits in the CSR, are functionally directly
-equivalent to c++ namespaces, even down to the use of braces.
-
-WARL, by allowing implementors to choose the value, prevents and prohibits
-the critical and necessary raising of an exception that would begin the
-JIT process in the case of ongoing standards evolution.
-
-Without this opportunity, an implementation has no reliable guaranteed way of knowing
-when to drop into full JIT mode,
-which is the only guaranteed way to distinguish
-any given conflicting opcode. It is as if the c++
-standard was given a similar optional
-opportunity to completely ignore the
-"using namespace" prefix!
-
---
-
-Ok so I trust it's now clear why WLRL (thanks Allen) is needed.
-
-When Dan raised the WARL concern initially a situation was masked by
-the conflict, that if gone unnoticed would jeapordise ISAMUX/ISANS
-entirely. Actually, two separate errors. So thank you for raising the
-question.
-
-The situation arises when foreign archs are to be given their own NS
-bit. MIPS is allocated bit 8, x86 bit 9, whilst LE/BE is given bit 0,
-RVCv2 bit 1 andso on. All of this potential rather than actual, clearly.
-
-Imagine then that software tries to write and set not just bit 8 and
-bit 9, it also tries to set bit 0 and 1 as well.
-
-This *IS* on the face of it a legitimate reason to make ISAMUX/ISANS WARL.
-
-However it masks a fundamental flaw that has to be addressed, which
-brings us back much closer to the original design of 18 months ago,
-and it's highlighted thus:
-
-x86 and simultaneous RVCv2 modes are total nonsense in the first place!
-
-The solution instead is to have a NS bit (bit0) that SPECIFICALLY
-determines if the arch is RV or not. If 0, the rest of the ISAMUX/ISANS
-is very specifically RV *only*, and if 1, the ISAMUX/ISANS is a *binary*
-table of foreign architectures and foreign architectures only.
-
-Exactly how many bits are used for the foreign arch table, is to
-be determined. 7 bits, one of which is reserved for custom usage,
-leaving a whopping 64 possible "official" foreign instruction sets to
-be hardware-supported/JIT-emulated seems to be sufficiently gratuitous,
-to me.
-
-One of those could even be Java Bytecode!
-
-Now, it could *hypothetically* be argued that the permutation of setting
-LE/BE and MIPS for example is desirable. A simple analysis shows this
-not to be the case: once in the MIPS foreign NS, it is the MIPS hardware
-implementation that should have its own way of setting and managing its
-LE/BE mode, because to do otherwise drastically interferes with MIPS
-binary compatibility.
-
-Thus, it is officially Not Our Problem: only flipping into one foreign
-arch at a time makes sense, thus this has to be reflected in the
-ISAMUX/ISANS CSR itself, completely side-stepping the (apparent) need
-to make the NS CSR WARL (which would not work anyway, as previously
-mentioned).
-
-So, thank you, again, Dan, for raising this. It would have completely
-jeapordised ISAMUX/NS if not spotted.
-
-The second issue is: how does any hardware system, whether it support
-ISANS or not, and whether any future hardware supports some Namespaces
-and, in a transitive fashion, has to support *more* future namespaces,
-through JIT emulation, if this is not planned properly in advance?
-
-Let us take the simple case first: a current 2019 RISCV fully compliant
-RV64GC UNIX capable system (with mandatory traps on all unsupported CSRs).
-
-Fast forward 20 years, there are now 5 ISAMUX/NS unary bits, and 3
-foreign arch binary table entries.
-
-Such a system is perfectly possible of software JIT emulating ALL of these
-options because the write to the (illegal, for that system) ISAMUX/NS
-CSR generates the trap that is needed for that system ti begin JIT mode.
-
-(This again emphasises exactly why the trap is mandatory).
-
-Now let us take the case of a hypothetical system from say 2021 that
-implements RVCv2 at the hardware level.
-
-Fast forward 20 years: if the CSR were made WARL, that system would be
-absolutely screwed. The implementor would be under the false impression
-that ignoring setting of "illegal" bits was acceptable, making the
-transition to JIT mode flat-out impossible to detect.
-
-When this is considered transitively, considering all future additions to
-the NS, and all permutations, it can be logically deduced that there is
-a need to reserve a *full* set of bits in the ISAMUX/NS CSR *in advance*.
-
-i.e. that *right now*, in the year 2019, the entire ISAMUX/NS CSR cannot
-be added to piecemeal, the full 32 (or 64) bits *has* to be reserved,
-and reserved bits set at zero.
-
-Furthermore, if any software attempts to write to those reserved bits,
-it *must* be treated just as if those bits were distinct and nonexistent
-CSRs, and a trap raised.
-
-It makes more sense to consider each NS as having its own completely
-separate CSR, which, if it does not exist, clearly it should be obvious
-that, as an unsupported CSR, a trap should be raised (and JIT emulation
-activated).
-
-However given that only the one bit is needed (in RV NS Mode, not
-Foreign NS Mode), it would be terribly wasteful of the CSRs to do this,
-despite it being technically correct and much easier to understand why
-trap raising is so essential (mandatory).
-
-This again should emphasise how to mentally get one's head round this
-mind-bendingly complex problem space: think of each NS bit as its own
-totally separate CSR that every implementor is free and clear to implement
-(or leave to JIT Emulation) as they see fit.
-
-Only then does the mandatory need to trap on write really start to hit
-home, as does the need to preallocate a full set of reserved zero values
-in the RV ISAMUX/NS.
-
-Lastly, I *think* it's ok to only reserve say 32 bits, and, in 50 years
-time if that genuinely is not enough, start the process all over again
-with a new CSR. ISAMUX2/NS2.
-
-Subdivision of the RV NS (support for RVCv3/4/5/RV16 without wasting
-precious CSR bits) best left for discussion another time, the above is
-a heck of a lot to absorb, already.
-
-## Why WARL will not work and why WLRL is required
-
-WARL requires a follow-up read of the CSR to ascertain what heuristic
-the hardware *might* have applied, and if that procedure is followed in
-this proposal, performance even on hardware would be severely compromised.
-
-In addition when switching to foreign architectures, the switch has to
-be done atomically and guaranteed to occur.
-
-In the case of JIT emulation, the WARL "detection" code will be in an
-assembly language that is alien to hardware.
-
-Support for both assembly languages immediately after the CSR write
-is clearly impossible, this leaves no other option but to have the CSR
-be WLRL (on all platforms) and for traps to be mandatory (on the UNIX
-Platform).
-
-## Is it strictly necessary for foreign archs to switch back? <a name="foreignswitch"></a>
-
-No, because LAST-ISANS handles the setting and unsetting of the ISANS CSR
-in a completely transparent fashion as far as the foreign arch is concerned.
-Supervisor or Hypervisor traps take care of the context switch in a way
-that the user mode (or guest) need not be aware of, in any way.
-
-Thus, in e.g. Hypervisor Mode, the foreign guest arch has no knowledge
-or need to know that the hypervisor is flipping back to RV at the time of
-a trap.
-
-Note however that this is **not** the same as the foreign arch executing
-*foreign* traps! Foreign architecture trap and interrupt handling mechanisms
-are **out of scope** of this document and MUST be handled by the foreign
-architecture implementation in a completely transparent fashion that in
-no way interacts or interferes with this proposal.
-
-## Can we have dynamic declaration and runtime declaration of capabilities? <a name="dynamic"></a>
-
-Answer: don't know (yet). Quoted from Rogier:
-
-> "A SOC may have several devices that one may want to directly control
-> with custom instructions. If independent vendors use the same opcodes you
-> either have to change the encodings for every different chip (not very
-> nice for software) or you can give the device an ID which is defined in
-> some device tree or something like that and use that."
-
-dynamic detection wasn't originally planned: static
-compilation was envisaged to solve the need, with a table of
-mvendorid-marchid-isamux/isans being maintained inside gcc / binutils /
-llvm (or separate library?) that, like the linux kernel ARCH table,
-requires a world-wide atomic "git commit" to add globally-unique
-registered entries that map functionality to actual namespaces.
-
-where that goes wrong is if there is ever a pair (or more) of vendors
-that use the exact same custom feature that maps to different opcodes,
-a statically-compiled binary has no hope of executing natively on both
-systems.
-
-at that point: yes, something akin to device-tree would be needed.
-
-# Open Questions <a name="open-questions"></a>
-
-This section from a post by Rogier Bruisse
-<http://hands.com/~lkcl/gmail_re_isadev_isamux.html>
-
-## is the ISANS CSR a 32 or XLEN bit value? <a name="isans-32-or-xlen"></a>
-
-This is partly answered in another FAQ above: if 32 bits is not enough
-for a full suite of official, custom-with-atomic-registration and custom-without
-then a second CSR group (ISANS2) may be added at a future date (10-20 years
-hence).
-
-32 bits would not inconvenience RV32, and implementors wishing to
-make significant altnernative modifications to opcodes in the RV32 ISA space
-could do so without the burden of having to support a split 32/LO 32/HI
-CSR across two locations.
-
-## is the ISANS a flat number space or should some bits be reserved for use as flags?
-
-See 16-bit RV namespace "page" concept, above. Some bits have to be unary
-(multiple simultaneous features such as LE/BE in one bit, and augmented
-Floating-point rounding / clipping in another), whilst others definitely
-need to be binary (the most obvious one being "paging" in the space currently
-occupied by RVC).
-
-## should the ISANS space be partitioned between reserved, custom with registration guaranteed non clashing, custom, very likely non clashing?
-
-Yes. Format TBD.
-
-## should only compiler visible/generated constant setting with CSRRWI and/or using a clearly recognisable LI/LUI be accommodated or should dynamic setting be accommodated as well?
-
-This is almost certainly a software design issue, not so much a hardware
-issue.
-
-## How should the ISANS be (re)stored in a trap and in context switch?
-
-See section above on privilege mode: LAST-ISANS has been introduced that
-mirrors (x)CAUSE and (x)EPC pretty much exactly. Context switches change
-uepc just before exit from the trap, in order to change the user-mode PC
-to switch to a new process, and ulast-isans can - must - be treated in
-exactly the same way. When the context switch sets ulast-isans (and uepc),
-the hardware flips both ulast-isans into uisans and uepc into pc (atomically):
-both the new NS and the new PC activate immediately, on return to usermode.
-
-Quite simple.
-
-## Should the mechanism accommodate "foreign ISA's" and if so how does one restore the ISA.
-
-See section above on LAST-ISANS. With the introduction of LAST-ISANS, the
-change is entirely transparent, and handled by the Supervisor (or Hypervisor)
-trap, in a fashion that the foreign ISA need not even know of the existence
-of ISANS. At all.
-
-## Where is the default ISA stored and what is responsible for what it is after
-
-Options:
-* start up
-* starting a program
-* calling into a dynamically linked library
-* taking a trap
-* changing privilege levels
-
-These first four are entirely at the discretion of (and the
-responsibility of) the software. There is precedent for most of these
-having been implemented, historically, at some point, in relation to
-LE/BE mode CSRs in other hardware (MIPSEL vs MIPS distros for example).
-
-Traps are responsible for saving LAST-ISANS on the stack, exactly as they
-are also responsible for saving other context-sensitive information such
-as the registers and xEPC.
-
-The hardware is responsible for atomically switching out ISANS into the
-relevant xLAST-ISANS (and back again on exit). See Privileged Traps,
-above.
-
-## If the ISANS is just bits of an instruction that are to be prefixed by the cpu, can those bits contain immediates? Register numbers?
-
-The concept of a CSR containing an immediate makes no sense. The concept
-of a CSR containing a register number, the contents of which would, presumably,
-be inserted into the NS, would immediately make that register a permanent
-and irrevocably reserved register that could not be utilised for any other
-purpose.
-
-This is what the CSRs are supposed to be for!
-
-It would be better just to have a second CSR - ISANS2 - potentially even ISANS3
-in 60+ years time, rather than try to use a GPR for the purposes for which CSRs
-are intended.
-
-## How does the system indicate a namespace is not recognised? Does it trap or can/must a recoverable mechanism be provided?
-
-It doesn't "indicate" that a namespace is not recognised. WLRL fields only
-hold supported values. If the hardware cannot hold the value, a trap
-**MUST** be thrown (in the UNIX platform), and at that point it becomes the
-responsibility of software to deal with it.
-
-## What are the security implications? Can some ISA namespaces be set by user space?
-
-Of course they can. It becomes the responsibility of the Supervisor Mode
-(the kernel) to treat ISANS in a fashion orthogonal to the PC. If the OS
-is not capable of properly context-switching securely by setting the right
-PC, it's not going to be capable of properly looking after changes to ISANS.
-
-## Does the validity of an ISA namespace depend on privilege level? If so how?
-
-The question does not exactly make sense, and may need a re-reading of the
-section on how Privilege Modes, above. In RISC-V, privilege modes do not
-actually change very much state of the system: the absolute minimum changes
-are made (swapped out) - xEPC, xSTATUS and so on - and the privilege mode
-is expected to handle the context switching (or other actions) itself.
-
-ISANS - through LAST-ISANS - is absolutely no different. The trap and the
-kernel (Supervisor or Hypervisor) are provided the *mechanism* by which
-ISA Namespace *may* be set: it is up to the software to use that mechanism
-correctly, just as the software is expected to use the mechanisms provided
-to correctly implement context-switching by saving and restoring register
-files, the PC, and other state. The NS effectively becomes just another
-part of that state.
-
-
+++ /dev/null
-# mvendorid/marchid/mimplid (mvendorid/marchid MRO, mimplid WARL)<a name="mvendor_marchid_mimplid"></a>
-
-This proposal explores the possibility of adding a "mimplid" (or isamux) CSR
-that acts as an extra bit of state that goes directly into instruction decoding.
-It would be analogous to extending every single RISC-V instruction by a few bits
-so as to guarantee that no conflicts may occur in either custom extensions or
-future revisions of the RISC-V Standard, as well as permitting processors
-to execute (rather than JIT decode) completely foreign architectures.
-
-Implementors register (mvendorid-marchid-mimpl) tuples with the FSF
-gcc and binutils teams, effectively making the FSF the de-facto atomic
-arbiter responsible for maintaining the world-wide unique encoding
-database as part of the gcc and binutils codebase.
-
-Conflicting custom extensions thus become world-wide globally unique
-such that assembly writers, gcc and binutils may have a high to 100%
-degree of confidence that a given binary will not need recompiling from
-source, if transferred from one architecture to another (that has the
-exact same set of extensions).
-
-# Ideas discussed so far
-
-## One hart, one ISA encoding
-
-This idea is quite straightforward: on any given multi-core processor
-it can have multiple mvendorid-marchid-mimplid tuples, where each core
-(hart) has *one* and *only* one tuple. Thus, running different
-encodings is a simple matter of selecting the correct core.
-
-clarification from jacob:
-
-> it solves the problem of one implementation needing to implement
-> conflicting extensions, with some limitations, specifically that each of
-> the conflicting extensions must be used in separate threads. The Rocket
-> RoCC coprocessor interface, in a multi-tile SoC where different tiles
-> have different coprocessors, provides a working example of this model.
-> The overall system has both of two conflicting coprocessors.
-
-There are a couple of issues with this approach:
-
-* Single-core (single hart) implementations are not possible.
-* Multi-core implementations are guaranteed, for high workloads,
- to have "incompatible" cores sitting idle whilst "compatible"
- cores are overloaded.
-
-Aside from those limitations it is a workable and valid proposal that has the
-potential to meet the requirements, that may turn out to be a legitimate
-and simple and easy to implement subset of other ideas outlined in this
-document.
-
-## Every hart, multiple ISA encodings, mimpl unchanged on traps
-
-This idea allows every hart (core) to have the ability to select
-any one of multiple ISA encodings, by setting mimpl *in U-mode*.
-Implementation-wise the value in mimpl is passed to an out-muxer
-that generates mutually-exclusive HIGH signals that are passed
-as an additional input to the selection/enabler blocks of multiple
-(conflicting) decoders. As an extra signal into the associated multi-input
-AND gate the overhead is negligeable, and there is no possibility of
-a conflict due to the out-mux outputs being mutually-exclusive.
-
-Note that whilst this is very similar to setting MISA bits, MISA actively
-disables instructions from being decoded (whereas whilst the above also
-disables a certain subset of the opcode space it also *enables* some
-in their place). Also - and this is extremely important - it is
-**forbidden** for the encoding-switching to alter the actual state
-of any Extensions (custom or othewise). Changing of mimplid **only**
-affects the decoding, it does **not**, unlike MISA, actually switch on
-or off the actual Extension and **cannot** be used to "power down" any
-hardware.
-
-The tricky bit for this idea is: what happens when a trap occurs,
-to switch to M-Mode or S-Mode? If this is not taken care of properly
-there is the possibility for a trap to be running instructions that
-are completely alien and incompatible with the code and context from
-which the trap occurred.
-
-A cursory analysis of the situation came to the consensus that whilst in
-a trap, it would both be highly unlikely that custom opcodes would be
-used *in the trap*, or that even when the hart can support multiple
-*approved* (present and future) variants of the *RV Base Standard*,
-it would be so unusual for RV Base to change between (approved) variants
-that the possibility of there actually being a conflict is extremely
-remote. This is good as the code-path in an OS trap (supervisor mode)
-needs to be kept as short as possible.
-
-However, the possibility that there will be a critical difference cannot
-be known or ruled out, and foreign ISAs will definitely be made much more
-difficult to implement full OSes for (particularly proprietary ones) if
-the M-Mode and S-Mode traps are running an incompatible ISA.
-
-So the solution here is that whenever M-mode changes the mimplid/isamux CSR,
-the underlying hardware switches mtvec, stvec and bstvec over to
-*separate* and distint entries (stored on a per-hart, per-mimplid basis).
-In the context of there being an OS, the OS would need to be running
-in a "default" initial context. It would set up mtvec, stvec (and bstvec
-if required), then change the mimplid/isamux, and set up *new* mtvec etc.
-entries *as appropriate* for that particular (alternative) ISA (including
-if it is a foreign architecture).
-
-> I agree.. complete renumbering is a huge overhead. Guy's solution avoids
-> that overhead and provides a fast-switching mechanism. We had already
-> identified what happens on traps, flushes, caches, etc. Would prefer if
-> we could review/critique that proposal.
->
-> If someone wants to re-number the entire custom ISA even then Guy's
-> solution will stand. Plus, in the heterogenous envrionemt, considering
-> each hart/core has its own marcselect(mutable csr), the M mode (or
-> user/supervisor) can simply choose to enable that hart/core by writing
-> to the marchselect CSR.
->
-> For compliance, yes we will need Jacob's idea of having a global database
-> somewhere. Also, I believe that the compliance will check only if the
-> core is RISC-V compliant and not worry about any other extensions present
-> or not.
->
-> By adding a new mutable csr in the MRW region even existing
-> implementations will be compliant since accessing this CSR in current
-> implementations would just trap.
-
-## Every hart, multiple ISA encodings, mimpl set to "default" on traps
-
-This is effectively the same as the above except that when switching to
-M-Mode or S-Mode, the ISA encoding is always automatically switched to
-one and only one (default) ISA encoding. There are no complications for
-the hardware, however for software - particularly OSes and in particular
-for running foreign hardware ISAs - every single trap now has to work
-out which ISA the U-mode was running, and branch accordingly. Running a
-foreign OS thus becomes extremely challenging, although a case could be
-made for the foreign ISA having its own entirely different orthogonal
-trap-handling system.
-
-## Every hart, multiple ISA encodings, mimpl set to "supervisor-selectable"
-
-This is again identical as far as mimplid/isamux is concerned, with, again,
-a different kind of decision-making on traps. It was pointed out that
-if the mimplid/isamux is left unaltered when a trap occurs, switching over
-from one ISA to another inside a trap and dropping down to a different
-ISA in U-Mode is made slightly challenging by virtue of the fact that, when
-the trap changes the ISA, from that point onwards it *has to run that ISA
-inside the trap*. This may involve extra code-paths (branches) to require
-multiple different exit points from the trap.
-
-Whilst again it is quite unlikely that this scenario will arise (due to
-it being unlikely that the Base ISA will change significantly between
-(stable, approved) revisions to the RV Standard, the possibility cannot
-entirely be ruled out.
-
-So this idea is a hybrid of the above two: there is a default ISA for
-M-Mode and S-Mode, however in each it is possible to *set* that default.
-
-The idea has not yet been fully analysed and needs further discussion.
-
+++ /dev/null
-# mvendorid/marchid WARL <a name="mvendor_marchid_warl"></a>
-
-This proposal is to make the mvendorid and marchid CSRs have WARL (writeable)
-characteristics as a means and method of providing RISC-V implementations
-with a way to support multiple binary instruction encodings simultaneously
-within the same processor. Each unique tuple (including on a per-hart
-basis) uniquely identifies and permits switch-over
-to a completely separate and distinct binary-encoding such that:
-
-* Different versions (legacy and new) of the RISC-V Standard may be
- supported within the same processor
-* The fight over the extremely limited custom opcode space ends (permanently)
-* Entirely foreign ISA may be supported within the same processor
- (actually executed: i.e. not the same thing at all as the JIT Extension).
-
-For instances where mvendorid and marchid are readable, that would be
-taken to be a Standards-mandatory "declaration" that the architecture
-has *no* Custom Extensions (and that it conforms precisely to one and
-only one specific variant of the RISC-V Specification).
-
-Beyond that, the change is so simple and straightforward that there is not
-much to discuss aside from its feasibility and its implications. The
-main considerations are:
-
-* State information. How is state to be handled?
-* Compliance. What impact does the change have on Compliance (and testing)?
-* Implementation. Is it feasible and practical?
-* Exception-handlling. What happens during a trap?
-* Backwards compatibility. Is the change zero-impact (for existing systems)
-* Forwards compatibility. Does the change affect (limit) future hardware?
-
-## State information
-
-Unlike with MISA (which can be used to completely switch off - i.e. power
-down) certain Extensions, state information is **not permitted to be
-altered or destroyed** during or by a switch-over. Switch-over to a different
-mvendorid-marchid tuple shall have the effect of *purely* disabling certain
-instruction encodings and enabling others.
-
-Note also that during (for example) standard OS context-switching *all*
-state of *all* enabled extensions (and variants of the Base Standards) related
-to *all* mvendorid-marchid tuples will need to be saved onto the stack,
-given that a hart may, at any time, switch between any available
-mvendorid-marchid tuples.
-
-In other words there is absolutely zero connection *of any kind whatsoever*
-between the "encoding switching" and the state or status of the Extensions
-that the binary encodings are being directed *at* (on any upcoming
-conflicting instruction encodings). If a program requires the enablement
-or disablement of an Extension it **uses MISA and other official methods
-to do so** that have **absolutely nothing to do with what mvendorid-marchid
-is presently enabled**.
-
-## Compliance
-
-It was pointed out early in the discussions that Compliance Testing may
-**fail** any system that has mvendorid/marchid as WARL. This however is a
-clear case of "Compliance Tail Wagging Standard Dog". However it *was*
-recognised that overly complex Compliance Testing would result
-in rejection of the entire RISC-V Standard.
-
-A simple solution is to modify the Compliance Test Suite to specify the
-required mvendorid/marchid to be tested, as a parameter to the test
-applications. The test can be run multiple times, providing the
-implementor with multiple Compliance Certificates for the same processor,
-against *different variants* of past, present and future RISC-V Standards.
-
-*This is clearly a desirable characteristic*
-
-It's been noted that there may be certain legitimate cases where
-a mvendorid-marchid should *specifically* not be tested for RISC-V
-Certification Compliance: native support for foreign architectures (not
-related to the JIT Extension: *actual* full entire non-RISC-V foreign
-instruction encoding). Exactly how this would work (vis-a-vis Compliance)
-needs discussion, as it would be unfortunate and undesirable for a hybrid
-processor capable of executing more than one hardware-level ISA support
-to not be permitted to receive RISC-V Certification Compliance.
-
-How such foreign architectures would switch back to RISC-V when the foreign
-architecture does not support the concept of mvendorid-marchid is out of
-scope and left to implementors to define and implement equivalent
-functionality.
-
-## Implementation
-
-The redirection of meaning of certain binary encodings to multiple
-engines was considered extreme, eyebrow-raising, and also (importantly)
-potentially expensive, introducing significant latency at the decode
-phase.
-
-However, it was observed that MISA already switches out entire
-sets of instructions (interacts at the "decode" phase). The difference
-between what MISA does and the mvendor/march-id WARL idea is that whilst
-MISA only switches instruction decoding on (or off), the WARL idea
-*redirects* encoding, effectively to *different* simultaneous engines,
-fortunately in a deliberately mutually-exclusive fashion.
-
-Implementations would therefore, in each Extension (assuming one separate
-"decode" engine per Extension), simply have an extra (mutually-exclusively
-enabled) wire in the AND gate for any given binary encoding, and in this
-way there would actually be very little impact on the latency. The assumption
-here is that there are not dozens of Extensions vying for the same binary
-encoding (at which point the Fabless Semi Company has other much more
-pressing issues to deal with that make resolving binary encoding conflicts
-trivial by comparison).
-
-Also pointed out was that in certain cases pipeline stalls could be introduced
-during the switching phase, if needed, just as they may be needed for
-correct implementation of (mandatory) support for MISA.
-
-## Exception Handling (traps) and context-switching
-
-In cases where mvendorid and marchid are WARL, the mvendorid-marchid
-becomes part of the execution context that must be saved (and switched
-as necessary) just like any other state / CSR.
-
-When any trap exception is raised the context / state *must not* be
-altered (so that it can be properly saved, if needed, by the exception
-handler) and that includes the current mvendorid-marchid tuple. This
-leads to some interesting situations where a hart could conceivably be
-directed to a set of trap handler binary instructions that the current
-mvendorid-marchid setting is incapable of correctly interpreting.
-
-To fix this it will be necessary for implementations (hardware /
-software) to set up separate per-mvendorid-marchid trap handlers and
-for the hardware (or software) to switch to the appropriate trap "set"
-when the mvendorid-marchid is written to. The switch to a different
-"set" will almost undoubtedly require (transparent) **hardware** assistance.
-
-The reason for requiring hardware-assist for switching exception
-handling tables is because it has to be done atomically: there cannot
-be a situation where just as a hart is switching to a different
-mvendorid-marchid tuple an exception occurs, resulting in execution of
-effectively foreign instructions.
-
-In essence this means that mtvec needs to be a multi-entry table, one
-per (mvendorid-marchid) tuple. Likewise stvec, and bstvec.
-
-## Backwards-compatibility
-
-Backwards compatibility is vital for Standards. There are two aspects
-to this:
-
-* The actual change to the Standard should be minimally-disruptive
-* There should be no interference between two different encodings
- (any two separate tuples).
-
-Given that mvendorid and marchid are presently read-only; given that
-the change is to the *wording* and does not add any new CSRs; the change
-can clearly be seen to be zero-impact, with the exception being to
-implementors that have Custom Extensions in silicon at the moment.
-
-On the second point: the *entire purpose* of the change is to provide
-globally world-wide irrevocable permanent distinction and separation
-between instruction encodings!
-
-## Forwards-compatibility
-
-Forwards compatibility is again vital for Standards. Standards are required
-to adapt, yet at the same time provide a means and method of identifying
-and separating older (and legacy) systems from present and future versions.
-
-The clear separation which mutually-exclusively redirects encodings based
-on which mvendorid-marchid tuple is currently active clearly meets that
-requirement.
-
-# How the "custom extension conflict" is solved
-
-* Vendor 1 produces a Custom Extension
-* Vendor 2 produces a Custom Extension
-* Both Custom Extensions have conflicting binary encodings.
-* Fabless Semi Company 1 licenses both Vendor 1 and 2 Custom Extensions
-* Fabless Semi Company 1 sets marchid=0xeee1 WARL to represent
- enabling Vendor 1's conflicting encoding
-* Fabless Semi Company 1 sets marchid=0xeee2 WARL to represent
- enabling Vendor 2's conflicting encoding
-* Fabless Semi Company 1 contacts the FSF, submitting patches to gcc
- (and likewise with binutils) to register
- (mvendorid=fabless1,marchid=0xeee1) to be added to the global
- (FSF-curated?) database for Vendor 1's instruction encoding.
-* Likewise for Vendor 2's instruction encoding.
-
-Note that the RISC-V Foundation is **not** involved (or consulted) in
-this process. The **FSF** (as the Copyright holder of gcc and binutils)
-inherently and implicitly becomes the de-facto atomic arbiter in control
-of the registration of Custom Extension instruction encodings.
-
-The FSF's "job" is however quite straightforward: ensure that all
-registrations are in fact unique. It would be absolutely no good if a
-Vendor decided to re-use two mvendorid-marchid tuples to mean two
-totally different sets of instructions needed to be enabled! Any
-Vendor attempting to do so should be red-flagged immediately.
-
-Situations in which the FSF receives requests for patches with
-*another fabless semiconductor company's* mvendorid should also be treated
-with suspicion, at the very least being queried as to why one fabless semi
-company is trying to encroach on another company's JEDEC-registered
-encoding space.
-
-The special case of the above is when a fabless semiconductor company
-implements a new version of the RISC-V Standard. Here, again, the
-fabless semi company will provide patches to gcc and binutils, requesting
-that their specific mvendorid-marchid tuple be added to the (inherently
-de-facto atomic arbitrated) global database for that particular RISC-V
-Standard "Variant".
-
-# Questions to be resolved
-
-* Can the declaration (meaning) of read-only be expanded to cover
- any number of (non-conflicting) Custom Extensions? What are the
- implications of doing so?