Bug 1244: changes to images

[libreriscv.git] / 3d_gpu / requirements_specification.mdwn
diff --git a/3d_gpu/requirements_specification.mdwn b/3d_gpu/requirements_specification.mdwn

index 611bf740e40028843d933f1c872d8723c3e7c201..2956ce5dd587248ca8f3e5d63ed6a2506975b2f7 100644 (file)
--- a/3d_gpu/requirements_specification.mdwn
+++ b/3d_gpu/requirements_specification.mdwn
@@ -2,7 +2,7 @@
  
  This document contains the Requirements Specification for the Libre RISC-V
  micro-architectural design.  It shall meet the target of 5-6 32-bit GFLOPs,
-150 M-Pixels/sec, 30 Million Triangles/sec, and minimum video decode 
+150 M-Pixels/sec, 30 Million Triangles/sec, and minimum video decode
  capability of 720p @ 30fps to a 1920x1080 framebuffer, in under 2.5 watts
  at an 800mhz clock rate.  Exceeding this target is acceptable if the
  power budget is not exceeded.  Exceeding this target "just because we can"
@@ -88,6 +88,15 @@ An overview of the design is as follows:
    corresponding 8/16-bit Function Unit(s) for that register, and vice-versa.
    8/16-bit operations will however **not** block the remaining
    (unallocated) bytes of the same register from being utilised.
+* Spectre timing attacks will be dealt with by ensuring that there
+  are no side-channels between cores in the usual ways (no shared
+  DIV unit, correct use of L1 cache), however there will be an
+  addition of a "Speculation Fence" instruction (or hint) that will
+  reset the internal state to a known quiescent state.  This involves
+  cancellation of all speculation, cancellation of "nameless" registers,
+  committing outstanding register writes to the register file, and
+  cancelling all Function Units waiting for read hazards.  This to
+  be automatically done on any exceptions or interrupts.
  
  # Register File
  
@@ -109,6 +118,7 @@ cycle, such that the register file may effectively be used as an
  
  # Function Units
  
+## Commit Phase (instruction order preservation)
  
  # 6600 Scoreboards
  
@@ -164,5 +174,82 @@ read and write hazards - dependencies - between Function Units.
  
  ## Branch Speculation
  
-![shadow branch][shadow_issue_flipflops.png]
-
+Branch speculation is done by preventing instructions from becoming
+"writeable" until the Branch Unit knows if it has resolved or not.
+This is done with the addition of "Shadow" lines, as shown below:
+
+This image reproduced with kind permission, Copyright (C) Mitch Alsup
+[[!img shadow_issue_flipflops.png]]
+
+Note that there are multiple "Shadow" signals, coming not just from Branch
+Speculation but also from predication and exception shadows.
+
+On a "Failed" signal, the instruction is told to "Go Die".  This is
+passed to the Computation Unit as well.  When all "Success" signals
+are raised the instruction is permitted to enter "Writeable".
+
+## Exceptions
+
+Exceptions shall be handled by each instruction that *may* throw an
+exception having and holding a "Shadow" wire over all dependent
+Function Units, in exactly the same way as Branch Speculation.
+Likewise, dependent instructions are prevented and prohibited from
+entering the "Writeable" state.
+
+Dependent downstream instructions, if the exception is thrown,
+shall have the "Failed" bit ASSERTED (by the Function Unit throwing
+the exception) such that the down-stream dependent instruction is told
+to "Go Die".
+
+If the point is reached at which the instruction knows that the
+Exception cannot possibly occur, the "Success" signal is raised
+instead, thus cancelling the "hold" over dependent downstream
+instructions - again in exactly the same way as Branch Speculation
+"Success".
+
+Exceptions may **only** be actually raised if they are at the front of
+the instruction queue, i.e. if they are free of write hazards.
+See section on "Function Unit Commit" phase, as the Function Units
+have a "link bit" that preserves the instruction issue order, which
+must also be respected.
+
+# Spectre-style timing mitigation
+
+Spectre-style timing attacks are defined by one instruction issue
+affecting the completion time of past **and future** instructions.
+The key insight to mitigation against such attacks is to note that
+arbitrary untrusted instructions must not be permitted to affect
+trusted instructions.  Consequently as long as there is a firebreak
+(a "Fence") between trusted and untrusted, timing attacks can be
+held off.
+
+Two instructions ("hints") shall therefore be added:
+
+* One that stops speculation, multi-issue and any out-of-order
+  resource allocation for a minimum of 16 instructions.
+* Another that **cancels** all speculation and reservations,
+  cancels "nameless" registers, waits for and ensures that all
+  outstanding instructions have completed and committed, before
+  permitting the processor to continue further.
+
+This latter shall occur unconditionally without requiring a special
+instruction to be called, on ECALL as well as all exceptions and
+interrupts.
+
+# ALU design
+
+There is a separate pipelined alu for fdiv/fsqrt/frsqrt/idiv/irem
+that is possibly shared between 2 or 4 cores.
+
+The main ALUs are each a unified ALU for i8-i64/f16-f64 where the
+ALU is split into lanes with separate instructions for each 32-bit half.
+So, the multiplier should be capable of 64-bit fmadd, 2x32-bit fmadd,
+4x16-bit fmadd, 1x32-bit fmadd + 2x16-bit fmadd (in either order), and all
+(8/16/32/64) sizes of integer mul/mulhsu/mulh/mulhu in 2 groups of 32-bits.
+We can implement fmul using fmadd with 0 (make sure that we get the right
+sign bit for 0 for all rounding modes).
+
+# Rowhammer Mitigation
+
+* <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-March/000699.html>
+* <https://arxiv.org/pdf/1903.00446.pdf>