openpower/sv/major_opcode_allocation.mdwn

   1 # Major Opcode Allocation
   2
   3 SimpleV Prefix, 16-bit Compressed, and SV VBLOCK all require considerable
   4 opcode space.  Similar to OpenPOWER v3.1 "prefixes" the key driving
   5 difference here is to reduce overall instruction size and thus greatly
   6 reduce I-Cache size and thus in turn power consumption.
   7
   8 Consequently rather than settle for a v3.1 32 bit prefix, 8 major opcodes
   9 are taken up and given new meanings.  Two options here involve either:
  10
  11 * Taking 8 arbitrary unused major opcodes as-is
  12 * Moving anything in the range 0-7 elsewhere
  13
  14 This **only** in "LibreSOC Mode".  Candidates for moving elsewhere
  15 include mulli, twi and tdi.
  16
  17 * 2 opcodes for 16-bit Compressed instructions with 11 bits available
  18 * 2 opcodes are required in order to give SV-P48 (and SV-C32) the 11 bits needed for prefixing
  19 * 2 opcodes are likewise required for SV-P64 (and SV-C48) to have 27 bits available
  20 * 2 opcodes for SV VBLOCK
  21
  22 With only 11 bits for 16-bit Compressed, it may be better to use the
  23 opportunity to switch into "16 bit mode".  Interestingly SV-P32 could
  24 likewise switch into the same.
  25
  26 # LE/BE complications.
  27
  28 See <https://bugs.libre-soc.org/show_bug.cgi?id=529> for discussion
  29
  30 With the Major Opcode being at the opposite end of the sequential byte
  31 order when read from memory in LE mode, a solution which allows 16 and
  32 48 bit instructions to co-exist with 32 bit ones is to look at bytes 2
  33 and 3 *before* looking at 0 and 1.
  34
  35 Option 1:
  36
  37 A 16 bit instruction would therefore be in bytes 2 and 3, removed from
  38 the instruction stream *ahead* of bytes 0 and 1, which would remain
  39 where they were.  The next instruction would repeat the analysis,
  40 starting now instead at the *new* byte 2-3.
  41
  42 A 48 bit instruction would again use bytes 2 and 3, read the major
  43 opcode, and extract bytes 0 thru 5 from the stream.  However the 48
  44 bit instruction would be constructed from bytes 2,3,0,1,4,5.  Again:
  45 after these 6 bytes were extracted fron the stream the analysis would
  46 begin again for the next instruction at bytes 2 and 3.
  47
  48 Option 2:
  49
  50 When reading from memory, before handing to the instruction decoder, bytes
  51 0 and 1 are swapped unconditionally with bytes 2 and 3.  Effectively this
  52 is near-identical to LE/BE byte-level swapping on a 32-bit block except
  53 this time it is half-word (16 bit) swapping on a 32-bit block.
  54
  55 With the Major Opcode then always being in the 1st 2 bytes it becomes
  56 much simpler for the pre-analysis phase to determine instruction length,
  57 regardless of what that length is (16/32/48/64/VBLOCK).
  58
  59 # 16 bit Compressed
  60
  61 See [[16_bit_compressed]]
  62