clarify cbank
[libreriscv.git] / openpower / sv / major_opcode_allocation.mdwn
1 # Major Opcode Allocation
2
3 SimpleV Prefix, 16-bit Compressed, and SV VBLOCK all require considerable
4 opcode space. Similar to OpenPOWER v3.1 "prefixes" the key driving
5 difference here is to reduce overall instruction size and thus greatly
6 reduce I-Cache size and thus in turn power consumption.
7
8 Consequently rather than settle for a v3.1 32 bit prefix, 8 major opcodes
9 are taken up and given new meanings. Two options here involve either:
10
11 * Taking 8 arbitrary unused major opcodes as-is
12 * Moving anything in the range 0-7 elsewhere
13
14 This **only** in "LibreSOC Mode". Candidates for moving elsewhere
15 include mulli, twi and tdi.
16
17 * 2 opcodes for 16-bit Compressed instructions with 11 bits available
18 * 2 opcodes are required in order to give SV-P48 the 11 bits needed for prefixing
19 * 2 opcodes are likewise required for SV-P64 to have 27 bits available
20 * 2 opcodes for SV-C32 and SV-C48 (32 bit versions of P48 and P64)
21
22 With only 11 bits for 16-bit Compressed, it may be better to use the
23 opportunity to switch into "16 bit mode". Interestingly SV-C32 could
24 likewise switch into the same.
25
26 VBLOCK can be added later by using further VSX dedicated major opcodes
27 (EXT62, EXT60)
28
29 * EXT01 - v3.1B prefix
30 * EXT04 - vector/bcd
31 * EXT06 - vector
32 * EXT09 - reserved
33 * EXT22 - reserved sandbox
34 * EXT57 - vector ld
35 * EXT58 - ld (leave ok)
36 * EXT59 - FP (leave ok)
37 * EXT60 - vector
38 * EXT61 - st (leave ok)
39 * EXT62 - vector st
40 * EXT63 - FP (leave ok)
41
42 # LE/BE complications.
43
44 See <https://bugs.libre-soc.org/show_bug.cgi?id=529> for discussion
45
46 With the Major Opcode being at the opposite end of the sequential byte
47 order when read from memory in LE mode, a solution which allows 16 and
48 48 bit instructions to co-exist with 32 bit ones is to look at bytes 2
49 and 3 *before* looking at 0 and 1.
50
51 Option 1:
52
53 A 16 bit instruction would therefore be in bytes 2 and 3, removed from
54 the instruction stream *ahead* of bytes 0 and 1, which would remain
55 where they were. The next instruction would repeat the analysis,
56 starting now instead at the *new* byte 2-3.
57
58 A 48 bit instruction would again use bytes 2 and 3, read the major
59 opcode, and extract bytes 0 thru 5 from the stream. However the 48
60 bit instruction would be constructed from bytes 2,3,0,1,4,5. Again:
61 after these 6 bytes were extracted fron the stream the analysis would
62 begin again for the next instruction at bytes 2 and 3.
63
64 Option 2:
65
66 When reading from memory, before handing to the instruction decoder, bytes
67 0 and 1 are swapped unconditionally with bytes 2 and 3. Effectively this
68 is near-identical to LE/BE byte-level swapping on a 32-bit block except
69 this time it is half-word (16 bit) swapping on a 32-bit block.
70
71 With the Major Opcode then always being in the 1st 2 bytes it becomes
72 much simpler for the pre-analysis phase to determine instruction length,
73 regardless of what that length is (16/32/48/64/VBLOCK).
74
75 Option 3:
76
77 Just as in VLE, require instructions to be in BE order. Data, which has nothing to do with instruction order, may optionally remain in LE order.
78
79 # 16 bit Compressed
80
81 See [[16_bit_compressed]]
82