(no commit message)
[libreriscv.git] / openpower / sv / major_opcode_allocation.mdwn
1 # Major Opcode Allocation
2
3 SimpleV Prefix, 16-bit Compressed, and SV VBLOCK all require considerable
4 opcode space. Similar to OpenPOWER v3.1 "prefixes" the key driving
5 difference here is to reduce overall instruction size and thus greatly
6 reduce I-Cache size and thus in turn power consumption.
7
8 Consequently rather than settle for a v3.1 32 bit prefix, 8 major opcodes
9 are taken up and given new meanings. Two options here involve either:
10
11 * Taking 8 arbitrary unused major opcodes as-is
12 * Moving anything in the range 0-7 elsewhere
13
14 This **only** in "LibreSOC Mode". Candidates for moving elsewhere
15 include mulli, twi and tdi.
16
17 * 2 opcodes for 16-bit Compressed instructions with 11 bits available
18 * 2 opcodes are required in order to give SV-P48 the 11 bits needed for prefixing
19 * 2 opcodes are likewise required for SV-P64 to have 27 bits available
20 * 2 opcodes for SV-C32 and SV-C48 (32 bit versions of P48 and P64)
21
22 With only 11 bits for 16-bit Compressed, it may be better to use the
23 opportunity to switch into "16 bit mode". Interestingly SV-C32 could
24 likewise switch into the same.
25
26 VBLOCK can be added later by using further VSX dedicated major opcodes
27 (EXT62, EXT60)
28
29 * EXT00 - unused (one instruction: attn)
30 * EXT01 - v3.1B prefix
31 * EXT02 - twi
32 * EXT03 - tdi
33 * EXT04 - vector/bcd
34 * EXT05 - unused
35 * EXT06 - vector
36 * EXT07 - mulli
37 * EXT09 - reserved
38 * EXT17 - unused (2 instructions: sc, scv)
39 * EXT22 - reserved sandbox
40 * EXT46 - lmw
41 * EXT47 - stmw
42 * EXT56 - lq
43 * EXT57 - vector ld
44 * EXT58 - ld (leave ok)
45 * EXT59 - FP (leave ok)
46 * EXT60 - vector
47 * EXT61 - st (leave ok)
48 * EXT62 - vector st
49 * EXT63 - FP (leave ok)
50
51 Potential allocations:
52
53 | hword 0 | hword1 | hword2 | hword 3 |
54 EXT00/01 - C 10bit -> 16bit
55 EXT60/62 - VBLOCK
56 EXT09/17 - SV-C32 and other SV-C
57 EXT06/07 - SV-C32-Swizzle and other SV-C-Swizzle
58 EXT02/03 - SV-P48
59 EXT04/05 - SV-P64
60 EXT56/57 - Predicated-SV-P48
61 EXT46/47 - Predicated SV-P64
62
63 Spare:
64
65 * EXT22
66
67 ## C10/16 FSM
68
69 if EXT == 00/01
70 start @ 10bit
71 if state==10bit:
72 if bit15:
73 next = 16bit
74 else:
75 next = Standard
76 if state==16bit:
77 if bit0 & bit15:
78 insn = C.immediate
79 if ~bit15:
80 if ~bit0:
81 next = Standard
82 else
83 next = Standard.then.16bit
84
85 ## SV-Compressed FSM
86
87 if EXT == 09/17:
88 if bit0:
89 SV.mode =
90
91 # Major opcode map
92
93 Table 9: Primary Opcode Map (opcode bits 0:5)
94
95 | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111
96 000 | | | tdi | twi | EXT04 | | | mulli | 000
97 001 | subfic | | cmpli | cmpi | addic | addic. | addi | addis | 001
98 010 | bc/l/a | EXT17 | b/l/a | EXT19 | rlwimi| rlwinm | | rlwnm | 010
99 011 | ori | oris | xori | xoris | andi. | andis. | EXT30 | EXT31 | 011
100 100 | lwz | lwzu | lbz | lbzu | stw | stwu | stb | stbu | 100
101 101 | lhz | lhzu | lha | lhau | sth | sthu | lmw | stmw | 101
102 110 | lfs | lfsu | lfd | lfdu | stfs | stfsu | stfd | stfdu | 110
103 111 | lq | EXT57 | EXT58 | EXT59 | EXT60 | EXT61 | EXT62 | EXT63 | 111
104 | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111
105
106 # LE/BE complications.
107
108 See <https://bugs.libre-soc.org/show_bug.cgi?id=529> for discussion
109
110 With the Major Opcode being at the opposite end of the sequential byte
111 order when read from memory in LE mode, a solution which allows 16 and
112 48 bit instructions to co-exist with 32 bit ones is to look at bytes 2
113 and 3 *before* looking at 0 and 1.
114
115 Option 1:
116
117 A 16 bit instruction would therefore be in bytes 2 and 3, removed from
118 the instruction stream *ahead* of bytes 0 and 1, which would remain
119 where they were. The next instruction would repeat the analysis,
120 starting now instead at the *new* byte 2-3.
121
122 A 48 bit instruction would again use bytes 2 and 3, read the major
123 opcode, and extract bytes 0 thru 5 from the stream. However the 48
124 bit instruction would be constructed from bytes 2,3,0,1,4,5. Again:
125 after these 6 bytes were extracted fron the stream the analysis would
126 begin again for the next instruction at bytes 2 and 3.
127
128 Option 2:
129
130 When reading from memory, before handing to the instruction decoder, bytes
131 0 and 1 are swapped unconditionally with bytes 2 and 3. Effectively this
132 is near-identical to LE/BE byte-level swapping on a 32-bit block except
133 this time it is half-word (16 bit) swapping on a 32-bit block.
134
135 With the Major Opcode then always being in the 1st 2 bytes it becomes
136 much simpler for the pre-analysis phase to determine instruction length,
137 regardless of what that length is (16/32/48/64/VBLOCK).
138
139 Option 3:
140
141 Just as in VLE, require instructions to be in BE order. Data, which has nothing to do with instruction order, may optionally remain in LE order.
142
143 ## Why does VLE use a separate 64k page?
144
145 VLE requires that the memory page be marked as VLE-encoded. It also requires rhat the instructions be in BE order even when 32 bit standard opcodes are mixed in.
146
147 Questions:
148
149 * What would happen without the page being marked, when attempting to call ppc64le ABI code?
150 * How would ppc64le code in the same page be distinguished from SVPrefix code?
151
152 The answers are that it is either impossible or that it requires a special mode-switching instruction to be called on entry and exit from functions, transitioning to and from ppc64le mode.
153
154 This transition may be achieved very simply by marking the 64k page.
155
156 # 16 bit Compressed
157
158 See [[16_bit_compressed]]
159