d6d926e8023cb181c089732de2bfff5ebfe8b13e
[libreriscv.git] / openpower / sv / 16_bit_compressed.mdwn
1 # 16 bit Compressed
2
3 Similar to VLE (but without immediate-prefixing) this encoding is designed
4 to fit on top of OpenPOWER ISA v3.0B when a "Modeswitch" bit is set (PCR
5 is recommended). Note that Compressed is *mutually exclusively incompatible*
6 with OpenPOWER v3.1B "prefixing" due to using (requiring) both EXT000
7 and EXT001. Hypothetically it could be made to use anything other than
8 EXT001, with some inconvenience (extra gates). The incompatibility is
9 "fixed" by swapping out of "Compressed" Mode and back into "Normal"
10 (v3.1B) Mode, at runtime, as needed.
11
12 Although initially intended to be augmented by Simple-V Prefixing (to
13 add Vector context, width overrides, e.g IEEE754 FP16, and predication) yet not put pressure on I-Cache power
14 or size, this Compressed Encoding is not critically dependent
15 *on* SV Prefixing, and may be used stand-alone.
16
17 See:
18
19 * <https://bugs.libre-soc.org/show_bug.cgi?id=238>
20 * <https://ftp.libre-soc.org/VLE_314-68105.pdf> VLE Encoding
21 * <http://lists.mailinglist.openpowerfoundation.org/pipermail/openpower-hdl-cores/2020-November/000210.html>
22
23 This one is a conundrum. OpenPOWER ISA was never designed with 16
24 bit in mind. VLE was added 10 years ago but only by way of marking
25 an entire 64k page as "VLE". With VLE not maintained it is not
26 fully compatible with current PowerISA.
27
28 Here, in order to embed 16 bit into a predominantly 32 bit stream the
29 overhead of using an entire 16 bits just to switch into Compressed mode
30 is itself a significant overhead. The situation is made worse by 6 bits
31 being taken up by Major Opcode space, leaving only 10 bits to allocate
32 to actual instructions.
33
34 Contrast this with RVC which takes 3 out of 4
35 combinations of the first 2 bits for indicating 16-bit (anything with 0b00 to 0b10 in the LSBs), and uses the 4th as a Huffman-style escape-sequence, easily allowing standard 32 bit and 16 bit to intermingle cleanly. To achieve the same thing on OpenPOWER would require a whopping 24 6-bit Major Opcodes which is clearly impractical: other schemes need to be devised.
36
37 In addition we would like to add SV-C32 which is a Vectorised version
38 of 16 bit Compressed, and ideally have a variant that adds the 27-bit
39 prefix format from SV-P64, as well.
40
41 Potential ways to reduce pressure on the 16 bit space are:
42
43 * To use more than one v3.0B Major Opcode, preferably an odd-even
44 contiguous pair
45 * To provide "paging". This involves bank-switching to alternative optimised encodings for specific workloads
46 * To enter "16 bit mode" for durations specified at the start
47 * To reserve one bit of every 16 bit instruction to indicate that the 16 bit mode is to continue to be sustained
48
49 This latter would be useful in the Vector context to have an alternative
50 meaning: as the bit which determines whether the instruction is 11-bit
51 prefixed or 27-bit prefixed:
52
53 0 1 2 3 4 5 6 7 8 9 a b c d e f |
54 |major op | 11 bit vector prefix|
55 |16 bit opcode alt vec. mode ^ |
56 | extra vector prefix if alt set|
57
58 Using a major opcode to enter 16 bit mode, leaves 11 bits to find
59 something to use them for:
60
61 0 1 2 3 4 5 6 7 8 9 a b c d e f |
62 |major op | what to do here 1 |
63 |16 bit stay in 16bit mode 1 |
64 |16 bit stay in 16bit mode 1 |
65 |16 bit exit 16bit mode 0 |
66
67 One possibility is that the 11 bits are used for bank selection, with
68 some room for additional context such as altering the registers used
69 for the 16 bit operations (bank selection of which scalar regs).
70 However the downside is that short sequences of Compressed instructions
71 become penalised by the fixed overhead. Even a single 16 bit instruction requires a 16 bit overhead to "gain access" to 16 bit "mode", making the exercise pointless.
72
73 An alternative is to use the first 11 bits for only the utmost commonly used
74 instructions. That being the case then one of those 11 bits could
75 be dedicated to saying if 16 bit mode is to be continued, at which
76 point *all* 16 bits can be used for Compressed.
77 10 bits remain for actual opcodes, which is ridiculously tight,
78 however the opportunity to subsequently use all 16 bits is worth it.
79
80 The reason for picking 2 contiguous Major v3.0B opcodes is illustrated below:
81
82 |0 1 2 3 4 5 6 7 8 9 a b c d e f|
83 |major op..0| LO Half C space |
84 |major op..1| HI Half C space |
85 |N N N N N|<--11 bits C space-->|
86
87 If NNNNN is the same value (two contiguous Major v3.0B Opcodes) this saves gates at a critical part of the decode phase.
88
89 # Opcode Allocation Ideas
90
91 * one bit from the 16-bit mode is used to indicate that standard
92 (v3.0B) mode is to be dropped into for only one single instruction
93 <https://bugs.libre-soc.org/show_bug.cgi?id=238#c2>
94
95 ## Opcodes exploration (Attempt 1)
96
97 Switching between different encoding modes is controlled by M (alone)
98 in 10-bit mode, and M and N in 16-bit mode.
99
100 * M in 10-bit mode if zero indicates that following instructions are
101 standard OpenPOWER ISA 32-bit encoded (including, redundantly,
102 further 10/16-bit instructions)
103 * M in 10-bit mode if 1 indicates that following instructions are
104 in 16-bit encoding mode
105
106 Once in 16-bit mode:
107
108 * 0b01 (M=1, N=0): stay in 16-bit mode
109 * 0b00: leave 16-bit mode permanently (return to standard OpenPOWER ISA)
110 * 0b10: leave 16-bit mode for one cycle (return to standard OpenPOWER ISA)
111 * 0b11: free to be used for something completely different.
112
113 The current "top" idea for 0b11 is to use it for a new encoding format
114 of predominantly "immediates-based" 16-bit instructions (branch-conditional,
115 addi, mulli etc.)
116
117 * The Compressed Major Opcode is in bits 5-7.
118 * Minor opcode in bit 8.
119 * In some cases bit 9 is taken as an additional sub-opcode, followed
120 by bits 0-4 (for CR operations)
121 * M+N mode-switching is not available for C-Major.minor 0b001.1
122 * 10 bit mode may be expanded by 16 bit mode, adding capabilities
123 that do not fit in the extreme limited space.
124
125 Mode-switching FSM showing relationship between v3.0B, C 10bit and C 16bit.
126 16-bit immediate mode remains in 16-bit.
127
128 | 0 | 1234 | 567 8 | 9abcde | f | explanation
129 | EXT000/1 | Cmaj.m | fields | 0 | 10bit then v3.0B
130 | EXT000/1 | Cmaj.m | fields | 1 | 10bit then 16bit
131 | 0 | flds | Cmaj.m | fields | 0 | 16bit then v3.0B
132 | 0 | flds | Cmaj.m | fields | 1 | 16bit then 16bit
133 | 1 | flds | Cmaj.m | fields | 0 | 16b then 1x v3.0B
134 | 1 | flds | Cmaj.m | fields | 1 | 16b/imm then 16bit
135
136 Notes:
137
138 * Cmaj.m is the C major/minor opcode: 3 bits for major, 1 for minor
139 * EXT000 and EXT001 are v3.0B Major Opcodes. The first 5 bits
140 are zero, therefore the 6th bit is actually part of Cmaj.
141 * "10bit then 16bit" means "this instruction is encoded C 10bit
142 and the following one in C 16bit"
143
144 ### C Instruction Encoding types
145
146 10-bit Opcode formats (all start with v3.0B EXT000 or EXT001
147 Major Opcodes)
148
149 | 01234 | 567 8 | 9 | a b | c | d e | f | enc
150 | E01 | Cmaj.m | fld1 | fld2 | M | 10b
151 | E01 | Cmaj.m | offset | M | 10b b
152 | E01 | 001.1 | S1 | fd1 | S2 | fd2 | M | 10b sub
153 | E01 | 111.m | fld1 | fld2 | M | 10b LDST
154
155 16-bit Opcode formats (including 10/16/v3.0B Switching)
156
157 | 0 | 1234 | 567 8 | 9 | a b | c | d e | f | enc
158 | N | immf | Cmaj.m | fld1 | fld2 | M | 16b
159 | 1 | immf | Cmaj.m | fld1 | imm | 1 | 16b imm
160 | fd3 | 001.1 | S1 | fd1 | S2 | fd2 | M | 16b sub
161 | N | fd4 | 111.m | fld1 | fld2 | M | 16b LDST
162
163 Notes:
164
165 * fld1 and fld2 can contain reg numbers, immediates, or opcode
166 fields (BO, BI, LK)
167 * S1 and S2 are further sub-selectors of C 001.1
168
169 ### Immediate Opcodes
170
171 only available in 16-bit mode, only available when M=1 and N=1
172 and when Cmaj.min is not 0b001.1.
173
174 instruction counts from objdump on /bin/bash:
175
176 466 extsw r1,r1
177 649 stw r1,1(r1)
178 691 lwz r1,1(r1)
179 705 cmpdi r1,1
180 791 cmpwi r1,1
181 794 addis r1,r1,1
182 1474 std r1,1(r1)
183 1846 li r1,1
184 2031 mr r1,r1
185 2473 addi r1,r1,1
186 3012 nop
187 3028 ld r1,1(r1)
188
189
190 | 0 | 1 | 2 | 3 4 | | 567.8 | 9ab | cde | f |
191 | 1 | 0 | 0 0 0 | | 001.0 | | 000 | 1 | TBD
192 | 1 | 0 | sh2 | | 001.0 | RA | sh | 1 | sradi.
193 | 1 | 1 | 0 0 0 | | 001.0 | | 000 | 1 | TBD
194 | 1 | 1 | 0 | sh2 | | 001.0 | RA | sh | 1 | srawi.
195 | 1 | 1 | 1 | | | 001.0 | | | 1 | TBD
196 | 1 | i2 | RT | | 010.0 | RA|0 | imm | 1 | addi
197 | 1 | 0 | i2 | | 010.1 | RA | imm | 1 | cmpdi
198 | 1 | 1 | i2 | | 010.1 | RA | imm | 1 | cmpwi
199 | 1 | 0 | i2 | | 011.0 | RT!=1| imm | 1 | ldspi
200 | 1 | 1 | i2 | | 011.0 | RT!=1| imm | 1 | lwspi
201 | 1 | 0 | i2 | | 011.1 | RT!=1| imm | 1 | stwspi
202 | 1 | 1 | i2 | | 011.1 | RT!=1| imm | 1 | stdspi
203 | 1 | | | 011.0 | 001 | | 1 | TBD
204 | 1 | | | 011.1 | 001 | | 1 | TBD
205 | 1 | i2 | | 100.0 | RT | imm | 1 | stwi
206 | 1 | i2 | | 100.1 | RT | imm | 1 | stdi
207 | 1 | i2 | | 101.0 | RA | imm | 1 | ldi
208 | 1 | i2 | | 101.1 | RA | imm | 1 | lwi
209 | 1 | i2 | RA | | 110.0 | RT | imm | 1 | fsti
210 | 1 | i2 | RA | | 110.1 | RT | imm | 1 | fstdi
211 | 1 | i2 | RT | | 111.0 | RA | imm | 1 | flwi
212 | 1 | i2 | RT | | 111.1 | RA | imm | 1 | fldi
213
214 Construction of immediate:
215
216 * LD/ST r1 (SP) variants should be offset by -256
217 see <https://bugs.libre-soc.org/show_bug.cgi?id=238#c43>
218 - SP variants map to e.g ld RT, imm(r1)
219 - SV Prefixing can be used to map r1 to alternate regs
220 * [1] not the same as v3.0B addis: the shift amount is smaller and actually
221 still maps to within the v3.0B addi immediate range.
222 * addi is EXTS(i2||imm) to give a 4-bit range -8 to +7
223 * addis is EXTS(i2||imm||000) to give a 11-bit range -1024 to +1023 in increments of 8
224 * all others are EXTS(i2||imm) to give a 7-bit range -128 to +127
225 (further for LD/ST due to word/dword-alignment)
226
227 Further Notes:
228
229 * bc also has an immediate mode, listed separately below in Branch section
230 * for LD/ST, offset is aligned. 8-byte: i2||imm||0b000 4-byte: 0b00
231 * SV Prefix over-rides help provide alternative bitwidths for LD/ST
232 * RA|0 if RA is zero, addi. becomes "li"
233 - this only works if RT takes part of opcode
234 - mv is also possible by specifying an immediate of zero
235
236 ### Illegal and nop
237
238 Note that illeg is all zeros, including in the 16-bit mode.
239 Given that C is allocated to OpenPOWER ISA Major opcodes EXT000 and
240 EXT001 this ensures that in both 10-bit *and* 16-bit mode, a 16-bit
241 run of all zeros is considered "illegal" whilst 0b0000.0000.1000.0000
242 is "nop"
243
244 | 16-bit mode | | 10-bit mode |
245 | 0 | 1 | 234 | | 567.8 | 9 ab | c de | f |
246 | 0 | 0 000 | | 000.0 | 0 00 | 0 00 | 0 | illeg
247 | 0 | 0 000 | | 000.0 | 0 00 | 0 00 | 1 | nop
248
249 16 bit mode only:
250
251 | 1 | 0 000 | | 000.0 | 0 00 | 0 00 | 0 | nop
252 | 1 | nonzero | | 000.0 | 0 00 | 0 00 | 0 | TBD
253
254 Notes:
255
256 * All-zeros being an illegal instruction is normal for ISAs. Ensuring that
257 this remains true at all times i.e. for both 10 bit and 16 bit mode is
258 common sense.
259 * The 10-bit nop (bit 15, M=1) is intended for circumstances
260 where alignment to 32-bit before returning to v3.0B is required.
261 M=1 being an indication "return to Standard v3.0B Encoding Mode".
262 * The 16-bit nop (bit 0, N=1) is intended for circumstances where a
263 return to Standard v3.0B Encoding is required for one cycle
264 but one cycle where alignment to a 32-bit boundary is needed.
265 Examples of this would be to return to "strict" (non-C) mode
266 where the PC may not be on a non-word-aligned boundary.
267 * If for any reason multiple 16 bit nops are needed in succession
268 the M=1 variant can be used, because each one returns to
269 Standard v3.0B Encoding Mode, each time.
270
271 In essence the 2 nops are needed due to there being 2 different C forms: 10 and 16 bit.
272
273 ### Branch
274
275 | 16-bit mode | | 10-bit mode |
276 | 0 | 1 | 234 | | 567.8 | 9 ab | c de | f |
277 | N | offs2 | | 000.LK | offs!=0 | M | b, bl
278 | 1 | offs2 | | 000.LK | BI | BO1 oo | 1 | bc, bcl
279 | N | BO3 BI3 | | 001.0 | LK BI | BO | M | bclr, bclrl
280
281 16 bit mode:
282
283 * bc only available when N,M=0b11
284 * offs2 extends offset in MSBs
285 * BI3 extends BI in MSBs to allow selection of full CR
286 * BO3 extends BO
287 * bc offset constructed from oo as LSBs and offs2 as MSBs
288 * bc BI allows selection of all bits from CR0 or CR1
289 * bc CR check is always active (as if BO0=1) therefore BO1 inverts
290
291 10 bit mode:
292
293 * illegal (all zeros) covers part of branch (offs=0,M=0,LK=0)
294 * nop also covers part of branch (offs=0,M=0,LK=1)
295 * bc **not available** in 10-bit mode
296 * BO[0] enables CR check, BO[1] inverts check
297 * BI refers to CR0 only (4 bits of)
298 * no Branch Conditional with immediate
299 * no Absolute Address
300 * CTR mode allowed with BO[2] for b only.
301 * offs is to 2 byte (signed) aligned
302 * all branches to 2 byte aligned
303
304 ### LD/ST
305
306 | 16-bit mode | | 10-bit mode |
307 | 0 | 1 | 2 3 4 | | 567.8 | 9 a b | c d e | f |
308 | RA2 | SZ | RB | | 001.1 | 1 RA | 0 RT | M | st
309 | RA2 | SZ | RB | | 001.1 | 1 RA | 1 RT | M | fst
310 | N | SZ | RT | | 111.0 | RA | RB | M | ld
311 | N | SZ | RT | | 111.1 | RA | RB | M | fld
312
313 * elwidth overrides can set different widths
314
315 16 bit mode:
316
317 * SZ=1 is 64 bit, SZ=0 is 32 bit
318 * RA2 extends RA to 3 bits (MSB)
319 * RT2 extends RT to 3 bits (MSB)
320
321 10 bit mode:
322
323 * RA and RB are only 2 bit (0-3)
324 * for LD, RT is implicitly RB: "ld RT=RB, RA(RB)"
325 * for ST, there is no offset: "st RT, RA(0)"
326
327 ### Arithmetic
328
329 | 16-bit mode | | 10-bit mode |
330 | 0 | 1 | 234 | | 567.8 | 9ab | c d e | f |
331 | N | 0 | RT | | 010.0 | RB | RA!=0 | M | add
332 | N | 0 | RT | | 010.1 | RB | RA|0 | M | sub.
333 | N | 0 | BF | | 011.0 | RB | RA|0 | M | cmpl
334
335 Notes:
336
337 * sub. and cmpl: default CR target is CR0
338 * for (RA|0) when RA=0 the input is a zero immediate,
339 meaning that sub. becomes neg. and cmp becomes cmpi against zero
340 * RT is implicitly RB: "add RT(=RB), RA, RB"
341 * Opcode 0b010.0 RA=0 is not missing from the above:
342 it is a system-wide instruction, "cbank" (section below)
343
344 16 bit mode only:
345
346 | 0 | 1 | 234 | | 567.8 | 9ab | cde | f |
347 | N | 1 | RA | | 010.0 | RB | RS | 0 | sld.
348 | N | 1 | RA | | 010.1 | RB | RS!=0 | 0 | srd.
349 | N | 1 | RA | | 010.1 | RB | 000 | 0 | srad.
350 | N | 1 | BF | | 011.0 | RB | RA|0 | 0 | cmpw
351
352 Notes:
353
354 * for srad, RS=RA: "srad. RA(=RS), RS, RB"
355
356
357 ### Logical
358
359 | 16-bit mode | | 10-bit mode |
360 | 0 | 1 | 2 3 4 | | 567.8 | 9ab | c d e | f |
361 | N | 0 | RT | | 100.0 | RB | RA!=0 | M | and
362 | N | 0 | RT | | 100.1 | RB | RA!=0 | M | nand
363 | N | 0 | RT | | 101.0 | RB | RA!=0 | M | or
364 | N | 0 | RT | | 101.1 | RB | RA!=0 | M | nor
365 | N | 0 | RT | | 100.0 | RB | 0 0 0 | M | extsw
366 | N | 0 | RT | | 100.1 | RB | 0 0 0 | M | cntlz
367 | N | 0 | RT | | 101.0 | RB | 0 0 0 | M | popcnt
368 | N | 0 | RT | | 101.1 | RB | 0 0 0 | M | not
369
370 16-bit mode only:
371
372 | 0 | 1 | 2 3 4 | | 567.8 | 9ab | c d e | f |
373 | N | 1 | RT | | 100.0 | RB | RA!=0 | 0 | TBD
374 | N | 1 | RT | | 100.1 | RB | RA!=0 | 0 | TBD
375 | N | 1 | RT | | 101.0 | RB | RA!=0 | 0 | xor
376 | N | 1 | RT | | 101.1 | RB | RA!=0 | 0 | eqv (xnor)
377 | N | 1 | RT | | 100.0 | RB | 0 0 0 | 0 | extsb
378 | N | 1 | RT | | 100.1 | RB | 0 0 0 | 0 | cnttz
379 | N | 1 | RT | | 101.0 | RB | 0 0 0 | 0 | TBD
380 | N | 1 | RT | | 101.1 | RB | 0 0 0 | 0 | extsh
381
382 10 bit mode:
383
384 * for (RA|0) when RA=0 the input is a zero immediate,
385 meaning that nor becomes not
386 * cntlz, popcnt, exts **not available** in 10-bit mode
387 * RT is implicitly RB: "and RT(=RB), RA, RB"
388
389 ### Floating Point
390
391 Note here that elwidth overrides (SV Prefix) can be used to select FP16/32/64
392
393 | 16-bit mode | | 10-bit mode |
394 | 0 | 1 | 2 3 4 | | 567.8 | 9ab | c d e | f |
395 | N | | RT | | 011.1 | RB | RA!=0 | M | fsub.
396 | N | 0 | RT | | 110.0 | RB | RA!=0 | M | fadd
397 | N | 0 | RT | | 110.1 | RB | RA!=0 | M | fmul
398 | N | 0 | RT | | 011.1 | RB | 0 0 0 | M | fneg.
399 | N | 0 | RT | | 110.0 | RB | 0 0 0 | M |
400 | N | 0 | RT | | 110.1 | RB | 0 0 0 | M |
401
402 16-bit mode only:
403
404 | 0 | 1 | 2 3 4 | | 567.8 | 9ab | c d e | f |
405 | N | 1 | RT | | 011.1 | RB | RA!=0 | 0 |
406 | N | 1 | RT | | 110.0 | RB | RA!=0 | 0 |
407 | N | 1 | RT | | 110.1 | RB | RA!=0 | 0 | fdiv
408 | N | 1 | RT | | 011.1 | RB | 0 0 0 | 0 | fabs.
409 | N | 1 | RT | | 110.0 | RB | 0 0 0 | 0 | fmr.
410 | N | 1 | RT | | 110.1 | RB | 0 0 0 | 0 |
411
412 16 bit only, FP to INT convert (using C 0b001.1 subencoding)
413
414 | 0123 | 4 | | 567.8 | 9 ab | cde | f |
415 | 0010 | X | | 001.1 | 0 RA | Y RT | M | fp2int
416 | 0011 | X | | 001.1 | 0 RA | Y RT | M | int2fp
417
418 * X: signed=1, unsigned=0
419 * Y: FP32=0, FP64=1
420
421 10 bit mode:
422
423 * fsub. fneg. and fmr. default target is CR1
424 * fmr. is **not available** in 10-bit mode
425 * fdiv is **not available** in 10-bit mode
426
427 16 bit mode:
428
429 * fmr. copies RB to RT (and sets CR1)
430
431 ### Condition Register
432
433 | 16-bit mode | | 10-bit mode |
434 | 0 1 2 3 | 4 | | 567.8 | 9 ab | cde | f |
435 | 0 0 0 0 | BF2 | | 001.1 | 0 BF | BFA | M | mcrf
436 | 0 0 0 1 | BA2 | | 001.1 | 0 BA | BB | M | crnor
437 | 0 1 0 0 | BA2 | | 001.1 | 0 BA | BB | M | crandc
438 | 0 1 1 0 | BA2 | | 001.1 | 0 BA | BB | M | crxor
439 | 0 1 1 1 | BA2 | | 001.1 | 0 BA | BB | M | crnand
440 | 1 0 0 0 | BA2 | | 001.1 | 0 BA | BB | M | crand
441 | 1 0 0 1 | BA2 | | 001.1 | 0 BA | BB | M | creqv
442 | 1 1 0 1 | BA2 | | 001.1 | 0 BA | BB | M | crorc
443 | 1 1 1 0 | BA2 | | 001.1 | 0 BA | BB | M | cror
444
445 10 bit mode:
446
447 * mcrf BF is only 2 bits which means the destination is only CR0-CR3
448 * CR operations: **not available** in 10-bit mode (but mcrf is)
449
450 16 bit mode:
451
452 * mcrf BF2 extends BF (in MSB) to 3 bits
453 * CR operations: destination register is same as BA.
454 * CR operations: only possible on CR0 and CR1
455
456 SV (Vector Mode):
457
458 * CR operations: greatly extended reach/range (useful for predicates)
459
460 ### System
461
462 cbank: Selection of Compressed-encoding "Bank". Different "banks"
463 give different meanings to opcodes. Example: CBank=0b001 is heavily
464 optimised to A/Video Encode/Decode. cbank borrows from add's encoding
465 space (when RA==0)
466
467 | 16-bit mode | | 10-bit mode |
468 | 0 | 1 2 3 4 | | 567.8 | 9ab | cde | f |
469 | N | 0 Bank2 | | 010.0 | CBank | 000 | M | cbank
470
471 **not available** in 10-bit mode:
472
473 | 0 1 2 3 | 4 | | 567.8 | 9 ab | cde | f |
474 | 1 1 1 1 | 0 | | 001.1 | 0 00 | RT | M | mtlr
475 | 1 1 1 1 | 0 | | 001.1 | 0 01 | RT | M | mtctr
476 | 1 1 1 1 | 0 | | 001.1 | 0 11 | RT | M | mtcr
477 | 1 1 1 1 | 1 | | 001.1 | 0 00 | RA | M | mflr
478 | 1 1 1 1 | 1 | | 001.1 | 0 01 | RA | M | mfctr
479 | 1 1 1 1 | 1 | | 001.1 | 0 11 | RA | M | mfcr
480
481 ### Unallocated
482
483 | 0 1 2 3 | 4 | | 567.8 | 9 ab | cde | f |
484 | 0 1 0 1 | | | 001.1 | 0 | | M |
485 | 1 0 1 0 | | | 001.1 | 0 | | M |
486 | 1 0 1 1 | | | 001.1 | 0 | | M |
487 | 1 1 0 0 | | | 001.1 | 0 | | M |
488 | 1 1 1 1 | | | 001.1 | 0 10 | | M |