# 16 bit Compressed
+Similar to VLE (but without immediate-prefixing) this encoding is designed
+to fit on top of OpenPOWER ISA v3.0B when a "Modeswitch" bit is set (PCR
+is recommended). Note that Compressed is *mutually exclusively incompatible*
+with OpenPOWER v3.1B "prefixing" due to using (requiring) both EXT000
+and EXT001. Hypothetically it could be made to use anything other than
+EXT001, with some inconvenience (extra gates). The incompatibility is
+"fixed" by swapping out of "Compressed" Mode and back into "Normal"
+(v3.1B) Mode, at runtime, as needed.
+
+Although initially intended to be augmented by Simple-V Prefixing, to
+add Vector context and predication yet not put pressure on I-Cache power
+or size, this Compressed Encoding is not critically dependent
+*on* SV Prefixing, and may be used stand-alone
+
See:
* <https://bugs.libre-soc.org/show_bug.cgi?id=238>
# Opcode Allocation Ideas
+* one bit from the 16-bit mode is used to indicate that 32-bit mode
+ is to be dropped into for only one single instruction
+ <https://bugs.libre-soc.org/show_bug.cgi?id=238#c2>
+
## Opcodes exploration (Attempt 1)
+Switching between different encoding modes is controlled by M (alone)
+in 10-bit mode, and M and N in 16-bit mode.
+
+* M in 10-bit mode if zero indicates that following instructions are
+ standard OpenPOWER ISA 32-bit encoded (including, redundantly,
+ further 10/16-bit instructions)
+* M in 10-bit mode if 1 indicates that following instructions are
+ in 16-bit encoding mode
+
+Once in 16-bit mode:
+
+* 0b01 (M=1, N=0): stay in 16-bit mode
+* 0b00: leave 16-bit mode permanently (return to standard OpenPOWER ISA)
+* 0b10: leave 16-bit mode for one cycle (return to standard OpenPOWER ISA)
+* 0b11: free to be used for something completely different.
+
+The current "top" idea for 0b11 is to use it for a new encoding format
+of predominantly "immediates-based" 16-bit instructions (branch-conditional,
+addi, mulli etc.)
+
+* The Compressed Major Opcode is in bits 5-7.
+* Minor opcode in bit 8.
+* In some cases bit 9 is taken as an additional sub-opcode, followed
+ by bits 0-4 (for CR operations)
+* M+N mode-switching is not available for C-Major 0b001 or 0b111
+* 10 bit mode may be expanded by 16 bit mode, adding capabilities
+ that do not fit in the extreme limited space.
+
+### Immediate Opcodes
+
+only available in 16-bit mode, and only available when M=1 and N=1
+
+ | 0 | 1 | 2 3 4 | | 567.8 | 9ab | cde | f |
+ | 1 | i2 | RT | | 010.0 | RA|0 | imm | 1 | addi
+ | 1 | i2 | | 010.1 | RA | imm | 1 | addis
+ | 1 | i2 | | 011.0 | RB | imm | 1 | cmpdi
+ | 1 | i2 | | 011.1 | RB | imm | 1 | cmpwi
+ | 1 | i2 | | 100.0 | RT | imm | 1 | sti
+ | 1 | i2 | | 100.1 | RT | imm | 1 | fstwi
+ | 1 | i2 | | 101.0 | RA | imm | 1 | ldi
+ | 1 | i2 | | 101.1 | RA | imm | 1 | lwi
+ | 1 | i2 | | 110.0 | RA | imm | 1 | flwi
+ | 1 | i2 | | 110.1 | RA | imm | 1 | fldi
+
+Construction of immediate:
+
+* addi is EXTS(i2||imm) to give a 4-bit range -8 to +7
+* addis is EXTS(i2||imm||000) to give a 11-bit range -1024 to +1023 in increments of 8
+* all others are EXTS(i2||imm) to give a 7-bit range -128 to +127
+ (further for LD/ST due to word/dword-alignment)
+
+Further Notes:
+
+* bc also has an immediate mode, listed separately below in Branch section
+* for LD/ST, offset is aligned. 8-byte: i2||imm||0b000 4-byte: 0b00
+* SV Prefix over-rides help provide alternative bitwidths for LD/ST
+* RA|0 if RA is zero, addi. becomes "li"
+ - this only works if RT takes part of opcode
+ - mv is also possible by specifying an immediate of zero
+
+
### Branch
-10 bit mode may be expanded by 16 bit mode later, adding capabilities
-that do not fit in the extreme limited space.
+Note that illeg and nop are all zeros, including in the 16-bit mode.
+Given that C is allocated to OpenPOWER ISA Major opcodes EXT000 and
+EXT001 this ensures that in both 10-bit *and* 16-bit mode, a 16-bit
+run of all zeros is considered "illegal" whilst 0b0000.0000.1000.0000
+is "nop"
- | 0 1 | 2 3 4 | | 567 | 8 9 a | b c d | e | f |
- | offs2 | | 000 | offs | LK | 1 | b
- | BO2 | BI3 | | 001 | 0 BI | 0 BO | LK | 1 | bclr
- | BO2 | BI3 | | 001 | 0 BI | 1 BO | LK | 1 | bctar
+ | 16-bit mode | | 10-bit mode |
+ | 0 | 1 | 234 | | 567.8 | 9 ab | c de | f |
+ | 0 | 0 000 | | 000.0 | 0 00 | 0 00 | 0 | illeg
+ | 0 | 0 000 | | 000.1 | 0 00 | 0 00 | 0 | nop
+ | N | offs2 | | 000.LK | offs!=0 | M | b, bl
+ | 1 | offs2 | | 000.LK | BI | BO1 oo | 1 | bc, bcl
+ | N | BO3 BI3 | | 001.0 | LK BI | BO | M | bclr, bclrl
16 bit mode:
+* bc only available when N,M=0b11
* offs2 extends offset in MSBs
* BI3 extends BI in MSBs to allow selection of full CR
-* BO2 extends BO
+* BO3 extends BO
+* bc offset constructed from oo as LSBs and offs2 as MSBs
+* bc BI allows selection of all bits from CR0 or CR1
+* bc CR check is always active (as if BO0=1) therefore BO1 inverts
10 bit mode:
+* illegal (all zeros) covers part of branch (offs=0,M=0,LK=0)
+* nop also covers part of branch (offs=0,M=0,LK=1)
+* bc **not available** in 10-bit mode
* BO[0] enables CR check, BO[1] inverts check
* BI refers to CR0 only (4 bits of)
* no Branch Conditional with immediate
* no Absolute Address
-* no CTR mode (and no bctr)
+* CTR mode allowed with BO[2] for b only.
* offs is to 2 byte (signed) aligned
* all branches to 2 byte aligned
### LD/ST
- | 0 | 1 | 2 3 4 | | 567 | 8 9 a | b c d | e | f |
- | RB2 | RA2 | RT | | 001 | 1 RA | 1 RB | 0 | 1 | fld
- | RA2 | RT2 | RB | | 001 | 1 RA | 1 RT | 1 | 1 | fst
- | | | RT | | 111 | RA | RB | 0 | 1 | ld
- | | | RB | | 111 | RA | RT | 1 | 1 | st
+ | 16-bit mode | | 10-bit mode |
+ | 0 | 1 | 2 3 4 | | 567.8 | 9 a b | c d e | f |
+ | RB2 | RA2 | RT | | 001.1 | 1 RA | 0 RB | M | fld
+ | RA2 | RT2 | RB | | 001.1 | 1 RA | 1 RT | M | fst
+ | | | RT | | 111.0 | RA | RB | M | ld
+ | | | RB | | 111.1 | RA | RT | M | st
* elwidth overrides can set different widths
10 bit mode:
* RA and RB are only 2 bit (0-3)
-* for LD, RT is implicitly RB: ld RT=RB, RA(RB)
-* for ST, there is no offset: st RT, RA(0)
+* for LD, RT is implicitly RB: "ld RT=RB, RA(RB)"
+* for ST, there is no offset: "st RT, RA(0)"
### Arithmetic
- | 0 1 | 2 3 4 | | 567 | 8 9 a | b c d | e | f |
- | | RT | | 010 | RB | RA | 0 | 1 | add
- | | RT | | 010 | RB | RA | 1 | 1 | mul
- | | RT | | 011 | RB | (RA|0)| 0 | 1 | sub.
+ | 16-bit mode | | 10-bit mode |
+ | 0 | 1 | 2 3 4 | | 567.8 | 9ab | c d e | f |
+ | N | 0 | RT | | 010.0 | RB | RA!=0 | M | add
+ | N | 0 | RT | | 010.1 | RB | RA | M | mul
+ | N | 0 | RT!=0 | | 011.0 | RB | RA!=0 | M | sub.
+ | N | 0 | 000 | | 011.0 | RB | RA!=0 | M | cmpw
+ | N | 0 | RT | | 011.0 | RB | 000 | M | neg.
+
+16 bit mode only:
+
+ | 0 | 1 | 2 3 4 | | 567.8 | 9ab | c d e | f |
+ | N | 1 | RT | | 010.0 | | | M |
+ | N | 1 | RT | | 010.1 | RB | RA | M | div
+ | N | 1 | RT!=0 | | 011.0 | RB | RA!=0 | M |
+ | N | 1 | 000 | | 011.0 | RB | RA!=0 | M | cmpl
+ | N | 1 | RT | | 011.0 | RB | 000 | M |
10 bit mode:
* sub. default CR target is CR0
* for (RA|0) when RA=0 the input is a zero immediate,
meaning that sub. becomes neg.
+* RT is implicitly RB: "add RT(=RB), RA, RB"
+* Opcode 0b010.0 RA=0 is not missing from the above:
+ it is a system-wide instruction, "cbank" (section below)
### Logical
- | 0 1 | 2 3 4 | | 567 | 8 9 a | b c d | e | f |
- | | RT | | 100 | RB | RA!=0 | 0 | 1 | and
- | | RT | | 100 | RB | RA | 1 | 1 | nand
- | | RT | | 101 | RB | RA | 0 | 1 | or
- | | RT | | 101 | RB | (RA|0)| 1 | 1 | nor
+ | 16-bit mode | | 10-bit mode |
+ | 0 | 1 | 2 3 4 | | 567.8 | 9ab | c d e | f |
+ | N | 0 | RT | | 100.0 | RB | RA!=0 | M | and
+ | N | 0 | RT | | 100.1 | RB | RA!=0 | M | nand
+ | N | 0 | RT | | 101.0 | RB | RA!=0 | M | or
+ | N | 0 | RT | | 101.1 | RB | RA!=0 | M | nor
+ | N | 0 | RT | | 100.0 | RB | 0 0 0 | M | extsw
+ | N | 0 | RT | | 100.1 | RB | 0 0 0 | M | cntlz
+ | N | 0 | RT | | 101.0 | RB | 0 0 0 | M | popcnt
+ | N | 0 | RT | | 101.1 | RB | 0 0 0 | M | not
+
+16-bit mode only:
+
+ | 0 | 1 | 2 3 4 | | 567.8 | 9ab | c d e | f |
+ | N | 1 | RT | | 100.0 | RB | RA!=0 | M |
+ | N | 1 | RT | | 100.1 | RB | RA!=0 | M |
+ | N | 1 | RT | | 101.0 | RB | RA!=0 | M | xor
+ | N | 1 | RT | | 101.1 | RB | RA!=0 | M | eqv (xnor)
+ | N | 1 | RT | | 100.0 | RB | 0 0 0 | M | extsb
+ | N | 1 | RT | | 100.1 | RB | 0 0 0 | M | cnttz
+ | N | 1 | RT | | 101.0 | RB | 0 0 0 | M |
+ | N | 1 | RT | | 101.1 | RB | 0 0 0 | M | extsh
10 bit mode:
* for (RA|0) when RA=0 the input is a zero immediate,
meaning that nor becomes not
+* cntlz, popcnt, exts **not available** in 10-bit mode
+* RT is implicitly RB: "and RT(=RB), RA, RB"
### Floating Point
- | 0 1 | 2 3 4 | | 567 | 8 9 a | b c d | e | f |
- | | RT | | 011 | RB | (RA|0)| 1 | 1 | fsub.
- | | RT | | 110 | RB | RA!=0 | 0 | 1 | fadd
- | | RT | | 110 | RB | 0 0 0 | 0 | 1 | fabs
- | | RT | | 110 | RB | RA!=0 | 1 | 1 | fmul
- | | RT | | 110 | RB | 0 0 0 | 1 | 1 | fmr.
+Note here that elwidth overrides (SV Prefix) can be used to select FP16/32/64
+
+ | 16-bit mode | | 10-bit mode |
+ | 0 | 1 | 2 3 4 | | 567.8 | 9ab | c d e | f |
+ | N | | RT | | 011.1 | RB | RA!=0 | M | fsub.
+ | N | 0 | RT | | 110.0 | RB | RA!=0 | M | fadd
+ | N | 0 | RT | | 110.1 | RB | RA!=0 | M | fmul
+ | N | 0 | RT | | 011.1 | RB | 0 0 0 | M | fneg.
+ | N | 0 | RT | | 110.0 | RB | 0 0 0 | M |
+ | N | 0 | RT | | 110.1 | RB | 0 0 0 | M |
+
+16-bit mode only:
+
+ | 0 | 1 | 2 3 4 | | 567.8 | 9ab | c d e | f |
+ | N | 1 | RT | | 011.1 | RB | RA!=0 | M |
+ | N | 1 | RT | | 110.0 | RB | RA!=0 | M |
+ | N | 1 | RT | | 110.1 | RB | RA!=0 | M | fdiv
+ | N | 1 | RT | | 011.1 | RB | 0 0 0 | M | fabs.
+ | N | 1 | RT | | 110.0 | RB | 0 0 0 | M | fmr.
+ | N | 1 | RT | | 110.1 | RB | 0 0 0 | M |
10 bit mode:
-* fsub default target is CR1
-* for (RA|0) when RA=0 the input is a zero immediate,
- meaning that fsub becomes fneg, and fcmp becomes fcmp-against-zero
+* fsub. fneg. and fmr. default target is CR1
* fmr. is **not available** in 10-bit mode
+* fdiv is **not available** in 10-bit mode
16 bit mode:
### Condition Register
- | 0 1 2 3 | 4 | | 567 | 8 9 a | b c d e | f |
- | 0 0 0 0 | BF2 | | 001 | 1 BF | 0 BFA | 1 | mcrf
- | 0 0 0 1 | BA2 | | 001 | 1 BA | 0 BB | 1 | crnor
- | 0 1 0 0 | BA2 | | 001 | 1 BA | 0 BB | 1 | crandc
- | 0 1 1 0 | BA2 | | 001 | 1 BA | 0 BB | 1 | crxor
- | 0 1 1 1 | BA2 | | 001 | 1 BA | 0 BB | 1 | crnand
- | 1 0 0 0 | BA2 | | 001 | 1 BA | 0 BB | 1 | crand
- | 1 0 0 1 | BA2 | | 001 | 1 BA | 0 BB | 1 | creqv
- | 1 1 0 1 | BA2 | | 001 | 1 BA | 0 BB | 1 | crorc
- | 1 1 1 0 | BA2 | | 001 | 1 BA | 0 BB | 1 | cror
+ | 16-bit mode | | 10-bit mode |
+ | 0 1 2 3 | 4 | | 567.8 | 9 ab | cde | f |
+ | 0 0 0 0 | BF2 | | 001.1 | 0 BF | BFA | M | mcrf
+ | 0 0 0 1 | BA2 | | 001.1 | 0 BA | BB | M | crnor
+ | 0 1 0 0 | BA2 | | 001.1 | 0 BA | BB | M | crandc
+ | 0 1 1 0 | BA2 | | 001.1 | 0 BA | BB | M | crxor
+ | 0 1 1 1 | BA2 | | 001.1 | 0 BA | BB | M | crnand
+ | 1 0 0 0 | BA2 | | 001.1 | 0 BA | BB | M | crand
+ | 1 0 0 1 | BA2 | | 001.1 | 0 BA | BB | M | creqv
+ | 1 1 0 1 | BA2 | | 001.1 | 0 BA | BB | M | crorc
+ | 1 1 1 0 | BA2 | | 001.1 | 0 BA | BB | M | cror
10 bit mode:
* mcrf BF is only 2 bits which means the destination is only CR0-CR3
-* CR operations: **not available**
+* CR operations: **not available** in 10-bit mode (but mcrf is)
16 bit mode:
### System
-10/16-bit mode:
+cbank: Selection of Compressed-encoding "Bank". Different "banks"
+give different meanings to opcodes. Example: CBank=0b001 is heavily
+optimised to A/Video Encode/Decode. cbank borrows from add's encoding
+space (when RA==0)
- | 0 1 | 2 3 4 | | 567 | 8 9 a | b c d | e | f |
- | | | | 100 | 0 0 0 | 0 0 0 | 0 | 1 | sc
- | | | | 100 | 0 0 1 | 0 0 0 | 0 | 1 | rfid
+ | 16-bit mode | | 10-bit mode |
+ | 0 | 1 2 3 4 | | 567.8 | 9ab | cde | f |
+ | N | 0 Bank2 | | 010.0 | CBank | 000 | M | cbank
**not available** in 10-bit mode:
- | 0 1 2 3 | 4 | | 567 | 8 9 a | b c d e | f |
- | 1 1 1 1 | 0 | | 001 | 1 00 | 0 RT | 1 | mtlr
- | 1 1 1 1 | 0 | | 001 | 1 01 | 0 RT | 1 | mtctr
- | 1 1 1 1 | 0 | | 001 | 1 10 | 0 RT | 1 | mttar
- | 1 1 1 1 | 0 | | 001 | 1 11 | 0 RT | 1 | mtcr
- | 1 1 1 1 | 1 | | 001 | 1 00 | 0 RA | 1 | mflr
- | 1 1 1 1 | 1 | | 001 | 1 01 | 0 RA | 1 | mfctr
- | 1 1 1 1 | 1 | | 001 | 1 10 | 0 RA | 1 | mftar
- | 1 1 1 1 | 1 | | 001 | 1 11 | 0 RA | 1 | mfcr
+ | 0 1 2 3 | 4 | | 567.8 | 9 ab | cde | f |
+ | 1 1 1 1 | 0 | | 001.1 | 0 00 | RT | M | mtlr
+ | 1 1 1 1 | 0 | | 001.1 | 0 01 | RT | M | mtctr
+ | 1 1 1 1 | 0 | | 001.1 | 0 11 | RT | M | mtcr
+ | 1 1 1 1 | 1 | | 001.1 | 0 00 | RA | M | mflr
+ | 1 1 1 1 | 1 | | 001.1 | 0 01 | RA | M | mfctr
+ | 1 1 1 1 | 1 | | 001.1 | 0 11 | RA | M | mfcr
### Unallocated
- | 0 1 | 2 3 4 | | 567 | 8 9 a | b c d | e | f |
- | | | | 100 | 0 1 0 | 0 0 0 | 0 | 1 |
- | | | | 100 | 0 1 1 | 0 0 0 | 0 | 1 |
- | | | | 100 | 1 0 0 | 0 0 0 | 0 | 1 |
- | | | | 100 | 1 0 1 | 0 0 0 | 0 | 1 |
- | | | | 100 | 1 1 0 | 0 0 0 | 0 | 1 |
- | | | | 100 | 1 1 1 | 0 0 0 | 0 | 1 |
-
- | 0 1 2 3 | 4 | | 567 | 8 9 a | b c d e | f |
- | 0 0 1 0 | | | 001 | 1 | 0 | 1 |
- | 0 0 1 1 | | | 001 | 1 | 0 | 1 |
- | 0 1 0 1 | | | 001 | 1 | 0 | 1 |
- | 1 0 1 0 | | | 001 | 1 | 0 | 1 |
- | 1 0 1 1 | | | 001 | 1 | 0 | 1 |
- | 1 1 0 0 | | | 001 | 1 | 0 | 1 |
+ | 0 1 2 3 | 4 | | 567.8 | 9 ab | cde | f |
+ | 0 0 1 0 | | | 001.1 | 0 | | M |
+ | 0 0 1 1 | | | 001.1 | 0 | | M |
+ | 0 1 0 1 | | | 001.1 | 0 | | M |
+ | 1 0 1 0 | | | 001.1 | 0 | | M |
+ | 1 0 1 1 | | | 001.1 | 0 | | M |
+ | 1 1 0 0 | | | 001.1 | 0 | | M |
+ | 1 1 1 1 | 0 | | 001.1 | 0 10 | | M |
+ | 1 1 1 1 | 1 | | 001.1 | 0 10 | | M |