include mulli, twi and tdi.
* 2 opcodes for 16-bit Compressed instructions with 11 bits available
-* 2 opcodes are required in order to give SV-P48 (and SV-C32) the 11 bits needed for prefixing
-* 2 opcodes are likewise required for SV-P64 (and SV-C48) to have 27 bits available
-* 2 opcodes for SV VBLOCK
+* 2 opcodes are required in order to give SV-P48 the 11 bits needed for prefixing
+* 2 opcodes are likewise required for SV-P64 to have 27 bits available
+* 2 opcodes for SV-C32 and SV-C48 (32 bit versions of P48 and P64)
With only 11 bits for 16-bit Compressed, it may be better to use the
-opportunity to switch into "16 bit mode". Interestingly SV-P32 could
+opportunity to switch into "16 bit mode". Interestingly SV-C32 could
likewise switch into the same.
+VBLOCK can be added later by using further VSX dedicated major opcodes
+(EXT62, EXT60)
+
+* EXT00 - unused (one instruction: attn)
+* EXT01 - v3.1B prefix
+* EXT02 - twi
+* EXT03 - tdi
+* EXT04 - vector/bcd
+* EXT05 - unused
+* EXT06 - vector
+* EXT07 - mulli
+* EXT09 - reserved
+* EXT17 - unused (2 instructions: sc, scv)
+* EXT22 - reserved sandbox
+* EXT46 - lmw
+* EXT47 - stmw
+* EXT56 - lq
+* EXT57 - vector ld
+* EXT58 - ld (leave ok)
+* EXT59 - FP (leave ok)
+* EXT60 - vector
+* EXT61 - st (leave ok)
+* EXT62 - vector st
+* EXT63 - FP (leave ok)
+
+Potential allocations:
+
+ | hword 0 | hword1 | hword2 | hword 3 |
+ EXT00/01 - C 10bit -> 16bit
+ EXT60/62 - VBLOCK
+ EXT09/17 - SV-C32 and other SV-C
+ EXT06/07 - SV-C32-Swizzle and other SV-C-Swizzle
+ EXT02/03 - SV-P48
+ EXT04/05 - SV-P64
+ EXT56/57 - Predicated-SV-P48
+ EXT46/47 - Predicated SV-P64
+
+Spare:
+
+* EXT22
+
+## C10/16 FSM
+
+ if EXT == 00/01
+ start @ 10bit
+ if state==10bit:
+ if bit15:
+ next = 16bit
+ else:
+ next = Standard
+ if state==16bit:
+ if bit0 & bit15:
+ insn = C.immediate
+ if ~bit15:
+ if ~bit0:
+ next = Standard
+ else
+ next = Standard.then.16bit
+
+## SV-Compressed FSM
+
+ if EXT == 09/17:
+ if bit0:
+ SV.mode =
+
+# Major opcode map
+
+Table 9: Primary Opcode Map (opcode bits 0:5)
+
+ | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111
+ 000 | | | tdi | twi | EXT04 | | | mulli | 000
+ 001 | subfic | | cmpli | cmpi | addic | addic. | addi | addis | 001
+ 010 | bc/l/a | EXT17 | b/l/a | EXT19 | rlwimi| rlwinm | | rlwnm | 010
+ 011 | ori | oris | xori | xoris | andi. | andis. | EXT30 | EXT31 | 011
+ 100 | lwz | lwzu | lbz | lbzu | stw | stwu | stb | stbu | 100
+ 101 | lhz | lhzu | lha | lhau | sth | sthu | lmw | stmw | 101
+ 110 | lfs | lfsu | lfd | lfdu | stfs | stfsu | stfd | stfdu | 110
+ 111 | lq | EXT57 | EXT58 | EXT59 | EXT60 | EXT61 | EXT62 | EXT63 | 111
+ | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111
+
# LE/BE complications.
See <https://bugs.libre-soc.org/show_bug.cgi?id=529> for discussion
much simpler for the pre-analysis phase to determine instruction length,
regardless of what that length is (16/32/48/64/VBLOCK).
-# 16 bit Compressed
-
-This one is a conundrum. OpenPOWER ISA was never designed with 16
-bit in mind. VLE was added 10 years ago but only by way of marking
-an entire 64k page as "VLE". With no means to mix 32 bit and 16 bit,
-jumping between the two would have been painful and taken up space.
-
-Here, in order to embed 16 bit into a predominantly 32 bit stream the
-overhead of using an entire 16 bits just to switch into Compressed mode
-is itself a significant overhead. The situation is made worse by 5 bits
-being taken up by Major Opcode space, leaving only 11 bits to allocate
-to actual instructions.
-
-In addition we would like to add SV-C32 which is a Vectorised version
-of 16 bit Compressed, and ideally have a variant that adds the 27-bit
-prefix format from SV-P64, as well.
-
-Potential ways to reduce pressure on the 16 bit space are:
-
-* To provide "paging". This involves bank-switching to alternative optimised encodings for specific workloads
-* To enter "16 bit mode" for durations specified at the start
-* To reserve one bit of every 16 bit instruction to indicate that the 16 bit mode is to continue to be sustained
-
-This latter would be useful in the Vector context to have an alternative
-meaning: as the bit which determines whether the instruction is 11-bit
-prefixed or 27-bit prefixed:
-
- 0 1 2 3 4 5 6 7 8 9 a b c d e f |
- |major op | 11 bit vector prefix|
- |16 bit opcode alt vec. mode ^ |
- | extra vector prefix if alt set|
-
-Using a major opcode to enter 16 bit mode, leaves 11 bits to find
-something to use them for:
-
- 0 1 2 3 4 5 6 7 8 9 a b c d e f |
- |major op | what to do here 1 |
- |16 bit stay in 16bit mode 1 |
- |16 bit stay in 16bit mode 1 |
- |16 bit exit 16bit mode 0 |
-
-One possibility is that the 11 bits are used for bank selection, with
-some room for additional context such as altering the registers used
-for the 16 bit operations (bank selection of which scalar regs)
-
-Another is to use the 11 bits for only the utmost commonly used
-instructions. That being the case then even one of those 11 bits would
-also need to be dedicated to saying if 16 bit mode is to be continued.
-10 bits remain for actual opcodes!
-
-## 16 bit Compressed opcodes exploration
-
-### Branch
-
-10 bit mode may be expanded by 16 bit mode later, adding capabilities
-that do not fit in the extreme limited space.
-
- | 0 1 | 2 3 4 | | 5 6 7 | 8 9 | a b | c d | e | f |
- | offs2 | | 0 0 0 | offs | LK | 1 | b
- | BO2 | BI3 | | 0 0 1 | 00 | BI | BO | LK | 1 | bclr
- | BO2 | BI3 | | 0 0 1 | 01 | BI | BO | LK | 1 | bctar
-
-16 bit mode:
+Option 3:
-* offs2 extends offset in MSBs
-* BI3 extends BI in MSBs to allow selection of full CR
-* BO2 extends BO
+Just as in VLE, require instructions to be in BE order. Data, which has nothing to do with instruction order, may optionally remain in LE order.
-10 bit mode:
+## Why does VLE use a separate 64k page?
-* BO[0] enables CR check, BO[1] inverts check
-* BI refers to CR0 only (4 bits of)
-* no Branch Conditional with immediate
-* no Absolute Address
-* no CTR mode (and no bctr)
-* offs is to 2 byte (signed) aligned
-* all branches to 2 byte aligned
+VLE requires that the memory page be marked as VLE-encoded. It also requires rhat the instructions be in BE order even when 32 bit standard opcodes are mixed in.
-### LD/ST
+Questions:
- | 0 | 1 | 2 3 4 | | 5 6 7 | 8 9 | a b | c d | e | f |
- | F | RA2 | RT | | 0 0 1 | 11 | RA | RB | 0 | 1 | ld
- | F | RT2 | RB | | 0 0 1 | 11 | RA | RT | 1 | 1 | st
+* What would happen without the page being marked, when attempting to call ppc64le ABI code?
+* How would ppc64le code in the same page be distinguished from SVPrefix code?
-* elwidth overrides can set different widths
+The answers are that it is either impossible or that it requires a special mode-switching instruction to be called on entry and exit from functions, transitioning to and from ppc64le mode.
-16 bit mode:
+This transition may be achieved very simply by marking the 64k page.
-* F=1 is FLD, FST
-* RA2 extends RA to 3 bits (MSB)
-* RT2 extends RT to 3 bits (MSB)
-
-10 bit mode:
-
-* RA and RB are only 2 bit (0-3)
-* for LD, RT is implicitly RB: ld RT=RB, RA(RB)
-* for ST, there is no offset: st RT, RA(0)
-
-### Arithmetic
-
- | 0 1 | 2 3 4 | | 5 6 7 | 8 9 a | b c d | e | f |
- | | | | 0 1 0 | RB | RA | 0 | 1 | add
- | | | | 0 1 0 | RB | RA | 1 | 1 | mul
- | | | | 0 1 1 | RB | (RA|0)| 0 | 1 | sub
- | | | | 0 1 1 | RB | (RA|0)| 1 | 1 | cmp
-
-10 bit mode:
-
-* cmp default target is CR0
-* for (RA|0) when RA=0 the input is a zero immediate,
- meaning that sub becomes neg, and cmp becomes cmp-against-zero
-
-### Logical
-
- | 0 1 | 2 3 4 | | 5 6 7 | 8 9 a | b c d | e | f |
- | | | | 1 0 0 | RB | RA | 0 | 1 | and
- | | | | 1 0 0 | RB | RA | 1 | 1 | nand
- | | | | 1 0 1 | RB | RA | 0 | 1 | or
- | | | | 1 0 1 | RB | (RA|0)| 1 | 1 | nor
-
-10 bit mode:
-
-* for (RA|0) when RA=0 the input is a zero immediate,
- meaning that nor becomes not
-
-### Floating Point
-
- | 0 1 | 2 3 4 | | 5 6 7 | 8 9 a | b c d | e | f |
- | | RT | | 1 1 0 | RB | RA!=0 | 0 | 1 | fadd
- | | RT | | 1 1 0 | RB | 0 0 0 | 0 | 1 | fabs
- | | RT | | 1 1 0 | RB | RA | 1 | 1 | fmul
- | | RT | | 1 1 1 | RB | (RA|0)| 0 | 1 | fsub
- | | RT | | 1 1 1 | RB | (RA|0)| 1 | 1 | fcmp
-
-10 bit mode:
-
-* fcmp default target is CR1
-* for (RA|0) when RA=0 the input is a zero immediate,
- meaning that fsub becomes fneg, and fcmp becomes fcmp-against-zero
-
-### Condition Register
-
- | 0 1 2 3 | 4 | | 5 6 7 | 8 9 | a b | c d e | f |
- | 0 0 0 0 | BF2 | | 0 0 1 | 10 | BF | BFA | 1 | mcrf
-
-10 bit mode:
+# 16 bit Compressed
-* BF is only 2 bits which means the destination is only CR0-CR3
+See [[16_bit_compressed]]