From: Konstantinos Margaritis Date: Thu, 27 Apr 2023 09:35:03 +0000 (+0000) Subject: really fix tables X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=68323e3c6728593eed6865b7d1239abfceabd102;p=libreriscv.git really fix tables --- diff --git a/openpower/sv/cookbook/chacha20.mdwn b/openpower/sv/cookbook/chacha20.mdwn index 959326e39..f7ae2face 100644 --- a/openpower/sv/cookbook/chacha20.mdwn +++ b/openpower/sv/cookbook/chacha20.mdwn @@ -140,39 +140,53 @@ Let's assume the values `x` in the registers 24-36 |--------|-----|-----|-----|-----| | GPR 24 | x0 | x1 | x2 | x3 | +|--------|-----|-----|-----|-----| | GPR 28 | x4 | x5 | x6 | x7 | +|--------|-----|-----|-----|-----| | GPR 32 | x8 | x9 | x10 | x11 | +|--------|-----|-----|-----|-----| | GPR 36 | x12 | x13 | x14 | x15 | +|--------|-----|-----|-----|-----| So for the addition in Vertical-First mode, `RT` (and `RA` as they are the same) indices are (in terms of x): |----|----|----|----|----|----|----|----| | 0 | 8 | 0 | 8 | 1 | 9 | 1 | 9 | +|----|----|----|----|----|----|----|----| | 2 | 10 | 2 | 10 | 3 | 11 | 3 | 11 | +|----|----|----|----|----|----|----|----| | 0 | 10 | 0 | 10 | 1 | 11 | 1 | 11 | +|----|----|----|----|----|----|----|----| | 2 | 8 | 2 | 8 | 3 | 9 | 3 | 9 | +|----|----|----|----|----|----|----|----| However, since the indices are small values, using a single 64-bit register for a single index value is a waste so we will compress them, 8 indices in a 64-bit register: So, `RT` indices will fit inside these 4 registers (in Little Endian format): - |-----------|-------------------|-------------------|-------------------|-------------------| - | SVSHAPE0: | 0x901090108000800 | 0xb030b030a020a02 | 0xb010b010a000a00 | 0x903090308020802 | +|-----------|-------------------|-------------------|-------------------|-------------------| +| SVSHAPE0: | 0x901090108000800 | 0xb030b030a020a02 | 0xb010b010a000a00 | 0x903090308020802 | +|-----------|-------------------|-------------------|-------------------|-------------------| Similarly we find the RB indices: |----|----|----|----|----|----|----|----| | 4 | 12 | 4 | 12 | 5 | 13 | 5 | 13 | +|----|----|----|----|----|----|----|----| | 6 | 14 | 6 | 14 | 7 | 15 | 7 | 15 | +|----|----|----|----|----|----|----|----| | 5 | 15 | 5 | 15 | 6 | 12 | 6 | 12 | +|----|----|----|----|----|----|----|----| | 7 | 13 | 7 | 13 | 4 | 14 | 7 | 14 | +|----|----|----|----|----|----|----|----| Using a similar method, we find the final 4 registers with the `RB` indices: - |-----------|-------------------|-------------------|-------------------|-------------------| - | SVSHAPE1: | 0xd050d050c040c04 | 0xf070f070e060e06 | 0xc060c060f050f05 | 0xe040e040d070d07 | +|-----------|-------------------|-------------------|-------------------|-------------------| +| SVSHAPE1: | 0xd050d050c040c04 | 0xf070f070e060e06 | 0xc060c060f050f05 | 0xe040e040d070d07 | +|-----------|-------------------|-------------------|-------------------|-------------------| Now, we can construct the Vertical First loop: @@ -285,14 +299,19 @@ for `sv.add` (`SHAPE0`). So, remembering that our |----|----|----|----|----|----|----|----| | 12 | 4 | 12 | 4 | 13 | 5 | 13 | 5 | +|----|----|----|----|----|----|----|----| | 14 | 6 | 14 | 6 | 15 | 7 | 15 | 7 | +|----|----|----|----|----|----|----|----| | 15 | 5 | 15 | 5 | 12 | 6 | 12 | 6 | +|----|----|----|----|----|----|----|----| | 13 | 7 | 13 | 7 | 14 | 4 | 14 | 4 | +|----|----|----|----|----|----|----|----| Again, we find - |-----------|-------------------|-------------------|-------------------|-------------------| - | SVSHAPE2: | 0x50d050d040c040c | 0x70f070f060e060e | 0x60c060c050f050f | 0x40e040e070d070d | +|-----------|-------------------|-------------------|-------------------|-------------------| +| SVSHAPE2: | 0x50d050d040c040c | 0x70f070f060e060e | 0x60c060c050f050f | 0x40e040e070d070d | +|-----------|-------------------|-------------------|-------------------|-------------------| The next operation is the `ROTATE` which takes as operand the result of the `XOR` and a shift argument. You can easily see that the indices used in this @@ -322,8 +341,9 @@ So, in a similar fashion, we instruct `XOR` (`sv.xor`) to use `SVSHAPE2` for (the shift values, which cycle every 4 elements). Note that the actual indices for `SVSHAPE3` will have to be in 32-bit elements: - |---------|--------------------|--------------------| - | SHIFTS: | 0x0000000c00000010 | 0x0000000700000008 | +|---------|--------------------|--------------------| +| SHIFTS: | 0x0000000c00000010 | 0x0000000700000008 | +|---------|--------------------|--------------------| The complete algorithm for a loop with 10 iterations is as follows: