From: Luke Kenneth Casson Leighton Date: Wed, 22 Apr 2020 12:31:24 +0000 (+0100) Subject: write-up on twin L0 cache/buffer X-Git-Tag: convert-csv-opcode-to-binary~2824 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=56eb408e8669fe032450f10c9366d82dd101ed93;p=libreriscv.git write-up on twin L0 cache/buffer --- diff --git a/3d_gpu/architecture/6600scoreboard.mdwn b/3d_gpu/architecture/6600scoreboard.mdwn index 1ee935f49..0b0328888 100644 --- a/3d_gpu/architecture/6600scoreboard.mdwn +++ b/3d_gpu/architecture/6600scoreboard.mdwn @@ -341,6 +341,36 @@ Twin L0 cache/buffer design [Flaws](https://bugs.libre-soc.org/show_bug.cgi?id=216#c24) in the above were detected, and needed correction. +Notes: + +* The flaw detected above is that for each pair of LD/ST operations + coming from the Function Unit (to cover mis-aligned requests), + the Addr[4] bit is **mutually-exclusive**. i.e. it is **guaranteed** + that Addr[4] for the first FU port's LD/ST request will **never** + equal that of the second. +* Therefore, if the two requests are split into left/right separate L0 + Cache/Buffers, the advantages and optimisations for XOR-comparison + of bits 12-48 of the address **may not take place**. +* Solution: merge both L0-left and L0-right into one L0 Cache/Buffer, + with twin left/right banks in the same L0 Cache/Buffer +* This then means that the number of rows may be reduced to 8 +* It also means that Addr[12-48] may be stored (and compared) only once +* It does however mean that the reservation on the row has to wait for + *both* ports (left and right) to clear out their LD/ST operation(s). +* Addr[4] still selects whether the request is to go into left or right bank + +Other than that, the design remains the same, as does the algorithm to +merge the bytemasks. This remains as follows: + +* PriorityPicker selects one row +* For all rows greater than the selected row, if Addr[5:48] matches + then the bytemask is "merged" into the output-bytemask-selector +* The output-bytemask-selector is used as a "byte-enable" line on + a single 128-bit byte-level read-or-write (never both). + +Twin 128-bit requests (read-or-write) are then passed directly through +to a pair of L1 Caches. + [[!img twin_l0_cache_buffer.jpg size="600x"]] # Multi-input/output Dependency Cell and Computation Unit