From: Luke Kenneth Casson Leighton Date: Thu, 20 Dec 2018 08:01:54 +0000 (+0000) Subject: clarify X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=113edd0aa326382a4b93d8dfc770afa4c2fce6cb;p=crowdsupply.git clarify --- diff --git a/updates/005_2018dec14_simd_without_simd.mdwn b/updates/005_2018dec14_simd_without_simd.mdwn index c6e456c..6548729 100644 --- a/updates/005_2018dec14_simd_without_simd.mdwn +++ b/updates/005_2018dec14_simd_without_simd.mdwn @@ -143,8 +143,18 @@ anyway, for 3D, so if 64-bit operations happen to have half the number of Reservation Stations / Function Units, and block more often, we actually don't mind so much. Also, we can still apply the same "banks" trick on the Register File, except this time with 4-way multiplexing on 32-bit -wide banks, and 4x4 crossbars on the bytes: +wide banks, and 4x4 crossbars on the bytes as well: {{register_file_multiplexing.jpg}} +To cope with 16-bit operations, pairs of 8-bit values in adjacent Function +Units are reserved. Likewise for 64-bit operations, the 8-bit crossbars +are not used, and pairs of 32-bit source values in adjacent Function Units +in the *32-bit* FU area are reserved. +However, the gate count in such a staggered crossbar arrangement is insane: +bear in mind that this will be 3R1W or 2R1W (2 or 3 reads, 1 write per +register), and that means **three** sets of crossbars, comprising **four** +banks, with effectively 16 byte to 16 byte routing. + +It's too much - so in later updates, this will be explored further.