From 865239a05f8333403c50277775dc73697f779e87 Mon Sep 17 00:00:00 2001 From: Staf Verhaegen Date: Tue, 19 May 2020 14:30:23 +0200 Subject: [PATCH] Re: [libre-riscv-dev] daily kan-ban update 18may2020 --- cb/314f2d69be18d2bee50082fe0e09ae0975db9b | 116 ++++++++++++++++++++++ 1 file changed, 116 insertions(+) create mode 100644 cb/314f2d69be18d2bee50082fe0e09ae0975db9b diff --git a/cb/314f2d69be18d2bee50082fe0e09ae0975db9b b/cb/314f2d69be18d2bee50082fe0e09ae0975db9b new file mode 100644 index 0000000..ab5e2d2 --- /dev/null +++ b/cb/314f2d69be18d2bee50082fe0e09ae0975db9b @@ -0,0 +1,116 @@ +Return-path: +Envelope-to: publicinbox@libre-riscv.org +Delivery-date: Tue, 19 May 2020 13:30:31 +0100 +Received: from localhost ([::1] helo=libre-riscv.org) + by libre-soc.org with esmtp (Exim 4.89) + (envelope-from ) + id 1jb1OA-0007uJ-Qp; Tue, 19 May 2020 13:30:30 +0100 +Received: from vps2.stafverhaegen.be ([85.10.201.15]) + by libre-soc.org with esmtp (Exim 4.89) + (envelope-from ) id 1jb1O9-0007u2-4r + for libre-riscv-dev@lists.libre-riscv.org; Tue, 19 May 2020 13:30:29 +0100 +Received: from hpdc7800 (hpdc7800 [10.0.0.1]) + by vps2.stafverhaegen.be (Postfix) with ESMTP id 997E211C042D + for ; + Tue, 19 May 2020 14:30:28 +0200 (CEST) +Message-ID: +From: Staf Verhaegen +To: Libre-RISCV General Development +Date: Tue, 19 May 2020 14:30:23 +0200 +In-Reply-To: +References: +Organization: FibraServi bvba +X-Mailer: Evolution 3.28.5 (3.28.5-8.el7) +Mime-Version: 1.0 +X-Content-Filtered-By: Mailman/MimeDel 2.1.23 +Subject: Re: [libre-riscv-dev] daily kan-ban update 18may2020 +X-BeenThere: libre-riscv-dev@lists.libre-riscv.org +X-Mailman-Version: 2.1.23 +Precedence: list +List-Id: Libre-RISCV General Development + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +Reply-To: Libre-RISCV General Development + +Content-Type: multipart/mixed; boundary="===============9015389145604249614==" +Errors-To: libre-riscv-dev-bounces@lists.libre-riscv.org +Sender: "libre-riscv-dev" + + +--===============9015389145604249614== +Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; + boundary="=-xK/O/tGs5dCealA5HP1p" + + +--=-xK/O/tGs5dCealA5HP1p +Content-Type: text/plain; charset="UTF-8" +Content-Transfer-Encoding: quoted-printable + +Luke Kenneth Casson Leighton schreef op ma 18-05-2020 om 12:45 [+0100]: +> * and had a fascinating conversation thanks to yehowshua and jeremy(also = +welcome!), which resulted in this(https://libre-soc.org/3d_gpu/architecture= +/tomasulo_transformation/). + +If I understand this correct the big architectural difference between exten= +ded scoreboarding and Tomasulo is that in the former the register content i= +s stored in a central register file and for the latter it is distributed ov= +er several 'reservation stations'. In order to scale to for example multi-i= +ssue you need to go to higher order nRmW register files for scoreboarding a= +nd for Tomasulo you increase the number of reservation stations together wi= +th a more complex tracker of the register tagging/aliasing. +So some 2 cents from me. +=46rom physical implementation point of view the central high order nRmW regi= +ster file and scoreboard does worry me. Higher order nRmW register files wi= +ll become power and area hungry compared to multiple lower order reservatio= +n stations. +I have seen numbers of a few tens of functional units in your design. I thi= +nk it will become also a nightmare to connect and route all the input and o= +utputs of all the functional units to the central register file and scorebo= +ard. So at first sight, from physical implementation point for smaller node= +s, the Tomasulo algorithm seems more scalable than extended scoreboarding. = +I indicated before that in smaller nodes power consumption and delay is mai= +nly determined by the length of the interconnects and not by the input load= + of the logic gates itself; in 180nm it will be more fifty/fifty. As Jeremy= + indicated this is next to the power consumption in the register files and = +cache which scales with the total bit count of the block and the nRmW order= + of the block. +Also the travialness of a big fan-in NOR or NAND gate may be deceptive, the= +se gates are not feasible and will be synthesized to trees of NAND/NOR gate= +s. In that respect a high fan-in NOR/NAND can have similar time/power consu= +mption than a seemingly more complex case of if statement. In zero order, f= +or single output block, delay and power is determined by the number of inpu= +ts independent of the complexity of the RTL/HDL code. In first order one ha= +s to account that NAND/NOR logic is more efficient than XOR/XNOR logic but = +for bigger trees this difference is less pronounced as XOR/XNOR trees will = +be synthesized to more efficient trees using AOI (and-or-invert) cells. + +greets, +Staf. + + + +--=-xK/O/tGs5dCealA5HP1p-- + + + +--===============9015389145604249614== +Content-Type: text/plain; charset="utf-8" +MIME-Version: 1.0 +Content-Transfer-Encoding: base64 +Content-Disposition: inline + +X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KbGlicmUtcmlz +Y3YtZGV2IG1haWxpbmcgbGlzdApsaWJyZS1yaXNjdi1kZXZAbGlzdHMubGlicmUtcmlzY3Yub3Jn +Cmh0dHA6Ly9saXN0cy5saWJyZS1yaXNjdi5vcmcvbWFpbG1hbi9saXN0aW5mby9saWJyZS1yaXNj +di1kZXYK + +--===============9015389145604249614==-- + + + -- 2.30.2