Return-path: Envelope-to: publicinbox@libre-riscv.org Delivery-date: Tue, 19 May 2020 13:30:31 +0100 Received: from localhost ([::1] helo=libre-riscv.org) by libre-soc.org with esmtp (Exim 4.89) (envelope-from ) id 1jb1OA-0007uJ-Qp; Tue, 19 May 2020 13:30:30 +0100 Received: from vps2.stafverhaegen.be ([85.10.201.15]) by libre-soc.org with esmtp (Exim 4.89) (envelope-from ) id 1jb1O9-0007u2-4r for libre-riscv-dev@lists.libre-riscv.org; Tue, 19 May 2020 13:30:29 +0100 Received: from hpdc7800 (hpdc7800 [10.0.0.1]) by vps2.stafverhaegen.be (Postfix) with ESMTP id 997E211C042D for ; Tue, 19 May 2020 14:30:28 +0200 (CEST) Message-ID: From: Staf Verhaegen To: Libre-RISCV General Development Date: Tue, 19 May 2020 14:30:23 +0200 In-Reply-To: References: Organization: FibraServi bvba X-Mailer: Evolution 3.28.5 (3.28.5-8.el7) Mime-Version: 1.0 X-Content-Filtered-By: Mailman/MimeDel 2.1.23 Subject: Re: [libre-riscv-dev] daily kan-ban update 18may2020 X-BeenThere: libre-riscv-dev@lists.libre-riscv.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Libre-RISCV General Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Libre-RISCV General Development Content-Type: multipart/mixed; boundary="===============9015389145604249614==" Errors-To: libre-riscv-dev-bounces@lists.libre-riscv.org Sender: "libre-riscv-dev" --===============9015389145604249614== Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="=-xK/O/tGs5dCealA5HP1p" --=-xK/O/tGs5dCealA5HP1p Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Luke Kenneth Casson Leighton schreef op ma 18-05-2020 om 12:45 [+0100]: > * and had a fascinating conversation thanks to yehowshua and jeremy(also = welcome!), which resulted in this(https://libre-soc.org/3d_gpu/architecture= /tomasulo_transformation/). If I understand this correct the big architectural difference between exten= ded scoreboarding and Tomasulo is that in the former the register content i= s stored in a central register file and for the latter it is distributed ov= er several 'reservation stations'. In order to scale to for example multi-i= ssue you need to go to higher order nRmW register files for scoreboarding a= nd for Tomasulo you increase the number of reservation stations together wi= th a more complex tracker of the register tagging/aliasing. So some 2 cents from me. =46rom physical implementation point of view the central high order nRmW regi= ster file and scoreboard does worry me. Higher order nRmW register files wi= ll become power and area hungry compared to multiple lower order reservatio= n stations. I have seen numbers of a few tens of functional units in your design. I thi= nk it will become also a nightmare to connect and route all the input and o= utputs of all the functional units to the central register file and scorebo= ard. So at first sight, from physical implementation point for smaller node= s, the Tomasulo algorithm seems more scalable than extended scoreboarding. = I indicated before that in smaller nodes power consumption and delay is mai= nly determined by the length of the interconnects and not by the input load= of the logic gates itself; in 180nm it will be more fifty/fifty. As Jeremy= indicated this is next to the power consumption in the register files and = cache which scales with the total bit count of the block and the nRmW order= of the block. Also the travialness of a big fan-in NOR or NAND gate may be deceptive, the= se gates are not feasible and will be synthesized to trees of NAND/NOR gate= s. In that respect a high fan-in NOR/NAND can have similar time/power consu= mption than a seemingly more complex case of if statement. In zero order, f= or single output block, delay and power is determined by the number of inpu= ts independent of the complexity of the RTL/HDL code. In first order one ha= s to account that NAND/NOR logic is more efficient than XOR/XNOR logic but = for bigger trees this difference is less pronounced as XOR/XNOR trees will = be synthesized to more efficient trees using AOI (and-or-invert) cells. greets, Staf. --=-xK/O/tGs5dCealA5HP1p-- --===============9015389145604249614== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KbGlicmUtcmlz Y3YtZGV2IG1haWxpbmcgbGlzdApsaWJyZS1yaXNjdi1kZXZAbGlzdHMubGlicmUtcmlzY3Yub3Jn Cmh0dHA6Ly9saXN0cy5saWJyZS1yaXNjdi5vcmcvbWFpbG1hbi9saXN0aW5mby9saWJyZS1yaXNj di1kZXYK --===============9015389145604249614==--