From: Staf Verhaegen Date: Wed, 25 Mar 2020 10:54:20 +0000 (+0100) Subject: Re: [libre-riscv-dev] cache SRAM organisation X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=a319b10f66e330bae9861c8e4776674b6d853ece;p=libre-riscv-dev.git Re: [libre-riscv-dev] cache SRAM organisation --- diff --git a/4d/308ca4191db4ebbab634351025cd3a566c71d3 b/4d/308ca4191db4ebbab634351025cd3a566c71d3 new file mode 100644 index 0000000..6b77284 --- /dev/null +++ b/4d/308ca4191db4ebbab634351025cd3a566c71d3 @@ -0,0 +1,147 @@ +Return-path: +Envelope-to: publicinbox@libre-riscv.org +Delivery-date: Wed, 25 Mar 2020 10:54:33 +0000 +Received: from localhost ([::1] helo=libre-riscv.org) + by libre-riscv.org with esmtp (Exim 4.89) + (envelope-from ) + id 1jH3g8-0006Xs-H7; Wed, 25 Mar 2020 10:54:32 +0000 +Received: from vps2.stafverhaegen.be ([85.10.201.15]) + by libre-riscv.org with esmtp (Exim 4.89) + (envelope-from ) id 1jH3g7-0006Xm-BV + for libre-riscv-dev@lists.libre-riscv.org; Wed, 25 Mar 2020 10:54:31 +0000 +Received: from hpdc7800 (hpdc7800 [10.0.0.1]) + by vps2.stafverhaegen.be (Postfix) with ESMTP id 96CA511C027D + for ; + Wed, 25 Mar 2020 11:54:30 +0100 (CET) +Message-ID: <29b1a9ecedda151dc9c8da6516c3691dfede62ef.camel@fibraservi.eu> +From: Staf Verhaegen +To: Libre RISC-V dev list +Date: Wed, 25 Mar 2020 11:54:20 +0100 +In-Reply-To: +References: +Organization: FibraServi bvba +X-Mailer: Evolution 3.28.5 (3.28.5-5.el7) +Mime-Version: 1.0 +X-Content-Filtered-By: Mailman/MimeDel 2.1.23 +Subject: Re: [libre-riscv-dev] cache SRAM organisation +X-BeenThere: libre-riscv-dev@lists.libre-riscv.org +X-Mailman-Version: 2.1.23 +Precedence: list +List-Id: Libre-RISCV General Development + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +Reply-To: Libre-RISCV General Development + +Content-Type: multipart/mixed; boundary="===============2511390821872612374==" +Errors-To: libre-riscv-dev-bounces@lists.libre-riscv.org +Sender: "libre-riscv-dev" + + +--===============2511390821872612374== +Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; + boundary="=-hcKE8vRVlmNXPJh1IKgI" + + +--=-hcKE8vRVlmNXPJh1IKgI +Content-Type: text/plain; charset="UTF-8" +Content-Transfer-Encoding: quoted-printable + +Libre-SOC developers, + +That discussion is mainly on system level and I don't want to get too deep = +into this as I don't have time for that. +I am providing the SRAM blocks and then it is up to the system guys to see = +how they use them. In this case you guys (libre-soc + LIP6) are the system = +guys. + +On ASICs commonly three types of SRAM are provided: a single port RAM, a 2-= +port RAM and a dual port RAM. Currently for NLNet only a single port SRAM i= +s foreseen as this is the most common, the smallest in area per bit and the= + fastest. +A single port SRAM has one port where you can do a read or a write each clo= +ck cycle. The 2-port one has one read port and one write port so you can do= + a read and write each clock cycle. The dual port one now has two ports tha= +t each can do a read or write each clock cycle. So you can do two reads, tw= +o write or a read+write each clock cycle. +For each of them you can have a synchronous or an asynchronous version. A s= +ynchronous RAM has a clock input and the address and data inputs are latche= +d on that clock signal. It thus means that the FFs are integrated in the SR= +AM, e.g. thus very close :) . The RAM currently being developed in my NLNet= + project is a synchronous SRAM as this is easier from timing point of view = +because all the timing can be related to the clock. A synchronous RAM actua= +lly functions as an addressable bunch of FFs and the synthesis and P&R tool= +s know how to handle them. + +Given this building block you can now make blocks that look to the outside = +world as higher number port blocks. You do this by instantiating multiple R= +AM blocks and make sure that the content is mirrored between all the blocks= +. This way you can read from the different blocks in parallel. Writing in t= +he blocks still has to happen to all the blocks at the same time. + +So if you take four single port SRAM blocks you can make a four port SRAM b= +lock. Each cycle you can do 1-4 reads or 1 write but you can't read and wri= +te at the same time. With four 2-port RAMs you can do 4 reads and 1 write e= +ach clock cycle. With four dual port RAMs you can do 4 reads or 3 reads + 1= + write or 2 reads + 2 writes each cycle. +I will provide the single block, the combining of the block has to happen i= +n RTL/HDL. For Libre-SOC this means in nmigen and using Coriolis for placem= +ent and connecting the single blocks. + +Although the SRAM does an operation each clock cycle the clock frequency co= +uld be different from the rest of the logic. If the RAM is fast enough it c= +ould run at double the frequency of the core so basically a single port RAM= + could look like a dual port RAM to the rest of the logic which is running = +at half the frequency. If the RAM is not fast enough wait states need to b= +e implemented for each operation. The maximum clock frequency will go down = +when you increase the size of a RAM block. So on CPU typically L1 cache run= +s at the same clock frequency as the core without any wait states and highe= +r level caches are bigger but also introduce more wait states for accessing= + them. +If you are thinking about having different clock frequencies in your design= + you have to first discuss this with Jean-Paul/LIP6 as doing multi clock de= +signs is opening it's own can of worms (cross clock domain problems etc). F= +or the October prototype I feel we need to stick with use of single port SR= +AM block and run the whole chip from the same clock. IMO, on this prototype= + you should take any performance implication this has. + +greets, +Staf. +Luke Kenneth Casson Leighton schreef op di 24-03-2020 om 22:32 [+0000]: +> https://groups.google.com/d/msg/comp.arch/cbGAlcCjiZE/mgMZVINVIAAJ +> Staf can i ask you the favour of reviewing Mitch's comments about cache d= +esign? +> in particular the comments about the possibility of using multiported SRA= +M cells as long as only 1R or 1W is done on any given cell? +> also something about doing the FFs yourself, close to the SRAM cells? +> l. +>=20 +>=20 + + + + +--=-hcKE8vRVlmNXPJh1IKgI-- + + + +--===============2511390821872612374== +Content-Type: text/plain; charset="utf-8" +MIME-Version: 1.0 +Content-Transfer-Encoding: base64 +Content-Disposition: inline + +X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KbGlicmUtcmlz +Y3YtZGV2IG1haWxpbmcgbGlzdApsaWJyZS1yaXNjdi1kZXZAbGlzdHMubGlicmUtcmlzY3Yub3Jn +Cmh0dHA6Ly9saXN0cy5saWJyZS1yaXNjdi5vcmcvbWFpbG1hbi9saXN0aW5mby9saWJyZS1yaXNj +di1kZXYK + +--===============2511390821872612374==-- + + +