Envelope-to: publicinbox@libre-riscv.org
Delivery-date: Thu, 26 Mar 2020 12:19:01 +0000
Message-ID: <db557324bcb76a999d8f66c75b9319974a1a1e08.camel@fibraservi.eu>
From: Staf Verhaegen <staf@fibraservi.eu>
To: libre-riscv-dev@lists.libre-riscv.org
Date: Thu, 26 Mar 2020 13:18:53 +0100
In-Reply-To: <CAPweEDxiyTEsneXN65Kq0HsEsdL3wdY=NYayq2tz5egXJNCVfg@mail.gmail.com>
References: <CAPweEDx5QCCKxSr1gfuyuw_2D68Ld8fK85bEmmMTZi8S3w2E9g@mail.gmail.com>
 <29b1a9ecedda151dc9c8da6516c3691dfede62ef.camel@fibraservi.eu>
 <CAPweEDwfqMczPjg=5Fvt1J_S8nx1YK44XhyBY8H1abuTNF6=xg@mail.gmail.com>
 <6fa40cb78b3f8c013ca4953ccb4daa5c23e3b501.camel@fibraservi.eu>
 <CAPweEDxiyTEsneXN65Kq0HsEsdL3wdY=NYayq2tz5egXJNCVfg@mail.gmail.com>
Organization: FibraServi bvba
Mime-Version: 1.0
Subject: Re: [libre-riscv-dev] cache SRAM organisation
Precedence: list
Reply-To: Libre-RISCV General Development
 <libre-riscv-dev@lists.libre-riscv.org>
Content-Type: multipart/mixed; boundary="===============2305731690204497899=="
Errors-To: libre-riscv-dev-bounces@lists.libre-riscv.org
Sender: "libre-riscv-dev" <libre-riscv-dev-bounces@lists.libre-riscv.org>


--===============2305731690204497899==
Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature";
	boundary="=-ikvahZFcA7Cn0qEbbFmM"


--=-ikvahZFcA7Cn0qEbbFmM
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Luke Kenneth Casson Leighton schreef op wo 25-03-2020 om 15:53 [+0000]:
> On Wed, Mar 25, 2020 at 1:46 PM Staf Verhaegen <staf@fibraservi.eu> wrote=
:
>=20
> > > this because it turns out that asynchronous SRAM can act, when used i=
n aRegister File, as if it was a (separate) Register Bypass / ForwardingPor=
t.  with the Out-of-Order Engine being a huge cyclic feedback loopbetween A=
LUs and RegFile, clock delays are an impediment, and havingcompletely separ=
ate (extra) Regfile Bypass ports dramatically increases thenumber of wires =
and Multiplexers.
> >=20
> > Could detail more on how the adress, data and output signals of this as=
ynchronous block would be used and switched between synchronous and asynchr=
onous functioning. To me it seems that it would just place of the multiplex=
ers, not the amount.
>=20
> ok i will try to outine it, there is quite a lot of detail, i apologise.
> it's basically the "pass-through" system used in nmigen FIFOs, and the"sy=
nchronous" mode of nmigen Memory.  the requirements are: what iswritten has=
 to be available *combinatorially* - i.e. on the same clockcycle - if simul=
taneously read via another port.
> now, yes i took note that this is not supposed to be permitted: you'renot=
 normally permitted to be able to read *and* write to an SRAM cellat the sa=
me time.  however, that's exactly what we need.

You seem to mixing up two different concepts, e.g. synchronicity and write-=
through. Synchronous means signals are synced with an edge of a (clock) sig=
nal. SRAM write-through means that after a write operation you also get on =
the Q output the data you have just written. These two concepts are orthogo=
nal to each other.
The current synchronous SRAM being developed will most likely have write-th=
rough behavior; will be confirmed before May test chip tape-out. It will ca=
use delay on the signal though. I need to check if it has changed but in th=
e OpenRAM 0.35um test tape-out I did the address and data input was latched=
 on rising edge and the Q output was updated on falling edge of the clock. =
So the delay on the Q output is half a clock cycle plus the internal delay =
on the output latch enable signal.
So if timing of the write-through is critical it is still best to still inc=
lude MUXs as said in Jacob's reply to allow the bypass ofsignal. I have see=
n SRAM that did include a AWT (asynchronous write through) but this just mo=
ved the MUXs inside the SRAM block and also adds them if you don't need thi=
s AWT. So I would like to keep these MUX be added added externally is neede=
d.
In theory on a single port SRAM
> a workaround (fallback position) is, we use DFF latches.  i created a"byp=
ass latch" function which creates DFF latches with such acombinatorial bypa=
ss: we actually use them quite a lot (includingbetween pipeline stages so t=
hat we can programmatically cut the numberof pipeline stages in half at the=
 flick of a switch).
> however for the Register File we would not "switch" betweensynchronous / =
asynchronous mode.  the reason why we need thesynchronous mode is because s=
ome Function Units will be sitting idle,waiting for their input operands, w=
hich have to come from otherFunction Units as "results".

I can understand you do this to implement functional units with configurabl=
e pipeline length but I would strongly discourage to pipeline register file=
s after each other . If the latter is excluded would you still need an asyn=
chronous RAM block ?

greets,
Staf.


--=-ikvahZFcA7Cn0qEbbFmM--


--===============2305731690204497899==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline

X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KbGlicmUtcmlz
Y3YtZGV2IG1haWxpbmcgbGlzdApsaWJyZS1yaXNjdi1kZXZAbGlzdHMubGlicmUtcmlzY3Yub3Jn
Cmh0dHA6Ly9saXN0cy5saWJyZS1yaXNjdi5vcmcvbWFpbG1hbi9saXN0aW5mby9saWJyZS1yaXNj
di1kZXYK

--===============2305731690204497899==--