Envelope-to: publicinbox@libre-riscv.org
Delivery-date: Fri, 27 Mar 2020 09:25:39 +0000
Message-ID: <6fbfb2a3258be77f4fce69661b283dc31a683f7b.camel@fibraservi.eu>
From: Staf Verhaegen <staf@fibraservi.eu>
To: libre-riscv-dev@lists.libre-riscv.org
Date: Fri, 27 Mar 2020 10:25:24 +0100
In-Reply-To: <CAPweEDwznLD5o6rHfWsSXR-8e1hbAfAB04f5O+YkL6pCwGsNfQ@mail.gmail.com>
References: <CAPweEDx5QCCKxSr1gfuyuw_2D68Ld8fK85bEmmMTZi8S3w2E9g@mail.gmail.com>
 <29b1a9ecedda151dc9c8da6516c3691dfede62ef.camel@fibraservi.eu>
 <CAPweEDwfqMczPjg=5Fvt1J_S8nx1YK44XhyBY8H1abuTNF6=xg@mail.gmail.com>
 <6fa40cb78b3f8c013ca4953ccb4daa5c23e3b501.camel@fibraservi.eu>
 <CAPweEDxiyTEsneXN65Kq0HsEsdL3wdY=NYayq2tz5egXJNCVfg@mail.gmail.com>
 <e430ea6587d292166fd58460adf4dfebfad20c6d.camel@fibraservi.eu>
 <CAPweEDzEvtPYGKvGMvebmQzhJDhSgfvUOVZvB2WXxSbv_ebE8A@mail.gmail.com>
 <b18283c7e7a93fa8afdef2f0a8679b26e4569528.camel@fibraservi.eu>
 <CAPweEDwznLD5o6rHfWsSXR-8e1hbAfAB04f5O+YkL6pCwGsNfQ@mail.gmail.com>
Organization: FibraServi bvba
Mime-Version: 1.0
Subject: Re: [libre-riscv-dev] cache SRAM organisation
Precedence: list
Reply-To: Libre-RISCV General Development
 <libre-riscv-dev@lists.libre-riscv.org>
Content-Type: multipart/mixed; boundary="===============2592891676999365254=="
Errors-To: libre-riscv-dev-bounces@lists.libre-riscv.org
Sender: "libre-riscv-dev" <libre-riscv-dev-bounces@lists.libre-riscv.org>


--===============2592891676999365254==
Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature";
	boundary="=-VGIfDE4gJhKTDrBI5//N"


--=-VGIfDE4gJhKTDrBI5//N
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Luke Kenneth Casson Leighton schreef op do 26-03-2020 om 21:37 [+0000]:
> ---crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma6=
8
>=20
> On Thu, Mar 26, 2020 at 8:18 PM Staf Verhaegen <staf@fibraservi.eu> wrote=
:
> > Luke Kenneth Casson Leighton schreef op do 26-03-2020 om 13:05 [+0000]:
> > > On Thursday, March 26, 2020, Staf Verhaegen <staf@fibraservi.eu> wrot=
e:
> > > > Would like to make separate side remark here. In ASICs MUXes are re=
lativeexpensive gates with respect to delay and power. So if this principle=
 isgenerally applied over the whole design it will make it difficult to mak=
e achip that is competitive in power/performance compared to ARM/x86 CPUs.
> > >=20
> > >=20
> > > just the ALU pipeline registers.  we felt that the advantage of being=
 ableto drop to say 500mhz and halve the number of pipeline stages to say 5=
, andalso be able to ramp up to 1.6ghz and double bavk up to 10 stages, was=
worth considering.
> >=20
> > What would be the advantage over running at 800Mhz with 5 pipeline stag=
es ?
>=20
> i assume you mean fixed 5-pipeline stages.
> the problem is, if you *want* to run at 1.6ghz and have complexpipeline s=
tages, you simply can't: 5 stages are too long, the gatepropagation delay i=
s too large.  the only way to get to 1.6hz is:split those 5 stages into 10 =
smaller stages.
> the problem with _that_ is: if you then run those 10 stages at say800mhz,=
 or say even 400 mhz or 100mhz (because you are in power-savingmode), you j=
ust *massively* increased the latency for completion ofany given operation.
> so even though those 10 stages are so fast (because you are in 14nm)that,=
 at 100mhz, they complete in under 5% of a 100mhz clock rate, ifyou have a =
fixed 10-stage pipeline you are absolutely screwed, you*have* to have the p=
enalty of the 10-stage pipeline latency.
> screwed 1:  5-stage pipeline FORCES you to ONLY be able to run atBELOW (e=
.g) 800mhz
> screwed 2: 10-stage pipeline FORCES you to have massive instructioncomple=
tion latency at below (e.g.) 800mhz.
> solution: give every other pipeline stage's registers a "combinatorial by=
pass".
> un-screwed 1: when speed is above 800mhz, switch off the combinatorialbyp=
ass, pipeline becomes 10-stage.
> un-screwed 2: when speed is below 800mhz, switch ON the combinatorialbypa=
ss, latency due to slower clock rate DISAPPEARS because allpipelines are no=
w only 5-stage, not 10.

My point is that you will have the same performance for the fixed 5-stage p=
ipeline running @ 800MHz as for the 10-stage pipeline running @ 1600MHz. Wh=
y do want to run @1600MHz ?
Actually the fixed 5-stage 800MHz capable pipeline will not be able to run =
@1600MHz when converted to configurable 5/10-stage pipeline due to the addi=
tional delay from the MUXes inserted in the path plus the fact that you lik=
ely can't split up each stage in two stages with each exact the half of the=
 delay.
greets,
Staf.


--=-VGIfDE4gJhKTDrBI5//N--


--===============2592891676999365254==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline

X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KbGlicmUtcmlz
Y3YtZGV2IG1haWxpbmcgbGlzdApsaWJyZS1yaXNjdi1kZXZAbGlzdHMubGlicmUtcmlzY3Yub3Jn
Cmh0dHA6Ly9saXN0cy5saWJyZS1yaXNjdi5vcmcvbWFpbG1hbi9saXN0aW5mby9saWJyZS1yaXNj
di1kZXYK

--===============2592891676999365254==--