From: Luke Kenneth Casson Leighton Date: Tue, 6 Nov 2018 08:15:16 +0000 (+0000) Subject: expand architectural requirements page X-Git-Tag: convert-csv-opcode-to-binary~4855 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=2b43c4dc503526b1bfd970a0cd97da8759bc760e;p=libreriscv.git expand architectural requirements page --- diff --git a/3d_gpu/microarchitecture.mdwn b/3d_gpu/microarchitecture.mdwn index ca99e1441..60756b716 100644 --- a/3d_gpu/microarchitecture.mdwn +++ b/3d_gpu/microarchitecture.mdwn @@ -1,3 +1,37 @@ +# High-level architectural Requirements + +* SMP Cache coherency (TileLink?) +* Minumum 800mhz +* Minimum 2-core SMP, more likely 4-core uniform design, + each core with full 4-wide SIMD-style predicated ALUs +* 6GFLOPS single-precision FP +* 128 64-bit FP and 128 64-bit INT register files +* RV64GC compliance +* 4-lane 1Rx1W SRAMs for registers numbered 32 and above; + Multi-R x Multi-W for registers 1-31. + TODO: consider 2R for registers to be used as predication targets + if >= 32. + +# Conversation Notes + +---- + +'m thinking about using tilelink (or something similar) internally as +having a cache-coherent protocol is required for implementing Vulkan +(unless you want to turn off the cache for the GPU memory, which I +don't think is a good idea), axi is not a cache-coherent protocol, +and tilelink already has atomic rmw operations built into the protocol. +We can use an axi to tilelink bridge to interface with the memory. + +I'm thinking we will want to have a dual-core GPU since a single +core with 4xSIMD is too slow to achieve 6GFLOPS with a reasonable +clock speed. Additionally, that allows us to use an 800MHz core clock +instead of the 1.6GHz we would otherwise need, allowing us to lower the +core voltage and save power, since the power used is proportional to +F\*V^2. (just guessing on clock speeds.) + +---- + I don't know about power, however I have done some research and a 4Kbyte (or 16, icr) SRAM (what I was thinking of for a tile buffer) takes in the ballpark of 1000 um^2 in 28nm. diff --git a/shakti/m_class/libre_3d_gpu.mdwn b/shakti/m_class/libre_3d_gpu.mdwn index a493ee487..62ae71f7b 100644 --- a/shakti/m_class/libre_3d_gpu.mdwn +++ b/shakti/m_class/libre_3d_gpu.mdwn @@ -1,5 +1,7 @@ # Libre 3D GPU Requirements +See [[3d_gpu/microarchitecture]] + ## GPU capabilities Based on GC800 the following would be acceptable performance (as would