From: Luke Kenneth Casson Leighton Date: Wed, 27 Jun 2018 09:56:44 +0000 (+0100) Subject: add libre 3d gpu page X-Git-Tag: convert-csv-opcode-to-binary~5108 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=222dbad5ff73ded09fba7ab36d206d62bc44e6be;p=libreriscv.git add libre 3d gpu page --- diff --git a/shakti/m_class/libre_3d_gpu.mwdn b/shakti/m_class/libre_3d_gpu.mwdn new file mode 100644 index 000000000..fc070176c --- /dev/null +++ b/shakti/m_class/libre_3d_gpu.mwdn @@ -0,0 +1,109 @@ +# Requirements + +## GPU size and power + +> 1.1. GPU size MUST be < 0.XX mm for ASICs after synthesis with +> DesignCompiler tool using YY cell library at ZZ nm tech. + +basically the power requirement should be at or below around 1 watt +in 40nm. beyond 1 watt it becomes... difficult. size is not +particularly critical as such but should not be insane. + +so here's a table showing embedded cores: + + +GC800 has (in 40nm): + +* 35 million triangles/sec +* 325 milllion pixels/sec +* 6 GFLOPS +* 1.9mm^2 synthesis area +* 2.5mm^2 silicon area. + +silicon area corresponds *ROUGHLY* with power usage, but PLEASE do +not take that as absolute, because if you read jeff's nyuzi 2016 paper +you'll see that getting data through the L1/L2 cache barrier is by far +and above the biggest eater of power. + +note lower down that the numbers for MALI400 are for the *4* core +version - MALI400-MP4 - where jeff and i compared MALI400 SINGLE CORE +and discovered that nyuzi, if 4 parallel nyuzi cores were put +together, would reach only 25% of MALI400's performance (in about the +same silicon area) + +## Other + + +* Deadline = 12-18 months +* The GPU is matched by the Gallium3D driver +* RTL must be sufficient to run on an FPGA. +* Software must be licensed under LGPLv2+ or BSD/MIT. +* Hardware (RTL) must be licensed under BSD or MIT with no + "NON-COMMERCIAL" CLAUSES. +* Any proposals will be competing against Vivante GC800 (using Etnaviv driver). +* The GPU is integrated (like Mali400). So all that the GPU needs to do + is write to an area of memory (framebuffer or area of the framebuffer). + the SoC - which in this case has a RISC-V core and has peripherals such + as the LCD controller - will take care of the rest. +* In this arcitecture, the GPU, the CPU and the peripherals are all on + the same AXI4 shared memory bus. They all have access to the same shared + DDR3/DDR4 RAM. So as a result the GPU will use AXI4 to write directly + to the framebuffer and the rest will be handle by SoC. +* The job must be done by a team that shows sufficient expertise to + reduce the risk. (Do you mean a team with good CVs? What about if the + team shows you an acceptable FPGA prototype? I’m talking about a team + of students which do not have big industrial CVs but they know how to + handle this job (just like RocketChip or MIAOW or etc…). + +response: + +> Deadline = ? + +about 12-18 months which is really tight. if an FPGA (or simulation) +plus the basics of the software driver are at least prototyped by then +it *might* be ok. + +if using nyuzi as the basis it *might* be possible to begin the +software port in parallel because jeff went to the trouble of writing +a cycle-accurate simulation. + + +> The GPU must be matched by the Gallium3D driver + +that's the *recommended* approach, as i *suspect* it will result in less +work than, for example, writing an entire OpenGL stack from scratch. + + +> RTL must be sufficient to run on an FPGA. + +a *demo* must run on an FPGA as an initial + +> Software must be licensed under LGPLv2+ or BSD/MIT. + +and no other licenses. GPLv2+ is out. + +> Hardware (RTL) must be licensed under BSD or MIT with no “NON-COMMERCIAL +> CLAUSES”. +> Any proposals will be competing against Vivante GC800 (using Etnaviv +> driver). + +in terms of price, performance and power budget, yes. if you look up +the numbers (triangles/sec, pixels/sec, power usage, die area) you'll +find it's really quite modest. nyuzi right now requires FOUR times the +silicon area of e.g. MALI400 to achieve the same performance as MALI400, +meaning that the power usage alone would be well in excess of the budget. + +> The job must be done by a team that shows sufficient expertise to reduce the +> risk. (Do you mean a team with good CVs? What about if the team shows you an +> acceptable FPGA prototype? + +that would be fantastic as it would demonstrate not only competence +but also committment. and will have taken out the "risk" of being +"unknown", entirely. + +> I’m talking about a team of students which do not +> have big industrial CVs but they know how to handle this job (just like +> RocketChip or MIAOW or etc…). + + works perfectly for me :) +