From: Luke Kenneth Casson Leighton Date: Wed, 27 Jun 2018 09:59:12 +0000 (+0100) Subject: rename page X-Git-Tag: convert-csv-opcode-to-binary~5106 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=5a4e07c8ae344a5936c5d97dec1649f6a4a5b3fe;p=libreriscv.git rename page --- diff --git a/shakti/m_class/libre_3d_gpu.mdwn b/shakti/m_class/libre_3d_gpu.mdwn new file mode 100644 index 000000000..1d22554d4 --- /dev/null +++ b/shakti/m_class/libre_3d_gpu.mdwn @@ -0,0 +1,117 @@ +# Requirements + +## GPU 3D capabilities + +Based on GC800 the following would be acceptable performance +(as would MALI400). + +* 35 million triangles/sec +* 325 milllion pixels/sec +* 6 GFLOPS + +## GPU size and power + +> 1.1. GPU size MUST be < 0.XX mm for ASICs after synthesis with +> DesignCompiler tool using YY cell library at ZZ nm tech. + +basically the power requirement should be at or below around 1 watt +in 40nm. beyond 1 watt it becomes... difficult. size is not +particularly critical as such but should not be insane. + +so here's a table showing embedded cores: + + +GC800 has (in 40nm): + +* 35 million triangles/sec +* 325 milllion pixels/sec +* 6 GFLOPS +* 1.9mm^2 synthesis area +* 2.5mm^2 silicon area. + +silicon area corresponds *ROUGHLY* with power usage, but PLEASE do +not take that as absolute, because if you read jeff's nyuzi 2016 paper +you'll see that getting data through the L1/L2 cache barrier is by far +and above the biggest eater of power. + +note lower down that the numbers for MALI400 are for the *4* core +version - MALI400-MP4 - where jeff and i compared MALI400 SINGLE CORE +and discovered that nyuzi, if 4 parallel nyuzi cores were put +together, would reach only 25% of MALI400's performance (in about the +same silicon area) + +## Other + +* Deadline = 12-18 months +* The GPU is matched by the Gallium3D driver +* RTL must be sufficient to run on an FPGA. +* Software must be licensed under LGPLv2+ or BSD/MIT. +* Hardware (RTL) must be licensed under BSD or MIT with no + "NON-COMMERCIAL" CLAUSES. +* Any proposals will be competing against Vivante GC800 (using Etnaviv driver). +* The GPU is integrated (like Mali400). So all that the GPU needs to do + is write to an area of memory (framebuffer or area of the framebuffer). + the SoC - which in this case has a RISC-V core and has peripherals such + as the LCD controller - will take care of the rest. +* In this arcitecture, the GPU, the CPU and the peripherals are all on + the same AXI4 shared memory bus. They all have access to the same shared + DDR3/DDR4 RAM. So as a result the GPU will use AXI4 to write directly + to the framebuffer and the rest will be handle by SoC. +* The job must be done by a team that shows sufficient expertise to + reduce the risk. (Do you mean a team with good CVs? What about if the + team shows you an acceptable FPGA prototype? I’m talking about a team + of students which do not have big industrial CVs but they know how to + handle this job (just like RocketChip or MIAOW or etc…). + +response: + +> Deadline = ? + +about 12-18 months which is really tight. if an FPGA (or simulation) +plus the basics of the software driver are at least prototyped by then +it *might* be ok. + +if using nyuzi as the basis it *might* be possible to begin the +software port in parallel because jeff went to the trouble of writing +a cycle-accurate simulation. + + +> The GPU must be matched by the Gallium3D driver + +that's the *recommended* approach, as i *suspect* it will result in less +work than, for example, writing an entire OpenGL stack from scratch. + + +> RTL must be sufficient to run on an FPGA. + +a *demo* must run on an FPGA as an initial + +> Software must be licensed under LGPLv2+ or BSD/MIT. + +and no other licenses. GPLv2+ is out. + +> Hardware (RTL) must be licensed under BSD or MIT with no “NON-COMMERCIAL +> CLAUSES”. +> Any proposals will be competing against Vivante GC800 (using Etnaviv +> driver). + +in terms of price, performance and power budget, yes. if you look up +the numbers (triangles/sec, pixels/sec, power usage, die area) you'll +find it's really quite modest. nyuzi right now requires FOUR times the +silicon area of e.g. MALI400 to achieve the same performance as MALI400, +meaning that the power usage alone would be well in excess of the budget. + +> The job must be done by a team that shows sufficient expertise to reduce the +> risk. (Do you mean a team with good CVs? What about if the team shows you an +> acceptable FPGA prototype? + +that would be fantastic as it would demonstrate not only competence +but also committment. and will have taken out the "risk" of being +"unknown", entirely. + +> I’m talking about a team of students which do not +> have big industrial CVs but they know how to handle this job (just like +> RocketChip or MIAOW or etc…). + + works perfectly for me :) + diff --git a/shakti/m_class/libre_3d_gpu.mwdn b/shakti/m_class/libre_3d_gpu.mwdn deleted file mode 100644 index 1d22554d4..000000000 --- a/shakti/m_class/libre_3d_gpu.mwdn +++ /dev/null @@ -1,117 +0,0 @@ -# Requirements - -## GPU 3D capabilities - -Based on GC800 the following would be acceptable performance -(as would MALI400). - -* 35 million triangles/sec -* 325 milllion pixels/sec -* 6 GFLOPS - -## GPU size and power - -> 1.1. GPU size MUST be < 0.XX mm for ASICs after synthesis with -> DesignCompiler tool using YY cell library at ZZ nm tech. - -basically the power requirement should be at or below around 1 watt -in 40nm. beyond 1 watt it becomes... difficult. size is not -particularly critical as such but should not be insane. - -so here's a table showing embedded cores: - - -GC800 has (in 40nm): - -* 35 million triangles/sec -* 325 milllion pixels/sec -* 6 GFLOPS -* 1.9mm^2 synthesis area -* 2.5mm^2 silicon area. - -silicon area corresponds *ROUGHLY* with power usage, but PLEASE do -not take that as absolute, because if you read jeff's nyuzi 2016 paper -you'll see that getting data through the L1/L2 cache barrier is by far -and above the biggest eater of power. - -note lower down that the numbers for MALI400 are for the *4* core -version - MALI400-MP4 - where jeff and i compared MALI400 SINGLE CORE -and discovered that nyuzi, if 4 parallel nyuzi cores were put -together, would reach only 25% of MALI400's performance (in about the -same silicon area) - -## Other - -* Deadline = 12-18 months -* The GPU is matched by the Gallium3D driver -* RTL must be sufficient to run on an FPGA. -* Software must be licensed under LGPLv2+ or BSD/MIT. -* Hardware (RTL) must be licensed under BSD or MIT with no - "NON-COMMERCIAL" CLAUSES. -* Any proposals will be competing against Vivante GC800 (using Etnaviv driver). -* The GPU is integrated (like Mali400). So all that the GPU needs to do - is write to an area of memory (framebuffer or area of the framebuffer). - the SoC - which in this case has a RISC-V core and has peripherals such - as the LCD controller - will take care of the rest. -* In this arcitecture, the GPU, the CPU and the peripherals are all on - the same AXI4 shared memory bus. They all have access to the same shared - DDR3/DDR4 RAM. So as a result the GPU will use AXI4 to write directly - to the framebuffer and the rest will be handle by SoC. -* The job must be done by a team that shows sufficient expertise to - reduce the risk. (Do you mean a team with good CVs? What about if the - team shows you an acceptable FPGA prototype? I’m talking about a team - of students which do not have big industrial CVs but they know how to - handle this job (just like RocketChip or MIAOW or etc…). - -response: - -> Deadline = ? - -about 12-18 months which is really tight. if an FPGA (or simulation) -plus the basics of the software driver are at least prototyped by then -it *might* be ok. - -if using nyuzi as the basis it *might* be possible to begin the -software port in parallel because jeff went to the trouble of writing -a cycle-accurate simulation. - - -> The GPU must be matched by the Gallium3D driver - -that's the *recommended* approach, as i *suspect* it will result in less -work than, for example, writing an entire OpenGL stack from scratch. - - -> RTL must be sufficient to run on an FPGA. - -a *demo* must run on an FPGA as an initial - -> Software must be licensed under LGPLv2+ or BSD/MIT. - -and no other licenses. GPLv2+ is out. - -> Hardware (RTL) must be licensed under BSD or MIT with no “NON-COMMERCIAL -> CLAUSES”. -> Any proposals will be competing against Vivante GC800 (using Etnaviv -> driver). - -in terms of price, performance and power budget, yes. if you look up -the numbers (triangles/sec, pixels/sec, power usage, die area) you'll -find it's really quite modest. nyuzi right now requires FOUR times the -silicon area of e.g. MALI400 to achieve the same performance as MALI400, -meaning that the power usage alone would be well in excess of the budget. - -> The job must be done by a team that shows sufficient expertise to reduce the -> risk. (Do you mean a team with good CVs? What about if the team shows you an -> acceptable FPGA prototype? - -that would be fantastic as it would demonstrate not only competence -but also committment. and will have taken out the "risk" of being -"unknown", entirely. - -> I’m talking about a team of students which do not -> have big industrial CVs but they know how to handle this job (just like -> RocketChip or MIAOW or etc…). - - works perfectly for me :) -