X-Git-Url: https://git.libre-soc.org/?a=blobdiff_plain;f=shakti%2Fm_class.mdwn;h=7f6cdd84cb0f11f682bea9499aab1c54a8d2a2ad;hb=85263dcfdd213916d12961c5ec61d30e07916082;hp=eff8d8740011fd9a058b5b85a12981073b4fea1e;hpb=f4b82dc9d42cb8f37721d39fe4dbb88160faf593;p=libreriscv.git diff --git a/shakti/m_class.mdwn b/shakti/m_class.mdwn index eff8d8740..7f6cdd84c 100644 --- a/shakti/m_class.mdwn +++ b/shakti/m_class.mdwn @@ -10,11 +10,13 @@ yields. * See [[pinouts]] for auto-generated table of pinouts (including mux) * See [[peripheralschematics]] for example Reference Layouts * See [[ramanalysis]] for a comprehensive analysis of why DDR3 is to be used. +* See [[todo]] for a rough list of tasks (and link to bugtracker) +* ## Rough specification. -Quad-core 28nm RISC-V 64-bit (RISCV64GC core with Vector SIMD Media / 3D -extensions), 300-pin 15x15mm BGA 0.8mm pitch, 32-bit DDR3/DDR3L/LPDDR3 +Quad-core 28nm OpenPOWER 64-bit (OpenPOWER v3.0B core with Simple-V Vector Media / 3D +extensions), 300-pin 15x15mm BGA 0.8mm pitch, 32-bit DDR3-4/LPDDR3/4 memory interface and libre / open interfaces and accelerated hardware functions suitable for the higher-end, low-power, embedded, industrial and mobile space. @@ -24,13 +26,22 @@ to be used (8-10mil) and 4-5mil tracks with 4mil clearance. For details see +[[shakti_libre_riscv.jpg]] + +## Die area estimates + +* +* 40nm 64-bit rocket single-core single-issue in-order: 0.14mm^2 +* 40nm 16-16k L1 caches, 0.25mm^2 +* + ## Targetting full Libre Licensing to the bedrock. The only barrier to being able to replicate the masks from scratch is the proprietary cells (e.g. memory cells) designed by the Foundries: there is a potential long-term strategy in place to deal with that issue. -The only proprietary interface utilised in the entire SoC is the DDR3 +The only proprietary interface utilised in the entire SoC is the DDR3/4 PHY plus Controller, which will be replaced in a future revision, making the entire SoC exclusively designed and made from fully libre-licensed BSD and LGPL openly and freely accessible VLSI and VHDL source. @@ -38,7 +49,7 @@ BSD and LGPL openly and freely accessible VLSI and VHDL source. In addition, no proprietary firmware whatsoever will be required to operate or boot the device right from the bedrock: the entire software stack will also be libre-licensed (even for programming the initial -proprietary DDR3 PHY+Controller) +proprietary DDR3/4 PHY+Controller) # Inspiration from several sources @@ -89,7 +100,7 @@ firmly a priority focus. ## Common Peripherals to majority of target markets -* SPI or 8080 or RGB/TTL or LVDS LCD display. SPI: 320x240. LVDS: 1440x900. +* SPI or 8080 or [RGB/TTL](RGBTTL) or LVDS LCD display. SPI: 320x240. LVDS: 1440x900. * LCD Backlight, requires GPIO power-control plus PWM for brightness control * USB-OTG Port (OTG-Host, OTG Client, Charging capability) * Baseband Modem (GSM / GPRS / 3G / LTE) requiring USB, UART, and PCM audio @@ -98,14 +109,19 @@ firmly a priority focus. * SD/MMC for external MicroSD * SD/MMC for on-PCB eMMC (care needed on power/boot sequence) * NAND Flash (not recommended), requires 8080/ATI-style Bus with dedicated CS# -* Optional 4-wire SPI NAND/NOR for boot (XIP - Execute In-place - recommended). -* Audio over I2S (5-pin: 4 for output, 1 for input), fall-back to USB Audio +* Optional 4-wire [[QSPI]] NAND/NOR for boot (XIP - Execute In-place - recommended). +* Audio over [[I2S]] (5-pin: 4 for output, 1 for input), fall-back to USB Audio +* Audio also over [[AC97]] * Some additional SPI peripherals, e.g. connection to low-power MCU. * GPIO (EINT-capable, with wakeup) for buttons, power, volume etc. * Camera(s) either by CSI-1 (parallel CSI) or better by USB * I2C sensors: accelerometer, compass, etc. Each requires EINT and RST GPIO. * Capacitive Touchpanel (I2C and also requiring EINT and RST GPIO) * Real-time Clock (usually an I2C device but may be on-board a support MCU) +* [[PCIe]] via PXPIPE +* [[LPC]] from Raptor Engineering +* [[USB3]] +* [[RGMII]] Gigabit Ethernet ## Peripherals unique to laptop market @@ -114,13 +130,13 @@ firmly a priority focus. ## Peripherals common to laptop and Industrial Market -* Ethernet (RGMII or better 8080-style XT/AT/ATI MCU bus) +* Ethernet ([[RGMII]] or better 8080-style XT/AT/ATI MCU bus for e.g. DM9000) ## Augmentation by an embedded MCU Some functions, particularly analog, are particularly tricky to implement -in an early SoC. In addition, CAN is still patented. For unusual, patented -or analog functionality such as CAN, RTC, ADC, DAC, SPDIF, One-wire Bus +in an early SoC. In addition, CAN is still patented (not any more). For unusual, patented +or analog functionality such as RTC, ADC, DAC, SPDIF, One-wire Bus and so on it is easier and simpler to deploy an ultra-low-cost low-speed companion Micro-Controller such as the crystal-less STMS8003 ($0.24) or the crystal-less STM32F072 or other suitable MCU, depending on requirements. @@ -163,11 +179,18 @@ image acceleration, scalable fonts, and Z-buffering and much more. + + ### 3D acceleration * MIAOW: ATI-compatible shader engine * ORSOC GPU contains some primitives that can be used -* SIMD RISC-V extensions can obviate the need for a "full" separate GPU +* Simple-V Vector extensions can obviate the need for a "full" separate GPU +* Nyuzi (OpenMP, based on Intel Larabee Compute Engine) +* Rasteriser +* OpenShader +* GPLGPU +* FlexGripPlus ### Video encode / decode @@ -189,140 +212,56 @@ TBD # Proposed Interfaces +* Plain [[GPIO]] multiplexed with a [[pinmux]] onto (nearly) all other pins * RGB/TTL up to 1440x900 @ 60fps, 24-bit colour -* 2x 1-lane SPI -* 1x 4-lane (quad) SPI +* 2x 1-lane [[SPI]] +* 1x 4-lane (quad) [[QSPI]] * 4x SD/MMC (1x 1/2/4/8-bit, 3x 1/2/4-bit) -* 2x full UART incl. CTS/RTS -* 3x UART (TX/RX only) +* 2x full [[UART]] incl. CTS/RTS +* 3x [[UART]] (TX/RX only) * 3x [[I2C]] (in case of address clashes between peripherals) * 8080-style AT/XT/ATI MCU Bus Interface, with multiple (8x CS#) lines -* 3x PWM-capable GPIO -* 32x EINT-cable GPIO with full edge-triggered and low/high IRQ capability +* 3x [[PWM]]-capable GPIO +* 32x [[EINT]]-cable GPIO with full edge-triggered and low/high IRQ capability * 1x [[I2S]] audio with 4-wire output and 1-wire input. -* 3x USB2 (ULPI for reduced pincount) each capable of USB-OTG support -* DDR3/DDR3L/LPDDR3 32-bit-wide memory controller +* 3x [[USB2]] ([[ULPI]] for reduced pincount) each capable of USB-OTG support +* [[DDR]] DDR3/DDR3L/LPDDR3 32-bit-wide memory controller +* [[JTAG]] for debugging Some interfaces at: +* * includes GPIO, SPI, UART, JTAG, I2C, PinCtrl, UART and PWM. Also included is a Watchdog Timer and others. * Pinmux ("IOF") for multiplexing several I/O functions onto a single pin - -## I2C - -At its own page [[I2C]] - -## I2S - -At its own page [[I2S]] - -## FlexBus - -FlexBus is capable of emulating the 8080-style / ATI MCU Bus, as well as -providing support for access to SRAM. It is extremely likely that it will -provide access to MCU-style Ethernet PHY ICs such as the DM9000, the -AX88180 (gigabit ethernet but an enormous number of pins), the AX88796A -(8/16-bit 80186 or MC68k). - -## RGB/TTL interface - - full linux kernel driver also available - -## SPI - -* APB to SPI -* ASIC-proven -* Wishbone-compliant - -## SD/MMC (including eMMC) - -* -* (needs work) - -# Pin Multiplexing - -Complex! Covered in [[pinouts]]. The general idea is to target several -distinct applications and, by trial-and-error, create a pinmux table that -successfully covers all the target scenarios by providing absolutely all -required functions for each and every target. A few general rules: - -* Different functions (SPI, I2C) which overlap on the same pins on one - bank should also be duplicated on completely different banks, both from - each other and also the bank on which they overlap. With each bank having - separate Power Domains this strategy increases the chances of being able - to place low-power and high-power peripherals and sensors on separate - GPIO banks without needing external level-shifters. -* Functions which have optional bus-widths (eMMC: 1/2/4/8) may have more - functions overlapping them than would otherwise normally be considered. -* Then the same overlapped high-order bus pins can also be mapped onto - other pins. This particularly applies to the very large buses, such - as FlexBus (over 50 pins). However if the overlapped pins are on a - different bank it becomes necessary to have both banks run in the same - GPIO Power Domain. -* All functions should really be pin-muxed at least twice, preferably - three times. Four or more times on average makes it pointless to - even have four-way pinmuxing at all, so this should be avoided. - The only exceptions (functions which have not been pinmuxed multiple - times) are the RGB/TTL LCD channel, and both ULPI interfaces. - -## GPIO Pinmux Power Domains - -Of particular importance is the Power Domains for the GPIO. Realistically -it has to be flexible (simplest option: recommended to be between -1.8v and 3.3v) as the majority of low-cost mass-produced sensors and -peripherals on I2C, SPI, UART and SD/MMC are at or are compatible with -this voltage range. Long-tail (older / stable / low-cost / mass-produced) -peripherals in particular tend to be 3.3v, whereas newer ones with a -particular focus on Mobile tend to be 1.2v to 1.8v. - -A large percentage of sensors and peripherals have separate IO voltage -domains from their main supply voltage: a good example is the SN75LVDS83b -which has one power domain for the RGB/TTL I/O, one for the LVDS output, -and one for the internal logic controller (typical deployments tend not -to notice the different power-domain capability, as they usually supply all -three voltages at 3.3v). - -Relying on this capability, however, by selecting a fixed voltage for -the entire SoC's GPIO domain, is simply not a good idea: all sensors -and peripherals which do not have a variable (VREF) capability for the -logic side, or coincidentally are not at the exact same fixed voltage, -will simply not be compatible if they are high-speed CMOS-level push-push -driven. Open-Drain on the other hand can be handled with a MOSFET for -two-way or even a diode for one-way depending on the levels, but this means -significant numbers of external components if the number of lines is large. - -So, selecting a fixed voltage (such as 1.8v or 3.3v) results in a bit of a -problem: external level-shifting is required on pretty much absolutely every -single pin, particularly the high-speed (CMOS) push-push I/O. An example: the -DM9000 is best run at 3.3v. A fixed 1.8v FlexBus would -require a whopping 18 pins (possibly even 24 for a 16-bit-wide bus) -worth of level-shifting, which is not just costly -but also a huge amount of PCB space: bear in mind that for level-shifting, an -IC with **double** the number of pins being level-shifted is required. - -Given that level-shifting is an unavoidable necessity, and external -level-shifting has such high cost(s), the workable solution is to -actually include GPIO-group level-shifting actually on the SoC die, -after the pin-muxer at the front-end (on the I/O pads of the die), -on a per-bank basis. This is an extremely common technique that is -deployed across a very wide range of mass-volume SoCs. - -One very useful side-effect for example of a variable Power Domain voltage -on a GPIO bank containing SD/MMC functionality is to be able to change the -bank's voltage from 3.3v to 1.8v, to match an SD Card's capabilities, as -permitted under the SD/MMC Specification. The alternative is to be forced to -deploy an external level-shifter IC (if PCB space and BOM target allows) or to -fix the voltage at 3.3v and thus lose access to the low-power and higher-speed -capabilities of modern SD Cards. - -In summary: putting level shifters right at the I/O pads of the SoC, after -the pin-mux (so that the core logic remains at the core voltage) is a -cost-effective solution that can have additional unintended side-benefits -and cost savings beyond simply saving on external level-shifting components -and board space. +* + including AXI, DMA, GPIO, I2C, JTAG, PLIC, QSPI, SDRAM, UART (and TCM?). + FlexBus, HyperBus and xSPI to be added. + +List of Interfaces: + +* [[CSI]] +* [[DDR]] +* [[JTAG]] +* [[I2C]] +* [[I2S]] +* [[PWM]] +* [[EINT]] +* [[FlexBus]] +* LCD / RGB/TTL [[RGBTTL]] +* [[SPI]] +* [[QSPI]] +* SD/MMC and eMMC [[sdmmc]] +* Pin Multiplexing [[pinmux]] +* Gigabit Ethernet [[RGMII]] +* SDRAM [[sdram]] + +List of Internal Interfaces: + +* [[AXI]] +* [[wishbone]] # Items requiring clarification, or proposals TBD @@ -412,10 +351,24 @@ and accurate PLL clock timing provided, it may become possible to bit-bang and software-emulate high-speed interfaces such as SATA, HDMI, PCIe and many more. +# Testing + +* cocotb +* cocotb AXI4 stream interface + # Research (to investigate) +* LPC Interface * * * 110nm DDR3 PHY +* myhdl HDL cores +* B Extension proposal +* Bit-extracts +* Bit-reverse +* Bit-permutations +* Commentary on Micro-controller +* P-SIMD + +> [[!tag cpus]] -