# Decoder * Context and walkthrough * First steps for a newbie developer [[docs/firststeps]] * bugreport The decoder is in charge of translating the POWER instruction stream into operations that can be handled by the backend. Source code: # POWER The decoder has been written in python, to parse straight CSV files and other information taken directly from the Power ISA Standards PDF files. This significantly reduces the possibility of manual transcription errors and greatly reduces code size. Based on Anton Blanchard's excellent microwatt design, these tables are in [[openpower/isatables]] which includes links to download the csv files. The top level decoder object recursively drops through progressive levels of case statement groups, covering additional portions of the incoming instruction bits. More on this technique - for which python and nmigen were *specifically* and strategically chosen - is outlined here The PowerDecoder2, on encountering for example an ADD operation, needs to know whether Rc=0/1, whether OE=0/1, whether RB is to be read, whether an immediate is to be read and so on. With all of this information being specified in the CSV files, on a per-instruction basis, it is simply a matter of expanding that information out into a data structure called Decode2ToExecute1Type. From there it becomes easily possible for other parts of the processor to take appropriate action. * [Decode2ToExecute1Type](https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/decoder/decode2execute1.py;hb=HEAD) ## Link to Function Units The Decoder (PowerDecode2) knows which registers are needed, however what it does not know is: * which Register file ports to connect to (this is defined by regspecs) * the order of those regfile ports (again: defined by regspecs) Neither do the Phase-aware Function Units (derived from MultiCompUnit) themselves know anything about the PowerDecoder, and they certainly do not know when a given instruction will need to tell *them* to read RA, or RB. For example: negation of RA only requires one operand, where add RA, RB requires two. Who tells whom that information, when the ALU's job is simply to add, and the Decoder's job is simply to decode? This is where a special function called "rdflags()" comes into play. rdflags works closely in conjunction with regspecs and the PowerDecoder2, in each Function Unit's "pipe\_data.py" file. It defines the flags that determine, from current instruction, whether the Function Unit actually *wants* any given Register Read Ports activated or not. That dynamically-determined information will then actively disable (or allow) Register file Read requests (rd.req) on a per-port basis. Example: class ALUInputData(IntegerData): regspec = [('INT', 'ra', '0:63'), # RA ('INT', 'rb', '0:63'), # RB/immediate ('XER', 'xer_so', '32'), # XER bit 32: SO ('XER', 'xer_ca', '34,45')] # XER bit 34/45: CA/CA32 This shows us that, for the ALU pipeline, it expects two INTEGER operands (RA and RB) both 64-bit, and it expects XER SO, CA and CA32 bits. However this information - as to which operands are required - is *dynamic*. Continuing from the OP_ADD example, where inspection of the CSV files (or the ISA tables) shows that we optionally need xer_so (OE=1), optionally need xer_ca (Rc=1), and even optionally need RB (add with immediate), we begin to understand that a dynamic system linking the PowerDecoder2 information to the Function Units is needed. This is where power\_regspec\_map.py comes into play. def regspec_decode_read(e, regfile, name): if regfile == 'INT': # Int register numbering is *unary* encoded if name == 'ra': # RA return e.read_reg1.ok, 1< then wrote a parser and language translator (aka compiler) to convert those code-fragments to python: then went to a lot of trouble over the course of several months to co-simulate them, update them, and make them accurate according to the actual spec: and created a fully-functioning python-based OpenPOWER ISA simulator: there is absolutely no reason why this language-translator (aka compiler) here should not be joined by another compiler, targetting c for use inside the linux kernel or, another compiler which auto-generates c++ for use inside power-gem5, such that this: becomes an absolute breeze to update. note that we maintain a decoder which is based on Microwatt: we extracted microwatt's decode1.vhdl into CSV files, and parse them in python as hierarchical recursive data structures: where the actual CSV files that it reads are here: this is then combined with *another* table that was extracted from the OpenPOWER v3.0B PDF: (the parser for that recognises "vertical bars" as being field-separators): and FINALLY - and this is about the only major piece of code that actually involves any kind of manual code - again it is based on Microwatt decode2.vhdl - we put everything together to turn a binary opcode into "something that needs to be executed": so our OpenPOWER simulator is actually based on: * machine-readable CSV files * machine-readable Field-Form files * machine-readable spec-accurate pseudocode files the only reason we haven't used those to turn it into HDL is because doing so is a massive research project, where a first pass would be highly likely to generate sub-optimal HDL