From: Rob Clark Date: Thu, 23 Jul 2020 21:59:38 +0000 (-0700) Subject: freedreno: slurp in rnndb X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=b721d336da94b2ea26744a2961d026081f6ba0a3;p=mesa.git freedreno: slurp in rnndb Pull in all of $envytools/rnndb (including display, etc) from envytools commit 6ccdda33ac4d88e19d2a70e1b4edaaab5ec4b026 This changes the directory structure to match the organization in the envytools tree. Signed-off-by: Rob Clark Part-of: --- diff --git a/src/freedreno/registers/a2xx.xml b/src/freedreno/registers/a2xx.xml deleted file mode 100644 index 549ce1ec6d9..00000000000 --- a/src/freedreno/registers/a2xx.xml +++ /dev/nullnote: only 0x3f worth of valid register values for VS_REGS and - PS_REGS, but high bit is set to indicate '0 registers usedexture state dwords - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - " - " - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/src/freedreno/registers/a3xx.xml b/src/freedreno/registers/a3xx.xml deleted file mode 100644 index 0819dc4ede7..00000000000 --- a/src/freedreno/registers/a3xx.xml +++ /dev/nullhe pair of MEM_SIZE/ADDR registers get programmed - in sequence with the size/addr of each buffer. - - - - - - - - - - - - - - - - - aka clip_halfz - - - - - - - - - - - - - - - - - - - - - - - - - - range of -8.0 to 8.0 - - - range of -512.0 to 512.0 - - - - - - - - - - - - - - - - - - - - - - - - - - RENDER_MODE is RB_RESOLVE_PASS for gmem->mem, otherwise RB_RENDER_PASS - - - - render targets - 1 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Pitch (actually, appears to be pitch in bytes, so really is a stride) - in GMEM, so pitch of the current tile. - - - - - offset into GMEM (or system memory address in bypass mode) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - actually, appears to be pitch in bytes, so really is a stride - - - - - - - - - - - - - - - - - - - - Z_TEST_ENABLE bit is set for zfunc other than GL_ALWAYS or GL_NEVER - - - - seems to be always set to 0x00000000 - - - - - DEPTH_BASE is offset in GMEM to depth/stencil buffer, ie - bin_w * bin_h / 1024 (possible rounded up to multiple of - something?? ie. 39 becomes 40, 78 becomes 80.. 75 becomes - 80.. so maybe it needs to be multiple of 8?? - - - - - - Pitch of depth buffer or combined depth+stencil buffer - in z24s8 cases. - - - - - - - - - - - - - - - - - - seems to be always set to 0x00000000 - - - Base address for stencil when not using interleaved depth/stencil - - - - pitch of stencil buffer when not using interleaved depth/stencil - - - - - - seems to be set to 0x00000002 during binning pass - - - - X/Y offset of current bin - - - - - - - - - - - - - - - seems to be where firmware writes BIN_DATA_ADDR from - CP_SET_BIN_DATA packet.. probably should be called - PC_BIN_BASE (just using name from yamato for now) - - - - probably should be PC_BIN_SIZE - - - SIZE is current pipe width * height (in tiles) - - - N is some sort of slot # between 0..(SIZE-1). In case - multiple tiles use same pipe, each tile gets unique slot # - - - - - - - STRIDE_IN_VPC: ALIGN(next_outloc - 8, 4) / 4 - (but, in cases where you'd expect 1, the blob driver uses - 2, so possibly 0 (no varying) or minimum of 2) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - indexed by dimension - - - - - - - - indexed by dimension, global_size / local_size - - - - - - - - - - TOTALATTRTOVS is # of attributes to vertex shader, in register - slots (ie. vec4+vec3 -> 7) - - - - STRMDECINSTRCNT is # of VFD_DECODE_INSTR registers valid - - STRMFETCHINSTRCNT is # of VFD_FETCH_INSTR registers valid - - - - MAXSTORAGE could be # of attributes/vbo's - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - SHIFTCNT appears to be size, ie. FLOAT_32_32_32 is 12, and BYTE_8 is 1 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - From register spec: - SP_FS_OBJ_OFFSET_REG.CONSTOBJECTSTARTOFFSET [16:24]: Constant object - start offset in on chip RAM, - 128bit aligned - - - - - - - - - - - - - - - - - - - - - - The full/half register footprint is in units of four components, - so if r0.x is used, that counts as all of r0.[xyzw] as used. - There are separate full/half register footprint values as the - full and half registers are independent (not overlapping). - Presumably the thread scheduler hardware allocates the full/half - register names from the actual physical register file and - handles the register renaming. - - - - - - - From regspec: - SP_FS_CTRL_REG0.FS_LENGTH [31:24]: FS length, unit = 256bits. - If bit31 is 1, it means overflow - or any long shader. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - These seem to be offsets for storage of the varyings. - Always seems to start from 8, possibly loc 0 and 4 - are for gl_Position and gl_PointSize? - - - - - - - - - - SP_VS_OBJ_START_REG contains pointer to the vertex shader program, - immediately followed by the binning shader program (although I - guess that is probably just re-using the same gpu buffer) - - - - - - - - - - - - - - - - - - - - - The full/half register footprint is in units of four components, - so if r0.x is used, that counts as all of r0.[xyzw] as used. - There are separate full/half register footprint values as the - full and half registers are independent (not overlapping). - Presumably the thread scheduler hardware allocates the full/half - register names from the actual physical register file and - handles the register renaming. - - - - - - - - - - - - From regspec: - SP_FS_CTRL_REG0.FS_LENGTH [31:24]: FS length, unit = 256bits. - If bit31 is 1, it means overflow - or any long shader. - - - - - - - - - - - SP_FS_OBJ_START_REG contains pointer to fragment shader program - - - - - - - - - - - - - seems to be one bit per scalar, '1' for flat, '0' for smooth - - - seems to be one bit per scalar, '1' for flat, '0' for smooth - - - - render targets - 1 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Configures the mapping between VSC_PIPE buffer and - bin, X/Y specify the bin index in the horiz/vert - direction (0,0 is upper left, 0,1 is leftmost bin - on second row, and so on). W/H specify the number - of bins assigned to this VSC_PIPE in the horiz/vert - dimension. - - - - - - - - - - - seems to be set to 0x00000001 during binning pass - - - - seems to be always set to 0x00000001 - - - - - - - seems to be always set to 0x00000001 - - - - - - - - - - - - - - - - - - - - - - - - - - - - seems to be always set to 0x00000001 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - seems to be always set to 0x00000003 - - - seems to be always set to 0x00000001 - - - - - - - - - - - - - - - - - - - Texture sampler dwords - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Texture constant dwords - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - INDX is index of texture address(es) in MIPMAP state block - - Pitch in bytes (so actually stride) - - SWAP bit is set for BGRA instead of RGBA - - - - - - - - - - - diff --git a/src/freedreno/registers/a4xx.xml b/src/freedreno/registers/a4xx.xml deleted file mode 100644 index b4c42538b7f..00000000000 --- a/src/freedreno/registers/a4xx.xml +++ /dev/nullitch (actually, appears to be pitch in bytes, so really is a stride) - in GMEM, so pitch of the current tile. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - actually, appears to be pitch in bytes, so really is a stride - - - - - - - - - - - - - - - - - - - - - - - - - - Z_TEST_ENABLE bit is set for zfunc other than GL_ALWAYS or GL_NEVER - - - - - - - DEPTH_BASE is offset in GMEM to depth/stencil buffer, ie - bin_w * bin_h / 1024 (possible rounded up to multiple of - something?? ie. 39 becomes 40, 78 becomes 80.. 75 becomes - 80.. so maybe it needs to be multiple of 8?? - - - - - stride of depth/stencil buffer - - - ??? - - - - - - - - - - - - - - - - - - - - - - Base address for stencil when not using interleaved depth/stencil - - - - pitch of stencil buffer when not using interleaved depth/stencilhe full/half register footprint is in units of four components, - so if r0.x is used, that counts as all of r0.[xyzw] as used. - There are separate full/half register footprint values as the - full and half registers are independent (not overlapping). - Presumably the thread scheduler hardware allocates the full/half - register names from the actual physical register file and - handles the register renaming. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - These seem to be offsets for storage of the varyings. - Always seems to start from 8, possibly loc 0 and 4 - are for gl_Position and gl_PointSize? - - - - - - - - - - - - From register spec: - SP_FS_OBJ_OFFSET_REG.CONSTOBJECTSTARTOFFSET [16:24]: Constant object - start offset in on chip RAM, - 128bit aligned - - - - - - " - - - - - - - - - - - - - - - " - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - " - - - - - - - - - - - - - - - - - - - These seem to be offsets for storage of the varyings. - Always seems to start from 8, possibly loc 0 and 4 - are for gl_Position and gl_PointSize? - - - - - - - - - - - - - " - - - - - - - - - - - - - - - - - - - - These seem to be offsets for storage of the varyings. - Always seems to start from 8, possibly loc 0 and 4 - are for gl_Position and gl_PointSize? - - - - - - - - - - - - - " - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Configures the mapping between VSC_PIPE buffer and - bin, X/Y specify the bin index in the horiz/vert - direction (0,0 is upper left, 0,1 is leftmost bin - on second row, and so on). W/H specify the number - of bins assigned to this VSC_PIPE in the horiz/vert - dimension. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - TOTALATTRTOVS is # of attributes to vertex shader, in register - slots (ie. vec4+vec3 -> 7) - - - - BYPASSATTROVS seems to count varyings that are just directly - assigned from attributes (ie, "vFoo = aFoo;") - - - STRMDECINSTRCNT is # of VFD_DECODE_INSTR registers valid - - STRMFETCHINSTRCNT is # of VFD_FETCH_INSTR registers valid - - - - MAXSTORAGE could be # of attributes/vbo's - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - SHIFTCNT appears to be size, ie. FLOAT_32_32_32 is 12, and BYTE_8 is 1 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - SIZE is current pipe width * height (in tiles) - - - N is some sort of slot # between 0..(SIZE-1). In case - multiple tiles use same pipe, each tile gets unique slot # - - - - - - - in groups of 4x vec4, blob only uses values - 0, 1, 2, 4, 6, 8 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Texture sampler dwords - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Texture constant dwords - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Pitch in bytes (so actually stride) - - - - - - - - - - - - - - - - - - - - - - - Pitch in bytes (so actually stride) - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/src/freedreno/registers/a5xx.xml b/src/freedreno/registers/a5xx.xml deleted file mode 100644 index 34ae474b9d4..00000000000 --- a/src/freedreno/registers/a5xx.xml +++ /dev/nullonfigures the mapping between VSC_PIPE buffer and - bin, X/Y specify the bin index in the horiz/vert - direction (0,0 is upper left, 0,1 is leftmost bin - on second row, and so on). W/H specify the number - of bins assigned to this VSC_PIPE in the horiz/vert - dimensionow Resolution Z ??) - ---- - - I think it serves two functions, early discard of primitives in binning - pass without needing full resolution depth buffer, and also functions as - a depth-prepass, used during the GMEM draws to discard primitives that - would not be visible due to later draws. - - The LRZ buffer always seems to be z16 format, regardless of actual - depth buffer format. - - Note that LRZ write should be disabled when blend/stencil/etc is enabled, - since the occluded primitive can still contribute to final color value - of a fragment. - - Only enabled for GL_LESS/GL_LEQUAL/GL_GREATER/GL_GEQUAL? - - - - LRZ write also disabled for blend/etc. - - update MAX instead of MIN value, ie. GL_GREATER/GL_GEQUAL - - - - - - - - Pitch is depth width (in pixels) / 8 (aligned to 32). Height - is also divided by 8 (ie. covers 8x8 pixels) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Z_TEST_ENABLE bit is set for zfunc other than GL_ALWAYS or GL_NEVER - - - - - - - - - stride of depth/stencil buffer - - - size of layer - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Blits: - ------ - - Blits are triggered by CP_EVENT_WRITE:BLIT, compared to previous - generations where they shared most of the gl pipeline and were - triggered by CP_DRAW_INDX* - - For gmem->mem blob uses RB_BLIT_CNTL.BUF to specify src of - blit (ie MRTn, ZS, etc) and RB_BLIT_DST_LO/HI for destination - gpuaddr. The gmem offset is taken from RB_MRT[n].BASE_LO/HI - - For mem->gmem blob uses just MRT0 or ZS and RB_BLIT_DST_LO/HI - for the GMEM offset, and gpuaddr from RB_MRT[0].BASE_LO/HI - (I suppose this is just to avoid trashing RB_MRT[1..7]??) - - - - - - - - - - - - - - - - - - - - - - - - - For MASK, if RB_BLIT_CNTL.BUF=BLIT_ZS: - 1 - depth - 2 - stencil - 3 - depth+stencil - if RB_BLIT_CNTL.BUF=BLIT_MRTn - then probably a component mask, I always see 0xf - - - - - - Buffer Metadata (flag buffers): - ------------------------------- - - Blob seems to stick some metadata at the front of the buffer, - both z/s and MRT. I think this is same as UBWC (bandwidth - compression) metadata that mdp 1.7 and later supports. See - 1d3fae5698ce5358caab87a15383b690941697e8 in downstream kernel. - UBWC seems to stand for "universal bandwidth compression". - - Before glReadPixels() it does a pair of BYPASS blits (at least - if metadata is used) presumably to resolve metadata. - - NOTES: see: getUBwcBlockSize(), getUBwcMetaBufferSize() at - https://android.googlesource.com/platform/hardware/qcom/display/+/android-6.0.1_r40/msm8994/libgralloc/alloc_controller.cpp - (note that bpp in bytes, not bits, so really cpp) - - Example Layout 2d w/ mipmap levels: - - 100x2000, ifmt=GL_RG, fmt=GL_RG16F, type=GL_FLOAT, meta=64x512@0x8000 (7x500) - base=c072e000, offset=16384, size=1703936 - - color flags - 0 c073a000 c0732000 - level 0 flags is address - 1 c0838000 c0834000 programmed in texture state - 2 c0879000 c0877000 - 3 c089a000 c0899000 - 4 c08ab000 c08aa000 - 5 c08b4000 c08b3000 - 6 c08b9000 c08b8000 - 7 c08bc000 c08bb000 - 8 c08be000 c08bd000 - 9 c08c0000 c08bf000 - 10 c08c2000 c08c1000 - - ARRAY_PITCH is the combined size of all the levels plus flags, - so 0xc08c3000 - 0xc0732000 = 0x00191000 (1642496); each level - takes up a minimum of 2 pages (since color and flags parts are - each page aligned. - - { TILE_MODE = TILE5_3 | SWIZ_X = A5XX_TEX_X | SWIZ_Y = A5XX_TEX_Y | SWIZ_Z = A5XX_TEX_ZERO | SWIZ_W = A5XX_TEX_ONE | MIPLVLS = 0 | FMT = TFMT5_16_16_FLOAT | SWAP = WZYX } - { WIDTH = 100 | HEIGHT = 2000 } - { FETCHSIZE = TFETCH5_4_BYTE | PITCH = 512 | TYPE = A5XX_TEX_2D } - { ARRAY_PITCH = 1642496 | 0x18800000 } - NOTE c2dc always has 0x18800000 but - { BASE_LO = 0xc0732000 } this varies for blob gles driver.. - { BASE_HI = 0 | DEPTH = 1 } not sure what it is - - - - - - - - - - - - - - - - - - - - - - - - - - num of varyings plus four for gl_Position (plus one if gl_PointSize) - plus # of transform-feedback (streamout) varyings if using the - hw streamout (rather than stg instructions in shader) - - - - - - - - - - - - - - - - - - - - - - - - - - - Stream-Out: - ----------- - - VPC_SO[0..3] registers setup details about streamout buffers, and - number of components to write to each. - - VPC_SO_PROG provides the mapping between output varyings and the SO - buffers. It is written multiple times (via a CP_CONTEXT_REG_BUNCH - packet, not sure if that matters), each write can handle up to two - components of stream-out output. Order matches up to OUTLOC, - including padding. So, if outputting first 3 varyings: - - SP_VS_OUT[0].REG: { A_REGID = r0.w | A_COMPMASK = 0xf | B_REGID = r0.x | B_COMPMASK = 0x7 } - SP_VS_OUT[0x1].REG: { A_REGID = r1.w | A_COMPMASK = 0x3 | B_REGID = r2.y | B_COMPMASK = 0xf } - SP_VS_VPC_DST[0].REG: { OUTLOC0 = 0 | OUTLOC1 = 4 | OUTLOC2 = 8 | OUTLOC3 = 12 } - - Then: - - VPC_SO_PROG: { A_BUF = 0 | A_OFF = 0 | A_EN | A_BUF = 0 | B_OFF = 4 | B_EN } - VPC_SO_PROG: { A_BUF = 0 | A_OFF = 8 | A_EN | A_BUF = 0 | B_OFF = 12 | B_EN } - VPC_SO_PROG: { A_BUF = 2 | A_OFF = 0 | A_EN | A_BUF = 2 | B_OFF = 4 | B_EN } - VPC_SO_PROG: { A_BUF = 2 | A_OFF = 8 | A_EN | A_BUF = 0 | B_OFF = 0 } - VPC_SO_PROG: { A_BUF = 1 | A_OFF = 0 | A_EN | A_BUF = 1 | B_OFF = 4 | B_EN } - - Note that varying order is OUTLOC0, OUTLOC2, OUTLOC1, and note - the padding between OUTLOC1 and OUTLOC2. - - The BUF bitfield indicates which of the four streamout buffers - to write into at the specified offset. - - The VPC_SO[n].FLUSH_BASE_LO/HI is used for hw to write back next - offset which gets loaded back into VPC_SO[n].BUFFER_OFFSET via a - CP_MEM_TO_REG. Probably can be ignored until we have GS/etc, at - which point we can't calculate the offset on the CPU. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - per MRT - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Texture sampler dwords - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Texture constant dwords - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Pitch in bytes (so actually stride) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Pitch in bytes (so actually stride) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/src/freedreno/registers/a6xx.xml b/src/freedreno/registers/a6xx.xml deleted file mode 100644 index 1e7eefb1bef..00000000000 --- a/src/freedreno/registers/a6xx.xml +++ /dev/nullllow early z-test and early-lrz (if applicable) - - Disable early z-test and early-lrz test (if applicable) - - - A special mode that allows early-lrz test but disables - early-z test. Which might sound a bit funny, since - lrz-test happens before z-test. But as long as a couple - conditions are maintained this allows using lrz-test in - cases where fragment shader has kill/discard: - - 1) Disable lrz-write in cases where it is uncertain during - binning pass that a fragment will pass. Ie. if frag - shader has-kill, writes-z, or alpha/stencil test is - enabled. (For correctness, lrz-write must be disabled - when blend is enabled.) This is analogous to how a - z-prepass works. - - 2) Disable lrz-write and test if a depth-test direction - reversal is detected. Due to condition (1), the contents - of the lrz buffer are a conservative estimation of the - depth buffer during the draw pass. Meaning that geometry - that we know for certain will not be visible will not pass - lrz-test. But geometry which may be (or contributes to - blend) will pass the lrz-test. - - This allows us to keep early-lrz-test in cases where the frag - shader does not write-z (ie. we know the z-value before FS) - and does not have side-effects (image/ssbo writes, etc), but - does have kill/discard. Which turns out to be a common - enough case that it is useful to keep early-lrz test against - the conservative lrz buffer to discard fragments that we - know will definitely not be visible. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - b0..7 seems to contain the size of buffered by not yet processed - RB level cmdstream.. it's possible that it is a low threshold - and b8..15 is a high threshold? - - b16..23 identifies where IB1 data starts (and RB data ends?) - - b24..31 identifies where IB2 data starts (and IB1 data ends) - - - - - - - - - low bits identify where CP_SET_DRAW_STATE stateobj - processing starts (and IB2 data ends). I'm guessing - b8 is part of this since (from downstream kgsl): - - /* ROQ sizes are twice as big on a640/a680 than on a630 */ - if (adreno_is_a640(adreno_dev) || adreno_is_a680(adreno_dev)) { - kgsl_regwrite(device, A6XX_CP_ROQ_THRESHOLDS_2, 0x02000140); - kgsl_regwrite(device, A6XX_CP_ROQ_THRESHOLDS_1, 0x8040362C); - } ... - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - number of remaining dwords incl current dword being consumed? - - - - number of remaining dwords incl current dword being consumedonfigures the mapping between VSC_PIPE buffer and - bin, X/Y specify the bin index in the horiz/vert - direction (0,0 is upper left, 0,1 is leftmost bin - on second row, and so on). W/H specify the number - of bins assigned to this VSC_PIPE in the horiz/vert - dimension. - - - - - - - - - - - - - - - - - - - - - - Seems to be a bitmap of which tiles mapped to the VSC - pipe contain geometry. - - I suppose we can connect a maximum of 32 tiles to a - single VSC pipe. - - - - - - - Has the size of data written to corresponding VSC_PRIM_STRM - buffer. - - - - - - - Has the size of data written to corresponding VSC pipe, ie. - same thing that is written out to VSC_DRAW_STRM_SIZE_ADDRESS_LO/HI - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - LRZ write also disabled for blend/etc. - - update MAX instead of MIN value, iebit is set for zfunc other than GL_ALWAYS or GL_NEVER - also set when Z_BOUNDS_ENABLE is set - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - For clearing depth/stencil - 1 - depth - 2 - stencil - 3 - depth+stencil - For clearing color buffer: - then probably a component mask, I always see 0xf - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - num of varyings plus four for gl_Position (plus one if gl_PointSize) - plus # of transform-feedback (streamout) varyings if using the - hw streamout (rather than stg instructions in shader) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - num of varyings plus four for gl_Position (plus one if gl_PointSize) - plus # of transform-feedback (streamout) varyings if using the - hw streamout (rather than stg instructions in shader) - - - - - - - - - - - - - - - - - - geometry shader - - - - - - - - - - - size in vec4s of per-primitive storage for gs. TODO: not actually inbit 0 seems to toggle between 2k and 32k of shared storage - the ldl/stl offset seems to be rewritten to 0 when it is beyond - this limit. This is different from ldlw/stlw, which wraps at - 64k (and has 36k of storage on A640 - reads between 36k-64k - always return 0) - - - - - - - - - - - - - - - - - - - - - - - - per MRT - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - This register clears pending loads queued up by - CP_LOAD_STATE6. Each bit resets a particular kind(s) of - CP_LOAD_STATE6. - - - - - - - - - - - - - - - - - - - - - - - - - - - Shared constants are intended to be used for Vulkan push - constants. When enabled, 8 vec4's are reserved in the FS - const pool and 16 in the geometry const pool although - only 8 are actually used (why?) and they are mapped to - c504-c511 in each stage. Both VS and FS shared consts - are written using ST6_CONSTANTS/SB6_IBO, so that both - the geometry and FS shared consts can be written at once - by using CP_LOAD_STATE6 rather than - CP_LOAD_STATE6_FRAG/CP_LOAD_STATE6_GEOM. In addition - DST_OFF and NUM_UNIT are in units of dwords instead of - vec4's. - - There is also a separate shared constant pool for CS, - which is loaded through CP_LOAD_STATE6_FRAG with - ST6_UBO/ST6_IBO. However the only real difference for CS - is the dword units. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Texture sampler dwords - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Texture constant dwords - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Pitch in bytes (so actually stride) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Pitch in bytes (so actually stride) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/src/freedreno/registers/adreno.xml b/src/freedreno/registers/adreno.xml new file mode 100644 index 00000000000..92b7f37a721 --- /dev/null +++ b/src/freedreno/registers/adreno.xml @@ -0,0 +1,17 @@ + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/adreno/a2xx.xml b/src/freedreno/registers/adreno/a2xx.xml new file mode 100644 index 00000000000..549ce1ec6d9 --- /dev/null +++ b/src/freedreno/registers/adreno/a2xx.xmlnote: only 0x3f worth of valid register values for VS_REGS and + PS_REGS, but high bit is set to indicate '0 registers usedexture state dwords + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + " + " + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/adreno/a3xx.xml b/src/freedreno/registers/adreno/a3xx.xml new file mode 100644 index 00000000000..0819dc4ede7 --- /dev/null +++ b/src/freedreno/registers/adreno/a3xx.xmlhe pair of MEM_SIZE/ADDR registers get programmed + in sequence with the size/addr of each buffer. + + + + + + + + + + + + + + + + + aka clip_halfz + + + + + + + + + + + + + + + + + + + + + + + + + + range of -8.0 to 8.0 + + + range of -512.0 to 512.0 + + + + + + + + + + + + + + + + + + + + + + + + + + RENDER_MODE is RB_RESOLVE_PASS for gmem->mem, otherwise RB_RENDER_PASS + + + + render targets - 1 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Pitch (actually, appears to be pitch in bytes, so really is a stride) + in GMEM, so pitch of the current tile. + + + + + offset into GMEM (or system memory address in bypass mode) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + actually, appears to be pitch in bytes, so really is a stride + + + + + + + + + + + + + + + + + + + + Z_TEST_ENABLE bit is set for zfunc other than GL_ALWAYS or GL_NEVER + + + + seems to be always set to 0x00000000 + + + + + DEPTH_BASE is offset in GMEM to depth/stencil buffer, ie + bin_w * bin_h / 1024 (possible rounded up to multiple of + something?? ie. 39 becomes 40, 78 becomes 80.. 75 becomes + 80.. so maybe it needs to be multiple of 8?? + + + + + + Pitch of depth buffer or combined depth+stencil buffer + in z24s8 cases. + + + + + + + + + + + + + + + + + + seems to be always set to 0x00000000 + + + Base address for stencil when not using interleaved depth/stencil + + + + pitch of stencil buffer when not using interleaved depth/stencil + + + + + + seems to be set to 0x00000002 during binning pass + + + + X/Y offset of current bin + + + + + + + + + + + + + + + seems to be where firmware writes BIN_DATA_ADDR from + CP_SET_BIN_DATA packet.. probably should be called + PC_BIN_BASE (just using name from yamato for now) + + + + probably should be PC_BIN_SIZE + + + SIZE is current pipe width * height (in tiles) + + + N is some sort of slot # between 0..(SIZE-1). In case + multiple tiles use same pipe, each tile gets unique slot # + + + + + + + STRIDE_IN_VPC: ALIGN(next_outloc - 8, 4) / 4 + (but, in cases where you'd expect 1, the blob driver uses + 2, so possibly 0 (no varying) or minimum of 2) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + indexed by dimension + + + + + + + + indexed by dimension, global_size / local_size + + + + + + + + + + TOTALATTRTOVS is # of attributes to vertex shader, in register + slots (ie. vec4+vec3 -> 7) + + + + STRMDECINSTRCNT is # of VFD_DECODE_INSTR registers valid + + STRMFETCHINSTRCNT is # of VFD_FETCH_INSTR registers valid + + + + MAXSTORAGE could be # of attributes/vbo's + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + SHIFTCNT appears to be size, ie. FLOAT_32_32_32 is 12, and BYTE_8 is 1 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + From register spec: + SP_FS_OBJ_OFFSET_REG.CONSTOBJECTSTARTOFFSET [16:24]: Constant object + start offset in on chip RAM, + 128bit aligned + + + + + + + + + + + + + + + + + + + + + + The full/half register footprint is in units of four components, + so if r0.x is used, that counts as all of r0.[xyzw] as used. + There are separate full/half register footprint values as the + full and half registers are independent (not overlapping). + Presumably the thread scheduler hardware allocates the full/half + register names from the actual physical register file and + handles the register renaming. + + + + + + + From regspec: + SP_FS_CTRL_REG0.FS_LENGTH [31:24]: FS length, unit = 256bits. + If bit31 is 1, it means overflow + or any long shader. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + These seem to be offsets for storage of the varyings. + Always seems to start from 8, possibly loc 0 and 4 + are for gl_Position and gl_PointSize? + + + + + + + + + + SP_VS_OBJ_START_REG contains pointer to the vertex shader program, + immediately followed by the binning shader program (although I + guess that is probably just re-using the same gpu buffer) + + + + + + + + + + + + + + + + + + + + + The full/half register footprint is in units of four components, + so if r0.x is used, that counts as all of r0.[xyzw] as used. + There are separate full/half register footprint values as the + full and half registers are independent (not overlapping). + Presumably the thread scheduler hardware allocates the full/half + register names from the actual physical register file and + handles the register renaming. + + + + + + + + + + + + From regspec: + SP_FS_CTRL_REG0.FS_LENGTH [31:24]: FS length, unit = 256bits. + If bit31 is 1, it means overflow + or any long shader. + + + + + + + + + + + SP_FS_OBJ_START_REG contains pointer to fragment shader program + + + + + + + + + + + + + seems to be one bit per scalar, '1' for flat, '0' for smooth + + + seems to be one bit per scalar, '1' for flat, '0' for smooth + + + + render targets - 1 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Configures the mapping between VSC_PIPE buffer and + bin, X/Y specify the bin index in the horiz/vert + direction (0,0 is upper left, 0,1 is leftmost bin + on second row, and so on). W/H specify the number + of bins assigned to this VSC_PIPE in the horiz/vert + dimension. + + + + + + + + + + + seems to be set to 0x00000001 during binning pass + + + + seems to be always set to 0x00000001 + + + + + + + seems to be always set to 0x00000001 + + + + + + + + + + + + + + + + + + + + + + + + + + + + seems to be always set to 0x00000001 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + seems to be always set to 0x00000003 + + + seems to be always set to 0x00000001 + + + + + + + + + + + + + + + + + + + Texture sampler dwords + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Texture constant dwords + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + INDX is index of texture address(es) in MIPMAP state block + + Pitch in bytes (so actually stride) + + SWAP bit is set for BGRA instead of RGBA + + + + + + + + + + + diff --git a/src/freedreno/registers/adreno/a4xx.xml b/src/freedreno/registers/adreno/a4xx.xml new file mode 100644 index 00000000000..b4c42538b7f --- /dev/null +++ b/src/freedreno/registers/adreno/a4xx.xmlitch (actually, appears to be pitch in bytes, so really is a stride) + in GMEM, so pitch of the current tile. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + actually, appears to be pitch in bytes, so really is a stride + + + + + + + + + + + + + + + + + + + + + + + + + + Z_TEST_ENABLE bit is set for zfunc other than GL_ALWAYS or GL_NEVER + + + + + + + DEPTH_BASE is offset in GMEM to depth/stencil buffer, ie + bin_w * bin_h / 1024 (possible rounded up to multiple of + something?? ie. 39 becomes 40, 78 becomes 80.. 75 becomes + 80.. so maybe it needs to be multiple of 8?? + + + + + stride of depth/stencil buffer + + + ??? + + + + + + + + + + + + + + + + + + + + + + Base address for stencil when not using interleaved depth/stencil + + + + pitch of stencil buffer when not using interleaved depth/stencilhe full/half register footprint is in units of four components, + so if r0.x is used, that counts as all of r0.[xyzw] as used. + There are separate full/half register footprint values as the + full and half registers are independent (not overlapping). + Presumably the thread scheduler hardware allocates the full/half + register names from the actual physical register file and + handles the register renaming. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + These seem to be offsets for storage of the varyings. + Always seems to start from 8, possibly loc 0 and 4 + are for gl_Position and gl_PointSize? + + + + + + + + + + + + From register spec: + SP_FS_OBJ_OFFSET_REG.CONSTOBJECTSTARTOFFSET [16:24]: Constant object + start offset in on chip RAM, + 128bit aligned + + + + + + " + + + + + + + + + + + + + + + " + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + " + + + + + + + + + + + + + + + + + + + These seem to be offsets for storage of the varyings. + Always seems to start from 8, possibly loc 0 and 4 + are for gl_Position and gl_PointSize? + + + + + + + + + + + + + " + + + + + + + + + + + + + + + + + + + + These seem to be offsets for storage of the varyings. + Always seems to start from 8, possibly loc 0 and 4 + are for gl_Position and gl_PointSize? + + + + + + + + + + + + + " + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Configures the mapping between VSC_PIPE buffer and + bin, X/Y specify the bin index in the horiz/vert + direction (0,0 is upper left, 0,1 is leftmost bin + on second row, and so on). W/H specify the number + of bins assigned to this VSC_PIPE in the horiz/vert + dimension. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + TOTALATTRTOVS is # of attributes to vertex shader, in register + slots (ie. vec4+vec3 -> 7) + + + + BYPASSATTROVS seems to count varyings that are just directly + assigned from attributes (ie, "vFoo = aFoo;") + + + STRMDECINSTRCNT is # of VFD_DECODE_INSTR registers valid + + STRMFETCHINSTRCNT is # of VFD_FETCH_INSTR registers valid + + + + MAXSTORAGE could be # of attributes/vbo's + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + SHIFTCNT appears to be size, ie. FLOAT_32_32_32 is 12, and BYTE_8 is 1 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + SIZE is current pipe width * height (in tiles) + + + N is some sort of slot # between 0..(SIZE-1). In case + multiple tiles use same pipe, each tile gets unique slot # + + + + + + + in groups of 4x vec4, blob only uses values + 0, 1, 2, 4, 6, 8 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Texture sampler dwords + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Texture constant dwords + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Pitch in bytes (so actually stride) + + + + + + + + + + + + + + + + + + + + + + + Pitch in bytes (so actually stride) + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/adreno/a5xx.xml b/src/freedreno/registers/adreno/a5xx.xml new file mode 100644 index 00000000000..34ae474b9d4 --- /dev/null +++ b/src/freedreno/registers/adreno/a5xx.xmlonfigures the mapping between VSC_PIPE buffer and + bin, X/Y specify the bin index in the horiz/vert + direction (0,0 is upper left, 0,1 is leftmost bin + on second row, and so on). W/H specify the number + of bins assigned to this VSC_PIPE in the horiz/vert + dimensionow Resolution Z ??) + ---- + + I think it serves two functions, early discard of primitives in binning + pass without needing full resolution depth buffer, and also functions as + a depth-prepass, used during the GMEM draws to discard primitives that + would not be visible due to later draws. + + The LRZ buffer always seems to be z16 format, regardless of actual + depth buffer format. + + Note that LRZ write should be disabled when blend/stencil/etc is enabled, + since the occluded primitive can still contribute to final color value + of a fragment. + + Only enabled for GL_LESS/GL_LEQUAL/GL_GREATER/GL_GEQUAL? + + + + LRZ write also disabled for blend/etc. + + update MAX instead of MIN value, ie. GL_GREATER/GL_GEQUAL + + + + + + + + Pitch is depth width (in pixels) / 8 (aligned to 32). Height + is also divided by 8 (ie. covers 8x8 pixels) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Z_TEST_ENABLE bit is set for zfunc other than GL_ALWAYS or GL_NEVER + + + + + + + + + stride of depth/stencil buffer + + + size of layer + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Blits: + ------ + + Blits are triggered by CP_EVENT_WRITE:BLIT, compared to previous + generations where they shared most of the gl pipeline and were + triggered by CP_DRAW_INDX* + + For gmem->mem blob uses RB_BLIT_CNTL.BUF to specify src of + blit (ie MRTn, ZS, etc) and RB_BLIT_DST_LO/HI for destination + gpuaddr. The gmem offset is taken from RB_MRT[n].BASE_LO/HI + + For mem->gmem blob uses just MRT0 or ZS and RB_BLIT_DST_LO/HI + for the GMEM offset, and gpuaddr from RB_MRT[0].BASE_LO/HI + (I suppose this is just to avoid trashing RB_MRT[1..7]??) + + + + + + + + + + + + + + + + + + + + + + + + + For MASK, if RB_BLIT_CNTL.BUF=BLIT_ZS: + 1 - depth + 2 - stencil + 3 - depth+stencil + if RB_BLIT_CNTL.BUF=BLIT_MRTn + then probably a component mask, I always see 0xf + + + + + + Buffer Metadata (flag buffers): + ------------------------------- + + Blob seems to stick some metadata at the front of the buffer, + both z/s and MRT. I think this is same as UBWC (bandwidth + compression) metadata that mdp 1.7 and later supports. See + 1d3fae5698ce5358caab87a15383b690941697e8 in downstream kernel. + UBWC seems to stand for "universal bandwidth compression". + + Before glReadPixels() it does a pair of BYPASS blits (at least + if metadata is used) presumably to resolve metadata. + + NOTES: see: getUBwcBlockSize(), getUBwcMetaBufferSize() at + https://android.googlesource.com/platform/hardware/qcom/display/+/android-6.0.1_r40/msm8994/libgralloc/alloc_controller.cpp + (note that bpp in bytes, not bits, so really cpp) + + Example Layout 2d w/ mipmap levels: + + 100x2000, ifmt=GL_RG, fmt=GL_RG16F, type=GL_FLOAT, meta=64x512@0x8000 (7x500) + base=c072e000, offset=16384, size=1703936 + + color flags + 0 c073a000 c0732000 - level 0 flags is address + 1 c0838000 c0834000 programmed in texture state + 2 c0879000 c0877000 + 3 c089a000 c0899000 + 4 c08ab000 c08aa000 + 5 c08b4000 c08b3000 + 6 c08b9000 c08b8000 + 7 c08bc000 c08bb000 + 8 c08be000 c08bd000 + 9 c08c0000 c08bf000 + 10 c08c2000 c08c1000 + + ARRAY_PITCH is the combined size of all the levels plus flags, + so 0xc08c3000 - 0xc0732000 = 0x00191000 (1642496); each level + takes up a minimum of 2 pages (since color and flags parts are + each page aligned. + + { TILE_MODE = TILE5_3 | SWIZ_X = A5XX_TEX_X | SWIZ_Y = A5XX_TEX_Y | SWIZ_Z = A5XX_TEX_ZERO | SWIZ_W = A5XX_TEX_ONE | MIPLVLS = 0 | FMT = TFMT5_16_16_FLOAT | SWAP = WZYX } + { WIDTH = 100 | HEIGHT = 2000 } + { FETCHSIZE = TFETCH5_4_BYTE | PITCH = 512 | TYPE = A5XX_TEX_2D } + { ARRAY_PITCH = 1642496 | 0x18800000 } - NOTE c2dc always has 0x18800000 but + { BASE_LO = 0xc0732000 } this varies for blob gles driver.. + { BASE_HI = 0 | DEPTH = 1 } not sure what it is + + + + + + + + + + + + + + + + + + + + + + + + + + num of varyings plus four for gl_Position (plus one if gl_PointSize) + plus # of transform-feedback (streamout) varyings if using the + hw streamout (rather than stg instructions in shader) + + + + + + + + + + + + + + + + + + + + + + + + + + + Stream-Out: + ----------- + + VPC_SO[0..3] registers setup details about streamout buffers, and + number of components to write to each. + + VPC_SO_PROG provides the mapping between output varyings and the SO + buffers. It is written multiple times (via a CP_CONTEXT_REG_BUNCH + packet, not sure if that matters), each write can handle up to two + components of stream-out output. Order matches up to OUTLOC, + including padding. So, if outputting first 3 varyings: + + SP_VS_OUT[0].REG: { A_REGID = r0.w | A_COMPMASK = 0xf | B_REGID = r0.x | B_COMPMASK = 0x7 } + SP_VS_OUT[0x1].REG: { A_REGID = r1.w | A_COMPMASK = 0x3 | B_REGID = r2.y | B_COMPMASK = 0xf } + SP_VS_VPC_DST[0].REG: { OUTLOC0 = 0 | OUTLOC1 = 4 | OUTLOC2 = 8 | OUTLOC3 = 12 } + + Then: + + VPC_SO_PROG: { A_BUF = 0 | A_OFF = 0 | A_EN | A_BUF = 0 | B_OFF = 4 | B_EN } + VPC_SO_PROG: { A_BUF = 0 | A_OFF = 8 | A_EN | A_BUF = 0 | B_OFF = 12 | B_EN } + VPC_SO_PROG: { A_BUF = 2 | A_OFF = 0 | A_EN | A_BUF = 2 | B_OFF = 4 | B_EN } + VPC_SO_PROG: { A_BUF = 2 | A_OFF = 8 | A_EN | A_BUF = 0 | B_OFF = 0 } + VPC_SO_PROG: { A_BUF = 1 | A_OFF = 0 | A_EN | A_BUF = 1 | B_OFF = 4 | B_EN } + + Note that varying order is OUTLOC0, OUTLOC2, OUTLOC1, and note + the padding between OUTLOC1 and OUTLOC2. + + The BUF bitfield indicates which of the four streamout buffers + to write into at the specified offset. + + The VPC_SO[n].FLUSH_BASE_LO/HI is used for hw to write back next + offset which gets loaded back into VPC_SO[n].BUFFER_OFFSET via a + CP_MEM_TO_REG. Probably can be ignored until we have GS/etc, at + which point we can't calculate the offset on the CPU. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + per MRT + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Texture sampler dwords + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Texture constant dwords + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Pitch in bytes (so actually stride) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Pitch in bytes (so actually stride) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/adreno/a6xx.xml b/src/freedreno/registers/adreno/a6xx.xml new file mode 100644 index 00000000000..1e7eefb1bef --- /dev/null +++ b/src/freedreno/registers/adreno/a6xx.xml @@ -0,0 +1,3944 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Allow early z-test and early-lrz (if applicable) + + Disable early z-test and early-lrz test (if applicable) + + + A special mode that allows early-lrz test but disables + early-z test. Which might sound a bit funny, since + lrz-test happens before z-test. But as long as a couple + conditions are maintained this allows using lrz-test in + cases where fragment shader has kill/discard: + + 1) Disable lrz-write in cases where it is uncertain during + binning pass that a fragment will pass. Ie. if frag + shader has-kill, writes-z, or alpha/stencil test is + enabled. (For correctness, lrz-write must be disabled + when blend is enabled.) This is analogous to how a + z-prepass works. + + 2) Disable lrz-write and test if a depth-test direction + reversal is detected. Due to condition (1), the contents + of the lrz buffer are a conservative estimation of the + depth buffer during the draw pass. Meaning that geometry + that we know for certain will not be visible will not pass + lrz-test. But geometry which may be (or contributes to + blend) will pass the lrz-test. + + This allows us to keep early-lrz-test in cases where the frag + shader does not write-z (ie. we know the z-value before FS) + and does not have side-effects (image/ssbo writes, etc), but + does have kill/discard. Which turns out to be a common + enough case that it is useful to keep early-lrz test against + the conservative lrz buffer to discard fragments that we + know will definitely not be visible. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + b0..7 seems to contain the size of buffered by not yet processed + RB level cmdstream.. it's possible that it is a low threshold + and b8..15 is a high threshold? + + b16..23 identifies where IB1 data starts (and RB data ends?) + + b24..31 identifies where IB2 data starts (and IB1 data ends) + + + + + + + + + low bits identify where CP_SET_DRAW_STATE stateobj + processing starts (and IB2 data ends). I'm guessing + b8 is part of this since (from downstream kgsl): + + /* ROQ sizes are twice as big on a640/a680 than on a630 */ + if (adreno_is_a640(adreno_dev) || adreno_is_a680(adreno_dev)) { + kgsl_regwrite(device, A6XX_CP_ROQ_THRESHOLDS_2, 0x02000140); + kgsl_regwrite(device, A6XX_CP_ROQ_THRESHOLDS_1, 0x8040362C); + } ... + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + number of remaining dwords incl current dword being consumed? + + + + number of remaining dwords incl current dword being consumedonfigures the mapping between VSC_PIPE buffer and + bin, X/Y specify the bin index in the horiz/vert + direction (0,0 is upper left, 0,1 is leftmost bin + on second row, and so on). W/H specify the number + of bins assigned to this VSC_PIPE in the horiz/vert + dimension. + + + + + + + + + + + + + + + + + + + + + + Seems to be a bitmap of which tiles mapped to the VSC + pipe contain geometry. + + I suppose we can connect a maximum of 32 tiles to a + single VSC pipe. + + + + + + + Has the size of data written to corresponding VSC_PRIM_STRM + buffer. + + + + + + + Has the size of data written to corresponding VSC pipe, ie. + same thing that is written out to VSC_DRAW_STRM_SIZE_ADDRESS_LO/HI + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + LRZ write also disabled for blend/etc. + + update MAX instead of MIN value, iebit is set for zfunc other than GL_ALWAYS or GL_NEVER + also set when Z_BOUNDS_ENABLE is set + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + For clearing depth/stencil + 1 - depth + 2 - stencil + 3 - depth+stencil + For clearing color buffer: + then probably a component mask, I always see 0xf + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + num of varyings plus four for gl_Position (plus one if gl_PointSize) + plus # of transform-feedback (streamout) varyings if using the + hw streamout (rather than stg instructions in shader) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + num of varyings plus four for gl_Position (plus one if gl_PointSize) + plus # of transform-feedback (streamout) varyings if using the + hw streamout (rather than stg instructions in shader) + + + + + + + + + + + + + + + + + + geometry shader + + + + + + + + + + + size in vec4s of per-primitive storage for gs. TODO: not actually in VPC + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + bit 0 seems to toggle between 2k and 32k of shared storage + the ldl/stl offset seems to be rewritten to 0 when it is beyond + this limit. This is different from ldlw/stlw, which wraps at + 64k (and has 36k of storage on A640 - reads between 36k-64k + always return 0) + + + + + + + + + + + + + + + + + + + + + + + + per MRT + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + This register clears pending loads queued up by + CP_LOAD_STATE6. Each bit resets a particular kind(s) of + CP_LOAD_STATE6. + + + + + + + + + + + + + + + + + + + + + + + + + + + Shared constants are intended to be used for Vulkan push + constants. When enabled, 8 vec4's are reserved in the FS + const pool and 16 in the geometry const pool although + only 8 are actually used (why?) and they are mapped to + c504-c511 in each stage. Both VS and FS shared consts + are written using ST6_CONSTANTS/SB6_IBO, so that both + the geometry and FS shared consts can be written at once + by using CP_LOAD_STATE6 rather than + CP_LOAD_STATE6_FRAG/CP_LOAD_STATE6_GEOM. In addition + DST_OFF and NUM_UNIT are in units of dwords instead of + vec4's. + + There is also a separate shared constant pool for CS, + which is loaded through CP_LOAD_STATE6_FRAG with + ST6_UBO/ST6_IBO. However the only real difference for CS + is the dword units. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Texture sampler dwords + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Texture constant dwords + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Pitch in bytes (so actually stride) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Pitch in bytes (so actually stride) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/adreno/a6xx_gmu.xml b/src/freedreno/registers/adreno/a6xx_gmu.xml new file mode 100644 index 00000000000..dbefd0cb70a --- /dev/null +++ b/src/freedreno/registers/adreno/a6xx_gmu.xml @@ -0,0 +1,218 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/adreno/adreno_common.xml b/src/freedreno/registers/adreno/adreno_common.xml new file mode 100644 index 00000000000..d70fbaf10c1 --- /dev/null +++ b/src/freedreno/registers/adreno/adreno_common.xml @@ -0,0 +1,370 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Registers in common between a2xx and a3xx + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Address mode for a5xx+ + + + + + + + diff --git a/src/freedreno/registers/adreno/adreno_control_regs.xml b/src/freedreno/registers/adreno/adreno_control_regs.xml new file mode 100644 index 00000000000..ed7c86b7700 --- /dev/null +++ b/src/freedreno/registers/adreno/adreno_control_regs.xml @@ -0,0 +1,131 @@ + + + + + + + + + + + + + To use these, write the address and number of dwords, then read + the result from $addr. + + + + + + + + + Instruction to jump to when the CP is preempted to perform a + context switch, initialized to entry 15 of the jump table at + bootup. + + + + + + + + + + + + + + Writing to this triggers a register write and auto-increments + REG_WRITE_ADDR. + + + + After setting these, read result from $addr2 + + + + + Write to increase WFI_PEND_CTR, decremented by WFI_PEND_DECR + pipe register. + + + + + + + + + + + + + + Controls whether RB, IB1, or IB2 is executed + + + Controls high 32 bits used by load and store afuc instructions + + + Used to initialize the jump table for handling packets at bootup + + + + + + + + + + These are addresses of various preemption records for the + current context. When context switching, the CP will save the + current state into these buffers, restore the state of the + next context from the buffers in the corresponding + CP_CONTEXT_SWITCH_PRIV_* registers written by the kernel, + then set these internal registers to the contents of + those registers. The kernel sets the initial values via + CP_SET_PSEUDO_REG on startup, and from then on the firmware + keeps track of them. + + + + + + + + + Used only during preemption, saved and restored from the "info" + field of a6xx_preemption_record. From the downstream kernel: + + "Type of record. Written non-zero (usually) by CP. + we must set to zero for all ringbuffers." + + + + + + Set by SET_MARKER, used to conditionally execute + CP_COND_REG_EXEC and draw states. + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/adreno/adreno_pipe_regs.xml b/src/freedreno/registers/adreno/adreno_pipe_regs.xml new file mode 100644 index 00000000000..d5292694cda --- /dev/null +++ b/src/freedreno/registers/adreno/adreno_pipe_regs.xml @@ -0,0 +1,77 @@ + + + + + + + Special type to mark registers with no payload. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/adreno/adreno_pm4.xml b/src/freedreno/registers/adreno/adreno_pm4.xml new file mode 100644 index 00000000000..f16793be5e4 --- /dev/null +++ b/src/freedreno/registers/adreno/adreno_pm4.xml @@ -0,0 +1,1705 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + initialize CP's micro-engine + + skip N 32-bit words to get to the next packet + + + indirect buffer dispatch. prefetch parser uses this packet + type to determine whether to pre-fetch the IB + + + + + + Takes the same arguments as CP_INDIRECT_BUFFER, but jumps to + another buffer at the same level. Must be at the end of IB, and + doesn't work with draw state IB's. + + + indirect buffer dispatch. same as IB, but init is pipelined + + wait for the IDLE state of the engine + + wait until a register or memory location is a specific value + + wait until a register location is equal to a specific value + + wait until a register location is >= a specific value + + wait until a read completes + + wait until all base/size writes from an IB_PFD packet have completed + + register read/modify/write + + Set binning configuration registers + + + reads register in chip and writes to memory + + write N 32-bit words to memory + + write CP_PROG_COUNTER value to memory + + conditional execution of a sequence of packets + + conditional write to memory or register + + + generate an event that creates a write to memory when completed + + generate a VS|PS_done event + + generate a cache flush done event + + generate a z_pass done event + + + not sure the real name, but this seems to be what is used for + opencl, instead of CP_DRAW_INDX.. + + + initiate fetch of index buffer and draw + + draw using supplied indices in packet + + initiate fetch of index buffer and binIDs and draw + + initiate fetch of bin IDs and draw using supplied indices + + begin/end initiator for viz query extent processing + + fetch state sub-blocks and initiate shader code DMAs + + load constant into chip and to memory + + load sequencer instruction memory (pointer-based) + + load sequencer instruction memory (code embedded in packet) + + load constants from a location in memory + + selective invalidation of state pointers + + dynamically changes shader instruction memory partition + + sets the 64-bit BIN_MASK register in the PFP + + sets the 64-bit BIN_SELECT register in the PFP + + updates the current context, if needed + + generate interrupt from the command stream + + copy sequencer instruction memory to system memory + + + + + + + + sets draw initiator flags register in PFP, gets bitwise-ORed into + every draw initiator + + + sets the register protection mode + + + + + + load high level sequencer command + + + Conditionally load a IB based on a flag, prefetch enabled + + Conditionally load a IB based on a flag, prefetch disabled + + Load a buffer with pre-fetch enabled + + Set bin (?) + + + test 2 memory locations to dword values specified + + + Write register, ignoring context state for context sensitive registers + + + Record the real-time when this packet is processed by PFP + + + + + + PFP waits until the FIFO between the PFP and the ME is empty + + + + + Used a bit like CP_SET_CONSTANT on a2xx, but can write multiple + groups of registers. Looks like it can be used to create state + objects in GPU memory, and on state change only emit pointer + (via CP_SET_DRAW_STATE), which should be nice for reducing CPU + overhead: + + (A4x) save PM4 stream pointers to execute upon a visible draw + + + + + + + + + + + set to 1 for fastclear..: + + + + + + for A4xx + Write to register with address that does not fit into type-0 pkt + + + + copy from ME scratch RAM to a register + + + Copy from REG to ME scratch RAM + + + Wait for memory writes to complete + + + Conditional execution based on register comparison + + + Memory to REG copy + + + + + + + for a5xx + + + + + + Tells CP the current mode of GPU operation + + Instruct CP to set a few internal CP registers + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Load state, a3xx (and later?) + + + + + + + + + + + + + + + + + inline with the CP_LOAD_STATE packet + + + + + in buffer pointed to by EXT_SRC_ADDR + + + + + + + + + + + + + + + + + + Load state, a4xx+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Load state, a6xx+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + SS6_UBO used by the a6xx vulkan blob with tesselation constants + in this case, EXT_SRC_ADDR is (ubo_id shl 16 | offset) + to load constants from a UBO loaded with DST_OFF = 14 and offset 0, + EXT_SRC_ADDR = 0xe0000 + (offset is a guess, should be in bytes given that maxUniformBufferRange=64k) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + DST_OFF same as in CP_LOAD_STATE6 - vec4 VS const at this offset will + be updated for each draw to {draw_id, first_vertex, first_instance, 0} + value of 0 disables it + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + value at offset 0 always seems to be 0x00000000.. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Like CP_SET_BIN_DATA5, but set the pointers as offsets from the + pointers stored in VSC_PIPE_{DATA,DATA2,SIZE}_ADDRESS. Useful + for Vulkan where these values aren't known when the command + stream is recorded. + + + + + + + + + + + + + + + + + + + + + + + + Modifies DST_REG using two sources that can either be registers + or immediates. If SRC1_ADD is set, then do the following: + + $dst = (($dst & $src0) rot $rotate) + $src1 + + Otherwise: + + $dst = (($dst & $src0) rot $rotate) | $src1 + + Here "rot" means rotate left. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Like CP_REG_TO_MEM, but the memory address to write to can be + offsetted using either one or two registers or scratch + registers. + + + + + + + + + + + + + + + + + + + + + + + + Like CP_REG_TO_MEM, but the memory address to write to can be + offsetted using a DWORD in memory. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Wait until a memory value is greater than or equal to the + reference, using signed comparison. + + + + + + + + + + + + + + + + + + + This uses the same internal comparison as CP_COND_WRITE, + but waits until the comparison is true instead. It busy-loops in + the CP for the given number of cycles before trying again. + + + + + + + + + + + + + + + + + + + + + + + + + + + + Waits for REG0 to not be 0 or REG1 to not equal REF + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + " + + + + + + + + + + + + + + + + + + + + + + + + " + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Tell CP the current operation mode, indicates save and restore procedure + + + + + + + + + + + + + + + + + + + + + + + + + + Set internal CP registers, used to indicate context save data addresses + + + + + + + + + + + + + + + + + + + + + + + Tests bit in specified register and sets predicate for CP_COND_REG_EXEC. + So: + + opcode: CP_REG_TEST (39) (2 dwords) + { REG = 0xc10 | BIT = 0 } + 0000: 70b90001 00000c10 + opcode: CP_COND_REG_EXEC (47) (3 dwords) + 0000: 70c70002 10000000 00000004 + opcode: CP_INDIRECT_BUFFER (3f) (4 dwords) + + Will execute the CP_INDIRECT_BUFFER only if b0 in the register at + offset 0x0c10 is 1 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Executes the following DWORDs of commands if the dword at ADDR0 + is not equal to 0 and the dword at ADDR1 is less than REF + (signed comparison). + + + + + + + + + + + + + + + + + + + + + + + + Used by the userspace driver to set various IB's which are + executed during context save/restore for handling + state that isn't restored by the + context switch routine itself. + + + + Executed unconditionally when switching back to the context. + + + + Executed when switching back after switching + away during execution of + a CP_SET_MARKER packet with RM6_YIELD as the + payload *and* the normal save routine was + bypassed for a shorter one. I think this is + connected to the "skipsaverestore" bit set by + the kernel when preempting. + + + + + Executed when switching away from the context, + except for context switches initiated via + CP_YIELD. + + + + + This can only be set by the RB (i.e. the kernel) + and executes with protected mode off, but + is otherwise similar to SAVE_IB. + + + + + + + + + + + + + + + + + + + Keep shadow copies of these registers and only set them + when drawing, avoiding redundant writes: + - VPC_CNTL_0 + - HLSQ_CONTROL_1_REG + - HLSQ_UNKNOWN_B980 + + + + Track RB_RENDER_CNTL, and insert a WFI in the following + situation: + - There is a write that disables binning + - There was a draw with binning left enabled, but in + BYPASS mode + Presumably this is a hang workaround? + + + + Do a mysterious CP_EVENT_WRITE 0x3f when the low bit of + the data to write is 0. Used by the Vulkan blob with + PC_UNKNOWN_9B07, but this isn't predicated on particular + register(s) like the others. + + + + + + + + + + + Note that the SMMU's definition of TTBRn can take different forms + depending on the pgtable format. But a5xx+ only uses aarch64 + format. + + + + + + + + + + Unused, does not apply to aarch64 pgtable format + + + + + + + + + diff --git a/src/freedreno/registers/adreno/ocmem.xml b/src/freedreno/registers/adreno/ocmem.xml new file mode 100644 index 00000000000..7eb3fc8312e --- /dev/null +++ b/src/freedreno/registers/adreno/ocmem.xml @@ -0,0 +1,42 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/adreno_common.xml b/src/freedreno/registers/adreno_common.xml deleted file mode 100644 index d70fbaf10c1..00000000000 --- a/src/freedreno/registers/adreno_common.xml +++ /dev/null @@ -1,370 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Registers in common between a2xx and a3xx - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Address mode for a5xx+ - - - - - - - diff --git a/src/freedreno/registers/adreno_pm4.xml b/src/freedreno/registers/adreno_pm4.xml deleted file mode 100644 index f16793be5e4..00000000000 --- a/src/freedreno/registers/adreno_pm4.xml +++ /dev/null @@ -1,1705 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - initialize CP's micro-engine - - skip N 32-bit words to get to the next packet - - - indirect buffer dispatch. prefetch parser uses this packet - type to determine whether to pre-fetch the IB - - - - - - Takes the same arguments as CP_INDIRECT_BUFFER, but jumps to - another buffer at the same level. Must be at the end of IB, and - doesn't work with draw state IB's. - - - indirect buffer dispatch. same as IB, but init is pipelined - - wait for the IDLE state of the engine - - wait until a register or memory location is a specific value - - wait until a register location is equal to a specific value - - wait until a register location is >= a specific value - - wait until a read completes - - wait until all base/size writes from an IB_PFD packet have completed - - register read/modify/write - - Set binning configuration registers - - - reads register in chip and writes to memory - - write N 32-bit words to memory - - write CP_PROG_COUNTER value to memory - - conditional execution of a sequence of packets - - conditional write to memory or register - - - generate an event that creates a write to memory when completed - - generate a VS|PS_done event - - generate a cache flush done event - - generate a z_pass done event - - - not sure the real name, but this seems to be what is used for - opencl, instead of CP_DRAW_INDX.. - - - initiate fetch of index buffer and draw - - draw using supplied indices in packet - - initiate fetch of index buffer and binIDs and draw - - initiate fetch of bin IDs and draw using supplied indices - - begin/end initiator for viz query extent processing - - fetch state sub-blocks and initiate shader code DMAs - - load constant into chip and to memory - - load sequencer instruction memory (pointer-based) - - load sequencer instruction memory (code embedded in packet) - - load constants from a location in memory - - selective invalidation of state pointers - - dynamically changes shader instruction memory partition - - sets the 64-bit BIN_MASK register in the PFP - - sets the 64-bit BIN_SELECT register in the PFP - - updates the current context, if needed - - generate interrupt from the command stream - - copy sequencer instruction memory to system memory - - - - - - - - sets draw initiator flags register in PFP, gets bitwise-ORed into - every draw initiator - - - sets the register protection mode - - - - - - load high level sequencer command - - - Conditionally load a IB based on a flag, prefetch enabled - - Conditionally load a IB based on a flag, prefetch disabled - - Load a buffer with pre-fetch enabled - - Set bin (?) - - - test 2 memory locations to dword values specified - - - Write register, ignoring context state for context sensitive registers - - - Record the real-time when this packet is processed by PFP - - - - - - PFP waits until the FIFO between the PFP and the ME is empty - - - - - Used a bit like CP_SET_CONSTANT on a2xx, but can write multiple - groups of registers. Looks like it can be used to create state - objects in GPU memory, and on state change only emit pointer - (via CP_SET_DRAW_STATE), which should be nice for reducing CPU - overhead: - - (A4x) save PM4 stream pointers to execute upon a visible draw - - - - - - - - - - - set to 1 for fastclear..: - - - - - - for A4xx - Write to register with address that does not fit into type-0 pkt - - - - copy from ME scratch RAM to a register - - - Copy from REG to ME scratch RAM - - - Wait for memory writes to complete - - - Conditional execution based on register comparison - - - Memory to REG copy - - - - - - - for a5xx - - - - - - Tells CP the current mode of GPU operation - - Instruct CP to set a few internal CP registers - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Load state, a3xx (and later?) - - - - - - - - - - - - - - - - - inline with the CP_LOAD_STATE packet - - - - - in buffer pointed to by EXT_SRC_ADDR - - - - - - - - - - - - - - - - - - Load state, a4xx+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Load state, a6xx+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - SS6_UBO used by the a6xx vulkan blob with tesselation constants - in this case, EXT_SRC_ADDR is (ubo_id shl 16 | offset) - to load constants from a UBO loaded with DST_OFF = 14 and offset 0, - EXT_SRC_ADDR = 0xe0000 - (offset is a guess, should be in bytes given that maxUniformBufferRange=64k) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - DST_OFF same as in CP_LOAD_STATE6 - vec4 VS const at this offset will - be updated for each draw to {draw_id, first_vertex, first_instance, 0} - value of 0 disables it - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - value at offset 0 always seems to be 0x00000000.. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Like CP_SET_BIN_DATA5, but set the pointers as offsets from the - pointers stored in VSC_PIPE_{DATA,DATA2,SIZE}_ADDRESS. Useful - for Vulkan where these values aren't known when the command - stream is recorded. - - - - - - - - - - - - - - - - - - - - - - - - Modifies DST_REG using two sources that can either be registers - or immediates. If SRC1_ADD is set, then do the following: - - $dst = (($dst & $src0) rot $rotate) + $src1 - - Otherwise: - - $dst = (($dst & $src0) rot $rotate) | $src1 - - Here "rot" means rotate left. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Like CP_REG_TO_MEM, but the memory address to write to can be - offsetted using either one or two registers or scratch - registers. - - - - - - - - - - - - - - - - - - - - - - - - Like CP_REG_TO_MEM, but the memory address to write to can be - offsetted using a DWORD in memory. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Wait until a memory value is greater than or equal to the - reference, using signed comparison. - - - - - - - - - - - - - - - - - - - This uses the same internal comparison as CP_COND_WRITE, - but waits until the comparison is true instead. It busy-loops in - the CP for the given number of cycles before trying again. - - - - - - - - - - - - - - - - - - - - - - - - - - - - Waits for REG0 to not be 0 or REG1 to not equal REF - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - " - - - - - - - - - - - - - - - - - - - - - - - - " - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Tell CP the current operation mode, indicates save and restore procedure - - - - - - - - - - - - - - - - - - - - - - - - - - Set internal CP registers, used to indicate context save data addresses - - - - - - - - - - - - - - - - - - - - - - - Tests bit in specified register and sets predicate for CP_COND_REG_EXEC. - So: - - opcode: CP_REG_TEST (39) (2 dwords) - { REG = 0xc10 | BIT = 0 } - 0000: 70b90001 00000c10 - opcode: CP_COND_REG_EXEC (47) (3 dwords) - 0000: 70c70002 10000000 00000004 - opcode: CP_INDIRECT_BUFFER (3f) (4 dwords) - - Will execute the CP_INDIRECT_BUFFER only if b0 in the register at - offset 0x0c10 is 1 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Executes the following DWORDs of commands if the dword at ADDR0 - is not equal to 0 and the dword at ADDR1 is less than REF - (signed comparison). - - - - - - - - - - - - - - - - - - - - - - - - Used by the userspace driver to set various IB's which are - executed during context save/restore for handling - state that isn't restored by the - context switch routine itself. - - - - Executed unconditionally when switching back to the context. - - - - Executed when switching back after switching - away during execution of - a CP_SET_MARKER packet with RM6_YIELD as the - payload *and* the normal save routine was - bypassed for a shorter one. I think this is - connected to the "skipsaverestore" bit set by - the kernel when preempting. - - - - - Executed when switching away from the context, - except for context switches initiated via - CP_YIELD. - - - - - This can only be set by the RB (i.e. the kernel) - and executes with protected mode off, but - is otherwise similar to SAVE_IB. - - - - - - - - - - - - - - - - - - - Keep shadow copies of these registers and only set them - when drawing, avoiding redundant writes: - - VPC_CNTL_0 - - HLSQ_CONTROL_1_REG - - HLSQ_UNKNOWN_B980 - - - - Track RB_RENDER_CNTL, and insert a WFI in the following - situation: - - There is a write that disables binning - - There was a draw with binning left enabled, but in - BYPASS mode - Presumably this is a hang workaround? - - - - Do a mysterious CP_EVENT_WRITE 0x3f when the low bit of - the data to write is 0. Used by the Vulkan blob with - PC_UNKNOWN_9B07, but this isn't predicated on particular - register(s) like the others. - - - - - - - - - - - Note that the SMMU's definition of TTBRn can take different forms - depending on the pgtable format. But a5xx+ only uses aarch64 - format. - - - - - - - - - - Unused, does not apply to aarch64 pgtable format - - - - - - - - - diff --git a/src/freedreno/registers/dsi/dsi.xml b/src/freedreno/registers/dsi/dsi.xml new file mode 100644 index 00000000000..a1ebee71e95 --- /dev/null +++ b/src/freedreno/registers/dsi/dsi.xmldiff --git a/src/freedreno/registers/dsi/mmss_cc.xml b/src/freedreno/registers/dsi/mmss_cc.xml new file mode 100644 index 00000000000..ccd4083fdf9 --- /dev/null +++ b/src/freedreno/registers/dsi/mmss_cc.xml @@ -0,0 +1,48 @@ + + + + + + + Multimedia sub-system clock control.. appears to be used by DSI + for clocks.. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/dsi/sfpb.xml b/src/freedreno/registers/dsi/sfpb.xml new file mode 100644 index 00000000000..a08c82ff169 --- /dev/null +++ b/src/freedreno/registers/dsi/sfpb.xml @@ -0,0 +1,17 @@ + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/edp/edp.xml b/src/freedreno/registers/edp/edp.xml new file mode 100644 index 00000000000..00fc6112585 --- /dev/null +++ b/src/freedreno/registers/edp/edp.xml @@ -0,0 +1,239 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/hdmi/hdmi.xml b/src/freedreno/registers/hdmi/hdmi.xml new file mode 100644 index 00000000000..af22313b6c5 --- /dev/null +++ b/src/freedreno/registers/hdmi/hdmi.xmldiff --git a/src/freedreno/registers/hdmi/qfprom.xml b/src/freedreno/registers/hdmi/qfprom.xml new file mode 100644 index 00000000000..4ae1221aba8 --- /dev/null +++ b/src/freedreno/registers/hdmi/qfprom.xml @@ -0,0 +1,18 @@ + + + + + + + seems to be something external to display block, for finding + what features are enabled/supported? + + + + + + + + diff --git a/src/freedreno/registers/mdp/mdp4.xml b/src/freedreno/registers/mdp/mdp4.xml new file mode 100644 index 00000000000..a84f5308040 --- /dev/null +++ b/src/freedreno/registers/mdp/mdp4.xml @@ -0,0 +1,480 @@ + + + + + + + + pipe names, index into PIPE[] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + appears to map pipe to mixer stage + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 8bit characters per pixel minus 1 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/mdp/mdp5.xml b/src/freedreno/registers/mdp/mdp5.xml new file mode 100644 index 00000000000..a5ae1e39621 --- /dev/null +++ b/src/freedreno/registers/mdp/mdp5.xmlbit characters per pixel minus 1 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/mdp/mdp_common.xml b/src/freedreno/registers/mdp/mdp_common.xml new file mode 100644 index 00000000000..226596a29fb --- /dev/null +++ b/src/freedreno/registers/mdp/mdp_common.xml @@ -0,0 +1,83 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + bits per component (non-alpha channel) + + + + + + + + bits per component (alpha channel) + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/meson.build b/src/freedreno/registers/meson.build index 26335ce608d..d38e1661262 100644 --- a/src/freedreno/registers/meson.build +++ b/src/freedreno/registers/meson.build @@ -33,7 +33,7 @@ foreach f : xml_files _name = f + '.h' freedreno_xml_header_files += custom_target( _name, - input : ['gen_header.py', f], + input : ['gen_header.py', 'adreno/' + f], output : _name, command : [prog_python, '@INPUT@'], capture : true, @@ -42,14 +42,14 @@ endforeach freedreno_xml_header_files += custom_target( 'a6xx-pack.xml.h', - input : ['gen_header.py', 'a6xx.xml'], + input : ['gen_header.py', 'adreno/a6xx.xml'], output : 'a6xx-pack.xml.h', command : [prog_python, '@INPUT@', '--pack-structs'], capture : true, ) freedreno_xml_header_files += custom_target( 'adreno-pm4-pack.xml.h', - input : ['gen_header.py', 'adreno_pm4.xml'], + input : ['gen_header.py', 'adreno/adreno_pm4.xml'], output : 'adreno-pm4-pack.xml.h', command : [prog_python, '@INPUT@', '--pack-structs'], capture : true, diff --git a/src/freedreno/registers/msm.xml b/src/freedreno/registers/msm.xml new file mode 100644 index 00000000000..2fbf5d0d9b0 --- /dev/null +++ b/src/freedreno/registers/msm.xml @@ -0,0 +1,26 @@ + + + + + + Register definitions for the display related hw blocks on + msm/snapdragon + + + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/rules-ng-ng.txt b/src/freedreno/registers/rules-ng-ng.txt new file mode 100644 index 00000000000..8b1de10241a --- /dev/null +++ b/src/freedreno/registers/rules-ng-ng.txt @@ -0,0 +1,703 @@ +1. Introduction + +rules-ng is a package consisting of a database of nvidia GPU registers in XML +format, and tools made to parse this database and do useful work with it. It +is in mostly usable state, but there are some annoyances that prevent its +adoption as the home of all nouveau documentation. + +Note that this document and rules-ng understands "register" quite liberally as +"anything that has an address and can have a value in it, written to it, or +read to it". This includes conventional MMIO registers, as well as fields of +memory structures and grobj methods. + +Its parsable XML format is supposed to solve three problems: + + - serve as actual documentation for the known registers: supports attaching + arbitrary text in tags and generating HTML for easy reading. + - name -> hex translation: supports generating C headers that #define all + known registers, bitfields and enum values as C constants. + - hex -> name translation: you tell it the address or address+value of a + register, and it decodes the address to its symbolic name, and the value to + its constituting bitfields, if any. Useful for decoding mmio-traces / + renouveau dumps, as well as standalone use. + +What's non-trivial about all this [ie. why rules-ng is not a long series of +plain address - name - documentation tuples]: + + - The registers may be split into bitfields, each with a different purpose + and name [and separate documentation]. + - The registers/bitfields may accept values from a predefined set [enum], + each with a different meaning. Each value also has a name and + documentation. + - The registers may come in multiple copies, forming arrays. They can also + form logical groups. And these groups can also come in multiple copies, + forming larger arrays... and you get a hierarchical structure. + - There are multiple different GPU chipsets. The available register set + changed between these chipsets - sometimes only a few registers, sometimes + half the card was remade from scratch. More annoyingly, sometimes some + registers move from one place to another, but are otherwise unchanged. + Also [nvidia-specific], new grobj classes are sometimes really just new + revisions of a base class with a few methods changed. In both of these + cases, we want to avoid duplication as much as possible. + +2. Proposed new XML format + +2.1. General tags + +Root tag is . There is one per the whole file and it should contain +everything else. + + and are tags that can appear inside any other tag, and document +whatever it defines. is supposed to be a short one-line description +giving a rough idea what a given item is for if no sufficiently descriptive +name was used. can be of any length, can contain some html and html-like +tags, and is supposed to describe a given item in as much detail as needed. +There should be at most one and at most one tag for any parent. + +Tags that define top-level entities include: + + : Declares an addressing space containing registers + : Declares a block of registers, expected to be included by one or + more s + : Declares a list of applicable bitfields for some register + : Declares a list of related symbolic values. Can describe params to + a register/bitfield, or discriminate between card variants. + +Each of these has an associated global name used to refer to them from other +parts of database. As a convenience, and to allow related stuff to be kept +together, the top-level entities are allowed to occur pretty much anywhere +inside the XML file except inside tags. This implies no scoping, +however: the effect is the same as putting the entity right below . +If two top-level elements of the same type and name are defined, they'll be +merged into a single one, as if contents of one were written right after +contents of the other. All attributes of the merged tags need to match. + +Another top-level tag that can be used anywhere is the tag. It's used +like and makes all of foo.xml's definitions available +to the containing file. If a single file is ed more than one time, all +s other than the first are ignored. + +2.2. Domains + +All register definitions ultimately belong to a . is +basically just a single address space. So we'll have a domain for the MMIO +BAR, one for each type of memory structure we need to describe, a domain for +the grobj/FIFO methods, and a domain for each indirect index-data pair used to +access something useful. can have the following attributes: + + - name [required]: The name of the domain. + - width [optional]: the size, in bits, of a single addressable unit. This is + 8 by default for usual byte-addressable memory, but 32 can be useful + occasionally for indexed spaces of 32-bit cells. Values sane enough to + support for now include 8, 16, 32, 64. + - size [optional]: total number of addressable units it spans. Can be + undefined if you don't know it or it doesn't make sense. As a special + exception to the merging rules, size attribute need not be specified on all + tags that will result in a merged domain: tags with size can be merged with + tags without size, resulting in merged domain that has size. Error only + happens when the merged domains both have sizes, and the sizes differ. + - bare [optional]: if set to "no", all children items will have the domain + name prepended to their names. If set to "yes", such prefixing doesn't + happen. Default is "no". + - prefix [optional]: selects the string that should be prepended to name + of every child item. The special value "none" means no prefix, and is the + default. All other values are looked up as names and, for each child + item, its name is prefixed with name of the earliest variant in the given + enum that supports given item. + + + + + + + +Describes a space with 0x1000000 of 8-bit addressable cells. Cells 0-3 belong +to NV04_PMC_BOOT_0 register, 4-7 belong to NV10_PMC_BOOT_1 register, +0x100-0x103 belong to NV04_PMC_INTR register, and remaining cells are either +unused or unknown. The generated .h definitions are: + +#define NV_MMIO__SIZE 0x1000000 +#define NV04_PMC_BOOT_0 0 +#define NV10_PMC_BOOT_1 4 +#define NV04_PMC_INTR 0x100 + + + + + + + + + + +Defines a 6-cell address space with each cell 32 bits in size and +corresponding to a single register. Definitions are: + +#define NV50_PFB_VM_TRAP__SIZE 6 +#define NV50_PFB_VM_TRAP_STATUS 0 +#define NV50_PFB_VM_TRAP_CHANNEL 1 +#define NV50_PFB_VM_TRAP_UNK2 2 +#define NV50_PFB_VM_TRAP_ADDRLOW 3 +#define NV50_PFB_VM_TRAP_ADDRMID 4 +#define NV50_PFB_VM_TRAP_ADDRHIGH 5 + +2.3. Registers + +What we really want all the time is defining registers. This is done with +, , or tags. The register of course takes +reg_width / domain_width cells in the domain. It's an error to define a +register with smaller width than the domain it's in. The attributes +are: + + - name [required]: the name of the register + - offset [required]: the offset of the register + - access [optional]: "rw" [default], "r", or "w" to mark the register as + read-write, read-only, or write-only. Only makes sense for real MMIO + domains. + - varset [optional]: the to choose from by the variant attribute. + Defaults to first used in currently active prefix. + - variants [optional]: space-separated list of and variant ranges that this + register is present on. The items of this list can be: + - var1: a single variant + - var1-var2: all variants starting with var1 up to and including var2 + - var1:var2: all variants starting with var1 up to, but not including var2 + - :var1: all variants before var1 + - -var1: all variants up to and including var1 + - var1-: all variants starting from var1 + - type [optional]: How to interpret the contents of this register. + - "uint": unsigned decimal integer + - "int": signed decimal integer + - "hex": unsigned hexadecimal integer + - "float" IEEE 16-bit, 32-bit or 64-bit floating point format, depending + on register/bitfield size + - "boolean": a boolean value: 0 is false, 1 is true + - any defined enum name: value from that anum + - "enum": value from the inline tags in this + - any defined bitset name: value decoded further according to that bitset + - "bitset": value decoded further according to the inline + tags + - any defined domain name: value decoded as an offset in that domain + The default is "bitset" if there are inline tags present, + otherwise "enum" if there are inline tags present, otherwise + "boolean" if this is a bitfield with width 1, otherwise "hex". + - shr [optional]: the value in this register is the real value shifted right + by this many bits. Ie. for register with shr="12", register value 0x1234 + should be interpreted as 0x1234000. May sound too specific, but happens + quite often in nvidia hardware. + - length [optional]: if specified to be other than 1, the register is treated + as if it was enclosed in an anonymous with corresponding length + and stride attributes, except the __ESIZE and __LEN stripe defines are + emitted with the register's name. If not specified, defaults to 1. + - stride [optional]: the stride value to use if length is non-1. Defaults to + the register's size in cells. + +The definitions emitted for a non-stripe register include only its offset and +shr value. Other informations are generally expected to be a part of code +logic anyway: + + + +results in + +#define PGRAPH_CTXCTL_SWAP 0x400784 +#define PGRAPH_CTXCTL_SWAP__SHR 12 + +For striped registers, __LEN and __ESIZE definitions like are emitted +too: + + + + +results in + +#define NV50_COMPUTE_USER_PARAM(i) (0x600 + (i)*4) +#define NV50_COMPUTE_USER_PARAM__LEN 64 +#define NV50_COMPUTE_USER_PARAM__ESIZE 4 + +The tags can also contain either bitfield definitions, or enum value +definitions. + +2.4. Enums and variants + +Enum is, basically, a set of values. They're defined by tag with the +following attributes: + + - name [required]: an identifying name. + - inline [optional]: "yes" or "no", with "no" being the default. Selects if + this enum should emit its own definitions in .h file, or be inlined into + any / definitions that reference it. + - bare [optional]: only for no-inline enums, behaves like bare attribute + to + - prefix [optional]: only for no-inline enums, behaves like prefix attribute + to . + +The tag contains tags with the following attributes: + + - name [required]: the name of the value + - value [optional]: the value + - varset [optional]: like in + - variants [optional]: like in + +The s are referenced from inside and tags by setting +the type attribute to the name of the enum. For single-use enums, the +tags can also be written directly inside tag. + + + + + + + + + + + + + + + + + + +Result: + +#define NV04_SURFACE_FORMAT_A8R8G8B8 6 +#define NV04_SURFACE_FORMAT_A8R8G8B8_RECT 0x12 +#define TEXTURE_FORMAT 0x1234 +#define SHADE_MODEL 0x1238 +#define SHADE_MODEL_FLAT 0x1d00 +#define SHADE_MODEL_SMOOTH 0x1d01 +#define PATTERN_SELECT 0x123c +#define PATTERN_SELECT_MONO 1 +#define PATTERN_SELECT_COLOR 2 + +Another use for enums is describing variants: slightly different versions of +cards, objects, etc. The varset and variant attributes of most tags allow +defining items that are only present when you're dealing with something of the +matching variant. The variant space is "multidimensional" - so you can have +a variant "dimension" representing what GPU chipset you're using at the +moment, and another dimension representing what grobj class you're dealing +with [taken from another enum]. Both of these can be independent. + + + The chipset of the card + + RIVA TNT + + + RIVA TNT2 + + + GeForce 256 + + + G80: GeForce 8800 GTX, Tesla *870, ... + + + G84: GeForce 8600 GT, ... + + + G200: GeForce 260 GTX, Tesla C1060, ... + + + GT216: GeForce GT 220 + + + +If enabled for a given domain, the name of the earliest variant to support +a given register / bitfield / value / whatever will be automatically prepended +to its name. For this purpose, "earliest" is defined as "comes first in the +XML file". + +s used for this purpose can still be used as normal enums. And can even +have variant-specific values referencing another . Example: + + + + + + + + + + +In generated .h file, this will result in: + +#define NV04_MEMORY_TO_MEMORY_FORMAT 0x0039 +#define NV50_MEMORY_TO_MEMORY_FORMAT 0x5039 +#define NV50_2D 0x502d +#define NV50_TCL 0x5097 +#define NV84_TCL 0x8297 +#define NV50_COMPUTE 0x50c0 + +2.5. Bitfields + +Often, registers store not a single full-width value, but are split into +bitfields. Like values can be grouped in enums, bitfields can be called in +bitsets. The tag has the same set of attributes as tag, and +contains tags with the following attributes: + + - name [required]: name of the bitfield + - low [required]: index of the lowest bit belonging to this bitfield. bits + are counted from 0, LSB-first. + - high [required]: index of the highest bit belonging to this bitfield. + - varset [optional]: like in + - variants [optional]: like in + - type [optional]: like in + - shr [optional]: like in + +Like s, s are also allowed to be written directly inside + tags. + +s themselves can contain s. The defines generated for +s include "name__MASK" equal to the bitmask corresponding to given +bitfield, "name__SHIFT" equal to the low attribute, "name__SHR" equal to +the shr attribute [if defined]. Single-bit bitfields with type "boolean" are +treated specially, and get "name" defined to the bitmask instead. If the +bitfield contains any s, s, or references an inlined +enum/bitset, defines for them are also generated, pre-shifted to the correct +position. Example: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Result: + +#define NV04_GROBJ_1_GRCLASS__MASK 0x000000ff +#define NV04_GROBJ_1_GRCLASS__SHIFT 0 +#define NV04_GROBJ_1_CHROMA_KEY 0x00001000 +#define NV04_GROBJ_1_USER_CLIP 0x00002000 +#define NV04_GROBJ_1_SWIZZLE 0x00004000 +#define NV04_GROBJ_1_PATCH_CONFIG__MASK 0x00038000 +#define NV04_GROBJ_1_PATCH_CONFIG__SHIFT 15 +#define NV04_GROBJ_1_PATCH_CONFIG_SRCCOPY_AND 0x00000000 +#define NV04_GROBJ_1_PATCH_CONFIG_ROP_AND 0x00008000 +#define NV04_GROBJ_1_PATCH_CONFIG_BLEND_AND 0x00010000 +#define NV04_GROBJ_1_PATCH_CONFIG_SRCCOPY 0x00018000 +#define NV04_GROBJ_1_PATCH_CONFIG_SRCCOPY_PRE 0x00020000 +#define NV04_GROBJ_1_PATCH_CONFIG_BLEND_PRE 0x00028000 + +#define PGRAPH_CTX_SWITCH_1 0x40014c + +#define FORMAT 0x0404 +#define FORMAT_PITCH__MASK 0x0000ffff +#define FORMAT_PITCH__SHIFT 0 +#define FORMAT_ORIGIN__MASM 0x00ff0000 +#define FORMAT_ORIGIN__SHIFT 16 +#define FORMAT_FILTER__MASK 0xff000000 +#define FORMAT_FILTER__SHIFT 24 + +#define POINT 0x040c +#define POINT_X 0x0000ffff +#define POINT_X__SHIFT 0 +#define POINT_Y 0xffff0000 +#define POINT_Y__SHIFT 16 + +#define FP_INTERPOLANT_CTRL 0x00001988 +#define FP_INTERPOLANT_CTRL_UMASK__MASK 0xff000000 +#define FP_INTERPOLANT_CTRL_UMASK__SHIFT 24 +#define FP_INTERPOLANT_CTRL_UMASK_X 0x01000000 +#define FP_INTERPOLANT_CTRL_UMASK_Y 0x02000000 +#define FP_INTERPOLANT_CTRL_UMASK_Z 0x04000000 +#define FP_INTERPOLANT_CTRL_UMASK_W 0x08000000 +#define FP_INTERPOLANT_CTRL_COUNT_NONFLAT__MASK 0x00ff0000 +#define FP_INTERPOLANT_CTRL_COUNT_NONFLAT__SHIFT 16 +#define FP_INTERPOLANT_CTRL_OFFSET__MASK 0x0000ff00 +#define FP_INTERPOLANT_CTRL_OFFSET__SHIFT 8 +#define FP_INTERPOLANT_CTRL_COUNT__MASK 0x000000ff +#define FP_INTERPOLANT_CTRL_COUNT__SHIFT 0 + +2.6. Arrays and stripes. + +Sometimes you have multiple copies of a register. Sometimes you actually have +multiple copies of a whole set of registers. And sometimes this set itself +contains multiple copies of something. This is what s are for. The + represents "length" units, each of size "stride" packed next to each +other starting at "offset". Offsets of everything inside the array are +relative to start of an element of the array. The attributes include: + + - name [required]: name of the array, also used as prefix for all items + inside it + - offset [required]: starting offset of the array. + - stride [required]: size of a single element of the array, as well as the + difference between offsets of two neighboring elements + - length [required]: Number of elements in the array + - varset [optional]: As in + - variants [optional]: As in + +The definitions emitted for an array include: + - name(i) defined to be the starting offset of element i, if length > 1 + - name defined to be the starting offset of arrayi, if length == 1 + - name__LEN defined to be the length of array + - name__ESIZE defined to be the stride of array + +Also, if length is not 1, definitions for all items inside the array that +involve offsets become parameter-taking C macros that calculate the offset +based on array index. For nested arrays, this macro takes as many arguments +as there are indices involved. + +It's an error if an item inside an array doesn't fit inside the array element. + + + + + + + + s, items can have offsets larger than stride, and offsets aren't +automatically assumed to be a part of unless a explicitely +hits that particular offset for some index. Also, s of length 1 and +stride 0 can be used as generic container, for example to apply a variant set +or a prefix to a bigger set of elements. Attributes: + + - name [optional]: like in . If not given, no prefixing happens, and + the defines for itself aren't emitted. + - offset [optional]: like . Defaults to 0 if unspecified. + - stride [optional]: the difference between offsets of items with indices i + and i+1. Or size of the if it makes sense in that particular + context. Defaults to 0. + - length [optional]: like in array. Defaults to 1. + - varset [optional]: as in + - variants [optional]: as in + - prefix [optional]: as in , overrides parent's prefix option. + +Definitions are emitted like for arrays, but: + - if no name is given, the definitions for stripe itself won't be emitted + - if length is 0, the length is assumed to be unknown or undefined. No __LEN + is emitted in this case. + - if stride is 0, __ESIZE is not emitted + - it's an error to have stride 0 with length different than 1 + + +Examples: + + + + + + + + + + + + + +Results in: + +#define NV04_PGRAPH 0x400000 +#define NV04_PGRAPH_INTR 0x400100 +#define NV04_PGRAPH_INTR_EN 0x400140 +#define NV50_PGRAPH 0x400000 +#define NV50_PGRAPH_INTR 0x400100 +#define NV50_PGRAPH_TRAP 0x400108 +#define NV50_PGRAPH_TRAP_EN 0x400138 +#define NV50_PGRAPH_INTR_EN 0x40013c + + + + + + + + + + + +Results in: + +#define PVIDEO_BASE (0x8900+(i)*4) +#define PVIDEO_LIMIT (0x8908+(i)*4) +#define PVIDEO_LUMINANCE (0x8910+(i)*4) +#define PVIDEO_CHROMINANCE (0x8918+(i)*4) + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Results in: + +#define NV01_OBJECT_NAME 0x00 +#define NV50_OBJECT_FENCE_ADDRESS_HIGH 0x10 +#define NV50_MEMORY_TO_MEMORY_FORMAT_LINEAR_IN 0x200 +#define NV04_MEMORY_TO_MEMORY_FORMAT_BUFFER_NOTIFY 0x328 +#define NV50_COMPUTE_LAUNCH 0x368 +#define NV50_COMPUTE_GLOBAL 0x400 +#define NV50_COMPUTE_GLOBAL__LEN 16 +#define NV50_COMPUTE_GLOBAL__ESIZE 0x20 +#define NV50_COMPUTE_GLOBAL_ADDRESS_HIGH (0x400 + (i)*0x20) +#define NV50_COMPUTE_GLOBAL_ADDRESS_LOW (0x404 + (i)*0x20) +#define NV50_COMPUTE_GLOBAL_PITCH (0x408 + (i)*0x20) +#define NV50_COMPUTE_GLOBAL_LIMIT (0x40c + (i)*0x20) +#define NV50_COMPUTE_GLOBAL_MODE (0x410 + (i)*0x20) +#define NV50_COMPUTE_USER_PARAM(i) (0x600 + (i)*4) +#define NV50_COMPUTE_USER_PARAM__LEN 64 +#define NV50_COMPUTE_USER_PARAM__ESIZE 4 + +2.7. Groups + +Groups are just sets of registers and/or arrays that can be copied-and-pasted +together, when they're duplicated in several places in the same , +two different s, or have different offsets for different variants. + and only have the name attribute. can appear +wherever can, including inside a . + + + + + + + + + + + + + + + + + + + + +Will get you: + +#define NV50_PGRAPH_TP_MP_TRAPPED_OPCODE(i, j) (0x408270 + (i)*0x1000 + (j)*0x80) +#define NVA0_PGRAPH_TP_MP_TRAPPED_OPCODE(i, j) (0x408170 + (i)*0x800 + (j)*0x80) + +3. The utilities. + +The header generation utility will take a set of XML files and generate .h +file with all of their definitions, as defined above. + +The HTML generation utilty will take an XML file and generate HTML +documentation out of it. The documentation will include the and +tags in some way, as well as information from all the attributes, in some easy +to read format. Some naming scheme for the HTML files should be decided, so +that cross-refs to HTML documentation generated for ed files will work +correctly if the generator is run in both. + +The lookup utility will perform database lookups of the following types: + + - domain name, offset, access type, variant type[s] -> register name + array + indices if applicable + - the above + register value -> same as above + decoded value. For registers + with bitfields, print all bitfields, and indicate if any bits not covered + by the bitfields are set to 1. For registers/bitfields with enum values, + print the matching one if any. For remaining registers/bitfields, print + according to type attribute. + - bitset name + value -> decoded value, as above + - enum name + value -> decoded value, as above + +The mmio-parse utility will parse a mmio-trace file and apply the second kind +of database lookups to all memory accesses matching a given range. Some +nv-specific hacks will be in order to automate the parsing: extract the +chipset from PMC_BOOT_0, figure out the mmio base from PCI config, etc. + +The renouveau-parse utility will take contents of a PFIFO pushbuffer and +decode them. The splitting to method,data pair will be done by nv-specific +code, then the pair will be handed over to generic rules-ng lookup. + +4. Issues + + - Random typing-saving feature for bitfields: make high default to same value + as low, to have one less attribute for single-bit bitfields? + + - What about allowing nameless registers and/or bitfields? These are + supported by renouveau.xml and are used commonly to signify an unknown + register. + + - How about cross-ref links in tags? + + - : do we need them? Sounds awesome and useful, but as defined + by the old spec, they're quite limited. The only examples of straight + translations that I know of are the legacy VGA registers and + NV50_PFB_VM_TRAP. And NV01_PDAC, but I doubt anybody gives a damn about it. + This list is small enough to be just handled by nv-specific hacks in + mmio-trace parser if really needed. + + - Another thing that renouveau.xml does is disassembling NV20-NV40 shaders. + Do we want that in rules-ng? IMO we'd be better off hacking nv50dis to + support it... diff --git a/src/freedreno/registers/rules-ng-ng.xsd b/src/freedreno/registers/rules-ng-ng.xsd new file mode 100644 index 00000000000..10b5f3712f8 --- /dev/null +++ b/src/freedreno/registers/rules-ng-ng.xsd @@ -0,0 +1,388 @@ + + + + + + An updated version of the old rules.xml file from the + RivaTV project. Specifications by Pekka Paalanen, + preliminary attempt by KoalaBR, + first working version by Jakob Bornecrantz. + For specifications, see the file rules-ng-format.txt + in Nouveau CVS module 'rules-ng'. + + Version 0.1 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + databaseType + + + + + + + + + + importType + + + + + + + domainType + + + + + + + + + + + + + + + + groupType + + + + + + + + + + + + arrayType + + + + + + + + + + + + + + + + + stripeType + + + + + + + + + + + + + + + + + + + registerType used by reg8, reg16, reg32, reg64 + + + + + + + + + + + + + + + + + + + + + + bitsetType + + + + + + + + + + + + + + + bitfieldType + + + + + + + + + + + + + + + + + + enumType + + + + + + + + + + + + + + + valueType + + + + + + + + + + + + + + refType + + + + + + + + + + + brief documentation, no markup + + + + + + + + + + + root element of documentation sub-tree + + + + + + + + + + + + + for bold, underline, italics + + + + + + + + + + + + + + + + + + + definition of a list, ordered or unordered + + + + + + + + + + + items of a list + + + + + + + + + + + + + + + + + + + + + + + + HexOrNumber + + + + + + + Access + + + + + + + + + + + DomainWidth + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/freedreno/registers/text-format.txt b/src/freedreno/registers/text-format.txt new file mode 100644 index 00000000000..8e4343f9e08 --- /dev/null +++ b/src/freedreno/registers/text-format.txt @@ -0,0 +1,101 @@ +1. Introduction to rules-ng-ng text format + +This-specification defines a text format that can be converted to and from rules-ng-ng XML. +It is intended to allow to create rules-ng-ng files with much less typing and with a more readable text. +xml2text can convert rules-ng-ng XML to this text format +text2xml can convert this text format to rules-ng-ng XML + +This specification is an addendum to the rules-ng-ng specification and assumes familiarity with it. + +2. Format + +2.1. Line format + +The initial indentation of a line is divided by 8 and the result determines the position in the document structure (similar to the Python language). +A "//" anywhere in the line causes the rest to be converted to an XML comment (like C++) +A line starting with ":" creates a tag with the rest of the line (excluding anything starting with //). +The content of multiple lines starting with ":" is merged in a single tag. + +2.2. Tokenization + +The line is then tokenized. +Token are generally continuous strings on non-whitespace characters, with some exceptions +Some characters (such as ":", "=" and "-") form a single-character token. +Text within double quotes generates a tag. +Any token formatted as ATTR(VALUE) generates an ATTR="VALUE" attribute. No whitespace allowed between ATTR and the '(' character. +Any token formatted as (VALUE) generates a variants="VALUE" attribute. +Any token formatted as (VARSET=VALUE) generates a varset="VARSET" variants="VALUE" attribute. + +2.3. Special token sequences + +These sequences are recognized and extracted before matching the line format: + +: NUM + set REGLIKE to regNUM + you must specify a type if the reg is anonymous + the : is recognized only if it is the third or successive token (and not the last) to avoid ambiguity with bitfields and generic tags + +{ STRIDE } + stride="STRIDE" attribute + +[ LENGTH ] + length="LENGTH" attribute + +!FLAGS + access="FLAGS" + no whitespace allowed after '!' + +:= + at the end of the line + set REGLIKE to "stripe" + += + at the end of the line + set REGLIKE to "array" + +inline + at the beginning of the line + inline="yes" attribute + +2.4. Line patterns + +The following line patterns are understood. +Only word tokens are used to match lines. +All tokens with special meaning are treated separately as described above. +[FOO] means that FOO is optional + +#import "FILE" + + +#pragma regNUM + REGLIKE is now set by default to regNUM instead of reg32 + +@TAG [NAME] + + use this if there are no children + +TAG [NAME] : + + use this if there are children + +TOKEN + if inside a reg or enum and TOKEN starts with a digit + if inside a reg or enum and TOKEN does not start with a digit + otherwise + +POS NAME + if inside a reg or bitset + otherwise + +LOW - HIGH NAME [TYPE] + + +VALUE = NAME + + +use WHAT NAME + + +OFFSET NAME [TYPE] + + diff --git a/src/freedreno/registers/update-headers.sh b/src/freedreno/registers/update-headers.sh deleted file mode 100755 index 4e521e8c74a..00000000000 --- a/src/freedreno/registers/update-headers.sh +++ /dev/null @@ -1,14 +0,0 @@ -#!/bin/sh - -d=$(dirname $0) - -rnndb=$1 - -if [ ! -f $rnndb/rnndb/adreno/adreno_common.xml ]; then - echo directory does not look like envytools: $rnndb - exit 1 -fi - -for f in $d/*.xml.h; do - cp -v $rnndb/rnndb/adreno/$(basename $f) $d -done