From 23025ed15dc5d99ab895f425986a66f941d7c012 Mon Sep 17 00:00:00 2001 From: Roland Scheidegger Date: Fri, 3 May 2013 21:34:12 +0200 Subject: [PATCH] gallium: tgsi documentation updates and clarification for integer opcodes. A lot of them were missing. Others were moved from the Compute ISA to a new Integer ISA section as that seemed more appropriate. Reviewed-by: Jose Fonseca --- src/gallium/docs/source/tgsi.rst | 362 ++++++++++++++++++++++++------- 1 file changed, 289 insertions(+), 73 deletions(-) diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst index a528fd27688..b2f7a85a7f5 100644 --- a/src/gallium/docs/source/tgsi.rst +++ b/src/gallium/docs/source/tgsi.rst @@ -872,6 +872,16 @@ This instruction replicates its result. as an integer register. +.. opcode:: CONT - Continue + + TBD + +.. note:: + + Support for CONT is determined by a special capability bit, + ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information. + + .. opcode:: IF - Float If Start an IF ... ELSE .. ENDIF block. Condition evaluates to true if @@ -977,6 +987,7 @@ These opcodes are primarily provided for special-use computational shaders. Support for these opcodes indicated by a special pipe capability bit (TBD). XXX so let's discuss it, yeah? +XXX doesn't look like most of the opcodes really belong here. .. opcode:: CEIL - Ceiling @@ -991,7 +1002,89 @@ XXX so let's discuss it, yeah? dst.w = \lceil src.w\rceil -.. opcode:: I2F - Integer To Float +.. opcode:: TRUNC - Truncate + +.. math:: + + dst.x = trunc(src.x) + + dst.y = trunc(src.y) + + dst.z = trunc(src.z) + + dst.w = trunc(src.w) + + +.. opcode:: MOD - Modulus + +.. math:: + + dst.x = src0.x \bmod src1.x + + dst.y = src0.y \bmod src1.y + + dst.z = src0.z \bmod src1.z + + dst.w = src0.w \bmod src1.w + + +.. opcode:: UARL - Integer Address Register Load + + Moves the contents of the source register, assumed to be an integer, into the + destination register, which is assumed to be an address (ADDR) register. + + +.. opcode:: SAD - Sum Of Absolute Differences + +.. math:: + + dst.x = |src0.x - src1.x| + src2.x + + dst.y = |src0.y - src1.y| + src2.y + + dst.z = |src0.z - src1.z| + src2.z + + dst.w = |src0.w - src1.w| + src2.w + + +.. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single texel + from a specified texture image. The source sampler may + not be a CUBE or SHADOW. + src 0 is a four-component signed integer vector used to + identify the single texel accessed. 3 components + level. + src 1 is a 3 component constant signed integer vector, + with each component only have a range of + -8..+8 (hw only seems to deal with this range, interface + allows for up to unsigned int). + TXF(uint_vec coord, int_vec offset). + + +.. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4) + retrieve the dimensions of the texture + depending on the target. For 1D (width), 2D/RECT/CUBE + (width, height), 3D (width, height, depth), + 1D array (width, layers), 2D array (width, height, layers) + +.. math:: + + lod = src0.x + + dst.x = texture_width(unit, lod) + + dst.y = texture_height(unit, lod) + + dst.z = texture_depth(unit, lod) + + +Integer ISA +^^^^^^^^^^^^^^^^^^^^^^^^ +These opcodes are used for integer operations. +Support for these opcodes indicated by PIPE_SHADER_CAP_INTEGERS (all of them?) + + +.. opcode:: I2F - Signed Integer To Float + + Rounding is unspecified (round to nearest even suggested). .. math:: @@ -1004,56 +1097,157 @@ XXX so let's discuss it, yeah? dst.w = (float) src.w -.. opcode:: NOT - Bitwise Not +.. opcode:: U2F - Unsigned Integer To Float + + Rounding is unspecified (round to nearest even suggested). .. math:: - dst.x = ~src.x + dst.x = (float) src.x - dst.y = ~src.y + dst.y = (float) src.y - dst.z = ~src.z + dst.z = (float) src.z - dst.w = ~src.w + dst.w = (float) src.w -.. opcode:: TRUNC - Truncate +.. opcode:: F2I - Float to Signed Integer + + Rounding is towards zero (truncate). + Values outside signed range (including NaNs) produce undefined results. .. math:: - dst.x = trunc(src.x) + dst.x = (int) src.x - dst.y = trunc(src.y) + dst.y = (int) src.y - dst.z = trunc(src.z) + dst.z = (int) src.z - dst.w = trunc(src.w) + dst.w = (int) src.w -.. opcode:: SHL - Shift Left +.. opcode:: F2U - Float to Unsigned Integer + + Rounding is towards zero (truncate). + Values outside unsigned range (including NaNs) produce undefined results. .. math:: - dst.x = src0.x << src1.x + dst.x = (unsigned) src.x - dst.y = src0.y << src1.x + dst.y = (unsigned) src.y - dst.z = src0.z << src1.x + dst.z = (unsigned) src.z - dst.w = src0.w << src1.x + dst.w = (unsigned) src.w -.. opcode:: SHR - Shift Right +.. opcode:: UADD - Integer Add + + This instruction works the same for signed and unsigned integers. + The low 32bit of the result is returned. .. math:: - dst.x = src0.x >> src1.x + dst.x = src0.x + src1.x - dst.y = src0.y >> src1.x + dst.y = src0.y + src1.y - dst.z = src0.z >> src1.x + dst.z = src0.z + src1.z - dst.w = src0.w >> src1.x + dst.w = src0.w + src1.w + + +.. opcode:: UMAD - Integer Multiply And Add + + This instruction works the same for signed and unsigned integers. + The multiplication returns the low 32bit (as does the result itself). + +.. math:: + + dst.x = src0.x \times src1.x + src2.x + + dst.y = src0.y \times src1.y + src2.y + + dst.z = src0.z \times src1.z + src2.z + + dst.w = src0.w \times src1.w + src2.w + + +.. opcode:: UMUL - Integer Multiply + + This instruction works the same for signed and unsigned integers. + The low 32bit of the result is returned. + +.. math:: + + dst.x = src0.x \times src1.x + + dst.y = src0.y \times src1.y + + dst.z = src0.z \times src1.z + + dst.w = src0.w \times src1.w + + +.. opcode:: IDIV - Signed Integer Division + + TBD: behavior for division by zero. + +.. math:: + + dst.x = src0.x \ src1.x + + dst.y = src0.y \ src1.y + + dst.z = src0.z \ src1.z + + dst.w = src0.w \ src1.w + + +.. opcode:: UDIV - Unsigned Integer Division + + For division by zero, 0xffffffff is returned. + +.. math:: + + dst.x = src0.x \ src1.x + + dst.y = src0.y \ src1.y + + dst.z = src0.z \ src1.z + + dst.w = src0.w \ src1.w + + +.. opcode:: UMOD - Unsigned Integer Remainder + + If second arg is zero, 0xffffffff is returned. + +.. math:: + + dst.x = src0.x \ src1.x + + dst.y = src0.y \ src1.y + + dst.z = src0.z \ src1.z + + dst.w = src0.w \ src1.w + + +.. opcode:: NOT - Bitwise Not + +.. math:: + + dst.x = ~src.x + + dst.y = ~src.y + + dst.z = ~src.z + + dst.w = ~src.w .. opcode:: AND - Bitwise And @@ -1082,114 +1276,136 @@ XXX so let's discuss it, yeah? dst.w = src0.w | src1.w -.. opcode:: MOD - Modulus +.. opcode:: XOR - Bitwise Xor .. math:: - dst.x = src0.x \bmod src1.x + dst.x = src0.x \oplus src1.x - dst.y = src0.y \bmod src1.y + dst.y = src0.y \oplus src1.y - dst.z = src0.z \bmod src1.z + dst.z = src0.z \oplus src1.z - dst.w = src0.w \bmod src1.w + dst.w = src0.w \oplus src1.w -.. opcode:: XOR - Bitwise Xor +.. opcode:: IMAX - Maximum of Signed Integers .. math:: - dst.x = src0.x \oplus src1.x + dst.x = max(src0.x, src1.x) - dst.y = src0.y \oplus src1.y + dst.y = max(src0.y, src1.y) - dst.z = src0.z \oplus src1.z + dst.z = max(src0.z, src1.z) - dst.w = src0.w \oplus src1.w + dst.w = max(src0.w, src1.w) -.. opcode:: UCMP - Integer Conditional Move +.. opcode:: UMAX - Maximum of Unsigned Integers .. math:: - dst.x = src0.x ? src1.x : src2.x + dst.x = max(src0.x, src1.x) - dst.y = src0.y ? src1.y : src2.y + dst.y = max(src0.y, src1.y) - dst.z = src0.z ? src1.z : src2.z + dst.z = max(src0.z, src1.z) - dst.w = src0.w ? src1.w : src2.w + dst.w = max(src0.w, src1.w) -.. opcode:: UARL - Integer Address Register Load +.. opcode:: IMIN - Minimum of Signed Integers - Moves the contents of the source register, assumed to be an integer, into the - destination register, which is assumed to be an address (ADDR) register. +.. math:: + dst.x = min(src0.x, src1.x) -.. opcode:: IABS - Integer Absolute Value + dst.y = min(src0.y, src1.y) + + dst.z = min(src0.z, src1.z) + + dst.w = min(src0.w, src1.w) + + +.. opcode:: UMIN - Minimum of Unsigned Integers .. math:: - dst.x = |src.x| + dst.x = min(src0.x, src1.x) - dst.y = |src.y| + dst.y = min(src0.y, src1.y) - dst.z = |src.z| + dst.z = min(src0.z, src1.z) - dst.w = |src.w| + dst.w = min(src0.w, src1.w) -.. opcode:: SAD - Sum Of Absolute Differences +.. opcode:: SHL - Shift Left .. math:: - dst.x = |src0.x - src1.x| + src2.x + dst.x = src0.x << src1.x - dst.y = |src0.y - src1.y| + src2.y + dst.y = src0.y << src1.x - dst.z = |src0.z - src1.z| + src2.z + dst.z = src0.z << src1.x - dst.w = |src0.w - src1.w| + src2.w + dst.w = src0.w << src1.x -.. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single texel - from a specified texture image. The source sampler may - not be a CUBE or SHADOW. - src 0 is a four-component signed integer vector used to - identify the single texel accessed. 3 components + level. - src 1 is a 3 component constant signed integer vector, - with each component only have a range of - -8..+8 (hw only seems to deal with this range, interface - allows for up to unsigned int). - TXF(uint_vec coord, int_vec offset). +.. opcode:: ISHR - Arithmetic Shift Right (of Signed Integer) +.. math:: -.. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4) - retrieve the dimensions of the texture - depending on the target. For 1D (width), 2D/RECT/CUBE - (width, height), 3D (width, height, depth), - 1D array (width, layers), 2D array (width, height, layers) + dst.x = src0.x >> src1.x + + dst.y = src0.y >> src1.x + + dst.z = src0.z >> src1.x + + dst.w = src0.w >> src1.x + + +.. opcode:: USHR - Logical Shift Right .. math:: - lod = src0 + dst.x = src0.x >> (unsigned) src1.x - dst.x = texture_width(unit, lod) + dst.y = src0.y >> (unsigned) src1.x - dst.y = texture_height(unit, lod) + dst.z = src0.z >> (unsigned) src1.x - dst.z = texture_depth(unit, lod) + dst.w = src0.w >> (unsigned) src1.x -.. opcode:: CONT - Continue - TBD -.. note:: +.. opcode:: UCMP - Integer Conditional Move - Support for CONT is determined by a special capability bit, - ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information. +.. math:: + + dst.x = src0.x ? src1.x : src2.x + + dst.y = src0.y ? src1.y : src2.y + + dst.z = src0.z ? src1.z : src2.z + + dst.w = src0.w ? src1.w : src2.w + + +.. opcode:: IABS - Integer Absolute Value + +.. math:: + + dst.x = |src.x| + + dst.y = |src.y| + + dst.z = |src.z| + + dst.w = |src.w| Geometry ISA -- 2.30.2