X-Git-Url: https://git.libre-soc.org/?a=blobdiff_plain;f=src%2Fgallium%2Fdocs%2Fsource%2Ftgsi.rst;h=ecab7cb8097bba02831174a295918f804157915f;hb=1e6d51e805baa11eff17ea784c92ffc7933c56c5;hp=7e6dce995389a17b3624cbac9a3af49f8f0e6306;hpb=62ca7b85ae1f7d914156a9b376d0520db85ba495;p=mesa.git

diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index 7e6dce99538..ecab7cb8097 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -19,12 +19,18 @@ Some instructions, like :opcode:`I2F`, permit re-interpretation of vector
 components as integers. Other instructions permit using registers as
 two-component vectors with double precision; see :ref:`Double Opcodes`.
 
+When an instruction has a scalar result, the result is usually copied into
+each of the components of *dst*. When this happens, the result is said to be
+*replicated* to *dst*. :opcode:`RCP` is one such instruction.
+
 Instruction Set
 ---------------
 
-From GL_NV_vertex_program
+Core ISA
 ^^^^^^^^^^^^^^^^^^^^^^^^^
 
+These opcodes are guaranteed to be available regardless of the driver being
+used.
 
 .. opcode:: ARL - Address Register Load
 
@@ -67,28 +73,20 @@ From GL_NV_vertex_program
 
 .. opcode:: RCP - Reciprocal
 
-.. math::
-
-  dst.x = \frac{1}{src.x}
-
-  dst.y = \frac{1}{src.x}
+This instruction replicates its result.
 
-  dst.z = \frac{1}{src.x}
+.. math::
 
-  dst.w = \frac{1}{src.x}
+  dst = \frac{1}{src.x}
 
 
 .. opcode:: RSQ - Reciprocal Square Root
 
-.. math::
-
-  dst.x = \frac{1}{\sqrt{|src.x|}}
-
-  dst.y = \frac{1}{\sqrt{|src.x|}}
+This instruction replicates its result.
 
-  dst.z = \frac{1}{\sqrt{|src.x|}}
+.. math::
 
-  dst.w = \frac{1}{\sqrt{|src.x|}}
+  dst = \frac{1}{\sqrt{|src.x|}}
 
 
 .. opcode:: EXP - Approximate Exponential Base 2
@@ -145,28 +143,20 @@ From GL_NV_vertex_program
 
 .. opcode:: DP3 - 3-component Dot Product
 
-.. math::
-
-  dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
-
-  dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
+This instruction replicates its result.
 
-  dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
+.. math::
 
-  dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
+  dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
 
 
 .. opcode:: DP4 - 4-component Dot Product
 
-.. math::
-
-  dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
-
-  dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
+This instruction replicates its result.
 
-  dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
+.. math::
 
-  dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
+  dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
 
 
 .. opcode:: DST - Distance Vector
@@ -299,7 +289,7 @@ From GL_NV_vertex_program
   dst.w = src0.x \times src1.x + src0.y \times src1.y + src2.x
 
 
-.. opcode:: FRAC - Fraction
+.. opcode:: FRC - Fraction
 
 .. math::
 
@@ -327,7 +317,7 @@ From GL_NV_vertex_program
 
 .. opcode:: FLR - Floor
 
-This is identical to ARL.
+This is identical to :opcode:`ARL`.
 
 .. math::
 
@@ -355,41 +345,29 @@ This is identical to ARL.
 
 .. opcode:: EX2 - Exponential Base 2
 
-.. math::
-
-  dst.x = 2^{src.x}
-
-  dst.y = 2^{src.x}
+This instruction replicates its result.
 
-  dst.z = 2^{src.x}
+.. math::
 
-  dst.w = 2^{src.x}
+  dst = 2^{src.x}
 
 
 .. opcode:: LG2 - Logarithm Base 2
 
-.. math::
-
-  dst.x = \log_2{src.x}
+This instruction replicates its result.
 
-  dst.y = \log_2{src.x}
-
-  dst.z = \log_2{src.x}
+.. math::
 
-  dst.w = \log_2{src.x}
+  dst = \log_2{src.x}
 
 
 .. opcode:: POW - Power
 
-.. math::
-
-  dst.x = src0.x^{src1.x}
+This instruction replicates its result.
 
-  dst.y = src0.x^{src1.x}
-
-  dst.z = src0.x^{src1.x}
+.. math::
 
-  dst.w = src0.x^{src1.x}
+  dst = src0.x^{src1.x}
 
 .. opcode:: XPD - Cross Product
 
@@ -419,43 +397,31 @@ This is identical to ARL.
 
 .. opcode:: RCC - Reciprocal Clamped
 
+This instruction replicates its result.
+
 XXX cleanup on aisle three
 
 .. math::
 
-  dst.x = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
-
-  dst.y = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
-
-  dst.z = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
-
-  dst.w = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
+  dst = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
 
 
 .. opcode:: DPH - Homogeneous Dot Product
 
-.. math::
-
-  dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
-
-  dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
+This instruction replicates its result.
 
-  dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
+.. math::
 
-  dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
+  dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
 
 
 .. opcode:: COS - Cosine
 
-.. math::
+This instruction replicates its result.
 
-  dst.x = \cos{src.x}
-
-  dst.y = \cos{src.x}
-
-  dst.z = \cos{src.x}
+.. math::
 
-  dst.w = \cos{src.x}
+  dst = \cos{src.x}
 
 
 .. opcode:: DDX - Derivative Relative To X
@@ -521,7 +487,9 @@ XXX cleanup on aisle three
 
   dst.w = 1
 
-Considered for removal.
+.. note::
+
+   Considered for removal.
 
 
 .. opcode:: SEQ - Set On Equal
@@ -539,17 +507,16 @@ Considered for removal.
 
 .. opcode:: SFL - Set On False
 
-.. math::
+This instruction replicates its result.
 
-  dst.x = 0
+.. math::
 
-  dst.y = 0
+  dst = 0
 
-  dst.z = 0
+.. note::
 
-  dst.w = 0
+   Considered for removal.
 
-Considered for removal.
 
 .. opcode:: SGT - Set On Greater Than
 
@@ -566,15 +533,11 @@ Considered for removal.
 
 .. opcode:: SIN - Sine
 
-.. math::
-
-  dst.x = \sin{src.x}
-
-  dst.y = \sin{src.x}
+This instruction replicates its result.
 
-  dst.z = \sin{src.x}
+.. math::
 
-  dst.w = \sin{src.x}
+  dst = \sin{src.x}
 
 
 .. opcode:: SLE - Set On Less Equal Than
@@ -605,15 +568,11 @@ Considered for removal.
 
 .. opcode:: STR - Set On True
 
-.. math::
+This instruction replicates its result.
 
-  dst.x = 1
-
-  dst.y = 1
+.. math::
 
-  dst.z = 1
-
-  dst.w = 1
+  dst = 1
 
 
 .. opcode:: TEX - Texture Lookup
@@ -635,25 +594,33 @@ Considered for removal.
 
   TBD
 
-  Considered for removal.
+.. note::
+
+   Considered for removal.
 
 .. opcode:: UP2US - Unpack Two Unsigned 16-Bit Scalars
 
   TBD
 
-  Considered for removal.
+.. note::
+
+   Considered for removal.
 
 .. opcode:: UP4B - Unpack Four Signed 8-Bit Values
 
   TBD
 
-  Considered for removal.
+.. note::
+
+   Considered for removal.
 
 .. opcode:: UP4UB - Unpack Four Unsigned 8-Bit Scalars
 
   TBD
 
-  Considered for removal.
+.. note::
+
+   Considered for removal.
 
 .. opcode:: X2D - 2D Coordinate Transformation
 
@@ -667,18 +634,18 @@ Considered for removal.
 
   dst.w = src0.y + src1.x \times src2.z + src1.y \times src2.w
 
-Considered for removal.
+.. note::
 
-
-From GL_NV_vertex_program2
-^^^^^^^^^^^^^^^^^^^^^^^^^^
+   Considered for removal.
 
 
 .. opcode:: ARA - Address Register Add
 
   TBD
 
-  Considered for removal.
+.. note::
+
+   Considered for removal.
 
 .. opcode:: ARR - Address Register Load With Round
 
@@ -697,7 +664,9 @@ From GL_NV_vertex_program2
 
   pc = target
 
-  Considered for removal.
+.. note::
+
+   Considered for removal.
 
 .. opcode:: CAL - Subroutine Call
 
@@ -793,15 +762,11 @@ From GL_NV_vertex_program2
 
 .. opcode:: DP2 - 2-component Dot Product
 
-.. math::
-
-  dst.x = src0.x \times src1.x + src0.y \times src1.y
-
-  dst.y = src0.x \times src1.x + src0.y \times src1.y
+This instruction replicates its result.
 
-  dst.z = src0.x \times src1.x + src0.y \times src1.y
+.. math::
 
-  dst.w = src0.x \times src1.x + src0.y \times src1.y
+  dst = src0.x \times src1.x + src0.y \times src1.y
 
 
 .. opcode:: TXL - Texture Lookup With LOD
@@ -819,27 +784,6 @@ From GL_NV_vertex_program2
   TBD
 
 
-.. opcode:: BGNFOR - Begin a For-Loop
-
-  dst.x = floor(src.x)
-  dst.y = floor(src.y)
-  dst.z = floor(src.z)
-
-  if (dst.y <= 0)
-    pc = [matching ENDFOR] + 1
-  endif
-
-  Note: The destination must be a loop register.
-        The source must be a constant register.
-
-  Considered for cleanup / removal.
-
-
-.. opcode:: REP - Repeat
-
-  TBD
-
-
 .. opcode:: ELSE - Else
 
   TBD
@@ -850,24 +794,6 @@ From GL_NV_vertex_program2
   TBD
 
 
-.. opcode:: ENDFOR - End a For-Loop
-
-  dst.x = dst.x + dst.z
-  dst.y = dst.y - 1.0
-
-  if (dst.y > 0)
-    pc = [matching BGNFOR instruction] + 1
-  endif
-
-  Note: The destination must be a loop register.
-
-  Considered for cleanup / removal.
-
-.. opcode:: ENDREP - End Repeat
-
-  TBD
-
-
 .. opcode:: PUSHA - Push Address Register On Stack
 
   push(src.x)
@@ -875,7 +801,13 @@ From GL_NV_vertex_program2
   push(src.z)
   push(src.w)
 
-  Considered for cleanup / removal.
+.. note::
+
+   Considered for cleanup.
+
+.. note::
+
+   Considered for removal.
 
 .. opcode:: POPA - Pop Address Register From Stack
 
@@ -884,14 +816,23 @@ From GL_NV_vertex_program2
   dst.y = pop()
   dst.x = pop()
 
-  Considered for cleanup / removal.
+.. note::
+
+   Considered for cleanup.
 
+.. note::
 
-From GL_NV_gpu_program4
+   Considered for removal.
+
+
+Compute ISA
 ^^^^^^^^^^^^^^^^^^^^^^^^
 
+These opcodes are primarily provided for special-use computational shaders.
 Support for these opcodes indicated by a special pipe capability bit (TBD).
 
+XXX so let's discuss it, yeah?
+
 .. opcode:: CEIL - Ceiling
 
 .. math::
@@ -1049,10 +990,17 @@ Support for these opcodes indicated by a special pipe capability bit (TBD).
 
   TBD
 
+.. note::
+
+   Support for CONT is determined by a special capability bit,
+   ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information.
 
-From GL_NV_geometry_program4
+
+Geometry ISA
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
+These opcodes are only supported in geometry shaders; they have no meaning
+in any other type of shader.
 
 .. opcode:: EMIT - Emit
 
@@ -1064,9 +1012,11 @@ From GL_NV_geometry_program4
   TBD
 
 
-From GLSL
+GLSL ISA
 ^^^^^^^^^^
 
+These opcodes are part of :term:`GLSL`'s opcode set. Support for these
+opcodes is determined by a special capability bit, ``GLSL``.
 
 .. opcode:: BGNLOOP - Begin a Loop
 
@@ -1095,20 +1045,17 @@ From GLSL
 
 .. opcode:: NRM4 - 4-component Vector Normalise
 
-.. math::
-
-  dst.x = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
-
-  dst.y = \frac{src.y}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
+This instruction replicates its result.
 
-  dst.z = \frac{src.z}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
+.. math::
 
-  dst.w = \frac{src.w}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
+  dst = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
 
 
 ps_2_x
 ^^^^^^^^^^^^
 
+XXX wait what
 
 .. opcode:: CALLNZ - Subroutine Call If Not Zero
 
@@ -1126,10 +1073,15 @@ ps_2_x
 
 .. _doubleopcodes:
 
-Double Opcodes
+Double ISA
 ^^^^^^^^^^^^^^^
 
-.. opcode:: DADD - Add Double
+The double-precision opcodes reinterpret four-component vectors into
+two-component vectors with doubled precision in each component.
+
+Support for these opcodes is XXX undecided. :T
+
+.. opcode:: DADD - Add
 
 .. math::
 
@@ -1138,7 +1090,7 @@ Double Opcodes
   dst.zw = src0.zw + src1.zw
 
 
-.. opcode:: DDIV - Divide Double
+.. opcode:: DDIV - Divide
 
 .. math::
 
@@ -1146,7 +1098,7 @@ Double Opcodes
 
   dst.zw = src0.zw / src1.zw
 
-.. opcode:: DSEQ - Set Double on Equal
+.. opcode:: DSEQ - Set on Equal
 
 .. math::
 
@@ -1154,7 +1106,7 @@ Double Opcodes
 
   dst.zw = src0.zw == src1.zw ? 1.0F : 0.0F
 
-.. opcode:: DSLT - Set Double on Less than
+.. opcode:: DSLT - Set on Less than
 
 .. math::
 
@@ -1162,7 +1114,7 @@ Double Opcodes
 
   dst.zw = src0.zw < src1.zw ? 1.0F : 0.0F
 
-.. opcode:: DFRAC - Double Fraction
+.. opcode:: DFRAC - Fraction
 
 .. math::
 
@@ -1171,23 +1123,33 @@ Double Opcodes
   dst.zw = src.zw - \lfloor src.zw\rfloor
 
 
-.. opcode:: DFRACEXP - Convert Double Number to Fractional and Integral Components
+.. opcode:: DFRACEXP - Convert Number to Fractional and Integral Components
+
+Like the ``frexp()`` routine in many math libraries, this opcode stores the
+exponent of its source to ``dst0``, and the significand to ``dst1``, such that
+:math:`dst1 \times 2^{dst0} = src` .
 
 .. math::
 
-  dst0.xy = frexp(src.xy, dst1.xy)
+  dst0.xy = exp(src.xy)
+
+  dst1.xy = frac(src.xy)
+
+  dst0.zw = exp(src.zw)
+
+  dst1.zw = frac(src.zw)
 
-  dst0.zw = frexp(src.zw, dst1.zw)
+.. opcode:: DLDEXP - Multiply Number by Integral Power of 2
 
-.. opcode:: DLDEXP - Multiple Double Number by Integral Power of 2
+This opcode is the inverse of :opcode:`DFRACEXP`.
 
 .. math::
 
-  dst.xy = ldexp(src0.xy, src1.xy)
+  dst.xy = src0.xy \times 2^{src1.xy}
 
-  dst.zw = ldexp(src0.zw, src1.zw)
+  dst.zw = src0.zw \times 2^{src1.zw}
 
-.. opcode:: DMIN - Minimum Double
+.. opcode:: DMIN - Minimum
 
 .. math::
 
@@ -1195,7 +1157,7 @@ Double Opcodes
 
   dst.zw = min(src0.zw, src1.zw)
 
-.. opcode:: DMAX - Maximum Double
+.. opcode:: DMAX - Maximum
 
 .. math::
 
@@ -1203,7 +1165,7 @@ Double Opcodes
 
   dst.zw = max(src0.zw, src1.zw)
 
-.. opcode:: DMUL - Multiply Double
+.. opcode:: DMUL - Multiply
 
 .. math::
 
@@ -1212,7 +1174,7 @@ Double Opcodes
   dst.zw = src0.zw \times src1.zw
 
 
-.. opcode:: DMAD - Multiply And Add Doubles
+.. opcode:: DMAD - Multiply And Add
 
 .. math::
 
@@ -1221,7 +1183,7 @@ Double Opcodes
   dst.zw = src0.zw \times src1.zw + src2.zw
 
 
-.. opcode:: DRCP - Reciprocal Double
+.. opcode:: DRCP - Reciprocal
 
 .. math::
 
@@ -1229,7 +1191,7 @@ Double Opcodes
 
    dst.zw = \frac{1}{src.zw}
 
-.. opcode:: DSQRT - Square root double
+.. opcode:: DSQRT - Square Root
 
 .. math::
 
@@ -1293,6 +1255,34 @@ Other tokens
 ---------------
 
 
+Declaration
+^^^^^^^^^^^
+
+
+Declares a register that is will be referenced as an operand in Instruction
+tokens.
+
+File field contains register file that is being declared and is one
+of TGSI_FILE.
+
+UsageMask field specifies which of the register components can be accessed
+and is one of TGSI_WRITEMASK.
+
+Interpolate field is only valid for fragment shader INPUT register files.
+It specifes the way input is being interpolated by the rasteriser and is one
+of TGSI_INTERPOLATE.
+
+If Dimension flag is set to 1, a Declaration Dimension token follows.
+
+If Semantic flag is set to 1, a Declaration Semantic token follows.
+
+CylindricalWrap bitfield is only valid for fragment shader INPUT register
+files. It specifies which register components should be subject to cylindrical
+wrapping when interpolating by the rasteriser. If TGSI_CYLINDRICAL_WRAP_X
+is set to 1, the X component should be interpolated according to cylindrical
+wrapping rules.
+
+
 Declaration Semantic
 ^^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -1365,10 +1355,8 @@ TGSI_SEMANTIC_PSIZE
 """""""""""""""""""
 
 PSIZE, or point size, is used to specify point sizes per-vertex. It should
-be in ``(p, n, x, f)`` format, where ``p`` is the point size, ``n`` is the minimum
-size, ``x`` is the maximum size, and ``f`` is the fade threshold.
-
-XXX this is arb_vp. is this what we actually do? should double-check...
+be in ``(s, 0, 0, 1)`` format, where ``s`` is the (possibly clamped) point size.
+Only the first component matters when writing from the vertex shader.
 
 When using this semantic, be sure to set the appropriate state in the
 :ref:`rasterizer` first.
@@ -1450,16 +1438,17 @@ DirectX 10 uses HALF_INTEGER.
 Texture Sampling and Texture Formats
 ------------------------------------
 
-This table shows how texture image components are returned as (x,y,z,w)
-tuples by TGSI texture instructions, such as TEX, TXD, and TXP.
-For reference, OpenGL and Direct3D conventions are shown as well.
+This table shows how texture image components are returned as (x,y,z,w) tuples
+by TGSI texture instructions, such as :opcode:`TEX`, :opcode:`TXD`, and
+:opcode:`TXP`. For reference, OpenGL and Direct3D conventions are shown as
+well.
 
 +--------------------+--------------+--------------------+--------------+
 | Texture Components | Gallium      | OpenGL             | Direct3D 9   |
 +====================+==============+====================+==============+
-| R                  | XXX TBD      | (r, 0, 0, 1)       | (r, 1, 1, 1) |
+| R                  | (r, 0, 0, 1) | (r, 0, 0, 1)       | (r, 1, 1, 1) |
 +--------------------+--------------+--------------------+--------------+
-| RG                 | XXX TBD      | (r, g, 0, 1)       | (r, g, 1, 1) |
+| RG                 | (r, g, 0, 1) | (r, g, 0, 1)       | (r, g, 1, 1) |
 +--------------------+--------------+--------------------+--------------+
 | RGB                | (r, g, b, 1) | (r, g, b, 1)       | (r, g, b, 1) |
 +--------------------+--------------+--------------------+--------------+
@@ -1482,4 +1471,4 @@ For reference, OpenGL and Direct3D conventions are shown as well.
 
 .. [#envmap-bumpmap] http://www.opengl.org/registry/specs/ATI/envmap_bumpmap.txt
 .. [#depth-tex-mode] the default is (z, z, z, 1) but may also be (0, 0, 0, z)
- or (z, z, z, z) depending on the value of GL_DEPTH_TEXTURE_MODE.
+   or (z, z, z, z) depending on the value of GL_DEPTH_TEXTURE_MODE.