Merge branch 'gallium-nopointsizeminmax'

[mesa.git] / src / gallium / docs / source / tgsi.rst
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst

index 34d80da1d340ce7a8d3e5dfac4eb99ca2193f346..c292cd37d5c21f8c940f1d7bf22caa56fcf37075 100644 (file)
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -6,6 +6,23 @@ for describing shaders. Since Gallium is inherently shaderful, shaders are
  an important part of the API. TGSI is the only intermediate representation
  used by all drivers.
  
+Basics
+------
+
+All TGSI instructions, known as *opcodes*, operate on arbitrary-precision
+floating-point four-component vectors. An opcode may have up to one
+destination register, known as *dst*, and between zero and three source
+registers, called *src0* through *src2*, or simply *src* if there is only
+one.
+
+Some instructions, like :opcode:`I2F`, permit re-interpretation of vector
+components as integers. Other instructions permit using registers as
+two-component vectors with double precision; see :ref:`Double Opcodes`.
+
+When an instruction has a scalar result, the result is usually copied into
+each of the components of *dst*. When this happens, the result is said to be
+*replicated* to *dst*. :opcode:`RCP` is one such instruction.
+
  Instruction Set
  ---------------
  
@@ -54,28 +71,20 @@ From GL_NV_vertex_program
  
  .. opcode:: RCP - Reciprocal
  
-.. math::
-
-  dst.x = \frac{1}{src.x}
-
-  dst.y = \frac{1}{src.x}
+This instruction replicates its result.
  
-  dst.z = \frac{1}{src.x}
+.. math::
  
-  dst.w = \frac{1}{src.x}
+  dst = \frac{1}{src.x}
  
  
  .. opcode:: RSQ - Reciprocal Square Root
  
-.. math::
-
-  dst.x = \frac{1}{\sqrt{|src.x|}}
+This instruction replicates its result.
  
-  dst.y = \frac{1}{\sqrt{|src.x|}}
-
-  dst.z = \frac{1}{\sqrt{|src.x|}}
+.. math::
  
-  dst.w = \frac{1}{\sqrt{|src.x|}}
+  dst = \frac{1}{\sqrt{|src.x|}}
  
  
  .. opcode:: EXP - Approximate Exponential Base 2
@@ -132,28 +141,20 @@ From GL_NV_vertex_program
  
  .. opcode:: DP3 - 3-component Dot Product
  
-.. math::
-
-  dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
-
-  dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
+This instruction replicates its result.
  
-  dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
+.. math::
  
-  dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
+  dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
  
  
  .. opcode:: DP4 - 4-component Dot Product
  
-.. math::
+This instruction replicates its result.
  
-  dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
-
-  dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
-
-  dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
+.. math::
  
-  dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
+  dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
  
  
  .. opcode:: DST - Distance Vector
@@ -314,7 +315,7 @@ From GL_NV_vertex_program
  
  .. opcode:: FLR - Floor
  
-This is identical to ARL.
+This is identical to :opcode:`ARL`.
  
  .. math::
  
@@ -342,41 +343,29 @@ This is identical to ARL.
  
  .. opcode:: EX2 - Exponential Base 2
  
-.. math::
-
-  dst.x = 2^{src.x}
-
-  dst.y = 2^{src.x}
+This instruction replicates its result.
  
-  dst.z = 2^{src.x}
+.. math::
  
-  dst.w = 2^{src.x}
+  dst = 2^{src.x}
  
  
  .. opcode:: LG2 - Logarithm Base 2
  
-.. math::
-
-  dst.x = \log_2{src.x}
-
-  dst.y = \log_2{src.x}
+This instruction replicates its result.
  
-  dst.z = \log_2{src.x}
+.. math::
  
-  dst.w = \log_2{src.x}
+  dst = \log_2{src.x}
  
  
  .. opcode:: POW - Power
  
-.. math::
+This instruction replicates its result.
  
-  dst.x = src0.x^{src1.x}
-
-  dst.y = src0.x^{src1.x}
-
-  dst.z = src0.x^{src1.x}
+.. math::
  
-  dst.w = src0.x^{src1.x}
+  dst = src0.x^{src1.x}
  
  .. opcode:: XPD - Cross Product
  
@@ -406,43 +395,31 @@ This is identical to ARL.
  
  .. opcode:: RCC - Reciprocal Clamped
  
+This instruction replicates its result.
+
  XXX cleanup on aisle three
  
  .. math::
  
-  dst.x = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
-
-  dst.y = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
-
-  dst.z = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
-
-  dst.w = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
+  dst = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
  
  
  .. opcode:: DPH - Homogeneous Dot Product
  
-.. math::
-
-  dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
+This instruction replicates its result.
  
-  dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
-
-  dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
+.. math::
  
-  dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
+  dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
  
  
  .. opcode:: COS - Cosine
  
-.. math::
+This instruction replicates its result.
  
-  dst.x = \cos{src.x}
-
-  dst.y = \cos{src.x}
-
-  dst.z = \cos{src.x}
+.. math::
  
-  dst.w = \cos{src.x}
+  dst = \cos{src.x}
  
  
  .. opcode:: DDX - Derivative Relative To X
@@ -508,7 +485,9 @@ XXX cleanup on aisle three
  
    dst.w = 1
  
-Considered for removal.
+.. note::
+
+   Considered for removal.
  
  
  .. opcode:: SEQ - Set On Equal
@@ -526,17 +505,16 @@ Considered for removal.
  
  .. opcode:: SFL - Set On False
  
-.. math::
+This instruction replicates its result.
  
-  dst.x = 0
+.. math::
  
-  dst.y = 0
+  dst = 0
  
-  dst.z = 0
+.. note::
  
-  dst.w = 0
+   Considered for removal.
  
-Considered for removal.
  
  .. opcode:: SGT - Set On Greater Than
  
@@ -553,15 +531,11 @@ Considered for removal.
  
  .. opcode:: SIN - Sine
  
-.. math::
-
-  dst.x = \sin{src.x}
+This instruction replicates its result.
  
-  dst.y = \sin{src.x}
-
-  dst.z = \sin{src.x}
+.. math::
  
-  dst.w = \sin{src.x}
+  dst = \sin{src.x}
  
  
  .. opcode:: SLE - Set On Less Equal Than
@@ -592,15 +566,11 @@ Considered for removal.
  
  .. opcode:: STR - Set On True
  
-.. math::
+This instruction replicates its result.
  
-  dst.x = 1
-
-  dst.y = 1
-
-  dst.z = 1
+.. math::
  
-  dst.w = 1
+  dst = 1
  
  
  .. opcode:: TEX - Texture Lookup
@@ -622,25 +592,33 @@ Considered for removal.
  
    TBD
  
-  Considered for removal.
+.. note::
+
+   Considered for removal.
  
  .. opcode:: UP2US - Unpack Two Unsigned 16-Bit Scalars
  
    TBD
  
-  Considered for removal.
+.. note::
+
+   Considered for removal.
  
  .. opcode:: UP4B - Unpack Four Signed 8-Bit Values
  
    TBD
  
-  Considered for removal.
+.. note::
+
+   Considered for removal.
  
  .. opcode:: UP4UB - Unpack Four Unsigned 8-Bit Scalars
  
    TBD
  
-  Considered for removal.
+.. note::
+
+   Considered for removal.
  
  .. opcode:: X2D - 2D Coordinate Transformation
  
@@ -654,7 +632,9 @@ Considered for removal.
  
    dst.w = src0.y + src1.x \times src2.z + src1.y \times src2.w
  
-Considered for removal.
+.. note::
+
+   Considered for removal.
  
  
  From GL_NV_vertex_program2
@@ -665,7 +645,9 @@ From GL_NV_vertex_program2
  
    TBD
  
-  Considered for removal.
+.. note::
+
+   Considered for removal.
  
  .. opcode:: ARR - Address Register Load With Round
  
@@ -684,7 +666,9 @@ From GL_NV_vertex_program2
  
    pc = target
  
-  Considered for removal.
+.. note::
+
+   Considered for removal.
  
  .. opcode:: CAL - Subroutine Call
  
@@ -780,15 +764,11 @@ From GL_NV_vertex_program2
  
  .. opcode:: DP2 - 2-component Dot Product
  
-.. math::
-
-  dst.x = src0.x \times src1.x + src0.y \times src1.y
-
-  dst.y = src0.x \times src1.x + src0.y \times src1.y
+This instruction replicates its result.
  
-  dst.z = src0.x \times src1.x + src0.y \times src1.y
+.. math::
  
-  dst.w = src0.x \times src1.x + src0.y \times src1.y
+  dst = src0.x \times src1.x + src0.y \times src1.y
  
  
  .. opcode:: TXL - Texture Lookup With LOD
@@ -819,7 +799,13 @@ From GL_NV_vertex_program2
    Note: The destination must be a loop register.
          The source must be a constant register.
  
-  Considered for cleanup / removal.
+.. note::
+
+   Considered for cleanup.
+
+.. note::
+
+   Considered for removal.
  
  
  .. opcode:: REP - Repeat
@@ -848,7 +834,13 @@ From GL_NV_vertex_program2
  
    Note: The destination must be a loop register.
  
-  Considered for cleanup / removal.
+.. note::
+
+   Considered for cleanup.
+
+.. note::
+
+   Considered for removal.
  
  .. opcode:: ENDREP - End Repeat
  
@@ -862,7 +854,13 @@ From GL_NV_vertex_program2
    push(src.z)
    push(src.w)
  
-  Considered for cleanup / removal.
+.. note::
+
+   Considered for cleanup.
+
+.. note::
+
+   Considered for removal.
  
  .. opcode:: POPA - Pop Address Register From Stack
  
@@ -871,7 +869,13 @@ From GL_NV_vertex_program2
    dst.y = pop()
    dst.x = pop()
  
-  Considered for cleanup / removal.
+.. note::
+
+   Considered for cleanup.
+
+.. note::
+
+   Considered for removal.
  
  
  From GL_NV_gpu_program4
@@ -1082,15 +1086,11 @@ From GLSL
  
  .. opcode:: NRM4 - 4-component Vector Normalise
  
-.. math::
-
-  dst.x = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
-
-  dst.y = \frac{src.y}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
+This instruction replicates its result.
  
-  dst.z = \frac{src.z}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
+.. math::
  
-  dst.w = \frac{src.w}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
+  dst = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
  
  
  ps_2_x
@@ -1111,6 +1111,8 @@ ps_2_x
  
    TBD
  
+.. _doubleopcodes:
+
  Double Opcodes
  ^^^^^^^^^^^^^^^
  
@@ -1269,25 +1271,41 @@ Keywords
  
    discard           Discard fragment.
  
-  dst               First destination register.
+  pc                Program counter.
  
-  dst0              First destination register.
+  target            Label of target instruction.
  
-  pc                Program counter.
  
-  src               First source register.
+Other tokens
+---------------
  
-  src0              First source register.
  
-  src1              Second source register.
+Declaration
+^^^^^^^^^^^
  
-  src2              Third source register.
  
-  target            Label of target instruction.
+Declares a register that is will be referenced as an operand in Instruction
+tokens.
  
+File field contains register file that is being declared and is one
+of TGSI_FILE.
  
-Other tokens
----------------
+UsageMask field specifies which of the register components can be accessed
+and is one of TGSI_WRITEMASK.
+
+Interpolate field is only valid for fragment shader INPUT register files.
+It specifes the way input is being interpolated by the rasteriser and is one
+of TGSI_INTERPOLATE.
+
+If Dimension flag is set to 1, a Declaration Dimension token follows.
+
+If Semantic flag is set to 1, a Declaration Semantic token follows.
+
+CylindricalWrap bitfield is only valid for fragment shader INPUT register
+files. It specifies which register components should be subject to cylindrical
+wrapping when interpolating by the rasteriser. If TGSI_CYLINDRICAL_WRAP_X
+is set to 1, the X component should be interpolated according to cylindrical
+wrapping rules.
  
  
  Declaration Semantic
@@ -1362,10 +1380,8 @@ TGSI_SEMANTIC_PSIZE
  """""""""""""""""""
  
  PSIZE, or point size, is used to specify point sizes per-vertex. It should
-be in ``(p, n, x, f)`` format, where ``p`` is the point size, ``n`` is the minimum
-size, ``x`` is the maximum size, and ``f`` is the fade threshold.
-
-XXX this is arb_vp. is this what we actually do? should double-check...
+be in ``(s, 0, 0, 1)`` format, where ``s`` is the (possibly clamped) point size.
+Only the first component matters when writing from the vertex shader.
  
  When using this semantic, be sure to set the appropriate state in the
  :ref:`rasterizer` first.
@@ -1447,9 +1463,10 @@ DirectX 10 uses HALF_INTEGER.
  Texture Sampling and Texture Formats
  ------------------------------------
  
-This table shows how texture image components are returned as (x,y,z,w)
-tuples by TGSI texture instructions, such as TEX, TXD, and TXP.
-For reference, OpenGL and Direct3D conventions are shown as well.
+This table shows how texture image components are returned as (x,y,z,w) tuples
+by TGSI texture instructions, such as :opcode:`TEX`, :opcode:`TXD`, and
+:opcode:`TXP`. For reference, OpenGL and Direct3D conventions are shown as
+well.
  
  +--------------------+--------------+--------------------+--------------+
  | Texture Components | Gallium      | OpenGL             | Direct3D 9   |
@@ -1479,4 +1496,4 @@ For reference, OpenGL and Direct3D conventions are shown as well.
  
  .. [#envmap-bumpmap] http://www.opengl.org/registry/specs/ATI/envmap_bumpmap.txt
  .. [#depth-tex-mode] the default is (z, z, z, 1) but may also be (0, 0, 0, z)
- or (z, z, z, z) depending on the value of GL_DEPTH_TEXTURE_MODE.
+   or (z, z, z, z) depending on the value of GL_DEPTH_TEXTURE_MODE.