gallium: fixup definitions of the rsq and sqrt

[mesa.git] / src / gallium / docs / source / tgsi.rst
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst

index 0002626c7152a07aa11c39cc7686d23c1fb3f77f..4d26c465579e24238dfb20b8428a1bb79727b0ba 100644 (file)
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -32,11 +32,8 @@ For inputs which have a floating point type, both absolute value and negation
  modifiers are supported (with absolute value being applied first).
  TGSI_OPCODE_MOV is considered to have float input type for applying modifiers.
  
-For inputs which have signed type only the negate modifier is supported. This
-includes instructions which are otherwise ignorant if the type is signed or
-unsigned, such as TGSI_OPCODE_UADD.
-
-For inputs with unsigned type no modifiers are allowed.
+For inputs which have signed or unsigned type only the negate modifier is
+supported.
  
  Instruction Set
  ---------------
@@ -97,16 +94,16 @@ This instruction replicates its result.
  
  .. opcode:: RSQ - Reciprocal Square Root
  
-This instruction replicates its result.
+This instruction replicates its result. The results are undefined for src <= 0.
  
  .. math::
  
-  dst = \frac{1}{\sqrt{|src.x|}}
+  dst = \frac{1}{\sqrt{src.x}}
  
  
  .. opcode:: SQRT - Square Root
  
-This instruction replicates its result.
+This instruction replicates its result. The results are undefined for src < 0.
  
  .. math::
  
@@ -474,11 +471,6 @@ This instruction replicates its result.
    dst.w = partialy(src.w)
  
  
-.. opcode:: KILP - Predicated Discard
-
-  discard
-
-
  .. opcode:: PK2H - Pack Two 16-bit Floats
  
    TBD
@@ -723,25 +715,6 @@ This instruction replicates its result.
    dst.w = round(src.w)
  
  
-.. opcode:: BRA - Branch
-
-  pc = target
-
-.. note::
-
-   Considered for removal.
-
-.. opcode:: CAL - Subroutine Call
-
-  push(pc)
-  pc = target
-
-
-.. opcode:: RET - Subroutine Call Return
-
-  pc = pop()
-
-
  .. opcode:: SSG - Set Sign
  
  .. math::
@@ -768,7 +741,9 @@ This instruction replicates its result.
    dst.w = (src0.w < 0) ? src1.w : src2.w
  
  
-.. opcode:: KIL - Conditional Discard
+.. opcode:: KILL_IF - Conditional Discard
+
+  Conditional discard.  Allowed in fragment shaders only.
  
  .. math::
  
@@ -777,6 +752,11 @@ This instruction replicates its result.
    endif
  
  
+.. opcode:: KILL - Discard
+
+  Unconditional discard.  Allowed in fragment shaders only.
+
+
  .. opcode:: SCS - Sine Cosine
  
  .. math::
@@ -859,26 +839,6 @@ This instruction replicates its result.
    dst = texture_sample(unit, coord, lod)
  
  
-.. opcode:: BRK - Break
-
-  TBD
-
-
-.. opcode:: IF - If
-
-  TBD
-
-
-.. opcode:: ELSE - Else
-
-  TBD
-
-
-.. opcode:: ENDIF - End If
-
-  TBD
-
-
  .. opcode:: PUSHA - Push Address Register On Stack
  
    push(src.x)
@@ -910,13 +870,35 @@ This instruction replicates its result.
     Considered for removal.
  
  
+.. opcode:: BRA - Branch
+
+  pc = target
+
+.. note::
+
+   Considered for removal.
+
+
+.. opcode:: CALLNZ - Subroutine Call If Not Zero
+
+   TBD
+
+.. note::
+
+   Considered for cleanup.
+
+.. note::
+
+   Considered for removal.
+
+
  Compute ISA
  ^^^^^^^^^^^^^^^^^^^^^^^^
  
  These opcodes are primarily provided for special-use computational shaders.
  Support for these opcodes indicated by a special pipe capability bit (TBD).
  
-XXX so let's discuss it, yeah?
+XXX doesn't look like most of the opcodes really belong here.
  
  .. opcode:: CEIL - Ceiling
  
@@ -931,7 +913,89 @@ XXX so let's discuss it, yeah?
    dst.w = \lceil src.w\rceil
  
  
-.. opcode:: I2F - Integer To Float
+.. opcode:: TRUNC - Truncate
+
+.. math::
+
+  dst.x = trunc(src.x)
+
+  dst.y = trunc(src.y)
+
+  dst.z = trunc(src.z)
+
+  dst.w = trunc(src.w)
+
+
+.. opcode:: MOD - Modulus
+
+.. math::
+
+  dst.x = src0.x \bmod src1.x
+
+  dst.y = src0.y \bmod src1.y
+
+  dst.z = src0.z \bmod src1.z
+
+  dst.w = src0.w \bmod src1.w
+
+
+.. opcode:: UARL - Integer Address Register Load
+
+  Moves the contents of the source register, assumed to be an integer, into the
+  destination register, which is assumed to be an address (ADDR) register.
+
+
+.. opcode:: SAD - Sum Of Absolute Differences
+
+.. math::
+
+  dst.x = |src0.x - src1.x| + src2.x
+
+  dst.y = |src0.y - src1.y| + src2.y
+
+  dst.z = |src0.z - src1.z| + src2.z
+
+  dst.w = |src0.w - src1.w| + src2.w
+
+
+.. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single texel
+                  from a specified texture image. The source sampler may
+                 not be a CUBE or SHADOW.
+                  src 0 is a four-component signed integer vector used to
+                 identify the single texel accessed. 3 components + level.
+                 src 1 is a 3 component constant signed integer vector,
+                 with each component only have a range of
+                 -8..+8 (hw only seems to deal with this range, interface
+                 allows for up to unsigned int).
+                 TXF(uint_vec coord, int_vec offset).
+
+
+.. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4)
+                  retrieve the dimensions of the texture
+                  depending on the target. For 1D (width), 2D/RECT/CUBE
+                 (width, height), 3D (width, height, depth),
+                 1D array (width, layers), 2D array (width, height, layers)
+
+.. math::
+
+  lod = src0.x
+
+  dst.x = texture_width(unit, lod)
+
+  dst.y = texture_height(unit, lod)
+
+  dst.z = texture_depth(unit, lod)
+
+
+Integer ISA
+^^^^^^^^^^^^^^^^^^^^^^^^
+These opcodes are used for integer operations.
+Support for these opcodes indicated by PIPE_SHADER_CAP_INTEGERS (all of them?)
+
+
+.. opcode:: I2F - Signed Integer To Float
+
+   Rounding is unspecified (round to nearest even suggested).
  
  .. math::
  
@@ -944,56 +1008,157 @@ XXX so let's discuss it, yeah?
    dst.w = (float) src.w
  
  
-.. opcode:: NOT - Bitwise Not
+.. opcode:: U2F - Unsigned Integer To Float
+
+   Rounding is unspecified (round to nearest even suggested).
  
  .. math::
  
-  dst.x = ~src.x
+  dst.x = (float) src.x
  
-  dst.y = ~src.y
+  dst.y = (float) src.y
  
-  dst.z = ~src.z
+  dst.z = (float) src.z
  
-  dst.w = ~src.w
+  dst.w = (float) src.w
  
  
-.. opcode:: TRUNC - Truncate
+.. opcode:: F2I - Float to Signed Integer
+
+   Rounding is towards zero (truncate).
+   Values outside signed range (including NaNs) produce undefined results.
  
  .. math::
  
-  dst.x = trunc(src.x)
+  dst.x = (int) src.x
  
-  dst.y = trunc(src.y)
+  dst.y = (int) src.y
  
-  dst.z = trunc(src.z)
+  dst.z = (int) src.z
  
-  dst.w = trunc(src.w)
+  dst.w = (int) src.w
  
  
-.. opcode:: SHL - Shift Left
+.. opcode:: F2U - Float to Unsigned Integer
+
+   Rounding is towards zero (truncate).
+   Values outside unsigned range (including NaNs) produce undefined results.
  
  .. math::
  
-  dst.x = src0.x << src1.x
+  dst.x = (unsigned) src.x
  
-  dst.y = src0.y << src1.x
+  dst.y = (unsigned) src.y
  
-  dst.z = src0.z << src1.x
+  dst.z = (unsigned) src.z
  
-  dst.w = src0.w << src1.x
+  dst.w = (unsigned) src.w
  
  
-.. opcode:: SHR - Shift Right
+.. opcode:: UADD - Integer Add
+
+   This instruction works the same for signed and unsigned integers.
+   The low 32bit of the result is returned.
  
  .. math::
  
-  dst.x = src0.x >> src1.x
+  dst.x = src0.x + src1.x
  
-  dst.y = src0.y >> src1.x
+  dst.y = src0.y + src1.y
  
-  dst.z = src0.z >> src1.x
+  dst.z = src0.z + src1.z
  
-  dst.w = src0.w >> src1.x
+  dst.w = src0.w + src1.w
+
+
+.. opcode:: UMAD - Integer Multiply And Add
+
+   This instruction works the same for signed and unsigned integers.
+   The multiplication returns the low 32bit (as does the result itself).
+
+.. math::
+
+  dst.x = src0.x \times src1.x + src2.x
+
+  dst.y = src0.y \times src1.y + src2.y
+
+  dst.z = src0.z \times src1.z + src2.z
+
+  dst.w = src0.w \times src1.w + src2.w
+
+
+.. opcode:: UMUL - Integer Multiply
+
+   This instruction works the same for signed and unsigned integers.
+   The low 32bit of the result is returned.
+
+.. math::
+
+  dst.x = src0.x \times src1.x
+
+  dst.y = src0.y \times src1.y
+
+  dst.z = src0.z \times src1.z
+
+  dst.w = src0.w \times src1.w
+
+
+.. opcode:: IDIV - Signed Integer Division
+
+   TBD: behavior for division by zero.
+
+.. math::
+
+  dst.x = src0.x \ src1.x
+
+  dst.y = src0.y \ src1.y
+
+  dst.z = src0.z \ src1.z
+
+  dst.w = src0.w \ src1.w
+
+
+.. opcode:: UDIV - Unsigned Integer Division
+
+   For division by zero, 0xffffffff is returned.
+
+.. math::
+
+  dst.x = src0.x \ src1.x
+
+  dst.y = src0.y \ src1.y
+
+  dst.z = src0.z \ src1.z
+
+  dst.w = src0.w \ src1.w
+
+
+.. opcode:: UMOD - Unsigned Integer Remainder
+
+   If second arg is zero, 0xffffffff is returned.
+
+.. math::
+
+  dst.x = src0.x \ src1.x
+
+  dst.y = src0.y \ src1.y
+
+  dst.z = src0.z \ src1.z
+
+  dst.w = src0.w \ src1.w
+
+
+.. opcode:: NOT - Bitwise Not
+
+.. math::
+
+  dst.x = ~src.x
+
+  dst.y = ~src.y
+
+  dst.z = ~src.z
+
+  dst.w = ~src.w
  
  
  .. opcode:: AND - Bitwise And
@@ -1022,30 +1187,108 @@ XXX so let's discuss it, yeah?
    dst.w = src0.w | src1.w
  
  
-.. opcode:: MOD - Modulus
+.. opcode:: XOR - Bitwise Xor
  
  .. math::
  
-  dst.x = src0.x \bmod src1.x
+  dst.x = src0.x \oplus src1.x
  
-  dst.y = src0.y \bmod src1.y
+  dst.y = src0.y \oplus src1.y
  
-  dst.z = src0.z \bmod src1.z
+  dst.z = src0.z \oplus src1.z
  
-  dst.w = src0.w \bmod src1.w
+  dst.w = src0.w \oplus src1.w
  
  
-.. opcode:: XOR - Bitwise Xor
+.. opcode:: IMAX - Maximum of Signed Integers
  
  .. math::
  
-  dst.x = src0.x \oplus src1.x
+  dst.x = max(src0.x, src1.x)
  
-  dst.y = src0.y \oplus src1.y
+  dst.y = max(src0.y, src1.y)
  
-  dst.z = src0.z \oplus src1.z
+  dst.z = max(src0.z, src1.z)
  
-  dst.w = src0.w \oplus src1.w
+  dst.w = max(src0.w, src1.w)
+
+
+.. opcode:: UMAX - Maximum of Unsigned Integers
+
+.. math::
+
+  dst.x = max(src0.x, src1.x)
+
+  dst.y = max(src0.y, src1.y)
+
+  dst.z = max(src0.z, src1.z)
+
+  dst.w = max(src0.w, src1.w)
+
+
+.. opcode:: IMIN - Minimum of Signed Integers
+
+.. math::
+
+  dst.x = min(src0.x, src1.x)
+
+  dst.y = min(src0.y, src1.y)
+
+  dst.z = min(src0.z, src1.z)
+
+  dst.w = min(src0.w, src1.w)
+
+
+.. opcode:: UMIN - Minimum of Unsigned Integers
+
+.. math::
+
+  dst.x = min(src0.x, src1.x)
+
+  dst.y = min(src0.y, src1.y)
+
+  dst.z = min(src0.z, src1.z)
+
+  dst.w = min(src0.w, src1.w)
+
+
+.. opcode:: SHL - Shift Left
+
+.. math::
+
+  dst.x = src0.x << src1.x
+
+  dst.y = src0.y << src1.x
+
+  dst.z = src0.z << src1.x
+
+  dst.w = src0.w << src1.x
+
+
+.. opcode:: ISHR - Arithmetic Shift Right (of Signed Integer)
+
+.. math::
+
+  dst.x = src0.x >> src1.x
+
+  dst.y = src0.y >> src1.x
+
+  dst.z = src0.z >> src1.x
+
+  dst.w = src0.w >> src1.x
+
+
+.. opcode:: USHR - Logical Shift Right
+
+.. math::
+
+  dst.x = src0.x >> (unsigned) src1.x
+
+  dst.y = src0.y >> (unsigned) src1.x
+
+  dst.z = src0.z >> (unsigned) src1.x
+
+  dst.w = src0.w >> (unsigned) src1.x
  
  
  .. opcode:: UCMP - Integer Conditional Move
@@ -1061,75 +1304,125 @@ XXX so let's discuss it, yeah?
    dst.w = src0.w ? src1.w : src2.w
  
  
-.. opcode:: UARL - Integer Address Register Load
  
-  Moves the contents of the source register, assumed to be an integer, into the
-  destination register, which is assumed to be an address (ADDR) register.
+.. opcode:: ISSG - Integer Set Sign
  
+.. math::
  
-.. opcode:: IABS - Integer Absolute Value
+  dst.x = (src0.x < 0) ? -1 : (src0.x > 0) ? 1 : 0
+
+  dst.y = (src0.y < 0) ? -1 : (src0.y > 0) ? 1 : 0
+
+  dst.z = (src0.z < 0) ? -1 : (src0.z > 0) ? 1 : 0
+
+  dst.w = (src0.w < 0) ? -1 : (src0.w > 0) ? 1 : 0
+
+
+
+.. opcode:: ISLT - Signed Integer Set On Less Than
  
  .. math::
  
-  dst.x = |src.x|
+  dst.x = (src0.x < src1.x) ? ~0 : 0
  
-  dst.y = |src.y|
+  dst.y = (src0.y < src1.y) ? ~0 : 0
  
-  dst.z = |src.z|
+  dst.z = (src0.z < src1.z) ? ~0 : 0
  
-  dst.w = |src.w|
+  dst.w = (src0.w < src1.w) ? ~0 : 0
  
  
-.. opcode:: SAD - Sum Of Absolute Differences
+.. opcode:: USLT - Unsigned Integer Set On Less Than
  
  .. math::
  
-  dst.x = |src0.x - src1.x| + src2.x
+  dst.x = (src0.x < src1.x) ? ~0 : 0
  
-  dst.y = |src0.y - src1.y| + src2.y
+  dst.y = (src0.y < src1.y) ? ~0 : 0
  
-  dst.z = |src0.z - src1.z| + src2.z
+  dst.z = (src0.z < src1.z) ? ~0 : 0
  
-  dst.w = |src0.w - src1.w| + src2.w
+  dst.w = (src0.w < src1.w) ? ~0 : 0
  
  
-.. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single texel
-                  from a specified texture image. The source sampler may
-                 not be a CUBE or SHADOW.
-                  src 0 is a four-component signed integer vector used to
-                 identify the single texel accessed. 3 components + level.
-                 src 1 is a 3 component constant signed integer vector,
-                 with each component only have a range of
-                 -8..+8 (hw only seems to deal with this range, interface
-                 allows for up to unsigned int).
-                 TXF(uint_vec coord, int_vec offset).
+.. opcode:: ISGE - Signed Integer Set On Greater Equal Than
  
+.. math::
  
-.. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4)
-                  retrieve the dimensions of the texture
-                  depending on the target. For 1D (width), 2D/RECT/CUBE
-                 (width, height), 3D (width, height, depth),
-                 1D array (width, layers), 2D array (width, height, layers)
+  dst.x = (src0.x >= src1.x) ? ~0 : 0
+
+  dst.y = (src0.y >= src1.y) ? ~0 : 0
+
+  dst.z = (src0.z >= src1.z) ? ~0 : 0
+
+  dst.w = (src0.w >= src1.w) ? ~0 : 0
+
+
+.. opcode:: USGE - Unsigned Integer Set On Greater Equal Than
  
  .. math::
  
-  lod = src0
+  dst.x = (src0.x >= src1.x) ? ~0 : 0
  
-  dst.x = texture_width(unit, lod)
+  dst.y = (src0.y >= src1.y) ? ~0 : 0
  
-  dst.y = texture_height(unit, lod)
+  dst.z = (src0.z >= src1.z) ? ~0 : 0
  
-  dst.z = texture_depth(unit, lod)
+  dst.w = (src0.w >= src1.w) ? ~0 : 0
  
  
-.. opcode:: CONT - Continue
+.. opcode:: USEQ - Integer Set On Equal
  
-  TBD
+.. math::
  
-.. note::
+  dst.x = (src0.x == src1.x) ? ~0 : 0
  
-   Support for CONT is determined by a special capability bit,
-   ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information.
+  dst.y = (src0.y == src1.y) ? ~0 : 0
+
+  dst.z = (src0.z == src1.z) ? ~0 : 0
+
+  dst.w = (src0.w == src1.w) ? ~0 : 0
+
+
+.. opcode:: USNE - Integer Set On Not Equal
+
+.. math::
+
+  dst.x = (src0.x != src1.x) ? ~0 : 0
+
+  dst.y = (src0.y != src1.y) ? ~0 : 0
+
+  dst.z = (src0.z != src1.z) ? ~0 : 0
+
+  dst.w = (src0.w != src1.w) ? ~0 : 0
+
+
+.. opcode:: INEG - Integer Negate
+
+  Two's complement.
+
+.. math::
+
+  dst.x = -src.x
+
+  dst.y = -src.y
+
+  dst.z = -src.z
+
+  dst.w = -src.w
+
+
+.. opcode:: IABS - Integer Absolute Value
+
+.. math::
+
+  dst.x = |src.x|
+
+  dst.y = |src.y|
+
+  dst.z = |src.z|
+
+  dst.w = |src.w|
  
  
  Geometry ISA
@@ -1140,12 +1433,14 @@ in any other type of shader.
  
  .. opcode:: EMIT - Emit
  
-  TBD
+  Generate a new vertex for the current primitive using the values in the
+  output registers.
  
  
  .. opcode:: ENDPRIM - End Primitive
  
-  TBD
+  Complete the current primitive (consisting of the emitted vertices),
+  and start a new one.
  
  
  GLSL ISA
@@ -1153,25 +1448,48 @@ GLSL ISA
  
  These opcodes are part of :term:`GLSL`'s opcode set. Support for these
  opcodes is determined by a special capability bit, ``GLSL``.
+Some require glsl version 1.30 (UIF/BREAKC/SWITCH/CASE/DEFAULT/ENDSWITCH).
+
+.. opcode:: CAL - Subroutine Call
+
+  push(pc)
+  pc = target
+
+
+.. opcode:: RET - Subroutine Call Return
+
+  pc = pop()
+
+
+.. opcode:: CONT - Continue
+
+  Unconditionally moves the point of execution to the instruction after the
+  last bgnloop. The instruction must appear within a bgnloop/endloop.
+
+.. note::
+
+   Support for CONT is determined by a special capability bit,
+   ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information.
+
  
  .. opcode:: BGNLOOP - Begin a Loop
  
-  TBD
+  Start a loop. Must have a matching endloop.
  
  
  .. opcode:: BGNSUB - Begin Subroutine
  
-  TBD
+  Starts definition of a subroutine. Must have a matching endsub.
  
  
  .. opcode:: ENDLOOP - End a Loop
  
-  TBD
+  End a loop started with bgnloop.
  
  
  .. opcode:: ENDSUB - End Subroutine
  
-  TBD
+  Ends definition of a subroutine.
  
  
  .. opcode:: NOP - No Operation
@@ -1179,28 +1497,102 @@ opcodes is determined by a special capability bit, ``GLSL``.
    Do nothing.
  
  
-.. opcode:: NRM4 - 4-component Vector Normalise
+.. opcode:: BRK - Break
  
-This instruction replicates its result.
+  Unconditionally moves the point of execution to the instruction after the
+  next endloop or endswitch. The instruction must appear within a loop/endloop
+  or switch/endswitch.
  
-.. math::
  
-  dst = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
+.. opcode:: BREAKC - Break Conditional
  
+  Conditionally moves the point of execution to the instruction after the
+  next endloop or endswitch. The instruction must appear within a loop/endloop
+  or switch/endswitch.
+  Condition evaluates to true if src0.x != 0 where src0.x is interpreted
+  as an integer register.
  
-ps_2_x
-^^^^^^^^^^^^
+.. note::
  
-XXX wait what
+   Considered for removal as it's quite inconsistent wrt other opcodes
+   (could emulate with UIF/BRK/ENDIF). 
  
-.. opcode:: CALLNZ - Subroutine Call If Not Zero
  
-  TBD
+.. opcode:: IF - Float If
  
+  Start an IF ... ELSE .. ENDIF block.  Condition evaluates to true if
  
-.. opcode:: BREAKC - Break Conditional
+    src0.x != 0.0
+
+  where src0.x is interpreted as a floating point register.
+
+
+.. opcode:: UIF - Bitwise If
+
+  Start an UIF ... ELSE .. ENDIF block. Condition evaluates to true if
+
+    src0.x != 0
+
+  where src0.x is interpreted as an integer register.
+
+
+.. opcode:: ELSE - Else
+
+  Starts an else block, after an IF or UIF statement.
+
+
+.. opcode:: ENDIF - End If
+
+  Ends an IF or UIF block.
+
+
+.. opcode:: SWITCH - Switch
+
+   Starts a C-style switch expression. The switch consists of one or multiple
+   CASE statements, and at most one DEFAULT statement. Execution of a statement
+   ends when a BRK is hit, but just like in C falling through to other cases
+   without a break is allowed. Similarly, DEFAULT label is allowed anywhere not
+   just as last statement, and fallthrough is allowed into/from it.
+   CASE src arguments are evaluated at bit level against the SWITCH src argument.
+
+   Example:
+   SWITCH src[0].x
+   CASE src[0].x
+   (some instructions here)
+   (optional BRK here)
+   DEFAULT
+   (some instructions here)
+   (optional BRK here)
+   CASE src[0].x
+   (some instructions here)
+   (optional BRK here)
+   ENDSWITCH
+
+
+.. opcode:: CASE - Switch case
+
+   This represents a switch case label. The src arg must be an integer immediate.
+
+
+.. opcode:: DEFAULT - Switch default
+
+   This represents the default case in the switch, which is taken if no other
+   case matches.
+
+
+.. opcode:: ENDSWITCH - End of switch
+
+   Ends a switch expression.
+
+
+.. opcode:: NRM4 - 4-component Vector Normalise
+
+This instruction replicates its result.
+
+.. math::
+
+  dst = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
  
-  TBD
  
  .. _doubleopcodes:
  
@@ -2010,14 +2402,66 @@ Edge flags are used to control which lines or points are actually
  drawn when the polygon mode converts triangles/quads/polygons into
  points or lines.
  
+
  TGSI_SEMANTIC_STENCIL
-""""""""""""""""""""""
+"""""""""""""""""""""
  
-For fragment shaders, this semantic label indicates than an output
+For fragment shaders, this semantic label indicates that an output
  is a writable stencil reference value. Only the Y component is writable.
  This allows the fragment shader to change the fragments stencilref value.
  
  
+TGSI_SEMANTIC_VIEWPORT_INDEX
+""""""""""""""""""""""""""""
+
+For geometry shaders, this semantic label indicates that an output
+contains the index of the viewport (and scissor) to use.
+Only the X value is used.
+
+
+TGSI_SEMANTIC_LAYER
+"""""""""""""""""""
+
+For geometry shaders, this semantic label indicates that an output
+contains the layer value to use for the color and depth/stencil surfaces.
+Only the X value is used. (Also known as rendertarget array index.)
+
+
+TGSI_SEMANTIC_CULLDIST
+""""""""""""""""""""""
+
+Used as distance to plane for performing application-defined culling
+of individual primitives against a plane. When components of vertex
+elements are given this label, these values are assumed to be a
+float32 signed distance to a plane. Primitives will be completely
+discarded if the plane distance for all of the vertices in the
+primitive are < 0. If a vertex has a cull distance of NaN, that
+vertex counts as "out" (as if its < 0);
+The limits on both clip and cull distances are bound
+by the PIPE_MAX_CLIP_OR_CULL_DISTANCE_COUNT define which defines
+the maximum number of components that can be used to hold the
+distances and by the PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT
+which specifies the maximum number of registers which can be
+annotated with those semantics.
+
+
+TGSI_SEMANTIC_CLIPDIST
+""""""""""""""""""""""
+
+When components of vertex elements are identified this way, these
+values are each assumed to be a float32 signed distance to a plane.
+Primitive setup only invokes rasterization on pixels for which
+the interpolated plane distances are >= 0. Multiple clip planes
+can be implemented simultaneously, by annotating multiple
+components of one or more vertex elements with the above specified
+semantic. The limits on both clip and cull distances are bound
+by the PIPE_MAX_CLIP_OR_CULL_DISTANCE_COUNT define which defines
+the maximum number of components that can be used to hold the
+distances and by the PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT
+which specifies the maximum number of registers which can be
+annotated with those semantics.
+
+
  Declaration Interpolate
  ^^^^^^^^^^^^^^^^^^^^^^^
  
@@ -2110,7 +2554,7 @@ If HALF_INTEGER, the fractionary part of the position will be 0.5
  If INTEGER, the fractionary part of the position will be 0.0
  
  Note that this does not affect the set of fragments generated by
-rasterization, which is instead controlled by gl_rasterization_rules in the
+rasterization, which is instead controlled by half_pixel_center in the
  rasterizer.
  
  OpenGL defaults to HALF_INTEGER, and is configurable with the