4 TGSI, Tungsten Graphics Shader Infrastructure, is an intermediate language
5 for describing shaders. Since Gallium is inherently shaderful, shaders are
6 an important part of the API. TGSI is the only intermediate representation
12 All TGSI instructions, known as *opcodes*, operate on arbitrary-precision
13 floating-point four-component vectors. An opcode may have up to one
14 destination register, known as *dst*, and between zero and three source
15 registers, called *src0* through *src2*, or simply *src* if there is only
18 Some instructions, like :opcode:`I2F`, permit re-interpretation of vector
19 components as integers. Other instructions permit using registers as
20 two-component vectors with double precision; see :ref:`Double Opcodes`.
22 When an instruction has a scalar result, the result is usually copied into
23 each of the components of *dst*. When this happens, the result is said to be
24 *replicated* to *dst*. :opcode:`RCP` is one such instruction.
30 ^^^^^^^^^^^^^^^^^^^^^^^^^
32 These opcodes are guaranteed to be available regardless of the driver being
35 .. opcode:: ARL - Address Register Load
39 dst.x = \lfloor src.x\rfloor
41 dst.y = \lfloor src.y\rfloor
43 dst.z = \lfloor src.z\rfloor
45 dst.w = \lfloor src.w\rfloor
48 .. opcode:: MOV - Move
61 .. opcode:: LIT - Light Coefficients
69 dst.z = (src.x > 0) ? max(src.y, 0)^{clamp(src.w, -128, 128))} : 0
74 .. opcode:: RCP - Reciprocal
76 This instruction replicates its result.
83 .. opcode:: RSQ - Reciprocal Square Root
85 This instruction replicates its result.
89 dst = \frac{1}{\sqrt{|src.x|}}
92 .. opcode:: EXP - Approximate Exponential Base 2
96 dst.x = 2^{\lfloor src.x\rfloor}
98 dst.y = src.x - \lfloor src.x\rfloor
105 .. opcode:: LOG - Approximate Logarithm Base 2
109 dst.x = \lfloor\log_2{|src.x|}\rfloor
111 dst.y = \frac{|src.x|}{2^{\lfloor\log_2{|src.x|}\rfloor}}
113 dst.z = \log_2{|src.x|}
118 .. opcode:: MUL - Multiply
122 dst.x = src0.x \times src1.x
124 dst.y = src0.y \times src1.y
126 dst.z = src0.z \times src1.z
128 dst.w = src0.w \times src1.w
131 .. opcode:: ADD - Add
135 dst.x = src0.x + src1.x
137 dst.y = src0.y + src1.y
139 dst.z = src0.z + src1.z
141 dst.w = src0.w + src1.w
144 .. opcode:: DP3 - 3-component Dot Product
146 This instruction replicates its result.
150 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
153 .. opcode:: DP4 - 4-component Dot Product
155 This instruction replicates its result.
159 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
162 .. opcode:: DST - Distance Vector
168 dst.y = src0.y \times src1.y
175 .. opcode:: MIN - Minimum
179 dst.x = min(src0.x, src1.x)
181 dst.y = min(src0.y, src1.y)
183 dst.z = min(src0.z, src1.z)
185 dst.w = min(src0.w, src1.w)
188 .. opcode:: MAX - Maximum
192 dst.x = max(src0.x, src1.x)
194 dst.y = max(src0.y, src1.y)
196 dst.z = max(src0.z, src1.z)
198 dst.w = max(src0.w, src1.w)
201 .. opcode:: SLT - Set On Less Than
205 dst.x = (src0.x < src1.x) ? 1 : 0
207 dst.y = (src0.y < src1.y) ? 1 : 0
209 dst.z = (src0.z < src1.z) ? 1 : 0
211 dst.w = (src0.w < src1.w) ? 1 : 0
214 .. opcode:: SGE - Set On Greater Equal Than
218 dst.x = (src0.x >= src1.x) ? 1 : 0
220 dst.y = (src0.y >= src1.y) ? 1 : 0
222 dst.z = (src0.z >= src1.z) ? 1 : 0
224 dst.w = (src0.w >= src1.w) ? 1 : 0
227 .. opcode:: MAD - Multiply And Add
231 dst.x = src0.x \times src1.x + src2.x
233 dst.y = src0.y \times src1.y + src2.y
235 dst.z = src0.z \times src1.z + src2.z
237 dst.w = src0.w \times src1.w + src2.w
240 .. opcode:: SUB - Subtract
244 dst.x = src0.x - src1.x
246 dst.y = src0.y - src1.y
248 dst.z = src0.z - src1.z
250 dst.w = src0.w - src1.w
253 .. opcode:: LRP - Linear Interpolate
257 dst.x = src0.x \times src1.x + (1 - src0.x) \times src2.x
259 dst.y = src0.y \times src1.y + (1 - src0.y) \times src2.y
261 dst.z = src0.z \times src1.z + (1 - src0.z) \times src2.z
263 dst.w = src0.w \times src1.w + (1 - src0.w) \times src2.w
266 .. opcode:: CND - Condition
270 dst.x = (src2.x > 0.5) ? src0.x : src1.x
272 dst.y = (src2.y > 0.5) ? src0.y : src1.y
274 dst.z = (src2.z > 0.5) ? src0.z : src1.z
276 dst.w = (src2.w > 0.5) ? src0.w : src1.w
279 .. opcode:: DP2A - 2-component Dot Product And Add
283 dst.x = src0.x \times src1.x + src0.y \times src1.y + src2.x
285 dst.y = src0.x \times src1.x + src0.y \times src1.y + src2.x
287 dst.z = src0.x \times src1.x + src0.y \times src1.y + src2.x
289 dst.w = src0.x \times src1.x + src0.y \times src1.y + src2.x
292 .. opcode:: FRC - Fraction
296 dst.x = src.x - \lfloor src.x\rfloor
298 dst.y = src.y - \lfloor src.y\rfloor
300 dst.z = src.z - \lfloor src.z\rfloor
302 dst.w = src.w - \lfloor src.w\rfloor
305 .. opcode:: CLAMP - Clamp
309 dst.x = clamp(src0.x, src1.x, src2.x)
311 dst.y = clamp(src0.y, src1.y, src2.y)
313 dst.z = clamp(src0.z, src1.z, src2.z)
315 dst.w = clamp(src0.w, src1.w, src2.w)
318 .. opcode:: FLR - Floor
320 This is identical to :opcode:`ARL`.
324 dst.x = \lfloor src.x\rfloor
326 dst.y = \lfloor src.y\rfloor
328 dst.z = \lfloor src.z\rfloor
330 dst.w = \lfloor src.w\rfloor
333 .. opcode:: ROUND - Round
346 .. opcode:: EX2 - Exponential Base 2
348 This instruction replicates its result.
355 .. opcode:: LG2 - Logarithm Base 2
357 This instruction replicates its result.
364 .. opcode:: POW - Power
366 This instruction replicates its result.
370 dst = src0.x^{src1.x}
372 .. opcode:: XPD - Cross Product
376 dst.x = src0.y \times src1.z - src1.y \times src0.z
378 dst.y = src0.z \times src1.x - src1.z \times src0.x
380 dst.z = src0.x \times src1.y - src1.x \times src0.y
385 .. opcode:: ABS - Absolute
398 .. opcode:: RCC - Reciprocal Clamped
400 This instruction replicates its result.
402 XXX cleanup on aisle three
406 dst = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
409 .. opcode:: DPH - Homogeneous Dot Product
411 This instruction replicates its result.
415 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
418 .. opcode:: COS - Cosine
420 This instruction replicates its result.
427 .. opcode:: DDX - Derivative Relative To X
431 dst.x = partialx(src.x)
433 dst.y = partialx(src.y)
435 dst.z = partialx(src.z)
437 dst.w = partialx(src.w)
440 .. opcode:: DDY - Derivative Relative To Y
444 dst.x = partialy(src.x)
446 dst.y = partialy(src.y)
448 dst.z = partialy(src.z)
450 dst.w = partialy(src.w)
453 .. opcode:: KILP - Predicated Discard
458 .. opcode:: PK2H - Pack Two 16-bit Floats
463 .. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars
468 .. opcode:: PK4B - Pack Four Signed 8-bit Scalars
473 .. opcode:: PK4UB - Pack Four Unsigned 8-bit Scalars
478 .. opcode:: RFL - Reflection Vector
482 dst.x = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.x - src1.x
484 dst.y = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.y - src1.y
486 dst.z = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.z - src1.z
492 Considered for removal.
495 .. opcode:: SEQ - Set On Equal
499 dst.x = (src0.x == src1.x) ? 1 : 0
501 dst.y = (src0.y == src1.y) ? 1 : 0
503 dst.z = (src0.z == src1.z) ? 1 : 0
505 dst.w = (src0.w == src1.w) ? 1 : 0
508 .. opcode:: SFL - Set On False
510 This instruction replicates its result.
518 Considered for removal.
521 .. opcode:: SGT - Set On Greater Than
525 dst.x = (src0.x > src1.x) ? 1 : 0
527 dst.y = (src0.y > src1.y) ? 1 : 0
529 dst.z = (src0.z > src1.z) ? 1 : 0
531 dst.w = (src0.w > src1.w) ? 1 : 0
534 .. opcode:: SIN - Sine
536 This instruction replicates its result.
543 .. opcode:: SLE - Set On Less Equal Than
547 dst.x = (src0.x <= src1.x) ? 1 : 0
549 dst.y = (src0.y <= src1.y) ? 1 : 0
551 dst.z = (src0.z <= src1.z) ? 1 : 0
553 dst.w = (src0.w <= src1.w) ? 1 : 0
556 .. opcode:: SNE - Set On Not Equal
560 dst.x = (src0.x != src1.x) ? 1 : 0
562 dst.y = (src0.y != src1.y) ? 1 : 0
564 dst.z = (src0.z != src1.z) ? 1 : 0
566 dst.w = (src0.w != src1.w) ? 1 : 0
569 .. opcode:: STR - Set On True
571 This instruction replicates its result.
578 .. opcode:: TEX - Texture Lookup
586 dst = texture_sample(unit, coord, bias)
589 .. opcode:: TXD - Texture Lookup with Derivatives
601 dst = texture_sample_deriv(unit, coord, bias, ddx, ddy)
604 .. opcode:: TXP - Projective Texture Lookup
608 coord.x = src0.x / src.w
610 coord.y = src0.y / src.w
612 coord.z = src0.z / src.w
618 dst = texture_sample(unit, coord, bias)
621 .. opcode:: UP2H - Unpack Two 16-Bit Floats
627 Considered for removal.
629 .. opcode:: UP2US - Unpack Two Unsigned 16-Bit Scalars
635 Considered for removal.
637 .. opcode:: UP4B - Unpack Four Signed 8-Bit Values
643 Considered for removal.
645 .. opcode:: UP4UB - Unpack Four Unsigned 8-Bit Scalars
651 Considered for removal.
653 .. opcode:: X2D - 2D Coordinate Transformation
657 dst.x = src0.x + src1.x \times src2.x + src1.y \times src2.y
659 dst.y = src0.y + src1.x \times src2.z + src1.y \times src2.w
661 dst.z = src0.x + src1.x \times src2.x + src1.y \times src2.y
663 dst.w = src0.y + src1.x \times src2.z + src1.y \times src2.w
667 Considered for removal.
670 .. opcode:: ARA - Address Register Add
676 Considered for removal.
678 .. opcode:: ARR - Address Register Load With Round
691 .. opcode:: BRA - Branch
697 Considered for removal.
699 .. opcode:: CAL - Subroutine Call
705 .. opcode:: RET - Subroutine Call Return
710 .. opcode:: SSG - Set Sign
714 dst.x = (src.x > 0) ? 1 : (src.x < 0) ? -1 : 0
716 dst.y = (src.y > 0) ? 1 : (src.y < 0) ? -1 : 0
718 dst.z = (src.z > 0) ? 1 : (src.z < 0) ? -1 : 0
720 dst.w = (src.w > 0) ? 1 : (src.w < 0) ? -1 : 0
723 .. opcode:: CMP - Compare
727 dst.x = (src0.x < 0) ? src1.x : src2.x
729 dst.y = (src0.y < 0) ? src1.y : src2.y
731 dst.z = (src0.z < 0) ? src1.z : src2.z
733 dst.w = (src0.w < 0) ? src1.w : src2.w
736 .. opcode:: KIL - Conditional Discard
740 if (src.x < 0 || src.y < 0 || src.z < 0 || src.w < 0)
745 .. opcode:: SCS - Sine Cosine
758 .. opcode:: TXB - Texture Lookup With Bias
772 dst = texture_sample(unit, coord, bias)
775 .. opcode:: NRM - 3-component Vector Normalise
779 dst.x = src.x / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
781 dst.y = src.y / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
783 dst.z = src.z / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
788 .. opcode:: DIV - Divide
792 dst.x = \frac{src0.x}{src1.x}
794 dst.y = \frac{src0.y}{src1.y}
796 dst.z = \frac{src0.z}{src1.z}
798 dst.w = \frac{src0.w}{src1.w}
801 .. opcode:: DP2 - 2-component Dot Product
803 This instruction replicates its result.
807 dst = src0.x \times src1.x + src0.y \times src1.y
810 .. opcode:: TXL - Texture Lookup With explicit LOD
824 dst = texture_sample(unit, coord, lod)
827 .. opcode:: BRK - Break
837 .. opcode:: ELSE - Else
842 .. opcode:: ENDIF - End If
847 .. opcode:: PUSHA - Push Address Register On Stack
856 Considered for cleanup.
860 Considered for removal.
862 .. opcode:: POPA - Pop Address Register From Stack
871 Considered for cleanup.
875 Considered for removal.
879 ^^^^^^^^^^^^^^^^^^^^^^^^
881 These opcodes are primarily provided for special-use computational shaders.
882 Support for these opcodes indicated by a special pipe capability bit (TBD).
884 XXX so let's discuss it, yeah?
886 .. opcode:: CEIL - Ceiling
890 dst.x = \lceil src.x\rceil
892 dst.y = \lceil src.y\rceil
894 dst.z = \lceil src.z\rceil
896 dst.w = \lceil src.w\rceil
899 .. opcode:: I2F - Integer To Float
903 dst.x = (float) src.x
905 dst.y = (float) src.y
907 dst.z = (float) src.z
909 dst.w = (float) src.w
912 .. opcode:: NOT - Bitwise Not
925 .. opcode:: TRUNC - Truncate
938 .. opcode:: SHL - Shift Left
942 dst.x = src0.x << src1.x
944 dst.y = src0.y << src1.x
946 dst.z = src0.z << src1.x
948 dst.w = src0.w << src1.x
951 .. opcode:: SHR - Shift Right
955 dst.x = src0.x >> src1.x
957 dst.y = src0.y >> src1.x
959 dst.z = src0.z >> src1.x
961 dst.w = src0.w >> src1.x
964 .. opcode:: AND - Bitwise And
968 dst.x = src0.x & src1.x
970 dst.y = src0.y & src1.y
972 dst.z = src0.z & src1.z
974 dst.w = src0.w & src1.w
977 .. opcode:: OR - Bitwise Or
981 dst.x = src0.x | src1.x
983 dst.y = src0.y | src1.y
985 dst.z = src0.z | src1.z
987 dst.w = src0.w | src1.w
990 .. opcode:: MOD - Modulus
994 dst.x = src0.x \bmod src1.x
996 dst.y = src0.y \bmod src1.y
998 dst.z = src0.z \bmod src1.z
1000 dst.w = src0.w \bmod src1.w
1003 .. opcode:: XOR - Bitwise Xor
1007 dst.x = src0.x \oplus src1.x
1009 dst.y = src0.y \oplus src1.y
1011 dst.z = src0.z \oplus src1.z
1013 dst.w = src0.w \oplus src1.w
1016 .. opcode:: SAD - Sum Of Absolute Differences
1020 dst.x = |src0.x - src1.x| + src2.x
1022 dst.y = |src0.y - src1.y| + src2.y
1024 dst.z = |src0.z - src1.z| + src2.z
1026 dst.w = |src0.w - src1.w| + src2.w
1029 .. opcode:: TXF - Texel Fetch
1034 .. opcode:: TXQ - Texture Size Query
1039 .. opcode:: CONT - Continue
1045 Support for CONT is determined by a special capability bit,
1046 ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information.
1050 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1052 These opcodes are only supported in geometry shaders; they have no meaning
1053 in any other type of shader.
1055 .. opcode:: EMIT - Emit
1060 .. opcode:: ENDPRIM - End Primitive
1068 These opcodes are part of :term:`GLSL`'s opcode set. Support for these
1069 opcodes is determined by a special capability bit, ``GLSL``.
1071 .. opcode:: BGNLOOP - Begin a Loop
1076 .. opcode:: BGNSUB - Begin Subroutine
1081 .. opcode:: ENDLOOP - End a Loop
1086 .. opcode:: ENDSUB - End Subroutine
1091 .. opcode:: NOP - No Operation
1096 .. opcode:: NRM4 - 4-component Vector Normalise
1098 This instruction replicates its result.
1102 dst = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1110 .. opcode:: CALLNZ - Subroutine Call If Not Zero
1115 .. opcode:: IFC - If
1120 .. opcode:: BREAKC - Break Conditional
1129 The double-precision opcodes reinterpret four-component vectors into
1130 two-component vectors with doubled precision in each component.
1132 Support for these opcodes is XXX undecided. :T
1134 .. opcode:: DADD - Add
1138 dst.xy = src0.xy + src1.xy
1140 dst.zw = src0.zw + src1.zw
1143 .. opcode:: DDIV - Divide
1147 dst.xy = src0.xy / src1.xy
1149 dst.zw = src0.zw / src1.zw
1151 .. opcode:: DSEQ - Set on Equal
1155 dst.xy = src0.xy == src1.xy ? 1.0F : 0.0F
1157 dst.zw = src0.zw == src1.zw ? 1.0F : 0.0F
1159 .. opcode:: DSLT - Set on Less than
1163 dst.xy = src0.xy < src1.xy ? 1.0F : 0.0F
1165 dst.zw = src0.zw < src1.zw ? 1.0F : 0.0F
1167 .. opcode:: DFRAC - Fraction
1171 dst.xy = src.xy - \lfloor src.xy\rfloor
1173 dst.zw = src.zw - \lfloor src.zw\rfloor
1176 .. opcode:: DFRACEXP - Convert Number to Fractional and Integral Components
1178 Like the ``frexp()`` routine in many math libraries, this opcode stores the
1179 exponent of its source to ``dst0``, and the significand to ``dst1``, such that
1180 :math:`dst1 \times 2^{dst0} = src` .
1184 dst0.xy = exp(src.xy)
1186 dst1.xy = frac(src.xy)
1188 dst0.zw = exp(src.zw)
1190 dst1.zw = frac(src.zw)
1192 .. opcode:: DLDEXP - Multiply Number by Integral Power of 2
1194 This opcode is the inverse of :opcode:`DFRACEXP`.
1198 dst.xy = src0.xy \times 2^{src1.xy}
1200 dst.zw = src0.zw \times 2^{src1.zw}
1202 .. opcode:: DMIN - Minimum
1206 dst.xy = min(src0.xy, src1.xy)
1208 dst.zw = min(src0.zw, src1.zw)
1210 .. opcode:: DMAX - Maximum
1214 dst.xy = max(src0.xy, src1.xy)
1216 dst.zw = max(src0.zw, src1.zw)
1218 .. opcode:: DMUL - Multiply
1222 dst.xy = src0.xy \times src1.xy
1224 dst.zw = src0.zw \times src1.zw
1227 .. opcode:: DMAD - Multiply And Add
1231 dst.xy = src0.xy \times src1.xy + src2.xy
1233 dst.zw = src0.zw \times src1.zw + src2.zw
1236 .. opcode:: DRCP - Reciprocal
1240 dst.xy = \frac{1}{src.xy}
1242 dst.zw = \frac{1}{src.zw}
1244 .. opcode:: DSQRT - Square Root
1248 dst.xy = \sqrt{src.xy}
1250 dst.zw = \sqrt{src.zw}
1253 Explanation of symbols used
1254 ------------------------------
1261 :math:`|x|` Absolute value of `x`.
1263 :math:`\lceil x \rceil` Ceiling of `x`.
1265 clamp(x,y,z) Clamp x between y and z.
1266 (x < y) ? y : (x > z) ? z : x
1268 :math:`\lfloor x\rfloor` Floor of `x`.
1270 :math:`\log_2{x}` Logarithm of `x`, base 2.
1272 max(x,y) Maximum of x and y.
1275 min(x,y) Minimum of x and y.
1278 partialx(x) Derivative of x relative to fragment's X.
1280 partialy(x) Derivative of x relative to fragment's Y.
1282 pop() Pop from stack.
1284 :math:`x^y` `x` to the power `y`.
1286 push(x) Push x on stack.
1290 trunc(x) Truncate x, i.e. drop the fraction bits.
1297 discard Discard fragment.
1301 target Label of target instruction.
1312 Declares a register that is will be referenced as an operand in Instruction
1315 File field contains register file that is being declared and is one
1318 UsageMask field specifies which of the register components can be accessed
1319 and is one of TGSI_WRITEMASK.
1321 Interpolate field is only valid for fragment shader INPUT register files.
1322 It specifes the way input is being interpolated by the rasteriser and is one
1323 of TGSI_INTERPOLATE.
1325 If Dimension flag is set to 1, a Declaration Dimension token follows.
1327 If Semantic flag is set to 1, a Declaration Semantic token follows.
1329 CylindricalWrap bitfield is only valid for fragment shader INPUT register
1330 files. It specifies which register components should be subject to cylindrical
1331 wrapping when interpolating by the rasteriser. If TGSI_CYLINDRICAL_WRAP_X
1332 is set to 1, the X component should be interpolated according to cylindrical
1336 Declaration Semantic
1337 ^^^^^^^^^^^^^^^^^^^^^^^^
1339 Vertex and fragment shader input and output registers may be labeled
1340 with semantic information consisting of a name and index.
1342 Follows Declaration token if Semantic bit is set.
1344 Since its purpose is to link a shader with other stages of the pipeline,
1345 it is valid to follow only those Declaration tokens that declare a register
1346 either in INPUT or OUTPUT file.
1348 SemanticName field contains the semantic name of the register being declared.
1349 There is no default value.
1351 SemanticIndex is an optional subscript that can be used to distinguish
1352 different register declarations with the same semantic name. The default value
1355 The meanings of the individual semantic names are explained in the following
1358 TGSI_SEMANTIC_POSITION
1359 """"""""""""""""""""""
1361 For vertex shaders, TGSI_SEMANTIC_POSITION indicates the vertex shader
1362 output register which contains the homogeneous vertex position in the clip
1363 space coordinate system. After clipping, the X, Y and Z components of the
1364 vertex will be divided by the W value to get normalized device coordinates.
1366 For fragment shaders, TGSI_SEMANTIC_POSITION is used to indicate that
1367 fragment shader input contains the fragment's window position. The X
1368 component starts at zero and always increases from left to right.
1369 The Y component starts at zero and always increases but Y=0 may either
1370 indicate the top of the window or the bottom depending on the fragment
1371 coordinate origin convention (see TGSI_PROPERTY_FS_COORD_ORIGIN).
1372 The Z coordinate ranges from 0 to 1 to represent depth from the front
1373 to the back of the Z buffer. The W component contains the reciprocol
1374 of the interpolated vertex position W component.
1376 Fragment shaders may also declare an output register with
1377 TGSI_SEMANTIC_POSITION. Only the Z component is writable. This allows
1378 the fragment shader to change the fragment's Z position.
1385 For vertex shader outputs or fragment shader inputs/outputs, this
1386 label indicates that the resister contains an R,G,B,A color.
1388 Several shader inputs/outputs may contain colors so the semantic index
1389 is used to distinguish them. For example, color[0] may be the diffuse
1390 color while color[1] may be the specular color.
1392 This label is needed so that the flat/smooth shading can be applied
1393 to the right interpolants during rasterization.
1397 TGSI_SEMANTIC_BCOLOR
1398 """"""""""""""""""""
1400 Back-facing colors are only used for back-facing polygons, and are only valid
1401 in vertex shader outputs. After rasterization, all polygons are front-facing
1402 and COLOR and BCOLOR end up occupying the same slots in the fragment shader,
1403 so all BCOLORs effectively become regular COLORs in the fragment shader.
1409 Vertex shader inputs and outputs and fragment shader inputs may be
1410 labeled with TGSI_SEMANTIC_FOG to indicate that the register contains
1411 a fog coordinate in the form (F, 0, 0, 1). Typically, the fragment
1412 shader will use the fog coordinate to compute a fog blend factor which
1413 is used to blend the normal fragment color with a constant fog color.
1415 Only the first component matters when writing from the vertex shader;
1416 the driver will ensure that the coordinate is in this format when used
1417 as a fragment shader input.
1423 Vertex shader input and output registers may be labeled with
1424 TGIS_SEMANTIC_PSIZE to indicate that the register contains a point size
1425 in the form (S, 0, 0, 1). The point size controls the width or diameter
1426 of points for rasterization. This label cannot be used in fragment
1429 When using this semantic, be sure to set the appropriate state in the
1430 :ref:`rasterizer` first.
1433 TGSI_SEMANTIC_GENERIC
1434 """""""""""""""""""""
1436 All vertex/fragment shader inputs/outputs not labeled with any other
1437 semantic label can be considered to be generic attributes. Typical
1438 uses of generic inputs/outputs are texcoords and user-defined values.
1441 TGSI_SEMANTIC_NORMAL
1442 """"""""""""""""""""
1444 Indicates that a vertex shader input is a normal vector. This is
1445 typically only used for legacy graphics APIs.
1451 This label applies to fragment shader inputs only and indicates that
1452 the register contains front/back-face information of the form (F, 0,
1453 0, 1). The first component will be positive when the fragment belongs
1454 to a front-facing polygon, and negative when the fragment belongs to a
1455 back-facing polygon.
1458 TGSI_SEMANTIC_EDGEFLAG
1459 """"""""""""""""""""""
1461 For vertex shaders, this sematic label indicates that an input or
1462 output is a boolean edge flag. The register layout is [F, x, x, x]
1463 where F is 0.0 or 1.0 and x = don't care. Normally, the vertex shader
1464 simply copies the edge flag input to the edgeflag output.
1466 Edge flags are used to control which lines or points are actually
1467 drawn when the polygon mode converts triangles/quads/polygons into
1470 TGSI_SEMANTIC_STENCIL
1471 """"""""""""""""""""""
1473 For fragment shaders, this semantic label indicates than an output
1474 is a writable stencil reference value. Only the Y component is writable.
1475 This allows the fragment shader to change the fragments stencilref value.
1479 ^^^^^^^^^^^^^^^^^^^^^^^^
1482 Properties are general directives that apply to the whole TGSI program.
1487 Specifies the fragment shader TGSI_SEMANTIC_POSITION coordinate origin.
1488 The default value is UPPER_LEFT.
1490 If UPPER_LEFT, the position will be (0,0) at the upper left corner and
1491 increase downward and rightward.
1492 If LOWER_LEFT, the position will be (0,0) at the lower left corner and
1493 increase upward and rightward.
1495 OpenGL defaults to LOWER_LEFT, and is configurable with the
1496 GL_ARB_fragment_coord_conventions extension.
1498 DirectX 9/10 use UPPER_LEFT.
1500 FS_COORD_PIXEL_CENTER
1501 """""""""""""""""""""
1503 Specifies the fragment shader TGSI_SEMANTIC_POSITION pixel center convention.
1504 The default value is HALF_INTEGER.
1506 If HALF_INTEGER, the fractionary part of the position will be 0.5
1507 If INTEGER, the fractionary part of the position will be 0.0
1509 Note that this does not affect the set of fragments generated by
1510 rasterization, which is instead controlled by gl_rasterization_rules in the
1513 OpenGL defaults to HALF_INTEGER, and is configurable with the
1514 GL_ARB_fragment_coord_conventions extension.
1516 DirectX 9 uses INTEGER.
1517 DirectX 10 uses HALF_INTEGER.
1519 FS_COLOR0_WRITES_ALL_CBUFS
1520 """"""""""""""""""""""""""
1521 Specifies that writes to the fragment shader color 0 are replicated to all
1522 bound cbufs. This facilitates OpenGL's fragColor output vs fragData[0] where
1523 fragData is directed to a single color buffer, but fragColor is broadcast.
1526 Texture Sampling and Texture Formats
1527 ------------------------------------
1529 This table shows how texture image components are returned as (x,y,z,w) tuples
1530 by TGSI texture instructions, such as :opcode:`TEX`, :opcode:`TXD`, and
1531 :opcode:`TXP`. For reference, OpenGL and Direct3D conventions are shown as
1534 +--------------------+--------------+--------------------+--------------+
1535 | Texture Components | Gallium | OpenGL | Direct3D 9 |
1536 +====================+==============+====================+==============+
1537 | R | (r, 0, 0, 1) | (r, 0, 0, 1) | (r, 1, 1, 1) |
1538 +--------------------+--------------+--------------------+--------------+
1539 | RG | (r, g, 0, 1) | (r, g, 0, 1) | (r, g, 1, 1) |
1540 +--------------------+--------------+--------------------+--------------+
1541 | RGB | (r, g, b, 1) | (r, g, b, 1) | (r, g, b, 1) |
1542 +--------------------+--------------+--------------------+--------------+
1543 | RGBA | (r, g, b, a) | (r, g, b, a) | (r, g, b, a) |
1544 +--------------------+--------------+--------------------+--------------+
1545 | A | (0, 0, 0, a) | (0, 0, 0, a) | (0, 0, 0, a) |
1546 +--------------------+--------------+--------------------+--------------+
1547 | L | (l, l, l, 1) | (l, l, l, 1) | (l, l, l, 1) |
1548 +--------------------+--------------+--------------------+--------------+
1549 | LA | (l, l, l, a) | (l, l, l, a) | (l, l, l, a) |
1550 +--------------------+--------------+--------------------+--------------+
1551 | I | (i, i, i, i) | (i, i, i, i) | N/A |
1552 +--------------------+--------------+--------------------+--------------+
1553 | UV | XXX TBD | (0, 0, 0, 1) | (u, v, 1, 1) |
1554 | | | [#envmap-bumpmap]_ | |
1555 +--------------------+--------------+--------------------+--------------+
1556 | Z | XXX TBD | (z, z, z, 1) | (0, z, 0, 1) |
1557 | | | [#depth-tex-mode]_ | |
1558 +--------------------+--------------+--------------------+--------------+
1559 | S | (s, s, s, s) | unknown | unknown |
1560 +--------------------+--------------+--------------------+--------------+
1562 .. [#envmap-bumpmap] http://www.opengl.org/registry/specs/ATI/envmap_bumpmap.txt
1563 .. [#depth-tex-mode] the default is (z, z, z, 1) but may also be (0, 0, 0, z)
1564 or (z, z, z, z) depending on the value of GL_DEPTH_TEXTURE_MODE.