4 TGSI, Tungsten Graphics Shader Infrastructure, is an intermediate language
5 for describing shaders. Since Gallium is inherently shaderful, shaders are
6 an important part of the API. TGSI is the only intermediate representation
12 All TGSI instructions, known as *opcodes*, operate on arbitrary-precision
13 floating-point four-component vectors. An opcode may have up to one
14 destination register, known as *dst*, and between zero and three source
15 registers, called *src0* through *src2*, or simply *src* if there is only
18 Some instructions, like :opcode:`I2F`, permit re-interpretation of vector
19 components as integers. Other instructions permit using registers as
20 two-component vectors with double precision; see :ref:`Double Opcodes`.
22 When an instruction has a scalar result, the result is usually copied into
23 each of the components of *dst*. When this happens, the result is said to be
24 *replicated* to *dst*. :opcode:`RCP` is one such instruction.
30 ^^^^^^^^^^^^^^^^^^^^^^^^^
32 These opcodes are guaranteed to be available regardless of the driver being
35 .. opcode:: ARL - Address Register Load
39 dst.x = \lfloor src.x\rfloor
41 dst.y = \lfloor src.y\rfloor
43 dst.z = \lfloor src.z\rfloor
45 dst.w = \lfloor src.w\rfloor
48 .. opcode:: MOV - Move
61 .. opcode:: LIT - Light Coefficients
69 dst.z = (src.x > 0) ? max(src.y, 0)^{clamp(src.w, -128, 128))} : 0
74 .. opcode:: RCP - Reciprocal
76 This instruction replicates its result.
83 .. opcode:: RSQ - Reciprocal Square Root
85 This instruction replicates its result.
89 dst = \frac{1}{\sqrt{|src.x|}}
92 .. opcode:: EXP - Approximate Exponential Base 2
96 dst.x = 2^{\lfloor src.x\rfloor}
98 dst.y = src.x - \lfloor src.x\rfloor
105 .. opcode:: LOG - Approximate Logarithm Base 2
109 dst.x = \lfloor\log_2{|src.x|}\rfloor
111 dst.y = \frac{|src.x|}{2^{\lfloor\log_2{|src.x|}\rfloor}}
113 dst.z = \log_2{|src.x|}
118 .. opcode:: MUL - Multiply
122 dst.x = src0.x \times src1.x
124 dst.y = src0.y \times src1.y
126 dst.z = src0.z \times src1.z
128 dst.w = src0.w \times src1.w
131 .. opcode:: ADD - Add
135 dst.x = src0.x + src1.x
137 dst.y = src0.y + src1.y
139 dst.z = src0.z + src1.z
141 dst.w = src0.w + src1.w
144 .. opcode:: DP3 - 3-component Dot Product
146 This instruction replicates its result.
150 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
153 .. opcode:: DP4 - 4-component Dot Product
155 This instruction replicates its result.
159 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
162 .. opcode:: DST - Distance Vector
168 dst.y = src0.y \times src1.y
175 .. opcode:: MIN - Minimum
179 dst.x = min(src0.x, src1.x)
181 dst.y = min(src0.y, src1.y)
183 dst.z = min(src0.z, src1.z)
185 dst.w = min(src0.w, src1.w)
188 .. opcode:: MAX - Maximum
192 dst.x = max(src0.x, src1.x)
194 dst.y = max(src0.y, src1.y)
196 dst.z = max(src0.z, src1.z)
198 dst.w = max(src0.w, src1.w)
201 .. opcode:: SLT - Set On Less Than
205 dst.x = (src0.x < src1.x) ? 1 : 0
207 dst.y = (src0.y < src1.y) ? 1 : 0
209 dst.z = (src0.z < src1.z) ? 1 : 0
211 dst.w = (src0.w < src1.w) ? 1 : 0
214 .. opcode:: SGE - Set On Greater Equal Than
218 dst.x = (src0.x >= src1.x) ? 1 : 0
220 dst.y = (src0.y >= src1.y) ? 1 : 0
222 dst.z = (src0.z >= src1.z) ? 1 : 0
224 dst.w = (src0.w >= src1.w) ? 1 : 0
227 .. opcode:: MAD - Multiply And Add
231 dst.x = src0.x \times src1.x + src2.x
233 dst.y = src0.y \times src1.y + src2.y
235 dst.z = src0.z \times src1.z + src2.z
237 dst.w = src0.w \times src1.w + src2.w
240 .. opcode:: SUB - Subtract
244 dst.x = src0.x - src1.x
246 dst.y = src0.y - src1.y
248 dst.z = src0.z - src1.z
250 dst.w = src0.w - src1.w
253 .. opcode:: LRP - Linear Interpolate
257 dst.x = src0.x \times src1.x + (1 - src0.x) \times src2.x
259 dst.y = src0.y \times src1.y + (1 - src0.y) \times src2.y
261 dst.z = src0.z \times src1.z + (1 - src0.z) \times src2.z
263 dst.w = src0.w \times src1.w + (1 - src0.w) \times src2.w
266 .. opcode:: CND - Condition
270 dst.x = (src2.x > 0.5) ? src0.x : src1.x
272 dst.y = (src2.y > 0.5) ? src0.y : src1.y
274 dst.z = (src2.z > 0.5) ? src0.z : src1.z
276 dst.w = (src2.w > 0.5) ? src0.w : src1.w
279 .. opcode:: DP2A - 2-component Dot Product And Add
283 dst.x = src0.x \times src1.x + src0.y \times src1.y + src2.x
285 dst.y = src0.x \times src1.x + src0.y \times src1.y + src2.x
287 dst.z = src0.x \times src1.x + src0.y \times src1.y + src2.x
289 dst.w = src0.x \times src1.x + src0.y \times src1.y + src2.x
292 .. opcode:: FRC - Fraction
296 dst.x = src.x - \lfloor src.x\rfloor
298 dst.y = src.y - \lfloor src.y\rfloor
300 dst.z = src.z - \lfloor src.z\rfloor
302 dst.w = src.w - \lfloor src.w\rfloor
305 .. opcode:: CLAMP - Clamp
309 dst.x = clamp(src0.x, src1.x, src2.x)
311 dst.y = clamp(src0.y, src1.y, src2.y)
313 dst.z = clamp(src0.z, src1.z, src2.z)
315 dst.w = clamp(src0.w, src1.w, src2.w)
318 .. opcode:: FLR - Floor
320 This is identical to :opcode:`ARL`.
324 dst.x = \lfloor src.x\rfloor
326 dst.y = \lfloor src.y\rfloor
328 dst.z = \lfloor src.z\rfloor
330 dst.w = \lfloor src.w\rfloor
333 .. opcode:: ROUND - Round
346 .. opcode:: EX2 - Exponential Base 2
348 This instruction replicates its result.
355 .. opcode:: LG2 - Logarithm Base 2
357 This instruction replicates its result.
364 .. opcode:: POW - Power
366 This instruction replicates its result.
370 dst = src0.x^{src1.x}
372 .. opcode:: XPD - Cross Product
376 dst.x = src0.y \times src1.z - src1.y \times src0.z
378 dst.y = src0.z \times src1.x - src1.z \times src0.x
380 dst.z = src0.x \times src1.y - src1.x \times src0.y
385 .. opcode:: ABS - Absolute
398 .. opcode:: RCC - Reciprocal Clamped
400 This instruction replicates its result.
402 XXX cleanup on aisle three
406 dst = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
409 .. opcode:: DPH - Homogeneous Dot Product
411 This instruction replicates its result.
415 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
418 .. opcode:: COS - Cosine
420 This instruction replicates its result.
427 .. opcode:: DDX - Derivative Relative To X
431 dst.x = partialx(src.x)
433 dst.y = partialx(src.y)
435 dst.z = partialx(src.z)
437 dst.w = partialx(src.w)
440 .. opcode:: DDY - Derivative Relative To Y
444 dst.x = partialy(src.x)
446 dst.y = partialy(src.y)
448 dst.z = partialy(src.z)
450 dst.w = partialy(src.w)
453 .. opcode:: KILP - Predicated Discard
458 .. opcode:: PK2H - Pack Two 16-bit Floats
463 .. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars
468 .. opcode:: PK4B - Pack Four Signed 8-bit Scalars
473 .. opcode:: PK4UB - Pack Four Unsigned 8-bit Scalars
478 .. opcode:: RFL - Reflection Vector
482 dst.x = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.x - src1.x
484 dst.y = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.y - src1.y
486 dst.z = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.z - src1.z
492 Considered for removal.
495 .. opcode:: SEQ - Set On Equal
499 dst.x = (src0.x == src1.x) ? 1 : 0
501 dst.y = (src0.y == src1.y) ? 1 : 0
503 dst.z = (src0.z == src1.z) ? 1 : 0
505 dst.w = (src0.w == src1.w) ? 1 : 0
508 .. opcode:: SFL - Set On False
510 This instruction replicates its result.
518 Considered for removal.
521 .. opcode:: SGT - Set On Greater Than
525 dst.x = (src0.x > src1.x) ? 1 : 0
527 dst.y = (src0.y > src1.y) ? 1 : 0
529 dst.z = (src0.z > src1.z) ? 1 : 0
531 dst.w = (src0.w > src1.w) ? 1 : 0
534 .. opcode:: SIN - Sine
536 This instruction replicates its result.
543 .. opcode:: SLE - Set On Less Equal Than
547 dst.x = (src0.x <= src1.x) ? 1 : 0
549 dst.y = (src0.y <= src1.y) ? 1 : 0
551 dst.z = (src0.z <= src1.z) ? 1 : 0
553 dst.w = (src0.w <= src1.w) ? 1 : 0
556 .. opcode:: SNE - Set On Not Equal
560 dst.x = (src0.x != src1.x) ? 1 : 0
562 dst.y = (src0.y != src1.y) ? 1 : 0
564 dst.z = (src0.z != src1.z) ? 1 : 0
566 dst.w = (src0.w != src1.w) ? 1 : 0
569 .. opcode:: STR - Set On True
571 This instruction replicates its result.
578 .. opcode:: TEX - Texture Lookup
583 .. opcode:: TXD - Texture Lookup with Derivatives
588 .. opcode:: TXP - Projective Texture Lookup
593 .. opcode:: UP2H - Unpack Two 16-Bit Floats
599 Considered for removal.
601 .. opcode:: UP2US - Unpack Two Unsigned 16-Bit Scalars
607 Considered for removal.
609 .. opcode:: UP4B - Unpack Four Signed 8-Bit Values
615 Considered for removal.
617 .. opcode:: UP4UB - Unpack Four Unsigned 8-Bit Scalars
623 Considered for removal.
625 .. opcode:: X2D - 2D Coordinate Transformation
629 dst.x = src0.x + src1.x \times src2.x + src1.y \times src2.y
631 dst.y = src0.y + src1.x \times src2.z + src1.y \times src2.w
633 dst.z = src0.x + src1.x \times src2.x + src1.y \times src2.y
635 dst.w = src0.y + src1.x \times src2.z + src1.y \times src2.w
639 Considered for removal.
642 .. opcode:: ARA - Address Register Add
648 Considered for removal.
650 .. opcode:: ARR - Address Register Load With Round
663 .. opcode:: BRA - Branch
669 Considered for removal.
671 .. opcode:: CAL - Subroutine Call
677 .. opcode:: RET - Subroutine Call Return
682 .. opcode:: SSG - Set Sign
686 dst.x = (src.x > 0) ? 1 : (src.x < 0) ? -1 : 0
688 dst.y = (src.y > 0) ? 1 : (src.y < 0) ? -1 : 0
690 dst.z = (src.z > 0) ? 1 : (src.z < 0) ? -1 : 0
692 dst.w = (src.w > 0) ? 1 : (src.w < 0) ? -1 : 0
695 .. opcode:: CMP - Compare
699 dst.x = (src0.x < 0) ? src1.x : src2.x
701 dst.y = (src0.y < 0) ? src1.y : src2.y
703 dst.z = (src0.z < 0) ? src1.z : src2.z
705 dst.w = (src0.w < 0) ? src1.w : src2.w
708 .. opcode:: KIL - Conditional Discard
712 if (src.x < 0 || src.y < 0 || src.z < 0 || src.w < 0)
717 .. opcode:: SCS - Sine Cosine
730 .. opcode:: TXB - Texture Lookup With Bias
735 .. opcode:: NRM - 3-component Vector Normalise
739 dst.x = src.x / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
741 dst.y = src.y / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
743 dst.z = src.z / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
748 .. opcode:: DIV - Divide
752 dst.x = \frac{src0.x}{src1.x}
754 dst.y = \frac{src0.y}{src1.y}
756 dst.z = \frac{src0.z}{src1.z}
758 dst.w = \frac{src0.w}{src1.w}
761 .. opcode:: DP2 - 2-component Dot Product
763 This instruction replicates its result.
767 dst = src0.x \times src1.x + src0.y \times src1.y
770 .. opcode:: TXL - Texture Lookup With LOD
775 .. opcode:: BRK - Break
785 .. opcode:: ELSE - Else
790 .. opcode:: ENDIF - End If
795 .. opcode:: PUSHA - Push Address Register On Stack
804 Considered for cleanup.
808 Considered for removal.
810 .. opcode:: POPA - Pop Address Register From Stack
819 Considered for cleanup.
823 Considered for removal.
827 ^^^^^^^^^^^^^^^^^^^^^^^^
829 These opcodes are primarily provided for special-use computational shaders.
830 Support for these opcodes indicated by a special pipe capability bit (TBD).
832 XXX so let's discuss it, yeah?
834 .. opcode:: CEIL - Ceiling
838 dst.x = \lceil src.x\rceil
840 dst.y = \lceil src.y\rceil
842 dst.z = \lceil src.z\rceil
844 dst.w = \lceil src.w\rceil
847 .. opcode:: I2F - Integer To Float
851 dst.x = (float) src.x
853 dst.y = (float) src.y
855 dst.z = (float) src.z
857 dst.w = (float) src.w
860 .. opcode:: NOT - Bitwise Not
873 .. opcode:: TRUNC - Truncate
886 .. opcode:: SHL - Shift Left
890 dst.x = src0.x << src1.x
892 dst.y = src0.y << src1.x
894 dst.z = src0.z << src1.x
896 dst.w = src0.w << src1.x
899 .. opcode:: SHR - Shift Right
903 dst.x = src0.x >> src1.x
905 dst.y = src0.y >> src1.x
907 dst.z = src0.z >> src1.x
909 dst.w = src0.w >> src1.x
912 .. opcode:: AND - Bitwise And
916 dst.x = src0.x & src1.x
918 dst.y = src0.y & src1.y
920 dst.z = src0.z & src1.z
922 dst.w = src0.w & src1.w
925 .. opcode:: OR - Bitwise Or
929 dst.x = src0.x | src1.x
931 dst.y = src0.y | src1.y
933 dst.z = src0.z | src1.z
935 dst.w = src0.w | src1.w
938 .. opcode:: MOD - Modulus
942 dst.x = src0.x \bmod src1.x
944 dst.y = src0.y \bmod src1.y
946 dst.z = src0.z \bmod src1.z
948 dst.w = src0.w \bmod src1.w
951 .. opcode:: XOR - Bitwise Xor
955 dst.x = src0.x \oplus src1.x
957 dst.y = src0.y \oplus src1.y
959 dst.z = src0.z \oplus src1.z
961 dst.w = src0.w \oplus src1.w
964 .. opcode:: SAD - Sum Of Absolute Differences
968 dst.x = |src0.x - src1.x| + src2.x
970 dst.y = |src0.y - src1.y| + src2.y
972 dst.z = |src0.z - src1.z| + src2.z
974 dst.w = |src0.w - src1.w| + src2.w
977 .. opcode:: TXF - Texel Fetch
982 .. opcode:: TXQ - Texture Size Query
987 .. opcode:: CONT - Continue
993 Support for CONT is determined by a special capability bit,
994 ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information.
998 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1000 These opcodes are only supported in geometry shaders; they have no meaning
1001 in any other type of shader.
1003 .. opcode:: EMIT - Emit
1008 .. opcode:: ENDPRIM - End Primitive
1016 These opcodes are part of :term:`GLSL`'s opcode set. Support for these
1017 opcodes is determined by a special capability bit, ``GLSL``.
1019 .. opcode:: BGNLOOP - Begin a Loop
1024 .. opcode:: BGNSUB - Begin Subroutine
1029 .. opcode:: ENDLOOP - End a Loop
1034 .. opcode:: ENDSUB - End Subroutine
1039 .. opcode:: NOP - No Operation
1044 .. opcode:: NRM4 - 4-component Vector Normalise
1046 This instruction replicates its result.
1050 dst = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1058 .. opcode:: CALLNZ - Subroutine Call If Not Zero
1063 .. opcode:: IFC - If
1068 .. opcode:: BREAKC - Break Conditional
1077 The double-precision opcodes reinterpret four-component vectors into
1078 two-component vectors with doubled precision in each component.
1080 Support for these opcodes is XXX undecided. :T
1082 .. opcode:: DADD - Add
1086 dst.xy = src0.xy + src1.xy
1088 dst.zw = src0.zw + src1.zw
1091 .. opcode:: DDIV - Divide
1095 dst.xy = src0.xy / src1.xy
1097 dst.zw = src0.zw / src1.zw
1099 .. opcode:: DSEQ - Set on Equal
1103 dst.xy = src0.xy == src1.xy ? 1.0F : 0.0F
1105 dst.zw = src0.zw == src1.zw ? 1.0F : 0.0F
1107 .. opcode:: DSLT - Set on Less than
1111 dst.xy = src0.xy < src1.xy ? 1.0F : 0.0F
1113 dst.zw = src0.zw < src1.zw ? 1.0F : 0.0F
1115 .. opcode:: DFRAC - Fraction
1119 dst.xy = src.xy - \lfloor src.xy\rfloor
1121 dst.zw = src.zw - \lfloor src.zw\rfloor
1124 .. opcode:: DFRACEXP - Convert Number to Fractional and Integral Components
1126 Like the ``frexp()`` routine in many math libraries, this opcode stores the
1127 exponent of its source to ``dst0``, and the significand to ``dst1``, such that
1128 :math:`dst1 \times 2^{dst0} = src` .
1132 dst0.xy = exp(src.xy)
1134 dst1.xy = frac(src.xy)
1136 dst0.zw = exp(src.zw)
1138 dst1.zw = frac(src.zw)
1140 .. opcode:: DLDEXP - Multiply Number by Integral Power of 2
1142 This opcode is the inverse of :opcode:`DFRACEXP`.
1146 dst.xy = src0.xy \times 2^{src1.xy}
1148 dst.zw = src0.zw \times 2^{src1.zw}
1150 .. opcode:: DMIN - Minimum
1154 dst.xy = min(src0.xy, src1.xy)
1156 dst.zw = min(src0.zw, src1.zw)
1158 .. opcode:: DMAX - Maximum
1162 dst.xy = max(src0.xy, src1.xy)
1164 dst.zw = max(src0.zw, src1.zw)
1166 .. opcode:: DMUL - Multiply
1170 dst.xy = src0.xy \times src1.xy
1172 dst.zw = src0.zw \times src1.zw
1175 .. opcode:: DMAD - Multiply And Add
1179 dst.xy = src0.xy \times src1.xy + src2.xy
1181 dst.zw = src0.zw \times src1.zw + src2.zw
1184 .. opcode:: DRCP - Reciprocal
1188 dst.xy = \frac{1}{src.xy}
1190 dst.zw = \frac{1}{src.zw}
1192 .. opcode:: DSQRT - Square Root
1196 dst.xy = \sqrt{src.xy}
1198 dst.zw = \sqrt{src.zw}
1201 Explanation of symbols used
1202 ------------------------------
1209 :math:`|x|` Absolute value of `x`.
1211 :math:`\lceil x \rceil` Ceiling of `x`.
1213 clamp(x,y,z) Clamp x between y and z.
1214 (x < y) ? y : (x > z) ? z : x
1216 :math:`\lfloor x\rfloor` Floor of `x`.
1218 :math:`\log_2{x}` Logarithm of `x`, base 2.
1220 max(x,y) Maximum of x and y.
1223 min(x,y) Minimum of x and y.
1226 partialx(x) Derivative of x relative to fragment's X.
1228 partialy(x) Derivative of x relative to fragment's Y.
1230 pop() Pop from stack.
1232 :math:`x^y` `x` to the power `y`.
1234 push(x) Push x on stack.
1238 trunc(x) Truncate x, i.e. drop the fraction bits.
1245 discard Discard fragment.
1249 target Label of target instruction.
1260 Declares a register that is will be referenced as an operand in Instruction
1263 File field contains register file that is being declared and is one
1266 UsageMask field specifies which of the register components can be accessed
1267 and is one of TGSI_WRITEMASK.
1269 Interpolate field is only valid for fragment shader INPUT register files.
1270 It specifes the way input is being interpolated by the rasteriser and is one
1271 of TGSI_INTERPOLATE.
1273 If Dimension flag is set to 1, a Declaration Dimension token follows.
1275 If Semantic flag is set to 1, a Declaration Semantic token follows.
1277 CylindricalWrap bitfield is only valid for fragment shader INPUT register
1278 files. It specifies which register components should be subject to cylindrical
1279 wrapping when interpolating by the rasteriser. If TGSI_CYLINDRICAL_WRAP_X
1280 is set to 1, the X component should be interpolated according to cylindrical
1284 Declaration Semantic
1285 ^^^^^^^^^^^^^^^^^^^^^^^^
1287 Vertex and fragment shader input and output registers may be labeled
1288 with semantic information consisting of a name and index.
1290 Follows Declaration token if Semantic bit is set.
1292 Since its purpose is to link a shader with other stages of the pipeline,
1293 it is valid to follow only those Declaration tokens that declare a register
1294 either in INPUT or OUTPUT file.
1296 SemanticName field contains the semantic name of the register being declared.
1297 There is no default value.
1299 SemanticIndex is an optional subscript that can be used to distinguish
1300 different register declarations with the same semantic name. The default value
1303 The meanings of the individual semantic names are explained in the following
1306 TGSI_SEMANTIC_POSITION
1307 """"""""""""""""""""""
1309 For vertex shaders, TGSI_SEMANTIC_POSITION indicates the vertex shader
1310 output register which contains the homogeneous vertex position in the clip
1311 space coordinate system. After clipping, the X, Y and Z components of the
1312 vertex will be divided by the W value to get normalized device coordinates.
1314 For fragment shaders, TGSI_SEMANTIC_POSITION is used to indicate that
1315 fragment shader input contains the fragment's window position. The X
1316 component starts at zero and always increases from left to right.
1317 The Y component starts at zero and always increases but Y=0 may either
1318 indicate the top of the window or the bottom depending on the fragment
1319 coordinate origin convention (see TGSI_PROPERTY_FS_COORD_ORIGIN).
1320 The Z coordinate ranges from 0 to 1 to represent depth from the front
1321 to the back of the Z buffer. The W component contains the reciprocol
1322 of the interpolated vertex position W component.
1324 Fragment shaders may also declare an output register with
1325 TGSI_SEMANTIC_POSITION. Only the Z component is writable. This allows
1326 the fragment shader to change the fragment's Z position.
1333 For vertex shader outputs or fragment shader inputs/outputs, this
1334 label indicates that the resister contains an R,G,B,A color.
1336 Several shader inputs/outputs may contain colors so the semantic index
1337 is used to distinguish them. For example, color[0] may be the diffuse
1338 color while color[1] may be the specular color.
1340 This label is needed so that the flat/smooth shading can be applied
1341 to the right interpolants during rasterization.
1345 TGSI_SEMANTIC_BCOLOR
1346 """"""""""""""""""""
1348 Back-facing colors are only used for back-facing polygons, and are only valid
1349 in vertex shader outputs. After rasterization, all polygons are front-facing
1350 and COLOR and BCOLOR end up occupying the same slots in the fragment shader,
1351 so all BCOLORs effectively become regular COLORs in the fragment shader.
1357 Vertex shader inputs and outputs and fragment shader inputs may be
1358 labeled with TGSI_SEMANTIC_FOG to indicate that the register contains
1359 a fog coordinate in the form (F, 0, 0, 1). Typically, the fragment
1360 shader will use the fog coordinate to compute a fog blend factor which
1361 is used to blend the normal fragment color with a constant fog color.
1363 Only the first component matters when writing from the vertex shader;
1364 the driver will ensure that the coordinate is in this format when used
1365 as a fragment shader input.
1371 Vertex shader input and output registers may be labeled with
1372 TGIS_SEMANTIC_PSIZE to indicate that the register contains a point size
1373 in the form (S, 0, 0, 1). The point size controls the width or diameter
1374 of points for rasterization. This label cannot be used in fragment
1377 When using this semantic, be sure to set the appropriate state in the
1378 :ref:`rasterizer` first.
1381 TGSI_SEMANTIC_GENERIC
1382 """""""""""""""""""""
1384 All vertex/fragment shader inputs/outputs not labeled with any other
1385 semantic label can be considered to be generic attributes. Typical
1386 uses of generic inputs/outputs are texcoords and user-defined values.
1389 TGSI_SEMANTIC_NORMAL
1390 """"""""""""""""""""
1392 Indicates that a vertex shader input is a normal vector. This is
1393 typically only used for legacy graphics APIs.
1399 This label applies to fragment shader inputs only and indicates that
1400 the register contains front/back-face information of the form (F, 0,
1401 0, 1). The first component will be positive when the fragment belongs
1402 to a front-facing polygon, and negative when the fragment belongs to a
1403 back-facing polygon.
1406 TGSI_SEMANTIC_EDGEFLAG
1407 """"""""""""""""""""""
1409 For vertex shaders, this sematic label indicates that an input or
1410 output is a boolean edge flag. The register layout is [F, x, x, x]
1411 where F is 0.0 or 1.0 and x = don't care. Normally, the vertex shader
1412 simply copies the edge flag input to the edgeflag output.
1414 Edge flags are used to control which lines or points are actually
1415 drawn when the polygon mode converts triangles/quads/polygons into
1418 TGSI_SEMANTIC_STENCIL
1419 """"""""""""""""""""""
1421 For fragment shaders, this semantic label indicates than an output
1422 is a writable stencil reference value. Only the Y component is writable.
1423 This allows the fragment shader to change the fragments stencilref value.
1427 ^^^^^^^^^^^^^^^^^^^^^^^^
1430 Properties are general directives that apply to the whole TGSI program.
1435 Specifies the fragment shader TGSI_SEMANTIC_POSITION coordinate origin.
1436 The default value is UPPER_LEFT.
1438 If UPPER_LEFT, the position will be (0,0) at the upper left corner and
1439 increase downward and rightward.
1440 If LOWER_LEFT, the position will be (0,0) at the lower left corner and
1441 increase upward and rightward.
1443 OpenGL defaults to LOWER_LEFT, and is configurable with the
1444 GL_ARB_fragment_coord_conventions extension.
1446 DirectX 9/10 use UPPER_LEFT.
1448 FS_COORD_PIXEL_CENTER
1449 """""""""""""""""""""
1451 Specifies the fragment shader TGSI_SEMANTIC_POSITION pixel center convention.
1452 The default value is HALF_INTEGER.
1454 If HALF_INTEGER, the fractionary part of the position will be 0.5
1455 If INTEGER, the fractionary part of the position will be 0.0
1457 Note that this does not affect the set of fragments generated by
1458 rasterization, which is instead controlled by gl_rasterization_rules in the
1461 OpenGL defaults to HALF_INTEGER, and is configurable with the
1462 GL_ARB_fragment_coord_conventions extension.
1464 DirectX 9 uses INTEGER.
1465 DirectX 10 uses HALF_INTEGER.
1469 Texture Sampling and Texture Formats
1470 ------------------------------------
1472 This table shows how texture image components are returned as (x,y,z,w) tuples
1473 by TGSI texture instructions, such as :opcode:`TEX`, :opcode:`TXD`, and
1474 :opcode:`TXP`. For reference, OpenGL and Direct3D conventions are shown as
1477 +--------------------+--------------+--------------------+--------------+
1478 | Texture Components | Gallium | OpenGL | Direct3D 9 |
1479 +====================+==============+====================+==============+
1480 | R | (r, 0, 0, 1) | (r, 0, 0, 1) | (r, 1, 1, 1) |
1481 +--------------------+--------------+--------------------+--------------+
1482 | RG | (r, g, 0, 1) | (r, g, 0, 1) | (r, g, 1, 1) |
1483 +--------------------+--------------+--------------------+--------------+
1484 | RGB | (r, g, b, 1) | (r, g, b, 1) | (r, g, b, 1) |
1485 +--------------------+--------------+--------------------+--------------+
1486 | RGBA | (r, g, b, a) | (r, g, b, a) | (r, g, b, a) |
1487 +--------------------+--------------+--------------------+--------------+
1488 | A | (0, 0, 0, a) | (0, 0, 0, a) | (0, 0, 0, a) |
1489 +--------------------+--------------+--------------------+--------------+
1490 | L | (l, l, l, 1) | (l, l, l, 1) | (l, l, l, 1) |
1491 +--------------------+--------------+--------------------+--------------+
1492 | LA | (l, l, l, a) | (l, l, l, a) | (l, l, l, a) |
1493 +--------------------+--------------+--------------------+--------------+
1494 | I | (i, i, i, i) | (i, i, i, i) | N/A |
1495 +--------------------+--------------+--------------------+--------------+
1496 | UV | XXX TBD | (0, 0, 0, 1) | (u, v, 1, 1) |
1497 | | | [#envmap-bumpmap]_ | |
1498 +--------------------+--------------+--------------------+--------------+
1499 | Z | XXX TBD | (z, z, z, 1) | (0, z, 0, 1) |
1500 | | | [#depth-tex-mode]_ | |
1501 +--------------------+--------------+--------------------+--------------+
1502 | S | (s, s, s, s) | unknown | unknown |
1503 +--------------------+--------------+--------------------+--------------+
1505 .. [#envmap-bumpmap] http://www.opengl.org/registry/specs/ATI/envmap_bumpmap.txt
1506 .. [#depth-tex-mode] the default is (z, z, z, 1) but may also be (0, 0, 0, z)
1507 or (z, z, z, z) depending on the value of GL_DEPTH_TEXTURE_MODE.