4 TGSI, Tungsten Graphics Shader Infrastructure, is an intermediate language
5 for describing shaders. Since Gallium is inherently shaderful, shaders are
6 an important part of the API. TGSI is the only intermediate representation
12 All TGSI instructions, known as *opcodes*, operate on arbitrary-precision
13 floating-point four-component vectors. An opcode may have up to one
14 destination register, known as *dst*, and between zero and three source
15 registers, called *src0* through *src2*, or simply *src* if there is only
18 Some instructions, like :opcode:`I2F`, permit re-interpretation of vector
19 components as integers. Other instructions permit using registers as
20 two-component vectors with double precision; see :ref:`Double Opcodes`.
22 When an instruction has a scalar result, the result is usually copied into
23 each of the components of *dst*. When this happens, the result is said to be
24 *replicated* to *dst*. :opcode:`RCP` is one such instruction.
29 TGSI supports modifiers on inputs (as well as saturate modifier on instructions).
31 For inputs which have a floating point type, both absolute value and negation
32 modifiers are supported (with absolute value being applied first).
33 TGSI_OPCODE_MOV is considered to have float input type for applying modifiers.
35 For inputs which have signed type only the negate modifier is supported. This
36 includes instructions which are otherwise ignorant if the type is signed or
37 unsigned, such as TGSI_OPCODE_UADD.
39 For inputs with unsigned type no modifiers are allowed.
45 ^^^^^^^^^^^^^^^^^^^^^^^^^
47 These opcodes are guaranteed to be available regardless of the driver being
50 .. opcode:: ARL - Address Register Load
54 dst.x = \lfloor src.x\rfloor
56 dst.y = \lfloor src.y\rfloor
58 dst.z = \lfloor src.z\rfloor
60 dst.w = \lfloor src.w\rfloor
63 .. opcode:: MOV - Move
76 .. opcode:: LIT - Light Coefficients
84 dst.z = (src.x > 0) ? max(src.y, 0)^{clamp(src.w, -128, 128))} : 0
89 .. opcode:: RCP - Reciprocal
91 This instruction replicates its result.
98 .. opcode:: RSQ - Reciprocal Square Root
100 This instruction replicates its result.
104 dst = \frac{1}{\sqrt{|src.x|}}
107 .. opcode:: SQRT - Square Root
109 This instruction replicates its result.
116 .. opcode:: EXP - Approximate Exponential Base 2
120 dst.x = 2^{\lfloor src.x\rfloor}
122 dst.y = src.x - \lfloor src.x\rfloor
129 .. opcode:: LOG - Approximate Logarithm Base 2
133 dst.x = \lfloor\log_2{|src.x|}\rfloor
135 dst.y = \frac{|src.x|}{2^{\lfloor\log_2{|src.x|}\rfloor}}
137 dst.z = \log_2{|src.x|}
142 .. opcode:: MUL - Multiply
146 dst.x = src0.x \times src1.x
148 dst.y = src0.y \times src1.y
150 dst.z = src0.z \times src1.z
152 dst.w = src0.w \times src1.w
155 .. opcode:: ADD - Add
159 dst.x = src0.x + src1.x
161 dst.y = src0.y + src1.y
163 dst.z = src0.z + src1.z
165 dst.w = src0.w + src1.w
168 .. opcode:: DP3 - 3-component Dot Product
170 This instruction replicates its result.
174 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
177 .. opcode:: DP4 - 4-component Dot Product
179 This instruction replicates its result.
183 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
186 .. opcode:: DST - Distance Vector
192 dst.y = src0.y \times src1.y
199 .. opcode:: MIN - Minimum
203 dst.x = min(src0.x, src1.x)
205 dst.y = min(src0.y, src1.y)
207 dst.z = min(src0.z, src1.z)
209 dst.w = min(src0.w, src1.w)
212 .. opcode:: MAX - Maximum
216 dst.x = max(src0.x, src1.x)
218 dst.y = max(src0.y, src1.y)
220 dst.z = max(src0.z, src1.z)
222 dst.w = max(src0.w, src1.w)
225 .. opcode:: SLT - Set On Less Than
229 dst.x = (src0.x < src1.x) ? 1 : 0
231 dst.y = (src0.y < src1.y) ? 1 : 0
233 dst.z = (src0.z < src1.z) ? 1 : 0
235 dst.w = (src0.w < src1.w) ? 1 : 0
238 .. opcode:: SGE - Set On Greater Equal Than
242 dst.x = (src0.x >= src1.x) ? 1 : 0
244 dst.y = (src0.y >= src1.y) ? 1 : 0
246 dst.z = (src0.z >= src1.z) ? 1 : 0
248 dst.w = (src0.w >= src1.w) ? 1 : 0
251 .. opcode:: MAD - Multiply And Add
255 dst.x = src0.x \times src1.x + src2.x
257 dst.y = src0.y \times src1.y + src2.y
259 dst.z = src0.z \times src1.z + src2.z
261 dst.w = src0.w \times src1.w + src2.w
264 .. opcode:: SUB - Subtract
268 dst.x = src0.x - src1.x
270 dst.y = src0.y - src1.y
272 dst.z = src0.z - src1.z
274 dst.w = src0.w - src1.w
277 .. opcode:: LRP - Linear Interpolate
281 dst.x = src0.x \times src1.x + (1 - src0.x) \times src2.x
283 dst.y = src0.y \times src1.y + (1 - src0.y) \times src2.y
285 dst.z = src0.z \times src1.z + (1 - src0.z) \times src2.z
287 dst.w = src0.w \times src1.w + (1 - src0.w) \times src2.w
290 .. opcode:: CND - Condition
294 dst.x = (src2.x > 0.5) ? src0.x : src1.x
296 dst.y = (src2.y > 0.5) ? src0.y : src1.y
298 dst.z = (src2.z > 0.5) ? src0.z : src1.z
300 dst.w = (src2.w > 0.5) ? src0.w : src1.w
303 .. opcode:: DP2A - 2-component Dot Product And Add
307 dst.x = src0.x \times src1.x + src0.y \times src1.y + src2.x
309 dst.y = src0.x \times src1.x + src0.y \times src1.y + src2.x
311 dst.z = src0.x \times src1.x + src0.y \times src1.y + src2.x
313 dst.w = src0.x \times src1.x + src0.y \times src1.y + src2.x
316 .. opcode:: FRC - Fraction
320 dst.x = src.x - \lfloor src.x\rfloor
322 dst.y = src.y - \lfloor src.y\rfloor
324 dst.z = src.z - \lfloor src.z\rfloor
326 dst.w = src.w - \lfloor src.w\rfloor
329 .. opcode:: CLAMP - Clamp
333 dst.x = clamp(src0.x, src1.x, src2.x)
335 dst.y = clamp(src0.y, src1.y, src2.y)
337 dst.z = clamp(src0.z, src1.z, src2.z)
339 dst.w = clamp(src0.w, src1.w, src2.w)
342 .. opcode:: FLR - Floor
344 This is identical to :opcode:`ARL`.
348 dst.x = \lfloor src.x\rfloor
350 dst.y = \lfloor src.y\rfloor
352 dst.z = \lfloor src.z\rfloor
354 dst.w = \lfloor src.w\rfloor
357 .. opcode:: ROUND - Round
370 .. opcode:: EX2 - Exponential Base 2
372 This instruction replicates its result.
379 .. opcode:: LG2 - Logarithm Base 2
381 This instruction replicates its result.
388 .. opcode:: POW - Power
390 This instruction replicates its result.
394 dst = src0.x^{src1.x}
396 .. opcode:: XPD - Cross Product
400 dst.x = src0.y \times src1.z - src1.y \times src0.z
402 dst.y = src0.z \times src1.x - src1.z \times src0.x
404 dst.z = src0.x \times src1.y - src1.x \times src0.y
409 .. opcode:: ABS - Absolute
422 .. opcode:: RCC - Reciprocal Clamped
424 This instruction replicates its result.
426 XXX cleanup on aisle three
430 dst = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
433 .. opcode:: DPH - Homogeneous Dot Product
435 This instruction replicates its result.
439 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
442 .. opcode:: COS - Cosine
444 This instruction replicates its result.
451 .. opcode:: DDX - Derivative Relative To X
455 dst.x = partialx(src.x)
457 dst.y = partialx(src.y)
459 dst.z = partialx(src.z)
461 dst.w = partialx(src.w)
464 .. opcode:: DDY - Derivative Relative To Y
468 dst.x = partialy(src.x)
470 dst.y = partialy(src.y)
472 dst.z = partialy(src.z)
474 dst.w = partialy(src.w)
477 .. opcode:: KILP - Predicated Discard
482 .. opcode:: PK2H - Pack Two 16-bit Floats
487 .. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars
492 .. opcode:: PK4B - Pack Four Signed 8-bit Scalars
497 .. opcode:: PK4UB - Pack Four Unsigned 8-bit Scalars
502 .. opcode:: RFL - Reflection Vector
506 dst.x = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.x - src1.x
508 dst.y = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.y - src1.y
510 dst.z = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.z - src1.z
516 Considered for removal.
519 .. opcode:: SEQ - Set On Equal
523 dst.x = (src0.x == src1.x) ? 1 : 0
525 dst.y = (src0.y == src1.y) ? 1 : 0
527 dst.z = (src0.z == src1.z) ? 1 : 0
529 dst.w = (src0.w == src1.w) ? 1 : 0
532 .. opcode:: SFL - Set On False
534 This instruction replicates its result.
542 Considered for removal.
545 .. opcode:: SGT - Set On Greater Than
549 dst.x = (src0.x > src1.x) ? 1 : 0
551 dst.y = (src0.y > src1.y) ? 1 : 0
553 dst.z = (src0.z > src1.z) ? 1 : 0
555 dst.w = (src0.w > src1.w) ? 1 : 0
558 .. opcode:: SIN - Sine
560 This instruction replicates its result.
567 .. opcode:: SLE - Set On Less Equal Than
571 dst.x = (src0.x <= src1.x) ? 1 : 0
573 dst.y = (src0.y <= src1.y) ? 1 : 0
575 dst.z = (src0.z <= src1.z) ? 1 : 0
577 dst.w = (src0.w <= src1.w) ? 1 : 0
580 .. opcode:: SNE - Set On Not Equal
584 dst.x = (src0.x != src1.x) ? 1 : 0
586 dst.y = (src0.y != src1.y) ? 1 : 0
588 dst.z = (src0.z != src1.z) ? 1 : 0
590 dst.w = (src0.w != src1.w) ? 1 : 0
593 .. opcode:: STR - Set On True
595 This instruction replicates its result.
602 .. opcode:: TEX - Texture Lookup
610 dst = texture_sample(unit, coord, bias)
612 for array textures src0.y contains the slice for 1D,
613 and src0.z contain the slice for 2D.
614 for shadow textures with no arrays, src0.z contains
616 for shadow textures with arrays, src0.z contains
617 the reference value for 1D arrays, and src0.w contains
618 the reference value for 2D arrays.
619 There is no way to pass a bias in the .w value for
620 shadow arrays, and GLSL doesn't allow this.
621 GLSL does allow cube shadows maps to take a bias value,
622 and we have to determine how this will look in TGSI.
624 .. opcode:: TXD - Texture Lookup with Derivatives
636 dst = texture_sample_deriv(unit, coord, bias, ddx, ddy)
639 .. opcode:: TXP - Projective Texture Lookup
643 coord.x = src0.x / src.w
645 coord.y = src0.y / src.w
647 coord.z = src0.z / src.w
653 dst = texture_sample(unit, coord, bias)
656 .. opcode:: UP2H - Unpack Two 16-Bit Floats
662 Considered for removal.
664 .. opcode:: UP2US - Unpack Two Unsigned 16-Bit Scalars
670 Considered for removal.
672 .. opcode:: UP4B - Unpack Four Signed 8-Bit Values
678 Considered for removal.
680 .. opcode:: UP4UB - Unpack Four Unsigned 8-Bit Scalars
686 Considered for removal.
688 .. opcode:: X2D - 2D Coordinate Transformation
692 dst.x = src0.x + src1.x \times src2.x + src1.y \times src2.y
694 dst.y = src0.y + src1.x \times src2.z + src1.y \times src2.w
696 dst.z = src0.x + src1.x \times src2.x + src1.y \times src2.y
698 dst.w = src0.y + src1.x \times src2.z + src1.y \times src2.w
702 Considered for removal.
705 .. opcode:: ARA - Address Register Add
711 Considered for removal.
713 .. opcode:: ARR - Address Register Load With Round
726 .. opcode:: BRA - Branch
732 Considered for removal.
734 .. opcode:: CAL - Subroutine Call
740 .. opcode:: RET - Subroutine Call Return
745 .. opcode:: SSG - Set Sign
749 dst.x = (src.x > 0) ? 1 : (src.x < 0) ? -1 : 0
751 dst.y = (src.y > 0) ? 1 : (src.y < 0) ? -1 : 0
753 dst.z = (src.z > 0) ? 1 : (src.z < 0) ? -1 : 0
755 dst.w = (src.w > 0) ? 1 : (src.w < 0) ? -1 : 0
758 .. opcode:: CMP - Compare
762 dst.x = (src0.x < 0) ? src1.x : src2.x
764 dst.y = (src0.y < 0) ? src1.y : src2.y
766 dst.z = (src0.z < 0) ? src1.z : src2.z
768 dst.w = (src0.w < 0) ? src1.w : src2.w
771 .. opcode:: KIL - Conditional Discard
775 if (src.x < 0 || src.y < 0 || src.z < 0 || src.w < 0)
780 .. opcode:: SCS - Sine Cosine
793 .. opcode:: TXB - Texture Lookup With Bias
807 dst = texture_sample(unit, coord, bias)
810 .. opcode:: NRM - 3-component Vector Normalise
814 dst.x = src.x / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
816 dst.y = src.y / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
818 dst.z = src.z / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
823 .. opcode:: DIV - Divide
827 dst.x = \frac{src0.x}{src1.x}
829 dst.y = \frac{src0.y}{src1.y}
831 dst.z = \frac{src0.z}{src1.z}
833 dst.w = \frac{src0.w}{src1.w}
836 .. opcode:: DP2 - 2-component Dot Product
838 This instruction replicates its result.
842 dst = src0.x \times src1.x + src0.y \times src1.y
845 .. opcode:: TXL - Texture Lookup With explicit LOD
859 dst = texture_sample(unit, coord, lod)
862 .. opcode:: BRK - Break
872 .. opcode:: ELSE - Else
877 .. opcode:: ENDIF - End If
882 .. opcode:: PUSHA - Push Address Register On Stack
891 Considered for cleanup.
895 Considered for removal.
897 .. opcode:: POPA - Pop Address Register From Stack
906 Considered for cleanup.
910 Considered for removal.
914 ^^^^^^^^^^^^^^^^^^^^^^^^
916 These opcodes are primarily provided for special-use computational shaders.
917 Support for these opcodes indicated by a special pipe capability bit (TBD).
919 XXX so let's discuss it, yeah?
921 .. opcode:: CEIL - Ceiling
925 dst.x = \lceil src.x\rceil
927 dst.y = \lceil src.y\rceil
929 dst.z = \lceil src.z\rceil
931 dst.w = \lceil src.w\rceil
934 .. opcode:: I2F - Integer To Float
938 dst.x = (float) src.x
940 dst.y = (float) src.y
942 dst.z = (float) src.z
944 dst.w = (float) src.w
947 .. opcode:: NOT - Bitwise Not
960 .. opcode:: TRUNC - Truncate
973 .. opcode:: SHL - Shift Left
977 dst.x = src0.x << src1.x
979 dst.y = src0.y << src1.x
981 dst.z = src0.z << src1.x
983 dst.w = src0.w << src1.x
986 .. opcode:: SHR - Shift Right
990 dst.x = src0.x >> src1.x
992 dst.y = src0.y >> src1.x
994 dst.z = src0.z >> src1.x
996 dst.w = src0.w >> src1.x
999 .. opcode:: AND - Bitwise And
1003 dst.x = src0.x & src1.x
1005 dst.y = src0.y & src1.y
1007 dst.z = src0.z & src1.z
1009 dst.w = src0.w & src1.w
1012 .. opcode:: OR - Bitwise Or
1016 dst.x = src0.x | src1.x
1018 dst.y = src0.y | src1.y
1020 dst.z = src0.z | src1.z
1022 dst.w = src0.w | src1.w
1025 .. opcode:: MOD - Modulus
1029 dst.x = src0.x \bmod src1.x
1031 dst.y = src0.y \bmod src1.y
1033 dst.z = src0.z \bmod src1.z
1035 dst.w = src0.w \bmod src1.w
1038 .. opcode:: XOR - Bitwise Xor
1042 dst.x = src0.x \oplus src1.x
1044 dst.y = src0.y \oplus src1.y
1046 dst.z = src0.z \oplus src1.z
1048 dst.w = src0.w \oplus src1.w
1051 .. opcode:: UCMP - Integer Conditional Move
1055 dst.x = src0.x ? src1.x : src2.x
1057 dst.y = src0.y ? src1.y : src2.y
1059 dst.z = src0.z ? src1.z : src2.z
1061 dst.w = src0.w ? src1.w : src2.w
1064 .. opcode:: UARL - Integer Address Register Load
1066 Moves the contents of the source register, assumed to be an integer, into the
1067 destination register, which is assumed to be an address (ADDR) register.
1070 .. opcode:: IABS - Integer Absolute Value
1083 .. opcode:: SAD - Sum Of Absolute Differences
1087 dst.x = |src0.x - src1.x| + src2.x
1089 dst.y = |src0.y - src1.y| + src2.y
1091 dst.z = |src0.z - src1.z| + src2.z
1093 dst.w = |src0.w - src1.w| + src2.w
1096 .. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single texel
1097 from a specified texture image. The source sampler may
1098 not be a CUBE or SHADOW.
1099 src 0 is a four-component signed integer vector used to
1100 identify the single texel accessed. 3 components + level.
1101 src 1 is a 3 component constant signed integer vector,
1102 with each component only have a range of
1103 -8..+8 (hw only seems to deal with this range, interface
1104 allows for up to unsigned int).
1105 TXF(uint_vec coord, int_vec offset).
1108 .. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4)
1109 retrieve the dimensions of the texture
1110 depending on the target. For 1D (width), 2D/RECT/CUBE
1111 (width, height), 3D (width, height, depth),
1112 1D array (width, layers), 2D array (width, height, layers)
1118 dst.x = texture_width(unit, lod)
1120 dst.y = texture_height(unit, lod)
1122 dst.z = texture_depth(unit, lod)
1125 .. opcode:: CONT - Continue
1131 Support for CONT is determined by a special capability bit,
1132 ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information.
1136 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1138 These opcodes are only supported in geometry shaders; they have no meaning
1139 in any other type of shader.
1141 .. opcode:: EMIT - Emit
1146 .. opcode:: ENDPRIM - End Primitive
1154 These opcodes are part of :term:`GLSL`'s opcode set. Support for these
1155 opcodes is determined by a special capability bit, ``GLSL``.
1157 .. opcode:: BGNLOOP - Begin a Loop
1162 .. opcode:: BGNSUB - Begin Subroutine
1167 .. opcode:: ENDLOOP - End a Loop
1172 .. opcode:: ENDSUB - End Subroutine
1177 .. opcode:: NOP - No Operation
1182 .. opcode:: NRM4 - 4-component Vector Normalise
1184 This instruction replicates its result.
1188 dst = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1196 .. opcode:: CALLNZ - Subroutine Call If Not Zero
1201 .. opcode:: BREAKC - Break Conditional
1210 The double-precision opcodes reinterpret four-component vectors into
1211 two-component vectors with doubled precision in each component.
1213 Support for these opcodes is XXX undecided. :T
1215 .. opcode:: DADD - Add
1219 dst.xy = src0.xy + src1.xy
1221 dst.zw = src0.zw + src1.zw
1224 .. opcode:: DDIV - Divide
1228 dst.xy = src0.xy / src1.xy
1230 dst.zw = src0.zw / src1.zw
1232 .. opcode:: DSEQ - Set on Equal
1236 dst.xy = src0.xy == src1.xy ? 1.0F : 0.0F
1238 dst.zw = src0.zw == src1.zw ? 1.0F : 0.0F
1240 .. opcode:: DSLT - Set on Less than
1244 dst.xy = src0.xy < src1.xy ? 1.0F : 0.0F
1246 dst.zw = src0.zw < src1.zw ? 1.0F : 0.0F
1248 .. opcode:: DFRAC - Fraction
1252 dst.xy = src.xy - \lfloor src.xy\rfloor
1254 dst.zw = src.zw - \lfloor src.zw\rfloor
1257 .. opcode:: DFRACEXP - Convert Number to Fractional and Integral Components
1259 Like the ``frexp()`` routine in many math libraries, this opcode stores the
1260 exponent of its source to ``dst0``, and the significand to ``dst1``, such that
1261 :math:`dst1 \times 2^{dst0} = src` .
1265 dst0.xy = exp(src.xy)
1267 dst1.xy = frac(src.xy)
1269 dst0.zw = exp(src.zw)
1271 dst1.zw = frac(src.zw)
1273 .. opcode:: DLDEXP - Multiply Number by Integral Power of 2
1275 This opcode is the inverse of :opcode:`DFRACEXP`.
1279 dst.xy = src0.xy \times 2^{src1.xy}
1281 dst.zw = src0.zw \times 2^{src1.zw}
1283 .. opcode:: DMIN - Minimum
1287 dst.xy = min(src0.xy, src1.xy)
1289 dst.zw = min(src0.zw, src1.zw)
1291 .. opcode:: DMAX - Maximum
1295 dst.xy = max(src0.xy, src1.xy)
1297 dst.zw = max(src0.zw, src1.zw)
1299 .. opcode:: DMUL - Multiply
1303 dst.xy = src0.xy \times src1.xy
1305 dst.zw = src0.zw \times src1.zw
1308 .. opcode:: DMAD - Multiply And Add
1312 dst.xy = src0.xy \times src1.xy + src2.xy
1314 dst.zw = src0.zw \times src1.zw + src2.zw
1317 .. opcode:: DRCP - Reciprocal
1321 dst.xy = \frac{1}{src.xy}
1323 dst.zw = \frac{1}{src.zw}
1325 .. opcode:: DSQRT - Square Root
1329 dst.xy = \sqrt{src.xy}
1331 dst.zw = \sqrt{src.zw}
1334 .. _samplingopcodes:
1336 Resource Sampling Opcodes
1337 ^^^^^^^^^^^^^^^^^^^^^^^^^
1339 Those opcodes follow very closely semantics of the respective Direct3D
1340 instructions. If in doubt double check Direct3D documentation.
1342 .. opcode:: SAMPLE - Using provided address, sample data from the
1343 specified texture using the filtering mode identified
1344 by the gven sampler. The source data may come from
1345 any resource type other than buffers.
1346 SAMPLE dst, address, sampler_view, sampler
1348 SAMPLE TEMP[0], TEMP[1], SVIEW[0], SAMP[0]
1350 .. opcode:: SAMPLE_I - Simplified alternative to the SAMPLE instruction.
1351 Using the provided integer address, SAMPLE_I fetches data
1352 from the specified sampler view without any filtering.
1353 The source data may come from any resource type other
1355 SAMPLE_I dst, address, sampler_view
1357 SAMPLE_I TEMP[0], TEMP[1], SVIEW[0]
1358 The 'address' is specified as unsigned integers. If the
1359 'address' is out of range [0...(# texels - 1)] the
1360 result of the fetch is always 0 in all components.
1361 As such the instruction doesn't honor address wrap
1362 modes, in cases where that behavior is desirable
1363 'SAMPLE' instruction should be used.
1364 address.w always provides an unsigned integer mipmap
1365 level. If the value is out of the range then the
1366 instruction always returns 0 in all components.
1367 address.yz are ignored for buffers and 1d textures.
1368 address.z is ignored for 1d texture arrays and 2d
1370 For 1D texture arrays address.y provides the array
1371 index (also as unsigned integer). If the value is
1372 out of the range of available array indices
1373 [0... (array size - 1)] then the opcode always returns
1374 0 in all components.
1375 For 2D texture arrays address.z provides the array
1376 index, otherwise it exhibits the same behavior as in
1377 the case for 1D texture arrays.
1378 The exact semantics of the source address are presented
1380 resource type X Y Z W
1381 ------------- ------------------------
1382 PIPE_BUFFER x ignored
1383 PIPE_TEXTURE_1D x mpl
1384 PIPE_TEXTURE_2D x y mpl
1385 PIPE_TEXTURE_3D x y z mpl
1386 PIPE_TEXTURE_RECT x y mpl
1387 PIPE_TEXTURE_CUBE not allowed as source
1388 PIPE_TEXTURE_1D_ARRAY x idx mpl
1389 PIPE_TEXTURE_2D_ARRAY x y idx mpl
1391 Where 'mpl' is a mipmap level and 'idx' is the
1394 .. opcode:: SAMPLE_I_MS - Just like SAMPLE_I but allows fetch data from
1395 multi-sampled surfaces.
1396 SAMPLE_I_MS dst, address, sampler_view, sample
1398 .. opcode:: SAMPLE_B - Just like the SAMPLE instruction with the
1399 exception that an additional bias is applied to the
1400 level of detail computed as part of the instruction
1402 SAMPLE_B dst, address, sampler_view, sampler, lod_bias
1404 SAMPLE_B TEMP[0], TEMP[1], SVIEW[0], SAMP[0], TEMP[2].x
1406 .. opcode:: SAMPLE_C - Similar to the SAMPLE instruction but it
1407 performs a comparison filter. The operands to SAMPLE_C
1408 are identical to SAMPLE, except that there is an additional
1409 float32 operand, reference value, which must be a register
1410 with single-component, or a scalar literal.
1411 SAMPLE_C makes the hardware use the current samplers
1412 compare_func (in pipe_sampler_state) to compare
1413 reference value against the red component value for the
1414 surce resource at each texel that the currently configured
1415 texture filter covers based on the provided coordinates.
1416 SAMPLE_C dst, address, sampler_view.r, sampler, ref_value
1418 SAMPLE_C TEMP[0], TEMP[1], SVIEW[0].r, SAMP[0], TEMP[2].x
1420 .. opcode:: SAMPLE_C_LZ - Same as SAMPLE_C, but LOD is 0 and derivatives
1421 are ignored. The LZ stands for level-zero.
1422 SAMPLE_C_LZ dst, address, sampler_view.r, sampler, ref_value
1424 SAMPLE_C_LZ TEMP[0], TEMP[1], SVIEW[0].r, SAMP[0], TEMP[2].x
1427 .. opcode:: SAMPLE_D - SAMPLE_D is identical to the SAMPLE opcode except
1428 that the derivatives for the source address in the x
1429 direction and the y direction are provided by extra
1431 SAMPLE_D dst, address, sampler_view, sampler, der_x, der_y
1433 SAMPLE_D TEMP[0], TEMP[1], SVIEW[0], SAMP[0], TEMP[2], TEMP[3]
1435 .. opcode:: SAMPLE_L - SAMPLE_L is identical to the SAMPLE opcode except
1436 that the LOD is provided directly as a scalar value,
1437 representing no anisotropy.
1438 SAMPLE_L dst, address, sampler_view, sampler, explicit_lod
1440 SAMPLE_L TEMP[0], TEMP[1], SVIEW[0], SAMP[0], TEMP[2].x
1442 .. opcode:: GATHER4 - Gathers the four texels to be used in a bi-linear
1443 filtering operation and packs them into a single register.
1444 Only works with 2D, 2D array, cubemaps, and cubemaps arrays.
1445 For 2D textures, only the addressing modes of the sampler and
1446 the top level of any mip pyramid are used. Set W to zero.
1447 It behaves like the SAMPLE instruction, but a filtered
1448 sample is not generated. The four samples that contribute
1449 to filtering are placed into xyzw in counter-clockwise order,
1450 starting with the (u,v) texture coordinate delta at the
1451 following locations (-, +), (+, +), (+, -), (-, -), where
1452 the magnitude of the deltas are half a texel.
1455 .. opcode:: SVIEWINFO - query the dimensions of a given sampler view.
1456 dst receives width, height, depth or array size and
1457 number of mipmap levels as int4. The dst can have a writemask
1458 which will specify what info is the caller interested
1460 SVIEWINFO dst, src_mip_level, sampler_view
1462 SVIEWINFO TEMP[0], TEMP[1].x, SVIEW[0]
1463 src_mip_level is an unsigned integer scalar. If it's
1464 out of range then returns 0 for width, height and
1465 depth/array size but the total number of mipmap is
1466 still returned correctly for the given sampler view.
1467 The returned width, height and depth values are for
1468 the mipmap level selected by the src_mip_level and
1469 are in the number of texels.
1470 For 1d texture array width is in dst.x, array size
1471 is in dst.y and dst.zw are always 0.
1473 .. opcode:: SAMPLE_POS - query the position of a given sample.
1474 dst receives float4 (x, y, 0, 0) indicated where the
1475 sample is located. If the resource is not a multi-sample
1476 resource and not a render target, the result is 0.
1478 .. opcode:: SAMPLE_INFO - dst receives number of samples in x.
1479 If the resource is not a multi-sample resource and
1480 not a render target, the result is 0.
1483 .. _resourceopcodes:
1485 Resource Access Opcodes
1486 ^^^^^^^^^^^^^^^^^^^^^^^
1488 .. opcode:: LOAD - Fetch data from a shader resource
1490 Syntax: ``LOAD dst, resource, address``
1492 Example: ``LOAD TEMP[0], RES[0], TEMP[1]``
1494 Using the provided integer address, LOAD fetches data
1495 from the specified buffer or texture without any
1498 The 'address' is specified as a vector of unsigned
1499 integers. If the 'address' is out of range the result
1502 Only the first mipmap level of a resource can be read
1503 from using this instruction.
1505 For 1D or 2D texture arrays, the array index is
1506 provided as an unsigned integer in address.y or
1507 address.z, respectively. address.yz are ignored for
1508 buffers and 1D textures. address.z is ignored for 1D
1509 texture arrays and 2D textures. address.w is always
1512 .. opcode:: STORE - Write data to a shader resource
1514 Syntax: ``STORE resource, address, src``
1516 Example: ``STORE RES[0], TEMP[0], TEMP[1]``
1518 Using the provided integer address, STORE writes data
1519 to the specified buffer or texture.
1521 The 'address' is specified as a vector of unsigned
1522 integers. If the 'address' is out of range the result
1525 Only the first mipmap level of a resource can be
1526 written to using this instruction.
1528 For 1D or 2D texture arrays, the array index is
1529 provided as an unsigned integer in address.y or
1530 address.z, respectively. address.yz are ignored for
1531 buffers and 1D textures. address.z is ignored for 1D
1532 texture arrays and 2D textures. address.w is always
1536 .. _threadsyncopcodes:
1538 Inter-thread synchronization opcodes
1539 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1541 These opcodes are intended for communication between threads running
1542 within the same compute grid. For now they're only valid in compute
1545 .. opcode:: MFENCE - Memory fence
1547 Syntax: ``MFENCE resource``
1549 Example: ``MFENCE RES[0]``
1551 This opcode forces strong ordering between any memory access
1552 operations that affect the specified resource. This means that
1553 previous loads and stores (and only those) will be performed and
1554 visible to other threads before the program execution continues.
1557 .. opcode:: LFENCE - Load memory fence
1559 Syntax: ``LFENCE resource``
1561 Example: ``LFENCE RES[0]``
1563 Similar to MFENCE, but it only affects the ordering of memory loads.
1566 .. opcode:: SFENCE - Store memory fence
1568 Syntax: ``SFENCE resource``
1570 Example: ``SFENCE RES[0]``
1572 Similar to MFENCE, but it only affects the ordering of memory stores.
1575 .. opcode:: BARRIER - Thread group barrier
1579 This opcode suspends the execution of the current thread until all
1580 the remaining threads in the working group reach the same point of
1581 the program. Results are unspecified if any of the remaining
1582 threads terminates or never reaches an executed BARRIER instruction.
1590 These opcodes provide atomic variants of some common arithmetic and
1591 logical operations. In this context atomicity means that another
1592 concurrent memory access operation that affects the same memory
1593 location is guaranteed to be performed strictly before or after the
1594 entire execution of the atomic operation.
1596 For the moment they're only valid in compute programs.
1598 .. opcode:: ATOMUADD - Atomic integer addition
1600 Syntax: ``ATOMUADD dst, resource, offset, src``
1602 Example: ``ATOMUADD TEMP[0], RES[0], TEMP[1], TEMP[2]``
1604 The following operation is performed atomically on each component:
1608 dst_i = resource[offset]_i
1610 resource[offset]_i = dst_i + src_i
1613 .. opcode:: ATOMXCHG - Atomic exchange
1615 Syntax: ``ATOMXCHG dst, resource, offset, src``
1617 Example: ``ATOMXCHG TEMP[0], RES[0], TEMP[1], TEMP[2]``
1619 The following operation is performed atomically on each component:
1623 dst_i = resource[offset]_i
1625 resource[offset]_i = src_i
1628 .. opcode:: ATOMCAS - Atomic compare-and-exchange
1630 Syntax: ``ATOMCAS dst, resource, offset, cmp, src``
1632 Example: ``ATOMCAS TEMP[0], RES[0], TEMP[1], TEMP[2], TEMP[3]``
1634 The following operation is performed atomically on each component:
1638 dst_i = resource[offset]_i
1640 resource[offset]_i = (dst_i == cmp_i ? src_i : dst_i)
1643 .. opcode:: ATOMAND - Atomic bitwise And
1645 Syntax: ``ATOMAND dst, resource, offset, src``
1647 Example: ``ATOMAND TEMP[0], RES[0], TEMP[1], TEMP[2]``
1649 The following operation is performed atomically on each component:
1653 dst_i = resource[offset]_i
1655 resource[offset]_i = dst_i \& src_i
1658 .. opcode:: ATOMOR - Atomic bitwise Or
1660 Syntax: ``ATOMOR dst, resource, offset, src``
1662 Example: ``ATOMOR TEMP[0], RES[0], TEMP[1], TEMP[2]``
1664 The following operation is performed atomically on each component:
1668 dst_i = resource[offset]_i
1670 resource[offset]_i = dst_i | src_i
1673 .. opcode:: ATOMXOR - Atomic bitwise Xor
1675 Syntax: ``ATOMXOR dst, resource, offset, src``
1677 Example: ``ATOMXOR TEMP[0], RES[0], TEMP[1], TEMP[2]``
1679 The following operation is performed atomically on each component:
1683 dst_i = resource[offset]_i
1685 resource[offset]_i = dst_i \oplus src_i
1688 .. opcode:: ATOMUMIN - Atomic unsigned minimum
1690 Syntax: ``ATOMUMIN dst, resource, offset, src``
1692 Example: ``ATOMUMIN TEMP[0], RES[0], TEMP[1], TEMP[2]``
1694 The following operation is performed atomically on each component:
1698 dst_i = resource[offset]_i
1700 resource[offset]_i = (dst_i < src_i ? dst_i : src_i)
1703 .. opcode:: ATOMUMAX - Atomic unsigned maximum
1705 Syntax: ``ATOMUMAX dst, resource, offset, src``
1707 Example: ``ATOMUMAX TEMP[0], RES[0], TEMP[1], TEMP[2]``
1709 The following operation is performed atomically on each component:
1713 dst_i = resource[offset]_i
1715 resource[offset]_i = (dst_i > src_i ? dst_i : src_i)
1718 .. opcode:: ATOMIMIN - Atomic signed minimum
1720 Syntax: ``ATOMIMIN dst, resource, offset, src``
1722 Example: ``ATOMIMIN TEMP[0], RES[0], TEMP[1], TEMP[2]``
1724 The following operation is performed atomically on each component:
1728 dst_i = resource[offset]_i
1730 resource[offset]_i = (dst_i < src_i ? dst_i : src_i)
1733 .. opcode:: ATOMIMAX - Atomic signed maximum
1735 Syntax: ``ATOMIMAX dst, resource, offset, src``
1737 Example: ``ATOMIMAX TEMP[0], RES[0], TEMP[1], TEMP[2]``
1739 The following operation is performed atomically on each component:
1743 dst_i = resource[offset]_i
1745 resource[offset]_i = (dst_i > src_i ? dst_i : src_i)
1749 Explanation of symbols used
1750 ------------------------------
1757 :math:`|x|` Absolute value of `x`.
1759 :math:`\lceil x \rceil` Ceiling of `x`.
1761 clamp(x,y,z) Clamp x between y and z.
1762 (x < y) ? y : (x > z) ? z : x
1764 :math:`\lfloor x\rfloor` Floor of `x`.
1766 :math:`\log_2{x}` Logarithm of `x`, base 2.
1768 max(x,y) Maximum of x and y.
1771 min(x,y) Minimum of x and y.
1774 partialx(x) Derivative of x relative to fragment's X.
1776 partialy(x) Derivative of x relative to fragment's Y.
1778 pop() Pop from stack.
1780 :math:`x^y` `x` to the power `y`.
1782 push(x) Push x on stack.
1786 trunc(x) Truncate x, i.e. drop the fraction bits.
1793 discard Discard fragment.
1797 target Label of target instruction.
1808 Declares a register that is will be referenced as an operand in Instruction
1811 File field contains register file that is being declared and is one
1814 UsageMask field specifies which of the register components can be accessed
1815 and is one of TGSI_WRITEMASK.
1817 The Local flag specifies that a given value isn't intended for
1818 subroutine parameter passing and, as a result, the implementation
1819 isn't required to give any guarantees of it being preserved across
1820 subroutine boundaries. As it's merely a compiler hint, the
1821 implementation is free to ignore it.
1823 If Dimension flag is set to 1, a Declaration Dimension token follows.
1825 If Semantic flag is set to 1, a Declaration Semantic token follows.
1827 If Interpolate flag is set to 1, a Declaration Interpolate token follows.
1829 If file is TGSI_FILE_RESOURCE, a Declaration Resource token follows.
1831 If Array flag is set to 1, a Declaration Array token follows.
1834 ^^^^^^^^^^^^^^^^^^^^^^^^
1836 Declarations can optional have an ArrayID attribute which can be referred by
1837 indirect addressing operands. An ArrayID of zero is reserved and treaded as
1838 if no ArrayID is specified.
1840 If an indirect addressing operand refers to a specific declaration by using
1841 an ArrayID only the registers in this declaration are guaranteed to be
1842 accessed, accessing any register outside this declaration results in undefined
1843 behavior. Note that for compatibility the effective index is zero-based and
1844 not relative to the specified declaration
1846 If no ArrayID is specified with an indirect addressing operand the whole
1847 register file might be accessed by this operand. This is strongly discouraged
1848 and will prevent packing of scalar/vec2 arrays and effective alias analysis.
1850 Declaration Semantic
1851 ^^^^^^^^^^^^^^^^^^^^^^^^
1853 Vertex and fragment shader input and output registers may be labeled
1854 with semantic information consisting of a name and index.
1856 Follows Declaration token if Semantic bit is set.
1858 Since its purpose is to link a shader with other stages of the pipeline,
1859 it is valid to follow only those Declaration tokens that declare a register
1860 either in INPUT or OUTPUT file.
1862 SemanticName field contains the semantic name of the register being declared.
1863 There is no default value.
1865 SemanticIndex is an optional subscript that can be used to distinguish
1866 different register declarations with the same semantic name. The default value
1869 The meanings of the individual semantic names are explained in the following
1872 TGSI_SEMANTIC_POSITION
1873 """"""""""""""""""""""
1875 For vertex shaders, TGSI_SEMANTIC_POSITION indicates the vertex shader
1876 output register which contains the homogeneous vertex position in the clip
1877 space coordinate system. After clipping, the X, Y and Z components of the
1878 vertex will be divided by the W value to get normalized device coordinates.
1880 For fragment shaders, TGSI_SEMANTIC_POSITION is used to indicate that
1881 fragment shader input contains the fragment's window position. The X
1882 component starts at zero and always increases from left to right.
1883 The Y component starts at zero and always increases but Y=0 may either
1884 indicate the top of the window or the bottom depending on the fragment
1885 coordinate origin convention (see TGSI_PROPERTY_FS_COORD_ORIGIN).
1886 The Z coordinate ranges from 0 to 1 to represent depth from the front
1887 to the back of the Z buffer. The W component contains the reciprocol
1888 of the interpolated vertex position W component.
1890 Fragment shaders may also declare an output register with
1891 TGSI_SEMANTIC_POSITION. Only the Z component is writable. This allows
1892 the fragment shader to change the fragment's Z position.
1899 For vertex shader outputs or fragment shader inputs/outputs, this
1900 label indicates that the resister contains an R,G,B,A color.
1902 Several shader inputs/outputs may contain colors so the semantic index
1903 is used to distinguish them. For example, color[0] may be the diffuse
1904 color while color[1] may be the specular color.
1906 This label is needed so that the flat/smooth shading can be applied
1907 to the right interpolants during rasterization.
1911 TGSI_SEMANTIC_BCOLOR
1912 """"""""""""""""""""
1914 Back-facing colors are only used for back-facing polygons, and are only valid
1915 in vertex shader outputs. After rasterization, all polygons are front-facing
1916 and COLOR and BCOLOR end up occupying the same slots in the fragment shader,
1917 so all BCOLORs effectively become regular COLORs in the fragment shader.
1923 Vertex shader inputs and outputs and fragment shader inputs may be
1924 labeled with TGSI_SEMANTIC_FOG to indicate that the register contains
1925 a fog coordinate in the form (F, 0, 0, 1). Typically, the fragment
1926 shader will use the fog coordinate to compute a fog blend factor which
1927 is used to blend the normal fragment color with a constant fog color.
1929 Only the first component matters when writing from the vertex shader;
1930 the driver will ensure that the coordinate is in this format when used
1931 as a fragment shader input.
1937 Vertex shader input and output registers may be labeled with
1938 TGIS_SEMANTIC_PSIZE to indicate that the register contains a point size
1939 in the form (S, 0, 0, 1). The point size controls the width or diameter
1940 of points for rasterization. This label cannot be used in fragment
1943 When using this semantic, be sure to set the appropriate state in the
1944 :ref:`rasterizer` first.
1947 TGSI_SEMANTIC_TEXCOORD
1948 """"""""""""""""""""""
1950 Only available if PIPE_CAP_TGSI_TEXCOORD is exposed !
1952 Vertex shader outputs and fragment shader inputs may be labeled with
1953 this semantic to make them replaceable by sprite coordinates via the
1954 sprite_coord_enable state in the :ref:`rasterizer`.
1955 The semantic index permitted with this semantic is limited to <= 7.
1957 If the driver does not support TEXCOORD, sprite coordinate replacement
1958 applies to inputs with the GENERIC semantic instead.
1960 The intended use case for this semantic is gl_TexCoord.
1963 TGSI_SEMANTIC_PCOORD
1964 """"""""""""""""""""
1966 Only available if PIPE_CAP_TGSI_TEXCOORD is exposed !
1968 Fragment shader inputs may be labeled with TGSI_SEMANTIC_PCOORD to indicate
1969 that the register contains sprite coordinates in the form (x, y, 0, 1), if
1970 the current primitive is a point and point sprites are enabled. Otherwise,
1971 the contents of the register are undefined.
1973 The intended use case for this semantic is gl_PointCoord.
1976 TGSI_SEMANTIC_GENERIC
1977 """""""""""""""""""""
1979 All vertex/fragment shader inputs/outputs not labeled with any other
1980 semantic label can be considered to be generic attributes. Typical
1981 uses of generic inputs/outputs are texcoords and user-defined values.
1984 TGSI_SEMANTIC_NORMAL
1985 """"""""""""""""""""
1987 Indicates that a vertex shader input is a normal vector. This is
1988 typically only used for legacy graphics APIs.
1994 This label applies to fragment shader inputs only and indicates that
1995 the register contains front/back-face information of the form (F, 0,
1996 0, 1). The first component will be positive when the fragment belongs
1997 to a front-facing polygon, and negative when the fragment belongs to a
1998 back-facing polygon.
2001 TGSI_SEMANTIC_EDGEFLAG
2002 """"""""""""""""""""""
2004 For vertex shaders, this sematic label indicates that an input or
2005 output is a boolean edge flag. The register layout is [F, x, x, x]
2006 where F is 0.0 or 1.0 and x = don't care. Normally, the vertex shader
2007 simply copies the edge flag input to the edgeflag output.
2009 Edge flags are used to control which lines or points are actually
2010 drawn when the polygon mode converts triangles/quads/polygons into
2013 TGSI_SEMANTIC_STENCIL
2014 """"""""""""""""""""""
2016 For fragment shaders, this semantic label indicates than an output
2017 is a writable stencil reference value. Only the Y component is writable.
2018 This allows the fragment shader to change the fragments stencilref value.
2021 Declaration Interpolate
2022 ^^^^^^^^^^^^^^^^^^^^^^^
2024 This token is only valid for fragment shader INPUT declarations.
2026 The Interpolate field specifes the way input is being interpolated by
2027 the rasteriser and is one of TGSI_INTERPOLATE_*.
2029 The CylindricalWrap bitfield specifies which register components
2030 should be subject to cylindrical wrapping when interpolating by the
2031 rasteriser. If TGSI_CYLINDRICAL_WRAP_X is set to 1, the X component
2032 should be interpolated according to cylindrical wrapping rules.
2035 Declaration Sampler View
2036 ^^^^^^^^^^^^^^^^^^^^^^^^
2038 Follows Declaration token if file is TGSI_FILE_SAMPLER_VIEW.
2040 DCL SVIEW[#], resource, type(s)
2042 Declares a shader input sampler view and assigns it to a SVIEW[#]
2045 resource can be one of BUFFER, 1D, 2D, 3D, 1DArray and 2DArray.
2047 type must be 1 or 4 entries (if specifying on a per-component
2048 level) out of UNORM, SNORM, SINT, UINT and FLOAT.
2051 Declaration Resource
2052 ^^^^^^^^^^^^^^^^^^^^
2054 Follows Declaration token if file is TGSI_FILE_RESOURCE.
2056 DCL RES[#], resource [, WR] [, RAW]
2058 Declares a shader input resource and assigns it to a RES[#]
2061 resource can be one of BUFFER, 1D, 2D, 3D, CUBE, 1DArray and
2064 If the RAW keyword is not specified, the texture data will be
2065 subject to conversion, swizzling and scaling as required to yield
2066 the specified data type from the physical data format of the bound
2069 If the RAW keyword is specified, no channel conversion will be
2070 performed: the values read for each of the channels (X,Y,Z,W) will
2071 correspond to consecutive words in the same order and format
2072 they're found in memory. No element-to-address conversion will be
2073 performed either: the value of the provided X coordinate will be
2074 interpreted in byte units instead of texel units. The result of
2075 accessing a misaligned address is undefined.
2077 Usage of the STORE opcode is only allowed if the WR (writable) flag
2082 ^^^^^^^^^^^^^^^^^^^^^^^^
2085 Properties are general directives that apply to the whole TGSI program.
2090 Specifies the fragment shader TGSI_SEMANTIC_POSITION coordinate origin.
2091 The default value is UPPER_LEFT.
2093 If UPPER_LEFT, the position will be (0,0) at the upper left corner and
2094 increase downward and rightward.
2095 If LOWER_LEFT, the position will be (0,0) at the lower left corner and
2096 increase upward and rightward.
2098 OpenGL defaults to LOWER_LEFT, and is configurable with the
2099 GL_ARB_fragment_coord_conventions extension.
2101 DirectX 9/10 use UPPER_LEFT.
2103 FS_COORD_PIXEL_CENTER
2104 """""""""""""""""""""
2106 Specifies the fragment shader TGSI_SEMANTIC_POSITION pixel center convention.
2107 The default value is HALF_INTEGER.
2109 If HALF_INTEGER, the fractionary part of the position will be 0.5
2110 If INTEGER, the fractionary part of the position will be 0.0
2112 Note that this does not affect the set of fragments generated by
2113 rasterization, which is instead controlled by gl_rasterization_rules in the
2116 OpenGL defaults to HALF_INTEGER, and is configurable with the
2117 GL_ARB_fragment_coord_conventions extension.
2119 DirectX 9 uses INTEGER.
2120 DirectX 10 uses HALF_INTEGER.
2122 FS_COLOR0_WRITES_ALL_CBUFS
2123 """"""""""""""""""""""""""
2124 Specifies that writes to the fragment shader color 0 are replicated to all
2125 bound cbufs. This facilitates OpenGL's fragColor output vs fragData[0] where
2126 fragData is directed to a single color buffer, but fragColor is broadcast.
2129 """"""""""""""""""""""""""
2130 If this property is set on the program bound to the shader stage before the
2131 fragment shader, user clip planes should have no effect (be disabled) even if
2132 that shader does not write to any clip distance outputs and the rasterizer's
2133 clip_plane_enable is non-zero.
2134 This property is only supported by drivers that also support shader clip
2136 This is useful for APIs that don't have UCPs and where clip distances written
2137 by a shader cannot be disabled.
2140 Texture Sampling and Texture Formats
2141 ------------------------------------
2143 This table shows how texture image components are returned as (x,y,z,w) tuples
2144 by TGSI texture instructions, such as :opcode:`TEX`, :opcode:`TXD`, and
2145 :opcode:`TXP`. For reference, OpenGL and Direct3D conventions are shown as
2148 +--------------------+--------------+--------------------+--------------+
2149 | Texture Components | Gallium | OpenGL | Direct3D 9 |
2150 +====================+==============+====================+==============+
2151 | R | (r, 0, 0, 1) | (r, 0, 0, 1) | (r, 1, 1, 1) |
2152 +--------------------+--------------+--------------------+--------------+
2153 | RG | (r, g, 0, 1) | (r, g, 0, 1) | (r, g, 1, 1) |
2154 +--------------------+--------------+--------------------+--------------+
2155 | RGB | (r, g, b, 1) | (r, g, b, 1) | (r, g, b, 1) |
2156 +--------------------+--------------+--------------------+--------------+
2157 | RGBA | (r, g, b, a) | (r, g, b, a) | (r, g, b, a) |
2158 +--------------------+--------------+--------------------+--------------+
2159 | A | (0, 0, 0, a) | (0, 0, 0, a) | (0, 0, 0, a) |
2160 +--------------------+--------------+--------------------+--------------+
2161 | L | (l, l, l, 1) | (l, l, l, 1) | (l, l, l, 1) |
2162 +--------------------+--------------+--------------------+--------------+
2163 | LA | (l, l, l, a) | (l, l, l, a) | (l, l, l, a) |
2164 +--------------------+--------------+--------------------+--------------+
2165 | I | (i, i, i, i) | (i, i, i, i) | N/A |
2166 +--------------------+--------------+--------------------+--------------+
2167 | UV | XXX TBD | (0, 0, 0, 1) | (u, v, 1, 1) |
2168 | | | [#envmap-bumpmap]_ | |
2169 +--------------------+--------------+--------------------+--------------+
2170 | Z | XXX TBD | (z, z, z, 1) | (0, z, 0, 1) |
2171 | | | [#depth-tex-mode]_ | |
2172 +--------------------+--------------+--------------------+--------------+
2173 | S | (s, s, s, s) | unknown | unknown |
2174 +--------------------+--------------+--------------------+--------------+
2176 .. [#envmap-bumpmap] http://www.opengl.org/registry/specs/ATI/envmap_bumpmap.txt
2177 .. [#depth-tex-mode] the default is (z, z, z, 1) but may also be (0, 0, 0, z)
2178 or (z, z, z, z) depending on the value of GL_DEPTH_TEXTURE_MODE.