4 TGSI, Tungsten Graphics Shader Infrastructure, is an intermediate language
5 for describing shaders. Since Gallium is inherently shaderful, shaders are
6 an important part of the API. TGSI is the only intermediate representation
12 All TGSI instructions, known as *opcodes*, operate on arbitrary-precision
13 floating-point four-component vectors. An opcode may have up to one
14 destination register, known as *dst*, and between zero and three source
15 registers, called *src0* through *src2*, or simply *src* if there is only
18 Some instructions, like :opcode:`I2F`, permit re-interpretation of vector
19 components as integers. Other instructions permit using registers as
20 two-component vectors with double precision; see :ref:`Double Opcodes`.
25 From GL_NV_vertex_program
26 ^^^^^^^^^^^^^^^^^^^^^^^^^
29 .. opcode:: ARL - Address Register Load
33 dst.x = \lfloor src.x\rfloor
35 dst.y = \lfloor src.y\rfloor
37 dst.z = \lfloor src.z\rfloor
39 dst.w = \lfloor src.w\rfloor
42 .. opcode:: MOV - Move
55 .. opcode:: LIT - Light Coefficients
63 dst.z = (src.x > 0) ? max(src.y, 0)^{clamp(src.w, -128, 128))} : 0
68 .. opcode:: RCP - Reciprocal
72 dst.x = \frac{1}{src.x}
74 dst.y = \frac{1}{src.x}
76 dst.z = \frac{1}{src.x}
78 dst.w = \frac{1}{src.x}
81 .. opcode:: RSQ - Reciprocal Square Root
85 dst.x = \frac{1}{\sqrt{|src.x|}}
87 dst.y = \frac{1}{\sqrt{|src.x|}}
89 dst.z = \frac{1}{\sqrt{|src.x|}}
91 dst.w = \frac{1}{\sqrt{|src.x|}}
94 .. opcode:: EXP - Approximate Exponential Base 2
98 dst.x = 2^{\lfloor src.x\rfloor}
100 dst.y = src.x - \lfloor src.x\rfloor
107 .. opcode:: LOG - Approximate Logarithm Base 2
111 dst.x = \lfloor\log_2{|src.x|}\rfloor
113 dst.y = \frac{|src.x|}{2^{\lfloor\log_2{|src.x|}\rfloor}}
115 dst.z = \log_2{|src.x|}
120 .. opcode:: MUL - Multiply
124 dst.x = src0.x \times src1.x
126 dst.y = src0.y \times src1.y
128 dst.z = src0.z \times src1.z
130 dst.w = src0.w \times src1.w
133 .. opcode:: ADD - Add
137 dst.x = src0.x + src1.x
139 dst.y = src0.y + src1.y
141 dst.z = src0.z + src1.z
143 dst.w = src0.w + src1.w
146 .. opcode:: DP3 - 3-component Dot Product
150 dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
152 dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
154 dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
156 dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
159 .. opcode:: DP4 - 4-component Dot Product
163 dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
165 dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
167 dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
169 dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
172 .. opcode:: DST - Distance Vector
178 dst.y = src0.y \times src1.y
185 .. opcode:: MIN - Minimum
189 dst.x = min(src0.x, src1.x)
191 dst.y = min(src0.y, src1.y)
193 dst.z = min(src0.z, src1.z)
195 dst.w = min(src0.w, src1.w)
198 .. opcode:: MAX - Maximum
202 dst.x = max(src0.x, src1.x)
204 dst.y = max(src0.y, src1.y)
206 dst.z = max(src0.z, src1.z)
208 dst.w = max(src0.w, src1.w)
211 .. opcode:: SLT - Set On Less Than
215 dst.x = (src0.x < src1.x) ? 1 : 0
217 dst.y = (src0.y < src1.y) ? 1 : 0
219 dst.z = (src0.z < src1.z) ? 1 : 0
221 dst.w = (src0.w < src1.w) ? 1 : 0
224 .. opcode:: SGE - Set On Greater Equal Than
228 dst.x = (src0.x >= src1.x) ? 1 : 0
230 dst.y = (src0.y >= src1.y) ? 1 : 0
232 dst.z = (src0.z >= src1.z) ? 1 : 0
234 dst.w = (src0.w >= src1.w) ? 1 : 0
237 .. opcode:: MAD - Multiply And Add
241 dst.x = src0.x \times src1.x + src2.x
243 dst.y = src0.y \times src1.y + src2.y
245 dst.z = src0.z \times src1.z + src2.z
247 dst.w = src0.w \times src1.w + src2.w
250 .. opcode:: SUB - Subtract
254 dst.x = src0.x - src1.x
256 dst.y = src0.y - src1.y
258 dst.z = src0.z - src1.z
260 dst.w = src0.w - src1.w
263 .. opcode:: LRP - Linear Interpolate
267 dst.x = src0.x \times src1.x + (1 - src0.x) \times src2.x
269 dst.y = src0.y \times src1.y + (1 - src0.y) \times src2.y
271 dst.z = src0.z \times src1.z + (1 - src0.z) \times src2.z
273 dst.w = src0.w \times src1.w + (1 - src0.w) \times src2.w
276 .. opcode:: CND - Condition
280 dst.x = (src2.x > 0.5) ? src0.x : src1.x
282 dst.y = (src2.y > 0.5) ? src0.y : src1.y
284 dst.z = (src2.z > 0.5) ? src0.z : src1.z
286 dst.w = (src2.w > 0.5) ? src0.w : src1.w
289 .. opcode:: DP2A - 2-component Dot Product And Add
293 dst.x = src0.x \times src1.x + src0.y \times src1.y + src2.x
295 dst.y = src0.x \times src1.x + src0.y \times src1.y + src2.x
297 dst.z = src0.x \times src1.x + src0.y \times src1.y + src2.x
299 dst.w = src0.x \times src1.x + src0.y \times src1.y + src2.x
302 .. opcode:: FRAC - Fraction
306 dst.x = src.x - \lfloor src.x\rfloor
308 dst.y = src.y - \lfloor src.y\rfloor
310 dst.z = src.z - \lfloor src.z\rfloor
312 dst.w = src.w - \lfloor src.w\rfloor
315 .. opcode:: CLAMP - Clamp
319 dst.x = clamp(src0.x, src1.x, src2.x)
321 dst.y = clamp(src0.y, src1.y, src2.y)
323 dst.z = clamp(src0.z, src1.z, src2.z)
325 dst.w = clamp(src0.w, src1.w, src2.w)
328 .. opcode:: FLR - Floor
330 This is identical to ARL.
334 dst.x = \lfloor src.x\rfloor
336 dst.y = \lfloor src.y\rfloor
338 dst.z = \lfloor src.z\rfloor
340 dst.w = \lfloor src.w\rfloor
343 .. opcode:: ROUND - Round
356 .. opcode:: EX2 - Exponential Base 2
369 .. opcode:: LG2 - Logarithm Base 2
373 dst.x = \log_2{src.x}
375 dst.y = \log_2{src.x}
377 dst.z = \log_2{src.x}
379 dst.w = \log_2{src.x}
382 .. opcode:: POW - Power
386 dst.x = src0.x^{src1.x}
388 dst.y = src0.x^{src1.x}
390 dst.z = src0.x^{src1.x}
392 dst.w = src0.x^{src1.x}
394 .. opcode:: XPD - Cross Product
398 dst.x = src0.y \times src1.z - src1.y \times src0.z
400 dst.y = src0.z \times src1.x - src1.z \times src0.x
402 dst.z = src0.x \times src1.y - src1.x \times src0.y
407 .. opcode:: ABS - Absolute
420 .. opcode:: RCC - Reciprocal Clamped
422 XXX cleanup on aisle three
426 dst.x = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
428 dst.y = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
430 dst.z = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
432 dst.w = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
435 .. opcode:: DPH - Homogeneous Dot Product
439 dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
441 dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
443 dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
445 dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
448 .. opcode:: COS - Cosine
461 .. opcode:: DDX - Derivative Relative To X
465 dst.x = partialx(src.x)
467 dst.y = partialx(src.y)
469 dst.z = partialx(src.z)
471 dst.w = partialx(src.w)
474 .. opcode:: DDY - Derivative Relative To Y
478 dst.x = partialy(src.x)
480 dst.y = partialy(src.y)
482 dst.z = partialy(src.z)
484 dst.w = partialy(src.w)
487 .. opcode:: KILP - Predicated Discard
492 .. opcode:: PK2H - Pack Two 16-bit Floats
497 .. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars
502 .. opcode:: PK4B - Pack Four Signed 8-bit Scalars
507 .. opcode:: PK4UB - Pack Four Unsigned 8-bit Scalars
512 .. opcode:: RFL - Reflection Vector
516 dst.x = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.x - src1.x
518 dst.y = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.y - src1.y
520 dst.z = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.z - src1.z
524 Considered for removal.
527 .. opcode:: SEQ - Set On Equal
531 dst.x = (src0.x == src1.x) ? 1 : 0
533 dst.y = (src0.y == src1.y) ? 1 : 0
535 dst.z = (src0.z == src1.z) ? 1 : 0
537 dst.w = (src0.w == src1.w) ? 1 : 0
540 .. opcode:: SFL - Set On False
552 Considered for removal.
554 .. opcode:: SGT - Set On Greater Than
558 dst.x = (src0.x > src1.x) ? 1 : 0
560 dst.y = (src0.y > src1.y) ? 1 : 0
562 dst.z = (src0.z > src1.z) ? 1 : 0
564 dst.w = (src0.w > src1.w) ? 1 : 0
567 .. opcode:: SIN - Sine
580 .. opcode:: SLE - Set On Less Equal Than
584 dst.x = (src0.x <= src1.x) ? 1 : 0
586 dst.y = (src0.y <= src1.y) ? 1 : 0
588 dst.z = (src0.z <= src1.z) ? 1 : 0
590 dst.w = (src0.w <= src1.w) ? 1 : 0
593 .. opcode:: SNE - Set On Not Equal
597 dst.x = (src0.x != src1.x) ? 1 : 0
599 dst.y = (src0.y != src1.y) ? 1 : 0
601 dst.z = (src0.z != src1.z) ? 1 : 0
603 dst.w = (src0.w != src1.w) ? 1 : 0
606 .. opcode:: STR - Set On True
619 .. opcode:: TEX - Texture Lookup
624 .. opcode:: TXD - Texture Lookup with Derivatives
629 .. opcode:: TXP - Projective Texture Lookup
634 .. opcode:: UP2H - Unpack Two 16-Bit Floats
638 Considered for removal.
640 .. opcode:: UP2US - Unpack Two Unsigned 16-Bit Scalars
644 Considered for removal.
646 .. opcode:: UP4B - Unpack Four Signed 8-Bit Values
650 Considered for removal.
652 .. opcode:: UP4UB - Unpack Four Unsigned 8-Bit Scalars
656 Considered for removal.
658 .. opcode:: X2D - 2D Coordinate Transformation
662 dst.x = src0.x + src1.x \times src2.x + src1.y \times src2.y
664 dst.y = src0.y + src1.x \times src2.z + src1.y \times src2.w
666 dst.z = src0.x + src1.x \times src2.x + src1.y \times src2.y
668 dst.w = src0.y + src1.x \times src2.z + src1.y \times src2.w
670 Considered for removal.
673 From GL_NV_vertex_program2
674 ^^^^^^^^^^^^^^^^^^^^^^^^^^
677 .. opcode:: ARA - Address Register Add
681 Considered for removal.
683 .. opcode:: ARR - Address Register Load With Round
696 .. opcode:: BRA - Branch
700 Considered for removal.
702 .. opcode:: CAL - Subroutine Call
708 .. opcode:: RET - Subroutine Call Return
712 Potential restrictions:
713 * Only occurs at end of function.
715 .. opcode:: SSG - Set Sign
719 dst.x = (src.x > 0) ? 1 : (src.x < 0) ? -1 : 0
721 dst.y = (src.y > 0) ? 1 : (src.y < 0) ? -1 : 0
723 dst.z = (src.z > 0) ? 1 : (src.z < 0) ? -1 : 0
725 dst.w = (src.w > 0) ? 1 : (src.w < 0) ? -1 : 0
728 .. opcode:: CMP - Compare
732 dst.x = (src0.x < 0) ? src1.x : src2.x
734 dst.y = (src0.y < 0) ? src1.y : src2.y
736 dst.z = (src0.z < 0) ? src1.z : src2.z
738 dst.w = (src0.w < 0) ? src1.w : src2.w
741 .. opcode:: KIL - Conditional Discard
745 if (src.x < 0 || src.y < 0 || src.z < 0 || src.w < 0)
750 .. opcode:: SCS - Sine Cosine
763 .. opcode:: TXB - Texture Lookup With Bias
768 .. opcode:: NRM - 3-component Vector Normalise
772 dst.x = src.x / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
774 dst.y = src.y / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
776 dst.z = src.z / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
781 .. opcode:: DIV - Divide
785 dst.x = \frac{src0.x}{src1.x}
787 dst.y = \frac{src0.y}{src1.y}
789 dst.z = \frac{src0.z}{src1.z}
791 dst.w = \frac{src0.w}{src1.w}
794 .. opcode:: DP2 - 2-component Dot Product
798 dst.x = src0.x \times src1.x + src0.y \times src1.y
800 dst.y = src0.x \times src1.x + src0.y \times src1.y
802 dst.z = src0.x \times src1.x + src0.y \times src1.y
804 dst.w = src0.x \times src1.x + src0.y \times src1.y
807 .. opcode:: TXL - Texture Lookup With LOD
812 .. opcode:: BRK - Break
822 .. opcode:: BGNFOR - Begin a For-Loop
829 pc = [matching ENDFOR] + 1
832 Note: The destination must be a loop register.
833 The source must be a constant register.
835 Considered for cleanup / removal.
838 .. opcode:: REP - Repeat
843 .. opcode:: ELSE - Else
848 .. opcode:: ENDIF - End If
853 .. opcode:: ENDFOR - End a For-Loop
855 dst.x = dst.x + dst.z
859 pc = [matching BGNFOR instruction] + 1
862 Note: The destination must be a loop register.
864 Considered for cleanup / removal.
866 .. opcode:: ENDREP - End Repeat
871 .. opcode:: PUSHA - Push Address Register On Stack
878 Considered for cleanup / removal.
880 .. opcode:: POPA - Pop Address Register From Stack
887 Considered for cleanup / removal.
890 From GL_NV_gpu_program4
891 ^^^^^^^^^^^^^^^^^^^^^^^^
893 Support for these opcodes indicated by a special pipe capability bit (TBD).
895 .. opcode:: CEIL - Ceiling
899 dst.x = \lceil src.x\rceil
901 dst.y = \lceil src.y\rceil
903 dst.z = \lceil src.z\rceil
905 dst.w = \lceil src.w\rceil
908 .. opcode:: I2F - Integer To Float
912 dst.x = (float) src.x
914 dst.y = (float) src.y
916 dst.z = (float) src.z
918 dst.w = (float) src.w
921 .. opcode:: NOT - Bitwise Not
934 .. opcode:: TRUNC - Truncate
947 .. opcode:: SHL - Shift Left
951 dst.x = src0.x << src1.x
953 dst.y = src0.y << src1.x
955 dst.z = src0.z << src1.x
957 dst.w = src0.w << src1.x
960 .. opcode:: SHR - Shift Right
964 dst.x = src0.x >> src1.x
966 dst.y = src0.y >> src1.x
968 dst.z = src0.z >> src1.x
970 dst.w = src0.w >> src1.x
973 .. opcode:: AND - Bitwise And
977 dst.x = src0.x & src1.x
979 dst.y = src0.y & src1.y
981 dst.z = src0.z & src1.z
983 dst.w = src0.w & src1.w
986 .. opcode:: OR - Bitwise Or
990 dst.x = src0.x | src1.x
992 dst.y = src0.y | src1.y
994 dst.z = src0.z | src1.z
996 dst.w = src0.w | src1.w
999 .. opcode:: MOD - Modulus
1003 dst.x = src0.x \bmod src1.x
1005 dst.y = src0.y \bmod src1.y
1007 dst.z = src0.z \bmod src1.z
1009 dst.w = src0.w \bmod src1.w
1012 .. opcode:: XOR - Bitwise Xor
1016 dst.x = src0.x \oplus src1.x
1018 dst.y = src0.y \oplus src1.y
1020 dst.z = src0.z \oplus src1.z
1022 dst.w = src0.w \oplus src1.w
1025 .. opcode:: SAD - Sum Of Absolute Differences
1029 dst.x = |src0.x - src1.x| + src2.x
1031 dst.y = |src0.y - src1.y| + src2.y
1033 dst.z = |src0.z - src1.z| + src2.z
1035 dst.w = |src0.w - src1.w| + src2.w
1038 .. opcode:: TXF - Texel Fetch
1043 .. opcode:: TXQ - Texture Size Query
1048 .. opcode:: CONT - Continue
1053 From GL_NV_geometry_program4
1054 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1057 .. opcode:: EMIT - Emit
1062 .. opcode:: ENDPRIM - End Primitive
1071 .. opcode:: BGNLOOP - Begin a Loop
1076 .. opcode:: BGNSUB - Begin Subroutine
1081 .. opcode:: ENDLOOP - End a Loop
1086 .. opcode:: ENDSUB - End Subroutine
1091 .. opcode:: NOP - No Operation
1096 .. opcode:: NRM4 - 4-component Vector Normalise
1100 dst.x = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1102 dst.y = \frac{src.y}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1104 dst.z = \frac{src.z}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1106 dst.w = \frac{src.w}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1113 .. opcode:: CALLNZ - Subroutine Call If Not Zero
1118 .. opcode:: IFC - If
1123 .. opcode:: BREAKC - Break Conditional
1132 .. opcode:: DADD - Add Double
1136 dst.xy = src0.xy + src1.xy
1138 dst.zw = src0.zw + src1.zw
1141 .. opcode:: DDIV - Divide Double
1145 dst.xy = src0.xy / src1.xy
1147 dst.zw = src0.zw / src1.zw
1149 .. opcode:: DSEQ - Set Double on Equal
1153 dst.xy = src0.xy == src1.xy ? 1.0F : 0.0F
1155 dst.zw = src0.zw == src1.zw ? 1.0F : 0.0F
1157 .. opcode:: DSLT - Set Double on Less than
1161 dst.xy = src0.xy < src1.xy ? 1.0F : 0.0F
1163 dst.zw = src0.zw < src1.zw ? 1.0F : 0.0F
1165 .. opcode:: DFRAC - Double Fraction
1169 dst.xy = src.xy - \lfloor src.xy\rfloor
1171 dst.zw = src.zw - \lfloor src.zw\rfloor
1174 .. opcode:: DFRACEXP - Convert Double Number to Fractional and Integral Components
1178 dst0.xy = frexp(src.xy, dst1.xy)
1180 dst0.zw = frexp(src.zw, dst1.zw)
1182 .. opcode:: DLDEXP - Multiple Double Number by Integral Power of 2
1186 dst.xy = ldexp(src0.xy, src1.xy)
1188 dst.zw = ldexp(src0.zw, src1.zw)
1190 .. opcode:: DMIN - Minimum Double
1194 dst.xy = min(src0.xy, src1.xy)
1196 dst.zw = min(src0.zw, src1.zw)
1198 .. opcode:: DMAX - Maximum Double
1202 dst.xy = max(src0.xy, src1.xy)
1204 dst.zw = max(src0.zw, src1.zw)
1206 .. opcode:: DMUL - Multiply Double
1210 dst.xy = src0.xy \times src1.xy
1212 dst.zw = src0.zw \times src1.zw
1215 .. opcode:: DMAD - Multiply And Add Doubles
1219 dst.xy = src0.xy \times src1.xy + src2.xy
1221 dst.zw = src0.zw \times src1.zw + src2.zw
1224 .. opcode:: DRCP - Reciprocal Double
1228 dst.xy = \frac{1}{src.xy}
1230 dst.zw = \frac{1}{src.zw}
1232 .. opcode:: DSQRT - Square root double
1236 dst.xy = \sqrt{src.xy}
1238 dst.zw = \sqrt{src.zw}
1241 Explanation of symbols used
1242 ------------------------------
1249 :math:`|x|` Absolute value of `x`.
1251 :math:`\lceil x \rceil` Ceiling of `x`.
1253 clamp(x,y,z) Clamp x between y and z.
1254 (x < y) ? y : (x > z) ? z : x
1256 :math:`\lfloor x\rfloor` Floor of `x`.
1258 :math:`\log_2{x}` Logarithm of `x`, base 2.
1260 max(x,y) Maximum of x and y.
1263 min(x,y) Minimum of x and y.
1266 partialx(x) Derivative of x relative to fragment's X.
1268 partialy(x) Derivative of x relative to fragment's Y.
1270 pop() Pop from stack.
1272 :math:`x^y` `x` to the power `y`.
1274 push(x) Push x on stack.
1278 trunc(x) Truncate x, i.e. drop the fraction bits.
1285 discard Discard fragment.
1289 target Label of target instruction.
1296 Declaration Semantic
1297 ^^^^^^^^^^^^^^^^^^^^^^^^
1300 Follows Declaration token if Semantic bit is set.
1302 Since its purpose is to link a shader with other stages of the pipeline,
1303 it is valid to follow only those Declaration tokens that declare a register
1304 either in INPUT or OUTPUT file.
1306 SemanticName field contains the semantic name of the register being declared.
1307 There is no default value.
1309 SemanticIndex is an optional subscript that can be used to distinguish
1310 different register declarations with the same semantic name. The default value
1313 The meanings of the individual semantic names are explained in the following
1316 TGSI_SEMANTIC_POSITION
1317 """"""""""""""""""""""
1319 Position, sometimes known as HPOS or WPOS for historical reasons, is the
1320 location of the vertex in space, in ``(x, y, z, w)`` format. ``x``, ``y``, and ``z``
1321 are the Cartesian coordinates, and ``w`` is the homogenous coordinate and used
1322 for the perspective divide, if enabled.
1324 As a vertex shader output, position should be scaled to the viewport. When
1325 used in fragment shaders, position will be in window coordinates. The convention
1326 used depends on the FS_COORD_ORIGIN and FS_COORD_PIXEL_CENTER properties.
1328 XXX additionally, is there a way to configure the perspective divide? it's
1329 accelerated on most chipsets AFAIK...
1331 Position, if not specified, usually defaults to ``(0, 0, 0, 1)``, and can
1332 be partially specified as ``(x, y, 0, 1)`` or ``(x, y, z, 1)``.
1334 XXX usually? can we solidify that?
1339 Colors are used to, well, color the primitives. Colors are always in
1340 ``(r, g, b, a)`` format.
1342 If alpha is not specified, it defaults to 1.
1344 TGSI_SEMANTIC_BCOLOR
1345 """"""""""""""""""""
1347 Back-facing colors are only used for back-facing polygons, and are only valid
1348 in vertex shader outputs. After rasterization, all polygons are front-facing
1349 and COLOR and BCOLOR end up occupying the same slots in the fragment, so
1350 all BCOLORs effectively become regular COLORs in the fragment shader.
1355 The fog coordinate historically has been used to replace the depth coordinate
1356 for generation of fog in dedicated fog blocks. Gallium, however, does not use
1357 dedicated fog acceleration, placing it entirely in the fragment shader
1360 The fog coordinate should be written in ``(f, 0, 0, 1)`` format. Only the first
1361 component matters when writing from the vertex shader; the driver will ensure
1362 that the coordinate is in this format when used as a fragment shader input.
1367 PSIZE, or point size, is used to specify point sizes per-vertex. It should
1368 be in ``(p, n, x, f)`` format, where ``p`` is the point size, ``n`` is the minimum
1369 size, ``x`` is the maximum size, and ``f`` is the fade threshold.
1371 XXX this is arb_vp. is this what we actually do? should double-check...
1373 When using this semantic, be sure to set the appropriate state in the
1374 :ref:`rasterizer` first.
1376 TGSI_SEMANTIC_GENERIC
1377 """""""""""""""""""""
1379 Generic semantics are nearly always used for texture coordinate attributes,
1380 in ``(s, t, r, q)`` format. ``t`` and ``r`` may be unused for certain kinds
1381 of lookups, and ``q`` is the level-of-detail bias for biased sampling.
1383 These attributes are called "generic" because they may be used for anything
1384 else, including parameters, texture generation information, or anything that
1385 can be stored inside a four-component vector.
1387 TGSI_SEMANTIC_NORMAL
1388 """"""""""""""""""""
1390 Vertex normal; could be used to implement per-pixel lighting for legacy APIs
1391 that allow mixing fixed-function and programmable stages.
1396 FACE is the facing bit, to store the facing information for the fragment
1397 shader. ``(f, 0, 0, 1)`` is the format. The first component will be positive
1398 when the fragment is front-facing, and negative when the component is
1401 TGSI_SEMANTIC_EDGEFLAG
1402 """"""""""""""""""""""
1408 ^^^^^^^^^^^^^^^^^^^^^^^^
1411 Properties are general directives that apply to the whole TGSI program.
1416 Specifies the fragment shader TGSI_SEMANTIC_POSITION coordinate origin.
1417 The default value is UPPER_LEFT.
1419 If UPPER_LEFT, the position will be (0,0) at the upper left corner and
1420 increase downward and rightward.
1421 If LOWER_LEFT, the position will be (0,0) at the lower left corner and
1422 increase upward and rightward.
1424 OpenGL defaults to LOWER_LEFT, and is configurable with the
1425 GL_ARB_fragment_coord_conventions extension.
1427 DirectX 9/10 use UPPER_LEFT.
1429 FS_COORD_PIXEL_CENTER
1430 """""""""""""""""""""
1432 Specifies the fragment shader TGSI_SEMANTIC_POSITION pixel center convention.
1433 The default value is HALF_INTEGER.
1435 If HALF_INTEGER, the fractionary part of the position will be 0.5
1436 If INTEGER, the fractionary part of the position will be 0.0
1438 Note that this does not affect the set of fragments generated by
1439 rasterization, which is instead controlled by gl_rasterization_rules in the
1442 OpenGL defaults to HALF_INTEGER, and is configurable with the
1443 GL_ARB_fragment_coord_conventions extension.
1445 DirectX 9 uses INTEGER.
1446 DirectX 10 uses HALF_INTEGER.
1450 Texture Sampling and Texture Formats
1451 ------------------------------------
1453 This table shows how texture image components are returned as (x,y,z,w)
1454 tuples by TGSI texture instructions, such as TEX, TXD, and TXP.
1455 For reference, OpenGL and Direct3D conventions are shown as well.
1457 +--------------------+--------------+--------------------+--------------+
1458 | Texture Components | Gallium | OpenGL | Direct3D 9 |
1459 +====================+==============+====================+==============+
1460 | R | XXX TBD | (r, 0, 0, 1) | (r, 1, 1, 1) |
1461 +--------------------+--------------+--------------------+--------------+
1462 | RG | XXX TBD | (r, g, 0, 1) | (r, g, 1, 1) |
1463 +--------------------+--------------+--------------------+--------------+
1464 | RGB | (r, g, b, 1) | (r, g, b, 1) | (r, g, b, 1) |
1465 +--------------------+--------------+--------------------+--------------+
1466 | RGBA | (r, g, b, a) | (r, g, b, a) | (r, g, b, a) |
1467 +--------------------+--------------+--------------------+--------------+
1468 | A | (0, 0, 0, a) | (0, 0, 0, a) | (0, 0, 0, a) |
1469 +--------------------+--------------+--------------------+--------------+
1470 | L | (l, l, l, 1) | (l, l, l, 1) | (l, l, l, 1) |
1471 +--------------------+--------------+--------------------+--------------+
1472 | LA | (l, l, l, a) | (l, l, l, a) | (l, l, l, a) |
1473 +--------------------+--------------+--------------------+--------------+
1474 | I | (i, i, i, i) | (i, i, i, i) | N/A |
1475 +--------------------+--------------+--------------------+--------------+
1476 | UV | XXX TBD | (0, 0, 0, 1) | (u, v, 1, 1) |
1477 | | | [#envmap-bumpmap]_ | |
1478 +--------------------+--------------+--------------------+--------------+
1479 | Z | XXX TBD | (z, z, z, 1) | (0, z, 0, 1) |
1480 | | | [#depth-tex-mode]_ | |
1481 +--------------------+--------------+--------------------+--------------+
1483 .. [#envmap-bumpmap] http://www.opengl.org/registry/specs/ATI/envmap_bumpmap.txt
1484 .. [#depth-tex-mode] the default is (z, z, z, 1) but may also be (0, 0, 0, z)
1485 or (z, z, z, z) depending on the value of GL_DEPTH_TEXTURE_MODE.