Merge remote branch 'origin/master' into lp-binning
[mesa.git] / src / gallium / docs / source / tgsi.rst
1 TGSI
2 ====
3
4 TGSI, Tungsten Graphics Shader Infrastructure, is an intermediate language
5 for describing shaders. Since Gallium is inherently shaderful, shaders are
6 an important part of the API. TGSI is the only intermediate representation
7 used by all drivers.
8
9 Instruction Set
10 ---------------
11
12 From GL_NV_vertex_program
13 ^^^^^^^^^^^^^^^^^^^^^^^^^
14
15
16 ARL - Address Register Load
17
18 .. math::
19
20 dst.x = \lfloor src.x\rfloor
21
22 dst.y = \lfloor src.y\rfloor
23
24 dst.z = \lfloor src.z\rfloor
25
26 dst.w = \lfloor src.w\rfloor
27
28
29 MOV - Move
30
31 .. math::
32
33 dst.x = src.x
34
35 dst.y = src.y
36
37 dst.z = src.z
38
39 dst.w = src.w
40
41
42 LIT - Light Coefficients
43
44 .. math::
45
46 dst.x = 1
47
48 dst.y = max(src.x, 0)
49
50 dst.z = (src.x > 0) ? max(src.y, 0)^{clamp(src.w, -128, 128))} : 0
51
52 dst.w = 1
53
54
55 RCP - Reciprocal
56
57 .. math::
58
59 dst.x = \frac{1}{src.x}
60
61 dst.y = \frac{1}{src.x}
62
63 dst.z = \frac{1}{src.x}
64
65 dst.w = \frac{1}{src.x}
66
67
68 RSQ - Reciprocal Square Root
69
70 .. math::
71
72 dst.x = \frac{1}{\sqrt{|src.x|}}
73
74 dst.y = \frac{1}{\sqrt{|src.x|}}
75
76 dst.z = \frac{1}{\sqrt{|src.x|}}
77
78 dst.w = \frac{1}{\sqrt{|src.x|}}
79
80
81 EXP - Approximate Exponential Base 2
82
83 .. math::
84
85 dst.x = 2^{\lfloor src.x\rfloor}
86
87 dst.y = src.x - \lfloor src.x\rfloor
88
89 dst.z = 2^{src.x}
90
91 dst.w = 1
92
93
94 LOG - Approximate Logarithm Base 2
95
96 .. math::
97
98 dst.x = \lfloor\log_2{|src.x|}\rfloor
99
100 dst.y = \frac{|src.x|}{2^{\lfloor\log_2{|src.x|}\rfloor}}
101
102 dst.z = \log_2{|src.x|}
103
104 dst.w = 1
105
106
107 MUL - Multiply
108
109 .. math::
110
111 dst.x = src0.x \times src1.x
112
113 dst.y = src0.y \times src1.y
114
115 dst.z = src0.z \times src1.z
116
117 dst.w = src0.w \times src1.w
118
119
120 ADD - Add
121
122 .. math::
123
124 dst.x = src0.x + src1.x
125
126 dst.y = src0.y + src1.y
127
128 dst.z = src0.z + src1.z
129
130 dst.w = src0.w + src1.w
131
132
133 DP3 - 3-component Dot Product
134
135 .. math::
136
137 dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
138
139 dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
140
141 dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
142
143 dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
144
145
146 DP4 - 4-component Dot Product
147
148 .. math::
149
150 dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
151
152 dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
153
154 dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
155
156 dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
157
158
159 DST - Distance Vector
160
161 .. math::
162
163 dst.x = 1
164
165 dst.y = src0.y \times src1.y
166
167 dst.z = src0.z
168
169 dst.w = src1.w
170
171
172 MIN - Minimum
173
174 .. math::
175
176 dst.x = min(src0.x, src1.x)
177
178 dst.y = min(src0.y, src1.y)
179
180 dst.z = min(src0.z, src1.z)
181
182 dst.w = min(src0.w, src1.w)
183
184
185 MAX - Maximum
186
187 .. math::
188
189 dst.x = max(src0.x, src1.x)
190
191 dst.y = max(src0.y, src1.y)
192
193 dst.z = max(src0.z, src1.z)
194
195 dst.w = max(src0.w, src1.w)
196
197
198 SLT - Set On Less Than
199
200 .. math::
201
202 dst.x = (src0.x < src1.x) ? 1 : 0
203
204 dst.y = (src0.y < src1.y) ? 1 : 0
205
206 dst.z = (src0.z < src1.z) ? 1 : 0
207
208 dst.w = (src0.w < src1.w) ? 1 : 0
209
210
211 SGE - Set On Greater Equal Than
212
213 .. math::
214
215 dst.x = (src0.x >= src1.x) ? 1 : 0
216
217 dst.y = (src0.y >= src1.y) ? 1 : 0
218
219 dst.z = (src0.z >= src1.z) ? 1 : 0
220
221 dst.w = (src0.w >= src1.w) ? 1 : 0
222
223
224 MAD - Multiply And Add
225
226 .. math::
227
228 dst.x = src0.x \times src1.x + src2.x
229
230 dst.y = src0.y \times src1.y + src2.y
231
232 dst.z = src0.z \times src1.z + src2.z
233
234 dst.w = src0.w \times src1.w + src2.w
235
236
237 SUB - Subtract
238
239 .. math::
240
241 dst.x = src0.x - src1.x
242
243 dst.y = src0.y - src1.y
244
245 dst.z = src0.z - src1.z
246
247 dst.w = src0.w - src1.w
248
249
250 LRP - Linear Interpolate
251
252 .. math::
253
254 dst.x = src0.x \times src1.x + (1 - src0.x) \times src2.x
255
256 dst.y = src0.y \times src1.y + (1 - src0.y) \times src2.y
257
258 dst.z = src0.z \times src1.z + (1 - src0.z) \times src2.z
259
260 dst.w = src0.w \times src1.w + (1 - src0.w) \times src2.w
261
262
263 CND - Condition
264
265 .. math::
266
267 dst.x = (src2.x > 0.5) ? src0.x : src1.x
268
269 dst.y = (src2.y > 0.5) ? src0.y : src1.y
270
271 dst.z = (src2.z > 0.5) ? src0.z : src1.z
272
273 dst.w = (src2.w > 0.5) ? src0.w : src1.w
274
275
276 DP2A - 2-component Dot Product And Add
277
278 .. math::
279
280 dst.x = src0.x \times src1.x + src0.y \times src1.y + src2.x
281
282 dst.y = src0.x \times src1.x + src0.y \times src1.y + src2.x
283
284 dst.z = src0.x \times src1.x + src0.y \times src1.y + src2.x
285
286 dst.w = src0.x \times src1.x + src0.y \times src1.y + src2.x
287
288
289 FRAC - Fraction
290
291 .. math::
292
293 dst.x = src.x - \lfloor src.x\rfloor
294
295 dst.y = src.y - \lfloor src.y\rfloor
296
297 dst.z = src.z - \lfloor src.z\rfloor
298
299 dst.w = src.w - \lfloor src.w\rfloor
300
301
302 CLAMP - Clamp
303
304 .. math::
305
306 dst.x = clamp(src0.x, src1.x, src2.x)
307
308 dst.y = clamp(src0.y, src1.y, src2.y)
309
310 dst.z = clamp(src0.z, src1.z, src2.z)
311
312 dst.w = clamp(src0.w, src1.w, src2.w)
313
314
315 FLR - Floor
316
317 This is identical to ARL.
318
319 .. math::
320
321 dst.x = \lfloor src.x\rfloor
322
323 dst.y = \lfloor src.y\rfloor
324
325 dst.z = \lfloor src.z\rfloor
326
327 dst.w = \lfloor src.w\rfloor
328
329
330 ROUND - Round
331
332 .. math::
333
334 dst.x = round(src.x)
335
336 dst.y = round(src.y)
337
338 dst.z = round(src.z)
339
340 dst.w = round(src.w)
341
342
343 EX2 - Exponential Base 2
344
345 .. math::
346
347 dst.x = 2^{src.x}
348
349 dst.y = 2^{src.x}
350
351 dst.z = 2^{src.x}
352
353 dst.w = 2^{src.x}
354
355
356 LG2 - Logarithm Base 2
357
358 .. math::
359
360 dst.x = \log_2{src.x}
361
362 dst.y = \log_2{src.x}
363
364 dst.z = \log_2{src.x}
365
366 dst.w = \log_2{src.x}
367
368
369 POW - Power
370
371 .. math::
372
373 dst.x = src0.x^{src1.x}
374
375 dst.y = src0.x^{src1.x}
376
377 dst.z = src0.x^{src1.x}
378
379 dst.w = src0.x^{src1.x}
380
381 XPD - Cross Product
382
383 .. math::
384
385 dst.x = src0.y \times src1.z - src1.y \times src0.z
386
387 dst.y = src0.z \times src1.x - src1.z \times src0.x
388
389 dst.z = src0.x \times src1.y - src1.x \times src0.y
390
391 dst.w = 1
392
393
394 ABS - Absolute
395
396 .. math::
397
398 dst.x = |src.x|
399
400 dst.y = |src.y|
401
402 dst.z = |src.z|
403
404 dst.w = |src.w|
405
406
407 RCC - Reciprocal Clamped
408
409 XXX cleanup on aisle three
410
411 .. math::
412
413 dst.x = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
414
415 dst.y = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
416
417 dst.z = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
418
419 dst.w = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
420
421
422 DPH - Homogeneous Dot Product
423
424 .. math::
425
426 dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
427
428 dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
429
430 dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
431
432 dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
433
434
435 COS - Cosine
436
437 .. math::
438
439 dst.x = \cos{src.x}
440
441 dst.y = \cos{src.x}
442
443 dst.z = \cos{src.x}
444
445 dst.w = \cos{src.x}
446
447
448 DDX - Derivative Relative To X
449
450 .. math::
451
452 dst.x = partialx(src.x)
453
454 dst.y = partialx(src.y)
455
456 dst.z = partialx(src.z)
457
458 dst.w = partialx(src.w)
459
460
461 DDY - Derivative Relative To Y
462
463 .. math::
464
465 dst.x = partialy(src.x)
466
467 dst.y = partialy(src.y)
468
469 dst.z = partialy(src.z)
470
471 dst.w = partialy(src.w)
472
473
474 KILP - Predicated Discard
475
476 discard
477
478
479 PK2H - Pack Two 16-bit Floats
480
481 TBD
482
483
484 PK2US - Pack Two Unsigned 16-bit Scalars
485
486 TBD
487
488
489 PK4B - Pack Four Signed 8-bit Scalars
490
491 TBD
492
493
494 PK4UB - Pack Four Unsigned 8-bit Scalars
495
496 TBD
497
498
499 RFL - Reflection Vector
500
501 .. math::
502
503 dst.x = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.x - src1.x
504
505 dst.y = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.y - src1.y
506
507 dst.z = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.z - src1.z
508
509 dst.w = 1
510
511 Considered for removal.
512
513
514 SEQ - Set On Equal
515
516 .. math::
517
518 dst.x = (src0.x == src1.x) ? 1 : 0
519 dst.y = (src0.y == src1.y) ? 1 : 0
520 dst.z = (src0.z == src1.z) ? 1 : 0
521 dst.w = (src0.w == src1.w) ? 1 : 0
522
523
524 SFL - Set On False
525
526 .. math::
527
528 dst.x = 0
529 dst.y = 0
530 dst.z = 0
531 dst.w = 0
532
533 Considered for removal.
534
535 SGT - Set On Greater Than
536
537 .. math::
538
539 dst.x = (src0.x > src1.x) ? 1 : 0
540 dst.y = (src0.y > src1.y) ? 1 : 0
541 dst.z = (src0.z > src1.z) ? 1 : 0
542 dst.w = (src0.w > src1.w) ? 1 : 0
543
544
545 SIN - Sine
546
547 .. math::
548
549 dst.x = \sin{src.x}
550
551 dst.y = \sin{src.x}
552
553 dst.z = \sin{src.x}
554
555 dst.w = \sin{src.x}
556
557
558 SLE - Set On Less Equal Than
559
560 .. math::
561
562 dst.x = (src0.x <= src1.x) ? 1 : 0
563 dst.y = (src0.y <= src1.y) ? 1 : 0
564 dst.z = (src0.z <= src1.z) ? 1 : 0
565 dst.w = (src0.w <= src1.w) ? 1 : 0
566
567
568 SNE - Set On Not Equal
569
570 .. math::
571
572 dst.x = (src0.x != src1.x) ? 1 : 0
573 dst.y = (src0.y != src1.y) ? 1 : 0
574 dst.z = (src0.z != src1.z) ? 1 : 0
575 dst.w = (src0.w != src1.w) ? 1 : 0
576
577
578 STR - Set On True
579
580 .. math::
581
582 dst.x = 1
583 dst.y = 1
584 dst.z = 1
585 dst.w = 1
586
587
588 TEX - Texture Lookup
589
590 TBD
591
592
593 TXD - Texture Lookup with Derivatives
594
595 TBD
596
597
598 TXP - Projective Texture Lookup
599
600 TBD
601
602
603 UP2H - Unpack Two 16-Bit Floats
604
605 TBD
606
607 Considered for removal.
608
609 UP2US - Unpack Two Unsigned 16-Bit Scalars
610
611 TBD
612
613 Considered for removal.
614
615 UP4B - Unpack Four Signed 8-Bit Values
616
617 TBD
618
619 Considered for removal.
620
621 UP4UB - Unpack Four Unsigned 8-Bit Scalars
622
623 TBD
624
625 Considered for removal.
626
627 X2D - 2D Coordinate Transformation
628
629 .. math::
630
631 dst.x = src0.x + src1.x \times src2.x + src1.y \times src2.y
632 dst.y = src0.y + src1.x \times src2.z + src1.y \times src2.w
633 dst.z = src0.x + src1.x \times src2.x + src1.y \times src2.y
634 dst.w = src0.y + src1.x \times src2.z + src1.y \times src2.w
635
636 Considered for removal.
637
638
639 From GL_NV_vertex_program2
640 ^^^^^^^^^^^^^^^^^^^^^^^^^^
641
642
643 ARA - Address Register Add
644
645 TBD
646
647 Considered for removal.
648
649 ARR - Address Register Load With Round
650
651 .. math::
652
653 dst.x = round(src.x)
654
655 dst.y = round(src.y)
656
657 dst.z = round(src.z)
658
659 dst.w = round(src.w)
660
661
662 BRA - Branch
663
664 pc = target
665
666 Considered for removal.
667
668 CAL - Subroutine Call
669
670 push(pc)
671 pc = target
672
673
674 RET - Subroutine Call Return
675
676 pc = pop()
677
678 Potential restrictions:
679 * Only occurs at end of function.
680
681 SSG - Set Sign
682
683 .. math::
684
685 dst.x = (src.x > 0) ? 1 : (src.x < 0) ? -1 : 0
686
687 dst.y = (src.y > 0) ? 1 : (src.y < 0) ? -1 : 0
688
689 dst.z = (src.z > 0) ? 1 : (src.z < 0) ? -1 : 0
690
691 dst.w = (src.w > 0) ? 1 : (src.w < 0) ? -1 : 0
692
693
694 CMP - Compare
695
696 .. math::
697
698 dst.x = (src0.x < 0) ? src1.x : src2.x
699
700 dst.y = (src0.y < 0) ? src1.y : src2.y
701
702 dst.z = (src0.z < 0) ? src1.z : src2.z
703
704 dst.w = (src0.w < 0) ? src1.w : src2.w
705
706
707 KIL - Conditional Discard
708
709 .. math::
710
711 if (src.x < 0 || src.y < 0 || src.z < 0 || src.w < 0)
712 discard
713 endif
714
715
716 SCS - Sine Cosine
717
718 .. math::
719
720 dst.x = \cos{src.x}
721
722 dst.y = \sin{src.x}
723
724 dst.z = 0
725
726 dst.y = 1
727
728
729 TXB - Texture Lookup With Bias
730
731 TBD
732
733
734 NRM - 3-component Vector Normalise
735
736 .. math::
737
738 dst.x = src.x / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
739
740 dst.y = src.y / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
741
742 dst.z = src.z / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
743
744 dst.w = 1
745
746
747 DIV - Divide
748
749 .. math::
750
751 dst.x = \frac{src0.x}{src1.x}
752
753 dst.y = \frac{src0.y}{src1.y}
754
755 dst.z = \frac{src0.z}{src1.z}
756
757 dst.w = \frac{src0.w}{src1.w}
758
759
760 DP2 - 2-component Dot Product
761
762 .. math::
763
764 dst.x = src0.x \times src1.x + src0.y \times src1.y
765
766 dst.y = src0.x \times src1.x + src0.y \times src1.y
767
768 dst.z = src0.x \times src1.x + src0.y \times src1.y
769
770 dst.w = src0.x \times src1.x + src0.y \times src1.y
771
772
773 TXL - Texture Lookup With LOD
774
775 TBD
776
777
778 BRK - Break
779
780 TBD
781
782
783 IF - If
784
785 TBD
786
787
788 BGNFOR - Begin a For-Loop
789
790 dst.x = floor(src.x)
791 dst.y = floor(src.y)
792 dst.z = floor(src.z)
793
794 if (dst.y <= 0)
795 pc = [matching ENDFOR] + 1
796 endif
797
798 Note: The destination must be a loop register.
799 The source must be a constant register.
800
801 Considered for cleanup / removal.
802
803
804 REP - Repeat
805
806 TBD
807
808
809 ELSE - Else
810
811 TBD
812
813
814 ENDIF - End If
815
816 TBD
817
818
819 ENDFOR - End a For-Loop
820
821 dst.x = dst.x + dst.z
822 dst.y = dst.y - 1.0
823
824 if (dst.y > 0)
825 pc = [matching BGNFOR instruction] + 1
826 endif
827
828 Note: The destination must be a loop register.
829
830 Considered for cleanup / removal.
831
832 ENDREP - End Repeat
833
834 TBD
835
836
837 PUSHA - Push Address Register On Stack
838
839 push(src.x)
840 push(src.y)
841 push(src.z)
842 push(src.w)
843
844 Considered for cleanup / removal.
845
846 POPA - Pop Address Register From Stack
847
848 dst.w = pop()
849 dst.z = pop()
850 dst.y = pop()
851 dst.x = pop()
852
853 Considered for cleanup / removal.
854
855
856 From GL_NV_gpu_program4
857 ^^^^^^^^^^^^^^^^^^^^^^^^
858
859 Support for these opcodes indicated by a special pipe capability bit (TBD).
860
861 CEIL - Ceiling
862
863 .. math::
864
865 dst.x = \lceil src.x\rceil
866
867 dst.y = \lceil src.y\rceil
868
869 dst.z = \lceil src.z\rceil
870
871 dst.w = \lceil src.w\rceil
872
873
874 I2F - Integer To Float
875
876 .. math::
877
878 dst.x = (float) src.x
879
880 dst.y = (float) src.y
881
882 dst.z = (float) src.z
883
884 dst.w = (float) src.w
885
886
887 NOT - Bitwise Not
888
889 .. math::
890
891 dst.x = ~src.x
892
893 dst.y = ~src.y
894
895 dst.z = ~src.z
896
897 dst.w = ~src.w
898
899
900 TRUNC - Truncate
901
902 .. math::
903
904 dst.x = trunc(src.x)
905
906 dst.y = trunc(src.y)
907
908 dst.z = trunc(src.z)
909
910 dst.w = trunc(src.w)
911
912
913 SHL - Shift Left
914
915 .. math::
916
917 dst.x = src0.x << src1.x
918
919 dst.y = src0.y << src1.x
920
921 dst.z = src0.z << src1.x
922
923 dst.w = src0.w << src1.x
924
925
926 SHR - Shift Right
927
928 .. math::
929
930 dst.x = src0.x >> src1.x
931
932 dst.y = src0.y >> src1.x
933
934 dst.z = src0.z >> src1.x
935
936 dst.w = src0.w >> src1.x
937
938
939 AND - Bitwise And
940
941 .. math::
942
943 dst.x = src0.x & src1.x
944
945 dst.y = src0.y & src1.y
946
947 dst.z = src0.z & src1.z
948
949 dst.w = src0.w & src1.w
950
951
952 OR - Bitwise Or
953
954 .. math::
955
956 dst.x = src0.x | src1.x
957
958 dst.y = src0.y | src1.y
959
960 dst.z = src0.z | src1.z
961
962 dst.w = src0.w | src1.w
963
964
965 MOD - Modulus
966
967 .. math::
968
969 dst.x = src0.x \bmod src1.x
970
971 dst.y = src0.y \bmod src1.y
972
973 dst.z = src0.z \bmod src1.z
974
975 dst.w = src0.w \bmod src1.w
976
977
978 XOR - Bitwise Xor
979
980 .. math::
981
982 dst.x = src0.x ^ src1.x
983
984 dst.y = src0.y ^ src1.y
985
986 dst.z = src0.z ^ src1.z
987
988 dst.w = src0.w ^ src1.w
989
990
991 SAD - Sum Of Absolute Differences
992
993 .. math::
994
995 dst.x = |src0.x - src1.x| + src2.x
996
997 dst.y = |src0.y - src1.y| + src2.y
998
999 dst.z = |src0.z - src1.z| + src2.z
1000
1001 dst.w = |src0.w - src1.w| + src2.w
1002
1003
1004 TXF - Texel Fetch
1005
1006 TBD
1007
1008
1009 TXQ - Texture Size Query
1010
1011 TBD
1012
1013
1014 CONT - Continue
1015
1016 TBD
1017
1018
1019 From GL_NV_geometry_program4
1020 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1021
1022
1023 EMIT - Emit
1024
1025 TBD
1026
1027
1028 ENDPRIM - End Primitive
1029
1030 TBD
1031
1032
1033 From GLSL
1034 ^^^^^^^^^^
1035
1036
1037 BGNLOOP - Begin a Loop
1038
1039 TBD
1040
1041
1042 BGNSUB - Begin Subroutine
1043
1044 TBD
1045
1046
1047 ENDLOOP - End a Loop
1048
1049 TBD
1050
1051
1052 ENDSUB - End Subroutine
1053
1054 TBD
1055
1056
1057 NOP - No Operation
1058
1059 Do nothing.
1060
1061
1062 NRM4 - 4-component Vector Normalise
1063
1064 .. math::
1065
1066 dst.x = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1067
1068 dst.y = \frac{src.y}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1069
1070 dst.z = \frac{src.z}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1071
1072 dst.w = \frac{src.w}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1073
1074
1075 ps_2_x
1076 ^^^^^^^^^^^^
1077
1078
1079 CALLNZ - Subroutine Call If Not Zero
1080
1081 TBD
1082
1083
1084 IFC - If
1085
1086 TBD
1087
1088
1089 BREAKC - Break Conditional
1090
1091 TBD
1092
1093
1094 Explanation of symbols used
1095 ------------------------------
1096
1097
1098 Functions
1099 ^^^^^^^^^^^^^^
1100
1101
1102 :math:`|x|` Absolute value of `x`.
1103
1104 :math:`\lceil x \rceil` Ceiling of `x`.
1105
1106 clamp(x,y,z) Clamp x between y and z.
1107 (x < y) ? y : (x > z) ? z : x
1108
1109 :math:`\lfloor x\rfloor` Floor of `x`.
1110
1111 :math:`\log_2{x}` Logarithm of `x`, base 2.
1112
1113 max(x,y) Maximum of x and y.
1114 (x > y) ? x : y
1115
1116 min(x,y) Minimum of x and y.
1117 (x < y) ? x : y
1118
1119 partialx(x) Derivative of x relative to fragment's X.
1120
1121 partialy(x) Derivative of x relative to fragment's Y.
1122
1123 pop() Pop from stack.
1124
1125 :math:`x^y` `x` to the power `y`.
1126
1127 push(x) Push x on stack.
1128
1129 round(x) Round x.
1130
1131 trunc(x) Truncate x, i.e. drop the fraction bits.
1132
1133
1134 Keywords
1135 ^^^^^^^^^^^^^
1136
1137
1138 discard Discard fragment.
1139
1140 dst First destination register.
1141
1142 dst0 First destination register.
1143
1144 pc Program counter.
1145
1146 src First source register.
1147
1148 src0 First source register.
1149
1150 src1 Second source register.
1151
1152 src2 Third source register.
1153
1154 target Label of target instruction.
1155
1156
1157 Other tokens
1158 ---------------
1159
1160
1161 Declaration Semantic
1162 ^^^^^^^^^^^^^^^^^^^^^^^^
1163
1164
1165 Follows Declaration token if Semantic bit is set.
1166
1167 Since its purpose is to link a shader with other stages of the pipeline,
1168 it is valid to follow only those Declaration tokens that declare a register
1169 either in INPUT or OUTPUT file.
1170
1171 SemanticName field contains the semantic name of the register being declared.
1172 There is no default value.
1173
1174 SemanticIndex is an optional subscript that can be used to distinguish
1175 different register declarations with the same semantic name. The default value
1176 is 0.
1177
1178 The meanings of the individual semantic names are explained in the following
1179 sections.
1180
1181 TGSI_SEMANTIC_POSITION
1182 """"""""""""""""""""""
1183
1184 Position, sometimes known as HPOS or WPOS for historical reasons, is the
1185 location of the vertex in space, in ``(x, y, z, w)`` format. ``x``, ``y``, and ``z``
1186 are the Cartesian coordinates, and ``w`` is the homogenous coordinate and used
1187 for the perspective divide, if enabled.
1188
1189 As a vertex shader output, position should be scaled to the viewport. When
1190 used in fragment shaders, position will ---
1191
1192 XXX --- wait a minute. Should position be in [0,1] for x and y?
1193
1194 XXX additionally, is there a way to configure the perspective divide? it's
1195 accelerated on most chipsets AFAIK...
1196
1197 Position, if not specified, usually defaults to ``(0, 0, 0, 1)``, and can
1198 be partially specified as ``(x, y, 0, 1)`` or ``(x, y, z, 1)``.
1199
1200 XXX usually? can we solidify that?
1201
1202 TGSI_SEMANTIC_COLOR
1203 """""""""""""""""""
1204
1205 Colors are used to, well, color the primitives. Colors are always in
1206 ``(r, g, b, a)`` format.
1207
1208 If alpha is not specified, it defaults to 1.
1209
1210 TGSI_SEMANTIC_BCOLOR
1211 """"""""""""""""""""
1212
1213 Back-facing colors are only used for back-facing polygons, and are only valid
1214 in vertex shader outputs. After rasterization, all polygons are front-facing
1215 and COLOR and BCOLOR end up occupying the same slots in the fragment, so
1216 all BCOLORs effectively become regular COLORs in the fragment shader.
1217
1218 TGSI_SEMANTIC_FOG
1219 """""""""""""""""
1220
1221 The fog coordinate historically has been used to replace the depth coordinate
1222 for generation of fog in dedicated fog blocks. Gallium, however, does not use
1223 dedicated fog acceleration, placing it entirely in the fragment shader
1224 instead.
1225
1226 The fog coordinate should be written in ``(f, 0, 0, 1)`` format. Only the first
1227 component matters when writing from the vertex shader; the driver will ensure
1228 that the coordinate is in this format when used as a fragment shader input.
1229
1230 TGSI_SEMANTIC_PSIZE
1231 """""""""""""""""""
1232
1233 PSIZE, or point size, is used to specify point sizes per-vertex. It should
1234 be in ``(p, n, x, f)`` format, where ``p`` is the point size, ``n`` is the minimum
1235 size, ``x`` is the maximum size, and ``f`` is the fade threshold.
1236
1237 XXX this is arb_vp. is this what we actually do? should double-check...
1238
1239 When using this semantic, be sure to set the appropriate state in the
1240 :ref:`rasterizer` first.
1241
1242 TGSI_SEMANTIC_GENERIC
1243 """""""""""""""""""""
1244
1245 Generic semantics are nearly always used for texture coordinate attributes,
1246 in ``(s, t, r, q)`` format. ``t`` and ``r`` may be unused for certain kinds
1247 of lookups, and ``q`` is the level-of-detail bias for biased sampling.
1248
1249 These attributes are called "generic" because they may be used for anything
1250 else, including parameters, texture generation information, or anything that
1251 can be stored inside a four-component vector.
1252
1253 TGSI_SEMANTIC_NORMAL
1254 """"""""""""""""""""
1255
1256 Vertex normal; could be used to implement per-pixel lighting for legacy APIs
1257 that allow mixing fixed-function and programmable stages.
1258
1259 TGSI_SEMANTIC_FACE
1260 """"""""""""""""""
1261
1262 FACE is the facing bit, to store the facing information for the fragment
1263 shader. ``(f, 0, 0, 1)`` is the format. The first component will be positive
1264 when the fragment is front-facing, and negative when the component is
1265 back-facing.
1266
1267 TGSI_SEMANTIC_EDGEFLAG
1268 """"""""""""""""""""""
1269
1270 XXX no clue