Merge remote branch 'origin/opengl-es-v2'
[mesa.git] / src / gallium / docs / source / tgsi.rst
1 TGSI
2 ====
3
4 TGSI, Tungsten Graphics Shader Infrastructure, is an intermediate language
5 for describing shaders. Since Gallium is inherently shaderful, shaders are
6 an important part of the API. TGSI is the only intermediate representation
7 used by all drivers.
8
9 Instruction Set
10 ---------------
11
12 From GL_NV_vertex_program
13 ^^^^^^^^^^^^^^^^^^^^^^^^^
14
15
16 ARL - Address Register Load
17
18 .. math::
19
20 dst.x = \lfloor src.x\rfloor
21
22 dst.y = \lfloor src.y\rfloor
23
24 dst.z = \lfloor src.z\rfloor
25
26 dst.w = \lfloor src.w\rfloor
27
28
29 MOV - Move
30
31 .. math::
32
33 dst.x = src.x
34
35 dst.y = src.y
36
37 dst.z = src.z
38
39 dst.w = src.w
40
41
42 LIT - Light Coefficients
43
44 .. math::
45
46 dst.x = 1
47
48 dst.y = max(src.x, 0)
49
50 dst.z = (src.x > 0) ? max(src.y, 0)^{clamp(src.w, -128, 128))} : 0
51
52 dst.w = 1
53
54
55 RCP - Reciprocal
56
57 .. math::
58
59 dst.x = \frac{1}{src.x}
60
61 dst.y = \frac{1}{src.x}
62
63 dst.z = \frac{1}{src.x}
64
65 dst.w = \frac{1}{src.x}
66
67
68 RSQ - Reciprocal Square Root
69
70 .. math::
71
72 dst.x = \frac{1}{\sqrt{|src.x|}}
73
74 dst.y = \frac{1}{\sqrt{|src.x|}}
75
76 dst.z = \frac{1}{\sqrt{|src.x|}}
77
78 dst.w = \frac{1}{\sqrt{|src.x|}}
79
80
81 EXP - Approximate Exponential Base 2
82
83 .. math::
84
85 dst.x = 2^{\lfloor src.x\rfloor}
86
87 dst.y = src.x - \lfloor src.x\rfloor
88
89 dst.z = 2^{src.x}
90
91 dst.w = 1
92
93
94 LOG - Approximate Logarithm Base 2
95
96 .. math::
97
98 dst.x = \lfloor\log_2{|src.x|}\rfloor
99
100 dst.y = \frac{|src.x|}{2^{\lfloor\log_2{|src.x|}\rfloor}}
101
102 dst.z = \log_2{|src.x|}
103
104 dst.w = 1
105
106
107 MUL - Multiply
108
109 .. math::
110
111 dst.x = src0.x \times src1.x
112
113 dst.y = src0.y \times src1.y
114
115 dst.z = src0.z \times src1.z
116
117 dst.w = src0.w \times src1.w
118
119
120 ADD - Add
121
122 .. math::
123
124 dst.x = src0.x + src1.x
125
126 dst.y = src0.y + src1.y
127
128 dst.z = src0.z + src1.z
129
130 dst.w = src0.w + src1.w
131
132
133 DP3 - 3-component Dot Product
134
135 .. math::
136
137 dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
138
139 dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
140
141 dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
142
143 dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
144
145
146 DP4 - 4-component Dot Product
147
148 .. math::
149
150 dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
151
152 dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
153
154 dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
155
156 dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
157
158
159 DST - Distance Vector
160
161 .. math::
162
163 dst.x = 1
164
165 dst.y = src0.y \times src1.y
166
167 dst.z = src0.z
168
169 dst.w = src1.w
170
171
172 MIN - Minimum
173
174 .. math::
175
176 dst.x = min(src0.x, src1.x)
177
178 dst.y = min(src0.y, src1.y)
179
180 dst.z = min(src0.z, src1.z)
181
182 dst.w = min(src0.w, src1.w)
183
184
185 MAX - Maximum
186
187 .. math::
188
189 dst.x = max(src0.x, src1.x)
190
191 dst.y = max(src0.y, src1.y)
192
193 dst.z = max(src0.z, src1.z)
194
195 dst.w = max(src0.w, src1.w)
196
197
198 SLT - Set On Less Than
199
200 .. math::
201
202 dst.x = (src0.x < src1.x) ? 1 : 0
203
204 dst.y = (src0.y < src1.y) ? 1 : 0
205
206 dst.z = (src0.z < src1.z) ? 1 : 0
207
208 dst.w = (src0.w < src1.w) ? 1 : 0
209
210
211 SGE - Set On Greater Equal Than
212
213 .. math::
214
215 dst.x = (src0.x >= src1.x) ? 1 : 0
216
217 dst.y = (src0.y >= src1.y) ? 1 : 0
218
219 dst.z = (src0.z >= src1.z) ? 1 : 0
220
221 dst.w = (src0.w >= src1.w) ? 1 : 0
222
223
224 MAD - Multiply And Add
225
226 .. math::
227
228 dst.x = src0.x \times src1.x + src2.x
229
230 dst.y = src0.y \times src1.y + src2.y
231
232 dst.z = src0.z \times src1.z + src2.z
233
234 dst.w = src0.w \times src1.w + src2.w
235
236
237 SUB - Subtract
238
239 .. math::
240
241 dst.x = src0.x - src1.x
242
243 dst.y = src0.y - src1.y
244
245 dst.z = src0.z - src1.z
246
247 dst.w = src0.w - src1.w
248
249
250 LRP - Linear Interpolate
251
252 .. math::
253
254 dst.x = src0.x \times src1.x + (1 - src0.x) \times src2.x
255
256 dst.y = src0.y \times src1.y + (1 - src0.y) \times src2.y
257
258 dst.z = src0.z \times src1.z + (1 - src0.z) \times src2.z
259
260 dst.w = src0.w \times src1.w + (1 - src0.w) \times src2.w
261
262
263 CND - Condition
264
265 .. math::
266
267 dst.x = (src2.x > 0.5) ? src0.x : src1.x
268
269 dst.y = (src2.y > 0.5) ? src0.y : src1.y
270
271 dst.z = (src2.z > 0.5) ? src0.z : src1.z
272
273 dst.w = (src2.w > 0.5) ? src0.w : src1.w
274
275
276 DP2A - 2-component Dot Product And Add
277
278 .. math::
279
280 dst.x = src0.x \times src1.x + src0.y \times src1.y + src2.x
281
282 dst.y = src0.x \times src1.x + src0.y \times src1.y + src2.x
283
284 dst.z = src0.x \times src1.x + src0.y \times src1.y + src2.x
285
286 dst.w = src0.x \times src1.x + src0.y \times src1.y + src2.x
287
288
289 FRAC - Fraction
290
291 .. math::
292
293 dst.x = src.x - \lfloor src.x\rfloor
294
295 dst.y = src.y - \lfloor src.y\rfloor
296
297 dst.z = src.z - \lfloor src.z\rfloor
298
299 dst.w = src.w - \lfloor src.w\rfloor
300
301
302 CLAMP - Clamp
303
304 .. math::
305
306 dst.x = clamp(src0.x, src1.x, src2.x)
307
308 dst.y = clamp(src0.y, src1.y, src2.y)
309
310 dst.z = clamp(src0.z, src1.z, src2.z)
311
312 dst.w = clamp(src0.w, src1.w, src2.w)
313
314
315 FLR - Floor
316
317 This is identical to ARL.
318
319 .. math::
320
321 dst.x = \lfloor src.x\rfloor
322
323 dst.y = \lfloor src.y\rfloor
324
325 dst.z = \lfloor src.z\rfloor
326
327 dst.w = \lfloor src.w\rfloor
328
329
330 ROUND - Round
331
332 .. math::
333
334 dst.x = round(src.x)
335
336 dst.y = round(src.y)
337
338 dst.z = round(src.z)
339
340 dst.w = round(src.w)
341
342
343 EX2 - Exponential Base 2
344
345 .. math::
346
347 dst.x = 2^{src.x}
348
349 dst.y = 2^{src.x}
350
351 dst.z = 2^{src.x}
352
353 dst.w = 2^{src.x}
354
355
356 LG2 - Logarithm Base 2
357
358 .. math::
359
360 dst.x = \log_2{src.x}
361
362 dst.y = \log_2{src.x}
363
364 dst.z = \log_2{src.x}
365
366 dst.w = \log_2{src.x}
367
368
369 POW - Power
370
371 .. math::
372
373 dst.x = src0.x^{src1.x}
374
375 dst.y = src0.x^{src1.x}
376
377 dst.z = src0.x^{src1.x}
378
379 dst.w = src0.x^{src1.x}
380
381 XPD - Cross Product
382
383 .. math::
384
385 dst.x = src0.y \times src1.z - src1.y \times src0.z
386
387 dst.y = src0.z \times src1.x - src1.z \times src0.x
388
389 dst.z = src0.x \times src1.y - src1.x \times src0.y
390
391 dst.w = 1
392
393
394 ABS - Absolute
395
396 .. math::
397
398 dst.x = |src.x|
399
400 dst.y = |src.y|
401
402 dst.z = |src.z|
403
404 dst.w = |src.w|
405
406
407 RCC - Reciprocal Clamped
408
409 XXX cleanup on aisle three
410
411 .. math::
412
413 dst.x = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
414
415 dst.y = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
416
417 dst.z = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
418
419 dst.w = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
420
421
422 DPH - Homogeneous Dot Product
423
424 .. math::
425
426 dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
427
428 dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
429
430 dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
431
432 dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
433
434
435 COS - Cosine
436
437 .. math::
438
439 dst.x = \cos{src.x}
440
441 dst.y = \cos{src.x}
442
443 dst.z = \cos{src.x}
444
445 dst.w = \cos{src.x}
446
447
448 DDX - Derivative Relative To X
449
450 .. math::
451
452 dst.x = partialx(src.x)
453
454 dst.y = partialx(src.y)
455
456 dst.z = partialx(src.z)
457
458 dst.w = partialx(src.w)
459
460
461 DDY - Derivative Relative To Y
462
463 .. math::
464
465 dst.x = partialy(src.x)
466
467 dst.y = partialy(src.y)
468
469 dst.z = partialy(src.z)
470
471 dst.w = partialy(src.w)
472
473
474 KILP - Predicated Discard
475
476 discard
477
478
479 PK2H - Pack Two 16-bit Floats
480
481 TBD
482
483
484 PK2US - Pack Two Unsigned 16-bit Scalars
485
486 TBD
487
488
489 PK4B - Pack Four Signed 8-bit Scalars
490
491 TBD
492
493
494 PK4UB - Pack Four Unsigned 8-bit Scalars
495
496 TBD
497
498
499 RFL - Reflection Vector
500
501 .. math::
502
503 dst.x = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.x - src1.x
504
505 dst.y = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.y - src1.y
506
507 dst.z = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.z - src1.z
508
509 dst.w = 1
510
511 Considered for removal.
512
513
514 SEQ - Set On Equal
515
516 .. math::
517
518 dst.x = (src0.x == src1.x) ? 1 : 0
519
520 dst.y = (src0.y == src1.y) ? 1 : 0
521
522 dst.z = (src0.z == src1.z) ? 1 : 0
523
524 dst.w = (src0.w == src1.w) ? 1 : 0
525
526
527 SFL - Set On False
528
529 .. math::
530
531 dst.x = 0
532
533 dst.y = 0
534
535 dst.z = 0
536
537 dst.w = 0
538
539 Considered for removal.
540
541 SGT - Set On Greater Than
542
543 .. math::
544
545 dst.x = (src0.x > src1.x) ? 1 : 0
546
547 dst.y = (src0.y > src1.y) ? 1 : 0
548
549 dst.z = (src0.z > src1.z) ? 1 : 0
550
551 dst.w = (src0.w > src1.w) ? 1 : 0
552
553
554 SIN - Sine
555
556 .. math::
557
558 dst.x = \sin{src.x}
559
560 dst.y = \sin{src.x}
561
562 dst.z = \sin{src.x}
563
564 dst.w = \sin{src.x}
565
566
567 SLE - Set On Less Equal Than
568
569 .. math::
570
571 dst.x = (src0.x <= src1.x) ? 1 : 0
572
573 dst.y = (src0.y <= src1.y) ? 1 : 0
574
575 dst.z = (src0.z <= src1.z) ? 1 : 0
576
577 dst.w = (src0.w <= src1.w) ? 1 : 0
578
579
580 SNE - Set On Not Equal
581
582 .. math::
583
584 dst.x = (src0.x != src1.x) ? 1 : 0
585
586 dst.y = (src0.y != src1.y) ? 1 : 0
587
588 dst.z = (src0.z != src1.z) ? 1 : 0
589
590 dst.w = (src0.w != src1.w) ? 1 : 0
591
592
593 STR - Set On True
594
595 .. math::
596
597 dst.x = 1
598
599 dst.y = 1
600
601 dst.z = 1
602
603 dst.w = 1
604
605
606 TEX - Texture Lookup
607
608 TBD
609
610
611 TXD - Texture Lookup with Derivatives
612
613 TBD
614
615
616 TXP - Projective Texture Lookup
617
618 TBD
619
620
621 UP2H - Unpack Two 16-Bit Floats
622
623 TBD
624
625 Considered for removal.
626
627 UP2US - Unpack Two Unsigned 16-Bit Scalars
628
629 TBD
630
631 Considered for removal.
632
633 UP4B - Unpack Four Signed 8-Bit Values
634
635 TBD
636
637 Considered for removal.
638
639 UP4UB - Unpack Four Unsigned 8-Bit Scalars
640
641 TBD
642
643 Considered for removal.
644
645 X2D - 2D Coordinate Transformation
646
647 .. math::
648
649 dst.x = src0.x + src1.x \times src2.x + src1.y \times src2.y
650
651 dst.y = src0.y + src1.x \times src2.z + src1.y \times src2.w
652
653 dst.z = src0.x + src1.x \times src2.x + src1.y \times src2.y
654
655 dst.w = src0.y + src1.x \times src2.z + src1.y \times src2.w
656
657 Considered for removal.
658
659
660 From GL_NV_vertex_program2
661 ^^^^^^^^^^^^^^^^^^^^^^^^^^
662
663
664 ARA - Address Register Add
665
666 TBD
667
668 Considered for removal.
669
670 ARR - Address Register Load With Round
671
672 .. math::
673
674 dst.x = round(src.x)
675
676 dst.y = round(src.y)
677
678 dst.z = round(src.z)
679
680 dst.w = round(src.w)
681
682
683 BRA - Branch
684
685 pc = target
686
687 Considered for removal.
688
689 CAL - Subroutine Call
690
691 push(pc)
692 pc = target
693
694
695 RET - Subroutine Call Return
696
697 pc = pop()
698
699 Potential restrictions:
700 * Only occurs at end of function.
701
702 SSG - Set Sign
703
704 .. math::
705
706 dst.x = (src.x > 0) ? 1 : (src.x < 0) ? -1 : 0
707
708 dst.y = (src.y > 0) ? 1 : (src.y < 0) ? -1 : 0
709
710 dst.z = (src.z > 0) ? 1 : (src.z < 0) ? -1 : 0
711
712 dst.w = (src.w > 0) ? 1 : (src.w < 0) ? -1 : 0
713
714
715 CMP - Compare
716
717 .. math::
718
719 dst.x = (src0.x < 0) ? src1.x : src2.x
720
721 dst.y = (src0.y < 0) ? src1.y : src2.y
722
723 dst.z = (src0.z < 0) ? src1.z : src2.z
724
725 dst.w = (src0.w < 0) ? src1.w : src2.w
726
727
728 KIL - Conditional Discard
729
730 .. math::
731
732 if (src.x < 0 || src.y < 0 || src.z < 0 || src.w < 0)
733 discard
734 endif
735
736
737 SCS - Sine Cosine
738
739 .. math::
740
741 dst.x = \cos{src.x}
742
743 dst.y = \sin{src.x}
744
745 dst.z = 0
746
747 dst.y = 1
748
749
750 TXB - Texture Lookup With Bias
751
752 TBD
753
754
755 NRM - 3-component Vector Normalise
756
757 .. math::
758
759 dst.x = src.x / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
760
761 dst.y = src.y / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
762
763 dst.z = src.z / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
764
765 dst.w = 1
766
767
768 DIV - Divide
769
770 .. math::
771
772 dst.x = \frac{src0.x}{src1.x}
773
774 dst.y = \frac{src0.y}{src1.y}
775
776 dst.z = \frac{src0.z}{src1.z}
777
778 dst.w = \frac{src0.w}{src1.w}
779
780
781 DP2 - 2-component Dot Product
782
783 .. math::
784
785 dst.x = src0.x \times src1.x + src0.y \times src1.y
786
787 dst.y = src0.x \times src1.x + src0.y \times src1.y
788
789 dst.z = src0.x \times src1.x + src0.y \times src1.y
790
791 dst.w = src0.x \times src1.x + src0.y \times src1.y
792
793
794 TXL - Texture Lookup With LOD
795
796 TBD
797
798
799 BRK - Break
800
801 TBD
802
803
804 IF - If
805
806 TBD
807
808
809 BGNFOR - Begin a For-Loop
810
811 dst.x = floor(src.x)
812 dst.y = floor(src.y)
813 dst.z = floor(src.z)
814
815 if (dst.y <= 0)
816 pc = [matching ENDFOR] + 1
817 endif
818
819 Note: The destination must be a loop register.
820 The source must be a constant register.
821
822 Considered for cleanup / removal.
823
824
825 REP - Repeat
826
827 TBD
828
829
830 ELSE - Else
831
832 TBD
833
834
835 ENDIF - End If
836
837 TBD
838
839
840 ENDFOR - End a For-Loop
841
842 dst.x = dst.x + dst.z
843 dst.y = dst.y - 1.0
844
845 if (dst.y > 0)
846 pc = [matching BGNFOR instruction] + 1
847 endif
848
849 Note: The destination must be a loop register.
850
851 Considered for cleanup / removal.
852
853 ENDREP - End Repeat
854
855 TBD
856
857
858 PUSHA - Push Address Register On Stack
859
860 push(src.x)
861 push(src.y)
862 push(src.z)
863 push(src.w)
864
865 Considered for cleanup / removal.
866
867 POPA - Pop Address Register From Stack
868
869 dst.w = pop()
870 dst.z = pop()
871 dst.y = pop()
872 dst.x = pop()
873
874 Considered for cleanup / removal.
875
876
877 From GL_NV_gpu_program4
878 ^^^^^^^^^^^^^^^^^^^^^^^^
879
880 Support for these opcodes indicated by a special pipe capability bit (TBD).
881
882 CEIL - Ceiling
883
884 .. math::
885
886 dst.x = \lceil src.x\rceil
887
888 dst.y = \lceil src.y\rceil
889
890 dst.z = \lceil src.z\rceil
891
892 dst.w = \lceil src.w\rceil
893
894
895 I2F - Integer To Float
896
897 .. math::
898
899 dst.x = (float) src.x
900
901 dst.y = (float) src.y
902
903 dst.z = (float) src.z
904
905 dst.w = (float) src.w
906
907
908 NOT - Bitwise Not
909
910 .. math::
911
912 dst.x = ~src.x
913
914 dst.y = ~src.y
915
916 dst.z = ~src.z
917
918 dst.w = ~src.w
919
920
921 TRUNC - Truncate
922
923 .. math::
924
925 dst.x = trunc(src.x)
926
927 dst.y = trunc(src.y)
928
929 dst.z = trunc(src.z)
930
931 dst.w = trunc(src.w)
932
933
934 SHL - Shift Left
935
936 .. math::
937
938 dst.x = src0.x << src1.x
939
940 dst.y = src0.y << src1.x
941
942 dst.z = src0.z << src1.x
943
944 dst.w = src0.w << src1.x
945
946
947 SHR - Shift Right
948
949 .. math::
950
951 dst.x = src0.x >> src1.x
952
953 dst.y = src0.y >> src1.x
954
955 dst.z = src0.z >> src1.x
956
957 dst.w = src0.w >> src1.x
958
959
960 AND - Bitwise And
961
962 .. math::
963
964 dst.x = src0.x & src1.x
965
966 dst.y = src0.y & src1.y
967
968 dst.z = src0.z & src1.z
969
970 dst.w = src0.w & src1.w
971
972
973 OR - Bitwise Or
974
975 .. math::
976
977 dst.x = src0.x | src1.x
978
979 dst.y = src0.y | src1.y
980
981 dst.z = src0.z | src1.z
982
983 dst.w = src0.w | src1.w
984
985
986 MOD - Modulus
987
988 .. math::
989
990 dst.x = src0.x \bmod src1.x
991
992 dst.y = src0.y \bmod src1.y
993
994 dst.z = src0.z \bmod src1.z
995
996 dst.w = src0.w \bmod src1.w
997
998
999 XOR - Bitwise Xor
1000
1001 .. math::
1002
1003 dst.x = src0.x \oplus src1.x
1004
1005 dst.y = src0.y \oplus src1.y
1006
1007 dst.z = src0.z \oplus src1.z
1008
1009 dst.w = src0.w \oplus src1.w
1010
1011
1012 SAD - Sum Of Absolute Differences
1013
1014 .. math::
1015
1016 dst.x = |src0.x - src1.x| + src2.x
1017
1018 dst.y = |src0.y - src1.y| + src2.y
1019
1020 dst.z = |src0.z - src1.z| + src2.z
1021
1022 dst.w = |src0.w - src1.w| + src2.w
1023
1024
1025 TXF - Texel Fetch
1026
1027 TBD
1028
1029
1030 TXQ - Texture Size Query
1031
1032 TBD
1033
1034
1035 CONT - Continue
1036
1037 TBD
1038
1039
1040 From GL_NV_geometry_program4
1041 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1042
1043
1044 EMIT - Emit
1045
1046 TBD
1047
1048
1049 ENDPRIM - End Primitive
1050
1051 TBD
1052
1053
1054 From GLSL
1055 ^^^^^^^^^^
1056
1057
1058 BGNLOOP - Begin a Loop
1059
1060 TBD
1061
1062
1063 BGNSUB - Begin Subroutine
1064
1065 TBD
1066
1067
1068 ENDLOOP - End a Loop
1069
1070 TBD
1071
1072
1073 ENDSUB - End Subroutine
1074
1075 TBD
1076
1077
1078 NOP - No Operation
1079
1080 Do nothing.
1081
1082
1083 NRM4 - 4-component Vector Normalise
1084
1085 .. math::
1086
1087 dst.x = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1088
1089 dst.y = \frac{src.y}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1090
1091 dst.z = \frac{src.z}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1092
1093 dst.w = \frac{src.w}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
1094
1095
1096 ps_2_x
1097 ^^^^^^^^^^^^
1098
1099
1100 CALLNZ - Subroutine Call If Not Zero
1101
1102 TBD
1103
1104
1105 IFC - If
1106
1107 TBD
1108
1109
1110 BREAKC - Break Conditional
1111
1112 TBD
1113
1114
1115 Explanation of symbols used
1116 ------------------------------
1117
1118
1119 Functions
1120 ^^^^^^^^^^^^^^
1121
1122
1123 :math:`|x|` Absolute value of `x`.
1124
1125 :math:`\lceil x \rceil` Ceiling of `x`.
1126
1127 clamp(x,y,z) Clamp x between y and z.
1128 (x < y) ? y : (x > z) ? z : x
1129
1130 :math:`\lfloor x\rfloor` Floor of `x`.
1131
1132 :math:`\log_2{x}` Logarithm of `x`, base 2.
1133
1134 max(x,y) Maximum of x and y.
1135 (x > y) ? x : y
1136
1137 min(x,y) Minimum of x and y.
1138 (x < y) ? x : y
1139
1140 partialx(x) Derivative of x relative to fragment's X.
1141
1142 partialy(x) Derivative of x relative to fragment's Y.
1143
1144 pop() Pop from stack.
1145
1146 :math:`x^y` `x` to the power `y`.
1147
1148 push(x) Push x on stack.
1149
1150 round(x) Round x.
1151
1152 trunc(x) Truncate x, i.e. drop the fraction bits.
1153
1154
1155 Keywords
1156 ^^^^^^^^^^^^^
1157
1158
1159 discard Discard fragment.
1160
1161 dst First destination register.
1162
1163 dst0 First destination register.
1164
1165 pc Program counter.
1166
1167 src First source register.
1168
1169 src0 First source register.
1170
1171 src1 Second source register.
1172
1173 src2 Third source register.
1174
1175 target Label of target instruction.
1176
1177
1178 Other tokens
1179 ---------------
1180
1181
1182 Declaration Semantic
1183 ^^^^^^^^^^^^^^^^^^^^^^^^
1184
1185
1186 Follows Declaration token if Semantic bit is set.
1187
1188 Since its purpose is to link a shader with other stages of the pipeline,
1189 it is valid to follow only those Declaration tokens that declare a register
1190 either in INPUT or OUTPUT file.
1191
1192 SemanticName field contains the semantic name of the register being declared.
1193 There is no default value.
1194
1195 SemanticIndex is an optional subscript that can be used to distinguish
1196 different register declarations with the same semantic name. The default value
1197 is 0.
1198
1199 The meanings of the individual semantic names are explained in the following
1200 sections.
1201
1202 TGSI_SEMANTIC_POSITION
1203 """"""""""""""""""""""
1204
1205 Position, sometimes known as HPOS or WPOS for historical reasons, is the
1206 location of the vertex in space, in ``(x, y, z, w)`` format. ``x``, ``y``, and ``z``
1207 are the Cartesian coordinates, and ``w`` is the homogenous coordinate and used
1208 for the perspective divide, if enabled.
1209
1210 As a vertex shader output, position should be scaled to the viewport. When
1211 used in fragment shaders, position will ---
1212
1213 XXX --- wait a minute. Should position be in [0,1] for x and y?
1214
1215 XXX additionally, is there a way to configure the perspective divide? it's
1216 accelerated on most chipsets AFAIK...
1217
1218 Position, if not specified, usually defaults to ``(0, 0, 0, 1)``, and can
1219 be partially specified as ``(x, y, 0, 1)`` or ``(x, y, z, 1)``.
1220
1221 XXX usually? can we solidify that?
1222
1223 TGSI_SEMANTIC_COLOR
1224 """""""""""""""""""
1225
1226 Colors are used to, well, color the primitives. Colors are always in
1227 ``(r, g, b, a)`` format.
1228
1229 If alpha is not specified, it defaults to 1.
1230
1231 TGSI_SEMANTIC_BCOLOR
1232 """"""""""""""""""""
1233
1234 Back-facing colors are only used for back-facing polygons, and are only valid
1235 in vertex shader outputs. After rasterization, all polygons are front-facing
1236 and COLOR and BCOLOR end up occupying the same slots in the fragment, so
1237 all BCOLORs effectively become regular COLORs in the fragment shader.
1238
1239 TGSI_SEMANTIC_FOG
1240 """""""""""""""""
1241
1242 The fog coordinate historically has been used to replace the depth coordinate
1243 for generation of fog in dedicated fog blocks. Gallium, however, does not use
1244 dedicated fog acceleration, placing it entirely in the fragment shader
1245 instead.
1246
1247 The fog coordinate should be written in ``(f, 0, 0, 1)`` format. Only the first
1248 component matters when writing from the vertex shader; the driver will ensure
1249 that the coordinate is in this format when used as a fragment shader input.
1250
1251 TGSI_SEMANTIC_PSIZE
1252 """""""""""""""""""
1253
1254 PSIZE, or point size, is used to specify point sizes per-vertex. It should
1255 be in ``(p, n, x, f)`` format, where ``p`` is the point size, ``n`` is the minimum
1256 size, ``x`` is the maximum size, and ``f`` is the fade threshold.
1257
1258 XXX this is arb_vp. is this what we actually do? should double-check...
1259
1260 When using this semantic, be sure to set the appropriate state in the
1261 :ref:`rasterizer` first.
1262
1263 TGSI_SEMANTIC_GENERIC
1264 """""""""""""""""""""
1265
1266 Generic semantics are nearly always used for texture coordinate attributes,
1267 in ``(s, t, r, q)`` format. ``t`` and ``r`` may be unused for certain kinds
1268 of lookups, and ``q`` is the level-of-detail bias for biased sampling.
1269
1270 These attributes are called "generic" because they may be used for anything
1271 else, including parameters, texture generation information, or anything that
1272 can be stored inside a four-component vector.
1273
1274 TGSI_SEMANTIC_NORMAL
1275 """"""""""""""""""""
1276
1277 Vertex normal; could be used to implement per-pixel lighting for legacy APIs
1278 that allow mixing fixed-function and programmable stages.
1279
1280 TGSI_SEMANTIC_FACE
1281 """"""""""""""""""
1282
1283 FACE is the facing bit, to store the facing information for the fragment
1284 shader. ``(f, 0, 0, 1)`` is the format. The first component will be positive
1285 when the fragment is front-facing, and negative when the component is
1286 back-facing.
1287
1288 TGSI_SEMANTIC_EDGEFLAG
1289 """"""""""""""""""""""
1290
1291 XXX no clue