gallium: add new query types and missing documentation
[mesa.git] / src / gallium / docs / source / context.rst
1 .. _context:
2
3 Context
4 =======
5
6 A Gallium rendering context encapsulates the state which effects 3D
7 rendering such as blend state, depth/stencil state, texture samplers,
8 etc.
9
10 Note that resource/texture allocation is not per-context but per-screen.
11
12
13 Methods
14 -------
15
16 CSO State
17 ^^^^^^^^^
18
19 All Constant State Object (CSO) state is created, bound, and destroyed,
20 with triplets of methods that all follow a specific naming scheme.
21 For example, ``create_blend_state``, ``bind_blend_state``, and
22 ``destroy_blend_state``.
23
24 CSO objects handled by the context object:
25
26 * :ref:`Blend`: ``*_blend_state``
27 * :ref:`Sampler`: Texture sampler states are bound separately for fragment,
28 vertex and geometry samplers. Note that sampler states are set en masse.
29 If M is the max number of sampler units supported by the driver and N
30 samplers are bound with ``bind_fragment_sampler_states`` then sampler
31 units N..M-1 are considered disabled/NULL.
32 * :ref:`Rasterizer`: ``*_rasterizer_state``
33 * :ref:`Depth, Stencil, & Alpha`: ``*_depth_stencil_alpha_state``
34 * :ref:`Shader`: These are create, bind and destroy methods for vertex,
35 fragment and geometry shaders.
36 * :ref:`Vertex Elements`: ``*_vertex_elements_state``
37
38
39 Resource Binding State
40 ^^^^^^^^^^^^^^^^^^^^^^
41
42 This state describes how resources in various flavours (textures,
43 buffers, surfaces) are bound to the driver.
44
45
46 * ``set_constant_buffer`` sets a constant buffer to be used for a given shader
47 type. index is used to indicate which buffer to set (some apis may allow
48 multiple ones to be set, and binding a specific one later, though drivers
49 are mostly restricted to the first one right now).
50
51 * ``set_framebuffer_state``
52
53 * ``set_vertex_buffers``
54
55 * ``set_index_buffer``
56
57 * ``set_stream_output_buffers``
58
59
60 Non-CSO State
61 ^^^^^^^^^^^^^
62
63 These pieces of state are too small, variable, and/or trivial to have CSO
64 objects. They all follow simple, one-method binding calls, e.g.
65 ``set_blend_color``.
66
67 * ``set_stencil_ref`` sets the stencil front and back reference values
68 which are used as comparison values in stencil test.
69 * ``set_blend_color``
70 * ``set_sample_mask``
71 * ``set_clip_state``
72 * ``set_polygon_stipple``
73 * ``set_scissor_state`` sets the bounds for the scissor test, which culls
74 pixels before blending to render targets. If the :ref:`Rasterizer` does
75 not have the scissor test enabled, then the scissor bounds never need to
76 be set since they will not be used. Note that scissor xmin and ymin are
77 inclusive, but xmax and ymax are exclusive. The inclusive ranges in x
78 and y would be [xmin..xmax-1] and [ymin..ymax-1].
79 * ``set_viewport_state``
80
81
82 Sampler Views
83 ^^^^^^^^^^^^^
84
85 These are the means to bind textures to shader stages. To create one, specify
86 its format, swizzle and LOD range in sampler view template.
87
88 If texture format is different than template format, it is said the texture
89 is being cast to another format. Casting can be done only between compatible
90 formats, that is formats that have matching component order and sizes.
91
92 Swizzle fields specify they way in which fetched texel components are placed
93 in the result register. For example, ``swizzle_r`` specifies what is going to be
94 placed in first component of result register.
95
96 The ``first_level`` and ``last_level`` fields of sampler view template specify
97 the LOD range the texture is going to be constrained to. Note that these
98 values are in addition to the respective min_lod, max_lod values in the
99 pipe_sampler_state (that is if min_lod is 2.0, and first_level 3, the first mip
100 level used for sampling from the resource is effectively the fifth).
101
102 The ``first_layer`` and ``last_layer`` fields specify the layer range the
103 texture is going to be constrained to. Similar to the LOD range, this is added
104 to the array index which is used for sampling.
105
106 * ``set_fragment_sampler_views`` binds an array of sampler views to
107 fragment shader stage. Every binding point acquires a reference
108 to a respective sampler view and releases a reference to the previous
109 sampler view. If M is the maximum number of sampler units and N units
110 is passed to set_fragment_sampler_views, the driver should unbind the
111 sampler views for units N..M-1.
112
113 * ``set_vertex_sampler_views`` binds an array of sampler views to vertex
114 shader stage. Every binding point acquires a reference to a respective
115 sampler view and releases a reference to the previous sampler view.
116
117 * ``create_sampler_view`` creates a new sampler view. ``texture`` is associated
118 with the sampler view which results in sampler view holding a reference
119 to the texture. Format specified in template must be compatible
120 with texture format.
121
122 * ``sampler_view_destroy`` destroys a sampler view and releases its reference
123 to associated texture.
124
125 Surfaces
126 ^^^^^^^^
127
128 These are the means to use resources as color render targets or depthstencil
129 attachments. To create one, specify the mip level, the range of layers, and
130 the bind flags (either PIPE_BIND_DEPTH_STENCIL or PIPE_BIND_RENDER_TARGET).
131 Note that layer values are in addition to what is indicated by the geometry
132 shader output variable XXX_FIXME (that is if first_layer is 3 and geometry
133 shader indicates index 2, the 5th layer of the resource will be used). These
134 first_layer and last_layer parameters will only be used for 1d array, 2d array,
135 cube, and 3d textures otherwise they are 0.
136
137 * ``create_surface`` creates a new surface.
138
139 * ``surface_destroy`` destroys a surface and releases its reference to the
140 associated resource.
141
142 Clearing
143 ^^^^^^^^
144
145 Clear is one of the most difficult concepts to nail down to a single
146 interface (due to both different requirements from APIs and also driver/hw
147 specific differences).
148
149 ``clear`` initializes some or all of the surfaces currently bound to
150 the framebuffer to particular RGBA, depth, or stencil values.
151 Currently, this does not take into account color or stencil write masks (as
152 used by GL), and always clears the whole surfaces (no scissoring as used by
153 GL clear or explicit rectangles like d3d9 uses). It can, however, also clear
154 only depth or stencil in a combined depth/stencil surface, if the driver
155 supports PIPE_CAP_DEPTHSTENCIL_CLEAR_SEPARATE.
156 If a surface includes several layers then all layers will be cleared.
157
158 ``clear_render_target`` clears a single color rendertarget with the specified
159 color value. While it is only possible to clear one surface at a time (which can
160 include several layers), this surface need not be bound to the framebuffer.
161
162 ``clear_depth_stencil`` clears a single depth, stencil or depth/stencil surface
163 with the specified depth and stencil values (for combined depth/stencil buffers,
164 is is also possible to only clear one or the other part). While it is only
165 possible to clear one surface at a time (which can include several layers),
166 this surface need not be bound to the framebuffer.
167
168
169 Drawing
170 ^^^^^^^
171
172 ``draw_vbo`` draws a specified primitive. The primitive mode and other
173 properties are described by ``pipe_draw_info``.
174
175 The ``mode``, ``start``, and ``count`` fields of ``pipe_draw_info`` specify the
176 the mode of the primitive and the vertices to be fetched, in the range between
177 ``start`` to ``start``+``count``-1, inclusive.
178
179 Every instance with instanceID in the range between ``start_instance`` and
180 ``start_instance``+``instance_count``-1, inclusive, will be drawn.
181
182 If there is an index buffer bound, and ``indexed`` field is true, all vertex
183 indices will be looked up in the index buffer.
184
185 In indexed draw, ``min_index`` and ``max_index`` respectively provide a lower
186 and upper bound of the indices contained in the index buffer inside the range
187 between ``start`` to ``start``+``count``-1. This allows the driver to
188 determine which subset of vertices will be referenced during te draw call
189 without having to scan the index buffer. Providing a over-estimation of the
190 the true bounds, for example, a ``min_index`` and ``max_index`` of 0 and
191 0xffffffff respectively, must give exactly the same rendering, albeit with less
192 performance due to unreferenced vertex buffers being unnecessarily DMA'ed or
193 processed. Providing a underestimation of the true bounds will result in
194 undefined behavior, but should not result in program or system failure.
195
196 In case of non-indexed draw, ``min_index`` should be set to
197 ``start`` and ``max_index`` should be set to ``start``+``count``-1.
198
199 ``index_bias`` is a value added to every vertex index after lookup and before
200 fetching vertex attributes.
201
202 When drawing indexed primitives, the primitive restart index can be
203 used to draw disjoint primitive strips. For example, several separate
204 line strips can be drawn by designating a special index value as the
205 restart index. The ``primitive_restart`` flag enables/disables this
206 feature. The ``restart_index`` field specifies the restart index value.
207
208 When primitive restart is in use, array indexes are compared to the
209 restart index before adding the index_bias offset.
210
211 If a given vertex element has ``instance_divisor`` set to 0, it is said
212 it contains per-vertex data and effective vertex attribute address needs
213 to be recalculated for every index.
214
215 attribAddr = ``stride`` * index + ``src_offset``
216
217 If a given vertex element has ``instance_divisor`` set to non-zero,
218 it is said it contains per-instance data and effective vertex attribute
219 address needs to recalculated for every ``instance_divisor``-th instance.
220
221 attribAddr = ``stride`` * instanceID / ``instance_divisor`` + ``src_offset``
222
223 In the above formulas, ``src_offset`` is taken from the given vertex element
224 and ``stride`` is taken from a vertex buffer associated with the given
225 vertex element.
226
227 The calculated attribAddr is used as an offset into the vertex buffer to
228 fetch the attribute data.
229
230 The value of ``instanceID`` can be read in a vertex shader through a system
231 value register declared with INSTANCEID semantic name.
232
233
234 Queries
235 ^^^^^^^
236
237 Queries gather some statistic from the 3D pipeline over one or more
238 draws. Queries may be nested, though only d3d1x currently exercises this.
239
240 Queries can be created with ``create_query`` and deleted with
241 ``destroy_query``. To start a query, use ``begin_query``, and when finished,
242 use ``end_query`` to end the query.
243
244 ``get_query_result`` is used to retrieve the results of a query. If
245 the ``wait`` parameter is TRUE, then the ``get_query_result`` call
246 will block until the results of the query are ready (and TRUE will be
247 returned). Otherwise, if the ``wait`` parameter is FALSE, the call
248 will not block and the return value will be TRUE if the query has
249 completed or FALSE otherwise.
250
251 The interface currently includes the following types of queries:
252
253 ``PIPE_QUERY_OCCLUSION_COUNTER`` counts the number of fragments which
254 are written to the framebuffer without being culled by
255 :ref:`Depth, Stencil, & Alpha` testing or shader KILL instructions.
256 The result is an unsigned 64-bit integer.
257 This query can be used with ``render_condition``.
258
259 In cases where a boolean result of an occlusion query is enough,
260 ``PIPE_QUERY_OCCLUSION_PREDICATE`` should be used. It is just like
261 ``PIPE_QUERY_OCCLUSION_COUNTER`` except that the result is a boolean
262 value of FALSE for cases where COUNTER would result in 0 and TRUE
263 for all other cases.
264 This query can be used with ``render_condition``.
265
266 ``PIPE_QUERY_TIME_ELAPSED`` returns the amount of time, in nanoseconds,
267 the context takes to perform operations.
268 The result is an unsigned 64-bit integer.
269
270 ``PIPE_QUERY_TIMESTAMP`` returns a device/driver internal timestamp,
271 scaled to nanoseconds, recorded after all commands issued prior to
272 ``end_query`` have been processed.
273 This query does not require a call to ``begin_query``.
274 The result is an unsigned 64-bit integer.
275
276 ``PIPE_QUERY_TIMESTAMP_DISJOINT`` can be used to check whether the
277 internal timer resolution is good enough to distinguish between the
278 events at ``begin_query`` and ``end_query``.
279 The result is a 64-bit integer specifying the timer resolution in Hz,
280 followed by a boolean value indicating whether the timer has incremented.
281
282 ``PIPE_QUERY_PRIMITIVES_GENERATED`` returns a 64-bit integer indicating
283 the number of primitives processed by the pipeline.
284
285 ``PIPE_QUERY_PRIMITIVES_EMITTED`` returns a 64-bit integer indicating
286 the number of primitives written to stream output buffers.
287
288 ``PIPE_QUERY_SO_STATISTICS`` returns 2 64-bit integers corresponding to
289 the results of
290 ``PIPE_QUERY_PRIMITIVES_EMITTED`` and
291 ``PIPE_QUERY_PRIMITIVES_GENERATED``, in this order.
292
293 ``PIPE_QUERY_SO_OVERFLOW_PREDICATE`` returns a boolean value indicating
294 whether the stream output targets have overflowed as a result of the
295 commands issued between ``begin_query`` and ``end_query``.
296 This query can be used with ``render_condition``.
297
298 ``PIPE_QUERY_GPU_FINISHED`` returns a boolean value indicating whether
299 all commands issued before ``end_query`` have completed. However, this
300 does not imply serialization.
301 This query does not require a call to ``begin_query``.
302
303 ``PIPE_QUERY_PIPELINE_STATISTICS`` returns an array of the following
304 64-bit integers:
305 Number of vertices read from vertex buffers.
306 Number of primitives read from vertex buffers.
307 Number of vertex shader threads launched.
308 Number of geometry shader threads launched.
309 Number of primitives generated by geometry shaders.
310 Number of primitives forwarded to the rasterizer.
311 Number of primitives rasterized.
312 Number of fragment shader threads launched.
313 Number of tessellation control shader threads launched.
314 Number of tessellation evaluation shader threads launched.
315 If a shader type is not supported by the device/driver,
316 the corresponding values should be set to 0.
317
318 Gallium does not guarantee the availability of any query types; one must
319 always check the capabilities of the :ref:`Screen` first.
320
321
322 Conditional Rendering
323 ^^^^^^^^^^^^^^^^^^^^^
324
325 A drawing command can be skipped depending on the outcome of a query
326 (typically an occlusion query). The ``render_condition`` function specifies
327 the query which should be checked prior to rendering anything.
328
329 If ``render_condition`` is called with ``query`` = NULL, conditional
330 rendering is disabled and drawing takes place normally.
331
332 If ``render_condition`` is called with a non-null ``query`` subsequent
333 drawing commands will be predicated on the outcome of the query. If
334 the query result is zero subsequent drawing commands will be skipped.
335
336 If ``mode`` is PIPE_RENDER_COND_WAIT the driver will wait for the
337 query to complete before deciding whether to render.
338
339 If ``mode`` is PIPE_RENDER_COND_NO_WAIT and the query has not yet
340 completed, the drawing command will be executed normally. If the query
341 has completed, drawing will be predicated on the outcome of the query.
342
343 If ``mode`` is PIPE_RENDER_COND_BY_REGION_WAIT or
344 PIPE_RENDER_COND_BY_REGION_NO_WAIT rendering will be predicated as above
345 for the non-REGION modes but in the case that an occulusion query returns
346 a non-zero result, regions which were occluded may be ommitted by subsequent
347 drawing commands. This can result in better performance with some GPUs.
348 Normally, if the occlusion query returned a non-zero result subsequent
349 drawing happens normally so fragments may be generated, shaded and
350 processed even where they're known to be obscured.
351
352
353 Flushing
354 ^^^^^^^^
355
356 ``flush``
357
358
359 Resource Busy Queries
360 ^^^^^^^^^^^^^^^^^^^^^
361
362 ``is_resource_referenced``
363
364
365
366 Blitting
367 ^^^^^^^^
368
369 These methods emulate classic blitter controls.
370
371 These methods operate directly on ``pipe_resource`` objects, and stand
372 apart from any 3D state in the context. Blitting functionality may be
373 moved to a separate abstraction at some point in the future.
374
375 ``resource_copy_region`` blits a region of a resource to a region of another
376 resource, provided that both resources have the same format, or compatible
377 formats, i.e., formats for which copying the bytes from the source resource
378 unmodified to the destination resource will achieve the same effect of a
379 textured quad blitter.. The source and destination may be the same resource,
380 but overlapping blits are not permitted.
381
382 ``resource_resolve`` resolves a multisampled resource into a non-multisampled
383 one. Their formats must match. This function must be present if a driver
384 supports multisampling.
385 The region that is to be resolved is described by ``pipe_resolve_info``, which
386 provides a source and a destination rectangle.
387 The source rectangle may be vertically flipped, but otherwise the dimensions
388 of the rectangles must match, unless PIPE_CAP_SCALED_RESOLVE is supported,
389 in which case scaling and horizontal flipping are allowed as well.
390 The result of resolving depth/stencil values may be any function of the values at
391 the sample points, but returning the value of the centermost sample is preferred.
392
393 The interfaces to these calls are likely to change to make it easier
394 for a driver to batch multiple blits with the same source and
395 destination.
396
397
398 Stream Output
399 ^^^^^^^^^^^^^
400
401 Stream output, also known as transform feedback allows writing the results of the
402 vertex pipeline (after the geometry shader or vertex shader if no geometry shader
403 is present) to be written to a buffer created with a ``PIPE_BIND_STREAM_OUTPUT``
404 flag.
405
406 First a stream output state needs to be created with the
407 ``create_stream_output_state`` call. It specific the details of what's being written,
408 to which buffer and with what kind of a writemask.
409
410 Then target buffers needs to be set with the call to ``set_stream_output_buffers``
411 which sets the buffers and the offsets from the start of those buffer to where
412 the data will be written to.
413
414
415 Transfers
416 ^^^^^^^^^
417
418 These methods are used to get data to/from a resource.
419
420 ``get_transfer`` creates a transfer object.
421
422 ``transfer_destroy`` destroys the transfer object. May cause
423 data to be written to the resource at this point.
424
425 ``transfer_map`` creates a memory mapping for the transfer object.
426 The returned map points to the start of the mapped range according to
427 the box region, not the beginning of the resource.
428
429 ``transfer_unmap`` remove the memory mapping for the transfer object.
430 Any pointers into the map should be considered invalid and discarded.
431
432 ``transfer_inline_write`` performs a simplified transfer for simple writes.
433 Basically get_transfer, transfer_map, data write, transfer_unmap, and
434 transfer_destroy all in one.
435
436
437 The box parameter to some of these functions defines a 1D, 2D or 3D
438 region of pixels. This is self-explanatory for 1D, 2D and 3D texture
439 targets.
440
441 For PIPE_TEXTURE_1D_ARRAY, the box::y and box::height fields refer to the
442 array dimension of the texture.
443
444 For PIPE_TEXTURE_2D_ARRAY, the box::z and box::depth fields refer to the
445 array dimension of the texture.
446
447 For PIPE_TEXTURE_CUBE, the box:z and box::depth fields refer to the
448 faces of the cube map (z + depth <= 6).
449
450
451
452 .. _transfer_flush_region:
453
454 transfer_flush_region
455 %%%%%%%%%%%%%%%%%%%%%
456
457 If a transfer was created with ``FLUSH_EXPLICIT``, it will not automatically
458 be flushed on write or unmap. Flushes must be requested with
459 ``transfer_flush_region``. Flush ranges are relative to the mapped range, not
460 the beginning of the resource.
461
462
463
464 .. _redefine_user_buffer:
465
466 redefine_user_buffer
467 %%%%%%%%%%%%%%%%%%%%
468
469 This function notifies a driver that the user buffer content has been changed.
470 The updated region starts at ``offset`` and is ``size`` bytes large.
471 The ``offset`` is relative to the pointer specified in ``user_buffer_create``.
472 While uploading the user buffer, the driver is allowed not to upload
473 the memory outside of this region.
474 The width0 is redefined to ``MAX2(width0, offset+size)``.
475
476
477
478 .. _texture_barrier
479
480 texture_barrier
481 %%%%%%%%%%%%%%%
482
483 This function flushes all pending writes to the currently-set surfaces and
484 invalidates all read caches of the currently-set samplers.
485
486
487
488 .. _pipe_transfer:
489
490 PIPE_TRANSFER
491 ^^^^^^^^^^^^^
492
493 These flags control the behavior of a transfer object.
494
495 ``PIPE_TRANSFER_READ``
496 Resource contents read back (or accessed directly) at transfer create time.
497
498 ``PIPE_TRANSFER_WRITE``
499 Resource contents will be written back at transfer_destroy time (or modified
500 as a result of being accessed directly).
501
502 ``PIPE_TRANSFER_MAP_DIRECTLY``
503 a transfer should directly map the resource. May return NULL if not supported.
504
505 ``PIPE_TRANSFER_DISCARD_RANGE``
506 The memory within the mapped region is discarded. Cannot be used with
507 ``PIPE_TRANSFER_READ``.
508
509 ``PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE``
510 Discards all memory backing the resource. It should not be used with
511 ``PIPE_TRANSFER_READ``.
512
513 ``PIPE_TRANSFER_DONTBLOCK``
514 Fail if the resource cannot be mapped immediately.
515
516 ``PIPE_TRANSFER_UNSYNCHRONIZED``
517 Do not synchronize pending operations on the resource when mapping. The
518 interaction of any writes to the map and any operations pending on the
519 resource are undefined. Cannot be used with ``PIPE_TRANSFER_READ``.
520
521 ``PIPE_TRANSFER_FLUSH_EXPLICIT``
522 Written ranges will be notified later with :ref:`transfer_flush_region`.
523 Cannot be used with ``PIPE_TRANSFER_READ``.