Merge branch '7.8'
[mesa.git] / src / gallium / docs / d3d11ddi.txt
1 This document compares the D3D10/D3D11 device driver interface with Gallium.
2 It is written from the perspective of a developer implementing a D3D10/D3D11 driver as a Gallium state tracker.
3
4 Note that naming and other cosmetic differences are not noted, since they don't really matter and would severely clutter the document.
5 Gallium/OpenGL terminology is used in preference to D3D terminology.
6
7 NOTE: this document tries to be complete but most likely isn't fully complete and also not fully correct: please submit patches if you spot anything incorrect
8
9 Also note that this is specifically for the DirectX 10/11 Windows Vista/7 DDI interfaces.
10 DirectX 9 has both user-mode (for Vista) and kernel mode (pre-Vista) interfaces, but they are significantly different from Gallium due to the presence of a lot of fixed function functionality.
11
12 The user-visible DirectX 10/11 interfaces are distinct from the kernel DDI, but they match very closely.
13
14 * Accessing Microsoft documentation
15
16 See http://msdn.microsoft.com/en-us/library/dd445501.aspx ("D3D11DDI_DEVICEFUNCS") for D3D documentation.
17
18 Also see http://download.microsoft.com/download/f/2/d/f2d5ee2c-b7ba-4cd0-9686-b6508b5479a1/direct3d10_web.pdf ("The Direct3D 10 System" by David Blythe) for an introduction to Direct3D 10 and the rationale for its design.
19
20 The Windows Driver Kit contains the actual headers, as well as shader bytecode documentation.
21
22 To get the headers from Linux, run the following, in a dedicated directory:
23 wget http://download.microsoft.com/download/4/A/2/4A25C7D5-EFBE-4182-B6A9-AE6850409A78/GRMWDK_EN_7600_1.ISO
24 sudo mount -o loop GRMWDK_EN_7600_1.ISO /mnt/tmp
25 cabextract -x /mnt/tmp/wdk/headers_cab001.cab
26 rename 's/^_(.*)_[0-9]*$/$1/' *
27 sudo umount /mnt/tmp
28
29 d3d10umddi.h contains the DDI interface analyzed in this document: note that it is much easier to read this online on MSDN.
30 d3d{10,11}TokenizedProgramFormat.hpp contains the shader bytecode definitions: this is not available on MSDN.
31 d3d9types.h contains DX9 shader bytecode, and DX9 types
32 d3dumddi.h contains the DirectX 9 DDI interface
33
34 * Glossary
35
36 BC1: DXT1
37 BC2: DXT3
38 BC3: DXT5
39 BC5: RGTC
40 BC6H: BPTC float
41 BC7: BPTC
42 CS = compute shader: OpenCL-like shader
43 DS = domain shader: tessellation evaluation shader
44 HS = hull shader: tessellation control shader
45 IA = input assembler: primitive assembly
46 Input layout: vertex elements
47 OM = output merger: blender
48 PS = pixel shader: fragment shader
49 Primitive topology: primitive type
50 Resource: buffer or texture
51 Shader resource (view): sampler view
52 SO = stream out: transform feedback
53 Unordered access view: view supporting random read/write access (usually from compute shaders)
54
55 * Legend
56
57 -: features D3D11 has and Gallium lacks
58 +: features Gallium has and D3D11 lacks
59 !: differences between D3D11 and Gallium
60 *: possible improvements to Gallium
61 >: references to comparisons of special enumerations
62 #: comment
63
64 * Gallium functions with no direct D3D10/D3D11 equivalent
65
66 clear
67 + Gallium supports clearing both render targets and depth/stencil with a single call
68
69 draw_range_elements
70 + Gallium supports indexed draw with explicit range
71
72 fence_signalled
73 fence_finish
74 + D3D10/D3D11 don't appear to support explicit fencing; queries can often substitute though, and flushing is supported
75
76 set_clip_state
77 + Gallium supports fixed function user clip planes, D3D10/D3D11 only support using the vertex shader for them
78
79 set_polygon_stipple
80 + Gallium supports polygon stipple
81
82 surface_fill
83 + Gallium supports subrectangle fills of surfaces, D3D10 only supports full clears of views
84
85 * DirectX 10/11 DDI functions and Gallium equivalents
86
87 AbandonCommandList (D3D11 only)
88 - Gallium does not support deferred contexts
89
90 CalcPrivateBlendStateSize
91 CalcPrivateDepthStencilStateSize
92 CalcPrivateDepthStencilViewSize
93 CalcPrivateElementLayoutSize
94 CalcPrivateGeometryShaderWithStreamOutput
95 CalcPrivateOpenedResourceSize
96 CalcPrivateQuerySize
97 CalcPrivateRasterizerStateSize
98 CalcPrivateRenderTargetViewSize
99 CalcPrivateResourceSize
100 CalcPrivateSamplerSize
101 CalcPrivateShaderResourceViewSize
102 CalcPrivateShaderSize
103 CalcDeferredContextHandleSize (D3D11 only)
104 CalcPrivateCommandListSize (D3D11 only)
105 CalcPrivateDeferredContextSize (D3D11 only)
106 CalcPrivateTessellationShaderSize (D3D11 only)
107 CalcPrivateUnorderedAccessViewSize (D3D11 only)
108 ! D3D11 allocates private objects itself, using the size computed here
109 * Gallium could do something similar to be able to put the private data inline into state tracker objects: this would allow them to fit in the same cacheline and improve performance
110
111 CheckDeferredContextHandleSizes (D3D11 only)
112 - Gallium does not support deferred contexts
113
114 CheckFormatSupport -> screen->is_format_supported
115 ! Gallium passes usages to this function, D3D11 returns them
116 - Gallium does not differentiate between blendable and non-blendable render targets
117 - Gallium lacks multisampled-texture and multisampled-render-target usages
118
119 CheckMultisampleQualityLevels
120 * could merge this with is_format_supported
121 - Gallium lacks multisampling support
122
123 CommandListExecute (D3D11 only)
124 - Gallium does not support command lists
125
126 CopyStructureCount (D3D11 only)
127 - Gallium does not support unordered access views (views that can be written to arbitrarily from compute shaders)
128
129 ClearDepthStencilView -> clear
130 ClearRenderTargetView -> clear
131 # D3D11 is not totally clear about whether this applies to any view or only a "currently-bound view"
132 + Gallium allows to clear both depth/stencil and render target(s) in a single operation
133 + Gallium supports double-precision depth values (but not rgba values!)
134 * May want to also support double-precision rgba or use "float" for "depth"
135
136 ClearUnorderedAccessViewFloat (D3D11 only)
137 ClearUnorderedAccessViewUint (D3D11 only)
138 - Gallium does not support unordered access views (views that can be written to arbitrarily from compute shaders)
139
140 CreateBlendState (extended in D3D10.1) -> create_blend_state
141 # D3D10 does not support per-RT blend modes (but per-RT blending), only D3D10.1 does
142 - Gallium lacks alpha-to-coverage
143 + Gallium supports logic ops
144 + Gallium supports dithering
145 + Gallium supports using the broadcast alpha component of the blend constant color
146
147 CreateCommandList (D3D11 only)
148 - Gallium does not support command lists
149
150 CreateComputeShader (D3D11 only)
151 - Gallium does not support compute shaders
152
153 CreateDeferredContext (D3D11 only)
154 - Gallium does not support deferred contexts
155
156 CreateDomainShader (D3D11 only)
157 - Gallium does not support domain shaders
158
159 CreateHullShader (D3D11 only)
160 - Gallium does not support hull shaders
161
162 CreateUnorderedAccessView (D3D11 only)
163 - Gallium does not support unordered access views
164
165 CreateDepthStencilState -> create_depth_stencil_alpha_state
166 ! D3D11 has both a global stencil enable, and front/back enables; Gallium has only front/back enables
167 + Gallium has per-face writemask/valuemasks, D3D11 uses the same value for back and front
168 + Gallium supports the alpha test, which D3D11 lacks
169
170 CreateDepthStencilView -> get_tex_surface
171 CreateRenderTargetView -> get_tex_surface
172 ! Gallium merges depthstencil and rendertarget views into pipe_surface, which also doubles as a 2D surface abstraction
173 - lack of texture array support
174 - lack of render-to-buffer support
175 + Gallium supports using 3D texture zslices as a depth/stencil buffer (in theory)
176
177 CreateElementLayout -> create_vertex_elements_state
178 ! D3D11 allows sparse vertex elements (via InputRegister); in Gallium they must be specified sequentially
179 ! D3D11 has an extra flag (InputSlotClass) that is the same as instance_divisor == 0
180
181 CreateGeometryShader -> create_gs_state
182 CreateGeometryShaderWithStreamOutput -> create_gs_state
183 CreatePixelShader -> create_fs_state
184 CreateVertexShader -> create_vs_state
185 > bytecode is different (see D3d10tokenizedprogramformat.hpp)
186 ! D3D11 describes input/outputs separately from bytecode; Gallium has the tgsi_scan.c module to extract it from TGSI
187 @ TODO: look into DirectX 10/11 semantics specification and bytecode
188
189 CheckCounter
190 CheckCounterInfo
191 CreateQuery -> create_query
192 - Gallium only supports occlusion, primitives generated and primitives emitted queries
193 ! D3D11 implements fences with "event" queries
194 * TIMESTAMP could be implemented as an additional fields for other queries: some cards have hardware support for exactly this
195 * OCCLUSIONPREDICATE is required for the OpenGL v2 occlusion query functionality
196 * others are performance counters, we may want them but they are not critical
197
198 CreateRasterizerState
199 - Gallium lacks clamping of polygon offset depth biases
200 - Gallium lacks support to disable depth clipping
201 - Gallium lacks multisampling
202 + Gallium, like OpenGL, supports PIPE_POLYGON_MODE_POINT
203 + Gallium, like OpenGL, supports per-face polygon fill modes
204 + Gallium, like OpenGL, supports culling everything
205 + Gallium, like OpenGL, supports two-side lighting; D3D11 only has the facing attribute
206 + Gallium, like OpenGL, supports per-fill-mode polygon offset enables
207 + Gallium, like OpenGL, supports polygon smoothing
208 + Gallium, like OpenGL, supports polygon stipple
209 + Gallium, like OpenGL, supports point smoothing
210 + Gallium, like OpenGL, supports point sprites
211 + Gallium supports specifying point quad rasterization
212 + Gallium, like OpenGL, supports per-point point size
213 + Gallium, like OpenGL, supports line smoothing
214 + Gallium, like OpenGL, supports line stipple
215 + Gallium supports line last pixel rule specification
216 + Gallium, like OpenGL, supports provoking vertex convention
217 + Gallium supports D3D9 rasterization rules
218 + Gallium supports fixed line width
219 + Gallium supports fixed point size
220
221 CreateResource -> texture_create or buffer_create
222 ! D3D11 passes the dimensions of all mipmap levels to the create call, while Gallium has an implicit floor(x/2) rule
223 # Note that hardware often has the implicit rule, so the D3D11 interface seems to make little sense
224 # Also, the D3D11 API does not allow the user to specify mipmap sizes, so this really seems a dubious decision on Microsoft's part
225 - D3D11 supports specifying initial data to write in the resource
226 - Gallium lacks support for stream output buffer usage
227 - Gallium does not support unordered access buffers
228 ! D3D11 specifies mapping flags (i.e. read/write/discard);:it's unclear what they are used for here
229 - D3D11 supports odd things in the D3D10_DDI_RESOURCE_MISC_FLAG enum (D3D10_DDI_RESOURCE_MISC_DISCARD_ON_PRESENT, D3D11_DDI_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS, D3D11_DDI_RESOURCE_MISC_BUFFER_STRUCTURED)
230 - Gallium does not support indirect draw call parameter buffers
231 - Gallium lacks multisampling
232 - Gallium lacks array textures
233 ! D3D11 supports specifying hardware modes and other stuff here for scanout resources
234 + Gallium allows specifying minimum buffer alignment
235 ! D3D11 implements cube maps as 2D array textures
236
237 CreateSampler
238 - D3D11 supports a monochrome convolution filter for "text filtering"
239 + Gallium supports non-normalized coordinates
240 + Gallium supports CLAMP, MIRROR_CLAMP and MIRROR_CLAMP_TO_BORDER
241 + Gallium supports setting min/max/mip filters and anisotropy independently
242
243 CreateShaderResourceView (extended in D3D10.1) -> create_sampler_view
244 - Gallium lacks sampler views over buffers
245 - Gallium lacks texture arrays, and cube map views over texture arrays
246 + Gallium supports specifying a swizzle
247 ! D3D11 implements "cube views" as views into a 2D array texture
248
249 CsSetConstantBuffers (D3D11 only)
250 CsSetSamplers (D3D11 only)
251 CsSetShader (D3D11 only)
252 CsSetShaderResources (D3D11 only)
253 CsSetShaderWithIfaces (D3D11 only)
254 CsSetUnorderedAccessViews (D3D11 only)
255 - Gallium does not support compute shaders
256
257 DestroyBlendState
258 DestroyCommandList (D3D11 only)
259 DestroyDepthStencilState
260 DestroyDepthStencilView
261 DestroyDevice
262 DestroyElementLayout
263 DestroyQuery
264 DestroyRasterizerState
265 DestroyRenderTargetView
266 DestroyResource
267 DestroySampler
268 DestroyShader
269 DestroyShaderResourceView
270 DestroyUnorderedAccessView (D3D11 only)
271 # these are trivial
272
273 Dispatch (D3D11 only)
274 - Gallium does not support compute shaders
275
276 DispatchIndirect (D3D11 only)
277 - Gallium does not support compute shaders
278
279 Draw -> draw_arrays
280 ! D3D11 sets primitive modes separately with IaSetTopology: it's not obvious which is better
281
282 DrawAuto
283 - Gallium lacks stream out and DrawAuto
284
285 DrawIndexed -> draw_elements
286 ! D3D11 sets primitive modes separately with IaSetTopology: it's not obvious which is better
287 * may want to add a separate set_index_buffer
288 - Gallium lacks base vertex for indexed draw calls
289 + D3D11 lacks draw_range_elements functionality, which is required for OpenGL
290
291 DrawIndexedInstanced -> draw_elements_instanced
292 ! D3D11 sets primitive modes separately with IaSetTopology: it's not obvious which is better
293 * may want to add a separate set_index_buffer
294 - Gallium lacks base vertex for indexed draw calls
295
296 DrawIndexedInstancedIndirect (D3D11 only) -> call draw_elements_instanced multiple times in software
297 # this allows to use an hardware buffer to specify the parameters for multiple draw_elements_instanced calls
298 - Gallium does not support draw call parameter buffers and indirect draw
299
300 DrawInstanced -> draw_arrays_instanced
301 ! D3D11 sets primitive modes separately with IaSetTopology: it's not obvious which is better
302
303 DrawInstancedIndirect (D3D11 only) -> call draw_arrays_instanced multiple times in software
304 # this allows to use an hardware buffer to specify the parameters for multiple draw_arrays_instanced calls
305 - Gallium does not support draw call parameter buffers and indirect draws
306
307 DsSetConstantBuffers (D3D11 only)
308 DsSetSamplers (D3D11 only)
309 DsSetShader (D3D11 only)
310 DsSetShaderResources (D3D11 only)
311 DsSetShaderWithIfaces (D3D11 only)
312 - Gallium does not support domain shaders
313
314 Flush -> flush
315 ! Gallium supports fencing and several kinds of flushing here, D3D11 just has a dumb glFlush-like function
316
317 GenMips
318 - Gallium lacks a mipmap generation interface, and does this manually with the 3D engine
319 * it may be useful to add a mipmap generation interface, since the hardware (especially older cards) may have a better way than using the 3D engine
320
321 GsSetConstantBuffers -> for(i = StartBuffer; i < NumBuffers; ++i) set_constant_buffer(PIPE_SHADER_GEOMETRY, i, phBuffers[i])
322
323 GsSetSamplers
324 - Gallium does not support sampling in geometry shaders
325
326 GsSetShader -> bind_gs_state
327
328 GsSetShaderWithIfaces (D3D11 only)
329 - Gallium does not support shader interfaces
330
331 GsSetShaderResources
332 - Gallium does not support sampling in geometry shaders
333
334 HsSetConstantBuffers (D3D11 only)
335 HsSetSamplers (D3D11 only)
336 HsSetShader (D3D11 only)
337 HsSetShaderResources (D3D11 only)
338 HsSetShaderWithIfaces (D3D11 only)
339 - Gallium does not support hull shaders
340
341 IaSetIndexBuffer
342 ! Gallium passes this to the draw_elements or draw_elements_instanced calls
343 + Gallium supports 8-bit indices
344 ! the D3D11 interface allows index-size-unaligned byte offsets into index buffers; it's not clear whether they actually work
345
346 IaSetInputLayout -> bind_vertex_elements_state
347
348 IaSetTopology
349 ! Gallium passes the topology = primitive type to the draw calls
350 * may want to add an interface for this
351 - Gallium lacks support for DirectX 11 tessellated primitives
352 + Gallium supports line loops, triangle fans, quads, quad strips and polygons
353
354 IaSetVertexBuffers -> set_vertex_buffers
355 + Gallium allows to specify a max_index here
356 - Gallium only allows setting all vertex buffers at once, while D3D11 supports setting a subset
357
358 OpenResource -> texture_from_handle
359
360 PsSetConstantBuffers -> for(i = StartBuffer; i < NumBuffers; ++i) set_constant_buffer(PIPE_SHADER_FRAGMENT, i, phBuffers[i])
361 * may want to split into fragment/vertex-specific versions
362
363 PsSetSamplers -> bind_fragment_sampler_states
364 * may want to allow binding subsets instead of all at once
365
366 PsSetShader -> bind_fs_state
367
368 PsSetShaderWithIfaces (D3D11 only)
369 - Gallium does not support shader interfaces
370
371 PsSetShaderResources -> set_fragment_sampler_views
372 * may want to allow binding subsets instead of all at once
373
374 QueryBegin -> begin_query
375
376 QueryEnd -> end_query
377
378 QueryGetData -> get_query_result
379 - D3D11 supports reading an arbitrary data chunk for query results, Gallium only supports reading a 64-bit integer
380 + D3D11 doesn't seem to support actually waiting for the query result (?!)
381 - D3D11 supports optionally not flushing command buffers here and instead returning DXGI_DDI_ERR_WASSTILLDRAWING
382
383 RecycleCommandList (D3D11 only)
384 RecycleCreateCommandList (D3D11 only)
385 RecycleDestroyCommandList (D3D11 only)
386 - Gallium does not support command lists
387
388 RecycleCreateDeferredContext (D3D11 only)
389 - Gallium does not support deferred contexts
390
391 RelocateDeviceFuncs
392 - Gallium does not support moving pipe_context, while D3D11 seems to, using this
393
394 ResetPrimitiveID (D3D10.1+ only, #ifdef D3D10PSGP)
395 # used to do vertex processing on the GPU on Intel G45 chipsets when it is faster this way (see www.intel.com/Assets/PDF/whitepaper/322931.pdf)
396 # presumably this resets the primitive id system value
397 - Gallium does not support vertex pipeline bypass anymore
398
399 ResourceCopy
400 ResourceCopyRegion
401 ResourceConvert (D3D10.1+ only)
402 ResourceConvertRegion (D3D10.1+ only)
403 -> surface_copy
404 - Gallium does not support hardware buffer copies
405 - Gallium does not support copying 3D texture subregions in a single call
406
407 ResourceIsStagingBusy -> is_texture_referenced, is_buffer_referenced
408 - Gallium does not support checking reference for a whole texture, but only a specific surface
409
410 ResourceReadAfterWriteHazard
411 ! Gallium specifies hides this, except for the render and texture caches
412
413 ResourceResolveSubresource
414 - Gallium does not support multisample sample resolution
415
416 ResourceMap
417 ResourceUnmap
418 DynamicConstantBufferMapDiscard
419 DynamicConstantBufferUnmap
420 DynamicIABufferMapDiscard
421 DynamicIABufferMapNoOverwrite
422 DynamicIABufferUnmap
423 DynamicResourceMapDiscard
424 DynamicResourceUnmap
425 StagingResourceMap
426 StagingResourceUnmap
427 -> buffer_map / buffer_unmap
428 -> transfer functions
429 ! Gallium and D3D have different semantics for transfers
430 * D3D separates vertex/index buffers from constant buffers
431 ! D3D separates some buffer flags into specialized calls
432
433 ResourceUpdateSubresourceUP -> transfer functionality, transfer_inline_write in gallium-resources
434 DefaultConstantBufferUpdateSubresourceUP -> transfer functionality, transfer_inline_write in gallium-resources
435
436 SetBlendState -> bind_blend_state and set_blend_color
437 ! D3D11 fuses bind_blend_state and set_blend_color in a single function
438 - Gallium lacks the sample mask
439
440 SetDepthStencilState -> bind_depth_stencil_alpha_state and set_stencil_ref
441 ! D3D11 fuses bind_depth_stencil_alpha_state and set_stencil_ref in a single function
442
443 SetPredication -> render_condition
444 # here both D3D11 and Gallium seem very limited (hardware is too, probably though)
445 # ideally, we should support nested conditional rendering, as well as more complex tests (checking for an arbitrary range, after an AND with arbitrary mask )
446 # of couse, hardware support is probably as limited as OpenGL/D3D11
447 + Gallium, like NV_conditional_render, supports by-region and wait flags
448 - D3D11 supports predication conditional on being equal any value (along with occlusion predicates); Gallium only supports on non-zero
449
450 SetRasterizerState -> bind_rasterizer_state
451
452 SetRenderTargets (extended in D3D11) -> set_framebuffer_state
453 ! Gallium passed a width/height here, D3D11 does not
454 ! Gallium lacks ClearTargets (but this is redundant and the driver can trivially compute this if desired)
455 - Gallium does not support unordered access views
456 - Gallium does not support geometry shader selection of texture array image / 3D texture zslice
457
458 SetResourceMinLOD (D3D11 only)
459 - Gallium does not support min lod directly on textures
460
461 SetScissorRects
462 - Gallium lacks support for multiple geometry-shader-selectable scissor rectangles D3D11 has
463
464 SetTextFilterSize
465 - Gallium lacks support for text filters
466
467 SetVertexPipelineOutput (D3D10.1+ only)
468 # used to do vertex processing on the GPU on Intel G45 chipsets when it is faster this way (see www.intel.com/Assets/PDF/whitepaper/322931.pdf)
469 - Gallium does not support vertex pipeline bypass anymore
470
471 SetViewports
472 - Gallium lacks support for multiple geometry-shader-selectable viewports D3D11 has
473
474 ShaderResourceViewReadAfterWriteHazard -> flush(PIPE_FLUSH_RENDER_CACHE)
475 - Gallium does not support specifying this per-render-target/view
476
477 SoSetTargets
478 - Gallium does not support stream out
479
480 VsSetConstantBuffers -> for(i = StartBuffer; i < NumBuffers; ++i) set_constant_buffer(PIPE_SHADER_VERTEX, i, phBuffers[i])
481 * may want to split into fragment/vertex-specific versions
482
483 VsSetSamplers -> bind_vertex_sampler_states
484 * may want to allow binding subsets instead of all at once
485
486 VsSetShader -> bind_vs_state
487
488 VsSetShaderWithIfaces (D3D11 only)
489 - Gallium does not support shader interfaces
490
491 VsSetShaderResources -> set_fragment_sampler_views
492 * may want to allow binding subsets instead of all at once