radv: fix setting EXCP_EN for different shader stages
[mesa.git] / docs / relnotes / 20.1.0.rst
1 Mesa 20.1.0 Release Notes / 2020-05-27
2 ======================================
3
4 Mesa 20.1.0 is a new development release. People who are concerned with
5 stability and reliability should stick with a previous release or wait
6 for Mesa 20.1.1.
7
8 Mesa 20.1.0 implements the OpenGL 4.6 API, but the version reported by
9 glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
10 glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being
11 used. Some drivers don't support all the features required in OpenGL
12 4.6. OpenGL 4.6 is **only** available if requested at context creation.
13 Compatibility contexts may report a lower version depending on each
14 driver.
15
16 Mesa 20.1.0 implements the Vulkan 1.2 API, but the version reported by
17 the apiVersion property of the VkPhysicalDeviceProperties struct depends
18 on the particular driver being used.
19
20 SHA256 checksum
21 ---------------
22
23 ::
24
25 2109055d7660514fc4c1bcd861bcba9db00c026119ae222720111732dba27c83 mesa-20.1.0.tar.xz
26
27 New features
28 ------------
29
30 - GL_ARB_compute_variable_group_size on i965.
31 - GL_EXT_depth_bounds_test on Iris.
32 - GL_EXT_texture_shadow_lod on radeonsi, nvc0.
33 - GL_NV_alpha_to_coverage_dither_control on radeonsi
34 - GL_NV_copy_image on all gallium drivers.
35 - GL_NV_pixel_buffer_object on all gallium drivers, i915, i965, swrast.
36 - GL_NV_viewport_array2 on nvc0 (GM200+).
37 - GL_NV_viewport_swizzle on nvc0 (GM200+).
38 - VK_AMD_memory_overallocation_behavior on RADV.
39 - VK_KHR_shader_non_semantic_info on Intel, RADV.
40 - GL_EXT_draw_instanced on gles2
41 - VK_KHR_8bit_storage for ACO on GFX8+
42 - VK_KHR_16bit_storage for ACO on GFX8+ (storageInputOutput16 is still
43 unsupported)
44 - shaderInt16 for ACO on GFX9+
45 - VK_KHR_shader_float16_int8 for ACO on GFX8+ (shaderFloat16 is still
46 unsupported)
47 - VK_EXT_robustness2 on Intel, RADV.
48 - Add Rocket Lake (RKL) support on anvil and iris.
49
50 Bug fixes
51 ---------
52
53 - Reproduceable i915 gpu hang Intel Iris Plus Graphics (Ice Lake 8x8
54 GT2)
55 - glsl: regression affecting shader compilation time
56 - freedreno: glamor issue with x11 desktops
57 - [gles3] supertuxkart: some textures are incorrect
58 - Double lock in fbobject.c
59 - [bisected] Steam crashes when newest Iris built with LTO
60 - i965/vec4: opt_cse_local cause the out of bound array access
61 - NIR: Regression on shader using 8/16-bit integers
62 - lp_bld_intr.c:70:16: error: use of undeclared identifier
63 'LLVMFixedVectorTypeKind'; did you mean 'LLVMVectorTypeKind'?
64 - Deadlock in anv_timelines_wait()
65 - post_version.py does not work with release candidates
66 - post_version.py does not work with release candidates
67 - radv regression on android
68 - src\util\meson.build:294:4: ERROR: Program or command 'winepath' not
69 found or not executable
70 - debug builds are massively broken on Windows
71 - heavy glitches on amd ryzen 5 since version 20.x
72 - zink asserts with 32-bit boolean
73 - Dirt: Showdown bad performance and broken rendering with enabled
74 advanced lightning
75 - gravit & Firefox WebGL broken since
76 3dc2ccc14c0e035368fea6ae3cce8c481f3c4ad2 "ac/surface: replace
77 RADEON_SURF_OPTIMIZE_FOR_SPACE with !FORCE_SWIZZLE_MODE"
78 - mesa 20.0.5 causing kitty to crash
79 - radeonsi: "Torchlight II" trace showing regression on mesa-20.0.6
80 [bisected]
81 - [RADV/LLVM/ACO/Regression] After mesa commit
82 a3dc7fffbb7be0f1b2ac478b16d3acc5662dff66 all games stucks at start
83 - Android building error after commit 2ab45f41
84 - iris: Crash when trying to capture window in OBS Studio
85 - Properly annotate control flow convergence points
86 - intel/compiler: Register coalesce doesn't move conditional modifiers
87 - [bisected] [iris] mpv under wayland: failed to import supplied
88 dmabufs: Unsupported buffer format 808669784
89 - [Bisected][Iris] piglit.spec.!opengl 1_1.max-texture-size crashes on
90 x32 platform
91 - anv : android deqp assert
92 dEQP-VK.api.external.memory.android_hardware_buffer.dedicated.image#export_import_bind_bind
93 - GL cts gtf30.GL3Tests.sgis_texture_lod.sgis_texture_lod_basic_getter
94 failure
95 - freedreno/a6xx: texture cache vs realloc_bo()
96 - [Bisected]
97 dEQP-VK.subgroups.ballot_mask.ext_shader_subgroup_ballot.\* failures
98 - dEQP-VK.subgroups.size_control.compute.\* crashes on HSW and TGL
99 - zink: framebuffer and pipeline caches accumulate due to
100 zink_create_surface()
101 - FTBFS due to LLVM commit 2dea3f129878 (LLVMVectorTypeKind is gone)
102 - [r600/Turks] 20.0.2: modesetting/radeon driver SIGABRT at loading X
103 (kernel 5.5.10, ppc64)
104 - piglit spec.!opengl 1.0.gl-1.0-fpexceptions crash on Iris
105 - ci: Update the Wine version
106 - SPIR-V: Failure in dEQP-VK.graphicsfuzz.control-flow-switch
107 - SPIR-V: OpConvertUToPtr from spec constant fails to compile
108 - ACO: Regression: Texture corruption
109 - radv: Reading ViewportIndex in fragment shader returns garbage
110 - piglit
111 spec.arb_gpu_shader_fp64.execution.arb_gpu_shader_fp64-vs-non-uniform-control-flow-ssbo
112 crash on Iris
113 - piglit
114 spec/arb_gpu_shader_fp64/execution/built-in-functions/vs-sign-neg-abs.shader_test
115 failure on IVB
116 - [ANV] gfxbench Aztec Ruins misrenders on gen11+
117 - glxinfo cmd crashed
118 - radeonsi: GL_LINES rendering is affected by GL_POINT_SPRITE
119 - nir: nir_lower_returns can't handle nested loops
120 - Graphic artifacts with Mesa 20.0.4 on intel HD 510 GPU
121 - [Iris] [Bisected] Some KHR-GL46.arrays_of_arrays_gl. tests are
122 failing
123 - Mesa 20 regression makes Lightsprint demos crash
124 - metro redux games crash upon loading certain levels on amdgpu
125 - dri_common.h:58:8: error: unknown type name '__GLXDRIdrawable'
126 - Graphical glitches on Intel Graphics when Xorg started on Iris driver
127 - GL/GLES test crashes on G33/i915 platforms
128 - GL/GLES test crashes on G33/i915 platforms
129 - GL/GLES test crashes on G33/i915 platforms
130 - SIGSEGV src/compiler/glsl/ast_function.cpp:53
131 - manywin aborts with "i965: Failed to submit batchbuffer: Invalid
132 argument"
133 - manywin aborts with "i965: Failed to submit batchbuffer: Invalid
134 argument"
135 - manywin aborts with "i965: Failed to submit batchbuffer: Invalid
136 argument"
137 - manywin aborts with "i965: Failed to submit batchbuffer: Invalid
138 argument"
139 - v3d: transform feedback issue
140 - radv: Enable TC-compat HTILE in VK_IMAGE_LAYOUT_GENERAL.
141 - radv:
142 dEQP-VK.binding_model.descriptorset_random.sets4.noarray.ubolimitlow.sbolimitlow.imglimitlow.noiub.comp.noia.0
143 segfault
144 - radv: RAVEN fails
145 dEQP-VK.pipeline.timestamp.misc_tests.reset_query_before_copy
146 - buffer overflow in nouveau driver on mesa 20.0.2
147 - xmlconfig sha1 code has overflow and possible bug
148 - enable storageBuffer16BitAccess feature in radv for SI and CIK
149 - Build Fails with Clang Shared Library
150 - Thousands of 32 bit regressions in VulkanCTS and GL test suites due
151 to handling of cross-invocation
152 - anv: isl assert when running dEQP-VK.geometry.layered.3d.*.readback
153 - Weston drm-backend.so seems to fail with Mesa master and
154 LIBGL_ALWAYS_SOFTWARE=1
155 - freedreno/turnip: Don't request pixlodenable when we don't use it
156 - VulkanCTS uniform_buffer_block_geom spins forever
157 - freedreno: dEQP-GLES3.functional.fbo.msaa.4_samples.r16f flakiness in
158 CI
159 - src\util\meson.build:291:4: ERROR: Program or command 'winepath' not
160 found or not executable
161 - RADV: flickering textures in Q.U.B.E. 2 through Proton
162 - Missing ENDBR in entry_x86-64_tls.h, entry_x86_tls.h and
163 entry_x86_tsd.h
164 - [regression][bisected] Android build test fails:
165 marshal_generated.c', missing and no known rule to make it
166 - Missing ENDBR in rtasm_x86sse.c
167 - src/intel/tools/aubinator_viewer.cpp:383:52: error: format ‘%lx’
168 expects argument of type ‘long unsigned int’, but argument 5 has type
169 ‘uint64_t {aka long long unsigned int}’ [-Werror=format=]
170 - src/compiler/glsl/ast_to_hir.cpp:2134: ir_rvalue\*
171 ast_expression::do_hir(exec_list*, \_mesa_glsl_parse_state*, bool):
172 Assertion \`result != NULL \|\| !needs_rvalue' failed.
173 - process_test fails on macOS
174 - Vulkan Overlay is blinking
175 - Regression: 9d64ad2fe79 broke Rocket League
176 - GameMaker games (Memoranda and Undertale) + amdgpu — Segmentation
177 fault on launch
178 - Civilization VI - Animated leader characters small black squares
179 artifacts
180 - [ACO] Reliable crash with RPCS3 that is not present with LLVM
181 - [RADV] vkCmdBindTransformFeedbackBuffersEXT pSizes optional parameter
182 not handled
183 - [RadeonSI] - Curse of the Dead Gods (1123770) - Lighting is not
184 rendering correctly.
185 - soft-fp64: \__fsat64 incorrectly returns NaN for a NaN input. It
186 should return zero.
187 - Hang when using glWaitSync with multithreaded shared GL contexts
188 - RPCS3 / Persona 5 - Performance regression [RADV / Navi]
189 - [ANV] Rendering corruption in Shadow of the Tomb Raider
190 - src/compiler/glsl/glcpp/glcpp-parse.y:1297: \_token_print: Assertion
191 \`!"Error: Don't know how to print token."' failed.
192 - [CTS] dEQP-VK.descriptor_indexing.\* fails on RADV/LLVM
193 - Unigine Valley failure / assert
194 - [Gen9/icl] [Bisected] [Regression]
195 dEQP-GLES3.functional.shaders.loops.short_circuit.do_while_fragment
196 fail
197 - [RadeonSI][gfx10/navi] Kerbal Space Program crash: si_draw_vbo:
198 Assertion \`0' failed
199 - Budget Cuts hits VK_AMD_shader_fragment_mask assert
200 - Follow-up from "i965/blorp: Don't resolve HiZ unless we're
201 reinterpreting"
202 - crash in vc4_write_uniforms with shaders involving YUV textures
203 - Corrupted output with vaapi 10 bit -> 8 bit transcoding on AMD RAVEN
204 - tessellator.cpp:78:7: error: 'fmin' is missing exception
205 specification 'noexcept'
206 - Please add Raspberry Pi 4 to features.txt
207 - Build failure with bison 2.3.
208 - Mesa build fails on 32 bit architecture
209 - Mesa build fails on 32 bit architecture
210 - Incorrect rendering with vaapi + uyvy422
211 - V3D/Broadcom (Raspberry Pi 4) - GLES 3.1 - GL_EXT_texture_norm16
212 advertised, but not usable
213 - mesa-20.0.0/src/amd/compiler/aco_instruction_selection.cpp:7221:55:
214 style: Same expression on both sides of '&&
215 - i965 assertion failure in fallback_rgbx_to_rgba
216 - vaapi bob deinterlacer produces wrong output height on AMD
217 - Compute copies do not handle SUBSAMPLED formats
218 - Please document RADV_TEX_ANISO variable in envvars.html
219 - unexpected CI failure
220 - Multiple glapi_mapi_tmp.h
221 - drisw crashes on calling NULL putImage on EGL surfaceless platform
222 (pbuffer EGLSurface)
223 - VRAM leak with vuilkan external memory + opengl memory objects
224 - [radeonsi][vaapi][bisected] invalid VASurfaceID when playing
225 interlaced DVB stream in Kodi
226 - [RADV] GPU hangs while the cutscene plays in the game Assassin's
227 Creed Origins
228 - ACO: The Elder Scrolls Online crashes on startup (Navi)
229 - Broken rendering of glxgears on S/390 architecture (64bit, BigEndian)
230 - aco: sun flickering with Assassins Creeds Origins
231 - !1896 broke ext_image_dma_buf_import piglit tests with radeonsi
232 - aco: wrong geometry with Assassins Creed Origins on GFX6
233 - valgrind errors since commit a8ec4082a41
234 - src/broadcom/qpu/qpu_pack.c:962:25: error: implicit declaration of
235 function 'ffs' is invalid in C99
236 [-Werror,-Wimplicit-function-declaration] mux_b =
237 ffs(desc->mux_b_mask) - 1;
238 - X fails to start with amdgpu and Mesa 20.1 on Fedora
239 - GPU hangs in Factorio on Radeon RX 5700 XT (MSI GAMING X)
240 - OSMesa osmesa_choose_format returns a format not supported by
241 st_new_renderbuffer_fb
242 - Build error with VS on WIN
243 - Using EGL_KHR_surfaceless_context causes spurious "libEGL warning:
244 FIXME: egl/x11 doesn't support front buffer rendering."
245 - !3460 broke texsubimage test with piglit on zink+anv
246 - VERSION needs to be bumped for trunk master
247 - The screen is black when using ACO
248
249 Changes
250 -------
251
252 Abhishek Kumar (1):
253
254 - anv/android: fix assert in anv_import_ahw_memory
255
256 Adam Jackson (1):
257
258 - gallium: enable EGL_EXT_image_dma_buf_import_modifiers
259 unconditionally
260
261 Albert Astals Cid (5):
262
263 - cube_face_coord: Use fabsf instead of fabs since we know it's floats
264 - cube_face_index: Use fabsf instead of fabs since we know it's floats
265 - aco: Minor optimization in spill_ctx constructor
266 - aco: pass vars by const &
267 - Fix promotion of floats to doubles
268
269 Alejandro Piñeiro (7):
270
271 - docs/features: add v3d driver
272 - nir/linker: remove reference to just SPIR-V linking
273 - v3d/tex: don't configure tmu config 1 if not needed
274 - v3d/tex: Configuration Parameter 1 can be only skipped if P2 can be
275 skipped too
276 - v3d/packet: fixing TMU_Config_Parameter_2 definition
277 - nir: add nir_tex_instr_need_sampler helper
278 - v3d: support for textureQueryLOD
279
280 Alexandros Frantzis (3):
281
282 - gitlab-ci: Automated testing with OpenGL traces
283 - gitlab-ci: Fix traces caching in tracie
284 - gitlab-ci: Check the Mesa version used for tracie tests
285
286 Alyssa Rosenzweig (505):
287
288 - pan/midgard: Break out one-src read_components
289 - pan/midgard: Implement mixed-type constant packing
290 - panfrost: Avoid overlapping copy
291 - pan/midgard: Check for null consts
292 - pan/midgard: Remove unused variable
293 - panfrost: Use size0 when calculating the offset to a depth level
294 - pan/midgard: Fix scheduling issue with csel + render target reference
295 - panfrost: Simplify swizzle translation
296 - panfrost: Update comment about magic number relating to barriers
297 - panfrost: Ensure compute shader_meta is zeroed
298 - panfrost: Identify mali_shared_memory structure
299 - panfrost: Unify bifrost_scratchpad with mali_shared_memory
300 - panfrost: Rename bifrost_framebuffer->mali_framebuffer
301 - panfrost: Rename unknown2_8 to padding
302 - panfrost: Allocate RAM backing of shared memory
303 - pan/midgard: Track pressure when scheduling ld/st
304 - pan/midgard: Fix missing prefixes
305 - pan/midgard: Fix swizzles harder
306 - pan/midgard: Implement barriers
307 - pan/midgard: Allow jumping out of a shader
308 - pan/midgard: Fix 32/64 mixed swizzle packing
309 - pan/midgard: Use dummy tag for empty shaders
310 - pan/midgard: Improve barrier disassembly
311 - pan/midgard: Overhaul tag handling
312 - pan/midgard: Imply next tags
313 - pan/midgard: Infer tags entirely
314 - pan/midgard: Set xyzx swizzle for load_compute_arg
315 - pan/midgard: Identify stack barrier flag
316 - pan/midgard: Don't crash with constants on unknown ops
317 - pan/midgard: Use fprintf instead of printf for constants
318 - pan/decode: Remove extraneous newline
319 - pan/decode: Add \`minimal\` mode
320 - pan/decode: Cleanup pandecode_jc
321 - panfrost: Implement PAN_DBG_SYNC with pandecode/minimal
322 - panfrost: Print synced traces to stderr
323 - panfrost: Rewrite scoreboarding routines
324 - panfrost: Update scoreboarding notes
325 - panfrost: Cleanup transfer_map
326 - panfrost: Avoid reading GPU memory when packing vertices
327 - panfrost: Debitfieldize mali_uniform_buffer_meta
328 - panfrost: Remove enum panfrost_memory_layout
329 - panfrost: Remove dirty tracking
330 - panfrost: Remove old comment
331 - panfrost: Remove old hack
332 - panfrost: Remove flush_frontbuffer
333 - pan/midgard: Identify clamp(x, -1.0, 1.0) flag
334 - panfrost: Move checksum routines to root panfrost
335 - panfrost: Move pan_afbc.c to root
336 - panfrost: Move format translation to root
337 - panfrost: Rewrite texture descriptor creation logic
338 - nir: Add SSBO->global lowering pass
339 - pan/midgard: Lower SSBOs in NIR
340 - pan/midgard: Implement nir_intrinsic_get_buffer_size
341 - pan/midgard: Implement load/store_shared
342 - panfrost: Combine get_index_buffer with bound computation
343 - panfrost: Implement index buffer cache
344 - pan/decode: Dump scratchpad size if present
345 - pan/midgard: Don't spill near a branch
346 - panfrost: Fix gl_VertexID/InstanceID
347 - panfrost: Fix padded_vertex_count generation
348 - panfrost: Update spilling comment framebuffer->shared
349 - panfrost: Don't set shared->unk0
350 - panfrost: Fix param getting
351 - panfrost: Default to 256 threads for TLS
352 - panfrost: Reserve an extra page for spilling
353 - panfrost: Simplify stack shift calculation
354 - panfrost: Expose PIPE_CAP_PRIMITIVE_RESTART
355 - panfrost: Add PAN_MESA_DEBUG=gles3 option
356 - panfrost: Increase SSBO/image limit from 4->8
357 - pan/midgard: Allow inverted inverted ops
358 - pan/midgard: Allow fusing inverted sources for inverted ops
359 - pan/midgard: Partially fix 64-bit swizzle alignment
360 - pan/midgard: Extract nir_ssa_index helper
361 - pan/midgard: Add LDST_ADDRESS property
362 - pan/midgard: Fix load/store argument sizing
363 - pan/midgard: Round up bytemasks when promoting uniforms
364 - pan/midgard: Force address alignment
365 - pan/midgard: Add address analysis framework
366 - pan/midgard: Use address analysis for globals, etc
367 - pan/decode: Calm an assert to a pandecode error
368 - pan/decode: Restore bifrost sample_locations
369 - pan/decode: Fix tiler weights printing
370 - pan/decode: Skip analysis for Bifrost tiler structures
371 - pan/bi: Add discard ops
372 - pan/bi: Add ICMP.GL.NEQ op
373 - pan/bi: Move notes on FMA opcodes from disassembler
374 - pan/bi: Introduce CSEL4 class
375 - pan/bi: Move notes on ADD ops to notes file
376 - pan/bi: Decode FMA_SHIFT properly
377 - pan/bi: Add v4i8 mode to FMA_SHIFT
378 - pan/bi: Identify extended FMA opcodes
379 - pan/bi: Decode ADD_SHIFT properly
380 - pan/bi: Combine LOAD_VARYING_ADDRESS instructions by type
381 - pan/bi: Squash LD_ATTR ops together
382 - pan/bi: Structify FMA_FADD
383 - pan/bi: Move some definitions from disasm to bifrost.h
384 - panfrost: Add note about preloaded varyings
385 - pan/bi: Gut old compiler
386 - pan/bi: Stub out new compiler
387 - pan/bi: Add the control flow graph
388 - pan/bi: Add src/dest fields to bifrost_instruction
389 - pan/bi: Add class properties
390 - pan/bi: Add modifiers to bi_instruction
391 - pan/bi: Add BI_GENERIC property
392 - pan/bi: Factor out enum bifrost_minmax_mode
393 - pan/bi: Add a bifrost_roundmode field
394 - pan/bi: Add bifrost_minmax_mode field
395 - pan/bi: Add bi_load structure
396 - pan/bi: Pull out bifrost_load_var
397 - pan/bi: Add bi_load_vary structure
398 - pan/bi: Add PAN_SCHED\_\* flags
399 - pan/bi: Add bi_clause, bi_bundle abstractions
400 - pan/bi: Add dest_type field to bifrost_instruction
401 - pan/bi: Add special indices
402 - pan/bi: Add constant field to bi_instruction
403 - pan/bi: Add class-specific ops
404 - pan/bi: Add clause header fields to bi_clause
405 - pan/bi: Clarify special op scheduling
406 - pan/bi: Add swizzles
407 - pan/bi: Add source type for conversions
408 - pan/bi: Add EXTRACT, MAKE_VEC synthetic ops
409 - pan/bi: Add constants to bi_clause
410 - pan/bi: Add pred/successors to build CFG
411 - pan/bi: Extract bifrost_branch structure
412 - pan/bi: Add bi_branch data
413 - pan/bi: Add CSEL condition
414 - pan/bi: Add high-latency property for classes
415 - pan/bi: Add quirks system
416 - pan/bi: Add IR iteration macros
417 - pan/bi: Move some print routines out of the disasm
418 - pan/bi: Add BIR manipulation routines to bir.c
419 - pan/bi: Move bi_interp_mode_name to bi_print
420 - pan/bi: Add bi_instruction printing
421 - pan/bi: Add bi_print_bundle for printing bi_bundle
422 - pan/bi: Add bi_print_clause
423 - pan/bi: Add bi_print_block
424 - pan/bi: Add bi_print_shader
425 - pan/bi: Lower and optimize NIR
426 - pan/bi: Walk through the NIR control flow graph
427 - pan/bi: Improve block printing
428 - pan/bi: Don't print types for unconditional branches
429 - pan/bi: Print branch target
430 - pan/bi: Add instruction emit/remove helpers
431 - pan/bi: Call nir_lower_io_to_temporaries in cmdline
432 - pan/bi: Add support for if-else blocks
433 - pan/bi: Handle loops when ingesting CFG
434 - pan/bi: Handle jumps (breaks, continues)
435 - pan/bi: Fix destination printing
436 - pan/bi: Implement nir_intrsinic_load_interpolated_input
437 - pan/bi: Add blend_location to IR for BI_BLEND
438 - pan/bi: Add bi_schedule_barrier helper
439 - pan/bi: Implement store_output for fragment shaders
440 - pan/bi: Implement load_input for vertex shaders
441 - pan/bi: Add helpers for creating temporaries
442 - pan/bi: Implement store_vary for vertex shaders
443 - pan/bi: Add preliminary LOAD_UNIFORM implementation
444 - pan/bi: Implement load_const
445 - pan/bi: Add dummy scheduler
446 - pan/bi: Rename next-wait to simply 'wait'
447 - pan/bi: Fix Android.mk
448 - panfrost: Move mir_to_bytemask to common code
449 - pan/bi: Generalize swizzles to avoid extracts
450 - pan/bi: Introduce writemasks
451 - pan/bi: Remove bi_load
452 - pan/bi: Lower vec\* to writemasks in NIR
453 - pan/bi: Add initial handling of ALU ops
454 - pan/bi: Allow inlining constants
455 - pan/bi: Implement fsat as mov.sat
456 - pan/bi: Add a bunch of ALU ops
457 - pan/bi: Add BI_SPECIAL\_\* enum
458 - pan/bi: Handle special ops in NIR->BIR
459 - pan/bi: Implement fabs, fneg as fmov with mods
460 - pan/bi: Disable lower_sub
461 - pan/bi: Add isub op
462 - pan/bi: Import algebraic pass from midgard
463 - pan/bi: Implement nir_op_bcsel
464 - pan/bi: Lower b2f to bcsel
465 - pan/bi: Specify comparison op for BI_CMP
466 - pan/bi: Print source types unconditionally
467 - pan/bi: Implement comparison opcodes via BI_CMP
468 - panfrost: Promote midgard_program to panfrost/util
469 - pan/midgard: Remove unused iterators
470 - pan/midgard: Adjust sysval-related prototypes
471 - pan/midgard: Remove indexing dependency of sysvals
472 - pan/midgard: Decontextualize midgard_nir_assign_sysval_body
473 - pan/midgard: Remove dest_override sysval argument
474 - panfrost: Move Midgard sysval code to common Panfrost
475 - pan/bi: Switch to panfrost_program
476 - pan/bi: Implement sysvals
477 - pan/midgard: Localize \`visited\` tracking
478 - pan/midgard: Decontextualize liveness analysis core
479 - pan/midgard: Sync midgard_block field names with Bifrost
480 - pan/midgard: Subclass midgard_block from pan_block
481 - panfrost: Move liveness analysis to root panfrost/
482 - panfrost: Sync Midgard/Bifrost control flow
483 - pan/bi: Paste over bi_has_arg
484 - pan/bi: Add bi_bytemask_of_read_components helpers
485 - pan/bi: Add bi_next/prev_op helpers
486 - pan/bi: Add bi_max_temp helper
487 - pan/bi: Add liveness analysis pass
488 - pan/bi: Add dead code elimination pass
489 - pan/bi: Implement nir_op_ffma
490 - pan/bi: Fix swizzle for second argument to ST_VARY
491 - panfrost: Move lcra to panfrost/util
492 - pan/midgard: Remove incorrect comment in RA
493 - pan/bi: Minor fixes in iteration macros
494 - pan/bi: Fix vector handling of readmasks
495 - pan/bi: Fix missing src_types
496 - pan/bi: Add register allocator
497 - pan/bi: Interpret register allocation results
498 - pan/bi: Setup initial clause packing
499 - pan/bi: Sketch out instruction word packing
500 - pan/bi: Add packing for register control field
501 - pan/bi: Pack register fields
502 - pan/bi: Add missing \__attribute__((packed))
503 - pan/bi: Assign registers to ports
504 - pan/bi: Route through first_instruction field
505 - pan/bi: Model 3-bit Bifrost srcs in IR
506 - pan/bi: Add struct bifrost_fma_fma
507 - pan/bi: Pack BI_FMA ops
508 - pan/bi: Pack fadd32
509 - pan/bi: List ADD classes in bi_pack_add
510 - pan/bi: Generalize bi_get_src a bit
511 - pan/bi: Pass second src for load_vary ops
512 - pan/bi: Emit load_vary ops
513 - pan/bi: Skip over data registers in port assignment
514 - pan/bi: Route through clause header
515 - pan/bi: Pretty-print clause types in disassembler
516 - pan/bi: Don't hide SCHED_ADD inside HI_LATENCY
517 - pan/bi: Track clause types during scheduling
518 - pan/bi: Flesh out ATEST in IR
519 - pan/bi: Add ATEST packing
520 - pan/bi: Flesh out BI_BLEND
521 - pan/bi: Pack BI_BLEND
522 - pan/bi: Implement FMA/MOV without modifiers
523 - pan/bi: Add bi_emit_before helper
524 - pan/bi: Add move lowering pass
525 - pan/bi: Pack a constant quadword
526 - pan/bi: Document constant related errata(?)
527 - pan/bi: Index out constants in instructions
528 - pan/bi: Include UBO index for sysval reads
529 - pan/bi: Add bi_load32_components helper
530 - pan/bi: Pack ld_ubo ops
531 - pan/bi: Pack ld_var_addr
532 - pan/bi: Flesh out st_vary IR
533 - pan/bi: Generalize data register setting
534 - pan/bi: Add store_channels property
535 - pan/bi: Pack st_vary
536 - pan/bi: Pack LD_ATTR
537 - pan/bi: Lower bool to ints
538 - pan/bi: Remove hacks for 1-bit booleans in IR
539 - pan/bi: Add \`soft\` NIR->BIR condition translation
540 - pan/bi: Implement csel fusing
541 - pan/bi: Respect shift when printing immediates
542 - pan/bi: Use bi_lookup_immediate when packing
543 - pan/bi: Default csel to "!= 0" mode
544 - pan/bi: Pack csel4 opcodes
545 - pan/bi: Ingest vecN directly (again)
546 - pan/bi: Lower combines to rewrites for scalars
547 - pan/bi: Rewrite aligned vectors as well
548 - panfrost: Split panfrost_device from panfrost_screen
549 - panfrost: Isolate panfrost_bo_access_for_stage to pan_cmdstream.c
550 - panfrost: Inline reference counting routines
551 - panfrost: Move pan_bo to root panfrost
552 - pan/bit: Link standalone compiler with en/decoder
553 - panfrost: Move device open/close to root panfrost
554 - pan/bit: Open up the device
555 - panfrost: Stub out G31/G52 quirks
556 - pan/bit: Submit a WRITE_VALUE job as a sanity check
557 - pan/bit: Begin generating a vertex job
558 - pan/bi: Fix overzealous write barriers
559 - pan/bi: Fix off-by-one in scoreboarding packing
560 - pan/bi: Enable precision lowering in standalone compiler
561 - panfrost: Enable PIPE_SHADER_CAP_FP16 on Bifrost
562 - pan/bi: Handle f2f\* opcodes
563 - pan/bi: Ignore swizzle in unwritten component
564 - pan/bi: Finish FMA structures
565 - pan/bi: Fix missing type for fmul
566 - pan/bi: Add FMA16 packing
567 - pan/bi: Pack outmod and roundmode with FMA
568 - pan/bi: Expand out FMA conversion opcodes
569 - pan/bi: Enumerate conversions
570 - pan/bi: Handle standard FMA conversions
571 - pan/bi: Add bifrost_fma_2src generic
572 - pan/bi: Add one-source f32->f16 op
573 - pan/bi: Assert out i16 related converts for now
574 - pan/bi: Handle round opcodes in frontend
575 - pan/bi: Add v2f16 versions of rounding ops
576 - pan/bi: Structify fadd/min/max16
577 - pan/bi: Handle core faddminmax16 packing
578 - pan/bi: Handle abs packing for fp16/FMA add/min
579 - pan/bi: Handle fp16/abs scheduling restriction
580 - pan/bi: Fix handling of constants with COMBINE
581 - pan/bit: Add \`run\` mode to the cmdline
582 - pan/bit: Wire through I/O
583 - pan/bi: Fix writes_component for VECTOR
584 - pan/bi: Use STAGE srcs for scheduler nops
585 - pan/bi: Don't set the back-to-back bit yet
586 - pan/bi: Add cmdline option for verbose disassembly
587 - pan/bi: Fix unused port swapping
588 - pan/bi: Handle fmov class ops
589 - pan/bi: Fix outmod/roundmode flip
590 - pan/bi: Export bi_class_name
591 - pan/bi: Fix duplicated source in ADD.v2f16
592 - pan/bi: Fix negation in ADD.v2f16
593 - pan/bi: Don't gobble zero ports
594 - pan/bi: Allow BI_FMA to take mods
595 - pan/bi: Handle BIFROST_FIRST_WRITE_FMA_P2_READ_P3
596 - pan/bi: Add helper to debug port assignment
597 - pan/bi: Match CSEL argument order with hw
598 - pan/bit: Stub out BIR interpreter
599 - pan/bit: Handle read/write
600 - pan/bit: Add preliminary FMA/ADD/MOV implementations
601 - pan/bit: Implement outmods
602 - pan/bit: Implement floating source mods
603 - pan/bit: Add packing test framework
604 - pan/bit: Add helper for generating floating mod tests
605 - pan/bit: Add verbose printing for tests
606 - pan/bit: Add 16-bit fmod tests
607 - pan/bit: Add FMA tests
608 - pan/bit: Add CSEL to interpreter
609 - pan/bit: Add csel tests
610 - pan/bit: Make run more useful
611 - pan/bit: Add mode to run unit tests
612 - pan/bi: Remove nontrivial SPECIAL ops
613 - pan/bi: Add 32-bit \_FAST packing
614 - pan/bi: Add fp16 support for frcp/frsq
615 - pan/bit: Add special op interpreting
616 - pan/bit: Add special unit test
617 - pan/bi: Implement min/max on FMA
618 - pan/bi: Structify ADD unit add/min/max
619 - pan/bi: Add ADD add/min/max fp32 packing
620 - pan/bi: Set BI_MODS for MINMAX
621 - pan/bi: Fix incorrect abs flip in fma/fadd16
622 - pan/bi: Force ADD scheduling for MINMAX
623 - pan/bit: Unify test frontends
624 - pan/bit: Add min/max support to interpreter
625 - pan/bit: Enable more debug for \`run\`
626 - pan/bit: Add fmin/max16 tests
627 - pan/bit: Wire up add/add op+test
628 - panfrost: Add IS_BIFROST quirk
629 - panfrost: Populate bifrost-specific structs within mali_shader_meta
630 - panfrost: Staticize a few cmdstream functions
631 - panfrost: Unify vertex/tiler structures
632 - panfrost: Set mfbd.msaa.sample_locations on Bifrost
633 - panfrost: Call the Bifrost compiler on bi devices
634 - pan/bi: Fix nondeterministic register packing
635 - pan/midgard: Remove unused max_varying variable
636 - panfrost: Move varying linking to cmdstream
637 - panfrost: Move uniform_count to pan_assemble
638 - panfrost: Pass compiler-appropriate options
639 - pan/bi: Fix backwards registers ports
640 - panfrost: Fix BI_BLEND packing
641 - pan/bi: Let !b2b imply branch_cond
642 - pan/decode: Print Bifrost blend descriptor
643 - panfrost: Drop dependency on nonexistant write_value
644 - pan/bi: Lower fsqrt
645 - pan/midgard: Fix f2u naming confusion
646 - pan/bi: Set BI_ROUNDMODE for BI_CONVERT
647 - pan/bi: Fix incorrect swizzle packing assert
648 - pan/bi: Rewrite conversion packing
649 - pan/bi: ADD packing for CONVERT
650 - pan/bit: Add BI_CONVERT interpretation
651 - pan/bit: Add BI_CONVERT tests
652 - pan/bi: Add disasm for ADD.i8
653 - pan/bi: Disable FMA scheduling for CONVERT
654 - pan/bi: Add BI_TABLE for fast table accesses
655 - pan/bi: Add special op for exp2
656 - pan/bi: Add op for ADD_FREXPM
657 - pan/bi: Add FLOG2_U op to disassembler
658 - pan/bi: Add log_frexpe op to IR
659 - pan/bi: Add frexp_log packing
660 - pan/bi: Add bi_pack_fma_2src helper
661 - pan/bi: Pack ADD_FREXPM
662 - pan/bi: Add log2_help packing
663 - pan/bi: Add \_MSCALE flag for FMA/ADD
664 - pan/bi: Structify FMA_MSCALE
665 - pan/bi: Pack FMA_MSCALE
666 - pan/bi: Add fexp2_fast packing
667 - pan/bi: Split src/dest index printing
668 - pan/bi: Ensure CONSTANT srcs have types
669 - pan/bi: Fix bi_get_immediate with multiple imms
670 - pan/bi: Fix packing with multiple constants
671 - pan/bi: Fix packing with low-nibble-set on hi constant
672 - pan/bi: Fix lower_combine swizzle rewrite
673 - pan/bi: Add fexp2 implementation
674 - pan/bi: Implement flog2
675 - pan/bi: Fix vec2/3 handling
676 - pan/bi: Handle st_vary with <4 components
677 - pan/bi: Try to reuse constants in ALU
678 - pan/bi: Workaround constant packing errata
679 - pan/bi: Structify add and min/max fp16 ADD
680 - pan/bi: Pack ADD.v2f16
681 - pan/bi: Pack MAX.v2f16
682 - pan/bi: Dump extra bits for disasm
683 - pan/bi: Round constants to 32-bit
684 - pan/bi: Lower special ops to 32-bit
685 - pan/bit: Add FREXP interp support
686 - pan/bit: Add frexp_log test
687 - pan/bit: Add BI_REDUCE_FMA interp
688 - pan/bit: Add FMA_REDUCE test
689 - pan/bit: Add log2 helper interp
690 - pan/bit: Add BI_TABLE test
691 - pan/bit: \_MSCALE interp
692 - pan/bit: Add FMA_MSCALE test
693 - pan/bit: Add fexp2_fast interp
694 - pan/bit: Add fexp2_fast test
695 - pan/bit: Add constants test
696 - pan/bit: Add fp16 min/max tests
697 - pan/bi: Print tex_compact coordinates
698 - pan/bi: Document when dual-tex is triggered
699 - pan/bi: Disassemble f16 dual tex
700 - pan/bi: Structify TEX compact
701 - pan/bi: Include TEX_COMPACT f16 opcode
702 - pan/bi: Feed data register to BI_TEX
703 - pan/bi: Add normal/compact/dual switch to IR
704 - pan/bi: Stub out tex_compact logic
705 - pan/bi: Generate TEX_COMPACT instruction
706 - pan/bi: Pack TEX compact instructions
707 - pan/bi: Assert out multiple textures
708 - panfrost: Fix crashes with small BOs
709 - panfrost: Assert on unimplemented fragcoord etc
710 - panfrost: Set clear_color_[12] in the extra fb desc
711 - panfrost: Add tentative bifrost_texture_descriptor
712 - panfrost: decode textures and samplers on bifrost
713 - pan/decode: Remove is_zs weirdness
714 - panfrost: Identify texture layout field
715 - panfrost: The texture descriptor has a pointer to a trampoline
716 - pan/bi: Pack fp16 ATEST
717 - pan/bi: Passthrough type for ATEST
718 - pan/bi: Passthrough blend types
719 - pan/bi: Assign blend descriptor for BLEND op
720 - pan/bi: Add missing BI_VECTOR
721 - pan/bi: Fix ADD.v4i8 opcode
722 - pan/bi: Eliminate writemasks in the IR
723 - pan/bi: Rename BI_SWIZZLE to BI_SELECT
724 - pan/bi: Pack FMA SEL16
725 - pan/bi: Pack FMA SEL8
726 - pan/bi: Pack ADD SEL16
727 - pan/bi: Force BI_SELECT arguments scalar
728 - pan/bit: Interpret BI_SELECT
729 - pan/bit: Add SELECT tests
730 - pan/bi: Fix RA wrt 16-bit swizzles
731 - pan/bi: Implement 16-bit COMBINE lowering
732 - nir: Move nir_lower_mediump_outputs from ir3
733 - ir3: Use shared mediump output lowering
734 - pan/bi: Add bool->float opcodes
735 - pan/bi: Add CSEL.64 opcode
736 - pan/bi: Add some 8-bit compares
737 - pan/bi: Add 64-bit int compares
738 - pan/bi: Add FCMP.GL.v2f16 on ADD opcode
739 - pan/bi: Add CSEL.8 opcode
740 - pan/bi(t): Fix SELECT tests
741 - pan/bi: Deduplicate csel/cmp cond
742 - pan/bi: Remove bi_round_op
743 - pan/bi: Structify FMA FCMP
744 - pan/bi Strucitfy ADD FCMP 32
745 - pan/bi: Structify FMA FCMP16
746 - pan/bi: Structify ADD FCMP16
747 - pan/bi: Structify FMA ICMP 32
748 - pan/bi: Structify FMA ICMP 16
749 - pan/bi: Structify ADD ICMP 32
750 - pan/bi: Fix source mod testing for CMP
751 - pan/bi: Pack FMA 32 FCMP
752 - pan/bi: Factor out fp16 abs logic
753 - pan/bi: Pack fma.fcmp16
754 - pan/bi: Relax double-abs condition
755 - pan/bit: Prepare condition evaluation for vectors
756 - pan/bit: Interpret CMP
757 - pan/bi: Add initial fcmp test
758 - pan/bi: Add bitwise modifiers
759 - pan/bi: Pack BI_BITWISE
760 - pan/bi: Handle iand/ior/ixor in NIR->BIR
761 - pan/bit: Interpret BI_BITWISE
762 - pan/bit: Add BITWISE test
763 - panfrost: Fix BO reference counting
764 - panfrost: Move Bifrost IR indexing to common
765 - pan/bi: Use common IR indices
766 - pan/mdg: Remove nir_alu_src_index
767 - pan/mdg: Use PAN_IS_REG
768 - pan/mdg: SSA_FIXED_MINIMUM already covered by PAN_IS_REG
769 - pan/mdg: Don't break SSA
770 - pan/mdg: Remove goofy 16-bit comment
771 - pan/mdg: Remove old hack
772 - pan/mdg: Set lower_flrp16
773 - pan/bi: Share ALU type printing
774 - pan/mdg: Add type fields to IR
775 - pan/mdg: Track ALU src types
776 - pan/mdg: Track ALU dest type
777 - pan/mdg: Another goofy comment gone
778 - pan/mdg: Track a primary type for I/O
779 - pan/mdg: Denoise prints
780 - pan/mdg: Track v_mov type (force uint32 for now?)
781 - pan/mdg: Track texture types
782 - pan/mdg: Set texture full fields at pack time
783 - pan/mdg: Move sampler_type emission to pack time
784 - pan/mdg: Lower specials to 32-bit
785 - pan/mdg: Specialize swizzle to type
786 - pan/mdg: Always print the mask
787 - pan/mdg: Make some branch targets more explicit
788 - pan/mdg: Don't crash on unknown branch target
789 - pan/mdg: Pass through some types from scheduling
790 - pan/mdg: Move condense_writemask to disasm
791 - pan/mdg: Ensure fdot is scalar out in disasm
792 - pan/mdg: Replicate 16-bit swizzles
793
794 Andreas Baierl (8):
795
796 - lima/parser: Fix RSW depth test parsing
797 - lima/parser: Extend AUX0 findings
798 - lima/parser: Change value name in RSW parser
799 - lima/parser: Extend rsw parsing showing strings instead of numbers
800 - gitlab-ci: lima: Add flaky tests to the skips list
801 - gitlab-ci: Enable the lima job again
802 - gitlab-ci: Add add a set of lima flakes
803 - lima: Add etc1 support
804
805 Andres Gomez (27):
806
807 - tracie: correct typo
808 - gitlab-ci: add missing popd to the build-deqp-vk.sh script
809 - gitlab-ci: build gfxreconstruct into the Vulkan testing container
810 - gitlab-ci: build VulkanTools into the Vulkan testing container
811 - gitlab-ci: Change devices format to <api-vendor-deviceId>
812 - gitlab-ci: Add gfxreconstruct traces support
813 - gitlab-ci: Add jobs to be able to test Vulkan
814 - gitlab-ci: Fix indentation and dangerous "\" in the last multiline
815 line
816 - gitlab-ci: Remove unneeded python3-pilkit dependency
817 - gitlab-ci: Sort packages to install alphabetically
818 - gitlab-ci: add python3-requests to the test-vk container
819 - gitlab-ci/traces: Add Vulkan sample entries for POLARIS10
820 - gitlab-ci: Don't use buster-backports packages by default for
821 x86_test-vk
822 - gitlab-ci: add Wine, win64's apitrace and DXVK to the Vulkan testing
823 container
824 - gitlab-ci: add apitrace's DXGI traces support
825 - gitlab-ci: replay apitrace traces in headless mode
826 - gitlab-ci: add Wine and DXVK env variables to Vulkan's tracie runner
827 - gitlab-ci/traces: Add D3D11 sample entry for POLARIS10
828 - gitlab-ci: Vulkan tracie runner to return last command exit code
829 - gitlab-ci: protect usage of shell variables with double quotes
830 - gitlab-ci: make explicit tracie is gitlab specific
831 - gitlab-ci: adapt query_traces_yaml to gitlab specific changes
832 - gitlab-ci: install winehq-stable to get 5.0 instead of 4.0
833 - Revert "meson,ci: Disable sparse_array tests on windows"
834 - gitlab-ci: update tracie README after changes in main script
835 - gitlab-ci: create always the "results" directory with tracie
836 - gitlab-ci: correct tracie behavior with replay errors
837
838 Andrii Simiklit (2):
839
840 - Revert "glx: convert glx_config_create_list to one big calloc"
841 - i965/vec4: Ignore swizzle of VGRF for use by var_range_end()
842
843 Anuj Phogat (2):
844
845 - intel/gen12+: Reserve 4KB of URB space per bank for Compute Engine
846 - intel/gen12+: Set way_size_per_bank to 4
847
848 Arcady Goldmints-Orlov (7):
849
850 - compiler/nir: Add support for variable initialization from a pointer
851 - compiler/spirv: Add support for non-constant initializers
852 - Rename nir_lower_constant_initializers to
853 nir_lower_variable_initalizers
854 - spirv: Remove outdated SPIR-V decoration warnings
855 - nir: Lower returns correctly inside nested loops
856 - anv: increase minUniformBufferOffsetAlignment to 64
857 - intel/compiler: fix alignment assert in nir_emit_intrinsic
858
859 Axel Davy (1):
860
861 - gallium/util: Fix leak in the live shader cache
862
863 Bas Nieuwenhuizen (29):
864
865 - radv: Allow non-dedicated linear images and buffer.
866 - radv: Do not set SX DISABLE bits for RB+ with unused surfaces.
867 - radv: Optimize emitting index buffer changes.
868 - radv: Do not redundantly set the RB+ regs on pipeline switch.
869 - radeonsi: Fix compute copies for subsampled formats.
870 - amd/llvm: Fix divergent descriptor indexing. (v3)
871 - amd/llvm: Fix divergent descriptor regressions with radeonsi.
872 - radv: Store 64-bit availability bools if requested.
873 - radv: Consider maximum sample distances for entire grid.
874 - radv: Whitespace fixup.
875 - radv: Use correct buffer count with variable descriptor set sizes.
876 - winsys/amdgpu: Retrieve WC flags from imported buffers.
877 - drm-uapi,radv,radeonsi: Add amdgpu_drm.h header.
878 - vulkan/wsi: Add callback to set ownership of buffer.
879 - radv: Add WSI buffers to BO list only if they can be used.
880 - st/dri: Set next in template instead of after creation. (v2)
881 - radeonsi: Count planes for imported textures.
882 - radv: Use actual memory type count for setting app-visible bitset.
883 - radv: Stop using memory type indices.
884 - radv/winsys: Add function to get domains/flags from fd.
885 - radv: Determine memory type for import based on fd.
886 - radv: Expose 4G element texel buffers.
887 - radv: Fix implicit sync with recent allocation changes.
888 - radv: Extend tiling flags to 64-bit.
889 - radv: Provide a better error for permission issues with priorities.
890 - radv/winsys: Remove extra sizeof multiply.
891 - radv: Handle failing to create .cache dir.
892 - radv: Do not close fd -1 when NULL-winsys creation fails.
893 - radv: Implement vkGetSwapchainGrallocUsage2ANDROID.
894
895 Bernd Kuhls (1):
896
897 - util/os_socket: Include unistd.h to fix build error
898
899 Blaž Tomažič (1):
900
901 - radeonsi: Fix omitted flush when moving suballocated texture
902
903 Boris Brezillon (45):
904
905 - pan/midgard: Add an enum to describe the render targets
906 - pan/midgard: Make sure we pass the right RT id to
907 emit_fragment_store()
908 - pan/midgard: Lower bitfield extract to shifts
909 - pan/midgard: Don't check 'branch && branch->writeout' twice in
910 mir_schedule_alu()
911 - pan/midgard: Stop leaking instruction objects in mir_schedule_alu()
912 - panfrost: Fix the damage box clamping logic
913 - pan/midgard: Turn Z/S stores into zs_output_pan intrinsics
914 - pan/midgard: Add nir_intrinsic_store_zs_output_pan support
915 - panfrost: Z24 variants should be sampled as R32UI
916 - panfrost: Add the MALI_WRITES_{Z,S} flags
917 - panfrost: Set the MALI_WRITES_{Z,S} flags when needed
918 - Revert "panfrost: Z24 variants should be sampled as R32UI"
919 - panfrost: Pass the sampler view format when creating a tex descriptor
920 - panfrost: Assign primitive_size.pointer only if writes_point_size()
921 returns true
922 - panfrost: Add an helper to retrieve the currently active shader state
923 - panfrost: Move the batch stack size adjustment out of
924 panfrost_queue_draw()
925 - panfrost: Move viewport desc emission out of panfrost_emit_for_draw()
926 - panfrost: Move the const buf emission logic out of
927 panfrost_emit_for_draw()
928 - panfrost: Move shared mem desc emission out of panfrost_launch_grid()
929 - panfrost: Dissociate shader meta patching from the desc emission
930 - panfrost: Move panfrost_attach_vt_framebuffer() to pan_cmdstream.c
931 - panfrost: Stop using panfrost_emit_for_draw() for compute jobs
932 - panfrost: Simplify panfrost_emit_for_draw() and make it private
933 - panfrost: Add an helper to update the occclusion query part of a
934 tiler job desc
935 - panfrost: Add an helper to update the rasterizer part of a tiler job
936 desc
937 - panfrost: Prepare things to get rid of panfrost_shader_state.tripipe
938 - panfrost: Prepare shader_meta descriptors at emission time
939 - panfrost: Add a panfrost_sampler_desc_init() helper
940 - panfrost: Move sampler/tex descs emission helpers to pan_cmdstream.c
941 - panfrost: Add an helper to emit a pair of vertex/tiler jobs
942 - panfrost: Drop initial mali_attr_meta.src_offset assignment
943 - panfrost: Ignore BO start addr when adjusting src_offset
944 - panfrost: Prepare attribute for builtins at state creation time
945 - panfrost: Emit attribute descriptors after patching the templates
946 - panfrost: Move the mali_attr.src_offset adjustment to a sub-function
947 - panfrost: Rename panfrost_stage_attributes()
948 - panfrost: Move streamout offset update out of panfrost_draw_vbo()
949 - panfrost: Move vertex/tiler payload initialization out of
950 panfrost_draw_vbo()
951 - panfrost: Inline panfrost_queue_draw() and panfrost_emit_for_draw()
952 - panfrost: Move panfrost_emit_vertex_data() to pan_cmdstream.c
953 - panfrost: Move panfrost_emit_varying_descriptor() to pan_cmdstream.c
954 - panfrost: Re-init the VT payloads at draw/launch_grid() time
955 - panfrost: Use ctx->active_prim in panfrost_writes_point_size()
956 - panfrost: Get rid of ctx->payloads[]
957 - vtn/opencl: add rint-support
958
959 Brian Ho (17):
960
961 - turnip: Promote tu_cs_get_size/is_empty to header
962 - turnip: Execute main cs for secondary command buffers
963 - turnip: Advertise 8 bit subpixel precision
964 - ir3: Disable copy prop for immediate ldlw offsets
965 - turnip: Set has_gs in ir3_shader_key
966 - turnip: Emit geometry shader obj and related consts
967 - turnip: Configure VPC for geometry shaders
968 - turnip: Configure VFD_CONTROL with gsheader and primitiveid
969 - turnip: Set up REG_A6XX_SP_GS_CONFIG
970 - turnip: Selectively configure GRAS_LAYER_CNTL
971 - turnip: Update maxGeometryShaderInvocations to match blob
972 - turnip: Populate tu_pipeline.active_stages
973 - turnip: Enable geometry shaders for CP_DRAWs
974 - turnip: Enable geometryShader device feature
975 - turnip: Correctly set layer stride for 3D images
976 - turnip: Emit geometry shader descriptor consts
977 - freedreno/turnip: Update GRAS_LAYER_CNTL to GRAS_MAX_LAYER_INDEX
978
979 Caio Marcelo de Oliveira Filho (46):
980
981 - anv: Advertise VK_KHR_shader_non_semantic_info
982 - radv: Advertise VK_KHR_shader_non_semantic_info
983 - intel/gen12: Take into account opcode when decoding SWSB
984 - spirv: Be consistent when checking for Shader/Kernel
985 - anv: Use intel_debug_flag_for_shader_stage()
986 - anv: Add pipe_state_for_stage() helper
987 - nir/builder: Add nir_scoped_memory_barrier()
988 - nir: Add the alias NIR_MEMORY_ACQ_REL
989 - nir/tests: Use nir_scoped_memory_barrier() helper
990 - nir, intel: Move use_scoped_memory_barrier to nir_options
991 - anv: Remove unused field xfb_used from anv_pipeline
992 - anv: Remove unused field \`urb.total_size\`
993 - nir: Don't skip a bit in nir_memory_semantics
994 - nir: Reorder nir_scopes so wider scope has larger numeric value
995 - nir: Add pass to combine adjacent scoped memory barriers
996 - intel/fs: Combine adjacent memory barriers
997 - anv: Add a new enum to identify the pipeline type
998 - anv: Use pipeline type to decide whether or not lower multiview
999 - anv: Use a dynamic array for storing executables in pipeline
1000 - anv: Keep the shader stage in anv_shader_bin
1001 - anv: Pass the right pipe_state to flush_descriptor_sets()
1002 - anv: Remove redundant check in flush_descriptor_sets() helpers
1003 - anv: Decouple flush_descriptor_sets() helpers from pipeline struct
1004 - anv: Decouple flush_descriptor_sets() from pipeline struct
1005 - anv: Use a separate field in the pipeline for compute shader
1006 - anv: Split graphics and compute bits from anv_pipeline
1007 - anv: Reduce compute pipeline batch_data size
1008 - anv: Remove duplicate code in anv_cmd_buffer_bind_descriptor_set
1009 - intel/blorp: Plumb the stage through blorp upload_shader
1010 - mesa/main: Fix overflow in validation of DispatchComputeGroupSizeARB
1011 - nir: Add per_view attribute to nir_variable
1012 - intel/gen12: Add XML description for 3DSTATE_PRIMITIVE_REPLICATION
1013 - intel/fs: Allow multiple slots for position
1014 - anv/gen12: Lower VK_KHR_multiview using Primitive Replication
1015 - intel/compiler: Replace cs_prog_data->push.total with a helper
1016 - anv: Stop using cs_prog_data->threads
1017 - iris: Stop using cs_prog_data->threads
1018 - intel/compiler: Remove cs_prog_data->threads
1019 - intel/fs,vec4: Properly account SENDs in IVB memory fence
1020 - spirv: Fix propagation of OpVariable access flags
1021 - spirv: Handle instruction aliases in vtn_gather_types
1022 - spirv: Update the headers from latest Khronos master
1023 - intel/fs: Allow FS_OPCODE_SCHEDULING_FENCE stall on registers
1024 - intel/fs,vec4: Pull stall logic for memory fences up into the IR
1025 - intel/fs: Only stall after sending all memory fence messages
1026 - i965: Use correct constant for max_variable_local_size
1027
1028 Chad Versace (12):
1029
1030 - anv: Drop unused anv_image_get_surface_for_aspect_mask()
1031 - anv: Rename param make_surface::dev to device
1032 - anv: Delete anv_image::ccs_e_compatible
1033 - anv: Clarify behavior of anv_image_aspect_to_plane()
1034 - anv: Respect ISL_SURF_USAGE_DISABLE_AUX_BIT in make_surface()
1035 - turnip: Add magic register values to tu_physical_device
1036 - turnip: Add a618 support
1037 - anv: Drop anv_image.c:get_surface()
1038 - anv: Add anv_image_plane_needs_shadow_surface() (v2)
1039 - anv: Refactor creation of aux surfaces (v2)
1040 - anv: Flatten the logic add_aux_surface_if_supported (v3)
1041 - anv: Use isl_drm_modifier_get_default_aux_state()
1042
1043 Chia-I Wu (2):
1044
1045 - egl/android: require ANDROID_native_fence_sync for buffer age
1046 - egl/android: enable/disable KHR_partial_update correctly
1047
1048 Chris Lord (2):
1049
1050 - vc4: fix vc4_yuv_blit overwriting fragment constant buffer slot 0
1051 - vc4: Fix query_dmabuf_modifiers mis-reporting external_only property
1052
1053 Chris Wilson (1):
1054
1055 - iris: Fix import sync-file into syncobj
1056
1057 Christian Gmeiner (44):
1058
1059 - etnaviv: enable texture upload memory throttling
1060 - etnaviv: update headers from rnndb
1061 - etnaviv: fix alpha test on GC3000
1062 - etnaviv: add etna_constbuf_state object
1063 - etnaviv: ask kernel for max number of supported varyings
1064 - etnaviv: update headers from rnndb
1065 - etnaviv: increase number of supported varyings to 16
1066 - etnaviv: implement emit_string_marker
1067 - etnaviv: get rid of etna_spec in etna_context
1068 - etnaviv: enable shareable shaders
1069 - freedreno: calculate modified bit mask only once
1070 - freedreno: simplify fd_set_shader_buffers(..)
1071 - freedreno: ssbo: keep track if a buffer gets written
1072 - freedreno: ssbo: mark resource read or written depending on usage
1073 - etnaviv: get rid of SE_CLIP\_\*
1074 - etnaviv: rework clippling calculation to be a derived state
1075 - etnaviv: do the left shift by 16 at emit time
1076 - etnaviv: get rid of struct compiled_scissor_state
1077 - etnaviv: s/scissor_s/scissor
1078 - etnaviv: compiled_framebuffer_state: get rid of SE_SCISSOR\_\*
1079 - etnaviv: rename hw queries to acc queries
1080 - etnaviv: rework etna_acc_sample_provider
1081 - etnaviv: explicitly call resource_written(..)
1082 - etnaviv: reset no_wait_cnt after triggered flush
1083 - etnaviv: rework wait/flush logic
1084 - etnaviv: extend acc query provider with supports(..) function
1085 - etnaviv: make use of a fixed size array to track of all acc query
1086 provider
1087 - etnaviv: extend result(..) to return if data is ready
1088 - etnaviv: extend acc sample provide with an allocate(..)
1089 - etnaviv: move generic perfmon functionality into own file
1090 - etnaviv: convert perfmon queries to acc queries
1091 - etnaviv: drop redundant calls to etna_acc_query_suspend(..)
1092 - etnaviv: change begin_query(..) to a void function
1093 - etnaviv: remove the "active" member of queries
1094 - etnaviv: anisotropic filtering is supported starting with HALTI0
1095 - etnaviv: update headers from rnndb
1096 - etnaviv: add anisotropic filter support
1097 - docs/features: mark GL_ARB_texture_filter_anisotropic as done for
1098 etnaviv
1099 - etnaviv: drop default state for FE_HALTI5_ID_CONFIG
1100 - etnaviv: call util_blitter_save_fragment_constant_buffer_slot(..)
1101 - etnaviv: support for using generic blit path
1102 - ci: bare-metal: power down device after tests
1103 - etnaviv: fix SAMP_ANISOTROPY register value
1104 - etnaviv: do not use int filter when anisotropic filtering is used
1105
1106 Christopher Egert (1):
1107
1108 - radv: use util_float_to_half_rtz
1109
1110 Christopher James Halse Rogers (1):
1111
1112 - egl/wayland: Fix zwp_linux_dmabuf usage
1113
1114 Connor Abbott (55):
1115
1116 - freedreno: Fix CP_COND_REG_EXEC bit positions
1117 - freedreno: Add CP_REG_WRITE documentation
1118 - freedreno: Fix CP_COND_EXEC
1119 - tu: Move vsc_data and vsc_data2 allocation into the device
1120 - tu: Don't emit initial render target state in tile_load_ib
1121 - tu: Properly set UBWC flags in RB_RENDER_CNTL
1122 - tu/blit: Support blits in secondary cmdstreams
1123 - tu: Support multisample image clears
1124 - tu: Disable linear depth attachments
1125 - tu: Sysmem rendering
1126 - tu: Add helper for CP_COND_REG_EXEC
1127 - tu: Handle vkCmdClearAttachments() with sysmem
1128 - tu: Support resolve ops with sysmem rendering
1129 - tu: Support input attachments with sysmem
1130 - tu: Force sysmem with mipmapped non-aligned linear stores
1131 - tu: Rewrite border color handling
1132 - lima/gpir: Make lima_gpir_node_insert_child() useful
1133 - lima/gpir: Optimize conditional break/continue
1134 - lima/gpir: Optimize nots created from branch lowering
1135 - tu: Fix border color with compute shaders
1136 - freedreno/fdl: Add base_align
1137 - tu: Return the correct alignment for images
1138 - freedreno: Cleanup event names
1139 - freedreno: Rename RB_DONE_TS
1140 - tu: Dump out shader assembly when requested
1141 - tu: ir3: Emit push constants directly
1142 - freedreno/a6xx: Add UBO size field
1143 - freedreno/a6xx: Add registers for the bindless model
1144 - ir3: Add bindless instruction encoding
1145 - ir3: Plumb through support for a1.x
1146 - ir3: Also don't propagate immediate offset with LDC
1147 - ir3: LDC also has a destination
1148 - ir3: Plumb through bindless support
1149 - ir3: Rewrite UBO push analysis to support bindless
1150 - tu: Switch to the bindless descriptor model
1151 - tu: Emit CP_LOAD_STATE6 for descriptors
1152 - tu: Add missing code for immutable samplers
1153 - tu: Implement descriptor set update templates
1154 - ir3: Fix txs with bindless
1155 - ir3: Fix LDC offset units
1156 - ir3: Handle load_ubo_ir3 when promoting to constants
1157 - tu: Align GMEM resolve blit scissor
1158 - tu: Use tu_cs_add_entries() with non-render-pass secondaries
1159 - ir3/ra: Fix off-by-one issues with live-range extension
1160 - freedreno/a6xx: Expand various varying-count bitfields
1161 - tu: Fix the advertised maxFragmentInputComponents
1162 - ir3: Don't double-insert the first block
1163 - ir3: Fix bug with shaders that only exit via discard
1164 - freedreno/a6xx: Document PrimID passthrough registers
1165 - ir3: Skip missing VS outputs in VS out map when linking
1166 - tu: Implement PrimID passthrough
1167 - freedreno/a6xx: Implement PrimID passthrough
1168 - st/nir: Fix assigning PointCoord location with !PIPE_CAP_TEXCOORD
1169 - ir3: Remove VARYING_SLOT_PNTC remapping hack
1170 - tu: Don't invert point coords
1171
1172 D Scott Phillips (6):
1173
1174 - intel/tools/aubinator_error_decode: read HW Context before other
1175 batches
1176 - intel/tools/aubinator_error_decode: Decode ring buffers from HEAD to
1177 TAIL
1178 - util/sparse_array: don't stomp head's counter on pop operations
1179 - intel/fs: Update location of Render Target Array Index for gen12
1180 - anv,iris: Fix input vertex max for tcs on gen12
1181 - anv/gen11+: Disable object level preemption
1182
1183 Daniel Schürmann (73):
1184
1185 - aco: fix image_atomic_cmp_swap
1186 - nir: gather info whether a shader uses demote_to_helper
1187 - nir: add pass to lower discard() to demote()
1188 - amd/llvm: implement nir_intrinsic_demote(_if) and
1189 nir_intrinsic_is_helper_invocation
1190 - radeonsi: lower discard to demote when FS_CORRECT_DERIVS_AFTER_KILL
1191 is enabled
1192 - radv: use nir_lower_discard_to_demote to work around game bugs
1193 - amd: join emit_kill() from radv and radeonsi in ac_nir_to_llvm
1194 - nir: fix unpack_64_4x16 in lower_alu_to_scalar()
1195 - aco: add comparison operators for PhysReg
1196 - aco: add sub-dword regclasses
1197 - aco: refactor regClass setup for subdword VGPRs
1198 - aco: validate p_create_vector with subdword elements properly
1199 - aco: validate register alignment of subdword operands and definitions
1200 - aco: validate uninitialized operands
1201 - aco: validate RA of subdword assignments
1202 - aco: print subdword registers
1203 - aco: fix Temp and assignment of renamed operands during RA
1204 - aco: remove unnecessary reg_file.fill() operation in
1205 get_reg_create_vector()
1206 - aco: add notion of subdword registers to register allocator
1207 - aco: create helper function to collect variables from register area
1208 - aco: adapt register allocation for subdword registers
1209 - aco: align subdword registers during RA when necessary
1210 - aco: small refactoring of shuffle code lowering
1211 - aco: add builder function for subdword copy()
1212 - aco: lower subdword shuffles correctly.
1213 - aco: don't propagate SGPRs into subdword PSEUDO instructions
1214 - aco: don't assume split_vector(create_vector) has the same number of
1215 elements when optimizing
1216 - aco: don't vectorize 8/16bit load/store_ssbo
1217 - aco: add missing conversion operations for small bitsizes
1218 - aco: add byte_align_scalar() & trim_subdword_vector() helper
1219 functions
1220 - aco: prepare helper functions for subdword handling
1221 - aco: implement vec2/3/4 with subdword operands
1222 - aco: implement storagePushConstant8 & storagePushConstant16
1223 - aco: implement 8bit/16bit load_buffer
1224 - aco: implement 8bit/16bit store_ssbo
1225 - aco: use MUBUF to load subdword SSBO
1226 - aco: guarantee that Temp fits in 4 bytes
1227 - aco: add explicit padding for all Instruction sub-structs
1228 - aco: improve hashing for value numbering
1229 - aco: improve register assignment when live-range splits are necessary
1230 - aco: replace assignment hashmap by std::vector in register allocation
1231 - aco: during RA only insert into renames table if a variable got
1232 renamed
1233 - aco: improve speed of live_var_analysis
1234 - aco: refactor try_remove_trivial_phi() in RA
1235 - aco: change some std::map to std::unordered_map in
1236 register_allocation
1237 - aco: change live_out variables to std::unordered_set
1238 - aco: move all needed helper containers to ra_ctx
1239 - aco: RA - move all std::function objects into proper functions
1240 - aco: setup subdword regclasses for ssa_undef & load_const
1241 - aco: ensure correct bit representation of subdword constants
1242 - aco: don't constant-propagate into subdword PSEUDO instructions
1243 - aco: lower subdword phis with SGPR operands
1244 - aco: rename aco_lower_bool_phis() -> aco_lower_phis()
1245 - aco: make some reg_file helpers private and fix their uses
1246 - aco: fix p_extract_vector optimization in presence of unequally sized
1247 vector operands
1248 - aco: use v_subrev_f32 for fsub with an sgpr operand in src1
1249 - aco: fix 64bit fsub
1250 - aco: move src1 to vgpr instead of using VOP3 for VOP2 instructions
1251 during isel
1252 - aco: simplify operand handling in RA
1253 - aco: refactor get_reg() to take Temp instead of RegClass
1254 - aco: refactor get_reg() to also handle affinities
1255 - aco: create pseudo dummy instruction in RA to be used for live-range
1256 splits
1257 - aco: create and use DefInfo struct in RA
1258 - aco: use DefInfo in more places to simplify RA
1259 - aco: move attempt to find strided register into get_reg_simple()
1260 - aco: allocate full register for subdword definitions if HW doesn't
1261 support it
1262 - aco: don't create vector affinities for operands which are not killed
1263 or are duplicates
1264 - aco: refactor get_reg_simple() to return early on exact matches
1265 - aco: stop get_reg_simple after reaching max_used_gpr
1266 - aco: try to always find a register with stride for even sizes
1267 - aco: use upper part of gap in register file if it is beneficial for
1268 striding
1269 - aco: coalesce v_mad's accumulator with definition's affinities
1270 - aco: either copy-propagate or inline create_vector operands
1271
1272 Daniel Stone (15):
1273
1274 - Revert "gitlab-ci: disable panfrost runners"
1275 - egl/wayland: Don't invalidate buffers on no-op resize
1276 - util/test: Use MAX_PATH on Windows
1277 - CI: Add native Windows VS2019 build
1278 - CI: Windows: Fix Docker tag argument inversion
1279 - CI: Disable Panfrost Mali-T820 jobs
1280 - CI: Avoid htz4 runner for VS2019
1281 - meson: Add VS 4624 warning exclusion to remove piles of LLVM warnings
1282 - CI: Re-enable Windows VS2019 builds
1283 - EGL: Add eglSetDamageRegionKHR to GLVND dispatch list
1284 - meson: Make shared-llvm into a tri-state boolean
1285 - CI: Disable Windows/VS2019 builds
1286 - Revert "CI: Disable Windows/VS2019 builds"
1287 - ci/windows: Make Chocolatey installs more reliable
1288 - CI: Disable Lima jobs due to lab unhealthiness
1289
1290 Danylo Piliaiev (29):
1291
1292 - i965: Do not set front_buffer_dirty if there is no front buffer
1293 - st/mesa: Handle the rest renderbuffer formats from OSMesa
1294 - osmesa/tests: Cover OSMESA_RGB GL_UNSIGNED_BYTE case
1295 - st/nir: Unify inputs_read/outputs_written before serializing NIR
1296 - brw_nir: Cast bitshift to unsigned
1297 - brw_fs: Avoid zero size vla
1298 - intel/compiler: Do not qsort zero sized array
1299 - intel/bufmgr: Cast bitshift to unsigned
1300 - glsl/blob: Do not call memcpy if there is nothing to copy
1301 - iris: Do not dereference nullptr with pipe_reference
1302 - i965: Do not generate D16 B5G6R5_UNORM configs on gen < 8
1303 - intel/tools: Fix compilation with UBSan
1304 - glsl: do not crash if string literal is used outside of
1305 #include/#line
1306 - st/mesa: Fix signed integer overflow when using
1307 util_throttle_memory_usage
1308 - intel/aub_viewer: Fix format specifier for uint64_t
1309 - nir: Fix breakage of foreach_list_typed_safe assumptions in loop
1310 unrolling
1311 - anv: Do not sample from 3d depth image with HiZ
1312 - glsl/list: Fix undefined behaviour of foreach\_\* macros
1313 - st/mesa: Update shader info of ffvp/ARB_vp after translation to NIR
1314 - st/mesa: Re-assign vs in locations after updating nir info for
1315 ffvp/ARB_vp
1316 - spirv: Expand workaround for OpControlBarrier on old GLSLang
1317 - st/mesa: Treat vertex inputs absent in inputMapping as zero in
1318 mesa_to_tgsi
1319 - iris/bufmgr: Check if iris_bo_gem_mmap failed
1320 - i965: Fix out-of-bounds access to brw_stage_state::surf_offset
1321 - anv: Translate relative timeout to absolute when calling
1322 anv_timelines_wait
1323 - anv: Fix deadlock in anv_timelines_wait
1324 - meson: Disable GCC's dead store elimination for memory zeroing custom
1325 new
1326 - mesa: Fix double-lock of Shared->FrameBuffers and usage of wrong
1327 mutex
1328 - intel/fs: Work around dual-source blending hangs in combination with
1329 SIMD16
1330
1331 Dave Airlie (69):
1332
1333 - llvmpipe/query: add support for indexed queries
1334 - gallivm/swr: add stream_id to geom epilogue emit
1335 - gallivm/nir: add support for multiple vertex streams
1336 - draw: change geom shader output to an array of outputs.
1337 - draw/gs: track emitted prims + verts per stream.
1338 - draw: emit multiple streams to streamout.
1339 - draw: don't emit vertex to streams with no outputs
1340 - llvmpipe: advertise 4 vertex streams
1341 - gallivm/s390: fix pass init order on s390 with llvm 8 (v2)
1342 - ci: bump debian image and change llvm deps to 8
1343 - dri: add another get shm variant.
1344 - glx/drisw: add getImageShm2 path
1345 - glx/drisw: return false if shmid == -1
1346 - glx/drisw: fix shm put image fallback
1347 - gallivm/tgsi: fix stream id regression
1348 - gallivm/nir: fix integer divide SIGFPE
1349 - gallivm/nir: handle mod 0 better.
1350 - gallium/auxiliary: add the microsoft tessellator and a pipe wrapper.
1351 - gallivm/nir: split out 64-bit splitting code
1352 - gallivm/nir: add support for tess system values
1353 - gallivm/nir: align store_var param order with load_var
1354 - gallivm/tgsi/swr: add mask vec to the tcs store
1355 - gallivm/nir: add tessellation i/o support.
1356 - draw: add JIT context/functions for tess stages.
1357 - draw: add main tessellation code
1358 - draw: hook up final bits of tessellation
1359 - gallium/nir/tgsi: only scan fragment shader inputs for usage_mask
1360 - llvmpipe: add support for tessellation shaders
1361 - gallivm/tessellator: use private functions for min/max to avoid
1362 namespace issues
1363 - gallium: fix build with latest meson and gcc10
1364 - gallivm/s3tc: split out dxt5 alpha code
1365 - gallivm: add support for rgtc/latc fetches.
1366 - gallium/llvmpipe: add an optimised 32-bit memset
1367 - gallivm/rgtc: fix the truncation to 8-bit
1368 - gallivm/rgtc: enable fast path for snorm types.
1369 - Revert "gallivm: disable rgtc/latc SNORM accellerated fetches"
1370 - llvmpipe: fixup context leaks.
1371 - draw: collect tessellation invocations statistics
1372 - llvmpipe: report tessellation shader statistics.
1373 - llvmpipe/query: fix transform feedback overflow any queries.
1374 - gallivm: fix left over shader vote debug
1375 - gallivm/nir: lower implicit lod to tex.
1376 - gallivm/draw: calloc prim id toavoid undef
1377 - llvmpipe: fix no tokens detections.
1378 - draw: fix tessellation stats query
1379 - llvmpipe/setup: move line stats collection earlier.
1380 - draw/cull: run pipeline for culled points.
1381 - draw: fix user culling pipeline order. (v2)
1382 - u_blitter: fix stencil blitting
1383 - draw: free the NIR IR.
1384 - draw/tess: free the NIR
1385 - llvmpipe/nir: free the nir shader
1386 - nir/linking: fix issue with two compact variables in a row. (v2)
1387 - gallivm/nir: fix image store conversions
1388 - gallivm/nir: add helper invocation support
1389 - util/indirect: handle stride less than number of parameters.
1390 - llvmpipe: bump max images to 16
1391 - llvmpipe: fix ssbo alignment
1392 - draw/tess: fix TES patch vertices in.
1393 - llvmpipe: fix d32 unorm depth conversions.
1394 - llvmpipe/setup: add point size clamping
1395 - llvmpipe: enable stencil only formats. (v2)
1396 - llvmpipe: clamp color storage for integer types.
1397 - gallivm: fix stencil border
1398 - vulkan: add initial device selection layer. (v6.1)
1399 - ci: add llvmpipe paths to virgl rules
1400 - draw/tess: free tessellation control shader i/o memory.
1401 - llvmpipo/nir: free compute shader NIR
1402 - llvmpipe: compute shaders work better with all the threads.
1403
1404 David Stevens (1):
1405
1406 - egl/android: set window usage flags
1407
1408 Denys (1):
1409
1410 - gitlab: add bug report template
1411
1412 Dominik Behr (1):
1413
1414 - meson: fix debug build on Android
1415
1416 Drew Davenport (1):
1417
1418 - radv: Filter extensions not whitelisted for Android
1419
1420 Duncan Hopkins (2):
1421
1422 - zink. Added storage CISto descriptor pool. Added storage in
1423 descriptor pool for combined image samplers as well as uniform
1424 buffers. Stops some shaders from running through a pools storage
1425 faster than zinks internal tracking.
1426 - zink: zero out zink_render_pass_state
1427
1428 Dylan Baker (48):
1429
1430 - docs/release-calendar: 20.0.0-rc1 has been released
1431 - docs: Mark 20.0-rc2 as done
1432 - docs: Add release notes for 19.3.4
1433 - docs: Add SHA256 sum for 19.3.4
1434 - docs: Mark 19.3.4 as done
1435 - docs: Mark 20.0.0-rc3 as done
1436 - Docs: Add 20.0.0 release notes
1437 - docs: Update index, relnotes, and release-calendar for 20.0
1438 - docs: Update stable process around using fixes: and gitlab
1439 - docs/submittingpatches: Fix confusing typo + missing pronoun
1440 - docs: Update release notes with current process
1441 - bin/post_version.py: Update the release calendar as well
1442 - bin/post_version.py: Pretty print the html
1443 - bin/post_version.py: Make the git commit as well.
1444 - docs: update releasing to cover updated post_version.py
1445 - docs: add relnotes for 20.0.1
1446 - docs: Add sha256sums for 20.0.1
1447 - docs: update news, calendar, and link release notes for 20.0.1
1448 - Docs: Add release notes for 20.0.2
1449 - docs/relnotes: Add sha256 sums for 20.0.2
1450 - docs: update calendar, add news item, and link releases notes for
1451 20.0.2
1452 - docs/release-calendar: Add calendar for 20.1 Release candidates
1453 - bin/gen_release_notes.py: Fix version detection for .0 release
1454 - bin/pick-ui: Add a new maintainer script for picking patches
1455 - replace \_mesa_is_pow_two with util_is_power_of_two\_\*
1456 - replace \_mesa_next_pow_two\_\* with util_next_power_of_two\_\*
1457 - replace \_mesa_logbase2 with util_logbase2
1458 - replace LOG2 with util_fast_log2
1459 - u_math: add x86 optimized version of ifloor
1460 - replace IFLOOR with util_ifloor
1461 - Replace IROUND_POS with \_mesa_roundevenf
1462 - mesa/main: remove unused IROUNDD
1463 - replace IROUND with util functions
1464 - move windows strtok_r define to u_string
1465 - Replace IS_INF_OR_NAN with util_is_inf_or_nan
1466 - replace malloc macros in imports.h with u_memory.h versions
1467 - util: Add an aligned realloc function
1468 - replace imports memory functions with utils memory functions
1469 - mesa|mapi: replace \_mesa_[v]snprintf with [v]snprintf
1470 - mesa: move ADD_POINTERS to macros.h
1471 - dri/nouveau: replace assert with unreachable
1472 - remove final imports.h and imports.c bits
1473 - meson: update llvm dependency logic for meson 0.54.0
1474 - docs: Add relnotes for 20.0.5
1475 - docs: Add sha256 sums for 20.0.5
1476 - docs: update calendar, add news item, and link releases notes for
1477 20.0.5
1478 - mesa: Follow OpenGL conversion rules for values that exceed storage
1479 size
1480 - tests: Make tests aware of meson test wrapper
1481
1482 Edmondo Tommasina (1):
1483
1484 - radv/sqtt: fix RADV_THREAD_TRACE_BUFFER_SIZE spelling
1485
1486 Eduardo Lima Mitev (3):
1487
1488 - turnip/pipeline: Don't assume tu_shader is a valid object
1489 - turnip: Instance can be NULL resolving 'GetInstanceProcAddr' entry
1490 point
1491 - anv/radv: Resolving 'GetInstanceProcAddr' should not require a valid
1492 instance
1493
1494 Eli Schwartz (1):
1495
1496 - docs: fix typo in v20 release notes
1497
1498 Elie Tournier (3):
1499
1500 - spirv2nir: print nir shader if translation succed
1501 - spirv2nir: Add kernel spirv support
1502 - docs/features: Update virgl OpenGL 4.5 features GL_ARB_clip_control
1503 and GL_KHR_robustness are now expose in the guest.
1504
1505 Emil Velikov (11):
1506
1507 - meson: glx: drop with_glx == dri check
1508 - glx: set the loader_logger early and for everyone
1509 - egl/drm: reinstate (kms\_)swrast support
1510 - Revert "egl/dri2: Don't dlclose() the driver on
1511 dri2_load_driver_common failure"
1512 - loader: use a maximum of 64 drmDevices
1513 - loader: simplify loader_get_user_preferred_fd()
1514 - loader: simplify codeflow in drm_get_pci_id_for_fd
1515 - loader: move "using driver..." message to
1516 loader_get_kernel_driver_name
1517 - loader: fallback to kernel name, if PCI fails
1518 - glx: omit loader_loader() for macOS
1519 - egl: simplify client/platform extension handling
1520
1521 Emmanuel Gil Peyrot (1):
1522
1523 - Expose EGL_KHR_platform\_\* when EXT is supported
1524
1525 Eric Anholt (144):
1526
1527 - gallium/osmesa: Fix a typo in the unit test's test names.
1528 - gallium/osmesa: Fix MakeCurrent of non-8888 contexts.
1529 - gallium/osmesa: Fill out other format tests.
1530 - gallium/osmesa: Try to fix the test for big-endian.
1531 - util: Make helper functions for pack/unpacking pixel rows.
1532 - mesa/st: Use direct util_format_pack/unpack instead of u_tile.
1533 - gallium/util: Remove pipe_get_tile_z/put_tile_z.
1534 - softpipe: Drop the raw_to\* part of the tile cache interface.
1535 - softpipe: Refactor pipe_get/put_tile_rgba\_\* paths.
1536 - gallium: Add and use a helper for packing uc from a color_union.
1537 - gallium: Refactor some single-pixel util_format_read/writes.
1538 - util: Drop unpacking from int signed to unsigned and vice versa.
1539 - freedreno: Move the layout debug under FD_MESA_DEBUG=layout.
1540 - freedreno: Include the layer size in layout debug.
1541 - freedreno: Rename the UBWC layer size field and store it as bytes.
1542 - freedreno/a6xx: Disable the core layer-size setup.
1543 - freedreno: Swap the whole resource layout in shadowing.
1544 - freedreno: Blit all array levels when uncompressing UBWC.
1545 - freedreno: Disable UBWC on Z24S8 if not TEXTURE_2D.
1546 - freedreno: Allow UBWC on textures with multiple mipmap levels.
1547 - mesa: Clean up some endianness adapters for shader image formats.
1548 - intel/isl: Move iris's pipe-to-isl format function to isl.
1549 - glsl,nir: Switch the enum representing shader image formats to
1550 PIPE_FORMAT.
1551 - mesa/st: Move the SYSTEM_VALUE -> TGSI_SEMANTIC map to
1552 tgsi_from_mesa.
1553 - nouveau: Reuse tgsi_get_sysval_semantic().
1554 - nouveau: reuse tgsi_get_gl_frag_result_semantic().
1555 - nouveau: Reuse tgsi_get_gl_varying_semantic().
1556 - u_tile: Skip the packed temporary and just store tiles directly.
1557 - ci: Disable a bunch of tests on freedreno a630.
1558 - ci: Bump the GLES CTS version to 3.2.6.1.
1559 - Revert "gallium: Fix big-endian addressing of non-bitmask array
1560 formats."
1561 - ci: Extend the a630 flake list to reduce spurious failures.
1562 - radv: Squelch possibly-undefined warning
1563 - llvmpipe: Fix real uninitialized use of "atype" for SEMANTIC_FACE
1564 - llvmpipe: Silence "possibly uninitialized value" warning for
1565 ssbo_limit.
1566 - llvmpipe: Silence uninitialized variable warning about "chan"
1567 - llvmpipe: Fix warning about uninitialized "op" in the NIR path.
1568 - llvmpipe: Silence uninitialized variable warning about "vals"
1569 - llvmpipe: Silence uninitialized variable warning about "scissor"
1570 - llvmpipe: Fix another uninitialized value warning, on init_val.
1571 - gallium: Only define PIPE_ALIGNSTACK on x86.
1572 - ci: prepare-artifacts: Make the indent here match previously in the
1573 file
1574 - ci: Make sure that we have a proper shell prompt for LAVA.
1575 - ci: Make LAVA job fails emit the full list of unexpected test
1576 results.
1577 - ci: Document how LAVA runners work.
1578 - ci: Don't bother generating deqp junit results since we don't present
1579 it.
1580 - ci: Remove a useless filtering of the lava logs.
1581 - nir: Rename gl_nir_lower_bindless_images.c in preparation for
1582 extending it.
1583 - nir: Make image lowering optionally handle the !bindless case as
1584 well.
1585 - gallium: Add a cap for enabling lowering of image load/store
1586 intrinsics.
1587 - v3d: Ask the state tracker to lower image accesses off of derefs.
1588 - glsl: Factor out the sampler dim coordinate components switch
1589 statement.
1590 - spirv_to_nir: Reuse glsl_sampler_dim_coordinate_components().
1591 - freedreno/ir3: Reuse glsl_get_sampler_dim_coordinate_components() in
1592 tex_info.
1593 - tgsi_to_nir: Reuse glsl_get_sampler_dim_coordinate_components().
1594 - prog_to_nir: Reuse glsl_get_sampler_dim_coordinate_components().
1595 - freedreno/ir3: Fix the arg to
1596 ir3_get_num_components_for_image_format()
1597 - nir: Move intel's intrinsic_image_coordinate_components() to core
1598 nir.
1599 - freedreno: Switch to using lowered image intrinsics.
1600 - ci: Blacklist another freedreno flaky test.
1601 - meson: Disable bison's -Wdeprecated since we still support old bison.
1602 - turnip: Fix compiler warning about casting a nondispatchable handle.
1603 - freedreno/computerator: Fix defined-but-not-used warnings from
1604 lex/yacc.
1605 - ci: Remove LLVM from ARM test drivers.
1606 - ci: Stop disabling ACPI in the LAVA arm64 kernel build.
1607 - ci: Shrink the arm64 kernel build a bit.
1608 - ci: Include db410c support in the ARM container.
1609 - aco: Fix signed-vs-unsigned warning.
1610 - ci: Enable -Werror on meson-vulkan and meson-testing.
1611 - ci: Switch testing on db410c over to LAVA.
1612 - ci: Add a disabled-by-default job for GLES3 testing on db410c.
1613 - ci: Flip db410c back to docker mode.
1614 - ci: Print the renderer/version that our dEQP invocation is using.
1615 - ci: Fix installation of firmware for db410c's nic.
1616 - ci: Make a simple little bare-metal fastboot mode for db410c.
1617 - glsl/tests: Catch mkdir errors to help explain when they happen.
1618 - glsl/tests: Fix waiting for disk_cache_put() to finish.
1619 - ci: Update the ci-templates commit.
1620 - ci: Enable ccache in the container builds.
1621 - ci: Enable ccaching of CMake builds as well.
1622 - ci: Enable testing GLES2-3 on a530 (Dragonboard 820c).
1623 - freedreno/a5xx: Fix min-vs-mag filtering decisions on non-mipmap tex.
1624 - gallium/util: Switch util_float_to_half to \_mesa_float_to_half()'s
1625 impl.
1626 - ci: Ban the recent popular freedreno a630 flakes.
1627 - ci: Disable tests that showed intermittent fails on a530 in day 1.
1628 - ci: Only run the freedreno baremetal tests when freedreno/core
1629 changes.
1630 - freedreno: Switch to exposing only half-integer pixel centers.
1631 - ci: Move db820c and db410c's gles3 tests to manual, like radv did.
1632 - glsl: Restore the IsES flag on the shader when reading from cache.
1633 - ci: Ban the recent popular freedreno a630 intermittent failure.
1634 - freedreno: Remove always-true return from per-gen begin_query.
1635 - freedreno: Remove the "active" member of queries.
1636 - freedreno: Fix acc query handling in the presence of batch
1637 reordering.
1638 - freedreno: Associate the acc query bo with the batch.
1639 - freedreno: Count blits in GL_TIME_ELAPSED and perf counter queries.
1640 - freedreno/a6xx: Fix timestamp queries.
1641 - freedreno: Rename "is_blit" to "is_discard_blit"
1642 - freedreno: Fix detection of being in a blit for acc queries.
1643 - freedreno: Work around UBWC flakiness.
1644 - freedreno: Drop an unnecessary include marked "this should go away"
1645 - freedreno/turnip: Use the NIR info to decide if we need helper
1646 invocations.
1647 - loader: Warn when we fail to open a device node due to permissions.
1648 - ci: Consistently use -j4 across x86 build jobs and -j8 on ARM.
1649 - freedreno/a6xx: Sink the per-level size temps inside the loop.
1650 - freedreno/a6xx: Remove the "aligned_height" temporary.
1651 - freedreno/a6xx: Drop the "alignment" layout temporary.
1652 - freedreno: Add the outline of a test for a6xx texture layout.
1653 - freedreno/a6xx: Set a level's pitch based on minified level0 pitch,
1654 not width0.
1655 - freedreno: Fix leak of binning shader variants.
1656 - freedreno/ir3: Stop doing b2n on the SEL condition.
1657 - freedreno/ir3: CSE the up/downconversion of SEL's cond's size.
1658 - freedreno/a5xx+: Skip compiling the old gmem blit programs.
1659 - freedreno/drm-shim: Add support for faking other adreno chips.
1660 - freedreno/ir3: Drop handling FRAG_RESULT_DEPTH writing to .z
1661 - freedreno: Introduce a "cpp_shift" value for cpp divs/muls.
1662 - freedreno: Make the slice pitch be bytes, not pixels.
1663 - drm-shim: Let the driver choose to overwrite the first render node.
1664 - nir/lower_two_sided_color: Fix picking of new driver location.
1665 - nir/lower_clip: Fix picking of unused driver locations.
1666 - gallium: Fix setup of pstipple frag coord var.
1667 - freedreno/ir3: Fix driver_location of the added vertex_flags varying.
1668 - freedreno/ir3: Fix sizing of the inputs/outputs array.
1669 - vc4: Use NIR shader's num_outputs for generating our new output.
1670 - ci: Drop redundant freedreno stage specification.
1671 - ci: Enable GLES3 testing on db410c/db820c (freedreno a306 and a530).
1672 - freedreno: Fix derivatives without texturing on a3xx-a5xx.
1673 - ci: Enable GLES 3.1 testing on db820c (a530).
1674 - freedreno/ir3: Fix the disasm of half-float STG dests.
1675 - freedreno/ir3: Print a space after nop counts, like qcom's disasm.
1676 - freedreno/ir3: Add a unit test for our disassembler.
1677 - freedreno/ir3: Convert remaining disasm src prints to reginfo.
1678 - freedreno/ir3: Refactor out print_reg_src().
1679 - freedreno/ir3: Add support for disasm of cat2 float32 immediates.
1680 - ci: Enable --compact-display false on all dEQP runs.
1681 - ci: Add sanity checking that dEQP gets the expected GL_RENDERER.
1682 - freedreno: Fix calculation of the const buffer cmdstream size.
1683 - ci: Allow namespacing of dEQP run results files.
1684 - ci: Clean up some excessive use of pipes in dEQP results processing.
1685 - ci/freedreno: Add a test run of a few driver options.
1686 - util/ra: Sanity check that the driver selected a valid reg.
1687 - util/ra: Sanity check that we're adding a valid reg to a class.
1688 - util/ra: Use util_dynarray for the adjacency list.
1689 - util/ra: Use util_dynarray for handling the conflict lists.
1690 - util/ra: Improve ra_set_finalize() performance.
1691
1692 Eric Engestrom (58):
1693
1694 - VERSION: bump after 20.0 branch point
1695 - egl: put full path to libEGL_mesa.so in GLVND json
1696 - gitlab-ci: disable a630 tests as mesa-cheza is down
1697 - util/os_socket: fix header unavailable on windows
1698 - freedreno/perfcntrs: fix fd leak
1699 - dri: delete gen-symbol-redefs.py
1700 - util/disk_cache: check for write() failure in the zstd path
1701 - meson: don't bother trying \`python2\`
1702 - Revert "egl: put full path to libEGL_mesa.so in GLVND json"
1703 - egl: directly access static members instead of using
1704 \_egl{Get,Set}ConfigKey()
1705 - meson: explicitly disallow unsupported build directory layout
1706 - docs: fix typos in the release docs
1707 - bin/gen_release_notes.py: fix commit list command
1708 - gen_release_notes: fix vulkan version reported
1709 - docs/relnotes/19.3: fix vulkan version reported
1710 - docs/relnotes/20.0: fix vulkan version reported
1711 - Revert "docs/relnotes/19.3: fix vulkan version reported"
1712 - docs: trivial fix for html structure
1713 - docs/releasing: add missing </li> tags
1714 - docs: add release notes for 19.3.5
1715 - docs: update calendar, add news item, and link releases notes for
1716 19.3.5
1717 - vulkan/wsi: fix cleanup when dup() fails
1718 - gen_release_notes: fix version in "you should wait" message
1719 - gen_release_notes: resolve ambiguity by renaming \`version\` to
1720 \`previous_version\` and \`next_version\` to \`this_version\`
1721 - meson: use existing variables in inc_common
1722 - meson: inline \`inc_common\`
1723 - vulkan: drop unused include directories
1724 - intel: drop unused include directories
1725 - scons: prune unused Makefile.sources
1726 - docs: add release notes for 20.0.3
1727 - docs/relnotes: add sha256sum for 20.0.3
1728 - docs: update calendar, add news item, and link releases notes for
1729 20.0.3
1730 - docs: add release notes for 20.0.4
1731 - docs/relnotes: add sha256sum for 20.0.4
1732 - docs: update calendar, add news item, and link releases notes for
1733 20.0.4
1734 - glx: fix 630 times -Wlto-type-mismatch when building with LTO enabled
1735 - glx: use anonymous namespace to avoid -Wodr issues when building with
1736 LTO enabled
1737 - pick-ui: auto-scroll the feedback window
1738 - pick-ui: compute .pick_status.json path only once
1739 - pick-ui: make .pick_status.json path relative to the git root instead
1740 of the script
1741 - pick-ui: show commit sha in the pick list
1742 - VERSION: bump to 20.1.0-rc1
1743 - .pick_status.json: Update to af55bdd05d94eda59ee1c9331a50045000da5db5
1744 - .pick_status.json: Update to 57796946985de60204189426ca8eb7bbfa97c396
1745 - .pick_status.json: Mark 3fac55ce0d066d767d6c6c8308f79d0c3e566ec0 as
1746 denominated
1747 - .pick_status.json: Update to 29da52128090a1ef8ef782188c0f67c7f5ec8d19
1748 - VERSION: bump to 20.1.0-rc2
1749 - .pick_status.json: Update to 772b15ad3227e08bb4e18932ac9ecf4c29271160
1750 - .pick_status.json: Update to 56f955e4850035d915a2a87e2ebea7fa66ab5e19
1751 - .pick_status.json: Update to c1c0cf7a66905e8d7ad506842a41b0ad0c5b10da
1752 - VERSION: bump to 20.1.0-rc3
1753 - .pick_status.json: Update to 5a6beb6a24aa084adfd6c57edd0a64f0a044611a
1754 - post_version.py: fix branch name construction for release candidates
1755 - post_version.py: invert \`is_point\` into \`is_first_release\` to
1756 make its purpose clearer
1757 - post_version.py: stop adding release candidates to the index and
1758 relnotes
1759 - VERSION: bump to 20.1.0-rc4
1760 - .pick_status.json: Update to a91306677c613ba7511b764b3decc9db42b24de1
1761 - tree-wide: fix deprecated GitLab URLs
1762
1763 Erik Faye-Lund (154):
1764
1765 - zink: enable texture-buffer objects
1766 - zink: implement load_instance_id
1767 - zink: implement support for derivative-control
1768 - zink: be more careful about the mask-check
1769 - zink: disallow depth-stencil blits with format-change
1770 - st/mesa: use uint-result for sampling stencil buffers
1771 - zink: lower away fdph
1772 - zink: fixup sampler-usage
1773 - zink: replace unset buffer with a dummy-buffer
1774 - zink: emit blend-target index
1775 - zink: only inspect dual-src limit if feature enabled
1776 - Revert "nir: Add a couple trivial abs optimizations"
1777 - zink: do not use SpvDimRect
1778 - zink: fix binding-usage
1779 - zink: do not report texture-samplers for unsupported stages
1780 - zink/spirv: do not reinvent store_dest
1781 - zink/spirv: prefer store_dest over store_dest_uint
1782 - zink/spirv: rename functions a bit
1783 - zink/spirv: unit_value -> raw_value
1784 - zink/spirv: uint -> raw
1785 - zink: do not convert bools to/from uint
1786 - util: promote u_debug_memory.c to src/util
1787 - util: move debug_memory_{begin,end} to os_memory_debug.h
1788 - gallium/util: do not use debug_print_format
1789 - gallium/util: remove unused debug_print_foo helpers
1790 - zink/spirv: do not use bitwise operations on booleans
1791 - pipebuffer: clean up cast-warnings
1792 - rbug: clean up cast-warnings
1793 - rbug: do not return void-value
1794 - vtn/opencl: fully enable OpenCLstd_Clz
1795 - compiler/nir: move build_exp helper into builtin-builder
1796 - compiler/nir: move build_log helper into builtin-builder
1797 - vtn/opencl: add native exp/log-support
1798 - vtn/opencl: add native exp10/log10-support
1799 - vtn/opencl: add native exp2/log2-support
1800 - nv50: remove unused variable
1801 - meson: disable some more warnings on msvc
1802 - mesa/main: correct extension-checks for GL_BLACKHOLE_RENDER_INTEL
1803 - mesa/main: clean-up extension-checks for point-sprites
1804 - mesa/main: clean up extension-check for GL_VERTEX_PROGRAM
1805 - mesa/main: clean up extension-check for GL_VERTEX_PROGRAM_TWO_SIDE
1806 - mesa/main: clean up extension-check for GL_VERTEX_PROGRAM_POINT_SIZE
1807 - mesa/main: clean up extension-check for GL_TEXTURE_RECTANGLE
1808 - mesa/main: clean up extension-check for GL_STENCIL_TEST_TWO_SIDE
1809 - mesa/main: clean up extension-check for GL_DEPTH_BOUNDS_TEST
1810 - mesa/main: clean up extension-check for AMD_depth_clamp_separate
1811 - mesa/main: clean up extension-check for GL_FRAGMENT_SHADER_ATI
1812 - mesa/main: clean up extension-check for GL_TEXTURE_CUBE_MAP_SEAMLESS
1813 - mesa/main: clean up extension-check for GL_RASTERIZER_DISCARD
1814 - mesa/main: clean up extension-check for GL_TEXTURE_EXTERNAL
1815 - mesa/main: remove unused macro
1816 - wgl: drop pointless debug_printf
1817 - wgl: drop unused member
1818 - wgl: move screen-init to a helper
1819 - wgl: do not create screen from DllMain
1820 - st/dri: make sure software color-buffers are linear
1821 - zink: be less picky about tiled resources
1822 - .mailmap: add an alias for Alan Swanson
1823 - .mailmap: add an alias for Alyssa Rosenzweig
1824 - .mailmap: add an alias for Andrii Simiklit
1825 - .mailmap: add an alias for Anuj Phogat
1826 - .mailmap: add an alias for Axel Davy
1827 - .mailmap: add an alias for Boris Brezillon
1828 - .mailmap: add an alias for Bruce Cherniak
1829 - .mailmap: update aliases for Carl-Philip Hänsch
1830 - .mailmap: add an alias for Chad Versace
1831 - .mailmap: add a couple of aliases for Chandu Babu Namburu
1832 - .mailmap: add alias for Chenglei Ren
1833 - .mailmap: add an alias for Christian Gmeiner
1834 - .mailmap: add an alias for Christian Inci
1835 - .mailmap: add a few aliases for Christoph Haag
1836 - .mailmap: add an alias for Colin McDonald
1837 - .mailmap: specify spelling for Constantine Kharlamov
1838 - .mailmap: add an alias for Craig Stout
1839 - .mailmap: add an alias for Daniel Schürmann
1840 - .mailmap: add an alias for Danylo Piliaiev
1841 - .mailmap: add an alias for Dave Airlie
1842 - .mailmap: add an alias for Dylan Baker
1843 - .mailmap: add a couple of aliases for Dylan Noblesmith
1844 - .mailmap: add an alias for Emmanuel Gil Peyrot
1845 - .mailmap: add an alias for Erik Faye-Lund
1846 - .mailmap: specify spelling for Francesco Ansanelli
1847 - .mailmap: specify spelling for Gurchetan Singh
1848 - .mailmap: add an alias for Haihao Xiang
1849 - .mailmap: add an alias for Harish Krupo
1850 - .mailmap: specify spelling for Heinrich Fink
1851 - .mailmap: specify spelling for Henri Verbeet
1852 - .mailmap: add an alias for Igor Gnatenko
1853 - .mailmap: add an alias for Illia Iorin
1854 - .mailmap: specify spelling for James Zhu
1855 - .mailmap: add an alias for Jan Beich
1856 - .mailmap: clean up aliases for Jeremy Huddleston
1857 - .mailmap: add an alias for Julien Isorce
1858 - .mailmap: add a few aliases for Karol Herbst
1859 - .mailmap: add a few aliases for Kevin Rogovin
1860 - .mailmap: add a few aliases for Kristian Høgsberg
1861 - .mailmap: add an alias for Lionel Landwerlin
1862 - .mailmap: specify spelling for Liviu Prodea
1863 - .mailmap: update aliases for Marc-André Lureau
1864 - .mailmap: add alias for Matthias Groß
1865 - .mailmap: add an alias for Neha Bhende
1866 - .mailmap: add an alias for Neil Roberts
1867 - .mailmap: specify spelling for Nian Wu
1868 - .mailmap: add an alias for Nicholas Bishop
1869 - .mailmap: update aliases for Nicolai Hähnle
1870 - .mailmap: add an alias for Philipp Zabel
1871 - .mailmap: update aliases for Pierre-Eric Pelloux-Prayer
1872 - .mailmap: add an alias for Plamena Manolova
1873 - .mailmap: add an alias for Qiang Yu
1874 - .mailmap: specify spelling for Randy Xu
1875 - .mailmap: add an alias for Renato Caldas
1876 - .mailmap: add an alias for Rob Clark
1877 - .mailmap: add an alias for Rodrigo Vivi
1878 - .mailmap: add an alias for Samuel Li
1879 - .mailmap: add an alias for Sergii Romantsov
1880 - .mailmap: specify spelling for Sonny Jiang
1881 - .mailmap: add a couple of aliases for Steinar H. Gunderson
1882 - .mailmap: add a couple of aliases for Suresh Guttula
1883 - .mailmap: add an alias for Thierry Reding
1884 - .mailmap: add an alias for Timo Aaltonen
1885 - .mailmap: add a couple of aliases for Timothy Arceri
1886 - .mailmap: add an alias for Tim Wiederhake
1887 - .mailmap: add an alias for Tom Stellard
1888 - .mailmap: add an alias for Tomasz Figa
1889 - .mailmap: add an alias for Topi Pohjolainen
1890 - .mailmap: add an alias for Vadym Shovkoplias
1891 - .mailmap: add an alias for Varad Gautam
1892 - .mailmap: specify spelling for Vivek Kasireddy
1893 - .mailmap: specify spelling for Wladimir J. van der Laan
1894 - .mailmap: add an alias for Xavier Bouchoux
1895 - .mailmap: add an alias for Yaakov Selkowitz
1896 - .mailmap: add alias for Zhaowei Yuan
1897 - .mailmap: add an alias for Zhongmin Wu
1898 - meson: use override_options to change warning-level
1899 - wgl: silence some cast-warnings
1900 - util/tests: initialize variable
1901 - mesa: fixup cast expression
1902 - vbo: avoid including wingdi.h on win32
1903 - meson: tell flex that we support c99
1904 - gtest: Update to 1.10.0
1905 - meson: do not disable incremental linking for debug-builds
1906 - docs: remove outdated sentence
1907 - mesa/gallium: do not use enum for bit-allocated member
1908 - meson: correct windows-version define
1909 - mesa/main: do not store unrecognized extensions in context
1910 - mesa/main: do not pass context to one-time extension init
1911 - mesa/main: do not init remap-table per api
1912 - mesa/main: Do not pass context to one_time_init
1913 - mesa/main: one_time_init() -> \_mesa_initialize()
1914 - mesa/st: call \_mesa_initialize() early
1915 - zink: lower b2b to b2i
1916 - util/os_memory: never use os_memory_debug.h
1917 - zink: implement i2b1
1918 - zink: use general-layout when blitting to/from same resource
1919
1920 Francisco Jerez (57):
1921
1922 - intel/fs/cse: Make HALT instruction act as CSE barrier.
1923 - intel/fs/gen7: Fix fs_inst::flags_written() for
1924 SHADER_OPCODE_FIND_LIVE_CHANNEL.
1925 - intel/fs: Add virtual instruction to load mask of live channels into
1926 flag register.
1927 - intel/fs/gen12: Workaround unwanted SEND execution due to broken
1928 NoMask control flow.
1929 - intel/fs/gen12: Fixup/simplify SWSB annotations of SIMD32 scratch
1930 writes.
1931 - intel/fs/gen12: Workaround data coherency issues due to broken NoMask
1932 control flow.
1933 - intel/fs: Set src0 alpha present bit in header when provided in
1934 message payload.
1935 - intel/fs/gen11: Work around dual-source blending hangs in combination
1936 with SIMD32.
1937 - intel/fs: Make sample_mask_reg() local to brw_fs.cpp and use it in
1938 more places.
1939 - intel/fs: Use helper for discard sample mask flag subregister number.
1940 - intel/fs/gen7+: Swap sample mask flag register and FIND_LIVE_CHANNEL
1941 temporary.
1942 - intel/fs: Refactor predication on sample mask into helper function.
1943 - intel/fs: Return consistent UW types from sample_mask_reg() in
1944 fragment shaders.
1945 - intel/fs/gen7+: Implement discard/demote for SIMD32 programs.
1946 - intel/compiler: Move base IR definitions into a separate header file
1947 - intel/compiler: Reverse inclusion dependency between brw_cfg.h and
1948 brw_shader.h
1949 - intel/compiler: Nest definition of live variables block_data
1950 structures
1951 - intel/compiler: Reverse inclusion dependency between
1952 brw_fs_live_variables.h and brw_fs.h
1953 - intel/compiler: Reverse inclusion dependency between
1954 brw_vec4_live_variables.h and brw_vec4.h
1955 - intel/compiler: Introduce simple IR analysis pass framework
1956 - intel/compiler: Introduce backend_shader method to propagate IR
1957 changes to analysis passes
1958 - intel/compiler: Define more detailed analysis dependency classes
1959 - intel/compiler: Pass detailed dependency classes to
1960 invalidate_analysis()
1961 - intel/compiler: Mark virtual_grf_interferes and vars_interfere as
1962 const
1963 - intel/compiler: Move all live interval analysis results into
1964 fs_live_variables
1965 - intel/compiler: Move all live interval analysis results into
1966 vec4_live_variables
1967 - intel/compiler: Restructure live intervals computation code
1968 - intel/compiler: Pass single backend_shader argument to the
1969 fs_live_variables constructor
1970 - intel/compiler: Pass single backend_shader argument to the
1971 vec4_live_variables constructor
1972 - intel/compiler/fs: Add live interval validation pass
1973 - intel/compiler/vec4: Add live interval validation pass
1974 - intel/compiler/fs: Switch liveness analysis to IR analysis framework
1975 - intel/compiler/vec4: Switch liveness analysis to IR analysis
1976 framework
1977 - intel/compiler: Drop invalidate_live_intervals()
1978 - intel/compiler: Move idom tree calculation and related logic into
1979 analysis object
1980 - intel/compiler: Move dominance tree data structure into idom_tree
1981 object
1982 - entel/compiler: Simplify new_idom reduction in dominance tree
1983 calculation
1984 - intel/compiler: Move register pressure calculation into IR analysis
1985 object
1986 - intel/compiler: Calculate num_instructions in O(1) during register
1987 pressure calculation
1988 - intel/fs: Fix workaround for VxH indirect addressing bug under
1989 control flow.
1990 - intel/fs/gen12: Fix interaction of SWSB dependency combination with
1991 EU fusion workaround.
1992 - intel/fs/gen12: Fix hangs with per-sample SIMD32 fragment shader
1993 dispatch.
1994 - intel/fs/gen12: Work around dual-source blending hangs in combination
1995 with SIMD32.
1996 - intel/fs/gen12: Fix Render Target Read header setup for new thread
1997 payload layout.
1998 - intel/ir: Add missing initialization of backend_reg::offset during
1999 construction.
2000 - intel/fs: Rename half() helpers to quarter(), allow index up to 3.
2001 - intel/fs: Fix constness of argument of
2002 fs_instruction_scheduler::is_compressed().
2003 - intel/fs: Replace fs_visitor::bank_conflict_cycles() with stand-alone
2004 function.
2005 - intel/vec4: Fix constness of vec4_instruction::reads_flag() and
2006 ::writes_flag().
2007 - intel/ir: Import shader performance analysis pass.
2008 - intel/fs: Heap-allocate fs_visitors in brw_compile_fs().
2009 - intel/fs: Implement performance analysis-based SIMD32 heuristic for
2010 fragment shaders.
2011 - intel/fs: Add INTEL_DEBUG=no32 debugging flag.
2012 - intel/ir: Use brw::performance object instead of CFG cycle counts for
2013 codegen stats.
2014 - intel/ir: Pass block cycle count information explicitly to
2015 disassembler.
2016 - intel/ir: Remove scheduling-based cycle count estimates.
2017 - intel/ir: Update performance analysis parameters for memory fence
2018 codegen changes.
2019
2020 Fritz Koenig (3):
2021
2022 - Revert "gitlab-ci: disable a630 tests as mesa-cheza is down"
2023 - Revert "gitlab-ci: disable a630 tests as mesa-cheza is down (again)"
2024 - freedreno: allow FMT6_8_UNORM as a UBWC format
2025
2026 Georg Lehmann (3):
2027
2028 - Correctly wait in the fragment stage until all semaphores are
2029 signaled
2030 - Vulkan Overlay: Don't try to change the image layout to present twice
2031 - Vulkan overlay: use the corresponding image index for each swapchain
2032
2033 Gert Wollny (63):
2034
2035 - r600: force new CF with TEX only if any texture value is written
2036 - r600: Increase space for IO values to agree with
2037 PIPE_MAX_SHADER_IN/OUTPUTS
2038 - r600: Add NIR compiler options
2039 - r600: Update state code to accept NIR shaders
2040 - r600/sfn: Add a basic nir shader backend
2041 - r600: enable NIR backend DEBUG flag for supported architectures
2042 - r600/sfn: Add the VS in and FS out vectorization
2043 - r600/sfn: Add the WaitAck instruction
2044 - r600/sfn: add live range evaluation for the GPR
2045 - r600/sfn: add register remapping
2046 - r600/sfn: Add lowering arrays to scratch and according instructions
2047 - r600/sfn: Add a load GDS result instruction
2048 - r600/sfn: Add MemRingOut instructions
2049 - r600/sfn: add emitVertex instructions
2050 - r600/sfn: Add support for geometry shader
2051 - r600/sfn: Add VS for TCS shader skeleton
2052 - r600/sfn: Add compute shader skeleton
2053 - r600/sfn: Add GDS instructions
2054 - r600/sfn: Add lowering UBO access to r600 specific codes
2055 - r600: Make sure LLVM is not used for DRAW
2056 - r600/sfn: Add support for atomic instructions
2057 - r600/sfn: Add support for SSBO load and store
2058 - r600/sfn: Add .editorconfig file
2059 - r600/sfn: Add some documentation
2060 - r600/sfn: Avoid using dynamic_cast to identify type
2061 - r600/sfn: Use static_cast when type is already known
2062 - r600/sfn: Don't try to catch exceptions, the driver doesn't throw any
2063 - gallium/tgsi_to_nir: Set nir_intrinsic_align_mul to 16 and offset to
2064 0
2065 - r600: Dump a few more variables when requested
2066 - r600/sfn: Reduce array limit for scratch usage
2067 - r600/sfn: Fix setting alignments when lowering UBOs
2068 - r600/sfn: Implementing instructions blocks
2069 - r600/nir: Pin interpolation results to channel
2070 - r600/sfn: Fix null pointer deref in live range evalation
2071 - r600/sfn: Handle b2b1 like it was a mov
2072 - r600/sfn: Fix handling of GS inputs
2073 - r600/sfn: Fix using the result of a fetch instruction in next fetch
2074 - r600/sfn: Count only literals that are not inline to split
2075 instruction groups
2076 - r600/sfn: use new temp register allocation when loading single value
2077 temporaries
2078 - nir: Add r600 specific intrinsics for tesselation shader IO
2079 - nir: Add umad24 and umul24 opcodes
2080 - r600: Handle texcoord semantics in LDS index evaluation
2081 - r600/sfn: simplify UBO lowering pass
2082 - r600/sfn: Don't emit inline constants in the r600 IR
2083 - r600/sfn: Add LDS IO instructions to r600 IR
2084 - r600/sfn: Add LDS instruction to assembly conversion
2085 - r600/sfn: Add TF write instruction
2086 - r600/sfn: Add IR instruction to fetch the TESS parameters
2087 - r600/sfn: Handle umul24 and umad24
2088 - r600/sfn: Emit some LDS instructions
2089 - r600/sfn: Move emission of barrier from compute shader to shader base
2090 - r600/sfn: Add methods to valuepool to get a vector of values
2091 - r600/sfn: Move some shader base methods to the public interface
2092 - r600/sfn: extract class to handle the VS export to different stages
2093 - r600/sfn: derive the GS from the vertex stage for a common interface
2094 - r600/sfn: Handle LDS output in VS
2095 - r600/sfn: Move removing of unused variables
2096 - r600/sfn: Add lowering passes for Tesselation IO
2097 - r600/sfn: Add tesselation shaders
2098 - r600: Enable tesselation for NIR
2099 - r600: Fix nir compiler options, i.e. don't lower IO to temps for TESS
2100 - r600/sfn: Fix printing vertex fetch instruction flags
2101 - r600: Fix duplicated subexpression in r600_asm.c
2102
2103 Greg V (3):
2104
2105 - amd/addrlib: fix build on non-x86 platforms
2106 - r600: add missing <array> include
2107 - svga: fix build on FreeBSD
2108
2109 H.J. Lu (2):
2110
2111 - x86_init_func_common: Add ENDBR at function entry
2112 - x86: Add ENDBR at function entries
2113
2114 Hanno Böck (1):
2115
2116 - Properly check mmap return value
2117
2118 Hyunjun Ko (27):
2119
2120 - freedreno/ir3: fix printing half constant registers.
2121 - freedreno/ir3: Add cat4 mediump opcodes
2122 - freedreno/ir3: put the conversion back for half const to the right
2123 place.
2124 - freedreno/ir3: Fold const only when the type is float
2125 - freedreno/ir3: Add new ir3 pass to fold out fp16 conversions
2126 - nir: Add optimization for doing removing f16/f32 conversions
2127 - freedreno/ir3: handle half registers for arrays during register
2128 allocation.
2129 - turnip: support indirect draw
2130 - glsl: Handle fp16 unary operations when lowering matrix operations
2131 - glsl/lower_instructions: Handle fp16 for MOD_TO_FLOOR
2132 - turnip: Gather information for transform feedback
2133 - turnip: Define structs for transform feedback
2134 - turnip: Setup stream-output when linking program
2135 - turnip: Implement stream-out emit and vkApis for transform feedback
2136 - turnip: Implement an empty function vkCmdDrawIndirectByteCountEXT
2137 - turnip: Enable VK_EXT_transform_feedback
2138 - turnip: Add tu6_control struct.
2139 - turnip: Fix wrong assignment of xfb output's offset.
2140 - turnip: Do gathering xfb info after nir_remove_dead_variables
2141 - freedreno: Enable mediump lowering
2142 - freedreno/ir3: enable nir_opt_loop_unroll on a6xx
2143 - nir: fix wrong assignment to buffer in xfb_varyings_info
2144 - turnip: make the struct slot_value of queries get 2 values
2145 - turnip: Implement and enable
2146 VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT
2147 - turnip : Fix wrong offset calculation for xfb buffer.
2148 - turnip: Skip unused regs when setting up streamout buffers
2149 - turnip: Fix crashes when geometry shader constants aren't used
2150
2151 Iago Toral Quiroga (1):
2152
2153 - nir: add a bool bitsize lowering pass
2154
2155 Ian Romanick (62):
2156
2157 - intel/fs: Don't count integer instructions as being possibly coissue
2158 - nir: Mark fmin and fmax as commutative and associative
2159 - mesa/draw: Make sure all the unused fields are initialized to zero
2160 - nir/search: Use larger type to hold linearized index
2161 - intel/fs: Correctly handle multiply of fsign with a source modifier
2162 - intel/fs: Do cmod prop again after scheduling
2163 - intel/fs: Allow NOT instructions in conditional discard optimization
2164 - intel/fs: Fix NULL destinations on 3-source instructions again after
2165 late DCE
2166 - nir/algebraic: Simplify logic to detect sign of an integer
2167 - nir/algebraic: optimize ior(ine(a, 0), ine(b, 0)) to ine(ior(a, b),
2168 0)
2169 - nir/algebraic: Generalize some and-of-shift-right patterns [v2]
2170 - nir/algebraic: Constant reassociation for bitwise operations too
2171 - nir/algebraic: Simplify a contradiction that can occur in
2172 \__flt64_nonnan
2173 - soft-fp64/b2f: Reimplement using bitwise logic ops
2174 - soft-fp64: Don't open-code umulExtended
2175 - soft-fp64: Simplify \__countLeadingZeros32 function
2176 - soft-fp64: Pick a single idiom for treating sign value as a Boolean
2177 - soft-fp64: Store sign value as 0 or 0x80000000
2178 - soft-fp64/fneg: Don't treat NaN specially
2179 - soft-fp64/flt: Perform checks in a different order
2180 - soft-fp64/fsat: Correctly handle NaN
2181 - soft-fp64/fsat: Micro-optimize x < 0 test
2182 - soft-fp64/fsat: Micro-optimize x >= 1 test
2183 - soft-fp64: Relax the way NaN is propagated
2184 - soft-fp64/ffloor: Simplify the >= 0 comparison
2185 - soft-fp64: Optimize \__fmin64 and \__fmax64 by using different
2186 evaluation order [v2]
2187 - soft-fp64/fadd: Instead of tracking "b < a", track sign of the
2188 difference
2189 - soft-fp64/fadd: Massively split the live range of zFrac0 and zFrac1
2190 - soft-fp64/fadd: Pick zero or non-zero result based on subtraction
2191 result
2192 - soft-fp64/fadd: Just let the subtraction happen when the result will
2193 be zero
2194 - soft-fp64/fadd: Delete a redundant condition check
2195 - soft-fp64/fadd: Reformat after previous commit
2196 - soft-fp64/fadd: Combine an if-statement into the preceeding
2197 else-clause
2198 - soft-fp64/fadd: Rename aFrac and bFrac variables
2199 - soft-fp64/fadd: Use absolute value of expDiff
2200 - soft-fp64/fadd: Move common code out of both branches of an
2201 if-statement
2202 - soft-fp64/fadd: Common code optimization for differing sign case
2203 - soft-fp64: Split a block that was missing a cast on a comparison
2204 - intel/vec4: Allow late copy propagation on vec4
2205 - nir/algebraic: Change the default cursor location when replacing a
2206 unary op
2207 - nir/algebraic: Distribute source modifiers into instructions
2208 - nir/algebraic: Use value range analysis to convert fmax to fsat
2209 - nir/algebraic: Remove a redundant fabs pattern
2210 - tnl: Don't dereference NULL obj pointer in bind_indices
2211 - tnl: Don't dereference NULL obj pointer in replay_init
2212 - tnl: Don't dereference NULL obj pointer in t_rebase_prims
2213 - tnl: Silence unused parameter 'attrib' warning in
2214 convert_half_to_float
2215 - tnl: Silence unused parameter warnings in \_tnl_draw_prims
2216 - tnl: Silence unused parameter warnings in dump_draw_info
2217 - tnl: Silence unused parameter warnings in \_tnl_split_inplace
2218 - tnl: Code formatting in t_draw.c
2219 - tnl: Code formatting in t_rebase.c
2220 - intel/compiler: Silence unused parameter warnings in vec4_tcs_visitor
2221 - intel/compiler: Silence unused parameter warning in
2222 fs_live_variables::setup_one_read
2223 - intel/compiler: Silence unused parameter warning in
2224 update_inst_scoreboard
2225 - intel/compiler: Only GE and L modifiers are commutative for SEL
2226 - intel/compiler: CSEL can do saturate
2227 - intel/compiler: Fixup operands in fs_builder::emit() that takes array
2228 - nir/algebraic: Detect some kinds of malformed variable names
2229 - nir/algebraic: Require operands to iand be 32-bit
2230 - nir/algebraic: Optimize ushr of pack_half, not ishr
2231 - anv/tests: Don't rely on assert or changing NDEBUG in tests
2232
2233 Icecream95 (16):
2234
2235 - panfrost: Fix non-debug builds
2236 - panfrost: Inline panfrost_get_default_swizzle
2237 - panfrost: LogicOp support
2238 - nir: Allow nir_format conversions to work on 32-bit values
2239 - panfrost: LogicOp fixes and non 8-bit format support
2240 - mesa/format_utils: Add a fast-path for RGBA to BGRA
2241 - panfrost: Extend the tiled store fast-path to loads
2242 - panfrost: Mark 64-bit formats as unsupported
2243 - panfrost: Add support for B5G5R5X1
2244 - st/mesa: Fall back on R3G3B2 for R3_G3_B2
2245 - panfrost: Add support for R3G3B2
2246 - panfrost: Correctly identify format 0x4c
2247 - pan/midgard: Fix a divide by zero in emit_alu_bundle
2248 - panfrost: Fix GL_EXT_vertex_array_bgra
2249 - panfrost: Enable PIPE_CAP_VERTEX_COLOR_UNCLAMPED
2250 - panfrost: Fix background showing when using discard
2251
2252 Icenowy Zheng (3):
2253
2254 - lima: remove its hash table entry when invalidating a resource
2255 - lima: expose fragment shader derivatives capability
2256 - lima: implement zsbuf reload
2257
2258 Ilia Mirkin (24):
2259
2260 - nv50: report max lod bias of 15.0
2261 - gitlab-ci: disable panfrost runners
2262 - mesa: fix \_mesa_draw_nonzero_divisor_bits to return nonzero divisors
2263 - nv50,nvc0: add newly added PIPE_CAP's to list
2264 - st/mesa: allow TXB2/TXL2 to work with cube array shadow textures
2265 - nvc0: enable EXT_texture_shadow_lod
2266 - st/vdpau: avoid asserting on new VDP_YCBCR\_\* formats
2267 - st/vdpau: make query test for 2D support
2268 - nv50: don't try to upload MSAA settings for BUFFER textures
2269 - gallium: add viewport swizzling state and cap
2270 - mesa: add GL_NV_viewport_swizzle support
2271 - st/mesa: add NV_viewport_swizzle support
2272 - nvc0: add NV_viewport_swizzle support for GM200+
2273 - compiler: add VARYING_SLOT_VIEWPORT_MASK
2274 - glsl: add NV_viewport_array2 support
2275 - mesa: add NV_viewport_array2 enable, attach to glsl
2276 - gallium: add TGSI_SEMANTIC_VIEWPORT_MASK
2277 - gallium: add TGSI_PROPERTY_LAYER_VIEWPORT_RELATIVE
2278 - gallium: add PIPE_CAP_VIEWPORT_MASK
2279 - st/mesa: add support for GL_NV_viewport_array2
2280 - nvc0: enable GL_NV_viewport_array2
2281 - nv50,nvc0: update with latest caps
2282 - docs: update for recently-added nvc0 features
2283 - mesa: add interaction between compute derivatives and variable local
2284 sizes
2285
2286 Indrajit Kumar Das (4):
2287
2288 - glapi/copyimage: Implement CopyImageSubDataNV
2289 - gallium: prepare framework for supporting
2290 AlphaToCoverageDitherControlNV
2291 - mesa: add support for AlphaToCoverageDitherControlNV
2292 - radeonsi: enable support for AlphaToCoverageDitherControlNV
2293
2294 Ivan Molodetskikh (1):
2295
2296 - egl: allow INVALID format for linux_dmabuf
2297
2298 James Xiong (2):
2299
2300 - iris: handle the failure of converting unsupported yuv formats to isl
2301 - gallium: let the pipe drivers decide the supported modifiers
2302
2303 James Zhu (1):
2304
2305 - radeonsi: fix Segmentation fault during vaapi enc test
2306
2307 Jan Palus (1):
2308
2309 - targets/opencl: fix build against LLVM>=10 with Polly support
2310
2311 Jan Vesely (2):
2312
2313 - clover: Use explicit conversion from llvm::StringRef to std::string
2314 - clover: Check if the detected clang libraries are usable
2315
2316 Jan Zielinski (8):
2317
2318 - gallium/swr: Fix various asserts and security issues
2319 - gallium/swr: fix corruptions in Unigine Heaven
2320 - gallium/swr: use ElementCount type arguments for getSplat()
2321 - gallium/gallivm: Remove workaround disabling AVX code for newer CPUs
2322 - gallium/gallivm: fix compilation issues with llvm 11
2323 - gallium/gallivm: remove unused header include for newer LLVM
2324 - gallium/swr: Fix LLVM 11 compilation issues
2325 - gallium/swr: Fix crashes and failures in vertex fetch
2326
2327 Jason Ekstrand (202):
2328
2329 - genxml: Add a new 3DSTATE_SF field on gen12
2330 - anv,iris: Set 3DSTATE_SF::DerefBlockSize to per-poly on Gen12+
2331 - intel/genxml: Drop SLMEnable from L3CNTLREG on Gen11
2332 - iris: Set SLMEnable based on the L3$ config
2333 - iris: Store the L3$ configs in the screen
2334 - iris: Use the URB size from the L3$ config
2335 - i965: Re-emit l3 state before BLORP executes
2336 - intel: Take a gen_l3_config in gen_get_urb_config
2337 - intel/blorp: Always emit URB config on Gen7+
2338 - iris: Consolodate URB emit
2339 - anv: Emit URB setup earlier
2340 - intel/common: Return the block size from get_urb_config
2341 - intel/blorp: Plumb deref block size through to 3DSTATE_SF
2342 - anv: Plumb deref block size through to 3DSTATE_SF
2343 - iris: Plumb deref block size through to 3DSTATE_SF
2344 - anv: Always fill out the AUX table even if CCS is disabled
2345 - intel/eu/validate: Don't validate regions of sends
2346 - intel/disasm: SEND has two sources on Gen12+
2347 - intel/tools: Handle strides better when dumping buffers
2348 - intel/fs: Write the address register with NoMask for MOV_INDIRECT
2349 - anv/blorp: Use the correct size for vkCmdCopyBufferToImage
2350 - anv: No-op submit and wait calls when no_hw is set
2351 - anv: Reject modifiers on depth/stencil formats
2352 - vulkan: Update the XML and headers to 1.2.133
2353 - nir: Fix the nir_builder include path for nir_builtin_builder
2354 - nir/builder: Return an integer from nir_get_texture_size
2355 - intel/isl: Add isl_aux_info.c to Makefile.sources
2356 - anv: Always enable the data cache
2357 - nir: Drop nir_tex_instr::texture_array_size
2358 - anv: Use the PIPE_CONTROL instead of bits for the CS stall W/A
2359 - anv: Use a proper end-of-pipe sync instead of just CS stall
2360 - anv: Do end-of-pipe sync around MCS/CCS ops instead of CS stall
2361 - nir: Flush to zero with OOB low exponents in ldexp
2362 - isl: Set 3DSTATE_DEPTH_BUFFER::Depth correctly for 3D surfaces
2363 - iris: Allow HiZ on blit sources
2364 - blorp: Write to depth/stencil images as depth/stencil when possible
2365 - anv: Enable HiZ for VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL
2366 - iris: Enable CCS for copies from HiZ+CCS depth buffers
2367 - iris: Enable HiZ and stencil CCS for blorp blit destinations
2368 - iris: Don't skip fast depth clears if the color changed
2369 - anv: Parse VkPhysicalDeviceFeatures2 in CreateDevice
2370 - anv: Mark max_push_range UNUSED and simplify the code
2371 - anv: Pass buffer addresses into emit_push_constant\*
2372 - anv: Delete some pointless break statements
2373 - anv: Align UBO sizes to 32B
2374 - anv: Add an align_down_u32 helper
2375 - anv: Bounds-check pushed UBOs when robustBufferAccess = true
2376 - vulkan/wsi: Don't leak the FD when
2377 GetImageDrmFormatModifierProperties fails
2378 - vulkan/wsi: Return an error if dup() fails
2379 - intel/isl: Clean up some aux surface logic
2380 - intel/isl: Add a separate ISL_AUX_USAGE_HIZ_CCS_WT
2381 - intel/blorp: Allow HIZ_CCS_WT in copy sources
2382 - iris: Use ISL_AUX_USAGE_HIZ_CCS_WT to indicate write-through HiZ
2383 - intel/isl: Require ISL_AUX_USAGE_HIZ_CCS_WT for HZ+CCS WT mode
2384 - intel/isl: Add a separate ISL_AUX_USAGE_STC_CCS
2385 - intel/blorp: Allow STC_CCS in blit sources
2386 - iris: Use ISL_AUX_USAGE_STC_CCS for stencil CCS
2387 - intel: Require ISL_AUX_USAGE_STC_CCS for stencil CCS
2388 - intel/isl: Set DepthStencilResource based on aux usage
2389 - anv: Dump push ranges via VK_KHR_pipeline_executable_properties
2390 - anv: Fix the comparison in an assert
2391 - anv: Push UBO ranges relative to the start of the binding
2392 - anv: Do an end-of-pipe sync before updating AUX table entries
2393 - intel/isl: Don't align linear images to 64K on Gen12+
2394 - intel/blorp: Add support for swizzling fast-clear colors
2395 - anv: Swizzle fast-clear values
2396 - intel/iris: Always initialize CCS to 0
2397 - anv: Only add END_OF_PIPE_SYNC if we actually have AUX_INVAL
2398 - util/sparse_array: Finish the sparse_array in the tests
2399 - util/sparse_array: Add a node_size_log2 temporary
2400 - meson,ci: Disable sparse_array tests on windows
2401 - util/sparse_array: Stash the node level in the node pointer
2402 - anv: Stop fetching the timestamp frequency ourselves
2403 - intel/dump_gpu: Add an ensure_device_info helper
2404 - intel/dump_gpu: Handle a bunch of getparam in the no-HW case
2405 - intel/nir: Run copy-prop and DCE after lower_bool_to_int32
2406 - nir: Add b2b opcodes
2407 - aco: Implement b2b32 and b2b1
2408 - nir: Use b2b opcodes for shared and constant memory
2409 - nir: Insert b2b1s around booleans in nir_lower_to
2410 - anv: Set alignments on descriptor and constant loads
2411 - nir: Validate that memory load/store ops work on whole bytes
2412 - nir: Set UBO alignments in lower_uniforms_to_ubo
2413 - nir/opt_loop_unroll: Fix has_nested_loop handling
2414 - nir/lower_int64: Lower 8 and 16-bit downcasts with nir_lower_mov64
2415 - nir/algebraic: Add downcast-of-pack opts
2416 - nir: Add a nir_op_is_vec helper
2417 - nir: Copy propagate through vec8s and vec16s
2418 - nir: Handle vec8/16 in bool_to_bitsize
2419 - nir: Handle vec8/16 in gather_ssa_types
2420 - nir: Handle vec8/16 in lower_phis_to_scalar
2421 - nir: Handle vec8/16 in lower_regs_to_ssa
2422 - nir: Handle vec8/16 in opt_split_alu_of_phi
2423 - nir: Treat vec8/16 as select in opt_peephole_select
2424 - nir: Handle vec8/16 in opt_undef_vecN
2425 - nir: Handle vec8/16 in nir_shrink_array_vars
2426 - anv: Account for the header in anv_state_stream_alloc
2427 - anv/allocator: Use util_dynarray for blocks in anv_state_stream
2428 - spirv: Implement OpCopyObject and OpCopyLogical as blind copies
2429 - Revert "spirv: Implement OpCopyObject and OpCopyLogical as blind
2430 copies"
2431 - anv/image: Use align_u64 for image offsets
2432 - nir/from_ssa: Only chain movs when a src is also a dest
2433 - intel/fs: Choose memory message type based on bit size
2434 - anv: Improve brw_nir_lower_mem_access_bit_sizes
2435 - iris: Set alignments on cbuf0 and constant reads
2436 - intel/nir: Lower memory access bit sizes later
2437 - nir/load_store_vectorize: Fix shared atomic info
2438 - nir/load_store_vectorize: Use nir_iadd_imm for offsets
2439 - nir/load_store_vectorize: Add support for nir_var_mem_global
2440 - intel/nir: Enable load/store vectorization
2441 - spirv: Add a vtn_block() helper
2442 - spirv: Add cast and loop helpers for vtn_cf_node
2443 - spirv: Make vtn_case a vtn_cf_node
2444 - spirv: Make vtn_function a vtn_cf_node
2445 - spirv: Add a parent field to vtn_cf_node
2446 - spirv: Rewrite CFG construction
2447 - Revert "spirv: Rewrite CFG construction"
2448 - nir: Assert memory loads are aligned
2449 - anv: Advertise SEND count through
2450 VK_EXT_pipeline_executable_properties
2451 - anv: Fix UBO range detection in anv_nir_compute_push_layout
2452 - nir: Add an alignment to nir_intrinsic_load_constant
2453 - nir: Add some sanity assertions in opt_large_constants
2454 - intel: Add \_const versions of prog_data cast helpers
2455 - anv: Report correct SLM size
2456 - intel/batch_decoder: Stop printing to stdout
2457 - intel/cfg: Add first/last_block helpers
2458 - anv: Emit pushed UBO bounds checking code in the back-end compiler
2459 - intel/blorp: Delete an unused enum
2460 - spirv: Handle OOB vector extract operations
2461 - spirv,nir: Add a better vector_insert
2462 - spirv: Error if OpCompositeInsert/Extract has OOB indices
2463 - nir/builder: Handle any bit-size selector in nir_extract
2464 - spirv: Call nir_builder directly for vector_extract
2465 - spirv,nir: Move the SPIR-V vector insert code to NIR
2466 - anv: Move vb_emit setup closer to where it's used in flush_state
2467 - anv: Apply any needed PIPE_CONTROLs before emitting state
2468 - nir/dominance: Better handle unreachable blocks
2469 - nir/gcm: Loop over blocks in pin_instructions
2470 - nir/gcm: Use an array for storing the early block
2471 - nir/gcm: Move block choosing into a helper function
2472 - nir/gcm: Add a real concept of "progress"
2473 - nir/gcm: Delete dead instructions
2474 - nir/gcm: Prefer the instruction's original block
2475 - intel/fs: Rename block to scan_block in can_coalesce_vars
2476 - intel/fs: Coalesce when the src live range is contained in the dst
2477 - glsl: Hard-code noise to zero in builtin_functions.cpp
2478 - nir: Delete the fnoise opcodes
2479 - meta,i965: Rip GL_EXT_texture_multisample_blit_scaled support out of
2480 meta
2481 - spirv: Allow constants and NULLs in SpvOpConvertUToPtr
2482 - anv: Properly handle all sizes of specialization constants
2483 - radv: Properly handle all sizes of specialization constants
2484 - turnip: Properly handle all sizes of specialization constants
2485 - spirv: Use nir_const_value for spec constants
2486 - nir/opt_deref: Remove certain sampler type casts
2487 - spirv: Fix passing combined image/samplers through function calls
2488 - anv: Drop an assert
2489 - nir/lower_subgroups: Mask off unused bits in ballot ops
2490 - anv: Add a vk_image_layout_to_usage_flags helper
2491 - anv: Move vk_image_layout_is_read_only higher
2492 - anv: Be more conservative about image view usage
2493 - anv: Rework anv_layout_to_aux_state
2494 - anv/blorp: Do less hard-coding of aux usages
2495 - anv: Generalize some aux usage checks
2496 - intel/blorp: Allow more HiZ usages in hiz_clear_depth_stencil
2497 - anv: Simplify a case in layout_to_aux_usage
2498 - anv/cmd_buffer: Move anv_image_init_aux_tt higher
2499 - intel/isl: Delete a misleading comment
2500 - intel/isl: Refactor isl_surf_get_ccs_surf
2501 - anv: Add support for HiZ+CCS
2502 - spirv: Rewrite CFG construction
2503 - intel/devinfo: Compute the correct L3$ size for Gen12
2504 - anv: Expose CS workgroup sizes based on a maximum of 64 threads
2505 - anv: Return an error if allocating attachment memory fails
2506 - anv: Add TRANSFER_SRC to pass usage not subpass usage
2507 - anv: Stop filling out the clear color in compute_aux_usage
2508 - anv: Assert surface states are valid
2509 - anv: Use ANV_FROM_HANDLE for pInheritanceInfo fields
2510 - anv: Mark images written in end_subpass
2511 - anv: Split command buffer attachment setup in three
2512 - anv: Allocate surface states per-subpass
2513 - intel: Move swizzle_color_value from blorp to ISL
2514 - anv: Disallow fast-clears which require format-reinterpretation
2515 - anv: Stop allowing non-zero clear colors in input attachments
2516 - anv: Refactor cmd_buffer_setup_attachments
2517 - anv: Rework depth_stencil_attachment_compute_aux_usage
2518 - anv: Split color_attachment_compute_aux_usage in two
2519 - anv: Use anv_layout_to_aux_usage for color during render passes
2520 - anv: Allow all clear colors for texturing on Gen11+
2521 - vulkan: Update Vulkan XML and headers to 1.2.139
2522 - nir/copy_prop_vars: Handle volatile better
2523 - nir/copy_prop_vars: Report progress when deleting self-copies
2524 - nir/dead_write_vars: Handle volatile
2525 - nir/combine_stores: Handle volatile
2526 - anv: Handle NULL descriptors
2527 - anv: Handle null vertex buffer bindings
2528 - anv: Claim VK_EXT_robustness2 support
2529 - intel/fs: Don't delete coalesced MOVs if they have a cmod
2530 - vulkan: Allow destroying NULL debug report callbacks
2531 - anv:gpu_memcpy: Emit 3DSTATE_VF_INDEXING on Gen8+
2532 - nir/lower_double_ops: Rework the if (progress) tree
2533 - nir/opt_deref: Report progress if we remove a deref
2534 - nir/copy_prop_vars: Record progress in more places
2535
2536 Jesse Natalie (3):
2537
2538 - wgl: add official gldrv.h header-file
2539 - wgl: use gldrv.h instead of stw_icd.h
2540 - util/ralloc: fix ralloc alignment on Win64
2541
2542 John Stultz (7):
2543
2544 - freedreno: Add ir3_cf.c and ir3_delay.c to Makefile.sources
2545 - panfrost: Move pan_afbc.c file to the the right Makefile.source file
2546 - gallium: hud_context: Fix scalar initializer warning.
2547 - Android.mk: Tweak MESA_ENABLE_LLVM checks
2548 - etnaviv: Avoid shift overflow
2549 - vc4_bufmgr: Remove duplicative VC definition
2550 - r600: Fix build error in sfn_nir_lower_fs_out_to_vector.cpp
2551
2552 Jon Turney (1):
2553
2554 - Fix util/process test on Cygwin
2555
2556 Jonathan Marek (79):
2557
2558 - freedreno/a6xx: use single format enum
2559 - freedreno/a6xx: fix Z24_UNORM_S8_UINT_AS_R8G8B8A8
2560 - freedreno: name sysmem color/depth flush events
2561 - freedreno/a6xx: document some unknown bits
2562 - turnip: add option to force use of hw binning
2563 - turnip: fix COND_EXEC reserved size in tu_query
2564 - turnip: add tu_device pointer to tu_cs
2565 - turnip: automatically reserve cmdstream space in emit_pkt4/emit_pkt7
2566 - turnip: remove marker seqno
2567 - turnip: make cond_exec helper easier to use
2568 - turnip: move tile_load_ib/sysmem_clear_ib into draw_cs
2569 - hud: add GALLIUM_HUD_SCALE
2570 - turnip: enable sampleRateShading feature
2571 - turnip: enable
2572 fullDrawIndexUint32/independentBlend/dualSrcBlend/logicOp
2573 - etnaviv: disable INT_FILTER for ASTC
2574 - util/format: add missing BC4/BC5 vulkan formats
2575 - turnip: rework format table to support r5g5b5a1_unorm/b5g5r5a1_unorm
2576 - turnip: add r5g5b5a1_unorm/b5g5r5a1_unorm formats
2577 - turnip: check the right alignment requirement on shader iova
2578 - turnip: move some constant state to tu6_init_hw
2579 - turnip: remove unecessary MRT_CONTROL fill
2580 - turnip: minify image_view extent
2581 - turnip: fix hw binning + render_area offset interaction
2582 - turnip: fix srgb MRT
2583 - turnip: don't hardcode gmem base for input attachment
2584 - turnip: remove unnecessary fb size check
2585 - turnip: fall back to sysmem when attachments don't fit into gmem
2586 - turnip: increase array sizes in tu_descriptor_map
2587 - turnip: improve binning pipe layout config
2588 - turnip: fix tile->slot calculation
2589 - etnaviv: nir: add compile_check_limits
2590 - freedreno/registers: more GRAS_CL_CNTL bits, Z_CLAMP
2591 - turnip: fix znear clipping
2592 - turnip: implement depth clamp
2593 - turnip: implement timestamp query
2594 - turnip: fix compute shaders crashing after geometry shader change
2595 - turnip: improve vertex input handling
2596 - turnip: use buffer size instead of bo size for VFD_FETCH_SIZE
2597 - freedreno/registers: add RB_CCU_CNTL bitfields
2598 - freedreno/a6xx: set bypass RB_CCU_CNTL value for blitter
2599 - turnip: RB_CCU_CNTL fixes
2600 - turnip: split up gmem/tile alignment
2601 - turnip: fix nir validate failure from push constant lowering
2602 - turnip: disable 8x msaa
2603 - turnip: save attachment samples in renderpass state
2604 - turnip: use dirty bits for dynamic viewport/scissor state
2605 - turnip: rework format helpers
2606 - turnip: add vk_format_is_snorm/is_float
2607 - turnip: new clear/blit implementation with shader path fallback
2608 - freedreno/computerator: support nop prefix
2609 - freedreno/computerator: support bindless sampler instructions
2610 - freedreno/ir3: fix emit_tex_info split_dest
2611 - freedreno/ir3: don't overwrite wrmask in ir3_SAM
2612 - turnip: compute render_components/srgb_cntl at renderpass creation
2613 time
2614 - turnip: don't limit framebuffer size to image size
2615 - turnip: image_view rework
2616 - nir: add common convert_ycbcr for vulkan csc
2617 - nir: convert_ycbcr: preserve alpha channel
2618 - anv: use common nir_convert_ycbcr
2619 - radv: use common nir_convert_ycbcr
2620 - turnip: fix GMEM resolve in CmdNextSubpass
2621 - turnip: disable depth test for S8_UINT attachment
2622 - turnip: improve GMEM load/store logic
2623 - turnip: enable VK_FORMAT_S8_UINT as stencil format
2624 - turnip: set shader key msaa field
2625 - turnip: implement VK_EXT_sample_locations
2626 - turnip: implement VK_EXT_filter_cubic
2627 - turnip: enable cube arrays
2628 - turnip: implement VK_EXT_sampler_filter_minmax
2629 - turnip: divide cube map depth by 6
2630 - freedreno/ir3: fix 16-bit ssbo access
2631 - freedreno/ir3: set even bit for f2f16_rtne
2632 - freedreno/ir3: fix incorrect conversion folding
2633 - turnip: remove unused RB_UNKNOWN_8E04_blit
2634 - turnip: use RESOLVE_TS event
2635 - turnip: add adreno 650
2636 - nir: add pack_32_2x16_split/unpack_32_2x16_split lowering
2637 - freedreno/ir3: run nir_lower_pack
2638 - turnip: fix wrong substream size in parse_multisample_and_color_blend
2639
2640 Jordan Justen (6):
2641
2642 - intel/compiler: Restrict cs_threads to 64
2643 - intel: Update TGL PCI strings
2644 - intel: Add TGL PCI ID
2645 - intel/dev: Split .num_subslices out of GEN12_FEATURES macro
2646 - intel/dev: Add device info for RKL
2647 - docs/relnotes/new_features.txt: Add RKL to 20.1 release notes
2648
2649 Jose Maria Casanova Crespo (5):
2650
2651 - broadcom: Fix implicit declaration of ffs for Android build
2652 - v3d: Sync on last CS when non-compute stage uses resource written by
2653 CS
2654 - v3d: Primitive Counts Feedback needs an extra 32-bit padding.
2655 - v3d: Fix swizzle in DXT3 and DXT5 formats
2656 - v3d: Include supported DXT formats to enable s3tc/dxt extensions
2657
2658 Joshua Ashton (3):
2659
2660 - radv: Use TRUNC_COORD on samplers
2661 - radv: Pass logical device to si_emit_graphics
2662 - radeonsi: Use TRUNC_COORD on samplers
2663
2664 José Fonseca (4):
2665
2666 - meson: Avoid duplicate symbols.
2667 - scons: Prune out unnecessary targets.
2668 - gitlab-ci: Prune all SCons jobs except scons-win64, and allows
2669 failures.
2670 - appveyor: Remove Meson job.
2671
2672 Juan A. Suarez Romero (6):
2673
2674 - nir/lower_double_ops: add note for lowering mod
2675 - nir/lower_double_ops: relax lower mod()
2676 - nir/algebraic: coalesce fmod lowering
2677 - anv: use urb_setup_attribs in SBE
2678 - intel/compiler: store the FS inputs in WM prog data
2679 - anv/pipeline: allow more than 16 FS inputs
2680
2681 Karol Herbst (18):
2682
2683 - clover: add trivial clCreateCommandQueueWithProperties implementation
2684 - nir/lower_ssbo: handle atomics
2685 - gallium: make handles of set_global_binding 64 bit
2686 - Revert "gallium: make handles of set_global_binding 64 bit"
2687 - nv50, nvc0: fix must_check warning of util_dynarray_resize_bytes
2688 - clover: fix build with single library clang build
2689 - gallium: add PIPE_CAP_SYSTEM_SVM
2690 - clover: add stubs for SVM
2691 - clover: implement CL_DEVICE_SVM_CAPABILITIES
2692 - clover: implement clSetKernelArgSVMPointer
2693 - clover: implement SVM functions for devices with fine grained system
2694 SVM support
2695 - clover: implement cl_arm_shared_virtual_memory
2696 - clover: expose cl_arm_shared_virtual_memory for devices with SVM
2697 support
2698 - nvc0: enable ASTC and ETC on GM20B
2699 - mesa: fix enum value of VIEWPORT_SWIZZLE_POSITIVE_W_NV
2700 - gallium: initialize viewport swizzle in cso_set_viewport_dims
2701 - Revert "nvc0: fix line width on GM20x+"
2702 - st/mesa: properly guard fallback_copy_texsubimage aginst failed maps
2703
2704 Kenneth Graunke (14):
2705
2706 - intel/genxml: Drop "reserved" enum
2707 - isl: Fix the android build.
2708 - iris: Dump frame markers with INTEL_DEBUG=submit
2709 - iris: Trim "../../src/gallium/drivers/iris/" out of debug dump
2710 filenames
2711 - iris: Make mocs an inline helper in iris_resource.h
2712 - iris: Fix BLORP vertex buffers to respect ISL MOCS settings
2713 - iris: Set MOCS for constant packets on Gen12+
2714 - intel/compiler: Drop nir_lower_to_source_mods() and related handling.
2715 - intel/compiler: Put back saturate on [iu]add_sat opcodes
2716 - intel/compiler: Don't copy prop source mods into PICK_HIGH_32BIT
2717 - intel/compiler: Delete abs/neg handling in fsign code
2718 - intel/compiler: Don't create 64-bit src1 immediates in
2719 opt_peephole_sel
2720 - nir: Actually do load/store vectorization beyond vec2
2721 - iris: Fix downcast of bound_vertex_buffers from uint64_t to int
2722
2723 Konrad Dybcio (1):
2724
2725 - freedreno/a4xx: enable A405
2726
2727 Kristian Høgsberg (39):
2728
2729 - nir: Delete unused is_var_constant() helper
2730 - nir: Make unroll pragma work on clang
2731 - freedreno/fdperf: Cast away some ignored return values
2732 - spirv/opencl: Cast opcode up front to avoid warnings
2733 - glsl: Use 'using' to be explicit about visitor overloads
2734 - nir: Remove always-true assert
2735 - turnip: Be explicit about converting vk compare func to a6xx
2736 - freedreno/a6xx: Add fd6_resource_screen_init()
2737 - freedreno: Set up supported modifiers in fd*_resource_screen_init()
2738 - freedreno: Add layout_resource_for_modifier screen vfunc
2739 - freedreno/a6xx: Implement layout for DRM_FORMAT_MOD_QCOM_COMPRESSED
2740 - turnip: Drop explicit configure opt-in for turnip
2741 - ci: Drop turnip opt-in option
2742 - freedreno/ir3: Set IR3_REG_HALF flag on src as well in immediate MOV
2743 - Mark a few static inline helpers with ASSERTED
2744 - main/get: Converted type conversion macros to inline functions
2745 - nir/types: Add glsl_float16_type() helper
2746 - freedreno/ir3: Lower output precision
2747 - Revert "glsl: Use a simpler formula for tanh"
2748 - Revert "spirv: Use a simpler and more correct implementaiton of
2749 tanh()"
2750 - freedreno/ir3: Don't fold conversions into sign
2751 - glsl: Add ir_constant constructor for fp16
2752 - glsl: Add fp16 case for ir_triop_lrp optimization
2753 - glsl: Implement constant propagation for fp16
2754 - glsl: Expand fp16 to float before constant expression evaluation
2755 - glsl: Add type queries for fp16+float and fp16+float+double
2756 - glsl/lower_instructions: Handle fp16 for FDIV_TO_MUL_RCP
2757 - radeonsi: Stop exposing PIPE_SHADER_CAP_FP16
2758 - turnip: Add missing VKAPI_ATTR annotations
2759 - turnip: Stub out VK_KHR_external_{fence,semaphore}_fd
2760 - turnip: Make Android platform build
2761 - turnip: Drop dep_llvm from dependencies
2762 - freedreno/ir3: Fix sz vs class confusion
2763 - freedreno/computerator: Decouple ir3 assembler
2764 - freedreno/ir3: Move ir3 assembler to backend compiler
2765 - freedreno/ir3: Parse, but ignore @in, @out and @tex headers
2766 - freedreno/ir3: Reset lex line number when we start parsing
2767 - freedreno/ir3: Print @tex write mask using 0x%x
2768 - freedreno: Use the right amount of &'s
2769
2770 Krzysztof Raszkowski (10):
2771
2772 - gallium/swr: fix gcc warnings
2773 - gallium/swr: Fix gcc 4.8.5 compile error
2774 - gallium/swr: Fix llvm11 compilation issues
2775 - gallium/swr: simplify environmental variabled expansion code
2776 - gallium/swr: fix rdtsc debug statistics mechanism
2777 - gallium/swr: Fix min/max range index draw
2778 - Revert "gallium/swr: Fix min/max range index draw"
2779 - gallium/swr: Fix vcvtph2ps llvm intrinsic compile error
2780 - gallium/swr: Fix array stride problem.
2781 - gallium/swr: Re-enable scratch space for client-memory buffers
2782
2783 Leandro Ribeiro (1):
2784
2785 - i965: remove duplicated comment
2786
2787 Leo Liu (1):
2788
2789 - radeon/jpeg: fix the jpeg dt_pitch with YUYV format
2790
2791 Lepton Wu (1):
2792
2793 - virgl: Use ETC2 formats directly when possible.
2794
2795 Lionel Landwerlin (49):
2796
2797 - iris: implement gen12 post sync pipe control workaround
2798 - anv: implement gen9 post sync pipe control workaround
2799 - anv: implement gen12 post sync pipe control workaround
2800 - anv: set MOCS on push constants
2801 - mesa: add INTEL_blackhole_render
2802 - i965: enable INTEL_blackhole_render
2803 - st: add support for INTEL_blackhole_render
2804 - iris: add support INTEL_blackhole_render
2805 - intel/tools/aub_dump: move aub file initialization to maybe_init()
2806 - intel/tools/aub_dump: fix crash when using the default legacy context
2807 - intel/aub_dump: stub the waits when overriding the device
2808 - intel/tools/dump_gpu: fix getparam values
2809 - anv: stop storing prog param data into shader blobs
2810 - intel/decoder: don't consider header fields past dword0
2811 - isl: implement linear tiling row pitch requirement for display
2812 - isl: properly filter supported display modifiers on Gen9+
2813 - isl: only apply main surface ccs pitch constraint with CCS
2814 - isl: drop min row pitch alignment when set by the driver
2815 - intel: add new TGL pci ids
2816 - i965/iris: fix crash when calling GetPerfQueryDataINTEL
2817 - vulkan/overlay: Add a workaround semaphore for application presenting
2818 without one
2819 - intel/perf: move register definition to special file
2820 - intel/perf: break GL query stuff away
2821 - intel/perf: move mdapi query definitions to their own file
2822 - intel/perf: document meaning of query field
2823 - intel/perf: store the probed i915-perf version
2824 - isl: set bpb for Y8_UNORM
2825 - isl: don't warn in physical extent calculation for yuv formats
2826 - intel/aub_viewer: fix access to freed memory
2827 - drm-shim: return device platform as specified
2828 - drm-shim: stub libdrm's use of realpath()
2829 - iris: properly free resources on BO allocation failure
2830 - iris: share buffer managers accross screens
2831 - iris: make resources take a ref on the screen object
2832 - i965: store DRM fd on intel_screen
2833 - i965: share buffer managers across screens
2834 - iris: drop cache coherent cpu mapping for external BO
2835 - intel/perf: Enable MDAPI queries for Gen12
2836 - anv: skip writing perfcntr in results on Gen12+
2837 - util/sparse_free_list: manipulate node pointers using atomic
2838 primitives
2839 - iris: fail screen creation when kernel support is not there
2840 - include/drm-uapi: bump headers
2841 - intel/perf: store default sseu configuration
2842 - intel/perf: specify sseu configuration when supported
2843 - anv: force whole EU array to be powered for perf queries
2844 - drm-shim: provide a valid fake syncobj handle at creation
2845 - drm-shim: stub syncobj wait ioctl
2846 - iris: don't assert on unfinished aux import in copy paths
2847 - anv: don't expose VK_INTEL_performance_query without kernel support
2848
2849 Liviu Prodea (2):
2850
2851 - scons/windows: Support build with LLVM 10.
2852 - util: Make process_test path compatible with mingw native toolchains
2853
2854 Louis-Francis Ratté-Boulianne (7):
2855
2856 - glsl/linker: add DisableTransformFeedbackPacking workaround
2857 - glsl/linker: handle array/struct members for DisableXfbPacking
2858 - glsl/linker: add xfb workaround for modified built-in variables
2859 - gallium: add PIPE_CAP_PACKED_STREAM_OUTPUT
2860 - gallium: add PIPE_CAP_VIEWPORT_TRANSFORM_LOWERED
2861 - gallium: add PIPE_CAP_PSIZ_CLAMPED
2862 - panfrost: fix transform feedback
2863
2864 Lucas Stach (1):
2865
2866 - etnaviv: retarget transfer to render resource when necessary
2867
2868 Marek Olšák (254):
2869
2870 - vbo: move GLvertexformat initialization into a template header file
2871 for reuse
2872 - vbo: use the template for noop GLvertexformat initialization
2873 - vbo: use the template for save GLvertexformat initialization
2874 - vbo: move reusable code from vbo_attrib_tmp.h into vbo_util.h
2875 - mesa: implement missing display list functions while switching to the
2876 template
2877 - radeonsi: don't report that multi-plane formats are supported
2878 - radeonsi: fix the DCC MSAA bug workaround
2879 - radeonsi: don't update states for the DCC MSAA bug on GFX6-7
2880 - glx: print FPS with 2 decimal places
2881 - mesa: fix incorrect uses of FLUSH_CURRENT
2882 - mesa: remove FLUSH_CURRENT calls that have no effect
2883 - mesa: import PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET handling
2884 - vbo: create the immediate mode buffer only in vbo_exec_vtx_map
2885 - vbo: skip FlushMappedBufferRange for glBegin/End by using a
2886 persistent mapping
2887 - vbo: don't unmap persistent buffer mappings for glBegin/End
2888 - vbo: remove immediate mode code that doesn't do anything and simplify
2889 stuff
2890 - vbo: interleave attrsz, attrtype, and active_sz in memory
2891 - vbo: remove a funky recursive call in glBegin
2892 - vbo: don't check ctx->NewState twice in glBegin
2893 - vbo: keep the immediate mode buffer always mapped for simplicity
2894 - vbo: don't set FLUSH_UPDATE_CURRENT for glVertex
2895 - vbo: pass only either uint32_t or uint64_t into ATTR_UNION
2896 - vbo: don't store glVertex values temporarily into exec
2897 - vbo: optimize resizing vertex attributes during immediate mode
2898 - vbo: fix resizing 64-bit vertex attributes
2899 - vbo: use FlushVertices flags properly and clear NeedFlush correctly
2900 - vbo: increase the size of the immediate mode buffer to decrease draw
2901 count
2902 - vbo: add/update unlikely statements in ATTR_UNION
2903 - vbo: delay flagging FLUSH_STORED_VERTICES until glEnd
2904 - vbo: also map the immediate mode buffer for read
2905 - vbo: clean up resetting vertex attribs
2906 - vbo: merge use_buffer_objects into vbo_CreateContext to skip the big
2907 malloc
2908 - í965: don't use \_mesa_prim::is_indirect
2909 - mesa: remove unused \_mesa_prim::is_indirect
2910 - mesa: don't use bitfields in \_mesa_prim
2911 - st/mesa: optimize st_update_array with ALWAYSINLINE
2912 - radeonsi: don't wait for shader compilation to finish when destroying
2913 a context
2914 - mesa: translate into gallium vertex formats in mesa/main
2915 - mesa: remove unused \_mesa_draw_indirect
2916 - st/mesa: always inline the code setting non-64bit vertex elements
2917 - st/mesa: simplify determination whether a draw has user vertex
2918 buffers
2919 - st/mesa: simplify determination whether a draw needs min/max index
2920 - st/mesa: change some loops from while to do..while in st_atom_array.c
2921 - st/mesa: make st_setup_current static
2922 - st/mesa: simplify releasing the current attrib buffer
2923 - gallium/u_upload_mgr: reduce dereferences by adding buffer_size
2924 - gallium/u_upload_mgr: don't do align twice in the u_upload_alloc fast
2925 path
2926 - gallium/u_vbuf: adjust the heuristic for unrolling indices
2927 - gallium/cso_hash: inline a bunch of functions
2928 - gallium/cso_hash: make cso_hash declared within structures instead of
2929 alloc'd
2930 - gallium/cso_hash: remove always constant variable nodeSize
2931 - gallium/cso_hash: cosmetic changes, no behavior changes
2932 - gallium/cso_hash: remove another layer of pointer indirection
2933 - st/mesa: try to fix MSVC build failure due to ALWAYS_INLINE
2934 - vbo: remove dead code in vbo_can_merge_prims
2935 - vbo: remove redundant code in vbo_exec_fixup_vertex
2936 - mesa: document \_mesa_prim::begin/end
2937 - mesa: don't use memset in glDrawArrays
2938 - mesa: fix immediate mode with tessellation and varying patch vertices
2939 - gallium/util: remove unused u_surfaces.c/h
2940 - util: remove the dependency on kcmp.h
2941 - nir: fix gl_nir_lower_images for bindless images
2942 - tgsi_to_nir: set num_images and num_samplers with holes correctly
2943 - gallium/hash_table: consolidate hash tables with pointer keys
2944 - gallium/hash_table: consolidate hash tables with FD keys
2945 - gallium/hash_table: use the same callback signatures as
2946 util/hash_table
2947 - gallium/hash_table: turn it into a wrapper around util/hash_table
2948 - gallium/hash_table: remove some function wrappers
2949 - mesa: remove leftovers from ARB_shadow_ambient
2950 - mesa: call FLUSH_VERTICES before updating CoordReplace
2951 - i965: stop using "indirect" parameter from Driver.Draw (non-indirect)
2952 - mesa: remove unused "indirect" parameter from Driver.Draw
2953 - gallium/cso_hash: pack cso_node better
2954 - gallium/cso_hash: inline struct cso_hash_data
2955 - gallium: pass cso_velems_state into cso_context instead of
2956 pipe_vertex_element
2957 - gallium/u_threaded: fix uploading user indices with start != 0
2958 - gallium/u_threaded: convert dividing by index_size to a bit shift
2959 - mesa/i965: remove \_mesa_prim::indirect_offset
2960 - mesa: remove redundant \_mesa_prim::is_indexed
2961 - mesa: move num_instances and base_instance out of \_mesa_prim
2962 - mesa: clean up glMultiDrawElements code, use alloca for small draw
2963 count (v2)
2964 - mesa: don't unroll glMultiDrawElements if one count is 0
2965 - mesa: optimize glMultiDrawArrays, call Draw only once (v2)
2966 - mesa: fix incorrect prim.begin/end for glMultiDrawElements
2967 - nir: replace GCC unroll with an option that works on GCC < 8.0
2968 - gallivm: fix 5 warnings
2969 - nir: fix 5 warnings
2970 - mesa: fix 11 warnings
2971 - gallium/u_vbuf: silence a warning by using unreachable
2972 - mesa: add index_size_shift = log2(index_size) into
2973 \_mesa_index_buffer
2974 - mesa: replace some index_size multiplications and divisions with
2975 shifts
2976 - vbo: don't look at the second draw's count when merging 2 glBegin/End
2977 draws
2978 - vbo: deduplicate copy_vertices functions
2979 - vbo: clean up vbo_copy_vertices
2980 - vbo: handle GS and tess primitive types when splitting Begin/End
2981 - vbo: clean up conditional blocks in ATTR_UNION
2982 - vbo: fold code from vbo_exec_fixup_vertex to
2983 vbo_exec_wrap_upgrade_vertex
2984 - Revert "mesa: check for z=0 in \_mesa_Vertex3dv()"
2985 - mesa: remove \_mesa_index_buffer::index_size in favor of
2986 index_size_shift
2987 - mesa: optimize get_index_size
2988 - mesa: deduplicate draw indirect functions
2989 - vbo: merge more primitive types for glBegin/End (v2)
2990 - vbo: merge draws even when begin==0 or end==0
2991 - glthread: don't generate the sync fallback if the call size is not
2992 variable
2993 - glthread: don't prefix variable_data with const
2994 - glthread: inline \_mesa_unmarshal_dispatch_cmd and convert the switch
2995 to a table
2996 - glthread: reduce pointer dereferences in glthread_unmarshal_batch
2997 - glthread: use int instead of size_t where it's OK
2998 - glthread: simplify repeated function sequences in marshal_generated.c
2999 - glthread: don't insert \_mesa_post_marshal_hook into every function
3000 - glthread: don't increment variable_data if it's the last
3001 variable-size param
3002 - glthread: add GL_DRAW_INDIRECT_BUFFER tracking and generator support
3003 - glthread: add/update count and marshal fields for many GL functions
3004 - glthread: handle complex pointer parameters and support GL functions
3005 with strings
3006 - glthread: check the size of all variable params and clean up the code
3007 - glthread: replace custom ClearBuffer marshalling with generated one
3008 - glthread: add support for TexParameteri and SamplerParameteri
3009 functions
3010 - glthread: add support for glFog, glLight, glLightModel, glTexEnv,
3011 glTexGen
3012 - glthread: add support for glClearNamedFramebuffer, glMaterial,
3013 glPointParameter
3014 - glthread: add support for glCallLists, glPatchParameterfv
3015 - glthread: add support for glMemoryObjectParameteriv,
3016 glSemaphoreParameterui64v
3017 - glthread: don't insert an empty line after (void) cmd;
3018 - glthread: add marshal_call_after and remove custom glFlush and
3019 glEnable code
3020 - glthread: track for each VAO whether the user has set a user pointer
3021 - glthread: sync instead of disabling glthread for non-VBO pointers
3022 - glthread: replace custom glBindBuffer marshalling with generated one
3023 - glthread: merge glBufferData and glNamedBufferData into 1 set of
3024 functions
3025 - glthread: merge glBufferSubData and glNamedBufferSubData into 1 set
3026 of functions
3027 - glthread: add custom marshalling for glNamedBuffer(Sub)DataEXT
3028 - glthread: fix a crash with incorrect glShaderSource parameters
3029 - glthread: fall back if a param size is non-zero and a pointer param
3030 is NULL
3031 - radeonsi: add a bug workaround for NGG - LATE_ALLOC_GS
3032 - ac: add a bug workaround for the 100% NGG culling case
3033 - radeonsi: determine uses_bindless_samplers correctly
3034 - st/mesa: flush the bitmap cache before st/dri and vbo flushes
3035 - st/mesa: fix a possible crash with selection and feedback modes
3036 - gallium/cso_context: remove cso_delete_xxx_shader helpers to fix the
3037 live cache
3038 - st/mesa: keep serialized NIR instead of nir_shader in st_program
3039 - vbo: use vbo_exec_wrap_upgrade_vertex for glVertex in ATTR_UNION
3040 - vbo: fix transitions from glVertexN to glVertexM where M < N
3041 - vbo: fix vbo_copy_vertices for GL_PATCHES and adjacency primitive
3042 types
3043 - gallium: add PIPE_CAP_DRAW_INFO_START_WITH_USER_INDICES
3044 - mesa: don't unroll glMultiDrawElements with user indices for gallium
3045 - radeonsi/gfx10: cache metadata in L2 on small chips
3046 - radeonsi: set better tessellation tunables on gfx9 and gfx10
3047 - radeonsi: tune primitive binning for small chips
3048 - ac: add radeon_info::use_late_alloc to control LATE_ALLOC globally
3049 - ac: disable late alloc on small gfx10 chips
3050 - gallium/u_threaded: don't sync the thread for all unsychronized
3051 mappings
3052 - gallium/u_vbuf: simplify the first if statement in
3053 u_vbuf_upload_buffers
3054 - ac: unify denorm setting enforcement
3055 - ac: set new LLVM denormal flags
3056 - ac: don't set old denormals flags with LLVM >= 11
3057 - nir: fix clip/cull_distance_array_size in
3058 nir_lower_clip_cull_distance_arrays
3059 - mesa: use vbo_attrib_tmp.h to generate display list vertex attrib
3060 functions
3061 - mesa: remove redundant api_loopback functions
3062 - glthread: align the batch buffer to 8 bytes for pointers and doubles
3063 again
3064 - glthread: enable display lists
3065 - glthread: track VAOs created by CreateVertexArrays
3066 - glthread: don't execute any custom VAO and BindBuffer code in the
3067 Core profile
3068 - glthread: remove debug_print_marshal function
3069 - glthread: clean up debug_print_sync code
3070 - glthread: don't declare unmarshal functions as inline
3071 - winsys/radeon: change to 3-space indentation
3072 - driconf: enable glthread for "From The Depths"
3073 - glthread: remove \_mesa_post_marshal_hook, because it's not very
3074 useful
3075 - glthread: simplify printing safe_mul in gl_marshal.py
3076 - glthread: autogenerate prototypes for custom-marshalled functions
3077 - glthread: move buffer functions into glthread_bufferobj.c
3078 - glthread: rename marshal.h/c to glthread_marshal.h and
3079 glthread_shaderobj.c
3080 - mesa: put gl_thread_state inside gl_context to remove pointer
3081 indirection
3082 - glthread: handle buffer unbinding via glDeleteBuffers
3083 - glthread: rename non_vbo helper functions
3084 - glthread: track which vertex array attribs are enabled
3085 - glthread: ignore vertex arrays with user pointers if they're disabled
3086 - glthread: remove the marshal_fail XML attribute
3087 - vbo,gallium: make glBegin/End buffer size configurable by drivers
3088 - ac: fix fast division
3089 - st/mesa: fix use of uninitialized memory due to st_nir_lower_builtin
3090 - glthread: inline SET_func and add -O1 to build
3091 \_mesa_create_marshal_table faster
3092 - glthread: declare marshal and unmarshal functions as non-static
3093 - glthread: compile marshal_generated.c faster by breaking it up into 8
3094 files
3095 - nir: add and gather shader_info::writes_memory
3096 - glsl_to_tgsi: set shader_info::writes_memory
3097 - mesa: allow out-of-order drawing to optimize immediate mode if it's
3098 safe
3099 - radeonsi: enable full out-of-order drawing when
3100 allow_draw_out_of_order is set
3101 - mesa: try to fix the android build
3102 - Move compiler.h and imports.h/c from src/mesa/main into src/util
3103 - mesa: don't use <> for including internal headers
3104 - util: stop including files from mesa/main
3105 - radv: stop including files from mesa/main
3106 - util: don't include p_defines.h and u_pointer.h from gallium
3107 - util: remove duplicated MALLOC_STRUCT and CALLOC_STRUCT
3108 - radeonsi: remove obsolete TODO comment related to compute-based
3109 culling
3110 - radeonsi: fix incorrect ordered_wave_id initilization for
3111 compute-based culling
3112 - radeonsi: set amdgpu-gds-size for mode == 2 of compute-based culling
3113 - radeonsi: always create wait_mem_scratch for compute-based culling
3114 - radeonsi: add num_vbos_in_user_sgprs into the shader cache key
3115 - radeonsi/gfx10: don't use NGG culling if compute-based culling is
3116 used
3117 - radeonsi/gfx10: fix ds.ordered.add intrinsic for compute-based
3118 culling
3119 - radeonsi/gfx10: user correct ACQUIRE_MEM packet for compute-based
3120 culling
3121 - radeonsi/gfx10: fix the wave size for compute-based culling
3122 - radeonsi/gfx10: fix descriptors and compute registers for
3123 compute-based culling
3124 - gallium/u_threaded: call the driver to pin threads to L3 immediately
3125 - st/mesa: add environment variable pin_app_thread for faster glthread
3126 on AMD Zen
3127 - driconf: whilelist more games for glthread
3128 - mesa: optimize initialization of new VAOs
3129 - mesa: don't ever set NullBufferObj in gl_vertex_array_binding
3130 - mesa: don't ever bind NullBufferObj for glBindBuffer targets
3131 - mesa: don't ever bind NullBufferObj to glBindBuffer(Base,Range) slots
3132 - mesa: remove NullBufferObj
3133 - mesa: remove no longer needed \_mesa_is_bufferobj function
3134 - mesa: precompute \_mesa_primitive_restart_index during state changes
3135 - mesa: split \_mesa_primitive_restart_index into a function without
3136 gl_context
3137 - vbo: expose helper function vbo_get_minmax_index_mapped for glthread
3138 - util: move and adjust the vertex upload heuristic equation from
3139 u_vbuf
3140 - st/mesa: fix a crash due to passing a draw vertex shader into the
3141 driver
3142 - ac: out-of-order rasterization is not supported on gfx10
3143 - ac,radeonsi: simplify checking for Navi1x chips
3144 - radeonsi: use pipe_blend_state::max_rt to update fewer blend
3145 registers
3146 - ac: force enable -structurizecfg-skip-uniform-regions for LLVM 11
3147 - ac: update and document fast math flags used by radeonsi
3148 - ac: generate FMA for inexact instructions for radeonsi
3149 - ac: reassociate FP expressions for inexact instructions for radeonsi
3150 - mesa: replace \_NEW_EVAL with vbo_exec_update_eval_maps
3151 - mesa: reset primitive restart state in glClientAttribDefaultEXT
3152 - mesa: remove exec="dynamic" from Draw functions that are not really
3153 dynamic
3154 - glthread: use 32-bit align instead of 64-bit ALIGN
3155 - glthread: reduce dereferences of the next batch
3156 - glthread: use GLenum16 in batch buffers to save space
3157 - glthread: sort variables in marshal structures to pack them optimally
3158 - gallium: add PIPE_CAP_MAP_UNSYNCHRONIZED_THREAD_SAFE for glthread
3159 - mesa: add Const.BufferCreateMapUnsynchronizedThreadSafe &
3160 MESA_MAP_THREAD_SAFE
3161 - mesa: add offset_is_int32 param into \_mesa_bind_vertex_buffer for
3162 glthread
3163 - mesa: extend \_mesa_bind_vertex_buffer to take ownership of the
3164 buffer reference
3165 - mesa: replace GLenum target with gl_shader_stage in NewProgram
3166 - ac/surface: rename micro tile mode enums like gfx10 uses them
3167 - ac/surface: remove RADEON_SURF_TC_COMPATIBLE_HTILE and assume it's
3168 always set
3169 - ac/surface: replace RADEON_SURF_OPTIMIZE_FOR_SPACE with
3170 !FORCE_SWIZZLE_MODE
3171 - ac/surface: match get_display_flag() with expectations for
3172 is_displayable
3173 - ac/surface: don't compute DCC if it's unsupported by DCN on gfx9+
3174 - ac/surface: move non-displayable DCC to the end of the buffer
3175 - ac/surface: add code for gfx10 displayable DCC
3176 - ac/surface: validate that DCC is enabled correctly on gfx9+
3177 - ac: enable displayable DCC on Navi12 & Navi14
3178 - mesa: report GL_INVALID_OPERATION for invalid glTextureBuffer target
3179 - st/mesa: expose more SPIR-V capabilities
3180 - radeonsi: unify and align down the max SSBO/TBO/UBO buffer binding
3181 size
3182 - radeonsi: revert an accidental change in si_clear_buffer
3183 - Revert "ac/surface: remove RADEON_SURF_TC_COMPATIBLE_HTILE and assume
3184 it's always set"
3185 - Revert "ac: reassociate FP expressions for inexact instructions for
3186 radeonsi"
3187 - ac/surface: fix MSAA crash with FORCE_SWIZZLE_MODE on gfx9
3188 - radeonsi: fix compilation of monolithic PS
3189 - radeonsi: don't expose 16xAA on chips with 1 RB due to an occlusion
3190 query issue
3191
3192 Marek Vasut (4):
3193
3194 - etnaviv: Destroy rsc->pending_ctx set in etna_resource_destroy()
3195 - etnaviv: Emit PE.ALPHA_COLOR_EXT\* on GPUs with half-float support
3196 - etnaviv: Fix depth stencil ops on GC880/GC2000
3197 - etnaviv: Disable seamless cube map on GC880
3198
3199 Mark Janes (2):
3200
3201 - nir: check shader type before writing to shaderinfo.tess union
3202 - nir: place aligned members after bitfields in shader_info.tess
3203
3204 Mark Menzynski (2):
3205
3206 - util/blob: Add overwrite function for uint8
3207 - tgsi/util: Change boolean for bool
3208
3209 Martin Fuzzey (3):
3210
3211 - freedreno: android: fix build failure on android due to python
3212 version
3213 - freedreno: android: add a6xx-pack.xml.h generation to android build
3214 - freedreno: android: fix build of perfcounters.
3215
3216 Mathias Fröhlich (19):
3217
3218 - egl: Implement getImage/putImage on pbuffer swrast.
3219 - mesa: Fix FLUSH_VERTICES in SubpixelPrecisionBiasNV.
3220 - egl: Fix A2RGB10 platform_{device,surfaceless} PBuffer configs.
3221 - egl: Factor out dri2_add_pbuffer_configs_for_visuals
3222 {device,surfaceless}.
3223 - mesa: Check for OpenGL state change before flushing vertices.
3224 - mesa: Flush vertices before changing the OpenGL state.
3225 - i965: Move down genX_upload_sbe in profiles.
3226 - iris: Move down iris_emit_sbe_swiz in profiles.
3227 - i965: Use 32 bit u_bit_scan for vertex attribute setup.
3228 - i965: Use the VAOs binding information in array setup.
3229 - i965: Test original vertex array pointer to skip array upload.
3230 - i965: Split merge_inputs and clear_buffers.
3231 - i965: Reorder workaround flags computation.
3232 - i965: Remove glbinding from brw_vertex_element.
3233 - mesa: Remove now unused \_mesa_draw_attrib_and_binding.
3234 - mesa: Remove now unused \_mesa_draw_attrib.
3235 - mesa: Provide gl_vertex_format accessors.
3236 - i965: Make use of the vertex format functions in i965.
3237 - i965: Use gl_vertex_format in brw_vertex_element.
3238
3239 Matt Turner (11):
3240
3241 - intel/tools: Do not print type/qualifiers/name for c_literal
3242 - intel/vec4: Make implied_mrf_writes() a vec4_instruction method
3243 - intel/compiler: Remove unnecessary local variables
3244 - intel/compiler: Make instructions_to_schedule a local variable
3245 - intel/compiler: Mark some methods and parameters const
3246 - intel/compiler: Mark visitor parameters to scheduler const
3247 - intel/compiler: Pass backend_shader \* to cfg_t()
3248 - intel/compiler: Pass shader_stats for each SIMD mode
3249 - intel/compiler: Discount NOPs from instruction counts
3250 - isl: Avoid EXPECT_DEATH in unit tests
3251 - meson: Specify the maximum required libdrm in dri.pc
3252
3253 Mauro Rossi (5):
3254
3255 - android: gallium/auxiliary: fix "Unused source files" in tesselator
3256 - android: aco: fix PIPE_FORMAT related building errors
3257 - android: r600/sfn: fix includes and libmesa_nir dependency
3258 - android: r600/sfn: Add GDS instructions
3259 - android: aco: add various compiler statistics
3260
3261 Michel Dänzer (33):
3262
3263 - gitlab-ci: Update to latest ci-templates HEAD
3264 - gitlab-ci: Pass -j4 to make
3265 - gitlab-ci: Merge ccache and libxml2-utils into main apt-get install
3266 - gitlab-ci: Add ppc64el and s390x cross-build jobs
3267 - gitlab-ci: Build radeonsi & RADV in the ppc64el job
3268 - llvmpipe: Bump test timeout to 180 seconds
3269 - gitlab-ci: Only use gstreamer runners for the s390x job for now
3270 - gitlab-ci: Sort random failure softpipe skips
3271 - gitlab-ci: Add three more dEQP-GLES31 tests to softpipe skips
3272 - st/vdpau: Only call is_video_format_supported hook if needed
3273 - winsys/amdgpu: Make local variable r signed
3274 - util: Change os_same_file_description return type from bool to int
3275 - gitlab-ci: Drop "test-" prefix from llvmpipe/softpipe job names
3276 - gitlab-ci: Distribute jobs across more stages
3277 - gitlab-ci: Always name artifacts archive after the job producing it
3278 - gitlab-ci: Don't restrict ppc64el/s390x build jobs to gstreamer
3279 runners
3280 - gitlab-ci: Don't use buster-backports packages by default for
3281 x86_build
3282 - gitlab-ci: Fold scons-swr job into scons job
3283 - gitlab-ci: Move classic driver testing to a new meson-classic job
3284 - llvmpipe: Use uintptr_t for pointer values
3285 - gitlab-ci: Enable more Gallium drivers in meson-i386 job
3286 - gitlab-ci: Restrict s390x/ppc64el jobs to packet runners
3287 - gitlab-ci: Update to current templates
3288 - gitlab-ci: Rename "paths" YAML anchor to "all_paths"
3289 - gitlab-ci/lava: Add needs: for container image to test jobs (again)
3290 - gitlab-ci: Don't require triggering build/test jobs manually
3291 - gitlab-ci: Run merge request pipelines automatically only for Marge
3292 Bot
3293 - gitlab-ci: Use all_paths in .test-manual rules
3294 - gbm/dri: Propagate queryDmaBufModifiers return value
3295 - amd/addrlib: Use enum instead of sparse chars to identify dimensions
3296 - mesa: Skip 3-byte array formats in \_mesa_array_format_flip_channels
3297 - Revert "ac,radeonsi: fix compilations issues with LLVM 11"
3298 - Revert "gallium/gallivm: fix compilation issues with llvm 11"
3299
3300 Mike Blumenkrantz (6):
3301
3302 - zink: set UBO alignments in nir_intrinsic_load_uniform lowering
3303 - zink: remove framebuffer cache
3304 - zink: explicitly unref old fb object when setting new one
3305 - iris: move iris_vtable to iris_screen
3306 - gallium: add pipe cap for scissored clears and pass scissor state to
3307 clear() hook
3308 - iris: handle PIPE_CAP_CLEAR_SCISSORED
3309
3310 Nanley Chery (6):
3311
3312 - isl: Add a module which manages aux resolves
3313 - iris: Use isl_aux_usage_has_fast_clear()
3314 - iris: Use ISL's access preparation functions
3315 - iris: Use isl_aux_state_transition_write()
3316 - i965: Use ISL's access preparation functions
3317 - i965: Use isl_aux_state_transition_write()
3318
3319 Nataraj Deshpande (1):
3320
3321 - dri_util: Update internal_format to GL_RGB8 for
3322 MESA_FORMAT_R8G8B8X8_UNORM
3323
3324 Neha Bhende (2):
3325
3326 - svga: fix size of format_conversion_table[]
3327 - svga: Use pipe_shader_state_from_tgsi to set shader state
3328
3329 Neil Armstrong (4):
3330
3331 - gitlab-ci/lava: fix handling of lava tags
3332 - Revert "ci: Remove T820 from CI temporarily"
3333 - gitlab-ci: add FILES_HOST_URL and move FILES_HOST_NAME into jobs
3334 - gitlab-ci: re-enable mali400/450 and t820 jobs
3335
3336 Neil Roberts (17):
3337
3338 - nir/opcodes: Add nir_op_f2fmp
3339 - glsl: Add support for float16 types in the IR tree
3340 - glsl: Add IR conversion ops for 16-bit float types
3341 - glsl: Add b2f16 and f162b conversion operations
3342 - glsl: Add ir_unop_f2fmp
3343 - glsl/validate: Allow float16 in the expression tree
3344 - glsl/lower_instructions: Use float16 constants when appropriate
3345 - glsl/opt_minmax: Add support for float16
3346 - glsl: Add a method to get precision from a deref instruction
3347 - glsl/hierarchical_visitor: Call leave_callback on leaf nodes
3348 - glsl: Add an IR lowering pass to convert mediump operations to 16-bit
3349 - glsl/standalone: Add an option to lower the precision
3350 - glsl: Add unit tests for the lower_precision pass
3351 - freedreno/ir3: Lower bools to bitsize
3352 - glsl: Inline builtins in a separate pass
3353 - glsl/lower_precision: Lower builtins depending on arguments
3354 - glsl/lower_precision: Use vector.back() instead of vector.end()[-1]
3355
3356 Paulo Zanoni (8):
3357
3358 - intel: fix the gen 11 compute shader scratch IDs
3359 - intel: fix the gen 12 compute shader scratch IDs
3360 - intel/device: bdw_gt1 actually has 6 eus per subslice
3361 - anv: multiply the scratch space by 4 on gen9-10 like iris and i965
3362 - iris: remove hole from struct iris_bo
3363 - iris: remove unnecessary forward declaration
3364 - iris: remove useless bo->gtt_offset assignment
3365 - iris: make BATCH_SZ smaller by BATCH_RESERVED bytes
3366
3367 Peng Huang (1):
3368
3369 - radeonsi: make si_fence_server_signal flush pipe without work
3370
3371 Pierre Moreau (1):
3372
3373 - clover/nir: Check the result of spirv_to_nir
3374
3375 Pierre-Eric Pelloux-Prayer (44):
3376
3377 - radeonsi/ngg: add VGT_FLUSH when enabling fast launch
3378 - radeonsi: test subsampled format in testdma
3379 - format: add format_to_chroma_format
3380 - gallium/video: remove pipe_video_buffer.chroma_format
3381 - gallium/vl: add 4:2:2 support
3382 - radeonsi: fix surf_pitch for subsampled surface
3383 - st/va: enable 4:2:2 chroma format
3384 - st/va: add support YUY2
3385 - radeonsi: remove AMD_DEBUG=sisched option
3386 - omx: fix build with gcc 10
3387 - meson: enable -fno-common by default
3388 - gitlab-ci: rules:changes to test on tested drivers changes
3389 - vdpau: remove bogus assert
3390 - st/mesa: disallow deferred flush if there are multiple contexts
3391 - radeonsi: enable glsl_zero_init for Curse of the Dead Gods
3392 - radeonsi: clarify the conditions when FLUSH_AND_INV_DB is needed
3393 - util/os_file: extend os_read_file to return the file size
3394 - util/u_process: add util_get_process_exec_path
3395 - util/xmlconfig: add new sha1 application attribute
3396 - radeonsi: enable workarounds for YoYo engine based games
3397 - util/u_process: fix Windows build
3398 - nir: update uses_demote flag in discard_to_demote pass
3399 - ac: fix ac_build_is_helper_invocation when postponed_kill is null
3400 - util: fix process_test path
3401 - ddebug: add missing forward declaration
3402 - radeon: fix includes
3403 - radeonsi: switch to 3-spaces style
3404 - radeon: switch to 3-spaces style
3405 - gallium/util: let shader live cache users know if a hit occured
3406 - radeonsi: dump shader stats when hitting the live cache
3407 - util/xmlconfig: fix sha1 comparison code
3408 - mesa: update pipeline when re-linking a program in use
3409 - gallium/u_threaded: flush batch when hitting mapping limit
3410 - radeonsi: use thread_context::bytes_mapped_limit
3411 - radeonsi: don't assume ctx is always a threaded_context
3412 - radeonsi: skip vs output optimizations for some outputs
3413 - mesa: fix crash in find_value
3414 - gallium/utils: silence strncpy warning
3415 - st/omx: fix gcc warnings
3416 - radeonsi: fix export count
3417 - mesa: add gl_coontext::ForceIntegerTexNearest
3418 - driconf: add force_integer_tex_nearest option
3419 - radeonsi: don't print gs_copy_shader stats for shaderdb
3420 - amd/addrlib: fix forgotten char -> enum conversions
3421
3422 Plamena Manolova (2):
3423
3424 - intel/compiler: Add support for variable workgroup size
3425 - i965: Implement ARB_compute_variable_group_size
3426
3427 Qiang Yu (35):
3428
3429 - lima: remove definition of lima_is_scanout
3430 - lima: use util_copy_framebuffer_state
3431 - lima: always add texture bo to submit
3432 - lima: remove lima_ctx_buff_va submit flags (v2)
3433 - lima: pass array as parameter to PLBU and VS command macros
3434 - lima: delay add plb buffer to submit when flush
3435 - lima: delay plbu head command generation to flush stage (v2)
3436 - lima: add render target to submit by dirty buffer flags
3437 - lima: add missing resolve check for damage and reload
3438 - lima: move syncobj from lima_submit to lima_context
3439 - lima: merge gp/pp submit
3440 - lima: put hardware related info to lima_gpu.h
3441 - lima: move flush code to lima_submit.c
3442 - lima: pass submit parameter for functions in lima_submic.c (v2)
3443 - lima: add lima_submit_create_stream_bo
3444 - lima: adjust pp_stream to use lima_submit_create_stream_bo
3445 - lima: use lima_submit_create_stream_bo for plbu/vs_cmd and pp_stack
3446 - lima: add lima_submit_get
3447 - lima: make lima_submit one time use drop data (v3)
3448 - lima: track write submits of context (v3)
3449 - lima: move plbu/vs_cmd_array into lima_submit
3450 - lima: move resolve into lima_submit
3451 - lima: move pp_max_stack_size to lima_submit
3452 - lima: move damage_rect into lima_submit
3453 - lima: move clear into submit (v2)
3454 - lima: move framebuffer info to lima_submit
3455 - lima: use per submit dump file
3456 - lima: optinal flush submit in lima_clear
3457 - lima: enable multi submit optimization
3458 - lima: move dump check to macro for lima_dump_command_stream_print
3459 - lima: rename lima_submit to lima_job
3460 - lima: fix buffer import with offset
3461 - lima: also check tiled and depth case when import
3462 - lima: set offset when export resource
3463 - panfrost: don't always build bifrost_compiler
3464
3465 Quentin Glidic (1):
3466
3467 - meson: Use dependency.partial_dependency()
3468
3469 Rafael Antognolli (18):
3470
3471 - intel: Load the driver even if I915_PARAM_REVISION is not found.
3472 - intel/tools: Update aubinator_error_decode.
3473 - intel/blorp: Implement GEN:BUG:1605967699.
3474 - iris: Apply the flushes when switching pipelines.
3475 - anv: Wait for the GPU to be idle before invalidating the aux table.
3476 - iris: Split aux map initialization from invalidation.
3477 - iris: Wait for the GPU to be idle before invalidating the aux table.
3478 - intel/isl: Implement D16_UNORM workarounds.
3479 - intel/gen12+: Disable mid thread preemption.
3480 - iris: Enable EXT_depth_bounds_test extension.
3481 - drm-uapi: Update headers from Linux 5.7-rc1.
3482 - i965/bufmgr: Factor out GEM_MMAP ioctl from mmap_cpu and mmap_wc.
3483 - iris/bufmgr: Factor out GEM_MMAP ioctl from mmap_cpu and mmap_wc.
3484 - i965/bufmgr: Add support for MMAP_OFFSET ioctl.
3485 - iris/bufmgr: Add support for MMAP_OFFSET ioctl.
3486 - anv: Add anv_device parameter to anv_gem_munmap.
3487 - anv: Add support for new MMAP_OFFSET ioctl.
3488 - anv: Enable HiZ on multi-layer depth buffers.
3489
3490 Rhys Perry (118):
3491
3492 - aco: fix gfx10_wave64_bpermute
3493 - aco: gfx10_wave64_bpermute reduce op to print_ir
3494 - aco: disable some instruction combining if it could change an exec
3495 operand
3496 - aco: improve SCC handling in some SALU combines
3497 - nir: fix nir_const_value_as_uint bit size in load/store vectorizer
3498 tests
3499 - gitlab-ci: remove load_store_vectorizer from expected s390x test
3500 failures
3501 - aco: add RegisterFile
3502 - aco: add some helpers for filling/testing register ranges
3503 - aco: improve GFX9 1D ddx/ddy assertion
3504 - spirv: improve creation of memory_barrier
3505 - spirv: fix memory_barrier_tcs_patch emission
3506 - aco: keep track of which events are used in a barrier
3507 - aco: fix carry-out size for wave32 v_add_co_u32_e64
3508 - aco: handle v_add_co_u32_e64 in parse_base_offset()
3509 - aco: add new NOP insertion pass for GFX6-9
3510 - aco: improve get_wait_states()
3511 - aco: consider non-hazard writes in handle_raw_hazard_internal
3512 - aco: improve control flow handling in GFX6-9 NOP pass
3513 - aco: only reserve sgprs for vcc if it's used
3514 - aco: fix uninitialized data error in waitcnt pass
3515 - glsl/list: use uintptr_t for exec_node_data()'s subtraction
3516 - aco: add helpers for moving instructions for scheduling
3517 - aco: add helpers for ensuring correct ordering while scheduling
3518 - aco: allow barriers to be skipped during scheduling
3519 - aco: don't stop scheduling at exports
3520 - aco: move some register demand helpers into aco_live_var_analysis.cpp
3521 - aco: add a late kill flag
3522 - aco: set late kill for v_interp_p1_f32 for some APUs
3523 - aco: fix instruction encoding for LS VGPR init bug workaround
3524 - aco: fix operand order for LS VGPR init bug workaround
3525 - nir/gather_info: handle emit_vertex_with_counter
3526 - radv: call nir_shader_gather_info again
3527 - radv/winsys: set has_syncobj_wait_for_submit in the null winsys
3528 - aco: set has_divergent_branch for discards in loops
3529 - aco: handle missing second predecessors at merge block phis
3530 - aco: handle when ACO adds new continue edges
3531 - aco: skip NIR in unreachable merge blocks
3532 - aco: improve check for unreachable loop continue blocks
3533 - aco: emit IR in IF's merge block instead if the other side ends in a
3534 jump
3535 - aco: fix boolean undef regclass
3536 - nir/gather_info: fix per-vertex handling in try_mask_partial_io
3537 - aco: remove dead code in handle_operands()
3538 - aco: implement 64-bit VGPR constant copies in handle_operands()
3539 - aco: look at p_{extract,split}_vector's definitions in
3540 pred_by_exec_mask()
3541 - glsl: fix race in instance getters
3542 - util/u_queue: fix race in total_jobs_size access
3543 - radv: add code for exposing compiler statistics
3544 - aco: add various compiler statistics
3545 - aco: add vmem/smem score statistic
3546 - radv, aco: collect statistics if requested but executables are not
3547 - radv: fix null winsys gpu_info array
3548 - aco: make PhysReg in units of bytes
3549 - aco: add SDWA_instruction
3550 - aco: print and validate opsel
3551 - aco: add emission support for register-allocated sdwa sels
3552 - aco: remove divergence check in sanitize_if()
3553 - aco: zero-initialize Temp
3554 - aco: improve vector optimization with sub-dword vectors
3555 - aco: fix p_extract_vector validation
3556 - aco: improve p_create_vector RA for sub-dword operands
3557 - aco: clear moved operands in get_reg_create_vector()
3558 - aco: fix 1D textureGrad() on GFX9
3559 - aco: implement various 8/16-bit conversions
3560 - aco: add missing scc clobber to nir_op_unpack_32_2x16_split_y
3561 - aco: fix copy statistic for 64-bit vgpr constant copy
3562 - aco: add VOP3P_instruction
3563 - aco: implement sub-dword swaps
3564 - aco: implement 64-bit sgpr swaps
3565 - nir/lower_bit_size: fix lowering of shifts
3566 - nir/lower_bit_size: fix lowering of {imul,umul}_high
3567 - nir/algebraic: don't undo lowering of 8/16-bit comparisons to 32-bit
3568 - aco: decrease the uses of other copy operations after
3569 splitting/removing
3570 - aco: copy-propagate p_create_vector copies of vectors
3571 - aco: remove copy in load_input_from_temps()
3572 - aco: move call to store_output_to_temps in store_ls_or_es_output
3573 earlier
3574 - aco: combine VALU and SALU into various VOP3 instructions
3575 - aco: improve code for 32-bit isign
3576 - aco: fix v_or(s_lshl) and v_add(s_lshl) optimizations
3577 - aco: fix outdated label_vec from p_create_vector labelling
3578 - radv: align buffer descriptor sizes to dword
3579 - radv: allocate larger shader memory slabs if needed
3580 - aco: be more careful about using SMEM for load_global
3581 - aco: add and use RegClass::get() helper
3582 - aco: add emit_load helper
3583 - aco: refactor load_lds to use new helpers
3584 - aco: use emit_load helper for VMEM/SMEM loads
3585 - aco: add helpers for splitting stores
3586 - aco: refactor store_lds() to use new helpers
3587 - aco: refactor store_vmem_mubuf() to use new helpers
3588 - aco: refactor visit_store_ssbo() to use new helpers
3589 - aco: refactor visit_store_global() to use new helpers
3590 - aco: refactor visit_store_scratch() to use new helpers
3591 - aco: add and use get_buffer_store_op() helper
3592 - aco: allow 8/16-bit shared loads
3593 - aco: vectorize global loads/stores
3594 - aco: handle undef p_create_vector operands in the optimizer
3595 - aco: clobber scc in s_bfe_u32 in get_alu_src()
3596 - aco: improve sub-dword emit_split_vector() with sgprs
3597 - aco: lower 8/16-bit integer arithmetic
3598 - radv/aco: enable 8/16-bit storage and int8/int16 on GFX8+
3599 - aco: make RegisterFile::block() take a regclass
3600 - aco: check alignment of non-subdword registers in get_reg_specified()
3601 - aco: fix neighboring register check in get_reg_simple()
3602 - aco: split self-intersecting copies instead of swapping
3603 - aco: don't recurse in sub-dword get_reg_simple()
3604 - aco: improve RA for uneven p_split_vector
3605 - aco: add missing adjust_max_used_regs()
3606 - aco: fix sub-dword out-of-bounds check in RA validator
3607 - aco: fix sub-dword overwrite check in RA validator
3608 - aco: add various GFX10 int16 opcodes
3609 - aco: improve clamped integer addition disassembly workaround
3610 - aco: fix vgpr nir_op_vecn with sgpr operands
3611 - aco: consider blocks unreachable if they are in the logical cfg
3612 - aco: remove use of f-strings
3613 - aco: add message to static_assert
3614 - nir: add missing group_memory_barrier handling
3615 - nir/opt_if: run opt_peel_loop_initial_if after all other
3616 optimizations
3617 - nir: fix lowering to scratch with boolean access
3618
3619 Rob Clark (147):
3620
3621 - freedreno/drm: readonly cmdstream
3622 - freedreno/ir3: shuffle a few ir3_register fields
3623 - freedreno/ir3: cleanup after lower_locals_to_regs
3624 - freedreno/ir3: fix crash when no non-input instructions
3625 - freedreno/ir3: split out delay helpers
3626 - freedreno/ir3: move nop padding to legalize
3627 - freedreno/ir3: move block-scheduling into legalize
3628 - freedreno/ir3: move atomic fixup after RA
3629 - freedreno/ir3: a bit more optmsgs debug
3630 - freedreno/ir3/ra: make use()/def() functions instead of macros
3631 - freedreno/ir3: fix kill scheduling
3632 - freedreno/ir3: post-RA sched pass
3633 - freedreno/ir3: number instructions from one
3634 - freedreno/ir3: add is_tex_or_prefetch()
3635 - freedreno/ir3: don't precolor unused inputs
3636 - freedreno/ir3: two pass register allocation
3637 - freedreno/a6xx: fix lrz overflow
3638 - freedreno/ir3: add RA sanity check
3639 - freedreno/ir3: remove unused tex arg harder
3640 - freedreno/ir3: create fragcoord instructions in input block
3641 - freedreno/ir3: simplify split from collect
3642 - freedreno/ir3: fix a dirty lie
3643 - freedreno: allow ctx->batch to be NULL
3644 - freedreno/ir3: fold const conversion into consumer
3645 - freedreno: allow INVALID modifier
3646 - freedreno/registers: teach gen_header.py about a3xx_regid
3647 - freedreno/a6xx: few register updates
3648 - freedreno: quiet INFO_MSG
3649 - freedreno/registers: cleanup CP_SET_MARKER
3650 - freedreno/computerator: import parser/lexer from fdre-a3xx
3651 - freedreno/computerator: polish out some of the rust
3652 - freedreno/computerator: rename prefix asm->ir3
3653 - freedreno/ir3: allow block->predecessors to be null
3654 - freedreno/computerator: add computerator
3655 - freedreno/computerator: fix build dependency
3656 - freedreno/ir3: remove from_tgsi
3657 - freedreno/a6xx: remove unused param
3658 - freedreno/a6xx: emit LRZ clear in sysmem too
3659 - freedreno/a6xx: whitespace fix
3660 - freedreno/a6xx: don't emit YIELD packet
3661 - freedreno/a6xx: enable SKIP_IB2_ENABLE properly
3662 - freedreno: honor FD_MESA_DEBUG=nogrow
3663 - freedreno/ir3: remove regmask_set_if_not()
3664 - freedreno/ir3: rewrite regmask to better support a6xx+
3665 - freedreno/ir3: don't hide latency when there is none to hide
3666 - freedreno/ir3: track half-precision live values
3667 - freedreno/ir3: update SFU delay
3668 - freedreno/ir3: fix crash with samgq workaround
3669 - freedreno/ir3: don't precolor unassigned inputs
3670 - freedreno/ir3: fix assert with getinfo
3671 - freedreno/ir3: add assert
3672 - nir/print: show variable precision
3673 - freedreno/ir3: also lower lowp frag outputs
3674 - freedreno/computerator: add hrsq/hlog2/hexp2
3675 - freedreno/ir3: remove extra nops inserted in scheduler
3676 - freedreno/ir3: add simplified stall estimation
3677 - freedreno: fix FD_MESA_DEBUG=inorder
3678 - util/ra: spiff out select_reg_callback
3679 - util/ra: move NO_REG to header
3680 - freedreno/ir3: split out has_latency_to_hide()
3681 - freedreno/ir3: fix has_latency_to_hide
3682 - freedreno/ir3: track register usage in first RA pass
3683 - freedreno/ir3: round-robin RA
3684 - freedreno/ir3: try to avoid syncs
3685 - freedreno/computerator: add performance counter support
3686 - freedreno/fdperf: set locale
3687 - freedreno/a6xx: register update
3688 - freedreno/ir3: small cleanup and comments
3689 - freedreno/ir3: add bary_ij as src for meta:tex_prefetch
3690 - freedreno/ir3: remove unused helper
3691 - freedreno/ir3: fix bogus register footprint with tess/gs
3692 - freedreno/ir3: reformat disasm output
3693 - freedreno/ir3: convert debug bitfield to BITFIELD_BIT()
3694 - freedreno/ir3/ra: add debug option for RA debug msgs
3695 - freedreno/ir3/ra: split-up
3696 - freedreno/ir3/ra: add helper to map name to instruction
3697 - freedreno/ir3/ra: fix target register calculation
3698 - freedreno/ir3/ra: add helper to map name to array
3699 - freedreno/ir3/ra: drop extending output live-ranges
3700 - freedreno/ir3/ra: add def/use iterators
3701 - freedreno/ir3/ra: fix array liveranges
3702 - freedreno/ir3/ra: compute register target from liveranges
3703 - freedreno/ir3/ra: pick higher numbered scalars in first pass
3704 - freedreno/ir3/ra: split building regs/classes and conflicts
3705 - freedreno/ir3/ra: re-work a6xx merged register file conflicts
3706 - gitlab-ci: disable vs2019 build
3707 - freedreno: remove some obsolete debug options
3708 - util: fix u_fifo_pop()
3709 - freedreno: add logging infrastructure
3710 - freedreno/a6xx: timestamp logging support
3711 - freedreno: add some initial fd_log tracepoints
3712 - freedreno/a6xx: add some more tracepoints
3713 - freedreno/log: avoid duplicate ts's
3714 - util: move ALIGN/ROUND_DOWN_TO to u_math.h
3715 - freedreno/ir3: fix android build
3716 - freedreno/log: fix build error
3717 - nir: fix definition of imadsh_mix16 for vectors
3718 - freedreno/ir3/cf: handle widening too
3719 - freedreno/ir3: fixup cat3 32b vs 16b
3720 - freedreno/ir3/cf: skip array load/store
3721 - freedreno/ir3: add a pass to collect SSA uses
3722 - freedreno/ir3/cf: use ssa-uses
3723 - freedreno/a6xx: add some compute logging
3724 - freedreno: fix missing locking
3725 - freedreno/ir3: also precompile compute shaders for shaderdb
3726 - freedreno: limit fp16 to frag and compute
3727 - glsl: don't limit fp16 lowering to frag
3728 - nir: add some swizzle helpers
3729 - nir/lower_amul: fix slot calculation
3730 - freedreno/log: android support
3731 - freedreno/log: spiff out parser some more
3732 - freedreno/log: better decoding for multiple chunks per batch
3733 - freedreno/ir3: spiff out disasm a bit
3734 - freedreno/ir3: make falsedep use's optional
3735 - freedreno/ir3: simplify grouping pass
3736 - freedreno/ir3: fix location of inserted mov's
3737 - freedreno/ir3: new pre-RA scheduler
3738 - freedreno/ir3/sched: awareness of partial liveness
3739 - freedreno/ir3/postsched: remove some leftovers
3740 - freedreno/ir3/postsched: avoid moving tex ahead of kill
3741 - freedreno/ir3: add mov/cov stats
3742 - freedreno/ir3/ra: handle array case for SFU select_reg opt
3743 - freedreno/ir3: better cleanup when removing unused instructions
3744 - freedreno/ir3: rename depth->dce
3745 - freedreno/ir3/ra: cleanup some leftovers
3746 - mesa: avoid redundant VBO updates
3747 - mesa/st: avoid u_vbuf for GLES
3748 - gallium: add # of MRT to blend state
3749 - freedreno/computer: add script to test widening/narrowing
3750 - freedreno/ir3/ra: remove unused variable
3751 - freedreno/ir3/ra: use ir3_debug_print helper
3752 - freedreno/ir3/ra: split out helper for array assignment
3753 - freedreno/ir3/ra: only assign array base in first pass
3754 - freedreno/a6xx+tu: rename VSC_DATA/VSC_DATA2
3755 - freedreno: add helper to estimate # of bins per pipe
3756 - freedreno/a6xx: pre-calculate expected vsc stream sizes
3757 - freedreno/log-parser: support to read gzip'd logs
3758 - freedreno: small whitespace fix
3759 - freedreno: don't realloc idle bo's
3760 - freedreno: mark more state dirty when rebinding resources
3761 - freedreno: optimize rebind_resource()
3762 - freedreno: rebind resource in all contexts
3763 - freedreno: rebind_resource() \*before\* bo changes
3764 - freedreno/a6xx: invalidate tex state cache entries on rebind
3765 - freedreno: fix buffer import
3766 - freedreno/ir3: fix indirect cb0 load_ubo lowering
3767 - freedreno: clear last_fence after resource tracking
3768
3769 Rohan Garg (5):
3770
3771 - ci: Split out radv build-testing on arm64
3772 - ci: Drop the git dependency in tracie
3773 - tracie: Switch to using shutil.move for cross filesystem moves
3774 - tracie: Print results in a machine readable format
3775 - tracie: Reformat code to fix indentation
3776
3777 Roland Scheidegger (7):
3778
3779 - gallivm: fix crash with bptc border color sampling
3780 - gallivm: fix crash in emit_get_buffer_size
3781 - gallivm: disable rgtc/latc SNORM accellerated fetches
3782 - gallium/util: Add back (and rename) util_float_to_half implementation
3783 - gallivm: fix rgtc2 format
3784 - gallivm: switch the mask6/mask7 cases for signed rgtc formats
3785 - gallivm: fix stream id fetch
3786
3787 Roman Stratiienko (3):
3788
3789 - panfrost: Align Android makefiles with recent changes
3790 - lima: Add missing source file to Android.mk
3791 - panfrost: Align Android makefiles with recent changes
3792
3793 Sagar Ghuge (13):
3794
3795 - intel/isl: Move get_format_encoding function to isl
3796 - intel/isl: Switch to R8_UNORM format for compatiblity
3797 - intel/tools: Handle illegal instruction
3798 - intel/tools: Handle STATE_REG in typed source operand
3799 - intel/tools: Set correct address register file and number in i965_asm
3800 - intel/tools: Add test for address register as source
3801 - intel/tools: Add test for state register as source
3802 - intel/tools: Print c_literals 4 byte wide
3803 - intel/tools: Allow i965_disasm to disassemble c_literal input type
3804 - intel/genxml: Add patch count threshold field on gen12
3805 - intel/compiler: Track patch count threshold
3806 - anv: Set patch count threshold in 3DSTATE_HS
3807 - iris: Set patch count threshold in 3DSTATE_HS
3808
3809 Samuel Iglesias Gonsálvez (2):
3810
3811 - radv: check buffer size in vkCreateBuffer()
3812 - radv: set sparseAddressSpaceSize to RADV_MAX_MEMORY_ALLOCATION_SIZE
3813
3814 Samuel Pitoiset (197):
3815
3816 - aco: fix MUBUF VS input loads when expanding vec3 to vec4 on GFX6
3817 - aco: do not use ds_{read,write}2 on GFX6
3818 - gitlab-ci: disable a630 tests as mesa-cheza is down (again)
3819 - aco: fix waiting for scalar stores before "writing back" data on
3820 GFX8-GFX9
3821 - radv: make sure to not submit any IBs when RADV_FORCE_FAMILY is set
3822 - radv: set the chip name to GCN-NOOP when RADV_FORCE_FAMILY is set
3823 - aco: fix creating v_madak if v_mad_f32 has two sgpr literals
3824 - nir: do not use De Morgan's Law rules for flt and fge
3825 - radv: fix line width range and granularity
3826 - radv: implement VK_EXT_line_rasterization
3827 - radv: remove LLVM sicheduler enable for The Talos Principle
3828 - radv: remove RADV_DEBUG=nosisched and RADV_PERFTEST=sisched
3829 - radv: remove unused RADV_HASH_SHADER_IS_GEOM_COPY_SHADER
3830 - radv: remove unnecessary RADV_DEBUG=nobatchchain option
3831 - docs/new_features: empty the feature list for the 20.1 cycle
3832 - radv: enable shaderStorageImageMultisample on GFX6-GFX7
3833 - radv: enable VK_EXT_sampler_filter_minmax on GFX6
3834 - radv: enable VK_NV_compute_shader_derivatives on GFX6-GFX7
3835 - radv: add a comment about VK_AMD_mixed_attachment_samples on
3836 GFX6-GFX7
3837 - docs/envvars: document RADV_TEX_ANISO
3838 - radv/winsys: add a new flag that requests zerovram allocations
3839 - radv: use RADEON_FLAG_ZERO_VRAM when creating the trace BO
3840 - radv: add the trace BO to the BO list at submit time
3841 - radv: implement a dummy winsys for creating devices without AMDGPU
3842 - ac,radeonsi: add ac_gpu_info::lds_size_per_cu
3843 - ac: add more ac_gpu_info related shader fields
3844 - radv/gfx10: adjust the number of simd per compute unit
3845 - radv/gfx10: adjust SGPRs/VGPRs related info
3846 - radv/gfx10: adjust the LDS size used to compute waves
3847 - radv/gfx10: adjust the number of VGPRs used to compute waves
3848 - radv: make use of ac_gpu_info::max_wave64_per_simd
3849 - radv: fix creating null devices if KHR_display is enabled
3850 - ac/llvm: fix 64-bit fmed3
3851 - ac/llvm: fix 16-bit fmed3 on GFX8 and older gens
3852 - ac/llvm: flush denorms for nir_op_fmed3 on GFX8 and older gens
3853 - ac: add more fields to ac_gpu_info
3854 - ac/registers: add definitions for thread trace
3855 - radv: add a small helper that allows to submit internal CS
3856 - radv: add initial SQ Thread Trace support for GFX9
3857 - radv: emit thread trace markers after every draw/dispatch call
3858 - radv: add initial SQTT files generation support
3859 - radv: allow to capture SQTT traces with
3860 RADV_THREAD_TRACE=<start_frame>
3861 - radv: fix 32-bit build failure in radv_queue_internal_submit()
3862 - radv: fix size of sqtt_file_chunk_asic_info on 32-bit system
3863 - radv/rgp: adjust trace memory/shader clocks to fix frame duration
3864 - radv/sqtt: do not assume that the number of shader engines is 4
3865 - radv/sqtt: update SPI_CONFIG_CNTL.EXP_PRIORITY_ORDER value
3866 - ac/registers: add definitions for thread trace on GFX10
3867 - radv/sqtt: add support for GFX10
3868 - radv: update entrypoints generation from ANV
3869 - ac: rename lds_size_per_cu to lds_size_per_workgroup
3870 - ac: rename vgpr_alloc_granularity to wave64_vgpr_alloc_granularity
3871 - ac: rename min_vgpr_alloc to min_wave64_vgpr_alloc
3872 - aco: fix image load/store with lod and 1D images
3873 - gitlab-ci: build Fossilize in the test image for VK
3874 - gitlab-ci: add Fossilize support to detect compiler regressions
3875 - gitlab-ci: enable building the test image for VK unconditionally
3876 - gitlab-ci: add a job that runs Fossilize on RADV/Polaris10
3877 - radv/winsys: fix missing initializations of shader info in the null
3878 device
3879 - radv/sqtt: fix wrong check in radv_is_thread_trace_complete()
3880 - radv/sqtt: tidy up radv_emit_thread_trace_{start,stop}
3881 - radv/sqtt: add radv_copy_thread_trace_info_regs() helper
3882 - ac/registers: adjust some definitions for thread trace on GFX8
3883 - radv/sqtt: add support for GFX8
3884 - radv/sqtt: abort if SQTT is used on GFX6-GFX7
3885 - ac: add ac_gpu_info::cu_mask to store bitmask of compute units
3886 - radv/rgp: report correct cu_mask info
3887 - radv/rgp: report correct system ram size
3888 - nir/lower_input_attachments: remove bogus assert in
3889 try_lower_input_texop()
3890 - radv/entrypoints: declare a driver internal layer for SQTT
3891 - radv: use device entrypoints from the SQTT layer if enabled
3892 - radv/sqtt: add a helper that emits thread trace userdata markers
3893 - radv: initial implementation of the driver internal layer SQTT
3894 - radv/sqtt: describe begin/end command buffers with user markers
3895 - radv/sqtt: describe draw/dispatch and emit event markers
3896 - radv/sqtt: describe render pass color/depthstencil clears
3897 - radv/rgp: bump the instrumentation spec version to 1
3898 - radv/sqtt: describe pipeline and wait events barriers
3899 - gitlab-ci: add rules:changes for RADV
3900 - radv: do not recursively begin/end render pass for meta operations
3901 - radv: fix 32-bits build (again)
3902 - gitlab-ci: build RADV in meson-i386 to avoid 32-bit build failures
3903 - ac/llvm: add missing optimization barrier for 64-bit readlanes
3904 - radv/sqtt: describe begin/end subpass barriers with user markers
3905 - radv/sqtt: describe layout transitions with user markers
3906 - radv/gfx10: cache metadata in L2 on small chips
3907 - radv: use better tessellation tunables on GFX9+
3908 - radv: tune primitive binning for small chips
3909 - radv: rewrite late alloc computation
3910 - radv: use ac_gpu_info::use_late_alloc
3911 - radv: cleanup occurences of use_aco everywhere
3912 - radv: remove radv_shader_variant::aco_used
3913 - radv: remove unnecessary LLVM includes
3914 - radv: add llvm_compiler_shader() helper
3915 - gitlab-ci: remove useless 'patch' package in the VK test image
3916 - gitlab-ci: allow deqp-runner to use the maximum number of jobs
3917 - gitlab-ci: do not set the number of deqp-parallel jobs for RADV CTS
3918 - gitlab-ci: bump Vulkan CTS to 1.2.1.0
3919 - radv/sqtt: handle thread trace capture in sqtt_QueuePresentKHR()
3920 - radv: only inject implicit subpass dependencies if necessary
3921 - radv/gfx10: fix required subgroup size with
3922 VK_EXT_subgroup_size_control
3923 - radv/gfx10: fix required ballot size with
3924 VK_EXT_subgroup_size_control
3925 - radv: fix random depth range unrestricted failures due to a cache
3926 issue
3927 - radv: remove wrong assert that checks compute subgroup size
3928 - radv: fix optional pSizes parameter when binding streamout buffers
3929 - radv/winsys: fix wrong PCI ID for Vega10 in the null winsys
3930 - radv/winsys: spoof some values for num_render_backends in the null
3931 winsys
3932 - gitlab-ci: compile fossils with both RADV compiler backends
3933 (LLVM/ACO)
3934 - gitlab-ci: compile fossils with more ASICs
3935 - gitlab-ci: add a new stage for RADV CI
3936 - gitlab-ci: add a bunch of new fossils from the Sascha Vulkan demos
3937 - radv/llvm: fix subgroup shuffle for chips without bpermute
3938 - radv: enable VK_KHR_8bit_storage on GFX6-GFX7
3939 - ac/nir: use llvm.amdgcn.rcp for nir_op_frcp
3940 - ac/nir: use llvm.amdgcn.rsq for nir_op_frsq
3941 - ac/nir: use llvm.amdgcn.rcp in ac_build_fdiv()
3942 - nir/algebraic: add fexp2(fmul(flog2(a), 0.5) -> fsqrt(a) optimization
3943 - aco: only break SMEM clauses if XNACK is enabled (mostly APUs)
3944 - aco: always optimize v_mad to v_madak in presence of literals
3945 - ac/nir: split 8-bit load/store to global memory on GFX6
3946 - ac/nir: split 8-bit SSBO stores on GFX6
3947 - radv/llvm: enable 8-bit storage features on GFX6-GFX7
3948 - ac/nir: split 16-bit load/store to global memory on GFX6
3949 - ac/nir: split 16-bit SSBO stores on GFX6
3950 - radv/llvm: enable 16-bit storage features on GFX6-GFX7
3951 - radv: rename decompress/resummarize depth/stencil functions
3952 - radv: rename extra graphics pipeline decompress/resummarize fields
3953 - radv: cleanup creating the decompress/resummarize pipelines
3954 - radv: remove radv_layout_has_htile() helper
3955 - radv: enable lowering of GS intrinsics for the LLVM backend
3956 - ac,radv: add ac_gpu_info::has_double_rate_fp16
3957 - radv: only expose shaderFloat16 for chips with double rate fp16
3958 - radv: only expose storageInputOutput16 for chips with double rate
3959 fp16
3960 - radv: only expose fp16 control features for chips with double rate
3961 fp16
3962 - radv: only enable TC-compat HTILE for images readable by a shader
3963 - radv: allow TC-compat HTILE with GENERAL outside of render loops
3964 - aco: implement 16-bit nir_op_frexp_sig/nir_op_frexp_exp
3965 - aco: implement 16-bit nir_op_ffract
3966 - aco: implement 16-bit nir_op_fexp2/nir_op_flog2
3967 - aco: implement 16-bit nir_op_ftrunc/nir_op_fround_even
3968 - aco: implement 16-bit nir_op_fsqrt/nir_op_frcp/nir_op_frsq
3969 - aco: implement 16-bit nir_op_ffloor/nir_op_fceil
3970 - aco: implement 16-bit nir_op_fmax/nir_op_fmin
3971 - aco: implement 16-bit nir_op_fabs/nir_op_fneg
3972 - aco: implement 16-bit nir_op_fsub/nir_op_fadd
3973 - aco: implement 16-bit nir_op_fcos/nir_op_fsin
3974 - aco: implement 16-bit nir_op_fmul
3975 - aco: implement 16-bit nir_op_fsat
3976 - aco: implement 16-bit nir_op_fsign
3977 - aco: implement 16-bit nir_op_bcsel
3978 - aco: implement 16-bit nir_op_f2i32/nir_op_f2u32
3979 - aco: implement 16-bit nir_op_ldexp
3980 - aco: implement 16-bit nir_op_fmax3/nir_op_fmin3/nir_op_fmed3
3981 - aco: implement 16-bit comparisons
3982 - aco: implement nir_op_b2f16/nir_op_i2f16/nir_op_u2f16
3983 - aco: fix f2i64/f2u64 with sgprs if the exponent computation overflow
3984 - aco: implement 16-bit nir_op_f2i64/nir_op_f2u64
3985 - aco: fix nir_op_pack_32_2x16_split if one operand is a constant
3986 - radv: add radeon_set_context_reg_rmw() helper
3987 - radv: use RMW packets for updating the maximum sample distance
3988 - aco: fix nir_op_frexp_exp with 16-bit floats and negative exponents
3989 - radv/aco: do not advertise VK_KHR_shader_subgroup_extended_types
3990 - aco: implement nir_op_f2i8/nir_op_f2u8
3991 - aco: fix emitting stream output with tess eval shaders
3992 - radv: do not abort with unknown/unimplemented descriptor types
3993 - radv: fix geometry shader primitives query with ACO on GFX10
3994 - radv: set missing SHARED_VGPR_CNT for NGG VS and ACO
3995 - radv/llvm: fix exporting the viewport index if the fragment shader
3996 needs it
3997 - aco: fix exporting the viewport index if the fragment shader needs it
3998 - nir/lower_int64: lower imin3/imax3/umin3/umax3/imed3/umed3
3999 - nir/opt_algebraic: lower 64-bit fmin3/fmax3/fmed3
4000 - gitlab-ci: add a list of excluded tests for RADV
4001 - radv: make sure to export the viewport index if FS needs it
4002 - radv: simplify checking for Navi1x chips
4003 - radv: adjust the supported subgroup stages
4004 - radv: fix robust_buffer_access if enabled via
4005 VkPhysicalDeviceFeatures2
4006 - gitlab-ci: add lists of expected failures for RADV CI
4007 - ac,radeonsi: fix compilations issues with LLVM 11
4008 - radv: do not expose GTT as device local memory mostly for APUs
4009 - radv: enable FMASK for color attachments only
4010 - radv: remove unused radv_device_memory::map_size field
4011 - radv: track memory heaps usage if overallocation is explicitly
4012 disallowed
4013 - radv: advertise VK_AMD_memory_overallocation_behavior
4014 - ac/llvm: fix nir_texop_texture_samples with NULL descriptors
4015 - aco: fix nir_texop_texture_samples with NULL descriptors
4016 - aco: fix adjusting the sample index with FMASK if value is negative
4017 - radv: handle NULL descriptors
4018 - radv: handle NULL vertex bindings
4019 - radv: advertise VK_EXT_robustness2
4020 - gitlab-ci: add a list of expected failures for FIJI with ACO
4021 - ci: fix reporting the number of unexpected/flakes
4022 - radv: report INITIALIZATION_FAILED when the amdgpu winsys init failed
4023 - radv: don't report error with other vendor DRM devices
4024 - aco: fix 64-bit trunc with negative exponents on GFX6
4025 - radv: limit the Vulkan version to 1.1 for Android
4026 - radv: handle different Vulkan API versions correctly
4027 - radv: update the list of allowed Android extensions
4028
4029 Satyajit Sahu (1):
4030
4031 - st/va: GetConfigAttributes: check profile and entrypoint combination
4032
4033 Simon Ser (1):
4034
4035 - mesa: add support for NV_pixel_buffer_object
4036
4037 Simon Zeni (1):
4038
4039 - mesa: enable GL_EXT_draw_instanced for gles2
4040
4041 Sonny Jiang (1):
4042
4043 - radeonsi: enable EXT_texture_shadow_lod
4044
4045 Szymon Andrzejuk (1):
4046
4047 - virgl: Use align_free for align_malloc allocated buffer
4048
4049 Tapani Pälli (27):
4050
4051 - intel/vec4: fix valgrind errors with vf_values array
4052 - glsl: fix a memory leak with resource_set
4053 - iris: fix aux buf map failure in 32bits app on Android
4054 - mesa: introduce boolean toggle for EXT_texture_norm16
4055 - i965: toggle on EXT_texture_norm16
4056 - mesa/st: toggle EXT_texture_norm16 based on format support
4057 - mesa/st: fix formats required for EXT_texture_norm16
4058 - nir: fix compilation warning on glsl_get_internal_ifc_packing
4059 - iris: toggle on PIPE_CAP_MIXED_COLOR_DEPTH_BITS
4060 - nir/glsl: gather bitmask of images used by program
4061 - iris: use the images_used mask in resolve pass
4062 - intel/compiler: detect if atomic load store operations are used
4063 - iris: provide dummy iris_image_view_aux_usage
4064 - iris: move existing image format fallback as a helper function
4065 - iris: determine aux usage during predraw and state setup
4066 - isl: allow compression for storage images on gen12+
4067 - iris: allow compression conditionally for images on gen12
4068 - glsl: set error_emitted true if type not ok for assignment
4069 - mesa/st: unbind shader state before deleting it
4070 - mesa/st: release variants for active programs before unref
4071 - mesa: remove redudant check
4072 - mesa: remove redudant assignment
4073 - glsl: remove redudant assignment
4074 - glsl: stop processing function parameters if error happened
4075 - mesa/st: initialize all winsys_handle fields for memory objects
4076 - anv: remove assert from GetImageMemoryRequirements[2]
4077 - st/mesa: destroy only own program variants when program is released
4078
4079 Thomas Hellstrom (5):
4080
4081 - svga: Fix banded DMA upload
4082 - svga, winsys/svga: Fix persistent memory discard maps
4083 - svga: Treat forced coherent maps as maps of persistent memory
4084 - gallium/pipebuffer: Use persistent maps for slabs
4085 - winsys/svga: Optionally avoid caching buffer maps
4086
4087 Thong Thai (7):
4088
4089 - Revert "st/va: Convert interlaced NV12 to progressive"
4090 - gallium/auxiliary/vl: fix bob compute shaders for deint yuv
4091 - st/va: remove unneeded code
4092 - st/va/postproc: reallocate interlaced destination buffer
4093 - radeonsi: add 10-bit HEVC encode support for VCN2.0 devices
4094 - radeon: add support for 10-bit HEVC encoding to VCN 2.0
4095 - st/va: add check for P010 and P016 encode/decode support
4096
4097 Timothy Arceri (51):
4098
4099 - glsl: fix gl_nir_set_uniform_initializers() for image arrays
4100 - glsl: fix possible memory leak in nir uniform linker
4101 - glsl: set the correct number of samplers in a shader
4102 - glsl: set the correct number of images in a shader
4103 - glsl: fix resizing of the uniform remap table
4104 - glsl: reset next_image_index count for each shader stage
4105 - glsl: fix sampler index calculation in nir linker
4106 - glsl: add some error checks to the nir uniform linker
4107 - glsl: move nir link uniforms struct defs earlier
4108 - glsl: move add_parameter() earlier in nir link uniforms
4109 - glsl: move get_next_index() earlier in nir link uniforms
4110 - glsl: add name support to nir uniform linker
4111 - glsl: correctly find block index when linking glsl with nir linker
4112 - nir: add glsl_get_internal_ifc_packing() helper
4113 - nir: add glsl_get_std140_base_alignment() helper
4114 - nir: add glsl_get_std140_size() helper
4115 - nir: add glsl_get_std430_base_alignment() helper
4116 - nir: add glsl_get_std430_size() helper
4117 - glsl: add std140 and std430 layouts to nir uniform linker
4118 - glsl: correctly set explicit offsets for struct members
4119 - glsl: find the base offset for block members from unnamed blocks
4120 - glsl: nir linker fix setting of ssbo top level array
4121 - glsl: set ShaderStorageBlocksWriteAccess in the nir linker
4122 - glsl: add support for builtins to the nir uniform linker
4123 - glsl: dont try to assign uniform storage for uniform blocks
4124 - glsl: add subroutine support to nir linker
4125 - glsl: fix varying packing for 64bit integers
4126 - nir: fix packing of TCS varyings not read by the TES
4127 - nir: fix crash in varying packing on interface mismatch
4128 - glsl_to_nir: remove dead code
4129 - radeonsi: don't lower constant arrays to uniforms in GLSL IR
4130 - nir: make opt_if_loop_terminator() less strict
4131 - nir: add matrix_layout to nir_variable data
4132 - glsl: fix struct offsets in the nir uniform linker
4133 - glsl: tidy up uniform storage value count code in NIR linker
4134 - Revert "glsl: fix resizing of the uniform remap table"
4135 - glsl: fix explicit locations for the glsl linker
4136 - glsl: error check max user assignable uniform locations
4137 - glsl: fix block index in NIR uniform linker
4138 - glsl: pull mark_array_elements_referenced() out into common helper
4139 - glsl: only set stage ref when uniforms referenced in stage
4140 - nir/gcm: allow derivative dependent intrinisics to be moved earlier
4141 - nir/gcm: be more conservative about moving instructions from loops
4142 - nir/gcm: dont move movs unless we can replace them later with their
4143 src
4144 - glsl: add bindless support to nir uniform linker
4145 - glsl: fix gl_nir_set_uniform_initializers() for bindless textures
4146 - st/glsl_to_nir: make use of nir linker for linking uniforms
4147 - glsl: some nir uniform linker fixes
4148 - glsl: remove some duplicate code from the nir uniform linker
4149 - glsl: stop cascading errors if process_parameters() fails
4150 - glsl: fix slow linking of uniforms in the nir linker
4151
4152 Timur Kristóf (90):
4153
4154 - aco/optimizer: Don't combine uniform bool s_and to s_andn2.
4155 - radv: Move some helper functions to the radv_shader.h header file.
4156 - aco: Extract setup_gs_variables into a separate function.
4157 - aco: Setup tessellation control shader variables.
4158 - aco: Implement load_tess_coord.
4159 - aco: Implement load_primitive_id for tessellation shaders.
4160 - aco: Implement load_patch_vertices_in.
4161 - aco: Implement load_invocation_id for tessellation control shaders.
4162 - aco: Implement control_barrier for tessellation control shaders.
4163 - aco: Implement memory_barrier_tcs_patch.
4164 - aco: Implement load_view_index for TCS and TES.
4165 - aco: Setup correct HW stages when tessellation is used.
4166 - aco: Use mesa shader stage when loading inputs.
4167 - aco: Remove vertex_geometry_gs assertion from merged shaders.
4168 - aco: Extract LDS alignment calculation to a separate function.
4169 - aco: Remove esgs_itemsize from LDS alignment calculation.
4170 - aco: Introduce new VMEM load/store helpers.
4171 - aco: Introduce new helpers for calculating address offsets.
4172 - aco: Refactor load_per_vertex_input in preparation for tessellation.
4173 - aco: Refactor VS output stores in preparation for tessellation.
4174 - aco: Slight fix to lds_store and lds_load.
4175 - aco: Fix combining DS additions in the optimizer.
4176 - aco: Implement tessellation control shader input/output.
4177 - aco: Store VS outputs correctly when tessellation is used.
4178 - aco: Fix LS VGPR init bug on affected hardware.
4179 - radv: Enable ACO for tessellation control shaders.
4180 - aco: Setup tessellation evaluation shader variables.
4181 - aco: Use TES output info when TES runs on the VS stage.
4182 - aco: Store TES outputs when TES runs on the HW VS stage.
4183 - aco: Enable streamout when TES runs on the HW VS stage.
4184 - aco: Implement loading TES inputs.
4185 - radv: Enable ACO for TES when there is no GS.
4186 - aco: Enable running TES as ES, including merged TES+GS.
4187 - radv: Enable ACO on all stages.
4188 - aco: Don't generate an if when the first part of a merged HS or GS is
4189 empty.
4190 - aco: Store tess factors in VMEM only at the end of the shader.
4191 - aco: Only write TCS outputs to LDS when they are read by the TCS.
4192 - aco: Don't store TCS outputs to LDS when we're sure that none are
4193 read.
4194 - nir: Add ability to lower non-const quad broadcasts to const ones.
4195 - radv: Enable lowering dynamic quad broadcasts.
4196 - radv: Enable subgroup shuffle on GFX10 when ACO is used.
4197 - aco: Create null exports in instruction selection instead of
4198 assembler.
4199 - aco: Extract tcs_driver_location_matches_api_mask to separate
4200 function.
4201 - aco: Fix handling of tess factors.
4202 - aco: Allow combining TCS output VMEM stores.
4203 - aco: Allow combining LDS loads when loading tess factors.
4204 - aco: Skip 2nd read of merged wave info when TCS in/out vertices are
4205 equal.
4206 - aco: Use more optimal sequence at the beginning of merged shaders.
4207 - nir: Collect if shader uses cross-invocation or indirect I/O.
4208 - aco: Treat outputs of the previous stage as inputs of the next stage.
4209 - aco: Change isel inputs/outputs to a flat array.
4210 - aco: Zero-fill undefined elements in create_vec_from_array.
4211 - aco: Extract setup_tcs_info to a separate function.
4212 - aco: Fix workgroup size calculation.
4213 - aco: Extract store_output_to_temps into a separate function.
4214 - aco: When LS and HS invocations are the same, pass LS outputs in
4215 temps.
4216 - aco: Don't store LS VS outputs to LDS when TCS doesn't need them.
4217 - aco: Fix crash in insert_wait_states.
4218 - aco: Extract uniform if handling to separate functions.
4219 - aco: Print block_kind_export_end.
4220 - aco: Extract merged_wave_info_to_mask to its own function.
4221 - aco: Treat s_setprio as a scheduling barrier.
4222 - aco/ngg: Add new stage for hw_ngg_gs.
4223 - aco/ngg: Initialize exec mask for NGG VS and TES.
4224 - aco/ngg: Fix exports for NGG VS and TES.
4225 - aco/ngg: Setup NGG VS and TES stages.
4226 - aco/ngg: Implement NGG VS and TES.
4227 - aco/ngg: Schedule position exports of NGG VS/TES.
4228 - aco/ngg: Run GS_ALLOC_REQ on priority 3 for NGG VS and TES.
4229 - radv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.
4230 - aco: Print shader stage in aco_print_program.
4231 - radv: Print shader stage before disassembly.
4232 - radv: Add inputs read by TES to radv_shader_info.
4233 - aco: Only store TCS outputs to VMEM when they are read by TES.
4234 - aco: Increase barrier_count to 7 to include barrier_barrier.
4235 - aco: Abort when RA can't find a register.
4236 - aco: Const correctness for get_barrier_interaction.
4237 - aco: Const correctness for aco_print_ir.
4238 - aco: Use 24-bit multiplication in TCS I/O
4239 - aco: Use 24-bit multiplication for NGG wave id and thread id.
4240 - aco: Move s_setprio to correct place after the gs_alloc_req.
4241 - radv: Refactor calculate_tess_lds_size and get_tcs_num_patches.
4242 - aco: Use context variables instead of calculating TCS inputs/outputs.
4243 - aco: Remember VS/TCS output driver locations.
4244 - aco: Calculate workgroup size of legacy GS.
4245 - aco: Set config->lds_size when TES or VS is running on HW ESGS.
4246 - nir: Add new linking helper to set linked driver locations.
4247 - radv: Use new linking helper to set default driver locations.
4248 - aco: Use new default driver locations.
4249 - radv: Use smaller esgs_itemsize for ACO.
4250
4251 Tobias Jakobi (1):
4252
4253 - meson: Link Gallium Nine with ld_args_build_id
4254
4255 Tomasz Pyra (1):
4256
4257 - gallium/swr: spin-lock performance improvement
4258
4259 Tomeu Vizoso (34):
4260
4261 - panfrost: Print intended field when decoding
4262 - panfrost: Add more info to some assertions
4263 - pan/midgard: Handle nir_intrinsic_load_barycentric_centroid
4264 - panfrost: Use DBG macro to avoid noise in the console
4265 - panfrost: Fix decoding of tiled 3D textures
4266 - panfrost: Only clamp the LOD to disable mipmapping when needed
4267 - gitlab-ci: Switch kernel for LAVA jobs to 5.5
4268 - gitlab-ci: Disable the lima job for now
4269 - gitlab-ci: Run GLES3 tests in dEQP on Panfrost
4270 - panfrost: Remove some more prints to stdout
4271 - gitlab-ci: Move to 5.5 kernel plus fixes for Panfrost
4272 - gitlab-ci: Use PAN_MESA_DEBUG=gles3 for Panfrost
4273 - gitlab-ci: Remove GLES3 test from Panfrost fails list
4274 - gitlab-ci: Skip dEQP-GLES3.functional.shaders.derivate.\*
4275 - gallium: Add forgotten docs for new CAPs related to transform
4276 feedback
4277 - gitlab-ci: Update renderdoc
4278 - gitlab-ci: Use surfaceless platform also for apitrace
4279 - gitlab-ci: Place files from the Mesa repo into the build tarball
4280 - gitlab-ci: Serve files for LAVA via separate service
4281 - gitlab-ci: Disable jobs for Collabora's LAVA lab
4282 - Revert "gitlab-ci: Disable jobs for Collabora's LAVA lab"
4283 - panfrost: Remove most usage of midgard_payload_vertex_tiler
4284 - panfrost: Pass IS_BIFROST to pandecode_jc
4285 - panfrost: Don't emit write_value jobs on Bifrost
4286 - panfrost: On Bifrost, set the right tiler descriptor
4287 - gitlab-ci: Test virgl driver
4288 - panfrost: Clean up a bit the tiler structs for Bifrost
4289 - panfrost: Emit sampler descriptor on bifrost
4290 - panfrost: Emit texture descriptor on bifrost
4291 - gitlab-ci: Update virglrenderer in the x86_test-gl image
4292 - gitlab-ci: Allow test jobs to add options to the dEQP invocation
4293 - gitlab-ci: Test OpenGL ES 3.1 on virgl
4294 - gitlab-ci: Test Virgl with traces
4295 - panfrost: Add Bifrost texture trampoline BO to batch
4296
4297 Uros Bizjak (1):
4298
4299 - doc: Update features.txt for r600 with misc supported features
4300
4301 Vasily Khoruzhick (19):
4302
4303 - lima: handle early-z and pixel kill better
4304 - lima: implement PLB PP stream cache
4305 - lima: add RGBA5551 and RGBA4444 formats
4306 - lima: don't disable tiling if there's linear modifier in list
4307 - lima: gpir: enforce instruction limit earlier
4308 - panfrost: split index cache into shared part
4309 - lima: enable minmax cache for index buffers
4310 - lima: print gp uniforms if gp debug is enabled
4311 - lima/gpir: improve disassembler output
4312 - lima/gpir: print acc ops even if we have only one source
4313 - lima/gpir: kill dead writes to regs in DCE
4314 - lima/gpir: add better lowering for ftrunc
4315 - lima/gpir: fix crash in schedule_insert_ready_list()
4316 - lima: disable Z16 format
4317 - lima: decode depth/stencil write bits in RSW
4318 - lima: split pixel and texel format tables
4319 - lima: add support for R and RG formats
4320 - lima: Implement lima_texture_subdata
4321 - lima: avoid situations when scissor minx > maxx or miny > maxy
4322
4323 Veerabadhran (1):
4324
4325 - radeon/vce: Move global function pointer si_get_pic_param to local
4326 encoder structure Multi gpu use case broken when the function was
4327 global
4328
4329 Vilya Harvey (1):
4330
4331 - zink. Don't set incorrect sType in VkImportMemoryFdInfoKHR struct
4332
4333 Vinson Lee (16):
4334
4335 - swr: Fix build with GCC 10.
4336 - lima: Fix build with GCC 10.
4337 - swr: Fix GCC 4.9 checks.
4338 - panfrost: Remove unused anonymous enum variables.
4339 - meson: Enable -Wno-deprecated only for bison > 2.3.
4340 - swr: Fix non-pod-varargs error.
4341 - st/nine: Fix incompatible-pointer-types-discards-qualifiers errors.
4342 - panfrost: Fix gnu-empty-initializer error.
4343 - util/u_process: Add util_get_process_exec_path for macOS.
4344 - mesa: Change \_mesa_exec_malloc argument type.
4345 - gallivm: Add missing header for powf.
4346 - swr/rasterizer: Use private functions for min/max to avoid namespace
4347 issues.
4348 - swr: Remove Byte Order Mark.
4349 - r600/sfn: Initialize VertexStageExportForGS m_num_clip_dist member
4350 variable.
4351 - r600/sfn: Use correct setter method.
4352 - freedreno: Add missing va_end.
4353
4354 Yevhenii Kolesnikov (1):
4355
4356 - intel/compiler: fix cmod propagation optimisations
4357
4358 Zhang, Boyuan (1):
4359
4360 - radeonsi: Add support for midstream bitrate change in encoder
4361
4362 luc (1):
4363
4364 - zink: confused compilation macro usage for zink in target helpers.