git.libre-soc.org Git - mesa.git/log

egl/x11: deduplicate depth-to-format logic

Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

i965: enable OES_texture_view for gen8+

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

mesa: changes to expose OES_texture_view extension

Functionality already covered by ARB_texture_view, patch also
adds missing 'gles guard' for enums (added in f1563e6392).

Tested via arb_texture_view.*_gles3 tests and individual app
utilizing texture view with ETC2.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

docs: update release calendar for 18.1 series

v2: extend 18.1 series (Andres)
v3: fix copy/paste typo (Engestrom)

CC: Andres Gomez <agomez@igalia.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
CC: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>

radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS

Do not lower FS inputs because this moves all load_var
instructions at beginning of shaders and because
interp_var_at_sample (and friends) seem broken. That might
be eventually enabled later on if we really want to preload
all FS inputs at beginning.

Polaris10:
Totals from affected shaders:
SGPRS: 54072 -> 54264 (0.36 %)
VGPRS: 38580 -> 38124 (-1.18 %)
Spilled SGPRs: 652 -> 652 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 2128116 -> 2127380 (-0.03 %) bytes
Max Waves: 8048 -> 8086 (0.47 %)

Vega10:
Totals from affected shaders:
SGPRS: 52616 -> 52656 (0.08 %)
VGPRS: 37536 -> 37116 (-1.12 %)
Spilled SGPRs: 828 -> 828 (0.00 %)
Code Size: 2043756 -> 2042672 (-0.05 %) bytes
Max Waves: 9176 -> 9254 (0.85 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: call nir_split_var_copies() before nir_lower_var_copies()

This doesn't nothing special currently because we don't create
any copy_var instructions, but this is needed for the next patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.

Instead of directly using intel_obj->buffer. Among other things
intel_bufferobj_buffer() will update intel_buffer_object::
gpu_active_start/end, which are used by glBufferSubData() to decide
which path to take. Fixes a failure in the Piglit
ARB_shader_image_load_store-host-mem-barrier Buffer Update/WaW tests,
which could be reproduced with a non-standard glGetTexSubImage
implementation (see bug report).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105351
Reported-by: Nanley Chery <nanleychery@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

i965: Handle non-zero texture buffer offsets in buffer object range calculation.

Otherwise the specified surface state will allow the GPU to access
memory up to BufferOffset bytes past the end of the buffer. Found by
inspection.

v2: Protect against out-of-range BufferOffset (Nanley).
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

i965: Move buffer texture size calculation into a common helper function.

The buffer texture size calculations (should be easy enough, right?)
are repeated in three different places, each of them subtly broken in
a different way. E.g. the image load/store path was never fixed to
clamp to MaxTextureBufferSize, and none of them are taking into
account the buffer offset correctly. It's easier to fix it all in one
place.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106481
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"

This reverts commit c0ed52f6146c7e24e1275451773bd47c1eda3145. It was
preventing the image format validation from being done on buffer
textures, which is required to ensure that the application doesn't
attempt to bind a buffer texture with an internal format incompatible
with the image unit format (e.g. of different texel size), which is
not allowed by the spec (it's not allowed for *any* texture target,
whether or not there is spec wording restricting this behavior
specifically for buffer textures) and will cause the driver to
calculate texel bounds incorrectly and potentially crash instead of
the expected behavior.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106465
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

ac: Use DPP for build_ddxy where possible.

WQM is pretty reliable now on LLVM 7, so let us just use
DPP + WQM.

This gives approximately a 1.5% performance increase on the
vrcompositor built-in benchmark.

v2: Use ac_build_quad_swizzle.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

i965: add {X,A}BGR2101010 to 'intel_image_formats'

This patch adds {X,A}BGR2101010 entries to the list of supported
'intel_image_formats'.

Bug: https://crbug.com/776093
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

dri_util: Add R10G10B10{A,X}2 translation between DRI and mesa_format.

Add R10G10B10{A,X}2 translation between mesa_format and DRI format
to driGLFormatToImageFormat() and driImageFormatToGLFormat().

Bug: https://crbug.com/776093
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

bin/get-pick-listh.sh: force git --pretty=medium

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Andres Gomez <agomez@igalia.com>

bin/bugzilla_mesa.sh: explicitly set the --pretty argument

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Andres Gomez <agomez@igalia.com>

docs: drop unnecessary out-of-frame target

I'm guessing an earlier version of the website used to have the page
contents in <frames>, but this isn't the case anymore so just drop the
unnecessary `target="_main"` :)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

docs: fix various html tags mistakes

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

docs: fix `<` & `>` used in html code

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

docs: add news notes to 18.1.0

CC: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>

tgsi/scan: add hw atomic to the list of memory accessing files

This fixes 4 out of 5 cases in:
arb_framebuffer_no_attachments-atomic on cayman.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>

llvmpipe: improve rasterization discard logic

This unifies the explicit rasterization discard as well as the implicit
rasterization disabled logic (which we need for another state tracker),
which really should do the exact same thing.
We'll now toss out the prims early on in setup with (implicit or
explicit) discard, rather than do setup and binning with them, which
was entirely pointless.
(We should eventually get rid of implicit discard, which should also
enable us to discard stuff already in draw, hence draw would be
able to skip the pointless clip and fallback stages in this case.)
We still need separate logic for only null ps - this is not the same
as rasterization discard. But simplify the logic there and don't count
primitives simply when there's an empty fs, regardless of depth/stencil
tests, which seems perfectly acceptable by d3d10.
While here, also fix statistics for primitives if face culling is
enabled.
No piglit changes.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

ac/surface/gfx6: Don't force a tile index for fmask.

The bpe of the fmask often differs from the bpe of the main
surface. On SI that means it has to get a different tile
index.

addrlib is capable of figuring this out itself, so just pass
-1 instead to let it know that it is not preset.

Fixes: 9bf3570fed0 "ac/surface/gfx6: compute FMASK together with the color surface"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106511
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106499
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

i965: Remove ring switching entirely

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

i965/miptree: Move the access_raw call to the individual map functions

The only function that doesn't need to call access_raw is map_blit. If
it takes the blitter path, it will happen as part of intel_miptree_copy.
If map_blit takes the blorp path, brw_blorp_copy_miptrees will handle
doing whatever resolves are needed. This should save us resolves in
quite a few cases and will probably help performance a bit.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

i965: Remove support for the BLT ring

We still support the blitter on gen4-5 but it's on the same ring as 3D.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

i965/miptree: Use blorp for blit maps on gen6+

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

i965/miptree: Use blorp for validation tex copies on gen6+

It's faster than the blitter and can handle things like stencil properly
so it doesn't require software fallbacks.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

i965: Delete the blitter path for CopyTexSubImage

The blorp path (called first) can do anything the blitter path can do so
it's just dead code.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

i965: Don't fall back to the blitter in BlitFramebuffer

On gen4-5, we try the blitter before we even try blorp. On newer
platforms, blorp can do everything the blitter can so there's no point
in even having the blitter fall-back path.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

i965: Remove some unused includes of intel_blit.h

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

i965/blit: Delete intel_emit_linear_blit

This function is no longer used.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

i965: Use meta for pixel ops on gen6+

Using meta for anything is fairly aweful and definitely has more CPU
overhead.  However, it also uses the 3D pipe and is therefore likely
faster in terms of GPU time than the blitter.  Also, the blitter code
has so many early returns that it's probably not buying us that much.
We may as well just use meta all the time instead of working over-time
to find the tiny case where we can use the blitter.  We keep gen4-5
using the old blit paths to avoid perturbing old hardware too much.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.

We'd like to start using soft-pin to assign BO addresses up front, and
never move them again.  Our previous plan for dealing with 48-bit VF
cache bugs was to relocate vertex buffers to the low 4GB, so we'd never
have addresses that alias in the low 32 bits.  But that requires moving
buffers dynamically.

This patch tracks the last seen BO address for each vertex/index buffer,
and emits a VF cache invalidate if the high bits change.  (Ideally, we
won't hit this case very often.)  This should work for the soft-pin
case, but unfortunately won't work in the relocation case, as we don't
actually know the addresses.  So, we have to use both methods.

v2: Mention that the cache uses a <VertexBufferIndex, Address> tuple
    more explicitly (suggested by Scott).  Mention "single batch" too
    (suggested by Chris).

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>

i965: Introduce a "memory zone" concept on BO allocation.

We're planning to start managing the PPGTT in userspace in the near
future, rather than relying on the kernel to assign addresses. While
most buffers can go anywhere, some need to be restricted to within 4GB
of a base address.

This commit adds a "memory zone" parameter to the BO allocation
functions, which lets the caller specify which base address the BO will
be associated with, or BRW_MEMZONE_OTHER for the full 48-bit VMA.

Eventually, I hope to create a 4GB memory zone corresponding to each
state base address.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0

Fixes: d6cd14f2131a5b "i965/fs: Define new shader opcode to..."
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>

dri3: Stricter SBC wraparound handling

Prevents corrupting the upper 32 bits of draw->recv_sbc when
draw->send_sbc resets to 0 (which currently happens when the window is
unbound from a context and bound to one again), which in turn caused
loader_dri3_swap_buffers_msc to calculate target_msc with corrupted
upper 32 bits. This resulted in hangs with the Xorg modesetting driver
as of xserver 1.20 (older versions and other drivers ignored the upper
32 bits of the target MSC, which is why this wasn't noticed earlier).

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/106351
Tested-by: Mike Lothian <mike@fireburn.co.uk>

radv: fix computation of user sgprs for 32-bit pointers

With 32-bit pointers we only need one user SGPR per desc set.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: drop user_sgpr_info::sgpr_count

It's only used inside allocate_user_sgprs().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: add support for 32-bit pointers in user data SGPRs

We still use 64-bit GPU pointers for all ring buffers because
llvm.amdgcn.implicit.buffer.ptr doesn't seem to support 32-bit
GPU pointers for now. This can be improved later anyways.

Vega10:
Totals from affected shaders:
SGPRS: 1008722 -> 1026710 (1.78 %)
VGPRS: 706580 -> 707136 (0.08 %)
Spilled SGPRs: 22555 -> 22209 (-1.53 %)
Spilled VGPRs: 75 -> 75 (0.00 %)
Code Size: 34819208 -> 35202140 (1.10 %) bytes
Max Waves: 175423 -> 175086 (-0.19 %)

Polaris10:
Totals from affected shaders:
SGPRS: 1029849 -> 1036517 (0.65 %)
VGPRS: 709984 -> 708872 (-0.16 %)
Spilled SGPRs: 22672 -> 22309 (-1.60 %)
Spilled VGPRs: 82 -> 66 (-19.51 %)
Scratch size: 76 -> 60 (-21.05 %) dwords per thread
Code Size: 34915336 -> 35309752 (1.13 %) bytes
Max Waves: 151221 -> 151677 (0.30 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: add set_loc_shader_ptr() helper

This helper will hep for switching to 32-bit GPU pointers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: allocate descriptor BOs in the 32-bit addr space

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: allocate the upload BO in the 32-bit addr space

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: set amdgpu-32bit-address-high-bits LLVM attribute

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv/winsys: allow to allocate BOs in the 32-bit addr space

This introduces a new flag called RADEON_FLAG_32BIT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv/winsys: request high address

This is needed for 32-bit GPU pointers. Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

i965/glk: Add l3 banks count for 2x6 configuration

2x6 configuration with pci-id 0x3185 has same number of
banks (2) as 3x6 configuration (pci-id 0x3184).

Reported-by: Clayton Craft <clayton.a.craft@intel.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: eb23be1d97da "i965: Add and initialize l3_banks field for gen7+"
Cc: Francisco Jerez <currojerez@riseup.net>

v3d: Include v3d_drm.h path.

Fix build error.

CC v3d_blit.lo
In file included from v3d_blit.c:27:0:
v3d_context.h:39:10: fatal error: v3d_drm.h: No such file or directory
#include "v3d_drm.h"
^~~~~~~~~~~

Fixes: 8a793d42f1cc ("v3d: Switch the vc5 driver to using the finalized V3D UABI.")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>

radv: fix centroid interpolation

It's legal to set the centroid and sample interpolation modes
when MSAA disabled. So, we have to initialize the centroid
inputs because the hardware doesn't.

This fixes rendering issues with DXVK and The Witness, World of
Warcraft, Trackmania and probably more games.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106315
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102390
CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: Cleanup unused prime blit path.

Since we have the common WSI code, we use vkCmdCopyImageToBuffer
instead.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

radv: Fix SRGB compute copies.

SRGB stores are broken. We had compensation code in the
resolve path but none in the copy path. Since we don't
want any conversion and it does not matter for DCC,
just make everything UNORM instead.

This happened to cause wrong colors for the PRIME path, as
that uses image->buffer copies which always use the compute
path.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106587
Reviewed-by: Dave Airlie <airlied@redhat.com>

android: enable VK_ANDROID_native_buffer

Patch changes entrypoints generator to not skip this extension even
though it is set as disabled in the xml. We also need compilation
flag VK_USE_PLATFORM_ANDROID_KHR to be enabled.

It looks like this extension got disabled in commit 69f447553c.

v2: just remove the whole 'supported' attrib check + remove
vk_icd.h compilation fix (fix in VulkanHeaders instead)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

vulkan: update vk_icd.h to current upstream

Import from commit eb0c1fd on branch 'master'
of https://github.com/KhronosGroup/Vulkan-Headers.git.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>

virgl: set texture buffer offset alignment to disable ARB_texture_buffer_range.

The host side hasn't got support for this feature yet, so don't enable it
unless we get the caps from the host.

This makes the texture buffer range piglit tests skip now.

Fixes: fe0647df5a7 (virgl: add offset alignment values to to v2 caps struct)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>

mesa: stop hiding query parameters from OpenGL compat

Just let the extension detection do its job as we will be adding
compat profile support in future, also we want these to work
with compat profile version overrides.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

radv: fix VK_EXT_descriptor_indexing

GetPhysicalDeviceProperties2KHR() was crashing because features was null

Fixes: 0e10790558b "radv: Enable VK_EXT_descriptor_indexing."
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

ac/surface: Only align linear power of two fmt textures.

We're not sharing 32_32_32 formats between different GPUs, so we
do not have to align for vega on pre-vega cards.

Fixes: e361970ed73 "radv: Add support for IMG_DATA_FORMAT_32_32_32."
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

amd/addrlib: Use defines in autotools build.

Otherwise stuff like NDEBUG would not be passed through.

CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106479
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

r600/compute: Mark several functions as static

They're not used anywhere else, so keep them private

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>

r600/compute: Remove unused compute_memory_pool functions

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>

draw: get rid of special logic to not emit null tris

I've confirmed after 77554d220d6d74b4d913dc37ea3a874e9dc550e4 we no
longer need this to pass some tests from another api (as we no longer
generate the bogus extra null tris in the first place).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

docs: Add sha sums for release

docs: Add release notes for 18.1.0

nir: Implement optional b2f->iand lowering

This pass is required by the Midgard compiler; our instruction set uses
NIR-style booleans (~0 for true) but lacks a dedicated b2f instruction.
Normally, this lowering pass would be implemented in a backend-specific
algebraic pass, but this conflicts with the existing iand->b2f pass in
nir_opt_algebraic.py, hanging the compiler. This patch thus makes the
existing pass optional (default on -- all other backends should remain
unaffected), adding an optional pass for lowering the opposite
direction.

v2: Defer lowering until late algebraic optimisations to allow
optimising the b2f instruction itself.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>

travis: Adapt to radeonsi dropping support for LLVM 4

meson Vulkan, Clover, and autotools Vulkan need to be switched to llvm 5

Fixes: f9eb1ef870eba9fdacf9a8cbd815ec3bff81db05
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

radeonsi: skip ES output stores for undefined output components

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

i965: isl: Move the MCS gen7+ assertion into ISL

This is useful for every user of ISL. Drop the comment along the way to
match similar functions in ISL.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>

i965/miptree: Remove format assertion in alloc_aux

intel_miptree_supports_{ccs,mcs,hiz} ensures the format is valid for the
color or depth miptree before the miptree is assigned an aux_usage.
alloc_aux switches on the aux_usage so don't assert that the format is
valid.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>

i965/miptree: Simplify the switch in supports_ccs

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

i965: Make get_ccs_surf succeed in alloc_aux

Synchronize the requirements listed in isl_surf_get_ccs_surf with
intel_miptree_supports_ccs by importing a restriction from ISL. Some
implications:
* We successfully create every aux_surf in alloc_aux
* We only return false from alloc_aux if we run out of memory

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>

llvmpipe: fix check for a no-op shader

The tgsi_info.num_tokens fix broke llvmpipe's detection of no-op shaders.
Fix the code to check for num_instructions <= 1 instead.

Fixes: 8fde9429c36b75 ("tgsi: fix incorrect tgsi_shader_info::num_tokens
computation")
Tested-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>

radv: pass radv_nir_compiler_options directly to create_llvm_function()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>

st/mesa: only define GLSL 1.4 for compat if driver supports it

Currently GLSL 1.4 is defined for all gallium drivers even only
GLSL 1.2 is supported as seen on etnaviv.

v1 -> v2:
- use _min(..) as suggested by Lucas Stach and Michel Dänzer

Fixes: 4560aad780b ("mesa: add GLSLVersionCompat constant")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>

vbo: remove MaxVertexAttribStride assert check.

Some drivers (virgl) don't support GL4.4 or GLES3.1 yet,
so never fill in this const.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>

mesa: drop GL_EXT_polygon_offset support

glPolygonOffset() has been part of the GL standard since 1.1. Also
niether AMD or Nvidia support this in their binary drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61761

tgsi: fix incorrect tgsi_shader_info::num_tokens computation

We were incrementing num_tokens in each loop iteration while parsing
the shader. But each call to tgsi_parse_token() can consume more than
one token (and often does). Instead, just call the tgsi_num_tokens()
function.

Luckily, this issue doesn't seem to effect any current users of this
field (llvmpipe just checks for <= 1, for example).

Reviewed-by: Neha Bhende<bhenden@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>

radv: add radv_emit_shader_pointer() helper

For future work (support for 32-bit GPU pointers).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: add some helpers for cleaning up radv_get_preamble_cs()

Because this function looks a bit ugly to me.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

amd: remove support for LLVM 4.0

It doesn't support GFX9.

Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

docs: update calendar, add news and link release notes to 18.0.4

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

docs: add sha256 checksums for 18.0.4

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 69ef6e4a75255e60a4c4a2419d03c9352b9eb8f2)

docs: add release notes for 18.0.4

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 3b49ab6219790c341ffb78a6eeaaa8b1a4b29bcc)

mesa: The glArrayElement api is independent of the current program.

All the shader program dependent handling is done on the level
of the gl_Context::Array._DrawVAO/_DrawVAOEnabledAttribs.
So, skip array element invalidation on _NEW_PROGRAM.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>

mesa: Flag _NEW_ARRAY only if we are changing ctx->Array.VAO.

For the VAO internal helper functions that may be called
with a non current VAO, flag the _NEW_ARRAY state only
if it is the current ctx->Array.VAO.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>

mesa: Remove flush_vertices argument from VAO methods.

The flush_vertices argument is now unused, remove it.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>

mesa: Remove FLUSH_VERTICES from VAO state changes.

Pending draw calls on immediate mode or display list calls do
not depend on changes of the VAO state. So, remove calls to
FLUSH_VERTICES and flag _NEW_ARRAY as appropriate.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>

docs: add 18.0.5 in the release calendar

Mesa 18.1 series has not been released yet, so let's extend 18.0 lifetime.

v2: Add missing closing TR tags (Eric Engestrom)

CC: Andres Gomez <agomez@igalia.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>

swr/rast: Added FEClipRectangles event

and also added some comments

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>

swr/rast: Whitespace and tab-to-spaces changes

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>

swr/rast: fix VCVTPD2PS generation for AVX512

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>

swr/rast: Rectlist support for GS

Add rectlist as an option for GS. Needed to support some driver
optimizations.

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>

swr/rast: Remove unneeded virtual from methods

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>

broadcom/vc4: Native fence fd support

With the syncobj support in place, lets use it to implement the
EGL_ANDROID_native_fence_sync extension. This mostly follows previous
implementations in freedreno and etnaviv.

v2: Drop the flags (Eric)
    Handle in_fence_fd already in job_submit (Eric)
    Drop extra vc4_fence_context_init (Eric)
    Dup fds with CLOEXEC (Eric)
    Mention exact extension name (Eric)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

broadcom/vc4: Store job fence in syncobj

This gives us access to the fence created for the render job.

v2: Drop flag (Eric)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

broadcom/vc4: Detect syncobj support

We need to know if the kernel supports syncobj submission since otherwise
all the DRM syncobj calls fail.

v2: Use drmGetCap to detect syncobj support (Eric)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

broadcom/vc4: Bump libdrm requirement

Require a version of libdrm with syncobj support.

v2: Don't require a libdrm_vc4, just bump core libdrm if vc4 enabled (by
anholt)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

drm-uapi: Update vc4 header with syncobj submit support

v2: Synchronized with kernel v2
v3: Update for the finalized kernel ABI (pad2 field)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

broadcom/vc4: Drop libdrm_vc4 requirement

This was missed in the move back to the local uapi copy.
libdrm_vc4 only seems to consist of headers that also exist in the
Mesa tree.

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

v3d: Add support for glSampleMask / glSampleCoverage.

v3d: Enable NaN propagation in the VS and CS as well.

Fixes piglit vs-isnan-*.shader_test at the expense of gl-1.0-spot-light.

i965/blorp: Disable BLORP clear color updates

With the previous patches, we now update the indirect clear color buffer
every time the clear color changes. Avoid redundant updates.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>