git.libre-soc.org Git - mesa.git/log

nvir/gm107: consider FILE_FLAGS dependencies in SchedDataCalculatorGM107

currently while insterting barriers, writes and reads to FILE_FLAGS aren't
considered. This can lead to WaR hazards in some situations.

With the previous commit fixes shaders with intstructions like this:
  mad u32 $r2 $r4 $r11 $r2
  mad u32 { $r5 $c0 } $r4 $r10 $r6
  mad (SUBOP:1) u32 $r3 $r4 $r10 $r2 $c0

Affects OpenCL CTS tests on Maxwell+:
basic/test_basic intmath_long
basic/test_basic intmath_long2
basic/test_basic intmath_long4

v2: only put barriers on instructions which actually read flags

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>

nvir/gm107: iterate over all defs in SchedDataCalculatorGM107::findFirstUse

In the sched data calculator we have to track first use of defs by iterating
over all defs of an instruction, not just the first one.

v2: fix minGRP and maxGRP values

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>

ac/nir: use ordered float comparisons except for not equal

Original patch from Timothy Arceri, I have just fixed the
not equal case locally.

This fixes one important rendering issue in Wolfenstein 2
(the cutscene transition issue).

RadeonSI uses the same ordered comparisons, so I guess that
what we should do as well.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104302
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

android: vulkan/util: add dependency on libnativewindow for O and later

Similar to 90dd6e5 ("Android: egl: add dependency on libnativewindow")

Fixes the following building error:

In file included from out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_vulkan_util_intermediates/util/vk_enum_to_str.c:26:
external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found
^~~~~~~~~~~~~~~~~
1 error generated.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

android: anv: add dependency on libnativewindow for O and later

Similar to 90dd6e5 ("Android: egl: add dependency on libnativewindow")

Fixes the following building errors:

In file included from external/mesa/src/intel/vulkan/gen7_cmd_buffer.c:30:
In file included from external/mesa/src/intel/vulkan/anv_private.h:72:
external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal
error: 'system/window.h' file not found
^~~~~~~~~~~~~~~~~
1 error generated.
...
In file included from external/mesa/src/intel/vulkan/anv_gem.c:32:
In file included from external/mesa/src/intel/vulkan/anv_private.h:72:
external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal
error: 'system/window.h' file not found
^~~~~~~~~~~~~~~~~
1 error generated.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

android: anv/extensions: fix generated sources build

Building rules are aligned to automake ones

The correct script to build anv_extensions.{c,h} is anv_extensions_gen.py
Generation rules for anv_extensions.c requires --out-c option
Generation rules for anv_extensions.h were missing
Necessary include paths are added to avoid following build errors:

cp: cannot stat '.../gen/STATIC_LIBRARIES/libmesa_vulkan_common_intermediates/vulkan/anv_extensions.c':
No such file or directory

In file included from external/mesa/src/intel/vulkan/anv_gem.c:32:
external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found
^~~~~~~~~~~~~~~~~~
1 error generated.

In file included from external/mesa/src/intel/vulkan/anv_batch_chain.c:30:
external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found
^~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: dd088d4bec7 ("anv/extensions: Generate a header file with extension tables")
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

radeonsi: remove 2 unused user SGPRs from merged TES-GS with 32-bit pointers

The effect of the last 13 commits on user SGPR counts:

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

radeonsi: make SI_SGPR_VERTEX_BUFFERS the last user SGPR input

so that it can be removed and replaced with inline VBO descriptors,
and the pointer can be packed in unused bits of VBO descriptors.
This also removes the pointer from merged TES-GS where it's useless.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

radeonsi: set correct num_input_sgprs for VS prolog in merged shaders

We need to take num_input_sgprs from VS, not the second shader.
No apps suffered from this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

radeonsi: allow fewer input SGPRs in 2nd shader of merged shaders

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

radeonsi: don't use struct si_descriptors for vertex buffer descriptors

VBO descriptor code will change a lot one day.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

build: Move wayland-scanner check into platform

Also only check for wayland-scanner if building for the Wayland
platform.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: bfa22266cd4d ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Cc: Emil Velikov <emil.velikov@collabora.co.uk>
Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211

build: Move wayland-protocols check into platform

In line with wayland-client and wayland-server, move the check for
wayland-protocols into the wayland platform branch.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: bfa22266cd4d ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Cc: Emil Velikov <emil.velikov@collabora.co.uk>
Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211

vulkan/wsi/wayland: Move Wayland protocol from BUILT_SOURCES

autotools wants to have the BUILT_SOURCES ready as soon as it enters the
directory, even if they are not used. This meant the build failed if
wayland-protocols was not available on the system, even if it was not
enabled.

As BUILT_SOURCES cannot be used in a conditional (cf. 166852ee957f), do
the same thing as EGL and manually encode the dependencies in the
Makefile.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: bfa22266cd4d ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Cc: Emil Velikov <emil.velikov@collabora.co.uk>
Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211

r600: fix tgsi clock last setting

On cayman this was hitting an assert later, which probably wasn't
see on non-cayman due to having the t slot.

Fixes: 9041730d1 (r600: add support for ARB_shader_clock.)

r600: add time lo/hi debugging output.

This just adds the these to the debug prints.

radeonsi/nir: enable lowering of fpow

Lowering fpow in NIR rather than LLVM can be beneficial.

Polaris results:

Totals from affected shaders:
SGPRS: 124928 -> 124896 (-0.03 %)
VGPRS: 68616 -> 68332 (-0.41 %)
Spilled SGPRs: 394 -> 413 (4.82 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 3668912 -> 3658368 (-0.29 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 18575 -> 18593 (0.10 %)
Wait states: 0 -> 0 (0.00 %)

Fixes: d6b753920677 "ac/nir: remove emission of nir_op_fpow"
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

ac: make use of ac_get_llvm_num_components() helper

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

gallium/tgsi: remove is_msaa_sampler array from tgsi_shader_info

Seems to have not been used since 16be87c90429

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

radeonsi/nir: fix loading of doubles for tess varyings

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

radeonsi/nir: fix lds store in tcs outputs handling

We were ignoring the channel offset.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

r600: Take ALU_EXTENDED into account when evaluating jump offsets

ALU_EXTENDED needs 4 DWORDS instead of the usual 2, hence if the last ALU
clause within a IF-JUMP or ELSE branch is ALU_EXTENDED the target jump
offset needs to be adjusted accordingly.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104654
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>

mesa: Expose EXT_shader_framebuffer_fetch(_non_coherent) on desktop and embedded GL.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

glsl: Silence warnings when reading from a framebuffer fetch output.

Framebuffer fetch outputs are implicitly initialized upon entry to the
fragment shader.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

glsl: Specify framebuffer fetch coherency mode in lower_blend_equation_advanced().

This requires passing an extra argument to the lowering pass because
the KHR_blend_equation_advanced specification doesn't seem to define
any mechanism for the implementation to determine at compile-time
whether coherent blending can ever be used (not even an "#extension
KHR_blend_equation_advanced_coherent" directive seems to be required
in the shader source AFAICT).

In the long run we'll probably want to do state-dependent recompiles
based on the value of ctx->Color.BlendCoherent, but right now there
would be no benefit from that because the only driver that supports
coherent framebuffer fetch is i965 on SKL+ hardware, which are unable
to support the non-coherent path for the moment because of texture
layout issues, so framebuffer fetch coherency is always enabled for
them.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

glsl: Add support for the framebuffer fetch layout(noncoherent) qualifier.

This allows the application to request framebuffer fetch coherency
with per-fragment output granularity. Coherent framebuffer fetch
outputs (which is the default if no qualifier is present for
compatibility with older versions of the EXT_shader_framebuffer_fetch
extension) will have ir_variable_data::memory_coherent set to true.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

glsl: Allow layout token for EXT_shader_framebuffer_fetch_non_coherent.

EXT_shader_framebuffer_fetch_non_coherent requires layout qualifiers
even on GL(ES) 2.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

glsl: Initialize ir_variable_data::fb_fetch_output earlier for GL(ES) 2.

At the same point where it is initialized on GL(ES) 3.0+ so we can
implement some common layout qualifier handling in a future commit.
Until now the fb_fetch_output flag would be inherited from the
original implicit gl_LastFragData declaration at a later point in the
AST to GLSL IR translation.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

glsl: Replace MESA_shader_framebuffer_fetch extension flags with EXT ones.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

glsl: Switch ast_type_qualifier to a 128-bit bitset.

This should end the drought of bits in the ast_type_qualifier object.
The bitset_t type works pretty much as a drop-in replacement for the
current uint64_t bitset.

The only catch is that the bitset_t type as defined in the previous
commit doesn't have a trivial constructor (because it has a
user-defined constructor), so it cannot be used as union member
without providing a user-defined constructor for the union (which
causes it in turn to be non-trivially constructible). This annoyance
could be easily addressed in C++11 by declaring the default
constructor of bitset_t to be the implicitly defined one -- IMO one
more reason to drop support for GCC 4.2-4.3.

The other minor change was required because glsl_parser_extras.cpp was
hard-coding the type of bitset temporaries as uint64_t, which (unlike
would have been the case if the uint64_t had been replaced with
e.g. an __int128) would otherwise have caused a build failure, because
the boolean conversion operator of bitset_t is marked explicit (if
C++11 is available), so the bitset won't be silently truncated down to
1 bit in order to use it to initialize the uint64_t temporaries
(yikes).

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

util/bitset: Add C++ wrapper for static-size bitsets.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

util: Add EXPLICIT_CONVERSION macro.

This can be used to specify that a C++ conversion operator is not
meant to be used for implicit conversions, which can lead to
unintended loss of information in some cases. Implemented as a macro
in order to keep old GCC versions happy.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

mesa: Implement glFramebufferFetchBarrierEXT entry point.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

glapi: Update XML for last revision of EXT_shader_framebuffer_fetch.

Desktop GL is now supported, and there is an additional entry-point
for EXT_shader_framebuffer_fetch_non_coherent.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

mesa: Rename MESA_shader_framebuffer_fetch gl_extensions bits to EXT.

The changes I had originally planned for the MESA_shader_framebuffer_fetch
extension have been merged into the EXT spec, there's no point in keeping
MESA_shader_framebuffer_fetch extension enables.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

mesa: Rename dd_function_table::BlendBarrier to match latest EXT spec.

This GL entry point was renamed to glFramebufferFetchBarrier() in the
EXT extension on request from Khronos members. Update the Mesa
codebase to match the latest spec.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

i965: Fix KHR_blend_equation_advanced with some render targets.

This reverts two bogus and seemingly useless changes from the commits
referenced below, which broke KHR_blend_equation_advanced (and
EXT_shader_framebuffer_fetch_non_coherent which wasn't exposed yet)
for any kind of render target surface that would cause the
get_isl_surf() call in brw_emit_surface_state() to do anything useful
(notice how the result of get_isl_surf() is completely ignored by the
caller right now), as was the case while using those extensions with
1D array or 3D framebuffers in particular.

Fixes: f5859b45b1686e8116380d87 "i965/miptree: Switch remaining surfaces to isl"
Fixes: bf24c3539e4b6989512968ca "i965/miptree: Clean-up unused"
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>

radeonsi: remove si_descriptors parameter from emit_shader_pointer functions

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

radeonsi: preload the tess offchip ring in TES

so that it's not done multiple times in branches

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

radeonsi: move tess ring address into TCS_OUT_LAYOUT, removes 2 TCS user SGPRs

TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address
aligned to 512KB. Hey, it's a 13-bit pointer!

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

radeonsi: move 2nd-shader descriptor pointers into s[0:1]

If 32-bit pointers are supported, both pointers can be moved into s[0:1]
and then ESGS has exactly the same user data SGPR declarations as VS.

If 32-bit pointers are not supported, only one pointer can be moved into
s[0:1]. In that case, the 2nd pointer is moved before TCS constants,
so that the location is the same in HS and GS.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

radeonsi: change si_descriptors::shader_userdata_offset type to short

We will want to use SH registers outside of user data SGPRs, like the GFX9
special SGPRs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

radeonsi: put both tessellation rings into 1 buffer

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

radeonsi: move tessellation ring info into si_screen

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

radeonsi: move TCS_OUT_LAYOUT.PatchVerticesIn to lower bits

For a later patch.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

nvir: dont optimize mad with subops to shladd

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

radv: Really use correct HTILE expanded words.

When transitioning to an htile compressed depth format, Set the full
depth range, so later rasterization can pass HiZ. Previously, for depth
only formats, the depth range was set to 0 to 0. This caused unwanted
HiZ rejections with a VK_FORMAT_D16_UNORM depth buffer
(VK_FORMAT_D32_SFLOAT was not affected somehow).

These values are derived from PAL [0], since I can't find the
specification describing the htile values.

[0] https://github.com/GPUOpen-Drivers/pal/blob/5cba4ecbda9452773f59692f5915301e7db4a183/src/core/hw/gfxip/gfx9/gfx9MaskRam.cpp#L1500

CC: Dave Airlie <airlied@redhat.com>
CC: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Fixes: 5158603182fe7435 "radv: Use correct HTILE expanded words."

radv/extensions: fix c_vk_version for patch == None

Similar to cb0d1ba156 ("anv/extensions: Fix VkVersion::c_vk_version for patch == None")
fixes the following building errors:

out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/radv_entrypoints.c:1161:48:
error: use of undeclared identifier 'None'; did you mean 'long'?
      return instance && VK_MAKE_VERSION(1, 0, None) <= core_version;
                                               ^~~~
                                               long
external/mesa/include/vulkan/vulkan.h:34:43: note: expanded from macro 'VK_MAKE_VERSION'
    (((major) << 22) | ((minor) << 12) | (patch))
                                          ^
...
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.

Fixes: e72ad05c1d ("radv: Return NULL for entrypoints when not supported.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

broadcom/vc5: Fix layout of 3D textures.

Cube maps are entire miptrees repeated, while 3D textures have each level
have all of its layers next to each other. Fixes tex3d and
tex-miplevel-selection GL2:texture() 3D.

broadcom/vc5: Ignore unused usage flags in is_format_supported.

Like for vc4, the new DISPLAY_TARGET flag ended up causing no formats to
match. Just drop the whole retval == usage thing and return early when we
hit a known unsupported case.

Fixes: f7604d8af521 ("st/dri: only expose config formats that are display targets")

gbm: Fix the alpha masks in the GBM format table.

Once GBM started looking at the values of the alpha masks, ARGB/ABGR
wouldn't match any more because we had both A and R in the low bits.

Fixes: 2ed344645d65 ("gbm/dri: Add RGBA masks to GBM format table")
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Daniel Stone <daniels@collabora.com>

mesa: Update vertex processing mode on _mesa_UseProgram.

The change is a bug fix for 92d76a169:
mesa: Provide an alternative to get_vp_mode()
that actually got exposed through 4562a7b0:
vbo: Make use of _DrawVAO from the dlist code.

Fixes: KHR-GLES31.core.shader_image_load_store.advanced-sso-simple
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105229
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

mesa: rename has_core_gs -> has_gs in get_programiv

This is also true for GLES.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

mesa: replace some API_OPENGL_CORE checks with _mesa_is_desktop_gl

This is more accurate with respect to the compatibility profile.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

mesa: add some of missing compatibility support for ARB_bindless_texture

The extension is exposed in the compatibility profile.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

mesa: expose ARB_enhanced_layouts in the compatibility profile

GLSL 1.40 is required.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

mesa: enable OpenGL 3.1 with ARB_compatibility

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

mesa: implement ARB_compatibility

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

swr: remove dead LLVM code paths

LLVM requirement was bumped to 4.0.0 with earlier commit.
Hence any code tailored for older versions is now unreachable.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>

broadcom/vc4: Remove the retval==usage check in is_format_supported().

This got us into trouble recently, so just remove it entirely.

broadcom/vc4: Add support for YUV textures using unaccelerated blits.

Previously we would assertion fail about having no hardware format. This
is enough to get kmscube -M nv12-2img working.

broadcom/vc4: Fix double-unrefcounting of prsc->next with shadows.

When we set up the shadow resource we were copying the original resource
as the template, including its prsc->next field. When we shadowed the
first YUV plane's resource for linear-to-tiled conversion, we would end up
unbalancing the refcount on the shadow resource's destruction.

broadcom/vc4: Add pipe_reference debugging for vc4_bos.

Trying to track down the YUV EGLImage use-after-free, it helps to see what
the mystery objects are that are being refcounted.

broadcom/vc4: Remove dead vc4_bo_set_reference().

It would be broken if NULL was passed to it anyway, since it wouldn't
participate in screen->bo_handles management.

broadcom/vc4: Use pipe_resource_reference in sampler views.

Improves u_debug_refcount output.

broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride.

This is part of supporting YUV textures -- MMAL will be handing us a
single GEM BO with the planes at offsets within it, and MMAL-decided
stride.

broadcom/vc4: Ignore PIPE_BIND_DISPLAY_TARGET in is_format_supported().

We were failing the retval == usage check at the end.

Fixes: f7604d8af521 ("st/dri: only expose config formats that are display targets")

etnaviv: fix in-place resolve tile count

TS tiles map to a fixed amount of bytes in the color/depth surface,
so the blocksize of the format needs to be taken into account when
calculating the number of tiles to fill.

The simplest fix is to just use the layer stride, which is the surface
size in bytes.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>

etnaviv: switch magic single buffer state to "3"

Some of the 16bit formats misrender with missing tiles with the current
"2" state. As all the previously working formats also work with the "3"
state, just always use that one.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>

etnaviv: add debug switch to disable single buffer feature

This feature has caused some trouble already. Add a debug switch to
allow users to quickly check if a specific issue is caused by this
feature.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>

meson: Fix GL and EGL pkg-config files with glvnd

Currently meson will generate a pkg-config that links to EGL_mesa (or
GLX_mesa), but this isn't correct, it should always link to EGL or GL.
Probably the "right" solution is to have glvnd itself provide the pkg
config files for GL and EGL, but that also means that glvnd needs to
provide many of the header files, which makes it a more involved job.

Fixes: a47c525f3281a27 ("meson: build glx")
Fixes: 035ec7a2bb2d5e4 ("meson: Add support for EGL glvnd")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>

egl/dri2: fix segfault when display initialisation fails

dri2_display_destroy() is called when platform specific display
initialisation fails. However, this would typically lead to a
segfault due to the dri2_egl_display vbtl not having been set up.

Fixes: 2db95482964 ("loader_dri3/glx/egl: Optionally use a blit
context for blitting operations")
Signed-off-by: Frank Binns <francisbinns@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

mesa: add missing RGB9_E5 format in _mesa_base_fbo_format

RGB9_E5 should be accepted by RenderbufferStorage if the
EXT_texture_shared_exponent is exposed. It is left to the
implementations to return GL_FRAMEBUFFER_UNSUPPORTED_EXT
when checking the framebuffer completeness if they do not
support rendering in this format.

Discussed in:
https://github.com/KhronosGroup/OpenGL-API/issues/32

This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5

v2: Added more info to the commit message (Antia)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Antia Puentes <apuentes@igalia.com>

etnaviv: npot_tex_any_wrap needs one bit only

Reduces size of struct etna_specs from 100 to 94 bytes.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>

vbo: Make use of _DrawVAO from the dlist code.

Finally use an internal VAO to execute display list draws. Avoid
duplicate state validation for display list draws. Remove client arrays
previously used exclusively for display lists.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

mesa: Use atomics for shared VAO reference counts.

VAOs will be used in the next change as immutable object across multiple
contexts. Only reference counting may write concurrently on the VAO. So,
make the reference count thread safe for those and only those VAO objects.

v3: Use bool/true/false for gl_vertex_array_object::SharedAndImmutable.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

vbo: Make use of _DrawVAO from immediate mode draw

Finally use an internal VAO to execute immediate mode draws. Avoid
duplicate state validation for immediate mode draws. Remove client arrays
previously used exclusively for immediate mode draws.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

vbo: Implement tool functions for vbo specific VAO setup.

Correct VBO_MATERIAL_SHIFT value.
The functions will be used next in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

mesa: Add flush_vertices to _mesa_bind_vertex_buffer.

We will need the flush_vertices argument later in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

mesa: Make _mesa_vertex_attrib_binding public.

Change vertex_attrib_binding() to _mesa_vertex_attrib_binding(), add a
flush_vertices argument, and make it publicly available.
The function will be needed later in the series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

mesa: Add flush_vertices to _mesa_{enable,disable}_vertex_array_attrib.

We will need the flush_vertices argument later in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

vbo: Use _DrawVAO for array type draw commands.

Switch over to use the _DrawVAO for all the array type draws.
The _DrawVAO needs to be set before we enter _mesa_update_state, so move
setting the draw method in front of the first call to _mesa_update_state
which is in turn called from the *validate*Draw* calls. Using the
gl_vertex_array_object::_Enabled bitmask, gl_vertex_program_state::_VPMode
and gl_vertex_array_object::_AttributeMapMode we can already set
varying_vp_inputs before we call _mesa_update_state the first time.
Thus remove duplicate state validation.

v2: Update comments.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

vbo: Implement method to track the inputs array.

Provided the _DrawVAO and the derived state that is maintained if we have
the _DrawVAO set, implement a method to incrementally update the array of
gl_vertex_array input pointers.

v2: Add some more comments.
Rename _vbo_array_init to _vbo_init_inputs.
Rename vbo_context::arrays to vbo_context::draw_arrays.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

mesa: Introduce a yet unused _DrawVAO.

During the patch series this VAO gets populated with either the currently
bound VAO or an internal VAO that will be used for immediate mode and
dlist rendering.

v2: More comments about the _DrawVAO, filter and enabled mask.
Rename _DrawVAOEnabled to _DrawVAOEnabledAttribs.
v3: Fix and move comment.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

vbo: Remove get_vp_mode() and enum vp_mode.

Is now unused.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

vbo: Use _VPMode instead of get_vp_mode().

At those places where we used get_vp_mode() use
gl_vertex_program_state::_VPMode instead.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

mesa: Provide an alternative to get_vp_mode()

To get equivalent information than get_vp_mode(), track the vertex
processing mode in a per context variable at
gl_vertex_program_state::_VPMode.
This aims to replace get_vp_mode() as seen in the vbo module.
But instead of the get_vp_mode() implementation which only gives correct
answers past calling _mesa_update_state() this context variable is
immediately tracked when the vertex processing state is modified. The
correctness of this value is asserted on state validation.

With this in place we should be able to untangle the dependency with
varying_vp_inputs and state invalidation.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>

nv50,nvc0: fix integer MS resolves using 2d engine

We don't want filtering for integer textures, same as depth/stencil.

Fixes: KHR-GL45.direct_state_access.renderbuffers_storage_multisample
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>

nvc0: fix writing query results into buffer

We need to mark the range as valid, and validate the resource using a
helper to ensure that the buffer status is marked properly.

Fixes some CTS pipeline stats query tests, and
KHR-GL45.direct_state_access.queries_functional

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>

nv50,nvc0: fix clear buffer acceleration

Two things were off:
- valid range was not updated, which could affect waiting for future
maps
- fencing was done manually instead of using the *_resource_validate
helper, which resulted in a missed dirty buffer flag being set

Fixes: KHR-GL45.direct_state_access.buffers_clear
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>

i965: perf: ensure reading config IDs from sysfs isn't interrupted

Fixes: 458468c136e "i965: Expose OA counters via INTEL_performance_query"
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

radv: Fix autotools build.

Somewhere along the way the Makefile changes got lost ...

Fixes: 4db78f3a6b "radv: Put supported extensions in a struct."
Acked-by: Dave Airlie <airlied@redhat.com>

radv: Return NULL for entrypoints when not supported.

This implements strict checking for the entrypoint ProcAddr
functions.

- InstanceProcAddr with instance = NULL, only returns the 3 allowed
   entrypoints.
- DeviceProcAddr does not return any instance entrypoints.
- InstanceProcAddr does not return non-supported or disabled
   instance entrypoints.
- DeviceProcAddr does not return non-supported or disabled device
   entrypoints.
- InstanceProcAddr still returns non-supported device entrypoints.

Reviewed-by: Dave Airlie <airlied@redhat.com>

radv: Reword radv_entrypoints_gen.py

With a big inspiration from anv as always ...

Reviewed-by: Dave Airlie <airlied@redhat.com>

radv: Track enabled extensions.

Reviewed-by: Dave Airlie <airlied@redhat.com>

radv: Put supported extensions in a struct.

Reviewed-by: Dave Airlie <airlied@redhat.com>

appveyor: Build with MSVC 2015.

The MSVC version we (at VMware) primarily care about from now on is
2015.

See https://ci.appveyor.com/project/jrfonseca/mesa/build/46

We can drop support for building with 2013 in a future commit. I'm not
aware of significant changes in C99/C11 support from MSVC 2013 to 2015,
but there's no point in continuing supporting old MSVC versions when
nobody cares.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>

ac/nir: remove emission of nir_op_fpow

fpow is now lowered at NIR level.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: enable lowering of fpow to fexp2 and flog2

There is no fpow in hardware, so it's always lowered somewhere,
but it appears that lowering at NIR level is better. Figured while
comparing compute shaders between RadeonSI and RADV.

Polaris10:
Totals from affected shaders:
SGPRS: 18936 -> 18904 (-0.17 %)
VGPRS: 12240 -> 12220 (-0.16 %)
Spilled SGPRs: 2809 -> 2809 (0.00 %)
Code Size: 718116 -> 719848 (0.24 %) bytes
Max Waves: 1409 -> 1410 (0.07 %)

Vega10:
Totals from affected shaders:
SGPRS: 18392 -> 18392 (0.00 %)
VGPRS: 12008 -> 11920 (-0.73 %)
Spilled SGPRs: 3001 -> 2981 (-0.67 %)
Code Size: 777444 -> 778788 (0.17 %) bytes
Max Waves: 1503 -> 1504 (0.07 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

nir: lower fexp2(fmul(flog2(a), 2)) to fmul(a, a)

Similar for the 4 case.

Suggested by Bas.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>