git.libre-soc.org Git - mesa.git/log

egl/android: Use per surface out fence

Use the plumbing introduced with previous patch to interact with the
Android framework.

Namely: currently we use an invalid fd of -1 for our calls to
ANativeWindow::{queue,cancel}Buffer.

At the same time applications (like flatland) may rely on it being
a valid one. Thus as they attempt to query the timestamp of the fence,
they get unexpected results/behaviour.

In the case of flatland - the benchmark hang inside getSignalTime().

Make use of the out fence and pass the correct fd to Android.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101655
Signed-off-by: Zhongmin Wu <zhongmin.wu@intel.com>
Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
[Emil Velikov: split from larger patch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

egl: Allow creation of per surface out fence

Add plumbing to allow creation of per display surface out fence.

This can be used to implement explicit sync. One user of which is
Android - which will be addressed with next commit.

Signed-off-by: Zhongmin Wu <zhongmin.wu@intel.com>
Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
[Emil Velikov: reorder so there's no intermetent regressions, split]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

egl: Wrap dri3 surface primitive around dri2 egl surface

Originally dri3 egl surface was wrapped around _EGLSurface.

With next commit we'll add additional attributes, which will be checked
from generic code. Thus in order to access that we need to use
dri2_egl_surface.

The name of the latter is a misnomer - it should really be dri or
dri_common...

Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
[Emil Velikov: commit message, squash the patches appropriately, add
relevant _eglInitSurface hunk to prevent build breakage]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Scons: Add LLVM 5.0 support

1 new required library - LLVMBinaryFormat

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org=/show_bug.cgi?id=3D102318
Signed-off-by: Alexandru-Liviu Prodea <liviuprodea@yahoo.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

radv: replace conditional compilation with MAYBE_UNUSED

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

glsl: replace conditional compilation with MAYBE_UNUSED

Suggested-by: Nicolai Hähnle <nhaehnle@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

broadcom/vc4: Fix use-after-free when deleting a program.

By leaving the compiled shader in the context's stage state, the next
compile of a new FS would look in the old compiled FS for figuring out
whether to set various dirty flags for the VS compile. Clear out the
pointer when deleting the program, and make sure that we always mark the
state as dirty if the previous program had been lost. Fixes valgrind
warnings on glsl-max-varyings.

Fixes: 2350569a78c6 ("vc4: Avoid VS shader recompiles by keeping a set of FS inputs seen so far.")

i965: Fix batch map failure check in INTEL_DEBUG=bat handling.

I originally wrote the code to call the maps 'batch' and 'state',
until I remembered that 'batch' is the intel_batchbuffer struct pointer.
The NULL check was still using the wrong variable.

Caught by Coverity.

CID: 1418109

broadcom/vc4: Fix crashes since the gallium blitter reworks.

Even if we're not clearing color, the blitter has started dereferencing
the color value.

broadcom/vc4: Fix use-after-free trying to mix a quad and tile clear.

The blitter will bind just the depth buffer, which flushes the current job
if we had both a color and depth/stencil. If the clear was doing partial
depth/stencil (quad-based) and color (tile-based), we'd go on to try to
set up the rest of the tile clear in the now flushed job.

Instead, move the partial clear up before we start setting up the job for
the current FBO state, and re-fetch the job if we're continuing on to a
tile-based clear. Fixes valgrind failures in fbo-depthtex.

Fixes: 9421a6065c4e ("vc4: Fix fallback to quad clears of depth in GLX.")

broadcom/vc4: Fix use-after-free for flushing when writing to a texture.

I was trying to continue the hash table loop, not the inner loop. This
tended to work out, because we would have *just* freed the job struct.
Fixes some valgrind failures in fbo-depthtex.

Fixes: f597ac396640 ("vc4: Implement job shuffling")

ttn: Fix out-of-bounds accesses since the always-2D-constants change.

Only one of the three checks for dim was updated, so we would try to set a
UBO buffer index source value on a nir_load_uniform, and wouldn't actually
declare non-UBO uniforms.

Fixes: 37dd8e8dee1d ("gallium: all drivers should accept two-dimensional constant buffer indexing")
Tested-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

anv/android: Disable surface and swapchain extensions (v2)

Android's Vulkan loader implements VK_KHR_surface and VK_KHR_swapchain,
and applications cannot access the driver's implementation. Moreoever,
if the driver exposes the those extension strings, then tests
dEQP-VK.api.info.instance.extensions and dEQP-VK.api.info.device fail
due to the duplicated strings.

v2: Replace !ANDROID with ANV_HAS_SURFACE. (for jekstrand)

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>

anv: Feed vk_android_native_buffer.xml to generators (v2)

Feed the XML to anv_extensions.py and anv_entrypoints_gen.py.
Do it on all platforms, not just Android. Tested on Android and Fedora.

We always parse the Android XML, regardless of target platform, to
help reduce the chance that people working on non-Android break the
Android build.

v2:
- Squash in Tapani's changes to Android.*.mk.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)

anv: Teach generator scripts how to parse mutliple XML files

The taught scripts are anv_extensions.py and anv_entrypoints_gen.py. To
give a script multiple XML files, call it like so:

anv_extensions.py --xml a.xml --xml b.xml --xml c.xml ...

The scripts parse the XML files in the given order.

This will allow us to feed the scripts XML files for extensions that are
missing from the official vk.xml, such as VK_ANDROID_native_buffer.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

vulkan/registry: Feed vk_android_native_buffer.xml to gen_enum_to_str.py

Tested on Android and Fedora.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

vulkan/util: Teach gen_enum_to_str.py to parse mutliple XML files

To give the script multiple XML files, call it like so:

gen_enum_to_str.py --xml a.xml --xml b.xml --xml c.xml ...

The script parses the XML files in the given order.

This will allow us to feed the script XML files for extensions that are
missing from the official vk.xml, such as VK_ANDROID_native_buffer.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

vulkan/registry: Add VK_ANDROID_native_buffer

The VK_ANDROID_native_buffer extension is missing from the official
vk.xml. This patch defines the extension in a separate, minimal XML
file: vk_android_native_buffer.xml.

I chose to add the extension to a new XML file instead of adding it to
the official vk.xml in order to avoid conflicts each time we sync the
vk.xml from Khronos.

This should be only a temporary solution until Jesse Hall is persuaded
to add it to the official vk.xml.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

vulkan: Add #ifdef hack to vk_android_native_buffer.h

This patch consolidates many potential `#ifdef ANDROID` messes
throughout src/vulkan and src/intel/vulkan into a simple, localized
hack. The hack is an `#ifdef ANDROID` in vk_android_native_buffer.h
that, on non-Android platorms, avoids including the Android platform
headers and typedefs any Android-specific types to void*.

This hack doesn't remove *all* the `#ifdef ANDROID`s in upcoming
patches, but it does remove a lot.

I first tried implementing VK_ANDROID_native_buffer without this hack,
but eventually gave up when the yak shaving became too much.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

vulkan: Import vk_android_native_buffer.h

Just as Mesa imports the Khronos Vulkan headers, it should import this
Android-private Vulkan header too. This guarantees that Mesa will
continue to build even when upstream Android breaks header
compatibility.

This header is only for *implementers* of Vulkan, not for consumers of
Vulkan.

Imported from tag 'android-7.1.1_r28' in aosp/frameworks/native.

References: https://android.googlesource.com/platform/frameworks/native/+/android-7.1.1_r28/vulkan/include/vulkan/vk_android_native_buffer.h
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

i965: Use prepare_external instead of make_shareable in setTexBuffer2

The setTexBuffer2 hook from GLX is used to implement glxBindTexImageEXT
which has tighter restrictions than just "it's shared".  In particular,
it says that any rendering to the image while it is bound causes the
contents to become undefined.  This means that we can do whatever aux
tracking we want between glxBindTexImageEXT and glxReleaseTexImageEXT so
long as we always transition from external in Bind and to external in
Release.

The fact that we were using make_shareable before was a problem because
it would resolve away 100% of the aux data and then throw away our
reference to the aux buffer.  If the aux data was shared with some other
application (i.e. if we're using I915_FORMAT_MOD_Y_TILED_CCS) then we
would forget that the aux data even existed for the rest of eternity.
This is fine for the first frame but any subsequent calls to
glxBindTexImageEXT would bind the texture as if it has no aux
whatsoever and no resolves would happen and texturing would happen as if
there is no aux.  This was causing rendering corruption in mutter when
running on top of X11 with modifiers.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>

i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2

The old code made a new miptree that referenced the same BO as the
renderbuffer and just trusted in the memory aliasing to work.  There are
only two ways in which the new miptree is liable to differ from the one
in the renderbuffer and neither of them matter:

1) It may have a different target.  The only targets that we can ever
    see in intelSetTexBuffer2 are GL_TEXTURE_2D and GL_TEXTURE_RECTANGLE
    and the difference between the two doesn't matter as far as the
    miptree is concerned; genX(update_sampler_state) only looks at the
    gl_texture_object and not the miptree when determining whether or
    not to use normalized coordinates.

2) It may have a very slightly different format.  Again, this doesn't
    matter because we've supported texture views for quite some time so
    we always look at the gl_texture_object format instead of the
    miptree format for hardware setup anyway.

On the other hand, because we were recreating the miptree, we were using
intel_miptree_create_for_bo which doesn't understand modifiers.  We
really want this function to work without doing a resolve so long as you
have modifiers so we need to fix that.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>

i965: Reset miptree aux state on update_image_buffer

When we get a miptree in through glxBindImageEXT, we don't know the
current aux state so we have to assume the worst-case. If the image
gets recreated, everything is fine because miptreecreate_for_dri_image
sets it to the default. However, if our miptree is recycled, then we
may have stale aux_usage and we need to reset to the default otherwise
our aux_state tracking will get messed up.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>

intel/isl: Add a drm_modifier_get_default_aux_state helper

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>

i965: Warn for GTT fallbacks when mapping the batch/state buffers.

This shouldn't really happen in practice, but I hit it a couple of times
when running a driver with a bad memory leak. We may as well hook up
the warning, because if it ever triggers, we'll know something is wrong.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

i965: Plumb brw through to intel_batchbuffer_reset().

We'll want to pass this to brw_bo_map in a moment.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

radeonsi: reallocate if a non-sharable textures is being shared

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

radeonsi: PIPE_BIND_SHARED should allow inter-process sharing

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

freedreno: compile fix

Fixes: 3f6b3d9db ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE")
Reported-by: Jan Vesely <jan.vesely@rutgers.edu>

clover: add missing include to compat.h

Fixes build issues with llvm-3.6
Fixes: 3115687f9b9830417c408228db2bc679e346bba6 (clover: Fix build after
LLVM r313390)

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Tested-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

clover: Query and export half precision support

v2: PIPE_CAP_HALFS -> PIPE_SHADER_CAP_FP16
has_halfs -> has_halves

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>

gallium: Add PIPE_SHADER_CAP_FP16

Denotes native half precision float operations capability
v2: PIPE_CAP_HALFS -> PIPE_SHADER_CAP_FP16
fix indentation

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

anv: Implement VK_KHR_image_format_list

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

anv: Implement VK_KHR_bind_memory2

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

nvc0: fix compile error

Fixes: 3f6b3d9db ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE")
Signed-off-by: Benedikt Schemmer <ben@besd.de>
Previously-pointed-out-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

radeonsi: allow out-of-order rasterization in commutative blending cases

We do not enable this by default for additive blending, since it slightly
breaks OpenGL invariance guarantees due to non-determinism.

Still, there may be some applications can benefit from white-listing
via the radeonsi_commutative_blend_add drirc setting without any real
visible artifacts.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

radeonsi: add drirc option "radeonsi_assume_no_z_fights"

This option enables a performance optimization where typical non-blending
draws with depth buffer may be rasterized out-of-order (on VI+, multi-SE
chips).

This optimization can lead to incorrect results when an applications
renders multiple objects with the same Z value at the same pixel, so we
will never enable it by default. But there may be applications that could
benefit from white-listing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

radeonsi: enable out-of-order rasterization when possible on VI and GFX9 dGPUs

This does not take commutative blending into account yet.

R600_DEBUG=nooutoforder disables it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

gallium/radeon: pass old_(perfect_)enable to set_occlusion_query_state

The callee can derive the current enable state itself.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE

To be able to properly distinguish between GL_ANY_SAMPLES_PASSED
and GL_ANY_SAMPLES_PASSED_CONSERVATIVE.

This patch goes through all drivers, having them treat the two
query types identically, except:

1. radeon incorrectly enabled conservative mode on
PIPE_QUERY_OCCLUSION_PREDICATE. We now do it correctly, only
on PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE.
2. st/mesa uses the new query type.

Fixes dEQP-GLES31.functional.fbo.no_attachments.*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

amd/common: add workaround for cube map array layer clamping

Fixes dEQP-GLES31.functional.texture.filtering.cube_array.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

amd/common: remove has_ds_bpermute argument from ac_build_ddxy

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

amd/common: add chip_class to ac_llvm_context

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

amd/common: round cube array slice in ac_prepare_cube_coords

The NIR-to-LLVM pass already does this; now the same fix covers
radeonsi as well.

Fixes various tests of
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>

radeonsi: workaround for gather4 on integer cube maps

This is the same workaround that radv already applied in commit
3ece76f03dc0 ("radv/ac: gather4 cube workaround integer").

Fixes dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i/ui.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

st/glsl_to_tgsi: fix theoretical memory leak

It can't *really* happen since we don't use subroutines.

CID: 1417491
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>

i965: emit BRW_NEW_AUX_STATE on aux state changes

Fixes a regression introduced with b96313c0e1289b296d7, which removed
BRW_NEW_BLORP for a bunch of SURFACE_STATE setup code, including render
targets, on the basis that blorp invalidates binding tables but not
surface states, however, at least on Broadwell, this caused a regression
in a CTS test, which Ken and Jason tracked down to the fact that we
are not uploading new render target surface states after allocating
new CCS_D surfaces for fast clears (which allocation is deferred until
an actual clear occurs).

The reason this only fails in BDW is that on SKL+ we use CCS_E which
is allocated up front so it exists in the initial surface state, the
problem can be reproduced in these platforms too if we use
INTEL_DEBUG=norcb to force the CCS_D path.

This patch, together with the ones preceding it, fixes the regression
by ensuring that we track and flag as dirty all aux state changes.

Credit goes to Jason and Ken for figuring out the reason for the
regression.

Fixes:
KHR-GL45.transform_feedback.draw_xfb_test

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

i965: emit BRW_NEW_AUX_STATE when we change the fast clear value

v2: rename intel_miptree_set_clear_value to intel_miptree_set_clear_color
(Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

i965: emit BRW_NEW_AUX_STATE if we drop the aux surface

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

i965: rename BRW_NEW_FAST_CLEAR_COLOR to BRW_NEW_AUX_STATE

We want to use this flag to signal changes to the aux surfaces,
so let's not make it about fast clearing only. Suggested by Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

docs: update calendar, add news item and link release notes for 17.2.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

docs: add sha256 checksums for 17.2.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit bd903d4ee15333288848708a60d6c8002cbb5cb1)

docs: add release notes for 17.2.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d6d2b6b5ec9b1638c0827582872670c7da79bb53)

docs: update sourcetree following omx rename

Fixes: 6a8aa11c207b99920b93 "st/omx_bellagio: Rename state tracker and option"
Cc: Gurkirpal Singh <gurkirpal204@gmail.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Andres Gomez <agomez@igalia.com>

gbm: Add gbm_device_get_format_modifier_plane_count to test

Adding gbm_device_get_format_modifier_plane_count made the
test gbm-symbols-check fail, this patch adds the according
function name to the test.

Fixes: 8824141b8d48d9120ddbf542d6fb661046c41c62
(gbm: Add a gbm_device_get_format_modifier_plane_count function)

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Andres Gomez <agomez@igalia.com>

travis: replace omx feature flag with omx-bellagio one

Fixes: 6a8aa11c207 ("st/omx_bellagio: Rename state tracker and
option")
Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Gurkirpal Singh <gurkirpal204@gmail.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>

docs/submittingpatches: add 'test each commit' instructions

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>

radv: Add support for more DCC compression with VK_KHR_image_format_list.

Reviewed-by: Dave Airlie <airlied@redhat.com>

radv: Add code to check if two formats can share DCC metadata.

Ported from radeonsi.

Reviewed-by: Dave Airlie <airlied@redhat.com>

i965: Add an INTEL_DEBUG=reemit option.

Jason and I use this for debugging all the time. Recompiling the driver
to enable it is kind of annoying. It's a great thing to try along with
always_flush_batch=true and always_flush_cache=true to detect a class of
problems - namely, atoms listening to an insufficient set of dirty bits.

Reviewed-by: Matt Turner <mattst88@gmail.com>

clover: Fix build after LLVM r313390

v2: pass llvm context reference instead of a pointer

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>

radv: Don't redundantly emit pipelines after secondary cmd buffer.

Reviewed-by: Dave Airlie <airlied@redhat.com>

radv: Check for GFX9 for 1D arrays in image_size intrinsic.

Only on GFX9 we implement them as 2D images.

This fixes:
dEQP-VK.image.image_size.1d_array.readonly_12x34
dEQP-VK.image.image_size.1d_array.readonly_1x1
dEQP-VK.image.image_size.1d_array.readonly_32x32
dEQP-VK.image.image_size.1d_array.readonly_7x1
dEQP-VK.image.image_size.1d_array.readonly_writeonly_12x34
dEQP-VK.image.image_size.1d_array.readonly_writeonly_1x1
dEQP-VK.image.image_size.1d_array.readonly_writeonly_32x32
dEQP-VK.image.image_size.1d_array.readonly_writeonly_7x1
dEQP-VK.image.image_size.1d_array.writeonly_12x34
dEQP-VK.image.image_size.1d_array.writeonly_1x1
dEQP-VK.image.image_size.1d_array.writeonly_32x32
dEQP-VK.image.image_size.1d_array.writeonly_7x1

Fixes: 1bcb953e166 "radv: handle GFX9 1D textures"
Reviewed-by: Dave Airlie <airlied@redhat.com>

i965: drop unused variables

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>

i965/tex: Unify the TexImage and TexSubImage code

It's nearly the same so there's no good reason why it can't be in a
common function. The one difference is that _mesa_store_teximage
calls AllocTextureImageBuffer for us, while _mesa_store_texsubimage
doesn't, but we don't need that anyway - intelTexImage already does it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

i965/tex: Remove the for_glTexImage parameter from texsubimage_tiled_memcpy

It is set to false in both callers. It isn't needed for glTexImage
because intelTexImage calls AllocTextureImageBuffer before calling
texsubimage_tiled_memcpy.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

i965/tex: Make a couple of helpers static

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

i965: Move TexSubImage functions to intel_tex_image.c

These two paths are basically the same. There's no good reason to have
them in different files.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

i965/blorp: Set r8stencil_needs_update when writing stencil

This fixes a crash on Haswell when we try to upload a stencil texture
with blorp. It would also be a problem if someone tried to texture from
stencil after glBlitFramebuffers.

Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

util/u_atomic: Add implementation of __sync_val_compare_and_swap_8

Needed for 32-bit PowerPC.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Fixes: a6a38a038bd ("util/u_atomic: provide 64bit atomics where
they're missing")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

util: Link libmesautil into u_atomic_test

Platforms without particular atomic operations require the
implementations in u_atomic.c

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Fixes: a6a38a038bd ("util/u_atomic: provide 64bit atomics where
they're missing")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

vulkan: update headers & registry to VK 1.0.61

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>

automake: enable libunwind in `make distcheck'

Enable the toggle to catch when the library is missing from the link
path. Better to test, fail and address before releasing Mesa ;-)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

travis: Add libunwind-dev to gallium/make builds

libunwind is a optional dependency used by the gallium aux module
(libgallium) and consequently the final binaries must be linked against
it. To test whether the library is properly specified in the link pass
add it to the travis-ci build environment and force its use.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

travis: force llvm-3.3 for "make Gallium ST Other"

In Ubuntu Trusty the default version of llvm is 3.4 and the build was
actually randomly picking 3.5 or 3.9. Adding libunwind would then result
is build success or failure depending of what version was picked.

Install the llvm-3.3-dev package and force its use: On one hand it is
the minimum required version we want to the build test against, and on
the other hand forcing the version stabilizes the build.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

mesa/st/tests: Correct build flags and force -std=c++11

Include src/gallium/Automake.inc, correct the build flags accordingly.

Force -std=c++11 (extensively used by the test) as otherwise it gets
defined only when building against llvm >= 3.9.

Fixes: 7be6d8fe12 ("mesa/st: glsl_to_tgsi: add tests for the new
temporary lifetime tracker")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102665
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)

automake: include radv_shader.h in the sources list

Otherwise it will be missing from the tarball, leadin to build failure.

Fixes: d4d777317b9 ("radv: move shaders related code to radv_shader.c")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

st/omx_bellagio: Rename state tracker and option

Changes --enable-omx option to --enable-omx-bellagio

Signed-off-by: Gurkirpal Singh <gurkirpal204@gmail.com>
Reviewed-and-Tested-by: Julien Isorce <julien.iso...@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>

i965: fix build warning on clang

fixes following warning:
warning: format specifies type 'long' but the argument has type 'uint64_t' (aka 'unsigned long long')

cast is needed to avoid this change turning in to another warning:
warning: format specifies type 'unsigned long long' but the argument has type 'uint64_t' (aka 'unsigned long')

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

radv: fix a potential crash if attachments allocation failed

Also, it's useless to set the error code twice. Though, we
should probably skip the next commands when the command buffer
is considered invalid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: dump the device name into the hang report

Similar to RadeonSI renderer string.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: add get_chip_name() callback

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

r600: add .gitignore for egd_tables.h

radeonsi: enable STD430 packing of UBOs by default

Before this change we were defaulting to STD140 which is slightly
less efficient at packing arrays.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

st/mesa: set UseSTD430AsDefaultPacking const based on CAP

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

gallium: introduce PIPE_CAP_LOAD_CONSTBUF

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

radeonsi: make use of LOAD for UBOs

v2: always set can_speculate and allow_smem to true

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

mesa/st: add LOAD support for UBOs

This will allow us to use STD430 packing by default if the driver
supports it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

mesa/st: create add_buffer_to_load_and_stores() helper

Will be used to add LOAD support to UBOs.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

gallium: add CONSTBUF type to tgsi_file_type

This will be use to distinguish between load types when using
the TGSI_OPCODE_LOAD opcode.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

virgl: drop const dimensions on first block.

The virgl protocol version of tgsi doesn't handle this yet,
transform it back to the old ways.

Thanks to Nicolai Hähnle <nicolai.haehnle@amd.com>
for also writing nearly the same patch.

Fixes: 41e342d5 tgsi/ureg: always emit constants (and their decls) as 2D
Tested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

st/glsl->tgsi: fix u64 to bool comparisons.

Otherwise we end up using a 32-bit comparison which didn't end well.

Timothy caught this while playing around with some opt passes.

Fixes: 278580729a (st/glsl_to_tgsi: add support for 64-bit integers)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

i965: Print size of validation and relocation lists in INTEL_DEBUG=flush

It's nice to have this information. While we're at it, tweak the
formatting to try and vertically align numbers in the common case.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

i965: Disentangle batch and state buffer flushing.

We now flush the batch when either the batchbuffer or statebuffer
reaches the original intended batch size, instead of when the sum of
the two reaches a certain size (which makes no sense now that they're
separate buffers).

With this change, we also need to update our "are we near the end?"
estimate to require separate batch and state buffer space.  I obtained
these estimates by looking at the size of draw calls in the Unreal 4
Elemental Demo (using INTEL_DEBUG=flush and always_flush_batch=true).

This will significantly impact the size of our batches.  I've adjusted
both down to try and be roughly similar to what we had been doing.  On
various benchmarks, a 20kB batch and 16kB statebuffer seemed to about
right, but we may need to adjust this further.  I tried a 16kB batch,
but that regressed Synmark OglMultithread performance by a fair bit.
32kB for both would have significantly increased our batch sizes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

i965: Delete BATCH_RESERVED handling.

Now that we can grow the batchbuffer if we absolutely need the extra
space, we don't need to reserve space for the final do-or-die ending
commands.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

i965: Make BLORP properly avoid batch wrapping.

We need to set brw->no_batch_wrap to actually avoid flushing in the
middle of our BLORP operation, and instead grow the batchbuffer.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

i965: Grow the batch/state buffers if we need space and can't flush.

Previously, we would just assert fail and die in this case.  The only
safeguard is the "estimated max prim size" checks when starting a draw
(or compute dispatch or BLORP operation)...which are woefully broken.

Growing is fairly straightforward:

1. Allocate a new larger BO.
2. memcpy the existing contents over to the new buffer
3. Set the new BO to the same GTT offset as the old BO.  When emitting
   relocations, we write the presumed GTT offset of the target BO.  If
   we changed it, we'd have to update all the existing values (by
   walking the relocation list and looking at offsets), which is more
   expensive.  With the old BO freed, ideally the kernel could simply
   place the new BO at that offset anyway.
4. Update the validation list to contain the new BO.
5. Update the relocation list to have the GEM handle for the new BO
   (which we can skip if using I915_EXEC_HANDLE_LUT).

v2: Update to handle malloc'd shadow buffers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

i965: Use a separate state buffer, but avoid changing flushing behavior.

Previously, we emitted GPU commands and indirect state into the same
buffer, using a stack/heap like system where we filled in commands from
the start of the buffer, and state from the end of the buffer.  We then
flushed before the two met in the middle.

Meeting in the middle is fatal, so you have to be certain that you
reserve the correct amount of space before emitting commands or state
for a draw.  Currently, we will assert !no_batch_wrap and die if the
estimate is ever too small.  This has been mercifully obscure, but has
happened on a number of occasions, and could in theory happen to any
application that issues a large draw at just the wrong time.

Estimating the amount of batch space required is painful - it's hard to
get right, and getting it right involves a lot of code that would burn
CPU time, and also be painful to maintain.  Rolling back to a saved
state and retrying is also painful - failing to save/restore all the
required state will break things, and redoing state emission burns a
lot of CPU.  memcpy'ing to a new batch and continuing is painful,
because commands we issue for a draw depend on earlier commands as well
(such as STATE_BASE_ADDRESS, or the GPU being in a pirtacular state).

The best plan is to never run out of space, which is totally doable but
pretty wasteful - a pessimal draw requires a huge amount of space, and
rarely occurs.  Instead, we'd like to grow the batch buffer if we need
more space and can't safely flush.

We can't grow with a meet in the middle approach - we'd have to move the
state to the end, which would mean updating every offset from dynamic
state base address.  Using separate batch and state buffers, where both
fill starting at the beginning, makes it easy to grow either as needed.

This patch separates the two concepts.  We create a separate state
buffer, with a second relocation list, and use that for brw_state_batch.

However, this patch tries to retain the original flushing behavior - it
adds the amount of batch and state space together, as if they were still
co-existing in a single buffer.  The hope is to flush at the same time
as before.  This is necessary to avoid provoking bugs caused by broken
batch wrap handling (which we'll fix shortly).  It also avoids suddenly
increasing the size of the batch (due to state not taking up space),
which could have a significant performance impact.  We'll tune it later.

v2:
- Mark the statebuffer with EXEC_OBJECT_CAPTURE when supported (caught
  by Chris).  Unfortunately, we lose the ability to capture state data
  on older kernels.
- Continue to support the malloc'd shadow buffers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

i965: Pass screen to intel_batchbuffer_reset().

This will let us access screen->kernel_features in the next patch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

i965: Prepare INTEL_DEBUG=bat decoding for a separate statebuffer.

We'll need to read from both buffers when decoding state.

This also drops the "failed to map" fallback - it's completely useless
on LLC systems where we write directly to the mapped BO. It's not that
useful on non-LLC systems either.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>