mesa.git
6 years agoradeonsi: move tess ring address into TCS_OUT_LAYOUT, removes 2 TCS user SGPRs
Marek Olšák [Sat, 3 Feb 2018 02:19:25 +0000 (03:19 +0100)]
radeonsi: move tess ring address into TCS_OUT_LAYOUT, removes 2 TCS user SGPRs

TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address
aligned to 512KB. Hey, it's a 13-bit pointer!

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agoradeonsi: move 2nd-shader descriptor pointers into s[0:1]
Marek Olšák [Fri, 2 Feb 2018 20:35:20 +0000 (21:35 +0100)]
radeonsi: move 2nd-shader descriptor pointers into s[0:1]

If 32-bit pointers are supported, both pointers can be moved into s[0:1]
and then ESGS has exactly the same user data SGPR declarations as VS.

If 32-bit pointers are not supported, only one pointer can be moved into
s[0:1]. In that case, the 2nd pointer is moved before TCS constants,
so that the location is the same in HS and GS.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agoradeonsi: change si_descriptors::shader_userdata_offset type to short
Marek Olšák [Wed, 7 Feb 2018 00:31:33 +0000 (01:31 +0100)]
radeonsi: change si_descriptors::shader_userdata_offset type to short

We will want to use SH registers outside of user data SGPRs, like the GFX9
special SGPRs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agoradeonsi: put both tessellation rings into 1 buffer
Marek Olšák [Sat, 3 Feb 2018 01:03:08 +0000 (02:03 +0100)]
radeonsi: put both tessellation rings into 1 buffer

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agoradeonsi: move tessellation ring info into si_screen
Marek Olšák [Sat, 3 Feb 2018 00:51:53 +0000 (01:51 +0100)]
radeonsi: move tessellation ring info into si_screen

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agoradeonsi: move TCS_OUT_LAYOUT.PatchVerticesIn to lower bits
Marek Olšák [Fri, 2 Feb 2018 20:04:57 +0000 (21:04 +0100)]
radeonsi: move TCS_OUT_LAYOUT.PatchVerticesIn to lower bits

For a later patch.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agonvir: dont optimize mad with subops to shladd
Karol Herbst [Fri, 23 Feb 2018 21:51:05 +0000 (22:51 +0100)]
nvir: dont optimize mad with subops to shladd

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoradv: Really use correct HTILE expanded words.
James Legg [Thu, 22 Feb 2018 16:57:53 +0000 (16:57 +0000)]
radv: Really use correct HTILE expanded words.

When transitioning to an htile compressed depth format, Set the full
depth range, so later rasterization can pass HiZ. Previously, for depth
only formats, the depth range was set to 0 to 0. This caused unwanted
HiZ rejections with a VK_FORMAT_D16_UNORM depth buffer
(VK_FORMAT_D32_SFLOAT was not affected somehow).

These values are derived from PAL [0], since I can't find the
specification describing the htile values.

[0] https://github.com/GPUOpen-Drivers/pal/blob/5cba4ecbda9452773f59692f5915301e7db4a183/src/core/hw/gfxip/gfx9/gfx9MaskRam.cpp#L1500

CC: Dave Airlie <airlied@redhat.com>
CC: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Fixes: 5158603182fe7435 "radv: Use correct HTILE expanded words."
6 years agoradv/extensions: fix c_vk_version for patch == None
Mauro Rossi [Fri, 23 Feb 2018 22:33:37 +0000 (23:33 +0100)]
radv/extensions: fix c_vk_version for patch == None

Similar to cb0d1ba156 ("anv/extensions: Fix VkVersion::c_vk_version for patch == None")
fixes the following building errors:

out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/radv_entrypoints.c:1161:48:
error: use of undeclared identifier 'None'; did you mean 'long'?
      return instance && VK_MAKE_VERSION(1, 0, None) <= core_version;
                                               ^~~~
                                               long
external/mesa/include/vulkan/vulkan.h:34:43: note: expanded from macro 'VK_MAKE_VERSION'
    (((major) << 22) | ((minor) << 12) | (patch))
                                          ^
...
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.

Fixes: e72ad05c1d ("radv: Return NULL for entrypoints when not supported.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agobroadcom/vc5: Fix layout of 3D textures.
Eric Anholt [Thu, 2 Nov 2017 23:59:10 +0000 (16:59 -0700)]
broadcom/vc5: Fix layout of 3D textures.

Cube maps are entire miptrees repeated, while 3D textures have each level
have all of its layers next to each other.  Fixes tex3d and
tex-miplevel-selection GL2:texture() 3D.

6 years agobroadcom/vc5: Ignore unused usage flags in is_format_supported.
Eric Anholt [Fri, 23 Feb 2018 17:10:36 +0000 (09:10 -0800)]
broadcom/vc5: Ignore unused usage flags in is_format_supported.

Like for vc4, the new DISPLAY_TARGET flag ended up causing no formats to
match.  Just drop the whole retval == usage thing and return early when we
hit a known unsupported case.

Fixes: f7604d8af521 ("st/dri: only expose config formats that are display targets")
6 years agogbm: Fix the alpha masks in the GBM format table.
Eric Anholt [Fri, 23 Feb 2018 22:55:51 +0000 (14:55 -0800)]
gbm: Fix the alpha masks in the GBM format table.

Once GBM started looking at the values of the alpha masks, ARGB/ABGR
wouldn't match any more because we had both A and R in the low bits.

Fixes: 2ed344645d65 ("gbm/dri: Add RGBA masks to GBM format table")
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Daniel Stone <daniels@collabora.com>
6 years agomesa: Update vertex processing mode on _mesa_UseProgram.
Mathias Fröhlich [Fri, 23 Feb 2018 19:46:20 +0000 (20:46 +0100)]
mesa: Update vertex processing mode on _mesa_UseProgram.

The change is a bug fix for 92d76a169:
  mesa: Provide an alternative to get_vp_mode()
that actually got exposed through 4562a7b0:
  vbo: Make use of _DrawVAO from the dlist code.

Fixes: KHR-GLES31.core.shader_image_load_store.advanced-sso-simple
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105229
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: rename has_core_gs -> has_gs in get_programiv
Marek Olšák [Wed, 14 Feb 2018 22:21:14 +0000 (23:21 +0100)]
mesa: rename has_core_gs -> has_gs in get_programiv

This is also true for GLES.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: replace some API_OPENGL_CORE checks with _mesa_is_desktop_gl
Marek Olšák [Wed, 14 Feb 2018 20:19:33 +0000 (21:19 +0100)]
mesa: replace some API_OPENGL_CORE checks with _mesa_is_desktop_gl

This is more accurate with respect to the compatibility profile.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: add some of missing compatibility support for ARB_bindless_texture
Marek Olšák [Wed, 14 Feb 2018 21:32:59 +0000 (22:32 +0100)]
mesa: add some of missing compatibility support for ARB_bindless_texture

The extension is exposed in the compatibility profile.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: expose ARB_enhanced_layouts in the compatibility profile
Marek Olšák [Wed, 14 Feb 2018 22:42:08 +0000 (23:42 +0100)]
mesa: expose ARB_enhanced_layouts in the compatibility profile

GLSL 1.40 is required.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: enable OpenGL 3.1 with ARB_compatibility
Marek Olšák [Wed, 14 Feb 2018 19:13:40 +0000 (20:13 +0100)]
mesa: enable OpenGL 3.1 with ARB_compatibility

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: implement ARB_compatibility
Marek Olšák [Wed, 14 Feb 2018 19:12:51 +0000 (20:12 +0100)]
mesa: implement ARB_compatibility

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agoswr: remove dead LLVM code paths
Emil Velikov [Tue, 20 Feb 2018 18:01:24 +0000 (18:01 +0000)]
swr: remove dead LLVM code paths

LLVM requirement was bumped to 4.0.0 with earlier commit.
Hence any code tailored for older versions is now unreachable.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
6 years agobroadcom/vc4: Remove the retval==usage check in is_format_supported().
Eric Anholt [Fri, 23 Feb 2018 01:52:03 +0000 (17:52 -0800)]
broadcom/vc4: Remove the retval==usage check in is_format_supported().

This got us into trouble recently, so just remove it entirely.

6 years agobroadcom/vc4: Add support for YUV textures using unaccelerated blits.
Eric Anholt [Fri, 23 Feb 2018 01:43:21 +0000 (17:43 -0800)]
broadcom/vc4: Add support for YUV textures using unaccelerated blits.

Previously we would assertion fail about having no hardware format.  This
is enough to get kmscube -M nv12-2img working.

6 years agobroadcom/vc4: Fix double-unrefcounting of prsc->next with shadows.
Eric Anholt [Tue, 20 Feb 2018 16:28:07 +0000 (16:28 +0000)]
broadcom/vc4: Fix double-unrefcounting of prsc->next with shadows.

When we set up the shadow resource we were copying the original resource
as the template, including its prsc->next field.  When we shadowed the
first YUV plane's resource for linear-to-tiled conversion, we would end up
unbalancing the refcount on the shadow resource's destruction.

6 years agobroadcom/vc4: Add pipe_reference debugging for vc4_bos.
Eric Anholt [Tue, 20 Feb 2018 16:05:29 +0000 (16:05 +0000)]
broadcom/vc4: Add pipe_reference debugging for vc4_bos.

Trying to track down the YUV EGLImage use-after-free, it helps to see what
the mystery objects are that are being refcounted.

6 years agobroadcom/vc4: Remove dead vc4_bo_set_reference().
Eric Anholt [Tue, 20 Feb 2018 15:59:10 +0000 (15:59 +0000)]
broadcom/vc4: Remove dead vc4_bo_set_reference().

It would be broken if NULL was passed to it anyway, since it wouldn't
participate in screen->bo_handles management.

6 years agobroadcom/vc4: Use pipe_resource_reference in sampler views.
Eric Anholt [Wed, 7 Feb 2018 11:16:12 +0000 (11:16 +0000)]
broadcom/vc4: Use pipe_resource_reference in sampler views.

Improves u_debug_refcount output.

6 years agobroadcom/vc4: Allow importing linear BOs with arbitrary offset/stride.
Eric Anholt [Tue, 6 Feb 2018 17:42:44 +0000 (17:42 +0000)]
broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride.

This is part of supporting YUV textures -- MMAL will be handing us a
single GEM BO with the planes at offsets within it, and MMAL-decided
stride.

6 years agobroadcom/vc4: Ignore PIPE_BIND_DISPLAY_TARGET in is_format_supported().
Eric Anholt [Fri, 23 Feb 2018 01:38:50 +0000 (17:38 -0800)]
broadcom/vc4: Ignore PIPE_BIND_DISPLAY_TARGET in is_format_supported().

We were failing the retval == usage check at the end.

Fixes: f7604d8af521 ("st/dri: only expose config formats that are display targets")
6 years agoetnaviv: fix in-place resolve tile count
Lucas Stach [Mon, 5 Feb 2018 17:59:48 +0000 (18:59 +0100)]
etnaviv: fix in-place resolve tile count

TS tiles map to a fixed amount of bytes in the color/depth surface,
so the blocksize of the format needs to be taken into account when
calculating the number of tiles to fill.

The simplest fix is to just use the layer stride, which is the surface
size in bytes.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
6 years agoetnaviv: switch magic single buffer state to "3"
Lucas Stach [Mon, 5 Feb 2018 17:56:09 +0000 (18:56 +0100)]
etnaviv: switch magic single buffer state to "3"

Some of the 16bit formats misrender with missing tiles with the current
"2" state. As all the previously working formats also work with the "3"
state, just always use that one.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
6 years agoetnaviv: add debug switch to disable single buffer feature
Lucas Stach [Mon, 5 Feb 2018 14:36:34 +0000 (15:36 +0100)]
etnaviv: add debug switch to disable single buffer feature

This feature has caused some trouble already. Add a debug switch to
allow users to quickly check if a specific issue is caused by this
feature.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
6 years agomeson: Fix GL and EGL pkg-config files with glvnd
Dylan Baker [Tue, 20 Feb 2018 18:36:44 +0000 (10:36 -0800)]
meson: Fix GL and EGL pkg-config files with glvnd

Currently meson will generate a pkg-config that links to EGL_mesa (or
GLX_mesa), but this isn't correct, it should always link to EGL or GL.
Probably the "right" solution is to have glvnd itself provide the pkg
config files for GL and EGL, but that also means that glvnd needs to
provide many of the header files, which makes it a more involved job.

Fixes: a47c525f3281a27 ("meson: build glx")
Fixes: 035ec7a2bb2d5e4 ("meson: Add support for EGL glvnd")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
6 years agoegl/dri2: fix segfault when display initialisation fails
Frank Binns [Thu, 22 Feb 2018 13:37:54 +0000 (13:37 +0000)]
egl/dri2: fix segfault when display initialisation fails

dri2_display_destroy() is called when platform specific display
initialisation fails. However, this would typically lead to a
segfault due to the dri2_egl_display vbtl not having been set up.

Fixes: 2db95482964 ("loader_dri3/glx/egl: Optionally use a blit
context for blitting operations")
Signed-off-by: Frank Binns <francisbinns@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agomesa: add missing RGB9_E5 format in _mesa_base_fbo_format
Juan A. Suarez Romero [Mon, 15 Jan 2018 10:58:50 +0000 (10:58 +0000)]
mesa: add missing RGB9_E5 format in _mesa_base_fbo_format

RGB9_E5 should be accepted by RenderbufferStorage if the
EXT_texture_shared_exponent is exposed. It is left to the
implementations to return GL_FRAMEBUFFER_UNSUPPORTED_EXT
when checking the framebuffer completeness if they do not
support rendering in this format.

Discussed in:
https://github.com/KhronosGroup/OpenGL-API/issues/32

This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5

v2: Added more info to the commit message (Antia)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
6 years agoetnaviv: npot_tex_any_wrap needs one bit only
Christian Gmeiner [Tue, 20 Feb 2018 19:47:18 +0000 (20:47 +0100)]
etnaviv: npot_tex_any_wrap needs one bit only

Reduces size of struct etna_specs from 100 to 94 bytes.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
6 years agovbo: Make use of _DrawVAO from the dlist code.
Mathias Fröhlich [Sat, 3 Feb 2018 21:25:50 +0000 (22:25 +0100)]
vbo: Make use of _DrawVAO from the dlist code.

Finally use an internal VAO to execute display list draws. Avoid
duplicate state validation for display list draws. Remove client arrays
previously used exclusively for display lists.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: Use atomics for shared VAO reference counts.
Mathias Fröhlich [Sat, 3 Feb 2018 14:22:33 +0000 (15:22 +0100)]
mesa: Use atomics for shared VAO reference counts.

VAOs will be used in the next change as immutable object across multiple
contexts. Only reference counting may write concurrently on the VAO. So,
make the reference count thread safe for those and only those VAO objects.

v3: Use bool/true/false for gl_vertex_array_object::SharedAndImmutable.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agovbo: Make use of _DrawVAO from immediate mode draw
Mathias Fröhlich [Wed, 7 Feb 2018 07:59:13 +0000 (08:59 +0100)]
vbo: Make use of _DrawVAO from immediate mode draw

Finally use an internal VAO to execute immediate mode draws. Avoid
duplicate state validation for immediate mode draws. Remove client arrays
previously used exclusively for immediate mode draws.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agovbo: Implement tool functions for vbo specific VAO setup.
Mathias Fröhlich [Sat, 3 Feb 2018 20:28:40 +0000 (21:28 +0100)]
vbo: Implement tool functions for vbo specific VAO setup.

Correct VBO_MATERIAL_SHIFT value.
The functions will be used next in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: Add flush_vertices to _mesa_bind_vertex_buffer.
Mathias Fröhlich [Sat, 3 Feb 2018 20:16:19 +0000 (21:16 +0100)]
mesa: Add flush_vertices to _mesa_bind_vertex_buffer.

We will need the flush_vertices argument later in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: Make _mesa_vertex_attrib_binding public.
Mathias Fröhlich [Sat, 3 Feb 2018 19:50:35 +0000 (20:50 +0100)]
mesa: Make _mesa_vertex_attrib_binding public.

Change vertex_attrib_binding() to _mesa_vertex_attrib_binding(), add a
flush_vertices argument, and make it publicly available.
The function will be needed later in the series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: Add flush_vertices to _mesa_{enable,disable}_vertex_array_attrib.
Mathias Fröhlich [Wed, 24 Aug 2016 06:45:05 +0000 (08:45 +0200)]
mesa: Add flush_vertices to _mesa_{enable,disable}_vertex_array_attrib.

We will need the flush_vertices argument later in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agovbo: Use _DrawVAO for array type draw commands.
Mathias Fröhlich [Sat, 3 Feb 2018 18:42:20 +0000 (19:42 +0100)]
vbo: Use _DrawVAO for array type draw commands.

Switch over to use the _DrawVAO for all the array type draws.
The _DrawVAO needs to be set before we enter _mesa_update_state, so move
setting the draw method in front of the first call to _mesa_update_state
which is in turn called from the *validate*Draw* calls. Using the
gl_vertex_array_object::_Enabled bitmask, gl_vertex_program_state::_VPMode
and gl_vertex_array_object::_AttributeMapMode we can already set
varying_vp_inputs before we call _mesa_update_state the first time.
Thus remove duplicate state validation.

v2: Update comments.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agovbo: Implement method to track the inputs array.
Mathias Fröhlich [Sat, 3 Feb 2018 16:19:24 +0000 (17:19 +0100)]
vbo: Implement method to track the inputs array.

Provided the _DrawVAO and the derived state that is maintained if we have
the _DrawVAO set, implement a method to incrementally update the array of
gl_vertex_array input pointers.

v2: Add some more comments.
    Rename _vbo_array_init to _vbo_init_inputs.
    Rename vbo_context::arrays to vbo_context::draw_arrays.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: Introduce a yet unused _DrawVAO.
Mathias Fröhlich [Wed, 24 Aug 2016 06:45:05 +0000 (08:45 +0200)]
mesa: Introduce a yet unused _DrawVAO.

During the patch series this VAO gets populated with either the currently
bound VAO or an internal VAO that will be used for immediate mode and
dlist rendering.

v2: More comments about the _DrawVAO, filter and enabled mask.
    Rename _DrawVAOEnabled to _DrawVAOEnabledAttribs.
v3: Fix and move comment.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agovbo: Remove get_vp_mode() and enum vp_mode.
Mathias Fröhlich [Sat, 3 Feb 2018 09:44:10 +0000 (10:44 +0100)]
vbo: Remove get_vp_mode() and enum vp_mode.

Is now unused.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agovbo: Use _VPMode instead of get_vp_mode().
Mathias Fröhlich [Sat, 3 Feb 2018 09:42:01 +0000 (10:42 +0100)]
vbo: Use _VPMode instead of get_vp_mode().

At those places where we used get_vp_mode() use
gl_vertex_program_state::_VPMode instead.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: Provide an alternative to get_vp_mode()
Mathias Fröhlich [Fri, 2 Feb 2018 20:31:27 +0000 (21:31 +0100)]
mesa: Provide an alternative to get_vp_mode()

To get equivalent information than get_vp_mode(), track the vertex
processing mode in a per context variable at
gl_vertex_program_state::_VPMode.
This aims to replace get_vp_mode() as seen in the vbo module.
But instead of the get_vp_mode() implementation which only gives correct
answers past calling _mesa_update_state() this context variable is
immediately tracked when the vertex processing state is modified. The
correctness of this value is asserted on state validation.

With this in place we should be able to untangle the dependency with
varying_vp_inputs and state invalidation.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agonv50,nvc0: fix integer MS resolves using 2d engine
Ilia Mirkin [Thu, 22 Feb 2018 04:32:49 +0000 (23:32 -0500)]
nv50,nvc0: fix integer MS resolves using 2d engine

We don't want filtering for integer textures, same as depth/stencil.

Fixes: KHR-GL45.direct_state_access.renderbuffers_storage_multisample
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
6 years agonvc0: fix writing query results into buffer
Ilia Mirkin [Wed, 21 Feb 2018 05:10:24 +0000 (00:10 -0500)]
nvc0: fix writing query results into buffer

We need to mark the range as valid, and validate the resource using a
helper to ensure that the buffer status is marked properly.

Fixes some CTS pipeline stats query tests, and
KHR-GL45.direct_state_access.queries_functional

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
6 years agonv50,nvc0: fix clear buffer acceleration
Ilia Mirkin [Wed, 21 Feb 2018 04:17:31 +0000 (23:17 -0500)]
nv50,nvc0: fix clear buffer acceleration

Two things were off:
 - valid range was not updated, which could affect waiting for future
   maps
 - fencing was done manually instead of using the *_resource_validate
   helper, which resulted in a missed dirty buffer flag being set

Fixes: KHR-GL45.direct_state_access.buffers_clear
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
6 years agoi965: perf: ensure reading config IDs from sysfs isn't interrupted
Lionel Landwerlin [Wed, 7 Feb 2018 10:48:32 +0000 (10:48 +0000)]
i965: perf: ensure reading config IDs from sysfs isn't interrupted

Fixes: 458468c136e "i965: Expose OA counters via INTEL_performance_query"
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agoradv: Fix autotools build.
Bas Nieuwenhuizen [Fri, 23 Feb 2018 00:42:07 +0000 (01:42 +0100)]
radv: Fix autotools build.

Somewhere along the way the Makefile changes got lost ...

Fixes: 4db78f3a6b "radv: Put supported extensions in a struct."
Acked-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Return NULL for entrypoints when not supported.
Bas Nieuwenhuizen [Sun, 11 Feb 2018 13:38:42 +0000 (14:38 +0100)]
radv: Return NULL for entrypoints when not supported.

This implements strict checking for the entrypoint ProcAddr
functions.

 - InstanceProcAddr with instance = NULL, only returns the 3 allowed
   entrypoints.
 - DeviceProcAddr does not return any instance entrypoints.
 - InstanceProcAddr does not return non-supported or disabled
   instance entrypoints.
 - DeviceProcAddr does not return non-supported or disabled device
   entrypoints.
 - InstanceProcAddr still returns non-supported device entrypoints.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Reword radv_entrypoints_gen.py
Bas Nieuwenhuizen [Sun, 11 Feb 2018 12:12:33 +0000 (13:12 +0100)]
radv: Reword radv_entrypoints_gen.py

With a big inspiration from anv as always ...

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Track enabled extensions.
Bas Nieuwenhuizen [Sat, 10 Feb 2018 23:32:34 +0000 (00:32 +0100)]
radv: Track enabled extensions.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Put supported extensions in a struct.
Bas Nieuwenhuizen [Sat, 10 Feb 2018 20:43:55 +0000 (21:43 +0100)]
radv: Put supported extensions in a struct.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoappveyor: Build with MSVC 2015.
Jose Fonseca [Mon, 22 Jan 2018 11:11:16 +0000 (11:11 +0000)]
appveyor: Build with MSVC 2015.

The MSVC version we (at VMware) primarily care about from now on is
2015.

See https://ci.appveyor.com/project/jrfonseca/mesa/build/46

We can drop support for building with 2013 in a future commit.  I'm not
aware of significant changes in C99/C11 support from MSVC 2013 to 2015,
but there's no point in continuing supporting old MSVC versions when
nobody cares.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
6 years agoac/nir: remove emission of nir_op_fpow
Samuel Pitoiset [Mon, 5 Feb 2018 14:51:37 +0000 (15:51 +0100)]
ac/nir: remove emission of nir_op_fpow

fpow is now lowered at NIR level.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: enable lowering of fpow to fexp2 and flog2
Samuel Pitoiset [Fri, 2 Feb 2018 18:04:57 +0000 (19:04 +0100)]
radv: enable lowering of fpow to fexp2 and flog2

There is no fpow in hardware, so it's always lowered somewhere,
but it appears that lowering at NIR level is better. Figured while
comparing compute shaders between RadeonSI and RADV.

Polaris10:
Totals from affected shaders:
SGPRS: 18936 -> 18904 (-0.17 %)
VGPRS: 12240 -> 12220 (-0.16 %)
Spilled SGPRs: 2809 -> 2809 (0.00 %)
Code Size: 718116 -> 719848 (0.24 %) bytes
Max Waves: 1409 -> 1410 (0.07 %)

Vega10:
Totals from affected shaders:
SGPRS: 18392 -> 18392 (0.00 %)
VGPRS: 12008 -> 11920 (-0.73 %)
Spilled SGPRs: 3001 -> 2981 (-0.67 %)
Code Size: 777444 -> 778788 (0.17 %) bytes
Max Waves: 1503 -> 1504 (0.07 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agonir: lower fexp2(fmul(flog2(a), 2)) to fmul(a, a)
Samuel Pitoiset [Mon, 5 Feb 2018 14:08:03 +0000 (15:08 +0100)]
nir: lower fexp2(fmul(flog2(a), 2)) to fmul(a, a)

Similar for the 4 case.

Suggested by Bas.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agonir: add is_used_once for fmul(fexp2(a), fexp2(b)) to fexp2(fadd(a, b))
Samuel Pitoiset [Mon, 5 Feb 2018 15:07:45 +0000 (16:07 +0100)]
nir: add is_used_once for fmul(fexp2(a), fexp2(b)) to fexp2(fadd(a, b))

Otherwise the code size increases because the original fexp2()
instructions can't be deleted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/nir: set GLC=1 for load/store of coherent/volatile images
Samuel Pitoiset [Thu, 22 Feb 2018 09:25:38 +0000 (10:25 +0100)]
ac/nir: set GLC=1 for load/store of coherent/volatile images

This disables persistence accross wavefronts.

F1 2017 and Wolfenstein 2 appear to use some coherent images
but this patch doesn't seem to change anything.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agospirv: apply memory qualifiers to images
Samuel Pitoiset [Thu, 22 Feb 2018 09:25:37 +0000 (10:25 +0100)]
spirv: apply memory qualifiers to images

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoglx: Properly handle cases where screen creation fails
Chuck Atkins [Thu, 22 Feb 2018 14:19:37 +0000 (09:19 -0500)]
glx: Properly handle cases where screen creation fails

This fixes a segfault exposed by a29d63ecf7 which occurs when swr is
used on an unsupported architecture.

v2: re-work to place logic in xmesa_init_display

Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: George Kyriazis <george.kyriazis@intel.com>
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoanv/blorp: multisample resolve all attachment layers
Iago Toral Quiroga [Wed, 14 Feb 2018 10:48:05 +0000 (11:48 +0100)]
anv/blorp: multisample resolve all attachment layers

We were only resolving the first.

v2:
  - Do not require that the number of layers on dst and src are an
    exact match, it is okay if the dst has more layers so long as
    it has at least the same that we are going to resolve.
  - Do not always resolve array_len layers, we should resolve
    only from base_array_layer to array_len.

v3:
  - v2 was assuming that array_len represented the total number of
    layers in the image, but it represents the number of layers
    starting at the base array ayer.

v4:
 - The number of layers to resolve should be taken from the
   framebuffer (Nanley).

Fixes new CTS tests for multisampled layered rendering:
dEQP-VK.renderpass.multisample_resolve.layers_*

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agointel/isl: Improve the documentation on get_default_aux_state
Jason Ekstrand [Tue, 28 Nov 2017 19:07:48 +0000 (11:07 -0800)]
intel/isl: Improve the documentation on get_default_aux_state

Reviewed-by: Chad Versace <chadversary@chromium.org>
6 years agoi965: Use finish_external instead of make_shareable in setTexBuffer2
Jason Ekstrand [Mon, 23 Oct 2017 23:32:42 +0000 (16:32 -0700)]
i965: Use finish_external instead of make_shareable in setTexBuffer2

The setTexBuffer2 hook from GLX is used to implement glxBindTexImageEXT
which has tighter restrictions than just "it's shared".  In particular,
it says that any rendering to the image while it is bound causes the
contents to become undefined.

The GLX_EXT_texture_from_pixmap extension provides us with an acquire
and release in the form of glXBindTexImageEXT and glXReleaseTexImageEXT.
The extension spec says,

    "Rendering to the drawable while it is bound to a texture will leave
    the contents of the texture in an undefined state.  However, no
    synchronization between rendering and texturing is done by GLX.  It
    is the application's responsibility to implement any synchronization
    required."

From the EGL 1.4 spec for eglBindTexImage:

    "After eglBindTexImage is called, the specified surface is no longer
    available for reading or writing.  Any read operation, such as
    glReadPixels or eglCopyBuffers, which reads values from any of the
    surface’s color buffers or ancillary buffers will produce
    indeterminate results.  In addition, draw operations that are done
    to the surface before its color buffer is released from the texture
    produce indeterminate results

In other words, between the bind and release calls, we effectively own
those pixels and can assume, so long as we don't crash, that no one else
is reading from/writing to the surface.  The GLX and EGL implementations
call the setTexBuffer2 and releaseTexBuffer function pointers that the
driver can hook.

In theory, this means that, between BindTexImage and ReleaseTexImage, we
own the pixels and it should be safe to track aux usage so we
can avoid redundant resolves so long as we start off with the right
assumption at the start of the bind/release pair.

In practice, however, X11 has slightly different expectations.  It's
expected that the server may be drawing to the image at the same time as
the compositor is texturing from it.  In that case, the worst expected
outcome should be tearing or partial rendering and not random corruption
like we see when rendering races with scanout with CCS.  Fortunately,
the GEM rules about texture/render dependencies save us here.  If X11
submits work to write to a pixmap after the compositor has submitted
work to texture from it, GEM inserts a dependency between the compositor
and X11.  If X11 is using a high-priority context, this will cause the
compositor to get a temporarily boosted priority while the batch from
X11 is waiting on it.  This means that we will never have an actual race
between X11 and the compositor so no corruption can happen.

Unfortunately, however, this means that X11 will likely be rendering to it
between the compositor's BindTexImage and ReleaseTexImage calls.  If we
want to avoid strange issues, we need to be a bit careful about
resolves because we can't really transition it away from the "default"
aux usage.  The only case where this would practically be a problem is
with image_load_store where we have to do a full resolve in order to use
the image via the data port.  Even there it would only be a problem if
batches were split such that X11's rendering happens between the resolve
and the use of it as a storage image.  However, the chances of this
happening are very slim so we just emit a warning and hope for the best.

This commit adds a new helper intel_miptree_finish_external which resets
all aux state to whatever ISL says is the right worst-case "default" for
the given modifier.  It feels a little awkward to call it "finish"
because it's actually an acquire from the perspective of the driver, but
it matches the semantics of the other prepare/finish functions.  This
new helper gets called in intelSetTexBuffer2 instead of make_shareable.
We also add an intelReleaseTexBuffer (we passed NULL to releaseTexBuffer
before) and call intel_miptree_prepare_external in it.  This probably
does nothing most of the time but it means that the prepare/finish calls
are properly matched.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
6 years agoi965/tex_image: Reference the renderbuffer miptree in setTexBuffer2
Jason Ekstrand [Tue, 12 Sep 2017 21:26:04 +0000 (14:26 -0700)]
i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2

The old code made a new miptree that referenced the same BO as the
renderbuffer and just trusted in the memory aliasing to work.  There are
only two ways in which the new miptree is liable to differ from the one
in the renderbuffer and neither of them matter:

 1) It may have a different target.  The only targets that we can ever
    see in intelSetTexBuffer2 are GL_TEXTURE_2D and GL_TEXTURE_RECTANGLE
    and the difference between the two doesn't matter as far as the
    miptree is concerned; genX(update_sampler_state) only looks at the
    gl_texture_object and not the miptree when determining whether or
    not to use normalized coordinates.

 2) It may have a very slightly different format.  Again, this doesn't
    matter because we've supported texture views for quite some time so
    we always look at the gl_texture_object format instead of the
    miptree format for hardware setup anyway.

On the other hand, because we were recreating the miptree, we were using
intel_miptree_create_for_bo which doesn't understand modifiers.  We
really want this function to work without doing a resolve so long as you
have modifiers so we need to fix that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
6 years agoi965/tex_image: Pull the tex format from the renderbuffer in intelSetTexBuffer2
Jason Ekstrand [Tue, 28 Nov 2017 19:26:55 +0000 (11:26 -0800)]
i965/tex_image: Pull the tex format from the renderbuffer in intelSetTexBuffer2

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
6 years agoi965/miptree: Loosen the format check in miptree_match_image
Jason Ekstrand [Mon, 23 Oct 2017 22:06:11 +0000 (15:06 -0700)]
i965/miptree: Loosen the format check in miptree_match_image

This function is used to determine when we need to re-allocate a
miptree.  Since we do nothing different in miptree allocation for
sRGB vs. linear, loosening this should be safe and may lead to less
copying and reallocating in some odd cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
6 years agoi965/state: Ignore intel_obj->_Format for depth/stencil and ETC2
Jason Ekstrand [Wed, 29 Nov 2017 00:06:27 +0000 (16:06 -0800)]
i965/state: Ignore intel_obj->_Format for depth/stencil and ETC2

We're about to start letting the intel_obj->_Format be the "real"
texture format.  For depth/stencil textures, this may be a combined
depth stencil format.  For ETC2 on gen7 and earlier, this will be the
actual ETC2 format.  This makes a bit more GL sense but means we have to
be careful in state upload.

Reviewed-by: Chad Versace <chadversary@chromium.org>
6 years agoglsl: Parse 'layout' as a token with advanced blending or bindless
Kenneth Graunke [Mon, 19 Feb 2018 17:35:46 +0000 (09:35 -0800)]
glsl: Parse 'layout' as a token with advanced blending or bindless

Both KHR_blend_equation_advanced and ARB_bindless_texture provide
layout qualifiers, and are exposed in compatibility contexts.  We
need to parse the layout qualifier as a token in order for those
to work, but forgot to extend this check.

ARB_shader_image_load_store would need a similar treatment, but we
don't expose that in legacy OpenGL contexts.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105161
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agovulkan/wsi/x11: Consistently update and return swapchain status
Daniel Stone [Tue, 20 Feb 2018 20:56:02 +0000 (20:56 +0000)]
vulkan/wsi/x11: Consistently update and return swapchain status

Use a helper function for updating the swapchain status. This will be
used later to handle VK_SUBOPTIMAL_KHR, where we need to make a
non-error status stick to the swapchain until recreation.  Instead of
direct comparisons to VK_SUCCESS to check for error, test for negative
numbers meaning an error status, and positive numbers indicating
non-error statuses.

v2 (Jason Ekstrand):
 - Use a pattern of "return x11_swapchain_result(chain, VK_WHATEVER)"
 - Handle wsi_queue_pull returning VK_TIMEOUT
 - Call x11_swapchain_result in x11_present_to_x11

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agovulkan/wsi/x11: Set OUT_OF_DATE if wait_for_special_event fails
Jason Ekstrand [Wed, 21 Feb 2018 20:38:12 +0000 (12:38 -0800)]
vulkan/wsi/x11: Set OUT_OF_DATE if wait_for_special_event fails

This most likely means we lost our connection to the X server so
OUT_OF_DATE is reasonable.  This was also the one case where we pushed a
UINT32_MAX into the queue without setting an error condition.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Daniel Stone <daniels@collabora.com>
6 years agovulkan/wsi/wayland: Add support for zwp_dmabuf
Daniel Stone [Fri, 9 Feb 2018 23:43:30 +0000 (15:43 -0800)]
vulkan/wsi/wayland: Add support for zwp_dmabuf

zwp_linux_dmabuf_v1 lets us use multi-planar images and buffer
modifiers.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoanv/image: Add support for modifiers for WSI
Jason Ekstrand [Tue, 14 Nov 2017 00:44:07 +0000 (16:44 -0800)]
anv/image: Add support for modifiers for WSI

This adds support for the modifiers portion of the WSI "extension".

Reviewed-by: Daniel Stone <daniels@collabora.com>
6 years agoanv/image: Separate modifiers from legacy scanout
Jason Ekstrand [Thu, 25 Jan 2018 03:47:14 +0000 (19:47 -0800)]
anv/image: Separate modifiers from legacy scanout

For a bit there, we had a bug in i965 where it ignored the tiling of the
modifier and used the one from the BO instead.  At one point, we though
this was best fixed by setting a tiling from Vulkan.  However, we've
decided that i965 was just doing the wrong thing and have fixed it as of
50485723523d2948a44570ba110f02f726f86a54.

The old assumptions also affected the solution we used for legacy
scanout in Vulkan.  Instead of treating it specially, we just treated it
like a modifier like we do in GL.  This commit goes back to making it
it's own thing so that it's clear in the driver when we're using
modifiers and when we're using legacy paths.

v2 (Jason Ekstrand):
 - Rename legacy_scanout to needs_set_tiling

Reviewed-by: Daniel Stone <daniels@collabora.com>
6 years agovulkan/wsi: Add modifiers support to wsi_create_native_image
Jason Ekstrand [Fri, 9 Feb 2018 23:43:27 +0000 (15:43 -0800)]
vulkan/wsi: Add modifiers support to wsi_create_native_image

This involves extending our fake extension a bit to allow for additional
querying and passing of modifier information.  The added bits are
intended to look a lot like the draft of VK_EXT_image_drm_format_modifier.
Once the extension gets finalized, we'll simply transition all of the
structs used in wsi_common to the real extension structs.

Reviewed-by: Daniel Stone <daniels@collabora.com>
6 years agovulkan/wsi: Add drm_modifier member to wsi_image
Daniel Stone [Fri, 9 Feb 2018 23:43:26 +0000 (15:43 -0800)]
vulkan/wsi: Add drm_modifier member to wsi_image

Not yet used anywhere.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agovulkan/wsi: Add multiple planes to wsi_image
Daniel Stone [Fri, 9 Feb 2018 23:43:25 +0000 (15:43 -0800)]
vulkan/wsi: Add multiple planes to wsi_image

Not currently used.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agonir: remove old assert
Timothy Arceri [Wed, 21 Feb 2018 03:36:09 +0000 (14:36 +1100)]
nir: remove old assert

This was originally intended to make sure the remap location
was not -1. However the code has changed alot since then,
the location is now never set to -1 and we also handle
components meaning this old assert has been doing comparisions
with the pointer to the array of component data.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183

6 years agoradeonsi/nir: collect more accurate output_usagemask
Timothy Arceri [Wed, 21 Feb 2018 02:27:17 +0000 (13:27 +1100)]
radeonsi/nir: collect more accurate output_usagemask

Fixes assert in the glsl-1.50-gs-max-output-components piglit test.

Note that the double handling will only work for doubles that
don't take up multiple slots i.e. double and dvec2. However
dual slot double handling is an existing bug which is made no
worse by this patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi/nir: disable GLSL IR loop unrolling
Timothy Arceri [Wed, 21 Feb 2018 01:30:30 +0000 (12:30 +1100)]
radeonsi/nir: disable GLSL IR loop unrolling

Delaying unrolling and allowing NIR to do it instead has been shown
to result in better code in drivers such as i965. shader-db results
appear to show the same is true for radeonsi.

The other advantage is that using NIR unrolling improves compile
times significantly.

Totals from affected shaders:
SGPRS: 9624 -> 10016 (4.07 %)
VGPRS: 6800 -> 6464 (-4.94 %)
Spilled SGPRs: 0 -> 2 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 359176 -> 332264 (-7.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1355 -> 1432 (5.68 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi/nir: fix tess varying loads for doubles
Timothy Arceri [Tue, 20 Feb 2018 23:10:33 +0000 (10:10 +1100)]
radeonsi/nir: fix tess varying loads for doubles

Fixes the following piglit tests:

tests/spec/arb_tessellation_shader/execution/double-array-vs-tcs-tes.shader_test
tests/spec/arb_tessellation_shader/execution/double-vs-tcs-tes.shader_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoac/radeonsi: pass type to load_tess_varyings()
Timothy Arceri [Tue, 20 Feb 2018 23:09:18 +0000 (10:09 +1100)]
ac/radeonsi: pass type to load_tess_varyings()

We need this to be able to load 64bit varyings.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agox11/dri3: Store raw present completion mode
Daniel Stone [Wed, 21 Feb 2018 10:39:34 +0000 (10:39 +0000)]
x11/dri3: Store raw present completion mode

The DRI3 drawable info struct currently stores a boolean for whether the
last completed operation was a flip or not. As we need to track the full
completion mode for handling suboptimal returns, change the 'flipping'
field to the raw present completion mode from the server.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agox11/dri3: Don't open-code ARRAY_SIZE
Daniel Stone [Wed, 21 Feb 2018 11:39:09 +0000 (11:39 +0000)]
x11/dri3: Don't open-code ARRAY_SIZE

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agoanv: Don't assert that stencil HiZ clears are single-slice
Jason Ekstrand [Wed, 21 Feb 2018 21:07:10 +0000 (13:07 -0800)]
anv: Don't assert that stencil HiZ clears are single-slice

It's true for depth HiZ clears because we only have HiZ on single-slice
images right now.  However, for stencil-only clears there is no such
restriction.

Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv: Only copy clear dwords if we're rendering to the first slice
Jason Ekstrand [Sun, 11 Feb 2018 06:10:03 +0000 (22:10 -0800)]
anv: Only copy clear dwords if we're rendering to the first slice

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
6 years agoradeonsi: don't flush when si_eliminate_fast_color_clear is no-op
Marek Olšák [Tue, 30 Jan 2018 01:51:47 +0000 (02:51 +0100)]
radeonsi: don't flush when si_eliminate_fast_color_clear is no-op

6 years agoradeonsi: make texture_discard_cmask/eliminate functions non-static
Marek Olšák [Sun, 7 Jan 2018 20:04:55 +0000 (21:04 +0100)]
radeonsi: make texture_discard_cmask/eliminate functions non-static

6 years agoradeonsi: enable uvd encode for HEVC main
James Zhu [Mon, 5 Feb 2018 17:02:50 +0000 (12:02 -0500)]
radeonsi: enable uvd encode for HEVC main

Enable UVD encode for HEVC main profile

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
6 years agoradeonsi:create uvd hevc enc entry
James Zhu [Mon, 5 Feb 2018 22:08:22 +0000 (17:08 -0500)]
radeonsi:create uvd hevc enc entry

Add UVD hevc encode pipe video codec creation entry

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
6 years agoradeon/uvd:add uvd hevc enc functions
James Zhu [Tue, 6 Feb 2018 18:29:11 +0000 (13:29 -0500)]
radeon/uvd:add uvd hevc enc functions

Implement UVD hevc encode functions

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
6 years agoradeon/uvd:add uvd hevc enc hw ib implementation
James Zhu [Tue, 6 Feb 2018 18:26:28 +0000 (13:26 -0500)]
radeon/uvd:add uvd hevc enc hw ib implementation

Implement required IBs for UVD HEVC encode.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
6 years agoradeon/uvd:add uvd hevc enc hw interface header
James Zhu [Tue, 6 Feb 2018 18:18:21 +0000 (13:18 -0500)]
radeon/uvd:add uvd hevc enc hw interface header

Add hevc encode hardware interface for UVD

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
6 years agowinsys/amdgpu:add uvd hevc enc support in amdgpu cs
James Zhu [Tue, 6 Feb 2018 17:39:03 +0000 (12:39 -0500)]
winsys/amdgpu:add uvd hevc enc support in amdgpu cs

Support UVD HEVC encode in amdgpu cs

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
6 years agoamd/common:add uvd hevc enc support check in hw query
James Zhu [Mon, 5 Feb 2018 21:28:13 +0000 (16:28 -0500)]
amd/common:add uvd hevc enc support check in hw query

Based on amdgpu hardware query information to check if UVD hevc enc support

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agonvir/nvc0: fix legalizing of ld unlock c0[0x10000]
Karol Herbst [Mon, 19 Feb 2018 23:45:14 +0000 (00:45 +0100)]
nvir/nvc0: fix legalizing of ld unlock c0[0x10000]

We have to increase the file index also for 0x10000 not just for values
greater than 0x10000.

Fixes: 37b67db6ae34fb6586d640a7a1b6232f091dd812
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>