mesa.git
6 years agoradv: Fix fragment resolve destination offset.
Bas Nieuwenhuizen [Tue, 26 Dec 2017 15:11:35 +0000 (16:11 +0100)]
radv: Fix fragment resolve destination offset.

The position start at (dst.x, dst.y), so if we want the source to
start at (src.x, src.y), we have to offset by (src.x-dst.x,src.y-dst.y).

Haven't tested that this fixed anything yet, but found by inspection.

Fixes: 69136f4e633 "radv/meta: add resolve pass using fragment/vertex shaders"
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Don't handle DCC in compute resolve.
Bas Nieuwenhuizen [Mon, 25 Dec 2017 13:30:50 +0000 (14:30 +0100)]
radv: Don't handle DCC in compute resolve.

If the destination has DCC, we will use the FS resolve.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Flush caches before subpass resolve.
Bas Nieuwenhuizen [Mon, 25 Dec 2017 13:27:28 +0000 (14:27 +0100)]
radv: Flush caches before subpass resolve.

Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Invert condition for all samples identical during resolve.
Bas Nieuwenhuizen [Mon, 25 Dec 2017 12:15:06 +0000 (13:15 +0100)]
radv: Invert condition for all samples identical during resolve.

the samples_identical instruction returns 0 if they are differet, so
we have to do the extra work if the result is 0, not if it is != 0.

Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoegl: don't try the software path twice
Eric Engestrom [Wed, 20 Dec 2017 15:53:10 +0000 (15:53 +0000)]
egl: don't try the software path twice

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Brendan King <Brendan.King@imgtec.com>
6 years agoegl: rename LIBGL_ALWAYS_SOFTWARE variable from UseFallback to ForceSoftware
Eric Engestrom [Wed, 20 Dec 2017 15:53:09 +0000 (15:53 +0000)]
egl: rename LIBGL_ALWAYS_SOFTWARE variable from UseFallback to ForceSoftware

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agoegl: let each platform decided how to handle LIBGL_ALWAYS_SOFTWARE
Eric Engestrom [Wed, 20 Dec 2017 15:53:08 +0000 (15:53 +0000)]
egl: let each platform decided how to handle LIBGL_ALWAYS_SOFTWARE

My refactor in 47273d7312cb5b5b6b0b9 missed this early return; because
of it, setting UseFallback one layer above actually prevented the
software path from being used.

Remove this early return and let each platform's dri2_initialize_*()
decide what it can do with the LIBGL_ALWAYS_SOFTWARE restriction.

platform_{surfaceless,x11,wayland} were already handling it themselves.

Fixes: 47273d7312cb5b5b6b0b9 "egl: set UseFallback if LIBGL_ALWAYS_SOFTWARE is set"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Brendan King <Brendan.King@imgtec.com>
6 years agoegl: link libEGL against the dynamic version of libglapi
Brendan King [Mon, 18 Dec 2017 16:33:18 +0000 (16:33 +0000)]
egl: link libEGL against the dynamic version of libglapi

Note: the following happens only when using slibtool.
Since this is a very serious breakage, we will keep the workaround until
a better solution is available.

DRI modules store the address of the dispatch table in a TLS variable,
_glapi_tls_Dispatch.

Changes to the way libEGL is built in d884d8d0077c16d459b1 resulted in
it being statically linked against libglapi, and thus containing its own
copy of _glapi_tls_Dispatch. The result was that some applications would
fail to work (e.g. deqp-egl, which dynamically loads libEGL), due to the
DRI module storing the dispatch table address in one copy of
_glapi_tls_Dispatch, and libEGL obtaining the address from another copy
of the variable.

Fixes: d884d8d0077c16d459b1 "egl/dri: link directly to libglapi.so"
Signed-off-by: Brendan King <Brendan.King@imgtec.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agoradv: don't do format replacement on tc compat htile surfaces.
Dave Airlie [Wed, 27 Dec 2017 07:00:29 +0000 (17:00 +1000)]
radv: don't do format replacement on tc compat htile surfaces.

For copies the texture unit needs to know the depth format so
it can read the htile data properly.

This fixes:
dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.load.clear

Fixes: ad3d98da9f (radv: enable tc compatible htile for d32s8 also.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv/gfx9: use correct stencil format for tc compat htile.
Dave Airlie [Wed, 27 Dec 2017 01:22:58 +0000 (11:22 +1000)]
radv/gfx9: use correct stencil format for tc compat htile.

This needs to correspond to the bit depth of the Z plane.

noticed in passing reading amdvlk.

Fixes: fc6c77e162df3 (radv: fix TC-compat HTILE with VK_FORMAT_D32_SFLOAT_S8_UINT on Vega)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agosvga: move variant->fs_shadow_compare_units assignment
Brian Paul [Wed, 27 Dec 2017 18:05:52 +0000 (11:05 -0700)]
svga: move variant->fs_shadow_compare_units assignment

Fixes a crash since the variant object isn't allocated until later
in the function.  Not sure how this got through.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
6 years agoamd/common: rework set_userdata_location() and rename to set_loc()
Samuel Pitoiset [Wed, 20 Dec 2017 19:56:07 +0000 (20:56 +0100)]
amd/common: rework set_userdata_location() and rename to set_loc()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: rename set_userdata_location_shader() to set_loc_shader()
Samuel Pitoiset [Wed, 20 Dec 2017 19:56:06 +0000 (20:56 +0100)]
amd/common: rename set_userdata_location_shader() to set_loc_shader()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: replace set_userdata_location_indirect() by set_loc_desc()
Samuel Pitoiset [Wed, 20 Dec 2017 19:56:05 +0000 (20:56 +0100)]
amd/common: replace set_userdata_location_indirect() by set_loc_desc()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: rename radv_define_vs_user_sgprs_phase2()
Samuel Pitoiset [Wed, 20 Dec 2017 19:56:04 +0000 (20:56 +0100)]
amd/common: rename radv_define_vs_user_sgprs_phase2()

... to set_vs_specific_input_locs().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: rename radv_define_common_user_sgprs_phase2()
Samuel Pitoiset [Wed, 20 Dec 2017 19:56:03 +0000 (20:56 +0100)]
amd/common: rename radv_define_common_user_sgprs_phase2()

... to set_global_input_locs().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: rename add_user_sgpr_array_argument() to add_array_arg()
Samuel Pitoiset [Wed, 20 Dec 2017 19:56:02 +0000 (20:56 +0100)]
amd/common: rename add_user_sgpr_array_argument() to add_array_arg()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: replace add_sgpr_argument() by add_arg()
Samuel Pitoiset [Wed, 20 Dec 2017 19:56:01 +0000 (20:56 +0100)]
amd/common: replace add_sgpr_argument() by add_arg()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: replace add_user_sgpr_argument() by add_arg()
Samuel Pitoiset [Wed, 20 Dec 2017 19:56:00 +0000 (20:56 +0100)]
amd/common: replace add_user_sgpr_argument() by add_arg()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: replace add_vgpr_argument() by add_arg()
Samuel Pitoiset [Wed, 20 Dec 2017 19:55:59 +0000 (20:55 +0100)]
amd/common: replace add_vgpr_argument() by add_arg()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: add new add_arg() helper for SGPRs/VGPRs arguments
Samuel Pitoiset [Wed, 20 Dec 2017 19:55:58 +0000 (20:55 +0100)]
amd/common: add new add_arg() helper for SGPRs/VGPRs arguments

The idea is to clean up the add arguments logic.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: rename radv_define_common_user_sgprs_phase1()
Samuel Pitoiset [Wed, 20 Dec 2017 19:55:57 +0000 (20:55 +0100)]
amd/common: rename radv_define_common_user_sgprs_phase1()

... to declare_global_input_sgprs().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: rename radv_define_vs_user_sgprs_phase1()
Samuel Pitoiset [Wed, 20 Dec 2017 19:55:56 +0000 (20:55 +0100)]
amd/common: rename radv_define_vs_user_sgprs_phase1()

... to declare_vs_specific_inputs_sgprs().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: do not try to declare input VS SGPRs for GS
Samuel Pitoiset [Wed, 20 Dec 2017 19:55:55 +0000 (20:55 +0100)]
amd/common: do not try to declare input VS SGPRs for GS

It's a no-op anyway but it looked strange to me, remove it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: add declare_vs_input_vgprs() helper
Samuel Pitoiset [Wed, 20 Dec 2017 19:55:54 +0000 (20:55 +0100)]
amd/common: add declare_vs_input_vgprs() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: add declare_tes_input_vgprs() helper
Samuel Pitoiset [Wed, 20 Dec 2017 19:55:53 +0000 (20:55 +0100)]
amd/common: add declare_tes_input_vgprs() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: remove unnecessary num_user_sgprs_used
Samuel Pitoiset [Wed, 20 Dec 2017 19:55:52 +0000 (20:55 +0100)]
amd/common: remove unnecessary num_user_sgprs_used

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: remove unnecessary user_sgpr_count
Samuel Pitoiset [Wed, 20 Dec 2017 19:55:51 +0000 (20:55 +0100)]
amd/common: remove unnecessary user_sgpr_count

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradeonsi: make use of ac_init_exec_full_mask()
Samuel Pitoiset [Wed, 13 Dec 2017 12:59:12 +0000 (13:59 +0100)]
radeonsi: make use of ac_init_exec_full_mask()

Similar to si_init_exec_full_mask().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agosvga: use tgsi_util_get_shadow_ref_src_index() in a couple place
Brian Paul [Sun, 24 Dec 2017 22:38:01 +0000 (15:38 -0700)]
svga: use tgsi_util_get_shadow_ref_src_index() in a couple place

No piglit changes.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
6 years agotgsi: improve comment on tgsi_util_get_shadow_ref_src_index()
Brian Paul [Sun, 24 Dec 2017 22:37:09 +0000 (15:37 -0700)]
tgsi: improve comment on tgsi_util_get_shadow_ref_src_index()

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
6 years agosvga: fix TGSI_TEXTURE_SHADOW1D coordinate selection
Brian Paul [Sun, 24 Dec 2017 05:11:47 +0000 (22:11 -0700)]
svga: fix TGSI_TEXTURE_SHADOW1D coordinate selection

Fixes about 24 Piglit tex-miplevel-selection tests.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
6 years agosvga: fix shadow comparison failures
Brian Paul [Sat, 23 Dec 2017 21:16:52 +0000 (14:16 -0700)]
svga: fix shadow comparison failures

In some cases, We do shadow comparison cases in the fragment shader
instead of with texture sampler state.  But when we do so, we must
disable the shadow comparison test in the sampler state.  As it
was, we were doing the comparison twice, which resulted in nonsense.
Also, we had the texcoord and texel value swapped in the comparison
instruction.

Fixes about 38 Piglit tex-miplevel-selection tests.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
6 years agoutil: add trivial comment on u_upload_create()
Brian Paul [Sun, 24 Dec 2017 05:12:52 +0000 (22:12 -0700)]
util: add trivial comment on u_upload_create()

6 years agor600: fix atomic counter index mode getting emitted on pre-cayman
Dave Airlie [Wed, 27 Dec 2017 01:56:20 +0000 (01:56 +0000)]
r600: fix atomic counter index mode getting emitted on pre-cayman

This is a regression since I added cayman atomic support, not sure
it fixes anything, but the shader dumps look better.

Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: set some dcc parameters depending on if texture will be sampled
Dave Airlie [Tue, 26 Dec 2017 22:16:53 +0000 (08:16 +1000)]
radv: set some dcc parameters depending on if texture will be sampled

This is ported from amdvlk which sets the independent 64b blocks
only for image which will sample dcc.

I'm not sure how to port this to radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv/radeonsi: set dcc min uncompressed properly for APUs.
Dave Airlie [Tue, 26 Dec 2017 22:02:30 +0000 (08:02 +1000)]
radv/radeonsi: set dcc min uncompressed properly for APUs.

This is ported from amdvlk.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoamd/common/radv/radeonsi: use register defines for dcc block sizes.
Dave Airlie [Tue, 26 Dec 2017 21:56:12 +0000 (07:56 +1000)]
amd/common/radv/radeonsi: use register defines for dcc block sizes.

These are just taken from amdvlk, we probably knew these already,
but may as well port them now.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agost/glsl_to_nir: add patch support to st_nir_assign_var_locations()
Timothy Arceri [Wed, 13 Dec 2017 23:14:34 +0000 (10:14 +1100)]
st/glsl_to_nir: add patch support to st_nir_assign_var_locations()

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost/glsl_to_nir: call post opt functions after opts have finished
Timothy Arceri [Thu, 14 Dec 2017 03:48:49 +0000 (14:48 +1100)]
st/glsl_to_nir: call post opt functions after opts have finished

We need to move this to a separate loop because
nir_compact_varyings() can alter the IR of a previous stage.

Fixes: 6648bd68fd27 "st/glsl_to_nir: enable NIR link time opts"
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost/st_glsl_to_nir: call nir_lower_64bit_pack
Timothy Arceri [Thu, 14 Dec 2017 05:02:45 +0000 (16:02 +1100)]
st/st_glsl_to_nir: call nir_lower_64bit_pack

Fixes 56 crashes in the radeonsi nir backend.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agodocs/features: show es3.1 compat done on r600.
Dave Airlie [Wed, 27 Dec 2017 00:07:25 +0000 (00:07 +0000)]
docs/features: show es3.1 compat done on r600.

This was already being reported, just missed the docs.

Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agomesa: always compare optype with symbolic name in ATI_fs
Miklós Máté [Sat, 2 Dec 2017 22:35:25 +0000 (23:35 +0100)]
mesa: always compare optype with symbolic name in ATI_fs

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: document ati_fragment_shader::cur_pass and swizzlerq
Miklós Máté [Sat, 2 Dec 2017 22:35:24 +0000 (23:35 +0100)]
mesa: document ati_fragment_shader::cur_pass and swizzlerq

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
6 years agomesa: move ATI_fs state compile changes after the error checks
Miklós Máté [Sat, 2 Dec 2017 22:35:23 +0000 (23:35 +0100)]
mesa: move ATI_fs state compile changes after the error checks

Both in setup and arithmetic instructions. Also, remove the useless
new_*_inst() functions, and refactor check_arith_arg(), because it did
two completely different things.

Piglit: spec/ati_fragment_shader/error04-endshader

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
6 years agotnl: fix not having texture coords in ATI_fs in swrast
Miklós Máté [Sat, 2 Dec 2017 22:35:22 +0000 (23:35 +0100)]
tnl: fix not having texture coords in ATI_fs in swrast

ATI_fs in swrast only had access to texture coordinates if there was a
valid texture bound and texturing was enabled.

Piglit: spec/ati_fragment_shader/render-sources and render-notexture

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
6 years agomesa: fix not having secondary color in ATI_fs in swrast
Miklós Máté [Sat, 2 Dec 2017 22:35:21 +0000 (23:35 +0100)]
mesa: fix not having secondary color in ATI_fs in swrast

ATI_fs in swrast only had secondary color if GL_COLOR_SUM was enabled.
This patch probably fixes the same issue in r200.

Piglit: spec/ati_fragment_shader/render-sources and render-precedence

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
6 years agomesa: fix validate for secondary interpolator
Miklós Máté [Sat, 2 Dec 2017 22:35:20 +0000 (23:35 +0100)]
mesa: fix validate for secondary interpolator

This patch fixes multiple problems:
- the interpolator check was duplicated
- both had arg instead of argRep
- I split it into color and alpha for better readability and error msg
- the DOT4 check only applies to color instruction according to the spec
- made the DOT4 check fatal, and improved the error msg

Piglit: spec/ati_fragment_shader/error08-secondary

v2: fixed formatting, added spec quotations

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
6 years agomesa: fix typo in ATI_fs dstMod error checking
Miklós Máté [Sat, 2 Dec 2017 22:35:19 +0000 (23:35 +0100)]
mesa: fix typo in ATI_fs dstMod error checking

Piglit: spec/ati_fragment_shader/error14-invalidmod

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agomesa: fix crash when an ATI_fs pass begins with an alpha inst
Miklós Máté [Sat, 2 Dec 2017 22:35:18 +0000 (23:35 +0100)]
mesa: fix crash when an ATI_fs pass begins with an alpha inst

This fixes crash when:
- first pass begins with alpha inst
- first pass ends with color inst, second pass begins with alpha inst
Also, use the symbolic name instead of a number.

Piglit: spec/ati_fragment_shader/api-alphafirst

v2: fixed formatting

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
6 years agomesa: add fallback texture for SampleMapATI if there is nothing
Miklós Máté [Sat, 2 Dec 2017 22:35:17 +0000 (23:35 +0100)]
mesa: add fallback texture for SampleMapATI if there is nothing

This fixes crash in the state tracker.

Piglit: spec/ati_fragment_shader/render-notexture

v2: fixed formatting, moved stuff inside the loop,
    moved the fallback later to fix more cases

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
6 years agoradeonsi: don't use fast color clear for small images even on APUs
Marek Olšák [Tue, 12 Dec 2017 23:40:19 +0000 (00:40 +0100)]
radeonsi: don't use fast color clear for small images even on APUs

Increase the limit and handle non-square images better.

This makes glxgears 20% faster on APUs, and a little more on dGPUs.
We all use and love glxgears.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoradeonsi: set PNT_SPRITE_ENA = point_quad_rasterization
Marek Olšák [Mon, 11 Dec 2017 18:28:01 +0000 (19:28 +0100)]
radeonsi: set PNT_SPRITE_ENA = point_quad_rasterization

This is based on how nvc0 translates the state.

6 years agogallium/util: add util_num_layers helper
Marek Olšák [Mon, 11 Dec 2017 18:27:39 +0000 (19:27 +0100)]
gallium/util: add util_num_layers helper

6 years agoradv: Fix DCC compatible formats.
Bas Nieuwenhuizen [Sat, 23 Dec 2017 00:40:03 +0000 (01:40 +0100)]
radv: Fix DCC compatible formats.

DCC was disabled when the image format is !!supported, which is one ! too many.

Ironically the commit that introduced it was supposed to lead to more DCC use ...

Fixes: 969537d9358 "radv: Add support for more DCC compression with VK_KHR_image_format_list."
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoRevert "i965/fs: Use align1 mode on ternary instructions on Gen10+"
Anuj Phogat [Fri, 22 Dec 2017 21:54:08 +0000 (13:54 -0800)]
Revert "i965/fs: Use align1 mode on ternary instructions on Gen10+"

This reverts commit 9cd60fce9c22737000a8f8dc711141f8a523fe75.
Above commit caused 2000+ piglit tests to assert fail. Disabling
the align1 mode on gen10 for now to avoid failures.

Cc: Matt Turner <mattst88@gmail.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
6 years agodocs: update calendar, add news item and link release notes for 17.2.8
Andres Gomez [Fri, 22 Dec 2017 22:59:22 +0000 (00:59 +0200)]
docs: update calendar, add news item and link release notes for 17.2.8

Signed-off-by: Andres Gomez <agomez@igalia.com>
6 years agodocs: add sha256 checksums for 17.2.8
Andres Gomez [Fri, 22 Dec 2017 22:54:11 +0000 (00:54 +0200)]
docs: add sha256 checksums for 17.2.8

Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 3281775ab9993d700a0a01a2823b6e7c72fce150)

6 years agodocs: add release notes for 17.2.8
Andres Gomez [Fri, 22 Dec 2017 20:39:47 +0000 (22:39 +0200)]
docs: add release notes for 17.2.8

Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 3482790712e92d660706952f9ff282d904415941)

6 years agofreedreno: set missing internal_format when importing texture
Ilia Mirkin [Fri, 22 Dec 2017 05:27:50 +0000 (00:27 -0500)]
freedreno: set missing internal_format when importing texture

Fixes running piglits without -fbo. Probably lots of other stuff too.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
6 years agoamd/common: add ac_export_mrt_z() helper
Samuel Pitoiset [Thu, 21 Dec 2017 16:53:15 +0000 (17:53 +0100)]
amd/common: add ac_export_mrt_z() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: pass the family to ac_llvm_context_init()
Samuel Pitoiset [Thu, 21 Dec 2017 16:53:14 +0000 (17:53 +0100)]
amd/common: pass the family to ac_llvm_context_init()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: reduce the number of small surfaces that need CMASK or DCC
Samuel Pitoiset [Thu, 21 Dec 2017 16:45:23 +0000 (17:45 +0100)]
radv: reduce the number of small surfaces that need CMASK or DCC

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agogm107/ir: use lane 0 for manual textureGrad handling
Ilia Mirkin [Wed, 20 Dec 2017 04:37:25 +0000 (23:37 -0500)]
gm107/ir: use lane 0 for manual textureGrad handling

This is parallel to the pre-SM50 change which does this. Adjusts the
shuffles / quadops to make the values correct relative to lane 0, and
then splat the results to all lanes for the final move into the target
register.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-By: Karol Herbst <kherbst@redhat.com>
6 years agoradv/meta: fix blit paths for depth/stencil (v2.1)
Dave Airlie [Thu, 21 Dec 2017 06:52:24 +0000 (16:52 +1000)]
radv/meta: fix blit paths for depth/stencil (v2.1)

This fixes the layout issue for the blit path as well.

This fixes:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint*

v2: use compatible render passes.
v2.1: use enum

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: handle depth/stencil image copy with layouts better. (v3.1)
Dave Airlie [Thu, 21 Dec 2017 06:23:30 +0000 (16:23 +1000)]
radv: handle depth/stencil image copy with layouts better. (v3.1)

If we are doing a general->general transfer with HIZ enabled,
we want to hit the tile surface disable bits in radv_emit_fb_ds_state,
however we never get the current layout to know we are in general
and meta hardcoded the transfer layout which is always tile enabled.

This fixes:
dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_general
dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.general_general

v2: refactor some shared helpers for blit patches
v3: we only need multiple render passes as they should be compatible.
v3.1: use enum (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: refactor blit2d pipeline creation
Dave Airlie [Wed, 20 Dec 2017 23:00:43 +0000 (09:00 +1000)]
radv: refactor blit2d pipeline creation

This just refactors the gfx9 blit2d pipeline creation
to be less lines of code.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv/gfx9: add support for 3d images to blit 2d paths
Dave Airlie [Tue, 19 Dec 2017 05:42:10 +0000 (15:42 +1000)]
radv/gfx9: add support for 3d images to blit 2d paths

This add support for a 3D image reading path to the blit 2d paths,
like I did for the clear paths.

Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv/gfx9: add 3d sampler image->buffer copy shader. (v3)
Dave Airlie [Tue, 19 Dec 2017 03:55:18 +0000 (13:55 +1000)]
radv/gfx9: add 3d sampler image->buffer copy shader. (v3)

On GFX9 we must access 3D textures with 3D samplers AFAICS.

This fixes:
dEQP-VK.api.image_clearing.core.clear_color_image.3d.single_layer

on GFX9 for me.

v1.1: fix tex->sampler_dim to dim
v2: send layer in from outside
v3: don't regress on pre-gfx9

Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: fix surface max layer count (v2)
Dave Airlie [Tue, 19 Dec 2017 05:41:42 +0000 (15:41 +1000)]
radv: fix surface max layer count (v2)

looking at traces I noticed we'd set slice_max too large sometimes.

This should fix it.

v2: fix missing - 1

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agointel/fs: Initialize fs_visitor::grf_used on construction.
Francisco Jerez [Sun, 17 Dec 2017 08:21:13 +0000 (00:21 -0800)]
intel/fs: Initialize fs_visitor::grf_used on construction.

This should shut up some Valgrind errors during pre-regalloc
scheduling.  The errors were harmless since they could only have led
to the estimation of the bank conflict penalty of an instruction
pre-regalloc, which is inaccurate at that point of the program
compilation, but no less accurate than the intended "return 0"
fall-back path.  The scheduling pass is normally re-run after regalloc
with a well-defined grf_used value and accurate bank conflict
information.

Fixes: acf98ff933d "intel/fs: Teach instruction scheduler about GRF bank conflict cycles."
Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agointel/fs/bank_conflicts: Use posix_memalign() instead of overaligned new to obtain...
Francisco Jerez [Sun, 17 Dec 2017 21:05:55 +0000 (13:05 -0800)]
intel/fs/bank_conflicts: Use posix_memalign() instead of overaligned new to obtain vector storage.

The weight_vector_type constructor was inadvertently assuming C++17
semantics of the new operator applied on a type with alignment
requirement greater than the largest fundamental alignment.
Unfortunately on earlier C++ dialects the implementation was allowed
to raise an allocation failure when the alignment requirement of the
allocated type was unsupported, in an implementation-defined fashion.
It's expected that a C++ implementation recent enough to implement
P0035R4 would have honored allocation requests for such over-aligned
types even if the C++17 dialect wasn't active, which is likely the
reason why this problem wasn't caught by our CI system.

A more elegant fix would involve wrapping the __SSE2__ block in a
'__cpp_aligned_new >= 201606' preprocessor conditional and continue
taking advantage of the language feature, but that would yield lower
compile-time performance on old compilers not implementing it
(e.g. GCC versions older than 7.0).

Fixes: af2c320190f3c731 "intel/fs: Implement GRF bank conflict mitigation pass."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104226
Reported-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agoRevert "spirv: consider bitsize when handling OpSwitch cases"
Mark Janes [Thu, 21 Dec 2017 20:15:40 +0000 (12:15 -0800)]
Revert "spirv: consider bitsize when handling OpSwitch cases"

This reverts commit 9702fac68e8bd07be8871f7925d7f9fb98da3699, which
hangs vulkancts and crucible on all platforms.

The patch is being reverted because it disables continuous integration
testing.  The patch from bug 104359 does not apply to master.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104359

6 years agoradv: fix issue with multisample positions and interp_var_at_sample.
Dave Airlie [Thu, 21 Dec 2017 04:03:20 +0000 (14:03 +1000)]
radv: fix issue with multisample positions and interp_var_at_sample.

This fixes vmfaults seen on vega with:
dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_single_sample_.128_128_1.samples_1

These were caused by the don't allocate cmask but it was just accidental.

The actual problem was the shader was trying to get the sample positions from
a buffer, but the buffer was never getting configured to contain them, as the
previous shader never needed them.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 1171b304f3 (radv: overhaul fragment shader sample positions.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agodocs: update calendar, add news item and link release notes for 17.3.1
Emil Velikov [Thu, 21 Dec 2017 17:38:04 +0000 (17:38 +0000)]
docs: update calendar, add news item and link release notes for 17.3.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
6 years agodocs: add sha256 checksums for 17.3.1
Emil Velikov [Thu, 21 Dec 2017 17:34:52 +0000 (17:34 +0000)]
docs: add sha256 checksums for 17.3.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f66496d291881f1eaca2ee5d326367fb73537541)

6 years agodocs: add release notes for 17.3.1
Emil Velikov [Thu, 21 Dec 2017 17:04:41 +0000 (17:04 +0000)]
docs: add release notes for 17.3.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4f5e85e9e97de4ae6e3d779ff42bf392c4739234)

6 years agoradv/gfx9: fix primitive topology when adjacency is used
Samuel Pitoiset [Wed, 20 Dec 2017 19:57:21 +0000 (20:57 +0100)]
radv/gfx9: fix primitive topology when adjacency is used

Found by inspection.

Cc: 17.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoglsl: disable vec3 packing/splitting in tfb separate mode
Brian Paul [Mon, 18 Dec 2017 19:32:56 +0000 (12:32 -0700)]
glsl: disable vec3 packing/splitting in tfb separate mode

This fixes a varying packing issue when using transform feedback in
GL_SEPARATE_ATTRIBS mode.  By time we get to linking, we already
know that the number of feedback attributes is under the
GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS limit so packing isn't
as critical.  In fact, packing/splitting vec3 attributes can cause
trouble because splitting effectively creates another TFB output
which can exceed device limits.  So, disable vec3 packing when it's
not needed to avoid that issue.

Fixes the Piglit ext_transform_feedback-separate test on VMware
driver.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: simply packing class comparison
Brian Paul [Fri, 15 Dec 2017 22:21:46 +0000 (15:21 -0700)]
glsl: simply packing class comparison

Handle comparing the packing class using the same method as we do
for var->data.is_xfb_only

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: document varying_matches::assign_locations() params and return value
Brian Paul [Fri, 15 Dec 2017 22:08:17 +0000 (15:08 -0700)]
glsl: document varying_matches::assign_locations() params and return value

And change *components to components[] as a reminder that it's an array.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: remove some continue statements
Brian Paul [Fri, 15 Dec 2017 21:36:25 +0000 (14:36 -0700)]
glsl: remove some continue statements

In some cases, I think loop code is easier to read without continue
statements.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: use bitwise operators in varying_matches::compute_packing_class()
Brian Paul [Fri, 15 Dec 2017 21:30:26 +0000 (14:30 -0700)]
glsl: use bitwise operators in varying_matches::compute_packing_class()

The mix of bitwise operators with * and + to compute the packing_class
values was a little weird.  Just use bitwise ops instead.

v2: add assertion to make sure interpolation bits fit without collision,
per Timothy.  Basically, rewrite function to be simpler.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: simplify loop in varying_matches::assign_locations()
Brian Paul [Fri, 15 Dec 2017 21:27:55 +0000 (14:27 -0700)]
glsl: simplify loop in varying_matches::assign_locations()

The use of break/continue was kind of weird/confusing.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: minor simplification in assign_varying_locations()
Brian Paul [Fri, 15 Dec 2017 21:25:20 +0000 (14:25 -0700)]
glsl: minor simplification in assign_varying_locations()

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: make varying_matches::is_varying_packing_safe() const
Brian Paul [Fri, 15 Dec 2017 21:23:39 +0000 (14:23 -0700)]
glsl: make varying_matches::is_varying_packing_safe() const

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: trivial comment fixes in lower_packed_varyings.cpp
Brian Paul [Fri, 15 Dec 2017 17:18:00 +0000 (10:18 -0700)]
glsl: trivial comment fixes in lower_packed_varyings.cpp

Reviewed by: Timothy Arceri <tarceri@itsqueeze.com>

6 years agodocs: update 17.3 and 18.0 cycles for the release calendar
Andres Gomez [Mon, 18 Dec 2017 19:31:23 +0000 (21:31 +0200)]
docs: update 17.3 and 18.0 cycles for the release calendar

Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agospirv: Makefile.nir.am: include vtn_gather_types_c.py script in tarball dist
Juan A. Suarez Romero [Wed, 20 Dec 2017 10:51:31 +0000 (11:51 +0100)]
spirv: Makefile.nir.am: include vtn_gather_types_c.py script in tarball dist

Fixes: bb1e6ff161c ("spirv: Add a prepass to set types on vtn_values")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
6 years agost/dri: allow direct YUYV import
Lucas Stach [Tue, 30 May 2017 13:07:13 +0000 (15:07 +0200)]
st/dri: allow direct YUYV import

Push this format to the pipe driver unchanged.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
6 years agospirv: consider bitsize when handling OpSwitch cases
Juan A. Suarez Romero [Tue, 19 Dec 2017 17:55:24 +0000 (17:55 +0000)]
spirv: consider bitsize when handling OpSwitch cases

When walking over all the cases in a OpSwitch, take in account the bitsize
of the literals to avoid getting wrong cases.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agodrirc: set allow_glsl_cross_stage_interpolation_mismatch for more games
Tapani Pälli [Wed, 20 Dec 2017 07:23:55 +0000 (09:23 +0200)]
drirc: set allow_glsl_cross_stage_interpolation_mismatch for more games

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104288
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoanv: disallow VK_REMAINING_ARRAY_LAYERS in vkCmdClearAttachments()
Samuel Iglesias Gonsálvez [Tue, 19 Dec 2017 07:59:36 +0000 (08:59 +0100)]
anv: disallow VK_REMAINING_ARRAY_LAYERS in vkCmdClearAttachments()

Vulkan spec doesn't specify that VK_REMAINING_ARRAY_LAYERS is allowed
in the passed VkClearRect struct.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agonvc0/ir: change textureGrad to always use lane 0 as the tex origin
Ilia Mirkin [Wed, 16 Aug 2017 04:34:43 +0000 (00:34 -0400)]
nvc0/ir: change textureGrad to always use lane 0 as the tex origin

Thanks to Karol Herbst for the debugging / tracing work that led to this
change.

Move to using lane 0 as the "work" lane for the texture. It is unclear
why this helps, as that computation should be identical to doing it in
the "correct" lane with the properly adjusted quadops.

In order to be able to use the lane 0 result, we also have to ensure
that lane 0 contains the proper array/indirect/shadow values.

This applies to Fermi and Kepler. Maxwell+ may or may not need fixing,
but that lowering logic is separate.

Fixes KHR-GL45.texture_cube_map_array.sampling

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agobroadcom/vc5: Add missing setting of the UIF XOR disable flag in textures.
Eric Anholt [Tue, 19 Dec 2017 22:23:06 +0000 (14:23 -0800)]
broadcom/vc5: Add missing setting of the UIF XOR disable flag in textures.

Most piglit textures happened to work out by RGBW not changing in that
bit, but it did cause failures in RGBA16F fbo-generatemipmap-formats.

6 years agobroadcom/vc5: Clean up the comment and code around level 0 UIF.
Eric Anholt [Tue, 19 Dec 2017 22:20:19 +0000 (14:20 -0800)]
broadcom/vc5: Clean up the comment and code around level 0 UIF.

I wrote this early in driver development, and our UIF handling is much
better now.

6 years agobroadcom/vc5: Simplify the tiling calculations.
Eric Anholt [Tue, 19 Dec 2017 22:08:18 +0000 (14:08 -0800)]
broadcom/vc5: Simplify the tiling calculations.

The mb_tile_layout table was just the utile_w/h times two, so reuse the
utile code instead.

6 years agobroadcom/vc5: Return the depth in all components of depth textures.
Eric Anholt [Thu, 16 Nov 2017 20:01:13 +0000 (12:01 -0800)]
broadcom/vc5: Return the depth in all components of depth textures.

Apparently gallium's u_blitter wants depth from at least the .z component,
and other swizzling appears to apply on top of that.  Fixes
fbo-generatemipmap-formats failures with depth formats.

6 years agobroadcom/vc5: Enable decompressing RGTC for desktop GL support.
Eric Anholt [Fri, 15 Dec 2017 22:40:43 +0000 (14:40 -0800)]
broadcom/vc5: Enable decompressing RGTC for desktop GL support.

This matches freedreno's behavior.

6 years agobroadcom/vc5: Use u_transfer_helper for MSAA mappings.
Eric Anholt [Fri, 15 Dec 2017 22:40:24 +0000 (14:40 -0800)]
broadcom/vc5: Use u_transfer_helper for MSAA mappings.