mesa.git
5 years agoradeonsi: fix packing of key.mono.u.ps
Marek Olšák [Thu, 25 Jul 2019 00:07:52 +0000 (20:07 -0400)]
radeonsi: fix packing of key.mono.u.ps

5 years agoac/nir: fix incorrect Phis if callbacks use control flow inside control flow
Marek Olšák [Thu, 25 Jul 2019 00:21:27 +0000 (20:21 -0400)]
ac/nir: fix incorrect Phis if callbacks use control flow inside control flow

5 years agoac/nir: handle abs modifier
Marek Olšák [Wed, 24 Jul 2019 21:44:51 +0000 (17:44 -0400)]
ac/nir: handle abs modifier

5 years agoac: fix a memory leak in the error path of ac_build_type_name_for_intr
Marek Olšák [Wed, 24 Jul 2019 21:36:25 +0000 (17:36 -0400)]
ac: fix a memory leak in the error path of ac_build_type_name_for_intr

5 years agoac: allow control flow statements in NIR callbacks
Marek Olšák [Wed, 24 Jul 2019 21:19:38 +0000 (17:19 -0400)]
ac: allow control flow statements in NIR callbacks

This fixes a crash when compiling geometry shaders on radeonsi.

5 years agoac/nir: handle negate modifier
Marek Olšák [Wed, 24 Jul 2019 03:11:40 +0000 (23:11 -0400)]
ac/nir: handle negate modifier

5 years agoradeonsi: don't use lp_build_if for the prim discard compute shader
Marek Olšák [Wed, 24 Jul 2019 00:49:27 +0000 (20:49 -0400)]
radeonsi: don't use lp_build_if for the prim discard compute shader

5 years agoradeonsi: don't use lp_build_if for the wrapping if block in the VS prolog
Marek Olšák [Wed, 24 Jul 2019 00:34:03 +0000 (20:34 -0400)]
radeonsi: don't use lp_build_if for the wrapping if block in the VS prolog

5 years agoradeonsi: don't use lp_build_if for the wrapping if block in merged shaders
Marek Olšák [Wed, 24 Jul 2019 00:34:03 +0000 (20:34 -0400)]
radeonsi: don't use lp_build_if for the wrapping if block in merged shaders

5 years agoradeonsi: don't use lp_build_if (in most common places)
Marek Olšák [Wed, 24 Jul 2019 00:41:55 +0000 (20:41 -0400)]
radeonsi: don't use lp_build_if (in most common places)

5 years agoradeonsi: don't use lp_build_alloca
Marek Olšák [Wed, 24 Jul 2019 00:41:30 +0000 (20:41 -0400)]
radeonsi: don't use lp_build_alloca

5 years agoradeonsi/nir: implement FBFETCH for KHR_blend_equation_advanced
Marek Olšák [Tue, 23 Jul 2019 23:32:50 +0000 (19:32 -0400)]
radeonsi/nir: implement FBFETCH for KHR_blend_equation_advanced

5 years agoradeonsi/nir: set input_interpolate_loc for color inputs
Marek Olšák [Tue, 23 Jul 2019 23:22:57 +0000 (19:22 -0400)]
radeonsi/nir: set input_interpolate_loc for color inputs

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agoradeonsi/nir: set tgsi_shader_info::num_memory_instructions
Marek Olšák [Tue, 23 Jul 2019 23:15:45 +0000 (19:15 -0400)]
radeonsi/nir: set tgsi_shader_info::num_memory_instructions

5 years agoradeonsi/nir: accurately set input_usage_mask for doubles (v2)
Marek Olšák [Tue, 23 Jul 2019 23:12:45 +0000 (19:12 -0400)]
radeonsi/nir: accurately set input_usage_mask for doubles (v2)

v2: fix doubles

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoradeonsi/nir: accurately set output_usagemask (v2)
Marek Olšák [Tue, 23 Jul 2019 22:55:47 +0000 (18:55 -0400)]
radeonsi/nir: accurately set output_usagemask (v2)

v2: fix doubles

5 years agoradeonsi/nir: accurately set reads_*_outputs for TCS
Marek Olšák [Tue, 23 Jul 2019 22:03:39 +0000 (18:03 -0400)]
radeonsi/nir: accurately set reads_*_outputs for TCS

5 years agoradeonsi/nir: clean up gather_intrinsic_load_deref_input_info
Marek Olšák [Tue, 23 Jul 2019 22:00:50 +0000 (18:00 -0400)]
radeonsi/nir: clean up gather_intrinsic_load_deref_input_info

5 years agoradeonsi/nir: add an option to convert TGSI to NIR
Marek Olšák [Tue, 23 Jul 2019 21:36:09 +0000 (17:36 -0400)]
radeonsi/nir: add an option to convert TGSI to NIR

Use at your own risk.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoradeonsi/nir: clean up some nir_scan_shader code
Marek Olšák [Tue, 23 Jul 2019 21:42:26 +0000 (17:42 -0400)]
radeonsi/nir: clean up some nir_scan_shader code

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoradeonsi/gfx10: disable DCC image stores
Marek Olšák [Tue, 23 Jul 2019 21:46:38 +0000 (17:46 -0400)]
radeonsi/gfx10: disable DCC image stores

Uncompressed image stores are usually faster.

Also, the driver didn't set WRITE_COMPRESS_ENABLE, so I don't know
what the hw did for image stores.

5 years agoradeonsi: adjust RB+ blend optimization settings
Marek Olšák [Tue, 23 Jul 2019 04:32:06 +0000 (00:32 -0400)]
radeonsi: adjust RB+ blend optimization settings

based on PAL

5 years agoac/surface: allow linear swizzle mode automatic selection on gfx9 & 10
Marek Olšák [Tue, 23 Jul 2019 05:55:12 +0000 (01:55 -0400)]
ac/surface: allow linear swizzle mode automatic selection on gfx9 & 10

let addrlib make the decision to get the same result as PAL.

5 years agomesa: add EXT_dsa indexed generic queries
Pierre-Eric Pelloux-Prayer [Thu, 2 May 2019 13:01:05 +0000 (15:01 +0200)]
mesa: add EXT_dsa indexed generic queries

Only GetPointerIndexedvEXT needs an implementation, the other functions are
aliases of existing functions.

5 years agomesa: add EXT_dsa indexed texture commands functions
Pierre-Eric Pelloux-Prayer [Mon, 29 Apr 2019 15:39:49 +0000 (17:39 +0200)]
mesa: add EXT_dsa indexed texture commands functions

Added functions:
  - EnableClientStateIndexedEXT
  - DisableClientStateIndexedEXT
  - EnableClientStateiEXT
  - DisableClientStateiEXT

Implemented using the idiom provided by the spec:

        if (array == TEXTURE_COORD_ARRAY) {
          int savedClientActiveTexture;

          GetIntegerv(CLIENT_ACTIVE_TEXTURE, &savedClientActiveTexture);
          ClientActiveTexture(TEXTURE0+index);
          XXX(array);
          ClientActiveTexture(savedActiveTexture);
        } else {
          // Invalid enum
        }

5 years agomesa: add EXT_dsa (Named)Framebuffer functions
Pierre-Eric Pelloux-Prayer [Mon, 29 Apr 2019 11:53:29 +0000 (13:53 +0200)]
mesa: add EXT_dsa (Named)Framebuffer functions

These functions dont support display list as specified:

    Should the selector-free versions of various OpenGL 3.0 and
    EXT_framebuffer_object framebuffer object commands not be allowed
    in display lists [...]?

    RESOLVED:  Yes

5 years agomesa: add EXT_dsa NamedBuffer functions
Pierre-Eric Pelloux-Prayer [Fri, 26 Apr 2019 16:10:44 +0000 (18:10 +0200)]
mesa: add EXT_dsa NamedBuffer functions

5 years agoi965/curbe: Look at SYSTEM_VALUE_FRAG_COORD instead of VARYING_SLOT_POS
Jason Ekstrand [Tue, 30 Jul 2019 23:07:08 +0000 (18:07 -0500)]
i965/curbe: Look at SYSTEM_VALUE_FRAG_COORD instead of VARYING_SLOT_POS

When transitioning gl_FragCoord over to a system value, we missed one
instance of VARYING_SLOT_POS in i965.  As of this commit, i965 has no
references to VARYING_SLOT_POS.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111263
Fixes: 4bb6e6817ec "intel: Use a system value for gl_FragCoord"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/fs: Implement quad_swap_horizontal with a swizzle on gen7
Jason Ekstrand [Fri, 26 Jul 2019 21:03:08 +0000 (16:03 -0500)]
intel/fs: Implement quad_swap_horizontal with a swizzle on gen7

This fixes dEQP-VK.subgroups.quad.compute.subgroupquadswaphorizontal_*
on all gen7 platforms.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/fs: Use ALIGN16 instructions for all derivatives on gen <= 7
Jason Ekstrand [Thu, 25 Jul 2019 23:28:44 +0000 (18:28 -0500)]
intel/fs: Use ALIGN16 instructions for all derivatives on gen <= 7

The issue here was discovered by a set of Vulkan CTS tests:

    dEQP-VK.glsl.derivate.*.dynamic_*

These tests use ballot ops to construct a branch condition that takes
the same path for each 2x2 quad but may not be uniform across the whole
subgroup.  They then tests that derivatives work and give the correct
value even when executed inside such a branch.  Because the derivative
isn't executed in uniform control-flow and the values coming into the
derivative aren't smooth (or worse, linear), they nicely catch bugs that
aren't uncovered by simpler derivative tests.

Unfortunately, these tests require Vulkan and the equivalent GL test
would require the GL_ARB_shader_ballot extension which requires int64.
Because the requirements for these tests are so high, it's not easy to
test on older hardware and the bug is only proven to exist on gen7;
gen4-6 are a conjecture.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agoscons+meson: suppress spammy build warning on MacOS
Eric Engestrom [Tue, 30 Jul 2019 14:41:06 +0000 (15:41 +0100)]
scons+meson: suppress spammy build warning on MacOS

Originally introduced in c7f36574506838274460 ("darwin: Suppress type
conversion warnings for GLhandleARB") to fix Bugzilla #66346 [1], this
workaround was never ported to Scons or Meson.

[1] https://bugs.freedesktop.org/66346

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
5 years agoi965/fs: Print the scheduler mode.
Matt Turner [Mon, 17 Oct 2016 21:12:28 +0000 (14:12 -0700)]
i965/fs: Print the scheduler mode.

Line wrap some awfully long lines while we are here.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoi965/fs: Add a shader_stats struct.
Matt Turner [Mon, 17 Oct 2016 21:10:26 +0000 (14:10 -0700)]
i965/fs: Add a shader_stats struct.

It'll grow further, and we'd like to avoid adding an additional
parameter to fs_generator() for each new piece of data.

v2 (idr): Rebase on 17 months.  Track a visitor instead of a cfg.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agolima/gp: Support exp2 and log2
Connor Abbott [Sun, 21 Apr 2019 19:46:46 +0000 (21:46 +0200)]
lima/gp: Support exp2 and log2

log2 is tricky because there cannot be a move between complex1 and
postlog2. We can't guarantee that scheduling complex1 will succeed when
we schedule postlog2, so we try to schedule complex1 and if it fails we
back out by rewriting the postlog2 as a move and introducing a new
postlog2 so that we can try again later.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Acked-by: Qiang Yu <yuq825@gmail.com>
5 years agolima/gpir: Always schedule complex2 and *_impl right after complex1
Connor Abbott [Sat, 27 Jul 2019 23:13:10 +0000 (01:13 +0200)]
lima/gpir: Always schedule complex2 and *_impl right after complex1

See https://gitlab.freedesktop.org/lima/mesa/issues/94 for the gory
details of why this is needed. For *_impl this is easy, since it never
increases register pressure and it goes in the complex slot hence it
never counts against max nodes. It's a bit more challenging for
complex2, since it does count against max nodes, so we need to change
the reservation logic to reserve an extra slot for complex2 when
scheduling complex1. This second part isn't strictly necessary yet, but
it will be for exp2.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Acked-by: Qiang Yu <yuq825@gmail.com>
5 years agoradv: Fix descriptor set allocation failure.
Bas Nieuwenhuizen [Tue, 30 Jul 2019 20:23:02 +0000 (22:23 +0200)]
radv: Fix descriptor set allocation failure.

Set all the handles to VK_NULL_HANDLE:

"If the creation of any of those descriptor sets fails, then the implementation
must destroy all successfully created descriptor set objects from this command,
set all entries of the pDescriptorSets array to VK_NULL_HANDLE and return the
error."

(Vulkan 1.1.117 Spec, section 13.2)

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: fix queries with WAIT_BIT returning VK_NOT_READY
Andres Rodriguez [Sat, 27 Jul 2019 13:44:44 +0000 (09:44 -0400)]
radv: fix queries with WAIT_BIT returning VK_NOT_READY

When vkGetQueryPoolResults() is called with VK_QUERY_RESULT_WAIT_BIT
set, the driver is supposed to wait for the query to become available
before returning.

Currently, radv returns once the query is indeed ready, but it returns
VK_NOT_READY. It also fails to populate the results.

The problem is a missing volatile in the secondary check for query
availability. This patch removes the secondary check altogether since it
is redundant with the preceding loop.

This bug was found with an unreleased version of SteamVR.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agomeson: Test for program_invocation_name
Matt Turner [Mon, 29 Jul 2019 20:51:55 +0000 (13:51 -0700)]
meson: Test for program_invocation_name

program_invocation_name and program_invocation_short_name are both GNU
extensions. I don't believe one can exist without the other, so only
check for program_invocation_name.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoscons: Test for random_r()
Matt Turner [Mon, 29 Jul 2019 22:31:34 +0000 (15:31 -0700)]
scons: Test for random_r()

Suggested-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agomeson: Test for random_r()
Matt Turner [Thu, 25 Jul 2019 01:44:35 +0000 (18:44 -0700)]
meson: Test for random_r()

It's better to test for needed functions instead of using external
knowledge about presence in this or that C library.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agost/nine: Drop preprocessor guards for glibc-2.12
Matt Turner [Thu, 25 Jul 2019 01:28:38 +0000 (18:28 -0700)]
st/nine: Drop preprocessor guards for glibc-2.12

Same rationale as the previous patch, but additionally these checks just
seem entirely unnecessary. pthread_self() has been used in Mesa since at
least 1999.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoutil: Drop preprocessor guards for glibc-2.12
Matt Turner [Thu, 25 Jul 2019 01:26:49 +0000 (18:26 -0700)]
util: Drop preprocessor guards for glibc-2.12

glibc-2.12 was released in 2010. No one is building new Mesa against 9
year old glibc, and removing these checks allows the code to work on
other C libraries like musl.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agopan/midgard: Nothing to see here, move along folks
Alyssa Rosenzweig [Tue, 30 Jul 2019 17:49:13 +0000 (10:49 -0700)]
pan/midgard: Nothing to see here, move along folks

Fixes: dee1e18fe4f ("pan/midgard: Cleanup ops table")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agospirv: don't discard access set by vtn_pointer_dereference
Lionel Landwerlin [Fri, 26 Jul 2019 19:47:09 +0000 (22:47 +0300)]
spirv: don't discard access set by vtn_pointer_dereference

We can have a access flag already set here so just augment the
existing ones.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0fb61dfdeb ("spirv: propagate access qualifiers through ssa & pointer")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Enable EXT_texture_shadow_lod
Sagar Ghuge [Thu, 25 Jul 2019 21:07:36 +0000 (14:07 -0700)]
iris: Enable EXT_texture_shadow_lod

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agogallium: Add PIPE_CAP_TEXTURE_SHADOW_LOD
Sagar Ghuge [Thu, 25 Jul 2019 21:05:58 +0000 (14:05 -0700)]
gallium: Add PIPE_CAP_TEXTURE_SHADOW_LOD

v2: Line wrap to 80 char (Marek Olsak)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoi965: Enable EXT_texture_shadow_lod
Sagar Ghuge [Fri, 31 May 2019 19:56:03 +0000 (12:56 -0700)]
i965: Enable EXT_texture_shadow_lod

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl: Add builtin functions for EXT_texture_shadow_lod
Paulo Zanoni [Thu, 25 Jul 2019 18:49:09 +0000 (11:49 -0700)]
glsl: Add builtin functions for EXT_texture_shadow_lod

With the help of Sagar, Ian and Ivan.

v2: Fix dependencies (Ian Romanick)

v3: 1) fix function name (Marek Olsak)
    2) Add check for extension enable (Marek Olsak)

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl: Allow _textureCubeArrayShadow function to accept ir_texture_opcode
Paulo Zanoni [Thu, 25 Jul 2019 17:57:43 +0000 (10:57 -0700)]
glsl: Allow _textureCubeArrayShadow function to accept ir_texture_opcode

This will be used to support one of the function from
Ext_texture_shadow_lod specification.

With the help of Sagar, Ian and Ivan.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agomesa: extension boilerplate for EXT_texture_shadow_lod
Paulo Zanoni [Mon, 18 Mar 2019 23:37:17 +0000 (16:37 -0700)]
mesa: extension boilerplate for EXT_texture_shadow_lod

With the help of Sagar, Ian and Ivan.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agopan/midgard: Cleanup ops table
Alyssa Rosenzweig [Tue, 30 Jul 2019 17:26:44 +0000 (10:26 -0700)]
pan/midgard: Cleanup ops table

Hopefully this should make a few ops make more sense. No functional
changes.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Extend copy-propagation to swizzles
Alyssa Rosenzweig [Fri, 26 Jul 2019 19:05:23 +0000 (12:05 -0700)]
pan/midgard: Extend copy-propagation to swizzles

We can compose them when we rewrite, which is.. more code.. but helps.

total instructions in shared programs: 3611 -> 3513 (-2.71%)
instructions in affected programs: 672 -> 574 (-14.58%)
helped: 11
HURT: 2
helped stats (abs) min: 2 max: 14 x̄: 9.09 x̃: 10
helped stats (rel) min: 5.71% max: 24.56% x̄: 17.99% x̃: 18.87%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.19% max: 2.08% x̄: 1.64% x̃: 1.64%
95% mean confidence interval for instructions value: -10.45 -4.62
95% mean confidence interval for instructions %-change: -20.07% -9.87%
Instructions are helped.

total bundles in shared programs: 2117 -> 2067 (-2.36%)
bundles in affected programs: 356 -> 306 (-14.04%)
helped: 11
HURT: 0
helped stats (abs) min: 1 max: 7 x̄: 4.55 x̃: 5
helped stats (rel) min: 4.55% max: 15.22% x̄: 13.63% x̃: 14.71%
95% mean confidence interval for bundles value: -5.64 -3.45
95% mean confidence interval for bundles %-change: -15.71% -11.55%
Bundles are helped.

total quadwords in shared programs: 3567 -> 3468 (-2.78%)
quadwords in affected programs: 695 -> 596 (-14.24%)
helped: 11
HURT: 1
helped stats (abs) min: 2 max: 14 x̄: 9.09 x̃: 10
helped stats (rel) min: 5.56% max: 21.88% x̄: 14.97% x̃: 15.15%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 2.38% max: 2.38% x̄: 2.38% x̃: 2.38%
95% mean confidence interval for quadwords value: -10.96 -5.54
95% mean confidence interval for quadwords %-change: -17.42% -9.63%
Quadwords are helped.

total registers in shared programs: 391 -> 383 (-2.05%)
registers in affected programs: 46 -> 38 (-17.39%)
helped: 9
HURT: 1
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 25.00% max: 25.00% x̄: 25.00% x̃: 25.00%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 10.00% max: 10.00% x̄: 10.00% x̃: 10.00%
95% mean confidence interval for registers value: -1.25 -0.35
95% mean confidence interval for registers %-change: -29.42% -13.58%
Registers are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Extract simple source mod check
Alyssa Rosenzweig [Fri, 26 Jul 2019 18:52:30 +0000 (11:52 -0700)]
pan/midgard: Extract simple source mod check

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Lower texr/texw mixed registers
Alyssa Rosenzweig [Tue, 30 Jul 2019 15:09:51 +0000 (08:09 -0700)]
pan/midgard: Lower texr/texw mixed registers

Conceptually, r28-r29 (as used for reading) and r28-r29 (as used for
writing) aren't registers at all, merely push/pull arrangements. So you
can't feed a texture result back into itself without explicitly moving
in the middle.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Always set .cont for derivatives in loops
Alyssa Rosenzweig [Mon, 29 Jul 2019 23:55:15 +0000 (16:55 -0700)]
pan/midgard: Always set .cont for derivatives in loops

We need to keep the helper invocations alive.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Implement derivatives
Alyssa Rosenzweig [Mon, 29 Jul 2019 22:11:12 +0000 (15:11 -0700)]
pan/midgard: Implement derivatives

Implement the fdd* and fdd* opcodes in the Midgard compiler.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Compose original texture swizzle in RA
Alyssa Rosenzweig [Mon, 29 Jul 2019 23:56:03 +0000 (16:56 -0700)]
pan/midgard: Compose original texture swizzle in RA

Used for lowering derivatives.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add new swizzles
Alyssa Rosenzweig [Mon, 29 Jul 2019 23:53:05 +0000 (16:53 -0700)]
pan/midgard: Add new swizzles

Used for derivatives.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add OP_IS_DERIVATIVE helper
Alyssa Rosenzweig [Mon, 29 Jul 2019 23:52:55 +0000 (16:52 -0700)]
pan/midgard: Add OP_IS_DERIVATIVE helper

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add make_compiler_temp_reg helper
Alyssa Rosenzweig [Mon, 29 Jul 2019 23:52:36 +0000 (16:52 -0700)]
pan/midgard: Add make_compiler_temp_reg helper

Corrollary to make_compiler_temp (for SSA).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Move nir_*_src_index to compiler.h
Alyssa Rosenzweig [Mon, 29 Jul 2019 22:10:41 +0000 (15:10 -0700)]
pan/midgard: Move nir_*_src_index to compiler.h

These helpers are useful for code emission everywhere. Share the love!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Disassemble unknown texture ops as hex
Alyssa Rosenzweig [Mon, 29 Jul 2019 21:07:42 +0000 (14:07 -0700)]
pan/midgard: Disassemble unknown texture ops as hex

I'm not sure why I ever thought decimal was a good idea.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add support for disassembling derivatives
Alyssa Rosenzweig [Mon, 29 Jul 2019 21:07:19 +0000 (14:07 -0700)]
pan/midgard: Add support for disassembling derivatives

They're just texture ops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agonir/find_array_copies: Use correct parent array length
Connor Abbott [Tue, 30 Jul 2019 09:05:22 +0000 (11:05 +0200)]
nir/find_array_copies: Use correct parent array length

instr->type is the type of the array element, not the type of the array
being dereferenced. Rather than fishing out the parent type, just use
parent->num_children which should be the length plus 1. While we're here
add another assert for the issue fixed by the previous commit.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111251
Fixes: 156306e5e62 ("nir/find_array_copies: Handle wildcards and overlapping copies")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: Fix comparison for nir_deref_instr_is_known_out_of_bounds()
Connor Abbott [Tue, 30 Jul 2019 09:04:14 +0000 (11:04 +0200)]
nir: Fix comparison for nir_deref_instr_is_known_out_of_bounds()

There was an off-by-one error.

Fixes: 156306e5e62 ("nir/find_array_copies: Handle wildcards and overlapping copies")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoradv/gfx10: only compile the GS copy shader on-demand
Samuel Pitoiset [Tue, 30 Jul 2019 13:14:35 +0000 (15:14 +0200)]
radv/gfx10: only compile the GS copy shader on-demand

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agogitlab-ci: Fix scons build directory path
Michel Dänzer [Fri, 26 Jul 2019 10:20:41 +0000 (12:20 +0200)]
gitlab-ci: Fix scons build directory path

Fixes: dd3d0b2897b8 "gitlab-ci: Only keep the build logs as artifacts."
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoswr/rasterizer: Add memory tracking support
Jan Zielinski [Fri, 26 Jul 2019 07:37:12 +0000 (09:37 +0200)]
swr/rasterizer: Add memory tracking support

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
5 years agoswr/rasterizer: Better implementation of scatter
Jan Zielinski [Wed, 24 Jul 2019 10:25:27 +0000 (12:25 +0200)]
swr/rasterizer: Better implementation of scatter

Added support for avx512 scatter instruction. Non-avx512 will
now call into a C function to do the scatter emulation.

This has better jit compile performance than
the previous approach of jitting scalar loops.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
5 years agoswr/rasterizer: cleanups for tessellation
Jan Zielinski [Wed, 24 Jul 2019 10:10:27 +0000 (12:10 +0200)]
swr/rasterizer: cleanups for tessellation

This commit introduces small fixes in preparation for tessellation
support.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
5 years agorasterizer/swr: move BucketMgr to SwrContext
Jan Zielinski [Wed, 24 Jul 2019 10:03:49 +0000 (12:03 +0200)]
rasterizer/swr: move BucketMgr to SwrContext

This move gets us back to parity  with global manager
in that we can dump render context buckets now.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
5 years agov3d: take into account separate_stencil when checking if stencil should be cleared
Alejandro Piñeiro [Mon, 29 Jul 2019 11:06:44 +0000 (13:06 +0200)]
v3d: take into account separate_stencil when checking if stencil should be cleared

In most cases this is not needed because the usual is that when a
separate stencil is written, the parent resource is also written.

This is needed if we have a separate stencil, no depth buffer, and the
source and destination is the same, as in that case the stencil can be
updated, but not the parent source (like if you are blitting only the
stencil buffer). On that situation, the following access to the
stencil buffer would clear the stencil buffer (so overwritting the
previous blitting) cleared because the parent source has
v3d_resource.writes to 0.

As far as I see, that situation only happens with the
GL_DEPTH32F_STENCIL8 format.

Note that one alternative would consider that if the separate_stencil
has been written, the parent should also be considered written (and
update its "writes" field accordingly). But I found this patch more
natural.

Fixes the following piglit tests:
   spec/arb_depth_buffer_float/fbo-stencil-gl_depth32f_stencil8-blit
   spec/arb_depth_buffer_float/fbo-stencil-gl_depth32f_stencil8-copypixels

the latter regressed when internally glCopyPixels implementation
started to use blitting. So:

Fixes: 131d40cfc91f ("st/mesa: accelerate glCopyPixels(STENCIL)")
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoradv: Don't include radv_private.h from radv_shader.h
Daniel Schürmann [Mon, 29 Jul 2019 15:51:01 +0000 (17:51 +0200)]
radv: Don't include radv_private.h from radv_shader.h

This patch decouples radv_shader.h from any LLVM dependency.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoi965/gen10: Remove unnecessary workaround.
Rafael Antognolli [Fri, 19 Jul 2019 22:41:35 +0000 (15:41 -0700)]
i965/gen10: Remove unnecessary workaround.

In fact, the description of the workaround states that the mask field
doesn't work correctly on gen10, and we need to set it to 0xffff even we
we only want to update a single field:

 "The mask bits are not implemented properly on 3DSTATE_3D_MODE.  Driver
 must always program bits 31:16 of DW1 a value of 0xFFFF.   This means
 if it is only updating 1 field, it must update all the fields to the
 correct value."

So unless we want to change any of the fields of 3DSTATE_3D_MODE,
there's not need to emit. Additionally, it seems this workaround is not
required on gen11. And last but not least, this workaround is not
implemented on iris or anv, and it doesn't seem to be missed there.

So let's just remove the whole thing.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: Fix SO offset to be 32-bit in DrawTransformFeedback handling
Kenneth Graunke [Mon, 29 Jul 2019 22:33:02 +0000 (15:33 -0700)]
iris: Fix SO offset to be 32-bit in DrawTransformFeedback handling

We accidentally started copying a full 64-bit value rather than copying
a 32-bit offset and zeroing the top 32-bits.  This caused us to compute
bogus vertex counts which could lead to GPU hangs in some cases.

Thanks to Clayton Craft for catching the regressions!

Fixes: 0e24d10ff5c ("iris: Use gen_mi_builder to handle CS ALU operations.")
5 years agointel: Use a system value for gl_FragCoord
Jason Ekstrand [Thu, 18 Jul 2019 14:59:44 +0000 (09:59 -0500)]
intel: Use a system value for gl_FragCoord

It's kind-of an anomaly that the Intel drivers are still treating
gl_FragCoord as an input.  It also makes zero sense because we have to
special-case it in the back-end.

Because ANV is the only user of nir_lower_wpos_center, we go ahead and
just update it to look for nir_intrinsic_load_frag_coord as part of this
patch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl: Treat gl_FragCoord as a varying even when it's a system value
Jason Ekstrand [Fri, 19 Jul 2019 15:42:56 +0000 (10:42 -0500)]
glsl: Treat gl_FragCoord as a varying even when it's a system value

This fixes glsl-fcoord-invariant-pass.shader_test on drivers that set
GLSLFragCoordIsSysVal which includes radeonsi among others.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agomesa/spirv: Set frag_coord_is_sysval to GLSLFragCoordIsSysVal
Jason Ekstrand [Thu, 18 Jul 2019 18:54:57 +0000 (13:54 -0500)]
mesa/spirv: Set frag_coord_is_sysval to GLSLFragCoordIsSysVal

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/fs: Remove calculate_urb_setup from fs_visitor
Jason Ekstrand [Thu, 18 Jul 2019 14:15:15 +0000 (09:15 -0500)]
intel/fs: Remove calculate_urb_setup from fs_visitor

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agofreedreno/a6xx: fix MSAA resolve hangs
Rob Clark [Wed, 24 Jul 2019 20:31:13 +0000 (13:31 -0700)]
freedreno/a6xx: fix MSAA resolve hangs

Seems like RB_BLIT_SCISSOR needs to be aligned to (minimum?) tile size.

Fixes intermittent GPU hangs triggered by some of the three.js samples
on https://threejs.org/

Signed-off-by: Rob Clark <robdclark@chromium.org>
5 years agofreedreno/ir3: fix for array/reg store vs meta instructions
Rob Clark [Fri, 19 Jul 2019 23:47:15 +0000 (16:47 -0700)]
freedreno/ir3: fix for array/reg store vs meta instructions

fishgl.com has a shader which does roughly:

   foo = texture(...);
   if (bar)
      foo = texture(...);

after lowering phi webs to regs we end up w/ a vec4 reg (array).  But
since it was not an indirect access, we try to skip the extra mov.  This
results that the per-component fanout (split) meta instructions store
directly to the reg (array).  Which doesn't work out in RA.

Signed-off-by: Rob Clark <robdclark@chromium.org>
5 years agomeson: bump required version to 0.46
Eric Engestrom [Thu, 18 Jul 2019 11:55:09 +0000 (12:55 +0100)]
meson: bump required version to 0.46

0.45 has a few annoying bugs (like the one in !358 [1]), and 0.46 is
well over a year old by now, so let's move to it.

[1] https://gitlab.freedesktop.org/mesa/mesa/merge_requests/358

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoradeon/vcn/vp9: add Arcturus VP9 support
Leo Liu [Fri, 12 Jul 2019 13:47:44 +0000 (09:47 -0400)]
radeon/vcn/vp9: add Arcturus VP9 support

Arcturus CHIP enum is less than Navi10, since it's still gfx9,
but its VCN version belongs to VCN2.x

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoradeon/vcn: add Arcturus decode support
Leo Liu [Thu, 20 Jun 2019 13:00:27 +0000 (09:00 -0400)]
radeon/vcn: add Arcturus decode support

different internal registers offset from previous HW

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoamd: add support for Arcturus
Marek Olšák [Mon, 22 Jul 2019 19:11:37 +0000 (15:11 -0400)]
amd: add support for Arcturus

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoradeonsi: add AMD_DEBUG=nogfx for testing
Marek Olšák [Mon, 22 Jul 2019 19:09:54 +0000 (15:09 -0400)]
radeonsi: add AMD_DEBUG=nogfx for testing

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoradeonsi: add support for compute-only chips
Marek Olšák [Thu, 7 Feb 2019 05:04:32 +0000 (00:04 -0500)]
radeonsi: add support for compute-only chips

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agogallium/auxiliary/vl: add compute shaders for deint yuv
Sonny Jiang [Fri, 7 Jun 2019 20:07:29 +0000 (16:07 -0400)]
gallium/auxiliary/vl: add compute shaders for deint yuv

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agogallium/auxiliary/vl: don't call gfx functions on compute-only chips
Sonny Jiang [Fri, 17 May 2019 19:07:29 +0000 (15:07 -0400)]
gallium/auxiliary/vl: don't call gfx functions on compute-only chips

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agogallium/auxiliary/vl: add PIPE_CAP_GRAPHICS check for vl compositor
James Zhu [Fri, 26 Apr 2019 15:40:09 +0000 (11:40 -0400)]
gallium/auxiliary/vl: add PIPE_CAP_GRAPHICS check for vl compositor

Init graphic shader Only when PIPE_CAP_GRAPHICS is true.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agogallium: create multimedia contexts as compute-only if graphics is unsupported
Marek Olšák [Thu, 7 Feb 2019 05:13:44 +0000 (00:13 -0500)]
gallium: create multimedia contexts as compute-only if graphics is unsupported

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agogallium: add PIPE_CAP_GRAPHICS
Marek Olšák [Thu, 7 Feb 2019 05:06:28 +0000 (00:06 -0500)]
gallium: add PIPE_CAP_GRAPHICS

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoradv: implement VK_EXT_index_type_uint8
Samuel Pitoiset [Mon, 29 Jul 2019 08:50:56 +0000 (10:50 +0200)]
radv: implement VK_EXT_index_type_uint8

Natively supported on VI+.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoanv: implement VK_EXT_index_type_uint8
Lionel Landwerlin [Mon, 13 May 2019 15:33:22 +0000 (16:33 +0100)]
anv: implement VK_EXT_index_type_uint8

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agovulkan: Bump headers to 1.1.117
Lionel Landwerlin [Mon, 13 May 2019 15:29:31 +0000 (16:29 +0100)]
vulkan: Bump headers to 1.1.117

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoinclude/vulkan: bump vk_android_native_buffer
Lionel Landwerlin [Mon, 29 Jul 2019 13:00:00 +0000 (16:00 +0300)]
include/vulkan: bump vk_android_native_buffer

Taken off https://android.googlesource.com/platform/frameworks/native/+/refs/tags/android-9.0.0_r45/vulkan/include/vulkan/vk_android_native_buffer.h

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/mi: only resolve to a temp register if source isn't in memory
Eric Engestrom [Mon, 29 Jul 2019 14:11:13 +0000 (15:11 +0100)]
intel/mi: only resolve to a temp register if source isn't in memory

aka. fix a s/||/&&/ typo

Fixes: 74063ee61aadd1371a9b ("intel/mi: Add a new gen_mi_store_if() helper.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agogitlab-ci: Enable freedreno shader-db runs.
Eric Anholt [Thu, 25 Jul 2019 22:46:51 +0000 (15:46 -0700)]
gitlab-ci: Enable freedreno shader-db runs.

Now that helgrind is less upset and I've completed many successful
full shader-db runs, we should be able to enable freedreno shader-db
runs for Mesa checkins on the tiny public shader-db.

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agonir: Fix helgrind complaints about data race in trivial_swizzle init.
Eric Anholt [Thu, 25 Jul 2019 20:37:28 +0000 (13:37 -0700)]
nir: Fix helgrind complaints about data race in trivial_swizzle init.

Even if the data race wasn't real (I'm not great at reasoning about
this), helgrind is a nice enough tool that keeping noise out of it is
probably worthwhile.  Besides, typing out the numbers keeps the data
in the read-only data section instead of emitting code to initialize
it every time.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
5 years agofreedreno: Fix data race on making the shader's id.
Eric Anholt [Thu, 25 Jul 2019 20:26:01 +0000 (13:26 -0700)]
freedreno: Fix data race on making the shader's id.

The value is only used for IR3_DBG_DISASM, but it cleans up the
helgrind output.

Reviewed-by: Rob Clark <robdclark@gmail.com>