mesa.git
5 years agodocs: Add release notes for 19.2.2
Dylan Baker [Wed, 23 Oct 2019 15:54:11 +0000 (08:54 -0700)]
docs: Add release notes for 19.2.2

5 years agofreedreno/ir3: handle the progress case
Rob Clark [Fri, 18 Oct 2019 22:55:10 +0000 (15:55 -0700)]
freedreno/ir3: handle the progress case

In some cases, in particular when you have things that can be src
modifiers ((abs)/(neg)), once eliminating one mov, there is a
possibility to remove another.  Handle this by re-visiting an
instruction after eliminating a copy on one of it's srcs.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: remove restrictions on const + (abs)/(neg)
Rob Clark [Fri, 18 Oct 2019 22:53:07 +0000 (15:53 -0700)]
freedreno/ir3: remove restrictions on const + (abs)/(neg)

These date back to relatively early days of ir3, when a lot was still
not well understood.  But according to CI (and what I've seen blob
driver do), these are not actually real restrictions.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: allow copy-propagate out of fanout
Rob Clark [Fri, 18 Oct 2019 22:46:59 +0000 (15:46 -0700)]
freedreno/ir3: allow copy-propagate out of fanout

Now that we fixed the sharp edges that this was papering over, we can
relax the restriction about eliminating a mov coming out of a fanout
(for example from result of texture fetch).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: treat high vs low reg as conversion
Rob Clark [Tue, 15 Oct 2019 23:08:26 +0000 (16:08 -0700)]
freedreno/ir3: treat high vs low reg as conversion

This avoids copy-propagating a high register into an instruction which
cannot consume it.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: propagate dest flags for collect/fanin
Rob Clark [Tue, 15 Oct 2019 22:46:42 +0000 (15:46 -0700)]
freedreno/ir3: propagate dest flags for collect/fanin

We did this properly already for split/fanout.  But collect was missed.
Extract out a helper to share.

This way we avoid copy propagating a mov from high or half reg into an
instruction which cannot consume a high/half reg.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: make high regs easier to see in IR dumps
Rob Clark [Tue, 15 Oct 2019 23:28:50 +0000 (16:28 -0700)]
freedreno/ir3: make high regs easier to see in IR dumps

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: debug cleanup
Rob Clark [Mon, 14 Oct 2019 17:42:25 +0000 (10:42 -0700)]
freedreno/ir3: debug cleanup

1) deduplicate IR3_SHADER_DEBUG=disasm versus fs/vs/etc handling
2) standardize shader stage name prints, in particular VERT vs BVERT
3) don't mix stderr and stdout

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agospirv: Add helper to find args of Image Operands
Caio Marcelo de Oliveira Filho [Wed, 23 Oct 2019 06:37:18 +0000 (23:37 -0700)]
spirv: Add helper to find args of Image Operands

Avoid keeping track of the idx and all possible image operands for
each operation.  Note for convenience we split up the handling of
ImageOperandsOffsetMask and ImageOperandsConstOffsetMask.

Suggested by Jason Ekstrand.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agospirv: Check that only one offset is defined as Image Operand
Caio Marcelo de Oliveira Filho [Wed, 23 Oct 2019 06:40:08 +0000 (23:40 -0700)]
spirv: Check that only one offset is defined as Image Operand

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agospirv: Add imageoperands_to_string helper
Caio Marcelo de Oliveira Filho [Wed, 23 Oct 2019 05:25:29 +0000 (22:25 -0700)]
spirv: Add imageoperands_to_string helper

Change the information to also include the category, so that the
particulars of BitEnum enumeration can be handled in the template.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoanv: Implement VK_KHR_vulkan_memory_model
Caio Marcelo de Oliveira Filho [Thu, 5 Sep 2019 18:10:02 +0000 (11:10 -0700)]
anv: Implement VK_KHR_vulkan_memory_model

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agospirv: Handle MakePointerAvailable/Visible
Caio Marcelo de Oliveira Filho [Tue, 10 Sep 2019 20:21:08 +0000 (13:21 -0700)]
spirv: Handle MakePointerAvailable/Visible

Emit barriers with semantics matching the access operand and the
storage class of the pointer.

v2: Fix order of visible / available emission relative to the
    operations.  (Bas)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agospirv: Handle MakeTexelAvailable/Visible
Caio Marcelo de Oliveira Filho [Tue, 10 Sep 2019 20:16:46 +0000 (13:16 -0700)]
spirv: Handle MakeTexelAvailable/Visible

Set the memory semantics and scope for later emitting the barrier.
Note the barrier emission code already exist in vtn_handle_image for
the Image atomics.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agospirv: Add option to emit scoped memory barriers
Caio Marcelo de Oliveira Filho [Tue, 10 Sep 2019 19:19:08 +0000 (12:19 -0700)]
spirv: Add option to emit scoped memory barriers

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agospirv: Add SpvMemoryModelVulkan and related capabilities
Caio Marcelo de Oliveira Filho [Tue, 10 Sep 2019 19:38:00 +0000 (12:38 -0700)]
spirv: Add SpvMemoryModelVulkan and related capabilities

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agospirv: Emit memory barriers for atomic operations
Caio Marcelo de Oliveira Filho [Tue, 10 Sep 2019 20:21:49 +0000 (13:21 -0700)]
spirv: Emit memory barriers for atomic operations

Add a helper to split the memory semantics into before and after the
operation, and use that result to emit memory barriers.

v2: Be more explicit about which bits we are keeping around when
    splitting memory semantics into a before and after.  For now
    we are ignoring Volatile.  (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agospirv: Parse memory semantics for atomic operations
Caio Marcelo de Oliveira Filho [Tue, 10 Sep 2019 20:16:36 +0000 (13:16 -0700)]
spirv: Parse memory semantics for atomic operations

Including the right storage memory semantic based on the storage class
of the operation.  These will be used later to emit memory barriers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agointel/fs: Implement scoped_memory_barrier
Caio Marcelo de Oliveira Filho [Thu, 5 Sep 2019 18:08:05 +0000 (11:08 -0700)]
intel/fs: Implement scoped_memory_barrier

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir/tests: Add copy propagation tests with scoped_memory_barrier
Caio Marcelo de Oliveira Filho [Fri, 18 Oct 2019 05:58:57 +0000 (22:58 -0700)]
nir/tests: Add copy propagation tests with scoped_memory_barrier

Three groups of tests, effectively defining what cases the
optimization is allowed or prevented

- Redudant loads       (a load  generated the value)
- Propagate SSA values (a store generated the value)
- Propagate a var      (a copy  generated the value)

Change the shader type of the tests to be COMPUTE so
nir_var_mem_shared can also be used.  Doesn't affect the semantic of
the copy propagation.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agonir: Add scoped_memory_barrier intrinsic
Caio Marcelo de Oliveira Filho [Thu, 18 Jul 2019 23:14:03 +0000 (16:14 -0700)]
nir: Add scoped_memory_barrier intrinsic

Add a NIR instrinsic that represent a memory barrier in SPIR-V /
Vulkan Memory Model, with extra attributes that describe the barrier:

- Ordering: whether is an Acquire or Release;
- "Cache control": availability ("ensure this gets written in the memory")
  and visibility ("ensure my cache is up to date when I'm reading");
- Variable modes: which memory types this barrier applies to;
- Scope: how far this barrier applies.

Note that unlike in SPIR-V, the "Storage Semantics" and the "Memory
Semantics" are split into two different attributes so we can use
variable modes for the former.

NIR passes that took barriers in consideration were also changed

- nir_opt_copy_prop_vars: clean up the values for the mode of an
  ACQUIRE barrier.  Copy propagation effect is to "pull up a load" (by
  not performing it), which is what ACQUIRE restricts.

- nir_opt_dead_write_vars and nir_opt_combine_writes: clean up the
  pending writes for the modes of an RELEASE barrier.  Dead writes
  effect is to "push down a store", which is what RELEASE restricts.

- nir_opt_access: treat the ACQUIRE and RELEASE as a full barrier for
  the modes.  This is conservative, but since this is a GL-specific
  pass, doesn't make a difference for now.

v2: Fix the scoped barrier handling in copy propagation.  (Jason)
    Add scoped barrier handling to nir_opt_access and
    nir_opt_combine_writes.  (Rhys)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agospirv/info: Add a memorymodel_to_string helper
Jason Ekstrand [Fri, 6 Jul 2018 21:05:22 +0000 (14:05 -0700)]
spirv/info: Add a memorymodel_to_string helper

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agodocs: Add release not about scons deprecation
Dylan Baker [Thu, 24 Oct 2019 17:09:21 +0000 (10:09 -0700)]
docs: Add release not about scons deprecation

5 years agoscons: Also print a deprecation warning on windows
Dylan Baker [Mon, 21 Oct 2019 17:18:50 +0000 (10:18 -0700)]
scons: Also print a deprecation warning on windows

This warning is different. Meson support for windows is less mature than
for other platforms, and the goal here is to alert people that
eventually we plan to drop scons and move to meson, and that they should
try out meson and report issues.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoscons: Print a deprecation warning about using scons on not windows
Dylan Baker [Mon, 21 Oct 2019 16:29:23 +0000 (09:29 -0700)]
scons: Print a deprecation warning about using scons on not windows

At this point meson should be able to handle all of the non-windows
platforms just fine; we'd like to be able to stop maintaining scons for
those platforms sooner than later.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoscons: Use print_function ins SConstruct
Dylan Baker [Mon, 21 Oct 2019 16:24:12 +0000 (09:24 -0700)]
scons: Use print_function ins SConstruct

This ensures that we get python3's print() function behavior even in
python2, instead of python2's print statement behavior. We'll be using
this in the next patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agogallium: Fix a bunch of undefined left-shifts in u_format_*
Adam Jackson [Wed, 23 Oct 2019 21:07:03 +0000 (17:07 -0400)]
gallium: Fix a bunch of undefined left-shifts in u_format_*

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
5 years agoradv: compute the number of records correctly for vertex buffers
Samuel Pitoiset [Mon, 21 Oct 2019 16:41:33 +0000 (18:41 +0200)]
radv: compute the number of records correctly for vertex buffers

On GFX8 the number of records is in bytes while on other chips
it's in units of "stride".

Fixes dEQP-VK.robustness.vertex_access.*.draw.vertex_* on RAVEN.

Tested on GFX6, GFX8, GFX10 and RAVEN.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agogitlab-ci: Enable UBSan for the meson-vulkan job
Michel Dänzer [Wed, 25 Sep 2019 10:56:58 +0000 (12:56 +0200)]
gitlab-ci: Enable UBSan for the meson-vulkan job

It doesn't report any errors now.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agoutil/tests: Avoid int64_t overflow issues in fast_idiv_by_const test
Michel Dänzer [Wed, 25 Sep 2019 09:55:33 +0000 (11:55 +0200)]
util/tests: Avoid int64_t overflow issues in fast_idiv_by_const test

Flagged by UBSan:

../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:233:14: runtime error: negation of -2147483648 cannot be represented in type 'int'; cast to an unsigned type to negate this value to itself
    #0 0x55b4c1a2a428 in rand_sint ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:233
    #1 0x55b4c1a2ad3a in random_sdiv_test ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:308
    #2 0x55b4c1a2b837 in fast_idiv_by_const_int32_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:410
    #3 0x55b4c1abc13f in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #4 0x55b4c1aa7a4d in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #5 0x55b4c1a4ce57 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
    #6 0x55b4c1a4f530 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #7 0x55b4c1a51cbe in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #8 0x55b4c1a6d698 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #9 0x55b4c1abfd58 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #10 0x55b4c1aab425 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #11 0x55b4c1a64cba in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #12 0x55b4c1ae4b73 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #13 0x55b4c1ae4a33 in main ../src/gtest/src/gtest_main.cc:37
    #14 0x7ff172d1dbba in __libc_start_main ../csu/libc-start.c:308
    #15 0x55b4c1a28dc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9)

../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:309:52: runtime error: negation of -9223372036854775808 cannot be represented in type 'long int'; cast to an unsigned type to negate this value to itself
    #0 0x563b24dafd2d in random_sdiv_test ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:309
    #1 0x563b24db0f0f in fast_idiv_by_const_int64_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:473
    #2 0x563b24e41111 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #3 0x563b24e2ca1f in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #4 0x563b24dd1e29 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
    #5 0x563b24dd4502 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #6 0x563b24dd6c90 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #7 0x563b24df266a in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #8 0x563b24e44d2a in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #9 0x563b24e303f7 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #10 0x563b24de9c8c in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #11 0x563b24e69b45 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #12 0x563b24e69a05 in main ../src/gtest/src/gtest_main.cc:37
    #13 0x7f9a90330bba in __libc_start_main ../csu/libc-start.c:308
    #14 0x563b24daddc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9)

v2:
* Use INT64_MIN instead of LLONG_MIN (Jason Ekstrand)
* Simpler test for INT64_MIN result from rand_sint (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agoutil: Use uint64_t for shifting left in sign_extend and strunc
Michel Dänzer [Wed, 25 Sep 2019 09:44:24 +0000 (11:44 +0200)]
util: Use uint64_t for shifting left in sign_extend and strunc

Shifting int64_t values left into the sign bit has undefined behaviour:

../src/util/fast_idiv_by_const.c:175:14: runtime error: left shift of 131 by 56 places cannot be represented in type 'long int'
    #0 0x561337ed10c1 in sign_extend ../src/util/fast_idiv_by_const.c:175
    #1 0x561337ed1335 in util_compute_fast_sdiv_info ../src/util/fast_idiv_by_const.c:239
    #2 0x561337e17519 in fast_idiv_by_const_int8_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:357
    #3 0x561337ea815d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #4 0x561337e93a6b in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #5 0x561337e38e75 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
    #6 0x561337e3b54e in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #7 0x561337e3dcdc in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #8 0x561337e596b6 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #9 0x561337eabd76 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #10 0x561337e97443 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #11 0x561337e50cd8 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #12 0x561337ed0b91 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #13 0x561337ed0a51 in main ../src/gtest/src/gtest_main.cc:37
    #14 0x7f85ba483bba in __libc_start_main ../csu/libc-start.c:308
    #15 0x561337e14dc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9)

../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:51:14: runtime error: left shift of negative value -63
    #0 0x55fc3c0e67cc in strunc ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:51
    #1 0x55fc3c0e6d93 in smul_high ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:140
    #2 0x55fc3c0e7067 in fast_sdiv ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:181
    #3 0x55fc3c0e858b in fast_idiv_by_const_int8_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:358
    #4 0x55fc3c17915d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #5 0x55fc3c164a6b in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #6 0x55fc3c109e75 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
    #7 0x55fc3c10c54e in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #8 0x55fc3c10ecdc in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #9 0x55fc3c12a6b6 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #10 0x55fc3c17cd76 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #11 0x55fc3c168443 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #12 0x55fc3c121cd8 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #13 0x55fc3c1a1b91 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #14 0x55fc3c1a1a51 in main ../src/gtest/src/gtest_main.cc:37
    #15 0x7fd224759bba in __libc_start_main ../csu/libc-start.c:308
    #16 0x55fc3c0e5dc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9)

v2:
* Use two casts instead of changing the argument type (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agogallium/util: Cast to target type before shifting left
Michel Dänzer [Wed, 25 Sep 2019 09:37:49 +0000 (11:37 +0200)]
gallium/util: Cast to target type before shifting left

Otherwise a smaller type may be promoted to int, which can hit undefined
behaviour:

../src/gallium/auxiliary/util/u_half.h:126:29: runtime error: left shift of 32768 by 16 places cannot be represented in type 'int'
    #0 0x5646ff63d488 in util_half_to_float ../src/gallium/auxiliary/util/u_half.h:126
    #1 0x5646ff63d749 in _mesa_half_to_float ../src/util/half_float.c:145
    #2 0x5646ff54d557 in nir_const_value_negative_equal ../src/compiler/nir/nir_instr_set.c:372
    #3 0x5646ff44d29a in const_value_negative_equal_test_nir_type_float16_trivially_true_Test::TestBody() ../src/compiler/nir/tests/negative_equal_tests.cpp:121
    #4 0x5646ff505c05 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #5 0x5646ff4f1513 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #6 0x5646ff4979b5 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
    #7 0x5646ff49a08e in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #8 0x5646ff49c81c in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #9 0x5646ff4b81f6 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #10 0x5646ff50981e in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #11 0x5646ff4f4eeb in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #12 0x5646ff4af818 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #13 0x5646ff52e639 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #14 0x5646ff52e4f9 in main ../src/gtest/src/gtest_main.cc:37
    #15 0x7f6bacb78bba in __libc_start_main ../csu/libc-start.c:308
    #16 0x5646ff448019 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/compiler/nir/negative_equal+0x17c019)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agointel/fs: Check for NULL key in fs_visitor constructor
Michel Dänzer [Wed, 25 Sep 2019 09:34:27 +0000 (11:34 +0200)]
intel/fs: Check for NULL key in fs_visitor constructor

Flagged by UBSan:

../src/intel/compiler/brw_fs_visitor.cpp:986:20: runtime error: member access within null pointer of type 'const struct brw_base_prog_key'
    #0 0x559fadb48556 in fs_visitor::init() ../src/intel/compiler/brw_fs_visitor.cpp:986
    #1 0x559fadb46db3 in fs_visitor::fs_visitor(brw_compiler const*, void*, void*, brw_base_prog_key const*, brw_stage_prog_data*, nir_shader const*, unsigned int, int, brw_vue_map const*) ../src/intel/compiler/brw_fs_visitor.cpp:962
    #2 0x559fad9c7cd8 in saturate_propagation_fs_visitor::saturate_propagation_fs_visitor(brw_compiler*, brw_wm_prog_data*, nir_shader*) (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/fs_saturate_propagation+0x61bcd8)
    #3 0x559fad9960a1 in saturate_propagation_test::SetUp() ../src/intel/compiler/test_fs_saturate_propagation.cpp:65
    #4 0x559fadd7a32d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #5 0x559fadd65c3b in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #6 0x559fadd0af75 in testing::Test::Run() ../src/gtest/src/gtest.cc:2470
    #7 0x559fadd0d8a4 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #8 0x559fadd10032 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #9 0x559fadd2ba0c in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #10 0x559fadd7df46 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #11 0x559fadd69613 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #12 0x559fadd2302e in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #13 0x559fadda2d61 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #14 0x559fadda2c21 in main ../src/gtest/src/gtest_main.cc:37
    #15 0x7fe8f6748bba in __libc_start_main ../csu/libc-start.c:308
    #16 0x559fad9950f9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/fs_saturate_propagation+0x5e90f9)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agointel/compiler: Cast to target type before shifting left
Michel Dänzer [Wed, 25 Sep 2019 09:31:23 +0000 (11:31 +0200)]
intel/compiler: Cast to target type before shifting left

Otherwise a smaller type may be promoted to int, which can hit undefined
behaviour:

../src/intel/compiler/brw_packed_float.c:66:17: runtime error: left shift of 128 by 24 places cannot be represented in type 'int'
    #0 0x5604a03969aa in brw_vf_to_float ../src/intel/compiler/brw_packed_float.c:66
    #1 0x5604a0391305 in vf_float_conversion_test_test_vf_to_float_Test::TestBody() ../src/intel/compiler/test_vf_float_conversions.cpp:70
    #2 0x5604a041a323 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #3 0x5604a0405c31 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #4 0x5604a03ab03b in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
    #5 0x5604a03ad714 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #6 0x5604a03afea2 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #7 0x5604a03cb87c in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #8 0x5604a041df3c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #9 0x5604a0409609 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #10 0x5604a03c2e9e in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #11 0x5604a0442d57 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #12 0x5604a0442c17 in main ../src/gtest/src/gtest_main.cc:37
    #13 0x7f9a1983dbba in __libc_start_main ../csu/libc-start.c:308
    #14 0x5604a0390d89 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/vf_float_conversions+0x8dd89)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agointel/compiler: Don't left-shift by >= the number of bits of the type
Michel Dänzer [Wed, 25 Sep 2019 09:17:11 +0000 (11:17 +0200)]
intel/compiler: Don't left-shift by >= the number of bits of the type

To avoid it, use the modulo of the number of bits in the value being
shifted, which is presumably what ended up happening on x86.

Flagged by UBSan:

../src/intel/compiler/brw_eu_validate.c:974:33: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int'
    #0 0x561abb612ab3 in general_restrictions_on_region_parameters ../src/intel/compiler/brw_eu_validate.c:974
    #1 0x561abb617574 in brw_validate_instructions ../src/intel/compiler/brw_eu_validate.c:1851
    #2 0x561abb53bd31 in validate ../src/intel/compiler/test_eu_validate.cpp:106
    #3 0x561abb555369 in validation_test_source_cannot_span_more_than_2_registers_Test::TestBody() ../src/intel/compiler/test_eu_validate.cpp:486
    #4 0x561abb742651 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #5 0x561abb72e64d in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #6 0x561abb6d5451 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
    #7 0x561abb6d7b2a in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #8 0x561abb6da2b8 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #9 0x561abb6f5c92 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #10 0x561abb74626a in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #11 0x561abb732025 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #12 0x561abb6ed2b4 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #13 0x561abb768b3b in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #14 0x561abb7689fb in main ../src/gtest/src/gtest_main.cc:37
    #15 0x7f525e5a9bba in __libc_start_main ../csu/libc-start.c:308
    #16 0x561abb538ed9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/eu_validate+0x1b8ed9)

Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agoanv: fix error message
Eric Engestrom [Thu, 24 Oct 2019 12:04:51 +0000 (13:04 +0100)]
anv: fix error message

`strerror()` takes an `errno`, not the negative value returned by the
`ioctl()`.
Instead of fixing this as `"%s", strerror(errno)`, let's just use the
`"%m"` shortcut for it.

Fixes: 2b5f30b1d91b98ab27ba ("anv: implement VK_INTEL_performance_query")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agomeson: add -Werror=empty-body to disallow `if(x);`
Eric Engestrom [Mon, 23 Sep 2019 16:21:20 +0000 (17:21 +0100)]
meson: add -Werror=empty-body to disallow `if(x);`

This would have prevented a bug in MR 2058 [1]; with that MR fixed,
nothing else uses empty-body blocks, so let's just forbid them altogether.

[1] https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2058#note_237880

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agollvmpipe: avoid generating empty-body blocks
Eric Engestrom [Wed, 25 Sep 2019 07:49:05 +0000 (08:49 +0100)]
llvmpipe: avoid generating empty-body blocks

Suggested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agollvmpipe: avoid compiling no-op block on release builds
Eric Engestrom [Wed, 25 Sep 2019 07:47:28 +0000 (08:47 +0100)]
llvmpipe: avoid compiling no-op block on release builds

Suggested-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agowinsys/svga: Limit the maximum DMA hardware buffer size
Thomas Hellstrom [Thu, 3 Oct 2019 10:44:42 +0000 (12:44 +0200)]
winsys/svga: Limit the maximum DMA hardware buffer size

The kernel total GMR/DMA size is limited, but it's definitely possible for the
kernel to allow a larger buffer allocation to succeed, but command
submission using that buffer as a GMR would fail typically causing an
application crash.

So have the winsys limit the size of GMR/DMA buffers. The pipe driver will
then resort to allocating smaller buffers and perform the DMA transfer in
multiple bands, also allowing for the pre-flush mechanism to kick in.

This avoids the related application crashes.

Fixes: e7843273fae ("winsys/svga: Update to vmwgfx kernel module 2.1")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agosvga: Fix banded DMA upload unmap
Thomas Hellstrom [Thu, 3 Oct 2019 10:26:39 +0000 (12:26 +0200)]
svga: Fix banded DMA upload unmap

Even with banded DMA uploads, st->hwbuf is always non-NULL, but when we've
allocated a software buffer to hold the full upload, unmapping of the
hardware buffer has already been done before
svga_texture_transfer_unmap_dma(), and the code was performing an unmap of
an already mapped buffer.

Fix this by testing for software buffer not present.

Fixes: a9c4a861d5d ("svga: refactor svga_texture_transfer_map/unmap functions")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agogitlab-ci: Update kernel for LAVA jobs to 5.4-rc4
Tomeu Vizoso [Mon, 21 Oct 2019 14:27:31 +0000 (16:27 +0200)]
gitlab-ci: Update kernel for LAVA jobs to 5.4-rc4

Update to 5.4-rc4 so we can test Panfrost on devices with Mali T720 and
T820.

A bug was found that prevented things working at all on RK3288 devices,
so we carry a patch for now in my personal fork.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
5 years agoglsl: remove propagate_invariance() call from the linker
Timothy Arceri [Wed, 23 Oct 2019 03:23:31 +0000 (14:23 +1100)]
glsl: remove propagate_invariance() call from the linker

This was added in 586f4a42e78f and became redundant with 34ab9b0947cd

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agonir: improve nir_variable packing
Timothy Arceri [Wed, 23 Oct 2019 00:43:59 +0000 (11:43 +1100)]
nir: improve nir_variable packing

Before:

/* size: 136, cachelines: 3, members: 10 */

After:

/* size: 128, cachelines: 2, members: 10 */

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
5 years agonir: fix nir_variable_data packing
Timothy Arceri [Wed, 23 Oct 2019 00:37:28 +0000 (11:37 +1100)]
nir: fix nir_variable_data packing

Before:

/* size: 60, cachelines: 1, members: 29 */

After:

/* size: 56, cachelines: 1, members: 29 */

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
5 years agoradeonsi/nir: implement pipe_screen::finalize_nir
Marek Olšák [Fri, 27 Sep 2019 00:24:17 +0000 (20:24 -0400)]
radeonsi/nir: implement pipe_screen::finalize_nir

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agost/mesa: use pipe_screen::finalize_nir
Marek Olšák [Fri, 27 Sep 2019 22:09:11 +0000 (18:09 -0400)]
st/mesa: use pipe_screen::finalize_nir

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agotgsi_to_nir: use pipe_screen::finalize_nir
Marek Olšák [Fri, 27 Sep 2019 18:55:58 +0000 (14:55 -0400)]
tgsi_to_nir: use pipe_screen::finalize_nir

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agogallium: add pipe_screen::finalize_nir
Marek Olšák [Fri, 18 Oct 2019 01:28:56 +0000 (21:28 -0400)]
gallium: add pipe_screen::finalize_nir

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agost/mesa: update VS shader_info for NIR after lowering passes
Marek Olšák [Fri, 18 Oct 2019 22:02:57 +0000 (18:02 -0400)]
st/mesa: update VS shader_info for NIR after lowering passes

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agost/mesa: assign driver locations for VS inputs for NIR before caching
Marek Olšák [Fri, 18 Oct 2019 17:02:15 +0000 (13:02 -0400)]
st/mesa: assign driver locations for VS inputs for NIR before caching

fix up edge flags in the NIR pass, because st/mesa doesn't touch the inputs
after caching

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agost/mesa: don't lower_global_vars_to_local for VS if there are no dead inputs
Marek Olšák [Tue, 22 Oct 2019 19:32:17 +0000 (15:32 -0400)]
st/mesa: don't lower_global_vars_to_local for VS if there are no dead inputs

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agost/mesa: move some NIR lowering before shader caching
Marek Olšák [Fri, 18 Oct 2019 01:03:34 +0000 (21:03 -0400)]
st/mesa: move some NIR lowering before shader caching

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoutil/u_queue: skip util_queue_finish if num_threads is 0
Marek Olšák [Thu, 24 Oct 2019 01:01:38 +0000 (21:01 -0400)]
util/u_queue: skip util_queue_finish if num_threads is 0

This fixes a deadlock in pthread_barrier_destroy.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoutil/disk_cache: finish all queue jobs in destroy instead of killing them
Marek Olšák [Wed, 23 Oct 2019 20:15:37 +0000 (16:15 -0400)]
util/disk_cache: finish all queue jobs in destroy instead of killing them

If there are queued shaders to be written to disk, wait for that.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: Rework edgeflag handling
Kenneth Graunke [Wed, 23 Oct 2019 22:38:52 +0000 (15:38 -0700)]
iris: Rework edgeflag handling

We were relying on specific pass ordering in st to avoid setting
inputs_read/outputs_written for edge flags.  Instead, just assume
that it happens and throw out the results we don't want.

We should probably revisit this and try and add a vertex element
property like I originally wanted so we can avoid having it be
associated with the VS altogether.

5 years agogallium/noop: implement get_disk_shader_cache and get_compiler_options
Marek Olšák [Wed, 23 Oct 2019 21:10:01 +0000 (17:10 -0400)]
gallium/noop: implement get_disk_shader_cache and get_compiler_options

trivial

5 years agoaco: take LDS into account when calculating num_waves
Rhys Perry [Fri, 18 Oct 2019 18:06:10 +0000 (19:06 +0100)]
aco: take LDS into account when calculating num_waves

pipeline-db (Vega):
SGPRS: 344 -> 344 (0.00 %)
VGPRS: 424 -> 524 (23.58 %)
Spilled SGPRs: 84 -> 80 (-4.76 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 52812 -> 52484 (-0.62 %) bytes
LDS: 135 -> 135 (0.00 %) blocks
Max Waves: 56 -> 53 (-5.36 %)

v2: consider WGP, rework to be clearer and apply the
    "maximum 16 workgroups per CU" limit properly
v2: use "SIMD" instead of "EU"
v2: fix spiller by introducing "Program::max_waves"
v2: rename "lds_size" to "lds_limit"
v3: make max_waves actually independant of register usage
v3: fix issue where max_waves was way too high
v3: use DIV_ROUND_UP(a, b) instead of max(a / b, 1)
v3: rename "workgroups_per_cu" to "workgroups_per_cu_wgp"
v4: fix typo from "workgroups_per_cu" rename

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v3)
5 years agoaco: increase accuracy of SGPR limits
Rhys Perry [Fri, 13 Sep 2019 15:41:00 +0000 (16:41 +0100)]
aco: increase accuracy of SGPR limits

SGPRs are allocated in groups of 16 on GFX8/GFX9. GFX10 allocates a fixed
number of SGPRs and has 106 addressable SGPRs.

pipeline-db (Vega):
SGPRS: 5912 -> 6232 (5.41 %)
VGPRS: 1772 -> 1780 (0.45 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 88228 -> 87904 (-0.37 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 559 -> 571 (2.15 %)

piepline-db (Navi):
SGPRS: 341256 -> 363384 (6.48 %)
VGPRS: 171536 -> 170960 (-0.34 %)
Spilled SGPRs: 832 -> 581 (-30.17 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 14207332 -> 14190872 (-0.12 %) bytes
LDS: 33 -> 33 (0.00 %) blocks
Max Waves: 18072 -> 18251 (0.99 %)

v2: unconditionally count vcc as an extra sgpr on GFX10+
v3: pass SGPRs rounded to 8

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoradv: round vgprs/sgprs before calculating max_waves
Rhys Perry [Fri, 18 Oct 2019 20:13:44 +0000 (21:13 +0100)]
radv: round vgprs/sgprs before calculating max_waves

Note that ACO doesn't correctly round SGPR counts on GFX8/GFX9.

pipeline-db (ACO/Vega):
SGPRS: 11000 -> 11000 (0.00 %)
VGPRS: 3120 -> 3120 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 164328 -> 164328 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1125 -> 1000 (-11.11 %)

v2: consider wave32

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agodocs: Add new Intel extension
Lionel Landwerlin [Wed, 23 Oct 2019 16:07:32 +0000 (19:07 +0300)]
docs: Add new Intel extension

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoRevert "vc4: do not report alpha-test as supported"
Erik Faye-Lund [Wed, 23 Oct 2019 11:02:55 +0000 (13:02 +0200)]
Revert "vc4: do not report alpha-test as supported"

This reverts commit a79b93269cf340ce4d23b5b34100039bcaafc841.

Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
5 years agoRevert "v3d: do not report alpha-test as supported"
Erik Faye-Lund [Mon, 21 Oct 2019 08:48:11 +0000 (10:48 +0200)]
Revert "v3d: do not report alpha-test as supported"

This reverts commit 9d0523b569bb7208c6e74cafc0f3945415d94336.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
5 years agoRevert "nir: drop support for using load_alpha_ref_float"
Erik Faye-Lund [Mon, 21 Oct 2019 08:48:09 +0000 (10:48 +0200)]
Revert "nir: drop support for using load_alpha_ref_float"

This reverts commit 5af272b47469398762e984e27f65fc4ecc293d28.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
5 years agoRevert "nir: drop unused alpha_ref_float"
Erik Faye-Lund [Mon, 21 Oct 2019 08:48:07 +0000 (10:48 +0200)]
Revert "nir: drop unused alpha_ref_float"

This reverts commit e8095f2af0736b5937674ca319f29cc9dabb17d4.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
5 years agoradv: fix a performance regression with graphics depth/stencil clears
Samuel Pitoiset [Tue, 22 Oct 2019 14:43:56 +0000 (16:43 +0200)]
radv: fix a performance regression with graphics depth/stencil clears

I recently changed the slow depth/stencil clear path to make sure
depth values are explicitly exported by the fragment shader. This
is actually only useful when VK_EXT_depth_range_unrestricted is
enabled.

While this path is correct, it introduced a performance regression
with Heroes of the Storm, Shadow of Mordor (Vulkan beta) and
probably more titles. This is because it prevents the hardware
to do some optimizations like discarding fragments.

This commit re-introduces the previous (a bit faster) slow
depth/stencil clear path and it selects the unrestricted path
only if VK_EXT_depth_range_unrestricted is enabled.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/863
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: fix vkUpdateDescriptorSets with inline uniform blocks
Samuel Pitoiset [Mon, 21 Oct 2019 11:32:05 +0000 (13:32 +0200)]
radv: fix vkUpdateDescriptorSets with inline uniform blocks

descriptorCount is the number of bytes into the descriptor, so
it shouldn't be used as an index. srcArrayElement/dstArrayElement
specify the starting byte offset within the binding to copy from/to.

This fixes new CTS tests:
dEQP-VK.binding_model.descriptor_copy.*.inline_uniform_block_*
dEQP-VK.binding_model.descriptor_copy.*.mix_3
dEQP-VK.binding_model.descriptor_copy.*.mix_array1

Fixes: 8d2654a4197 ("radv: Support VK_EXT_inline_uniform_block.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: fix 3D images
Samuel Pitoiset [Mon, 21 Oct 2019 13:11:35 +0000 (15:11 +0200)]
radv/gfx10: fix 3D images

GFX10 does act like GFX9 actually.

This fixes
dEQP-VK.glsl.texture_functions.query.texturesize.*sampler3d_*.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: re-enable fast depth/stencil clears with separate aspects
Samuel Pitoiset [Thu, 17 Oct 2019 08:19:37 +0000 (10:19 +0200)]
radv/gfx10: re-enable fast depth/stencil clears with separate aspects

It used to cause weird issues on GFX10 in the past with vkmark and
Wreckfest, and they can't be reproduced now. Shadow Of Mordor
(Vulkan beta) hits that path and it works fine.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: do not emit rbplus if attachments are undefined
Samuel Pitoiset [Mon, 21 Oct 2019 14:03:47 +0000 (16:03 +0200)]
radv: do not emit rbplus if attachments are undefined

Fixes some crashes with dEQP-VK.geometry.layered.*.secondary_cmd_buffer
on Raven and other chips that allow rbplus.

This just prevents a crash and rbplus probaby needs more work.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: add an assertion in radv_gfx10_compute_bin_size()
Samuel Pitoiset [Mon, 21 Oct 2019 08:40:23 +0000 (10:40 +0200)]
radv: add an assertion in radv_gfx10_compute_bin_size()

To prevent out of bounds access.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: do not create meta pipelines with 16 samples
Samuel Pitoiset [Mon, 21 Oct 2019 08:42:30 +0000 (10:42 +0200)]
radv: do not create meta pipelines with 16 samples

The driver only supports up to 8 samples, so it's useless to
create more pipelines than needed.

This fixes a conditional jump reported by Valgrind on GFX10:

==194282== Conditional jump or move depends on uninitialised value(s)
==194282==    at 0xDBF925A: radv_gfx10_compute_bin_size (radv_pipeline.c:3242)
==194282==    by 0xDBF95A6: radv_pipeline_generate_binning_state (radv_pipeline.c:3334)
==194282==    by 0xDBFC1A0: radv_pipeline_generate_pm4 (radv_pipeline.c:4440)
==194282==    by 0xDBFD15E: radv_pipeline_init (radv_pipeline.c:4764)
==194282==    by 0xDBFD23E: radv_graphics_pipeline_create (radv_pipeline.c:4788)
==194282==    by 0xDBB95A3: create_pipeline (radv_meta_clear.c:114)
==194282==    by 0xDBB9AC5: create_color_pipeline (radv_meta_clear.c:297)
==194282==    by 0xDBBCF05: radv_device_init_meta_clear_state (radv_meta_clear.c:1277)
==194282==    by 0xDB9ACD9: radv_device_init_meta (radv_meta.c:363)
==194282==    by 0xDB7FE3A: radv_CreateDevice (radv_device.c:2080

This is caused by an out of bound access of 'fmask_array' (ie. index
is 4 as for 16 samples).

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoanv: implement VK_INTEL_performance_query
Lionel Landwerlin [Thu, 7 Jun 2018 17:02:03 +0000 (18:02 +0100)]
anv: implement VK_INTEL_performance_query

v2: Introduce the appropriate pipe controls
    Properly deal with changes in metric sets (using execbuf parameter)
    Record marker at query end

v3: Fill out PerfCntr1&2

v4: Introduce vkUninitializePerformanceApiINTEL

v5: Use new execbuf extension mechanism

v6: Fix comments in genX_query.c (Rafael)
    Use PIPE_CONTROL workarounds (Rafael)
    Refactor on the last kernel series update (Lionel)

v7: Only I915_PERF_IOCTL_CONFIG when perf stream is already opened (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/perf: add mdapi writes for register perf counters
Lionel Landwerlin [Wed, 28 Nov 2018 15:10:09 +0000 (15:10 +0000)]
intel/perf: add mdapi writes for register perf counters

Those are not part of the OA reports.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/genxml: add RPSTAT register for core frequency
Lionel Landwerlin [Fri, 11 Oct 2019 12:53:16 +0000 (15:53 +0300)]
intel/genxml: add RPSTAT register for core frequency

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/genxml: add generic perf counters registers
Lionel Landwerlin [Wed, 28 Nov 2018 15:08:51 +0000 (15:08 +0000)]
intel/genxml: add generic perf counters registers

We have 2 of those we can configure to source programmable events.
Those are not part of the OA reports. Configuration happens in i915
through the metric set selected by the application. On the Mesa side
we'll just sample those and do a diff.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/perf: add support for querying kernel loaded configurations
Lionel Landwerlin [Mon, 22 Oct 2018 14:39:29 +0000 (15:39 +0100)]
intel/perf: add support for querying kernel loaded configurations

We use this as a communication mechanism between MDAPI & Anv.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agodrm-uapi: Update headers from drm-next
Lionel Landwerlin [Wed, 29 Aug 2018 12:58:23 +0000 (13:58 +0100)]
drm-uapi: Update headers from drm-next

Pull new updates from drm-next as of the following commit:

commit f1b4a9217efd61d0b84c6dc404596c8519ff6f59
Merge: 400e91347e1d f3a36d469621
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Oct 22 15:04:00 2019 +1000

    Merge tag 'du-next-20191016' of git://linuxtv.org/pinchartl/media into drm-next

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agointel/perf: move registers to their own header
Lionel Landwerlin [Fri, 11 Oct 2019 12:54:57 +0000 (15:54 +0300)]
intel/perf: move registers to their own header

Will conflict with the genxml RPSTAT register.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/perf: extract register configuration
Lionel Landwerlin [Fri, 19 Oct 2018 17:25:13 +0000 (18:25 +0100)]
intel/perf: extract register configuration

We want to query the content of register configurations from the
kernel. Let's pull this out of the query.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/perf: expose some utility functions
Lionel Landwerlin [Wed, 28 Aug 2019 12:45:00 +0000 (15:45 +0300)]
intel/perf: expose some utility functions

The Vulkan performance query extension is a bit lower level than the
GL one. Expose some of the functions to do the result accumulation
directly in the Anv driver.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/perf: add mdapi maker helper
Lionel Landwerlin [Sat, 9 Jun 2018 22:20:10 +0000 (23:20 +0100)]
intel/perf: add mdapi maker helper

A simple utility to put the marker at the right location.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agost/mesa: Silence chatty debug printf
Kenneth Graunke [Wed, 23 Oct 2019 01:01:10 +0000 (18:01 -0700)]
st/mesa: Silence chatty debug printf

Other debug_printf's in this file are in if (0) blocks.

Trivial.

5 years agost/mesa: Map MESA_FORMAT_RGB_UNORM8 <-> PIPE_FORMAT_R8G8B8_UNORM
Chris Wilson [Wed, 10 Jul 2019 18:10:25 +0000 (19:10 +0100)]
st/mesa: Map MESA_FORMAT_RGB_UNORM8 <-> PIPE_FORMAT_R8G8B8_UNORM

This is useful for PBO texture upload with GL_RGB and GL_UNSIGNED_BYTE.

v2: Vasily Khoruzhick provided an update for the Lima CI expectations.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoanv: fix unwind of vkCreateDevice fail
Lionel Landwerlin [Tue, 22 Oct 2019 12:34:12 +0000 (15:34 +0300)]
anv: fix unwind of vkCreateDevice fail

We're skipping the context destruction in some cases which is the
grand scheme of thing is not that important because closing device->fd
will destroy the associated context as well.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Fixes: b30e01aef56 ("anv: fix memory leak on device destroy")
5 years agoRevert "aco: only emit waitcnt on loop continues if we there was some load or export"
Rhys Perry [Tue, 15 Oct 2019 16:27:07 +0000 (17:27 +0100)]
Revert "aco: only emit waitcnt on loop continues if we there was some load or export"

We don't properly pass on ctx.lgkm_cnt/ctx.barrier_imm/etc, so this
waitcnt was necessary for barriers and correctly waiting for SMEM before
s_dcache_wb on GFX10.

Totals from affected shaders:
SGPRS: 33200 -> 33200 (0.00 %)
VGPRS: 31376 -> 31376 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 2431804 -> 2433956 (0.09 %) bytes
LDS: 316 -> 316 (0.00 %) blocks
Max Waves: 1609 -> 1609 (0.00 %)

This reverts commit 2c050b49b3d776f054f1265d5523cabb61f22fc3.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: add missing bld.scc()
Rhys Perry [Tue, 15 Oct 2019 16:56:54 +0000 (17:56 +0100)]
aco: add missing bld.scc()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: keep can_reorder/barrier when combining addition into SMEM
Rhys Perry [Tue, 15 Oct 2019 16:01:24 +0000 (17:01 +0100)]
aco: keep can_reorder/barrier when combining addition into SMEM

Affects 30 shaders in the pipeline-db (all youngblood).

Totals from affected shaders:
SGPRS: 2656 -> 2456 (-7.53 %)
VGPRS: 2260 -> 2260 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 240680 -> 240944 (0.11 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 90 -> 90 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: add a few missing checks in value numbering
Rhys Perry [Mon, 14 Oct 2019 16:19:19 +0000 (17:19 +0100)]
aco: add a few missing checks in value numbering

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: use ds_read2_b64/ds_write2_b64
Rhys Perry [Tue, 15 Oct 2019 10:31:11 +0000 (11:31 +0100)]
aco: use ds_read2_b64/ds_write2_b64

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: properly combine additions into ds_write2_b64/ds_read2_b64
Rhys Perry [Mon, 14 Oct 2019 16:17:00 +0000 (17:17 +0100)]
aco: properly combine additions into ds_write2_b64/ds_read2_b64

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: fix sparse store_lds()
Rhys Perry [Mon, 14 Oct 2019 19:25:27 +0000 (20:25 +0100)]
aco: fix sparse store_lds()

p_extract_vector's second operand is in units of the definition size, not
dwords.

v2: move extract_subvector() to right before ds_write_helper

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: create load_lds/store_lds helpers
Rhys Perry [Fri, 11 Oct 2019 11:02:49 +0000 (12:02 +0100)]
aco: create load_lds/store_lds helpers

We'll want these for GS, since VS->GS IO on Vega is done using LDS.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: fix 64-bit p_extract_vector on 32-bit p_create_vector
Rhys Perry [Mon, 14 Oct 2019 18:27:52 +0000 (19:27 +0100)]
aco: fix 64-bit p_extract_vector on 32-bit p_create_vector

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: small stage corrections
Rhys Perry [Tue, 10 Sep 2019 14:08:31 +0000 (15:08 +0100)]
aco: small stage corrections

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agost/mesa: replace pipe_shader_state with tgsi_token* in st_vp_variant
Marek Olšák [Fri, 18 Oct 2019 02:41:54 +0000 (22:41 -0400)]
st/mesa: replace pipe_shader_state with tgsi_token* in st_vp_variant

we don't need more than that

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir: allow nir_lower_uniforms_to_ubo to be run repeatedly
Marek Olšák [Fri, 18 Oct 2019 23:49:44 +0000 (19:49 -0400)]
nir: allow nir_lower_uniforms_to_ubo to be run repeatedly

for st/mesa

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agofreedreno/ir3: fixup register footprint fixup
Rob Clark [Mon, 21 Oct 2019 23:33:50 +0000 (16:33 -0700)]
freedreno/ir3: fixup register footprint fixup

Small typo resulted in not converting footprint to vec4, meaning that we
could potentially ask for quite a few more registers than required

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: handle scalarized varying inputs
Rob Clark [Mon, 21 Oct 2019 18:15:53 +0000 (11:15 -0700)]
freedreno/ir3: handle scalarized varying inputs

If the load_interpolated_input is scalarized, we would be too
conservative about deciding the tex instruction wasn't a candidate to
pre-fetch:

vec1 32 ssa_0 = load_const (0x00000000 /* 0.000000 */)
vec2 32 ssa_1 = intrinsic load_barycentric_pixel () (0) /* interp_mode=0 */
vec1 32 ssa_2 = intrinsic load_interpolated_input (ssa_1, ssa_0) (0, 0) /* base=0 */ /* component=0 */ /* packed:v_uv,v_uv1 */
vec1 32 ssa_3 = intrinsic load_interpolated_input (ssa_1, ssa_0) (0, 1) /* base=0 */ /* component=1 */ /* packed:v_uv,v_uv1 */
vec2 32 ssa_8 = vec2 ssa_2, ssa_3
vec4 32 ssa_9 = tex ssa_8 (coord), 0 (texture), 0 (sampler)

Really we don't care that the texcoord components come from different
load_interpolated_input instructions, just that they have consecutive
varying offsets.

Reported-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agoaco: refactor value numbering
Daniel Schürmann [Sat, 19 Oct 2019 14:11:13 +0000 (16:11 +0200)]
aco: refactor value numbering

Previously, we used one hashset per BB, so that we could
always initialize the current hashset from the immediate
dominator. This patch changes the behavior to a single
hashmap using the block index per instruction to resolve
dominance.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>