mesa.git
6 years agoac: add reusable helpers for direct LLVM compilation
Marek Olšák [Wed, 4 Jul 2018 05:11:47 +0000 (01:11 -0400)]
ac: add reusable helpers for direct LLVM compilation

This is basically LLVMTargetMachineEmitToMemoryBuffer inlined and reworked.

struct ac_compiler_passes (opaque type) contains the main pass manager.

ac_create_llvm_passes -- the result can go to thread local storage
ac_destroy_llvm_passes -- can be called by a destructor in TLS
ac_compile_module_to_binary -- from LLVMModuleRef to ac_shader_binary

The motivation is to do the expensive call addPassesToEmitFile once
per context or thread.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agonvc0: implement multisampled images on Maxwell+
Rhys Perry [Wed, 4 Jul 2018 09:21:41 +0000 (10:21 +0100)]
nvc0: implement multisampled images on Maxwell+

Changes in v2:
- make loadSuInfo32() protected without making the rest protected
- move NVC0_SU_INFO_* into nv50_ir_lowering_nvc0.h instead of duplicating
  NVC0_SU_INFO_MS

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
6 years agoi965: Fix output register sizes when variable ranges are interleaved
Neil Roberts [Fri, 18 May 2018 11:39:13 +0000 (13:39 +0200)]
i965: Fix output register sizes when variable ranges are interleaved

In 6f5abf31466aed this code was fixed to calculate the maximum size of
an attribute in a seperate pass and then allocate the registers to
that size. However this wasn’t taking into account ranges that overlap
but don’t have the same starting location. For example:

layout(location = 0, component = 0) out float a[4];
layout(location = 2, component = 1) out float b[4];

Previously, if ‘a’ was processed first then it would allocate a
register of size 4 for location 0 and it wouldn’t allocate another
register for location 2 because it would already be covered by the
range of 0. Then if something tries to write to b[2] it would try to
write past the end of the register allocated for ‘a’ and it would hit
an assert.

This patch changes it to scan for any overlapping ranges that start
within each range to calculate the maximum extent and allocate that
instead.

Fixed Piglit’s arb_enhanced_layouts/execution/component-layout/
vs-fs-array-interleave-range.shader_test

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 6f5abf31466 "i965: Fix output register sizes when multiple variables
       share a slot."

6 years agor600/sb: cleanup if_conversion iterator to be legal C++
Dave Airlie [Fri, 29 Jun 2018 02:47:26 +0000 (03:47 +0100)]
r600/sb: cleanup if_conversion iterator to be legal C++

The current code causes:
/usr/include/c++/8/debug/safe_iterator.h:207:
Error: attempt to copy from a singular iterator.

This is due to the iterators getting invalidated, fix the
reverse iterator to use the return value from erase, and
cast it properly.

(used Mathias suggestion)
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
6 years agoradeonsi: fix compiler breakage
Marek Olšák [Wed, 4 Jul 2018 04:13:04 +0000 (00:13 -0400)]
radeonsi: fix compiler breakage

Broken by d853d3a59bd5f8720a5b021bcd64a193d370b623.

6 years agoac: make some fns static
Dave Airlie [Wed, 27 Jun 2018 00:24:18 +0000 (10:24 +1000)]
ac: make some fns static

Some of the compiler functions are no longer called outside
the util file.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoac/radv: move llvm compiler info to struct and init in one place
Dave Airlie [Tue, 26 Jun 2018 23:27:03 +0000 (09:27 +1000)]
ac/radv: move llvm compiler info to struct and init in one place

This ports radv to the shared code, however due to a bug in LLVM
version prior to 7, radv cannot add target info at this stage,
as it would leak one for every shader compile, however I'd prefer
to keep this llvm damage in the shared code, since it isn't the
driver at fault here. We just add a flag to denote if the driver
can support leaking the target info or not, and the common code
does the right thing depending on the llvm version.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoac/radeonsi: port compiler init/destroy out of radeonsi.
Dave Airlie [Mon, 2 Jul 2018 23:51:42 +0000 (09:51 +1000)]
ac/radeonsi: port compiler init/destroy out of radeonsi.

We want to share this code with radv in the future, so port
it out of radeonsi.

Add a return value as radv will want that to know if this
succeeds

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradv/radeonsi: add a check ir tm options
Dave Airlie [Mon, 2 Jul 2018 23:44:22 +0000 (09:44 +1000)]
radv/radeonsi: add a check ir tm options

This doesn't do much yet, but it makes it easier to move the code
to a common shared code base.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: rename si_compiler -> ac_llvm_compiler
Dave Airlie [Mon, 2 Jul 2018 23:39:27 +0000 (09:39 +1000)]
radeonsi: rename si_compiler -> ac_llvm_compiler

As precursor to moving init to common code, just rename the struct
and move it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoac: add target library info helpers
Dave Airlie [Tue, 26 Jun 2018 23:34:42 +0000 (09:34 +1000)]
ac: add target library info helpers

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradv: create/destroy passmgr at the higher level.
Dave Airlie [Tue, 26 Jun 2018 23:11:47 +0000 (09:11 +1000)]
radv: create/destroy passmgr at the higher level.

This is prep work for moving this to a per-thread struct

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradv: port to use common passmgr code.
Dave Airlie [Tue, 26 Jun 2018 23:02:25 +0000 (09:02 +1000)]
radv: port to use common passmgr code.

This adds a inline always pass, but otherwise should work the
same.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/radeonsi: refactor out pass manager init to common code.
Dave Airlie [Tue, 26 Jun 2018 22:52:20 +0000 (08:52 +1000)]
ac/radeonsi: refactor out pass manager init to common code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradv: drop copy of ac_create_target_machine.
Dave Airlie [Tue, 26 Jun 2018 22:38:30 +0000 (08:38 +1000)]
radv: drop copy of ac_create_target_machine.

Once we split the init once stuff out, this can be shared again.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoac/radv: split the non-common init_once code from the common target code. (v2)
Dave Airlie [Tue, 26 Jun 2018 22:36:41 +0000 (08:36 +1000)]
ac/radv: split the non-common init_once code from the common target code. (v2)

This just splits out the non-shared code and reuses ac_get_llvm_target in radv.

v2: rebase on Marek's patch - fixup brace position/whitespace

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoi965: Use the new nir atomic counter linker for SPIR-V shaders
Neil Roberts [Wed, 29 Nov 2017 09:14:25 +0000 (10:14 +0100)]
i965: Use the new nir atomic counter linker for SPIR-V shaders

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoi965: enable AtomicStorage capability for gen7+
Alejandro Piñeiro [Sat, 28 Oct 2017 09:27:17 +0000 (11:27 +0200)]
i965: enable AtomicStorage capability for gen7+

That is the same gen requirement for ARB_shader_atomic_counters.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agomesa/glspirv: lower workgroup access to offsets
Antia Puentes [Wed, 14 Feb 2018 11:58:33 +0000 (12:58 +0100)]
mesa/glspirv: lower workgroup access to offsets

This will perform the CS shared lowering. See 8761a04d0d93

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agonir: Fix OpAtomicCounterIDecrement for uniform atomic counters
Antia Puentes [Thu, 22 Feb 2018 12:50:23 +0000 (13:50 +0100)]
nir: Fix OpAtomicCounterIDecrement for uniform atomic counters

From the SPIR-V 1.0 specification, section 3.32.18, "Atomic
Instructions":

   "OpAtomicIDecrement:
    <skip>
    The instruction's result is the Original Value."

However, we were implementing it, for uniform atomic counters, as a
pre-decrement operation, as was the one available from GLSL.

Renamed the former nir intrinsic 'atomic_counter_dec*' to
'atomic_counter_pre_dec*' for clarification purposes, as it implements
a pre-decrement operation as specified for GLSL. From GLSL 4.50 spec,
section 8.10, "Atomic Counter Functions":

   "uint atomicCounterDecrement (atomic_uint c)

    Atomically
    1. decrements the counter for c, and
    2. returns the value resulting from the decrement operation.

    These two steps are done atomically with respect to the atomic
    counter functions in this table."

Added a new nir intrinsic 'atomic_counter_post_dec*' which implements
a post-decrement operation as required by SPIR-V.

v2: (Timothy Arceri)
   * Add extra spec quotes on commit message
   * Use "post" instead "pos" to avoid confusion with "position"

Signed-off-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agonir/linker: Add a pure NIR implementation of the atomic counter linker
Neil Roberts [Tue, 28 Nov 2017 12:39:44 +0000 (13:39 +0100)]
nir/linker: Add a pure NIR implementation of the atomic counter linker

This is mostly just a straight-forward conversion of
link_assign_atomic_counter_resources to C directly using nir variables
instead of GLSL IR variables.

It is based on the version of link_assign_atomic_counter_resources in
6b8909f2d1906. I’m noting this here to make it easier to track changes
and keep the NIR version up-to-date.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agonir/types: Add wrappers for a couple of atomic counter methods
Neil Roberts [Tue, 28 Nov 2017 12:38:32 +0000 (13:38 +0100)]
nir/types: Add wrappers for a couple of atomic counter methods

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agospirv/nir: add capability check for SpvCapabilityAtomicStorage
Alejandro Piñeiro [Sat, 28 Oct 2017 08:57:35 +0000 (10:57 +0200)]
spirv/nir: add capability check for SpvCapabilityAtomicStorage

Capability that informs if atomic counters are supported. From SPIR-V
1.0 spec, section 3.7, "Storage Class", item 10 from table:

(Column "Storage Class"):

   "AtomicCounter For holding atomic counters. Visible across all
    functions of the current invocation. Atomic counter-specific
    memory."

(Column "Required Capability"):

   "AtomicStorage"

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agospirv/nir: add atomic counter support on vtn_handle_ssbo_or_shared_atomic
Alejandro Piñeiro [Tue, 31 Oct 2017 12:12:11 +0000 (13:12 +0100)]
spirv/nir: add atomic counter support on vtn_handle_ssbo_or_shared_atomic

So renamed to a more general vtn_handle_atomics

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agospirv/nir: initialize offset on the nir var at vtn_create_variable
Alejandro Piñeiro [Sun, 5 Nov 2017 15:19:43 +0000 (16:19 +0100)]
spirv/nir: initialize offset on the nir var at vtn_create_variable

This is convenient when dealing with atomic counter uniforms. The
alternative would be doing that at vtn_handle_atomics.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agonir/spirv: Fix atomic counter (multidimensional-)arrays
Antia Puentes [Wed, 2 May 2018 20:28:43 +0000 (22:28 +0200)]
nir/spirv: Fix atomic counter (multidimensional-)arrays

When constructing NIR if we have a SPIR-V uint variable and the
storage class is SpvStorageClassAtomicCounter, we store as NIR's
glsl_type an atomic_uint to reflect the fact that the variable is an
atomic counter.

However, we were tweaking the type only for atomic_uint scalars, we
have to do it as well for atomic_uint arrays and atomic_uint arrays of
arrays of any depth.

Signed-off-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
v2: update after deref patches got pushed (Alejandro Piñeiro)
v3: simplify repair_atomic_type (suggested by Timothy Arceri, included
    on the patch by Alejandro)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agospirv/nir: tweak nir type when storage class is SpvStorageClassAtomicCounter
Alejandro Piñeiro [Fri, 10 Nov 2017 15:57:40 +0000 (16:57 +0100)]
spirv/nir: tweak nir type when storage class is SpvStorageClassAtomicCounter

GLSL types differentiates uint from atomic uint. On SPIR-V the type is
uint, and the variable has a specific storage class. So we need to
tweak the type based on the storage class.

Ideally we would like to get the proper type at vtn_handle_type, but
we don't have the storage class at that moment.

We tweak only the nir type, as is the one that really requires it.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agonir_types: add glsl_atomic_uint_type() helper
Alejandro Piñeiro [Fri, 10 Nov 2017 15:32:41 +0000 (16:32 +0100)]
nir_types: add glsl_atomic_uint_type() helper

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agospirv/nir: add offset at vtn_variable
Alejandro Piñeiro [Sun, 5 Nov 2017 11:00:19 +0000 (12:00 +0100)]
spirv/nir: add offset at vtn_variable

Also initialize it on var_decoration_cb

This is equivalent to nir_variable.offset, used to store the location
an atomic counter is stored at.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agospirv/nir: SpvStorageClassAtomicCounter support on vtn_storage_class_to_mode
Alejandro Piñeiro [Fri, 27 Oct 2017 10:40:35 +0000 (12:40 +0200)]
spirv/nir: SpvStorageClassAtomicCounter support on vtn_storage_class_to_mode

Atomic Counters are uniforms per spec.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agonir/linker: handle uniforms without explicit location
Alejandro Piñeiro [Fri, 13 Apr 2018 13:47:49 +0000 (15:47 +0200)]
nir/linker: handle uniforms without explicit location

ARB_gl_spirv points that uniforms in general need explicit
location. But there are still some cases of uniforms without location,
like for example uniform atomic counters. Those doesn't have a
location from the OpenGL point of view (they are identified with a
binding and offset), but Mesa internally assigns it a location.

Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com>
v2: squash with another patch, minor variable name tweak (Timothy
Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agocompiler/glsl: refactor empty_uniform_block utilities to linker_util
Alejandro Piñeiro [Tue, 26 Jun 2018 14:28:59 +0000 (16:28 +0200)]
compiler/glsl: refactor empty_uniform_block utilities to linker_util

This includes:
  * Move the defition of empty_uniform_block to linker_util.h
  * Move find_empty_block (with a rename) to linker_util.h
  * Refactor some code at linker.cpp to a new method at linker_util.h
    (link_util_update_empty_uniform_locations)

So all that code could be used by the GLSL linker and the NIR linker
used for ARB_gl_spirv.

v2: include just "ir_uniform.h" (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoi965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't compatible
Ian Romanick [Thu, 28 Jun 2018 00:25:34 +0000 (17:25 -0700)]
i965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't compatible

Otherwise we can incorrectly cmod propagate in situations like

    add(8)          g10<1>.xD       g2<0>.xD        -16D
    ...
    cmp.ge.f0(8)    null<1>D        g2<0>.xD        16D
    ...
    (+f0) sel(8)    g21<1>.xyUD     g14<4>.xyyyUD   g18<4>.xyyyUD

Sadly, this change hurts quite a few shaders.

v2: Refactor writemask compatibility check into a separate function.
Suggested by Caio.

Ivy Bridge and Haswell had similar results. (Haswell shown)
total instructions in shared programs: 12968489 -> 12968738 (<.01%)
instructions in affected programs: 60679 -> 60928 (0.41%)
helped: 0
HURT: 249
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.22% max: 0.81% x̄: 0.46% x̃: 0.44%
95% mean confidence interval for instructions value: 1.00 1.00
95% mean confidence interval for instructions %-change: 0.44% 0.48%
Instructions are HURT.

total cycles in shared programs: 409171965 -> 409172317 (<.01%)
cycles in affected programs: 260056 -> 260408 (0.14%)
helped: 0
HURT: 176
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.04% max: 0.34% x̄: 0.17% x̃: 0.17%
95% mean confidence interval for cycles value: 2.00 2.00
95% mean confidence interval for cycles %-change: 0.16% 0.18%
Cycles are HURT.

Sandy Bridge
total instructions in shared programs: 10423577 -> 10423753 (<.01%)
instructions in affected programs: 40667 -> 40843 (0.43%)
helped: 0
HURT: 176
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.29% max: 0.79% x̄: 0.48% x̃: 0.42%
95% mean confidence interval for instructions value: 1.00 1.00
95% mean confidence interval for instructions %-change: 0.46% 0.51%
Instructions are HURT.

total cycles in shared programs: 146097503 -> 146097855 (<.01%)
cycles in affected programs: 503990 -> 504342 (0.07%)
helped: 0
HURT: 176
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.02% max: 0.36% x̄: 0.12% x̃: 0.11%
95% mean confidence interval for cycles value: 2.00 2.00
95% mean confidence interval for cycles %-change: 0.11% 0.13%
Cycles are HURT.

No changes on any other platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes: cd635d149b2 i965/vec4: Propagate conditional modifiers from compares to adds
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
6 years agointel/compiler: Silence unused parameter warnings brw_nir.c
Ian Romanick [Wed, 23 May 2018 18:33:51 +0000 (11:33 -0700)]
intel/compiler: Silence unused parameter warnings brw_nir.c

src/intel/compiler/brw_nir.c: In function ‘brw_nir_lower_vue_outputs’:
src/intel/compiler/brw_nir.c:464:32: warning: unused parameter ‘is_scalar’ [-Wunused-parameter]
                           bool is_scalar)
                                ^~~~~~~~~
src/intel/compiler/brw_nir.c: In function ‘lower_bit_size_callback’:
src/intel/compiler/brw_nir.c:610:57: warning: unused parameter ‘data’ [-Wunused-parameter]
 lower_bit_size_callback(const nir_alu_instr *alu, void *data)
                                                         ^~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
6 years agoi965: Fix BRW_NEW_NUM_SAMPLES to be in .brw, not .mesa
Kenneth Graunke [Mon, 2 Jul 2018 21:17:37 +0000 (14:17 -0700)]
i965: Fix BRW_NEW_NUM_SAMPLES to be in .brw, not .mesa

This is the wrong kind of dirty bit.  Caught by GCC warnings, due to
64-bit values being truncated to 32 bits.

Fixes: b95b0e2918c052068caeb4f6c2802ba89be043a3 (intel/anv,blorp,i965: Implement the SKL 16x MSAA SIMD32 workaround)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoanv: Add support for the on-disk shader cache
Jason Ekstrand [Sat, 30 Jun 2018 00:08:30 +0000 (17:08 -0700)]
anv: Add support for the on-disk shader cache

The Vulkan API provides a mechanism for applications to cache their own
shaders and manage on-disk pipeline caching themselves.  Generally, this
is what I would recommend to application developers and I've resisted
implementing driver-side transparent caching in the Vulkan driver for a
long time.  However, not all applications do this and, for some
use-cases, it's just not practical.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoanv/pipeline_cache: Add a _locked suffix to a function
Jason Ekstrand [Sat, 30 Jun 2018 01:12:34 +0000 (18:12 -0700)]
anv/pipeline_cache: Add a _locked suffix to a function

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoanv: Add device-level helpers for searching for and uploading kernels
Jason Ekstrand [Sat, 30 Jun 2018 01:02:07 +0000 (18:02 -0700)]
anv: Add device-level helpers for searching for and uploading kernels

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoanv/pipeline: Stop optimizing for not having a cache
Jason Ekstrand [Sat, 30 Jun 2018 00:53:47 +0000 (17:53 -0700)]
anv/pipeline: Stop optimizing for not having a cache

Before, we were only hashing the shader if we had a shader cache to
cache things in.  This means that if we ever get it wrong, we could end
up trying to cache a shader with an undefined hash.  Since not having a
shader cache is an extremely uncommon case, let's optimize for code
clarity and obvious correctness over avoiding a hash operation.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoanv: Use a default pipeline cache if none is specified
Jason Ekstrand [Sat, 30 Jun 2018 00:29:35 +0000 (17:29 -0700)]
anv: Use a default pipeline cache if none is specified

If a client is dumb enough to not specify a pipeline cache, give it a
default.  We have to create one anyway for blorp so we may as well let
the client cache shaders in it.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoanv: Be more careful about hashing pipeline layouts
Jason Ekstrand [Sat, 30 Jun 2018 03:03:49 +0000 (20:03 -0700)]
anv: Be more careful about hashing pipeline layouts

Previously, we just hashed the entire descriptor set layout verbatim.
This meant that a bunch of extra stuff such as pointers and reference
counts made its way into the cache.  It also meant that we weren't
properly hashing in the Y'CbCr conversion information information from
bound immutable samplers.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoanv,intel: Enable nir_opt_large_constants for Vulkan
Jason Ekstrand [Fri, 29 Jun 2018 05:44:43 +0000 (22:44 -0700)]
anv,intel: Enable nir_opt_large_constants for Vulkan

According to RenderDoc, this shaves 99.6% of the run time off of the
ambient occlusion pass in Skyrim Special Edition when running under DXVK
and shaves 92% off the runtime for a reasonably representative frame.
When running the actual game, Skyrim goes from being a slide-show to a
very stable and playable framerate on my SKL GT4e machine.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoanv: Add state setup support for shader constants
Jason Ekstrand [Fri, 29 Jun 2018 05:44:24 +0000 (22:44 -0700)]
anv: Add state setup support for shader constants

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoanv: Add support for shader constant data to the pipeline cache
Jason Ekstrand [Fri, 29 Jun 2018 05:41:21 +0000 (22:41 -0700)]
anv: Add support for shader constant data to the pipeline cache

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agonir: Add a large constants optimization pass
Jason Ekstrand [Fri, 29 Jun 2018 02:16:58 +0000 (19:16 -0700)]
nir: Add a large constants optimization pass

This pass searches for reasonably large local variables which can be
statically proven to be constant and moves them into shader constant
data.  This is especially useful when large tables are baked into the
shader source code because they can be moved into a UBO by the driver to
reduce register pressure and make indirect access cheaper.

v2 (Jason Ekstrand):
 - Use a size/align function to ensure we get the right alignments
 - Use the newly added deref offset helpers

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agonir: Add a concept of constant data associated with a shader
Jason Ekstrand [Fri, 29 Jun 2018 02:16:19 +0000 (19:16 -0700)]
nir: Add a concept of constant data associated with a shader

This commit adds a concept to NIR of having a blob of constant data
associated with a shader.  Instead of being a UBO or uniform that can be
manipulated by the client, this constant data considered part of the
shader and remains constant across all invocations of the given shader
until the end of time.  To access this constant data from the shader, we
add a new load_constant intrinsic.  The intention is that drivers will
eventually lower load_constant intrinsics to load_ubo, load_uniform, or
something similar.  Constant data will be used by the optimization pass
in the next commit but this concept may also be useful for OpenCL.

v2 (Jason Ekstrand):
 - Rename num_constants to constant_data_size (anholt)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agonir/deref: Add helpers for getting offsets
Jason Ekstrand [Fri, 29 Jun 2018 21:44:19 +0000 (14:44 -0700)]
nir/deref: Add helpers for getting offsets

These are very similar to the related function in nir_lower_io except
that they don't handle per-vertex or packed things (that could be added,
in theory) and they take a more detailed size/align function pointer.
One day, we should consider switching nir_lower_io over to using the
more detailed size/align functions and then we could make it use these
helpers instead of having its own.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agonir/types: Add a natural size and alignment helper
Jason Ekstrand [Fri, 29 Jun 2018 21:14:52 +0000 (14:14 -0700)]
nir/types: Add a natural size and alignment helper

The size and alignment are "natural" in the sense that everything is
aligned to a scalar.  This is a bit tighter than std430 where vec3s are
required to be aligned to a vec4.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agonir: Add a deref_instr_has_indirect helper
Jason Ekstrand [Fri, 29 Jun 2018 02:46:01 +0000 (19:46 -0700)]
nir: Add a deref_instr_has_indirect helper

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoutil/macros: Import ALIGN_POT from ralloc.c
Jason Ekstrand [Fri, 29 Jun 2018 21:59:56 +0000 (14:59 -0700)]
util/macros: Import ALIGN_POT from ralloc.c

v2 (Jason Ekstrand):
 - Rename y to pot_align (Brian)
 - Also use ALIGN_POT in build_id.c and slab.c (Brian)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agov3d: Claim PIPE_CAP_TGSI_CAN_READ_OUTPUTS.
Eric Anholt [Mon, 2 Jul 2018 17:19:47 +0000 (10:19 -0700)]
v3d: Claim PIPE_CAP_TGSI_CAN_READ_OUTPUTS.

Fixes warning at screen creation.  We store our outputs in normal temps
and just emit them to shader I/O at the end, due to our I/O ordering
requirements, so reading "outputs" in NIR is fine.

6 years agoac: move all LLVM module initialization into ac_create_module
Marek Olšák [Sat, 30 Jun 2018 04:54:30 +0000 (00:54 -0400)]
ac: move all LLVM module initialization into ac_create_module

This removes some ugly code around module initialization.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agov3d: Emit a TF flush after each draw using TF.
Eric Anholt [Mon, 25 Jun 2018 17:12:03 +0000 (10:12 -0700)]
v3d: Emit a TF flush after each draw using TF.

This fixes GPU hangs on 7278 in transform feedback tests such as
GTF-GLES3.gtf.GL3Tests.transform_feedback2.transform_feedback2_basic

6 years agonv50/ir: handle clipvertex for geom and tess shaders as well
Karol Herbst [Sat, 30 Jun 2018 02:58:30 +0000 (04:58 +0200)]
nv50/ir: handle clipvertex for geom and tess shaders as well

this will be needed for compatibility profiles

v2: handle tess shaders

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
6 years agogallium/u_vbuf: drop min/max-scanning for empty indirect draws
Erik Faye-Lund [Mon, 25 Jun 2018 20:10:31 +0000 (21:10 +0100)]
gallium/u_vbuf: drop min/max-scanning for empty indirect draws

When building with asserts enabled, we'll end up triggering an assert
in pipe_buffer_map_range down this code-path, due to trying to map
an empty range. Even if we avoid that, we'll trigger another assert
a bit later, because u_vbuf_get_minmax_index returns a min-index of
-1 here, which gets promoted to an unsigned value, and gives us an
out-of-bounds buffer-mapping offset.

Since we can't really have a well-defined min/max range here when
the range is empty anyway, we should just drop this dance in the
first place. After all, no rendering is going to be produced.

This fixes a crash in dEQP-GLES31.functional.draw_indirect.random.0
on VirGL for me.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradv: reset the image's predicate after a color decompression pass
Samuel Pitoiset [Wed, 18 Apr 2018 12:34:55 +0000 (14:34 +0200)]
radv: reset the image's predicate after a color decompression pass

After performing a fast-clear eliminate, a FMASK decompress,
or a DCC decompress, we can reset the predicate to FALSE.

With that, the GPU should be able to skip unnecessary color
decompression passes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: enable/disable predication for the DCC decompression pass
Samuel Pitoiset [Wed, 18 Apr 2018 12:34:54 +0000 (14:34 +0200)]
radv: enable/disable predication for the DCC decompression pass

Performing a DCC decompression pass is currently pretty rare,
but using predication allows the GPU to skip unnecessary passes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: add padding for the UMR disassembler
Samuel Pitoiset [Wed, 27 Jun 2018 08:39:51 +0000 (10:39 +0200)]
radv: add padding for the UMR disassembler

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agovirgl: Add support for glGetMultisample
Gert Wollny [Fri, 29 Jun 2018 10:39:06 +0000 (12:39 +0200)]
virgl: Add support for glGetMultisample

Use caps to obtain the multisample sample positions for up to 16
positions and implement the according Gallium interface.

This implemenation (plus its counterpart in virglrenderer) assume that
the fixed sample position are always the same for a given number of samples
over the whole live time of a qemu session. It also assumes that sample
series are only given for 2, 4, 8, and 16 samples, and for intermediate
numbers N of samples the next higher supported set from above list is picked
and the sample positions for the first N samples are returned accordingly.

Fixes (when run on GL host):
    dEQP-GLES31.functional.texture.multisample.samples_1.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_2.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_3.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_4.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_8.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_10.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_12.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_13.sample_position
    dEQP-GLES31.functional.texture.multisample.samples_16.sample_position

v2: remove unrelated chunk (thanks Ilia Mirkin)
v3: - also return positions for intermediate sample counts
    - fix unused varible warning
    - update description
v4: explain better what this patch assumes and how it handles sample numbers
    that are not directly advertised (thanks go to Erik Faye-Lund for making
    me aware that this should be documented)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
6 years agost/mesa: Also check for PIPE_FORMAT_A8R8G8B8_SRGB for texture_sRGB
Tomeu Vizoso [Fri, 22 Jun 2018 13:59:10 +0000 (15:59 +0200)]
st/mesa: Also check for PIPE_FORMAT_A8R8G8B8_SRGB for texture_sRGB

and PIPE_FORMAT_R8G8B8A8_SRGB, as well.

The reason for this is that when Virgl runs with GLES on the host, it
cannot directly upload textures in BGRA.

So to avoid a conversion step, consider the RGB sRGB formats as well for
this extension.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost/mesa: Fall back to R8G8B8A8_SRGB for ETC2
Tomeu Vizoso [Fri, 22 Jun 2018 13:59:09 +0000 (15:59 +0200)]
st/mesa: Fall back to R8G8B8A8_SRGB for ETC2

If the driver doesn't support PIPE_FORMAT_B8G8R8A8_SRGB, fall back to
PIPE_FORMAT_R8G8B8A8_SRGB.

Drivers such as Virgl will have a hard time supporting
PIPE_FORMAT_B8G8R8A8_SRGB when the host runs GLES, as GL_BGRA isn't as
well suported there.

So go with PIPE_FORMAT_R8G8B8A8_SRGB so these drivers can avoid a
conversion copy.

v2: Fix typo in commit message

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost/mesa/i965: Allow decompressing ETC2 to GL_RGBA
Tomeu Vizoso [Fri, 22 Jun 2018 13:59:08 +0000 (15:59 +0200)]
st/mesa/i965: Allow decompressing ETC2 to GL_RGBA

When Mesa itself implements ETC2 decompression, it currently
decompresses to formats in the GL_BGRA component order.

That can be problematic for drivers which cannot upload the texture data
as GL_BGRA, such as Virgl when it's backed by GLES on the host.

So this commit adds a flag to _mesa_unpack_etc2_format so callers can
specify the optimal component order.

In Gallium's case, it will be requested if the format isn't in
PIPE_FORMAT_B8G8R8A8_SRGB format.

For i965, it will remain GL_BGRA, as before.

v2: * Remove unnecesary include (Emil Velikov)

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoanv/cmd_buffer: make descriptors dirty when emitting base state address
Iago Toral Quiroga [Thu, 28 Jun 2018 11:12:53 +0000 (13:12 +0200)]
anv/cmd_buffer: make descriptors dirty when emitting base state address

Every time we emit a new state base address we will need to re-emit our
binding tables, since they might have been emitted with a different base
state adress.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
6 years agoanv/cmd_buffer: clean dirty push constants flag after emitting push constants
Iago Toral Quiroga [Thu, 28 Jun 2018 11:16:53 +0000 (13:16 +0200)]
anv/cmd_buffer: clean dirty push constants flag after emitting push constants

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
6 years agoanv/cmd_buffer: never shrink the push constant buffer size
Iago Toral Quiroga [Thu, 28 Jun 2018 06:10:16 +0000 (08:10 +0200)]
anv/cmd_buffer: never shrink the push constant buffer size

If we have to re-emit push constant data, we need to re-emit all
of it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
6 years agogallium/llvmpipe: Enable support bptc format.
Denis Pauk [Tue, 26 Jun 2018 20:30:52 +0000 (23:30 +0300)]
gallium/llvmpipe: Enable support bptc format.

v2: none
v3: none

Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Rhys Perry <pendingchaos02@gmail.com>
CC: Matt Turner <mattst88@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agogallium/softpipe: Enable support bptc format.
Denis Pauk [Tue, 26 Jun 2018 20:30:51 +0000 (23:30 +0300)]
gallium/softpipe: Enable support bptc format.

v2: none
v3: none

Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agogallium/auxiliary: Add helper support for bptc format compress/decompress
Denis Pauk [Tue, 26 Jun 2018 20:30:50 +0000 (23:30 +0300)]
gallium/auxiliary: Add helper support for bptc format compress/decompress

Reuse code shared with mesa/main/texcompress_bptc.

v2: Use block decompress function
v3: Include static bptc code from texcompress_bptc_tmp.h
Suggested-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
CC: Nicolai Hähnle <nicolai.haehnle@amd.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: add header for share bptc decompress functions
Denis Pauk [Tue, 26 Jun 2018 20:30:49 +0000 (23:30 +0300)]
mesa: add header for share bptc decompress functions

Move shared bptc functions to texcompress_bptc_tmp.h:
* fetch_rgba_unorm_from_block
* fetch_rgb_float_from_block
* compress_rgba_unorm
* compress_rgb_float

Create decompress functions:
* decompress_rgba_unorm
* decompress_rgb_float

Functions will be reused in gallium/auxiliary code.

v2: Add block decompress function
v3: Move all shared code to header
Suggested-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
CC: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agoglsl/cache: save and restore ExternalSamplersUsed
Marek Olšák [Sat, 30 Jun 2018 04:57:08 +0000 (00:57 -0400)]
glsl/cache: save and restore ExternalSamplersUsed

Shaders that need special code for external samplers were broken if
they were loaded from the cache.

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agonir: fix selection of loop terminator when two or more have the same limit
Timothy Arceri [Mon, 4 Jun 2018 06:26:46 +0000 (16:26 +1000)]
nir: fix selection of loop terminator when two or more have the same limit

We need to add loop terminators to the list in the order we come
across them otherwise if two or more have the same exit condition
we will select that last one rather than the first one even though
its unreachable.

This fix is for simple unrolls where we only have a single exit
point. When unrolling these type of loops the unreachable
terminators and their unreachable branch are removed prior to
unrolling. Because of the logic change we also switch some
list access in the complex unrolling logic to avoid breakage.

Fixes: 6772a17acc8e ("nir: Add a loop analysis pass")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoradeonsi: enable OpenGL 4.4 compat profile
Timothy Arceri [Mon, 25 Jun 2018 10:31:02 +0000 (20:31 +1000)]
radeonsi: enable OpenGL 4.4 compat profile

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: enable ARB_vertex_attrib_64bit in compat profile
Timothy Arceri [Wed, 20 Jun 2018 03:05:05 +0000 (13:05 +1000)]
mesa: enable ARB_vertex_attrib_64bit in compat profile

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: add outstanding ARB_vertex_attrib_64bit dlist support
Timothy Arceri [Thu, 28 Jun 2018 05:31:09 +0000 (15:31 +1000)]
mesa: add outstanding ARB_vertex_attrib_64bit dlist support

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agovbo_save: add support for doubles to display list code
Dave Airlie [Thu, 28 Jun 2018 02:40:20 +0000 (12:40 +1000)]
vbo_save: add support for doubles to display list code

Required for ARB_vertex_attrib_64bit compat profile support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: add compat profile support for ARB_multi_draw_indirect
Timothy Arceri [Mon, 25 Jun 2018 00:32:58 +0000 (10:32 +1000)]
mesa: add compat profile support for ARB_multi_draw_indirect

v2: add missing ARB_base_instance support

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: make valid_draw_indirect_multi() accessible externally
Timothy Arceri [Mon, 25 Jun 2018 00:31:34 +0000 (10:31 +1000)]
mesa: make valid_draw_indirect_multi() accessible externally

We will use this to add compat support to ARB_multi_draw_indirect
in the following patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: add ARB_draw_indirect support to compat profile
Timothy Arceri [Sat, 23 Jun 2018 07:09:13 +0000 (17:09 +1000)]
mesa: add ARB_draw_indirect support to compat profile

v2: add missing ARB_base_instance support

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: generate GL_INVALID_OPERATION using draw indirect in dlist
Timothy Arceri [Sat, 23 Jun 2018 02:29:50 +0000 (12:29 +1000)]
mesa: generate GL_INVALID_OPERATION using draw indirect in dlist

The spec doesn't explicitly say to generate an error but since
DrawArraysInstanced* and DrawElementsInstanced* do, it makes
sense to do it for these functions also.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: add missing display list support for ARB_compute_shader
Timothy Arceri [Thu, 28 Jun 2018 00:25:17 +0000 (10:25 +1000)]
mesa: add missing display list support for ARB_compute_shader

The extension is enabled for compat profile but there is currently
no display list support.

I filed a spec bug and it has been agreed that
glDispatchComputeIndirect should generate an INVALID_OPERATION
error when called during display list compilation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: expose some ARB_viewport_array dependent extensions in compat
Timothy Arceri [Thu, 21 Jun 2018 00:35:15 +0000 (10:35 +1000)]
mesa: expose some ARB_viewport_array dependent extensions in compat

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: enable ARB_viewport_array in compat profile
Timothy Arceri [Wed, 20 Jun 2018 03:03:40 +0000 (13:03 +1000)]
mesa: enable ARB_viewport_array in compat profile

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: add ARB_viewport_array display list support
Timothy Arceri [Thu, 21 Jun 2018 00:14:36 +0000 (10:14 +1000)]
mesa: add ARB_viewport_array display list support

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: enable ARB_shader_subroutine in compat profile
Timothy Arceri [Wed, 20 Jun 2018 00:55:34 +0000 (10:55 +1000)]
mesa: enable ARB_shader_subroutine in compat profile

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: add glUniformSubroutinesuiv() display list support
Timothy Arceri [Wed, 20 Jun 2018 01:08:35 +0000 (11:08 +1000)]
mesa: add glUniformSubroutinesuiv() display list support

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: stop hiding remaining query parameters from OpenGL compat
Timothy Arceri [Wed, 20 Jun 2018 00:16:20 +0000 (10:16 +1000)]
mesa: stop hiding remaining query parameters from OpenGL compat

I managed to miss these two in my last pass at this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: enable ARB_gpu_shader_fp64 in compat profile
Timothy Arceri [Tue, 19 Jun 2018 09:35:17 +0000 (19:35 +1000)]
mesa: enable ARB_gpu_shader_fp64 in compat profile

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: add ProgramUniform*d display list support
Timothy Arceri [Tue, 19 Jun 2018 09:33:26 +0000 (19:33 +1000)]
mesa: add ProgramUniform*d display list support

This is required for fp64 to be enabled in compat profile.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: add Uniform*d support to display lists
Timothy Arceri [Tue, 19 Jun 2018 09:05:25 +0000 (19:05 +1000)]
mesa: add Uniform*d support to display lists

This is required so we can enable fp64 support in compat profile.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost/glsl_to_nir: run lower_output_reads on !PIPE_CAP_TGSI_CAN_READ_OUTPUTS
Karol Herbst [Tue, 20 Feb 2018 16:56:47 +0000 (17:56 +0100)]
st/glsl_to_nir: run lower_output_reads on !PIPE_CAP_TGSI_CAN_READ_OUTPUTS

this is required for Drivers which don't allow reading from outputs.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
6 years agov3d: Move GL shader state dumping out of per-version compilation.
Eric Anholt [Thu, 28 Jun 2018 19:33:43 +0000 (12:33 -0700)]
v3d: Move GL shader state dumping out of per-version compilation.

It doesn't depend on V3D_VER, since it's just calling v3d_print_group.

6 years agov3d: Add missing Stream field to transform feedback specs on V3D 4.1.
Eric Anholt [Thu, 28 Jun 2018 20:08:59 +0000 (13:08 -0700)]
v3d: Add missing Stream field to transform feedback specs on V3D 4.1.

Noticed when trying to CLIF parse a transform feedback job that hangs on
HW.

6 years agov3d: Add missing "tri trip or fan" flag in Primitive List Format.
Eric Anholt [Wed, 27 Jun 2018 23:40:36 +0000 (16:40 -0700)]
v3d: Add missing "tri trip or fan" flag in Primitive List Format.

6 years agov3d: Fix the shader code address field widths on V3D 4.1+
Eric Anholt [Wed, 27 Jun 2018 23:31:19 +0000 (16:31 -0700)]
v3d: Fix the shader code address field widths on V3D 4.1+

We were overlapping it with the threadable/nan flags, resulting in
incorrect relocations (threadable/nan included in the offset) and wrong
ordering in the CLIF files.

6 years agov3d: Add missing "no prim pack" field to the V3D4.1+ GL shader state.
Eric Anholt [Wed, 27 Jun 2018 23:28:25 +0000 (16:28 -0700)]
v3d: Add missing "no prim pack" field to the V3D4.1+ GL shader state.

It looks like we don't need this flag for anything (not that I'm clear on
what it does), but it makes our struct dumping line up with CLIF parsing.

6 years agov3d: Express dithering mode in the same way that the CLIF parser does.
Eric Anholt [Wed, 27 Jun 2018 23:00:16 +0000 (16:00 -0700)]
v3d: Express dithering mode in the same way that the CLIF parser does.

6 years agov3d: Add missing "number of bin tile lists" field.
Eric Anholt [Wed, 27 Jun 2018 22:55:32 +0000 (15:55 -0700)]
v3d: Add missing "number of bin tile lists" field.

Noticed when trying to feed our dumps through the CLIF parser.  Since this
is a "minus one" field, we were already filling in the value we wanted (0).

6 years agov3d: Rewrite the color write masks to match CLIF format.
Eric Anholt [Wed, 27 Jun 2018 22:25:03 +0000 (15:25 -0700)]
v3d: Rewrite the color write masks to match CLIF format.

The render_target_* fields gave us pretty(ish) printing, but meant we were
incompatible with CLIF, and had much more verbose code generating them.

6 years agov3d: Merge the V3D 4.1 and 4.2 XML into V3D 3.3'x XML.
Eric Anholt [Wed, 27 Jun 2018 18:21:34 +0000 (11:21 -0700)]
v3d: Merge the V3D 4.1 and 4.2 XML into V3D 3.3'x XML.

The XML ends up noisier if you're only looking at one version, but from
the diffstat there's obvious wins in terms of deduplication.  This will
get even more significant if we ever support 3.2 or 4.0.

6 years agov3d: Switch v3d_decoder.c to the XML's top min_ver/max_ver fields.
Eric Anholt [Wed, 27 Jun 2018 21:10:52 +0000 (14:10 -0700)]
v3d: Switch v3d_decoder.c to the XML's top min_ver/max_ver fields.

The XML zipper wants one XML per version for filling out its tables, but
we want to do more than one GPU version per XML now.  Assume that the
"gen" field will be the same as min_ver and look up our XML text assuming
that they're listed in increasing min_ver.