mesa.git
5 years agoradv/gfx10: add Wave32 support for fragment shaders
Samuel Pitoiset [Thu, 1 Aug 2019 08:43:41 +0000 (10:43 +0200)]
radv/gfx10: add Wave32 support for fragment shaders

It can be enabled with RADV_PERFTEST=pswave32.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agogallium: Implement GL_EXT_shader_samples_identical via a new capability
Kenneth Graunke [Wed, 31 Jul 2019 22:47:34 +0000 (15:47 -0700)]
gallium: Implement GL_EXT_shader_samples_identical via a new capability

This exposes the textureSamplesIdenticalEXT function in GLSL.

We enable it for iris and radeonsi, because their compilers already
have support for this.  Tested on Intel Kabylake and AMD Vega 64.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agointel/tools: Fix aubinator_viewer build.
Kenneth Graunke [Fri, 2 Aug 2019 06:36:41 +0000 (23:36 -0700)]
intel/tools: Fix aubinator_viewer build.

This functions was recently renamed and not all callers were updated.

Fixes: 086c486a75f ("intel/device: rename gen_get_device_info")
5 years agointel/ir: Fix CFG corruption in opt_predicated_break().
Francisco Jerez [Tue, 23 Jul 2019 23:17:07 +0000 (16:17 -0700)]
intel/ir: Fix CFG corruption in opt_predicated_break().

Specifically the optimization of a conditional BREAK + WHILE sequence
into a conditional WHILE seems pretty broken.  The list of successors
of "earlier_block" (where the conditional BREAK was found) is emptied
and then re-created with the same edges for no apparent reason.  On
top of that the list of predecessors of the block immediately after
the WHILE loop is emptied, but only one of the original edges will be
added back, which means that potentially several blocks that still
have it on their list of successors won't be on its list of
predecessors anymore, causing all sorts of hilarity due to the
inconsistency in the control flow graph.

The solution is to remove the code that's removing valid edges from
the CFG.  cfg_t::remove_block() will already clean up after itself.
The assert in bblock_t::combine_with() also needs to be removed since
we will be merging a block with multiple children into the first one
of them.

Found the issue on a hardware enabling branch originally, but
apparently somebody reproduced the same problem independently on
master in the meantime.

Fixes: d13bcdb3a9f ("i965/fs: Extend predicated break pass to predicate WHILE.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111009
Cc: jiradet.jd@gmail.com
Cc: Sergii Romantsov <sergii.romantsov@globallogic.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Paul Chelombitko <qamonstergl@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/device: make internal functions private
Mark Janes [Tue, 30 Jul 2019 00:38:42 +0000 (17:38 -0700)]
intel/device: make internal functions private

The device info initializer makes several fuctions internal:

  - handling of device override
  - updating topology from kernel information

The implementation file is slightly reordered due to the renamed
functions being static.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agointel/device: rename gen_get_device_info
Mark Janes [Thu, 25 Jul 2019 22:57:30 +0000 (15:57 -0700)]
intel/device: rename gen_get_device_info

Rename the original device info initialization routine so callers
don't mistakenly call the wrong one:

  gen_get_device_info_from_fd:

      Queries kernel for full device info, including topology
      details.

  gen_get_device_info_from_pci_id:

      Partially initializes device info based on PCI ID lookup, when
      the kernel is not available.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agointel/tools: use device info initializer
Mark Janes [Thu, 25 Jul 2019 21:31:40 +0000 (14:31 -0700)]
intel/tools: use device info initializer

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: use initialization routine for gen_device_info
Mark Janes [Thu, 25 Jul 2019 17:40:55 +0000 (10:40 -0700)]
anv: use initialization routine for gen_device_info

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoiris/screen: use initialization routine for gen_device_info
Mark Janes [Wed, 24 Jul 2019 22:21:36 +0000 (15:21 -0700)]
iris/screen: use initialization routine for gen_device_info

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoi965: Move device info initialization to common code
Mark Janes [Wed, 24 Jul 2019 20:48:03 +0000 (13:48 -0700)]
i965: Move device info initialization to common code

With perf queries, initializing the device info is much more complex
than just getting a PCI ID and calling gen_get_device_info.  This commit
adds a new gen_get_device_info_from_fd helper in common code which does
all of the requisite kernel queries to get device info including all of
the topology information.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoi965/perf: verify kernel support before registering OA metrics
Mark Janes [Wed, 31 Jul 2019 23:16:50 +0000 (16:16 -0700)]
i965/perf: verify kernel support before registering OA metrics

When gen_device_info updates the topology in it's initializer, the
kernel queries will fail silently.  Iris and anv have minimum
kernel requirements that support the queries.  i965 must verify kernel
support before reporting OA metrics.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agointel/common: provide common ioctl routine
Mark Janes [Thu, 25 Jul 2019 17:50:36 +0000 (10:50 -0700)]
intel/common: provide common ioctl routine

i965 links against libdrm for drmIoctl, but anv and iris both
re-implement this routine to avoid the dependency.

intel/dev also needs an ioctl wrapper, so lets share the same
implementation everywhere.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agopanfrost: Remove unused argument
Alyssa Rosenzweig [Thu, 1 Aug 2019 15:10:03 +0000 (08:10 -0700)]
panfrost: Remove unused argument

A relic from when we didn't have an online compiler, hah.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Handle MESA_SHADER_COMPUTE in compile callback
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:52:04 +0000 (15:52 -0700)]
panfrost: Handle MESA_SHADER_COMPUTE in compile callback

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Use standard list traversal to find initial tag
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:49:30 +0000 (15:49 -0700)]
pan/midgard: Use standard list traversal to find initial tag

Fixes a hang (and abort) on empty shaders, which you shouldn't have
anyway but better safe than sorry. DCE going on the fritz is no reason
to freeze the system.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Use gl_shader_stage directly for compiles
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:49:13 +0000 (15:49 -0700)]
panfrost: Use gl_shader_stage directly for compiles

No need to add a third set of enums to the mix.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Emit "draw" info for compute jobs
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:32:18 +0000 (15:32 -0700)]
panfrost: Emit "draw" info for compute jobs

Important fields relating to shader state and UBOs are filled out from
this (misnomer) function.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Feed compute shaders into the compiler
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:31:23 +0000 (15:31 -0700)]
panfrost: Feed compute shaders into the compiler

The path for compute shader compiles resembles the graphic shader
compile path, although it is substantially simpler as we don't need any
shader keying.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Expose compute shaders as panfrost_shader_variants
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:20:00 +0000 (15:20 -0700)]
panfrost: Expose compute shaders as panfrost_shader_variants

Whether variants are packed by graphics or compute is irrelevant.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Remove shader state *base
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:19:44 +0000 (15:19 -0700)]
panfrost: Remove shader state *base

It is now unused.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Remove CSO dependency from shader_compile
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:19:09 +0000 (15:19 -0700)]
panfrost: Remove CSO dependency from shader_compile

We want this routine to be generic across graphics and compute, so let
the caller deal with the typing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Generalize UBO upload for other shader stages
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:06:38 +0000 (15:06 -0700)]
panfrost: Generalize UBO upload for other shader stages

Now that everything is unified, this generalization is nice and easy.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Guard vertex upload by ctx->vertex != NULL
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:06:14 +0000 (15:06 -0700)]
panfrost: Guard vertex upload by ctx->vertex != NULL

This is irrelevant for graphics but matters for compute workloads.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Generalize vertex shader upload
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:05:57 +0000 (15:05 -0700)]
panfrost: Generalize vertex shader upload

This allows us to reuse the same code path for compute.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Share gl_enables between VERTEX/COMPUTE
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:56:03 +0000 (14:56 -0700)]
panfrost: Share gl_enables between VERTEX/COMPUTE

Catch-all for magic bits.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Invoke compute shader according to grid info
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:27:53 +0000 (14:27 -0700)]
panfrost: Invoke compute shader according to grid info

We already have helpers for packing invocations (due to its role in
instanced vertex shaders), so we can reuse this drop in for compute
shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Explain and include compute FBD
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:22:37 +0000 (14:22 -0700)]
panfrost: Explain and include compute FBD

Squint at it hard enough and you realize it's the beginning of an
SFBD... I guess...

A compute shader with register spilling would be able to confirm this,
but we would expect to see the first field | 1 and an address splattered
later, setting up TLS.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Unify-driven cleanup
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:15:19 +0000 (14:15 -0700)]
panfrost: Unify-driven cleanup

Again, now that stages are unified some logic goes away.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Unify ctx->vs and ctx->fs
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:13:30 +0000 (14:13 -0700)]
panfrost: Unify ctx->vs and ctx->fs

It's a little verbose, but this way we can support other shader stages
without too much contortion.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Flesh out launch_grid stub
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:08:59 +0000 (14:08 -0700)]
panfrost: Flesh out launch_grid stub

It's still incomplette, but we're able to hook into launch_grid to
create a stub COMPUTE job.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Cleanup via payload unification
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:08:07 +0000 (14:08 -0700)]
panfrost: Cleanup via payload unification

Since these are now indexable, quite a bit of code cleans up.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Unify payload_vertex/payload_tiler
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:05:14 +0000 (14:05 -0700)]
panfrost: Unify payload_vertex/payload_tiler

Rather than disparate variables, let's use an array of payloads indexed
by the shader stage.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Only wallpaper if we drew something
Alyssa Rosenzweig [Wed, 31 Jul 2019 20:54:23 +0000 (13:54 -0700)]
panfrost: Only wallpaper if we drew something

last_tiler.gpu may be NULL at flush time despite no clear and existing
jobs -- if we executed a compute-only workload.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Adjust shader CAPs to expose dEQP compute
Alyssa Rosenzweig [Wed, 31 Jul 2019 20:40:46 +0000 (13:40 -0700)]
panfrost: Adjust shader CAPs to expose dEQP compute

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Expose NIR as our PIPE_SHADER_CAP_SUPPORTED_IRS
Alyssa Rosenzweig [Tue, 23 Jul 2019 23:40:42 +0000 (16:40 -0700)]
panfrost: Expose NIR as our PIPE_SHADER_CAP_SUPPORTED_IRS

We *could* expose TGSI as well -- we pipe it through tgsi_to_nir for
Gallium-internal shaders anyway -- but we'd rather not.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Copy freedreno's panfrost_get_compute_param
Alyssa Rosenzweig [Tue, 23 Jul 2019 17:41:25 +0000 (10:41 -0700)]
panfrost: Copy freedreno's panfrost_get_compute_param

Values reported here aren't remotely correct, but it's a start to just
get the entrypoint stubbed out.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Expose COMPUTE-related caps for GLES3.1
Alyssa Rosenzweig [Tue, 23 Jul 2019 16:05:40 +0000 (09:05 -0700)]
panfrost: Expose COMPUTE-related caps for GLES3.1

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Stub out launch_grid
Alyssa Rosenzweig [Tue, 23 Jul 2019 15:31:14 +0000 (08:31 -0700)]
panfrost: Stub out launch_grid

Just dumps some information about the invocation for later debug.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Stub out compute CSO
Alyssa Rosenzweig [Tue, 23 Jul 2019 15:28:23 +0000 (08:28 -0700)]
panfrost: Stub out compute CSO

Doesn't do anything, just gets the functions there.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Implement gl_FrontFacing
Alyssa Rosenzweig [Wed, 31 Jul 2019 19:24:32 +0000 (12:24 -0700)]
panfrost: Implement gl_FrontFacing

Interestingly, this requires no compiler changes. It's just exposed as a
special varying.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Add support for decoding gl_FrontFacing
Alyssa Rosenzweig [Wed, 31 Jul 2019 18:56:55 +0000 (11:56 -0700)]
panfrost: Add support for decoding gl_FrontFacing

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/decode: Use max varying index as varying buffer count
Alyssa Rosenzweig [Wed, 31 Jul 2019 18:52:52 +0000 (11:52 -0700)]
pan/decode: Use max varying index as varying buffer count

This allows us to decode asymmetric varyings correctly, which occurs
with e.g. gl_FrontFacing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoiris: add support for gl_ClipVertex in tess eval shaders
Timothy Arceri [Fri, 28 Jun 2019 12:25:57 +0000 (22:25 +1000)]
iris: add support for gl_ClipVertex in tess eval shaders

Required for OpenGL compat support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: add support for gl_ClipVertex in geometry shaders
Timothy Arceri [Thu, 27 Jun 2019 05:06:30 +0000 (15:06 +1000)]
iris: add support for gl_ClipVertex in geometry shaders

This will enable us to support the OpenGL compat profile.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir: Stop whacking gl_FrontFacing to a system value
Jason Ekstrand [Wed, 31 Jul 2019 20:17:17 +0000 (15:17 -0500)]
nir: Stop whacking gl_FrontFacing to a system value

We have a cap bit for gallium and a GLSL compiler flag to control this.
Just trust what GLSL gives us and stop forcing it.  In order for this to
be safe, we have to advertise another cap in some of the gallium
drivers.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agopanfrost: Implement panfrost_set_shader_buffers callback
Alyssa Rosenzweig [Thu, 1 Aug 2019 17:31:35 +0000 (10:31 -0700)]
panfrost: Implement panfrost_set_shader_buffers callback

Just copy over the passed SSBO for now.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agogallium/util: Add util_set_shader_buffers_mask helper
Alyssa Rosenzweig [Thu, 1 Aug 2019 17:30:40 +0000 (10:30 -0700)]
gallium/util: Add util_set_shader_buffers_mask helper

Conceptually follows util_set_vertex_buffers_mask but for SSBOs.

v2: Fix missing ~ when clearing mask. Adjust mask behaviour to match
freedreno/v3d when buffer == NULL.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agokmsro: move entry points from etnaviv to kmsro
Jonathan Marek [Thu, 1 Aug 2019 16:48:40 +0000 (12:48 -0400)]
kmsro: move entry points from etnaviv to kmsro

These drivers are kmsro drivers so they should be part of the kmsro #if

This fixes missing imx_drm driver when building with only freedreno+kmsro

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agogitlab-ci: remove software-properties-common
Emil Velikov [Thu, 25 Jul 2019 13:45:08 +0000 (14:45 +0100)]
gitlab-ci: remove software-properties-common

Currently we use the python package to manage repositories. At the same
time we also do that by hand - since it's a trivial echo to a file.

Stay consistent, remove the package and manage things manually.

Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
5 years agost/mesa: fix MSVC compile breakage
Brian Paul [Thu, 1 Aug 2019 15:07:19 +0000 (09:07 -0600)]
st/mesa: fix MSVC compile breakage

Trivial.

5 years agovirgl: Enable depth_clamp by lowering if the host is new enough.
Gert Wollny [Thu, 25 Jul 2019 08:45:14 +0000 (10:45 +0200)]
virgl: Enable depth_clamp by lowering if the host is new enough.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agogallium: Make PIPE_CAP_DEPTH_CLIP_DISABLE a tri-state value and use it
Gert Wollny [Thu, 25 Jul 2019 08:44:39 +0000 (10:44 +0200)]
gallium: Make PIPE_CAP_DEPTH_CLIP_DISABLE a tri-state value and use it

Use value "2" to signal that lowering is needed and supported and enable
it accordingly.

v2: - Note in CAP description that this lowering currently requires TGSI
    - use "true" instead of GL_TRUE (both Erik)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa/st: Signal state changes when depth_clamp is emulated
Gert Wollny [Thu, 25 Jul 2019 08:40:36 +0000 (10:40 +0200)]
mesa/st: Signal state changes when depth_clamp is emulated

v1 implemented by Erik Faye-Lund <erik.faye-lund@collabora.com>
v2: - Add GS and TES
    - fix constants state update flags (Erik)
v3: don't update rasterizer when depth_clamp is lowered (Erik)
v4: Correct NewDepthClamp and also set flags for NewClipControl (Erik)
v5: Also set shader_has_one_variant property acording to possible
   depth_clamp lowering (Marek)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa/st: Add depth clamping to rasterizer code
Gert Wollny [Thu, 25 Jul 2019 08:38:24 +0000 (10:38 +0200)]
mesa/st: Add depth clamping to rasterizer code

implemented by Erik Faye-Lund <erik.faye-lund@collabora.com>

v2: Use current depth range values for clamping (Erik)
v3: fix scons-win64 build

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa/st: Tie depth_clamp code into other shaders (GS and TES)
Gert Wollny [Thu, 25 Jul 2019 08:35:46 +0000 (10:35 +0200)]
mesa/st: Tie depth_clamp code into other shaders (GS and TES)

v2: Use file scope defined depth_range_state  in common
v3: - don't use the one_shader_variant property, as this is
      not correct (Marek)
    - also use tests on available shader stages to enable
      depth_clamp lowering
v4: Don't use key.st, use st directly (Marek)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa/st: Tie depth_clamp lowering into the FS
Gert Wollny [Thu, 25 Jul 2019 08:34:48 +0000 (10:34 +0200)]
mesa/st: Tie depth_clamp lowering into the FS

v1 implemented by Erik Faye-Lund <erik.faye-lund@collabora.com>
v2: Use different call for FS
v3: Use file scope defined depth_range_state

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa/st: Tie depth clamp lowering in to the VP code
Gert Wollny [Thu, 25 Jul 2019 08:33:36 +0000 (10:33 +0200)]
mesa/st: Tie depth clamp lowering in to the VP code

v1: implemented by Erik Faye-Lund <erik.faye-lund@collabora.com>
v2: Add handling of the ARB_clip_control depth mode
v3: Move depth_range_state to file scope and remove training zeros (Erik)
v4: - don't use the one_shader_variant property, as this is not correct (Marek)
    - also use tests on available shader stages to enable depth_clamp lowering
V5: Don't use key.st, use st directly (Marek)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa/st: add tgsi-lowering code for depth-clamp
Erik Faye-Lund [Wed, 5 Jun 2019 13:39:41 +0000 (15:39 +0200)]
mesa/st: add tgsi-lowering code for depth-clamp

This is a TGSI pass that lowers depth-clamping into shader-operations,
by replacing the depth-value with 0 (a z-coordinate of zero will always
pass the OpenGL depth test conditions), and using a dedicated varying to
interpolate the real depth-value instead. Finally we replace the
depth-output in the fragment shader.

v1 implemented by Erik Faye-Lund <erik.faye-lund@collabora.com>
v2: Add support for handling depth clip mode, and refactor code
v3: - Rename *_vs functions to *_last_vertex_stage (Erik)
    - Use 0.0 depth to avoid clipping (Erik)
v4: Fix inversion of bool value for clip control property

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa/st: replace boolean declarations by bool
Gert Wollny [Tue, 23 Jul 2019 05:07:14 +0000 (07:07 +0200)]
mesa/st: replace boolean declarations by bool

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoRevert "softpipe: Don't draw when rasterizer_discard is set"
Gert Wollny [Mon, 29 Jul 2019 16:13:44 +0000 (18:13 +0200)]
Revert "softpipe: Don't draw when rasterizer_discard is set"

This was too aggressive and breaks TF (Ilia)

This reverts commit 4ee638cd7826e8a4bed76f51c7b73395a2fcdbbc.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agodocs: reword meson instructions
Eric Engestrom [Wed, 31 Jul 2019 10:49:03 +0000 (11:49 +0100)]
docs: reword meson instructions

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agotravis: drop unnecessary Meson option for MacOS
Eric Engestrom [Tue, 30 Jul 2019 15:17:15 +0000 (16:17 +0100)]
travis: drop unnecessary Meson option for MacOS

Those are already their default values on MacOS.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agointel/vec4: Drop all of the 64-bit varying code
Jason Ekstrand [Sat, 20 Jul 2019 13:17:59 +0000 (08:17 -0500)]
intel/vec4: Drop all of the 64-bit varying code

Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/fs: Drop all of the 64-bit varying code
Jason Ekstrand [Fri, 19 Jul 2019 22:38:04 +0000 (17:38 -0500)]
intel/fs: Drop all of the 64-bit varying code

Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel: Use NIR to lower 64-bit varying access
Jason Ekstrand [Fri, 19 Jul 2019 22:23:26 +0000 (17:23 -0500)]
intel: Use NIR to lower 64-bit varying access

Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agonir/lower_io: Add an option to lower 64-bit varyings
Jason Ekstrand [Fri, 19 Jul 2019 22:10:07 +0000 (17:10 -0500)]
nir/lower_io: Add an option to lower 64-bit varyings

Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agodocs: Update Platforms and Drivers page with more comprehensive information.
Jorge Natz [Wed, 31 Jul 2019 22:50:43 +0000 (22:50 +0000)]
docs: Update Platforms and Drivers page with more comprehensive information.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agonir: use common deref has indirect code in scratch lowering.
Dave Airlie [Wed, 31 Jul 2019 04:05:49 +0000 (14:05 +1000)]
nir: use common deref has indirect code in scratch lowering.

This doesn't seem to need it's own copy here.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: remove explicit nir_intrinsic_index_flag values
Eric Engestrom [Wed, 31 Jul 2019 21:50:56 +0000 (22:50 +0100)]
nir: remove explicit nir_intrinsic_index_flag values

These were left after a rebase and happen to make
NIR_INTRINSIC_SWIZZLE_MASK == NIR_INTRINSIC_SRC_ACCESS, which is how it
was noticed.

Fixes: 6f20643b471a851c936f ("nir: Allow qualifiers on copy_deref and image instructions")
Cc: Connor Abbott <cwabbott0@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agostate_tracker: Free Labels for querry and tranform_feedback
Yevhenii Kolesnikov [Fri, 26 Jul 2019 14:30:55 +0000 (17:30 +0300)]
state_tracker: Free Labels for querry and tranform_feedback

Memory leaks were observed on iris with GL_KHR_debug.

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoiris: Skip emitting 3DSTATE_INDEX_BUFFER if possible
Kenneth Graunke [Sat, 16 Feb 2019 08:57:54 +0000 (00:57 -0800)]
iris: Skip emitting 3DSTATE_INDEX_BUFFER if possible

We were emitting 3DSTATE_INDEX_BUFFER on every indexed draw, even if
back-to-back draws referred to the same index buffer.  This improves
drawoverhead scores in the DrawElements cases by about 10%, by giving
us even more minimal batches.

5 years agost/dri: simplify dri_get_egl_image by reusing dri2_format_table
Mike Blumenkrantz [Thu, 6 Jun 2019 23:47:23 +0000 (19:47 -0400)]
st/dri: simplify dri_get_egl_image by reusing dri2_format_table

this makes dri2_get_mapping_by_fourcc accessible from dri_helpers.h
and does a direct lookup on the fourcc id to match the pipe format

v2 (Ken): Allow map to be NULL, use img->texture->format.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agolima: enable lower_bitops in ppir
Erico Nunes [Thu, 18 Jul 2019 19:13:19 +0000 (21:13 +0200)]
lima: enable lower_bitops in ppir

The mali pp doesn't support integers and some nir_algebraic
optimizations may result in ops that are not easily lowerable to floats,
so disable optimizations resulting in bitops.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
5 years agonir/algebraic: rename lower_bitshift to lower_bitops
Erico Nunes [Thu, 18 Jul 2019 18:56:27 +0000 (20:56 +0200)]
nir/algebraic: rename lower_bitshift to lower_bitops

Optimizations that insert bitshift or bitwise operations should not be
applied on GPUs that don't support integer operations.
The .lower_bitshift could be used to control the bitshift related ones,
but there was also one bitwise optimization uncovered.
Since only lima and freedreno use this option and the use case is that
no bit operations are wanted, let's rename it to .lower_bitops and use
it to control all bitops related optimizations.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
5 years agolima/ppir: lower fdot in nir_opt_algebraic
Erico Nunes [Sat, 27 Jul 2019 16:10:46 +0000 (18:10 +0200)]
lima/ppir: lower fdot in nir_opt_algebraic

Now that we have fsum in nir, we can move fdot lowering there.
This helps reduce ppir complexity and enables the lowered ops to be part
of other nir optimizations in the optimization loop.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agonir/algebraic: add new fsum ops and fdot lowering
Erico Nunes [Sat, 27 Jul 2019 15:58:53 +0000 (17:58 +0200)]
nir/algebraic: add new fsum ops and fdot lowering

The Mali400 pp doesn't implement fdot but has fsum3 and fsum4, which can
be used to optimize fdot lowering. fsum2 is not implemented and can be
further lowered to an add with the vector components.
Currently lima ppir handles this lowering internally, however this
happens in a very late stage and requires a big chunk of code compared
to a nir_opt_algebraic lowering.
By having fsum in nir, we can reduce ppir complexity and enable the
lowered ops to be part of other nir optimizations in the optimization
loop.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agolima/ppir: refactor texture code to simplify scheduler
Erico Nunes [Sun, 28 Jul 2019 19:27:46 +0000 (21:27 +0200)]
lima/ppir: refactor texture code to simplify scheduler

The 'varying fetch' pp instruction deals only with coordinates, and
'texture fetch' deals only with the sampler index.
Previously it was not possible to clearly map ppir_op_load_coords and
ppir_op_load_texture to pp instructions as the source coordinates were
kept in the ppir_op_load_texture node, making this harder to maintain.
The refactor is made with the attempt to clearly map ppir_op_load_coords
to the 'varying fetch' and ppir_op_load_texture to the 'texture fetch'.
The coordinates are still temporarily kept in the ppir_op_load_texture
node as nir has both sampler and coordinates in a single instruction and
it is only possible to output one ppir node during emit. But now after
lowering, the sources are transferred to the (always) created
ppir_op_load_coords node, and it should be possible to directly map them
to their pp instructions from there onwards.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agolima/ppir: lower texture projection
Erico Nunes [Sun, 28 Jul 2019 19:25:42 +0000 (21:25 +0200)]
lima/ppir: lower texture projection

Lower texture projection in ppir using nir_lower_tex and nir_lower_tex.
This will insert a mul with the coordinate division before the load
varying.

Even though the lima pp supports projection in the load varying
instruction while loading the coordinates (from a register or a
varying), it requires that both the coordinates and projector be
components in a single register.
nir currently handles them in separate ssa, and attempting to merge them
manually may end up in worse code than just doing the coordinate
division manually. So for now let's just lower the projection to add
support for it in lima.
In the future, an optimization pass may be implemented in lima to ensure
that both coords and projector come in the same register, then this
lowering may be disabled and in this case lima may use the built-in
projection and save the mul instruction from lowering.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agoscons: Fix random_r check.
Vinson Lee [Wed, 31 Jul 2019 02:24:16 +0000 (19:24 -0700)]
scons: Fix random_r check.

Fixes: 597bddad47e8 ("scons: Test for random_r()")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agoRevert "st/dri: simplify dri_get_egl_image by reusing dri2_format_table"
Kenneth Graunke [Wed, 31 Jul 2019 18:05:49 +0000 (11:05 -0700)]
Revert "st/dri: simplify dri_get_egl_image by reusing dri2_format_table"

This reverts commit c47af8b95f26bd83efe322ff0baa52263fb8625e.  It causes
dEQP-EGL regressions.  (I think there is an easy fix, but we'll have it
go through review again.)

5 years agopan/midgard: Don't special case inline_constant
Alyssa Rosenzweig [Tue, 30 Jul 2019 19:25:21 +0000 (12:25 -0700)]
pan/midgard: Don't special case inline_constant

Another constant source of bugs. Ain't that special.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: De-special-case branching
Alyssa Rosenzweig [Tue, 30 Jul 2019 19:20:24 +0000 (12:20 -0700)]
pan/midgard: De-special-case branching

It's not that special.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Add MALI_SAMP_NORM_COORDS flag
Alyssa Rosenzweig [Wed, 31 Jul 2019 16:08:07 +0000 (09:08 -0700)]
panfrost: Add MALI_SAMP_NORM_COORDS flag

Corresponds to the normalized coordinates? flag on images in OpenCL and
evidently also shows up in GL, so let's wire it in.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Simplify filter_mode definition
Alyssa Rosenzweig [Wed, 31 Jul 2019 15:50:02 +0000 (08:50 -0700)]
panfrost: Simplify filter_mode definition

It's just a bit field containing some flags; there's no need for all the
macro magic.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Shrink "compute FBD"
Alyssa Rosenzweig [Wed, 31 Jul 2019 14:25:24 +0000 (07:25 -0700)]
pan/midgard: Shrink "compute FBD"

We still don't know what it is, but from a newer trace we now know it's
half the size we thought it was.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Flip texture/sampler fields
Alyssa Rosenzweig [Wed, 31 Jul 2019 14:20:29 +0000 (07:20 -0700)]
panfrost: Flip texture/sampler fields

We had them backwards in both the command stream and the Midgard stack.
In OpenGL ES 2.0, they're always the same, but in Vulkan/later-GL/CL
they diverge so we can fix this.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Add MALI_ATTR_IMAGE value
Alyssa Rosenzweig [Wed, 31 Jul 2019 00:27:03 +0000 (17:27 -0700)]
panfrost: Add MALI_ATTR_IMAGE value

Images are implemented (in part) as special attributes, so include
support for decoding this.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agost/dri: simplify dri_get_egl_image by reusing dri2_format_table
Mike Blumenkrantz [Thu, 6 Jun 2019 23:47:23 +0000 (19:47 -0400)]
st/dri: simplify dri_get_egl_image by reusing dri2_format_table

this makes dri2_get_mapping_by_fourcc accessible from dri_helpers.h
and does a direct lookup on the fourcc id to match the pipe format

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agogallium: add handling for YUV planar surfaces
Mike Blumenkrantz [Wed, 29 May 2019 21:14:32 +0000 (17:14 -0400)]
gallium: add handling for YUV planar surfaces

st/dri:
this adds a table (similar to the one in i965) which provides
mappings for turning various planar formats into multiple sampler views.
whereas only NV12 and IYUV were supported, now many more formats are
supported here:
* P0XX
* YUV4XX
* YVU4XX
* AYUV
* XYUV
* YUYV
* UYVY

the table is used directly to handle image creation, simplifying
a lot of code and resolving related TODO/FIXME items where workarounds were
previously in place to manage NV12 and IYUV formats exclusively

st/mesa:
the changes here relate to setting up samplers for the planar formats.
this requires:
* checking for driver support for all the sampler formats
* creating the samplers with the corresponding formats and swizzling
* running nir_lower_tex with the appropriate options to trigger the lowering
  for each plane->sampler

fixes kwg/mesa#36

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agogallium: add AYUV and XYUV formats
Mike Blumenkrantz [Wed, 3 Jul 2019 16:14:54 +0000 (12:14 -0400)]
gallium: add AYUV and XYUV formats

this only adds the PIPE_FORMAT members, not any direct handling for them

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agopan/midgard: Simplify discard logic
Alyssa Rosenzweig [Wed, 31 Jul 2019 00:07:25 +0000 (17:07 -0700)]
pan/midgard: Simplify discard logic

The "branch offset" is, in fact, ignored.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add units for more instructions
Alyssa Rosenzweig [Tue, 30 Jul 2019 23:55:16 +0000 (16:55 -0700)]
pan/midgard: Add units for more instructions

For everything but freduce, we have some sense of what units the
instruction takes.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Fix ball/bany opcode table
Alyssa Rosenzweig [Tue, 30 Jul 2019 23:46:57 +0000 (16:46 -0700)]
pan/midgard: Fix ball/bany opcode table

This were seriously messed up beyond all recognition. How we're passing
shaders.random.* is a mystery.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Document branch combination LUT
Alyssa Rosenzweig [Mon, 29 Jul 2019 16:15:32 +0000 (09:15 -0700)]
pan/midgard: Document branch combination LUT

This took way longer to figure out than it should have..

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agost/mesa: Skip scissor rect updates when scissor is entirely disabled.
Kenneth Graunke [Wed, 31 Jul 2019 01:07:17 +0000 (18:07 -0700)]
st/mesa: Skip scissor rect updates when scissor is entirely disabled.

If any scissor rectangles are enabled, then we need to set proper
scissor rectangles for all viewports.  But if the scissor test is
entirely disabled, then we can skip updating any scissor rectangles.

Without this step, we were updating the scissor rectangles based on
the current framebuffer size.  So if an app rendered to a variety of
render targets at different sizes, with scissor test disabled each
time, we'd still be continually updating the scissor rectangles,
even though it's not necessary.

In Civilization VI, this drops us from 310-350 set_scissor_state
calls per frame to 0, as it doesn't appear to use scissor testing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoegl/drm: ensure the backing gbm is set before using it
Emil Velikov [Fri, 5 Jul 2019 10:14:30 +0000 (11:14 +0100)]
egl/drm: ensure the backing gbm is set before using it

Currently, if we error out before gbm_dri is set (say due to a different
name of the backing GBM implementation, or otherwise) the tear down will
trigger a NULL ptr deref and crash out.

Move the gbm_dri initialization as early as possible.

v2: Drop check in dri2_teardowm_drm (Eric)

Reported-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: update required meson version
Eric Engestrom [Mon, 29 Jul 2019 23:44:21 +0000 (00:44 +0100)]
docs: update required meson version

Fixes: f7b6a8d12fdc446e3251 ("meson: bump required version to 0.46")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoradv/gfx10: implement a GE bug workaround
Samuel Pitoiset [Wed, 31 Jul 2019 07:39:23 +0000 (09:39 +0200)]
radv/gfx10: implement a GE bug workaround

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: remove an obsolete VGT_REUSE_OFF workaround
Samuel Pitoiset [Wed, 31 Jul 2019 07:39:22 +0000 (09:39 +0200)]
radv/gfx10: remove an obsolete VGT_REUSE_OFF workaround

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: disable LATE_ALLOC_GS on Navi14
Samuel Pitoiset [Wed, 31 Jul 2019 07:39:21 +0000 (09:39 +0200)]
radv/gfx10: disable LATE_ALLOC_GS on Navi14

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>