mesa.git
4 years agopan/midgard: Fix masks/alignment for 64-bit loads
Alyssa Rosenzweig [Fri, 15 Nov 2019 20:16:53 +0000 (15:16 -0500)]
pan/midgard: Fix masks/alignment for 64-bit loads

These need to be handled with special care.

Oh, Midgard, you're *extra* special.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Expose more typesize helpers
Alyssa Rosenzweig [Fri, 15 Nov 2019 20:16:28 +0000 (15:16 -0500)]
pan/midgard: Expose more typesize helpers

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Implement non-aligned UBOs
Alyssa Rosenzweig [Fri, 15 Nov 2019 19:13:18 +0000 (14:13 -0500)]
pan/midgard: Implement non-aligned UBOs

The field is more fine-grained than we had assumed.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agoetnaviv: rs: upsampling is not supported
Christian Gmeiner [Sat, 9 Nov 2019 18:27:54 +0000 (19:27 +0100)]
etnaviv: rs: upsampling is not supported

This change makes it possible to support different downsample cases
like 4 -> 2 or 4 -> 1.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
4 years agofreedreno/registers: fix a6xx_2d_blit_cntl ROTATE
Jonathan Marek [Fri, 15 Nov 2019 20:34:09 +0000 (15:34 -0500)]
freedreno/registers: fix a6xx_2d_blit_cntl ROTATE

A change from b7093882 got overwritten by 610c8c93

Fixes: 610c8c93 ("freedreno/registers: Update with GS, HS and DS registers")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
4 years agofreedreno/ir3: disable texture prefetch for 1d array textures
Jonathan Marek [Fri, 15 Nov 2019 17:38:28 +0000 (12:38 -0500)]
freedreno/ir3: disable texture prefetch for 1d array textures

Prefetch only supports the basic 2D texture case, checking is_array is
needed because 1d array textures pass the coord num_components==2 test.

Fixes: 2a0d45ae ("freedreno/ir3: Add a NIR pass to select tex instructions eligible for pre-fetch")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
4 years agolima: Parse VS and PLBU command stream while making a dump
Andreas Baierl [Fri, 25 Oct 2019 14:38:52 +0000 (16:38 +0200)]
lima: Parse VS and PLBU command stream while making a dump

This makes the streams more readable and comparable with the blob's parser
as it parses the VS and PLBU stream and shows the currently known values.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
4 years agolima: Beautify stream dumps
Andreas Baierl [Thu, 17 Oct 2019 10:21:13 +0000 (12:21 +0200)]
lima: Beautify stream dumps

Change the dump, that the output looks more like the output of
mali-syscall-tracker [1].
This is a preparation for a more detailed stream analysis.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
[1]: https://gitlab.freedesktop.org/lima/mali-syscall-tracker

4 years agoclover/llvm: fix build after llvm 10 commit 1dfede3122ee
Aaron Watry [Fri, 15 Nov 2019 04:44:02 +0000 (22:44 -0600)]
clover/llvm: fix build after llvm 10 commit 1dfede3122ee

CodeGenFileType moved from ::llvm::TargetMachine in
llvm/Target/TargetMachine.h to ::llvm:: in llvm/Support/CodeGen.h

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
4 years agoandroid: util/format: fix include path list
Mauro Rossi [Fri, 15 Nov 2019 22:54:52 +0000 (23:54 +0100)]
android: util/format: fix include path list

To avoid following building error:

out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_util_intermediates/format/u_format_table.c:30:10:
fatal error: 'u_format.h' file not found
         ^~~~~~~~~~~~
1 error generated.

Fixes: 882ca6d ("util: Move gallium's PIPE_FORMAT utils to /util/format/")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
4 years agoandroid: radeonsi: fix build error due to wrong u_format.csv file path
Mauro Rossi [Fri, 15 Nov 2019 22:13:49 +0000 (23:13 +0100)]
android: radeonsi: fix build error due to wrong u_format.csv file path

GEN10_FORMAT_TABLE_INPUTS requires correction of u_format.csv file path
in order to avoid following build error:

ninja: error: 'external/mesa/util/format/u_format.csv',
needed by 'out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_pipe_radeonsi_intermediates/radeonsi/gfx10_format_table.h',
missing and no known rule to make it

Fixes: 882ca6d ("util: Move gallium's PIPE_FORMAT utils to /util/format/")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
4 years agomesa/st: Reuse st_choose_matching_format from st_choose_format().
Eric Anholt [Tue, 17 Sep 2019 19:39:23 +0000 (12:39 -0700)]
mesa/st: Reuse st_choose_matching_format from st_choose_format().

We had this ad-hoc exact size matching for unsized internalformats,
but st_choose_matching_format() can do exactly what we want.  This
means, that, for example, we'll now prefer the matching ordering for
565/565_REV if the driver supports both orders.  We also pass
Unpack.SwapBytes through from ChooseTextureFormat so that we can hit
the memcpy path for 8888 formats when that flag is set.

Some interesting format choice changes from this (on softpipe):
intf/form/type        before            after
----------------------------------------------------
RGBA/RGBA/USHORT:     R8G8B8A8_UNORM -> RGBA_UNORM16
RGB/RGBA/8888:        X8B8G8R8_UNORM -> R8G8B8X8_UNORM
RGB/ABGR/8888_REV:    X8B8G8R8_UNORM -> R8G8B8X8_UNORM
RGBA/RGBA/5551:       B5G5R5A1_UNORM -> A1B5G5R5_UNORM
RGBA/RGBA/4444:       R8G8B8A8_UNORM -> A4B4G4R4_UNORM
RGBA/GL_RGBA/1010102: R8G8B8A8_UNORM -> A2B10G10R10_UNORM
DEPTH/DEPTH/UINT:     Z24X8          -> Z_UNORM32
DEPTH/DEPTH/USHORT:   Z24X8          -> Z_UNORM16

v2: Make sure that the baseformat still matches.  v1 would pick
    MESA_FORMAT_L16_UNORM for RED/LUMINANCE/SHORT, when we clearly
    want a red format.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agomesa: Don't put sRGB formats in the array format table.
Eric Anholt [Fri, 15 Nov 2019 01:41:27 +0000 (17:41 -0800)]
mesa: Don't put sRGB formats in the array format table.

sRGB vs unorm was the only conflict case being guarded against in this
function.  Before the PIPE_FORMAT conversion, we always listed the
unorm before the sRGB in the enums, but PIPE_FORMAT_A8B8G8R8_SRGB
happens to be before _UNORM.  We always want the unorm result here.

Fixes: 807a800d8c3e ("mesa: Redefine MESA_FORMAT_* in terms of PIPE_FORMAT_*.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agomesa/st: Simplify st_choose_matching_format().
Eric Anholt [Tue, 17 Sep 2019 22:13:30 +0000 (15:13 -0700)]
mesa/st: Simplify st_choose_matching_format().

We now have a nice helper function for finding those memcpy formats,
without needing to go through each entry of the mesa format table to
see if it happens to match.

While looking at sysprof of a softpipe GLES2 CTS run, we were spending
~8% of the CPU on ChooseTextureFormat.  With this, roughly the same
region of the testsuite was .4%.

v2: Add Ken's fix for canonicalizing array formats.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agomesa: Handle GL_COLOR_INDEX in _mesa_format_from_format_and_type().
Kenneth Graunke [Wed, 13 Nov 2019 07:12:54 +0000 (23:12 -0800)]
mesa: Handle GL_COLOR_INDEX in _mesa_format_from_format_and_type().

Just return MESA_FORMAT_NONE to avoid triggering unreachable; there's
really no sensible thing to return for this case anyway.

This prevents regressions in the next commit, which makes st/mesa
start using this function to find a reasonable format from GL format
and type enums.

Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agopan/midgard: Use generic constant packing for 8/64-bit
Alyssa Rosenzweig [Tue, 5 Nov 2019 14:06:41 +0000 (09:06 -0500)]
pan/midgard: Use generic constant packing for 8/64-bit

Eventually, we will want to combine constants across types, but for now
let's not break the world.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Pack 64-bit swizzles
Alyssa Rosenzweig [Tue, 5 Nov 2019 13:50:29 +0000 (08:50 -0500)]
pan/midgard: Pack 64-bit swizzles

64-bit ops have their own funky swizzles. Let's pack them, both for
native 64-bit sources as well as extended 32-bit sources.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Fix mir_round_bytemask_down for !32b
Alyssa Rosenzweig [Tue, 5 Nov 2019 03:21:47 +0000 (22:21 -0500)]
pan/midgard: Fix mir_round_bytemask_down for !32b

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Implement i2i64 and u2u64
Alyssa Rosenzweig [Tue, 5 Nov 2019 03:21:20 +0000 (22:21 -0500)]
pan/midgard: Implement i2i64 and u2u64

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Expand 64-bit writemasks
Alyssa Rosenzweig [Tue, 5 Nov 2019 03:20:59 +0000 (22:20 -0500)]
pan/midgard: Expand 64-bit writemasks

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agoradeonsi/nir: don't lower fma, instead, fuse fma
Marek Olšák [Wed, 13 Nov 2019 05:21:54 +0000 (00:21 -0500)]
radeonsi/nir: don't lower fma, instead, fuse fma

We want fma. This decreases compile times by 4% for Borderlands 2.

48505 shaders in 30515 tests
Totals:
SGPRS: 2206584 -> 2204784 (-0.08 %)
VGPRS: 1647892 -> 1648964 (0.07 %)
Spilled SGPRs: 6256 -> 6078 (-2.85 %)
Spilled VGPRs: 72 -> 72 (0.00 %)
Private memory VGPRs: 2176 -> 2176 (0.00 %)
Scratch size: 2240 -> 2240 (0.00 %) dwords per thread
Code Size: 49680804 -> 49837988 (0.32 %) bytes
LDS: 74 -> 74 (0.00 %) blocks
Max Waves: 371387 -> 371352 (-0.01 %)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
4 years agoradeonsi/nir: call nir_lower_flrp only once per shader
Marek Olšák [Wed, 13 Nov 2019 04:41:23 +0000 (23:41 -0500)]
radeonsi/nir: call nir_lower_flrp only once per shader

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
4 years agoradeonsi/nir: remove dead function temps
Marek Olšák [Sat, 9 Nov 2019 01:16:20 +0000 (20:16 -0500)]
radeonsi/nir: remove dead function temps

glxgears has dead temps after lowering color inputs to load intrinsics.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
4 years agogallium/noop: call finalize_nir
Marek Olšák [Thu, 14 Nov 2019 02:20:55 +0000 (21:20 -0500)]
gallium/noop: call finalize_nir

For measuring st/mesa compile time.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
4 years agopanfrost: Make sure the shader descriptor is in sync with the GL state
Tomeu Vizoso [Tue, 12 Nov 2019 12:48:54 +0000 (13:48 +0100)]
panfrost: Make sure the shader descriptor is in sync with the GL state

State was leaking from previous frames as we weren't updating the
descriptor in all cases.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
4 years agopan/midgard: Prioritize texture registers
Alyssa Rosenzweig [Wed, 13 Nov 2019 14:00:37 +0000 (09:00 -0500)]
pan/midgard: Prioritize texture registers

On newer GPUs, this is a no-op. On older GPUs, this prevents needless
spilling since texture registers are shared with a subset of work
registers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
4 years agopan/midgard: Disassemble with old pipeline always on T720
Alyssa Rosenzweig [Wed, 13 Nov 2019 13:48:12 +0000 (08:48 -0500)]
pan/midgard: Disassemble with old pipeline always on T720

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
4 years agopan/midgard: Use texture, not textureLod, on early Midgard
Alyssa Rosenzweig [Mon, 11 Nov 2019 13:19:18 +0000 (08:19 -0500)]
pan/midgard: Use texture, not textureLod, on early Midgard

We have to disable the fixup.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
4 years agopan/midgard: Fix vertex texturing on early Midgard
Alyssa Rosenzweig [Mon, 11 Nov 2019 13:15:46 +0000 (08:15 -0500)]
pan/midgard: Fix vertex texturing on early Midgard

We use a different set of texture registers, probably to save hardware.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
4 years agopan/midgard: Generalize texture registers across GPUs
Alyssa Rosenzweig [Mon, 11 Nov 2019 13:13:46 +0000 (08:13 -0500)]
pan/midgard: Generalize texture registers across GPUs

Early Midgard uses a different set of texture registers; let's not
hardcode.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
4 years agoaco: implement VK_KHR_shader_float_controls
Rhys Perry [Sat, 9 Nov 2019 20:51:45 +0000 (20:51 +0000)]
aco: implement VK_KHR_shader_float_controls

This actually supports more of the extension than the LLVM backend but we
can't enable it because ACO doesn't work with all stages yet.

With more of it enabled, some CTS tests fail because our 64-bit sqrt
is very imprecise. I can't find any precision requirements for it
anywhere, so I'm thinking it might be a CTS issue.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoaco: fix 64-bit fsign with 0
Rhys Perry [Mon, 11 Nov 2019 14:19:51 +0000 (14:19 +0000)]
aco: fix 64-bit fsign with 0

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoaco: don't combine literals into v_cndmask_b32/v_subb/v_addc
Rhys Perry [Mon, 11 Nov 2019 14:15:04 +0000 (14:15 +0000)]
aco: don't combine literals into v_cndmask_b32/v_subb/v_addc

No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoradv: enable FP16/FP64 denormals earlier and only for LLVM
Rhys Perry [Mon, 11 Nov 2019 13:41:32 +0000 (13:41 +0000)]
radv: enable FP16/FP64 denormals earlier and only for LLVM

ACO sets this itself and will have to set it differently in the future to
support shaderDenormFlushToZeroFloat64.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
4 years agogitlab-ci: Organize images using new REPO_SUFFIX templates feature
Michel Dänzer [Mon, 11 Nov 2019 17:13:28 +0000 (18:13 +0100)]
gitlab-ci: Organize images using new REPO_SUFFIX templates feature

Two benefits:

Most docker image related environment variables can now be defined in
the jobs where they're used instead of globally. The DEBIAN_TAG values
are propagated to other jobs via YAML anchors.

Images on https://gitlab.freedesktop.org/mesa/mesa/container_registry
are now organized in separate repositories with a suffix matching the
name of the job which makes sure the image is there.

Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
4 years agogitlab-ci: Rename container install scripts to match job names (better)
Michel Dänzer [Thu, 7 Nov 2019 19:08:03 +0000 (20:08 +0100)]
gitlab-ci: Rename container install scripts to match job names (better)

Cleans up .gitlab-ci/ a little, and allows using a single DEBIAN_EXEC
line for all container jobs.

v2:
* Use lava_arm.sh instead of arm_lava.sh for consistency with v2 of the
  previous change

Reviewed-by: Eric Anholt <eric@anholt.net> # v1
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
4 years agogitlab-ci: Use functional container job names
Michel Dänzer [Wed, 13 Nov 2019 16:43:41 +0000 (17:43 +0100)]
gitlab-ci: Use functional container job names

This makes it easier to tell which job is which in a pipeline.

v2:
* Use lava_arm{64,hf} instead of arm{64,hf}_lava to keep these jobs
  together in pipeline overviews

Reviewed-by: Eric Anholt <eric@anholt.net> # v1
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
4 years agogitlab-ci: Document that ci-templates refs must be in sync
Michel Dänzer [Thu, 7 Nov 2019 19:25:10 +0000 (20:25 +0100)]
gitlab-ci: Document that ci-templates refs must be in sync

Otherwise there can be weird breakage.

(Removing the include from .gitlab-ci/lava-gitlab-ci.yml doesn't seem
possible unfortunately:
https://gitlab.freedesktop.org/daenzer/mesa/pipelines/79458)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
4 years agopanfrost: Multiply offset_units by 2
Tomeu Vizoso [Wed, 13 Nov 2019 07:42:34 +0000 (08:42 +0100)]
panfrost: Multiply offset_units by 2

Per the spec, the units passed to glPolygonOffset are to be multiplied
by an implementation-defined constant.

On Midgard, this constant seems to be 2.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agointel/perf: add EHL performance query support
Lionel Landwerlin [Wed, 30 Oct 2019 09:18:42 +0000 (11:18 +0200)]
intel/perf: add EHL performance query support

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
4 years agointel/dev: flag the Elkhart Lake platform
Lionel Landwerlin [Wed, 30 Oct 2019 08:59:35 +0000 (10:59 +0200)]
intel/dev: flag the Elkhart Lake platform

We'll use this for performance metrics which are different from ICL.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
4 years agogitlab-ci: update Piglit commit, update skips
Tapani Pälli [Fri, 15 Nov 2019 05:12:24 +0000 (07:12 +0200)]
gitlab-ci: update Piglit commit, update skips

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
4 years agomesa: allow bit queries for EXT_disjoint_timer_query
Tapani Pälli [Tue, 12 Nov 2019 11:43:21 +0000 (13:43 +0200)]
mesa: allow bit queries for EXT_disjoint_timer_query

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2090
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agoradv: make sure to not clear the ds attachment after resolves
Samuel Pitoiset [Wed, 6 Nov 2019 12:55:08 +0000 (13:55 +0100)]
radv: make sure to not clear the ds attachment after resolves

To not overwrite the resolve if there is pending clear aspects,
same as color resolves.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv: remove useless RADV_DEBUG=unsafemath debug option
Samuel Pitoiset [Fri, 8 Nov 2019 07:22:15 +0000 (08:22 +0100)]
radv: remove useless RADV_DEBUG=unsafemath debug option

This option is useless and shouldn't be used at all.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agollvmpipe: Check thread creation errors
Nathan Kidd [Fri, 15 Nov 2019 01:35:11 +0000 (02:35 +0100)]
llvmpipe: Check thread creation errors

In the case of glibc, pthread_t is internally a pointer.  If
lp_rast_destroy() passes a 0-value pthread_t to pthread_join(), the
latter will SEGV dereferencing it.

pthread_create() can fail if either the user's ulimit -u or Linux
kernel's /proc/sys/kernel/threads-max is reached.

Choosing to continue, rather than fail, on theory that it is better to
run with the one main thread, than not run at all.

Keeping as many threads as we got, since lack of threads severely
degrades llvmpipe performance.

Signed-off-by: Nathan Kidd <nkidd@opentext.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
4 years agollvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders
Ben Crocker [Wed, 13 Nov 2019 20:27:24 +0000 (20:27 +0000)]
llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders

Large programs, e.g. gnome-shell and firefox, may tax the
addressability of the Medium code model once a (potentially unbounded)
number of dynamically generated JIT-compiled shader programs are
linked in and relocated.  Yet the default code model as of LLVM 8 is
Medium or even Small.

The cost of changing from Medium to Large is negligible:
- an additional 8-byte pointer stored immediately before the shader entrypoint;
- change an add-immediate (addis) instruction to a load (ld).

Testing with WebGL Conformance
(https://www.khronos.org/registry/webgl/sdk/tests/webgl-conformance-tests.html)
yields clean runs with this change (and crashes without it).

Testing with glxgears shows no detectable performance difference.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1753327175378915435721747110, and 1582226

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/223
Co-authored by: Nemanja Ivanovic <nemanjai@ca.ibm.com>, Tom Stellard <tstellar@redhat.com>

CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
4 years agoiris: Wrap iris_fix_edge_flags in NIR_PASS
Kenneth Graunke [Thu, 14 Nov 2019 18:22:17 +0000 (10:22 -0800)]
iris: Wrap iris_fix_edge_flags in NIR_PASS

So nir_validate happens properly.  Unfortunately this means we have
to play the metadata song and dance, so walk over all impls and say
that we didn't hurt anything.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
4 years agoiris: Properly move edgeflag_out from output list to global list
Kenneth Graunke [Thu, 14 Nov 2019 18:10:27 +0000 (10:10 -0800)]
iris: Properly move edgeflag_out from output list to global list

When demoting it from an output to a global, we need to actually move
it to the correct list.  While here, we also refactor so it's clear
we aren't mutating the list while iterating.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2106
Fixes: f9fd04aca15 ("nir: Fix non-determinism in lower_global_vars_to_local")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
4 years agomesa: Move compile of common Mesa core files to a static lib.
Eric Anholt [Tue, 12 Nov 2019 19:25:09 +0000 (11:25 -0800)]
mesa: Move compile of common Mesa core files to a static lib.

We were compiling them twice, costing extra build time.  Reduces my
ccache-hot clean build time by a second (24.3s to 23.3s, 3 runs each).

The windows args are a little strange -- it's not clear to me that
they're actually used for building these files, but keep them in place
just in case, since we don't have a good windows CI story yet.  We
should want them on both gallium and classic regardless: Only osmesa
could be built for windows in classic, and classic OSMesa's scons
build defines these flags too.

Closes: #2052
Acked-by: Dylan Baker <dylan@pnwbakers.com>
4 years agoAppveyor: Quickly fix meson build.
Prodea Alexandru-Liviu [Thu, 14 Nov 2019 21:45:23 +0000 (21:45 +0000)]
Appveyor: Quickly fix meson build.
As this required use of Python 3.8, mako module also had to be updated.

v2 - Unbind mako module version when using Meson.
Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
4 years agointel/fs: Do not lower large local arrays to scratch on gen7
Danylo Piliaiev [Tue, 12 Nov 2019 16:32:25 +0000 (18:32 +0200)]
intel/fs: Do not lower large local arrays to scratch on gen7

On gen7 and earlier the scratch space size is limited to 12kB.
By enabling this optimization we may easily exceed this limit
without having any fallback.

arb_compute_shader/linker/bug-93840.shader_test crashes with
this lowering on IVB due to exceeding scratch size limit.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2092
Fixes: 69244fc7
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
4 years agoutil: Move gallium's PIPE_FORMAT utils to /util/format/
Eric Anholt [Thu, 27 Jun 2019 22:05:31 +0000 (15:05 -0700)]
util: Move gallium's PIPE_FORMAT utils to /util/format/

To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to
move their helpers out of gallium.  Since u_format used
util_copy_rect(), I moved that in there, too.

I've put it in a separate directory in util/ because it's a big chunk
of related code, and it's not clear to me whether we might want it as
a separate library from libmesa_util at some point.

Closes: #1905
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agogitlab-ci: auto-cancel CI runs when a newer commit is pushed to the same branch
Eric Engestrom [Tue, 12 Nov 2019 23:42:21 +0000 (23:42 +0000)]
gitlab-ci: auto-cancel CI runs when a newer commit is pushed to the same branch

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
5 years agoaco: Optimize out trivial code from uniform bools.
Timur Kristóf [Tue, 5 Nov 2019 10:41:00 +0000 (11:41 +0100)]
aco: Optimize out trivial code from uniform bools.

This should remove most of the excess code size that was
introduced by making all booleans per-lane.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Treat all booleans as per-lane.
Timur Kristóf [Mon, 4 Nov 2019 18:28:08 +0000 (19:28 +0100)]
aco: Treat all booleans as per-lane.

Previously, instruction selection had two kinds of booleans:
1. divergent which was per-lane and stored in s2 (VCC size)
2. uniform which was stored in s1
Additionally, uniform booleans were made per-lane when they resulted
from operations which were supported only by the VALU.

To decide which type was used, we relied on the destination size,
which was not reliable due to the per-lane uniform bools, but it
mostly works on wave64.
However, in wave32 mode (where VCC is also s1) this approach
makes it impossible keep track of which boolean is uniform and
which is divergent.

This commit makes all booleans per-lane.
The resulting excess code size will be taken care of by the optimizer.

v2 (by Daniel Schürmann):
- Better names for some functions
- Use s_andn2_b64 with exec for nir_op_inot
- Simplify code due to using s_and_b64 in bool_to_scalar_condition

v3 (by Timur Kristóf):
- Fix several subgroups regressions

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: use s_and_b64 exec to reduce uniform booleans to one bit
Daniel Schürmann [Tue, 12 Nov 2019 10:40:28 +0000 (11:40 +0100)]
aco: use s_and_b64 exec to reduce uniform booleans to one bit

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
5 years agoaco: Make sure not to mistakenly propagate 64-bit constants.
Timur Kristóf [Wed, 6 Nov 2019 16:42:32 +0000 (17:42 +0100)]
aco: Make sure not to mistakenly propagate 64-bit constants.

ACO's optimizer would try to propagate 64-bit constants, but
does so in such a way that wouldn't work due to how the 64-bit
constants are handled in the IR.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: value number instructions using the execution mask
Daniel Schürmann [Mon, 11 Nov 2019 10:41:31 +0000 (11:41 +0100)]
aco: value number instructions using the execution mask

This patch tries to give instructions with the same execution
mask also the same pass_flags and enables VN for SALU instructions
using exec as Operand.
This patch also adds back VN for VOPC instructions and removes VN for phis.

v2 (by Timur Kristóf):
- Fix some regressions.
v3 (by Daniel Schürmann):
- Fix additional issues

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
5 years agoaco: check if SALU instructions are predeceeded by exec when calculating WQM needs
Daniel Schürmann [Mon, 11 Nov 2019 15:21:51 +0000 (16:21 +0100)]
aco: check if SALU instructions are predeceeded by exec when calculating WQM needs

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
5 years agoac: fix build with recent LLVM
Samuel Pitoiset [Thu, 14 Nov 2019 09:04:29 +0000 (10:04 +0100)]
ac: fix build with recent LLVM

Build is broken since "Move CodeGenFileType enum to Support/CodeGen.h".

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoRevert "mesa: allow bit queries for EXT_disjoint_timer_query"
Tapani Pälli [Thu, 14 Nov 2019 12:50:30 +0000 (14:50 +0200)]
Revert "mesa: allow bit queries for EXT_disjoint_timer_query"

This reverts commit 66d24a9ef705b8f9f15dab8059b63781f9fb28ca.

This commit made Mesa CI red because commit depends on a Piglit test
change.

5 years agonir: Fix non-determinism in lower_global_vars_to_local
Connor Abbott [Tue, 22 Oct 2019 15:50:07 +0000 (17:50 +0200)]
nir: Fix non-determinism in lower_global_vars_to_local

Using a hash-table walk means that variables will get inserted in
different orders on different runs. Just walk the list of globals
instead, even if some of them can't be turned into locals.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agomesa/st: make sure we remove dead IO variables before handing NIR to backends
Iago Toral Quiroga [Wed, 13 Nov 2019 08:19:22 +0000 (09:19 +0100)]
mesa/st: make sure we remove dead IO variables before handing NIR to backends

Commit "1c2bf82d24a glsl: disable lower_fragdata_array() for NIR drivers"
disabled the GLSL IR lowering that turned gl_FragData from an array into a
collection of scalar outputs under the assumption that this was already being
handled properly elsewhere, however there are some corner cases where NIR
would fail to do this, leaving gl_FragData[] as an array variable. This can
break backends that assume that all their outputs will be scalar and use the
variable definitions from the shader to do their output setup, such as the
case of V3D.

At least one corner case was found in some Portal shaders from shader-db, where
NIR would optimize out the full body of a fragment shader. In this scenario,
the empty shader would keep the original array definition of gl_FragData[],
causing the backend to assert.

We need to do this late enough for it to be effective, since doing it in
st_nir_preprocess does not fix the original problem.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2091
Fixes: 1c2bf82d ("glsl: disable lower_fragdata_array() for NIR drivers")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: allow bit queries for EXT_disjoint_timer_query
Tapani Pälli [Tue, 12 Nov 2019 11:43:21 +0000 (13:43 +0200)]
mesa: allow bit queries for EXT_disjoint_timer_query

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2090
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoRevert "dri_interface: add interface for EGL_EXT_image_flush_external"
Tapani Pälli [Tue, 12 Nov 2019 15:33:08 +0000 (17:33 +0200)]
Revert "dri_interface: add interface for EGL_EXT_image_flush_external"

This reverts commit 7520478461d8ab1cda415ff689d6b9058213ff43.

This series caused unexpected flickering artifacts with Iris driver on
Chrome OS and EGL_EXT_image_flush_external spec has not been published
yet.

Acked-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agoRevert "st/dri: assume external consumers of back buffers can write to the buffers"
Tapani Pälli [Tue, 12 Nov 2019 15:32:56 +0000 (17:32 +0200)]
Revert "st/dri: assume external consumers of back buffers can write to the buffers"

This reverts commit 1d1b4578211dcc69cfab8879d0cdafaba1eec948.

This series caused unexpected flickering artifacts with Iris driver on
Chrome OS and EGL_EXT_image_flush_external spec has not been published
yet.

Acked-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agoRevert "st/dri: add support for EGL_EXT_image_flush_external"
Tapani Pälli [Tue, 12 Nov 2019 15:32:49 +0000 (17:32 +0200)]
Revert "st/dri: add support for EGL_EXT_image_flush_external"

This reverts commit 1d122c104a7a3d9348ab347e1e843b7e2bf3b498.

This series caused unexpected flickering artifacts with Iris driver on
Chrome OS and EGL_EXT_image_flush_external spec has not been published
yet.

Acked-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agoRevert "egl: handle EGL_IMAGE_EXTERNAL_FLUSH_EXT"
Tapani Pälli [Tue, 12 Nov 2019 15:32:41 +0000 (17:32 +0200)]
Revert "egl: handle EGL_IMAGE_EXTERNAL_FLUSH_EXT"

This reverts commit 34b1aa957a3f44ea9587ec43311e8434d3782cc1.

This series caused unexpected flickering artifacts with Iris driver on
Chrome OS and EGL_EXT_image_flush_external spec has not been published
yet.

Acked-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agoRevert "egl: implement new functions from EGL_EXT_image_flush_external"
Tapani Pälli [Tue, 12 Nov 2019 15:32:33 +0000 (17:32 +0200)]
Revert "egl: implement new functions from EGL_EXT_image_flush_external"

This reverts commit c1c574fdf18f2aeb1c03f9670bf00e1dcd22d99d.

This series caused unexpected flickering artifacts with Iris driver on
Chrome OS and EGL_EXT_image_flush_external spec has not been published
yet.

Acked-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agopan/midgard: Fix copypropagation for textures
Alyssa Rosenzweig [Fri, 8 Nov 2019 18:11:25 +0000 (13:11 -0500)]
pan/midgard: Fix copypropagation for textures

total instructions in shared programs: 3562 -> 3457 (-2.95%)
instructions in affected programs: 575 -> 470 (-18.26%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 6.56 x̃: 10
helped stats (rel) min: 5.71% max: 24.56% x̄: 16.83% x̃: 18.87%
95% mean confidence interval for instructions value: -9.07 -4.06
95% mean confidence interval for instructions %-change: -19.00% -14.66%
Instructions are helped.

total bundles in shared programs: 1846 -> 1830 (-0.87%)
bundles in affected programs: 338 -> 322 (-4.73%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.50% max: 20.00% x̄: 8.85% x̃: 3.33%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -13.02% -4.67%
Bundles are helped.

total quadwords in shared programs: 3191 -> 3144 (-1.47%)
quadwords in affected programs: 606 -> 559 (-7.76%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 2.94 x̃: 3
helped stats (rel) min: 5.17% max: 22.22% x̄: 11.20% x̃: 5.62%
95% mean confidence interval for quadwords value: -4.58 -1.29
95% mean confidence interval for quadwords %-change: -15.16% -7.24%
Quadwords are helped.

total registers in shared programs: 312 -> 303 (-2.88%)
registers in affected programs: 27 -> 18 (-33.33%)
helped: 9
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 33.33% max: 33.33% x̄: 33.33% x̃: 33.33%
95% mean confidence interval for registers value: -1.00 -1.00
95% mean confidence interval for registers %-change: -33.33% -33.33%
Registers are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Copypropagate vector creation
Alyssa Rosenzweig [Fri, 8 Nov 2019 18:27:39 +0000 (13:27 -0500)]
pan/midgard: Copypropagate vector creation

total instructions in shared programs: 3457 -> 3431 (-0.75%)
instructions in affected programs: 787 -> 761 (-3.30%)
helped: 14
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 1.86 x̃: 1
helped stats (rel) min: 1.01% max: 11.11% x̄: 9.22% x̃: 11.11%
95% mean confidence interval for instructions value: -3.55 -0.16
95% mean confidence interval for instructions %-change: -11.41% -7.03%
Instructions are helped.

total bundles in shared programs: 1830 -> 1826 (-0.22%)
bundles in affected programs: 279 -> 275 (-1.43%)
helped: 2
HURT: 0

total quadwords in shared programs: 3144 -> 3121 (-0.73%)
quadwords in affected programs: 645 -> 622 (-3.57%)
helped: 13
HURT: 0
helped stats (abs) min: 1 max: 11 x̄: 1.77 x̃: 1
helped stats (rel) min: 2.09% max: 16.67% x̄: 12.61% x̃: 14.29%
95% mean confidence interval for quadwords value: -3.45 -0.09
95% mean confidence interval for quadwords %-change: -15.43% -9.79%
Quadwords are helped.

total registers in shared programs: 303 -> 301 (-0.66%)
registers in affected programs: 14 -> 12 (-14.29%)
helped: 2
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/lcra: Use Chaitin's spilling heuristic
Alyssa Rosenzweig [Wed, 13 Nov 2019 20:57:18 +0000 (15:57 -0500)]
pan/lcra: Use Chaitin's spilling heuristic

Not much of a difference but slightly better and slightly less
arbitrary.

total instructions in shared programs: 3560 -> 3559 (-0.03%)
instructions in affected programs: 44 -> 43 (-2.27%)
helped: 1
HURT: 0

total bundles in shared programs: 1844 -> 1843 (-0.05%)
bundles in affected programs: 23 -> 22 (-4.35%)
helped: 1
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Compute spill costs
Alyssa Rosenzweig [Sat, 26 Oct 2019 14:08:18 +0000 (10:08 -0400)]
pan/midgard: Compute spill costs

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agointel/compiler: fix nir_op_{i,u}*32 on ICL
Paulo Zanoni [Tue, 12 Nov 2019 00:49:15 +0000 (16:49 -0800)]
intel/compiler: fix nir_op_{i,u}*32 on ICL

On ICL we have the src1 restriction which is applied through
fix_byte_src() and potentially changes the type of the operands from 8
to 32 bits. When this change happens, we fall into the "else if
(bit_size < 32)" case and miscompute src_type because it takes into
consideration bit_size (8) instead of the adjusted size of temp_op
(32). This results in the shader reading unused memory, giving us
mostly failures, but occasional passes due to whatever was already in
the registers we were reading.

This commit fixes a lot of dEQP subgroup i8vec2 tests on ICL, such as:
    dEQP-VK.subgroups.arithmetic.compute.subgroupadd_i8vec2

This can also be verified by simply changing fix_byte_src() to apply
on all platforms.

Fixes: 5847de6e9afe ("intel/compiler: don't use byte operands for src1 on ICL")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
5 years agospirv: Consider the sampled_image case in wa_glslang_179 workaround
Caio Marcelo de Oliveira Filho [Wed, 13 Nov 2019 19:04:39 +0000 (11:04 -0800)]
spirv: Consider the sampled_image case in wa_glslang_179 workaround

Fixes: 9e440b8d0b9 ("spirv: Sort out the mess that is sampled image")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agodocs: update calendar, add news item and link release notes for 19.2.4
Dylan Baker [Wed, 13 Nov 2019 19:12:53 +0000 (11:12 -0800)]
docs: update calendar, add news item and link release notes for 19.2.4

5 years agodocs: Add SHA256 sum for for 19.2.4
Dylan Baker [Wed, 13 Nov 2019 19:09:32 +0000 (11:09 -0800)]
docs: Add SHA256 sum for for 19.2.4

5 years agodocs: Add release notes for 19.2.4
Dylan Baker [Wed, 13 Nov 2019 18:38:40 +0000 (10:38 -0800)]
docs: Add release notes for 19.2.4

5 years agoci: Expand the freedreno blit skip regex to cover more cases.
Eric Anholt [Wed, 13 Nov 2019 17:40:27 +0000 (09:40 -0800)]
ci: Expand the freedreno blit skip regex to cover more cases.

We've had flaps on at least:
- r16f_to_r16f
- r16i_to_rg16i

Reviewed-by: Daniel Stone <daniels@collabora.com>
5 years agoanv: Initialize depth_bounds_test_enable when not explicitly set
Caio Marcelo de Oliveira Filho [Tue, 12 Nov 2019 18:42:09 +0000 (10:42 -0800)]
anv: Initialize depth_bounds_test_enable when not explicitly set

This was causing uninitialized value to end up propagated to the
3DSTATE_DEPTH_BOUNDS packet, leading to asserts on packet
building due to the value being greater than 1.

Fixes: 939ddccb7a5 ("anv: Add support for depth bounds testing.")
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
5 years agopan/midgard: Remove util/ra support
Alyssa Rosenzweig [Fri, 1 Nov 2019 02:25:05 +0000 (22:25 -0400)]
pan/midgard: Remove util/ra support

It's now unused, in favour of LCRA.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Integrate LCRA
Alyssa Rosenzweig [Fri, 1 Nov 2019 20:46:38 +0000 (16:46 -0400)]
pan/midgard: Integrate LCRA

Pretty routine, we do have a hack to force swizzle alignment for !32-bit
for until we implement !32-bit the right way.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Implement linearly-constrained register allocation
Alyssa Rosenzweig [Sat, 19 Oct 2019 23:43:47 +0000 (19:43 -0400)]
pan/midgard: Implement linearly-constrained register allocation

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add blend shader selection bits for MRT
Alyssa Rosenzweig [Tue, 12 Nov 2019 19:19:52 +0000 (14:19 -0500)]
pan/midgard: Add blend shader selection bits for MRT

This is less complicated than previously thought. Note we have no way of
specifying the work register count for blend shaders; it must be
strictly less than the work register count of the corresponding fragment
shader (which is fine since we force the fragment shader to report a
count of 16 with a blend shader as a major hack until we get register
pressure down for blend shaders).

TODO: pandecode the flags.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agodrm-shim: fix EOF case
Christian Gmeiner [Tue, 12 Nov 2019 11:46:20 +0000 (12:46 +0100)]
drm-shim: fix EOF case

Close input end of the pipe after data was written. Without this
fix I have seen a hang in sysfs_uevent_get(.., "OF_FULLNAME")
when key was not found.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoutil/android: fix android build errors
Tapani Pälli [Tue, 12 Nov 2019 11:45:54 +0000 (13:45 +0200)]
util/android: fix android build errors

Fixes: 9020f519 ("util/u_endian: Add error checks")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2078
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agogitlab-ci: build RADV on ARM64
Samuel Pitoiset [Tue, 12 Nov 2019 13:25:16 +0000 (14:25 +0100)]
gitlab-ci: build RADV on ARM64

The ARMHF LLVM package is LLVM 7 but RADV requires LLVM 8.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogitlab-ci: build a specific libdrm version for ARM64
Samuel Pitoiset [Tue, 12 Nov 2019 13:56:35 +0000 (14:56 +0100)]
gitlab-ci: build a specific libdrm version for ARM64

RADV requires libdrm-2.4.100 but the distrib package is too old.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agozink: move drawing separate source
Erik Faye-Lund [Wed, 6 Nov 2019 14:46:16 +0000 (15:46 +0100)]
zink: move drawing separate source

This code is kinda stand-alone, and it makes it a bit easier to find the
right source in the source-tree.

5 years agozink: move blitting to separate source
Erik Faye-Lund [Wed, 6 Nov 2019 14:36:43 +0000 (15:36 +0100)]
zink: move blitting to separate source

This code is kinda stand-alone, and it makes it a bit easier to find the
right source in the source-tree

5 years agozink: move filter-helper to separate helper-header
Erik Faye-Lund [Wed, 6 Nov 2019 14:36:43 +0000 (15:36 +0100)]
zink: move filter-helper to separate helper-header

This will help code-reuse a bit in the next commit.

5 years agozink: move format-checking to separate source
Erik Faye-Lund [Wed, 6 Nov 2019 14:26:12 +0000 (15:26 +0100)]
zink: move format-checking to separate source

This code is more or less stand-alone, and this keeps the formats array
a bit more encapsulated.

5 years agoci: Disable flappy blit tests on a630.
Eric Anholt [Tue, 12 Nov 2019 22:01:58 +0000 (14:01 -0800)]
ci: Disable flappy blit tests on a630.

These have shown up with the new CTS runner, which has changed test
ordering.

Reviewed-by: Daniel Stone <daniels@collabora.com>
5 years agofreedreno/ir3: remove unused parameter
Rob Clark [Tue, 12 Nov 2019 21:54:22 +0000 (13:54 -0800)]
freedreno/ir3: remove unused parameter

Signed-off-by: Rob Clark <robdclark@chromium.org>
5 years agofreedreno/ir3: legalize cleanups
Rob Clark [Sun, 10 Nov 2019 18:49:59 +0000 (10:49 -0800)]
freedreno/ir3: legalize cleanups

We can clear the "needs" flags once we emit a flag.  And also, don't
open-code the opcode name.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agofreedreno/ir3: fix gpu hang with pre-fs-tex-fetch
Rob Clark [Fri, 8 Nov 2019 20:55:27 +0000 (12:55 -0800)]
freedreno/ir3: fix gpu hang with pre-fs-tex-fetch

For pre-fs-dispatch texture fetch, we need to assign bary_ij to r0.x,
even if it is not used in the shader (ie. only varying use is for tex
coords).  But if, for example, gl_FragCoord is used, it could get
assigned on top of bary_ij, resulting in a GPU hang.

The solution to this is two-fold: (1) the inputs/outputs rework has the
benefit of making RA realize bary_ij is a vec2, even if there are no
split/collect instructions (due to no varying fetches in the shader
itself).  And (2) extend the live ranges of meta:input instructions to
the first non-input, to prevent RA from assigning the same register to
multiple inputs.

Backport note: because of (1) above, a better solution for 19.3 would be
to revert f30c256ec05.

Fixes: f30c256ec05 ("freedreno/ir3: enable pre-fs texture fetch for a6xx")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agofreedreno/ir3: only tex instructions have wrmask
Rob Clark [Fri, 1 Nov 2019 22:17:22 +0000 (15:17 -0700)]
freedreno/ir3: only tex instructions have wrmask

At the ir3 level, we would assume that we could use wrmask to mask
off other components of an instruction returning a vecN when they are
not used.  Which would let RA use components not written for other live
values.  But this is only true for tex instructions.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agofreedreno/ir3: re-work shader inputs/outputs
Rob Clark [Fri, 25 Oct 2019 22:37:56 +0000 (15:37 -0700)]
freedreno/ir3: re-work shader inputs/outputs

Allow inputs/outputs to be vecN (ie. whatever their actual size is), and
use split to get scalar components of inputs, and collect to gather up
scalar components of outputs.

The main motivation is to simplify RA, by only having to consider split/
collect to figure out where values need to land in consecutive scalar
registers, rather than having to also deal with left/right neighbors.

Because of varying packing, and the resulting fractional location
(location_frac), to implement load_input/store_output, it is still
convenient to have a table of scalar inputs/outputs.  We move this to
the compile ctx (since it is only needed for nir->ir3).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agofreedreno/ir3: simplify creating sysval inputs
Rob Clark [Fri, 25 Oct 2019 17:48:22 +0000 (10:48 -0700)]
freedreno/ir3: simplify creating sysval inputs

In almost all places, the add_sysval_input() is paired directly with a
create_input().  (The one exception is frag shader ij bary coord, and
this exception will go away in a later patch.)

So go ahead and clean this up before reworking input/output handling.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>