mesa.git
5 years agofreedreno/ir3: add imul24 opcode
Rob Clark [Tue, 8 Oct 2019 20:36:14 +0000 (13:36 -0700)]
freedreno/ir3: add imul24 opcode

This maps to mul.s24

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
5 years agofreedreno/ir3: optimize immed 2nd src to mad
Rob Clark [Mon, 30 Sep 2019 18:44:16 +0000 (11:44 -0700)]
freedreno/ir3: optimize immed 2nd src to mad

We can't encode immed sources for cat3 (mad) instructions, but we can
use const in first or third src.  We handled this case already, but we
weren't considering that we could lower immed to const.

For manhattan:

  total instructions in shared programs: 35202 -> 34718 (-1.37%)
  instructions in affected programs: 14931 -> 14447 (-3.24%)
  helped: 90
  HURT: 0
  total full in shared programs: 2451 -> 2359 (-3.75%)
  full in affected programs: 653 -> 561 (-14.09%)
  helped: 69
  HURT: 2

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: add rule to generate imad24
Rob Clark [Fri, 27 Sep 2019 18:36:43 +0000 (11:36 -0700)]
freedreno/ir3: add rule to generate imad24

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
5 years agonir: add nir_lower_amul pass
Rob Clark [Fri, 27 Sep 2019 17:15:02 +0000 (10:15 -0700)]
nir: add nir_lower_amul pass

Lower amul to either imul or imul24, depending on whether 24b is enough
bits to calculate an offset within the thing being dereferenced.

Signed-off-by: Rob Clark <robdclark@chromium.org>
5 years agonir: add address calc related opt rules
Rob Clark [Thu, 26 Sep 2019 17:34:51 +0000 (10:34 -0700)]
nir: add address calc related opt rules

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
5 years agonir: add amul instruction
Rob Clark [Thu, 26 Sep 2019 17:32:00 +0000 (10:32 -0700)]
nir: add amul instruction

Used for address/offset calculation (ie. array derefs), where we can
potentially use less than 32b for the multiply of array idx by element
size.  For backends that support `imul24`, this gives a lowering pass
an easy way to find multiplies that potentially can be converted to
`imul24`.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
5 years agonir: Add a new ALU nir_op_imul24
Rob Clark [Wed, 25 Sep 2019 17:10:39 +0000 (10:10 -0700)]
nir: Add a new ALU nir_op_imul24

Some hardware can do 24b multiply in a single instruction, but not 32b.
However in most cases 24b is sufficient for address/offset calculation.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
5 years agofreedreno/ir3: Handle newly added opcode nir_op_imad24_ir3
Eduardo Lima Mitev [Fri, 28 Jun 2019 07:43:03 +0000 (09:43 +0200)]
freedreno/ir3: Handle newly added opcode nir_op_imad24_ir3

Simply emit an ir3_MAD_S24 instruction in the backend.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
5 years agonir: Add a new ALU nir_op_imad24_ir3
Eduardo Lima Mitev [Fri, 28 Jun 2019 07:39:38 +0000 (09:39 +0200)]
nir: Add a new ALU nir_op_imad24_ir3

ir3 compiler has a signed integer multiply-add instruction (MAD_S24)
that is used for different offset calculations in the backend.
Since we intend to move some of these calculations to NIR, we need
a new ALU op that can directly represent it.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
5 years agofreedreno/ir3: rename mul.s/mul.u
Rob Clark [Wed, 25 Sep 2019 17:21:24 +0000 (10:21 -0700)]
freedreno/ir3: rename mul.s/mul.u

to mul.s24/mul.u24, to better reflect that these are 24b multiply.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
5 years agonir/search: fix the PoT helpers
Rob Clark [Wed, 25 Sep 2019 18:59:49 +0000 (11:59 -0700)]
nir/search: fix the PoT helpers

Otherwise, if the base type is (for example) uint32, we would
incorrectly think that PoT optimizations could not apply.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstsrand <jason@jleksrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
5 years agofreedreno/ir3: enable pre-fs texture fetch for a6xx
Rob Clark [Fri, 18 Oct 2019 18:30:48 +0000 (11:30 -0700)]
freedreno/ir3: enable pre-fs texture fetch for a6xx

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agoturnip: add support for pre-fs texture fetch
Rob Clark [Fri, 18 Oct 2019 18:52:35 +0000 (11:52 -0700)]
turnip: add support for pre-fs texture fetch

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/a6xx: add support for pre-fs texture fetch
Rob Clark [Fri, 11 Oct 2019 23:43:03 +0000 (16:43 -0700)]
freedreno/a6xx: add support for pre-fs texture fetch

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: Add support for texture sampling pre-dispatch
Hyunjun Ko [Mon, 5 Aug 2019 06:38:57 +0000 (08:38 +0200)]
freedreno/ir3: Add support for texture sampling pre-dispatch

Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: Add a NIR pass to select tex instructions eligible for pre-fetch
Eduardo Lima Mitev [Mon, 5 Aug 2019 06:09:23 +0000 (08:09 +0200)]
freedreno/ir3: Add a NIR pass to select tex instructions eligible for pre-fetch

The pass should run once at the end of shader compilation, for a4xx
onwards. It iterates texture sampling instructions and mark those
eligibile for pre-dispatch by changing the tex op from 'tex' to
'tex_prefetch'. An instruction is eligibile if:

* The coordinate is a vector where all its components come from a
  shader input.
* The order of the components match exactly that of the input (no
  swizzles).
* The instruction is in the 'main' function, and in the outer
  most-block.

The first two restrictions were arrived to empirically, so more
testing could tighten or loosen it.

The 3rd restriction is there to allow moving the instructions
eligible for pre-dispatch to the beginning of the shader, so
that we don't block the registers holding the result for too
long.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: force i/j pixel to r0.x
Rob Clark [Fri, 11 Oct 2019 02:36:30 +0000 (19:36 -0700)]
freedreno/ir3: force i/j pixel to r0.x

It seems that pre-fs texture fetch only works if ij_pix ends up in r0.x.
I've tried unknown zero bits, to no avail, and blob also seems to force
r0.x when this feature is used.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: add pre-dispatch tex fetch to disasm
Rob Clark [Thu, 10 Oct 2019 19:09:15 +0000 (12:09 -0700)]
freedreno/ir3: add pre-dispatch tex fetch to disasm

Useful to see in disassembly listing texture fetches that were moved to
pre-dispatch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: add dummy bary.f(ei) for pre-fs-fetch
Rob Clark [Wed, 9 Oct 2019 22:51:01 +0000 (15:51 -0700)]
freedreno/ir3: add dummy bary.f(ei) for pre-fs-fetch

If the only use of varyings is a pre-shader texture-fetch, we still need
to issue a bary.f with the end-input flag, otherwise we'll block further
VS invocations, as the hw will think varying storage is still busy.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: fixup register footprint to account for prefetch
Rob Clark [Fri, 11 Oct 2019 23:15:44 +0000 (16:15 -0700)]
freedreno/ir3: fixup register footprint to account for prefetch

It is possible that the result of a pre-fs texture fetch is an output
(or partially an output) of the FS.  Sine the meta:tex_prefetch
instructions are dropped before the assembler, we need to account for
this when we fixup the register footprint.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: add meta instruction for pre-fs texture fetch
Rob Clark [Fri, 11 Oct 2019 22:57:22 +0000 (15:57 -0700)]
freedreno/ir3: add meta instruction for pre-fs texture fetch

Add a placeholder instruction to track texture fetches made prior to FS
shader dispatch.  These, like meta:input instructions are scheduled
before any real instructions, so that RA realizes their result values
are live before the first real instruction.  And to give legalize a way
to track usage of fetched sample requiring (sy) sync flags.

There is some related special handling for varying texcoord inputs used
for pre-fs-fetch, so that they are not DCE'd and remain in linkage
between FS and previous stage.  Note that we could almost avoid this
special handling by giving meta:tex_prefetch real src arguments, except
that in the FS stage, inputs are actual bary.f/ldlv instructions.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: don't DCE ij_pix if used for pre-fs-texture-fetch
Rob Clark [Fri, 11 Oct 2019 18:50:22 +0000 (11:50 -0700)]
freedreno/ir3: don't DCE ij_pix if used for pre-fs-texture-fetch

When we enable pre-dispatch texture fetch, we could have a scenario
where the barycentric i/j coord sysval is not used in the shader, but
only used for the varying fetch for the pre-dispatch texture fetch.
In this case we need to take care not to DCE this sysval.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: track sysval slot for inputs
Rob Clark [Fri, 11 Oct 2019 18:35:53 +0000 (11:35 -0700)]
freedreno/ir3: track sysval slot for inputs

Will be needed for special handling of SYSTEM_VALUE_BARYCENTRIC_PIXEL
(ij_pix) when pre-fs texture fetch is enabled.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: remove unused ir3_instruction::inout
Rob Clark [Fri, 11 Oct 2019 18:26:08 +0000 (11:26 -0700)]
freedreno/ir3: remove unused ir3_instruction::inout

Not sure I remember how long this has been unused for.  But it's unused
now.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: Add data structures to support texture pre-fetch
Hyunjun Ko [Fri, 2 Aug 2019 19:12:22 +0000 (21:12 +0200)]
freedreno/ir3: Add data structures to support texture pre-fetch

Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno: update registers
Rob Clark [Wed, 9 Oct 2019 19:16:03 +0000 (12:16 -0700)]
freedreno: update registers

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agonir: Add new texop nir_texop_tex_prefetch
Eduardo Lima Mitev [Wed, 10 Jul 2019 07:48:21 +0000 (09:48 +0200)]
nir: Add new texop nir_texop_tex_prefetch

This is like nir_texop_tex, but signals that the sampling coordinates
are immutable during the shader stage, in a way that allows the HW
that supports pre-dispatching sampling operations to pre-fetch
the result prior to scheduling the shader stage.

This is introduced to support the feature in Freedreno. Adreno HW
from a4xx supports it.

A NIR pass introduced later in this series will detect sampling
operations that are eligible for pre-dispatch, and replace
nir_texop_tex by this new op, to tell the backend to enable
pre-fetch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agoosmesa: add missing #include <stdint.h>
Eric Engestrom [Wed, 16 Oct 2019 10:54:59 +0000 (11:54 +0100)]
osmesa: add missing #include <stdint.h>

Fixes: 281466332ba81a4277a1 ("gallium/osmesa: Introduce a test.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1947
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
5 years agodocs: Add new feature for compiling for windows with meson
Dylan Baker [Thu, 17 Oct 2019 17:09:59 +0000 (10:09 -0700)]
docs: Add new feature for compiling for windows with meson

Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agoappveyor: Move appveyor script into .appveyor directory
Dylan Baker [Tue, 15 Oct 2019 18:17:54 +0000 (11:17 -0700)]
appveyor: Move appveyor script into .appveyor directory

This clears out the scripts directory completely

Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agoappveyor: Add support for building llvmpipe with meson
Dylan Baker [Tue, 15 Oct 2019 18:16:01 +0000 (11:16 -0700)]
appveyor: Add support for building llvmpipe with meson

Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agodocs: update meson docs for windows
Dylan Baker [Thu, 17 Oct 2019 17:07:44 +0000 (10:07 -0700)]
docs: update meson docs for windows

Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agomeson: Use cmake to find LLVM when building for windows
Dylan Baker [Thu, 25 Jul 2019 21:27:43 +0000 (14:27 -0700)]
meson: Use cmake to find LLVM when building for windows

We don't use cmake normally because it always results in static linking.
This is very problematic for *nix OSes which expect shared linking by
default, but for windows this isn't a problem as LLVM doesn't support
shared linking on windows anyway.

Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agomeson: Add support for wrapping llvm
Dylan Baker [Tue, 24 Apr 2018 20:48:25 +0000 (13:48 -0700)]
meson: Add support for wrapping llvm

For building on Windows (when not using cygwin), users may want to use a
binary wrap of LLVM, this provides a fallback to the LLVM dependency
which may be used in this case

Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agomeson/llvmpipe: Add dep_llvm to driver_swrast
Dylan Baker [Tue, 15 Oct 2019 20:06:58 +0000 (13:06 -0700)]
meson/llvmpipe: Add dep_llvm to driver_swrast

This fixes build errors in gl-gdi on windows when using llvmpipe

Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agoRevert "egl: Add EGL_CONFIG_SELECT_GROUP_MESA ext."
Hal Gentz [Fri, 18 Oct 2019 07:03:37 +0000 (01:03 -0600)]
Revert "egl: Add EGL_CONFIG_SELECT_GROUP_MESA ext."

This reverts commit 173bc9d6842efdec54ea3fd415a6946dcee7b02a.

5 years agoRevert "egl: Fixes transparency with EGL and X11."
Hal Gentz [Fri, 18 Oct 2019 07:03:36 +0000 (01:03 -0600)]
Revert "egl: Fixes transparency with EGL and X11."

This reverts commit 90a19074b4e1d4d8f8ababaade8170c05aeecffe.

5 years agoRevert "egl: Puts RGBA visuals in the second config selection group."
Hal Gentz [Fri, 18 Oct 2019 07:03:33 +0000 (01:03 -0600)]
Revert "egl: Puts RGBA visuals in the second config selection group."

This reverts commit a800d16e4f1589e41e53edf8e8a771a33bb46a6a.

5 years agoRevert "egl: Configs w/o double buffering support have no `EGL_WINDOW_BIT`."
Hal Gentz [Fri, 18 Oct 2019 07:03:30 +0000 (01:03 -0600)]
Revert "egl: Configs w/o double buffering support have no `EGL_WINDOW_BIT`."

This reverts commit 075a96aa926e6e89795f95a6a59693f44d9ac970.

5 years agoetnaviv: check NO_ASTC feature bit
Jonathan Marek [Fri, 9 Aug 2019 15:44:07 +0000 (11:44 -0400)]
etnaviv: check NO_ASTC feature bit

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
5 years agoetnaviv: fix TS samplers on GC7000L
Jonathan Marek [Mon, 2 Sep 2019 20:23:21 +0000 (16:23 -0400)]
etnaviv: fix TS samplers on GC7000L

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
5 years agoetnaviv: fix linear_nearest / nearest_linear filters on GC7000Lite
Jonathan Marek [Fri, 21 Jun 2019 00:01:28 +0000 (20:01 -0400)]
etnaviv: fix linear_nearest / nearest_linear filters on GC7000Lite

MIN filter is only used when LOD MAX is at least 4 (I guess the 2 LSB don't
actually exist).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
5 years agoetnaviv: GC7000: flush TX descriptor and instruction cache
Lucas Stach [Thu, 22 Feb 2018 10:54:21 +0000 (11:54 +0100)]
etnaviv: GC7000: flush TX descriptor and instruction cache

The etnaviv kernel driver will only ever flush write caches. As both
the TX descriptor and instruction cache are read caches they must be
flushed from the user cmdstream at an appropriate time.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
5 years agoetnaviv: add linear texture support on GC7000
Lucas Stach [Thu, 28 Mar 2019 09:14:23 +0000 (10:14 +0100)]
etnaviv: add linear texture support on GC7000

It's just a matter of writing the addressing mode into the
texture descriptor.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
5 years agoetnaviv: GC7000: Texture descriptors
Wladimir J. van der Laan [Sun, 29 Oct 2017 14:59:43 +0000 (14:59 +0000)]
etnaviv: GC7000: Texture descriptors

Create a separate implementation file with texture-descriptor-based
sampler views and sampler states. Initialize the one or the other
based on the GPU. There is so little in common that this seemed more
appropriate that keeping them as one type of state object would
only be confusing.

This commit is actually a combiation of the original commit by
Wladimir, fixes and TS implementation from Jonathan and changed to
use softpin by Lucas.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
5 years agoetnaviv: check for softpin availability on Halti5 devices
Lucas Stach [Fri, 2 Aug 2019 12:53:08 +0000 (14:53 +0200)]
etnaviv: check for softpin availability on Halti5 devices

Halti5 uses texture descriptors to control the samplers, and thus needs to
know the GPU virtual address for the texture buffers to fill into the
descriptor buffer. Without softpin userspace has no control over the GPU
VM and also no way to fix up the texture descriptor buffer, so there is
no point in creating a screen on a Halti5 device without softpin being
available.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
5 years agoetnaviv: drm: add softpin interface
Lucas Stach [Fri, 2 Aug 2019 12:48:09 +0000 (14:48 +0200)]
etnaviv: drm: add softpin interface

If softpin is available on the kernel side, we transparently replace the
relocs with self-managed GPU virtual addresses. This allows to skip some
work at the kernel side, as it doesn't need to touch the command stream
anymore before submitting it to the hardware.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
5 years agoetnaviv: Rework locking
Marek Vasut [Thu, 5 Sep 2019 18:02:58 +0000 (20:02 +0200)]
etnaviv: Rework locking

Replace the per-screen locking of flushing with per-context one and
add per-context lock around command stream buffer accesses, to prevent
cross-context flushing from corrupting these command stream buffers.

Signed-off-by: Marek Vasut <marex@denx.de>
5 years agoetnaviv: Command buffer realloc
Marek Vasut [Thu, 5 Sep 2019 17:57:39 +0000 (19:57 +0200)]
etnaviv: Command buffer realloc

Reallocate the command stream buffer in case it is too small.
The older kernel versions are limited to 64 kiB buffer, so
limit the size to avoid oversized buffers.

Signed-off-by: Marek Vasut <marex@denx.de>
5 years agoetnaviv: Rework resource status tracking
Marek Vasut [Wed, 4 Sep 2019 23:23:52 +0000 (01:23 +0200)]
etnaviv: Rework resource status tracking

Have each context track which resources it marked as pending read and
pending write. Have each resource track in which context it is pending.
This way, it is possible to identify when a resource is both pending
read and pending write at the same time. Moreover, the status field
can be correctly calculated and updated when necessary.

Signed-off-by: Marek Vasut <marex@denx.de>
5 years agoetnaviv: rework the stream flush to always go through the context flush
Lucas Stach [Fri, 9 Aug 2019 15:11:23 +0000 (17:11 +0200)]
etnaviv: rework the stream flush to always go through the context flush

This way we can ensure that the pipe driver tracking of pending resources
stays in sync with the actual command buffer state, even if a space
reservation triggers a forced flush.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
5 years agoetnaviv: drm: remove unused etna_cmd_stream_finish
Lucas Stach [Fri, 9 Aug 2019 14:46:01 +0000 (16:46 +0200)]
etnaviv: drm: remove unused etna_cmd_stream_finish

It's not used by anything and gets in the way for the refactoring of
the flush handling.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
5 years agoetnaviv: keep references to pending resources
Lucas Stach [Fri, 9 Aug 2019 13:34:31 +0000 (15:34 +0200)]
etnaviv: keep references to pending resources

As long as a resource is pending in any context we must not destroy
it, otherwise we'll hit a classical use-after-free with fireworks.
To avoid this take a reference when the resource is first added to
the pending set and put the reference when no longer pending.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
5 years agoetnaviv: Make contexts track resources
Marek Vasut [Sat, 8 Jun 2019 17:52:55 +0000 (19:52 +0200)]
etnaviv: Make contexts track resources

Currently, the screen tracks all resources for all contexts, but this
is not correct. Each context should track the resources it uses. This
also allows a context to detect whether a resource is used by another
context and to notify another context using a resource that the current
context is done using the resource.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Guido Günther <guido.gunther@puri.sm>
Cc: Lucas Stach <l.stach@pengutronix.de>
5 years agoREVIEWERS: add VMware reviewers
Brian Paul [Wed, 7 Aug 2019 20:22:33 +0000 (14:22 -0600)]
REVIEWERS: add VMware reviewers

5 years agoradv: implement VK_KHR_shader_float_controls
Samuel Pitoiset [Mon, 14 Oct 2019 09:27:32 +0000 (11:27 +0200)]
radv: implement VK_KHR_shader_float_controls

This exposes what's required for DX and this is what we already
configure. The driver flushes denorms for FP32 and preserves them
for FP16/FP64. Note that we can't allow both preserving and
flushing denorms because this won't work for merged shaders. This
will require LLVM to update the float mode register to make it work.

Only enabled on GFX8+ with the LLVM path because it's untested on
previous chips and ACO doesn't support it.

This extension is required for SPIRV 1.4.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac/llvm: force fneg/fabs to flush denorms to zero if requested
Samuel Pitoiset [Mon, 14 Oct 2019 13:39:06 +0000 (15:39 +0200)]
ac/llvm: force fneg/fabs to flush denorms to zero if requested

LLVM optimizes these instructions with XOR/AND and it loses
the sign bit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac/llvm: add AC_FLOAT_MODE_ROUND_TO_ZERO
Samuel Pitoiset [Mon, 14 Oct 2019 13:36:37 +0000 (15:36 +0200)]
ac/llvm: add AC_FLOAT_MODE_ROUND_TO_ZERO

Because some instructions will be optimized by the backend compiler,
the driver has to manually flush to zero to keep the result exact.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac/llvm: add ac_build_canonicalize() helper
Samuel Pitoiset [Mon, 14 Oct 2019 12:23:35 +0000 (14:23 +0200)]
ac/llvm: add ac_build_canonicalize() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agotravis: test meson install as well
Eric Engestrom [Fri, 18 Oct 2019 14:05:21 +0000 (15:05 +0100)]
travis: test meson install as well

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agotravis: don't (re)install python
Eric Engestrom [Fri, 18 Oct 2019 14:03:43 +0000 (15:03 +0100)]
travis: don't (re)install python

The new Mac OS X images apparently already have python2 and python3,
and `brew` considers asking to install something already installed
as a fatal error...

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogbm: Add GBM_MAX_PLANES definition
Lepton Wu [Thu, 17 Oct 2019 08:53:49 +0000 (01:53 -0700)]
gbm: Add GBM_MAX_PLANES definition

This removed hard coded "4".

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
5 years agov3d: Explicitly expose OpenGL ES Shading Language 3.1
Jose Maria Casanova Crespo [Fri, 11 Oct 2019 11:53:32 +0000 (13:53 +0200)]
v3d: Explicitly expose OpenGL ES Shading Language 3.1

This will expose GL_EXT_primitive_bounding_box and
GL_OES_primitive_bounding_box after previous commits
expose OpenGL ES 3.1 once Compute Shaders are available.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: request the kernel to flush caches when TMU is dirty
Iago Toral Quiroga [Tue, 3 Sep 2019 08:31:42 +0000 (10:31 +0200)]
v3d: request the kernel to flush caches when TMU is dirty

This adapts the v3d driver to the new CL submit ioctl interface that
allows the driver to request a flush of the caches after the render
job has completed. This seems to eliminate the kernel write violation
errors reported during CTS and Piglit excutions, fixing some CTS tests
and GPU resets along the way.

v2:
  - Adapt to changes in the kernel side.
  - Disable shader storage and shader images if the kernel doesn't
    implement cache flushing.

Fixes CTS tests:
KHR-GLES31.core.shader_image_size.basic-nonMS-fs-float
KHR-GLES31.core.shader_image_size.basic-nonMS-fs-int
KHR-GLES31.core.shader_image_size.basic-nonMS-fs-uint
KHR-GLES31.core.shader_image_size.advanced-nonMS-fs-float
KHR-GLES31.core.shader_image_size.advanced-nonMS-fs-int
KHR-GLES31.core.shader_image_size.advanced-nonMS-fs-uint
KHR-GLES31.core.shader_atomic_counters.advanced-usage-many-draw-calls2
KHR-GLES31.core.shader_atomic_counters.advanced-usage-draw-update-draw
KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-int
KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-std140-matR
KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-std140-struct
KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-std430-matC-pad
KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-std430-vec

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: Add Compute Shader support
Eric Anholt [Wed, 5 Dec 2018 23:41:35 +0000 (15:41 -0800)]
v3d: Add Compute Shader support

Now that the UAPI has landed, add the pipe_context function for
dispatching compute shaders.  This is the last major feature for GLES 3.1,
though it's not enabled quite yet.

5 years agobroadcom: document known hardware issues for L2T flush command
Iago Toral Quiroga [Thu, 5 Sep 2019 06:35:01 +0000 (08:35 +0200)]
broadcom: document known hardware issues for L2T flush command

Suggested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: add new flag dirty TMU cache at v3d_compiler
Iago Toral Quiroga [Wed, 14 Aug 2019 07:27:13 +0000 (09:27 +0200)]
v3d: add new flag dirty TMU cache at v3d_compiler

That we set for any TMU write on spills and general tmu. It is then
used as part of v3d_emit_gl_shader_state later.

v2: add a new flag instead at v3d_compiler instead of dirty the flag
    at v3dx if there is any spill (change suggested by Eric, added by
    Alejandro)

v3: set this for anything that is not a load and do it also in
    v3d40_vir_emit_image_load_store (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: trivial update to obsolete comment
Iago Toral Quiroga [Wed, 14 Aug 2019 07:28:15 +0000 (09:28 +0200)]
v3d: trivial update to obsolete comment

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoradv: Fix single stage constant flush with merged shaders.
Bas Nieuwenhuizen [Thu, 17 Oct 2019 23:21:29 +0000 (01:21 +0200)]
radv: Fix single stage constant flush with merged shaders.

e.g. a VERTEX only flush with tess on Vega should look at the TCS
to see which bits are needed.

CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1953
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agorbug: remove superfluous NULL check
Lucas Stach [Mon, 16 Sep 2019 13:15:47 +0000 (15:15 +0200)]
rbug: remove superfluous NULL check

The SCR_INIT macro used to install the rbug resource_changed method
will only do so when the driver below rbug exposes this method, so
the check will always evaluate to true.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
5 years agorbug: implement resource creation with modifier
Lucas Stach [Mon, 16 Sep 2019 13:09:38 +0000 (15:09 +0200)]
rbug: implement resource creation with modifier

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
5 years agorbug: forward can_create_resource to pipe driver
Lucas Stach [Mon, 16 Sep 2019 13:08:44 +0000 (15:08 +0200)]
rbug: forward can_create_resource to pipe driver

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
5 years agorbug: forward texture_barrier to pipe driver
Lucas Stach [Mon, 16 Sep 2019 13:07:53 +0000 (15:07 +0200)]
rbug: forward texture_barrier to pipe driver

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
5 years agorbug: implement missing explicit sync related fence functions
Lucas Stach [Mon, 16 Sep 2019 13:07:07 +0000 (15:07 +0200)]
rbug: implement missing explicit sync related fence functions

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
5 years agorbug: move flush_resource initialization
Lucas Stach [Mon, 16 Sep 2019 13:01:10 +0000 (15:01 +0200)]
rbug: move flush_resource initialization

All the other context method initialzation follow the order of the pipe_context
structure definition making it easy to find unimplemented methods in rbug.
Move the flush_resource init to follow the same order.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
5 years agorbug: unwrap index buffer resource
Lucas Stach [Mon, 16 Sep 2019 12:55:13 +0000 (14:55 +0200)]
rbug: unwrap index buffer resource

All resources passed to the drivers below rbug need to be unwrapped before
being passed down. We missed to do this for the index buffer resource when
this was made part of the draw_info structure.

Fixes: 330d0607ed60 (gallium: remove pipe_index_buffer and set_index_buffer)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
5 years agorbug: fix transmitted texture sizes
Lucas Stach [Mon, 16 Sep 2019 12:48:27 +0000 (14:48 +0200)]
rbug: fix transmitted texture sizes

The rbug wire format defines the texture size parameters to be uint32_t sized
and uses memcpy to move the function parameters to the message structure.
This caused totally wrong transmitted texture sizes since the height and depth
paramterds have been changed to uint16_t in the gallium API. Fix this by doing
an explicit conversion to the correct representation before packing into the
wire message.

Fixes: e6428092f5e1 (gallium: decrease the size of pipe_resource - 64 -> 48 bytes)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
5 years agogallium/util: don't depend on implementation defined behavior in listen()
Lucas Stach [Mon, 16 Sep 2019 12:43:13 +0000 (14:43 +0200)]
gallium/util: don't depend on implementation defined behavior in listen()

Using 0 as the backlog argument to listen() is exploiting implementation
defined behavior and will lead to no connections being accepted on some
libc implementations.

Quote of the listen manpage: "A backlog argument of 0 may allow the socket to
accept connections, in which case the length of the listen queue may be set to
an implementation-defined minimum value."

Fix this by using a more sensible backlog value.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
5 years agomesa/main: GL_GEOMETRY_SHADER_INVOCATIONS exists in GL_OES_geometry_shader
Iago Toral Quiroga [Mon, 14 Oct 2019 08:13:17 +0000 (10:13 +0200)]
mesa/main: GL_GEOMETRY_SHADER_INVOCATIONS exists in GL_OES_geometry_shader

It seems that for desktop GL this was included with ARB_gpu_shader5, but
for OpenGL ES this is already included with the base extension and there is
a CTS test that checks this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: implement glTextureStorageNDEXT functions
Pierre-Eric Pelloux Prayer [Mon, 23 Sep 2019 09:06:07 +0000 (11:06 +0200)]
mesa: implement glTextureStorageNDEXT functions

Implement the 3 functions using the texturestorage_error() helper.
_mesa_lookup_or_create_texture is always called to make sure that 'texture'
is initialized (even if the texturestorage_error() generates an error afterwards).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: add EXT_dsa NamedCopyBufferSubDataEXT function
Pierre-Eric Pelloux-Prayer [Wed, 11 Sep 2019 08:26:50 +0000 (10:26 +0200)]
mesa: add EXT_dsa NamedCopyBufferSubDataEXT function

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: add EXT_dsa NamedRenderbufferStorageMultisampleEXT function
Pierre-Eric Pelloux-Prayer [Wed, 11 Sep 2019 08:13:21 +0000 (10:13 +0200)]
mesa: add EXT_dsa NamedRenderbufferStorageMultisampleEXT function

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: add EXT_dsa Generate*MipmapEXT functions
Pierre-Eric Pelloux-Prayer [Wed, 11 Sep 2019 08:01:24 +0000 (10:01 +0200)]
mesa: add EXT_dsa Generate*MipmapEXT functions

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: refactor GenerateTextureMipmap handling
Pierre-Eric Pelloux-Prayer [Wed, 11 Sep 2019 07:58:47 +0000 (09:58 +0200)]
mesa: refactor GenerateTextureMipmap handling

Rework _mesa_GenerateTextureMipmap to allow code sharing with EXT_dsa functions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: add EXT_dsa glGetFloati_vEXT/glGetDoublei_vEXT
Pierre-Eric Pelloux-Prayer [Wed, 11 Sep 2019 07:30:14 +0000 (09:30 +0200)]
mesa: add EXT_dsa glGetFloati_vEXT/glGetDoublei_vEXT

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: add EXT_dsa + EXT_gpu_program_parameters functions
Pierre-Eric Pelloux-Prayer [Mon, 9 Sep 2019 15:26:30 +0000 (17:26 +0200)]
mesa: add EXT_dsa + EXT_gpu_program_parameters functions

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: add EXT_dsa + EXT_gpu_shader4 functions
Pierre-Eric Pelloux-Prayer [Mon, 9 Sep 2019 15:14:18 +0000 (17:14 +0200)]
mesa: add EXT_dsa + EXT_gpu_shader4 functions

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: add EXT_dsa + EXT_texture_integer functions
Pierre-Eric Pelloux-Prayer [Mon, 9 Sep 2019 14:44:11 +0000 (16:44 +0200)]
mesa: add EXT_dsa + EXT_texture_integer functions

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: add EXT_dsa + EXT_texture_buffer_object functions
Pierre-Eric Pelloux-Prayer [Mon, 9 Sep 2019 14:22:29 +0000 (16:22 +0200)]
mesa: add EXT_dsa + EXT_texture_buffer_object functions

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: add EXT_dsa glProgramUniform*EXT functions
Pierre-Eric Pelloux-Prayer [Mon, 9 Sep 2019 13:53:00 +0000 (15:53 +0200)]
mesa: add EXT_dsa glProgramUniform*EXT functions

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: add EXT_dsa NamedProgram functions
Pierre-Eric Pelloux-Prayer [Tue, 28 May 2019 15:06:00 +0000 (17:06 +0200)]
mesa: add EXT_dsa NamedProgram functions

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: add EXT_dsa glClientAttribDefaultEXT / glPushClientAttribDefaultEXT
Pierre-Eric Pelloux-Prayer [Tue, 28 May 2019 08:27:52 +0000 (10:27 +0200)]
mesa: add EXT_dsa glClientAttribDefaultEXT / glPushClientAttribDefaultEXT

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: add EXT_dsa glNamedRenderbufferStorageEXT and glGetNamedRenderbufferParameterivEXT
Pierre-Eric Pelloux-Prayer [Thu, 23 May 2019 14:34:16 +0000 (16:34 +0200)]
mesa: add EXT_dsa glNamedRenderbufferStorageEXT and glGetNamedRenderbufferParameterivEXT

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agopanfrost: Respect offset for imported resources
Daniel Stone [Thu, 17 Oct 2019 11:49:54 +0000 (13:49 +0200)]
panfrost: Respect offset for imported resources

When we import a resource through Gallium, we need to take account of
the offset parameter passed.

Fixes a failure seen with the VIVID V4L2 driver, which would create NV12
resources within the same BO, with an offset. Sample pipeline to
reproduce (replace videoN with your actual VIVID device node):
    gst-launch-1.0 v4l2src device=/dev/videoN ! video/x-raw,format=NV12 ! glimagesink

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Tested-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
5 years agoiris/resource: Use isl surface alignment during bo allocation
Jordan Justen [Fri, 31 May 2019 22:50:53 +0000 (15:50 -0700)]
iris/resource: Use isl surface alignment during bo allocation

Reworks:
 * Change subject from "iris: Align main surface allocation to 64k on gen12+"
 * Make use of isl surf alignment. (Nanley)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/isl: Add isl_aux_usage_has_ccs
Jason Ekstrand [Fri, 4 May 2018 16:43:01 +0000 (09:43 -0700)]
intel/isl: Add isl_aux_usage_has_ccs

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/isl: Add R10G10B10_FLOAT_A2_UNORM format
Jordan Justen [Thu, 12 Apr 2018 05:48:33 +0000 (22:48 -0700)]
intel/isl: Add R10G10B10_FLOAT_A2_UNORM format

Reworks:
 * Fill out the format's entry in the ISL format table. (Nanley)
 * Support CCS_E-enabled BLORP copies with the format. (Nanley)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/compiler: Report the number of non-spill/fill SEND messages
Kenneth Graunke [Tue, 10 Sep 2019 01:31:41 +0000 (18:31 -0700)]
intel/compiler: Report the number of non-spill/fill SEND messages

This can be useful to measure whether memory access optimizations are
having the desired effect.  For example, we might see a reduction in
image loads/stores, or constant buffer loads.  We can already see this
in cycle estimates to some extent, but this is a more direct approach,
minus a lot of the noise of random scheduler shuffling.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agost/mesa: don't call variables "tgsi" when they can reference NIR
Marek Olšák [Thu, 17 Oct 2019 00:21:11 +0000 (20:21 -0400)]
st/mesa: don't call variables "tgsi" when they can reference NIR

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agost/mesa: merge st_fragment_program into st_common_program
Marek Olšák [Wed, 16 Oct 2019 20:46:19 +0000 (16:46 -0400)]
st/mesa: merge st_fragment_program into st_common_program

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>