mesa.git
6 years agoradeonsi: store compute local_size into tgsi_shader_info
Marek Olšák [Thu, 14 Jun 2018 06:25:00 +0000 (02:25 -0400)]
radeonsi: store compute local_size into tgsi_shader_info

This is kinda a hack, but it's enough for the shader cache.

6 years agoradeonsi: unify duplicated code for initial shader compilation
Marek Olšák [Thu, 14 Jun 2018 06:09:05 +0000 (02:09 -0400)]
radeonsi: unify duplicated code for initial shader compilation

6 years agoac: set +auto-waitcnt-before-barrier when needed
Marek Olšák [Thu, 14 Jun 2018 05:27:10 +0000 (01:27 -0400)]
ac: set +auto-waitcnt-before-barrier when needed

This removes useless s_waitcnt before barriers.
Only radeonsi uses this function.

6 years agoradeonsi/gfx9: insert the barrier between merged shaders inside the if block
Marek Olšák [Thu, 14 Jun 2018 05:10:54 +0000 (01:10 -0400)]
radeonsi/gfx9: insert the barrier between merged shaders inside the if block

6 years agogallium: plumb invariant output attrib thru TGSI
Joe M. Kniss [Thu, 21 Jun 2018 00:55:10 +0000 (17:55 -0700)]
gallium: plumb invariant output attrib thru TGSI

Add support for glsl 'invariant' modifier for output data declarations.
Gallium drivers that use TGSI serialization currently loose invariant
modifiers in glsl shaders.

v2: use boolean for invariant instead of unsigned.

Tested: chromiumos on qemu with virglrenderer.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agointel/fs: Build 32-wide FS shaders.
Francisco Jerez [Wed, 27 Apr 2016 02:45:41 +0000 (19:45 -0700)]
intel/fs: Build 32-wide FS shaders.

Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agointel/anv,blorp,i965: Implement the SKL 16x MSAA SIMD32 workaround
Jason Ekstrand [Fri, 18 May 2018 23:39:21 +0000 (16:39 -0700)]
intel/anv,blorp,i965: Implement the SKL 16x MSAA SIMD32 workaround

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agointel/fs: Add fields to wm_prog_data for SIMD32 dispatch
Jason Ekstrand [Fri, 18 May 2018 06:26:02 +0000 (23:26 -0700)]
intel/fs: Add fields to wm_prog_data for SIMD32 dispatch

Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Fix nir_intrinsic_load_helper_invocation for SIMD32.
Francisco Jerez [Thu, 12 Jan 2017 03:55:33 +0000 (19:55 -0800)]
intel/fs: Fix nir_intrinsic_load_helper_invocation for SIMD32.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Fix fs_builder::sample_mask_reg() for 32-wide FS dispatch.
Francisco Jerez [Mon, 9 Jan 2017 22:14:02 +0000 (14:14 -0800)]
intel/fs: Fix fs_builder::sample_mask_reg() for 32-wide FS dispatch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Fix Gen6+ interpolation setup for SIMD32
Francisco Jerez [Fri, 13 Jan 2017 23:33:11 +0000 (15:33 -0800)]
intel/fs: Fix Gen6+ interpolation setup for SIMD32

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Get rid of MOV_DISPATCH_TO_FLAGS
Jason Ekstrand [Thu, 24 May 2018 01:09:48 +0000 (18:09 -0700)]
intel/fs: Get rid of MOV_DISPATCH_TO_FLAGS

We can just emit the MOV in the two places where we use this.

Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Emit MOV_DISPATCH_TO_FLAGS once for the centroid workaround
Jason Ekstrand [Thu, 24 May 2018 00:54:54 +0000 (17:54 -0700)]
intel/fs: Emit MOV_DISPATCH_TO_FLAGS once for the centroid workaround

There's no reason for us to emit it a pile of times and then have a
whole pass to clean it up.  Just emit it once like we really want.

Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Generalize the unlit centroid workaround
Francisco Jerez [Fri, 13 Jan 2017 23:33:45 +0000 (15:33 -0800)]
intel/fs: Generalize the unlit centroid workaround

This generalizes the unlit centroid workaround so it's less code and now
supports SIMD32.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Fix sample id setup for SIMD32.
Francisco Jerez [Fri, 13 Jan 2017 23:32:05 +0000 (15:32 -0800)]
intel/fs: Fix sample id setup for SIMD32.

v2 (Jason Ekstrand):
 - Disallow gl_SampleId in SIMD32 on gen7

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Fix Gen7 compressed source region alignment restriction for SIMD32
Francisco Jerez [Sat, 14 Jan 2017 01:04:23 +0000 (17:04 -0800)]
intel/fs: Fix Gen7 compressed source region alignment restriction for SIMD32

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Implement 32-wide FS payload setup on Gen6+
Francisco Jerez [Fri, 13 Jan 2017 23:40:38 +0000 (15:40 -0800)]
intel/fs: Implement 32-wide FS payload setup on Gen6+

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Extend thread payload layout to SIMD32
Francisco Jerez [Fri, 13 Jan 2017 23:36:51 +0000 (15:36 -0800)]
intel/fs: Extend thread payload layout to SIMD32

And handle 32-wide payload register reads in fetch_payload_reg().

v2 (Jason Ekstrand);
 - Fix some whitespace and brace placement

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Wrap FS payload register look-up in a helper function.
Francisco Jerez [Fri, 13 Jan 2017 23:23:48 +0000 (15:23 -0800)]
intel/fs: Wrap FS payload register look-up in a helper function.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Use fs_regs instead of brw_regs in the unlit centroid workaround
Francisco Jerez [Fri, 13 Jan 2017 23:18:07 +0000 (15:18 -0800)]
intel/fs: Use fs_regs instead of brw_regs in the unlit centroid workaround

While we're here, we change to using horiz_offset() instead of abusing
half().

v2 (Jason Ekstrand):
 - Use horiz_offset() instead of half()

Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Simplify fs_visitor::emit_samplepos_setup
Francisco Jerez [Fri, 13 Jan 2017 22:53:00 +0000 (14:53 -0800)]
intel/fs: Simplify fs_visitor::emit_samplepos_setup

The original code manually handled splitting the MOVs to 8-wide to
handle various regioning restrictions.  Now that we have a SIMD width
splitting pass that handles these things, we can just emit everything at
the full width and let the SIMD splitting pass handle it.  We also now
have a useful "subscript" helper which is designed exactly for the case
where you want to take a W type and read it as a vector of Bs so we may
as well use that too.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agoi965: Add plumbing for shader time in 32-wide FS dispatch mode.
Francisco Jerez [Tue, 26 Apr 2016 00:02:05 +0000 (17:02 -0700)]
i965: Add plumbing for shader time in 32-wide FS dispatch mode.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Disable opt_sampler_eot() in 32-wide dispatch.
Francisco Jerez [Tue, 26 Apr 2016 00:08:42 +0000 (17:08 -0700)]
intel/fs: Disable opt_sampler_eot() in 32-wide dispatch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Emit LINE+MAC for LINTERP with unaligned coordinates
Jason Ekstrand [Sat, 26 May 2018 05:23:30 +0000 (22:23 -0700)]
intel/fs: Emit LINE+MAC for LINTERP with unaligned coordinates

On g4x through Sandy Bridge, src1 (the coordinates) of the PLN
instruction is required to be an even register number.  When it's odd
(which can happen with SIMD32), we have to emit a LINE+MAC combination
instead.  Unfortunately, we can't just fall through to the gen4 case
because the input registers are still set up for PLN which lays out the
four src1 registers differently in SIMD16 than LINE.

v2 (Jason Ekstrand):
 - Take advantage of both accumulators and emit LINE LINE MAC MAC
   (Based on a patch from Francisco Jerez)
 - Unify the gen4 and gen4x-6 cases using a loop

v3 (Jason Ekstrand):
 - Don't unify gen4 with gen4x-6 as this turns out to be more fragile
   than first thought without reworking the gen4 barycentric coordinate
   layout.

Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLN
Jason Ekstrand [Mon, 28 May 2018 16:42:49 +0000 (09:42 -0700)]
intel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLN

When we don't have PLN (gen4 and gen11+), we implement LINTERP as either
LINE+MAC or a pair of MADs.  In both cases, the accumulator is written
by the first of the two instructions and read by the second.  Even
though the accumulator value isn't actually ever used from a logical
instruction perspective, it is trashed so we need to make the scheduler
aware.  Otherwise, the scheduler could end up re-ordering instructions
and putting a LINTERP between another an instruction which writes the
accumulator and another which tries to use that result.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Rework INTERPOLATE_AT_PER_SLOT_OFFSET
Francisco Jerez [Tue, 26 Apr 2016 01:06:13 +0000 (18:06 -0700)]
intel/fs: Rework INTERPOLATE_AT_PER_SLOT_OFFSET

This reworks INTERPOLATE_AT_PER_SLOT_OFFSET to work more like an ALU
operation and less like a send.  This is less code over-all and, as a
side-effect, it now properly handles execution groups and lowering so
SIMD32 support just falls out.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Add the group to the flag subreg number on SNB and older
Jason Ekstrand [Fri, 18 May 2018 03:51:24 +0000 (20:51 -0700)]
intel/fs: Add the group to the flag subreg number on SNB and older

We want consistent behavior in the meaning of the flag_subreg field
between SNB and IVB+.

v2 (Jason Ekstrand):
 - Add some extra commentary

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Fix FB read header setup for SIMD32.
Francisco Jerez [Tue, 10 Jan 2017 00:43:24 +0000 (16:43 -0800)]
intel/fs: Fix FB read header setup for SIMD32.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Fix logical FB write lowering for SIMD32
Francisco Jerez [Fri, 13 Jan 2017 22:25:37 +0000 (14:25 -0800)]
intel/fs: Fix logical FB write lowering for SIMD32

Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Fix FB write message control codegen for SIMD32.
Francisco Jerez [Fri, 13 Jan 2017 22:22:19 +0000 (14:22 -0800)]
intel/fs: Fix FB write message control codegen for SIMD32.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Don't enable dual source blend if no outputs are written
Francisco Jerez [Fri, 6 Jan 2017 22:41:27 +0000 (14:41 -0800)]
intel/fs: Don't enable dual source blend if no outputs are written

This prevents a crash in some arb_enhanced_layouts tests that would be
caused by the next commit.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Fix codegen of FS_OPCODE_SET_SAMPLE_ID for SIMD32.
Francisco Jerez [Tue, 26 Apr 2016 02:28:21 +0000 (19:28 -0700)]
intel/fs: Fix codegen of FS_OPCODE_SET_SAMPLE_ID for SIMD32.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/eu: Fix pixel interpolator queries for SIMD32.
Francisco Jerez [Tue, 26 Apr 2016 02:20:49 +0000 (19:20 -0700)]
intel/eu: Fix pixel interpolator queries for SIMD32.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Disable SIMD32 dispatch for fragment shaders with discard.
Francisco Jerez [Fri, 6 Jan 2017 01:51:51 +0000 (17:51 -0800)]
intel/fs: Disable SIMD32 dispatch for fragment shaders with discard.

Current discard handling requires dedicating the second flag register to
discard.  However, control-flow in SIMD32 requires both flag registers
so it's incompatible with the current discard handling.  Just don't
support SIMD32+discard for now.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Disable SIMD32 dispatch on Gen4-6 with control flow
Francisco Jerez [Tue, 26 Apr 2016 00:29:57 +0000 (17:29 -0700)]
intel/fs: Disable SIMD32 dispatch on Gen4-6 with control flow

The hardware's control flow logic is 16-wide so we're out of luck
here.  We could, in theory, support SIMD32 if we know the control-flow
is uniform but we don't have that information at this point.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Split instructions low to high in lower_simd_width
Jason Ekstrand [Mon, 21 May 2018 16:51:50 +0000 (09:51 -0700)]
intel/fs: Split instructions low to high in lower_simd_width

Commit 0d905597f fixed an issue with the placement of the zip and unzip
instructions.  However, as a side-effect, it reversed the order in which
we were emitting the split instructions so that they went from high
group to low instead of low to high.  This is fine for most things like
texture instructions and the like but certain render target writes
really want to be emitted low to high.  This commit just switches the
order back around to be low to high.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 0d905597f "intel/fs: Be more explicit about our placement of [un]zip"
6 years agointel/fs: Rework KSP data to be SIMD width-based
Jason Ekstrand [Fri, 18 May 2018 06:49:29 +0000 (23:49 -0700)]
intel/fs: Rework KSP data to be SIMD width-based

Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/compiler: Add and use helpers for working with KSP indices
Jason Ekstrand [Fri, 18 May 2018 06:17:17 +0000 (23:17 -0700)]
intel/compiler: Add and use helpers for working with KSP indices

The pixel shader dispatch table is kind-of a confusing mess.  This adds
some helpers for dealing with it and for easily extracting the correct
data from wm_prog_data.

Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agoi965: Re-arrange shader kernel setup in WM state
Jason Ekstrand [Fri, 18 May 2018 20:34:33 +0000 (13:34 -0700)]
i965: Re-arrange shader kernel setup in WM state

Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Remove program key argument from generator.
Francisco Jerez [Tue, 26 Apr 2016 00:20:35 +0000 (17:20 -0700)]
intel/fs: Remove program key argument from generator.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Set up FB write message headers in the visitor
Jason Ekstrand [Thu, 17 May 2018 15:46:03 +0000 (08:46 -0700)]
intel/fs: Set up FB write message headers in the visitor

Doing instruction header setup in the generator is awful for a number
of reasons.  For one, we can't schedule the header setup at all.  For
another, it means lots of implied writes which the instruction scheduler
and other passes can't properly read about.  The second isn't a huge
problem for FB writes since they always happen at the end.  We made a
similar change to sampler handling in ff4726077d86.

Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Fix implied_mrf_writes() for headerless FB writes.
Francisco Jerez [Fri, 13 Jan 2017 22:18:22 +0000 (14:18 -0800)]
intel/fs: Fix implied_mrf_writes() for headerless FB writes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Fix fs_inst::flags_written() for Gen4-5 FB writes.
Francisco Jerez [Fri, 13 Jan 2017 22:17:20 +0000 (14:17 -0800)]
intel/fs: Fix fs_inst::flags_written() for Gen4-5 FB writes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/eu: Return new instruction to caller from brw_fb_WRITE().
Francisco Jerez [Fri, 13 Jan 2017 22:16:12 +0000 (14:16 -0800)]
intel/eu: Return new instruction to caller from brw_fb_WRITE().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Pull FB write implied headers from src[0]
Jason Ekstrand [Fri, 18 May 2018 01:47:19 +0000 (18:47 -0700)]
intel/fs: Pull FB write implied headers from src[0]

Now that we have the implied header in src[0] for tracking purposes, we
may as well use it in the generator.  This makes things a tiny bit more
general.

Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: Properly track implied header regs read by FB writes
Jason Ekstrand [Thu, 17 May 2018 22:40:48 +0000 (15:40 -0700)]
intel/fs: Properly track implied header regs read by FB writes

The FB write opcode on gen4-5 does implied copies from g0 and g1 to the
message payload.  With this commit, we start tracking that as part of
the IR by having the FB write read from g0-1.

Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agointel/fs: FS_OPCODE_REP_FB_WRITE has side effects
Jason Ekstrand [Thu, 17 May 2018 00:51:10 +0000 (17:51 -0700)]
intel/fs: FS_OPCODE_REP_FB_WRITE has side effects

It doesn't matter since we don't ever run replicated write shaders
through the optimizer but it's good to be complete.

Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agodocs: Add news item for mesa 18.1.2
Dylan Baker [Thu, 28 Jun 2018 17:06:44 +0000 (10:06 -0700)]
docs: Add news item for mesa 18.1.2

Which I forgot to do when 18.1.2 came out.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
6 years agonvc0: remove magic values in nve4_set_tex_handles()
Rhys Perry [Fri, 22 Jun 2018 20:47:43 +0000 (21:47 +0100)]
nvc0: remove magic values in nve4_set_tex_handles()

With this commit, things no longer break if NVC0_CB_AUX_TEX_INFO is
changed to anything other than 0x20.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
6 years agonvc0/ir: fix TargetNVC0::insnCanLoadOffset()
Rhys Perry [Thu, 28 Jun 2018 13:26:33 +0000 (14:26 +0100)]
nvc0/ir: fix TargetNVC0::insnCanLoadOffset()

Previously, TargetNVC0::insnCanLoadOffset() returned whether the offset
could be set to a specific value. The IndirectPropagation pass expected
it to return whether the offset could be increased by a specific value,
which is what TargetNV50::insnCanLoadOffset() does.

Fixes: 37b67db6ae34fb6586d640a7a1b6232f091dd812
("nvc0/ir: be careful about propagating very large offsets into const load")

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
6 years agoswr/rast: Updating code style based on current clang-format rules
Alok Hota [Fri, 22 Jun 2018 14:11:26 +0000 (09:11 -0500)]
swr/rast: Updating code style based on current clang-format rules

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Fix addPassesToEmitFile usage with llvm-7.0.
Vinson Lee [Mon, 25 Jun 2018 14:52:19 +0000 (09:52 -0500)]
swr/rast: Fix addPassesToEmitFile usage with llvm-7.0.

Fix build error after llvm-7.0svn r332881 ("CodeGen: Add a dwo output
file argument to addPassesToEmitFile and hook it up to dwo output.").

  CXX      rasterizer/jitter/libmesaswr_la-JitManager.lo
rasterizer/jitter/JitManager.cpp:368:93: error: too few arguments to function call, expected at least 4, have 3
        pTarget->addPassesToEmitFile(*pMPasses, filestream, TargetMachine::CGFT_AssemblyFile);
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                                        ^

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Handling removed LLVM intrinsics in trunk
Alok Hota [Mon, 25 Jun 2018 14:52:18 +0000 (09:52 -0500)]
swr/rast: Handling removed LLVM intrinsics in trunk

- Functionality replaced with emulated intrinsics
- Fixes Bug 106558

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Adding SCATTERPS functionality to BuilderGfxMem
Alok Hota [Mon, 25 Jun 2018 14:52:17 +0000 (09:52 -0500)]
swr/rast: Adding SCATTERPS functionality to BuilderGfxMem

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Adding Read/Write specifier to TranslateGfxAddress stack
Alok Hota [Mon, 25 Jun 2018 14:52:16 +0000 (09:52 -0500)]
swr/rast: Adding Read/Write specifier to TranslateGfxAddress stack

- Removing unused generic translate function
- Requiring read/write specifier in builder_gfx_mem

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agogallium: Fix automake for Android (v2)
Chad Versace [Fri, 1 Jun 2018 02:57:55 +0000 (19:57 -0700)]
gallium: Fix automake for Android (v2)

Chromium OS uses Autotools and pkg-config when building Mesa for
Android. The gallium drivers were failing to find the headers and
libraries for zlib and Android's libbacktrace.

v2:
  - Don't add a check for zlib.pc. configure.ac already checks for
    zlib.pc elsewhere. [for tfiga]
  - Check for backtrace.pc separately from the other Android libs.
    [for tfiga]

Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agoglsl: skip comparison opt when adding vars of different size
Timothy Arceri [Wed, 27 Jun 2018 23:23:20 +0000 (09:23 +1000)]
glsl: skip comparison opt when adding vars of different size

The spec allows adding scalars with a vector or matrix. In this case
the opt was losing swizzle and size information.

This fixes a bug with Doom (2016) shaders.

Fixes: 34ec1a24d61f ("glsl: Optimize (x + y cmp 0) into (x cmp -y).")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agoRevert "anv: Print the actual enum for ignored structure types"
Jason Ekstrand [Wed, 27 Jun 2018 21:09:51 +0000 (14:09 -0700)]
Revert "anv: Print the actual enum for ignored structure types"

This reverts commit fda7014c35e5f5dfa26f078ad0512d13ead8b717.  It was
hitting an unreachable when the sType was unknown.

6 years agoanv: Print the actual enum for ignored structure types
Jason Ekstrand [Tue, 26 Jun 2018 20:33:29 +0000 (13:33 -0700)]
anv: Print the actual enum for ignored structure types

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
6 years agoi965/bufmgr: Use the correct argument order for bo_alloc_internal
Jason Ekstrand [Wed, 27 Jun 2018 01:32:38 +0000 (18:32 -0700)]
i965/bufmgr: Use the correct argument order for bo_alloc_internal

The memzone and flags parameters were accidentally flipped in the call
from brw_bo_alloc_tiled_2d.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agovulkan/wsi_common_display: Return SURFACE_LOST for fatal DRM errors
Keith Packard [Tue, 26 Jun 2018 23:01:45 +0000 (16:01 -0700)]
vulkan/wsi_common_display: Return SURFACE_LOST for fatal DRM errors

Instead of encouraging the client to re-create the swapchain and keep
going with an OUT_OF_DATE error, tell the client that further use of
the current surface will not succeed as the associated kernel objects
are no longer valid.

In particular, when a DRM lease is revoked, then the client needs to
get another lease and create a new surface for that.

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoglsl: Make sure that packed varyings reflect always_active_io properly.
Eric Anholt [Thu, 21 Jun 2018 23:39:15 +0000 (16:39 -0700)]
glsl: Make sure that packed varyings reflect always_active_io properly.

The always_active_io flag was only set according to the first variable
that got packed in, so NIR io compaction would end up compacting XFB
varyings that shouldn't move at that point.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agov3d: Fix Z clipping when viewport.scale[2] is negative.
Eric Anholt [Mon, 25 Jun 2018 20:29:42 +0000 (13:29 -0700)]
v3d: Fix Z clipping when viewport.scale[2] is negative.

Fixes:
dEQP-GLES3.functional.shaders.builtin_variable.depth_range_fragment
dEQP-GLES3.functional.shaders.builtin_variable.depth_range_vertex

6 years agov3d: Convert a bunch of our "minus one" fields over to the new XML attr.
Eric Anholt [Tue, 26 Jun 2018 22:58:21 +0000 (15:58 -0700)]
v3d: Convert a bunch of our "minus one" fields over to the new XML attr.

This fixes up their formatting for CLIF files and makes the code more
legible.

6 years agov3d: Add pack/unpack/decode support for fields with a "- 1" modifier.
Eric Anholt [Tue, 26 Jun 2018 22:53:26 +0000 (15:53 -0700)]
v3d: Add pack/unpack/decode support for fields with a "- 1" modifier.

Right now, we name these fields as "field name minus one" so that your C
code obviously states what the value should be.  However, it's easy enough
to handle at the codegen level with another little XML attribute, meaning
less C code and easier-to-read values in CLIF dumping and gdb as well.

(The actual CLIF format for simulator and FPGA replay takes in
pre-minus-one values, so we need it there too).

6 years agoi965: small cleanup in blorp debug printing output (trivial)
Tapani Pälli [Thu, 14 Jun 2018 11:08:11 +0000 (14:08 +0300)]
i965: small cleanup in blorp debug printing output (trivial)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
6 years agomesa: add a space between headers and source (trivial)
Tapani Pälli [Thu, 14 Jun 2018 11:08:10 +0000 (14:08 +0300)]
mesa: add a space between headers and source (trivial)

There used to be one and it looks like it was removed by eb63640c1d.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
6 years agofeatures.txt: mark some extensions as done
Tapani Pälli [Thu, 14 Jun 2018 11:08:09 +0000 (14:08 +0300)]
features.txt: mark some extensions as done

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
6 years agomesa: Return number of result bits for GL_ANY_SAMPLES_PASSED_CONSERVATIVE
Danylo Piliaiev [Thu, 21 Jun 2018 09:34:15 +0000 (12:34 +0300)]
mesa: Return number of result bits for GL_ANY_SAMPLES_PASSED_CONSERVATIVE

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106986
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
6 years agoradv: use separate bind points for the dynamic buffers
Samuel Pitoiset [Tue, 26 Jun 2018 09:19:26 +0000 (11:19 +0200)]
radv: use separate bind points for the dynamic buffers

The Vulkan spec says:

   "pipelineBindPoint is a VkPipelineBindPoint indicating whether
    the descriptors will be used by graphics pipelines or compute
    pipelines. There is a separate set of bind points for each of
    graphics and compute, so binding one does not disturb the other."

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: remove unused 'predicated' parameter from some functions
Samuel Pitoiset [Tue, 26 Jun 2018 20:35:04 +0000 (22:35 +0200)]
radv: remove unused 'predicated' parameter from some functions

It's always false.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agovirgl: add ARB_texture_view support
Dave Airlie [Fri, 8 Jun 2018 00:02:20 +0000 (10:02 +1000)]
virgl: add ARB_texture_view support

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
6 years agonir/opt_if: Remove unneeded phis if we make progress
Jason Ekstrand [Mon, 25 Jun 2018 23:18:19 +0000 (16:18 -0700)]
nir/opt_if: Remove unneeded phis if we make progress

Now that SSA values can be derefs and they have special rules, we have
to be a bit more careful about our LCSSA phis.  In particular, we need
to clean up in case LCSSA ended up creating a phi node for a deref.
This fixes validation issues with some Vulkan CTS tests with the new
deref instructions.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoradv: emit PIPELINESTAT_{START,STOP} events for pipeline stats queries
Samuel Pitoiset [Fri, 22 Jun 2018 17:16:43 +0000 (19:16 +0200)]
radv: emit PIPELINESTAT_{START,STOP} events for pipeline stats queries

Ported from RadeonSI.
This appears to fix some random fails with:
dEQP-VK.query_pool.statistics_query.*

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoglsl: serialize data from glTransformFeedbackVaryings
Tapani Pälli [Thu, 14 Jun 2018 08:10:20 +0000 (11:10 +0300)]
glsl: serialize data from glTransformFeedbackVaryings

While XFB has been enabled for cache, we did not serialize enough
data for the whole API to work (such as glGetProgramiv).

Fixes: 6d830940f7 "Allow shader cache usage with transform feedback"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106907
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
6 years agoradv: enable VK_EXT_shader_stencil_export
Samuel Pitoiset [Mon, 25 Jun 2018 13:56:46 +0000 (15:56 +0200)]
radv: enable VK_EXT_shader_stencil_export

The driver already supports exporting the stencil value.

The following CTS test now pass:
dEQP-VK.pipeline.shader_stencil_export.op_replace

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: ignore pInheritanceInfo for primary command buffers
Samuel Pitoiset [Mon, 25 Jun 2018 14:22:43 +0000 (16:22 +0200)]
radv: ignore pInheritanceInfo for primary command buffers

From the Vulkan spec:
"If this is a primary command buffer, then this value is ignored."

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoi965/gen6/gs: Handle case where a GS doesn't allocate VUE
Andrii Simiklit [Fri, 22 Jun 2018 07:59:57 +0000 (10:59 +0300)]
i965/gen6/gs: Handle case where a GS doesn't allocate VUE

We can not use the VUE Dereference flags combination for EOT
message under ILK and SNB because the threads are not initialized
there with initial VUE handle unlike Pre-IL.
So to avoid GPU hangs on SNB and ILK we need
to avoid usage of the VUE Dereference flags combination.
(Was tested only on SNB but according to the specification
SNB Volume 2 Part 1: 1.6.5.3, 1.6.5.6
the ILK must behave itself in the similar way)

v2: Approach to fix this issue was changed.
Instead of different EOT flags in the program end
we will create VUE every time even if GS produces no output.

v3: Clean up the patch.
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105399
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
6 years agoradeon: duplicate cmask surface for now.
Dave Airlie [Tue, 26 Jun 2018 00:43:14 +0000 (10:43 +1000)]
radeon: duplicate cmask surface for now.

The radeon winsys isn't linked against the ac code, I have vague
memories of this causing some problems before, for now fix the build
but just duplicating the code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: rename r600_transfer -> si_transfer
Marek Olšák [Fri, 22 Jun 2018 04:02:47 +0000 (00:02 -0400)]
radeonsi: rename r600_transfer -> si_transfer

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: properly set cmask_buffer in si_reallocate_texture_inplace
Marek Olšák [Fri, 22 Jun 2018 04:00:11 +0000 (00:00 -0400)]
radeonsi: properly set cmask_buffer in si_reallocate_texture_inplace

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: remove redundant si_texture::cmask_size
Marek Olšák [Fri, 22 Jun 2018 03:54:20 +0000 (23:54 -0400)]
radeonsi: remove redundant si_texture::cmask_size

cmask_buffer and surface.cmask_size can replace its role.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: inline struct r600_cmask_info
Marek Olšák [Fri, 22 Jun 2018 03:00:56 +0000 (23:00 -0400)]
radeonsi: inline struct r600_cmask_info

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: move CMASK size computation into ac_surface
Marek Olšák [Fri, 22 Jun 2018 02:54:59 +0000 (22:54 -0400)]
radeonsi: move CMASK size computation into ac_surface

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoac/surface: move cmask_size/alignment into radeon_surf
Marek Olšák [Fri, 22 Jun 2018 02:50:51 +0000 (22:50 -0400)]
ac/surface: move cmask_size/alignment into radeon_surf

cmask_size is changed to uint32_t because it can't be greater than 4GB.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: rename r600_surface -> si_surface
Marek Olšák [Fri, 22 Jun 2018 02:16:07 +0000 (22:16 -0400)]
radeonsi: rename r600_surface -> si_surface

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: rename r600_memory_object -> si_memory_object
Marek Olšák [Fri, 22 Jun 2018 02:07:24 +0000 (22:07 -0400)]
radeonsi: rename r600_memory_object -> si_memory_object

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: remove unused r600_memory_object::offset
Marek Olšák [Fri, 22 Jun 2018 02:04:33 +0000 (22:04 -0400)]
radeonsi: remove unused r600_memory_object::offset

The real offset is passed through resource_from_memobj.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: unify duplicated texture_from_handle & texture_from_memobj
Marek Olšák [Fri, 22 Jun 2018 00:41:06 +0000 (20:41 -0400)]
radeonsi: unify duplicated texture_from_handle & texture_from_memobj

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: reorder and initialize more fields in si_reallocate_texture_inplace
Marek Olšák [Fri, 15 Jun 2018 19:28:28 +0000 (15:28 -0400)]
radeonsi: reorder and initialize more fields in si_reallocate_texture_inplace

Some fields shouldn't be initialized, like framebuffers_bound and other stats.
It's hopefully complete now.

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
6 years agoradeonsi: stop using lp_build_emit_llvm_unary/binary
Marek Olšák [Thu, 21 Jun 2018 23:19:49 +0000 (19:19 -0400)]
radeonsi: stop using lp_build_emit_llvm_unary/binary

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: stop using lp_build_alloc
Marek Olšák [Thu, 21 Jun 2018 22:52:47 +0000 (18:52 -0400)]
radeonsi: stop using lp_build_alloc

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: use gallivm less
Marek Olšák [Thu, 21 Jun 2018 22:52:21 +0000 (18:52 -0400)]
radeonsi: use gallivm less

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: stop using lp_bld_intr.h
Marek Olšák [Thu, 21 Jun 2018 22:20:59 +0000 (18:20 -0400)]
radeonsi: stop using lp_bld_intr.h

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: remove last uses of lp_build_context::undef
Marek Olšák [Thu, 21 Jun 2018 22:20:59 +0000 (18:20 -0400)]
radeonsi: remove last uses of lp_build_context::undef

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: stop using lp_bld_arit.h
Marek Olšák [Thu, 21 Jun 2018 22:18:42 +0000 (18:18 -0400)]
radeonsi: stop using lp_bld_arit.h

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: stop using lp_build_gather_values
Marek Olšák [Thu, 21 Jun 2018 22:06:23 +0000 (18:06 -0400)]
radeonsi: stop using lp_build_gather_values

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: clean up some #includes
Marek Olšák [Thu, 21 Jun 2018 22:03:06 +0000 (18:03 -0400)]
radeonsi: clean up some #includes

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeonsi: clean up passing the is_monolithic flag for compilation
Marek Olšák [Thu, 21 Jun 2018 05:36:22 +0000 (01:36 -0400)]
radeonsi: clean up passing the is_monolithic flag for compilation

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoegl/android: Add DRM node probing and filtering
Robert Foss [Wed, 18 Apr 2018 15:27:40 +0000 (17:27 +0200)]
egl/android: Add DRM node probing and filtering

This patch both adds support for probing & filtering DRM nodes
and switches away from using the GRALLOC_MODULE_PERFORM_GET_DRM_FD
gralloc call.

Currently the filtering is based just on the driver name,
and the desired name is supplied using the "drm.gpu.vendor_name"
Android property.

Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>