mesa.git
5 years agoetnaviv: fix ts size calculation
Jonathan Marek [Thu, 4 Jul 2019 17:30:55 +0000 (13:30 -0400)]
etnaviv: fix ts size calculation

The size of the TS is screen->specs.bits_per_tile bits per tile, with each
tile being 64 bytes of the resource.

This gives the same result for 32bpp formats, but reduces the size of TS
for 16bpp formats by 2.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: update headers from rnndb
Jonathan Marek [Mon, 1 Jul 2019 20:07:56 +0000 (16:07 -0400)]
etnaviv: update headers from rnndb

Update to etna_viv commit 8a8b13a and use new names in the code.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoscons: s/HAVE_NO_AUTOCONF/HAVE_SCONS/
Eric Engestrom [Sat, 29 Jun 2019 23:17:50 +0000 (00:17 +0100)]
scons: s/HAVE_NO_AUTOCONF/HAVE_SCONS/

Back when autotools and scons were the two build systems, it kinda made
sense to call scons "not autoconf", but autoconf's been gone for a while
now and other build systems have been added (android.mk and meson), so
the name really doesn't make any sense anymore.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoradeonsi: Fix some warnings.
Bas Nieuwenhuizen [Thu, 4 Jul 2019 00:11:27 +0000 (02:11 +0200)]
radeonsi: Fix some warnings.

../mesa/src/gallium/drivers/radeonsi/si_compute_blit.c: In function ‘si_clear_buffer’:
../mesa/src/gallium/drivers/radeonsi/si_compute_blit.c:195:11: warning: unused variable ‘clear_alignment’ [-Wunused-variable]
  unsigned clear_alignment = MIN2(clear_value_size, 4);
           ^~~~~~~~~~~~~~~
[23/60] Compiling C object 'src/gallium/drivers/radeonsi/3cdc30e@@radeonsi@sta/si_compute_prim_discard.c.o'.
../mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c: In function ‘si_prepare_prim_discard_or_split_draw’:
../mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:1106:7: warning: unused variable ‘compute_has_space’ [-Wunused-variable]
  bool compute_has_space = sctx->ws->cs_check_space(cs, need_compute_dw, false);

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: move ac_shader_{binary,reloc} into r600 and rename
Nicolai Hähnle [Mon, 1 Jul 2019 15:26:09 +0000 (17:26 +0200)]
amd/common: move ac_shader_{binary,reloc} into r600 and rename

They are no longer used by radeonsi or radv.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: removed unused ac_shader_binary functions
Nicolai Hähnle [Mon, 1 Jul 2019 14:57:48 +0000 (16:57 +0200)]
amd/common: removed unused ac_shader_binary functions

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: remove unused ac_compile_module_to_binary
Nicolai Hähnle [Mon, 1 Jul 2019 14:55:28 +0000 (16:55 +0200)]
amd/common: remove unused ac_compile_module_to_binary

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradv: Switch to using rtld.
Bas Nieuwenhuizen [Mon, 1 Jul 2019 01:21:58 +0000 (03:21 +0200)]
radv: Switch to using rtld.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Move more stuff to variant create time.
Bas Nieuwenhuizen [Mon, 1 Jul 2019 00:19:13 +0000 (02:19 +0200)]
radv: Move more stuff to variant create time.

Due to them depending on the linker result.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Add the concept of radv shader binaries.
Bas Nieuwenhuizen [Sun, 30 Jun 2019 23:29:24 +0000 (01:29 +0200)]
radv: Add the concept of radv shader binaries.

This simplifies a bunch of stuff by
(1) Keeping all the things in a single allocation, making things easier
 for the cache.
(2) creating a shader_variant creation helper.

This is immediately put to use by creating rtld shader binaries. This
is the main reason for the binaries, as we need to do the linking at
upload time, i.e. post caching. We do not enable rtld yet.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Add export_prim_id to the shader variant info.
Bas Nieuwenhuizen [Mon, 1 Jul 2019 22:41:47 +0000 (00:41 +0200)]
radv: Add export_prim_id to the shader variant info.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: use last nir shader to determine stage in postprocessing
Bas Nieuwenhuizen [Tue, 2 Jul 2019 10:16:36 +0000 (12:16 +0200)]
radv: use last nir shader to determine stage in postprocessing

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Merge rsrc1/rsrc2 fields with the config fields.
Bas Nieuwenhuizen [Sat, 29 Jun 2019 23:47:30 +0000 (01:47 +0200)]
radv: Merge rsrc1/rsrc2 fields with the config fields.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agovulkan: Update headers to 1.1.113
Andres Gomez [Wed, 3 Jul 2019 14:02:42 +0000 (17:02 +0300)]
vulkan: Update headers to 1.1.113

Some headers were not dragged in the last update(s).

Fixes: 465ec0b145c ("vulkan: Update the XML and headers to 1.1.113")
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoradv: do not crash when generating binning state for unknown chips
Samuel Pitoiset [Thu, 4 Jul 2019 06:54:49 +0000 (08:54 +0200)]
radv: do not crash when generating binning state for unknown chips

These values are only useful if binning is disabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: fix potential crash in the compute resolve path
Samuel Pitoiset [Thu, 4 Jul 2019 06:54:48 +0000 (08:54 +0200)]
radv: fix potential crash in the compute resolve path

If the destination attachment is UNUSED.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agopanfrost: Take into account off-screen FBOs
Tomeu Vizoso [Thu, 4 Jul 2019 07:59:30 +0000 (09:59 +0200)]
panfrost: Take into account off-screen FBOs

In that case, ctx->pipe_framebuffer.cbufs[0] can be NULL.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 5375d009be18 ("panfrost: Pass referenced BOs to the SUBMIT ioctls")
5 years agoutil/macros: rework DIV_ROUND_UP macro
Christian Gmeiner [Wed, 3 Jul 2019 19:06:54 +0000 (21:06 +0200)]
util/macros: rework DIV_ROUND_UP macro

Simplify used math.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
5 years agogitlab-ci: bump required libdrm version
Christian Gmeiner [Wed, 3 Jul 2019 21:12:33 +0000 (23:12 +0200)]
gitlab-ci: bump required libdrm version

Fixes following build problem:
 Message: libdrm 2.4.99 needed because amdgpu has the highest requirement
 Dependency libdrm_intel found: NO found '2.4.97' but need: '>=2.4.99'
 Dependency libdrm_intel found: NO

 meson.build:1178:4: ERROR:  Invalid version of dependency, need 'libdrm_intel' ['>=2.4.99'] found '2.4.97'.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
5 years agoiris: Fix MOCS for grid surface
Kenneth Graunke [Wed, 3 Jul 2019 22:14:49 +0000 (15:14 -0700)]
iris: Fix MOCS for grid surface

Hardcoding 4 is bad; we have a function for this now.

5 years agoiris: Minor tidying
Kenneth Graunke [Wed, 3 Jul 2019 22:12:17 +0000 (15:12 -0700)]
iris: Minor tidying

5 years agoRevert "mesa/st: Passthrough scissor when clearing by quad"
Marek Olšák [Thu, 4 Jul 2019 05:08:02 +0000 (01:08 -0400)]
Revert "mesa/st: Passthrough scissor when clearing by quad"

This reverts commit 0a88aa3025db0cc5a68222c7939d7da4d218f1be.

It breaks a lot of piglit tests.

5 years agogallium/u_blitter: add return to fix the build
Marek Olšák [Thu, 27 Jun 2019 17:20:31 +0000 (13:20 -0400)]
gallium/u_blitter: add return to fix the build

5 years agomesa/st: Passthrough scissor when clearing by quad
Alyssa Rosenzweig [Tue, 18 Jun 2019 20:37:16 +0000 (13:37 -0700)]
mesa/st: Passthrough scissor when clearing by quad

The scissor state -is- setup, but the scissor test is not enabled. This
can prevent certain optimizations from occurring on tilers where
unaffected tiles are thrown out entirely.

v2: Only enable scissor test if the scissor test is actually set by the
app, to avoid regressing quad-based clears used for other reasons (like
a color mask).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd: add NAVI10 PCI IDs
Nicolai Hähnle [Tue, 24 Oct 2017 11:40:53 +0000 (11:40 +0000)]
amd: add NAVI10 PCI IDs

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: fix legacy GS
Marek Olšák [Tue, 25 Jun 2019 21:57:48 +0000 (17:57 -0400)]
radeonsi/gfx10: fix legacy GS

LLVM doesn't insert s_waitcnt_vscnt before GS_DONE.

There was also the crash in legacy GS copy shader.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: disable clear state
Nicolai Hähnle [Wed, 8 May 2019 01:06:15 +0000 (03:06 +0200)]
radeonsi/gfx10: disable clear state

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: disable DPBB
Nicolai Hähnle [Mon, 13 Nov 2017 16:24:13 +0000 (17:24 +0100)]
radeonsi/gfx10: disable DPBB

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: disable SDMA
Nicolai Hähnle [Mon, 13 Nov 2017 16:23:15 +0000 (17:23 +0100)]
radeonsi/gfx10: disable SDMA

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi: determine the rasterization primitive type accurately (v2)
Marek Olšák [Tue, 25 Jun 2019 00:54:52 +0000 (20:54 -0400)]
radeonsi: determine the rasterization primitive type accurately (v2)

v2: reworked version to fix bugs and make it more efficient

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: consolidate & improve input_prim determination for NGG
Marek Olšák [Tue, 25 Jun 2019 00:53:41 +0000 (20:53 -0400)]
radeonsi/gfx10: consolidate & improve input_prim determination for NGG

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: rework ac_build_waitcnt for gfx10
Marek Olšák [Mon, 24 Jun 2019 20:13:24 +0000 (16:13 -0400)]
ac: rework ac_build_waitcnt for gfx10

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement si_shader_vs
Marek Olšák [Mon, 24 Jun 2019 21:39:39 +0000 (17:39 -0400)]
radeonsi/gfx10: implement si_shader_vs

Only used with tessellation + GS instancing.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: unpack GS invocation ID
Marek Olšák [Sat, 22 Jun 2019 01:06:16 +0000 (21:06 -0400)]
radeonsi/gfx10: unpack GS invocation ID

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: jump over the shader query atomic if the queries are disabled
Marek Olšák [Fri, 21 Jun 2019 22:38:58 +0000 (18:38 -0400)]
radeonsi/gfx10: jump over the shader query atomic if the queries are disabled

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: cosmetic changes
Marek Olšák [Fri, 14 Jun 2019 01:29:47 +0000 (21:29 -0400)]
radeonsi/gfx10: cosmetic changes

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: set cache control registers
Marek Olšák [Thu, 6 Jun 2019 04:25:40 +0000 (00:25 -0400)]
radeonsi/gfx10: set cache control registers

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: export correct PrimitiveID from NGG vertex shaders
Marek Olšák [Thu, 6 Jun 2019 00:20:47 +0000 (20:20 -0400)]
radeonsi/gfx10: export correct PrimitiveID from NGG vertex shaders

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: set PA_SC_TILE_STEERING_OVERRIDE
Marek Olšák [Wed, 5 Jun 2019 19:04:45 +0000 (15:04 -0400)]
radeonsi/gfx10: set PA_SC_TILE_STEERING_OVERRIDE

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: add a workaround for stencil HTILE with mipmapping
Marek Olšák [Wed, 5 Jun 2019 05:54:46 +0000 (01:54 -0400)]
radeonsi/gfx10: add a workaround for stencil HTILE with mipmapping

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: disable DCC with MSAA
Marek Olšák [Wed, 5 Jun 2019 05:37:01 +0000 (01:37 -0400)]
radeonsi/gfx10: disable DCC with MSAA

It was only enabled for 2x MSAA anyway.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: fix GL_LINE polygon mode for decomposed primitives
Marek Olšák [Thu, 30 May 2019 00:06:16 +0000 (20:06 -0400)]
radeonsi/gfx10: fix GL_LINE polygon mode for decomposed primitives

We need to tell PA to accept edge flags generated by the input assembler,
because decomposed primitives shouldn't draw inner edges.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: fix NGG GS color clamping
Marek Olšák [Wed, 29 May 2019 20:32:17 +0000 (16:32 -0400)]
radeonsi/gfx10: fix NGG GS color clamping

Just need to pass the input from ES to GS. Everything else is done.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: fix vertex color clamping for TES
Marek Olšák [Tue, 28 May 2019 23:55:09 +0000 (19:55 -0400)]
radeonsi/gfx10: fix vertex color clamping for TES

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: unbind NGG shaders when destroyed
Marek Olšák [Wed, 29 May 2019 02:29:08 +0000 (22:29 -0400)]
radeonsi/gfx10: unbind NGG shaders when destroyed

This fixes glsl-max-varyings, which creates shaders, draws, and then
destroys them.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: don't use the GS workaround for triangle strips w/ adjancency
Marek Olšák [Wed, 29 May 2019 02:06:52 +0000 (22:06 -0400)]
radeonsi/gfx10: don't use the GS workaround for triangle strips w/ adjancency

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: don't do the query buffer atomic for blit shaders
Marek Olšák [Wed, 29 May 2019 02:01:09 +0000 (22:01 -0400)]
radeonsi/gfx10: don't do the query buffer atomic for blit shaders

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: update spi_map if API VS (as NGG) changes and PS doesn't
Marek Olšák [Tue, 28 May 2019 23:56:08 +0000 (19:56 -0400)]
radeonsi/gfx10: update spi_map if API VS (as NGG) changes and PS doesn't

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: fix a possible hang with exp pos0 with done=0 and exec=0
Marek Olšák [Tue, 28 May 2019 23:52:53 +0000 (19:52 -0400)]
radeonsi/gfx10: fix a possible hang with exp pos0 with done=0 and exec=0

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: prefetch HW GS when NGG is used
Marek Olšák [Tue, 28 May 2019 22:55:30 +0000 (18:55 -0400)]
radeonsi/gfx10: prefetch HW GS when NGG is used

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoamd/common/gfx10: set DLC for llvm.amdgcn.s.buffer.load
Nicolai Hähnle [Mon, 27 May 2019 14:16:39 +0000 (16:16 +0200)]
amd/common/gfx10: set DLC for llvm.amdgcn.s.buffer.load

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: fix PS exports for SPI_SHADER_32_AR
Marek Olšák [Sat, 25 May 2019 02:49:27 +0000 (22:49 -0400)]
radeonsi/gfx10: fix PS exports for SPI_SHADER_32_AR

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: set DLC for loads when GLC is set
Marek Olšák [Fri, 24 May 2019 22:48:39 +0000 (18:48 -0400)]
radeonsi/gfx10: set DLC for loads when GLC is set

This fixes L1 shader array cache coherency.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: fix shader images
Marek Olšák [Fri, 24 May 2019 21:25:04 +0000 (17:25 -0400)]
radeonsi/gfx10: fix shader images

Don't promote 2D image instructions to 3D, and don't set z=BASE_ARRAY.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: set the DCC constant encoding flag
Marek Olšák [Thu, 23 May 2019 18:20:27 +0000 (14:20 -0400)]
radeonsi/gfx10: set the DCC constant encoding flag

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: fix intensity formats
Marek Olšák [Wed, 22 May 2019 22:21:55 +0000 (18:21 -0400)]
radeonsi/gfx10: fix intensity formats

move the ALPHA_IS_ON_MSB fixup into vi_alpha_is_on_msb

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: allocate GDS BOs for streamout
Marek Olšák [Wed, 5 Jun 2019 02:08:41 +0000 (22:08 -0400)]
radeonsi/gfx10: allocate GDS BOs for streamout

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: make sure GDS is idle between IBs
Marek Olšák [Wed, 5 Jun 2019 02:02:25 +0000 (22:02 -0400)]
radeonsi/gfx10: make sure GDS is idle between IBs

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement streamout
Nicolai Hähnle [Fri, 21 Sep 2018 20:07:01 +0000 (22:07 +0200)]
radeonsi/gfx10: implement streamout

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement streamout-related queries
Nicolai Hähnle [Wed, 19 Sep 2018 12:53:35 +0000 (14:53 +0200)]
radeonsi/gfx10: implement streamout-related queries

The NGG hardware pipeline doesn't track these statistics automatically,
and in fact *cannot* track them automatically when API geometry shaders
are involved, so we accumulate statistics in the shader using atomic
adds.

This implementation accumulates statistics via the memory system and
the RW buffer descriptor setup. We could use GDS, but since these
atomics aren't latency-sensitive, that basically just trades off
L2$ bandwidth vs. export bus bandwidth. One single memory transaction
per shader workgroup doesn't seem too bad. The result ring buffer in
memory is needed either way to avoid pipeline stalls.

The shader code contains the atomic unconditionally, though the
GFX10_GS_QUERY_BUF is a null buffer when no queries are active. The
atomic is simply discarded by the shader hardware in that case.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: enable the workaround for unaligned vertex fetch
Nicolai Hähnle [Mon, 1 Apr 2019 13:58:31 +0000 (15:58 +0200)]
radeonsi/gfx10: enable the workaround for unaligned vertex fetch

Yes, really. Note that non-format buffer loads are unaffected and work
just fine with unaligned pointers (as long as SH_MEM_CONFIG is setup
correctly, which amdgpu ensures).

Fixes e.g. KHR-GL45.vertex_attrib_64bit.vao

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: re-order the initialization order in si_compile_tgsi_main
Nicolai Hähnle [Thu, 20 Sep 2018 08:18:07 +0000 (10:18 +0200)]
radeonsi/gfx10: re-order the initialization order in si_compile_tgsi_main

It's useful to be able to access gs_ngg_scratch before creating the
main wrapping branch.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: apply DCC MSAA blend workaround
Nicolai Hähnle [Tue, 18 Sep 2018 12:16:43 +0000 (14:16 +0200)]
radeonsi/gfx10: apply DCC MSAA blend workaround

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement si_emit_global_shader_pointers
Nicolai Hähnle [Thu, 7 Feb 2019 17:53:27 +0000 (18:53 +0100)]
radeonsi/gfx10: implement si_emit_global_shader_pointers

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement si_init_tess_factor_ring
Nicolai Hähnle [Fri, 31 Aug 2018 17:59:48 +0000 (19:59 +0200)]
radeonsi/gfx10: implement si_init_tess_factor_ring

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: initialize EXEC for TES-as-NGG (without geometry shader)
Nicolai Hähnle [Fri, 31 Aug 2018 17:59:36 +0000 (19:59 +0200)]
radeonsi/gfx10: initialize EXEC for TES-as-NGG (without geometry shader)

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: use correct VGPR for instance ID in LS shader
Nicolai Hähnle [Fri, 31 Aug 2018 17:58:35 +0000 (19:58 +0200)]
radeonsi/gfx10: use correct VGPR for instance ID in LS shader

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement si_shader_hs
Nicolai Hähnle [Fri, 31 Aug 2018 17:54:59 +0000 (19:54 +0200)]
radeonsi/gfx10: implement si_shader_hs

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement si_create_sampler_state
Nicolai Hähnle [Fri, 31 Aug 2018 17:53:52 +0000 (19:53 +0200)]
radeonsi/gfx10: implement si_create_sampler_state

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: double the number of tessellation offchip buffers per SE
Nicolai Hähnle [Thu, 30 Aug 2018 15:06:52 +0000 (17:06 +0200)]
radeonsi/gfx10: double the number of tessellation offchip buffers per SE

Each gfx10 shader engine corresponds to two gfx9 shader engines, so scale
the number of offchip buffers accordingly.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement get_tess_ring_descriptor
Nicolai Hähnle [Tue, 28 Aug 2018 14:00:28 +0000 (16:00 +0200)]
radeonsi/gfx10: implement get_tess_ring_descriptor

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: mask DCC tile swizzle by alignment
Nicolai Hähnle [Mon, 23 Jul 2018 07:47:19 +0000 (09:47 +0200)]
radeonsi/gfx10: mask DCC tile swizzle by alignment

DCC alignment can be less than the alignment of the main surface. In that
case, the DCC tile swizzle needs to be masked accordingly. Should have no
impact on pre-gfx10.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement hardware MSAA resolve
Nicolai Hähnle [Mon, 2 Jul 2018 16:50:48 +0000 (18:50 +0200)]
radeonsi/gfx10: implement hardware MSAA resolve

MSAA is only supported for 64KB_{R,Z}_X modes, so the micro tile
optimization that we use on gfx9 and earlier does not work.

Be very explicit about how the swizzle mode of the temporary surface is
selected.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: fix binding on si_update_scratch_relocs
Nicolai Hähnle [Thu, 21 Jun 2018 12:56:54 +0000 (14:56 +0200)]
radeonsi/gfx10: fix binding on si_update_scratch_relocs

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: set llvm_has_working_vgpr_indexing
Nicolai Hähnle [Tue, 19 Jun 2018 15:44:24 +0000 (17:44 +0200)]
radeonsi/gfx10: set llvm_has_working_vgpr_indexing

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement load_const_buffer_desc_fast_path
Nicolai Hähnle [Fri, 1 Jun 2018 14:04:02 +0000 (16:04 +0200)]
radeonsi/gfx10: implement load_const_buffer_desc_fast_path

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: take PRIMID from the correct output when exported by GS
Nicolai Hähnle [Fri, 1 Jun 2018 14:03:31 +0000 (16:03 +0200)]
radeonsi/gfx10: take PRIMID from the correct output when exported by GS

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: change location of instance ID shader input
Nicolai Hähnle [Wed, 30 May 2018 20:47:10 +0000 (22:47 +0200)]
radeonsi/gfx10: change location of instance ID shader input

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: set USER_DATA_ADDR offset for geometry shaders
Nicolai Hähnle [Wed, 30 May 2018 20:45:06 +0000 (22:45 +0200)]
radeonsi/gfx10: set USER_DATA_ADDR offset for geometry shaders

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement si_emit_derived_tess_state
Nicolai Hähnle [Tue, 7 May 2019 22:54:46 +0000 (00:54 +0200)]
radeonsi/gfx10: implement si_emit_derived_tess_state

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement si_shader_gs
Nicolai Hähnle [Wed, 8 May 2019 01:10:21 +0000 (03:10 +0200)]
radeonsi/gfx10: implement si_shader_gs

This is only used in the legacy, non-NGG path.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement preload_ring_buffers
Nicolai Hähnle [Wed, 8 May 2019 01:09:42 +0000 (03:09 +0200)]
radeonsi/gfx10: implement preload_ring_buffers

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement si_set_ring_buffer
Nicolai Hähnle [Tue, 14 Nov 2017 15:59:35 +0000 (16:59 +0100)]
radeonsi/gfx10: implement si_set_ring_buffer

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: allow rectangle outputs from NGG primitive shader
Nicolai Hähnle [Wed, 8 May 2019 00:54:01 +0000 (02:54 +0200)]
radeonsi/gfx10: allow rectangle outputs from NGG primitive shader

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: emit VGT_GS_OUT_PRIM_TYPE from draw and add it to VS_STATE
Nicolai Hähnle [Tue, 7 May 2019 23:40:29 +0000 (01:40 +0200)]
radeonsi/gfx10: emit VGT_GS_OUT_PRIM_TYPE from draw and add it to VS_STATE

With NGG, the VGT_GS_OUT_PRIM_TYPE can change without a shader change.

The VS_STATE is required for both streamout and culling from a vertex
shader without pre-compiling outprim-specific variants.

We could consider compiling specialized variants in the future. We
could also consider compiling the NGG logic as an epilog.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: NGG geometry shader PM4 and upload
Nicolai Hähnle [Wed, 23 May 2018 20:31:41 +0000 (22:31 +0200)]
radeonsi/gfx10: NGG geometry shader PM4 and upload

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: generate geometry shaders for NGG
Nicolai Hähnle [Wed, 23 May 2018 20:20:15 +0000 (22:20 +0200)]
radeonsi/gfx10: generate geometry shaders for NGG

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: use the correct register for image descriptor dumping
Nicolai Hähnle [Wed, 13 Dec 2017 12:25:23 +0000 (13:25 +0100)]
radeonsi/gfx10: use the correct register for image descriptor dumping

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: emit GE_CNTL instead of IA_MULTI_VGT_PARAM for legacy mode
Nicolai Hähnle [Sun, 19 Nov 2017 14:40:12 +0000 (15:40 +0100)]
radeonsi/gfx10: emit GE_CNTL instead of IA_MULTI_VGT_PARAM for legacy mode

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: initialize GE_{MAX,MIN}_VTX_INDX/INDX_OFFSET
Nicolai Hähnle [Sun, 19 Nov 2017 14:24:28 +0000 (15:24 +0100)]
radeonsi/gfx10: initialize GE_{MAX,MIN}_VTX_INDX/INDX_OFFSET

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: setup registers for OpenGL compute
Nicolai Hähnle [Sat, 18 Nov 2017 20:16:26 +0000 (21:16 +0100)]
radeonsi/gfx10: setup registers for OpenGL compute

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: set user data base registers
Nicolai Hähnle [Sat, 18 Nov 2017 19:55:56 +0000 (20:55 +0100)]
radeonsi/gfx10: set user data base registers

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement gfx10_shader_ngg
Nicolai Hähnle [Tue, 7 May 2019 21:50:01 +0000 (23:50 +0200)]
radeonsi/gfx10: implement gfx10_shader_ngg

For pipelines without API GS. We will later expand this to cover NGG
geometry shaders as well.

Note that the vtx offset passed into the GS part is just the
vertex index multiplied by VGT_ESGS_RING_ITEMSIZE.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: add NGG registers to si_init_config
Nicolai Hähnle [Sat, 18 Nov 2017 13:32:59 +0000 (14:32 +0100)]
radeonsi/gfx10: add NGG registers to si_init_config

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: update shader-related fields in si_init_config
Nicolai Hähnle [Sat, 18 Nov 2017 13:32:34 +0000 (14:32 +0100)]
radeonsi/gfx10: update shader-related fields in si_init_config

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: implement si_shader_ps
Nicolai Hähnle [Sat, 18 Nov 2017 12:27:55 +0000 (13:27 +0100)]
radeonsi/gfx10: implement si_shader_ps

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: generate VS and TES as NGG merged ESGS shaders
Nicolai Hähnle [Thu, 16 Nov 2017 16:00:50 +0000 (17:00 +0100)]
radeonsi/gfx10: generate VS and TES as NGG merged ESGS shaders

This does not support geometry shading yet. Also missing are streamout
and NGG-specific optimizations.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: distinguish between merged shaders and multi-part shaders
Nicolai Hähnle [Fri, 17 Nov 2017 12:40:18 +0000 (13:40 +0100)]
radeonsi/gfx10: distinguish between merged shaders and multi-part shaders

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: update si_get_shader_name
Nicolai Hähnle [Fri, 17 Nov 2017 12:39:45 +0000 (13:39 +0100)]
radeonsi/gfx10: update si_get_shader_name

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: add as_ngg shader key bit
Nicolai Hähnle [Tue, 7 May 2019 21:27:24 +0000 (23:27 +0200)]
radeonsi/gfx10: add as_ngg shader key bit

Also add the shader main part NGG variant, so that in principle
we can switch between legacy in NGG modes.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>