mesa.git
4 years agoradeonsi/gfx10: update shader-related fields in si_init_config
Nicolai Hähnle [Sat, 18 Nov 2017 13:32:34 +0000 (14:32 +0100)]
radeonsi/gfx10: update shader-related fields in si_init_config

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: implement si_shader_ps
Nicolai Hähnle [Sat, 18 Nov 2017 12:27:55 +0000 (13:27 +0100)]
radeonsi/gfx10: implement si_shader_ps

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: generate VS and TES as NGG merged ESGS shaders
Nicolai Hähnle [Thu, 16 Nov 2017 16:00:50 +0000 (17:00 +0100)]
radeonsi/gfx10: generate VS and TES as NGG merged ESGS shaders

This does not support geometry shading yet. Also missing are streamout
and NGG-specific optimizations.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: distinguish between merged shaders and multi-part shaders
Nicolai Hähnle [Fri, 17 Nov 2017 12:40:18 +0000 (13:40 +0100)]
radeonsi/gfx10: distinguish between merged shaders and multi-part shaders

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: update si_get_shader_name
Nicolai Hähnle [Fri, 17 Nov 2017 12:39:45 +0000 (13:39 +0100)]
radeonsi/gfx10: update si_get_shader_name

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: add as_ngg shader key bit
Nicolai Hähnle [Tue, 7 May 2019 21:27:24 +0000 (23:27 +0200)]
radeonsi/gfx10: add as_ngg shader key bit

Also add the shader main part NGG variant, so that in principle
we can switch between legacy in NGG modes.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: implement si_update_shaders
Nicolai Hähnle [Thu, 16 Nov 2017 16:02:41 +0000 (17:02 +0100)]
radeonsi/gfx10: implement si_update_shaders

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: implement si_build_vgt_shader_config
Nicolai Hähnle [Tue, 7 May 2019 21:23:03 +0000 (23:23 +0200)]
radeonsi/gfx10: implement si_build_vgt_shader_config

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: keep track of whether NGG is used
Nicolai Hähnle [Tue, 7 May 2019 21:00:43 +0000 (23:00 +0200)]
radeonsi/gfx10: keep track of whether NGG is used

We always use NGG by default, except when tessellation is enabled with
extreme geometry shader amplification.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: document NGG shader stages
Nicolai Hähnle [Thu, 16 Nov 2017 16:02:13 +0000 (17:02 +0100)]
radeonsi/gfx10: document NGG shader stages

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: implement gfx10_emit_cache_flush
Nicolai Hähnle [Thu, 16 Nov 2017 12:33:36 +0000 (13:33 +0100)]
radeonsi/gfx10: implement gfx10_emit_cache_flush

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: add si_context::emit_cache_flush
Nicolai Hähnle [Thu, 16 Nov 2017 11:16:52 +0000 (12:16 +0100)]
radeonsi/gfx10: add si_context::emit_cache_flush

The introduction of GCR_CNTL makes cache flush handling on gfx10
sufficiently different that it makes sense to just use a separate
function.

Since emit_cache_flush is called quite early during context init,
we initialize the pointer explicitly in si_create_context.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: implement DB registers
Nicolai Hähnle [Thu, 16 Nov 2017 10:18:52 +0000 (11:18 +0100)]
radeonsi/gfx10: implement DB registers

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: set CB registers
Nicolai Hähnle [Thu, 16 Nov 2017 09:22:15 +0000 (10:22 +0100)]
radeonsi/gfx10: set CB registers

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: always set up sample locations
Nicolai Hähnle [Wed, 15 Nov 2017 20:29:56 +0000 (21:29 +0100)]
radeonsi/gfx10: always set up sample locations

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: use Z32_FLOAT_CLAMP for upgraded depth textures
Nicolai Hähnle [Wed, 15 Nov 2017 20:10:28 +0000 (21:10 +0100)]
radeonsi/gfx10: use Z32_FLOAT_CLAMP for upgraded depth textures

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: implement vertex format changes
Nicolai Hähnle [Wed, 15 Nov 2017 19:50:19 +0000 (20:50 +0100)]
radeonsi/gfx10: implement vertex format changes

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: implement si_set_{constant,shader}_buffer
Nicolai Hähnle [Tue, 14 Nov 2017 15:55:34 +0000 (16:55 +0100)]
radeonsi/gfx10: implement si_set_{constant,shader}_buffer

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: implement si_make_buffer_descriptor
Nicolai Hähnle [Tue, 7 May 2019 20:43:32 +0000 (22:43 +0200)]
radeonsi/gfx10: implement si_make_buffer_descriptor

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: implement si_set_mutable_tex_desc_fields
Nicolai Hähnle [Tue, 14 Nov 2017 15:36:16 +0000 (16:36 +0100)]
radeonsi/gfx10: implement si_set_mutable_tex_desc_fields

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: gfx10 can render up to 8192 layers
Nicolai Hähnle [Tue, 14 Nov 2017 15:10:10 +0000 (16:10 +0100)]
radeonsi/gfx10: gfx10 can render up to 8192 layers

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: add gfx10_make_texture_descriptor
Nicolai Hähnle [Sun, 15 Apr 2018 17:09:14 +0000 (19:09 +0200)]
radeonsi/gfx10: add gfx10_make_texture_descriptor

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: add pipe_screen::make_texture_descriptor
Nicolai Hähnle [Tue, 14 Nov 2017 15:03:48 +0000 (16:03 +0100)]
radeonsi/gfx10: add pipe_screen::make_texture_descriptor

Texture descriptors in gfx10 are very different.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: determine view->is_integer based on the pipe_format
Nicolai Hähnle [Tue, 14 Nov 2017 14:37:36 +0000 (15:37 +0100)]
radeonsi/gfx10: determine view->is_integer based on the pipe_format

It was convenient, but NUM_FORMAT no longer exists in gfx10.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: implement si_is_format_supported
Nicolai Hähnle [Tue, 14 Nov 2017 14:20:06 +0000 (15:20 +0100)]
radeonsi/gfx10: implement si_is_format_supported

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: generate gfx10_format_table.h
Nicolai Hähnle [Tue, 14 Nov 2017 14:01:13 +0000 (15:01 +0100)]
radeonsi/gfx10: generate gfx10_format_table.h

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: set MAX_ALLOC_COUNT
Nicolai Hähnle [Tue, 24 Oct 2017 11:43:30 +0000 (11:43 +0000)]
radeonsi/gfx10: set MAX_ALLOC_COUNT

The number for Vega was copied from PAL and has no effect because of MIN2.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/gfx10: require LLVM 9
Nicolai Hähnle [Mon, 13 May 2019 19:58:30 +0000 (21:58 +0200)]
radeonsi/gfx10: require LLVM 9

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: update for new vcn enc interface
Boyuan Zhang [Wed, 13 Mar 2019 23:14:13 +0000 (19:14 -0400)]
radeon/vcn: update for new vcn enc interface

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi: enable jpeg decode for navi10
Boyuan Zhang [Tue, 5 Mar 2019 22:51:23 +0000 (17:51 -0500)]
radeonsi: enable jpeg decode for navi10

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: implement vcn 2.0 jpeg decode
Boyuan Zhang [Tue, 5 Mar 2019 22:49:57 +0000 (17:49 -0500)]
radeon/vcn: implement vcn 2.0 jpeg decode

Use direct register to implement vcn 2.0 jpeg deocde

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: add direct register bool
Boyuan Zhang [Tue, 5 Mar 2019 22:48:52 +0000 (17:48 -0500)]
radeon/vcn: add direct register bool

VCN 2.0 uses direct register space where VCN 1.0 uses some indirect registers

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: add defines for vcn 2.0 jpeg
Boyuan Zhang [Tue, 5 Mar 2019 21:51:37 +0000 (16:51 -0500)]
radeon/vcn: add defines for vcn 2.0 jpeg

Add neccesary register defines for vcn 2.0 jpeg deocde

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: use variable to assign ib cmd
Boyuan Zhang [Sat, 2 Mar 2019 02:14:50 +0000 (21:14 -0500)]
radeon/vcn: use variable to assign ib cmd

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: implement vcn 2.0 encode
Boyuan Zhang [Sat, 2 Mar 2019 02:14:06 +0000 (21:14 -0500)]
radeon/vcn: implement vcn 2.0 encode

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: add vcn2.0 encode skeleton
Boyuan Zhang [Thu, 1 Nov 2018 19:47:58 +0000 (15:47 -0400)]
radeon/vcn: add vcn2.0 encode skeleton

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
(v2: build fix -- Nicolai)
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: move vcn1.0 specific defines to c
Boyuan Zhang [Thu, 1 Nov 2018 19:35:04 +0000 (15:35 -0400)]
radeon/vcn: move vcn1.0 specific defines to c

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: assign function pointer with ib functions
Boyuan Zhang [Wed, 31 Oct 2018 18:11:17 +0000 (14:11 -0400)]
radeon/vcn: assign function pointer with ib functions

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: add function pointer for ib functions
Boyuan Zhang [Tue, 30 Oct 2018 18:57:59 +0000 (14:57 -0400)]
radeon/vcn: add function pointer for ib functions

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: move header related algorithm to vcn_enc
Boyuan Zhang [Tue, 30 Oct 2018 18:14:35 +0000 (14:14 -0400)]
radeon/vcn: move header related algorithm to vcn_enc

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: move add buf func to common file
Boyuan Zhang [Sat, 2 Mar 2019 02:37:20 +0000 (21:37 -0500)]
radeon/vcn: move add buf func to common file

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: move cs defines to enc header file
Boyuan Zhang [Sat, 2 Mar 2019 02:35:59 +0000 (21:35 -0500)]
radeon/vcn: move cs defines to enc header file

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: add VP9 support for Navi10
Leo Liu [Mon, 7 Jan 2019 15:29:07 +0000 (10:29 -0500)]
radeon/vcn: add VP9 support for Navi10

It requires bigger DPB and context buffers

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi: enable encode support for newer HW
Leo Liu [Tue, 16 Oct 2018 18:18:12 +0000 (14:18 -0400)]
radeonsi: enable encode support for newer HW

Previously it was Raven only allowed to do so

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeon/vcn: add VCN2 set of internal registers for IB
Leo Liu [Fri, 5 Oct 2018 13:39:15 +0000 (09:39 -0400)]
radeon/vcn: add VCN2 set of internal registers for IB

From VCN2.0, the RBC have different views on the registers

Signed-off-by: Leo Liu <leo.liu@amd.com>
(v2: rebase -- Nicolai)
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi/uvd: allow newer HW to create HW decoder
Leo Liu [Fri, 5 Oct 2018 13:19:45 +0000 (09:19 -0400)]
radeonsi/uvd: allow newer HW to create HW decoder

Previously it was Raven only allowed to do so

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoac/surface/gfx10: allow "rotated" micro mode
Nicolai Hähnle [Thu, 28 Jun 2018 19:01:40 +0000 (21:01 +0200)]
ac/surface/gfx10: allow "rotated" micro mode

Standard mode does not support DCC.

The R is retconned to "render target" on gfx10.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoac/surface/gfx10: DCC is only supported with SW_64KB_{Z,R}_X modes
Nicolai Hähnle [Thu, 28 Jun 2018 18:53:51 +0000 (20:53 +0200)]
ac/surface/gfx10: DCC is only supported with SW_64KB_{Z,R}_X modes

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoamd/addrlib/gfx10: forbid DCC for swizzle modes which the hardware does not support
Nicolai Hähnle [Thu, 28 Jun 2018 19:03:39 +0000 (21:03 +0200)]
amd/addrlib/gfx10: forbid DCC for swizzle modes which the hardware does not support

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoamd/addrlib/gfx10: fix assertion in Addr2IsValidDisplaySwizzleMode
Nicolai Hähnle [Thu, 28 Jun 2018 18:12:05 +0000 (20:12 +0200)]
amd/addrlib/gfx10: fix assertion in Addr2IsValidDisplaySwizzleMode

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoamd/common/gfx10: print gfx10 registers in debug dumps
Nicolai Hähnle [Fri, 24 May 2019 12:34:45 +0000 (14:34 +0200)]
amd/common/gfx10: print gfx10 registers in debug dumps

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoamd/common/gfx10: CMASK is only used for FMASK
Nicolai Hähnle [Sun, 19 Nov 2017 16:26:23 +0000 (17:26 +0100)]
amd/common/gfx10: CMASK is only used for FMASK

All regular color compression is done via DCC.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoamd/common/gfx10: support new tbuffer encoding
Nicolai Hähnle [Mon, 25 Mar 2019 17:12:07 +0000 (18:12 +0100)]
amd/common/gfx10: support new tbuffer encoding

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoamd/common/gfx10: pad shader buffers for instruction prefetch
Nicolai Hähnle [Thu, 29 Nov 2018 23:37:07 +0000 (00:37 +0100)]
amd/common/gfx10: pad shader buffers for instruction prefetch

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoamd/common/gfx10: implement scan & reduce operations
Nicolai Hähnle [Wed, 23 May 2018 20:08:22 +0000 (22:08 +0200)]
amd/common/gfx10: implement scan & reduce operations

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoamd/common/gfx10: add GS_ALLOC_REQ message define
Nicolai Hähnle [Tue, 7 May 2019 20:34:50 +0000 (22:34 +0200)]
amd/common/gfx10: add GS_ALLOC_REQ message define

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoamd/common/gfx10: print out GCR_CNTL as part of {ACQUIRE,RELEASE}_MEM
Nicolai Hähnle [Sun, 19 Nov 2017 14:23:44 +0000 (15:23 +0100)]
amd/common/gfx10: print out GCR_CNTL as part of {ACQUIRE,RELEASE}_MEM

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoamd/common/gfx10: add register JSON
Nicolai Hähnle [Tue, 7 May 2019 00:37:19 +0000 (02:37 +0200)]
amd/common/gfx10: add register JSON

A small number of fields now need new disambiguation.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoamd/common: add GFX10 chips
Nicolai Hähnle [Tue, 24 Oct 2017 11:42:31 +0000 (11:42 +0000)]
amd/common: add GFX10 chips

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agomeson: require libdrm_amdgpu 2.4.99 for Navi
Marek Olšák [Tue, 2 Jul 2019 18:47:01 +0000 (14:47 -0400)]
meson: require libdrm_amdgpu 2.4.99 for Navi

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv: gfx10 is not supported
Nicolai Hähnle [Mon, 13 May 2019 19:57:47 +0000 (21:57 +0200)]
radv: gfx10 is not supported

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoamd/addrlib: add gfx10 support
Marek Olšák [Thu, 20 Jun 2019 00:42:18 +0000 (20:42 -0400)]
amd/addrlib: add gfx10 support

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi: make emit_streamout_output externally accessible
Nicolai Hähnle [Fri, 21 Sep 2018 20:07:01 +0000 (22:07 +0200)]
radeonsi: make emit_streamout_output externally accessible

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi: pass the context to query destroy functions
Nicolai Hähnle [Thu, 20 Sep 2018 08:19:30 +0000 (10:19 +0200)]
radeonsi: pass the context to query destroy functions

We'll need this in the future.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi: make si_restore_qbo_state externally available
Nicolai Hähnle [Wed, 24 Apr 2019 13:52:35 +0000 (15:52 +0200)]
radeonsi: make si_restore_qbo_state externally available

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi: make get_primitive_id externally visible
Nicolai Hähnle [Tue, 7 May 2019 20:52:27 +0000 (22:52 +0200)]
radeonsi: make get_primitive_id externally visible

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi: make si_llvm_export_vs externally available
Nicolai Hähnle [Thu, 16 Nov 2017 15:49:06 +0000 (16:49 +0100)]
radeonsi: make si_llvm_export_vs externally available

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi: various si_translate_*format functions only apply to pre-gfx10
Nicolai Hähnle [Tue, 7 May 2019 20:38:20 +0000 (22:38 +0200)]
radeonsi: various si_translate_*format functions only apply to pre-gfx10

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradeonsi: use a fragment shader blit instead of DB->CB copy for ZS CPU mappings
Marek Olšák [Thu, 20 Jun 2019 22:32:57 +0000 (18:32 -0400)]
radeonsi: use a fragment shader blit instead of DB->CB copy for ZS CPU mappings

This mainly removes and simplifies code that is no longer needed.

There were some issues with the DB->CB stencil copy on gfx10, so let's
just use a fragment shader blit for all ZS mappings. It's more reliable.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
4 years agogallium/u_blitter: implement copying from ZS to color and vice versa
Marek Olšák [Thu, 20 Jun 2019 22:24:19 +0000 (18:24 -0400)]
gallium/u_blitter: implement copying from ZS to color and vice versa

This is for drivers that can't map depth and stencil and need to blit
them to a color texture for CPU access.

This also useful for drivers using separate depth and stencil.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
4 years agogallium/util: rewrite depth-stencil blit shaders
Marek Olšák [Thu, 20 Jun 2019 23:52:23 +0000 (19:52 -0400)]
gallium/util: rewrite depth-stencil blit shaders

- merge all 3 functions (Z, S, ZS)
- don't write the color output
- read the value from texel.x, then write it to position.z or stencil.y
  (don't use the value from texel.y or texel.z)

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
4 years agost/mesa: accelerate glCopyPixels(STENCIL)
Marek Olšák [Sat, 15 Jun 2019 04:09:56 +0000 (00:09 -0400)]
st/mesa: accelerate glCopyPixels(STENCIL)

Tested-by: Dieter Nützel
4 years agoglsl/standalone: meson test for --dump-builder
Yevhenii Kolesnikov [Wed, 20 Feb 2019 13:42:27 +0000 (15:42 +0200)]
glsl/standalone: meson test for --dump-builder

Added meson test for standalone compiler with --dump-builder option
on builtin texture* functions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107767
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
4 years agoglsl/standalone: exit on unsupported texture functions
Sergii Romantsov [Thu, 30 Aug 2018 12:04:35 +0000 (15:04 +0300)]
glsl/standalone: exit on unsupported texture functions

glsl/standalone with --dump-builder will exit when unsupported texture
functions are encountered.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107767
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
4 years agoradeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabled
Pierre-Eric Pelloux-Prayer [Wed, 3 Jul 2019 17:27:12 +0000 (19:27 +0200)]
radeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabled

gl_SampleMaskIn is 1 when R_028BE0_PA_SC_AA_CONFIG is 0, so this commit rework the conditions
controlling this register.

Before it was set if the sctx->framebuffer had a sample count > 1.

Now we still require this condition, but we also need either:
  - GL_MULTISAMPLE to be enabled
  - to be executing an operation that doesn't depends on GL state using u_blitter.

This fixes the arb_sample_shading/sample_mask piglit tests on radeonsi.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
4 years agogallium/u_blitter: enable MSAA when blitting to MSAA surfaces
Brian Paul [Thu, 7 Dec 2017 16:09:13 +0000 (09:09 -0700)]
gallium/u_blitter: enable MSAA when blitting to MSAA surfaces

If we're doing a Z -> Z MSAA blit (for example) we need to enable
msaa rasterization when drawing the quads so that we can properly
write the per-sample values.

This fixes a number of Piglit ext_framebuffer_multisample blit tests
such as ext_framebuffer_multisample/no-color 2 depth combined with
the VMware driver.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
4 years agovirgl: Clear the valid buffer range when possible
Alexandros Frantzis [Fri, 21 Jun 2019 22:18:27 +0000 (01:18 +0300)]
virgl: Clear the valid buffer range when possible

If we are discarding the whole resource, we don't care about previous contents,
and the resource storage is now unused, either because we have created new
resource storage, or because we have waited for the existing resource storage
to become unused, or because the transfer is unsynchronized.

In the last two cases this commit marks the storage as uninitialized, but only
if the resource is not host writable (in which case we can't clear the valid
range, since that would result in missed readbacks in future transfers).

In the first case, when the whole resource discard involves a reallocation, the
reallocation and subsequent rebinding already update the valid buffer range
appropriately.

Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
4 years agoswr/swr: Enable ARB_viewport_array
Jan Zielinski [Tue, 2 Jul 2019 14:44:34 +0000 (16:44 +0200)]
swr/swr: Enable ARB_viewport_array

The rasterizer core supported ARB_viewport_array,
but the swr layer connecting core to Gallium state
tracker only allowed one viewport.

We add support for multiple viewports to swr layer.

Reviewed-by: Alok Hota <alok.hota@intel.com>
4 years agoradv: Support VK_EXT_queue_family_foreign.
Bas Nieuwenhuizen [Wed, 3 Jul 2019 00:25:19 +0000 (02:25 +0200)]
radv: Support VK_EXT_queue_family_foreign.

Basically same as external for now.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Only case we might need to handle differently in the near future
is Raven's case of displayable DCC which is not renderable. But
we don't support that yet.

4 years agoradv: Fix interactions between variable descriptor count and inline uniform blocks.
Bas Nieuwenhuizen [Tue, 2 Jul 2019 09:32:44 +0000 (11:32 +0200)]
radv: Fix interactions between variable descriptor count and inline uniform blocks.

Fixes: d7e6541cc72 "radv: Only allocate supplied number of descriptors when variable."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
4 years agowinsys/amdgpu: Make KMS handles valid for original DRM file descriptor
Michel Dänzer [Fri, 28 Jun 2019 16:35:56 +0000 (18:35 +0200)]
winsys/amdgpu: Make KMS handles valid for original DRM file descriptor

Getting a DMA-buf fd and converting that to a handle using our duplicate
of that file descriptor (getting at which requires passing a
radeon_winsys pointer to the buffer_get_handle hook) makes sure of this,
since duplicated file descriptors reference the same file description
and therefore the same GEM handle namespace.

This is necessary because libdrm_amdgpu may use a different DRM file
descriptor with a separate handle namespace internally, e.g. because it
always reuses any existing amdgpu_device_handle for the same device.
amdgpu_bo_export returns a handle which is valid for that internal
file descriptor.

Bugzilla: https://bugs.freedesktop.org/110903
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agowinsys/amdgpu: Add amdgpu_screen_winsys
Michel Dänzer [Fri, 28 Jun 2019 14:06:23 +0000 (16:06 +0200)]
winsys/amdgpu: Add amdgpu_screen_winsys

It extends pipe_screen / radeon_winsys and references amdgpu_winsys.
Multiple amdgpu_screen_winsys instances may reference the same
amdgpu_winsys instance, which corresponds to an amdgpu_device_handle.

The purpose of amdgpu_screen_winsys is to keep a duplicate of the DRM
file descriptor passed to amdgpu_winsys_create, which will be needed
in the next change.

v2:
* Add comment in amdgpu_winsys_unref explaining why it always returns
  true (Marek Olšák)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agowinsys/amdgpu: Use amdgpu_winsys helper instead of open-coded casts
Michel Dänzer [Mon, 1 Jul 2019 07:20:11 +0000 (09:20 +0200)]
winsys/amdgpu: Use amdgpu_winsys helper instead of open-coded casts

Cleanup to prevent breakage with the next change, no functional change
intended in this one.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agointel: fix wrong format usage
Juan A. Suarez Romero [Tue, 2 Jul 2019 17:36:56 +0000 (19:36 +0200)]
intel: fix wrong format usage

Do not use the view format when filling the surface state.

Fixes dEQP-VK.image.texel_view_compatible.compute.extended.texture.*

Fixes: fb1350c76f1 ("intel: Add and use helpers for level0 extent")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoradv: only allocate a 32-bit value for the TC-compat range metadata
Samuel Pitoiset [Tue, 2 Jul 2019 12:50:28 +0000 (14:50 +0200)]
radv: only allocate a 32-bit value for the TC-compat range metadata

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv: remove unused code in radv_update_tc_compat_zrange_metadata()
Samuel Pitoiset [Tue, 2 Jul 2019 12:50:27 +0000 (14:50 +0200)]
radv: remove unused code in radv_update_tc_compat_zrange_metadata()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv: add radv_get_depth_pipeline() helper
Samuel Pitoiset [Tue, 2 Jul 2019 12:50:24 +0000 (14:50 +0200)]
radv: add radv_get_depth_pipeline() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoiris: assert isl_surf_init success in resource_from_handle
Mike Blumenkrantz [Wed, 29 May 2019 20:27:39 +0000 (16:27 -0400)]
iris: assert isl_surf_init success in resource_from_handle

this can fail unexpectedly due to bugs, so it's good to provide feedback
when this occurs

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
4 years agoanv: Advertise a more accurate minTexelBufferOffsetAlignment
Jason Ekstrand [Thu, 6 Jun 2019 20:45:57 +0000 (15:45 -0500)]
anv: Advertise a more accurate minTexelBufferOffsetAlignment

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
4 years agoanv: Implement VK_EXT_texel_buffer_alignment
Jason Ekstrand [Thu, 6 Jun 2019 20:31:17 +0000 (15:31 -0500)]
anv: Implement VK_EXT_texel_buffer_alignment

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
4 years agovulkan: Update the XML and headers to 1.1.113
Jason Ekstrand [Thu, 6 Jun 2019 20:22:02 +0000 (15:22 -0500)]
vulkan: Update the XML and headers to 1.1.113

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
4 years agospirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup
Caio Marcelo de Oliveira Filho [Mon, 1 Jul 2019 23:06:13 +0000 (16:06 -0700)]
spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup

From OpPtrAccessChain description in the SPIR-V spec (1.4 rev 1):

    For objects in the Uniform, StorageBuffer, or PushConstant storage
    classes, the element’s address or location is calculated using a
    stride, which will be the Base-type’s Array Stride when the Base
    type is decorated with ArrayStride. For all other objects, the
    implementation will calculate the element’s address or location.

For non-CL shaders the driver should layout the Workgroup storage
class, so override any explicitly set ArrayStride in the shader.  This
currently fixes only the lower_workgroup_access_to_offsets case, which
is used by anv.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
4 years agonouveau: handle new CAPS
Karol Herbst [Mon, 1 Jul 2019 10:17:39 +0000 (12:17 +0200)]
nouveau: handle new CAPS

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
4 years agointel/fs: Use nir_lower_interpolation on gen11+
Jason Ekstrand [Thu, 11 Apr 2019 19:57:12 +0000 (14:57 -0500)]
intel/fs: Use nir_lower_interpolation on gen11+

On gen11, the removed the PLN instruction so we have to emit a pile of
MAD to emulate it.  We may as well do that in NIR so we can optimize and
later schedule it.

Shader-db results on Ice Lake:

    total instructions in shared programs: 17145644 -> 16556440 (-3.44%)
    instructions in affected programs: 11507454 -> 10918250 (-5.12%)
    helped: 35763
    HURT: 42085
    helped stats (abs) min: 1 max: 140 x̄: 19.09 x̃: 18
    helped stats (rel) min: 0.04% max: 37.93% x̄: 15.40% x̃: 14.49%
    HURT stats (abs)   min: 1 max: 248 x̄: 2.22 x̃: 2
    HURT stats (rel)   min: 0.05% max: 50.00% x̄: 5.00% x̃: 2.47%
    95% mean confidence interval for instructions value: -7.67 -7.47
    95% mean confidence interval for instructions %-change: -4.46% -4.29%
    Instructions are helped.

    total loops in shared programs: 4370 -> 4370 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total cycles in shared programs: 360624645 -> 368220857 (2.11%)
    cycles in affected programs: 269631244 -> 277227456 (2.82%)
    helped: 15583
    HURT: 65874
    helped stats (abs) min: 1 max: 28561 x̄: 78.45 x̃: 32
    helped stats (rel) min: <.01% max: 67.81% x̄: 5.38% x̃: 2.44%
    HURT stats (abs)   min: 1 max: 238638 x̄: 133.87 x̃: 20
    HURT stats (rel)   min: <.01% max: 306.25% x̄: 5.81% x̃: 3.97%
    95% mean confidence interval for cycles value: 67.42 119.09
    95% mean confidence interval for cycles %-change: 3.61% 3.73%
    Cycles are HURT.

    total spills in shared programs: 8943 -> 8981 (0.42%)
    spills in affected programs: 1925 -> 1963 (1.97%)
    helped: 44
    HURT: 14

    total fills in shared programs: 21815 -> 21925 (0.50%)
    fills in affected programs: 3511 -> 3621 (3.13%)
    helped: 41
    HURT: 18

    LOST:   70
    GAINED: 14

Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agointel/fs: Implement nir_intrinsic_load_fs_input_interp_deltas
Jason Ekstrand [Thu, 11 Apr 2019 19:55:40 +0000 (14:55 -0500)]
intel/fs: Implement nir_intrinsic_load_fs_input_interp_deltas

Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agointel/fs: Actually implement the load_barycentric intrinsics
Jason Ekstrand [Thu, 11 Apr 2019 19:12:58 +0000 (14:12 -0500)]
intel/fs: Actually implement the load_barycentric intrinsics

If they never get used, dead code should clean them up.  Also, we rework
the at_offset and at_sample intrinsics so they return a proper vec2
instead of returning things in PLN layout.  Fortunately, copy-prop is
pretty good at cleaning this up and it doesn't result in any actual
extra MOVs.

Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agonir: add pass to lower load_interpolated_input
Rob Clark [Wed, 3 Apr 2019 23:29:36 +0000 (19:29 -0400)]
nir: add pass to lower load_interpolated_input

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agopanfrost: Pass referenced BOs to the SUBMIT ioctls
Boris Brezillon [Tue, 2 Jul 2019 12:26:02 +0000 (14:26 +0200)]
panfrost: Pass referenced BOs to the SUBMIT ioctls

Instead of manually adding the BOs from the various SLAB pools plus
the one backing the color FB, we insert them in the BO set attached
to the job and let panfrost_drm_submit_job() pass all BOs from this set
to the SUBMIT ioctl.
This means we are now passing all referenced BOs and let the scheduler
wait on referenced BO fences if needed.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
4 years agopanfrost: Make SLAB pool creation rely on BO helpers
Boris Brezillon [Tue, 2 Jul 2019 11:50:44 +0000 (13:50 +0200)]
panfrost: Make SLAB pool creation rely on BO helpers

There's no point duplicating the code, and it will help us simplify
the bo_handles[] filling logic in panfrost_drm_submit_job().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
4 years agopanfrost: Add the panfrost_drm_{create,release}_bo() helpers
Boris Brezillon [Tue, 2 Jul 2019 11:21:55 +0000 (13:21 +0200)]
panfrost: Add the panfrost_drm_{create,release}_bo() helpers

To avoid the panfrost_memory <-> panfrost_bo dance done in
panfrost_resource_create_bo() and panfrost_bo_unreference().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>