Matt Turner [Tue, 6 Jun 2017 22:43:23 +0000 (15:43 -0700)]
i965: Rename brw_inst 3src functions in preparation for align1
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Matt Turner [Wed, 14 Jun 2017 22:05:39 +0000 (15:05 -0700)]
i965: Print subreg in units of type-size on ternary instructions
The instruction word contains SubRegNum[4:2] so it's in units of dwords
(hence the * 4 to get it in terms of bytes). Before this patch, the
subreg would have been wrong for DF arguments.
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Matt Turner [Wed, 14 Jun 2017 21:08:32 +0000 (14:08 -0700)]
i965: Add functions for brw_reg_type <-> hw 3src type
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Matt Turner [Thu, 24 Aug 2017 23:04:26 +0000 (16:04 -0700)]
i965: Move brw_reg_type_is_floating_point to brw_reg_type.h
I'm going to call this from brw_inst.h, and I don't want to have to
include all of brw_reg.h.
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Jason Ekstrand [Fri, 15 Sep 2017 02:52:38 +0000 (19:52 -0700)]
nir: Get rid of nir_shader::stage
It's redundant with nir_shader::info::stage.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Samuel Pitoiset [Tue, 17 Oct 2017 07:47:53 +0000 (09:47 +0200)]
radv: use optimal packet order for draws
Ported from RadeonSI. The time where shaders are idle should
be shorter now. This can give a little boost, like +6% with
the dynamicubo Vulkan demo.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Mon, 16 Oct 2017 15:48:42 +0000 (17:48 +0200)]
radv: add radv_emit_shaders_prefetch()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Mon, 16 Oct 2017 15:34:42 +0000 (17:34 +0200)]
radv: add radv_emit_shader_prefetch()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Marek Olšák [Fri, 20 Oct 2017 16:55:48 +0000 (18:55 +0200)]
st/mesa: correct a u_vbuf comment
trivial.
Christian Gmeiner [Thu, 19 Oct 2017 21:12:48 +0000 (23:12 +0200)]
etnaviv: fix implicit conversion warning
Galliums query_type used in APIs is unsigned.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Christian Gmeiner [Thu, 19 Oct 2017 21:12:47 +0000 (23:12 +0200)]
etnaviv: enable occlusion query if GPU supports it
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Christian Gmeiner [Thu, 19 Oct 2017 21:12:46 +0000 (23:12 +0200)]
etnaviv: add support for occlusion queries
Passes most occlusion query piglits. The following piglits are broken:
- spec@arb_occlusion_query@occlusion_query_meta_fragments
- spec@arb_occlusion_query@occlusion_query_meta_save
- spec@arb_occlusion_query2@render
v1 -> v2:
- use one sample provider for all occlusion queries tyes
- add comment about 'magic' value 0x1DF5E76
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Christian Gmeiner [Thu, 19 Oct 2017 21:12:45 +0000 (23:12 +0200)]
etnaviv: add basic infrastructure for hw queries
No hardware query is supported yet.
v1 -> v2
- removed query_type from strcut etna_hw_sample_provider
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Christian Gmeiner [Thu, 19 Oct 2017 21:12:44 +0000 (23:12 +0200)]
etnaviv: update headers from rnndb
Update to etna_viv commit
6c9c706.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Chris Wilson [Wed, 3 May 2017 14:42:35 +0000 (15:42 +0100)]
relnotes/17.3: EGL_IMG_context_priority is now implemented
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chris Wilson [Tue, 11 Apr 2017 15:17:36 +0000 (16:17 +0100)]
i965: Report supported context priorities to EGL/DRI
Hook up the RendererQuery for __DRI2_RENDERER_HAS_CONTEXT_PRIORITY to
report the available DRM_I915_GEM_CONTEXT_SETPARAM options based on the
per-client default context. The kernel will validate the request to change
the property, so we get an accurate reflection of available support
(based on kernel version and privilege) and we should only have to do it
once during screen setup -- although the SETPARAM should be fast, they
are still an ioctl each.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chris Wilson [Tue, 11 Apr 2017 14:24:54 +0000 (15:24 +0100)]
i965: Pass the EGL/DRI context priority through to the kernel
Decode the EGL/DRI priority enum into the [-1023, 1023] range as
interpreted by the kernel and call DRM_I915_GEM_CONTEXT_SETPARAM to
adjust the priority. We use 0 as the default medium priority (also the
kernel default) and so only need adjust up or down. By only doing the
adjustment if not setting to medium, we can faithfully report any error
whilst setting without worrying about kernel version.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chris Wilson [Wed, 27 Sep 2017 15:14:33 +0000 (16:14 +0100)]
i965: Record the presence of the kernel scheduler
Mention to the debug log if the kernel scheduler is enabled; and in
particular if it has preemption enabled.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chris Wilson [Wed, 27 Sep 2017 17:37:07 +0000 (18:37 +0100)]
i965: Sync i915_drm.h from kernel for IMG_context_priority
Pulling in changes up to
kernel commit
ac14fbd460d0ec16e7750e40dcd8199b0ff83d0a
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Oct 3 21:34:53 2017 +0100
drm/i915/scheduler: Support user-defined priorities
and including the fixup from
kernel commit
822a4b673284672af697ccd66e8795f8a712a90d
Author: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Date: Fri Oct 6 13:45:59 2017 +0300
drm/i915: Don't use BIT() in UAPI section
for implementing IMG_context_priority.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chris Wilson [Thu, 27 Oct 2016 18:54:49 +0000 (19:54 +0100)]
egl,dri: Propagate context priority hint to driver->CreateContext
Jump through the layers of abstraction between egl and dri in order to
feed the context priority attribute through to the backend. This
requires us to read the value from the base _egl_context, convert it to
a DRI attribute, parse it again in the generic context creator before
passing it to the driver as a function parameter.
In order to not require us to pass back the actual value of the context
priority after creation, we impose that drivers should report the
available set of priorities during screen setup (and then they may chose
to fail if given an invalid value as that should have been checked at
the user boundary.)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net> # i915/i965
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chris Wilson [Thu, 27 Oct 2016 18:34:46 +0000 (19:34 +0100)]
egl: Support IMG_context_priority
IMG_context_priority
https://www.khronos.org/registry/egl/extensions/IMG/EGL_IMG_context_priority.txt
"This extension allows an EGLContext to be created with a priority
hint. It is possible that an implementation will not honour the
hint, especially if there are constraints on the number of high
priority contexts available in the system, or system policy limits
access to high priority contexts to appropriate system privilege
level. A query is provided to find the real priority level assigned
to the context after creation."
The extension adds a new eglCreateContext attribute for choosing a
priority hint. This stub parses the attribute and copies into the base
struct _egl_context, and hooks up the query similarly.
Since the attribute is purely a hint, I have no qualms about the lack of
implementation before reporting back the value the user gave!
v2: Remember to set the default ContextPriority value to medium.
v3: Use the driRendererQuery interface to probe the backend for
supported priority values and use those to mask the EGL interface.
v4: Treat the priority attrib as a hint and gracefully mask any requests
not supported by the driver, the EGLContext will remain at medium
priority.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rob Clark <robdclark@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Emil Velikov <emli.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fredrik Höglund [Thu, 19 Oct 2017 18:54:50 +0000 (20:54 +0200)]
radv: don't flush the VS when srcStageMask == TOP_OF_PIPE_BIT
The Vulkan specification says:
"... an execution dependency with only VK_PIPELINE_STAGE_TOP_OF_-
PIPE_BIT in the source stage mask will effectively not wait for
any prior commands to complete."
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Samuel Pitoiset [Fri, 20 Oct 2017 09:21:27 +0000 (11:21 +0200)]
radv: mark total_count as MAYBE_UNUSED in CmdSet{Viewport,Scissor}
Fixes two compilation warnings in release build. Trivial.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Samuel Pitoiset [Mon, 16 Oct 2017 18:59:43 +0000 (20:59 +0200)]
radv: rename radv_cmd_buffer_flush_state() to radv_draw()
Similar to the dispatch codepath.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 10 Oct 2017 11:36:23 +0000 (13:36 +0200)]
radv: emit primitive restart from radv_emit_draw_registers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 10 Oct 2017 11:29:58 +0000 (13:29 +0200)]
radv: add radv_emit_draw_registers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Fri, 13 Oct 2017 17:06:11 +0000 (19:06 +0200)]
radv: refactor indirect draws (+count buffer) with radv_draw_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Fri, 13 Oct 2017 16:56:48 +0000 (18:56 +0200)]
radv: refactor indirect draws with radv_draw_info
Indirect draws with a count buffer will be refactored in a
separate patch.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Fri, 13 Oct 2017 15:34:35 +0000 (17:34 +0200)]
radv: refactor simple and indexed draws with radv_draw_info
Similar to the dispatch compute logic but for draw calls. For
convenience, indirect draws will be converted in a separate
patch.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 19 Oct 2017 10:35:46 +0000 (12:35 +0200)]
radv: re-emit VGT_INDEX_TYPE because non-indexed draws overwrite it
Only on CIK and later. We should only update VGT_INDEX_TYPE but
it seems easier to re-emit all the index buffer packets.
Fixes: 966d66f28f (radv: do not re-emit the index buffer for every draw call)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 19 Oct 2017 10:35:45 +0000 (12:35 +0200)]
radv: clear the dirty flags in the corresponding emit helpers
This will allow us to fix the VGT_INDEX_TYPE issue properly.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 19 Oct 2017 10:35:44 +0000 (12:35 +0200)]
radv: rename RADV_CMD_DIRTY_RENDER_TARGETS to RADV_CMD_DIRTY_FRAMEBUFFER
To be consistent with the emit function name.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 19 Oct 2017 14:25:59 +0000 (16:25 +0200)]
radv: move DB_COUNT_CONTROL initialization to si_emit_config()
CLEAR_STATE will initialize DB_COUNT_CONTROL to 0 for CIK+.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Iglesias Gonsálvez [Mon, 9 Oct 2017 10:25:39 +0000 (12:25 +0200)]
i965/vec4: remove setting default LOD in the backend
It is already done in NIR.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Samuel Iglesias Gonsálvez [Mon, 9 Oct 2017 10:24:39 +0000 (12:24 +0200)]
i965/fs: remove setting default LOD in the backend
It is already done in NIR.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Samuel Iglesias Gonsálvez [Mon, 9 Oct 2017 10:24:06 +0000 (12:24 +0200)]
nir: set default lod to texture opcodes that needed it but don't provide it
v2:
- Use helper to add a new source to the texture instruction.
v3:
- Use nir_tex_instr_src_index() to simplify the patch (Jason).
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bas Nieuwenhuizen [Thu, 19 Oct 2017 23:09:08 +0000 (01:09 +0200)]
radv: enable GS on GFX9
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Fri, 20 Oct 2017 00:24:24 +0000 (02:24 +0200)]
radv: calculate and emit GFX9 GS registers to pipeline state.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Fri, 20 Oct 2017 00:49:57 +0000 (02:49 +0200)]
ac/nir: Fix up GS input vgprs.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Thu, 19 Oct 2017 23:40:31 +0000 (01:40 +0200)]
ac/nir: Add loading from LDS for merged GS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Thu, 19 Oct 2017 23:27:12 +0000 (01:27 +0200)]
ac/nir: Add ES output to LDS for GFX9.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Thu, 19 Oct 2017 23:06:50 +0000 (01:06 +0200)]
ac/nir: Add merged GS function.
[airlied: merged fixup + and fixed up a couple more bits].
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Thu, 19 Oct 2017 23:08:30 +0000 (01:08 +0200)]
radv: Only emit TES when it exists.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Thu, 19 Oct 2017 23:42:34 +0000 (01:42 +0200)]
radv: Use control shader presence for detecting tess.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 20 Oct 2017 02:45:51 +0000 (03:45 +0100)]
radv: fixup tess eval shader when combined.
This fixes some access to the tess eval shader when it's combined
with geometry on gfx9.
This is a review of Bas's commit:
radv: Prevent crashing by accessing TES for VGT reuse depth.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Fri, 20 Oct 2017 01:17:14 +0000 (03:17 +0200)]
radv: Set VGT_GS_MODE properly for gfx9
Reviewed-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 20 Oct 2017 03:02:15 +0000 (04:02 +0100)]
radv: ensure correct outinfo is picked.
This struct used to rely on being in a union, it isn't anymore,
so we have to pick the correct outinfo struct now.
This should fix a regression since the union became a struct.
dEQP-VK.tessellation.geometry_interaction.point_size.vertex_set_geometry_set
Fixes: 6078a3bd51 (ac/nir: Allow ac_shader_variant_info to contain info about multiple stages.)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
George Kyriazis [Wed, 18 Oct 2017 19:10:26 +0000 (14:10 -0500)]
swr: Rework scratch space allocation
Remove allocation of > 2kbyte buffers into context memory in
swr_copy_to_scatch_space() (which is used to copy small vertex/index buffers
and shader constants to a scratch space to be used by the upcoming draw.)
Large shader constant allocations need to be done in the circular scratch
buffer instead of context memory, because their values persist across
render calls.
Also lower SCRATCH_SINGLE_ALLOCATION_LIMIT to 8k, since allocations of larger
buffers will get too large for the circular scratch space.
Fixes render issues with CEI Ensight.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Bas Nieuwenhuizen [Thu, 19 Oct 2017 21:28:25 +0000 (23:28 +0200)]
radv: Enable tessellation shaders for GFX9.
It mostly works now.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 19 Oct 2017 04:29:02 +0000 (05:29 +0100)]
ac/nir: init full exec mask for merged shaders.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 17 Oct 2017 06:12:28 +0000 (07:12 +0100)]
radv: drop unused r600_htile_info.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 19 Oct 2017 03:52:29 +0000 (04:52 +0100)]
radv: fix CLEAR_STATE packet length.
Looking at shader traces I noticed some registers were missing,
one of them was being eaten by the wrong clear state length.
Fixes: 4f42ea4dc (radv: use CLEAR_STATE for initializing some registers)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dylan Baker [Thu, 19 Oct 2017 17:28:37 +0000 (10:28 -0700)]
meson: don't build gallium dri target if gallium is disabled
Otherwise -Dgallium-drivers= will cause libmesa_gallium to be built and
the megadriver install script to attempt to install drivers without any
actual drivers being built.
fixes:
66f97f6640f5316b36177fd1053f0027eb6ec6cc ("meson: build radeonsi")
Reported-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Timothy Arceri [Wed, 18 Oct 2017 22:27:04 +0000 (09:27 +1100)]
radv: copy indirect lowering settings from radeonsi
It looks the original indirect mask was probably copied from
ANV.
Sascha Willems demo results:
tessellation ~4000 -> ~4200 fps
V2: continue lowering local indirects due to llvm deficiencies.
Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Timothy Arceri [Wed, 18 Oct 2017 22:27:03 +0000 (09:27 +1100)]
radv: stop redundant setting of active_stages
We already set it when above in the nir compilation loop.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Timothy Arceri [Thu, 19 Oct 2017 06:01:35 +0000 (17:01 +1100)]
ac: move some code out of loop in store_tcs_output()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Tue, 17 Oct 2017 22:59:16 +0000 (00:59 +0200)]
radv: Modify rsrc1/rsrc2 generation for merged tess.
No OC_LDS_EN for HS, and the included LS vgpr_comp_cnt is at
a different offset.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 16 Oct 2017 21:57:46 +0000 (23:57 +0200)]
radv: Set correct registers for merged shader rings.
We need different regs to end up in s0/s1.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Tue, 17 Oct 2017 20:51:00 +0000 (22:51 +0200)]
radv: Add GFX9 HS emitting code.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 16 Oct 2017 16:27:47 +0000 (18:27 +0200)]
radv: Remove remaining hard coded references to VS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 16 Oct 2017 16:09:25 +0000 (18:09 +0200)]
radv: Update GFX9 user data regs for GS/tess.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 16 Oct 2017 11:18:02 +0000 (13:18 +0200)]
radv: Add code to compile merged shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Thu, 19 Oct 2017 00:58:34 +0000 (02:58 +0200)]
ac/nir: Add LS-HS input VGPR workaround.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Wed, 18 Oct 2017 23:36:26 +0000 (01:36 +0200)]
ac/nir: Compile the bodies of multiple shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 16 Oct 2017 22:01:33 +0000 (00:01 +0200)]
ac/nir: Expand user SGPR descriptions a bit.
To prevent VS/TCS collisions in merged shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 16 Oct 2017 15:45:06 +0000 (17:45 +0200)]
ac/nir: Don't write to the dynamic HS word on GFX9.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 16 Oct 2017 14:32:41 +0000 (16:32 +0200)]
ac/nir: Add function creation for merged LS+HS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 16 Oct 2017 14:04:20 +0000 (16:04 +0200)]
ac/nir: Make scan_shader_output_decl less dependent on the context.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 25 Sep 2017 03:54:55 +0000 (05:54 +0200)]
ac/nir: Allow ac_shader_variant_info to contain info about multiple stages.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 24 Sep 2017 23:05:49 +0000 (01:05 +0200)]
ac/nir: Change interface to allow multiple source shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 16 Oct 2017 20:15:47 +0000 (22:15 +0200)]
ac/nir: Add HS calling convention.
Needed for GFX9 merged shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 16 Oct 2017 21:58:48 +0000 (23:58 +0200)]
ac: Parse the new HS RSRC1 register.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tim Rowley [Tue, 17 Oct 2017 20:11:19 +0000 (15:11 -0500)]
swr: knob overrides for Intel Xeon Phi
Architecture benefits from having more threads/work outstanding.
Patch by Jan Zielinski.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 17 Oct 2017 20:02:53 +0000 (15:02 -0500)]
swr/rast: Add api to override draws in flight
Allow draws in flight to be overridden via SWR_CREATECONTEXT_INFO.
Patch by Jan Zielinski.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Mon, 16 Oct 2017 23:39:41 +0000 (18:39 -0500)]
swr/rast: Widen fetch shader to SIMD16 (disabled for now)
Refactored the gather operation to process 16 elements at a time via
paired SIMD8 operations.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Wed, 11 Oct 2017 21:21:21 +0000 (16:21 -0500)]
swr/rast: Change DS memory allocation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Fri, 6 Oct 2017 18:50:14 +0000 (13:50 -0500)]
swr/rast: Fix indentation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Fri, 29 Sep 2017 19:45:16 +0000 (14:45 -0500)]
swr/rast: Miscellaneous viewport array code changes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Wed, 27 Sep 2017 17:22:35 +0000 (12:22 -0500)]
swr/rast: Minor changes for os-x
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Kenneth Graunke [Fri, 13 Oct 2017 03:47:41 +0000 (20:47 -0700)]
i965: Don't disable aux buffers for non-overlapping miplevels.
Meta's GenerateMipmap implementation binds the same image for both
sampling and rendering - but it samples from one miplevel while
rendering the next. This is a false self-dependency, and there's
no need to disable auxiliary buffers in this case. In fact, we really
want to leave it enabled so the new miplevels gain color compression.
Thankfully, the texture object's _MaxLevel is always one shy of the
miplevel being rendered. So we can simply check if irb->mt_level is
overlaps with the texture's defined levels. If not, there's no self-
dependency and we can leave the auxiliary buffers enabled.
Fixes a performance regression in GFXBench4 Car Chase, which apparently
calls glGenerateMipmap() on every frame.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103247
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Fri, 13 Oct 2017 05:24:18 +0000 (22:24 -0700)]
i965: Remove the intel_miptree_prepare_fb_fetch wrapper.
Now that intel_miptree_prepare_texture takes levels and layers, there's
not much use in this anymore.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Mon, 16 Oct 2017 19:28:17 +0000 (12:28 -0700)]
i965: Only resolve texture levels/layers that are accessed.
This should avoid unnecessary resolves when working with texture views.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Fri, 13 Oct 2017 03:59:22 +0000 (20:59 -0700)]
i965: Make intel_miptree_prepare_texture() take level/layer arguments.
This effectively exports intel_miptree_prepare_texture_slices() as
intel_miptree_prepare_texture(). The hope is to avoid resolves for
when using texture views that access a subset of the levels/layers.
For now, we pass the same arguments to separate the mechanical change
from the one that actually modifies our behavior.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>
Tim Rowley [Thu, 19 Oct 2017 14:13:46 +0000 (09:13 -0500)]
gallium: add more exceptions to tgsi_util_get_inst_usage_mask
A number of double/int64 operations don't have matching
read and write usage masks, which the fallthrough case of
tgsi_util_get_inst_usage_mask assumes for componentwise
tagged instructions.
No regressions in llvmpipe piglit; fixes a large number of
swr regressions.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Kenneth Graunke [Sun, 24 Sep 2017 21:50:17 +0000 (14:50 -0700)]
isl: Fix width check in isl_gen7_choose_msaa_layout.
The restriction is supposed to apply if the width *field* is >= 8192,
meaning the actual width *value* is >= 8193.
The code also incorrectly used == for some reason.
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Wed, 18 Oct 2017 06:19:20 +0000 (23:19 -0700)]
i965: Use is_scheduling_barrier instead of schedule_node::is_barrier.
Commit
a73116ecc60414ade89802150b tried to make add_barrier_deps()
walk to the next barrier, and stop. To accomplish that, it added an
is_barrier flag. Unfortunately, this only works half of the time.
The issue is that add_barrier_deps() walks both backward (to the
previous barrier), and forward (to the next barrier). It also sets
is_barrier. Assuming that we're processing instructions in forward
order, this means that is_barrier will be set for previous instructions,
but not future ones. So we'll never see it, and walk further than we
need to.
dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23
now compiles its shaders in 3.6 seconds instead of 3.3 minutes.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Pallavi G <pallavi.g@intel.com>
Kenneth Graunke [Wed, 18 Oct 2017 18:22:43 +0000 (11:22 -0700)]
i965: Move fs_inst::has_side_effects()'s eot check to the parent class.
This eliminates a layer of wrapping, and makes a backend_instruction
sufficient. The downside is that it exposes 'eot' to the vec4 backend,
which it doesn't need, but can basically happily ignore.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Pallavi G <pallavi.g@intel.com>
Roland Scheidegger [Wed, 18 Oct 2017 21:13:58 +0000 (23:13 +0200)]
tgsi: fix tgsi_util_get_inst_usage_mask
The logic for handling shadow coords was completely broken.
Fixes
be3ab867bd444594f9d9e0f8e59d305d15769afd.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103265
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Thu, 19 Oct 2017 12:31:39 +0000 (13:31 +0100)]
docs: update calendar, add news item and link release notes for 17.2.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Thu, 19 Oct 2017 12:28:13 +0000 (13:28 +0100)]
docs: add sha256 checksums for 17.2.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
facc85181883cb514b2b1a8106255be88fd54c6e)
Emil Velikov [Thu, 19 Oct 2017 12:10:20 +0000 (13:10 +0100)]
docs: add release notes for 17.2.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
28dc4b64f2f75dc0a0a98e2b97f1dd3350f50e2d)
Iago Toral Quiroga [Mon, 16 Oct 2017 10:43:52 +0000 (12:43 +0200)]
glsl/linker: produce error when invalid explicit locations are used
We only need to add a check to validate output locations here. For
inputs with invalid locations we will fail to link when we can't
find a matching output in the same (invalid) location.
v2: compute location slots properly depending on shader stage and
variable type / direction
Fixes:
KHR-GL45.enhanced_layouts.varying_location_limit
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Iago Toral Quiroga [Fri, 13 Oct 2017 07:22:54 +0000 (09:22 +0200)]
i965/sbe: fix active components for SSO programs with over 16 inputs
When we have up to 16 FS inputs, the SF unit will reorder our inputs
to be consecutive, however, when we have more than 16 we need to
to read our inputs from the URB exactly as they have been
output from the previous stage. This means that for SSO we have to
consider if we have URB padding due to unused input locations.
Specifically, this affects gen9 active components programming, since
for things to work in scenarios with over 16 inputs that have padded
regions we need to ensure that we program active components for the
padded regions too. If we don't do this the hardware won't read
the URB properly for inputs located after padded regions.
Found empirically.
Fixes (these also require a patch in CTS):
KHR-GL45.enhanced_layouts.varying_locations
KHR-GL45.enhanced_layouts.varying_array_locations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chris Wilson [Wed, 18 Oct 2017 08:49:31 +0000 (09:49 +0100)]
i965: Do not log a perf warning when mapping an idle bo
We only want to scare the user away from causing a GPU stall for mapping
a busy bo. The time taken to instantiate the set of pages for a buffer
and their mmapping is unavoidable and flagging idle bo as being busy is
"crying wolf".
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Thu, 19 Oct 2017 05:16:05 +0000 (22:16 -0700)]
i965: Use a union to bitcast a float
... which does not break C's aliasing rules.
Darren Salt [Sun, 15 Oct 2017 22:22:22 +0000 (23:22 +0100)]
drirc: Group a few games in the glthread whitelist together.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Darren Salt [Sun, 15 Oct 2017 22:22:21 +0000 (23:22 +0100)]
drirc: Enable glthread for more games (Saints Row 4 & Gat out of Hell).
“Saints Row: Gat out of Hell” benefits from this on slower CPUs in that
usage spikes on individual cores are avoided, which in turn makes it harder
to hit a bug which causes broken audio and the game to hang on exit.
“Saints Row IV” appears to be fine either way, but also exhibits the audio
breakage bug: glthread is therefore being enabled on the grounds that it should
make it a little harder to hit that bug.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Wed, 18 Oct 2017 12:09:27 +0000 (14:09 +0200)]
radv: reset dirty flags after flushing all states
Move it to radv_cmd_buffer_flush_state() because if
rasterizerDiscardEnable is true, the flags are not cleared.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 18 Oct 2017 12:17:23 +0000 (14:17 +0200)]
radv: do not re-emit the index buffer for every draw call
It can only be changed when CmdBindIndexBuffer() is called
or when a secondary buffer is used. Though not always, but
let's re-emit the packets in this situation for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 18 Oct 2017 12:17:22 +0000 (14:17 +0200)]
radv: remove useless mask operation in radv_cs_emit_draw_indexed_packet()
This saves few CPU cycles when CmdDrawIndexed() is used a lot.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>