Timothy Arceri [Mon, 30 Oct 2017 00:58:52 +0000 (11:58 +1100)]
radv: enable nir varying array splitting
Acked-by: Dave Airlie <airlied@redhat.com>
Timothy Arceri [Mon, 20 Nov 2017 06:20:35 +0000 (17:20 +1100)]
st/glsl_to_nir: enable NIR link time opts
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Fri, 17 Nov 2017 06:04:22 +0000 (17:04 +1100)]
radeonsi/nir: add support for packed inputs
Because NIR can create non vec4 variables when implementing component
packing we need to make sure not to reprocess the same slot again.
Also we can drop the fs_attr_idx counter and just use driver_location.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Fri, 17 Nov 2017 06:45:32 +0000 (17:45 +1100)]
st/glsl_to_nir: move some calls out of st_glsl_to_nir_post_opts()
NIR component packing will be inserted between these calls and the
calling of st_glsl_to_nir_post_opts().
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Tue, 14 Nov 2017 01:56:20 +0000 (12:56 +1100)]
st/glsl_to_nir: call some lowering passes earlier
This is required so that we can enbale NIR linking optimisations.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Mon, 13 Nov 2017 23:13:58 +0000 (10:13 +1100)]
st/glsl_to_nir: add basic NIR opt loop helper
We need to be able to do these NIR opts in the state tracker
rather than the driver in order for the NIR linking opts to
be useful.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Mon, 13 Nov 2017 23:06:47 +0000 (10:06 +1100)]
st/glsl_to_nir: make st_glsl_to_nir() static
Here we also move the extern C functions to the bottom of the file.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Mon, 13 Nov 2017 22:15:54 +0000 (09:15 +1100)]
st/glsl_to_nir: split the st_glsl_to_nir() function in two
We want to be able to generate NIR then apply NIR optimisations.
Once the optimisations are done we can then apply the new post opt
function which assigns uniforms etc based on the optimised IR.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Mon, 13 Nov 2017 22:03:45 +0000 (09:03 +1100)]
st/glsl_to_nir: create set_st_program() helper
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Thu, 9 Nov 2017 06:32:08 +0000 (17:32 +1100)]
st/glsl: move nir linking loop to new function st_link_nir()
This will allow us to refactor linking and include some nir link
time optimisations.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Fri, 17 Nov 2017 03:27:27 +0000 (14:27 +1100)]
nir: fix support for scalar arrays in nir_lower_io_types()
This was just recreating the same vector type we alreay had and
hitting an assert for scalars.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Tue, 21 Nov 2017 00:53:58 +0000 (11:53 +1100)]
st/glsl_to_nir: add st_nir_assign_var_locations() helper
This avoids packed varyings being assigned different driver locations.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Wed, 18 Oct 2017 02:48:41 +0000 (13:48 +1100)]
radv: enable nir component packing
SaschaWillems Vulkan demo tessellation:
~4000fps -> ~4600fps
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Timothy Arceri [Wed, 18 Oct 2017 08:40:06 +0000 (19:40 +1100)]
nir: add varying component packing helpers
v2: update shader info input/output masks when pack components
v3: make sure interpolation loc matches, this is required for the
radeonsi NIR backend.
v4:
33dca36f4f28 fixed nir_gather_info to update outputs_read
correct, make sure we also adjust this correctly when
packing components.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)
Timothy Arceri [Mon, 23 Oct 2017 04:51:29 +0000 (15:51 +1100)]
nir: add varying array splitting pass
V2:
- fix matrix support, non-array matrices were being skipped in v1
v3:
- handle lowering of tcs output loads correctly
- correctly mark indirect locations for either in or out not both
when processing a stage.
- use nir_src_copy() when lowering stores.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Rob Clark [Sun, 3 Dec 2017 16:50:09 +0000 (11:50 -0500)]
freedreno/ir3: relax barriers
Instructions with no barrier_class can move wrt. an EVERYTHING barrier.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 3 Dec 2017 16:48:56 +0000 (11:48 -0500)]
freedreno/ir3: all mem instructions have WAR hazzard
It isn't just load instructions that have write-after-read hazzard.
Fixes stk gaussian blur compute shaders.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Wed, 29 Nov 2017 20:06:39 +0000 (15:06 -0500)]
freedreno: add debug option to force emulated indirect
Useful mostly for debugging indirect draw.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Wed, 29 Nov 2017 14:04:08 +0000 (09:04 -0500)]
freedreno: also mark draw-indirect buffer as read
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Mon, 20 Nov 2017 20:33:54 +0000 (15:33 -0500)]
freedreno: small cleanups
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 19 Nov 2017 21:45:04 +0000 (16:45 -0500)]
freedreno: avoid unneccessary batch flush
In some cases we can end up trying to add a write dependency on ourself,
which shouldn't trigger a flush.
Avoids an extra couple flushes per from in stk.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 19 Nov 2017 17:50:50 +0000 (12:50 -0500)]
freedreno: avoid mem2gmem for invalidated buffers
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 19 Nov 2017 16:42:25 +0000 (11:42 -0500)]
freedreno: deferred flush support
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 19 Nov 2017 15:36:19 +0000 (10:36 -0500)]
freedreno: rework fence tracking
ctx->last_fence isn't such a terribly clever idea, if batches can be
flushed out of order. Instead, each batch now holds a fence, which is
created before the batch is flushed (useful for next patch), that later
gets populated after the batch is actually flushed.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Mon, 20 Nov 2017 14:52:04 +0000 (09:52 -0500)]
freedreno: proper locking for iterating dependent batches
In transfer_map(), when we need to flush batches that read from a
resource, we should be holding screen->lock to guard against race
conditions. Somehow deferred flush seems to make this existing
race more obvious.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Wed, 22 Nov 2017 14:45:28 +0000 (09:45 -0500)]
freedreno/a5xx: correct max_indicies for indirect draws
Signed-off-by: Rob Clark <robdclark@gmail.com>
Jason Ekstrand [Thu, 19 Oct 2017 00:28:19 +0000 (17:28 -0700)]
spirv: Convert the supported_extensions struct to spirv_options
This is a bit more general and lets us pass additional options into the
spirv_to_nir pass beyond what capabilities we support.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Jason Ekstrand [Thu, 19 Oct 2017 17:11:22 +0000 (10:11 -0700)]
spirv: Only emit functions which are actually used
Instead of emitting absolutely everything, just emit the few functions
that are actually referenced in some way by the entrypoint. This should
save us quite a bit of time when handed large shader modules containing
many entrypoints.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Jason Ekstrand [Thu, 19 Oct 2017 16:56:22 +0000 (09:56 -0700)]
spirv: Drop the impl field from vtn_builder
We have a nir_builder and it has an impl field.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Jordan Justen [Fri, 1 Dec 2017 01:48:57 +0000 (17:48 -0800)]
i965: Serialize nir later in the linking process
Fixes MESA_GLSL=cache_fb with piglit
tests/spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out.shader_test
Fixes: 0610a624a12 i965/link: Serialize program to nir after linking for shader cache
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103988
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Marc Dietrich [Wed, 29 Nov 2017 21:25:05 +0000 (22:25 +0100)]
configure: avoid testing for negative compiler options
gcc seems to always accept unsupported negative compiler warning options:
echo "int i;" | gcc -c -xc -Wno-bob - # no error
echo "int i;" | gcc -c -xc -Walice - # unsupported compiler option
Inverting the options fixes the tests.
V2: fix options in meson build
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Eric Anholt [Wed, 29 Nov 2017 00:17:16 +0000 (16:17 -0800)]
broadcom/vc4: Use a single-entry cached last_hindex value.
Since almost all BOs will be in one CL at a time, this cache will almost
always hit except for the first usage of the BO in each CL.
This didn't show up as statistically significant on the minetest trace
(n=340), but if I lop off the throttled lobe of the bimodal distribution,
it very clearly does (0.74731% +/- 0.162093%, n=269).
Eric Anholt [Sat, 25 Nov 2017 06:34:12 +0000 (22:34 -0800)]
broadcom/vc4: Decompose single QUADs to a TRIANGLE_FAN.
No significant difference in the minetest replay, but it should reduce
overhead by not requiring that we write quad indices to index buffers that
we repeatedly re-upload (and making the draw packet smaller, as well).
Over the course of the series the actual game seems to be up by 1-2 fps.
Eric Anholt [Sat, 25 Nov 2017 06:20:21 +0000 (22:20 -0800)]
broadcom/vc4: Use the new enum functionality of the XML to decode better.
Eric Anholt [Sat, 25 Nov 2017 06:15:28 +0000 (22:15 -0800)]
broadcom/vc4: Skip emitting redundant VC4_PACKET_GEM_HANDLES.
Now that there's only one user of it, it's pretty obvious how to avoid
emitting redundant ones. This should save a bunch of kernel validation
overhead.
No statistically sigificant difference on the minetest trace I was looking
at (n=169), but the maximum FPS is up by .3%
Eric Anholt [Sat, 25 Nov 2017 06:11:11 +0000 (22:11 -0800)]
broadcom/vc4: Simplify the relocation handling for index buffers.
Originally there was CL code for handling various relocations back when I
had relocs for the TSDA/TA buffers. Now that the kernel handles those
entirely on its own, I can inline that code into the one place using it.
Eric Anholt [Sat, 25 Nov 2017 05:40:50 +0000 (21:40 -0800)]
broadcom/vc4: Fix handling of GFXH-515 workaround with a start vertex count.
We failed to take the start into account for how many vertices to draw in
this round, so we would end up decrementing count below 0, which as an
unsigned number meant we would loop until the CLs soon ran out of space.
When I wrote the code I was thinking about how to use the previously
emitted shader state (no index bias baked into the elements) by emitting
up to 65535 and then only re-emitting with bias for the second wround, but
that doesn't work if the start is over 65535. Instead, just delay
emitting shader state until we get into the drawarrays GFXH-515 loop and
always bake the bias in when we're doing the workaround.
Eric Anholt [Fri, 1 Dec 2017 23:29:05 +0000 (15:29 -0800)]
broadcom/vc4: Fix the scaling factor for the GFXH-515 workaround.
For triangle strips, we step by max_verts - 2.
Dylan Baker [Wed, 29 Nov 2017 00:49:02 +0000 (16:49 -0800)]
meson: use dep_thread instead of dependency('threads') in freedreno
They are the same thing, but this is more consistent with the rest of
the project.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Dylan Baker [Wed, 29 Nov 2017 00:42:37 +0000 (16:42 -0800)]
meson: Add lmsensors support
v2: - Make -Dlmsensors=false work
- Simplify auto and true cases
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Dylan Baker [Wed, 29 Nov 2017 00:31:06 +0000 (16:31 -0800)]
meson: Add support for gallium extra hud
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Adam Jackson [Tue, 14 Nov 2017 20:13:05 +0000 (15:13 -0500)]
glx: Prepare driFetchDrawable for no-config contexts
When we look up the DRI drawable state we need to associate an fbconfig
with the drawable. With GLX_EXT_no_config_context we can no longer infer
that from the context and must instead query the server.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Adam Jackson [Tue, 14 Nov 2017 20:13:02 +0000 (15:13 -0500)]
glx: Use __glXSendError instead of open-coding it
This also fixes a bug, the error path through MakeCurrent didn't
translate the error code by the extension's error base.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Adam Jackson [Tue, 14 Nov 2017 20:13:01 +0000 (15:13 -0500)]
glx: Simplify some dummy vtable interactions
The dummy vtable has these slots as NULL already, no need to check for
the dummy context explicitly.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Wed, 29 Nov 2017 18:16:38 +0000 (18:16 +0000)]
docs/release-calendar: update and extend
v2: Missing td tag, add Andres + Juan for 17.2.8 and 17.3.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Emil Velikov [Wed, 29 Nov 2017 15:21:03 +0000 (15:21 +0000)]
docs/specs: annotate MESA_set_3dfx_mode as obsolete
Aimed to work with Glide, which hasn't been a thing in over 10 years.
There are no drivers that implement it, so annotate it as obsolete
v2: Move the extension to OLD/
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Adam Jackson <ajax@redhat.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Emil Velikov [Wed, 29 Nov 2017 15:15:19 +0000 (15:15 +0000)]
xlib: remove dummy GLX_MESA_set_3dfx_mode implementation
The implementation is a simple 'return EGL_FALSE'. Stop pretending and
simply remove it.
Note: the removal of XMesa API is fine, since there hasn't been any
users for it in years.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Emil Velikov [Wed, 29 Nov 2017 15:09:01 +0000 (15:09 +0000)]
docs/specs: annotate MESA_agp_offset as obsolete
No Mesa driver has implemented the extension in ages. Seemingly non Mesa
drivers don't implement it either.
As mentioned by Ian, the extension is effectively superseded by
ARB_vertex_buffer_object.
v2: Move the extension to OLD/
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Adam Jackson <ajax@redhat.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Emil Velikov [Wed, 29 Nov 2017 14:46:26 +0000 (14:46 +0000)]
xlib: remove empty GLX_MESA_agp_offset stubs
The extension was never implemented and seemingly never will.
The DRI based libGL dropped support for it over 10 years ago.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Emil Velikov [Wed, 29 Nov 2017 14:32:36 +0000 (14:32 +0000)]
xlib: remove empty GLX_NV_vertex_array_range stubs
The extension was never implemented and seemingly never will.
The DRI based libGL dropped support for it over 10 years ago.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Rafael Antognolli [Wed, 8 Nov 2017 19:39:52 +0000 (11:39 -0800)]
i965/gen10: Change the order of PIPE_CONTROL and load register.
I believe the workaround describes that the MI_LOAD_REGISTER_IMM should
come right after the 3DSTATE_SAMPLE_PATTERN.
This fixes GPU hangs in the i965 initial state batchbuffer when running
some Piglit tests with always_flush_batch=true.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rafael Antognolli [Fri, 6 Oct 2017 18:41:54 +0000 (11:41 -0700)]
intel/compiler: Implement WaClearTDRRegBeforeEOTForNonPS.
The bspec describes:
"WA: Clear tdr register before send EOT in all non-PS shader kernels
mov(8) tdr0:ud 0x0:ud {NoMask}"
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rafael Antognolli [Mon, 2 Oct 2017 18:06:05 +0000 (11:06 -0700)]
i965/gen10: emit 3DSTATE_MULTISAMPLE more often.
On CNL, we see multiple multisample failures on piglit tests. By
emitting this extra state, though not documented in the bspec, those
failures seem to go away.
This workaround could be removed if we ever find out a better solution,
but it should be good enough for now.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Dylan Baker [Thu, 30 Nov 2017 18:39:29 +0000 (10:39 -0800)]
meson: install khrplatform header for EGL as well as GLES
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Dylan Baker [Thu, 30 Nov 2017 18:37:11 +0000 (10:37 -0800)]
meson: install dri internal header
Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Jason Ekstrand [Thu, 30 Nov 2017 00:22:42 +0000 (16:22 -0800)]
i965: Disable regular fast-clears (CCS_D) on gen9+
This partially reverts commit
3e57e9494c2279580ad6a83ab8c065d01e7e634e
which caused a bunch of GPU hangs on several Source titles. To date, we
have no clue why these hangs are actually happening. This undoes the
final effect of
3e57e9494c227 and gets us back to not hanging. Tested
with Team Fortress 2.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102435
Fixes: 3e57e9494c2279580ad6a83ab8c065d01e7e634e
Cc: mesa-stable@lists.freedesktop.org
Vadym Shovkoplias [Fri, 1 Dec 2017 15:08:53 +0000 (17:08 +0200)]
egl/x11: Remove unneeded free() on always null string
In this condition dri2_dpy->driver_name string always equals
NULL, so call to free() is useless
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Eric Engestrom [Wed, 29 Nov 2017 14:19:26 +0000 (14:19 +0000)]
gallium/hud: use #ifdef to test for macro existence
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Eric Engestrom [Fri, 24 Nov 2017 16:23:03 +0000 (16:23 +0000)]
amd: remove always-true BRAHMA_BUILD define
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Vadym Shovkoplias [Fri, 1 Dec 2017 11:23:02 +0000 (13:23 +0200)]
glx/dri3: Remove unused deviceName variable
deviceName string is declared, assigned and freed but actually
never used in dri3_create_screen() function.
Fixes: 2d94601582e ("Add DRI3+Present loader")
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
George Kyriazis [Thu, 30 Nov 2017 20:24:39 +0000 (14:24 -0600)]
swr/scons: Fix intermittent build failure
gen_rasterizer*.cpp depends on gen_ar_eventhandler.hpp.
Account for new dependency.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Samuel Pitoiset [Thu, 30 Nov 2017 21:23:37 +0000 (22:23 +0100)]
radv: only reset command buffers when the allocation fails
"vkAllocateCommandBuffers can be used to create multiple command
buffers. If the creation of any of those command buffers fails, the
implementation must destroy all successfully created command buffer
objects from this command, set all entries of the pCommandBuffers
array to NULL and return the error."
This has been suggested by gabriel@system.is.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 30 Nov 2017 21:16:09 +0000 (22:16 +0100)]
radv: do not dump meta shaders with RADV_DEBUG=shaders
It's really annoying and this pollutes the output especially
when a bunch of non-meta shaders are compiled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Dave Airlie [Fri, 3 Nov 2017 00:15:38 +0000 (10:15 +1000)]
r600: add ARB_shader_storage_buffer_object support (v3)
This just builds on the image support. Evergreen only has ssbo
for fragment and compute no other stages.
v2: handle images and ssbo in the same shader properly (Ilia)
v3: fix RESQ on buffers,
fix missing atom emit
fix first element offset
use R32 format
write separate buffer rat store path.
(from running deqp gles3.1 tests)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 27 Nov 2017 06:39:49 +0000 (06:39 +0000)]
r600/cayman: looks like cmpxchg moved to Z
On cayman it appears the cmp component is now in Z.
Fixes:
arb_shader_image_load_store-dead-fragments on cayman.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 27 Nov 2017 02:07:45 +0000 (02:07 +0000)]
r600/shader: fix 64->32 conversions
These didn't handle the TGSI at all properly, this fixes
them to use the common path for 64->32 then adds the 32->int
on at the end.
Fixes:
generated_tests/spec/arb_gpu_shader_fp64/execution/conversion/*
Signed-off-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Wed, 29 Nov 2017 13:48:32 +0000 (14:48 +0100)]
radv: do not allocate CMASK or DCC for small surfaces
The idea is ported from RadeonSI, but using 512x512 instead of
256x256 seems slightly better. This improves dota2 performance
by +2%.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Samuel Pitoiset [Thu, 30 Nov 2017 19:58:29 +0000 (20:58 +0100)]
radv: do not set DISABLE_LSB_CEIL on GFX9
The state no longer exists on GFX9.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 30 Nov 2017 13:32:58 +0000 (14:32 +0100)]
radv: remove set but unnecessary radv_color_buffer_info::micro_tile_mode
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 30 Nov 2017 13:32:57 +0000 (14:32 +0100)]
radv: do not store gfx9_epitch in radv_color_buffer_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Dylan Baker [Wed, 29 Nov 2017 19:18:52 +0000 (11:18 -0800)]
meson: fix glxext.h install
Another typo, the glext.h header was being install instead.
Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Dylan Baker [Wed, 29 Nov 2017 19:16:59 +0000 (11:16 -0800)]
meson: fix GLES3/gl31.h install
This is a typo, gl32.h is installed twice.
Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Marek Olšák [Thu, 30 Nov 2017 01:14:18 +0000 (02:14 +0100)]
ac/surface: always compute DCC info when DCC is possible on GFX9
The same code for VI doesn't check for scanout either.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Thu, 30 Nov 2017 01:16:29 +0000 (02:16 +0100)]
radeonsi/gfx9: fix importing shared textures with DCC
VI has 11 dwords at least. GFX9 has 10 dwords.
Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Jon Turney [Thu, 23 Nov 2017 13:51:43 +0000 (13:51 +0000)]
meson: fix deps and underlinkage of libGL
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Jon Turney [Mon, 20 Nov 2017 22:05:47 +0000 (22:05 +0000)]
meson: build src/glx/windows
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Jon Turney [Thu, 23 Nov 2017 14:01:57 +0000 (14:01 +0000)]
meson: don't require dri2proto for darwin or windows
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Jon Turney [Thu, 23 Nov 2017 13:42:00 +0000 (13:42 +0000)]
meson: set _GNU_SOURCE on cygwin
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Jon Turney [Thu, 23 Nov 2017 13:40:06 +0000 (13:40 +0000)]
meson: set windows glx defines
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Dylan Baker [Fri, 3 Nov 2017 21:54:03 +0000 (14:54 -0700)]
meson: fix generated source inclusion on macOS and Windows
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Vadym Shovkoplias [Mon, 27 Nov 2017 10:15:13 +0000 (12:15 +0200)]
intel/blorp: Fix possible NULL pointer dereferencing
Fix incomplete check of input params in blorp_surf_convert_to_uncompressed()
which can lead to NULL pointer dereferencing.
Fixes: 5ae8043fed2 ("intel/blorp: Add an entrypoint for doing
bit-for-bit copies")
Fixes: f395d0abc83 ("intel/blorp: Internally expose
surf_convert_to_uncompressed")
Reviewed-by: Emil Velikov <emli.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Tapani Pälli [Fri, 24 Nov 2017 05:46:07 +0000 (07:46 +0200)]
mesa: add AllowGLSLCrossStageInterpolationMismatch workaround
This fixes issues seen with certain versions of Unreal Engine 4 editor
and games built with that using GLSL 4.30.
v2: add driinfo_gallium change (Emil Velikov)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97852
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103801
Acked-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Vinson Lee [Wed, 29 Nov 2017 07:16:58 +0000 (23:16 -0800)]
anv: Check if memfd_create is already defined.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103909
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Iago Toral Quiroga [Wed, 29 Nov 2017 09:50:42 +0000 (10:50 +0100)]
i965/vec4: use a temp register to compute offsets for pull loads
64-bit pull loads are implemented by emitting 2 separate
32-bit pull load messages, where the second message loads from
an offset at +16B.
That addition of 16B to the original offset should not alter the
original offset register used as source for the pull load instruction
though, since the compiler might use that same offset register in other
instructions (for example, for other pull loads in the shader code
that take that same offset as reference).
If the pull load is 32-bit then we only need to emit one message and
we don't need to do offset calculations, but in that case the optimizer
should be able to drop the redundant MOV.
Fixes the following test on Haswell:
KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103007
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:40 +0000 (10:44 +0100)]
etnaviv: GC7000: Factor out state based texture functionality
Prepare for two texture handling paths, the descriptor-based
path will be added in a future commit. These are structured
so that the texture implementation handles its own state
emission.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:39 +0000 (10:44 +0100)]
etnaviv: GC7000: Move active_samplers_bits to texture
This needs to be shared between texture_plain and texture_desc.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:38 +0000 (10:44 +0100)]
etnaviv: GC7000: Factor out incompatible texture handling logic
This will be shared with the texture descriptor path.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:37 +0000 (10:44 +0100)]
etnaviv: GC7000: Track dirty sampler views
Need this to efficiently emit texture descriptor invalidations.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:36 +0000 (10:44 +0100)]
etnaviv: GC7000: Make point sprites work on HALTI5
Track varying component offset of the point size output, as well as
provide the offset of the point coord input.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Wed, 29 Nov 2017 12:19:45 +0000 (13:19 +0100)]
etnaviv: GC7000: State changes for HALTI3..5
Update state objects to add new state, and emit function to emit new
state.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:34 +0000 (10:44 +0100)]
etnaviv: GC7000: Update screen specs for HALTI5
- This core must load shaders from memory (AFAIK)
- Yet another new location for UNIFORMS
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:33 +0000 (10:44 +0100)]
etnaviv: GC7000: Update context reset for ..HALTI5
Update context reset for HALTI3..HALTI5, sorting states for the HALTI
version that has them.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:32 +0000 (10:44 +0100)]
etnaviv: GC7000: No RS align when using BLT
RS align is not necessary and might even be harmful when using the BLT
engine for blitting.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:31 +0000 (10:44 +0100)]
etnaviv: GC7000: BLT engine blitting support
Add an implemenation of key clear_blit functions using the BLT engine
that replaced the RS on GC7000.
Also set level->size correctly for imported resources. This is important
for the BLT resolve-in-place path to work for them.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:30 +0000 (10:44 +0100)]
etnaviv: GC7000: Factor out RS blit functionality
Prepare for BLT-based blitting path by moving RS-based
blitting to the RS implementation file, making this
self-contained.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:29 +0000 (10:44 +0100)]
etnaviv: GC7000: Move etna_coalesce to emit header file
Want to be able to emit state from the texture implementation,
and the blitter implementation.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:28 +0000 (10:44 +0100)]
etnaviv: GC7000: Support BLT as recipient for etna_stall
When the BLT is involved as source or target, add an extra BLT
enable/disable sequence around the sync sequence.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:27 +0000 (10:44 +0100)]
etnaviv: Use only DRAW_INSTANCED on GC3000+
The blob does this, as DRAW_INSTANCED can replace fully all the other
draw commands. It is also required to handle integer vertex formats.
The other path is only there for compatibility and might go away (or at
least rot to become buggy due to dis-use) in newer hardware.
As a by-effect this changes the behavior for GC3000-, by no longer using
the index offset for DRAW_INDEXED but instead adding it to INDEX_ADDR.
This should make no difference.
Preparation for GC7000 support.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:26 +0000 (10:44 +0100)]
etnaviv: Emit SCALE for vertex attributes
This is used by HALTI2+ (GC3000+) when drawing with DRAW_INSTANCED.
It is also necessary when switching between integer and floating point
vertex element formats.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Kenneth Graunke [Tue, 28 Nov 2017 16:58:21 +0000 (08:58 -0800)]
i965: Reorganize batch/state BO fields into a 'brw_growing_bo' struct.
We're about to add more of them, and need to pass the whole lot of them
around together when growing them. Putting them in a struct makes this
much easier.
brw->batch.batch.bo is a bit of a mouthful, but it's nice to have things
labeled 'batch' and 'state' now that we have multiple buffers.
Fixes: 2dfc119f22f257082ab0 "i965: Grow the batch/state buffers if we need space and can't flush."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103101
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>