Eric Engestrom [Fri, 8 Mar 2019 15:33:39 +0000 (15:33 +0000)]
travis: clean up
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Eric Engestrom [Fri, 8 Mar 2019 15:05:15 +0000 (15:05 +0000)]
travis: drop unused vars
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Eric Engestrom [Fri, 8 Mar 2019 15:04:54 +0000 (15:04 +0000)]
travis: fix meson build by letting `auto` do its job
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Eric Engestrom [Tue, 5 Mar 2019 11:49:33 +0000 (11:49 +0000)]
autotools: don't build libGLES*.so with GLVND
GLVND already provides these, so distro packagers have been deleting
them all along. Let's save ourselves the trouble and not build them in
the first place.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Eric Engestrom [Tue, 5 Mar 2019 11:46:38 +0000 (11:46 +0000)]
meson: don't build libGLES*.so with GLVND
GLVND already provides these, so distro packagers have been deleting
them all along. Let's save ourselves the trouble and not build them in
the first place.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Brian Paul [Fri, 8 Mar 2019 15:07:23 +0000 (08:07 -0700)]
pipebuffer: s/PB_ALL_USAGE_FLAGS/PB_USAGE_ALL/
To fix build failure. I guess my meson configuration has assertions
disabled for some reason.
Trivial fix.
Brian Paul [Thu, 7 Mar 2019 23:14:32 +0000 (16:14 -0700)]
svga: remove SVGA_RELOC_READ flag in SVGA3D_BindGBSurface()
This fixes a rendering issue where UBO updates aren't always picked
up by drawing calls. This issue effected the Webots robotics
simulator. VMware bug
2175527.
Testing Done: Webots replay, piglit, misc Linux games
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Brian Paul [Wed, 6 Mar 2019 17:16:57 +0000 (10:16 -0700)]
svga: refactor draw_vgpu10() function
The draw_vgpu10() function was huge. Move the code for preparing the
vertex buffers and the index buffer into separate functions.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Brian Paul [Thu, 7 Mar 2019 23:44:06 +0000 (16:44 -0700)]
st/mesa: whitespace, formatting fixes in st_cb_flush.c
Trivial.
Brian Paul [Wed, 6 Mar 2019 17:23:59 +0000 (10:23 -0700)]
st/mesa: move, clean-up shader variant key decls/inits
Move the variant key declarations inside the scope they're used.
Use designated initializers instead of memset() calls.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Brian Paul [Tue, 5 Mar 2019 21:20:29 +0000 (14:20 -0700)]
winsys/svga: use new pb_usage_flags enum type
And add a comment that we're implicitly converting PIPE_TRANSFER_
flags to PB_USAGE_ flags in one place. And statically assert that
the enum values match.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Brian Paul [Wed, 6 Mar 2019 02:47:25 +0000 (19:47 -0700)]
pipebuffer: whitespace fixes in pb_buffer.h
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Brian Paul [Tue, 5 Mar 2019 21:08:35 +0000 (14:08 -0700)]
pipebuffer: use new pb_usage_flags enum type
Use a new enum type instead of 'unsigned' to make things a bit more
understandable.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Charmaine Lee [Wed, 6 Mar 2019 02:36:48 +0000 (19:36 -0700)]
svga: add svga shader type in the shader variant
With this patch, the svga shader type will be saved in the shader variant,
and there is no need to pass in the shader type to the define/destroy
variant functions.
Reviewed-by: Brian Paul <brianp@vmware.com>
Brian Paul [Tue, 5 Mar 2019 17:06:43 +0000 (10:06 -0700)]
gallium/util: add some const qualifiers in u_bitmask.c
And add/update comments.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Tue, 5 Mar 2019 17:05:18 +0000 (10:05 -0700)]
gallium/util: whitespace cleanups in u_bitmask.[ch]
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Alejandro Piñeiro [Thu, 7 Mar 2019 15:57:10 +0000 (16:57 +0100)]
nir/linker: fix ARRAY_SIZE query with xfb varyings
For a non-array varying, it is expecting ARRAY_SIZE as 1, instead of 0.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Antia Puentes [Sat, 22 Dec 2018 17:40:29 +0000 (18:40 +0100)]
nir/linker: Fix TRANSFORM_FEEDBACK_BUFFER_INDEX
From the ARB_enhanced_layouts specification:
"For the property TRANSFORM_FEEDBACK_BUFFER_INDEX, a single integer
identifying the index of the active transform feedback buffer
associated with an active variable is written to <params>. For
variables corresponding to the special names "gl_NextBuffer",
"gl_SkipComponents1", "gl_SkipComponents2", "gl_SkipComponents3",
and "gl_SkipComponents4", -1 is written to <params>."
We were storing the xfb_buffer value, instead of the value
corresponding to GL_TRANSFORM_FEEDBACK_BUFFER_INDEX.
Note that the implementation assumes that varyings would be sorted by
offset and buffer.
Signed-off-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Alejandro Piñeiro [Wed, 7 Nov 2018 09:11:20 +0000 (10:11 +0100)]
nir/linker: use nir_gather_xfb_info
Instead of a custom ARB_gl_spirv xfb gather info pass.
In fact, this is not only about reusing code, but the current custom
code was not handling properly how many varyings are enumerated from
some complex types. So this change is also about fixing some corner
cases.
v2: Use util_bitcount, simplify current stage check (Kenneth)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Alejandro Piñeiro [Thu, 10 Jan 2019 14:04:37 +0000 (15:04 +0100)]
nir/xfb: handle arrays and AoA of basic types
On OpenGL, a array of a simple type adds just one varying. So
gl_transform_feedback_varying_info struct defined at mtypes.h includes
the parameters Type (base_type) and Size (number of elements).
This commit checks this when the recursive add_var_xfb_outputs call
handles arrays, to ensure that just one is addded.
We also need to take into account AoA here
v2: use glsl_type_is_leaf from nir_types (Timothy Arceri)
v3: simplified aoa check, without the need ot using glsl_type_is_leaf,
using glsl_types_is_struct (Timothy Arceri)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Alejandro Piñeiro [Thu, 7 Mar 2019 10:33:03 +0000 (11:33 +0100)]
nir_types: add glsl_type_is_struct helper
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Alejandro Piñeiro [Thu, 7 Mar 2019 16:42:49 +0000 (17:42 +0100)]
nir/xfb: sort varyings too
Right now we are only re-sorting outputs. But it is better to sort too
varyings, as linker expect them to be sorted out (as it was done on
GLSL). For varyings, and to make easier to compute buffer_index, we
sort also by buffer. We could do the same for outputs, but we lack a
reason for that, so we left it as it is (just offset).
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Alejandro Piñeiro [Wed, 9 Jan 2019 17:19:45 +0000 (18:19 +0100)]
nir/xfb: adding varyings on nir_xfb_info and gather_info
In order to be used for OpenGL (right now for ARB_gl_spirv).
This commit adds two new structures:
* nir_xfb_varying_info: that identifies each individual varying. For
each one, we need to know the type, buffer and xfb_offset
* nir_xfb_buffer_info: as now for each buffer, in addition to the
stride, we need to know how many varyings are assigned to it.
For this patch, the only case where num_outputs != num_varyings is
with the case of doubles, that for dvec3/4 could require more than one
output. There are more cases though (like aoa), that will be handled
on following patches.
v2: updated after new nir general XFB support introduced for "anv: Add
support for VK_EXT_transform_feedback"
v3: compute num_varyings beforehand for allocating, instead of relying
on num_outputs as approximate value (Timothy Arceri)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Alejandro Piñeiro [Wed, 6 Mar 2019 14:15:54 +0000 (15:15 +0100)]
nir_types: add glsl_varying_count helper
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Alejandro Piñeiro [Tue, 6 Nov 2018 17:10:01 +0000 (18:10 +0100)]
nir/xfb: add component_offset at nir_xfb_info
Where component_offset here is the offset when accessing components of
a packed variable. Or in other words, location_frac on
nir.h. Different places of mesa use different names for it.
Technically nir_xfb_info consumer can get the same from the
component_mask, it seems somewhat forced to make it to compute it,
instead of providing it.
v2: rename local location_frac for comp_offset, more similar to the
intended use (Timothy Arceri)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Samuel Pitoiset [Fri, 8 Mar 2019 13:51:02 +0000 (14:51 +0100)]
Revert "radv: execute external subpass barriers after ending subpasses"
This changes is actually wrong because we have to sync
before doing image layout transitions.
This fixes rendering issues in Batman, Path of Exile and
probably more titles.
This reverts commit
76c17cfd8da017ebd19be33ba6cef888957a6758.
Fixes: 76c17cfd8da ("radv: execute external subpass barriers after ending subpasses")
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Lionel Landwerlin [Fri, 16 Nov 2018 18:13:36 +0000 (18:13 +0000)]
intel/error2aub: support older style engine names
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Tue, 4 Sep 2018 16:33:45 +0000 (17:33 +0100)]
intel/error2aub: deal with GuC log buffer
When Guc is enabled, the error state will contain a "global" buffer
for the GuC log buffer.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Thu, 23 Aug 2018 18:01:47 +0000 (19:01 +0100)]
intel/error2aub: add a verbose option
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Tue, 4 Sep 2018 13:45:37 +0000 (14:45 +0100)]
intel/error2aub: write GGTT buffers into the aub file
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Tue, 4 Sep 2018 13:50:13 +0000 (14:50 +0100)]
intel/error2aub: store engine last ring buffer head/tail pointers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Tue, 4 Sep 2018 13:18:35 +0000 (14:18 +0100)]
intel/error2aub: annotate buffer with their address space
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Tue, 4 Sep 2018 12:44:49 +0000 (13:44 +0100)]
intel/error2aub: parse other buffer types
We don't write them in the aub file yet.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Tue, 4 Sep 2018 12:36:11 +0000 (13:36 +0100)]
intel/error2aub: strenghten batchbuffer identifier marker
Found out that some base64 data matched the '---' identifier. We can
avoid this by adding the surrounding spaces.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Tue, 4 Sep 2018 12:32:44 +0000 (13:32 +0100)]
intel/error2aub: identify buffers by engine
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Thu, 23 Aug 2018 17:06:22 +0000 (18:06 +0100)]
intel/error2aub: build a list of BOs before writing them
The error state contains several kind of BOs, including the context
image which we will want to write in a later commit. Because it can
come later in the error state than the user buffers and because we
need to write it first in the aub file, we have to first build a list
of BOs and then write them in the appropriate order.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Chris Wilson [Fri, 22 Feb 2019 23:31:56 +0000 (23:31 +0000)]
iris: Wire up EGL_IMG_context_priority
Add the missing PIPE_CAP_CONTEXT_PRIORITY_MASK and parsing of the context
construction flags.
Testcase: piglit/egl-context-priority
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Mon, 24 Dec 2018 08:27:09 +0000 (00:27 -0800)]
iris: Export a copy_region helper that doesn't flush
I'll want to use this for transfer maps, which already do their own
flushing. This lets us avoid a double flush, and also gives us more
control over the batch which is selected.
Kenneth Graunke [Tue, 5 Mar 2019 09:21:53 +0000 (01:21 -0800)]
iris: Spruce up "are we using this engine?" checks for flushing
We were using batch->contains_draw as a proxy for "are we even using
this engine?" That isn't quite right, because it only counts regular
draws. BLORP operations may have also rendered to a resource, which
needs to trigger flushing. To check for this, we also see if the
render and sometimes depth caches are non-empty.
We can also drop the "but there might already be stale data in the
cache even if we haven't emitted any commands yet" concern in the
comments. The kernel flushes caches between batches.
This may not be great but it's at least better than what was there.
Timur Kristóf [Thu, 7 Mar 2019 07:19:02 +0000 (08:19 +0100)]
radeonsi/nir: Only set window_space_position for vertex shaders.
By mistake, this was previously set for all shaders.
It is a vertex shader property so only makes sense to
set it for vertex shaders.
Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-By: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Jason Ekstrand [Thu, 7 Mar 2019 17:45:13 +0000 (11:45 -0600)]
nir/builder: Add a build_deref_array_imm helper
Unlike most of the cases in which we do this by hand, the new helper
properly handles non-32-bit pointers.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Tue, 5 Mar 2019 22:06:31 +0000 (16:06 -0600)]
nir/builder: Cast array indices in build_deref_follower
There's no guarantee when build_deref_follower is called that the two
derefs have the same bit size destination. Insert a cast on the array
index in case we have differing bit sizes. While we're here, insert
some asserts in build_deref_array and build_deref_ptr_as_array. The
validator will catch violations here but they're easier to debug if we
catch them while building.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Wed, 6 Mar 2019 18:27:26 +0000 (12:27 -0600)]
nir/builder: Emit better code for iadd/imul_imm
Because we already know the immediate right-hand parameter, we can
potentially save the optimizer a bit of work.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Rob Clark [Thu, 7 Mar 2019 19:22:24 +0000 (14:22 -0500)]
freedreno/a6xx: perfcntrs
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Wed, 6 Mar 2019 15:34:53 +0000 (10:34 -0500)]
freedreno/a6xx: fix border-color swizzles
Fixes nearly all of the remaining
dEQP-GLES31.functional.texture.border_clamp.formats.* fails
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Rob Clark [Wed, 6 Mar 2019 15:04:21 +0000 (10:04 -0500)]
freedreno/a6xx: refactor fd6_tex_swiz()
We need a version of fd6_tex_swiz() that just returns the composed
swizzle without building part of the TEX_CONST_0 state. So just
refactor the existing function to build more of the TEX_CONST_0 state,
and leave fd6_tex_swiz() simply composing swizzles.
The small IBO state change (to use LINEAR for smaller sizes/levels) is
to match the state in fd6_tex_const_0(). It seems like maybe tiled
actually works at the smaller sizes but not if minification is in play,
so best just to make images match what we do for textures.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Rob Clark [Wed, 6 Mar 2019 14:44:14 +0000 (09:44 -0500)]
freedreno/a6xx: remove astc_srgb workaround
Not used on a6xx, so remove some of the related plumbing that was copied
over from older gens.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Thu, 7 Mar 2019 20:32:11 +0000 (15:32 -0500)]
freedreno: fix ir3_cmdline build
Fixes: 7530d4abfcf glsl/freedreno/panfrost: pass gl_context to the standalone compiler
Signed-off-by: Rob Clark <robdclark@gmail.com>
Kenneth Graunke [Wed, 19 Dec 2018 10:17:42 +0000 (02:17 -0800)]
iris: Drop PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY
This cap is mainly for working around a r600 texture swizzle issue,
but it also controls whether ARB_texture_buffer_object (with legacy
formats) is enabled. I suspect the missing I/L/A/LA faking is why
I had it set in the first place.
Thanks to Ilia for pointing out that I shouldn't be setting this.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Fri, 22 Feb 2019 06:49:40 +0000 (22:49 -0800)]
iris: Properly support alpha and luminance-alpha formats
For texturing, we map alpha formats to the corresponding red format,
as many alpha formats are outright missing, and red is more efficient
when sampling anyway.
When rendering to A8_UNORM, we use that format directly, so the image
gets the shader output's .a/.w channel, rather than the .r/.x channel.
All other A* formats are non-renderable, so we can't do much and just
mark them as unsupported for rendering. Fortunately, GL only requires
rendering to A8_UNORM, so that works out.
According to Andre Heider and Timur Kristóf, this fixes font rendering
in Witcher 1 (via nine). Andre also reported that it fixes Unigine
Heaven (presumably via nine).
v2: Use the same swizzle for both sampler views and "render targets".
BLORP expects the read swizzle, and will take the inverse when
setting up the destination swizzle (and actually applying it in
the shaders). We ignore the format swizzle when setting up normal
rendering SURFACE_STATEs, which is necessary because it would be
an illegal shader channel select combination. Thanks to Jason
Ekstrand for pointing out that BLORP took an inverse swizzle.
Tested-by: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Tue, 4 Dec 2018 23:34:30 +0000 (15:34 -0800)]
iris: Defer uploading sampler state tables until draw time
Gallium might call us multiple times to bind subsets of the samplers,
at which point we'd recreate the table a bunch of times. It doesn't
really buy us anything to do it here - even if we defer to draw time,
the dirty tracking ensures we'll only do it on the first draw after a
bind_sampler_states() call.
We now use the number of samplers specified by the shader instead of
the binding count. If this number changes, we flag sampler state as
dirty so we re-upload a table with the right number of entries.
This also fixes a bug where ice->state.need_border_colors was never
unset, so once something needed border colors, the pool would always
be pinned in all future batches.
v2: Explicitly flag sampler states as dirty, rather than assuming that
bind_sampler_states() will be called if the program texture count
changes. While this may be true for st/mesa, it isn't the case for
Gallium HUD.
Tested-by: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Thu, 28 Feb 2019 09:13:33 +0000 (01:13 -0800)]
iris: Plumb through ISL_SWIZZLE_IDENTITY in buffer surface emitters
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Thu, 28 Feb 2019 09:13:33 +0000 (01:13 -0800)]
isl: Add a swizzle parameter to isl_buffer_fill_state()
This is necessary for legacy texture buffer object formats, where we'll
need to use a swizzle to fake e.g. luminance.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Thu, 7 Mar 2019 16:59:53 +0000 (16:59 +0000)]
iris: fix decode_get_bo callback
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: acb50d6b1ff1b7 ("intel/decoders: handle decoding MI_BBS from ring")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Erik Faye-Lund [Wed, 6 Mar 2019 13:43:15 +0000 (14:43 +0100)]
virgl: remove unused variable
This variable is now unused, so let's remove it.
Fixes: 9c4930946a5 (virgl: add encoder functions for new protocol)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Erik Faye-Lund [Wed, 6 Mar 2019 13:41:54 +0000 (14:41 +0100)]
virgl: remove unused variable
This variable is now unused, so let's remove it.
Fixes: db77573d7ba (virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Erik Faye-Lund [Wed, 6 Mar 2019 13:40:04 +0000 (14:40 +0100)]
virgl: remove unused variable
This variable is now unused, so let's remove it.
Fixes: c19aedcf1a8 (virgl: don't mark unclean after a flush)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Erik Faye-Lund [Wed, 6 Mar 2019 13:36:15 +0000 (14:36 +0100)]
virgl: remove unused variables
These variables are now unused, let's remove them to get rif of a few
warnings.
Fixes: f0e71b10888 (virgl: use transfer queue)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Lionel Landwerlin [Thu, 7 Mar 2019 16:14:13 +0000 (16:14 +0000)]
iris: fix decoder call
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: acb50d6b1ff1b7 ("intel/decoders: handle decoding MI_BBS from ring")
Lionel Landwerlin [Mon, 3 Sep 2018 14:11:08 +0000 (15:11 +0100)]
intel/aub_write: factorize context image/pphwsp/ring creation
We allocate GGTT entries and physical addresses are we create engines
rather than having a fixed layout.
Context images now receive a parameter argument which is used to setup
pml4 & ring buffer addresses.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Mon, 3 Sep 2018 14:10:06 +0000 (15:10 +0100)]
intel/aub_write: turn context images arrays into functions
We'll make them more parameterized in a later commit.
As this is just a transitional commit, we allow ourself to leak the
context images allocated in get_context_init(). We'll fix this in the
next commit.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Thu, 23 Aug 2018 23:03:28 +0000 (00:03 +0100)]
intel/aub_write: store the physical page allocator in struct
We want to use this allocator in the next commit for GGTT pages.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Sat, 25 Aug 2018 00:40:29 +0000 (01:40 +0100)]
intel/aub_write: log mmio writes
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Sun, 26 Aug 2018 13:35:30 +0000 (14:35 +0100)]
intel/aub_write: switch to use i915_drm engine classes
Prepare aub write to deal with multiple engine instances. We don't
pass the instance number yet this could be done in the future by
having a 2 dimensional array of struct engine.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Thu, 23 Aug 2018 23:37:03 +0000 (00:37 +0100)]
intel/aub_write: break execlist write in 2
We want to reuse the execlist submission, but won't need the ring
buffer update.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Sun, 26 Aug 2018 23:19:29 +0000 (00:19 +0100)]
intel/aub_write: write header in init
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Thu, 23 Aug 2018 19:36:16 +0000 (20:36 +0100)]
intel/aub_write: split comment section from HW setup
In the future we'll want error2aub to reuse the context image saved by
i915 instead of the default one we write in intel_dump_gpu.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Thu, 23 Aug 2018 16:07:08 +0000 (17:07 +0100)]
intel/aub_read: reuse defines from gen_context
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Tue, 4 Sep 2018 14:45:32 +0000 (15:45 +0100)]
intel/decoders: limit number of decoded batchbuffers
IGT has a test to hang the GPU that works by having a batch buffer
jump back into itself, trigger an infinite loop on the command stream.
As our implementation of the decoding is "perfectly" mimicking the
hardware, our decoder also "hangs". This change limits the number of
batch buffer we'll decode before we bail to 100.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Tue, 28 Aug 2018 10:41:42 +0000 (11:41 +0100)]
intel/decoders: handle decoding MI_BBS from ring
An MI_BATCH_BUFFER_START in the ring buffer acts as a second level
batchbuffer (aka jump back to ring buffer when running into a
MI_BATCH_BUFFER_END).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Lionel Landwerlin [Sun, 26 Aug 2018 12:52:47 +0000 (13:52 +0100)]
intel/decoders: add address space indicator to get BOs
Some commands like MI_BATCH_BUFFER_START have this indicator.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Eric Engestrom [Thu, 7 Mar 2019 13:45:14 +0000 (13:45 +0000)]
vulkan/overlay: fix missing var rename in previous commit
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Tue, 5 Mar 2019 15:57:34 +0000 (15:57 +0000)]
vulkan/util: use the platform defines in vk.xml instead of hard-coding them
See also:
3d4238d26c5de4a0f7a5 "anv: use the platform defines in vk.xml
instead of hard-coding them"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Andre Heider [Sun, 10 Feb 2019 17:31:59 +0000 (18:31 +0100)]
iris: add support for tgsi_to_nir
The Gallium Nine state tracker now works on iris.
Also tested with GALLIUM_HUD and Star Wars: Knights of the Old
Republic on WINE (GL_ATI_fragment_shader).
Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tapani Pälli [Wed, 6 Mar 2019 10:30:22 +0000 (12:30 +0200)]
nir: free dead_ctx in case of no progress
Fixes a leak:
==7576== 320 (48 direct, 272 indirect) bytes in 1 blocks are definitely lost in loss record 26 of 26
==7576== at 0x4C2EE3B: malloc (vg_replace_malloc.c:309)
==7576== by 0x53EF0E4: ralloc_size (ralloc.c:119)
==7576== by 0x53EF0C2: ralloc_context (ralloc.c:113)
==7576== by 0x5471F64: nir_split_per_member_structs (nir_split_per_member_structs.c:176)
==7576== by 0x51288CF: anv_shader_compile_to_nir (anv_pipeline.c:216)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Tapani Pälli [Wed, 6 Mar 2019 10:27:30 +0000 (12:27 +0200)]
anv: call blob_finish when done with it
Fixes leaks from anv_device_upload_nir:
==7345== 8,192 bytes in 2 blocks are definitely lost in loss record 24 of 24
==7345== at 0x4C2ED78: malloc (vg_replace_malloc.c:308)
==7345== by 0x4C31393: realloc (vg_replace_malloc.c:836)
==7345== by 0x54E0848: grow_to_fit (blob.c:67)
==7345== by 0x54E0BE5: blob_reserve_bytes (blob.c:166)
==7345== by 0x54E0C7C: blob_reserve_intptr (blob.c:186)
==7345== by 0x54704A7: nir_serialize (nir_serialize.c:1091)
==7345== by 0x512F97D: anv_device_upload_nir (anv_pipeline_cache.c:756)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Tapani Pälli [Wed, 6 Mar 2019 08:49:21 +0000 (10:49 +0200)]
anv: use anv_gem_munmap in block pool cleanup
Use anv_gem_munmap for unmap when softpin in use, this corresponds to
anv_gem_mmap used in anv_block_pool_expand_range. This fixes valgrind
errors seen for each pool when softpin is in use:
==25581== 262,144 bytes in 1 blocks are definitely lost in loss record 31 of 31
==25581== at 0x50E77E8: anv_gem_mmap (anv_gem.c:96)
==25581== by 0x50EEE2B: anv_block_pool_expand_range (anv_allocator.c:543)
==25581== by 0x50EEB51: anv_block_pool_init (anv_allocator.c:477)
==25581== by 0x50EF7EF: anv_state_pool_init (anv_allocator.c:920)
==25581== by 0x510B8EB: anv_CreateDevice (anv_device.c:2031)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Kenneth Graunke [Wed, 6 Mar 2019 22:49:39 +0000 (14:49 -0800)]
iris: Fix MOCS for blits and clears
I915_MOCS_CACHED is the wrong value. Expose mocs() and use that.
Timothy Arceri [Wed, 4 Apr 2018 06:01:21 +0000 (16:01 +1000)]
st/glsl: start spilling out common st glsl conversion code
The NIR and TGSI paths are currently intertwined which makes it
not only hard to follow but also makes it hard to take advantage
of the differences in IR.
Here we take the first step to splitting that path apart. With
this we take the opportunity to no longer call the GLSL IR
optimisation passes after the final lowering calls for NIR. We
can instead just use the NIR passes which can produce better code
and should also result in faster compile times.
The speed-up can be measured in some dolphin uber shaders due to
no longer calling lower_if_to_cond_assign() for example
dolphin/ubershaders/120.shader_test goes from ~1.63 -> ~1.53
seconds on my machine.
There are some code changes as a result of not calling
lower_if_to_cond_assign(), this is because it flattens ifs that
contain UBOs where as NIR's peephole select doesn't. This is
were most of the regressions in Max Waves happens with shader-db.
shader-db results (VEGA):
Totals from affected shaders:
SGPRS:
2349056 ->
2349640 (0.02 %)
VGPRS:
1322160 ->
1323300 (0.09 %)
Spilled SGPRs: 21190 -> 21527 (1.59 %)
Spilled VGPRs: 99 -> 99 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 72 -> 72 (0.00 %) dwords per thread
Code Size:
57260904 ->
57270932 (0.02 %) bytes
Compile Time:
1107186 ->
1022942 (-7.61 %) milliseconds
LDS: 786 -> 786 (0.00 %) blocks
Max Waves: 391932 -> 391619 (-0.08 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Eric Anholt <eric@anholt.net>
Timothy Arceri [Thu, 21 Feb 2019 01:15:18 +0000 (12:15 +1100)]
radeonsi/nir: stop calling nir_lower_returns()
We now call this for all drivers in glsl_to_nir() instead.
Reviewed-by: Eric Anholt <eric@anholt.net>
Timothy Arceri [Thu, 21 Feb 2019 00:54:09 +0000 (11:54 +1100)]
i965: stop calling nir_lower_returns()
We now call this for all drivers in glsl_to_nir() instead.
Reviewed-by: Eric Anholt <eric@anholt.net>
Timothy Arceri [Wed, 20 Feb 2019 06:13:49 +0000 (17:13 +1100)]
glsl: use NIR function inlining for drivers that use glsl_to_nir()
glsl_to_nir() is still missing support for converting certain
functions to NIR, so for those we use the GLSL IR optimisations
to remove the functions.
Reviewed-by: Eric Anholt <eric@anholt.net>
Timothy Arceri [Fri, 22 Feb 2019 00:51:24 +0000 (11:51 +1100)]
glsl/freedreno/panfrost: pass gl_context to the standalone compiler
This allows us to use the ctx with glsl_to_nir() in a following
patch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Lionel Landwerlin [Tue, 5 Mar 2019 12:19:10 +0000 (12:19 +0000)]
vulkan/overlay: drop dependency on validation layer headers
v2: reimplement layer chain info getters (Eric)
v3: make it compile.. (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Lionel Landwerlin [Tue, 5 Mar 2019 10:38:14 +0000 (10:38 +0000)]
vulkan/util: generate instance/device dispatch tables
This will be used by the overlay instead of system installed
validation layers helpers.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Lionel Landwerlin [Mon, 25 Feb 2019 23:01:02 +0000 (23:01 +0000)]
vulkan/util: make header available from c++
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Jose Maria Casanova Crespo [Wed, 27 Feb 2019 19:44:27 +0000 (20:44 +0100)]
iris: setup EdgeFlag Vertex Element when needed.
If Vertex Shader uses EdgeFlag the hardware request that it is setup
as the last VERTEX_ELEMENT_STATE. If SGVS are add at draw time we
need to also reconfigure the last 3DSTATE_VF_INSTANCING so its
VertexElementIndex points to the new Vertex Element that contains
the EdgeFlag.
So if draw parameters or edgeflag are not used the CSO generated at
iris_create_vertex_element is sent directly in the batches. But if
edge flag is used we adjust last VERTEX_ELEMENT_STATE and
last 3DSTATE_VF_INSTANCING using their alternative edge flag version
we generate at iris_create_vertex_element and store at the CSO.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Thu, 21 Feb 2019 20:47:37 +0000 (12:47 -0800)]
v3d: Include a count of register pressure in the RA failure dumps.
You usually want to go find the highest pressure and figure out why you
couldn't spill or what pattern led to a bunch of pressure leading to that
point.
Samuel Pitoiset [Wed, 6 Mar 2019 21:35:31 +0000 (22:35 +0100)]
radv: enable lower_mul_2x32_64
Fixes: 58bcebd987b ("spirv: Allow [i/u]mulExtended to use new nir opcode")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Jason Ekstrand [Mon, 4 Mar 2019 23:02:39 +0000 (17:02 -0600)]
st/nir: Move 64-bit lowering later
Now that we have a loop unrolling cost function and loop unrolling isn't
going to kill us the moment we have a 64-bit op in a loop, we can go
ahead and move 64-bit lowering later. This gives us the opportunity to
do more optimizations and actually let the full optimizer run even on
64-bit ops rather than hoping one round of opt_algebraic will fix
everything. This substantially reduces both fp64 shader compile times
and the resulting code size.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 4 Mar 2019 22:11:57 +0000 (16:11 -0600)]
intel/nir: Move 64-bit lowering later
Now that we have a loop unrolling cost function and loop unrolling isn't
going to kill us the moment we have a 64-bit op in a loop, we can go
ahead and move 64-bit lowering later. This gives us the opportunity to
do more optimizations and actually let the full optimizer run even on
64-bit ops rather than hoping one round of opt_algebraic will fix
everything. This substantially reduces both fp64 shader compile times
and the resulting code size. On the vs-isnan-dvec test from piglit:
Before this commit:
1684.63s user 17.29s system 99% cpu 28:28.24 total
101479 instructions. 0 loops. 802452 cycles. 79:369 spills:fills.
Peak memory usage (according to massif): 1.435 GB
After this commit:
179.64s user 7.75s system 99% cpu 3:07.92 total
57316 instructions. 0 loops. 459287 cycles. 0:0 spills:fills.
Peak memory usage (according to massif): 531.0 MB
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 4 Mar 2019 21:55:19 +0000 (15:55 -0600)]
nir/lower_doubles: Inline functions directly in lower_doubles
Instead of trusting the caller to already have created a softfp64
function shader and added all its functions to our shader, we simply
take the softfp64 shader as an argument and do the function inlining
ouselves. This means that there's no more nasty functions lying around
that the caller needs to worry about cleaning up.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 4 Mar 2019 22:17:02 +0000 (16:17 -0600)]
nir/deref: Expose nir_opt_deref_impl
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 4 Mar 2019 21:32:36 +0000 (15:32 -0600)]
nir/inline_functions: Break inlining into a builder helper
This pulls the guts of function inlining into a builder helper so that
it can be used elsewhere. The rest of the infrastructure is still
needed for most inlining cases to ensure that everything gets inlined
and only ever once. However, there are use-cases where you just want to
inline one little thing. This new helper also has a neat trick where it
can seamlessly inline a function from one nir_shader into another.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 4 Mar 2019 20:39:40 +0000 (14:39 -0600)]
glsl/nir: Inline functions in float64_funcs_to_nir
This doesn't really change anything as the functions will all get
inlined anyway. However it does let us do a bit of the work earlier and
in a common place.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Sun, 3 Mar 2019 16:00:14 +0000 (10:00 -0600)]
glsl/nir: Add a shared helper for building float64 shaders
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 4 Mar 2019 22:01:23 +0000 (16:01 -0600)]
intel/nir: Drop an unneeded lower_constant_initializers call
Even though this is technically a step in the function inlining process
as laid out in nir_inline_functions.c, it's not really needed. We
already have constant initializers lowered here and no new ones are
added by appending the softfp64 functions.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Sun, 3 Mar 2019 16:10:46 +0000 (10:10 -0600)]
intel/debug: Add a debug flag to force software fp64
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 5 Mar 2019 04:54:44 +0000 (22:54 -0600)]
i965: Compile the fp64 program based on nir options
Instead of looking the devinfo directly, look at the lowering options we
provided to NIR. This is more accurate as it's now checking for "do we
need full software lowering" rather than a hardware bit.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Sun, 3 Mar 2019 15:24:12 +0000 (09:24 -0600)]
nir: Teach loop unrolling about 64-bit instruction lowering
The lowering we do for 64-bit instructions can cause a single NIR ALU
instruction to blow up into hundreds or thousands of instructions
potentially with control flow. If loop unrolling isn't aware of this,
it can unroll a loop 20 times which contains a nir_op_fsqrt which we
then lower to a full software implementation based on integer math.
Those 20 invocations suddenly get a lot more expensive than NIR loop
unrolling currently expects. By giving it an approximate estimate
function, we can prevent loop unrolling from going to town when it
shouldn't.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>