mesa.git
5 years agotravis: fix meson build by letting `auto` do its job
Eric Engestrom [Fri, 8 Mar 2019 15:04:54 +0000 (15:04 +0000)]
travis: fix meson build by letting `auto` do its job

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoautotools: don't build libGLES*.so with GLVND
Eric Engestrom [Tue, 5 Mar 2019 11:49:33 +0000 (11:49 +0000)]
autotools: don't build libGLES*.so with GLVND

GLVND already provides these, so distro packagers have been deleting
them all along. Let's save ourselves the trouble and not build them in
the first place.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agomeson: don't build libGLES*.so with GLVND
Eric Engestrom [Tue, 5 Mar 2019 11:46:38 +0000 (11:46 +0000)]
meson: don't build libGLES*.so with GLVND

GLVND already provides these, so distro packagers have been deleting
them all along. Let's save ourselves the trouble and not build them in
the first place.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agopipebuffer: s/PB_ALL_USAGE_FLAGS/PB_USAGE_ALL/
Brian Paul [Fri, 8 Mar 2019 15:07:23 +0000 (08:07 -0700)]
pipebuffer: s/PB_ALL_USAGE_FLAGS/PB_USAGE_ALL/

To fix build failure.  I guess my meson configuration has assertions
disabled for some reason.

Trivial fix.

5 years agosvga: remove SVGA_RELOC_READ flag in SVGA3D_BindGBSurface()
Brian Paul [Thu, 7 Mar 2019 23:14:32 +0000 (16:14 -0700)]
svga: remove SVGA_RELOC_READ flag in SVGA3D_BindGBSurface()

This fixes a rendering issue where UBO updates aren't always picked
up by drawing calls.  This issue effected the Webots robotics
simulator.  VMware bug 2175527.

Testing Done: Webots replay, piglit, misc Linux games

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
5 years agosvga: refactor draw_vgpu10() function
Brian Paul [Wed, 6 Mar 2019 17:16:57 +0000 (10:16 -0700)]
svga: refactor draw_vgpu10() function

The draw_vgpu10() function was huge.  Move the code for preparing the
vertex buffers and the index buffer into separate functions.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
5 years agost/mesa: whitespace, formatting fixes in st_cb_flush.c
Brian Paul [Thu, 7 Mar 2019 23:44:06 +0000 (16:44 -0700)]
st/mesa: whitespace, formatting fixes in st_cb_flush.c

Trivial.

5 years agost/mesa: move, clean-up shader variant key decls/inits
Brian Paul [Wed, 6 Mar 2019 17:23:59 +0000 (10:23 -0700)]
st/mesa: move, clean-up shader variant key decls/inits

Move the variant key declarations inside the scope they're used.
Use designated initializers instead of memset() calls.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
5 years agowinsys/svga: use new pb_usage_flags enum type
Brian Paul [Tue, 5 Mar 2019 21:20:29 +0000 (14:20 -0700)]
winsys/svga: use new pb_usage_flags enum type

And add a comment that we're implicitly converting PIPE_TRANSFER_
flags to PB_USAGE_ flags in one place.  And statically assert that
the enum values match.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
5 years agopipebuffer: whitespace fixes in pb_buffer.h
Brian Paul [Wed, 6 Mar 2019 02:47:25 +0000 (19:47 -0700)]
pipebuffer: whitespace fixes in pb_buffer.h

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
5 years agopipebuffer: use new pb_usage_flags enum type
Brian Paul [Tue, 5 Mar 2019 21:08:35 +0000 (14:08 -0700)]
pipebuffer: use new pb_usage_flags enum type

Use a new enum type instead of 'unsigned' to make things a bit more
understandable.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
5 years agosvga: add svga shader type in the shader variant
Charmaine Lee [Wed, 6 Mar 2019 02:36:48 +0000 (19:36 -0700)]
svga: add svga shader type in the shader variant

With this patch, the svga shader type will be saved in the shader variant,
and there is no need to pass in the shader type to the define/destroy
variant functions.

Reviewed-by: Brian Paul <brianp@vmware.com>
5 years agogallium/util: add some const qualifiers in u_bitmask.c
Brian Paul [Tue, 5 Mar 2019 17:06:43 +0000 (10:06 -0700)]
gallium/util: add some const qualifiers in u_bitmask.c

And add/update comments.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
5 years agogallium/util: whitespace cleanups in u_bitmask.[ch]
Brian Paul [Tue, 5 Mar 2019 17:05:18 +0000 (10:05 -0700)]
gallium/util: whitespace cleanups in u_bitmask.[ch]

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
5 years agonir/linker: fix ARRAY_SIZE query with xfb varyings
Alejandro Piñeiro [Thu, 7 Mar 2019 15:57:10 +0000 (16:57 +0100)]
nir/linker: fix ARRAY_SIZE query with xfb varyings

For a non-array varying, it is expecting ARRAY_SIZE as 1, instead of 0.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/linker: Fix TRANSFORM_FEEDBACK_BUFFER_INDEX
Antia Puentes [Sat, 22 Dec 2018 17:40:29 +0000 (18:40 +0100)]
nir/linker: Fix TRANSFORM_FEEDBACK_BUFFER_INDEX

From the ARB_enhanced_layouts specification:

  "For the property TRANSFORM_FEEDBACK_BUFFER_INDEX, a single integer
   identifying the index of the active transform feedback buffer
   associated with an active variable is written to <params>.  For
   variables corresponding to the special names "gl_NextBuffer",
   "gl_SkipComponents1", "gl_SkipComponents2", "gl_SkipComponents3",
   and "gl_SkipComponents4", -1 is written to <params>."

We were storing the xfb_buffer value, instead of the value
corresponding to GL_TRANSFORM_FEEDBACK_BUFFER_INDEX.

Note that the implementation assumes that varyings would be sorted by
offset and buffer.

Signed-off-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/linker: use nir_gather_xfb_info
Alejandro Piñeiro [Wed, 7 Nov 2018 09:11:20 +0000 (10:11 +0100)]
nir/linker: use nir_gather_xfb_info

Instead of a custom ARB_gl_spirv xfb gather info pass.

In fact, this is not only about reusing code, but the current custom
code was not handling properly how many varyings are enumerated from
some complex types. So this change is also about fixing some corner
cases.

v2: Use util_bitcount, simplify current stage check (Kenneth)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/xfb: handle arrays and AoA of basic types
Alejandro Piñeiro [Thu, 10 Jan 2019 14:04:37 +0000 (15:04 +0100)]
nir/xfb: handle arrays and AoA of basic types

On OpenGL, a array of a simple type adds just one varying. So
gl_transform_feedback_varying_info struct defined at mtypes.h includes
the parameters Type (base_type) and Size (number of elements).

This commit checks this when the recursive add_var_xfb_outputs call
handles arrays, to ensure that just one is addded.

We also need to take into account AoA here

v2: use glsl_type_is_leaf from nir_types (Timothy Arceri)

v3: simplified aoa check, without the need ot using glsl_type_is_leaf,
    using glsl_types_is_struct (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir_types: add glsl_type_is_struct helper
Alejandro Piñeiro [Thu, 7 Mar 2019 10:33:03 +0000 (11:33 +0100)]
nir_types: add glsl_type_is_struct helper

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/xfb: sort varyings too
Alejandro Piñeiro [Thu, 7 Mar 2019 16:42:49 +0000 (17:42 +0100)]
nir/xfb: sort varyings too

Right now we are only re-sorting outputs. But it is better to sort too
varyings, as linker expect them to be sorted out (as it was done on
GLSL). For varyings, and to make easier to compute buffer_index, we
sort also by buffer. We could do the same for outputs, but we lack a
reason for that, so we left it as it is (just offset).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/xfb: adding varyings on nir_xfb_info and gather_info
Alejandro Piñeiro [Wed, 9 Jan 2019 17:19:45 +0000 (18:19 +0100)]
nir/xfb: adding varyings on nir_xfb_info and gather_info

In order to be used for OpenGL (right now for ARB_gl_spirv).

This commit adds two new structures:

  * nir_xfb_varying_info: that identifies each individual varying. For
    each one, we need to know the type, buffer and xfb_offset

  * nir_xfb_buffer_info: as now for each buffer, in addition to the
    stride, we need to know how many varyings are assigned to it.

For this patch, the only case where num_outputs != num_varyings is
with the case of doubles, that for dvec3/4 could require more than one
output. There are more cases though (like aoa), that will be handled
on following patches.

v2: updated after new nir general XFB support introduced for "anv: Add
    support for VK_EXT_transform_feedback"

v3: compute num_varyings beforehand for allocating, instead of relying
    on num_outputs as approximate value (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir_types: add glsl_varying_count helper
Alejandro Piñeiro [Wed, 6 Mar 2019 14:15:54 +0000 (15:15 +0100)]
nir_types: add glsl_varying_count helper

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/xfb: add component_offset at nir_xfb_info
Alejandro Piñeiro [Tue, 6 Nov 2018 17:10:01 +0000 (18:10 +0100)]
nir/xfb: add component_offset at nir_xfb_info

Where component_offset here is the offset when accessing components of
a packed variable. Or in other words, location_frac on
nir.h. Different places of mesa use different names for it.

Technically nir_xfb_info consumer can get the same from the
component_mask, it seems somewhat forced to make it to compute it,
instead of providing it.

v2: rename local location_frac for comp_offset, more similar to the
intended use (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoRevert "radv: execute external subpass barriers after ending subpasses"
Samuel Pitoiset [Fri, 8 Mar 2019 13:51:02 +0000 (14:51 +0100)]
Revert "radv: execute external subpass barriers after ending subpasses"

This changes is actually wrong because we have to sync
before doing image layout transitions.

This fixes rendering issues in Batman, Path of Exile and
probably more titles.

This reverts commit 76c17cfd8da017ebd19be33ba6cef888957a6758.

Fixes: 76c17cfd8da ("radv: execute external subpass barriers after ending subpasses")
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agointel/error2aub: support older style engine names
Lionel Landwerlin [Fri, 16 Nov 2018 18:13:36 +0000 (18:13 +0000)]
intel/error2aub: support older style engine names

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/error2aub: deal with GuC log buffer
Lionel Landwerlin [Tue, 4 Sep 2018 16:33:45 +0000 (17:33 +0100)]
intel/error2aub: deal with GuC log buffer

When Guc is enabled, the error state will contain a "global" buffer
for the GuC log buffer.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/error2aub: add a verbose option
Lionel Landwerlin [Thu, 23 Aug 2018 18:01:47 +0000 (19:01 +0100)]
intel/error2aub: add a verbose option

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/error2aub: write GGTT buffers into the aub file
Lionel Landwerlin [Tue, 4 Sep 2018 13:45:37 +0000 (14:45 +0100)]
intel/error2aub: write GGTT buffers into the aub file

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/error2aub: store engine last ring buffer head/tail pointers
Lionel Landwerlin [Tue, 4 Sep 2018 13:50:13 +0000 (14:50 +0100)]
intel/error2aub: store engine last ring buffer head/tail pointers

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/error2aub: annotate buffer with their address space
Lionel Landwerlin [Tue, 4 Sep 2018 13:18:35 +0000 (14:18 +0100)]
intel/error2aub: annotate buffer with their address space

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/error2aub: parse other buffer types
Lionel Landwerlin [Tue, 4 Sep 2018 12:44:49 +0000 (13:44 +0100)]
intel/error2aub: parse other buffer types

We don't write them in the aub file yet.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/error2aub: strenghten batchbuffer identifier marker
Lionel Landwerlin [Tue, 4 Sep 2018 12:36:11 +0000 (13:36 +0100)]
intel/error2aub: strenghten batchbuffer identifier marker

Found out that some base64 data matched the '---' identifier. We can
avoid this by adding the surrounding spaces.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/error2aub: identify buffers by engine
Lionel Landwerlin [Tue, 4 Sep 2018 12:32:44 +0000 (13:32 +0100)]
intel/error2aub: identify buffers by engine

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/error2aub: build a list of BOs before writing them
Lionel Landwerlin [Thu, 23 Aug 2018 17:06:22 +0000 (18:06 +0100)]
intel/error2aub: build a list of BOs before writing them

The error state contains several kind of BOs, including the context
image which we will want to write in a later commit. Because it can
come later in the error state than the user buffers and because we
need to write it first in the aub file, we have to first build a list
of BOs and then write them in the appropriate order.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agoiris: Wire up EGL_IMG_context_priority
Chris Wilson [Fri, 22 Feb 2019 23:31:56 +0000 (23:31 +0000)]
iris: Wire up EGL_IMG_context_priority

Add the missing PIPE_CAP_CONTEXT_PRIORITY_MASK and parsing of the context
construction flags.

Testcase: piglit/egl-context-priority

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: Export a copy_region helper that doesn't flush
Kenneth Graunke [Mon, 24 Dec 2018 08:27:09 +0000 (00:27 -0800)]
iris: Export a copy_region helper that doesn't flush

I'll want to use this for transfer maps, which already do their own
flushing.  This lets us avoid a double flush, and also gives us more
control over the batch which is selected.

5 years agoiris: Spruce up "are we using this engine?" checks for flushing
Kenneth Graunke [Tue, 5 Mar 2019 09:21:53 +0000 (01:21 -0800)]
iris: Spruce up "are we using this engine?" checks for flushing

We were using batch->contains_draw as a proxy for "are we even using
this engine?"  That isn't quite right, because it only counts regular
draws.  BLORP operations may have also rendered to a resource, which
needs to trigger flushing.  To check for this, we also see if the
render and sometimes depth caches are non-empty.

We can also drop the "but there might already be stale data in the
cache even if we haven't emitted any commands yet" concern in the
comments.  The kernel flushes caches between batches.

This may not be great but it's at least better than what was there.

5 years agoradeonsi/nir: Only set window_space_position for vertex shaders.
Timur Kristóf [Thu, 7 Mar 2019 07:19:02 +0000 (08:19 +0100)]
radeonsi/nir: Only set window_space_position for vertex shaders.

By mistake, this was previously set for all shaders.
It is a vertex shader property so only makes sense to
set it for vertex shaders.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-By: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
5 years agonir/builder: Add a build_deref_array_imm helper
Jason Ekstrand [Thu, 7 Mar 2019 17:45:13 +0000 (11:45 -0600)]
nir/builder: Add a build_deref_array_imm helper

Unlike most of the cases in which we do this by hand, the new helper
properly handles non-32-bit pointers.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agonir/builder: Cast array indices in build_deref_follower
Jason Ekstrand [Tue, 5 Mar 2019 22:06:31 +0000 (16:06 -0600)]
nir/builder: Cast array indices in build_deref_follower

There's no guarantee when build_deref_follower is called that the two
derefs have the same bit size destination.  Insert a cast on the array
index in case we have differing bit sizes.  While we're here, insert
some asserts in build_deref_array and build_deref_ptr_as_array.  The
validator will catch violations here but they're easier to debug if we
catch them while building.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agonir/builder: Emit better code for iadd/imul_imm
Jason Ekstrand [Wed, 6 Mar 2019 18:27:26 +0000 (12:27 -0600)]
nir/builder: Emit better code for iadd/imul_imm

Because we already know the immediate right-hand parameter, we can
potentially save the optimizer a bit of work.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agofreedreno/a6xx: perfcntrs
Rob Clark [Thu, 7 Mar 2019 19:22:24 +0000 (14:22 -0500)]
freedreno/a6xx: perfcntrs

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/a6xx: fix border-color swizzles
Rob Clark [Wed, 6 Mar 2019 15:34:53 +0000 (10:34 -0500)]
freedreno/a6xx: fix border-color swizzles

Fixes nearly all of the remaining
dEQP-GLES31.functional.texture.border_clamp.formats.* fails

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
5 years agofreedreno/a6xx: refactor fd6_tex_swiz()
Rob Clark [Wed, 6 Mar 2019 15:04:21 +0000 (10:04 -0500)]
freedreno/a6xx: refactor fd6_tex_swiz()

We need a version of fd6_tex_swiz() that just returns the composed
swizzle without building part of the TEX_CONST_0 state.  So just
refactor the existing function to build more of the TEX_CONST_0 state,
and leave fd6_tex_swiz() simply composing swizzles.

The small IBO state change (to use LINEAR for smaller sizes/levels) is
to match the state in fd6_tex_const_0().  It seems like maybe tiled
actually works at the smaller sizes but not if minification is in play,
so best just to make images match what we do for textures.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
5 years agofreedreno/a6xx: remove astc_srgb workaround
Rob Clark [Wed, 6 Mar 2019 14:44:14 +0000 (09:44 -0500)]
freedreno/a6xx: remove astc_srgb workaround

Not used on a6xx, so remove some of the related plumbing that was copied
over from older gens.

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno: fix ir3_cmdline build
Rob Clark [Thu, 7 Mar 2019 20:32:11 +0000 (15:32 -0500)]
freedreno: fix ir3_cmdline build

Fixes: 7530d4abfcf glsl/freedreno/panfrost: pass gl_context to the standalone compiler
Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agoiris: Drop PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY
Kenneth Graunke [Wed, 19 Dec 2018 10:17:42 +0000 (02:17 -0800)]
iris: Drop PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY

This cap is mainly for working around a r600 texture swizzle issue,
but it also controls whether ARB_texture_buffer_object (with legacy
formats) is enabled.  I suspect the missing I/L/A/LA faking is why
I had it set in the first place.

Thanks to Ilia for pointing out that I shouldn't be setting this.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoiris: Properly support alpha and luminance-alpha formats
Kenneth Graunke [Fri, 22 Feb 2019 06:49:40 +0000 (22:49 -0800)]
iris: Properly support alpha and luminance-alpha formats

For texturing, we map alpha formats to the corresponding red format,
as many alpha formats are outright missing, and red is more efficient
when sampling anyway.

When rendering to A8_UNORM, we use that format directly, so the image
gets the shader output's .a/.w channel, rather than the .r/.x channel.

All other A* formats are non-renderable, so we can't do much and just
mark them as unsupported for rendering.  Fortunately, GL only requires
rendering to A8_UNORM, so that works out.

According to Andre Heider and Timur Kristóf, this fixes font rendering
in Witcher 1 (via nine).  Andre also reported that it fixes Unigine
Heaven (presumably via nine).

v2: Use the same swizzle for both sampler views and "render targets".
    BLORP expects the read swizzle, and will take the inverse when
    setting up the destination swizzle (and actually applying it in
    the shaders).  We ignore the format swizzle when setting up normal
    rendering SURFACE_STATEs, which is necessary because it would be
    an illegal shader channel select combination.  Thanks to Jason
    Ekstrand for pointing out that BLORP took an inverse swizzle.

Tested-by: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoiris: Defer uploading sampler state tables until draw time
Kenneth Graunke [Tue, 4 Dec 2018 23:34:30 +0000 (15:34 -0800)]
iris: Defer uploading sampler state tables until draw time

Gallium might call us multiple times to bind subsets of the samplers,
at which point we'd recreate the table a bunch of times.  It doesn't
really buy us anything to do it here - even if we defer to draw time,
the dirty tracking ensures we'll only do it on the first draw after a
bind_sampler_states() call.

We now use the number of samplers specified by the shader instead of
the binding count.  If this number changes, we flag sampler state as
dirty so we re-upload a table with the right number of entries.

This also fixes a bug where ice->state.need_border_colors was never
unset, so once something needed border colors, the pool would always
be pinned in all future batches.

v2: Explicitly flag sampler states as dirty, rather than assuming that
    bind_sampler_states() will be called if the program texture count
    changes.  While this may be true for st/mesa, it isn't the case for
    Gallium HUD.

Tested-by: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoiris: Plumb through ISL_SWIZZLE_IDENTITY in buffer surface emitters
Kenneth Graunke [Thu, 28 Feb 2019 09:13:33 +0000 (01:13 -0800)]
iris: Plumb through ISL_SWIZZLE_IDENTITY in buffer surface emitters

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoisl: Add a swizzle parameter to isl_buffer_fill_state()
Kenneth Graunke [Thu, 28 Feb 2019 09:13:33 +0000 (01:13 -0800)]
isl: Add a swizzle parameter to isl_buffer_fill_state()

This is necessary for legacy texture buffer object formats, where we'll
need to use a swizzle to fake e.g. luminance.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoiris: fix decode_get_bo callback
Lionel Landwerlin [Thu, 7 Mar 2019 16:59:53 +0000 (16:59 +0000)]
iris: fix decode_get_bo callback

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: acb50d6b1ff1b7 ("intel/decoders: handle decoding MI_BBS from ring")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agovirgl: remove unused variable
Erik Faye-Lund [Wed, 6 Mar 2019 13:43:15 +0000 (14:43 +0100)]
virgl: remove unused variable

This variable is now unused, so let's remove it.

Fixes: 9c4930946a5 (virgl: add encoder functions for new protocol)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: remove unused variable
Erik Faye-Lund [Wed, 6 Mar 2019 13:41:54 +0000 (14:41 +0100)]
virgl: remove unused variable

This variable is now unused, so let's remove it.

Fixes: db77573d7ba (virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: remove unused variable
Erik Faye-Lund [Wed, 6 Mar 2019 13:40:04 +0000 (14:40 +0100)]
virgl: remove unused variable

This variable is now unused, so let's remove it.

Fixes: c19aedcf1a8 (virgl: don't mark unclean after a flush)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: remove unused variables
Erik Faye-Lund [Wed, 6 Mar 2019 13:36:15 +0000 (14:36 +0100)]
virgl: remove unused variables

These variables are now unused, let's remove them to get rif of a few
warnings.

Fixes: f0e71b10888 (virgl: use transfer queue)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agoiris: fix decoder call
Lionel Landwerlin [Thu, 7 Mar 2019 16:14:13 +0000 (16:14 +0000)]
iris: fix decoder call

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: acb50d6b1ff1b7 ("intel/decoders: handle decoding MI_BBS from ring")
5 years agointel/aub_write: factorize context image/pphwsp/ring creation
Lionel Landwerlin [Mon, 3 Sep 2018 14:11:08 +0000 (15:11 +0100)]
intel/aub_write: factorize context image/pphwsp/ring creation

We allocate GGTT entries and physical addresses are we create engines
rather than having a fixed layout.

Context images now receive a parameter argument which is used to setup
pml4 & ring buffer addresses.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_write: turn context images arrays into functions
Lionel Landwerlin [Mon, 3 Sep 2018 14:10:06 +0000 (15:10 +0100)]
intel/aub_write: turn context images arrays into functions

We'll make them more parameterized in a later commit.

As this is just a transitional commit, we allow ourself to leak the
context images allocated in get_context_init(). We'll fix this in the
next commit.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_write: store the physical page allocator in struct
Lionel Landwerlin [Thu, 23 Aug 2018 23:03:28 +0000 (00:03 +0100)]
intel/aub_write: store the physical page allocator in struct

We want to use this allocator in the next commit for GGTT pages.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_write: log mmio writes
Lionel Landwerlin [Sat, 25 Aug 2018 00:40:29 +0000 (01:40 +0100)]
intel/aub_write: log mmio writes

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_write: switch to use i915_drm engine classes
Lionel Landwerlin [Sun, 26 Aug 2018 13:35:30 +0000 (14:35 +0100)]
intel/aub_write: switch to use i915_drm engine classes

Prepare aub write to deal with multiple engine instances. We don't
pass the instance number yet this could be done in the future by
having a 2 dimensional array of struct engine.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_write: break execlist write in 2
Lionel Landwerlin [Thu, 23 Aug 2018 23:37:03 +0000 (00:37 +0100)]
intel/aub_write: break execlist write in 2

We want to reuse the execlist submission, but won't need the ring
buffer update.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_write: write header in init
Lionel Landwerlin [Sun, 26 Aug 2018 23:19:29 +0000 (00:19 +0100)]
intel/aub_write: write header in init

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_write: split comment section from HW setup
Lionel Landwerlin [Thu, 23 Aug 2018 19:36:16 +0000 (20:36 +0100)]
intel/aub_write: split comment section from HW setup

In the future we'll want error2aub to reuse the context image saved by
i915 instead of the default one we write in intel_dump_gpu.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_read: reuse defines from gen_context
Lionel Landwerlin [Thu, 23 Aug 2018 16:07:08 +0000 (17:07 +0100)]
intel/aub_read: reuse defines from gen_context

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/decoders: limit number of decoded batchbuffers
Lionel Landwerlin [Tue, 4 Sep 2018 14:45:32 +0000 (15:45 +0100)]
intel/decoders: limit number of decoded batchbuffers

IGT has a test to hang the GPU that works by having a batch buffer
jump back into itself, trigger an infinite loop on the command stream.
As our implementation of the decoding is "perfectly" mimicking the
hardware, our decoder also "hangs". This change limits the number of
batch buffer we'll decode before we bail to 100.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/decoders: handle decoding MI_BBS from ring
Lionel Landwerlin [Tue, 28 Aug 2018 10:41:42 +0000 (11:41 +0100)]
intel/decoders: handle decoding MI_BBS from ring

An MI_BATCH_BUFFER_START in the ring buffer acts as a second level
batchbuffer (aka jump back to ring buffer when running into a
MI_BATCH_BUFFER_END).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/decoders: add address space indicator to get BOs
Lionel Landwerlin [Sun, 26 Aug 2018 12:52:47 +0000 (13:52 +0100)]
intel/decoders: add address space indicator to get BOs

Some commands like MI_BATCH_BUFFER_START have this indicator.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agovulkan/overlay: fix missing var rename in previous commit
Eric Engestrom [Thu, 7 Mar 2019 13:45:14 +0000 (13:45 +0000)]
vulkan/overlay: fix missing var rename in previous commit

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agovulkan/util: use the platform defines in vk.xml instead of hard-coding them
Eric Engestrom [Tue, 5 Mar 2019 15:57:34 +0000 (15:57 +0000)]
vulkan/util: use the platform defines in vk.xml instead of hard-coding them

See also: 3d4238d26c5de4a0f7a5 "anv: use the platform defines in vk.xml
                                instead of hard-coding them"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoiris: add support for tgsi_to_nir
Andre Heider [Sun, 10 Feb 2019 17:31:59 +0000 (18:31 +0100)]
iris: add support for tgsi_to_nir

The Gallium Nine state tracker now works on iris.

Also tested with GALLIUM_HUD and Star Wars: Knights of the Old
Republic on WINE (GL_ATI_fragment_shader).

Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir: free dead_ctx in case of no progress
Tapani Pälli [Wed, 6 Mar 2019 10:30:22 +0000 (12:30 +0200)]
nir: free dead_ctx in case of no progress

Fixes a leak:

  ==7576== 320 (48 direct, 272 indirect) bytes in 1 blocks are definitely lost in loss record 26 of 26
  ==7576==    at 0x4C2EE3B: malloc (vg_replace_malloc.c:309)
  ==7576==    by 0x53EF0E4: ralloc_size (ralloc.c:119)
  ==7576==    by 0x53EF0C2: ralloc_context (ralloc.c:113)
  ==7576==    by 0x5471F64: nir_split_per_member_structs (nir_split_per_member_structs.c:176)
  ==7576==    by 0x51288CF: anv_shader_compile_to_nir (anv_pipeline.c:216)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoanv: call blob_finish when done with it
Tapani Pälli [Wed, 6 Mar 2019 10:27:30 +0000 (12:27 +0200)]
anv: call blob_finish when done with it

Fixes leaks from anv_device_upload_nir:

  ==7345== 8,192 bytes in 2 blocks are definitely lost in loss record 24 of 24
  ==7345==    at 0x4C2ED78: malloc (vg_replace_malloc.c:308)
  ==7345==    by 0x4C31393: realloc (vg_replace_malloc.c:836)
  ==7345==    by 0x54E0848: grow_to_fit (blob.c:67)
  ==7345==    by 0x54E0BE5: blob_reserve_bytes (blob.c:166)
  ==7345==    by 0x54E0C7C: blob_reserve_intptr (blob.c:186)
  ==7345==    by 0x54704A7: nir_serialize (nir_serialize.c:1091)
  ==7345==    by 0x512F97D: anv_device_upload_nir (anv_pipeline_cache.c:756)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoanv: use anv_gem_munmap in block pool cleanup
Tapani Pälli [Wed, 6 Mar 2019 08:49:21 +0000 (10:49 +0200)]
anv: use anv_gem_munmap in block pool cleanup

Use anv_gem_munmap for unmap when softpin in use, this corresponds to
anv_gem_mmap used in anv_block_pool_expand_range. This fixes valgrind
errors seen for each pool when softpin is in use:

  ==25581== 262,144 bytes in 1 blocks are definitely lost in loss record 31 of 31
  ==25581==    at 0x50E77E8: anv_gem_mmap (anv_gem.c:96)
  ==25581==    by 0x50EEE2B: anv_block_pool_expand_range (anv_allocator.c:543)
  ==25581==    by 0x50EEB51: anv_block_pool_init (anv_allocator.c:477)
  ==25581==    by 0x50EF7EF: anv_state_pool_init (anv_allocator.c:920)
  ==25581==    by 0x510B8EB: anv_CreateDevice (anv_device.c:2031)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoiris: Fix MOCS for blits and clears
Kenneth Graunke [Wed, 6 Mar 2019 22:49:39 +0000 (14:49 -0800)]
iris: Fix MOCS for blits and clears

I915_MOCS_CACHED is the wrong value.  Expose mocs() and use that.

5 years agost/glsl: start spilling out common st glsl conversion code
Timothy Arceri [Wed, 4 Apr 2018 06:01:21 +0000 (16:01 +1000)]
st/glsl: start spilling out common st glsl conversion code

The NIR and TGSI paths are currently intertwined which makes it
not only hard to follow but also makes it hard to take advantage
of the differences in IR.

Here we take the first step to splitting that path apart. With
this we take the opportunity to no longer call the GLSL IR
optimisation passes after the final lowering calls for NIR. We
can instead just use the NIR passes which can produce better code
and should also result in faster compile times.

The speed-up can be measured in some dolphin uber shaders due to
no longer calling lower_if_to_cond_assign() for example
dolphin/ubershaders/120.shader_test goes from ~1.63 -> ~1.53
seconds on my machine.

There are some code changes as a result of not calling
lower_if_to_cond_assign(), this is because it flattens ifs that
contain UBOs where as NIR's peephole select doesn't. This is
were most of the regressions in Max Waves happens with shader-db.

shader-db results (VEGA):

Totals from affected shaders:
SGPRS: 2349056 -> 2349640 (0.02 %)
VGPRS: 1322160 -> 1323300 (0.09 %)
Spilled SGPRs: 21190 -> 21527 (1.59 %)
Spilled VGPRs: 99 -> 99 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 72 -> 72 (0.00 %) dwords per thread
Code Size: 57260904 -> 57270932 (0.02 %) bytes
Compile Time: 1107186 -> 1022942 (-7.61 %) milliseconds
LDS: 786 -> 786 (0.00 %) blocks
Max Waves: 391932 -> 391619 (-0.08 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoradeonsi/nir: stop calling nir_lower_returns()
Timothy Arceri [Thu, 21 Feb 2019 01:15:18 +0000 (12:15 +1100)]
radeonsi/nir: stop calling nir_lower_returns()

We now call this for all drivers in glsl_to_nir() instead.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoi965: stop calling nir_lower_returns()
Timothy Arceri [Thu, 21 Feb 2019 00:54:09 +0000 (11:54 +1100)]
i965: stop calling nir_lower_returns()

We now call this for all drivers in glsl_to_nir() instead.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoglsl: use NIR function inlining for drivers that use glsl_to_nir()
Timothy Arceri [Wed, 20 Feb 2019 06:13:49 +0000 (17:13 +1100)]
glsl: use NIR function inlining for drivers that use glsl_to_nir()

glsl_to_nir() is still missing support for converting certain
functions to NIR, so for those we use the GLSL IR optimisations
to remove the functions.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoglsl/freedreno/panfrost: pass gl_context to the standalone compiler
Timothy Arceri [Fri, 22 Feb 2019 00:51:24 +0000 (11:51 +1100)]
glsl/freedreno/panfrost: pass gl_context to the standalone compiler

This allows us to use the ctx with glsl_to_nir() in a following
patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agovulkan/overlay: drop dependency on validation layer headers
Lionel Landwerlin [Tue, 5 Mar 2019 12:19:10 +0000 (12:19 +0000)]
vulkan/overlay: drop dependency on validation layer headers

v2: reimplement layer chain info getters (Eric)

v3: make it compile.. (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agovulkan/util: generate instance/device dispatch tables
Lionel Landwerlin [Tue, 5 Mar 2019 10:38:14 +0000 (10:38 +0000)]
vulkan/util: generate instance/device dispatch tables

This will be used by the overlay instead of system installed
validation layers helpers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agovulkan/util: make header available from c++
Lionel Landwerlin [Mon, 25 Feb 2019 23:01:02 +0000 (23:01 +0000)]
vulkan/util: make header available from c++

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoiris: setup EdgeFlag Vertex Element when needed.
Jose Maria Casanova Crespo [Wed, 27 Feb 2019 19:44:27 +0000 (20:44 +0100)]
iris: setup EdgeFlag Vertex Element when needed.

If Vertex Shader uses EdgeFlag the hardware request that it is setup
as the last VERTEX_ELEMENT_STATE. If SGVS are add at draw time we
need to also reconfigure the last 3DSTATE_VF_INSTANCING so its
VertexElementIndex points to the new Vertex Element that contains
the EdgeFlag.

So if draw parameters or edgeflag are not used the CSO generated at
iris_create_vertex_element is sent directly in the batches. But if
edge flag is used we adjust last VERTEX_ELEMENT_STATE and
last 3DSTATE_VF_INSTANCING using their alternative edge flag version
we generate at iris_create_vertex_element and store at the CSO.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agov3d: Include a count of register pressure in the RA failure dumps.
Eric Anholt [Thu, 21 Feb 2019 20:47:37 +0000 (12:47 -0800)]
v3d: Include a count of register pressure in the RA failure dumps.

You usually want to go find the highest pressure and figure out why you
couldn't spill or what pattern led to a bunch of pressure leading to that
point.

5 years agoradv: enable lower_mul_2x32_64
Samuel Pitoiset [Wed, 6 Mar 2019 21:35:31 +0000 (22:35 +0100)]
radv: enable lower_mul_2x32_64

Fixes: 58bcebd987b ("spirv: Allow [i/u]mulExtended to use new nir opcode")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agost/nir: Move 64-bit lowering later
Jason Ekstrand [Mon, 4 Mar 2019 23:02:39 +0000 (17:02 -0600)]
st/nir: Move 64-bit lowering later

Now that we have a loop unrolling cost function and loop unrolling isn't
going to kill us the moment we have a 64-bit op in a loop, we can go
ahead and move 64-bit lowering later.  This gives us the opportunity to
do more optimizations and actually let the full optimizer run even on
64-bit ops rather than hoping one round of opt_algebraic will fix
everything.  This substantially reduces both fp64 shader compile times
and the resulting code size.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/nir: Move 64-bit lowering later
Jason Ekstrand [Mon, 4 Mar 2019 22:11:57 +0000 (16:11 -0600)]
intel/nir: Move 64-bit lowering later

Now that we have a loop unrolling cost function and loop unrolling isn't
going to kill us the moment we have a 64-bit op in a loop, we can go
ahead and move 64-bit lowering later.  This gives us the opportunity to
do more optimizations and actually let the full optimizer run even on
64-bit ops rather than hoping one round of opt_algebraic will fix
everything.  This substantially reduces both fp64 shader compile times
and the resulting code size.  On the vs-isnan-dvec test from piglit:

Before this commit:

    1684.63s user 17.29s system 99% cpu 28:28.24 total
    101479 instructions. 0 loops. 802452 cycles. 79:369 spills:fills.
    Peak memory usage (according to massif): 1.435 GB

After this commit:

    179.64s user 7.75s system 99% cpu 3:07.92 total
    57316 instructions. 0 loops. 459287 cycles. 0:0 spills:fills.
    Peak memory usage (according to massif): 531.0 MB

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/lower_doubles: Inline functions directly in lower_doubles
Jason Ekstrand [Mon, 4 Mar 2019 21:55:19 +0000 (15:55 -0600)]
nir/lower_doubles: Inline functions directly in lower_doubles

Instead of trusting the caller to already have created a softfp64
function shader and added all its functions to our shader, we simply
take the softfp64 shader as an argument and do the function inlining
ouselves.  This means that there's no more nasty functions lying around
that the caller needs to worry about cleaning up.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/deref: Expose nir_opt_deref_impl
Jason Ekstrand [Mon, 4 Mar 2019 22:17:02 +0000 (16:17 -0600)]
nir/deref: Expose nir_opt_deref_impl

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/inline_functions: Break inlining into a builder helper
Jason Ekstrand [Mon, 4 Mar 2019 21:32:36 +0000 (15:32 -0600)]
nir/inline_functions: Break inlining into a builder helper

This pulls the guts of function inlining into a builder helper so that
it can be used elsewhere.  The rest of the infrastructure is still
needed for most inlining cases to ensure that everything gets inlined
and only ever once.  However, there are use-cases where you just want to
inline one little thing.  This new helper also has a neat trick where it
can seamlessly inline a function from one nir_shader into another.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl/nir: Inline functions in float64_funcs_to_nir
Jason Ekstrand [Mon, 4 Mar 2019 20:39:40 +0000 (14:39 -0600)]
glsl/nir: Inline functions in float64_funcs_to_nir

This doesn't really change anything as the functions will all get
inlined anyway.  However it does let us do a bit of the work earlier and
in a common place.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl/nir: Add a shared helper for building float64 shaders
Jason Ekstrand [Sun, 3 Mar 2019 16:00:14 +0000 (10:00 -0600)]
glsl/nir: Add a shared helper for building float64 shaders

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/nir: Drop an unneeded lower_constant_initializers call
Jason Ekstrand [Mon, 4 Mar 2019 22:01:23 +0000 (16:01 -0600)]
intel/nir: Drop an unneeded lower_constant_initializers call

Even though this is technically a step in the function inlining process
as laid out in nir_inline_functions.c, it's not really needed.  We
already have constant initializers lowered here and no new ones are
added by appending the softfp64 functions.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/debug: Add a debug flag to force software fp64
Jason Ekstrand [Sun, 3 Mar 2019 16:10:46 +0000 (10:10 -0600)]
intel/debug: Add a debug flag to force software fp64

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoi965: Compile the fp64 program based on nir options
Jason Ekstrand [Tue, 5 Mar 2019 04:54:44 +0000 (22:54 -0600)]
i965: Compile the fp64 program based on nir options

Instead of looking the devinfo directly, look at the lowering options we
provided to NIR.  This is more accurate as it's now checking for "do we
need full software lowering" rather than a hardware bit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir: Teach loop unrolling about 64-bit instruction lowering
Jason Ekstrand [Sun, 3 Mar 2019 15:24:12 +0000 (09:24 -0600)]
nir: Teach loop unrolling about 64-bit instruction lowering

The lowering we do for 64-bit instructions can cause a single NIR ALU
instruction to blow up into hundreds or thousands of instructions
potentially with control flow.  If loop unrolling isn't aware of this,
it can unroll a loop 20 times which contains a nir_op_fsqrt which we
then lower to a full software implementation based on integer math.
Those 20 invocations suddenly get a lot more expensive than NIR loop
unrolling currently expects.  By giving it an approximate estimate
function, we can prevent loop unrolling from going to town when it
shouldn't.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir: Expose double and int64 op_to_options_mask helpers
Jason Ekstrand [Fri, 1 Mar 2019 23:39:54 +0000 (17:39 -0600)]
nir: Expose double and int64 op_to_options_mask helpers

We already have one internally for int64 but we don't have a similar one
for doubles so we'll have to make one.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agocompiler/nir: add an is_conversion field to nir_op_info
Iago Toral Quiroga [Tue, 12 Feb 2019 11:55:28 +0000 (12:55 +0100)]
compiler/nir: add an is_conversion field to nir_op_info

This is set to True only for numeric conversion opcodes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>