mesa.git
6 years agoanv/allocator: Don't shrink either end of the block pool
Jason Ekstrand [Sat, 21 Apr 2018 04:52:41 +0000 (21:52 -0700)]
anv/allocator: Don't shrink either end of the block pool

Previously, we only tried to ensure that we didn't shrink either end
below what was already handed out.  However, due to the way we handle
relocations with block pools, we can't shrink the back end at all.  It's
probably best to not shrink in either direction.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105374
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106147
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Cc: mesa-stable@lists.freedesktop.org
6 years agobroadcom/vc5: Add support for centroid varyings.
Eric Anholt [Thu, 26 Apr 2018 16:24:32 +0000 (09:24 -0700)]
broadcom/vc5: Add support for centroid varyings.

It would be nice to share the flags packet emit logic with flat shade
flags, but I couldn't come up with a good way while still using our pack
macros.  We need to refactor this to shader record setup at compile time,
anyway.

Fixes ext_framebuffer_multisample-interpolation * centroid-*

6 years agobroadcom/vc5: Add an assert about GFXH-1559.
Eric Anholt [Wed, 25 Apr 2018 23:30:20 +0000 (16:30 -0700)]
broadcom/vc5: Add an assert about GFXH-1559.

Our TF outputs always start at 6 or 7 currently, so we don't hit the
broken 8 case.  Let's make sure that doesn't change somehow.

6 years agobroadcom/vc5: Add validation that we don't violate GFXH-1633 requirements.
Eric Anholt [Wed, 25 Apr 2018 23:24:15 +0000 (16:24 -0700)]
broadcom/vc5: Add validation that we don't violate GFXH-1633 requirements.

We don't use ldunifa yet, but we will eventually for UBOs.

6 years agobroadcom/vc5: Add validation that we don't violate GFXH-1625 requirements.
Eric Anholt [Wed, 25 Apr 2018 23:16:27 +0000 (16:16 -0700)]
broadcom/vc5: Add validation that we don't violate GFXH-1625 requirements.

We don't use TMUWT yet, but we will once we do SSBOs.

6 years agobroadcom/vc5: Implement GFXH-1742 workaround (emit 2 dummy stores on 4.x).
Eric Anholt [Wed, 25 Apr 2018 21:18:52 +0000 (14:18 -0700)]
broadcom/vc5: Implement GFXH-1742 workaround (emit 2 dummy stores on 4.x).

This should fix help with intermittent GPU hangs in tests switching
formats while rendering small frames.  Unfortunately, it didn't help with
the tests I'm having troubles with.

6 years agobroadcom/vc5: Add QPU validation for register writes after thrend.
Eric Anholt [Wed, 25 Apr 2018 20:51:47 +0000 (13:51 -0700)]
broadcom/vc5: Add QPU validation for register writes after thrend.

The next shader gets to start writing the register file during these
slots, so make sure we don't stomp over them.

The only case of hitting this that I could imagine would be dead writes.

6 years agost: Choose a 2101010 format for GL_RGB/GL_RGBA with a 2_10_10_10 type.
Eric Anholt [Wed, 25 Apr 2018 18:40:40 +0000 (11:40 -0700)]
st: Choose a 2101010 format for GL_RGB/GL_RGBA with a 2_10_10_10 type.

GLES's GL_EXT_texture_type_2_10_10_10_REV allows uploading this type to an
unsized internalformat, and it should be non-color-renderable.
fbobject.c's implementation of the check for color-renderable is checks
that the texture has a 2101010 mesa format, so make sure that we have
chosen a 2101010 format so that check can do what it meant to.

Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgb on vc5.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost/mesa: fix missing setting of _ElementSize in new_draw_rasterpos_stage
Charmaine Lee [Thu, 26 Apr 2018 16:21:52 +0000 (09:21 -0700)]
st/mesa: fix missing setting of _ElementSize in new_draw_rasterpos_stage

With this patch, _ElementSize is initialized along with the rest
of the vertex array attributes in new_draw_rasterpos_stage().
This fixes a crash in st_pipe_vertex_format() when running
topogun-1.06-orc-84k-resize trace file with VMware svga driver.

Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agost/va: Fix typos
Drew Davenport [Wed, 25 Apr 2018 15:32:31 +0000 (09:32 -0600)]
st/va: Fix typos

s/attibute/attribute/
s/suface/surface/

v2: rebased(Leo)

Reviewed-by: Leo Liu <leo.liu@amd.com>
6 years agost/va: Fix potential buffer overread
Drew Davenport [Tue, 24 Apr 2018 23:01:32 +0000 (17:01 -0600)]
st/va: Fix potential buffer overread

VASurfaceAttribExternalBuffers.pitches is indexed by
plane. Current implementation only supports single plane layout.

Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Leo Liu <leo.liu@amd.com>
6 years agoradeon/vcn: fix mpeg4 msg buffer settings
Boyuan Zhang [Wed, 25 Apr 2018 15:49:52 +0000 (11:49 -0400)]
radeon/vcn: fix mpeg4 msg buffer settings

Previous bit-fields assignments are incorrect and will result certain mpeg4
decode failed due to wrong flag values. This patch fixes these assignments.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
6 years agoradeon: Drop broken front_buffer_reading/drawing optimization
Ian Romanick [Fri, 18 Sep 2015 16:00:28 +0000 (12:00 -0400)]
radeon: Drop broken front_buffer_reading/drawing optimization

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradeon: Use _mesa_is_front_buffer_drawing
Ian Romanick [Thu, 17 Sep 2015 14:56:15 +0000 (10:56 -0400)]
radeon: Use _mesa_is_front_buffer_drawing

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradv: set ac_surf_info::num_channels correctly
Samuel Pitoiset [Wed, 25 Apr 2018 09:22:17 +0000 (11:22 +0200)]
radv: set ac_surf_info::num_channels correctly

num_channels has been introduced since "ac/surface: don't set
the display flag for obviously unsupported cases".

Based on RadeonSI.

Fixes: e29facff315 ("ac/surface: don't set the display flag for obviously unsupported cases (v2)")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: fix DCC enablement since partial MSAA implementation
Samuel Pitoiset [Wed, 25 Apr 2018 08:56:15 +0000 (10:56 +0200)]
radv: fix DCC enablement since partial MSAA implementation

dcc_msaa_allowed is always false on GFX9+ and only true on VI
if RADV_PERFTEST=dccmsaa is set. This means DCC was disabled
in some situations where it should not.

This is likely going to fix a performance regression.

Fixes: 2f63b3dd09 ("radv: enable DCC for MSAA 2x textures on VI under an option")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agonir/opt_constant_folding: fix folding of 8 and 16 bit ints
Karol Herbst [Sun, 22 Apr 2018 01:29:07 +0000 (03:29 +0200)]
nir/opt_constant_folding: fix folding of 8 and 16 bit ints

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agonir: print 8 and 16 bit constants correctly
Karol Herbst [Sat, 21 Apr 2018 23:31:22 +0000 (01:31 +0200)]
nir: print 8 and 16 bit constants correctly

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agonir: support converting to 8-bit integers in nir_type_conversion_op
Karol Herbst [Sat, 21 Apr 2018 15:27:17 +0000 (17:27 +0200)]
nir: support converting to 8-bit integers in nir_type_conversion_op

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agospirv: Don’t check for NaN for most OpFOrd* comparisons
Neil Roberts [Tue, 24 Apr 2018 10:17:56 +0000 (12:17 +0200)]
spirv: Don’t check for NaN for most OpFOrd* comparisons

For all of the OpFOrd* comparisons except OpFOrdNotEqual the hardware
should probably already return false if one of the operands is NaN so
we don’t need to have an explicit check for it. This seems to at least
work on Intel hardware. This should reduce the number of instructions
generated for the most common comparisons.

For what it’s worth, the original code to handle this was added in
e062eb6415de3a. The commit message for that says that it was to fix
some CTS tests for OpFUnord* opcodes. Even if the hardware doesn’t
handle NaNs this patch shouldn’t affect those tests. At any rate they
have since been moved out of the mustpass list. Incidentally those
tests fail on the nvidia proprietary driver so it doesn’t seem like
handling NaNs correctly is a priority.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoIntel: Add a Kaby Lake PCI ID
Matt Atwood [Wed, 25 Apr 2018 16:23:04 +0000 (09:23 -0700)]
Intel: Add a Kaby Lake PCI ID

v2: Branding changed

Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
6 years agogallium/util: Fix incorrect refcounting of separate stencil.
Eric Anholt [Wed, 25 Apr 2018 16:47:40 +0000 (09:47 -0700)]
gallium/util: Fix incorrect refcounting of separate stencil.

The driver may have a reference on the separate stencil buffer for some
reason (like an unflushed job using it), so we can't directly free the
resource and should instead just decrement the refcount that we own.
Fixes double-free in KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8
on vc5.

Fixes: e94eb5e6000e ("gallium/util: add u_transfer_helper")
Reviewed-by: Rob Clark <robdclark@gmail.com>
6 years agobroadcom/vc5: Fix reloads of separate stencil buffers.
Eric Anholt [Wed, 25 Apr 2018 00:50:50 +0000 (17:50 -0700)]
broadcom/vc5: Fix reloads of separate stencil buffers.

Like for stores, we need to emit a separate load_general packet.

6 years agobroadcom/vc5: Fix cpp of MSAA surfaces on 4.x.
Eric Anholt [Tue, 24 Apr 2018 21:56:23 +0000 (14:56 -0700)]
broadcom/vc5: Fix cpp of MSAA surfaces on 4.x.

The internal-type-bpp path is for surfaces that get stored in the raw TLB
format.  For 4.x, we're storing MSAA as just 2x width/height at the
original format.

6 years agobroadcom/vc5: Implement stencil blits using RGBA.
Eric Anholt [Tue, 24 Apr 2018 18:11:40 +0000 (11:11 -0700)]
broadcom/vc5: Implement stencil blits using RGBA.

Fixes piglit fbo-depthstencil blit default_fb

6 years agobroadcom/vc5: Remove leftover vc4 MSAA lowering setup in the FS key.
Eric Anholt [Tue, 24 Apr 2018 22:23:27 +0000 (15:23 -0700)]
broadcom/vc5: Remove leftover vc4 MSAA lowering setup in the FS key.

6 years agobroadcom/vc5: Fix tile load/store of MSAA surfaces on 4.x.
Eric Anholt [Tue, 24 Apr 2018 20:22:41 +0000 (13:22 -0700)]
broadcom/vc5: Fix tile load/store of MSAA surfaces on 4.x.

For single-sample we have to always program SAMPLE_0, but for multisample
we want to store all the samples.

6 years agotravis: update libva required version
Juan A. Suarez Romero [Fri, 20 Apr 2018 14:34:14 +0000 (14:34 +0000)]
travis: update libva required version

Commit fa328456e8f29 added VP9 config support, but this needs a newer
libva version, 1.7.0 or above.

Fixes: fa328456e8f ("st/va: add VP9 config to enable profile2")
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agomesa: GL_EXT_texture_norm16 extension plumbing
Tapani Pälli [Fri, 6 Apr 2018 07:57:57 +0000 (10:57 +0300)]
mesa: GL_EXT_texture_norm16 extension plumbing

Patch enables use of short and unsigned short data for texture uploads,
rendering and reading of framebuffers within the restrictions specified
in GL_EXT_texture_norm16 spec.

Patch also enables those 16bit format layout qualifiers listed in
GL_NV_image_formats that depend on EXT_texture_norm16.

v2: expose extension with dummy_true
    fix layout qualifier map changes (Ilia Mirkin)

v3: use _mesa_has_EXT_texture_norm16, other fixes
    and cleanup (Ilia Mirkin)

v4: fix rest of the issues found

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agomeson: Fix with_intel_vk and with_amd_vk variables
Jordan Justen [Wed, 25 Apr 2018 01:12:51 +0000 (18:12 -0700)]
meson: Fix with_intel_vk and with_amd_vk variables

Fixes: 5608d0a2cee "meson: use array type options"
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
6 years agodraw: fix different sign logic when clipping
Roland Scheidegger [Tue, 24 Apr 2018 16:25:55 +0000 (18:25 +0200)]
draw: fix different sign logic when clipping

The logic was flawed, since mul(x,y) will be <= 0 (exactly 0) when
the sign is the same but both numbers are sufficiently small
(if the product is smaller than 2^-128).
This could apparently lead to emitting a sufficient amount of
additional bogus vertices to overflow the allocated array for them,
hitting an assertion (still safe with release builds since we just
aborted clipping after the assertion in this case - I'm however unsure
if this is now really no longer possible, so that code stays).
Not sure if the additional vertices could cause other grief, I didn't
see anything wrong even when hitting the assertion.

Essentially, both +-0 are treated as positive (the vertex is considered
to be inside the clip volume for this plane), so integrate the logic
determining different sign into the branch there.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
6 years agodraw: simplify clip null tri logic
Roland Scheidegger [Tue, 24 Apr 2018 16:12:34 +0000 (18:12 +0200)]
draw: simplify clip null tri logic

Simplifies the logic when to emit null tris (albeit the reasons why we
have to do this remain unclear).
This is strictly just logic simplification, the behavior doesn't change
at all.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
6 years agonvc0/ir: all short immediates are sign-extended, adjust LIMM test
Ilia Mirkin [Sat, 21 Apr 2018 17:08:51 +0000 (13:08 -0400)]
nvc0/ir: all short immediates are sign-extended, adjust LIMM test

Some analysis suggests that all short immediates are sign-extended. The
insnCanLoad logic already accounted for this, but we could still pick
the wrong form when emitting actual instructions that support both short
and long immediates (with the long form usually having additional
restrictions that insnCanLoad should be aware of).

This also reverses a bunch of commits that had previously "worked
around" this issue in various emitters:

9c63224540ef: gm107/ir: make use of ADD32I for all immediates
83a4f28dc27b: gm107/ir: make use of LOP32I for all immediates
b84c97587b4a: gm107/ir: make use of IMUL32I for all immediates
d30768025a22: gk110/ir: make use of IMUL32I for all immediates

as well as the original import for UMUL in the nvc0 emitter.

Reported-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
6 years agomesa: call DrawBufferAllocate driver hook in update_framebuffer for windows-system FB
Boyan Ding [Sat, 14 Apr 2018 04:45:23 +0000 (14:45 +1000)]
mesa: call DrawBufferAllocate driver hook in update_framebuffer for windows-system FB

When draw buffers are changed on a bound framebuffer, DrawBufferAllocate()
hook should be called. However, it is missing in update_framebuffer with
window-system framebuffer, in which FB's draw buffer state should match
context state, potentially resulting in a change.

Note: This is needed because gallium delays creating the front buffer,
      i965 works fine without this change.

V2 (Timothy Arceri):
 - Rebased on merged/simplified DrawBuffer driver function
 - Move DrawBuffer call outside fb->ColorDrawBuffer[0] !=
   ctx->Color.DrawBuffer[0] check to make piglit pass.

v3 (Timothy Arceri):
 - Call new DrawBuffaerAllocate() driver function.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v2)
Reviewed-by: Brian Paul <brianp@vmware.com> (v2)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99116

6 years agost/mesa: add new driver function DrawBufferAllocate
Timothy Arceri [Tue, 24 Apr 2018 04:19:48 +0000 (14:19 +1000)]
st/mesa: add new driver function DrawBufferAllocate

Unlike some of the classic drivers the st was only using DrawBuffer()
to allocated some buffers on-demand. Creating a separate function
will allow us to call it from update_framebuffer() in the following
patch without regressing some of the older classic drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: some C99 tidy ups for framebuffer.c
Timothy Arceri [Tue, 24 Apr 2018 04:06:00 +0000 (14:06 +1000)]
mesa: some C99 tidy ups for framebuffer.c

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agomeson: Fix no-rtti in llvm detection
Dylan Baker [Tue, 24 Apr 2018 21:15:47 +0000 (14:15 -0700)]
meson: Fix no-rtti in llvm detection

Because I clearly wasn't thinking and clearly didn't do a good job
testing. Sigh

Fixes: c5a97d658ec19cc02719d7f86c1b0715e3d9ffc4
       ("meson: fix builds against LLVM built without rtti")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomeson: use new warning function
Dylan Baker [Mon, 16 Apr 2018 22:19:54 +0000 (15:19 -0700)]
meson: use new warning function

Instead of emulating it with message.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agomeson: use array type options
Dylan Baker [Mon, 16 Apr 2018 22:18:08 +0000 (15:18 -0700)]
meson: use array type options

This option type is nice since it involves less converting strings into
lists, and because it validates the values that are provided.

v2: - Set with_any_vk to true if any vulkan driver is built (Eric)

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agomeson: fix builds against LLVM built without rtti
Dylan Baker [Mon, 16 Apr 2018 21:47:58 +0000 (14:47 -0700)]
meson: fix builds against LLVM built without rtti

Building without rtti is a frought with peril, but it's something that
autotools supports so we need to support it too.

Since we've moved to version 0.44 as a whole we can use the meson
functionality for accessing random llvm-config options we can check for
rtti and add -fno-rtti to all C++ code accordingly.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
6 years agomeson: remove dummy_cpp
Dylan Baker [Mon, 16 Apr 2018 21:40:51 +0000 (14:40 -0700)]
meson: remove dummy_cpp

meson has gotten pretty smart about tracking C and C++ dependencies
(internal and external), and using the right linker. This wasn't always
the case and we created empty c++ files to force the use of the c++
linker. We don't need that any more.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agomeson: allow empty sources when using link_whole
Dylan Baker [Mon, 16 Apr 2018 21:39:59 +0000 (14:39 -0700)]
meson: allow empty sources when using link_whole

meson used to get grumpy if the sources list was empty, even when using
--whole-archive (link_whole). In more recent versions that's not true,
so remove the workaround.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agomeson: remove workaround for custom target creating .h and .c files
Dylan Baker [Mon, 16 Apr 2018 21:34:35 +0000 (14:34 -0700)]
meson: remove workaround for custom target creating .h and .c files

In more modern versions of meson a custom_target returns an index-able
object. This allows us to create accurate dependency models for targets
that rely only on the header and not on the code from anv_entrypoints.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agomeson: raise required version to 0.44.1
Dylan Baker [Fri, 13 Apr 2018 22:05:55 +0000 (15:05 -0700)]
meson: raise required version to 0.44.1

We have already required 0.44 for building clover and swr, so it was
already partially required. This just makes it required across the board
instead of just for clover and swr.

There is a bug in 0.44 which makes it impossible to build mesa in some
configurations, so require 0.44.1 which fixes this.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agomeson: fix graw-xlib after auxiliary consolidation
Dylan Baker [Wed, 18 Apr 2018 18:31:31 +0000 (11:31 -0700)]
meson: fix graw-xlib after auxiliary consolidation

This one's completely my fault, I didn't do good enough testing after
rebasing and this got missed.

Fixes: d28c24650110c130008be3d3fe584520ff00ceb1
       ("meson: build graw tests")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agomeson: only build mesa_st tests when build-tests is true
Dylan Baker [Wed, 18 Apr 2018 16:29:35 +0000 (09:29 -0700)]
meson: only build mesa_st tests when build-tests is true

Since we have an option to turn test building on and off, we should
honor that.

Fixes: 34cb4d0ebc14663113705beae63dd52b9d1b2d87
       ("meson: build tests for gallium mesa state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agomeson: don't build classic mesa tests without dri_drivers
Dylan Baker [Wed, 18 Apr 2018 17:53:27 +0000 (10:53 -0700)]
meson: don't build classic mesa tests without dri_drivers

Since mesa_classic is build-on-demand the tests will create a demand and
add a bunch of extra compilation.

Fixes: 43a6e84927e3b1290f6f211f5dfb184dfe5a719e
       ("meson: build mesa test.")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agoi965/meta_util: Re-enable sRGB-encoded fast-clears on CNL
Nanley Chery [Fri, 23 Mar 2018 00:05:34 +0000 (17:05 -0700)]
i965/meta_util: Re-enable sRGB-encoded fast-clears on CNL

The paths which sample with the clear color are now using a getter which
performs the sRGB decode needed to enable this fast clear.

This path can be exercised by fast-clearing a texture, then performing
an operation which requires sRGB decoding. Test coverage for this
feature is provided with the following tests:

* Shader texture calls:
  - spec@ext_texture_srgb@tex-srgb

* Shader texelfetch calls:
  - spec@arb_framebuffer_srgb@fbo-fast-clear
  - spec@arb_framebuffer_srgb@msaa-fast-clear

* Blending:
  - spec@arb_framebuffer_srgb@arb_framebuffer_srgb-fast-clear-blend

* Blitting:
  - spec@arb_framebuffer_srgb@blit texture srgb msaa enabled clear

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoi965/miptree: Extend the sRGB-blending WA to future platforms
Nanley Chery [Fri, 30 Mar 2018 05:14:09 +0000 (22:14 -0700)]
i965/miptree: Extend the sRGB-blending WA to future platforms

The blending issue seems to be present on CNL as well.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoi965: Add and use a getter for the clear color
Nanley Chery [Mon, 26 Mar 2018 21:32:18 +0000 (14:32 -0700)]
i965: Add and use a getter for the clear color

It returns both the inline clear color and a clear address which points
to the indirect clear color buffer (or NULL if unused/non-existent).
This getter allows CNL to sample from fast-cleared sRGB textures
correctly by doing the needed sRGB-decode on the clear color (inline)
and making the indirect clear color buffer unused.

v2 (Rafael):
* Have a more detailed commit message.
* Add a comment on the sRGB conversion process.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoutil/srgb: Add a float sRGB -> linear helper
Jason Ekstrand [Fri, 23 Jun 2017 03:00:47 +0000 (20:00 -0700)]
util/srgb: Add a float sRGB -> linear helper

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoi965/wm_surface_state: Use the clear address if clear_bo is non-NULL
Nanley Chery [Tue, 10 Apr 2018 20:56:18 +0000 (13:56 -0700)]
i965/wm_surface_state: Use the clear address if clear_bo is non-NULL

We want to add and use a getter that turns off the indirect path by
returning zero for the clear color bo and offset.

v2: Fix usage of "clear address" in commit message (Jason).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoi965: Add and use a single miptree aux_buf field
Nanley Chery [Fri, 6 Apr 2018 16:54:31 +0000 (09:54 -0700)]
i965: Add and use a single miptree aux_buf field

We want to add and use a function that accesses the auxiliary buffer's
clear_color_bo and doesn't care if it has an MCS or HiZ buffer
specifically.

v2 (Jason Ekstrand):
* Drop intel_miptree_get_aux_buffer().
* Mention CCS in the aux_buf field.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoi965: Add and use a getter for the miptree aux buffer
Nanley Chery [Mon, 9 Apr 2018 18:11:46 +0000 (11:11 -0700)]
i965: Add and use a getter for the miptree aux buffer

Make the next patch easier to read by eliminating most of the would-be
duplicate field accesses now.

v2: Update the HiZ comment instead of deleting it (Rafael).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
6 years agogm107/ir/lib: fix sched in div u32 builtin
Karol Herbst [Sun, 22 Apr 2018 20:23:13 +0000 (22:23 +0200)]
gm107/ir/lib: fix sched in div u32 builtin

Imad needs to set a read barrier.

With significant big work groups I was getting wrong results for div u32. Turns
out the issue was with the sched opcodes.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agointel/compiler: Add scheduler deps for instructions that implicitly read g0
Ian Romanick [Mon, 16 Apr 2018 23:32:41 +0000 (16:32 -0700)]
intel/compiler: Add scheduler deps for instructions that implicitly read g0

Otherwise the scheduler can move the writes after the reads.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95009
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95012
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Cc: Clayton A Craft <clayton.a.craft@intel.com>
Cc: mesa-stable@lists.freedesktop.org
6 years agointel/compiler: Silence unused parameter warnings in empty vec4_instruction_scheduler...
Ian Romanick [Wed, 28 Mar 2018 23:45:01 +0000 (16:45 -0700)]
intel/compiler: Silence unused parameter warnings in empty vec4_instruction_scheduler methods

src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::count_reads_remaining(backend_instruction*)’:
src/intel/compiler/brw_schedule_instructions.cpp:764:72: warning: unused parameter ‘be’ [-Wunused-parameter]
 vec4_instruction_scheduler::count_reads_remaining(backend_instruction *be)
                                                                        ^~
src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::setup_liveness(cfg_t*)’:
src/intel/compiler/brw_schedule_instructions.cpp:769:51: warning: unused parameter ‘cfg’ [-Wunused-parameter]
 vec4_instruction_scheduler::setup_liveness(cfg_t *cfg)
                                                   ^~~
src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::update_register_pressure(backend_instruction*)’:
src/intel/compiler/brw_schedule_instructions.cpp:774:75: warning: unused parameter ‘be’ [-Wunused-parameter]
 vec4_instruction_scheduler::update_register_pressure(backend_instruction *be)
                                                                           ^~
src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual int vec4_instruction_scheduler::get_register_pressure_benefit(backend_instruction*)’:
src/intel/compiler/brw_schedule_instructions.cpp:779:80: warning: unused parameter ‘be’ [-Wunused-parameter]
 vec4_instruction_scheduler::get_register_pressure_benefit(backend_instruction *be)
                                                                                ^~
src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual int vec4_instruction_scheduler::issue_time(backend_instruction*)’:
src/intel/compiler/brw_schedule_instructions.cpp:1550:61: warning: unused parameter ‘inst’ [-Wunused-parameter]
 vec4_instruction_scheduler::issue_time(backend_instruction *inst)
                                                             ^~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agointel/compiler: Silence unused parameter warning in compile_cs_to_nir
Ian Romanick [Wed, 28 Mar 2018 23:35:10 +0000 (16:35 -0700)]
intel/compiler: Silence unused parameter warning in compile_cs_to_nir

src/intel/compiler/brw_fs.cpp: In function ‘nir_shader* compile_cs_to_nir(const brw_compiler*, void*, const brw_cs_prog_key*, brw_cs_prog_data*, const nir_shader*, unsigned int)’:
src/intel/compiler/brw_fs.cpp:7205:44: warning: unused parameter ‘prog_data’ [-Wunused-parameter]
                   struct brw_cs_prog_data *prog_data,
                                            ^~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agointel/compiler: Silence unused parameter warnings in generate_foo methods
Ian Romanick [Wed, 28 Mar 2018 23:29:45 +0000 (16:29 -0700)]
intel/compiler: Silence unused parameter warnings in generate_foo methods

Since all of the fs_generator::generate_foo methods take a fs_inst * as
the first parameter, just remove the name to quiet the compiler.

src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_barrier(fs_inst*, brw_reg)’:
src/intel/compiler/brw_fs_generator.cpp:743:41: warning: unused parameter ‘inst’ [-Wunused-parameter]
 fs_generator::generate_barrier(fs_inst *inst, struct brw_reg src)
                                         ^~~~
src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_discard_jump(fs_inst*)’:
src/intel/compiler/brw_fs_generator.cpp:1326:46: warning: unused parameter ‘inst’ [-Wunused-parameter]
 fs_generator::generate_discard_jump(fs_inst *inst)
                                              ^~~~
src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_pack_half_2x16_split(fs_inst*, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_fs_generator.cpp:1675:54: warning: unused parameter ‘inst’ [-Wunused-parameter]
 fs_generator::generate_pack_half_2x16_split(fs_inst *inst,
                                                      ^~~~
src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_shader_time_add(fs_inst*, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_fs_generator.cpp:1743:49: warning: unused parameter ‘inst’ [-Wunused-parameter]
 fs_generator::generate_shader_time_add(fs_inst *inst,
                                                 ^~~~
src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_set_simd4x2_header_gen9(brw_codegen*, brw::vec4_instruction*, brw_reg)’:
src/intel/compiler/brw_vec4_generator.cpp:1412:52: warning: unused parameter ‘inst’ [-Wunused-parameter]
                                  vec4_instruction *inst,
                                                    ^~~~
src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_mov_indirect(brw_codegen*, brw::vec4_instruction*, brw_reg, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_vec4_generator.cpp:1430:41: warning: unused parameter ‘inst’ [-Wunused-parameter]
                       vec4_instruction *inst,
                                         ^~~~
src/intel/compiler/brw_vec4_generator.cpp:1432:63: warning: unused parameter ‘length’ [-Wunused-parameter]
                       struct brw_reg indirect, struct brw_reg length)
                                                               ^~~~~~
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agobroadcom/vc5: Set up internal_format for imported resources.
Eric Anholt [Thu, 12 Apr 2018 23:29:19 +0000 (16:29 -0700)]
broadcom/vc5: Set up internal_format for imported resources.

Without this, we'd assertion fail in u_transfer_helper when mapping an
imported resource.

6 years agobroadcom/vc5: Assert that created BOs have offset != 0.
Eric Anholt [Thu, 12 Apr 2018 22:20:17 +0000 (15:20 -0700)]
broadcom/vc5: Assert that created BOs have offset != 0.

The kernel shouldn't return a bo at NULL, and the HW special-cases NULL
address values for things like OQs.

6 years agobroadcom/vc5: Don't allocate simulator BOs at offset 0.
Eric Anholt [Thu, 12 Apr 2018 22:19:42 +0000 (15:19 -0700)]
broadcom/vc5: Don't allocate simulator BOs at offset 0.

The kernel won't return us BOs at offset 0 (because things like OQs
wouldn't work there), so we shouldn't in the simulator either.

6 years agobroadcom/vc5: Add sim support for the GET_BO_OFFSET ioctl.
Eric Anholt [Thu, 12 Apr 2018 20:47:52 +0000 (13:47 -0700)]
broadcom/vc5: Add sim support for the GET_BO_OFFSET ioctl.

Otherwise we'd crash immediately upon importing a BO through EGL
interfaces.

6 years agobroadcom/vc5: Treat imports of DRM_FORMAT_MOD_INVALID BOs as linear.
Eric Anholt [Thu, 12 Apr 2018 20:46:24 +0000 (13:46 -0700)]
broadcom/vc5: Treat imports of DRM_FORMAT_MOD_INVALID BOs as linear.

We don't have any kernel metadata about BO tiling, so this probably is all
we should do for the moment.

6 years agoi965: expose MESA_FORMAT_R8G8B8A8_SRGB visual
Tapani Pälli [Mon, 19 Mar 2018 11:41:45 +0000 (13:41 +0200)]
i965: expose MESA_FORMAT_R8G8B8A8_SRGB visual

Exposing the visual makes following dEQP tests pass on Android:

   dEQP-EGL.functional.wide_color.window_8888_colorspace_srgb
   dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_srgb

Visual is exposed only when DRI_LOADER_CAP_RGBA_ORDERING is set.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agodri: Add __DRI_IMAGE_FORMAT_SABGR8
Tapani Pälli [Mon, 19 Mar 2018 11:41:44 +0000 (13:41 +0200)]
dri: Add __DRI_IMAGE_FORMAT_SABGR8

Add format definition and required plumbing to create images.
Note that there is no match to drm_fourcc definition, just like
with existing _DRI_IMAGE_FOURCC_SARGB8888.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoRevert "st/dri: Fix dangling pointer to a destroyed dri_drawable"
Marek Olšák [Tue, 24 Apr 2018 04:00:20 +0000 (00:00 -0400)]
Revert "st/dri: Fix dangling pointer to a destroyed dri_drawable"

This reverts commit dab02dea3411d325a5aee6cda5b581e61396ecc6.

It causes crashes of qtcreator and firefox.

Fixes: dab02de "st/dri: Fix dangling pointer to a destroyed dri_drawable"
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
6 years agogallivm: dump bitcode before optimization
Roland Scheidegger [Mon, 23 Apr 2018 04:22:45 +0000 (06:22 +0200)]
gallivm: dump bitcode before optimization

If we dump the bitcode for off-line debug purposes, we really want the
pre-optimized bitcode, otherwise it's useless in identifying problems
with IR optimization (if you have a shader which takes an hour to do
IR optimization, it's also nice you don't have to wait that hour...).
Also, print out the function passes for opt which correspond to what
was used for jit compilation (and also the opt level for codegen).
Using opt/llc this way should then pretty much mimic what was done
for jit. (When specifying something like -time-passes
-debug-pass=[Structure|Arguments] (for either opt or llc) that also
gives very useful information in which passes all the time was spent,
and which passes are really run along with the order - llvm will add
passes due to dependencies on its own, and of course -O2 for llc
comes with a ~100 pass list.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
6 years agogallivm: (trivial) do division by 1000 with int64
Roland Scheidegger [Mon, 23 Apr 2018 02:52:48 +0000 (04:52 +0200)]
gallivm: (trivial) do division by 1000 with int64

Conversion to int can otherwise overflow if compile times are over
~71min. (Yes this can happen...)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
6 years agogallivm: remove LICM pass
Roland Scheidegger [Mon, 23 Apr 2018 02:39:00 +0000 (04:39 +0200)]
gallivm: remove LICM pass

LICM is simply too expensive, even though it presumably can help quite
a bit in some cases.
It was definitely cheaper in llvm 3.3, though as far as I can tell with
llvm 3.3 it failed to do anything in most cases. early-cse also actually
seems to cause licm to be able to move things when it previously couldn't,
which causes noticeable compile time increases.
There's more loop passes in llvm, but I'm not sure which ones are helpful,
and I couldn't find anything which would roughly do what the old licm in
llvm 3.3 did, so ditch it.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
6 years agogallivm: add early cse pass
Roland Scheidegger [Mon, 23 Apr 2018 02:32:56 +0000 (04:32 +0200)]
gallivm: add early cse pass

This pass is quite cheap, and can simplify the IR quite a bit for our
generated IR.
In particular on a variety of shaders I've found the time saved by
other passes due to the simplified IR more than makes up for the cost
of this pass, and on top of that the end result is actually better.
The only downside I've found is this enables the LICM pass to move some
things out of the main shader loop (in the case I've seen, instanced
vertex fetch (which is constant within the jit shader) plus the derived
instructions in the shader) which it couldn't do before for some reason.
This would actually be desirable but can increase compile time
considerably (licm seems to have considerable cost when it actually can
move things out of loops, due to alias analysis). But blaming early cse
for this seems inappropriate. (Note that the first two sroa / earlycse
passes are similar to what a standard llvm opt -O1/-O2 pipeline would
do, albeit this has some more passes even before but I don't think
they'd do much for us.)
It also in particular helps some crazy shader used for driver
verification (don't ask...) a lot (about factor of 6 faster in compile
time) (due to simplfiying the ir before LICM is run).
While here, also move licm behind simplifycfg. For some shaders there
seems to be very significant compile time gains (we've seen a factor
of 10000 albeit that was a really crazy shader you'd certainly never
see in a real app), beause LICM is quite expensive and there's cases
where running simplifycfg (along with sroa and early-cse) before licm
reduces IR complexity significantly. (I'm not entirely sure if it would
make sense to also run it afterwards.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
6 years agoglsl/glcpp: Handle hex constants with 0X prefix
Vlad Golovkin [Thu, 19 Apr 2018 20:08:01 +0000 (23:08 +0300)]
glsl/glcpp: Handle hex constants with 0X prefix

GLSL 4.6 spec describes hex constant as:

hexadecimal-constant:
    0x hexadecimal-digit
    0X hexadecimal-digit
    hexadecimal-constant hexadecimal-digit

Right now if you have a shader with the following structure:

    #if 0X1 // or any hex number with the 0X prefix
    // some code
    #endif

the code between #if and #endif gets removed because the checking is performed
only for "0x" prefix which results in strtoll being called with the base 8 and
after encountering the 'X' char the strtoll returns 0. Letting strtoll detect
the base makes this limitation go away and also makes code easier to read.

From the strtoll Linux man page:

"If base is zero or 16, the string may then include a "0x" prefix, and the
number will be read in base 16; otherwise, a zero base is taken as 10 (decimal)
unless the next character is '0', in which case it is taken as 8 (octal)."

This matches the behaviour in the GLSL spec.

This patch also adds a test for uppercase hex prefix.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agomesa: rename api_validate.{c,h} -> draw_validate.{c,h}
Timothy Arceri [Mon, 23 Apr 2018 03:46:15 +0000 (13:46 +1000)]
mesa: rename api_validate.{c,h} -> draw_validate.{c,h}

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65422

6 years agoac/radv/radeonsi: refactor harvest config register getters.
Dave Airlie [Mon, 23 Apr 2018 00:42:21 +0000 (10:42 +1000)]
ac/radv/radeonsi: refactor harvest config register getters.

This refactors the code out to share it between radv and radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoradv: only set raster_config_1 outside the index registers.
Dave Airlie [Mon, 23 Apr 2018 00:39:33 +0000 (10:39 +1000)]
radv: only set raster_config_1 outside the index registers.

This follows what radeonsi does.

Ported from radeonsi:
    radeonsi: emit PA_SC_RASTER_CONFIG_1 only once

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoac/radv/radeonsi: refactor max simd waves into common code.
Dave Airlie [Mon, 23 Apr 2018 00:16:07 +0000 (10:16 +1000)]
ac/radv/radeonsi: refactor max simd waves into common code.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/radv/radeonsi: refactor raster_config default values getters.
Dave Airlie [Mon, 23 Apr 2018 00:09:36 +0000 (10:09 +1000)]
ac/radv/radeonsi: refactor raster_config default values getters.

This just makes this common code between the two drivers.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradeonsi: use common gs_table_depth code
Dave Airlie [Sun, 22 Apr 2018 23:57:20 +0000 (09:57 +1000)]
radeonsi: use common gs_table_depth code

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: use common gs_table_depth code.
Dave Airlie [Sun, 22 Apr 2018 23:57:10 +0000 (09:57 +1000)]
radv: use common gs_table_depth code.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/info: move gs table depth to common code.
Dave Airlie [Sun, 22 Apr 2018 23:56:43 +0000 (09:56 +1000)]
ac/info: move gs table depth to common code.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradeonsi: don't runtime check gs table info
Dave Airlie [Sun, 22 Apr 2018 23:52:28 +0000 (09:52 +1000)]
radeonsi: don't runtime check gs table info

We can just unreachable here, this aligns with radv code, makes
it easier to move to common code.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv/gfx9: don't use gs_table_depth on gfx9.
Dave Airlie [Sun, 22 Apr 2018 23:50:28 +0000 (09:50 +1000)]
radv/gfx9: don't use gs_table_depth on gfx9.

Missed this on initial radeonsi port, we shouldn't use this value
on gfx9, but also in gfx8 only for when we have a geom shader.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoi965/fs: Return mlen * 8 for size_read() for INTERPOLATE_AT_*
Jason Ekstrand [Fri, 20 Apr 2018 03:48:42 +0000 (20:48 -0700)]
i965/fs: Return mlen * 8 for size_read() for INTERPOLATE_AT_*

They are send messages and this makes size_read() and mlen agree.  For
both of these opcodes, the payload is just a dummy so mlen == 1 and this
should decrease register pressure a bit.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: mesa-stable@lists.freedesktop.org
6 years agoac: fix the number of coordinates for ac_image_get_lod and arrays
Samuel Pitoiset [Mon, 23 Apr 2018 15:05:10 +0000 (17:05 +0200)]
ac: fix the number of coordinates for ac_image_get_lod and arrays

This fixes crashes for the following CTS:
dEQP-VK.glsl.texture_functions.query.texturequerylod.*

Cubemaps are the same as 2D arrays.

Fixes: 625dcbbc456 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agoi965: perf: enable GPA query statistics
Lionel Landwerlin [Fri, 9 Feb 2018 10:56:42 +0000 (10:56 +0000)]
i965: perf: enable GPA query statistics

The combinaison of GPA/MDAPI components expects a particular name &
layout for their pipeline statistics query.

v2: Limit the query GPA/MDAPI statistics to gen7->9 (Lionel)

v3: Add curly braces (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoi965: perf: add support for raw queries
Lionel Landwerlin [Wed, 7 Mar 2018 14:28:41 +0000 (14:28 +0000)]
i965: perf: add support for raw queries

The INTEL_performance_query extension provides a list of queries that
a user can select to monitor a particular workload. Each query reports
different sets of counters (roughly looking at different parts of the
hardware, i.e. caches/fixed functions/etc...).

Each query has an associated configuration that we need to program
into the hardware before using the query. Up to now, we provided
predefined queries. This change allows the user to build its own query
(and associated configuration) externally, and have the i965 driver
use that configuration through a new query named :

   Intel_Raw_Hardware_Counters_Set_0_Query

When this query is selected, the i965 driver will report raw counters
deltas (meaning their values need to be interpreted by the user, as
opposed to existing queries that provide human readable values).

This change is also useful for debug purposes for building new
pre-defined queries and verifying the underlying numbers make sense
before writing equations for user readable output.

This change's purpose is also to enable GPA. GPA uses a library called
MDAPI that processes raw counter data. MDAPI expects raw data to have
a certain layout (per generation which is a bit unfortunate...). This
change also embeds the expected data layouts.

v2: Enable raw queries on gen 7->11, v1 had 7->9 (Lionel)

v3: Don't assert on cherryview for gen7... (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoi965: perf: read slice/unslice frequencies from OA reports
Lionel Landwerlin [Wed, 7 Mar 2018 16:02:40 +0000 (16:02 +0000)]
i965: perf: read slice/unslice frequencies from OA reports

v2: Add comment breaking down where the frequency values come from (Ken)

v3: More documentation (Ken/Lionel)
    Adjust clock ratio multiplier to reflect the divider's behavior (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoi965: perf: snapshot RPSTAT register
Lionel Landwerlin [Wed, 7 Mar 2018 10:46:58 +0000 (10:46 +0000)]
i965: perf: snapshot RPSTAT register

This register contains the current/previous frequency of the GT, it's
one of the value GPA would like to have as part of their queries.

v2: Don't use this register on baytrail/cherryview (Ken)
    Use GET_FIELD() macro (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoi965: perf: extract utility functions
Lionel Landwerlin [Tue, 6 Mar 2018 17:09:21 +0000 (17:09 +0000)]
i965: perf: extract utility functions

We would like to reuse a number of the functions and structures in
another file in a future commit.

We also move the previous content of brw_performance_query.h into
brw_performance_query_metrics.h to be included by generated metrics
files.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoac: teach get_ac_sampler_dim() about subpass attachments
Samuel Pitoiset [Mon, 23 Apr 2018 14:55:39 +0000 (16:55 +0200)]
ac: teach get_ac_sampler_dim() about subpass attachments

Suggested by Nicolai.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agoac/nir: add missing round_slice for 1D arrays
Samuel Pitoiset [Mon, 23 Apr 2018 12:46:26 +0000 (14:46 +0200)]
ac/nir: add missing round_slice for 1D arrays

This fixes a bunch of CTS fails with 1D arrays:

dEQP-VK.glsl.texture_functions.texture*.sampler1darray_*

Fixes: 625dcbbc456 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agobin/install_megadrivers: rename a few variables to make things clearer
Dylan Baker [Mon, 9 Apr 2018 20:59:55 +0000 (13:59 -0700)]
bin/install_megadrivers: rename a few variables to make things clearer

Originally the "each" variable was just a part of the "drivers"
variable. It's not anymore so it's a bit ambiguous.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
6 years agobin/install_megadrivers: fix DESTDIR and -D*-path
Dylan Baker [Mon, 9 Apr 2018 20:53:09 +0000 (13:53 -0700)]
bin/install_megadrivers: fix DESTDIR and -D*-path

This fixes -Ddri-drivers-path, -Dvdpau-libs-path, etc. with DESTDIR when
those paths are absolute. Currently due to the way python's os.path.join
handles absolute paths these will ignore DESTDIR, which is bad. This
fixes them to be relative to DESTDIR if that is set.

Fixes: 3218056e0eb375eeda470058d06add1532acd6d4
       ("meson: Build i965 and dri stack")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
6 years agocompiler/glsl: close fd's in glcpp_test.py
Dylan Baker [Thu, 19 Apr 2018 18:02:32 +0000 (11:02 -0700)]
compiler/glsl: close fd's in glcpp_test.py

I would have thought falling out of scope would allow the gc to collect
these, but apparently it doesn't, and this hits an fd limit on macos.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106133
Fixes: db8cd8e36771eed98eb638fd0593c978c3da52a9
       ("glcpp/tests: Convert shell scripts to a python script")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
6 years agonir: Do not use progress for unreachable code in return lowering.
Bas Nieuwenhuizen [Sun, 22 Apr 2018 17:05:19 +0000 (19:05 +0200)]
nir: Do not use progress for unreachable code in return lowering.

We seem to use progress for two cases:
1) When we lowered some returns.
2) When we remove unreachable code.

If just case 2 happens we assert as state->return_flag has not
been allocated yet, but we are still trying to do insert all
predicates based on it.

This splits the concerns. We only use progress internally for case 1
and then keep track of 2 in a separate variable to indicate progress
in the return value of the pass.

This is slightly better than transforming the assert into
if (!state->return_flag) return, as the solution in this patch avoids
inserting predicates even if some other part of the might need them.

Fixes: 6e22ad6edc "nir: return early when lowering a return at the end of a function"
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106174
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoradv: advertise 8 bits of subpixel precision for viewports
Józef Kucia [Tue, 10 Apr 2018 22:11:57 +0000 (00:11 +0200)]
radv: advertise 8 bits of subpixel precision for viewports

This is what radeonsi does.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agost/dri: Fix dangling pointer to a destroyed dri_drawable
Johan Klokkhammer Helsing [Fri, 20 Apr 2018 10:29:16 +0000 (12:29 +0200)]
st/dri: Fix dangling pointer to a destroyed dri_drawable

If an EGLSurface is created, made current and destroyed, and then a second
EGLSurface is created. Then the second malloc in driCreateNewDrawable may
return the same pointer address the first surface's drawable had.
Consequently, when dri_make_current later tries to determine if it should
update the texture_stamp it compares the surface's drawable pointer against
the drawable in the last call to dri_make_current and assumes it's the same
surface (which it isn't).

When texture_stamp is left unset, then dri_st_framebuffer_validate thinks
it has already called update_drawable_info for that drawable, leaving it
unvalidated and this is when bad things starts to happen. In my case it
manifested itself by the width and height of the surface being unset.

This is fixed this by setting the pointer to NULL before freeing the
surface.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106126
Signed-off-by: Johan Klokkhammer Helsing <johan.helsing@qt.io>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
6 years agonv50/ir: make a copy of tex src if it's referenced multiple times
Ilia Mirkin [Tue, 10 Apr 2018 02:19:35 +0000 (22:19 -0400)]
nv50/ir: make a copy of tex src if it's referenced multiple times

For nv50 we coalesce the srcs and defs into a single node. As such, we
can end up with impossible constraints if the source is referenced
after the tex operation (which, due to the coalescing of values, will
have overwritten it).

This logic already exists for inserting moves for MERGE/UNION sources.
It's the exact same idea here, so leverage that code, which also
includes a few optimizations around not extending live ranges
unnecessarily.

Fixes tests/spec/glsl-1.30/execution/fs-textureSize-components.shader_test

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agovirgl: disable virgl when no 3D for virtio gpu.
Lepton Wu [Thu, 5 Apr 2018 19:38:48 +0000 (12:38 -0700)]
virgl: disable virgl when no 3D for virtio gpu.

If users are running mesa under old version of qemu or have turned off
GL at runtime, virtio gpu driver actually doesn't work. Adds a detection
here so mesa can fall back to software rendering.

v2:
 - move detection from loader to virgl (Ilia, Emil)

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: mark const structs as extern in header file to avoid lto damage
Dave Airlie [Fri, 13 Apr 2018 02:40:55 +0000 (12:40 +1000)]
radv: mark const structs as extern in header file to avoid lto damage

The copr repo from che was using LTO and he reported radv broke
recently with it. When testing with lto builds here I noticed
that we weren't seeing any instance extensions reported.

It appears LTO was treating the const without extern as an empty
struct, this is possibly a gcc bug, but we can work around it
just by marking these with extern.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>