mesa.git
5 years agost/mesa: call resource_changed when binding a EGLImage to a texture
Lucas Stach [Tue, 20 Mar 2018 11:14:12 +0000 (12:14 +0100)]
st/mesa: call resource_changed when binding a EGLImage to a texture

When a EGLImage is newly bound to a texture, we need to make sure the
driver is informed that the resource might have changed. Fixes stale
texture content on Etnaviv when binding an existing EGLImage to an
existing texture object.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradv: emit a dummy ZPASS_DONE to prevent GPU hangs on GFX9
Samuel Pitoiset [Wed, 11 Jul 2018 09:55:55 +0000 (11:55 +0200)]
radv: emit a dummy ZPASS_DONE to prevent GPU hangs on GFX9

A ZPASS_DONE or PIXEL_STAT_DUMP_EVENT (of the DB occlusion
counters) must immediately precede every timestamp event to
prevent a GPU hang on GFX9.

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: add support for VK_KHR_create_renderpass2
Samuel Pitoiset [Sun, 8 Jul 2018 15:47:52 +0000 (17:47 +0200)]
radv: add support for VK_KHR_create_renderpass2

VkCreateRenderPass2KHR() is quite similar to VkCreateRenderPass()
but refactoring the code is a bit painful.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: introduce radv_subpass_attachment data structure
Samuel Pitoiset [Sun, 8 Jul 2018 15:47:51 +0000 (17:47 +0200)]
radv: introduce radv_subpass_attachment data structure

Needed for VK_KHR_create_renderpass2.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agost/mesa: Only enable depth writes if the function isn't EQUAL.
Kenneth Graunke [Thu, 5 Jul 2018 09:55:57 +0000 (02:55 -0700)]
st/mesa: Only enable depth writes if the function isn't EQUAL.

If the depth function is EQUAL, then we'll only write the depth value
when it already matches what's in the buffer, which is pointless.
Skipping these writes can save bandwidth.

The state tracker can easily take care of this, so all drivers benefit.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoanv/android: Fix type error in call to vk_errorf()
Chad Versace [Thu, 28 Jun 2018 03:22:23 +0000 (20:22 -0700)]
anv/android: Fix type error in call to vk_errorf()

In a single call to vk_errorf() in the Android code, the arguments were
swapped. The bug has existed since day one. Chrome OS used to forgive
the warning, but it is now a compilation error.

CC: <mesa-stable@lists.freedesktop.org>
Fixes: 053d4c32 "anv: Implement VK_ANDROID_native_buffer (v9)"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
5 years agoanv/android: Fix Autotools build for VK_ANDROID_native_buffer
Chad Versace [Thu, 28 Jun 2018 02:40:44 +0000 (19:40 -0700)]
anv/android: Fix Autotools build for VK_ANDROID_native_buffer

Changes to vk.xml and anv_entrypoints_gen.py broke the Autotools build
on Android. The changes undef'd the VK_ANDROID_native_buffer entrypoints
in anv_entrypoints.h.

Fix it with CPPFLAGS += -DVK_USE_PLATFORM_ANDROID_KHR.

CC: <mesa-stable@lists.freedesktop.org>
See-Also: 63525ba7 "android: enable VK_ANDROID_native_buffer"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
5 years agoradv: make sure to wait for CP DMA when needed
Samuel Pitoiset [Mon, 9 Jul 2018 16:02:58 +0000 (18:02 +0200)]
radv: make sure to wait for CP DMA when needed

This might fix some synchronization issues. I don't know if
that will affect performance but it's required for correctness.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agointel/tools/dump_gpu: Add option to print ppgtt mappings.
Rafael Antognolli [Tue, 3 Jul 2018 18:38:39 +0000 (11:38 -0700)]
intel/tools/dump_gpu: Add option to print ppgtt mappings.

Using -vv will increase the verbosity, by printing the ppgtt mappings as
they get written into the aub file.

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agospirv: Fix InterpolateAt* instructions for vecs with dynamic index
Neil Roberts [Wed, 9 May 2018 13:14:36 +0000 (15:14 +0200)]
spirv: Fix InterpolateAt* instructions for vecs with dynamic index

If the glsl is something like this:

  in vec4 some_input;
  interpolateAtCentroid(some_input[idx])

then it now gets generated as if it were:

  interpolateAtCentroid(some_input)[idx]

This is necessary because the index will get generated as a series of
nir_bcsel instructions so it would no longer be an input variable. It
is similar to what is done for GLSL in ca63a5ed3e9efb2bd645b42.

Although I can’t find anything explicit in the Vulkan specs to say
this should be allowed, the SPIR-V spec just says “the operand
interpolant must be a pointer to the Input Storage Class”, which I
guess doesn’t rule out any type of pointer to an input.

This was found using the spec/glsl-4.40/execution/fs-interpolateAt*
Piglit tests with the ARB_gl_spirv branch.

Signed-off-by: Neil Roberts <nroberts@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
v2: update after nir_deref_instr land on master. Implemented by
    Alejandro Piñeiro. Special thanks to Jason Ekstrand for guidance
    at the new nir_deref_instr world.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/ir: Uncomment definition of several unused hardware opcodes.
Francisco Jerez [Wed, 24 Jan 2018 03:35:23 +0000 (19:35 -0800)]
intel/ir: Uncomment definition of several unused hardware opcodes.

There are a number of opcode_desc table entries for many of these
unused opcodes.  A symbolic opcode enum will be required in a future
commit in order to keep them in the opcode description tables.  The
alternative would be to remove the unused opcodes from the opcode
description tables.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/fs: Initialize mlen for gen7 varying pull constant load messages.
Francisco Jerez [Wed, 27 Dec 2017 03:08:10 +0000 (19:08 -0800)]
intel/fs: Initialize mlen for gen7 varying pull constant load messages.

This makes the message length available at the IR level, which should
save some guesswork in a future commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Assert that the instruction is send-like in brw_set_desc_ex().
Francisco Jerez [Sun, 3 Jun 2018 20:20:45 +0000 (13:20 -0700)]
intel/eu: Assert that the instruction is send-like in brw_set_desc_ex().

Constructing a descriptor in-place as part of the immediate of an ALU
instruction is no longer supported.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Get rid of the return value of brw_send_indirect_message().
Francisco Jerez [Sat, 2 Jun 2018 22:08:18 +0000 (15:08 -0700)]
intel/eu: Get rid of the return value of brw_send_indirect_message().

The return value is not used anymore.  This allows simplifying the
code slightly, and in addition it should frustrate anybody's attempts
to continue using the obsolete piecemeal approach to construct a
message descriptor in combination with brw_send_indirect_message().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Get rid of the return value of brw_send_indirect_surface_message().
Francisco Jerez [Sun, 3 Jun 2018 10:30:50 +0000 (03:30 -0700)]
intel/eu: Get rid of the return value of brw_send_indirect_surface_message().

All users of brw_send_indirect_surface_message() should be providing a
full descriptor immediate up front by now, this isn't necessary
anymore.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Use descriptor constructors for dataport typed surface messages.
Francisco Jerez [Thu, 7 Jun 2018 22:27:06 +0000 (15:27 -0700)]
intel/eu: Use descriptor constructors for dataport typed surface messages.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Use descriptor constructors for dataport scattered byte surface messages.
Francisco Jerez [Thu, 7 Jun 2018 22:24:48 +0000 (15:24 -0700)]
intel/eu: Use descriptor constructors for dataport scattered byte surface messages.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Use descriptor constructors for dataport untyped surface messages.
Francisco Jerez [Thu, 7 Jun 2018 22:22:58 +0000 (15:22 -0700)]
intel/eu: Use descriptor constructors for dataport untyped surface messages.

v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Provide single descriptor argument to brw_send_indirect_surface_message().
Francisco Jerez [Thu, 7 Jun 2018 22:19:49 +0000 (15:19 -0700)]
intel/eu: Provide single descriptor argument to brw_send_indirect_surface_message().

Instead of the current message_len, response_len and header_present
arguments.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Use descriptor constructors for pixel interpolator messages.
Francisco Jerez [Mon, 9 Jul 2018 23:16:16 +0000 (16:16 -0700)]
intel/eu: Use descriptor constructors for pixel interpolator messages.

v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Use descriptor constructors for dataport write messages.
Francisco Jerez [Mon, 9 Jul 2018 23:12:59 +0000 (16:12 -0700)]
intel/eu: Use descriptor constructors for dataport write messages.

v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Use descriptor constructors for dataport read messages.
Francisco Jerez [Thu, 7 Jun 2018 17:50:20 +0000 (10:50 -0700)]
intel/eu: Use descriptor constructors for dataport read messages.

v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Use descriptor constructors for sampler messages.
Francisco Jerez [Sat, 2 Jun 2018 22:15:15 +0000 (15:15 -0700)]
intel/eu: Use descriptor constructors for sampler messages.

v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Provide desc immediate argument up front to brw_send_indirect_message().
Francisco Jerez [Sat, 2 Jun 2018 22:07:31 +0000 (15:07 -0700)]
intel/eu: Provide desc immediate argument up front to brw_send_indirect_message().

The current approach of returning a setup instruction where additional
descriptor fields can be specified is still supported in order to keep
things working, but it will be removed later in this series.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoTRIVIAL: intel/eu: Use a local devinfo variable in brw_shader_time_add().
Francisco Jerez [Sat, 2 Jun 2018 21:59:08 +0000 (14:59 -0700)]
TRIVIAL: intel/eu: Use a local devinfo variable in brw_shader_time_add().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Use brw_set_desc() along with a helper to set common descriptor controls.
Francisco Jerez [Mon, 11 Jun 2018 17:49:39 +0000 (10:49 -0700)]
intel/eu: Use brw_set_desc() along with a helper to set common descriptor controls.

This replaces brw_set_message_descriptor() with the composition of
brw_set_desc() and a new inline helper function that packs the common
message descriptor controls into an integer.  The goal is to represent
all message descriptors as a 32-bit integer which is written at once
into the instruction, which is more flexible (SENDS anyone?), robust
(see d2eecf0b0b24d203d0f171807681dffd830d54de fixing an issue
ultimately caused by some bits of the extended message descriptor
being left undefined) and future-proof than the current approach of
specifying the individual descriptor fields directly into the
instruction.

This approach also seems more self-documenting, since it will allow
removing calls to functions with way too many arguments like
brw_set_*_message() and brw_send_indirect_message(), and instead
provide a single descriptor argument constructed from an appropriate
combination of brw_*_desc() helpers.

Note that because brw_set_message_descriptor() was (conditionally?)
overriding fields of the instruction which strictly speaking weren't
part of the message descriptor, this involves calling
brw_inst_set_sfid() and brw_inst_set_eot() in some cases in addition
to brw_set_desc().

v2: Use SET_BITS macro instead of left shift (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Define SET_BITS helper more easily reusable than SET_FIELD.
Francisco Jerez [Mon, 25 Jun 2018 19:06:50 +0000 (12:06 -0700)]
intel/eu: Define SET_BITS helper more easily reusable than SET_FIELD.

Allows to specify a bitfield based on its upper and lower bounds
instead of a symbolic field definition, kind of what the current
GET_BITS macro is to GET_FIELD.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Define helper to specify the descriptor immediates of a SEND instruction.
Francisco Jerez [Sat, 2 Jun 2018 20:48:42 +0000 (13:48 -0700)]
intel/eu: Define helper to specify the descriptor immediates of a SEND instruction.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/eu: Add brw_inst.h helpers for the SEND(C) descriptor and extended descriptor.
Francisco Jerez [Sat, 2 Jun 2018 01:21:37 +0000 (18:21 -0700)]
intel/eu: Add brw_inst.h helpers for the SEND(C) descriptor and extended descriptor.

This introduces helpers that can be used to specify or extract the
whole descriptor of a SEND message instruction at once.  Because the
the instruction encoding of these is rather awkward on some
generations using the generic brw_inst.h macros doesn't seem like an
option.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoi965: Support saving the gen program with glGetProgramBinary
Jordan Justen [Thu, 1 Mar 2018 02:33:58 +0000 (18:33 -0800)]
i965: Support saving the gen program with glGetProgramBinary

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoi965: Add flag_state param to brw_search_cache
Jordan Justen [Thu, 1 Mar 2018 05:43:22 +0000 (21:43 -0800)]
i965: Add flag_state param to brw_search_cache

This allows brw_search_cache to be used to find programs without
causing extra state to be emitted in the case where the program isn't
being made active. (For example, to find the program to save out with
the ARB_get_program_binary interface.)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agomesa: Add gl_shader_program param to ProgramBinarySerializeDriverBlob
Jordan Justen [Thu, 1 Mar 2018 02:29:54 +0000 (18:29 -0800)]
mesa: Add gl_shader_program param to ProgramBinarySerializeDriverBlob

This might be required because some stages might generate different
programs depending on the other stages in the program. For example,
the i965 driver's tessellation control stage depends on the
tessellation evaluation shader.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoi965: Add brw_populate_default_key
Jordan Justen [Thu, 1 Mar 2018 01:58:02 +0000 (17:58 -0800)]
i965: Add brw_populate_default_key

We will need to populate the default key for ARB_get_program_binary to
allow us to retrieve the default gen program to store in the program
binary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoi965: Replace brw_setup_tex_for_precompile brw with devinfo
Jordan Justen [Thu, 1 Mar 2018 01:14:50 +0000 (17:14 -0800)]
i965: Replace brw_setup_tex_for_precompile brw with devinfo

Trying to make sure the setup of the default program key is not
dependent on the GL state.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoi965: Regenerate blob without gen program for shader cache
Jordan Justen [Mon, 9 Apr 2018 08:39:18 +0000 (01:39 -0700)]
i965: Regenerate blob without gen program for shader cache

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agocompiler/blob: Add blob_skip_bytes
Jordan Justen [Mon, 9 Apr 2018 08:07:03 +0000 (01:07 -0700)]
compiler/blob: Add blob_skip_bytes

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoi965: Add support for driver cache blob containing the gen program
Jordan Justen [Thu, 1 Mar 2018 00:20:51 +0000 (16:20 -0800)]
i965: Add support for driver cache blob containing the gen program

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoi965: Use brw_prog_key_set_id in disk cache load/store code
Jordan Justen [Tue, 6 Mar 2018 05:18:27 +0000 (21:18 -0800)]
i965: Use brw_prog_key_set_id in disk cache load/store code

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoi965: Add brw_prog_key_set_id helper to set the program id on any stage
Jordan Justen [Tue, 6 Mar 2018 01:17:23 +0000 (17:17 -0800)]
i965: Add brw_prog_key_set_id helper to set the program id on any stage

For saving programs (shader cache; get program binary) it is useful to
set the id to 0, with the stage being a parameter.

For restoring programs it is useful to set the id to the id allocated
to the program at creation time.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoi965: Add brw_stage_cache_id to map gl stages to brw cache_ids
Jordan Justen [Thu, 1 Mar 2018 06:01:44 +0000 (22:01 -0800)]
i965: Add brw_stage_cache_id to map gl stages to brw cache_ids

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoi965: Add brw_(read|write)_blob_program_data functions
Jordan Justen [Wed, 28 Feb 2018 23:55:47 +0000 (15:55 -0800)]
i965: Add brw_(read|write)_blob_program_data functions

We will want to use these for both the disk shader cache, and for the
ARB_get_program_binary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoi965: Add brw_program_deserialize_driver_blob
Jordan Justen [Wed, 28 Feb 2018 22:41:02 +0000 (14:41 -0800)]
i965: Add brw_program_deserialize_driver_blob

brw_program_deserialize_driver_blob will be a more generic form of
brw_program_deserialize_nir. In addition to nir, it will also be able
to extract gen binaries and upload them to the program cache.

In this commit, it continues to only support nir.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoi965: Move brw_program_*serialize_nir to brw_program_binary.c
Jordan Justen [Wed, 28 Feb 2018 09:39:27 +0000 (01:39 -0800)]
i965: Move brw_program_*serialize_nir to brw_program_binary.c

This will allow get_program_binary to add the gen program into its
serialization in addition to just the nir program.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agomesa: Always call ProgramBinarySerializeDriverBlob
Jordan Justen [Thu, 19 Apr 2018 22:39:40 +0000 (15:39 -0700)]
mesa: Always call ProgramBinarySerializeDriverBlob

The driver may prefer to have a different blob for
ARB_get_program_binary compared to the version saved out for the disk
shader cache.

Since they both use the driver_cache_blob field, we need to always
give the driver the opportunity to fill in the driver_cache_blob when
saving the program binary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoi965: Use ShaderCacheSerializeDriverBlob driver function
Jordan Justen [Mon, 9 Apr 2018 06:16:19 +0000 (23:16 -0700)]
i965: Use ShaderCacheSerializeDriverBlob driver function

This function is called just before the gl_program::driver_cache_blob
is saved out as part of the gl_program serialization.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agost/mesa: Use ShaderCacheSerializeDriverBlob driver function
Jordan Justen [Thu, 19 Apr 2018 23:20:53 +0000 (16:20 -0700)]
st/mesa: Use ShaderCacheSerializeDriverBlob driver function

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agost/mesa: Skip serializing driver_cache_blob if it exists
Jordan Justen [Thu, 19 Apr 2018 23:14:28 +0000 (16:14 -0700)]
st/mesa: Skip serializing driver_cache_blob if it exists

Previously the mesa core code would not call to serialize the
driver_cache_blob if it existed. We will update it to always call to
serialize the driver_cache_blob meaning we should avoid re-serializing
it under mesa/state_tracker.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agomesa: Add disk shader cache driver blob callback
Jordan Justen [Mon, 9 Apr 2018 00:56:34 +0000 (17:56 -0700)]
mesa: Add disk shader cache driver blob callback

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agointel/compiler: emit actual barriers for working-group level barriers
Iago Toral Quiroga [Thu, 21 Jun 2018 07:45:19 +0000 (09:45 +0200)]
intel/compiler: emit actual barriers for working-group level barriers

Until now we have assumed that we could skip emitting these barriers
in the general case based on empirical testing and a few assumptions
detailed in a comment in the driver code, however, recent CTS tests
have showed that we actually need them to produce correct behavior.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoradv: add some cxxflags for new c++ file
Dave Airlie [Tue, 10 Jul 2018 00:15:34 +0000 (10:15 +1000)]
radv: add some cxxflags for new c++ file

Looks like I broke intel CI compiles.

Fixes: 6f3aee40f9 (radv: using tls to store llvm related info and speed up compiles (v10))
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
5 years agoanv,radv: Add support for VK_KHR_get_display_properties2
Jason Ekstrand [Fri, 15 Jun 2018 22:47:41 +0000 (15:47 -0700)]
anv,radv: Add support for VK_KHR_get_display_properties2

Reviewed-by: Keith Packard <keithp@keithp.com>
5 years agointel/aubinator_error_decode: Allow for more sections
Jason Ekstrand [Mon, 9 Jul 2018 23:00:17 +0000 (16:00 -0700)]
intel/aubinator_error_decode: Allow for more sections

Error states coming from actual Vulkan applications tend to have fairly
long command buffers and lots of chained batches.  30 total BOs isn't
nearly enough.  This commit bumps it to 256, makes some things use the
actual number of sections instead of the #define, and adds asserts if we
ever go over 256 sections.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agointel/batch_decoder: Recurse for all 2nd level batches
Jason Ekstrand [Mon, 9 Jul 2018 22:58:33 +0000 (15:58 -0700)]
intel/batch_decoder: Recurse for all 2nd level batches

Our attempt to restart the loop with the second level batch worked at
one point but got broken at some point.  It was too fragile anyway and
we're not likely to have enough secondaries to actually overflow the
stack so we may as well recurse in both cases.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agovirgl/vtest: add support to vtest for new cap getting.
Dave Airlie [Fri, 8 Jun 2018 06:19:49 +0000 (16:19 +1000)]
virgl/vtest: add support to vtest for new cap getting.

The vtest protocol is pretty simple but also pretty dumb, and
the v1 caps query was fixed size, with no nice way to expand it,
however the server also ignores any command it doesn't understand.

So we can query v2 caps by sending a v2 followed by a v1, if the
v2 is ignored we know it's an old vtest server, and the we get
a v2 answer then we can just read the v1 answer and discard it.

Acked-by: Jakob Bornecrantz <jakob@collabora.com> (sounds good)
5 years agoi965/icl: Don't set float blend optimization bit in CACHE_MODE_SS
Anuj Phogat [Thu, 31 May 2018 23:03:44 +0000 (16:03 -0700)]
i965/icl: Don't set float blend optimization bit in CACHE_MODE_SS

CACHE_MODE_SS is not listed in gfxspecs table for user mode
non-privileged registers. So, making any changes from Mesa
will do nothing. Kernel is already setting this bit in
CACHE_MODE_SS register which is saved/restored to/from
the HW context image.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv/icl: Don't set float blend optimization bit in CACHE_MODE_SS
Anuj Phogat [Thu, 31 May 2018 22:41:53 +0000 (15:41 -0700)]
anv/icl: Don't set float blend optimization bit in CACHE_MODE_SS

CACHE_MODE_SS is not listed in gfxspecs table for user mode
non-privileged registers. So, making any changes from Mesa
will do nothing. Kernel is already setting this bit in
CACHE_MODE_SS register which is saved/restored to/from
the HW context image.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: Implement VK_EXT_vertex_attribute_divisor
Jason Ekstrand [Mon, 2 Jul 2018 19:57:44 +0000 (12:57 -0700)]
anv: Implement VK_EXT_vertex_attribute_divisor

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoanv/pipeline: Add a per-VB instance divisor
Jason Ekstrand [Mon, 2 Jul 2018 19:49:06 +0000 (12:49 -0700)]
anv/pipeline: Add a per-VB instance divisor

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoanv/pipeline: Use a per-VB struct instead of separate arrays
Jason Ekstrand [Mon, 2 Jul 2018 19:44:49 +0000 (12:44 -0700)]
anv/pipeline: Use a per-VB struct instead of separate arrays

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoanv: Enable SPV_KHR_8bit_storage and VK_KHR_8bit_storage
Jose Maria Casanova Crespo [Mon, 9 Jul 2018 00:01:32 +0000 (02:01 +0200)]
anv: Enable SPV_KHR_8bit_storage and VK_KHR_8bit_storage

Enables SPV_KHR_8bit_storage and VK_KHR_8bit_storage on gen 8+
using the VK_KHR_get_physical_device_properties2 functionality
to expose if the extension is supported or not.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agospirv/nir: Add support for SPV_KHR_8bit_storage
Jose Maria Casanova Crespo [Mon, 9 Jul 2018 00:01:22 +0000 (02:01 +0200)]
spirv/nir: Add support for SPV_KHR_8bit_storage

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agospirv: Include headers and grammar for SPV_KHR_8bit_storage
Jose Maria Casanova Crespo [Mon, 9 Jul 2018 00:01:14 +0000 (02:01 +0200)]
spirv: Include headers and grammar for SPV_KHR_8bit_storage

Updates headers and grammar to ff684ffc6a35d2a58f0f63108877d0064ea33feb

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoi965/fs: Enable store_ssbo for 8-bit types.
Jose Maria Casanova Crespo [Mon, 9 Jul 2018 00:01:01 +0000 (02:01 +0200)]
i965/fs: Enable store_ssbo for 8-bit types.

v2: Update comment according to this patch. (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/compiler: relax brw_eu_validate for byte raw movs
Jose Maria Casanova Crespo [Mon, 9 Jul 2018 00:00:34 +0000 (02:00 +0200)]
intel/compiler: relax brw_eu_validate for byte raw movs

When the destination is a BYTE type allow raw movs
even if the stride is not exact multiple of destination
type and exec type, execution type is Word and its size is 2.

This restriction was only allowing stride==2 destinations
for 8-bit types.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoi965/fs: Enable conversions to 8-bit integers
Jose Maria Casanova Crespo [Mon, 9 Jul 2018 00:00:23 +0000 (02:00 +0200)]
i965/fs: Enable conversions to 8-bit integers

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoi965: Support for 8-bit base types in helper functions
Jose Maria Casanova Crespo [Mon, 9 Jul 2018 00:00:06 +0000 (02:00 +0200)]
i965: Support for 8-bit base types in helper functions

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoi965/fs: Register allocator shoudn't use grf127 for sends dest
Jose Maria Casanova Crespo [Wed, 18 Apr 2018 23:15:23 +0000 (01:15 +0200)]
i965/fs: Register allocator shoudn't use grf127 for sends dest

Since Gen8+ Intel PRM states that "r127 must not be used for return
address when there is a src and dest overlap in send instruction."

This patch implements this restriction creating new grf127_send_hack_node
at the register allocator. This node has a fixed assignation to grf127.

For vgrf that are used as destination of send messages we create node
interfereces with the grf127_send_hack_node. So the register allocator
will never assign to these vgrf a register that involves grf127.

If dispatch_width > 8 we don't create these interferences to the because
all instructions have node interferences between sources and destination.
That is enough to avoid the r127 restriction.

This fixes CTS tests that raised this issue as they were executed as SIMD8:

dEQP-VK.spirv_assembly.instruction.graphics.8bit_storage.8struct_to_32struct.storage_buffer_*int_geom

Shader-db results on Skylake:
   total instructions in shared programs: 7686798 -> 7686797 (<.01%)
   instructions in affected programs: 301 -> 300 (-0.33%)
   helped: 1
   HURT: 0

   total cycles in shared programs: 337092322 -> 337091919 (<.01%)
   cycles in affected programs: 22420415 -> 22420012 (<.01%)
   helped: 712
   HURT: 588

Shader-db results on Broadwell:

   total instructions in shared programs: 7658574 -> 7658625 (<.01%)
   instructions in affected programs: 19610 -> 19661 (0.26%)
   helped: 3
   HURT: 4

   total cycles in shared programs: 340694553 -> 340676378 (<.01%)
   cycles in affected programs: 24724915 -> 24706740 (-0.07%)
   helped: 998
   HURT: 916

   total spills in shared programs: 4300 -> 4311 (0.26%)
   spills in affected programs: 333 -> 344 (3.30%)
   helped: 1
   HURT: 3

   total fills in shared programs: 5370 -> 5378 (0.15%)
   fills in affected programs: 274 -> 282 (2.92%)
   helped: 1
   HURT: 3

v2: Avoid duplicating register classes without grf127. Let's use a node
    with a fixed assignation to grf127 and create interferences to send
    message vgrf destinations. (Eric Anholt)
v3: Update reference to CTS VK_KHR_8bit_storage failing tests.
    (Jose Maria Casanova)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
5 years agointel/compiler: grf127 can not be dest when src and dest overlap in send
Jose Maria Casanova Crespo [Mon, 26 Mar 2018 12:59:46 +0000 (14:59 +0200)]
intel/compiler: grf127 can not be dest when src and dest overlap in send

Implement at brw_eu_validate the restriction from Intel Broadwell PRM,
vol 07, section "Instruction Set Reference", subsection "EUISA
Instructions", Send Message (page 990):

"r127 must not be used for return address when there is a src and
dest overlap in send instruction."

v2: Style fixes (Matt Turner)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
5 years agoradv: using tls to store llvm related info and speed up compiles (v10)
Dave Airlie [Wed, 27 Jun 2018 01:34:25 +0000 (11:34 +1000)]
radv: using tls to store llvm related info and speed up compiles (v10)

This uses the common compiler passes abstraction to help radv
avoid fixed cost compiler overheads. This uses a linked list per
thread stored in thread local storage, with an entry in the list
for each target machine.

This should remove all the fixed overheads setup costs of creating
the pass manager each time.

This takes a demo app time to compile the radv meta shaders on nocache
and exit from 1.7s to 1s. It also has been reported to take the startup
time of uncached shaders on RoTR from 12m24s to 11m35s (Alex)

v2: fix llvm6 build, inline emit function, handle multiple targets
in one thread
v3: rebase and port onto new structure
v4: rename some vars (Bas)
v5: drag all code into radv for now, we can refactor it out later
for radeonsi if we make it shareable
v6: use a bit more C++ in the wrapper
v7: logic bugs fixed so it actually runs again.
v8: rebase on top of radeonsi changes.
v9: drop some C++ headers, cleanup list entry
v10: use pop_back (didn't have enough caffeine)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoswrast: Fix eglMakeCurrent(dpy, NULL, NULL, ctx) (v2)
Adam Jackson [Mon, 9 Jul 2018 16:51:37 +0000 (12:51 -0400)]
swrast: Fix eglMakeCurrent(dpy, NULL, NULL, ctx) (v2)

Fixes 14 piglits, mostly in egl_khr_create_context.

v2: Also short-circuit the same-context-no-drawables case (Eric Anholt)

Fixes: https://github.com/anholt/libepoxy/issues/177
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
5 years agointel: tools: dump_gpu: fix ppgtt mapping
Lionel Landwerlin [Fri, 6 Jul 2018 09:58:47 +0000 (10:58 +0100)]
intel: tools: dump_gpu: fix ppgtt mapping

We were not properly writing page tables when the virtual address
range spans multiple subtrees of the tables.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agov3d: Implement noperspective varyings on V3D 4.x.
Eric Anholt [Fri, 6 Jul 2018 22:48:46 +0000 (15:48 -0700)]
v3d: Implement noperspective varyings on V3D 4.x.

Fixes a bunch of piglit interpolation tests, and reduces my concern about
some MSAA blit shaders with noperspective varyings.

5 years agov3d: Refactor flat shade/centroid flag emission.
Eric Anholt [Fri, 6 Jul 2018 22:41:56 +0000 (15:41 -0700)]
v3d: Refactor flat shade/centroid flag emission.

The logic was duplicated in a pretty gross way, when what we really need
is just a helper function for stuffing the values in the packet.  This
will make implementing noperspective easier.

5 years agov3d: Fix typo in dither mode offset.
Eric Anholt [Fri, 6 Jul 2018 21:56:26 +0000 (14:56 -0700)]
v3d: Fix typo in dither mode offset.

We weren't using the field yet, so it didn't affect anything.

Fixes: c0476d964abb ("v3d: Express dithering mode in the same way that the CLIF parser does.")
5 years agoglsl: Treat sampler2DRect and sampler2DRectShadow as reserved in ES2
zhaowei yuan [Tue, 12 Jun 2018 20:45:43 +0000 (04:45 +0800)]
glsl: Treat sampler2DRect and sampler2DRectShadow as reserved in ES2

"sampler2DRect" and "sampler2DRectShadow" are specified as
reserved from GLSL 1.1 and GLSL ES 1.0

Signed-off-by: zhaowei yuan <zhaowei.yuan@samsung.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106906
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes: 34f7e761bc61 ("glsl/parser: Track built-in types using the glsl_type directly")
5 years agost/wgl: check for NULL piAttribList in wglCreatePbufferARB()
Charmaine Lee [Fri, 6 Jul 2018 22:52:37 +0000 (15:52 -0700)]
st/wgl: check for NULL piAttribList in wglCreatePbufferARB()

Java2d opengl pipeline passes NULL piAttribList to
wglCreatePbufferARB(). So skip parsing the attribute list
if it is NULL.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
5 years agoanv: Add support for VK_KHR_create_renderpass2
Jason Ekstrand [Tue, 24 Apr 2018 20:08:13 +0000 (13:08 -0700)]
anv: Add support for VK_KHR_create_renderpass2

The implementation of CreateRenderPass2 uses the helpers we broke out in
previous commits.  The implementations of the new vkCmd functions just
call the old versions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: Make subpass::depth_stencil_attachment a pointer
Jason Ekstrand [Tue, 26 Jun 2018 16:22:20 +0000 (09:22 -0700)]
anv: Make subpass::depth_stencil_attachment a pointer

This makes certain checks a bit easier and means that we don't have
the attachment information duplicated in the attachment list and in
depth_stencil_attachment.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv/pass: Move implicit dependency setup to anv_render_pass_compile
Jason Ekstrand [Tue, 24 Apr 2018 20:01:01 +0000 (13:01 -0700)]
anv/pass: Move implicit dependency setup to anv_render_pass_compile

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv/pass: Move some dependency setup into a helper
Jason Ekstrand [Tue, 24 Apr 2018 19:57:39 +0000 (12:57 -0700)]
anv/pass: Move some dependency setup into a helper

This new helper takes a VkSubpassDependency2KHR for future-proofing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv/pass: Move a bunch of analysis into a separate "compile" stage
Jason Ekstrand [Tue, 24 Apr 2018 18:37:27 +0000 (11:37 -0700)]
anv/pass: Move a bunch of analysis into a separate "compile" stage

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv/pass: Use a designated initailizer for attachments
Jason Ekstrand [Tue, 24 Apr 2018 16:11:34 +0000 (09:11 -0700)]
anv/pass: Use a designated initailizer for attachments

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: Bump the advertised patch version to 80
Jason Ekstrand [Mon, 9 Jul 2018 04:40:14 +0000 (21:40 -0700)]
anv: Bump the advertised patch version to 80

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoglx: Don't allow glXMakeContextCurrent() with only one valid drawable
Adam Jackson [Fri, 6 Jul 2018 18:59:21 +0000 (14:59 -0400)]
glx: Don't allow glXMakeContextCurrent() with only one valid drawable

Drawable and readable need to either both be None or both be non-None.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agomesa: verify MaxVertexAttribStride for GLES 3.1
Erik Faye-Lund [Wed, 4 Jul 2018 12:45:04 +0000 (14:45 +0200)]
mesa: verify MaxVertexAttribStride for GLES 3.1

The OpenGL 3.1 specification, table Table 20.41 ("Implementation
Dependent Values"), defines the minimum-maximum value for
MAX_VERTEX_ATTRIB_STRIDE to be 2048.

So we shouldn't enable OpenGL ES 3.1 on implementations where this
isn't the case. Let's add a check for this

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agomesa: verify MaxVertexAttribStride for GL 4.4
Erik Faye-Lund [Wed, 4 Jul 2018 12:40:25 +0000 (14:40 +0200)]
mesa: verify MaxVertexAttribStride for GL 4.4

The OpenGL 4.4 specification, table Table 23.55 ("Implementation
Dependent Values"), defines the minimum-maximum value for
MAX_VERTEX_ATTRIB_STRIDE to be 2048.

So we shouldn't enable OpenGL 4.4 on implementations where this isn't
the case. Let's add a check for this.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agor600: report incorrect max-vertex-attrib for GL 4.4
Erik Faye-Lund [Fri, 6 Jul 2018 08:29:02 +0000 (10:29 +0200)]
r600: report incorrect max-vertex-attrib for GL 4.4

OpenGL 4.4 requires a max vertex attrib of 2048 or higher, but
r600 only supports 2047. Technically, this makes it an GL4.3 GPU,
but it's currently exposing GL4.4.

To avoid regressing the GL version supported in the following
patches, let's just lie and pretend like we support 2048. Any
applications using 2048 are already broken anyway.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agointel/fs: use uint type for per_slot_offset at GS
Jose Maria Casanova Crespo [Tue, 12 Jun 2018 12:52:14 +0000 (14:52 +0200)]
intel/fs: use uint type for per_slot_offset at GS

This helps us to compact original instruction:

mul(8)  g3<1>D  g6<8,8,1>UD  0x00000006UD { align1 1Q };

So now we emit:

mul(8)  g3<1>UD g6<8,8,1>UD  0x00000006UD { align1 1Q compacted };

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
5 years agoradv: add the trace BO to the list when starting a new cmdbuf
Samuel Pitoiset [Tue, 3 Jul 2018 10:43:41 +0000 (12:43 +0200)]
radv: add the trace BO to the list when starting a new cmdbuf

That might reduce CPU overhead a little bit when using
RADV_TRACE_FILE.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: reduce CPU overhead in radv_flush_descriptors()
Samuel Pitoiset [Tue, 3 Jul 2018 10:43:40 +0000 (12:43 +0200)]
radv: reduce CPU overhead in radv_flush_descriptors()

The number of enabled descriptors for a given pipeline stage
can be computed at compile time.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agointel/compiler: remove unused function
Iago Toral Quiroga [Mon, 9 Jul 2018 09:47:50 +0000 (11:47 +0200)]
intel/compiler: remove unused function

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv/pipeline: honor the pipeline_cache_enabled run-time flag
Iago Toral Quiroga [Wed, 4 Jul 2018 08:40:15 +0000 (10:40 +0200)]
anv/pipeline: honor the pipeline_cache_enabled run-time flag

v2: merge both conditions to reduce the diff (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agor600/sb: fix crash in fold_alu_op3
Roland Scheidegger [Wed, 4 Jul 2018 02:44:17 +0000 (04:44 +0200)]
r600/sb: fix crash in fold_alu_op3

fold_assoc() called from fold_alu_op3() can lower the number of src to 2,
which then leads to an invalid access to n.src[2]->gvalue().
This didn't seem to have caused much harm in the past, but on Fedora 28
it will crash (presumably because -D_GLIBCXX_ASSERTIONS is used, although
with libstdc++ 4.8.5 this didn't do anything, -D_GLIBCXX_DEBUG was
needed to show the issue).

An alternative fix would be to instead call fold_alu_op2() from within
fold_assoc() when the number of src is reduced and return always TRUE
from fold_assoc() in this case, with the only actual difference being
the return value from fold_alu_op3() then. I'm not sure what the return
value actually should be in this case (or whether it even can make a
difference).

https://bugs.freedesktop.org/show_bug.cgi?id=106928
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agovulkan: Update the XML and headers to 1.1.80
Jason Ekstrand [Tue, 24 Apr 2018 15:30:24 +0000 (08:30 -0700)]
vulkan: Update the XML and headers to 1.1.80

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoi965: fix clear color bo address relocation
Lionel Landwerlin [Sat, 7 Jul 2018 13:06:22 +0000 (14:06 +0100)]
i965: fix clear color bo address relocation

Fixes: 7987d041fda0c9 ("i965/surface_state: Emit the clear color address instead of value.")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoradv: winsys/amdgpu: include missing pthread.h header
Mauro Rossi [Sun, 20 May 2018 11:57:03 +0000 (13:57 +0200)]
radv: winsys/amdgpu: include missing pthread.h header

pthread types are used in some files without explicitely including pthread.h.
This leads to compile errors on Android 7.x nougat-x86
e.g. in src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h

In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c:31:
In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.h:32:
external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h:52:2: error: unknown type name 'pthread_mutex_t'
        pthread_mutex_t global_bo_list_lock;
        ^
1 error generated.

Including pthread.h explicitely solves the building error

Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agonv50/ir: fix Instruction::isActionEqual for PHI instructions
Karol Herbst [Thu, 28 Jun 2018 16:55:00 +0000 (18:55 +0200)]
nv50/ir: fix Instruction::isActionEqual for PHI instructions

phi instructions don't have the same results by simply having the same sources.
They need to be inside the same BasicBlock or share an equal condition
resulting into a path through the shader selecting equal sources as well.

short example:

cond = ...;
const0 = 0;
const1 = 1;

if (cond) {
  ssa_1 = const0;
} else {
  ssa_2 = const1;
}
ssa_3 = phi ssa_1 ssa_2;

if (!cond) {
  ssa_4 = const0;
} else {
  ssa_5 = const1;
}
ssa_6 = phi ssa_4 ssa_5;

allthough both phis actually have sources with equal results, merging them
would be wrong due to having a different condition selecting which source to
take.

For now we also stick an assert into GlobalCSE, because it should never end up
having to merge phi instructions.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
5 years agonvc0/ir: use the combined tid special register
Rhys Perry [Fri, 6 Jul 2018 20:21:28 +0000 (21:21 +0100)]
nvc0/ir: use the combined tid special register

total instructions in shared programs : 5804448 -> 5804690 (0.00%)
total gprs used in shared programs    : 670065 -> 670065 (0.00%)
total shared used in shared programs  : 548832 -> 548832 (0.00%)
total local used in shared programs   : 21068 -> 21068 (0.00%)

                local     shared        gpr       inst      bytes
    helped           0           0           0           5           5
      hurt           0           0           0         191         191

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agonir/print: Print texture and sampler indices
Jason Ekstrand [Sat, 30 Jun 2018 06:08:05 +0000 (23:08 -0700)]
nir/print: Print texture and sampler indices

Commit 5fb69daa6076e56b deleted support from nir_print for printing the
texture and sampler indices on texture instructions.  This commit just
brings it back as best as we can.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agointel/compiler: Relax mixed type restriction for saturating immediates
Ian Romanick [Wed, 27 Jun 2018 02:21:43 +0000 (19:21 -0700)]
intel/compiler: Relax mixed type restriction for saturating immediates

At the time of commit 7bc6e455e23 (i965: Add support for saturating
immediates.) we thought mixed type saturates would be impossible.  We
were only thinking about type converting moves from D to F, for
example.  However, type converting moves w/saturate from F to DF are
definitely possible.  This change minimally relaxes the restriction to
allow cases that I have been able trigger via piglit tests.

Fixes new piglit tests:
 - arb_gpu_shader_fp64/execution/built-in-functions/fs-sign-sat-neg-abs.shader_test
 - arb_gpu_shader_fp64/execution/built-in-functions/vs-sign-sat-neg-abs.shader_test

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>