mesa.git
6 years agonir: Generalize nir_intrinsic_vote_eq
Jason Ekstrand [Tue, 29 Aug 2017 00:33:33 +0000 (17:33 -0700)]
nir: Generalize nir_intrinsic_vote_eq

The SPIR-V extension wants us to be able to do an AllEqual on any vector
or scalar type.  This has two implications:

 1) We need to be able to handle vectors so we switch the vote_eq
    intrinsics to be vectorized intrinsics.

 2) We need to handle floats which have different behavior with respect
    to +-0, NaN, etc. than the integer variant so we need two variants.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agospirv: Add subgroup ballot support
Jason Ekstrand [Tue, 22 Aug 2017 23:53:05 +0000 (16:53 -0700)]
spirv: Add subgroup ballot support

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agoi965/fs: Implement basic SPIR-V subgroup intrinsics
Jason Ekstrand [Tue, 22 Aug 2017 05:17:37 +0000 (22:17 -0700)]
i965/fs: Implement basic SPIR-V subgroup intrinsics

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agospirv: Add initial subgroup support
Jason Ekstrand [Fri, 28 Apr 2017 11:45:50 +0000 (04:45 -0700)]
spirv: Add initial subgroup support

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agonir: Add new SPIR-V ballot intrinsics and lowering
Jason Ekstrand [Tue, 9 May 2017 23:44:13 +0000 (16:44 -0700)]
nir: Add new SPIR-V ballot intrinsics and lowering

Someone can make the lowering optional later if they want something
different for their hardware.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agocompiler: Add two new system values for subgroups
Jason Ekstrand [Sat, 30 Sep 2017 21:50:40 +0000 (14:50 -0700)]
compiler: Add two new system values for subgroups

This will be required for SPIR-V subgroup support

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agonir: Add new SPIR-V ballot ALU intrinsics and lowering
Jason Ekstrand [Tue, 3 Oct 2017 01:19:44 +0000 (18:19 -0700)]
nir: Add new SPIR-V ballot ALU intrinsics and lowering

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agospirv: Handle the new OpModuleProcessed instruction
Jason Ekstrand [Wed, 11 Oct 2017 23:29:28 +0000 (16:29 -0700)]
spirv: Handle the new OpModuleProcessed instruction

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER
Jason Ekstrand [Wed, 11 Oct 2017 23:06:13 +0000 (16:06 -0700)]
anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER

From the Vulkan 1.1 spec:

    "Vulkan 1.0 implementations were required to return
    VK_ERROR_INCOMPATIBLE_DRIVER if apiVersion was larger than 1.0.
    Implementations that support Vulkan 1.1 or later must not return
    VK_ERROR_INCOMPATIBLE_DRIVER for any value of apiVersion."

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv: Implement vkEnumerateInstanceVersion
Jason Ekstrand [Thu, 12 Oct 2017 01:09:32 +0000 (18:09 -0700)]
anv: Implement vkEnumerateInstanceVersion

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/device: fail to initialize device if we have queues with unsupported flags
Iago Toral Quiroga [Tue, 6 Feb 2018 09:37:16 +0000 (10:37 +0100)]
anv/device: fail to initialize device if we have queues with unsupported flags

This is not strictly necessary since users should not be requesting any
flags that are not valid for the list of enabled features requested and
we already fail if they attempt to use an unsupported feature, however
it is an easy to implement sanity check that would help developes realize
that they are doing things wrong, so we might as well do it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoanv/device: GetDeviceQueue2 should only return queues with matching flags
Iago Toral Quiroga [Tue, 6 Feb 2018 09:06:30 +0000 (10:06 +0100)]
anv/device: GetDeviceQueue2 should only return queues with matching flags

From the Vulkan 1.1 spec, VkDeviceQueueInfo2 structure:

   "The queue returned by vkGetDeviceQueue2 must have the same flags value
    from this structure as that used at device creation time in a
    VkDeviceQueueCreateInfo instance. If no matching flags were specified
    at device creation time then pQueue will return VK_NULL_HANDLE."

For us this means no flags at all since we don't support any.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoanv: Support querying for protected memory
Jason Ekstrand [Fri, 22 Sep 2017 17:03:18 +0000 (10:03 -0700)]
anv: Support querying for protected memory

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agoanv: Implement GetDeviceQueue2
Jason Ekstrand [Fri, 6 Oct 2017 02:29:27 +0000 (19:29 -0700)]
anv: Implement GetDeviceQueue2

This belongs to the protected memory feature but there's nothing about
it that's specific to protected memory.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agoanv: Trivially implement VK_KHR_device_group
Jason Ekstrand [Thu, 21 Sep 2017 20:54:55 +0000 (13:54 -0700)]
anv: Trivially implement VK_KHR_device_group

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agoanv: Implement vkCmdDispatchBase
Jason Ekstrand [Tue, 3 Oct 2017 22:23:07 +0000 (15:23 -0700)]
anv: Implement vkCmdDispatchBase

This is part of the device groups extension/feature but it's a decent
chunk of work in its own right so it's worth breaking into its own
patch.  The mechanism we use is fairly straightforward: we just push the
base work group id into the shader and add it to the work group id we
get from dispatch.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agonir/spirv: Add support for device groups
Jason Ekstrand [Thu, 21 Sep 2017 22:51:55 +0000 (15:51 -0700)]
nir/spirv: Add support for device groups

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv: Implement VK_KHR_maintenance3
Jason Ekstrand [Thu, 5 Oct 2017 23:03:29 +0000 (16:03 -0700)]
anv: Implement VK_KHR_maintenance3

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv: Support VkPhysicalDeviceShaderDrawParameterFeatures
Jason Ekstrand [Fri, 13 Oct 2017 18:03:07 +0000 (11:03 -0700)]
anv: Support VkPhysicalDeviceShaderDrawParameterFeatures

This advertises the VK_KHR_shader_draw_parameters functionality as a
"core optimal feature" in Vulkan 1.1.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/entrypoints: Drop support for protect attributes
Jason Ekstrand [Tue, 17 Oct 2017 04:48:11 +0000 (21:48 -0700)]
anv/entrypoints: Drop support for protect attributes

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoGet rid of a bunch of KHR suffixes
Jason Ekstrand [Wed, 20 Sep 2017 20:16:26 +0000 (13:16 -0700)]
Get rid of a bunch of KHR suffixes

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv: Add version 1.1.0 but leave it disabled
Jason Ekstrand [Wed, 20 Sep 2017 19:18:10 +0000 (12:18 -0700)]
anv: Add version 1.1.0 but leave it disabled

This requires us to rename any Vulkan API entrypoints which became core
in 1.1 to no longer have the KHR suffix.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agospirv: Update the SPIR-V headers and json to 1.3.1
Jason Ekstrand [Mon, 21 Aug 2017 23:15:36 +0000 (16:15 -0700)]
spirv: Update the SPIR-V headers and json to 1.3.1

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agovulkan: Update the XML and headers to 1.1.70
Jason Ekstrand [Tue, 19 Sep 2017 20:04:13 +0000 (13:04 -0700)]
vulkan: Update the XML and headers to 1.1.70

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agovulkan/enum_to_str: Add support for aliases and new Vulkan versions
Jason Ekstrand [Thu, 21 Sep 2017 15:26:06 +0000 (08:26 -0700)]
vulkan/enum_to_str: Add support for aliases and new Vulkan versions

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agovulkan/enum_to_str: Add a add_value_from_xml helper to VkEnum
Jason Ekstrand [Wed, 24 Jan 2018 03:43:00 +0000 (19:43 -0800)]
vulkan/enum_to_str: Add a add_value_from_xml helper to VkEnum

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/entrypoints: Generate #ifdef guards from platform attributes
Jason Ekstrand [Tue, 17 Oct 2017 04:46:55 +0000 (21:46 -0700)]
anv/entrypoints: Generate #ifdef guards from platform attributes

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/extensions: Add support for multiple API versions
Jason Ekstrand [Fri, 22 Sep 2017 14:36:39 +0000 (07:36 -0700)]
anv/extensions: Add support for multiple API versions

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/entrypoints_gen: Add support for aliases in the XML
Jason Ekstrand [Tue, 19 Sep 2017 21:44:26 +0000 (14:44 -0700)]
anv/entrypoints_gen: Add support for aliases in the XML

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/entrypoints: Allow an entrypoint to require multiple extensions
Jason Ekstrand [Wed, 24 Jan 2018 03:18:08 +0000 (19:18 -0800)]
anv/entrypoints: Allow an entrypoint to require multiple extensions

In this case, we say an entrypoint is supported if ANY of the extensions
is supported.  This is because, in the XML, entrypoints don't require
extensions so much as extensions require entrypoints.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/entrypoints: Add an is_device_entrypoint helper
Jason Ekstrand [Wed, 24 Jan 2018 03:15:27 +0000 (19:15 -0800)]
anv/entrypoints: Add an is_device_entrypoint helper

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/entrypoints_gen: Allow the string map to grow
Jason Ekstrand [Wed, 20 Sep 2017 19:38:12 +0000 (12:38 -0700)]
anv/entrypoints_gen: Allow the string map to grow

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/entrypoints_gen: A bit of refactoring
Jason Ekstrand [Wed, 20 Sep 2017 16:41:50 +0000 (09:41 -0700)]
anv/entrypoints_gen: A bit of refactoring

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/entrypoints: Generalize the string map a bit
Jason Ekstrand [Wed, 20 Sep 2017 15:25:05 +0000 (08:25 -0700)]
anv/entrypoints: Generalize the string map a bit

The original string map assumed that the mapping from strings to
entrypoints was a bijection.  This will not be true the moment we
add entrypoint aliasing.  This reworks things to be an arbitrary map
from strings to non-negative signed integers.  The old one also had a
potential bug if we ever had a hash collision because it didn't do the
strcmp inside the lookup loop.  While we're at it, we break things out
into a helpful class.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agovulkan: Rename multiview from KHX to KHR
Jason Ekstrand [Tue, 19 Sep 2017 20:49:05 +0000 (13:49 -0700)]
vulkan: Rename multiview from KHX to KHR

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agospirv: Rework barriers
Jason Ekstrand [Wed, 23 Aug 2017 05:01:42 +0000 (22:01 -0700)]
spirv: Rework barriers

Our previous handling of barriers always used the big hammer and didn't
correctly emit memory barriers when specified along with a control
barrier.  This commit completely reworks the way we emit barriers to
make things both more precise and more correct.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agospirv: Add a vtn_constant_value helper
Jason Ekstrand [Wed, 23 Aug 2017 05:16:01 +0000 (22:16 -0700)]
spirv: Add a vtn_constant_value helper

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agoradeonsi: remove si_llvm_add_attribute
Marek Olšák [Sat, 24 Feb 2018 23:39:44 +0000 (00:39 +0100)]
radeonsi: remove si_llvm_add_attribute

6 years agoradeonsi: fix passing address32_hi to LLVM for high values
Marek Olšák [Sat, 24 Feb 2018 23:33:28 +0000 (00:33 +0100)]
radeonsi: fix passing address32_hi to LLVM for high values

The old function treats high values as negative, which LLVM interprets as 0.

6 years agoradeonsi: assume has_virtual_memory == true
Marek Olšák [Wed, 21 Feb 2018 22:07:05 +0000 (23:07 +0100)]
radeonsi: assume has_virtual_memory == true

6 years agoradeonsi: add/update assertions for 32-bit address space
Marek Olšák [Thu, 22 Feb 2018 16:13:51 +0000 (17:13 +0100)]
radeonsi: add/update assertions for 32-bit address space

6 years agoradeonsi: prevent a negative buffer offset in si_upload_descriptors
Marek Olšák [Thu, 22 Feb 2018 19:21:42 +0000 (20:21 +0100)]
radeonsi: prevent a negative buffer offset in si_upload_descriptors

6 years agoradeonsi: properly extract a buffer address from a descriptor
Marek Olšák [Wed, 21 Feb 2018 23:28:39 +0000 (00:28 +0100)]
radeonsi: properly extract a buffer address from a descriptor

6 years agoradeonsi: fix vertex buffer address computation with full 64-bit addresses
Marek Olšák [Wed, 21 Feb 2018 22:33:38 +0000 (23:33 +0100)]
radeonsi: fix vertex buffer address computation with full 64-bit addresses

6 years agoradeonsi: mask out high VM address bits in registers where needed
Marek Olšák [Wed, 21 Feb 2018 22:30:41 +0000 (23:30 +0100)]
radeonsi: mask out high VM address bits in registers where needed

6 years agoradv: Add entrypoints generation with the new vk.xml
Bas Nieuwenhuizen [Tue, 6 Mar 2018 00:08:43 +0000 (01:08 +0100)]
radv: Add entrypoints generation with the new vk.xml

A lot of it is based on intel again.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoglsl: Fix memory leak with known glsl_type instances
Simon Hausmann [Wed, 14 Feb 2018 11:51:11 +0000 (12:51 +0100)]
glsl: Fix memory leak with known glsl_type instances

When looking up known glsl_type instances in the various hash tables, we
end up leaking the key instances used for the lookup, as the glsl_type
constructor allocates memory on the global mem_ctx. This patch changes
glsl_type to manage its own memory, which fixes the leak and also allows
getting rid of the global mem_ctx and its mutex.

v2: remove lambda usage (Tapani)
    (+keep ASSERT_BITFIELD_SIZE, modify dummy ctor to initialize mem_ctx)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104884
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Simon Hausmann <simon.hausmann@qt.io>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agospirv: Add SpvCapabilityShaderViewportIndexLayerEXT
Caio Marcelo de Oliveira Filho [Mon, 5 Mar 2018 21:58:11 +0000 (13:58 -0800)]
spirv: Add SpvCapabilityShaderViewportIndexLayerEXT

This capability allows gl_ViewportIndex and gl_Layer to also be used
as outputs in Vertex and Tesselation shaders.

v2: Make conditional to the capability, add gl_Layer, add tesselation
    shaders. (Iago)

v3: Don't export to tesselation control shader.

v4: Add Reviewd-by tag.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agoandroid: anv: add libmesa_intel_dev static dependency
Mauro Rossi [Tue, 6 Mar 2018 23:15:15 +0000 (00:15 +0100)]
android: anv: add libmesa_intel_dev static dependency

Fixes the following building errors:

external/mesa/src/intel/vulkan/anv_device.c:300: error: undefined reference to 'gen_get_pci_device_id_override'
external/mesa/src/intel/vulkan/anv_device.c:312: error: undefined reference to 'gen_get_device_name'
external/mesa/src/intel/vulkan/anv_device.c:313: error: undefined reference to 'gen_get_device_info'
clang.real: error: linker command failed with exit code 1 (use -v to see invocation)

Fixes: 272bef0601a "intel: Split gen_device_info out into libintel_dev"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
6 years agoRevert "nir: bump loop unroll limit to 96."
Timothy Arceri [Tue, 6 Mar 2018 03:48:07 +0000 (14:48 +1100)]
Revert "nir: bump loop unroll limit to 96."

This reverts commit 2d36efdb7f18f061c519dbb93f6058bf161aad33.

This raised limit turns out to harmful for more complex shaders,
it causes excessive spilling in some Bioshock Infinite shaders.

The fps for the ssao demo on radv remains unchanged when reverting
this.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoac/nir: don't put lod into args if it's zero.
Dave Airlie [Wed, 7 Mar 2018 03:24:25 +0000 (03:24 +0000)]
ac/nir: don't put lod into args if it's zero.

If it's zero but put it in args we still end up consuming a
register for it.

This fixes some spilling in the NIR paths in Dirt Rally that
isn't seen with TGSI.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agofreedreno: bump required libdrm version
Christian Gmeiner [Tue, 6 Mar 2018 09:34:08 +0000 (10:34 +0100)]
freedreno: bump required libdrm version

Fixes: 26a9321d0a "freedreno: add global_bindings state"
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agonir: Simplify some comparisons like a+b < a
Ian Romanick [Tue, 13 Feb 2018 02:58:53 +0000 (18:58 -0800)]
nir: Simplify some comparisons like a+b < a

All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14514555 -> 14514547 (<.01%)
instructions in affected programs: 1972 -> 1964 (-0.41%)
helped: 8
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.39% max: 0.42% x̄: 0.41% x̃: 0.41%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.41% -0.40%
Instructions are helped.

total cycles in shared programs: 533141444 -> 533136780 (<.01%)
cycles in affected programs: 164728 -> 160064 (-2.83%)
helped: 181
HURT: 3
helped stats (abs) min: 2 max: 94 x̄: 26.17 x̃: 30
helped stats (rel) min: 0.12% max: 5.33% x̄: 3.42% x̃: 3.80%
HURT stats (abs)   min: 4 max: 54 x̄: 24.00 x̃: 14
HURT stats (rel)   min: 0.20% max: 2.39% x̄: 1.09% x̃: 0.68%
95% mean confidence interval for cycles value: -27.12 -23.58
95% mean confidence interval for cycles %-change: -3.54% -3.16%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10533667 -> 10533539 (<.01%)
instructions in affected programs: 10148 -> 10020 (-1.26%)
helped: 124
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.03 x̃: 1
helped stats (rel) min: 0.39% max: 4.35% x̄: 2.20% x̃: 2.04%
95% mean confidence interval for instructions value: -1.06 -1.00
95% mean confidence interval for instructions %-change: -2.46% -1.95%
Instructions are helped.

total cycles in shared programs: 146136887 -> 146132122 (<.01%)
cycles in affected programs: 206382 -> 201617 (-2.31%)
helped: 171
HURT: 0
helped stats (abs) min: 2 max: 40 x̄: 27.87 x̃: 30
helped stats (rel) min: 0.08% max: 5.73% x̄: 2.98% x̃: 2.67%
95% mean confidence interval for cycles value: -29.19 -26.54
95% mean confidence interval for cycles %-change: -3.20% -2.76%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886515 -> 7886507 (<.01%)
instructions in affected programs: 3016 -> 3008 (-0.27%)
helped: 8
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.25% max: 0.28% x̄: 0.27% x̃: 0.27%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.27% -0.26%
Instructions are helped.

total cycles in shared programs: 178100396 -> 178100388 (<.01%)
cycles in affected programs: 156128 -> 156120 (<.01%)
helped: 4
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -3.68 1.68
95% mean confidence interval for cycles %-change: -0.03% <.01%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857872 -> 4857868 (<.01%)
instructions in affected programs: 1544 -> 1540 (-0.26%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.25% max: 0.27% x̄: 0.26% x̃: 0.26%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.28% -0.24%
Instructions are helped.

total cycles in shared programs: 122167654 -> 122167662 (<.01%)
cycles in affected programs: 96248 -> 96256 (<.01%)
helped: 0
HURT: 4
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: 2.00 2.00
95% mean confidence interval for cycles %-change: <.01% 0.02%
Cycles are HURT.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agonir: Use De Morgan's Law on logic compounded comparisons
Ian Romanick [Wed, 7 Feb 2018 01:27:53 +0000 (17:27 -0800)]
nir: Use De Morgan's Law on logic compounded comparisons

The replacement of the comparison operators must happen during this
step.  If it does not, the next pass of nir_opt_algebraic will reapply
De Morgan's Law in the "opposite direction" before performing dead code
elimination.  The resulting infinite loop will eventually get OOM
killed.

Haswell, Broadwell, and Skylake had similar results. (Broadwell shown)
total instructions in shared programs: 14808185 -> 14808036 (<.01%)
instructions in affected programs: 13758 -> 13609 (-1.08%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 3.82 x̃: 3
helped stats (rel) min: 0.44% max: 1.55% x̄: 0.98% x̃: 1.01%
95% mean confidence interval for instructions value: -4.67 -2.97
95% mean confidence interval for instructions %-change: -1.09% -0.88%
Instructions are helped.

total cycles in shared programs: 559438333 -> 559435832 (<.01%)
cycles in affected programs: 199160 -> 196659 (-1.26%)
helped: 42
HURT: 3
helped stats (abs) min: 2 max: 184 x̄: 61.50 x̃: 51
helped stats (rel) min: 0.02% max: 6.94% x̄: 1.41% x̃: 1.40%
HURT stats (abs)   min: 2 max: 40 x̄: 27.33 x̃: 40
HURT stats (rel)   min: 0.05% max: 0.74% x̄: 0.51% x̃: 0.74%
95% mean confidence interval for cycles value: -71.47 -39.69
95% mean confidence interval for cycles %-change: -1.64% -0.93%
Cycles are helped.

Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11811776 -> 11811553 (<.01%)
instructions in affected programs: 15201 -> 14978 (-1.47%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 5.72 x̃: 6
helped stats (rel) min: 0.44% max: 2.53% x̄: 1.30% x̃: 1.26%
95% mean confidence interval for instructions value: -7.21 -4.23
95% mean confidence interval for instructions %-change: -1.48% -1.12%
Instructions are helped.

total cycles in shared programs: 257617270 -> 257614589 (<.01%)
cycles in affected programs: 212107 -> 209426 (-1.26%)
helped: 45
HURT: 0
helped stats (abs) min: 2 max: 180 x̄: 59.58 x̃: 54
helped stats (rel) min: 0.02% max: 6.02% x̄: 1.30% x̃: 1.32%
95% mean confidence interval for cycles value: -74.02 -45.14
95% mean confidence interval for cycles %-change: -1.59% -1.01%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886648 -> 7886515 (<.01%)
instructions in affected programs: 14106 -> 13973 (-0.94%)
helped: 29
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 4.59 x̃: 4
helped stats (rel) min: 0.35% max: 1.83% x̄: 0.90% x̃: 0.81%
95% mean confidence interval for instructions value: -5.65 -3.52
95% mean confidence interval for instructions %-change: -1.03% -0.76%
Instructions are helped.

total cycles in shared programs: 178100812 -> 178100396 (<.01%)
cycles in affected programs: 67970 -> 67554 (-0.61%)
helped: 29
HURT: 0
helped stats (abs) min: 2 max: 40 x̄: 14.34 x̃: 12
helped stats (rel) min: 0.15% max: 1.69% x̄: 0.58% x̃: 0.54%
95% mean confidence interval for cycles value: -18.30 -10.39
95% mean confidence interval for cycles %-change: -0.71% -0.45%
Cycles are helped.

GM45
total instructions in shared programs: 4857939 -> 4857872 (<.01%)
instructions in affected programs: 7426 -> 7359 (-0.90%)
helped: 15
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 4.47 x̃: 4
helped stats (rel) min: 0.33% max: 1.80% x̄: 0.87% x̃: 0.77%
95% mean confidence interval for instructions value: -6.06 -2.87
95% mean confidence interval for instructions %-change: -1.06% -0.67%
Instructions are helped.

total cycles in shared programs: 122167930 -> 122167654 (<.01%)
cycles in affected programs: 43118 -> 42842 (-0.64%)
helped: 15
HURT: 0
helped stats (abs) min: 4 max: 40 x̄: 18.40 x̃: 16
helped stats (rel) min: 0.15% max: 1.69% x̄: 0.62% x̃: 0.54%
95% mean confidence interval for cycles value: -25.03 -11.77
95% mean confidence interval for cycles %-change: -0.82% -0.41%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agonir: Replace fmin(b2f(a), b) with a bcsel
Ian Romanick [Sat, 3 Feb 2018 01:39:54 +0000 (17:39 -0800)]
nir: Replace fmin(b2f(a), b) with a bcsel

All of the affected shaders are HDR mappers from Serious Sam 3.

All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14516285 -> 14516273 (<.01%)
instructions in affected programs: 348 -> 336 (-3.45%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.08% max: 6.67% x̄: 4.31% x̃: 4.17%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -5.55% -3.06%
Instructions are helped.

total cycles in shared programs: 533163876 -> 533163808 (<.01%)
cycles in affected programs: 1144 -> 1076 (-5.94%)
helped: 4
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 5.80% max: 6.08% x̄: 5.94% x̃: 5.94%
95% mean confidence interval for cycles value: -18.84 -15.16
95% mean confidence interval for cycles %-change: -6.20% -5.68%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10533321 -> 10533309 (<.01%)
instructions in affected programs: 372 -> 360 (-3.23%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.00% max: 5.88% x̄: 3.91% x̃: 3.85%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -4.96% -2.86%
Instructions are helped.

total cycles in shared programs: 146136632 -> 146136428 (<.01%)
cycles in affected programs: 11668 -> 11464 (-1.75%)
helped: 12
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 0.99% max: 3.44% x̄: 2.20% x̃: 2.29%
95% mean confidence interval for cycles value: -17.66 -16.34
95% mean confidence interval for cycles %-change: -2.82% -1.58%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886301 -> 7886277 (<.01%)
instructions in affected programs: 576 -> 552 (-4.17%)
helped: 12
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.94% max: 6.06% x̄: 4.51% x̃: 4.65%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -5.30% -3.72%
Instructions are helped.

total cycles in shared programs: 178113176 -> 178113176 (0.00%)
cycles in affected programs: 2116 -> 2116 (0.00%)
helped: 2
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 1.14% max: 1.14% x̄: 1.14% x̃: 1.14%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.50% max: 0.65% x̄: 0.58% x̃: 0.58%
95% mean confidence interval for cycles value: -3.25 3.25
95% mean confidence interval for cycles %-change: -0.93% 0.94%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857756 -> 4857744 (<.01%)
instructions in affected programs: 294 -> 282 (-4.08%)
helped: 6
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.94% max: 5.71% x̄: 4.40% x̃: 4.55%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -5.71% -3.09%
Instructions are helped.

total cycles in shared programs: 122178730 -> 122178722 (<.01%)
cycles in affected programs: 700 -> 692 (-1.14%)
helped: 2
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agonir: Pull b2f out of bcsel
Ian Romanick [Thu, 1 Feb 2018 23:33:04 +0000 (15:33 -0800)]
nir: Pull b2f out of bcsel

All platforms had similar results. (Skylake shown)
total instructions in shared programs: 14516592 -> 14516586 (<.01%)
instructions in affected programs: 500 -> 494 (-1.20%)
helped: 2
HURT: 0

total cycles in shared programs: 533167044 -> 533166998 (<.01%)
cycles in affected programs: 6988 -> 6942 (-0.66%)
helped: 2
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agonir: Replace an odd comparison involving fmin of -b2f
Ian Romanick [Wed, 31 Jan 2018 19:11:02 +0000 (11:11 -0800)]
nir: Replace an odd comparison involving fmin of -b2f

I noticed the fge version while looking at a shader for an unrelated
reason.  The feq version prevents a regression in a later change that
performs strength reduction of some compares.

Broadwell and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14514808 -> 14514796 (<.01%)
instructions in affected programs: 750 -> 738 (-1.60%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.83% max: 1.96% x̄: 1.40% x̃: 1.40%
95% mean confidence interval for instructions value: -6.67 0.67
95% mean confidence interval for instructions %-change: -2.43% -0.36%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 533144939 -> 533144853 (<.01%)
cycles in affected programs: 8911 -> 8825 (-0.97%)
helped: 4
HURT: 0
helped stats (abs) min: 16 max: 32 x̄: 21.50 x̃: 19
helped stats (rel) min: 0.60% max: 1.89% x̄: 1.28% x̃: 1.31%
95% mean confidence interval for cycles value: -32.94 -10.06
95% mean confidence interval for cycles %-change: -2.30% -0.26%
Cycles are helped.

Haswell
total instructions in shared programs: 13093785 -> 13093775 (<.01%)
instructions in affected programs: 924 -> 914 (-1.08%)
helped: 4
HURT: 2
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.82% max: 1.95% x̄: 1.39% x̃: 1.39%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19%
95% mean confidence interval for instructions value: -4.53 1.20
95% mean confidence interval for instructions %-change: -2.02% 0.97%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 409580553 -> 409580118 (<.01%)
cycles in affected programs: 10909 -> 10474 (-3.99%)
helped: 5
HURT: 1
helped stats (abs) min: 6 max: 222 x̄: 89.60 x̃: 18
helped stats (rel) min: 0.16% max: 24.72% x̄: 9.54% x̃: 1.78%
HURT stats (abs)   min: 13 max: 13 x̄: 13.00 x̃: 13
HURT stats (rel)   min: 0.39% max: 0.39% x̄: 0.39% x̃: 0.39%
95% mean confidence interval for cycles value: -180.68 35.68
95% mean confidence interval for cycles %-change: -19.55% 3.79%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 11811851 -> 11811840 (<.01%)
instructions in affected programs: 1032 -> 1021 (-1.07%)
helped: 5
HURT: 1
helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 1
helped stats (rel) min: 0.63% max: 1.95% x̄: 1.13% x̃: 0.97%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19%
95% mean confidence interval for instructions value: -4.17 0.51
95% mean confidence interval for instructions %-change: -1.86% 0.36%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 257618403 -> 257618168 (<.01%)
cycles in affected programs: 10784 -> 10549 (-2.18%)
helped: 4
HURT: 2
helped stats (abs) min: 4 max: 220 x̄: 64.50 x̃: 17
helped stats (rel) min: 0.50% max: 24.34% x̄: 7.07% x̃: 1.72%
HURT stats (abs)   min: 9 max: 14 x̄: 11.50 x̃: 11
HURT stats (rel)   min: 0.24% max: 0.42% x̄: 0.33% x̃: 0.33%
95% mean confidence interval for cycles value: -133.11 54.78
95% mean confidence interval for cycles %-change: -14.79% 5.59%
Inconclusive result (value mean confidence interval includes 0).

GM45, Iron Lake, and Sandy Bridge had similar results. (Sandy Bridge shown)
total instructions in shared programs: 10533871 -> 10533859 (<.01%)
instructions in affected programs: 865 -> 853 (-1.39%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.63% max: 1.83% x̄: 1.22% x̃: 1.21%
95% mean confidence interval for instructions value: -6.67 0.67
95% mean confidence interval for instructions %-change: -2.16% -0.29%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 146139904 -> 146139852 (<.01%)
cycles in affected programs: 15213 -> 15161 (-0.34%)
helped: 4
HURT: 0
helped stats (abs) min: 3 max: 18 x̄: 13.00 x̃: 15
helped stats (rel) min: 0.15% max: 0.84% x̄: 0.39% x̃: 0.29%
95% mean confidence interval for cycles value: -23.79 -2.21
95% mean confidence interval for cycles %-change: -0.88% 0.09%
Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agonir: Mark bcsel-to-fmin (or fmax) transformations as inexact
Ian Romanick [Mon, 26 Feb 2018 22:49:47 +0000 (14:49 -0800)]
nir: Mark bcsel-to-fmin (or fmax) transformations as inexact

These transformations are inexact because section 4.7.1 (Range and
Precision) says:

    Operations and built-in functions that operate on a NaN are not
    required to return a NaN as the result.

The fmin or fmax might not return NaN in cases where the original
expression would be required to return NaN.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agonir: Recognize some more open-coded fmin / fmax
Ian Romanick [Mon, 22 Jan 2018 09:29:48 +0000 (17:29 +0800)]
nir: Recognize some more open-coded fmin / fmax

This transformation is inexact because section 4.7.1 (Range and
Precision) says:

    Operations and built-in functions that operate on a NaN are not
    required to return a NaN as the result.

The fmin or fmax might not return NaN in cases where the original
expression would be required to return NaN.

v2: Reorder operands and mark as inexact.  The latter suggested by
Jason.

shader-db results:

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14514817 -> 14514808 (<.01%)
instructions in affected programs: 229 -> 220 (-3.93%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4
helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12%

total cycles in shared programs: 533145211 -> 533144939 (<.01%)
cycles in affected programs: 37268 -> 36996 (-0.73%)
helped: 8
HURT: 0
helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2
helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05%

Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
total cycles in shared programs: 257618409 -> 257618403 (<.01%)
cycles in affected programs: 12582 -> 12576 (-0.05%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05%

No changes on Iron Lake or GM45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agost/omx/tizonia/h264d: Add EGLImage support
Gurkirpal Singh [Sat, 20 Jan 2018 01:57:07 +0000 (07:27 +0530)]
st/omx/tizonia/h264d: Add EGLImage support

Example Gstreamer pipeline :
MESA_ENABLE_OMX_EGLIMAGE=1 GST_GL_API=gles2 GST_GL_PLATFORM=egl gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! glimagesink

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
6 years agost/omx/tizonia: Add H.264 encoder
Gurkirpal Singh [Sat, 20 Jan 2018 01:52:33 +0000 (07:22 +0530)]
st/omx/tizonia: Add H.264 encoder

v2: Refactor out screen functions to st/omx

Example Gstreamer pipeline :
gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! avdec_h264 ! videoconvert ! omxh264enc ! h264parse ! avdec_h264 ! videoconvert ! ximagesink

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
6 years agost/omx/tizonia: Add H.264 decoder
Gurkirpal Singh [Sat, 20 Jan 2018 00:44:17 +0000 (06:14 +0530)]
st/omx/tizonia: Add H.264 decoder

v2: Refactor out screen functions to st/omx

Example Gstreamer pipeline :
gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! videoconvert ! ximagesink

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
6 years agost/omx/tizonia: Add entrypoint
Gurkirpal Singh [Sat, 20 Jan 2018 00:36:59 +0000 (06:06 +0530)]
st/omx/tizonia: Add entrypoint

Adds base files for adding components

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
6 years agost/omx/tizonia: Add --enable-omx-tizonia flag and build files
Gurkirpal Singh [Sat, 20 Jan 2018 00:07:53 +0000 (05:37 +0530)]
st/omx/tizonia: Add --enable-omx-tizonia flag and build files

Allow only bellagio or tizonia to be used at the same time.
Detect tizonia package config file
Generate libomx_mesa.so and install it to libtizcore.pc::pluginsdir
Only compile empty source (target.c) for now.

link: https://summerofcode.withgoogle.com/projects/#4737166321123328
Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
6 years agost/omx/bellagio: Rename st and target directories
Gurkirpal Singh [Fri, 19 Jan 2018 23:42:06 +0000 (05:12 +0530)]
st/omx/bellagio: Rename st and target directories

v2: Refactor out screen functions to st/omx

Allows to keep all the code under st/omx (st/omx/tizonia and
st/omx/bellagio).
Reverts targets/omx_bellagio to omx as additions to existing files
is enough to compile for both bellagio and tizonia.

* autotools changes:
  --enable-omx -> --enable-omx-bellagio

* meson changes:
  -Dgallium-omx=false -> -Dgallium-omx=disabled
  -Dgallium-omx=true  -> -Dgallium-omx=bellagio

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
6 years agoradv: report the scratch private memory size with shader stats
Samuel Pitoiset [Thu, 1 Mar 2018 21:12:56 +0000 (22:12 +0100)]
radv: report the scratch private memory size with shader stats

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoac/nir: count the scratch private memory size
Samuel Pitoiset [Thu, 1 Mar 2018 21:12:55 +0000 (22:12 +0100)]
ac/nir: count the scratch private memory size

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoac: add ac_count_scratch_private_memory()
Samuel Pitoiset [Thu, 1 Mar 2018 21:12:54 +0000 (22:12 +0100)]
ac: add ac_count_scratch_private_memory()

Imported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoac/nir: only enable used channels when exporting parameters
Samuel Pitoiset [Thu, 1 Mar 2018 10:54:22 +0000 (11:54 +0100)]
ac/nir: only enable used channels when exporting parameters

This allows us to generate, for example,
"exp param0 v0, off, off, off" if only the first channel is needed.

Not sure if this improves performance but it's worth trying.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoac: update enabled channels mask when optimizing PARAM exports
Samuel Pitoiset [Thu, 1 Mar 2018 10:54:21 +0000 (11:54 +0100)]
ac: update enabled channels mask when optimizing PARAM exports

When the mask is not 0xf we need to update the number of
enabled channels, otherwise the hardware won't emit the
components that are combined.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoac/nir: pass the number of enabled channels to si_llvm_init_export_args()
Samuel Pitoiset [Thu, 1 Mar 2018 10:54:20 +0000 (11:54 +0100)]
ac/nir: pass the number of enabled channels to si_llvm_init_export_args()

Currently, it's always 0xf but an upcoming patch will reduce the
number of channels for parameters export.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoac/shader: scan output usage mask for VS and TES
Samuel Pitoiset [Thu, 1 Mar 2018 10:54:19 +0000 (11:54 +0100)]
ac/shader: scan output usage mask for VS and TES

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agointel: Add missing includes for building on Android
Clayton Craft [Tue, 6 Mar 2018 01:00:05 +0000 (17:00 -0800)]
intel: Add missing includes for building on Android

This adds a missing library to the i965/Android.mk file, and updates
intel/Android.mk to include the new library. Without this, mesa does not
build on Android.

Fixes: 272bef0601a "intel: Split gen_device_info out into
libintel_dev"

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agovulkan: do not expose surface/swapchain extensions on Android
Tapani Pälli [Mon, 5 Mar 2018 16:46:20 +0000 (18:46 +0200)]
vulkan: do not expose surface/swapchain extensions on Android

On Android surface/swapchain extensions are implemented by the loader. Patch
modifies both anv and radv extension scripts disabling currently exposed
ones. See also earlier commit 9f763c1f9b.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoanv: Don't expose VK_KHX_multiview on android.
Tapani Pälli [Mon, 5 Mar 2018 08:57:29 +0000 (10:57 +0200)]
anv: Don't expose VK_KHX_multiview on android.

Just like commit 2ffe395 does for radv.

Fixes following dEQP test on i965:
   dEQP-VK.api.info.android.no_unknown_extensions

v2: make it !ANDROID since this extension is not about
    surfaces/swapchain

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agogallium: increase PIPE_MAX_SHADER_SAMPLER_VIEWS to 128
Roland Scheidegger [Tue, 27 Feb 2018 02:38:17 +0000 (03:38 +0100)]
gallium: increase PIPE_MAX_SHADER_SAMPLER_VIEWS to 128

Some state trackers require 128.
(There are no plans to increase PIPE_MAX_SAMPLERS too, since with gl
state tracker it's unlikely more than 32 will be needed, if you need
more use bindless.)

6 years agotgsi/scan: use wrap-around shift behavior explicitly for file_mask
Roland Scheidegger [Fri, 2 Mar 2018 02:00:41 +0000 (03:00 +0100)]
tgsi/scan: use wrap-around shift behavior explicitly for file_mask

The comment said it will only represent the lowest 32 regs. This was
not entirely true in practice, since at least on x86 you'll get
masked shifts (unless the compiler could recognize it already and toss
it out). It turns out this actually works out alright (presumably
noone uses it for temp regs) when increasing max sampler views, so
make that behavior explicit.
Albeit it feels a bit hacky (but in any case, explicit behavior there
is better than undefined behavior).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agoclover: Allow overriding platform/device version numbers
Aaron Watry [Thu, 10 Aug 2017 03:02:30 +0000 (22:02 -0500)]
clover: Allow overriding platform/device version numbers

Useful for testing API, builtin library, and device completeness of
not-yet-supported versions.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(v3) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Jan Vesely <jan.vesely@rutgers.edu>
v4: Remove redundant std::string wrapper around debug_get_option calls
v3: mark CL version overrides as static and const
v2: Make version_string in platform const in case

6 years agoclover/llvm: Pass device down to compile
Aaron Watry [Sun, 6 Aug 2017 01:41:40 +0000 (20:41 -0500)]
clover/llvm: Pass device down to compile

We'll need to be able to detect device version to define the appropriate
__OPENCL_VERSION__ header.

v2: Rebase after removing the previous patch (Pierre)
  - Removed "clover: Add device_clc_version to llvm::create_compiler_instance"

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
6 years agoclover: Pass device to llvm::create_compiler_instance
Aaron Watry [Sun, 6 Aug 2017 01:18:48 +0000 (20:18 -0500)]
clover: Pass device to llvm::create_compiler_instance

We'll be using dev.device_clc_version to select the default language version
soon along with the existing ir_target field.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
v4: Pass the device down instead of device_clc_version as a separate field
v3: Revise to acknowledge that we now have the device in compile/link_program
    instead of the string values.
v2: (Pierre) Move changes to create_compiler_instance invocation to correct
    patch to prevent temporary build breakage.
    (Jan) Use device_clc_version instead of device_version for compile/link

6 years agoclover/llvm: Use device in llvm compilation instead of copying fields
Aaron Watry [Sat, 10 Feb 2018 20:03:13 +0000 (14:03 -0600)]
clover/llvm: Use device in llvm compilation instead of copying fields

Copying the individual fields from the device when compiling/linking
will lead to an unnecessarily large number of fields getting passed
around.

v3: Rebase on current master
v2: Use device in function args before making additional changes in
    following patches

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
6 years agoradeonsi/nir: fix handling of doubles for gs inputs
Timothy Arceri [Thu, 1 Mar 2018 04:21:52 +0000 (15:21 +1100)]
radeonsi/nir: fix handling of doubles for gs inputs

Fixes piglit test:
tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoac: pass the unmodified number of components to load gs inputs
Timothy Arceri [Thu, 1 Mar 2018 04:37:25 +0000 (15:37 +1100)]
ac: pass the unmodified number of components to load gs inputs

Currently both users of this would overflow an array when the
input was a dual slot double as they expected the number of
components to be a max of 4.

Since we pass the type we can just let the functions handle
doubles in a way they choose.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradeonsi: move si_nir_load_input_gs() to si_shader.c
Timothy Arceri [Thu, 1 Mar 2018 04:17:34 +0000 (15:17 +1100)]
radeonsi: move si_nir_load_input_gs() to si_shader.c

All the tess shader and tgsi equivalents are here and it allows
use to use llvm_type_is_64bit() in the following patch without
exposing it externally.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agobroadcom/vc4: Add support for HW perfmon
Boris Brezillon [Thu, 11 Jan 2018 09:22:04 +0000 (10:22 +0100)]
broadcom/vc4: Add support for HW perfmon

The V3D engine provides several perf counters.
Implement ->get_driver_query_[group_]info() so that these counters are
exposed through the GL_AMD_performance_monitor extension.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
6 years agodrm-uapi: Update vc4 header with perfmon related definitions
Boris Brezillon [Thu, 11 Jan 2018 09:22:03 +0000 (10:22 +0100)]
drm-uapi: Update vc4 header with perfmon related definitions

v2: Update to the final version with the documentation.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
6 years agor600: fix color export mask
Roland Scheidegger [Mon, 5 Mar 2018 19:12:32 +0000 (20:12 +0100)]
r600: fix color export mask

The r600 code (not the eg one) forgot to copy the ps_color_export_mask
in commit 5b14e06d8b42e2b08ebc52b6c314ef8647d87a1f when updating the
pixel state, leading to misrenderings (probably with MRT).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105262

Tested-by: LoneVVolf <lonewolf@xs4all.nl>
Tested-by: Pavel Vinogradov <public@sourcemage.org>
6 years agotravis: keep meson version below 0.45.0
Andres Gomez [Mon, 5 Mar 2018 15:25:36 +0000 (17:25 +0200)]
travis: keep meson version below 0.45.0

Recently Meson upgraded to 0.45.0 and it needs python 3.5+, which is
not available in Trusty.

Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
6 years agointel: Drop SURFACE_FORMAT enum from genxml.
Kenneth Graunke [Wed, 14 Feb 2018 02:13:51 +0000 (18:13 -0800)]
intel: Drop SURFACE_FORMAT enum from genxml.

We want people to be using ISL_FORMAT_*, rather than the genxml format
enumerations. This patch drops 10 separate copies, and drops a bunch
of ugly casting.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
[jordan.l.justen@intel.com: Minor changes for rebase]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agointel/common: Use isl for decoder surface formats
Jordan Justen [Tue, 27 Feb 2018 04:31:22 +0000 (20:31 -0800)]
intel/common: Use isl for decoder surface formats

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agointel/isl: Add isl_format_is_valid
Jordan Justen [Tue, 27 Feb 2018 01:57:19 +0000 (17:57 -0800)]
intel/isl: Add isl_format_is_valid

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agointel: Split gen_device_info out into libintel_dev
Jordan Justen [Mon, 26 Feb 2018 23:39:59 +0000 (15:39 -0800)]
intel: Split gen_device_info out into libintel_dev

Split out the device info so isl doesn't depend on intel/common. Now
it will depend on the new intel/dev device info lib.

This will allow the decoder in intel/common to use isl, allowing us to
apply Ken's patch that removes the genxml duplication of surface
formats.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agogallium/aux/hud: Avoid possible buffer overflow
Gert Wollny [Wed, 28 Feb 2018 13:50:21 +0000 (14:50 +0100)]
gallium/aux/hud: Avoid possible buffer overflow

Limit the length of acceptable cpu names for use in hud_get_num_cpufreq
in order to avoid a buffer overflow later in add_object when this name
is copied into cpufreq_info::name.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105274
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agogbm: give a name to rgba fields
Eric Engestrom [Wed, 28 Feb 2018 16:08:54 +0000 (16:08 +0000)]
gbm: give a name to rgba fields

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
6 years agoegl: remove duplicated initialization
Andres Gomez [Fri, 2 Mar 2018 11:28:28 +0000 (13:28 +0200)]
egl: remove duplicated initialization

Found by inspection.

The line removed is a duplicate of the line literally just above the
the 3 lines context usually printed in a commit log.

v2: enhance the commit log (Emil).

Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agofreedreno/ir3: start dealing with half-precision
Rob Clark [Fri, 2 Mar 2018 15:21:55 +0000 (10:21 -0500)]
freedreno/ir3: start dealing with half-precision

Some instructions, assume src and/or dst is half-precision based on a
type field (ie. f32/s32/u32 are full precision but others are half
precision).  So add some code to sanity check the src/dst registers to
catch mixups.

Also propagate half-precision flag for SSA sources.  The instruction
consuming a SSA value needs to be of the same type as the one producing
it.

This is probably not complete half-precision support, but a useful first
step.  We do still need to add support for nir alu instructions for
converting between half/full precision.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: fix fixing-up register footprint
Rob Clark [Wed, 28 Feb 2018 22:33:29 +0000 (17:33 -0500)]
freedreno/ir3: fix fixing-up register footprint

It isn't just vertex shaders that need to fixup reg footprint for inputs
populated before shader starts.

This problem showed up with compute shaders.  If you have (for example)
a localregid sysval, but only the .x component is used, the hw still
writes the .yz components, which could overflow into other threads
causing corruption.  Showed up in cl cts 'basic/test_basic intmath_int'.
But in theory the same problem could crop up elsewhere.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: surfaces can be PIPE_BUFFER
Rob Clark [Tue, 27 Feb 2018 17:59:57 +0000 (12:59 -0500)]
freedreno: surfaces can be PIPE_BUFFER

At least for clover.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/a5xx: handle compute resources
Rob Clark [Mon, 26 Feb 2018 18:38:22 +0000 (13:38 -0500)]
freedreno/a5xx: handle compute resources

Not *entirely* sure why this is a different BIND bit, but it is.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: ignore return jump
Rob Clark [Mon, 26 Feb 2018 18:04:21 +0000 (13:04 -0500)]
freedreno/ir3: ignore return jump

I think this should also always only occur at the end of a BB (by
definition), and the BB successor should be the end block.

Signed-off-by: Rob Clark <robdclark@gmail.com>