mesa.git
7 years agoradv: Track scratch usage across pipelines & command buffers.
Bas Nieuwenhuizen [Sun, 29 Jan 2017 14:20:03 +0000 (15:20 +0100)]
radv: Track scratch usage across pipelines & command buffers.

Based on code written by Dave Airlie.

Signed-off-by: Bas Nieuwenhuizen <basni@oogle.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/ac: Add compiler support for spilling.
Bas Nieuwenhuizen [Sat, 28 Jan 2017 22:51:19 +0000 (23:51 +0100)]
radv/ac: Add compiler support for spilling.

Based on code written by Dave Airlie.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/amdgpu: Support a preamble CS.
Bas Nieuwenhuizen [Thu, 26 Jan 2017 23:19:52 +0000 (00:19 +0100)]
radv/amdgpu: Support a preamble CS.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoi965: add assert to while_jumps_before_offset()
Timothy Arceri [Thu, 26 Jan 2017 02:50:42 +0000 (13:50 +1100)]
i965: add assert to while_jumps_before_offset()

jip should always be negative here as its the result of
do instruction - while instruction.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965: fix up asserts in brw_inst_set_jip()
Timothy Arceri [Thu, 26 Jan 2017 02:50:41 +0000 (13:50 +1100)]
i965: fix up asserts in brw_inst_set_jip()

We are casting from a signed 32bit int to an unsigned 16bit int
so shift 15 bits rather than 16.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agollvmpipe: Use LLVMDumpModule, not DumpModule.
Bas Nieuwenhuizen [Sun, 29 Jan 2017 16:03:25 +0000 (17:03 +0100)]
llvmpipe: Use LLVMDumpModule, not DumpModule.

Forgot the prefix ...

Fixes: 0fca80b3db64dc1d004f78e22b9de86a07e9de96
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
7 years agovarious: Fix missing DumpModule with recent LLVM.
Bas Nieuwenhuizen [Sat, 28 Jan 2017 16:32:05 +0000 (17:32 +0100)]
various: Fix missing DumpModule with recent LLVM.

Since LLVM revision 293359 DumpModule gets only implemented when
either a debug build or LLVM_ENABLE_DUMP is set.

This patch adds a direct replacement for the function for radv and
radeonsi, However, as I don't know a good place to put common LLVM
code for all three I inlined the implementation for LLVMPipe.

v2: Use the new code for LLVM 3.4+ instead of LLVM 5+ & fixed indentation

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
7 years agor600g: use ieee variants of multiplication instructions
Ilia Mirkin [Tue, 24 Jan 2017 01:53:50 +0000 (20:53 -0500)]
r600g: use ieee variants of multiplication instructions

This matches the behavior of most other drivers, including nouveau,
radeonsi, and i965.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agor600g: add support for optionally using non-IEEE mul ops
Ilia Mirkin [Tue, 24 Jan 2017 02:02:28 +0000 (21:02 -0500)]
r600g: add support for optionally using non-IEEE mul ops

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agovc4: Coalesce into TLB writes as well as VPM/tex.
Eric Anholt [Wed, 18 Jan 2017 22:31:45 +0000 (09:31 +1100)]
vc4: Coalesce into TLB writes as well as VPM/tex.

This generally cuts an instruction when blending is enabled and we thus
have a single instruction generating the color value.

total instructions in shared programs: 91759 -> 91634 (-0.14%)
instructions in affected programs:     5338 -> 5213 (-2.34%)

7 years agovc4: Avoid an extra temporary and mov in ffloor/ffract/fceil.
Eric Anholt [Wed, 18 Jan 2017 03:23:14 +0000 (14:23 +1100)]
vc4: Avoid an extra temporary and mov in ffloor/ffract/fceil.

shader-db results:

total instructions in shared programs: 92611 -> 91764 (-0.91%)
instructions in affected programs:     27417 -> 26570 (-3.09%)

The star is one shader in glmark2's terrain (drops 16% of its
instructions), but there are also wins in mupen64plus and glb2.7.

7 years agovc4: Flip the switch to run the GLSL compiler optimization loop once.
Eric Anholt [Tue, 17 Jan 2017 23:38:55 +0000 (10:38 +1100)]
vc4: Flip the switch to run the GLSL compiler optimization loop once.

This has almost no effect on shader-db:

total instructions in shared programs: 92572 -> 92611 (0.04%)
instructions in affected programs:     4486 -> 4525 (0.87%)

Looking at 2 of the 7 different shaders that were hurt (all of which were
in mupen64), they all appear to be just differences in order of
instructions at the NIR level.

The advantage is that this should significantly reduce time in the compiler.

7 years agoi965: Unbind deleted shaders from brw_context, fixing malloc heisenbug.
Kenneth Graunke [Wed, 25 Jan 2017 08:59:42 +0000 (00:59 -0800)]
i965: Unbind deleted shaders from brw_context, fixing malloc heisenbug.

Applications may delete a shader program, create a new one, and bind it
before the next draw.  With terrible luck, malloc may randomly return a
chunk of memory for the new gl_program that happened to be the exact
same pointer as our previously bound gl_program.  In this case, our
logic to detect new programs in brw_upload_pipeline_state() would break:

      if (brw->vertex_program != ctx->VertexProgram._Current) {
         brw->vertex_program = ctx->VertexProgram._Current;
         brw->ctx.NewDriverState |= BRW_NEW_VERTEX_PROGRAM;
      }

Because the pointer is the same, we'd think it was the same program.
But it could be wildly different - a different stage altogether,
different sets of resources, and so on.  This causes utter chaos.

As unlikely as this seems, I believe I hit this when running a subset
of the CTS in a loop, in a group of tests that churns through simple
programs, deleting and rebuilding them.  Presumably malloc uses a
bucketing cache of sorts, and so freeing up a gl_program and allocating
a new one fairly quickly causes it to reuse that memory.

The result was that brw->vertex_program->info.num_ssbos claimed the
program had SSBOs, while brw->vs.base.prog_data.binding_table claimed
that there were none.  This was crazy, because the binding table is
calculated from info.num_ssbos - the shader info appeared to change
between shader compile time and draw time.  Careful use of watchpoints
revealed that it was being clobbered by rzalloc's memset when building
an entirely different program...

Fortunately, our 0xd0d0d0d0 canary for unused binding table entries
caused us to crash out of bounds when trying to upload SSBOs, or we
may have never discovered this heisenbug.

Fixes crashes in GL45-CTS.compute_shader.sso-case2 when using a hacked
cts-runner that only runs GL45-CTS.compute_shader.s* in EGL config ID 5
at 64x64 in a loop with 100 iterations.

Cc: "17.0 13.0 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoradv/ac: Use base in push constant loads.
Bas Nieuwenhuizen [Sat, 28 Jan 2017 00:32:20 +0000 (01:32 +0100)]
radv/ac: Use base in push constant loads.

Apparently the source is not an address but an offset, so we actually
need to use the base.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
7 years agoradv: drop support for VK_AMD_NEGATIVE_VIEWPORT_HEIGHT
Andres Rodriguez [Fri, 27 Jan 2017 05:09:58 +0000 (00:09 -0500)]
radv: drop support for VK_AMD_NEGATIVE_VIEWPORT_HEIGHT

This extension was not correctly supported, and it conflicts with the
VK_KHR_MAINTENANCE1 spec.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: implement VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2
Dave Airlie [Fri, 11 Nov 2016 02:27:21 +0000 (02:27 +0000)]
radv: implement VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2

Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: use proper maximum slice for layered view
Dave Airlie [Mon, 9 Jan 2017 07:02:17 +0000 (07:02 +0000)]
radv: use proper maximum slice for layered view

this fixes deferred shadows with geom shaders enabled.

but I think this fix is fine by itself.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoi965/sync: Implement fences based on Linux sync_file
Chad Versace [Fri, 13 Jan 2017 18:46:49 +0000 (10:46 -0800)]
i965/sync: Implement fences based on Linux sync_file

This patch implements a new type of struct brw_fence, one that is based
struct sync_file.

This completes support for EGL_ANDROID_native_fence_sync.

* Background

  Linux 4.7 added a new file type, struct sync_file. See

    commit 460bfc41fd52959311ed0328163f785e023857af
    Author:  Gustavo Padovan <gustavo.padovan@collabora.co.uk>
    Date:    Thu Apr 28 10:46:57 2016 -0300
    Subject: dma-buf/sync_file: de-stage sync_file headers

  A sync file is a cross-driver explicit synchronization primitive. In a
  sense, sync_file's relation to synchronization is similar to dma_buf's
  relation to memory: both are primitives that can be imported and
  exported across drivers (at least in theory).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965/sync: Rename brw_fence_insert()
Chad Versace [Fri, 13 Jan 2017 18:46:49 +0000 (10:46 -0800)]
i965/sync: Rename brw_fence_insert()

Rename to brw_fence_insert_locked(). This is correct because the fence's
mutex is effectively locked, as all callers are also *creators* of the
fence, and have not yet returned the new fence.

This reduces noise in the next patch, which defines and uses
brw_fence_insert(), an unlocked variant.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965/sync: Fail sync creation when batchbuffer flush fails
Chad Versace [Fri, 13 Jan 2017 18:46:49 +0000 (10:46 -0800)]
i965/sync: Fail sync creation when batchbuffer flush fails

Pre-patch, brw_sync.c ignored the return value of
intel_batchbuffer_flush().

When intel_batchbuffer_flush() fails during eglCreateSync
(brw_dri_create_fence), we now give up, cleanup, and return NULL.

When it fails during glFenceSync, however, we blindly continue and hope
for the best because there does not exist yet a way to tell core GL that
sync creation failed.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965/sync: Add brw_fence::type
Chad Versace [Fri, 13 Jan 2017 18:46:49 +0000 (10:46 -0800)]
i965/sync: Add brw_fence::type

This a refactor patch; no expected changed in behavior.

Add `enum brw_fence_type` and brw_fence::type. There is only one type
currently, BRW_FENCE_TYPE_BO_WAIT. This patch reduces a lot of noise in
the next, which adds new type BRW_FENCE_TYPE_SYNC_FD.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965: Add intel_batchbuffer_flush_fence()
Chad Versace [Fri, 13 Jan 2017 18:46:49 +0000 (10:46 -0800)]
i965: Add intel_batchbuffer_flush_fence()

A variant of intel_batchbuffer_flush() with parameters for in and out
fence fds.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965: Add intel_screen::has_fence_fd
Chad Versace [Fri, 13 Jan 2017 18:46:48 +0000 (10:46 -0800)]
i965: Add intel_screen::has_fence_fd

This bool maps to I915_PARAM_HAS_EXEC_FENCE_FD.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoconfigure: Require libdrm >= 2.4.75
Chad Versace [Fri, 13 Jan 2017 18:46:49 +0000 (10:46 -0800)]
configure: Require libdrm >= 2.4.75

Required to implement EGL_ANDROID_native_fence_sync on i965.
Specifically, i965 needs drm_intel_gem_bo_exec_fence(),
I915_PARAM_HAS_EXEC_FENCE, and libsync.h.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoconfigure.ac: list radeon in --with-vulkan-drivers help string
Emil Velikov [Fri, 27 Jan 2017 18:29:38 +0000 (18:29 +0000)]
configure.ac: list radeon in --with-vulkan-drivers help string

Analogous to what we do for the dri and gallium drivers.

Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@colllabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: automake: Don't install vk_platform.h or vulkan.h.
Emil Velikov [Fri, 27 Jan 2017 18:05:13 +0000 (18:05 +0000)]
radv: automake: Don't install vk_platform.h or vulkan.h.

These files belong to the vulkan loader.

Identical to
045f38a5075 vulkan: Don't install vk_platform.h or vulkan.h.

Cc: Dave Airlie <airlied@redhat.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoanv: Advertise API version 1.0.39
Jason Ekstrand [Thu, 26 Jan 2017 04:57:47 +0000 (20:57 -0800)]
anv: Advertise API version 1.0.39

I'm pretty sure we've kept up with the bug fixes.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
7 years agogbm/dri: fix memory leaks in error path
Eric Engestrom [Fri, 27 Jan 2017 17:29:05 +0000 (17:29 +0000)]
gbm/dri: fix memory leaks in error path

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
[Emil Velikov: make sure it builds]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agodocs/releasing: add a note about the relnotes template
Emil Velikov [Thu, 26 Jan 2017 19:26:13 +0000 (19:26 +0000)]
docs/releasing: add a note about the relnotes template

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agomesa: remove explicit __STDC_FORMAT_MACROS define
Emil Velikov [Thu, 26 Jan 2017 13:24:10 +0000 (13:24 +0000)]
mesa: remove explicit __STDC_FORMAT_MACROS define

Analogous to previous commits.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agonouveau: remove explicit __STDC_FORMAT_MACROS define
Emil Velikov [Thu, 26 Jan 2017 13:24:09 +0000 (13:24 +0000)]
nouveau: remove explicit __STDC_FORMAT_MACROS define

Already handled by the build.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agoscons: swr: remove explicit __STDC_.*_MACROS defines
Emil Velikov [Thu, 26 Jan 2017 13:24:08 +0000 (13:24 +0000)]
scons: swr: remove explicit __STDC_.*_MACROS defines

Analogous to previous commits.

Cc: George Kyriazis <george.kyriazis@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agogallium: remove explicit __STDC_.*_MACROS defines
Emil Velikov [Thu, 26 Jan 2017 13:24:07 +0000 (13:24 +0000)]
gallium: remove explicit __STDC_.*_MACROS defines

Analogous to previous commits.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agogallivm: remove explicit __STDC_.*_MACROS defines
Emil Velikov [Thu, 26 Jan 2017 13:24:06 +0000 (13:24 +0000)]
gallivm: remove explicit __STDC_.*_MACROS defines

Correctly handled by the build systems.

Cc: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agoglsl: remove explicit __STDC_FORMAT_MACROS define
Emil Velikov [Thu, 26 Jan 2017 13:24:05 +0000 (13:24 +0000)]
glsl: remove explicit __STDC_FORMAT_MACROS define

Correctly handled by all the build systems.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agoautoconf: set all __STDC_*_MACROS
Emil Velikov [Thu, 26 Jan 2017 13:24:04 +0000 (13:24 +0000)]
autoconf: set all __STDC_*_MACROS

Analogous to previous commit(s), with a minor detail - here we set the
macros when building both C and C++ sources.

Resolving that is a more challenging task that we'll sort out another
day.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agoscons: always set __STDC_*_MACROS for C++ sources
Emil Velikov [Thu, 26 Jan 2017 13:24:03 +0000 (13:24 +0000)]
scons: always set __STDC_*_MACROS for C++ sources

Analogous to previous commit - just set the lot once throughout.

Cc: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agoandroid: always set __STDC_*_MACROS for C++ sources
Emil Velikov [Thu, 26 Jan 2017 13:24:02 +0000 (13:24 +0000)]
android: always set __STDC_*_MACROS for C++ sources

Various parts of the code depend on the macros being defined.

Just set those unconditionally, only where needed (c++ sources) so that
we can drop the workarounds through the code.

Cc: Rob Herring <robh@kernel.org>
Cc: Chih-Wei Huang <cwhuang@android-x86.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agost/xa: automake: remove duplicate -Wall
Emil Velikov [Thu, 26 Jan 2017 13:18:43 +0000 (13:18 +0000)]
st/xa: automake: remove duplicate -Wall

Already handled by configure.ac

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agomesa: move variable declaration to where its used
Emil Velikov [Thu, 26 Jan 2017 13:18:41 +0000 (13:18 +0000)]
mesa: move variable declaration to where its used

The variable replacement was unused when building w/o
ENABLE_SHADER_CACHE. Since we can mix variable declarations and code,
move it to where its used.

Fixes: 9f8dc3bf03e "utils: build sha1/disk cache only with
Android/Autoconf"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agost/mesa: use correct return statement for a void function
Emil Velikov [Thu, 26 Jan 2017 13:18:40 +0000 (13:18 +0000)]
st/mesa: use correct return statement for a void function

Analogous to previous commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agomesa: use correct return statement for a void function
Emil Velikov [Thu, 26 Jan 2017 13:18:39 +0000 (13:18 +0000)]
mesa: use correct return statement for a void function

Using return foo() is incorrect even if foo itself returns void.
Spotted by AppVeyor, as below:

teximage.c(3653) : warning C4098: 'copyteximage' : 'void' function returning a value

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agosvga: remove const qualifier from SVGA3D_vgpu10_GenMips() prototype
Emil Velikov [Thu, 26 Jan 2017 13:18:38 +0000 (13:18 +0000)]
svga: remove const qualifier from SVGA3D_vgpu10_GenMips() prototype

Does not match the function definition or how it's used. Triggers the
following warning in AppVeyor

svga_cmd_vgpu10.c(1301) : warning C4028: formal parameter 2 different from declaration

Cc: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agonir: add extra const notation in compare_blocks()
Emil Velikov [Thu, 26 Jan 2017 13:18:37 +0000 (13:18 +0000)]
nir: add extra const notation in compare_blocks()

MSVC warns about different const qualifiers. Add the extra const to
silence it.

nir_phi_builder.c(244) : warning C4090: 'initializing' : different 'const' qualifiers
nir_phi_builder.c(245) : warning C4090: 'initializing' : different 'const' qualifiers

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agonir: silence implicit conversion to 64bit
Emil Velikov [Thu, 26 Jan 2017 13:18:36 +0000 (13:18 +0000)]
nir: silence implicit conversion to 64bit

MSVC warns about implicit conversion as below. Annotate the literal
appropriately to silence the warning.

nir_gather_info.c(249) : warning C4334: '<<' : result of 32-bit shift
implicitly converted to 64 bits (was 64-bit shift intended?)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoi915, i965: automake: remove NA include directive
Emil Velikov [Mon, 16 Jan 2017 15:45:50 +0000 (15:45 +0000)]
i915, i965: automake: remove NA include directive

The path in question (... dri/intel/server) was removed years ago.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agomesa/tests: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:49 +0000 (15:45 +0000)]
mesa/tests: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agodri/osmesa: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:48 +0000 (15:45 +0000)]
dri/osmesa: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agodri/swrast: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:47 +0000 (15:45 +0000)]
dri/swrast: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoradeon, r200: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:46 +0000 (15:45 +0000)]
radeon, r200: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agomapi: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:45 +0000 (15:45 +0000)]
mapi: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoloader: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:44 +0000 (15:45 +0000)]
loader: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoglx/windows: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:43 +0000 (15:45 +0000)]
glx/windows: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoglx/apple: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:42 +0000 (15:45 +0000)]
glx/apple: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jeremy Sequoia <jeremyhu@apple.com>
7 years agoglx: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:41 +0000 (15:45 +0000)]
glx: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agod3dadapter9: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:40 +0000 (15:45 +0000)]
d3dadapter9: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agost/dri: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:39 +0000 (15:45 +0000)]
st/dri: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoclover: automake: remove -I$(srcdir)
Emil Velikov [Mon, 16 Jan 2017 15:45:38 +0000 (15:45 +0000)]
clover: automake: remove -I$(srcdir)

Already implicitly handled by the build system.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Aaron Watry <awatry@gmail.com>
7 years agoclover: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:37 +0000 (15:45 +0000)]
clover: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Aaron Watry <awatry@gmail.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoegl: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:36 +0000 (15:45 +0000)]
egl: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoi915: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:35 +0000 (15:45 +0000)]
i915: automake: include builddir prior to srcdir

Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoi965: automake: include builddir prior to srcdir
Emil Velikov [Mon, 16 Jan 2017 15:45:34 +0000 (15:45 +0000)]
i965: automake: include builddir prior to srcdir

The latter can contain stale generated file, which, as-is, we'll end up
using.

Fixes: bfd17c76c12 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
7 years agofreedreno: automake: correctly set MKDIR_GEN
Emil Velikov [Mon, 16 Jan 2017 15:45:33 +0000 (15:45 +0000)]
freedreno: automake: correctly set MKDIR_GEN

Analogous to previous commit.

Fixes: 4610e5ef28e "freedreno/ir3: fix sin/cos"
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Rob Clark <robclark@freedesktop.org>
Cc: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
7 years agoi965: automake: correctly set MKDIR_GEN
Emil Velikov [Mon, 16 Jan 2017 15:45:32 +0000 (15:45 +0000)]
i965: automake: correctly set MKDIR_GEN

Otherwise we might end up w/o the respective folder (depending on
autotools version) and fail at build time.

Fixes: bfd17c76c12 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoanv: add missing extension errors in vk_errorf()
Eric Engestrom [Thu, 26 Jan 2017 13:48:18 +0000 (13:48 +0000)]
anv: add missing extension errors in vk_errorf()

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: add missing core errors in vk_errorf()
Eric Engestrom [Thu, 26 Jan 2017 13:48:17 +0000 (13:48 +0000)]
anv: add missing core errors in vk_errorf()

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: don't assert on out of memory descriptor pool in debug mode
Lionel Landwerlin [Thu, 26 Jan 2017 11:25:44 +0000 (11:25 +0000)]
anv: don't assert on out of memory descriptor pool in debug mode

Fixes:
   dEQP-VK.api.descriptor_pool.out_of_pool_memory

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
7 years agodocs/repository: fix name of main branch
Eric Engestrom [Thu, 26 Jan 2017 18:11:10 +0000 (18:11 +0000)]
docs/repository: fix name of main branch

This is git, not svn :P

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoegl: EGL_PLATFORM_SURFACELESS_MESA is now upstream
Eric Engestrom [Tue, 24 Jan 2017 18:07:06 +0000 (18:07 +0000)]
egl: EGL_PLATFORM_SURFACELESS_MESA is now upstream

EGL_PLATFORM_SURFACELESS_MESA is in eglext.h as of last commit.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
7 years agoegl: update headers from registry
Eric Engestrom [Tue, 24 Jan 2017 18:07:05 +0000 (18:07 +0000)]
egl: update headers from registry

Khronos introduced a new macro (suggested by Google) to avoid using
C-style casts in C++ code, as those generate warnings.

Khronos Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16113
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
7 years agoradv: add missing extension errors in vk_errorf()
Eric Engestrom [Thu, 26 Jan 2017 14:20:24 +0000 (14:20 +0000)]
radv: add missing extension errors in vk_errorf()

v2(Bas): Remove the extra VK_ERROR_FRAGMENTED_POOL cases.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: add missing core errors in vk_errorf()
Eric Engestrom [Thu, 26 Jan 2017 14:20:23 +0000 (14:20 +0000)]
radv: add missing core errors in vk_errorf()

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoconfigure.ac: Require LLVM for r300 only on x86 and x86_64
Andreas Boll [Tue, 24 Jan 2017 15:44:12 +0000 (16:44 +0100)]
configure.ac: Require LLVM for r300 only on x86 and x86_64

b3119a3 introduced a strict LLVM requirement for r300 on all
architectures and thus configure fails on architectures where LLVM is
not available or buggy.

r300 doesn't strictly require LLVM, but for performance reasons we
highly recommend LLVM usage. So require it at least on x86 and x86_64
architectures as we have done before b3119a3.

Fixes: b3119a3 ("configure.ac: Check gallium LLVM version in gallium_require_llvm")
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agogallium: enable int64 on radeonsi, llvmpipe, softpipe
Nicolai Hähnle [Tue, 24 Jan 2017 20:22:32 +0000 (21:22 +0100)]
gallium: enable int64 on radeonsi, llvmpipe, softpipe

All of these have had support for the TGSI opcodes since before most of
the glsl compiler work landed.

Also update the docs accordingly, including the missing note about i965.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/mesa: add support for enabling ARB_gpu_shader_int64.
Dave Airlie [Thu, 9 Jun 2016 00:17:58 +0000 (10:17 +1000)]
st/mesa: add support for enabling ARB_gpu_shader_int64.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/glsl_to_tgsi: add support for 64-bit integers
Dave Airlie [Thu, 9 Jun 2016 00:17:26 +0000 (10:17 +1000)]
st/glsl_to_tgsi: add support for 64-bit integers

v2: add conversion opcodes.

v3 (idr): Rebase on replacemtn of TGSI_OPCODE_I2U64 with
TGSI_OPCODE_I2I64.

v4 (idr): "cut them down later" => Remove ir_unop_b2u64 and
ir_unop_u642b.  Handle these with extra i2u or u2i casts just like
uint(bool) and bool(uint) conversion is done.

v5 (nha): add clarifying comment about a subtle assumption

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium: Add integer 64 capability
Dave Airlie [Thu, 9 Jun 2016 00:13:03 +0000 (10:13 +1000)]
gallium: Add integer 64 capability

v1.1: move to using a normal CAP. (Marek)

v2: fill in the cap everywhere

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agometa: Refactor texture format translation
Topi Pohjolainen [Sat, 26 Nov 2016 16:03:56 +0000 (18:03 +0200)]
meta: Refactor texture format translation

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/blorp/dbg: Name blit shaders for easy recognition in dumps
Topi Pohjolainen [Tue, 29 Nov 2016 07:56:23 +0000 (09:56 +0200)]
intel/blorp/dbg: Name blit shaders for easy recognition in dumps

Blorp clears already have an equivalent.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/hiz/gen6: Stop setting false qpitch
Topi Pohjolainen [Wed, 11 Jan 2017 08:26:32 +0000 (10:26 +0200)]
i965/hiz/gen6: Stop setting false qpitch

which is not applicable for "all slices at each lod". Current
logic makes one to believe it has some purpose. When miptree
layout is calculated brw_miptree_layout_texture_array() sets
the qpitch unconditionally but later on ignores it altogether
for ALL_SLICES_AT_EACH_LOD.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/blorp/gen6: Remove dead code in hiz setup
Topi Pohjolainen [Tue, 10 Jan 2017 08:24:26 +0000 (10:24 +0200)]
i965/blorp/gen6: Remove dead code in hiz setup

Such as comment states for intel_miptree_hiz_buffer::mt, hiz_mt
only exists for gen6. In addition, intel_hiz_miptree_buf_create()
uses MIPTREE_LAYOUT_FORCE_ALL_SLICE_AT_LOD unconditionally.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/gen6: Simplify hiz surface setup
Topi Pohjolainen [Tue, 10 Jan 2017 09:02:08 +0000 (11:02 +0200)]
i965/gen6: Simplify hiz surface setup

In intel_hiz_miptree_buf_create() intel_miptree_aux_buffer::bo
is unconditionally initialised to point to the same buffer
object as hiz_mt does. The same goes for
intel_miptree_aux_buffer::pitch/qpitch.

This will make following patches simpler to read.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/blorp/gen6: Simplify hiz surface setup
Topi Pohjolainen [Tue, 10 Jan 2017 08:13:30 +0000 (10:13 +0200)]
i965/blorp/gen6: Simplify hiz surface setup

In intel_hiz_miptree_buf_create() intel_miptree_aux_buffer::bo
is unconditionally initialised to point to the same buffer
object as hiz_mt does. Also intel_miptree_aux_buffer::offset
is initialised to zero (calloc()).

This will make following patches significantly simpler to read.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/gen6: Remove check for stencil format
Topi Pohjolainen [Thu, 29 Dec 2016 08:06:16 +0000 (10:06 +0200)]
i965/gen6: Remove check for stencil format

There are is no alternative.

Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965: Remove check for hiz on earlier gens than SNB
Topi Pohjolainen [Wed, 28 Dec 2016 15:49:56 +0000 (17:49 +0200)]
i965: Remove check for hiz on earlier gens than SNB

Only caller, brw_workaround_depthstencil_alignment(), returns
early for gen6+.

While at it, reduce scope for brw_get_depthstencil_tile_masks() as
well.

Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Remove redundant check for null texture
Topi Pohjolainen [Thu, 22 Dec 2016 08:36:03 +0000 (10:36 +0200)]
i965/miptree: Remove redundant check for null texture

There exact same check earlier in brw_miptree_layout() which
intel_miptree_create_layout() in turn calls unconditionally.

Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Tell when brw_miptree_layout() fails
Topi Pohjolainen [Tue, 17 Jan 2017 08:10:17 +0000 (10:10 +0200)]
i965/miptree: Tell when brw_miptree_layout() fails

In addition, let intel_miptree_create_layout() release the
miptree - it is the allocator.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/meta: Remove unused brw_get_rb_for_slice()
Topi Pohjolainen [Thu, 22 Dec 2016 08:09:55 +0000 (10:09 +0200)]
i965/meta: Remove unused brw_get_rb_for_slice()

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gons<C3><A1>lvez <siglesias@igalia.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoclover: Fix build against clang SVN >= r293097
Michel Dänzer [Thu, 26 Jan 2017 06:28:12 +0000 (15:28 +0900)]
clover: Fix build against clang SVN >= r293097

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agovc4: Use NEON to speed up utile stores on Pi2+.
Eric Anholt [Sun, 8 Jan 2017 22:54:57 +0000 (14:54 -0800)]
vc4: Use NEON to speed up utile stores on Pi2+.

Improves 1024x1024 TexSubImage2D by 41.2371% +/- 3.52799% (n=10).

7 years agovc4: Use NEON to speed up utile loads on Pi2.
Eric Anholt [Thu, 5 Jan 2017 23:11:30 +0000 (15:11 -0800)]
vc4: Use NEON to speed up utile loads on Pi2.

We had a lot of memcpy call overhead because gpu_stride wasn't being
inlined.  But if you split out the stride==8 and stride==16 cases like
this code does while still using memcpy, you'd no longer have glibc's
NEON memcpy applied at which point we'd be doing 16 uncached reads
instead of 64/(NEON memcpy granularity), for about a 30% performance
hit.  By hand writing the assembly, we can get a whole cacheline
loaded at a time.

Unfortunately, NEON intrinsics turned out to be unusable -- they
didn't have the vldm instruction available.

Note that, for now, the NEON code is only enabled when building for ARMv7
(Pi 2+).  We may want to do runtime detection for the Raspbian case, in
the future.

Improves 1024x1024 GetTexImage by 208.256% +/- 7.07029% (n=10).

7 years agovc4: Move LT tiling code to a separate file.
Eric Anholt [Fri, 6 Jan 2017 18:51:22 +0000 (10:51 -0800)]
vc4: Move LT tiling code to a separate file.

This paves the way for building it twice, with NEON assembly or not.

7 years agovc4: Use unreachable() in an unreachable codepath for tiling.
Eric Anholt [Fri, 6 Jan 2017 18:55:07 +0000 (10:55 -0800)]
vc4: Use unreachable() in an unreachable codepath for tiling.

7 years agogallium/radeon: add VRAM-vis-usage HUD query
Samuel Pitoiset [Wed, 25 Jan 2017 15:56:46 +0000 (16:56 +0100)]
gallium/radeon: add VRAM-vis-usage HUD query

This new query returns the current visible usage of VRAM accessed
by the CPU. It will return 0 on radeon because it's unimplemented.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/radeon: query the CPU accessible size of VRAM
Samuel Pitoiset [Wed, 25 Jan 2017 15:56:45 +0000 (16:56 +0100)]
gallium/radeon: query the CPU accessible size of VRAM

R600_DEBUG="info" can be used to display that size, as well as
the total amount of VRAM/GTT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa: Arrange validate_uniform_parameters parameters to match call sites
Ian Romanick [Thu, 6 Nov 2014 19:12:31 +0000 (11:12 -0800)]
mesa: Arrange validate_uniform_parameters parameters to match call sites

Saves a measly 20 bytes on IA32 and nothing on x64.  Depending on
exactly when this is applied, a lot of variation is possible due to
function alignment.

   text    data     bss     dec     hex filename
6670131  228340   22552 6921023  699b3f lib/i965_dri.so before
6670111  228340   22552 6921003  699b2b lib/i965_dri.so after
6342932  293872   29880 6666684  65b9bc lib64/i965_dri.so before
6342932  293872   29880 6666684  65b9bc lib64/i965_dri.so after

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
7 years agomesa: Arrange _mesa_uniform parameters to match the call sites
Ian Romanick [Thu, 6 Nov 2014 02:44:21 +0000 (18:44 -0800)]
mesa: Arrange _mesa_uniform parameters to match the call sites

By putting the parameters first that match the parameters to the call
site, 4 (of 14) instructions are saved at _mesa_Uniform4fv on x64.  On
IA32, the details of the instructions change, but it is the same count
and mix of instructions.

Before:

0000000000000830 <_mesa_Uniform4fv>:
     830:       48 83 ec 10             sub    $0x10,%rsp
     834:       49 89 d0                mov    %rdx,%r8
     837:       48 8b 15 00 00 00 00    mov    0x0(%rip),%rdx        # 83e <_mesa_Uniform4fv+0xe>
     83e:       89 f8                   mov    %edi,%eax
     840:       89 f1                   mov    %esi,%ecx
     842:       41 b9 02 00 00 00       mov    $0x2,%r9d
     848:       64 48 8b 3a             mov    %fs:(%rdx),%rdi
     84c:       48 8b 97 c8 01 02 00    mov    0x201c8(%rdi),%rdx
     853:       48 8b 72 70             mov    0x70(%rdx),%rsi
     857:       6a 04                   pushq  $0x4
     859:       89 c2                   mov    %eax,%edx
     85b:       e8 00 00 00 00          callq  860 <_mesa_Uniform4fv+0x30>
     860:       48 83 c4 18             add    $0x18,%rsp
     864:       c3                      retq

After:

00000000000007f0 <_mesa_Uniform4fv>:
     7f0:       48 83 ec 10             sub    $0x10,%rsp
     7f4:       48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 7fb <_mesa_Uniform4fv+0xb>
     7fb:       41 b9 02 00 00 00       mov    $0x2,%r9d
     801:       64 48 8b 08             mov    %fs:(%rax),%rcx
     805:       48 8b 81 c8 01 02 00    mov    0x201c8(%rcx),%rax
     80c:       6a 04                   pushq  $0x4
     80e:       4c 8b 40 70             mov    0x70(%rax),%r8
     812:       e8 00 00 00 00          callq  817 <_mesa_Uniform4fv+0x27>
     817:       48 83 c4 18             add    $0x18,%rsp
     81b:       c3                      retq

Saves a measly 416 bytes of text on x64.  Depending on exactly when this
is applied, a lot of variation is possible due to function alignment.

   text    data     bss     dec     hex filename
6670131  228340   22552 6921023  699b3f lib/i965_dri.so before
6670131  228340   22552 6921023  699b3f lib/i965_dri.so after
6343348  293872   29880 6667100  65bb5c lib64/i965_dri.so before
6342932  293872   29880 6666684  65b9bc lib64/i965_dri.so after

There is likely to be no performance change with just this patch.
_mesa_uniform immediately calls validate_uniform_parameters with
parameters in the "wrong" (different from the call site) order.

v2: Rebase on GL_ARB_gpu_shader_fp64.

v3: Rebase on GL_ARB_gpu_shader_int64.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
7 years agomesa: Arrange _mesa_uniform_matrix parameters to match the call sites
Ian Romanick [Thu, 6 Nov 2014 01:48:40 +0000 (17:48 -0800)]
mesa: Arrange _mesa_uniform_matrix parameters to match the call sites

By putting the parameters first that match the parameters to the call
site, 4 (of 16) instructions are saved at _mesa_UniformMatrix4fv on
x64.  On IA32, the details of the instructions change, but it is the
same count and mix of instructions.

Before:

0000000000001380 <_mesa_UniformMatrix4fv>:
    1380:       48 83 ec 10             sub    $0x10,%rsp
    1384:       48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 138b <_mesa_UniformMatrix4fv+0xb>
    138b:       41 89 f8                mov    %edi,%r8d
    138e:       41 89 f1                mov    %esi,%r9d
    1391:       0f b6 d2                movzbl %dl,%edx
    1394:       64 48 8b 38             mov    %fs:(%rax),%rdi
    1398:       48 8b b7 c8 01 02 00    mov    0x201c8(%rdi),%rsi
    139f:       48 8b 76 70             mov    0x70(%rsi),%rsi
    13a3:       68 06 14 00 00          pushq  $0x1406
    13a8:       51                      push   %rcx
    13a9:       52                      push   %rdx
    13aa:       b9 04 00 00 00          mov    $0x4,%ecx
    13af:       ba 04 00 00 00          mov    $0x4,%edx
    13b4:       e8 00 00 00 00          callq  13b9 <_mesa_UniformMatrix4fv+0x39>
    13b9:       48 83 c4 28             add    $0x28,%rsp
    13bd:       c3                      retq

After:

0000000000001360 <_mesa_UniformMatrix4fv>:
    1360:       48 83 ec 10             sub    $0x10,%rsp
    1364:       48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 136b <_mesa_UniformMatrix4fv+0xb>
    136b:       0f b6 d2                movzbl %dl,%edx
    136e:       64 4c 8b 00             mov    %fs:(%rax),%r8
    1372:       49 8b 80 c8 01 02 00    mov    0x201c8(%r8),%rax
    1379:       68 06 14 00 00          pushq  $0x1406
    137e:       6a 04                   pushq  $0x4
    1380:       6a 04                   pushq  $0x4
    1382:       4c 8b 48 70             mov    0x70(%rax),%r9
    1386:       e8 00 00 00 00          callq  138b <_mesa_UniformMatrix4fv+0x2b>
    138b:       48 83 c4 28             add    $0x28,%rsp
    138f:       c3                      retq

Saves a measly 576 bytes of text on x64.

   text    data     bss     dec     hex filename
6670131  228340   22552 6921023  699b3f lib/i965_dri.so before
6670131  228340   22552 6921023  699b3f lib/i965_dri.so after
6343924  293872   29880 6667676  65bd9c lib64/i965_dri.so before
6343348  293872   29880 6667100  65bb5c lib64/i965_dri.so after

v2: Rebase on GL_ARB_gpu_shader_fp64.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
7 years agomesa: Trivial clean-ups in uniform_query.cpp
Ian Romanick [Tue, 1 Sep 2015 17:29:04 +0000 (10:29 -0700)]
mesa: Trivial clean-ups in uniform_query.cpp

This is C++, so we can mix code and declarations.  Doing so allows
constification.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
7 years agospirv: handle undefined components for OpVectorShuffle
Lionel Landwerlin [Thu, 26 Jan 2017 16:57:40 +0000 (16:57 +0000)]
spirv: handle undefined components for OpVectorShuffle

Fixes:
   dEQP-VK.spirv_assembly.instruction.compute.opspecconstantop.vector_related
   dEQP-VK.spirv_assembly.instruction.graphics.opspecconstantop.vector_related*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>