mesa.git
6 years agoswr/rast: Fix alloca usage in jitter
George Kyriazis [Thu, 5 Apr 2018 20:59:54 +0000 (15:59 -0500)]
swr/rast: Fix alloca usage in jitter

Fix issue where temporary allocas were getting hoisted to function entry
unnecessarily. We now explicitly mark temporary allocas and skip hoisting
during the hoist pass. Shuold reduce stack usage.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Change gfx pointers to gfxptr_t
George Kyriazis [Thu, 5 Apr 2018 17:08:15 +0000 (12:08 -0500)]
swr/rast: Change gfx pointers to gfxptr_t

Changing type to gfxptr for indices and related changes to fetch and mem
builder code.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Fix byte offset for non-indexed draws
George Kyriazis [Tue, 10 Apr 2018 00:47:51 +0000 (19:47 -0500)]
swr/rast: Fix byte offset for non-indexed draws

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Add support for setting optimization level
George Kyriazis [Wed, 4 Apr 2018 22:34:54 +0000 (17:34 -0500)]
swr/rast: Add support for setting optimization level

for JIT compilation

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Adding translate call to builder_gfx_mem.
George Kyriazis [Thu, 29 Mar 2018 19:43:06 +0000 (14:43 -0500)]
swr/rast: Adding translate call to builder_gfx_mem.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Fix codegen for typedef types
George Kyriazis [Wed, 28 Mar 2018 19:43:09 +0000 (14:43 -0500)]
swr/rast: Fix codegen for typedef types

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr: add x86 lowering pass to fragment shader
George Kyriazis [Wed, 28 Mar 2018 19:31:20 +0000 (14:31 -0500)]
swr: add x86 lowering pass to fragment shader

Needed because some FP paths (namely stipple) use gather intrinsics
that now need to be lowered to x86.

v2: fix typo in commit message
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Enable generalized fetch jit
George Kyriazis [Fri, 23 Mar 2018 20:14:58 +0000 (15:14 -0500)]
swr/rast: Enable generalized fetch jit

Enable generalized fetch jit with 8 or 16 wide SIMD target. Still some
work needed to remove some simd8 double pumping for 16-wide target.

Also removed unused non-gather load vertices path.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Add builder_gfx_mem.{h|cpp}
George Kyriazis [Mon, 26 Mar 2018 18:29:04 +0000 (13:29 -0500)]
swr/rast: Add builder_gfx_mem.{h|cpp}

Abstract usage scenarios for memory accesses into builder_gfx_mem.
Builder_gfx_mem will convert gfxptr_t from 64-bit int to regular pointer
types for use by builder_mem.

v2: reworded commit message; renamed enum more appropriately
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Lower VGATHERPS and VGATHERPS_16 to x86.
George Kyriazis [Thu, 22 Mar 2018 20:25:36 +0000 (15:25 -0500)]
swr/rast: Lower VGATHERPS and VGATHERPS_16 to x86.

Some more work to do before we can support simultaneous 8-wide and
16-wide and remove the VGATHERPS_16 version.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Cleanup of JitManager convenience types
George Kyriazis [Wed, 21 Mar 2018 18:23:23 +0000 (13:23 -0500)]
swr/rast: Cleanup of JitManager convenience types

Small cleanup. Remove convenience types from JitManager and standardize
on the Builder's convenience types.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Lower PERMD and PERMPS to x86.
George Kyriazis [Tue, 20 Mar 2018 23:13:35 +0000 (18:13 -0500)]
swr/rast: Lower PERMD and PERMPS to x86.

Add support for providing an emulation callback function for arch/width
combinations that don't map cleanly to an x86 intrinsic.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Start refactoring of builder/packetizer.
George Kyriazis [Tue, 20 Mar 2018 00:05:38 +0000 (19:05 -0500)]
swr/rast: Start refactoring of builder/packetizer.

Move x86 intrinsic lowering to a separate pass. Builder now instantiates
generic intrinsics for features not supported by llvm. The separate x86
lowering pass is responsible for lowering to valid x86 for the target
SIMD architecture. Currently it's a port of existing code to get it
up and running quickly. Will eventually support optimized x86 for AVX,
AVX2 and AVX512.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Simplify #define usage in gen source file
George Kyriazis [Mon, 19 Mar 2018 22:46:13 +0000 (17:46 -0500)]
swr/rast: Simplify #define usage in gen source file

Removed preprocessor defines from structures passed to LLVM jitted code.

The python scripts do not understand the preprocessor defines and ignores
them. So for fields that are compiled out due to a preprocessor define
the LLVM script accounts for them anyway because it doesn't know what
the defines are set to. The sanitize defines for open source are fine
in that they're safely used.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Move CallPrint() to a separate file
George Kyriazis [Fri, 16 Mar 2018 15:26:25 +0000 (10:26 -0500)]
swr/rast: Move CallPrint() to a separate file

Needed work for jit code debug.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Fix name mangling for LLVM pow intrinsic
George Kyriazis [Thu, 15 Mar 2018 22:49:54 +0000 (17:49 -0500)]
swr/rast: Fix name mangling for LLVM pow intrinsic

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Add some archrast counters
George Kyriazis [Thu, 15 Mar 2018 20:58:10 +0000 (15:58 -0500)]
swr/rast: Add some archrast counters

Hook up archrast counters for shader stats: instructions executed.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Code cleanup
George Kyriazis [Thu, 15 Mar 2018 18:43:08 +0000 (13:43 -0500)]
swr/rast: Code cleanup

Removing some code that doesn't seem to do anything meaningful.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Add "Num Instructions Executed" stats intrinsic.
George Kyriazis [Thu, 15 Mar 2018 17:49:51 +0000 (12:49 -0500)]
swr/rast: Add "Num Instructions Executed" stats intrinsic.

Added a SWR_SHADER_STATS structure which is passed to each shader. The
stats pass will instrument the shader to populate this.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Add MEM_ADD helper function to Builder.
George Kyriazis [Thu, 15 Mar 2018 17:08:00 +0000 (12:08 -0500)]
swr/rast: Add MEM_ADD helper function to Builder.

mem[offset] += value

This function will be heavily used by all stats intrinsics.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Permute work for simd16
George Kyriazis [Wed, 14 Mar 2018 18:38:18 +0000 (13:38 -0500)]
swr/rast: Permute work for simd16

Fix slow permutes in PA tri lists under SIMD16 emulation on AVX

Added missing permute (interlane, immediate) to SIMDLIB

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: WIP builder rewrite (2)
George Kyriazis [Wed, 14 Mar 2018 17:29:04 +0000 (12:29 -0500)]
swr/rast: WIP builder rewrite (2)

Finish up the remaining explicit intrinsic uses. At this point all
explicit Intrinsic::getDeclaration() usage has been replaced with auto
generated macros generated with gen_llvm_ir_macros.py. Going forward,
make sure to only use the intrinsics here, adding new ones as needed.

Next step is to remove all references to x86 intrinsics to keep the
builder target-independent. Any x86 lowering will be handled by a
separate pass.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Add autogen of helper llvm intrinsics.
George Kyriazis [Tue, 13 Mar 2018 18:46:41 +0000 (13:46 -0500)]
swr/rast: Add autogen of helper llvm intrinsics.

Replace sqrt, maskload, fp min/max, cttz, ctlz with llvm equivalent.
Replace AVX maskedstore intrinsic with LLVM intrinsic. Add helper llvm
macros for stacksave, stackrestore, popcnt.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: WIP builder rewrite.
George Kyriazis [Mon, 12 Mar 2018 18:18:56 +0000 (13:18 -0500)]
swr/rast: WIP builder rewrite.

Start removing avx2 macros for functionality that exists in llvm.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: LLVM 6 fix
George Kyriazis [Tue, 13 Mar 2018 01:34:19 +0000 (20:34 -0500)]
swr/rast: LLVM 6 fix

for getting masked gather intrinsic (also compatible with LLVM 4)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Changes to allow jitter to compile with LLVM5
George Kyriazis [Sat, 10 Mar 2018 06:04:11 +0000 (00:04 -0600)]
swr/rast: Changes to allow jitter to compile with LLVM5

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Add some archrast stats
George Kyriazis [Wed, 7 Mar 2018 01:32:53 +0000 (19:32 -0600)]
swr/rast: Add some archrast stats

Add stats for degenerate and backfacing primitive counts

Wire archrast stats for alpha blend and alpha test.
pass value to jitter, upon return have archrast event increment a value

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Silence some unused variable warnings
George Kyriazis [Fri, 9 Mar 2018 17:37:57 +0000 (11:37 -0600)]
swr/rast: Silence some unused variable warnings

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Add debug type info for i128
George Kyriazis [Thu, 8 Mar 2018 22:19:36 +0000 (16:19 -0600)]
swr/rast: Add debug type info for i128

Help support debug info in 16 wide shaders.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Use blend context struct to pass params
George Kyriazis [Thu, 8 Mar 2018 07:35:17 +0000 (01:35 -0600)]
swr/rast: Use blend context struct to pass params

Stuff parameters into a blend context struct before passing down through
the PFN_BLEND_JIT_FUNC function pointer. Needed for stat changes.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Introduce JIT_MEM_CLIENT
George Kyriazis [Wed, 7 Mar 2018 19:33:44 +0000 (13:33 -0600)]
swr/rast: Introduce JIT_MEM_CLIENT

Add assert for correct usage of memory accesses

v2: reworded commit message; renamed enum more appropriately
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Add some instructions to jitter
George Kyriazis [Wed, 7 Mar 2018 18:00:52 +0000 (12:00 -0600)]
swr/rast: Add some instructions to jitter

VPHADDD, PMAXUD, PMINUD

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agodocs: update calendar, add news and link release notes to 18.0.1
Juan A. Suarez Romero [Wed, 18 Apr 2018 15:29:12 +0000 (15:29 +0000)]
docs: update calendar, add news and link release notes to 18.0.1

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
6 years agodocs: add sha256 checksums for 18.0.1
Juan A. Suarez Romero [Wed, 18 Apr 2018 15:25:00 +0000 (15:25 +0000)]
docs: add sha256 checksums for 18.0.1

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit a1c421c638fd9ff2810b2a59f1ccd0a3a03657b1)

6 years agodocs: add release notes for 18.0.1
Juan A. Suarez Romero [Wed, 18 Apr 2018 14:44:49 +0000 (14:44 +0000)]
docs: add release notes for 18.0.1

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 8bd719e3faee8cb0054f51cf1fe9d372a9eea0ea)

6 years agodocs: update calendar, add news and link release notes to 17.3.9
Juan A. Suarez Romero [Wed, 18 Apr 2018 09:45:04 +0000 (09:45 +0000)]
docs: update calendar, add news and link release notes to 17.3.9

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
6 years agodocs: add sha256 checksums for 17.3.9
Juan A. Suarez Romero [Wed, 18 Apr 2018 09:39:48 +0000 (09:39 +0000)]
docs: add sha256 checksums for 17.3.9

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit cf0864dc63caf1285bdede364e9a39b22bac5938)

6 years agodocs: add release notes for 17.3.9
Juan A. Suarez Romero [Wed, 18 Apr 2018 08:40:26 +0000 (08:40 +0000)]
docs: add release notes for 17.3.9

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 6d88ea9dd46e630ee861e773dfe4a49f5d1c1fbd)

6 years agoRevert "meson: add wrap for libdrm"
Dylan Baker [Tue, 17 Apr 2018 20:47:17 +0000 (13:47 -0700)]
Revert "meson: add wrap for libdrm"

This reverts commit 6217eedc9bac86856d5048c43b5f5a3f6976c13e.

I was using this for testing and accidentally put it on master

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
6 years agoRevert "Add subprojects directory and git ignore"
Dylan Baker [Tue, 17 Apr 2018 20:47:06 +0000 (13:47 -0700)]
Revert "Add subprojects directory and git ignore"

This reverts commit 21e2e73f71096fd4607051c060cf82c593663d50.

I was using this for testing and accidentally put it on master

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
6 years agomeson: Version libMesaOpenCL like autotools does
Jan Alexander Steffens (heftig) [Sat, 14 Apr 2018 17:23:22 +0000 (19:23 +0200)]
meson: Version libMesaOpenCL like autotools does

This is for parity with autotools. It names the library
libMesaOpenCL.so.1.0.0 and points mesa.icd to the .1 symlink.

opencl_version now matches configure.ac's OPENCL_VERSION.

Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com>
Tested-By: Aaron Watry <awatry@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
6 years agomeson: Add library versions to swr drivers
Jan Alexander Steffens (heftig) [Sat, 14 Apr 2018 17:23:21 +0000 (19:23 +0200)]
meson: Add library versions to swr drivers

This is for parity with autotools.

Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
6 years agomeson: add wrap for libdrm
Dylan Baker [Fri, 13 Apr 2018 19:18:10 +0000 (12:18 -0700)]
meson: add wrap for libdrm

Currently this requires libdrm from git, since the version reported by
meson is wrong.

6 years agoAdd subprojects directory and git ignore
Dylan Baker [Fri, 13 Apr 2018 19:04:57 +0000 (12:04 -0700)]
Add subprojects directory and git ignore

For meson wraps.

6 years agoradv: fix scissor computation when using half-pixel viewport offset
Samuel Pitoiset [Tue, 17 Apr 2018 20:07:26 +0000 (22:07 +0200)]
radv: fix scissor computation when using half-pixel viewport offset

'scale[i]' can be non-integer.

Original patch by Philip Rebohle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106074
Fixes: 0f3de89a56a ("radv: Use the guard band.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agospirv: Accept doubles in FaceForward, Reflect and Refract
Neil Roberts [Wed, 21 Mar 2018 19:34:40 +0000 (20:34 +0100)]
spirv: Accept doubles in FaceForward, Reflect and Refract

The SPIR-V spec doesn’t specify a size requirement for these and the
equivalent functions in the GLSL spec have explicit alternatives for
doubles. Refract is a little bit more complicated due to the fact that
the final argument is always supposed to be a scalar 32- or 16- bit
float regardless of the other operands. However in practice it seems
there is a bug in glslang that makes it convert the argument to 64-bit
if you actually try to pass it a 32-bit value while the other
arguments are 64-bit. This adds an optional conversion of the final
argument in order to support any type.

These have been tested against the automatically generated tests of
glsl-4.00/execution/built-in-functions using the ARB_gl_spirv branch
which tests it with quite a large range of combinations.

The issue with glslang has been filed here:
https://github.com/KhronosGroup/glslang/issues/1279

v2: Convert the eta operand of Refract from any size in order to make
    it eventually cope with 16-bit floats.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agospirv: Add a 64-bit implementation of OpIsInf
Neil Roberts [Wed, 21 Mar 2018 19:34:39 +0000 (20:34 +0100)]
spirv: Add a 64-bit implementation of OpIsInf

The only change neccessary is to change the type of the constant used
to compare against.

This has been tested against the arb_gpu_shader_fp64/execution/
fs-isinf-dvec tests using the ARB_gl_spirv branch.

v2: Use nir_imm_floatN_t for the constant.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agospirv: Use nir_imm_floatN_t for constants for GLSL450 builtins
Neil Roberts [Wed, 21 Mar 2018 19:34:38 +0000 (20:34 +0100)]
spirv: Use nir_imm_floatN_t for constants for GLSL450 builtins

There is an existing macro that is used to choose between either a
float or a double immediate constant based on the bit size of the
first operand to the builtin. This is now changed to use the new
nir_imm_floatN_t helper function to reduce the number of places that
make this decision.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agonir/builder: Add a nir_imm_floatN_t helper
Neil Roberts [Wed, 21 Mar 2018 19:34:37 +0000 (20:34 +0100)]
nir/builder: Add a nir_imm_floatN_t helper

This lets you easily build float immediates just given the bit size.
If we have this single place here to handle this then it will be
easier to add support for 16-bit floats later.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agonir: return early when lowering a return at the end of a function
Timothy Arceri [Sun, 8 Apr 2018 11:47:32 +0000 (21:47 +1000)]
nir: return early when lowering a return at the end of a function

Otherwise we create unused conditional return flags and things
get unnecessarily ugly fast when lowering nested functions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agomesa: merge the driver functions DrawBuffers and DrawBuffer
Timothy Arceri [Sat, 14 Apr 2018 03:42:31 +0000 (13:42 +1000)]
mesa: merge the driver functions DrawBuffers and DrawBuffer

The extra params we unused by the drivers that used DrawBuffers.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agoglsl: fix gcc 8 parenthesis warning
Marc Dietrich [Fri, 23 Mar 2018 10:01:23 +0000 (11:01 +0100)]
glsl: fix gcc 8 parenthesis warning

fixes warnings like this:
[184/1137] Compiling C++ object 'src/compiler/glsl/glsl@sta/lower_jumps.cpp.o'.
In file included from ../src/mesa/main/mtypes.h:48,
                 from ../src/compiler/glsl_types.h:149,
                 from ../src/compiler/glsl/lower_jumps.cpp:59:
../src/compiler/glsl/lower_jumps.cpp: In member function '{anonymous}::block_record {anonymous}::ir_lower_jumps_visitor::visit_block(exec_list*)':
../src/compiler/glsl/list.h:650:17: warning: unnecessary parentheses in declaration of 'node' [-Wparentheses]
    for (__type *(__inst) = (__type *)(__list)->head_sentinel.next; \
                 ^
../src/compiler/glsl/lower_jumps.cpp:510:7: note: in expansion of macro 'foreach_in_list'
       foreach_in_list(ir_instruction, node, list) {
       ^~~~~~~~~~~~~~~

Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agocompiler: int8/uint8 fixes
Rob Clark [Sun, 15 Apr 2018 16:02:37 +0000 (12:02 -0400)]
compiler: int8/uint8 fixes

A couple spots were missed for handling of the new INT8/UINT8 base type.

Also de-duplicate get_base_type().. get_scalar_type() had nearly the
same switch statement, with the exception that anything with base_type
that was not scalar would return error_type.  So just handle that one
special case in get_scalar_type().

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoradeonsi: don't emit partial flushes for internal CS flushes only
Marek Olšák [Sat, 7 Apr 2018 02:26:49 +0000 (22:26 -0400)]
radeonsi: don't emit partial flushes for internal CS flushes only

Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agowinsys/amdgpu: always set AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE
Marek Olšák [Tue, 3 Apr 2018 18:55:02 +0000 (14:55 -0400)]
winsys/amdgpu: always set AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE

There is a kernel patch that adds the new flag.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agoradeonsi: implement mechanism for IBs without partial flushes at the end (v6)
Marek Olšák [Fri, 16 Jun 2017 12:25:34 +0000 (14:25 +0200)]
radeonsi: implement mechanism for IBs without partial flushes at the end (v6)

(This patch doesn't enable the behavior. It will be enabled in a later
commit.)

Draw calls from multiple IBs can be executed in parallel.

v2: do emit partial flushes on SI
v3: invalidate all shader caches at the beginning of IBs
v4: don't call si_emit_cache_flush in si_flush_gfx_cs if not needed,
    only do this for flushes invoked internally
v5: empty IBs should wait for idle if the flush requires it
v6: split the commit

If we artificially limit the number of draw calls per IB to 5, we'll get
a lot more IBs, leading to a lot more partial flushes. Let's see how
the removal of partial flushes changes GPU utilization in that scenario:

With partial flushes (time busy):
    CP: 99%
    SPI: 86%
    CB: 73:

Without partial flushes (time busy):
    CP: 99%
    SPI: 93%
    CB: 81%

Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agonir: fix ir_binop_gequal glsl_to_nir conversion
Erico Nunes [Sat, 14 Apr 2018 19:14:41 +0000 (21:14 +0200)]
nir: fix ir_binop_gequal glsl_to_nir conversion

ir_binop_gequal needs to be converted to nir_op_sge when native integers
are not supported in the driver.
Otherwise it becomes no different than ir_binop_less after the
conversion.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoanv,radv: Drop XML workarounds for VK_ANDROID_native_buffer
Jason Ekstrand [Mon, 16 Apr 2018 14:38:31 +0000 (07:38 -0700)]
anv,radv: Drop XML workarounds for VK_ANDROID_native_buffer

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agovulkan: Update the XML and headers to 1.1.73
Jason Ekstrand [Mon, 16 Apr 2018 14:32:03 +0000 (07:32 -0700)]
vulkan: Update the XML and headers to 1.1.73

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoradv: clean up radv_decompress_resolve_subpass_src()
Samuel Pitoiset [Fri, 13 Apr 2018 17:14:52 +0000 (19:14 +0200)]
radv: clean up radv_decompress_resolve_subpass_src()

To handle the source color image transitions in the same place.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: don't fast-clear eliminate after resolving a subpass with compute
Samuel Pitoiset [Fri, 13 Apr 2018 17:14:51 +0000 (19:14 +0200)]
radv: don't fast-clear eliminate after resolving a subpass with compute

That looks useless, and I think radv_handle_image_transition()
will do a fast-clear eliminate because it's called after the
resolve.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: handle CMASK/FMASK transitions only if DCC is disabled
Samuel Pitoiset [Fri, 13 Apr 2018 17:14:50 +0000 (19:14 +0200)]
radv: handle CMASK/FMASK transitions only if DCC is disabled

DCC implies a fast-clear eliminate, so I think this sounds
reasonable.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: merge radv_handle_{dcc,cmask}_image_transition() functions
Samuel Pitoiset [Fri, 13 Apr 2018 17:14:49 +0000 (19:14 +0200)]
radv: merge radv_handle_{dcc,cmask}_image_transition() functions

Into radv_handle_color_image_transition().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: add radv_init_color_image_metadata() helper
Samuel Pitoiset [Fri, 13 Apr 2018 17:14:48 +0000 (19:14 +0200)]
radv: add radv_init_color_image_metadata() helper

In order to separate initialization from decompression. In the
future, that will allow us to init DCC/FMASK/CMASK in one shot.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: make radv_initialise_cmask() static
Samuel Pitoiset [Fri, 13 Apr 2018 17:14:47 +0000 (19:14 +0200)]
radv: make radv_initialise_cmask() static

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: clean up radv_handle_image_transition() a bit
Samuel Pitoiset [Fri, 13 Apr 2018 17:14:46 +0000 (19:14 +0200)]
radv: clean up radv_handle_image_transition() a bit

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: add radv_handle_color_image_transition() helper
Samuel Pitoiset [Fri, 13 Apr 2018 17:14:45 +0000 (19:14 +0200)]
radv: add radv_handle_color_image_transition() helper

To handle CMASK, FMASK and DCC transitions in the same place.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: handle DCC image transitions before CMASK/FMASK transitions
Samuel Pitoiset [Fri, 13 Apr 2018 17:14:44 +0000 (19:14 +0200)]
radv: handle DCC image transitions before CMASK/FMASK transitions

Mostly because DCC implies a fast-clear eliminate and we
should be able to skip some DCC decompressions by setting
a predicate like for CMASK and FMASK.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: disable prediction only if it has been enabled
Samuel Pitoiset [Fri, 13 Apr 2018 17:14:43 +0000 (19:14 +0200)]
radv: disable prediction only if it has been enabled

When decompressing DCC we don't enable it, so it's useless
to disable it. This reduces the number of prediction packets
sent to the GPU when performing color decompression passes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too.
Bas Nieuwenhuizen [Sun, 15 Apr 2018 22:09:39 +0000 (00:09 +0200)]
ac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too.

No clue how I missed those ...

Fixes: 4503ff760c "ac/nir: Add workaround for GFX9 buffer views."
CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105320
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agogallium/osmesa: link with winsock2 library on Windows
Brian Paul [Fri, 13 Apr 2018 21:34:23 +0000 (15:34 -0600)]
gallium/osmesa: link with winsock2 library on Windows

To fix the MSVC build.  The build broke because we started to compile
the ddebug code on Windows after the mtypes.h changes.  Building ddebug
caused us to also use the u_network.c code for the first time.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
6 years agogallium/util: put (void) in a few function signatures
Brian Paul [Fri, 13 Apr 2018 21:33:39 +0000 (15:33 -0600)]
gallium/util: put (void) in a few function signatures

To match the header file.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
6 years agoddebug: add PIPE_OS_UNIX/LINUX checks to fix MSVC build
Brian Paul [Fri, 13 Apr 2018 21:32:48 +0000 (15:32 -0600)]
ddebug: add PIPE_OS_UNIX/LINUX checks to fix MSVC build

Don't include Unix headers or use Unix functions when building with MSVC.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
6 years agomesa: protect #include of unistd.h with _MSV_VER check
Brian Paul [Fri, 13 Apr 2018 21:31:49 +0000 (15:31 -0600)]
mesa: protect #include of unistd.h with _MSV_VER check

unistd.h is unix only.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
6 years agomesa: remove unused 'i' in dimensions_error_check()
Brian Paul [Fri, 13 Apr 2018 21:31:20 +0000 (15:31 -0600)]
mesa: remove unused 'i' in dimensions_error_check()

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: restore si_emit_cache_flush call at the end of IBs
Marek Olšák [Sat, 14 Apr 2018 00:04:04 +0000 (20:04 -0400)]
radeonsi: restore si_emit_cache_flush call at the end of IBs

Fixes: 918b798668c "radeonsi: make sure CP DMA is idle at the end of IBs"
6 years agoradv: enable subgroup capabilities
Daniel Schürmann [Tue, 6 Mar 2018 14:05:13 +0000 (15:05 +0100)]
radv: enable subgroup capabilities

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac: handle subgroup intrinsics
Daniel Schürmann [Tue, 6 Mar 2018 14:04:29 +0000 (15:04 +0100)]
ac: handle subgroup intrinsics

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac: add LLVM build functions for subgroup instrinsics
Daniel Schürmann [Tue, 6 Mar 2018 14:03:36 +0000 (15:03 +0100)]
ac: add LLVM build functions for subgroup instrinsics

Co-authored-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac: make ballot and umsb capable of 64bit inputs
Daniel Schürmann [Wed, 28 Feb 2018 19:26:03 +0000 (20:26 +0100)]
ac: make ballot and umsb capable of 64bit inputs

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agonir: lower 64bit subgroup shuffle intrinsics
Daniel Schürmann [Tue, 10 Apr 2018 14:07:27 +0000 (16:07 +0200)]
nir: lower 64bit subgroup shuffle intrinsics

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agonir/spirv: Fix warning and add missing breaks.
Daniel Schürmann [Tue, 10 Apr 2018 10:02:44 +0000 (12:02 +0200)]
nir/spirv: Fix warning and add missing breaks.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agonir: use ballot_bit_size when lowering ballot_bitfield_extract
Daniel Schürmann [Fri, 13 Apr 2018 13:05:24 +0000 (15:05 +0200)]
nir: use ballot_bit_size when lowering ballot_bitfield_extract

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agonir: subgroups instructions for 64bit ballot sizes
Daniel Schürmann [Fri, 13 Apr 2018 13:04:16 +0000 (15:04 +0200)]
nir: subgroups instructions for 64bit ballot sizes

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoglsl: #undef THIS macro to fix MSVC build
Brian Paul [Fri, 13 Apr 2018 15:56:33 +0000 (09:56 -0600)]
glsl: #undef THIS macro to fix MSVC build

THIS is a macro in one of the MSVC header files.  It's also a token
in the GLSL lexer.  This causes a compilation failure with MSVC.
This issue seems to be newly exposed after the recent mtypes.h removal
patches.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
6 years agoglsl: rename 'interface' var to 'iface' to fix MSVC build
Brian Paul [Fri, 13 Apr 2018 15:38:16 +0000 (09:38 -0600)]
glsl: rename 'interface' var to 'iface' to fix MSVC build

The recent mtypes.h removal patches seems to have exposed a MSVC
issue where 'interface' is defined as a macro in an MSVC header file.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
6 years agomesa: remove snprintf macro in imports.h to fix MSVC build
Brian Paul [Fri, 13 Apr 2018 15:32:31 +0000 (09:32 -0600)]
mesa: remove snprintf macro in imports.h to fix MSVC build

snprintf is a macro in the MSVC stdio.h header and we needed to
include that header before imports.h where we also defined an
snprintf macro.  Otherwise, the MSVC build would fail.  The recent
mtypes.h removal patches seems to have exposed this issue.

This patch simply removes our snprintf macro and replaces one use
of it in teximage.c with _mesa_snprintf().  There are other calls
to snprintf() in DRI drivers, but none of them are built on Windows.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
6 years agoanv: fix number of planes for depth & stencil
Lionel Landwerlin [Thu, 12 Apr 2018 18:06:47 +0000 (11:06 -0700)]
anv: fix number of planes for depth & stencil

We're not counting correctly with depth & stencil images.

Additionally we need to move an assert that is meant just for color
attachments.

v2: Move an assert() (Reported by Craig)
    Change aspect mask checks (Francesco)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a62a97933578a ("anv: enable multiple planes per image/imageView")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105994
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agogallium: move ddebug, noop, rbug, trace to auxiliary to improve build times
Marek Olšák [Sat, 7 Apr 2018 18:01:12 +0000 (14:01 -0400)]
gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times

which also simplifies the build scripts.

6 years agoradeonsi: make sure CP DMA is idle at the end of IBs
Marek Olšák [Thu, 5 Apr 2018 21:54:39 +0000 (17:54 -0400)]
radeonsi: make sure CP DMA is idle at the end of IBs

6 years agogallium/hud: add a simple HUD view that only draws text
Marek Olšák [Wed, 4 Apr 2018 22:20:53 +0000 (18:20 -0400)]
gallium/hud: add a simple HUD view that only draws text

Add this prefix to the env var: "simple," For example:
    GALLIUM_HUD=simple,fps

The X coordinates are the same, but the Y coordinates are different, because
there is only text.

'+' happens to behave the same as "\n".
',' happens to behave the same as "\n\n".

6 years agomesa: Include unistd.h in program_lexer
Dylan Baker [Fri, 13 Apr 2018 16:01:29 +0000 (09:01 -0700)]
mesa: Include unistd.h in program_lexer

Which was previously provided implicitly by mtypes.h

CC: Marek Olšák <marek.olsak@amd.com>
CC: Mark Janes <mark.a.janes@intel.com>
Fixes: 43d66c8c2d4d3d4dee1309856b6ce6c5393682e5
       ("mesa: include mtypes.h less")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agoradeonsi: always prefetch later shaders after the draw packet
Marek Olšák [Tue, 3 Apr 2018 01:08:05 +0000 (21:08 -0400)]
radeonsi: always prefetch later shaders after the draw packet

so that the draw is started as soon as possible.

v2: only prefetch the API VS and VBO descriptors

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoradeonsi: emit shader pointers before cache flushes & waits
Marek Olšák [Tue, 3 Apr 2018 00:43:23 +0000 (20:43 -0400)]
radeonsi: emit shader pointers before cache flushes & waits

This code was written with the constant engine in mind.
We can simplify it now.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoradeonsi/gfx9: don't use the workaround for gather4 + stencil
Marek Olšák [Tue, 3 Apr 2018 19:20:04 +0000 (15:20 -0400)]
radeonsi/gfx9: don't use the workaround for gather4 + stencil

it doesn't seem to be needed.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoradeonsi: disable TC-compat HTILE on Tonga and Iceland
Marek Olšák [Tue, 3 Apr 2018 23:32:12 +0000 (19:32 -0400)]
radeonsi: disable TC-compat HTILE on Tonga and Iceland

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoradeonsi: force 2D tiling on VI only when TC-compat HTILE is really enabled
Marek Olšák [Tue, 3 Apr 2018 23:22:24 +0000 (19:22 -0400)]
radeonsi: force 2D tiling on VI only when TC-compat HTILE is really enabled

just pass the flag that indicates it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoradeonsi: don't flush HTILE if there is no HTILE clear
Marek Olšák [Wed, 28 Mar 2018 01:19:15 +0000 (21:19 -0400)]
radeonsi: don't flush HTILE if there is no HTILE clear

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoradeonsi: merge 2 identical if statements in si_clear
Marek Olšák [Wed, 28 Mar 2018 01:57:26 +0000 (21:57 -0400)]
radeonsi: merge 2 identical if statements in si_clear

and other cleanups

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoradeonsi: don't do GFX-specific texture decompression for compute
Marek Olšák [Tue, 3 Apr 2018 01:30:41 +0000 (21:30 -0400)]
radeonsi: don't do GFX-specific texture decompression for compute

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>