mesa.git
8 years agoi965/fs: Remove extract virtual opcodes.
Francisco Jerez [Thu, 19 May 2016 01:43:54 +0000 (18:43 -0700)]
i965/fs: Remove extract virtual opcodes.

These can be easily represented in the IR as a MOV instruction with
strided source so they seem rather redundant.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965: Define brw_int_type() helper.
Francisco Jerez [Tue, 26 Apr 2016 00:35:52 +0000 (17:35 -0700)]
i965: Define brw_int_type() helper.

Intended as a (partial) inverse of type_sz().  Will be useful in the
next commit and some other SIMD32 generator changes I have queued up.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Remove manual splitting of DDY ops in the generator.
Francisco Jerez [Sat, 28 May 2016 06:22:02 +0000 (23:22 -0700)]
i965/fs: Remove manual splitting of DDY ops in the generator.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Remove manual unrolling of BFI instructions from the generator.
Francisco Jerez [Wed, 18 May 2016 03:02:29 +0000 (20:02 -0700)]
i965/fs: Remove manual unrolling of BFI instructions from the generator.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Drop Gen7 CMP SIMD unrolling workaround from the generator.
Francisco Jerez [Wed, 18 May 2016 02:59:18 +0000 (19:59 -0700)]
i965/fs: Drop Gen7 CMP SIMD unrolling workaround from the generator.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Drop lowering code for a few three-source instructions from the generator.
Francisco Jerez [Wed, 18 May 2016 02:51:50 +0000 (19:51 -0700)]
i965/fs: Drop lowering code for a few three-source instructions from the generator.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Set default access mode to Align1 for all instructions in the generator.
Francisco Jerez [Thu, 19 May 2016 01:41:28 +0000 (18:41 -0700)]
i965/fs: Set default access mode to Align1 for all instructions in the generator.

Currently the generator code for most opcodes honours the default
access mode (which should typically be Align1 in the scalar back-end),
but generate_code() doesn't set it explicitly which means that the
access mode from a previous instruction could leak into the following
ones if you did something special and weren't careful enough to save
and restore the previous access mode.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Remove handcrafted math SIMD lowering from the generator.
Francisco Jerez [Wed, 18 May 2016 02:10:48 +0000 (19:10 -0700)]
i965/fs: Remove handcrafted math SIMD lowering from the generator.

Most of this wouldn't have worked for SIMD32 and had various
dispatch_width and compression control bugs.  It's mostly dead now
with SIMD lowering of math instructions turned on in the compiler.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Limit SIMD width of various virtual opcodes to the maximum supported value.
Francisco Jerez [Fri, 20 May 2016 20:34:46 +0000 (13:34 -0700)]
i965/fs: Limit SIMD width of various virtual opcodes to the maximum supported value.

Which is 16 or 8 in most cases.  This will make sure that 32-wide
virtual instructions get chopped up into chunks of their maximum
execution size.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Lower LOAD_PAYLOAD instructions of unsupported width.
Francisco Jerez [Fri, 20 May 2016 06:44:23 +0000 (23:44 -0700)]
i965/fs: Lower LOAD_PAYLOAD instructions of unsupported width.

Only per-channel LOAD_PAYLOAD instructions can be lowered, which
should cover everything that comes in from the front-end.

LOAD_PAYLOAD instructions used to construct actual message payloads
cannot be easily lowered because they contain headers and vectors of
variable type that aren't necessarily channel-aligned -- We shouldn't
find any of them in the program at SIMD lowering time though because
they're introduced during logical send lowering.

An alternative that may be worth considering would be to re-run the
SIMD lowering pass after LOAD_PAYLOAD lowering instead of this patch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Lower DDY instructions to SIMD8 during SIMD lowering time
Francisco Jerez [Tue, 17 May 2016 23:27:09 +0000 (16:27 -0700)]
i965/fs: Lower DDY instructions to SIMD8 during SIMD lowering time

...on hardware lacking compressed Align16 support.  Will allow
simplifying the generator code and fixing it for SIMD32 codegen.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Apply usual FPU-like execution size restrictions to MULH.
Francisco Jerez [Tue, 17 May 2016 23:43:05 +0000 (16:43 -0700)]
i965/fs: Apply usual FPU-like execution size restrictions to MULH.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Calculate maximum execution size of MOV_INDIRECT correctly.
Francisco Jerez [Tue, 17 May 2016 23:10:38 +0000 (16:10 -0700)]
i965/fs: Calculate maximum execution size of MOV_INDIRECT correctly.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Assert that IF instruction with embedded compare has legal exec_size.
Francisco Jerez [Tue, 17 May 2016 23:01:29 +0000 (16:01 -0700)]
i965/fs: Assert that IF instruction with embedded compare has legal exec_size.

We shouldn't encounter these right now but if we did it wouldn't be
possible for the SIMD lowering pass to split it into multiple
instructions because of its side effects on control flow, so just
assert in order to kill the program.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Implement HSW BFI exec size workarounds in the SIMD lowering pass.
Francisco Jerez [Tue, 17 May 2016 23:00:19 +0000 (16:00 -0700)]
i965/fs: Implement HSW BFI exec size workarounds in the SIMD lowering pass.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Implement workaround for IVB CMP dependency race in the SIMD lowering pass.
Francisco Jerez [Tue, 17 May 2016 22:58:04 +0000 (15:58 -0700)]
i965/fs: Implement workaround for IVB CMP dependency race in the SIMD lowering pass.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Enforce common regioning restrictions by SIMD splitting.
Francisco Jerez [Fri, 20 May 2016 20:15:49 +0000 (13:15 -0700)]
i965/fs: Enforce common regioning restrictions by SIMD splitting.

This change addresses a number of hardware restrictions on the source
and destination regions and other execution controls of regular
FPU-like instructions that in some cases can be avoided by reducing
the execution size of the instruction.  Some of these restrictions
(e.g. the one about 3src instructions not supporting compression on
some hardware) are currently being worked around case by case in the
generator with ad-hoc splitting code that is buggy in several ways
(e.g. doesn't handle non-trivial execution controls which would break
SIMD32 code), but it seems cleaner to implement as many restrictions
as we can in a single lowering pass since that will allow us to
simplify some of the surrounding code considerably and also make sure
that we don't forget applying them in the future.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Enforce extended math exec size limits during SIMD lowering.
Francisco Jerez [Fri, 20 May 2016 20:14:20 +0000 (13:14 -0700)]
i965/fs: Enforce extended math exec size limits during SIMD lowering.

This teaches the SIMD lowering pass about the hardware limits on the
execution size of math instructions, which will allow simplifying the
generator code and at the same time get rid of a number of bugs in the
manual SIMD unrolling done currently that prevent SIMD32 codegen from
working.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Handle SAMPLEINFO consistently like other texturing instructions.
Francisco Jerez [Fri, 20 May 2016 07:37:37 +0000 (00:37 -0700)]
i965/fs: Handle SAMPLEINFO consistently like other texturing instructions.

Seems like this texturing opcode was missing its logical counterpart
which would prevent it from taking advantage of the SIMD lowering
infrastructure, define it and plumb it through the back-end.  At some
point we'll likely want to emit a single SAMPLEINFO message shared
among all channels irrespective of this change, but for the moment
this should be enough to get the intrinsic working in SIMD32 mode.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Lower math into Gen4-5 send-like instructions in lower_logical_sends.
Francisco Jerez [Wed, 18 May 2016 06:54:25 +0000 (23:54 -0700)]
i965/fs: Lower math into Gen4-5 send-like instructions in lower_logical_sends.

The benefit is we will be able to use the SIMD lowering pass to unroll
math instructions of unsupported width and then remove some cruft from
the generator.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Add missing get_latency_gen7() cases for the Gen7 pull constant opcodes.
Francisco Jerez [Wed, 18 May 2016 06:52:15 +0000 (23:52 -0700)]
i965/fs: Add missing get_latency_gen7() cases for the Gen7 pull constant opcodes.

This was causing the scheduler to be rather optimistic about the
latency of pull constant opcodes on Gen7+.  This might seem to
increase the cycle count estimate calculated by the scheduler itself
for some shaders, even though the actual cycle count should actually
be decreased.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Rename Gen4 physical varying pull constant load opcode.
Francisco Jerez [Fri, 20 May 2016 20:03:31 +0000 (13:03 -0700)]
i965/fs: Rename Gen4 physical varying pull constant load opcode.

For consistency with the Gen7 variant.  I'm not doing the same to the
uniform pull constant message at this point because the non-GEN7 one
is still overloaded to be either an expression-like logical
instruction or a Gen4-specific physical send message.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Implement promotion of varying pull loads on Gen4 during SIMD lowering.
Francisco Jerez [Wed, 18 May 2016 08:26:03 +0000 (01:26 -0700)]
i965/fs: Implement promotion of varying pull loads on Gen4 during SIMD lowering.

Varying pull constant loads inherit the same limitation of pre-ILK
hardware that requires expanding SIMD8 texel fetch instructions to
SIMD16, we can deal with pull constant loads in the same way it's done
for texturing during SIMD lowering.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Hide varying pull constant load message setup behind logical opcode.
Francisco Jerez [Wed, 18 May 2016 06:18:38 +0000 (23:18 -0700)]
i965/fs: Hide varying pull constant load message setup behind logical opcode.

This will allow the SIMD lowering pass to split 32-wide varying pull
constant loads (not natively supported by the hardware) into 16-wide
instructions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Avoid constant propagation when the type sizes don't match.
Francisco Jerez [Fri, 20 May 2016 04:32:14 +0000 (21:32 -0700)]
i965/fs: Avoid constant propagation when the type sizes don't match.

The case where the source type of the instruction is smaller than the
immediate type could be handled by calculating the portion of the
immediate read by the instruction (assuming that the source channels
are aligned with the destination channels of the copy) and then
representing the same value as an immediate of the source type
(assuming such an immediate type exists), but the code below doesn't
do that, so just bail for the moment.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Fix CSE temporary copy for some LOAD_PAYLOAD corner cases.
Francisco Jerez [Tue, 26 Apr 2016 00:25:26 +0000 (17:25 -0700)]
i965/fs: Fix CSE temporary copy for some LOAD_PAYLOAD corner cases.

If the LOAD_PAYLOAD instruction only has header sources it's possible
for the number of registers written to be less than or equal to the
SIMD component size, in which case it would take the single-MOV path
at the bottom which would cause the channel enable masks to be applied
incorrectly to the header contents and/or cause it to write past the
end of the allocated temporary.  If the instruction is either
LOAD_PAYLOAD or doesn't write exactly one component the MOV path is
going to mess up the program so just don't use it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Handle instruction predication in SIMD lowering pass.
Francisco Jerez [Tue, 17 May 2016 23:48:32 +0000 (16:48 -0700)]
i965/fs: Handle instruction predication in SIMD lowering pass.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: No need to unzip SIMD-periodic sources during SIMD lowering.
Francisco Jerez [Tue, 17 May 2016 23:54:16 +0000 (16:54 -0700)]
i965/fs: No need to unzip SIMD-periodic sources during SIMD lowering.

If the source value is going to the same for all SIMD-lowered chunks
of the instruction there should be no need to unzip the value into
multiple temporary registers one for each lowered chunk.  As a side
effect this fixes SIMD lowering of instructions with a vector
immediate source.  In the long term it *might* still be worth fixing
offset() to handle vector immediates correctly though, this should be
good enough for the moment.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Generalize is_uniform() to is_periodic().
Francisco Jerez [Wed, 18 May 2016 00:45:41 +0000 (17:45 -0700)]
i965/fs: Generalize is_uniform() to is_periodic().

This will be useful in the SIMD lowering pass.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Fix byte_offset() for MRF/ARF/FIXED_GRF regs.
Francisco Jerez [Tue, 17 May 2016 00:19:17 +0000 (17:19 -0700)]
i965/fs: Fix byte_offset() for MRF/ARF/FIXED_GRF regs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Fix off-by-one region overlap comparison in copy propagation.
Francisco Jerez [Tue, 24 May 2016 02:32:51 +0000 (19:32 -0700)]
i965/fs: Fix off-by-one region overlap comparison in copy propagation.

This was introduced in cf375a3333e54a01462f192202d609436e5fbec8 but
the blame is mine because the pseudocode I sent in my review comment
for the original patch suggesting to do things this way already had
the off-by-one error.  This may have caused copy propagation to be
unnecessarily strict while checking whether VGRF writes interfere with
any ACP entries and possibly miss valid optimization opportunities in
cases where multiple copy instructions write sequential locations of
the same VGRF.

Cc: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
8 years agoanv/cmd_buffer: Don't delete command buffers in ResetCommandPool()
Ronie Salgado [Sat, 28 May 2016 00:32:44 +0000 (17:32 -0700)]
anv/cmd_buffer: Don't delete command buffers in ResetCommandPool()

v2 (Jason Ekstrand): Destroy command buffers in DestroyCommandPool().

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95034
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agogallium/util: another s/unsigned/enum pipe_prim_type/ for clang
Brian Paul [Sat, 28 May 2016 00:32:04 +0000 (18:32 -0600)]
gallium/util: another s/unsigned/enum pipe_prim_type/ for clang

Trivial.

8 years agoanv: Try the first 8 render nodes instead of just renderD128
Jason Ekstrand [Tue, 24 May 2016 19:06:35 +0000 (12:06 -0700)]
anv: Try the first 8 render nodes instead of just renderD128

This way, if you have other cards installed, the Vulkan driver will still
work.  No guarantees about WSI working correctly but offscreen should at
least work.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95537

8 years agoanv: strdup the device path into the physical device
Jason Ekstrand [Tue, 24 May 2016 18:02:18 +0000 (11:02 -0700)]
anv: strdup the device path into the physical device

This way we don't have to assume that the string coming in is a piece of
constant data that exists forever.

8 years agoanv/formats: Exit early for unsupported formats
Jason Ekstrand [Sat, 28 May 2016 00:16:09 +0000 (17:16 -0700)]
anv/formats: Exit early for unsupported formats

8 years agoanv/formats: Map VK_FORMAT_UNDEFINED to ISL_FORMAT_UNSUPPORTED
Jason Ekstrand [Sat, 28 May 2016 00:14:29 +0000 (17:14 -0700)]
anv/formats: Map VK_FORMAT_UNDEFINED to ISL_FORMAT_UNSUPPORTED

At one point in time, we may have used the mapping to ISL_FORMAT_RAW for
certain buffer surfaces but that time has long since passed.  This fixes a
bug where doing format queries on VK_FORMAT_UNDEFINED would assert-fail.

8 years agoanv/clear: Remove an unused variable
Jason Ekstrand [Sat, 28 May 2016 00:13:45 +0000 (17:13 -0700)]
anv/clear: Remove an unused variable

8 years agogallium/util: another unsigned -> enum pipe_prim_type change
Brian Paul [Fri, 27 May 2016 21:56:07 +0000 (15:56 -0600)]
gallium/util: another unsigned -> enum pipe_prim_type change

gcc didn't warn about the unsigned / enum pipe_prim_type mismatch
between the .c and .h file.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agoi965/compute: Fix uniform init issue when SIMD8 is skipped
Jordan Justen [Wed, 18 May 2016 19:04:03 +0000 (12:04 -0700)]
i965/compute: Fix uniform init issue when SIMD8 is skipped

In d8347f12ead89c5a58f69ce9283a54ac8487159c, we added support for
skipping SIMD8 generation when the program local size is too large for
SIMD8 to be usable. This change was missed in that commit.

This bug would impact gen7 platforms when the compute shader local
size is greater than 512, and gen8 platforms when the local size is
greater than 448.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agodocs: Mention GL4.3 and ES3.1 support for nvc0 and radeonsi
Bas Nieuwenhuizen [Fri, 27 May 2016 22:57:31 +0000 (00:57 +0200)]
docs: Mention GL4.3 and ES3.1 support for nvc0 and radeonsi

v2: also update the introductory text.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoanv: Emit DRAWING_RECTANGLE once at driver initialization
Jason Ekstrand [Fri, 11 Mar 2016 03:15:32 +0000 (19:15 -0800)]
anv: Emit DRAWING_RECTANGLE once at driver initialization

Also, we don't actually need it for clipping because meta always colors
inside the lines and, for all other operations, the user is required to set
a scissor.  Since DRAWING_RECTANGLE stalls the GPU, we want to emit it as
little as possible.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoanv/cmd_buffer: Only emit PIPE_CONTROL on-demand
Jason Ekstrand [Fri, 20 May 2016 18:49:12 +0000 (11:49 -0700)]
anv/cmd_buffer: Only emit PIPE_CONTROL on-demand

This is in contrast to emitting it directly in vkCmdPipelineBarrier.  This
has a couple of advantages.  First, it means that no matter how many
vkCmdPipelineBarrier calls the application strings together it gets one or
two PIPE_CONTROLs.  Second, it allow us to better track when we need to do
stalls because we can flag when a flush has happened and we need a stall.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agogenxml: Make PIPE_CONTROL::CommandStreamerStallEnable a boolean
Jason Ekstrand [Fri, 20 May 2016 19:07:53 +0000 (12:07 -0700)]
genxml: Make PIPE_CONTROL::CommandStreamerStallEnable a boolean

This has been declared as a uint since SNB but it's only one bit.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoanv/clear: Only clear the render area when doing subpass clears
Jason Ekstrand [Fri, 20 May 2016 07:11:32 +0000 (00:11 -0700)]
anv/clear: Only clear the render area when doing subpass clears

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoanv: Move push constant allocation to the command buffer
Jason Ekstrand [Wed, 9 Mar 2016 02:10:22 +0000 (18:10 -0800)]
anv: Move push constant allocation to the command buffer

Instead of blasting it out as part of the pipeline, we put it in the
command buffer and only blast it out when it's really needed.  Since the
PUSH_CONSTANT_ALLOC commands aren't pipelined, they immediately cause a
stall which we would like to avoid.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoradeonsi: enable OpenGL 4.3
Bas Nieuwenhuizen [Mon, 18 Apr 2016 22:47:49 +0000 (00:47 +0200)]
radeonsi: enable OpenGL 4.3

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agonouveau: enable GL 4.3 on kepler/fermi
Dave Airlie [Fri, 27 May 2016 19:51:12 +0000 (05:51 +1000)]
nouveau: enable GL 4.3 on kepler/fermi

Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradeonsi: always reserve output space for tess factors
Marek Olšák [Fri, 27 May 2016 10:39:30 +0000 (12:39 +0200)]
radeonsi: always reserve output space for tess factors

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dave Airlie <airlied@redhat.com>
8 years agoglsl/linker: call link_uniform blocks on linked shader.
Dave Airlie [Fri, 27 May 2016 03:21:57 +0000 (13:21 +1000)]
glsl/linker: call link_uniform blocks on linked shader.

The old code called this on the prelinked shader list,
but at this point we have the linked shader, so we should
call the interface on that alone.

This fixes a regression in:
dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.13
introduced in
5b2675093e863a52b610f112884ae12d42513770
glsl: handle implicit sized arrays in ssbo

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96228
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reported-by: Mark James
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agomesa/get: drop unused extension checks.
Dave Airlie [Fri, 27 May 2016 05:11:33 +0000 (15:11 +1000)]
mesa/get: drop unused extension checks.

These all show up as unused warnings here, so drop them for now.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agogallium/ddebug: Add passthrough for query_memory_info.
Bas Nieuwenhuizen [Fri, 27 May 2016 11:55:56 +0000 (13:55 +0200)]
gallium/ddebug: Add passthrough for query_memory_info.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agonir/inline: Also rewrite param derefs for texture instructions
Jason Ekstrand [Wed, 25 May 2016 17:51:33 +0000 (10:51 -0700)]
nir/inline: Also rewrite param derefs for texture instructions

Without this, samplers get left hanging as derefs to variables that don't
actually exist.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
8 years agonir/inline: Break the guts of rewrite_param-derefs into a helper
Jason Ekstrand [Wed, 25 May 2016 17:48:05 +0000 (10:48 -0700)]
nir/inline: Break the guts of rewrite_param-derefs into a helper

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
8 years agonir/inline: Make the rewrite_param_derefs helper work on instructions
Jason Ekstrand [Wed, 25 May 2016 17:36:23 +0000 (10:36 -0700)]
nir/inline: Make the rewrite_param_derefs helper work on instructions

Now that we have the better nir_foreach_block macro, there's no reason to
use the archaic block version for everything.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
8 years agonir/inline: Don't use foreach_instr_safe unless we need to
Jason Ekstrand [Fri, 27 May 2016 16:25:51 +0000 (09:25 -0700)]
nir/inline: Don't use foreach_instr_safe unless we need to

Suggested-by: Connor Abbott <cwabbott0@gmail.com>
8 years agogallivm: eliminate a unnecessary AND with unorm lerps
Roland Scheidegger [Thu, 12 May 2016 23:44:39 +0000 (01:44 +0200)]
gallivm: eliminate a unnecessary AND with unorm lerps

Instead of doing a add and then mask out the upper bits, we can
simply do a add with a half wide type (this, of course, assumes
the hw can actually do it...), so we'll get the required zero
in the upper bits automatically.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
8 years agogallium/util: use enum pipe_prim_type instead of unsigned some more
Roland Scheidegger [Fri, 27 May 2016 16:49:44 +0000 (18:49 +0200)]
gallium/util: use enum pipe_prim_type instead of unsigned some more

There were complaints from a mingw build:
u_draw.h:134:14: error: invalid conversion from ‘uint {aka unsigned int}’
to ‘pipe_prim_type’ [-fpermissive]

Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agosvga: remove unneeded casts in get_query_result_vgpu9() calls
Brian Paul [Fri, 27 May 2016 00:58:16 +0000 (18:58 -0600)]
svga: remove unneeded casts in get_query_result_vgpu9() calls

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
8 years agosvga: use MAYBE_UNUSED to silence release-build warnings
Brian Paul [Fri, 27 May 2016 00:57:51 +0000 (18:57 -0600)]
svga: use MAYBE_UNUSED to silence release-build warnings

Signed-off-by: Brian Paul <brianp@vmware.com>
8 years agoisl: Fix some tautological-compare warnings
Ben Widawsky [Fri, 27 May 2016 04:59:17 +0000 (21:59 -0700)]
isl: Fix some tautological-compare warnings

Fixes:
isl.c:62:22: warning: self-comparison always evaluates to true [-Wtautological-compare]
    assert(ISL_DEV_GEN(dev) == dev->info->gen);
                      ^~
isl.c:63:33: warning: self-comparison always evaluates to true [-Wtautological-compare]
    assert(ISL_DEV_USE_SEPARATE_STENCIL(dev) == dev->use_separate_stencil);

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
8 years agomesa: add support for GLSL ES 3.20 version string
Ilia Mirkin [Thu, 26 May 2016 17:58:42 +0000 (13:58 -0400)]
mesa: add support for GLSL ES 3.20 version string

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
8 years agomapi: expose new functions in GL ES 3.2
Ilia Mirkin [Thu, 26 May 2016 17:58:41 +0000 (13:58 -0400)]
mapi: expose new functions in GL ES 3.2

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
8 years agonvc0/ir: handle a load's reg result not being used for locked variants
Ilia Mirkin [Thu, 26 May 2016 02:41:06 +0000 (22:41 -0400)]
nvc0/ir: handle a load's reg result not being used for locked variants

For a load locked, we might not use the first result but the second
result is the predicate result of the locking. In that case the load
splitting logic doesn't apply (which is designed for splitting 128-bit
loads). Instead we take the predicate and move it into the first
position (as having a dead result in first def's position upsets all
sorts of things including RA). Update the emitters to deal with this as
well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
8 years agonvc0/ir: avoid generating illegal instructions for compute constbuf loads
Ilia Mirkin [Thu, 26 May 2016 01:54:39 +0000 (21:54 -0400)]
nvc0/ir: avoid generating illegal instructions for compute constbuf loads

For user-supplied constbufs, fileIndex is 0. In that case, when we
subtract 1, we'll end up loading from constbuf offset -16. This is
illegal, and there are asserts to avoid it. Normally we'd just DCE it,
but no point in generating the instructions if they're not going to be
used.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
8 years agogallium/util: fix build break
Rob Clark [Fri, 27 May 2016 00:59:08 +0000 (20:59 -0400)]
gallium/util: fix build break

Missing #include caused build breaks after 21a3fb9cd.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
8 years agonir/spirv: Allow pointless variable decorations on inputs
Jason Ekstrand [Fri, 27 May 2016 00:06:17 +0000 (17:06 -0700)]
nir/spirv: Allow pointless variable decorations on inputs

SPIR-V specifies that a bunch of stuff gets applied to types.  This means
taht a local variable could get, for instance, an array stride.  Just
because it's pointless doesn't mean you'll never see it.

8 years agogallium/util: use enum pipe_prim_type in u_prim.h functions
Brian Paul [Thu, 26 May 2016 20:50:13 +0000 (14:50 -0600)]
gallium/util: use enum pipe_prim_type in u_prim.h functions

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agoutil/indices: move duplicated assignments out of switch cases
Brian Paul [Thu, 26 May 2016 15:50:24 +0000 (09:50 -0600)]
util/indices: move duplicated assignments out of switch cases

Spotted by Roland.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agogallium: change pipe_draw_info::mode to be pipe_prim_type
Brian Paul [Thu, 26 May 2016 13:17:50 +0000 (07:17 -0600)]
gallium: change pipe_draw_info::mode to be pipe_prim_type

Makes debugging with gdb a little nicer.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agoutil/indices,svga: s/unsigned/enum pipe_prim_type/
Brian Paul [Thu, 26 May 2016 13:12:59 +0000 (07:12 -0600)]
util/indices,svga: s/unsigned/enum pipe_prim_type/

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agoutil: s/unsigned/enum pipe_resource_usage/ for buffer usage variables
Brian Paul [Wed, 25 May 2016 23:13:56 +0000 (17:13 -0600)]
util: s/unsigned/enum pipe_resource_usage/ for buffer usage variables

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agosvga: s/unsigned/enum pipe_resource_usage/ for buffer usage variables
Brian Paul [Wed, 25 May 2016 23:13:23 +0000 (17:13 -0600)]
svga: s/unsigned/enum pipe_resource_usage/ for buffer usage variables

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agosvga: s/unsigned/enum pipe_prim_type/ for primitive type variables
Brian Paul [Wed, 25 May 2016 22:52:34 +0000 (16:52 -0600)]
svga: s/unsigned/enum pipe_prim_type/ for primitive type variables

Proper enum types were only added recently.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agosvga: fix test for unfilled triangles fallback
Brian Paul [Wed, 25 May 2016 18:42:55 +0000 (12:42 -0600)]
svga: fix test for unfilled triangles fallback

VGPU10 actually supports line-mode triangles.  We failed to make use of
that before.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
8 years agosvga: clean up and improve comments in svga_draw_private.h
Brian Paul [Wed, 25 May 2016 15:46:17 +0000 (09:46 -0600)]
svga: clean up and improve comments in svga_draw_private.h

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
8 years agoutil/indices: implement unfilled (tri->line) conversion for adjacency prims
Brian Paul [Wed, 25 May 2016 17:58:29 +0000 (11:58 -0600)]
util/indices: implement unfilled (tri->line) conversion for adjacency prims

Tested with new piglit gl-3.2-adj-prims test.

v2: re-order trisadj and tristripadj code, per Roland.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agoutil/indices: implement provoking vertex conversion for adjacency primitives
Brian Paul [Wed, 25 May 2016 21:53:25 +0000 (15:53 -0600)]
util/indices: implement provoking vertex conversion for adjacency primitives

Tested with new piglit gl-3.2-adj-prims test.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agoutil/indices: assert that the incoming primitive is a triangle type
Brian Paul [Fri, 13 May 2016 22:49:22 +0000 (16:49 -0600)]
util/indices: assert that the incoming primitive is a triangle type

The unfilled index translator/generator functions should only be
called when the primitive mode is one of the triangle types.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agoutil/indices: formatting, whitespace fixes in u_unfilled_indices.c
Brian Paul [Fri, 13 May 2016 22:46:26 +0000 (16:46 -0600)]
util/indices: formatting, whitespace fixes in u_unfilled_indices.c

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agoutil/indices: improve comments in u_indices.h
Brian Paul [Fri, 13 May 2016 22:45:25 +0000 (16:45 -0600)]
util/indices: improve comments in u_indices.h

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agosvga: fix primitive mode (point/line/tri) test for unfilled primitives
Brian Paul [Mon, 9 May 2016 19:42:58 +0000 (13:42 -0600)]
svga: fix primitive mode (point/line/tri) test for unfilled primitives

The original mode test was valid before we had GS support.

Regression tested with full piglit run.  Though, I don't think we have
any piglit tests that exercise drawing unfilled adjacency primitives.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
8 years agoi965: Enable GL_OES_shader_io_blocks
Ian Romanick [Wed, 11 May 2016 20:11:00 +0000 (13:11 -0700)]
i965: Enable GL_OES_shader_io_blocks

Only one dEQP io_blocks test fails.  This test fails for the same reason
as the match_different_member_struct_names test in a previous commit.

dEQP-GLES31.functional.separate_shader.validation.io_blocks.match_different_member_struct_names

v2: Add to release notes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
8 years agoglsl: Allow shader interface blocks in GLSL ES
Ian Romanick [Thu, 12 May 2016 01:24:32 +0000 (18:24 -0700)]
glsl: Allow shader interface blocks in GLSL ES

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
8 years agoglsl: Add a has_shader_io_blocks helper
Ian Romanick [Wed, 11 May 2016 21:03:40 +0000 (14:03 -0700)]
glsl: Add a has_shader_io_blocks helper

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
8 years agomesa: Add extension tracking for GL_OES_shader_io_blocks
Ian Romanick [Wed, 11 May 2016 20:05:22 +0000 (13:05 -0700)]
mesa: Add extension tracking for GL_OES_shader_io_blocks

v2: Also support GL_EXT_shader_io_blocks.  It's pretty much identical to
the OES extension.  Suggested by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
8 years agomesa: Only validate SSO shader IO in OpenGL ES or debug context
Ian Romanick [Thu, 19 May 2016 17:27:12 +0000 (10:27 -0700)]
mesa: Only validate SSO shader IO in OpenGL ES or debug context

v2: Move later in series to avoid issues with Gallium drivers and debug
contexts.  Suggested by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
8 years agomesa: Remove old validate_io function
Ian Romanick [Fri, 20 May 2016 01:09:00 +0000 (18:09 -0700)]
mesa: Remove old validate_io function

The new validate_io catches all of the cases (and many more) that the
old function caught.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
8 years agomesa: Additional SSO validation using program_interface_query data
Ian Romanick [Thu, 19 May 2016 17:28:25 +0000 (10:28 -0700)]
mesa: Additional SSO validation using program_interface_query data

Fixes the following dEQP tests on SKL:

dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_smooth_fragment_flat
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_implicit_explicit_location_1
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_array_element_type
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_none
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_order
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_type
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_centroid_fragment_flat
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_array_length
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_type
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_precision
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_explicit_location_type
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_centroid
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_explicit_location
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_smooth
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_name

It regresses one test:

dEQP-GLES31.functional.separate_shader.validation.varying.match_different_struct_names

Hoever, this test is based on language in the OpenGL ES 3.1 spec that I
believe is incorrect.  I have already submitted a spec bug:

https://www.khronos.org/bugzilla/show_bug.cgi?id=1500

v2: Move spec quote about built-in variables to the first place where
it's relevant.  Suggested by Alejandro.

v3: Move patch earlier in series, fix rebase issues.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v2]
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> [v2]
8 years agomesa: Track the additional data in gl_shader_variable
Ian Romanick [Thu, 19 May 2016 17:25:47 +0000 (10:25 -0700)]
mesa: Track the additional data in gl_shader_variable

The interface type, interpolation mode, precision, the type of the
outermost structure, and whether or not the variable has an explicit
location will be used for SSO validation on OpenGL ES.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
8 years agonir: Make nir_const_value a union
Jason Ekstrand [Thu, 26 May 2016 22:38:45 +0000 (15:38 -0700)]
nir: Make nir_const_value a union

There's no good reason for it to be a struct of an anonymous union.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96221
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agoi965: Use the buffer object size for VERTEX_BUFFER_STATE's size field.
Kenneth Graunke [Wed, 25 May 2016 21:38:32 +0000 (14:38 -0700)]
i965: Use the buffer object size for VERTEX_BUFFER_STATE's size field.

commit 7c8dfa78b98a12c1c5 (i965/draw: Use the real size for vertex
buffers) changed how we programmed the VERTEX_BUFFER_STATE size field.

Previously, we programmed it to the size of the actual underlying BO,
which is page-aligned, and potentially much larger than the GL buffer
object.  This violated the ARB_robust_buffer_access spec.

With that change, we started programming it based on the range of data
we expect the draw call to actually access - which is based on the
min_index and max_index information provided to glDrawRangeElements().

Unfortunately, applications often provide inaccurate range information
to glDrawRangeElements().  For example, all the Unreal demos appear to
draw using a range of [0, 3] when the index buffer's actual index range
is [0, 5].  Such results are undefined, and we are absolutely allowed
to restrict access to the range they specified.  However, the failure
mode is usually that nothing draws, or misrendering with wild geometry,
which is kind of bad for a common mistake.  And people tend to assume
the range information isn't that important when data is in VBOs.

There's no real advantage, either.  ARB_robust_buffer_access only
requires us to restrict access to the GL buffer object size, not
the range of data we think they should access.  Doing that allows
buggy applications to still function.  (Note that we still use this
information for busy-tracking, so if they try to overwrite the data
with glBufferSubData, they'll still hit a bug.)  This seems to be
safer.

We may want to provide the more strict range as a debug option,
or scan the VBO and warn against bogus glDrawRangeElements in
debug contexts.  That can be done as a later patch, though.

Makes Unreal demos draw again.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agonvc0: invalidate textures/samplers between 3D and CP on Fermi
Samuel Pitoiset [Thu, 26 May 2016 21:01:37 +0000 (23:01 +0200)]
nvc0: invalidate textures/samplers between 3D and CP on Fermi

Like constant buffers, samplers and textures are aliased on Fermi and
we need to invalidate the state when switching from 3D to CP and vice
versa.

This fixes rendering issues in the UE4 demos.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoanv: Stop linking against libmesa.la and libdri_test_stubs.la
Jason Ekstrand [Thu, 26 May 2016 01:20:40 +0000 (18:20 -0700)]
anv: Stop linking against libmesa.la and libdri_test_stubs.la

This brings the final size of an optimized non-debug build of the Vulkan
driver down to 2.9 MB as opposed to 8.7 MB for the dri driver.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
8 years agoi965: Don't link libmesa or libdri_test_stubs into tests
Jason Ekstrand [Thu, 26 May 2016 00:51:59 +0000 (17:51 -0700)]
i965: Don't link libmesa or libdri_test_stubs into tests

Now that the compiler has been completely separated from libmesa, we no
longer need these.  We can make the tests much smaller by not linking them
in.  This also ensures that anyone who runs make check won't accidentally
put in any dependencies from the compiler to the rest of mesa core.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
8 years agoi965: Move compiler debug functions to intel_screen.c
Jason Ekstrand [Thu, 26 May 2016 01:19:50 +0000 (18:19 -0700)]
i965: Move compiler debug functions to intel_screen.c

They reference the compiler so they shouldn't go in libi965_compiler.la.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
8 years agoi965/test: Remove the fragment/vertex_program field from test visitors
Jason Ekstrand [Thu, 26 May 2016 00:41:59 +0000 (17:41 -0700)]
i965/test: Remove the fragment/vertex_program field from test visitors

None of them are actually using it.  It's a relic of an older compiler
interface that required a gl_program.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
8 years agoi965: Move brw_new_shader to brw_link.cpp
Jason Ekstrand [Thu, 26 May 2016 00:46:07 +0000 (17:46 -0700)]
i965: Move brw_new_shader to brw_link.cpp

That's where brw_link_shader lives and they seem to go together.  Also,
this gets it out of libi965_compiler.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
8 years agoi965: Move brw_nir_lower_uniforms.cpp to i965_FILES
Jason Ekstrand [Thu, 26 May 2016 00:29:38 +0000 (17:29 -0700)]
i965: Move brw_nir_lower_uniforms.cpp to i965_FILES

This gets it out of i965_compiler.la

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
8 years agoi965: Move brw_create_nir to brw_program.c
Jason Ekstrand [Thu, 26 May 2016 00:27:23 +0000 (17:27 -0700)]
i965: Move brw_create_nir to brw_program.c

This way it's no longer part of libi965_compiler.la since it depends on
GLSL and ARB program stuff.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>