mesa.git
9 years agonir/spirv: Rework decoration iteration
Jason Ekstrand [Fri, 20 Nov 2015 23:15:38 +0000 (15:15 -0800)]
nir/spirv: Rework decoration iteration

The old code didn't work correctly if you had member decorations after
non-member decorations.  Since glslang never gave us any of those, it
wasn't properly tested.

9 years agonir/spirv: Handle OpNop
Jason Ekstrand [Fri, 20 Nov 2015 23:02:45 +0000 (15:02 -0800)]
nir/spirv: Handle OpNop

9 years agogen8_state: Clamp sampler values to HW limitations
Jason Ekstrand [Fri, 20 Nov 2015 22:45:44 +0000 (14:45 -0800)]
gen8_state: Clamp sampler values to HW limitations

9 years agonir/spirv: Add support for runtime arrays
Jason Ekstrand [Fri, 20 Nov 2015 20:49:20 +0000 (12:49 -0800)]
nir/spirv: Add support for runtime arrays

9 years agogen8/pipeline: Properly handle MIN/MAX blend ops
Jason Ekstrand [Fri, 20 Nov 2015 19:53:10 +0000 (11:53 -0800)]
gen8/pipeline: Properly handle MIN/MAX blend ops

9 years agogen8/pipeline: Set IndependentAlphaBlendEnable properly
Jason Ekstrand [Fri, 20 Nov 2015 19:52:54 +0000 (11:52 -0800)]
gen8/pipeline: Set IndependentAlphaBlendEnable properly

9 years agogen8/pipeline: Minor blending fixes
Jason Ekstrand [Fri, 20 Nov 2015 19:52:28 +0000 (11:52 -0800)]
gen8/pipeline: Minor blending fixes

This makes various fields match upstream mesa

9 years agoanv: Put all of the descriptor set stuff together in one file
Jason Ekstrand [Wed, 18 Nov 2015 22:58:40 +0000 (14:58 -0800)]
anv: Put all of the descriptor set stuff together in one file

The stuff to take descriptor sets and turn them into binding tables and
sampler tables is still in anv_cmd_buffer.c.  We may want to consider
putting it in anv_descriptor_set.c eventually.

9 years agoanv/device: Update the right sampler in UpdateDescriptorSets
Jason Ekstrand [Wed, 18 Nov 2015 22:48:28 +0000 (14:48 -0800)]
anv/device: Update the right sampler in UpdateDescriptorSets

9 years agoanv/cmd_buffer: Add a new genX_cmd_buffer file for shared code
Jason Ekstrand [Wed, 18 Nov 2015 20:25:11 +0000 (12:25 -0800)]
anv/cmd_buffer: Add a new genX_cmd_buffer file for shared code

This file contains code that can be shared across gens modulo recompiling.
In particular, we can share STATE_BASE_ADDRESS setup and handling of the
vkPipelineBarrier call.  Not sharing STATE_BASE_ADDRESS setup has already
been a source of bugs and the gen7 and gen8 implementations of
PipelineBarrier were line-for-line identical.

Incidentally, this should fix MOCS settings for dynamic and surface state
on Haswell.

9 years agoanv/gen7: A bunch of depth-stencil fixes
Jason Ekstrand [Wed, 18 Nov 2015 19:43:48 +0000 (11:43 -0800)]
anv/gen7: A bunch of depth-stencil fixes

There are various bits which move around between Haswell and Ivy Bridge
that we weren't taking into account.  This also makes us actually set the
StencilWriteEnable in a sane way.

9 years agogen7/pipeline: Re-arrange stencil parameters to match gen8
Jason Ekstrand [Wed, 18 Nov 2015 03:10:31 +0000 (19:10 -0800)]
gen7/pipeline: Re-arrange stencil parameters to match gen8

9 years agoanv/gen7: Implement CmdPipelineBarrier
Jason Ekstrand [Wed, 18 Nov 2015 01:09:27 +0000 (17:09 -0800)]
anv/gen7: Implement CmdPipelineBarrier

9 years agoanv/gen7: Don't use the upper bound on dynamic state base address
Jason Ekstrand [Wed, 18 Nov 2015 01:08:42 +0000 (17:08 -0800)]
anv/gen7: Don't use the upper bound on dynamic state base address

It doesn't do much for us and, if we have to resize the dynamic state block
pool for any reason, it becomes out-of-date.

9 years agoanv: Add initial Haswell support
Jason Ekstrand [Tue, 17 Nov 2015 15:07:02 +0000 (07:07 -0800)]
anv: Add initial Haswell support

9 years agoanv: Add macros for doing per-gen compilation
Jason Ekstrand [Tue, 17 Nov 2015 14:31:25 +0000 (06:31 -0800)]
anv: Add macros for doing per-gen compilation

9 years agoanv/entrypoints: Add dispatch support for haswell
Jason Ekstrand [Tue, 17 Nov 2015 14:30:47 +0000 (06:30 -0800)]
anv/entrypoints: Add dispatch support for haswell

9 years agoanv/entrypoints: Use devinfo instead of a gen number
Jason Ekstrand [Tue, 17 Nov 2015 14:30:02 +0000 (06:30 -0800)]
anv/entrypoints: Use devinfo instead of a gen number

9 years agoanv/cmd_buffer: Pack the 3DSTATE_VF packet on-demand
Jason Ekstrand [Tue, 17 Nov 2015 00:29:33 +0000 (16:29 -0800)]
anv/cmd_buffer: Pack the 3DSTATE_VF packet on-demand

9 years agoanv/formats: Don't advertise stencil texture/blit prior to Broadwell
Jason Ekstrand [Tue, 17 Nov 2015 09:32:55 +0000 (01:32 -0800)]
anv/formats: Don't advertise stencil texture/blit prior to Broadwell

9 years agoanv: Only include the pack headers where needed
Jason Ekstrand [Mon, 16 Nov 2015 20:29:07 +0000 (12:29 -0800)]
anv: Only include the pack headers where needed

Previously, we were including gen7_pack.h, gen75_pack.h, and gen8_pack.h
in anv_private.h.  As we add more gens, this is going to become untenable.
This commit moves things around so that we only use the pack headers when
and if we need them.

9 years agoanv/cmd_buffer: Move gen-specific stuff into the appropreate files
Jason Ekstrand [Mon, 16 Nov 2015 20:10:11 +0000 (12:10 -0800)]
anv/cmd_buffer: Move gen-specific stuff into the appropreate files

9 years agonir/spirv: Add support for separate samplers and textures
Jason Ekstrand [Sun, 15 Nov 2015 06:32:54 +0000 (22:32 -0800)]
nir/spirv: Add support for separate samplers and textures

This gets tricky in a few places because we have to pass vtn_sampled_image
values through OpAccessChain, but it works ok.  At some point, it probably
needs to be cleaned up but it doesn't occur to me exactly how to do that at
the moment.  We'll see how this approach goes.

9 years agoanv/cmd_buffer: Add a default descriptor type case
Jason Ekstrand [Sat, 14 Nov 2015 17:16:53 +0000 (09:16 -0800)]
anv/cmd_buffer: Add a default descriptor type case

This silences a bunch of compiler warnings.

9 years agoanv/apply_pipeline_layout: Handle separate samplers and textures
Jason Ekstrand [Sat, 14 Nov 2015 17:00:35 +0000 (09:00 -0800)]
anv/apply_pipeline_layout: Handle separate samplers and textures

9 years agoMerge branch 'wip/i965-separate-sampler-tex' into vulkan
Jason Ekstrand [Sat, 14 Nov 2015 16:23:27 +0000 (08:23 -0800)]
Merge branch 'wip/i965-separate-sampler-tex' into vulkan

9 years agoi965/vec4: Plumb separate surfaces and samplers through from NIR
Jason Ekstrand [Tue, 3 Nov 2015 02:39:17 +0000 (18:39 -0800)]
i965/vec4: Plumb separate surfaces and samplers through from NIR

9 years agoi965/vec4: Separate the sampler from the surface in generate_tex
Jason Ekstrand [Tue, 3 Nov 2015 02:28:49 +0000 (18:28 -0800)]
i965/vec4: Separate the sampler from the surface in generate_tex

9 years agoi965/fs: Plumb separate surfaces and samplers through from NIR
Jason Ekstrand [Tue, 3 Nov 2015 00:04:29 +0000 (16:04 -0800)]
i965/fs: Plumb separate surfaces and samplers through from NIR

9 years agoi965/fs: Separate the sampler from the surface in generate_tex
Jason Ekstrand [Mon, 2 Nov 2015 23:24:05 +0000 (15:24 -0800)]
i965/fs: Separate the sampler from the surface in generate_tex

9 years agonir: Separate texture from sampler in nir_tex_instr
Jason Ekstrand [Tue, 3 Nov 2015 01:58:29 +0000 (17:58 -0800)]
nir: Separate texture from sampler in nir_tex_instr

This commit adds the capability to NIR to support separate textures and
samplers.  As it currently stands, glsl_to_nir only sets the sampler and
leaves the texture alone as it did before and nir_lower_samplers assumes
this.  However, backends can, if they wish, assume that they are separate
because nir_lower_samplers sets both texture and sampler index (they are
the same in this case).

9 years agoMerge remote-tracking branch 'mesa-public/master' into vulkan
Jason Ekstrand [Sat, 14 Nov 2015 15:56:10 +0000 (07:56 -0800)]
Merge remote-tracking branch 'mesa-public/master' into vulkan

This pulls in Matt's big compiler refactor.

9 years agonouveau: don't expose HEVC decoding support
Ilia Mirkin [Sat, 14 Nov 2015 15:28:55 +0000 (10:28 -0500)]
nouveau: don't expose HEVC decoding support

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
9 years agoanv/gen8: Subtract 1 from num_elements when setting up buffer surface state
Jason Ekstrand [Sat, 14 Nov 2015 06:50:52 +0000 (22:50 -0800)]
anv/gen8: Subtract 1 from num_elements when setting up buffer surface state

9 years agoanv/pipeline: Don't free blend states that don't exist
Jason Ekstrand [Sat, 14 Nov 2015 05:49:39 +0000 (21:49 -0800)]
anv/pipeline: Don't free blend states that don't exist

Compute pipelines don't need a blend state so we shouldn't be
unconditionally freeing it.

9 years agonir/spirv: Add support for SSBO stores
Jason Ekstrand [Sat, 14 Nov 2015 05:33:16 +0000 (21:33 -0800)]
nir/spirv: Add support for SSBO stores

This only handles vector stores, not component-of-a-vector stores.

9 years agonir/spirv: Refactor vtn_block_load
Jason Ekstrand [Sat, 14 Nov 2015 05:31:58 +0000 (21:31 -0800)]
nir/spirv: Refactor vtn_block_load

We pull the offset calculations out into their own function so we can
re-use it for stores.

9 years agonir/spirv: Add support for image_load_store
Jason Ekstrand [Sat, 14 Nov 2015 01:26:22 +0000 (17:26 -0800)]
nir/spirv: Add support for image_load_store

9 years agonir/builder: Add a nir_ssa_undef helper
Jason Ekstrand [Sat, 14 Nov 2015 00:25:24 +0000 (16:25 -0800)]
nir/builder: Add a nir_ssa_undef helper

9 years agonir/spirv: Add support for creating image variables
Jason Ekstrand [Fri, 13 Nov 2015 23:53:08 +0000 (15:53 -0800)]
nir/spirv: Add support for creating image variables

9 years agonir/spirv: Add support for image types
Jason Ekstrand [Fri, 13 Nov 2015 23:52:52 +0000 (15:52 -0800)]
nir/spirv: Add support for image types

9 years agonir/types: Add image type helpers
Jason Ekstrand [Fri, 13 Nov 2015 23:49:13 +0000 (15:49 -0800)]
nir/types: Add image type helpers

9 years agoglsl/types: Add a get_image_instance helper
Jason Ekstrand [Fri, 13 Nov 2015 23:48:48 +0000 (15:48 -0800)]
glsl/types: Add a get_image_instance helper

9 years agonir: Silence GCC maybe-uninitialized warnings.
Vinson Lee [Mon, 2 Nov 2015 09:23:59 +0000 (01:23 -0800)]
nir: Silence GCC maybe-uninitialized warnings.

nir/nir_control_flow.c: In function ‘split_block_cursor.isra.11’:
nir/nir_control_flow.c:460:15: warning: ‘after’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       *_after = after;
               ^
nir/nir_control_flow.c:458:16: warning: ‘before’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       *_before = before;
                ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agoi965: Add a SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT opcode.
Kenneth Graunke [Sat, 7 Nov 2015 09:37:33 +0000 (01:37 -0800)]
i965: Add a SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT opcode.

We need to use per-slot offsets when there's non-uniform indexing,
as each SIMD channel could have a different index.  We want to use
them for any non-constant index (even if uniform), as it lives in
the message header instead of the descriptor, allowing us to set
offsets in GRFs rather than immediates.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
9 years agoglsl: Allow implicit int -> uint conversions for the % operator.
Kenneth Graunke [Thu, 12 Nov 2015 21:02:05 +0000 (13:02 -0800)]
glsl: Allow implicit int -> uint conversions for the % operator.

GLSL 4.00 and GL_ARB_gpu_shader5 introduced a new int -> uint implicit
conversion rule and updated the rules for modulus to use them.  (In
earlier languages, none of the implicit conversion rules did anything
relevant, so there was no point in applying them.)

This allows expressions such as:

   int foo;
   uint bar;
   uint mod = foo % bar;

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoi965: Print input/output VUE maps on INTEL_DEBUG=vs, gs.
Kenneth Graunke [Tue, 10 Nov 2015 08:48:33 +0000 (00:48 -0800)]
i965: Print input/output VUE maps on INTEL_DEBUG=vs, gs.

I've been carrying around a patch to do this for the last few months,
and it's been exceedingly useful for debugging GS and tessellation
problems.  I've caught lots of bugs by inspecting the interface
expectations of two adjacent stages.

It's not that much spam, so I figure we may as well just print it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
9 years agoi965: Make convert_attr_sources_to_hw_regs handle stride == 0.
Kenneth Graunke [Thu, 12 Nov 2015 06:37:53 +0000 (22:37 -0800)]
i965: Make convert_attr_sources_to_hw_regs handle stride == 0.

This makes expressions like component(fs_reg(ATTR, n), 7) get a proper
<0,1,0> region instead of the invalid <0,8,0>.

Nobody uses this today, but I plan to.

v2: Rebase on Matt's changes; simplify.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
9 years agonir: Add helpers for getting input/output intrinsic sources.
Kenneth Graunke [Sun, 8 Nov 2015 06:35:33 +0000 (22:35 -0800)]
nir: Add helpers for getting input/output intrinsic sources.

With the many variants of IO intrinsics, particular sources are often in
different locations.  It's convenient to say "give me the indirect
offset" or "give me the vertex index" and have it just work, without
having to think about exactly which kind of intrinsic you have.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agonir: Don't lower TCS outputs to temporaries.
Kenneth Graunke [Mon, 19 Oct 2015 18:28:15 +0000 (11:28 -0700)]
nir: Don't lower TCS outputs to temporaries.

We'd like to shadow these when possible, but the current code doesn't
work properly for TCS outputs.  For now, disable it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agonir: Allow outputs reads and add the relevant intrinsics.
Kenneth Graunke [Mon, 19 Oct 2015 18:44:28 +0000 (11:44 -0700)]
nir: Allow outputs reads and add the relevant intrinsics.

Normally, we rely on nir_lower_outputs_to_temporaries to create shadow
variables for outputs, buffering the results and writing them all out
at the end of the program.  However, this is infeasible for tessellation
control shader outputs.

Tessellation control shaders can generate multiple output vertices, and
write per-vertex outputs.  These are arrays indexed by the vertex
number; each thread only writes one element, but can read any other
element - including those being concurrently written by other threads.
The barrier() intrinsic synchronizes between threads.

Even if we tried to shadow every output element (which is of dubious
value), we'd have to read updated values in at barrier() time, which
means we need to allow output reads.

Most stages should continue using nir_lower_outputs_to_temporaries(),
but in theory drivers could choose not to if they really wanted.

v2: Rebase to accomodate Jason's review feedback.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agonir/lower_io: Introduce nir_store_per_vertex_output intrinsics.
Kenneth Graunke [Fri, 2 Oct 2015 07:11:01 +0000 (00:11 -0700)]
nir/lower_io: Introduce nir_store_per_vertex_output intrinsics.

Similar to nir_load_per_vertex_input, but for outputs.  This is not
useful in geometry shaders, but will be useful in tessellation shaders.

v2: Change stage_uses_per_vertex_outputs() to is_per_vertex_output(),
    taking a nir_variable (requested by Jason Ekstrand).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agonir/lower_io: Use load_per_vertex_input intrinsics for TCS and TES.
Kenneth Graunke [Thu, 1 Oct 2015 00:17:35 +0000 (17:17 -0700)]
nir/lower_io: Use load_per_vertex_input intrinsics for TCS and TES.

Tessellation control shader inputs are an array indexed by the vertex
number, like geometry shader inputs.  There aren't per-patch TCS inputs.

Tessellation evaluation shaders have both per-vertex and per-patch
inputs.  Per-vertex inputs get the new intrinsics; per-patch inputs
continue to use the ordinary load_input intrinsics, as they already
work like we want them to.

v2: Change stage_uses_per_vertex_inputs into is_per_vertex_input(),
    which takes a variable (requested by Jason Ekstrand).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoi965: Silence unused parameter warnings in get_buffer_rect
Ian Romanick [Mon, 2 Nov 2015 22:29:42 +0000 (14:29 -0800)]
i965: Silence unused parameter warnings in get_buffer_rect

brw_meta_fast_clear.c: In function 'get_buffer_rect':
brw_meta_fast_clear.c:318:37: warning: unused parameter 'brw' [-Wunused-parameter]
 get_buffer_rect(struct brw_context *brw, struct gl_framebuffer *fb,
                                     ^
brw_meta_fast_clear.c:319:44: warning: unused parameter 'irb' [-Wunused-parameter]
                 struct intel_renderbuffer *irb, struct rect *rect)
                                            ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agometa/generate_mipmap: Don't leak the sampler object
Ian Romanick [Tue, 10 Nov 2015 20:36:58 +0000 (12:36 -0800)]
meta/generate_mipmap: Don't leak the sampler object

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agoi965: Remove unneeded #includes.
Matt Turner [Fri, 13 Nov 2015 20:16:48 +0000 (12:16 -0800)]
i965: Remove unneeded #includes.

Some of these are no longer needed since all the backends switched to
NIR.

9 years agoi965: Silence warning.
Matt Turner [Fri, 13 Nov 2015 20:13:14 +0000 (12:13 -0800)]
i965: Silence warning.

intel_asm_annotation.c: In function ‘annotation_insert_error’:
intel_asm_annotation.c:214:18:
warning: ‘ann’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
       ann->error = ralloc_strdup(annotation->mem_ctx, error);
                         ^

I initially tried changing the type of ann_count to unsigned (is
currently int), since that in addition to the check that it's non-zero
at the beginning of the function seems sufficient to prove that it must
be greater than zero. Unfortunately that wasn't sufficient.

9 years agoi965: Don't write beyond allocated memory.
Juha-Pekka Heikkila [Fri, 13 Nov 2015 11:36:43 +0000 (13:36 +0200)]
i965: Don't write beyond allocated memory.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
9 years agoi965: Use BRW_MRF_COMPR4 macro in more places.
Matt Turner [Mon, 2 Nov 2015 18:23:12 +0000 (10:23 -0800)]
i965: Use BRW_MRF_COMPR4 macro in more places.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Combine register file field.
Matt Turner [Tue, 27 Oct 2015 01:41:27 +0000 (18:41 -0700)]
i965: Combine register file field.

The first four values (2-bits) are hardware values, and VGRF, ATTR, and
UNIFORM remain values used in the IR.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Replace HW_REG with ARF/FIXED_GRF.
Matt Turner [Tue, 27 Oct 2015 00:52:57 +0000 (17:52 -0700)]
i965: Replace HW_REG with ARF/FIXED_GRF.

HW_REGs are (were!) kind of awful. If the file was HW_REG, you had to
look at different fields for type, abs, negate, writemask, swizzle, and
a second file. They also caused annoying problems like immediate sources
being considered scheduling barriers (commit 6148e94e2) and other such
nonsense.

Instead use ARF/FIXED_GRF/MRF for fixed registers in those files.

After a sufficient amount of time has passed since "GRF" was used, we
can rename FIXED_GRF -> GRF, but doing so now would make rebasing awful.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/fs: Set stride correctly for immediates in fs_reg(brw_reg).
Matt Turner [Mon, 2 Nov 2015 00:25:04 +0000 (00:25 +0000)]
i965/fs: Set stride correctly for immediates in fs_reg(brw_reg).

The fs_reg() constructors for immediates set stride to 0, except for
vector-immediates, which set stride to 1.  This patch makes the fs_reg
constructor that takes a brw_reg do likewise, so that stride is set
correctly for cases such as fs_reg(brw_imm_v(...)).

The generator asserts that this is true (and presumably it's useful in
some optimization passes?) and the VF fs_reg constructors did this (by
virtue of the fact that it doesn't override what init() does).

In the next commit, calling this constructor with brw_imm_* will generate
an IMM file register rather than a HW_REG, making this change necessary
to avoid breakage with existing uses of brw_imm_v().

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/fs: Handle type-V immediates in brw_reg_from_fs_reg().
Matt Turner [Mon, 2 Nov 2015 00:22:29 +0000 (00:22 +0000)]
i965/fs: Handle type-V immediates in brw_reg_from_fs_reg().

We use brw_imm_v() to produce type-V immediates, which generates a
brw_reg with fs_reg's .file set to HW_REG. The next commit will rid us
of HW_REGs, so we need to handle BRW_REGISTER_TYPE_V in the IMM case.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Rename GRF to VGRF.
Matt Turner [Tue, 27 Oct 2015 00:09:25 +0000 (17:09 -0700)]
i965: Rename GRF to VGRF.

The 2-bit hardware register file field is ARF, GRF, MRF, IMM.

Rename GRF to VGRF (virtual GRF) so that we can reuse the GRF name to
mean an assigned general purpose register.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Move BAD_FILE from the beginning of enum register_file.
Matt Turner [Fri, 30 Oct 2015 05:04:22 +0000 (22:04 -0700)]
i965: Move BAD_FILE from the beginning of enum register_file.

I'm going to begin using brw_reg's file field in backend_reg and its
derivatives, and in order to keep the hardware value for ARF as 0, we
have to do something different.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Initialize registers.
Matt Turner [Fri, 30 Oct 2015 20:53:38 +0000 (13:53 -0700)]
i965: Initialize registers.

The test (file == BAD_FILE) works on registers for which the constructor
has not run because BAD_FILE is zero.  The next commit will move
BAD_FILE in the enum so that it's no longer zero.

In the case of this->outputs, the constructor was being run implicitly,
and we were unnecessarily memsetting is to zero.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Use brw_reg's nr field to store register number.
Matt Turner [Mon, 26 Oct 2015 11:35:14 +0000 (04:35 -0700)]
i965: Use brw_reg's nr field to store register number.

In addition to combining another field, we get replace silliness like
"reg.reg" with something that actually makes sense, "reg.nr"; and no one
will ever wonder again why dst.reg isn't a dst_reg.

Moving the now 16-bit nr field to a 16-bit boundary decreases code size
by about 3k.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Unwrap some lines.
Matt Turner [Mon, 26 Oct 2015 11:04:16 +0000 (04:04 -0700)]
i965: Unwrap some lines.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/vec4: Remove swizzle/writemask fields from src/dst_reg.
Matt Turner [Mon, 26 Oct 2015 04:14:56 +0000 (21:14 -0700)]
i965/vec4: Remove swizzle/writemask fields from src/dst_reg.

Also allows us to handle HW_REGs in the swizzle() and writemask()
functions.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Remove fixed_hw_reg field from backend_reg.
Matt Turner [Sat, 24 Oct 2015 22:29:03 +0000 (15:29 -0700)]
i965: Remove fixed_hw_reg field from backend_reg.

Since backend_reg now inherits brw_reg, we can use it in place of the
fixed_hw_reg field.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Use immediate storage in inherited brw_reg.
Matt Turner [Sat, 24 Oct 2015 21:55:57 +0000 (14:55 -0700)]
i965: Use immediate storage in inherited brw_reg.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Add and use enum brw_reg_file.
Matt Turner [Fri, 23 Oct 2015 20:11:44 +0000 (13:11 -0700)]
i965: Add and use enum brw_reg_file.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Reorganize brw_reg fields.
Matt Turner [Fri, 23 Oct 2015 19:17:03 +0000 (12:17 -0700)]
i965: Reorganize brw_reg fields.

Put fields that are meaningless with an immediate in the same storage
with the immediate. This leaves fields type, file, nr, subnr in the
first dword where there's now extra room for expansion.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Make 'dw1' and 'bits' unnamed structures in brw_reg.
Matt Turner [Fri, 23 Oct 2015 02:41:30 +0000 (19:41 -0700)]
i965: Make 'dw1' and 'bits' unnamed structures in brw_reg.

Generated by

   sed -i -e 's/\.bits\././g' *.c *.h *.cpp
   sed -i -e 's/dw1\.//g' *.c *.h *.cpp

and then reverting changes to comments in gen7_blorp.cpp and
brw_fs_generator.cpp.

There wasn't any utility offered by forcing the programmer to list these
to access their fields. Removing them will reduce churn in future
commits.

This is C11 (and gcc has apparently supported it for sometime
"compatibility with other compilers")

See https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Delete type field from backend_reg.
Matt Turner [Sat, 24 Oct 2015 22:04:23 +0000 (15:04 -0700)]
i965: Delete type field from backend_reg.

Switching from an implicitly-sized type field to field with an explicit
bit width is safe because we have fewer than 2^4 types, and gcc will
warn if you attempt to set a value that will not fit.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Delete abs/negate fields from backend_reg.
Matt Turner [Sat, 24 Oct 2015 21:35:33 +0000 (14:35 -0700)]
i965: Delete abs/negate fields from backend_reg.

Instead use the ones provided by brw_reg. Also allows us to handle
HW_REGs in the negate() functions.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Make backend_reg inherit from brw_reg.
Matt Turner [Sat, 24 Oct 2015 21:32:03 +0000 (14:32 -0700)]
i965: Make backend_reg inherit from brw_reg.

Some fields (file, type, abs, negate) in brw_reg are shadowed by
backend_reg.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/fs: Replace nested ternary with if ladder.
Matt Turner [Fri, 13 Nov 2015 00:02:22 +0000 (16:02 -0800)]
i965/fs: Replace nested ternary with if ladder.

Since the types of the expression were

   bool ? src_reg : (bool ? brw_reg : brw_reg)

the result of the second (nested) ternary would be implicitly
converted to a src_reg by the src_reg(struct brw_reg) constructor. I.e.,

   bool ? src_reg : src_reg(bool ? brw_reg : brw_reg)

In the next patch, I make backend_reg (the parent of src_reg) inherit
from brw_reg, which changes this expression to return brw_reg, which
throws away any fields that exist in the classes derived from brw_reg.
I.e.,

   src_reg(bool ? brw_reg(src_reg) : bool ? brw_reg : brw_reg)

Generally this code was gross, and wasn't actually shorter or easier to
read than an if ladder.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
9 years agoisl: Embed brw_device_info in isl_device
Chad Versace [Fri, 13 Nov 2015 19:12:46 +0000 (11:12 -0800)]
isl: Embed brw_device_info in isl_device

Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoradeonsi: remove dead code after ES-GS linkage change
Marek Olšák [Thu, 15 Oct 2015 21:41:35 +0000 (23:41 +0200)]
radeonsi: remove dead code after ES-GS linkage change

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: link ES-GS just like LS-HS
Marek Olšák [Thu, 15 Oct 2015 21:29:00 +0000 (23:29 +0200)]
radeonsi: link ES-GS just like LS-HS

This reduces the shader key for ES.

Use a fixed attrib location based on (semantic name,  index).

The ESGS item size is determined by the physical index of the highest ES
output, so it's almost always larger than before, but I think that
shouldn't matter as long as the ESGS ring buffer is large enough.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: calculate optimal GS ring sizes to fix GS hangs on Tonga
Marek Olšák [Sun, 8 Nov 2015 12:34:44 +0000 (13:34 +0100)]
radeonsi: calculate optimal GS ring sizes to fix GS hangs on Tonga

I discovered that increasing the ESGS ring size fixes GS hangs on Tonga,
so let's do it properly.

There is now a separate init_config_gs_rings state that is not immutable,
because GS rings are resized when needed.

This also saves some memory. Most apps won't need more than 1MB
per ring per shader engine.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: rename si_update_gs_rings
Marek Olšák [Sun, 8 Nov 2015 11:15:54 +0000 (12:15 +0100)]
radeonsi: rename si_update_gs_rings

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: calculate ESGS_RING_ITEMSIZE in create_shader
Marek Olšák [Sun, 8 Nov 2015 11:12:46 +0000 (12:12 +0100)]
radeonsi: calculate ESGS_RING_ITEMSIZE in create_shader

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: move maximum gs stream calculation into create_shader
Marek Olšák [Sun, 8 Nov 2015 11:05:39 +0000 (12:05 +0100)]
radeonsi: move maximum gs stream calculation into create_shader

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: clean up small duplication in si_shader_gs
Marek Olšák [Sun, 8 Nov 2015 10:49:33 +0000 (11:49 +0100)]
radeonsi: clean up small duplication in si_shader_gs

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: shorten render_cond variable names
Marek Olšák [Sat, 7 Nov 2015 15:30:01 +0000 (16:30 +0100)]
gallium/radeon: shorten render_cond variable names

and ..._cond -> ..._invert

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: remove predicate_drawing flag
Marek Olšák [Sat, 7 Nov 2015 15:24:47 +0000 (16:24 +0100)]
gallium/radeon: remove predicate_drawing flag

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: atomize render condition (SET_PREDICATION)
Marek Olšák [Sat, 7 Nov 2015 14:39:39 +0000 (15:39 +0100)]
gallium/radeon: atomize render condition (SET_PREDICATION)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: simplify restoring render condition after flush
Marek Olšák [Sat, 7 Nov 2015 14:00:55 +0000 (15:00 +0100)]
gallium/radeon: simplify restoring render condition after flush

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: don't use PREDICATION_OP_CLEAR
Marek Olšák [Sat, 7 Nov 2015 13:55:23 +0000 (14:55 +0100)]
gallium/radeon: don't use PREDICATION_OP_CLEAR

Not setting the predication bit is sufficient.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: simplify disabling render condition for u_blitter
Marek Olšák [Sat, 7 Nov 2015 13:45:58 +0000 (14:45 +0100)]
gallium/radeon: simplify disabling render condition for u_blitter

just disable it by not setting the predication bit

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agor600g: don't set predication on non-draw packets
Marek Olšák [Sat, 7 Nov 2015 13:36:38 +0000 (14:36 +0100)]
r600g: don't set predication on non-draw packets

This has no effect.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: inline the r600_rings structure
Marek Olšák [Sat, 7 Nov 2015 13:00:30 +0000 (14:00 +0100)]
gallium/radeon: inline the r600_rings structure

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: prevent recursion in si_context_gfx_flush
Marek Olšák [Sat, 7 Nov 2015 11:22:56 +0000 (12:22 +0100)]
radeonsi: prevent recursion in si_context_gfx_flush

The recursion can only occur if you modify need_cs_space to always flush.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: remove the IB flushing flag
Marek Olšák [Sat, 7 Nov 2015 12:43:18 +0000 (13:43 +0100)]
gallium/radeon: remove the IB flushing flag

Not needed anymore. A similar flag will be introduced in the next commit,
which will be private in radeonsi.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: move GFX/DMA flushing from add_to_buffer_list to need_cs_space
Marek Olšák [Sat, 7 Nov 2015 12:31:03 +0000 (13:31 +0100)]
gallium/radeon: move GFX/DMA flushing from add_to_buffer_list to need_cs_space

need_cs_space isn't invoked so often and is called before all commands too.
This is a lot cleaner. The code in radeon_add_to_buffer_list always seemed
dodgy to me.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: rename cache flushing flags once more
Marek Olšák [Fri, 6 Nov 2015 20:11:16 +0000 (21:11 +0100)]
radeonsi: rename cache flushing flags once more

KCACHE, TC L1 and TC L2 are renamed to:
- SMEM L1
- VMEM L1
- GLOBAL L2

You can easily tell what they are used for now.
Shaders must deal with coherency issues between both L1s manually,
e.g. by setting GLC=1 or by using s_dcache_*.

BOTH_ICACHE_KCACHE was an unused definition.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: set the DISABLE_WR_CONFIRM flag on CI-VI as well
Marek Olšák [Sat, 7 Nov 2015 11:07:31 +0000 (12:07 +0100)]
radeonsi: set the DISABLE_WR_CONFIRM flag on CI-VI as well

I missed this in commit c3e527f93d4281ad6e2ca165eaf6ff588e4faefa
    radeonsi: only enable write confirmation on the last CP DMA packet

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: initialize SX_PS_DOWNCONVERT to 0 on Stoney
Marek Olšák [Thu, 5 Nov 2015 22:56:38 +0000 (23:56 +0100)]
radeonsi: initialize SX_PS_DOWNCONVERT to 0 on Stoney

otherwise the SX or CB blocks can go bananas

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org