mesa.git
9 years agoradeonsi: save the contents of indirect buffers for debug contexts
Marek Olšák [Sat, 15 Aug 2015 10:46:17 +0000 (12:46 +0200)]
radeonsi: save the contents of indirect buffers for debug contexts

This will be used by the IB parser.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoradeonsi: generate register and packet tables for an IB parser from sid.h
Marek Olšák [Sat, 15 Aug 2015 21:44:04 +0000 (23:44 +0200)]
radeonsi: generate register and packet tables for an IB parser from sid.h

This makes writing a good IB parser a lot easier.

It generates 2 tables:
- packet3 table
- register table with all registers, fields, and named values

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoradeonsi: remove duplicated register definitions and instruction definitions
Marek Olšák [Sat, 15 Aug 2015 16:48:06 +0000 (18:48 +0200)]
radeonsi: remove duplicated register definitions and instruction definitions

Instruction encoding isn't needed in Mesa.

The border color address registers were duplicated.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agor600g,radeonsi: remove unused ill-formed register field definitions
Marek Olšák [Sat, 15 Aug 2015 16:43:27 +0000 (18:43 +0200)]
r600g,radeonsi: remove unused ill-formed register field definitions

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoradeonsi: add an initial dump_debug_state implementation dumping shaders
Marek Olšák [Sat, 15 Aug 2015 21:56:22 +0000 (23:56 +0200)]
radeonsi: add an initial dump_debug_state implementation dumping shaders

This is usually called after a draw call.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoradeonsi: allow si_dump_key to write to a file
Marek Olšák [Sat, 11 Jul 2015 11:13:07 +0000 (13:13 +0200)]
radeonsi: allow si_dump_key to write to a file

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agogallium/ddebug: new pipe for hang detection and driver state dumping (v2)
Marek Olšák [Sat, 4 Jul 2015 12:10:21 +0000 (14:10 +0200)]
gallium/ddebug: new pipe for hang detection and driver state dumping (v2)

v2: lots of improvements

This is like identity or trace, but simpler. It doesn't wrap most states.

Run with:
  GALLIUM_DDEBUG=1000 [executable]
where "executable" is the app and "1000" is in miliseconds, meaning that
the context will be considered hung if a fence fails to signal in 1000 ms.

If that happens, all shaders, context states, bound resources, draw
parameters, and driver debug information (if any) will be dumped into:
  /home/$username/dd_dumps/$processname_$pid_$index.

Note that the context is flushed after every draw/clear/copy/blit operation
and then waited for to find the exact call that hangs.

You can also do:
  GALLIUM_DDEBUG=always
to do the dumping after every draw/clear/copy/blit operation without
flushing and waiting.

Examples of driver states that can be dumped are:
- Hardware status registers saying which hw block is busy (hung).
- Disassembled shaders in a human-readable form.
- The last submitted command buffer in a human-readable form.

v2: drop pipe-loader changes, drop SConscript
    rename dd.h -> dd_pipe.h

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agogallium: add flags parameter to pipe_screen::context_create
Marek Olšák [Sat, 25 Jul 2015 16:40:59 +0000 (18:40 +0200)]
gallium: add flags parameter to pipe_screen::context_create

This allows creating compute-only and debug contexts.

Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agogallium: add an interface for dumping debug driver state
Marek Olšák [Sat, 11 Jul 2015 10:34:46 +0000 (12:34 +0200)]
gallium: add an interface for dumping debug driver state

Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agomesa: remove pointless es31 checks, fix indirect to only be in es31
Ilia Mirkin [Mon, 24 Aug 2015 15:34:42 +0000 (11:34 -0400)]
mesa: remove pointless es31 checks, fix indirect to only be in es31

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agomesa: uncomment checks in es31 computation, add texture_ms
Ilia Mirkin [Mon, 24 Aug 2015 13:35:04 +0000 (09:35 -0400)]
mesa: uncomment checks in es31 computation, add texture_ms

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
9 years agomesa: create multisample fallback textures like normal textures
Marek Olšák [Sun, 23 Aug 2015 22:22:37 +0000 (00:22 +0200)]
mesa: create multisample fallback textures like normal textures

This works if drivers upsample on upload (like all radeon ones do).
The alternative is an unexpected GL error from anything calling
_mesa_update_state and possibly other issues.

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
9 years agoradeonsi: mark unreachable paths to avoid warnings
Grazvydas Ignotas [Tue, 18 Aug 2015 00:23:29 +0000 (03:23 +0300)]
radeonsi: mark unreachable paths to avoid warnings

Otherwise we get:
warning: 'num_user_sgprs' may be used uninitialized in this function
...

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
9 years agomesa: GetTexLevelParameter{if}v changes for OpenGL ES 3.1
Tapani Pälli [Tue, 28 Jul 2015 08:25:35 +0000 (11:25 +0300)]
mesa: GetTexLevelParameter{if}v changes for OpenGL ES 3.1

Patch refactors existing parameters check to first check common enums
between desktop GL and GLES 3.1 and modifies get_tex_level_parameter_image
to be compatible with enums specified in 3.1.

v2: remove extra is_gles31() checks (suggested by Ilia)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1)
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> (v1)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agomesa/es3.1: Allow GL_COMPUTE_WORK_GROUP_SIZE for OpenGL ES 3.1
Marta Lofstedt [Wed, 19 Aug 2015 13:30:33 +0000 (15:30 +0200)]
mesa/es3.1: Allow GL_COMPUTE_WORK_GROUP_SIZE for OpenGL ES 3.1

According to OpenGL ES specification section 7.12,
GL_COMPUTE_WORK_GROUP_SIZE, is supported by the
glGetProgramiv function.

Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agomesa/es3.1: Enable getting MAX_COMPUTE_WORK_GROUP_ values for OpenGL ES 3.1
Marta Lofstedt [Wed, 19 Aug 2015 13:33:21 +0000 (15:33 +0200)]
mesa/es3.1: Enable getting MAX_COMPUTE_WORK_GROUP_ values for OpenGL ES 3.1

According to the OpenGL ES 3.1 specification chapter 17, the
MAX_COMPUTE_WORK_GROUP_COUNT and MAX_COMPUTE_WORK_GROUP_SIZE
is available for glGetIntegeri_v.

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agomesa/formats: pass correct parameter to _mesa_is_format_compressed
Dave Airlie [Wed, 26 Aug 2015 00:37:09 +0000 (10:37 +1000)]
mesa/formats: pass correct parameter to _mesa_is_format_compressed

commit 26c549e69d12e44e2e36c09764ce2cceab262a1b
Author: Nanley Chery <nanley.g.chery@intel.com>
Date:   Fri Jul 31 10:26:36 2015 -0700

    mesa/formats: remove compressed formats from matching function

caused a regression in my CTS testing, this looks like a clear
thinko.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
sSigned-off-by: Dave Airlie <airlied@redhat.com>

9 years agogallium/auxiliary: optimize rgb9e5 helper some more
Roland Scheidegger [Sun, 9 Aug 2015 00:50:10 +0000 (02:50 +0200)]
gallium/auxiliary: optimize rgb9e5 helper some more

I used this as some testing ground for investigating some compiler
bits initially (e.g. lrint calls etc.), figured I could do much better
in the end just for fun...
This is mathematically equivalent, but uses some tricks to avoid
doubles and also replaces some float math with ints. Good for another
performance doubling or so. As a side note, some quick tests show that
llvm's loop vectorizer would be able to properly vectorize this version
(which it failed to do earlier due to doubles, producing a mess), giving
another 3 times performance increase with sse2 (more with sse4.1), but this
may not apply to mesa.
No piglit change.

Acked-by: Marek Olšák <marek.olsak@amd.com>
9 years agogallium/auxiliary: optimize rgb9e5 helper a bit
Roland Scheidegger [Sun, 9 Aug 2015 00:03:33 +0000 (02:03 +0200)]
gallium/auxiliary: optimize rgb9e5 helper a bit

This code (lifted straight from the extension) was doing things the most
inefficient way you could think of.
This drops some of the more expensive float operations, in particular
- int-cast floors (pointless, values always positive)
- 2 raised to (signed) integers (replace with simple exponent manipulation),
  getting rid of a misguided comment in the process (implement with table...)
- float division (replace with mul of reverse of those exponents)
This is like 3 times faster (measured for float3_to_rgb9e5), though it depends
(e.g. llvm is clever enough to replace exp2 with ldexp whereas gcc is not,
division is not too bad on cpus with early-exit divs).
Note that keeping the double math for now (float x + 0.5), as the results may
otherwise differ.

Acked-by: Marek Olšák <marek.olsak@amd.com>
9 years agomesa/texgetimage: fix missing stencil check
Dave Airlie [Sun, 23 Aug 2015 23:52:12 +0000 (09:52 +1000)]
mesa/texgetimage: fix missing stencil check

GetTexImage can read to stencil8 but only from
a stencil or depthstencil textures.

This fixes a bunch of failures in CTS
GL33-CTS.gtf32.GL3Tests.packed_pixels

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agomesa/teximage: Add GL error parameter to _mesa_target_can_be_compressed
Nanley Chery [Fri, 21 Aug 2015 20:09:08 +0000 (13:09 -0700)]
mesa/teximage: Add GL error parameter to _mesa_target_can_be_compressed

Enables _mesa_target_can_be_compressed to return the appropriate GL error
depending on it's inputs. Use the parameter to return the appropriate GL error
for ETC2 formats on GLES3.

Suggested-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa/formats: remove compressed formats from matching function
Nanley Chery [Fri, 31 Jul 2015 17:26:36 +0000 (10:26 -0700)]
mesa/formats: remove compressed formats from matching function

All compressed formats return GL_FALSE and there isn't any evidence to
support that this behaviour would change. Remove all switch cases for
compressed formats.

v2. Since the exhaustive switch is removed, add a gtest to ensure
    all formats are handled.
v3. Ensure that GL_NO_ERROR is set before returning.
v4. Fix an arg to _mesa_uncompressed_format_to_type_and_comps();
    fix formatting and misc improvements (Chad).

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa/formats: make format testing a gtest
Nanley Chery [Tue, 18 Aug 2015 19:42:57 +0000 (12:42 -0700)]
mesa/formats: make format testing a gtest

We currently check that our format info table is sane during context
initialization in debug builds. Perform this check during
`make check` instead. This enables format testing in release builds
and removes the requirement of an exhuastive switch for
_mesa_uncompressed_format_to_type_and_comps().

v2. indentation and conditional inclusion fixes (Chad).
    allow tests to continue running if any format fails
    and display the failing format name.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agogallium/ttn: Use nir_builder_insert() rather than poking at cf_list.
Kenneth Graunke [Thu, 6 Aug 2015 14:44:35 +0000 (07:44 -0700)]
gallium/ttn: Use nir_builder_insert() rather than poking at cf_list.

I intend to remove nir_builder::cf_node_list, so I can't have this code
poking at it directly.  The proper way is to set the insertion point and
then simply insert things there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
9 years agoprog_to_nir: Use nir_builder_insert() rather than poking at cf_list.
Kenneth Graunke [Thu, 6 Aug 2015 14:39:34 +0000 (07:39 -0700)]
prog_to_nir: Use nir_builder_insert() rather than poking at cf_list.

I intend to remove nir_builder::cf_node_list, so I can't have this code
poking at it directly.  The proper way is to set the insertion point and
then simply insert things there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
9 years agonir: Use nir_shader::stage rather than passing it around.
Kenneth Graunke [Tue, 18 Aug 2015 08:53:29 +0000 (01:53 -0700)]
nir: Use nir_shader::stage rather than passing it around.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
9 years agonir: Store gl_shader_stage in nir_shader.
Kenneth Graunke [Tue, 18 Aug 2015 08:48:34 +0000 (01:48 -0700)]
nir: Store gl_shader_stage in nir_shader.

This makes it easy for NIR passes to inspect what kind of shader they're
operating on.

Thanks to Michel Dänzer for helping me figure out where TGSI stores the
shader stage information.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
9 years agoi965/fs: Combine assign_constant_locations and move_uniform_array_access_to_pull_cons...
Jason Ekstrand [Wed, 19 Aug 2015 00:04:53 +0000 (17:04 -0700)]
i965/fs: Combine assign_constant_locations and move_uniform_array_access_to_pull_constants

The comment above move_uniform_array_access_to_pull_constants was
completely bogus because it has nothing to do with lowering instructions.
Instead, it's assiging locations of pull constants.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/lower_io: Remove assign_var_locations_direct_first
Jason Ekstrand [Tue, 18 Aug 2015 21:45:35 +0000 (14:45 -0700)]
nir/lower_io: Remove assign_var_locations_direct_first

This is no longer used so we might as well get rid of it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/fs: Rework uniform handling
Jason Ekstrand [Tue, 18 Aug 2015 19:00:15 +0000 (12:00 -0700)]
i965/fs: Rework uniform handling

Previously, we treated the entire UNIFORM file as if it had two elements:
One for direct things and one for indirect.  This is substantially
different from how the old visitor code handled it where each element was
effectively its own uniform.  This commit makes the NIR path more like the
old ir_visitor path where each uniform is separate.  This should allow us
to more easily make decisions about what to push.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/vec4_nir: Get rid of the uniform_driver_location tracking
Jason Ekstrand [Tue, 18 Aug 2015 18:42:02 +0000 (11:42 -0700)]
i965/vec4_nir: Get rid of the uniform_driver_location tracking

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/lower_io: Separate driver_location and base offset for uniforms
Jason Ekstrand [Tue, 18 Aug 2015 18:20:40 +0000 (11:20 -0700)]
nir/lower_io: Separate driver_location and base offset for uniforms

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/intrinsics: Add a second const index to load_uniform
Jason Ekstrand [Tue, 18 Aug 2015 18:18:55 +0000 (11:18 -0700)]
nir/intrinsics: Add a second const index to load_uniform

In the i965 backend, we want to be able to "pull apart" the uniforms and
push some of them into the shader through a different path.  In order to do
this effectively, we need to know which variable is actually being referred
to by a given uniform load.  Previously, it was completely flattened by
nir_lower_io which made things difficult.  This adds more information to
the intrinsic to make this easier for us.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir: Pass a type_size() function pointer into nir_lower_io().
Kenneth Graunke [Wed, 12 Aug 2015 21:29:25 +0000 (14:29 -0700)]
nir: Pass a type_size() function pointer into nir_lower_io().

Previously, there were four type_size() functions in play - the i965
compiler backend defined scalar and vec4 type_size() functions, and
nir_lower_io contained its own similar functions.

In fact, the i965 driver used nir_lower_io() and then looped over the
components using its own type_size - meaning both were in play.  The
two are /basically/ the same, but not exactly in obscure cases like
subroutines and images.

This patch removes nir_lower_io's functions, and instead makes the
driver supply a function pointer.  This gives the driver ultimate
flexibility in deciding how it wants to count things, reduces code
duplication, and improves consistency.

v2 (Jason Ekstrand):
 - One side-effect of passing in a function pointer is that nir_lower_io is
   now aware of and properly allocates space for image uniforms, allowing
   us to drop hacks in the backend

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
v2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

9 years agoprog_to_nir: Don't allocate nir_variable with type vec4[0] for uniforms.
Kenneth Graunke [Mon, 24 Aug 2015 23:39:24 +0000 (16:39 -0700)]
prog_to_nir: Don't allocate nir_variable with type vec4[0] for uniforms.

If there are no parameters, we don't need to create a nir_variable to
hold them...and allocating an array of length 0 is pretty bogus.

Should avoid i965 backend assertions in future patches Jason and I are
working on.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoi965: Move type_size() methods out of visitor classes.
Kenneth Graunke [Wed, 12 Aug 2015 21:19:17 +0000 (14:19 -0700)]
i965: Move type_size() methods out of visitor classes.

I want to use C function pointers to these, and they don't use anything
in the visitor classes anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoi965: Make setup_vec4_uniform_value and _image_uniform_values take an offset
Jason Ekstrand [Wed, 19 Aug 2015 17:32:32 +0000 (10:32 -0700)]
i965: Make setup_vec4_uniform_value and _image_uniform_values take an offset

This way they don't implicitly increment the uniforms variable and don't
have to be called in-sequence during uniform setup.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Rename setup_vector_uniform_values to setup_vec4_uniform_value
Jason Ekstrand [Wed, 19 Aug 2015 16:56:57 +0000 (09:56 -0700)]
i965: Rename setup_vector_uniform_values to setup_vec4_uniform_value

The new name more accurately represents what it does: Set up a single vec4
uniform value.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agofreedreno/ir3: fix compile break after splitting out nir_control_flow.h
Rob Clark [Tue, 25 Aug 2015 12:17:30 +0000 (08:17 -0400)]
freedreno/ir3: fix compile break after splitting out nir_control_flow.h

The commit:

  commit b49371b8ede380f10ea3ab333246a3b01ac6aca5
  Author:     Connor Abbott <cwabbott0@gmail.com>
  AuthorDate: Tue Jul 21 19:54:18 2015 -0700

      nir: move control flow modification to its own file

split out some control flow related APIs into a separate header, but did
not update drivers.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: fix compile break after fxn->start_block removal
Rob Clark [Tue, 25 Aug 2015 12:13:04 +0000 (08:13 -0400)]
freedreno/ir3: fix compile break after fxn->start_block removal

The commit:

  commit 8e0d4ef3410ea07d9621df3e083bc3e7c1ad2ab0
  Author:     Kenneth Graunke <kenneth@whitecape.org>
  AuthorDate: Thu Aug 6 18:18:40 2015 -0700

      nir: Delete the nir_function_impl::start_block field.

removed the start_block field without fixing up drivers..

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agomesa: enable texture stencil8 for multisample
Dave Airlie [Wed, 29 Jul 2015 08:09:44 +0000 (18:09 +1000)]
mesa: enable texture stencil8 for multisample

This fixes GL45-CTS.gtf44.GL31Tests.texture_stencil8.texture_stencil8_gl44
from the ogl conform suite.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agomesa: make _mesa_bind_texture_unit() static
Brian Paul [Mon, 24 Aug 2015 13:50:51 +0000 (07:50 -0600)]
mesa: make _mesa_bind_texture_unit() static

It's only called from the file it's defined in.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
9 years agomesa/formats: store whether or not a format is sRGB in gl_format_info
Nanley Chery [Tue, 19 May 2015 16:58:17 +0000 (09:58 -0700)]
mesa/formats: store whether or not a format is sRGB in gl_format_info

v2: remove extra newline.
v3: use bool instead of GLboolean.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agonir: Use !block_ends_in_jump() in a few places rather than open-coding.
Kenneth Graunke [Mon, 24 Aug 2015 19:18:51 +0000 (12:18 -0700)]
nir: Use !block_ends_in_jump() in a few places rather than open-coding.

Connor introduced this helper recently; we should use it here too.

I had to move the function earlier in the file for it to be available.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agonir/cf: reimplement nir_cf_node_remove() using the new API
Connor Abbott [Wed, 22 Jul 2015 02:54:35 +0000 (19:54 -0700)]
nir/cf: reimplement nir_cf_node_remove() using the new API

This gives us some testing of it. Also, the old nir_cf_node_remove()
wasn't handling phi nodes correctly and was calling cleanup_cf_node()
too late.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: add new control modification API's
Connor Abbott [Wed, 22 Jul 2015 02:54:34 +0000 (19:54 -0700)]
nir/cf: add new control modification API's

These will help us do a number of things, including:

- Early return elimination.
- Dead control flow elimination.
- Various optimizations, such as replacing:

if (foo) {
    ...
}
if (!foo) {
    ...
}

with:

if (foo) {
    ...
} else {
    ...
}

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: use a cursor for inserting control flow
Connor Abbott [Wed, 22 Jul 2015 02:54:33 +0000 (19:54 -0700)]
nir/cf: use a cursor for inserting control flow

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: add split_block_cursor()
Connor Abbott [Wed, 22 Jul 2015 02:54:32 +0000 (19:54 -0700)]
nir/cf: add split_block_cursor()

This is a helper that will be shared between the new control flow
insertion and modification code.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: add split_block_before_instr()
Connor Abbott [Wed, 22 Jul 2015 02:54:31 +0000 (19:54 -0700)]
nir/cf: add split_block_before_instr()

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: add a cursor structure
Connor Abbott [Wed, 22 Jul 2015 02:54:30 +0000 (19:54 -0700)]
nir/cf: add a cursor structure

For now, it allows us to refactor the control flow insertion API's so
that there's a single entrypoint (with some wrappers). More importantly,
it will allow us to reduce the combinatorial explosion in the extract
function. There, we need to specify two points to extract, which may be
at the beginning of a block, the end of a block, or in the middle of a
block. And then there are various wrappers based off of that (before a
control flow node, before a control flow list, etc.). Rather than having
9 different functions, we can have one function and push the actual
logic of determining which variant to use down to the split function,
which will be shared with nir_cf_node_insert().

In the future, we may want to make the instruction insertion API's as
well as the builder use this, but that's a future cleanup.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: fix link_blocks() when there are no successors
Connor Abbott [Wed, 22 Jul 2015 02:54:29 +0000 (19:54 -0700)]
nir/cf: fix link_blocks() when there are no successors

When we insert a single basic block A into another basic block B, we
will split B into C and D, insert A in the middle, and then splice
together C, A, and D. When we splice together C and A, we need to move
the successors of A into C -- except A has no successors, since it
hasn't been inserted yet. So in move_successors(), we need to handle the
case where the block whose successors are to be moved doesn't have any
successors. Fixing link_blocks() here prevents a segfault and makes it
work correctly.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: clean up jumps when cleaning up CF nodes
Connor Abbott [Wed, 22 Jul 2015 02:54:28 +0000 (19:54 -0700)]
nir/cf: clean up jumps when cleaning up CF nodes

We may delete a control flow node which contains structured jumps to
other parts of the program. We need to remove the jump as a predecessor,
as well as remove any phi node sources which reference it. Right now,
the same problem exists for blocks that don't end in a jump instruction,
but with the new API it shouldn't be an issue, since blocks that don't
end in a jump must either point to another block in the same extracted
CF list or not point to anything at all.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: remove uses of SSA definitions that are being deleted
Connor Abbott [Wed, 22 Jul 2015 02:54:27 +0000 (19:54 -0700)]
nir/cf: remove uses of SSA definitions that are being deleted

Unlike calling nir_instr_remove(), calling nir_cf_node_remove() (and
later in the series, the nir_cf_list_delete()) implies that you're
removing instructions that may still have uses, except those
instructions are never executed so any uses will be undefined. When
cleaning up a CF node for deletion, we must clean up any uses of the
deleted instructions by making them point to undef instructions instead.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: handle jumps better in stitch_blocks()
Connor Abbott [Wed, 22 Jul 2015 02:54:26 +0000 (19:54 -0700)]
nir/cf: handle jumps better in stitch_blocks()

In particular, handle the case where the earlier block ends in a jump
and the later block is empty. In that case, we want to preserve the jump
and remove any traces of the later block. Before, we would only hit this
case when removing a control flow node after a jump, which wasn't a
common occurance, but we'll need it to handle inserting a control flow
list which ends in a jump, which should be more common/useful.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: handle jumps in split_block_end()
Connor Abbott [Wed, 22 Jul 2015 02:54:25 +0000 (19:54 -0700)]
nir/cf: handle jumps in split_block_end()

Before, we would only split a block with a jump at the end if we were
inserting something after a block with a jump, which never happened in
practice. But now, we want to use this to extract control flow lists
which may end in a jump, in which case we really need to do the correct
patching up. As a side effect, when removing jumps we now correctly
insert undef phi sources in some corner cases, which can't hurt.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: add block_ends_in_jump()
Connor Abbott [Wed, 22 Jul 2015 02:54:24 +0000 (19:54 -0700)]
nir/cf: add block_ends_in_jump()

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: handle phi nodes better in split_block_beginning()
Connor Abbott [Wed, 22 Jul 2015 02:54:23 +0000 (19:54 -0700)]
nir/cf: handle phi nodes better in split_block_beginning()

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: split up and improve nir_handle_remove_jumps()
Connor Abbott [Wed, 22 Jul 2015 02:54:22 +0000 (19:54 -0700)]
nir/cf: split up and improve nir_handle_remove_jumps()

Before, the process of removing a jump and wiring up the remaining block
correctly was atomic, but with the new control flow modification it's
split into two parts: first, we extract the jump, which creates a new
block with re-wired successors as well as a free-floating jump, and then
we delete the control flow containing the jump, which removes the entry
in the predecessors and any phi node sources. Split up
nir_handle_remove_jumps() to accomodate this, and add the missing
support for removing phi node sources.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: add remove_phi_src() helper
Connor Abbott [Wed, 22 Jul 2015 02:54:21 +0000 (19:54 -0700)]
nir/cf: add remove_phi_src() helper

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir: add nir_foreach_phi_src_safe()
Connor Abbott [Wed, 22 Jul 2015 02:54:20 +0000 (19:54 -0700)]
nir: add nir_foreach_phi_src_safe()

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/cf: add insert_phi_undef() helper
Connor Abbott [Wed, 22 Jul 2015 02:54:19 +0000 (19:54 -0700)]
nir/cf: add insert_phi_undef() helper

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir: move control flow modification to its own file
Connor Abbott [Wed, 22 Jul 2015 02:54:18 +0000 (19:54 -0700)]
nir: move control flow modification to its own file

We want to start reworking and expanding this code, but it'll be a lot
easier to do once we disentangle it from the rest of the stuff in nir.c.
Unfortunately, there are a few unavoidable dependencies in nir.c on
methods we'd rather not expose publicly, since if not used in very
specific situations they can cause Bad Things (tm) to happen. Namely, we
need to do some magical control flow munging when adding/removing jumps.
In the future, we may disallow adding/removing jumps in
nir_instr_insert_*() and nir_instr_remove(), and use separate functions
that are part of the control flow modification code, but for now we
expose them and put them in a separate, private header.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir: make cleanup_cf_node() not use remove_defs_uses()
Connor Abbott [Wed, 22 Jul 2015 02:54:17 +0000 (19:54 -0700)]
nir: make cleanup_cf_node() not use remove_defs_uses()

cleanup_cf_node() is part of the control flow modification code, which
we're going to split into its own file, but remove_defs_uses() is an
internal function used by nir_instr_remove(). Break the dependency by
making cleanup_cf_node() use nir_instr_remove() instead, which simply
calls remove_defs_uses() and then removes the instruction from the list.
nir_instr_remove() does do extra things for jumps, though, so we avoid
calling it on jumps which matches the previous behavior (this will be
fixed later in the series).

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir: inline block_add_pred() a few places
Connor Abbott [Wed, 22 Jul 2015 02:54:16 +0000 (19:54 -0700)]
nir: inline block_add_pred() a few places

It was being used to initialize function impls and loops, even though
it's really a control flow modification helper. It's pretty trivial, so
just inline it to avoid the dependency.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/validate: check successors/predecessors more carefully
Connor Abbott [Wed, 22 Jul 2015 02:54:15 +0000 (19:54 -0700)]
nir/validate: check successors/predecessors more carefully

We should be checking almost everything now.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir: Delete the nir_function_impl::start_block field.
Kenneth Graunke [Fri, 7 Aug 2015 01:18:40 +0000 (18:18 -0700)]
nir: Delete the nir_function_impl::start_block field.

It's simply the first nir_cf_node in the nir_function_impl::body list,
which is easy enough to access - we don't to store a pointer to it
explicitly.  Removing it means we don't need to maintain the pointer
when, say, splitting the start block when modifying control flow.

Thanks to Connor Abbott for suggesting this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agomesa/formats: only do type and component lookup for uncompressed formats
Nanley Chery [Fri, 31 Jul 2015 16:25:56 +0000 (09:25 -0700)]
mesa/formats: only do type and component lookup for uncompressed formats

Only uncompressed formats have a non-void type and actual
components per pixel. Rename _mesa_format_to_type_and_comps
to _mesa_uncompressed_format_to_type_and_comps and require
callers to check if the format is not compressed.

v2. include compressed format cases to avoid gcc warnings (Chad).

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agofreedreno/a4xx: formats update
Rob Clark [Sat, 15 Aug 2015 15:57:22 +0000 (11:57 -0400)]
freedreno/a4xx: formats update

Fixes glamor, which wants to use R8 integer textures.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno: update generated headers
Rob Clark [Mon, 24 Aug 2015 16:58:08 +0000 (12:58 -0400)]
freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agoi965: Always re-emit the pipeline select during invariant state emission
Chris Wilson [Sun, 23 Aug 2015 08:24:57 +0000 (09:24 +0100)]
i965: Always re-emit the pipeline select during invariant state emission

On the older platforms where we don't have logical contexts preserving
state across batches, we emit the invariant state setup on every batch
using the brw_invariant_state atom. This includes the pipeline selection
which is cached with the introduction of

commit 0e0e23ef537c9add672ff322f34e129a07edc55e
Author: Jordan Justen <jordan.l.justen@intel.com>
Date:   Wed Apr 22 11:43:50 2015 -0700

    i965/state: Emit pipeline select when changing pipelines

However, we do not reset the cache between batches on context-less
platforms resulting in us not setting the pipeline selection and can
cause GPU hangs if a media pipelined was loaded in the meantime (e.g.
mixing mplayer/gstreamer using libva and gnome-shell). A simple solution
is to just forcibly re-emit the pipeline select along with the invariant
state and reset the cache at that point.

Reported-and-tested-by: Tomasz C. <tomaszc@o2.pl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91254
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
9 years agoRevert "radeon/winsys: increase the IB size for VM"
Marek Olšák [Sun, 23 Aug 2015 16:57:44 +0000 (18:57 +0200)]
Revert "radeon/winsys: increase the IB size for VM"

This reverts commit 567394112d904096abff1d994ab952f475dfb444.

It regressed performance. It looks like smaller IBs are better, because
the GPU goes idle quicker and there is less waiting for buffers and fences.

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
9 years agonv50: fix 2d engine blits for 64- and 128-bit formats
Ilia Mirkin [Sun, 23 Aug 2015 07:11:09 +0000 (03:11 -0400)]
nv50: fix 2d engine blits for 64- and 128-bit formats

This fixes bin/ext_framebuffer_multisample-formats all_samples

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
9 years agonv50: account for the int RT0 rule for alpha-to-one/cov
Ilia Mirkin [Sun, 23 Aug 2015 06:56:45 +0000 (02:56 -0400)]
nv50: account for the int RT0 rule for alpha-to-one/cov

Same as commit 1af0641db but for nvc0. If an integer texture is
bound to RT0, don't do alpha-to-one or alpha-to-coverage.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
9 years agomesa/arb_gpu_shader_fp64: add support for glGetUniformdv
Dave Airlie [Mon, 27 Jul 2015 03:13:49 +0000 (13:13 +1000)]
mesa/arb_gpu_shader_fp64: add support for glGetUniformdv

This was missed when I did fp64, I've sent a piglit test to cover
the case as well.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agonv50,nvc0: disable depth bounds test on blit
Ilia Mirkin [Sun, 23 Aug 2015 03:59:50 +0000 (23:59 -0400)]
nv50,nvc0: disable depth bounds test on blit

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
9 years agoi965/bdw: Fix 3DSTATE_VF_INSTANCING when the edge flag is used
Neil Roberts [Thu, 20 Aug 2015 01:55:44 +0000 (18:55 -0700)]
i965/bdw: Fix 3DSTATE_VF_INSTANCING when the edge flag is used

When the edge flag element is enabled then the elements are slightly
reordered so that the edge flag is always the last one. This was
confusing the code to upload the 3DSTATE_VF_INSTANCING state because
that is uploaded with a separate loop which has an instruction for
each element. The indices used in these instructions weren't taking
into account the reordering so the state would be incorrect.

v2: Use nr_elements instead of brw->vb.nr_enabled so that it will cope
    when gl_VertexID is used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91292
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
9 years agoi965: Swap the order of the vertex ID and edge flag attributes
Neil Roberts [Mon, 13 Jul 2015 17:01:14 +0000 (18:01 +0100)]
i965: Swap the order of the vertex ID and edge flag attributes

The edge flag data on Gen6+ is passed through the fixed function hardware as
an extra attribute. According to the PRM it must be the last valid
VERTEX_ELEMENT structure. However if the vertex ID is also used then another
extra element is added to source the VID. This made it so the vertex ID is in
the wrong register in the vertex shader and the edge attribute is no longer in
the last element.

v2: Also implement for BDW+

v3 [by Ben]: Remove 10.5 tag. Too late.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84677
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
9 years agor600g: Fix assert in tgsi_cmp
Glenn Kennard [Sat, 22 Aug 2015 23:01:31 +0000 (01:01 +0200)]
r600g: Fix assert in tgsi_cmp

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=91726

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@gmail.com>
9 years agoegl: scons: fix the haiku build, do not build the dri2 backend
Alexander von Gluck IV [Wed, 19 Aug 2015 01:47:59 +0000 (20:47 -0500)]
egl: scons: fix the haiku build, do not build the dri2 backend

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agodocs: add 11.1.0-devel release notes template, bump version
Emil Velikov [Sat, 22 Aug 2015 12:28:16 +0000 (13:28 +0100)]
docs: add 11.1.0-devel release notes template, bump version

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agoegl/wayland: define set_cloexec_or_close only when mkostemp is not present
Boyan Ding [Fri, 21 Aug 2015 13:44:36 +0000 (21:44 +0800)]
egl/wayland: define set_cloexec_or_close only when mkostemp is not present

Fixes a compiler warning of defined but not used function when
HAVE_MKOSTEMP is defined.

Fixes: eb3e2562a4b(configure.ac: check for mkostemp())
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
9 years agomapi: ship ARB_tessellation_shader.xml
Emil Velikov [Sat, 22 Aug 2015 11:58:03 +0000 (12:58 +0100)]
mapi: ship ARB_tessellation_shader.xml

Fixes: e2b59a39cbb(mapi: add ARB_tessellation_shader)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agonouveau: add codegen/unordered_set.h to the tarball
Emil Velikov [Sat, 22 Aug 2015 11:15:27 +0000 (12:15 +0100)]
nouveau: add codegen/unordered_set.h to the tarball

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agowinsys/sw/kms-dri: don't attempt to bundle the sconscript
Emil Velikov [Fri, 21 Aug 2015 01:01:42 +0000 (02:01 +0100)]
winsys/sw/kms-dri: don't attempt to bundle the sconscript

The build/file was removed with an earlier commit while the EXTRA_DIST
was forgotten.

Fixes: 66d77cd71c6 (scons: don't build the kms-dri winsys)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agowinsys/amdgpu: automake: remove missing headers
Emil Velikov [Thu, 20 Aug 2015 21:55:49 +0000 (22:55 +0100)]
winsys/amdgpu: automake: remove missing headers

The files are not referenced in any other place in whole of
mesa. They are likely remnants of the early development stage.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agoautomake: build all drivers but vc4 during distcheck
Emil Velikov [Thu, 20 Aug 2015 21:52:49 +0000 (22:52 +0100)]
automake: build all drivers but vc4 during distcheck

vc4 conflicts with ilo, when build on x86 as it's build for emulation
purposes. In that mode a i965-like symbol is exported by vc4, which
conflicts with the ilo one in the gallium-dri megadriver.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agoandroid: enable amdgpu winsys in radeonsi driver
Mauro Rossi [Tue, 18 Aug 2015 09:53:32 +0000 (11:53 +0200)]
android: enable amdgpu winsys in radeonsi driver

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agoandroid: fix cflags and includes for amdgpu winsys
Mauro Rossi [Tue, 18 Aug 2015 09:53:31 +0000 (11:53 +0200)]
android: fix cflags and includes for amdgpu winsys

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agodocs: add news item and link release notes for 10.6.5
Emil Velikov [Sat, 22 Aug 2015 10:04:11 +0000 (11:04 +0100)]
docs: add news item and link release notes for 10.6.5

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agodocs: add sha256 checksums for 10.6.5
Emil Velikov [Sat, 22 Aug 2015 10:00:47 +0000 (11:00 +0100)]
docs: add sha256 checksums for 10.6.5

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit fa34225167396008e75e93f23696666caba8a7bf)

9 years agodocs: add release notes for 10.6.5
Emil Velikov [Sat, 22 Aug 2015 09:20:54 +0000 (10:20 +0100)]
docs: add release notes for 10.6.5

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit a43b3dd99bd4c114d0f3e90f4fd4792164fe7539)

9 years agoi965: Move control flush into pipelined conditional render
Chris Wilson [Fri, 21 Aug 2015 14:28:22 +0000 (15:28 +0100)]
i965: Move control flush into pipelined conditional render

The nv_conditional_render piglits were sporadically failing. Moving
the control flush from the write and placing it just before the read
was sufficient to make the piglits pass a 1000/1000 times. The bspec
says that the flush enable bit "waits until all previous writes of
immediate data from post sync circles are complete before executing the
next command" - the operative word being previous!

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90691
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Neil Roberts <neil@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agovc4: Actually allow math results to allocate into r4.
Eric Anholt [Fri, 21 Aug 2015 17:57:24 +0000 (10:57 -0700)]
vc4: Actually allow math results to allocate into r4.

I switched us to tracking whether the results *could* go to r4, but then
didn't make a separate register class for the class bits that included r4.
Switch the "any" class to actually be "any", and name the "any but r4"
class more appropriately.

total instructions in shared programs: 96798 -> 94680 (-2.19%)
instructions in affected programs:     62736 -> 60618 (-3.38%)

9 years agovc4: Fold the 16-bit integer pack into the instructions generating it.
Eric Anholt [Fri, 21 Aug 2015 07:08:13 +0000 (00:08 -0700)]
vc4: Fold the 16-bit integer pack into the instructions generating it.

total instructions in shared programs: 97580 -> 96798 (-0.80%)
instructions in affected programs:     52826 -> 52044 (-1.48%)

9 years agovc4: Reuse QPU dumping for packing bits in QIR.
Eric Anholt [Fri, 21 Aug 2015 07:04:36 +0000 (00:04 -0700)]
vc4: Reuse QPU dumping for packing bits in QIR.

9 years agovc4: Make _dest variants of qir ALU helpers to provide an explicit dest.
Eric Anholt [Wed, 19 Aug 2015 03:26:05 +0000 (20:26 -0700)]
vc4: Make _dest variants of qir ALU helpers to provide an explicit dest.

9 years agovc4: Use the SSA defs list for figuring out eligible MOVs for copy prop.
Eric Anholt [Fri, 21 Aug 2015 16:22:32 +0000 (09:22 -0700)]
vc4: Use the SSA defs list for figuring out eligible MOVs for copy prop.

I thought I'd converted this over previously.  It was copy propagating
MOVs badly with the new destination packing flags.

9 years agost/nine: Always use user constant buffers
Krzysztof Sobiecki [Thu, 20 Aug 2015 21:19:30 +0000 (23:19 +0200)]
st/nine: Always use user constant buffers

We had several reports of users hitting bugs
with the other path to upload constants,
and switching to the user constant buffer
path solves the bugs.

User constant buffers are expected to be slower
for Nvidia cards, so ideally this patch should be
reverted when the path is fixed.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Krzysztof Sobiecki <sobkas@gmail.com>
9 years agost/nine: Silent warning in nine_ff
Axel Davy [Sun, 16 Aug 2015 11:11:50 +0000 (13:11 +0200)]
st/nine: Silent warning in nine_ff

release build was complaining

Signed-off-by: Axel Davy <axel.davy@ens.fr>
9 years agost/nine: Silent warning in sm1_declusage_to_tgsi
Axel Davy [Sun, 16 Aug 2015 11:11:27 +0000 (13:11 +0200)]
st/nine: Silent warning in sm1_declusage_to_tgsi

release build was complaining

Signed-off-by: Axel Davy <axel.davy@ens.fr>