mesa.git
10 years agoi965: Pass brw rather than gen to brw_disassemble_inst().
Matt Turner [Thu, 12 Jun 2014 23:08:02 +0000 (16:08 -0700)]
i965: Pass brw rather than gen to brw_disassemble_inst().

We will need it in order to use the new brw_inst API.

Signed-off-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Convert brw_eu_compact.c to the new brw_inst API.
Matt Turner [Thu, 12 Jun 2014 06:10:19 +0000 (23:10 -0700)]
i965: Convert brw_eu_compact.c to the new brw_inst API.

v2: Use brw_inst_bits rather than pulling out individual fields and
    reassembling them.

Signed-off-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Extend is_haswell checks to gen >= 8 in Gen4-7 generators.
Kenneth Graunke [Sun, 8 Jun 2014 06:52:37 +0000 (23:52 -0700)]
i965: Extend is_haswell checks to gen >= 8 in Gen4-7 generators.

We're going to use fs_generator/vec4_generator for Gen8+ code soon,
thanks to the new brw_instruction API.  When we do, we'll generally
want to take the Haswell paths on Gen8+ as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Convert test_eu_compact.c to the new brw_inst API.
Kenneth Graunke [Sun, 8 Jun 2014 05:58:26 +0000 (22:58 -0700)]
i965: Convert test_eu_compact.c to the new brw_inst API.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Convert vec4_generator to the new brw_inst API.
Kenneth Graunke [Sun, 8 Jun 2014 05:46:59 +0000 (22:46 -0700)]
i965: Convert vec4_generator to the new brw_inst API.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Convert fs_generator to the new brw_inst API.
Kenneth Graunke [Sun, 8 Jun 2014 05:44:24 +0000 (22:44 -0700)]
i965: Convert fs_generator to the new brw_inst API.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Convert Gen4-5 clipping code to the new brw_inst API.
Kenneth Graunke [Sun, 8 Jun 2014 05:22:41 +0000 (22:22 -0700)]
i965: Convert Gen4-5 clipping code to the new brw_inst API.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Convert brw_sf_emit.c to the new brw_inst API.
Kenneth Graunke [Sun, 8 Jun 2014 04:29:47 +0000 (21:29 -0700)]
i965: Convert brw_sf_emit.c to the new brw_inst API.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Convert brw_eu_emit.c to the new brw_inst API.
Kenneth Graunke [Thu, 5 Jun 2014 00:08:57 +0000 (17:08 -0700)]
i965: Convert brw_eu_emit.c to the new brw_inst API.

v2:
 - Fix IF -> ELSE patching on Sandybridge.
 - Don't set base_mrf on Gen6+ in OWord Block Read functions.  (Although
 - the old code did this universally, it shouldn't have - the field
 - doesn't exist on Gen6+ and just got overwritten by the SFID anyway.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Convert brw_eu.[ch] to use the new brw_inst API.
Kenneth Graunke [Sun, 8 Jun 2014 04:24:41 +0000 (21:24 -0700)]
i965: Convert brw_eu.[ch] to use the new brw_inst API.

v2: Don't set flag_reg_nr prior to Gen7 (as it doesn't exist).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Introduce a new brw_inst API.
Kenneth Graunke [Thu, 5 Jun 2014 00:07:30 +0000 (17:07 -0700)]
i965: Introduce a new brw_inst API.

This is similar to gen8_instruction, and will eventually replace it.

For now nothing uses this, but we can incrementally convert.
The new API takes the existing brw_instruction pointers to ease
conversion; when done, we can simply drop the old structure and rename
struct brw_instruction -> brw_inst.

v2: (by Matt Turner) Make JIP/UIP functions take a signed argument.
v3: (by Kenneth Graunke)
 - Make Gen4-6 jump target functions take a signed argument.
 - Fix indirect align1 AddrImm bits on Gen4-7.
 - Fix SFID on Sandybridge to use bits 27:24.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [v1, v3+]
Signed-off-by: Matt Turner <mattst88@gmail.com> [v2]
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Pass brw into next_offset().
Kenneth Graunke [Sun, 8 Jun 2014 04:15:59 +0000 (21:15 -0700)]
i965: Pass brw into next_offset().

The new brw_inst API is going to require a brw pointer in order
to access fields (so it can do generation checks).  Plumb it in now.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Remove unneeded VS workaround stalls on Baytrail.
Greg Hunt [Wed, 25 Jun 2014 13:42:24 +0000 (14:42 +0100)]
i965: Remove unneeded VS workaround stalls on Baytrail.

According to the workarounds list, these stalls aren't needed on
production Baytrail systems.  Piglit confirms that as well.

These cause a small slowdown when we are sending a large number of small
batches to the GPU.  Removing these improves performance by up to 5% on
some CPU bound SynMark tests (Batch[4-7], DrvState1, HdrBloom,
Multithread, ShMapPcf).

Signed-off-by: Gregory Hunt <greg.hunt@mobica.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Include marketing names for Broadwell GPUs.
Kenneth Graunke [Tue, 24 Jun 2014 23:18:11 +0000 (16:18 -0700)]
i965: Include marketing names for Broadwell GPUs.

Intel would like us to include the marketing names.  Developers
additionally want "Broadwell GT1/2/3" because it makes it easier
to identify what hardware users have when they request assistance
or report issues.

Including both makes it easy for everyone to map between the names.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agosoftpipe: use last_level from sampler view, not from the resource
Roland Scheidegger [Thu, 26 Jun 2014 00:37:44 +0000 (02:37 +0200)]
softpipe: use last_level from sampler view, not from the resource

The last_level from the sampler view may be limited by the state tracker
to a value lower than what the base texture provides.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=80541.

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agotargets/automake.inc: s/GALLIUM_VIDEO_CFLAGS/GALLIUM_TARGET_CFLAGS/
Emil Velikov [Thu, 12 Jun 2014 16:10:52 +0000 (17:10 +0100)]
targets/automake.inc: s/GALLIUM_VIDEO_CFLAGS/GALLIUM_TARGET_CFLAGS/

The flags are not specific to the video targets plus
we can reuse them for targets/xa and targets/gbm.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agoauxiliary/vl: Remove no longer used SPLIT_TARGETS
Emil Velikov [Thu, 12 Jun 2014 16:03:50 +0000 (17:03 +0100)]
auxiliary/vl: Remove no longer used SPLIT_TARGETS

Required for the conversion stage of all VL targets to
a single library per API (static/shared pipe-drivers).

No longer required as per last commit.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agotargets/radeonsi/omx: convert to static/shared pipe-drivers
Emil Velikov [Sat, 21 Jun 2014 11:38:30 +0000 (12:38 +0100)]
targets/radeonsi/omx: convert to static/shared pipe-drivers

The radeonsi counterpart of previous commit - now libomx-radeonsi is
built into the libomx-mesa library. Providing a single library per API.

v2: Include the radeon winsys only when there is a user for it.
v3: Correcly include the winsys. Now with extra brown bag :\

Note: Make sure to rebuild the .omxregister file, by executing
   $ omxregister-bellagio

This patch concludes the unification. Now libomx-mesa will be used
for all hardware - r600, radeonsi and nouveau.

Cc: Leo Liu <leo.liu@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
10 years agotargets/r600/omx: convert to static/shared pipe-drivers
Emil Velikov [Thu, 12 Jun 2014 15:38:54 +0000 (16:38 +0100)]
targets/r600/omx: convert to static/shared pipe-drivers

The r600 counterpart of previous commit - now the libomx-r600 is
built into the libomx-mesa library. Providing a single library per API.

v2: Include the radeon winsys only when there is a user for it.
v3: Correcly include the winsys. Now with extra brown bag :\

Note: Make sure to rebuild the .omxregister file, by executing
   $ omxregister-bellagio

If you have more than one omx library (libomx-radeonsi, libomx-r600),
make sure to temporary move the unused one. By the end of the series
there will be only one library that will be used for all hardware -
r600, radeonsi and nouveau.

Cc: Leo Liu <leo.liu@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
10 years agotargets/omx-nouveau: convert to static/shared pipe-drivers
Emil Velikov [Thu, 12 Jun 2014 15:33:58 +0000 (16:33 +0100)]
targets/omx-nouveau: convert to static/shared pipe-drivers

Similar to the vdpau/xvmc targets, we're going to convert the
multiple target libraries into a single one.

The library can be built with the relevant pipe-drivers
statically linked in, or loaded as shared modules.
Currently we default to static.

Note: Make sure to rebuild the .omxregister file, by executing
   $ omxregister-bellagio

If you have more than one omx library (libomx-radeonsi, libomx-r600),
make sure to temporary move the unused one. By the end of the series
there will be only one library that will be used for all hardware -
r600, radeonsi and nouveau.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agost/omx: avoid using dynamic vid_(enc|dec)_base and avc_(name|role)
Emil Velikov [Tue, 10 Jun 2014 01:14:18 +0000 (02:14 +0100)]
st/omx: avoid using dynamic vid_(enc|dec)_base and avc_(name|role)

Strictly speaking we should not have done this in the
first place, as all of the above should be static across
the system.

Currently this may cause some minor issues, which will be
resolved in the following patches, by providing a single
library for the OMX api.

Cleanup a few unneeded strcpy cases while we're around.

Note: Make sure to rebuild the .omxregister file, by executing
   $ omxregister-bellagio

If you have more than one omx library (libomx-radeonsi, libomx-r600),
make sure to temporary move the unused one. By the end of the series
there will be only one library that will be used for all hardware -
r600, radeonsi and nouveau.

Cc: Leo Liu <leo.liu@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agost/omx: provide constant number of components
Emil Velikov [Tue, 10 Jun 2014 01:28:00 +0000 (02:28 +0100)]
st/omx: provide constant number of components

The number of components and their names/roles should
be kept constant as all of that information cached.

Note: Make sure to rebuild the .omxregister file, by executing
   $ omxregister-bellagio.

Cc: Leo Liu <leo.liu@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agoglx: Added missing null check in GetDrawableAttribute()
Juha-Pekka Heikkila [Fri, 25 Apr 2014 08:16:50 +0000 (11:16 +0300)]
glx: Added missing null check in GetDrawableAttribute()

For GLX_BACK_BUFFER_AGE_EXT query added extra null check.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa/main: In register_surface() verify gl_texture_object was found
Juha-Pekka Heikkila [Thu, 8 May 2014 08:16:54 +0000 (11:16 +0300)]
mesa/main: In register_surface() verify gl_texture_object was found

Verify _mesa_lookup_texture() returned valid pointer before using it.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa/main: Verify calloc return value in register_surface()
Juha-Pekka Heikkila [Thu, 8 May 2014 07:34:50 +0000 (10:34 +0300)]
mesa/main: Verify calloc return value in register_surface()

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Add missing null check in push_back()
Juha-Pekka Heikkila [Wed, 7 May 2014 09:38:07 +0000 (12:38 +0300)]
glsl: Add missing null check in push_back()

Report memory error on realloc failure and don't leak any memory.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: check _mesa_hash_table_create return value in link_uniform_blocks
Juha-Pekka Heikkila [Thu, 3 Apr 2014 14:06:42 +0000 (17:06 +0300)]
glsl: check _mesa_hash_table_create return value in link_uniform_blocks

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965/fs: Check variable_storage return value in fs_visitor::visit
Juha-Pekka Heikkila [Mon, 7 Apr 2014 11:37:42 +0000 (14:37 +0300)]
i965/fs: Check variable_storage return value in fs_visitor::visit

check variable_storage() found the requested fs_reg.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965: Handle miptree creation failure in intel_alloc_texture_storage()
Juha-Pekka Heikkila [Mon, 12 May 2014 12:25:59 +0000 (15:25 +0300)]
i965: Handle miptree creation failure in intel_alloc_texture_storage()

Check intel_miptree_create() return value before using it as
a pointer.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965: Check calloc return value in gather_statistics_results()
Juha-Pekka Heikkila [Thu, 8 May 2014 13:19:51 +0000 (16:19 +0300)]
i965: Check calloc return value in gather_statistics_results()

Check calloc return value and report on error, also later skip
results handling if there was no memory to store results to.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965/vec4: Try constant propagate after copy propagate made progress.
Matt Turner [Tue, 24 Jun 2014 05:29:57 +0000 (22:29 -0700)]
i965/vec4: Try constant propagate after copy propagate made progress.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/vec4: Make try_copy_propagate() static.
Matt Turner [Tue, 24 Jun 2014 05:16:02 +0000 (22:16 -0700)]
i965/vec4: Make try_copy_propagate() static.

Now that can_do_source_mods() isn't part of the visitor, this doesn't
need to be either.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/vec4: Rename try_copy/constant_propagat{ion,e} to match the fs.
Matt Turner [Tue, 24 Jun 2014 05:12:03 +0000 (22:12 -0700)]
i965/vec4: Rename try_copy/constant_propagat{ion,e} to match the fs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/vec4: Constant propagate into 2-src math instructions on Gen8.
Matt Turner [Tue, 24 Jun 2014 05:07:38 +0000 (22:07 -0700)]
i965/vec4: Constant propagate into 2-src math instructions on Gen8.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/fs: Constant propagate into 2-src math instructions on Gen8.
Matt Turner [Tue, 24 Jun 2014 05:07:20 +0000 (22:07 -0700)]
i965/fs: Constant propagate into 2-src math instructions on Gen8.

total instructions in shared programs: 1878133 -> 1876986 (-0.06%)
instructions in affected programs:     153007 -> 151860 (-0.75%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/fs: Make try_constant_propagate() static.
Matt Turner [Tue, 24 Jun 2014 05:05:03 +0000 (22:05 -0700)]
i965/fs: Make try_constant_propagate() static.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Make can_do_source_mods() a member of the instruction classes.
Matt Turner [Tue, 24 Jun 2014 04:57:31 +0000 (21:57 -0700)]
i965: Make can_do_source_mods() a member of the instruction classes.

Pretty nonsensical to have it as a method of the visitor just for access
to brw.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoglsl: Treat an interface block specifier as a level of struct nesting
Chris Forbes [Sun, 15 Jun 2014 00:57:20 +0000 (12:57 +1200)]
glsl: Treat an interface block specifier as a level of struct nesting

Fixes the piglit test:

   spec/glsl-1.50/compiler/interface-blocks-structs-defined-within-block-instanced.vert

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoglsl: Disallow primitive type layout qualifier on variables.
Chris Forbes [Thu, 12 Jun 2014 09:17:13 +0000 (21:17 +1200)]
glsl: Disallow primitive type layout qualifier on variables.

This only makes any sense on the GS input or output layout declaration,
nowhere else.

Fixes the piglit tests:

  * spec/glsl-1.50/compiler/incorrect-in-layout-qualifiers-with-variable-declarations.geom
  * spec/glsl-1.50/compiler/incorrect-out-layout-qualifiers-with-variable-declarations.geom
  * spec/glsl-1.50/compiler/layout-fs-no-output.frag
  * spec/glsl-1.50/compiler/layout-vs-no-input.vert
  * spec/glsl-1.50/compiler/layout-vs-no-output.vert

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoglsl: Relax combinations of layout qualifiers with other qualifiers.
Chris Forbes [Thu, 12 Jun 2014 07:48:58 +0000 (19:48 +1200)]
glsl: Relax combinations of layout qualifiers with other qualifiers.

Previously we disallowed any combination of layout with interpolation,
invariant, or precise qualifiers. There is very little spec guidance on
exactly which combinations should be allowed, but with ARB_sso it's
useful to allow these qualifiers with rendezvous-by-location.

Since it's unclear exactly where the layout qualifier should appear when
combined with other qualifiers, we will allow it anywhere before the
auxiliary storage qualifier.

This allows enough flexibility for all examples I've seen, while keeping
the auxiliary-storage-qualifier / storage-qualifier pair together (as
they are a single qualifier in the spec prior to
ARB_shading_language_420pack)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoglsl: Don't convert reductions of ivec to a dot-product
Ian Romanick [Wed, 25 Jun 2014 02:12:24 +0000 (19:12 -0700)]
glsl: Don't convert reductions of ivec to a dot-product

Mesa has an optimization that converts expressions like "v.x + v.y + v.z
+ v.w" into dot(v, 1.0).  And therein lies the rub: the other operand to
the dot-product is always a float... even if the vector is an ivec or
uvec.  This results in an assertion failure in ir_builder.

If the base type of the operand is not float, don't try the
optimization.  Dot-product is not valid on integer data.

Fixes piglit vs-integer-reduction.shader_test and OpenGL ES conformance
test ES2-CTS.gtf.GL2Tests.glGetUniform.glGetUniform.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Christoph Brill <egore911@gmail.com>
10 years agodocs: Import 10.2.2 release notes, add news item
Carl Worth [Wed, 25 Jun 2014 04:49:38 +0000 (21:49 -0700)]
docs: Import 10.2.2 release notes, add news item

10 years agodocs: Import 10.1.6 release notes, add news item
Carl Worth [Wed, 25 Jun 2014 04:40:15 +0000 (21:40 -0700)]
docs: Import 10.1.6 release notes, add news item

10 years agollvmpipe: Fix zero-division in llvmpipe_texture_layout()
Takashi Iwai [Wed, 25 Jun 2014 00:03:07 +0000 (02:03 +0200)]
llvmpipe: Fix zero-division in llvmpipe_texture_layout()

Fix the crash of "gnome-control-center info" invocation on QEMU where
zero height is passed at init.

(sroland: simplify logic by eliminating the div altogether, using 64bit mul.)

Fixes: https://bugzilla.novell.com/show_bug.cgi?id=879462
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agoi965/fs: Don't fix_math_operand() on Gen >= 8.
Matt Turner [Mon, 23 Jun 2014 20:30:15 +0000 (13:30 -0700)]
i965/fs: Don't fix_math_operand() on Gen >= 8.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965/vec4: Don't fix_math_operand() on Gen >= 8.
Matt Turner [Mon, 23 Jun 2014 20:30:14 +0000 (13:30 -0700)]
i965/vec4: Don't fix_math_operand() on Gen >= 8.

The emit_math?_gen? functions serve to implement workarounds for the
math instruction, none of which exist on Gen8+.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965/vec4: Don't return void from a void function.
Matt Turner [Mon, 23 Jun 2014 20:30:13 +0000 (13:30 -0700)]
i965/vec4: Don't return void from a void function.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agor600g/compute: Defer the creation of the temporary resource
Bruno Jiménez [Thu, 19 Jun 2014 18:20:02 +0000 (20:20 +0200)]
r600g/compute: Defer the creation of the temporary resource

For the first use of a buffer, we will only need the temporary
resource in the case that a user wants to write/map to this buffer.

But in the cases where the user creates a buffer to act as an
output of a kernel, then we were creating an unneeded resource,
because it will contain garbage, and would be copied to the pool,
and destroyed when promoting.

This patch avoids the creation and copies of resources in
this case.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agor600g/compute: Handle failures in compute_memory_pool_finalize
Jan Vesely [Thu, 19 Jun 2014 18:20:01 +0000 (20:20 +0200)]
r600g/compute: Handle failures in compute_memory_pool_finalize

Reviewed-by: Bruno Jiménez <brunojimen@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
10 years agor600g/compute: Fix possible endless loop in compute_memory_pool allocations.
Jan Vesely [Thu, 19 Jun 2014 18:20:00 +0000 (20:20 +0200)]
r600g/compute: Fix possible endless loop in compute_memory_pool allocations.

The important part is the change of the condition to <= 0. Otherwise the loop
gets stuck never actually growing the pool.

The change in the aux-need calculation guarantees max 2 iterations, and
avoids wasting memory in case a smaller item can't fit into a relatively larger
pool.

Reviewed-by: Bruno Jiménez <brunojimen@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
10 years agor600: Fix use after free in compute_memory_promote_item.
Jan Vesely [Mon, 23 Jun 2014 14:39:00 +0000 (10:39 -0400)]
r600: Fix use after free in compute_memory_promote_item.

The dst pointer needs to be initialized after any calls to
 compute_memory_grow_pool, as the function might change the pool->vbo pointer.

This fixes crashes and assertion failures in two gegl tests.

Reviewed-by: Bruno Jiménez <brunojimen@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
10 years agonouveau: dup fd before passing it to device
Ilia Mirkin [Thu, 19 Jun 2014 08:25:04 +0000 (04:25 -0400)]
nouveau: dup fd before passing it to device

nouveau screens are reused for the same device node. However in the
scenario where we create screen 1, screen 2, and then delete screen 1,
the surrounding code might also close the original device node. To
protect against this, dup the fd and use the dup'd fd in the
nouveau_device. Also tell the nouveau_device that it is the owner of the
fd so that it will be closed on destruction.

Also make sure to free the nouveau_device in case of any failure.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79823
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@ubuntu.com>
10 years agomesa: Don't use derived vertex state in api_arrayelt.c
Fredrik Höglund [Thu, 23 Jan 2014 17:47:44 +0000 (18:47 +0100)]
mesa: Don't use derived vertex state in api_arrayelt.c

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agonvc0: allow VIEWPORT_INDEX and LAYER to be used as input semantics
Ilia Mirkin [Sat, 21 Jun 2014 23:25:07 +0000 (19:25 -0400)]
nvc0: allow VIEWPORT_INDEX and LAYER to be used as input semantics

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agomesa/st: handle gl_Layer input semantic
Ilia Mirkin [Sat, 21 Jun 2014 23:12:24 +0000 (19:12 -0400)]
mesa/st: handle gl_Layer input semantic

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agonv50/ir: allow gl_ViewportIndex to work on non-provoking vertices
Tobias Klausmann [Mon, 23 Jun 2014 21:01:42 +0000 (23:01 +0200)]
nv50/ir: allow gl_ViewportIndex to work on non-provoking vertices

Previously, if we had something like:

  gl_ViewportIndex = idx;
  for(int i = 0; i < gl_in.length(); i++) {
     gl_Position = gl_in[i].gl_Position;
     EmitVertex();
  }
  EndPrimitive();

The right viewport index would not be set on the primitive because the
last vertex is the provoking one. However blob drivers appear to move
the gl_ViewportIndex write into the for loop, allowing the application
to be ignorant of this detail.

While the application is technically wrong here, because the blob does
it and other drivers appear to implicitly work this way as well, we add
a buffer register that viewport index writes go into, which is then
exported before every EmitVertex() call.

This fixes the remaining piglit tests in ARB_viewport_array for nv50/nvc0.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agodraw: (trivial) fix clamping of viewport index
Roland Scheidegger [Mon, 23 Jun 2014 20:06:15 +0000 (22:06 +0200)]
draw: (trivial) fix clamping of viewport index

The old logic would let all negative values go through unclamped, with
potentially disastrous results (probably trying to fetch viewport values
from random memory locations). GL has undefined rendering for vp indices
outside valid range but that's a bit too undefined...
(The logic is now the same as in llvmpipe.)

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agoi965: Don't emit SURFACE_STATEs for gather workarounds on Broadwell.
Kenneth Graunke [Thu, 29 May 2014 07:06:08 +0000 (00:06 -0700)]
i965: Don't emit SURFACE_STATEs for gather workarounds on Broadwell.

As far as I can tell, Broadwell doesn't need any of the SURFACE_STATE
workarounds for textureGather() bugs, so there's no need to emit
a second set of identical copies.

To keep things simple, just point the gather surface index base to the
same place as the texture surface index base.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agotargets/(vdpau|xvmc): hardlink against the installed library
Emil Velikov [Mon, 23 Jun 2014 17:30:00 +0000 (18:30 +0100)]
targets/(vdpau|xvmc): hardlink against the installed library

With commit 11e46a32aed and f9ebb1ea771 we resolved the symlink
generation required by the versioning of the library.
Although they incorrectly changed the way hardlinks are created by
linking to the ones from the build tree. If the device used for
building differs from the one set as destination linking will fail.

Reported-by: Andy Furniss <adf.lists@gmail.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agoi965: Allow the blorp blit between BGR and RGB
Neil Roberts [Mon, 23 Jun 2014 16:43:08 +0000 (17:43 +0100)]
i965: Allow the blorp blit between BGR and RGB

Previously the blorp blitter would only be used if the format is identical or
there is only a difference between whether there is an alpha component or not.
This patch makes it also allow the blorp blitter if the only difference is the
ordering of the RGB components (ie, RGB or BGR).

This is particularly useful since commit 61e264f4fcdba3623 because Mesa now
prefers RGB ordering for textures but the window system buffers are still
created as BGR. That means that the blorp blitter won't be used for the
(probably) common case of blitting from a texture to the window system buffer.

This doesn't cause any regressions in the FBO piglit tests on Haswell. On
Sandybridge it causes the fbo-blit-stretch test to fail but that is only
because it was failing anyway before the above commit and that commit hid the
problem.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68365
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Silence many unused parameter warnings
Ian Romanick [Sat, 21 Jun 2014 00:27:19 +0000 (17:27 -0700)]
glsl: Silence many unused parameter warnings

In file included from ../../src/glsl/builtin_functions.cpp:61:0:
../../src/glsl/glsl_parser_extras.h:154:9: warning: unused parameter 'var' [-Wunused-parameter]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
10 years agotargets/xvmc: correctly generate the symlinks
Emil Velikov [Mon, 23 Jun 2014 14:54:36 +0000 (15:54 +0100)]
targets/xvmc: correctly generate the symlinks

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agotargets/vdpau: correctly generate the symlinks
Emil Velikov [Mon, 23 Jun 2014 14:13:47 +0000 (15:13 +0100)]
targets/vdpau: correctly generate the symlinks

Reported-by: David Heidelberger <david.heidelberger@ixit.cz>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agoi915: Fix gen2 texblend setup
Ville Syrjälä [Mon, 16 Jun 2014 17:54:32 +0000 (20:54 +0300)]
i915: Fix gen2 texblend setup

Fix an off by one in the texture unit walk during texblend
setup on gen2. This caused the last enabled texunit to be
skipped resulting in totally messed up texturing.

This is a regression introduced here:
 commit 1ad443ecdd694dd9bf3c4a5050d749fb80db6fa2
 Author: Eric Anholt <eric@anholt.net>
 Date:   Wed Apr 23 15:35:27 2014 -0700

    i915: Redo texture unit walking on i830.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
10 years agomesa: Make Geom.UsesEndPrimitive a bool instead of a GLboolean
Iago Toral Quiroga [Fri, 20 Jun 2014 08:01:40 +0000 (10:01 +0200)]
mesa: Make Geom.UsesEndPrimitive a bool instead of a GLboolean

10 years agotargets/r600/xvmc: convert to static/shared pipe-drivers
Emil Velikov [Thu, 12 Jun 2014 15:59:58 +0000 (16:59 +0100)]
targets/r600/xvmc: convert to static/shared pipe-drivers

The r600 equivalent of previous commit.

v2: Correctly include the radeon winsys/radeon_common.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
10 years agotargets/xvmc-nouveau: convert to static/shared pipe-drivers
Emil Velikov [Thu, 12 Jun 2014 14:54:15 +0000 (15:54 +0100)]
targets/xvmc-nouveau: convert to static/shared pipe-drivers

Similar to vdpau targets, we're going to convert the individual
target libraries into a single one.

The library can be built with the relevant pipe-drivers
statically linked in, or loaded as shared modules.
Currently we default to static.

Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
10 years agotargets/radeonsi/vdpau: convert to static/shared pipe-drivers
Emil Velikov [Sat, 21 Jun 2014 11:31:47 +0000 (12:31 +0100)]
targets/radeonsi/vdpau: convert to static/shared pipe-drivers

Similar to previous commits, this allows us to minimise some
of the duplication by compacting all vdpau targets into a
single library.

v2: Include the radeon winsys only when there is a user for it.
v3: Correcly include the winsys. Now with extra brown bag :\

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
10 years agotargets/r600/vdpau: convert to static/shared pipe-drivers
Emil Velikov [Thu, 12 Jun 2014 15:57:31 +0000 (16:57 +0100)]
targets/r600/vdpau: convert to static/shared pipe-drivers

Similar to previous commit, this allows us to minimise some
of the duplication by compacting all vdpau targets into a
single library.

v2: Include the radeon winsys only when there is a user for it.
v3: Correcly include the winsys. Now with extra brown bag :\

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
10 years agotargets/vdpau-nouveau: convert to static/shared pipe-drivers
Emil Velikov [Thu, 12 Jun 2014 14:41:29 +0000 (15:41 +0100)]
targets/vdpau-nouveau: convert to static/shared pipe-drivers

Create a single library (for the vdpau api) thus reducing
the overall size of mesa. Current commit converts
vdpau-nouveau, with upcomming commits handling the rest.

The library can be built with the relevant pipe-drivers
statically linked in, or loaded as shared modules.
Currently we default to static.

Add SPLIT_TARGETS to guard the other VL targets.

Note: symlink handling is rather ugly and will need an
update to work with BSD and other non-linux platforms.

v2: Split the conversion into per-target basis.

Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
10 years agoPartially revert "glsl: Add builtin define for ARB_fragment_layer_viewport"
Chris Forbes [Sun, 22 Jun 2014 11:53:59 +0000 (23:53 +1200)]
Partially revert "glsl: Add builtin define for ARB_fragment_layer_viewport"

This partially reverts commit cc18b1ec2161c846109e921d7821dfeef7a06f3a,
which dropped some unrelated code due to a fumbled rebase.

10 years agofreedreno: use util_copy_framebuffer_state()
Rob Clark [Wed, 18 Jun 2014 17:50:14 +0000 (13:50 -0400)]
freedreno: use util_copy_framebuffer_state()

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: WFI fixes/cleanup
Rob Clark [Fri, 13 Jun 2014 21:39:59 +0000 (17:39 -0400)]
freedreno/a3xx: WFI fixes/cleanup

Blob driver seems to need WFI in some cases after CP_EVENT_WRITE,
implying that this is asynchronous and should reset needs_wfi.
Also, CP_INVALIDATE_STATE seems to need WFI.  But CP_LOAD_STATE
does not.

The blob driver also puts WFIs before writing GRAS_CL_VPORT registers.
The latter may be a work-around, as these registers should be banked/
context registers.  I haven't yet found a lockup that this averts, but
I expect viewport to change infrequently so out of paranoia I will
keep these for now.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agoglsl: Add gl_Layer and gl_ViewportIndex builtins to fragment shader
Chris Forbes [Sat, 25 Jan 2014 05:29:38 +0000 (18:29 +1300)]
glsl: Add gl_Layer and gl_ViewportIndex builtins to fragment shader

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Add builtin define for ARB_fragment_layer_viewport
Chris Forbes [Sat, 25 Jan 2014 05:24:38 +0000 (18:24 +1300)]
glsl: Add builtin define for ARB_fragment_layer_viewport

The spec doesn't actually mention adding this, but this is the usual
pattern so I'm assuming it's a spec bug.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Add extension plumbing for ARB_fragment_layer_viewport
Chris Forbes [Sat, 25 Jan 2014 05:24:18 +0000 (18:24 +1300)]
glsl: Add extension plumbing for ARB_fragment_layer_viewport

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa: Add extension plumbing for ARB_fragment_layer_viewport
Chris Forbes [Sat, 25 Jan 2014 05:13:01 +0000 (18:13 +1300)]
mesa: Add extension plumbing for ARB_fragment_layer_viewport

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglapi: Add (empty) api section for ARB_fragment_layer_viewport
Chris Forbes [Sat, 25 Jan 2014 05:09:34 +0000 (18:09 +1300)]
glapi: Add (empty) api section for ARB_fragment_layer_viewport

This extension is purely GLSL -- there are no new GL API elements.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965: Save meta stencil blit programs in the context.
Kenneth Graunke [Thu, 19 Jun 2014 05:25:33 +0000 (22:25 -0700)]
i965: Save meta stencil blit programs in the context.

When the last context in a share group is destroyed, the hash table
containing all of the shader programs (ctx->Shared->ShaderObjects) is
destroyed, throwing away all of the shader programs.

Using a static variable to store program IDs ends up holding on to them
after this, so we think we still have a compiled program, when it
actually got destroyed.  _mesa_UseProgram then hits GL errors, since no
program by that ID exists.

Instead, store the program IDs in the context, so we know to recompile
if our context gets destroyed and the application creates another one.

Fixes es3conform tests when run without -minfmt (where it creates
separate contexts for testing each visual).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77865
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agoscons: avoid building any piece of i915
Emil Velikov [Thu, 19 Jun 2014 00:47:38 +0000 (01:47 +0100)]
scons: avoid building any piece of i915

Leftover from commit c21fca8bf24.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jakob Bornecrantz <wallbraker@gmail.com>
10 years agogallivm: Fix build after LLVM commit 211259
Aaron Watry [Sat, 21 Jun 2014 00:13:30 +0000 (19:13 -0500)]
gallivm: Fix build after LLVM commit 211259

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agoglx: Don't crash on swap event for a Window (non-GLXWindow)
Daniel Manjarres [Fri, 20 Jun 2014 17:51:33 +0000 (10:51 -0700)]
glx: Don't crash on swap event for a Window (non-GLXWindow)

Prior to GLX 1.3 there was the glxMakeCurrent() function that took a
single drawable handle. The Drawable could be either a bare XID for a
Window or an XID for a glxpixmap.

GLX 1.3 added glxMakeContextCurrent that takes 2 handles: one for
reading, one for writing. Nowadays the old glxMakeCurrent call is
implemented as a call to glxMakeContextCurrent with the single handle
duplicated.

Because of this it is allowed to use a plain-old Window ID as an
argument to glxMakeContextCurrent, although nobody really documents this
sort of thing. The manpage for the NEW call specifies the arguments as
GLXPixmaps, but the actual code accepts Window XIDs too, and handles
them correctly.

Similarly, the glxSelectEvents function can also take a bare Window XID.

The "piglit" tests all use GLXWindows and/or GLXPixmaps. You never
tested swap events with a bare Window XID. That is what my app was
doing.

The swap_events code worked with Window XIDs in mesa 7.x.y. The new code
added in versions 8, 9, and 10 assumes that all buffer swap events have
a GLXPixmap associated with them. Because of the historical quirks
above, this is not true. Swap events for bare Window XIDs do NOT have a
glxpixmap resulting in a segfault.

Any app that uses the old school glxMakeCurrent call with a Window XID
while trying to use swap_events will crash when the libs try to lookup
the nonexistent GLXPixmap associated with the incoming swap event.

I believe that the people who wrote the spec overlooked this, because
the "sbc" field comes from the OML_sync extension that is defined in
terms of glxpixmaps only.

v2 (idr): Formatting changes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54372
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
10 years agor600g/compute: Use gallium util functions for double lists
Bruno Jiménez [Wed, 18 Jun 2014 15:01:59 +0000 (17:01 +0200)]
r600g/compute: Use gallium util functions for double lists

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agor600g/compute: Map only against intermediate buffers
Bruno Jiménez [Wed, 18 Jun 2014 15:01:58 +0000 (17:01 +0200)]
r600g/compute: Map only against intermediate buffers

With this we can assure that mapped buffers will never change
its position when relocating the pool.

This patch should finally solve the mapping bug.

v2: Use the new is_item_in_pool util function,
    as suggested by Tom Stellard

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agor600g/compute: Implement compute_memory_demote_item
Bruno Jiménez [Wed, 18 Jun 2014 15:01:57 +0000 (17:01 +0200)]
r600g/compute: Implement compute_memory_demote_item

This function will be used when we want to map an item
that it's already in the pool.

v2: Use temporary variables to avoid so many castings in functions,
    as suggested by Tom Stellard

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agor600g/compute: Avoid problems when promoting items mapped for reading
Bruno Jiménez [Wed, 18 Jun 2014 15:01:56 +0000 (17:01 +0200)]
r600g/compute: Avoid problems when promoting items mapped for reading

Acording to the OpenCL spec, it is possible to have a buffer mapped
for reading and at read from it using commands or buffers.

With this we can keep the mapping (that exists against the
temporary item) and read with a kernel (from the item we have
just added to the pool) without problems.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agor600g/compute: Only move to the pool the buffers marked for promoting
Bruno Jiménez [Wed, 18 Jun 2014 15:01:55 +0000 (17:01 +0200)]
r600g/compute: Only move to the pool the buffers marked for promoting

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agor600g/compute: divide the item list in two
Bruno Jiménez [Wed, 18 Jun 2014 15:01:54 +0000 (17:01 +0200)]
r600g/compute: divide the item list in two

Now we will have a list with the items that are in the pool
(item_list) and the items that are outside it (unallocated_list)

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agor600g/compute: Add statuses to the compute_memory_items
Bruno Jiménez [Wed, 18 Jun 2014 15:01:53 +0000 (17:01 +0200)]
r600g/compute: Add statuses to the compute_memory_items

These statuses will help track whether the items are mapped
or if they should be promoted to or demoted from the pool

v2: Use the new is_item_in_pool util function,
    as suggested by Tom Stellard

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agor600g/compute: Add an util function to know if an item is in the pool
Bruno Jiménez [Wed, 18 Jun 2014 15:01:52 +0000 (17:01 +0200)]
r600g/compute: Add an util function to know if an item is in the pool

Every item that has been placed in the pool must have start_in_dw
different from -1.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agor600g/compute: Add an intermediate resource for OpenCL buffers
Bruno Jiménez [Wed, 18 Jun 2014 15:01:51 +0000 (17:01 +0200)]
r600g/compute: Add an intermediate resource for OpenCL buffers

This patch changes completely the way buffers are added to the
compute_memory_pool. Before this, whenever we were going to
map a buffer or write to or read from it, it would get placed
into the pool. Now, every unallocated buffer has its own
r600_resource until it is allocated in the pool.

NOTE: This patch also increase the GPU memory usage at the moment
of putting every buffer in it's place. More or less, the memory
usage is ~2x(sum of every buffer size)

v2: Cleanup

v3: Use temporary variables to avoid so many castings in functions,
    as suggested by Tom Stellard

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agomesa: Copy Geom.UsesEndPrimitive when cloning a geometry program.
Iago Toral Quiroga [Mon, 16 Jun 2014 15:00:15 +0000 (17:00 +0200)]
mesa: Copy Geom.UsesEndPrimitive when cloning a geometry program.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
10 years agomesa: Init Geom.UsesEndPrimitive in shader programs.
Iago Toral Quiroga [Mon, 16 Jun 2014 14:57:59 +0000 (16:57 +0200)]
mesa: Init Geom.UsesEndPrimitive in shader programs.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Optimize (v.x + v.y) + (v.z + v.w) into dot(v, 1.0).
Matt Turner [Sun, 2 Mar 2014 16:59:50 +0000 (08:59 -0800)]
glsl: Optimize (v.x + v.y) + (v.z + v.w) into dot(v, 1.0).

Cuts five instructions out of SynMark's Gl32VSInstancing benchmark.

10 years agoglsl: Pass in options to do_algebraic().
Matt Turner [Sat, 1 Mar 2014 01:49:20 +0000 (17:49 -0800)]
glsl: Pass in options to do_algebraic().

Will be used in the next commit.

Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoglsl: Rebalance expression trees that are reduction operations.
Matt Turner [Sat, 1 Mar 2014 04:11:32 +0000 (20:11 -0800)]
glsl: Rebalance expression trees that are reduction operations.

The intention of this pass was to give us better instruction scheduling
opportunities, but it unexpectedly reduced some instruction counts as
well:

total instructions in shared programs: 1666639 -> 1666073 (-0.03%)
instructions in affected programs:     54612 -> 54046 (-1.04%)
(and trades 4 SIMD16 programs in SS3)

10 years agoautomake: include the libdeps in the correct order
Emil Velikov [Thu, 19 Jun 2014 21:46:25 +0000 (22:46 +0100)]
automake: include the libdeps in the correct order

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80254
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agoclover: Calculate the serialized size of a module efficiently.
Francisco Jerez [Sat, 14 Jun 2014 19:03:02 +0000 (21:03 +0200)]
clover: Calculate the serialized size of a module efficiently.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
10 years agoclover: Optimize module serialization for vectors of fundamental types.
Francisco Jerez [Sat, 14 Jun 2014 18:53:35 +0000 (20:53 +0200)]
clover: Optimize module serialization for vectors of fundamental types.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
10 years agogallivm: set mcpu when initializing llvm execution engine
Roland Scheidegger [Thu, 19 Jun 2014 01:27:26 +0000 (03:27 +0200)]
gallivm: set mcpu when initializing llvm execution engine

Previously llvm detected cpu features automatically when the execution engine
was created (based on host cpu). This is no longer the case, which meant llvm
was then not able to emit some of the intrinsics we used as we didn't specify
any sse attributes (only on avx supporting systems this was not a problem since
despite at least some llvm versions enabling it anyway we always set this
manually). So, instead of trying to figure out which MAttrs to set just set
MCPU.

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=77493.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>