mesa.git
7 years agogallium/radeon: strenghten some checking for DMA preparation
Marek Olšák [Wed, 11 May 2016 12:09:55 +0000 (14:09 +0200)]
gallium/radeon: strenghten some checking for DMA preparation

Just for consistency. This doesn't fix anything, because DCC is not
supported with non-mipmapped textures.

v1.1: fix the comment about DCC

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agogallium/util: add util_texrange_covers_whole_level from radeon
Marek Olšák [Mon, 9 May 2016 11:36:39 +0000 (13:36 +0200)]
gallium/util: add util_texrange_covers_whole_level from radeon

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agonir: allow sat on all float destination types
Ilia Mirkin [Tue, 31 May 2016 21:50:04 +0000 (17:50 -0400)]
nir: allow sat on all float destination types

With the introduction of fp64 and fp16 to nir, there are now a bunch of
float types running around. A F1 2015 shader ends up with an i2f.sat
operation, which has a nir_type_float32 destination. Allow sat on all
the float destination types.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoradeonsi: fix the raster config setup for 1 RB iceland chips
Alex Deucher [Mon, 23 May 2016 19:53:56 +0000 (15:53 -0400)]
radeonsi: fix the raster config setup for 1 RB iceland chips

I didn't realize there were 1 and 2 RB variants when this code
was originally added.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
7 years agomesa/sampler: fix error codes for sampler parameters.
Dave Airlie [Wed, 1 Jun 2016 06:35:59 +0000 (16:35 +1000)]
mesa/sampler: fix error codes for sampler parameters.

The initial ARB_sampler_objects spec had GL_INVALID_VALUE in it,
however version 8 of it fixed this, and the GL specs also have
the fixed value in them.

Fixes:
GL45-CTS.texture_border_clamp.samplerparameteri_non_gen_sampler_error

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoglsl: define some GLES3 constants in GLSL 4.1
Dave Airlie [Wed, 1 Jun 2016 06:16:30 +0000 (16:16 +1000)]
glsl: define some GLES3 constants in GLSL 4.1

The GLSL 4.1 spec adds:
gl_MaxVertexUniformVectors
gl_MaxFragmentUniformVectors
gl_MaxVaryingVectors

This fixes:
GL45-CTS.gtf31.GL3Tests.uniform_buffer_object.uniform_buffer_object_build_in_constants

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoi965: Add norbc debug option
Topi Pohjolainen [Tue, 31 May 2016 13:47:50 +0000 (16:47 +0300)]
i965: Add norbc debug option

This INTEL_DEBUG option disables lossless compression (also known
as render buffer compression).

v2: (Matt) Use likely(!lossless_compression_disabled) instead of
           !likely(lossless_compression_disabled)
    (Grazvydas) Update docs/envvars.html

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965/gen9: Configure rbc buffers as plain for non-rbc tex views
Topi Pohjolainen [Tue, 31 May 2016 07:36:12 +0000 (10:36 +0300)]
i965/gen9: Configure rbc buffers as plain for non-rbc tex views

Fixes rendering in Shadow of Mordor with rbc. Application writes
RGBA_UNORM texture filling it with values the application wants to
later on treat as SRGB_ALPHA.
Intel driver enables lossless compression for the buffer by the time
of writing. However, the driver fails to make sure the buffer can be
sampled as something else later on and unfortunately there is
restriction in the hardware for using lossless compression for srgb
formats which looks to extend itself to the sampling engine also.
Requesting srgb to linear conversion on top of compressed buffer
results the color values to be pretty much garbage.

Fortunately none of tracked benchmarks showed a regression with
this.

v2 (Matt): Add missing space

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965: Fix the passthrough TCS for isolines.
Kenneth Graunke [Thu, 26 May 2016 07:29:56 +0000 (00:29 -0700)]
i965: Fix the passthrough TCS for isolines.

We weren't setting up several of the uniform values for the patch
header, so we'd crash when uploading push constants.  We at least
need to initialize them to zero.  We also had the isoline parameters
reversed, so it would also render incorrectly (if it didn't crash).

Fixes a new Piglit test(*) (isoline-no-tcs), as well as crashes in
GL44-CTS.tessellation_shader.single.max_patch_vertices.

(*) https://lists.freedesktop.org/archives/piglit/2016-May/019866.html

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
7 years agoi965/xfb: skip components in correct buffer.
Dave Airlie [Wed, 1 Jun 2016 04:10:22 +0000 (14:10 +1000)]
i965/xfb: skip components in correct buffer.

The driver was adding the skip components but always for buffer 0.

This fixes:
GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_skip_multiple_buffers

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoglsl/linker: fix multiple streams transform feedback.
Dave Airlie [Tue, 31 May 2016 02:51:47 +0000 (12:51 +1000)]
glsl/linker: fix multiple streams transform feedback.

e2791b38b42f83add5b07298c39741bf0a6d7d4b
mesa/program_interface_query: fix transform feedback varyings.

caused a regression in
GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_multiple_streams
on radeonsi.

The problem was it was using the skip components varying to set
the stream id, when it should wait until a varying was written,
this just adds the varying checks in the right place.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agomesa/bufferobj: use mapping range in BufferSubData.
Dave Airlie [Wed, 25 May 2016 04:02:27 +0000 (14:02 +1000)]
mesa/bufferobj: use mapping range in BufferSubData.

According to GL4.5 spec:
An INVALID_OPERATION error is generated if any part of the speci-
fied buffer range is mapped with MapBufferRange or MapBuffer (see sec-
tion 6.3), unless it was mapped with MAP_PERSISTENT_BIT set in the Map-
BufferRange access flags.

So we should use the if range is mapped path.

This fixes:
GL45-CTS.buffer_storage.map_persistent_buffer_sub_data

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0, 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agonv50/ir: fix error finding free element in bitset in some situations
Ilia Mirkin [Tue, 31 May 2016 04:33:50 +0000 (00:33 -0400)]
nv50/ir: fix error finding free element in bitset in some situations

This really only hits for bitsets with a size of a multiple of 32. We
can end up with pos = -1 as a result of the ffs, which we in turn decide
is a valid position (since we fall through the loop and i == 1, we end
up adding 32 to it, so end up returning 31 again).

Up until recently this was largely unreachable, as the register file
sizes were all 63 or 255. However with the advent of compute shaders
which can restrict the number of registers, this can now happen.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
7 years agonv50/ir: print relevant file's bitset when showing RA info
Ilia Mirkin [Tue, 31 May 2016 04:33:19 +0000 (00:33 -0400)]
nv50/ir: print relevant file's bitset when showing RA info

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agoRevert "glsl: fix xfb_offset unsized array validation"
Timothy Arceri [Tue, 31 May 2016 23:21:01 +0000 (09:21 +1000)]
Revert "glsl: fix xfb_offset unsized array validation"

This reverts commit aac90ba2920cf5ceb4df6dba776dd3952780e456.

The commit caused a regression in:
piglit.spec.glsl-1_50.compiler.gs-input-nonarray-named-block.geom

Also the CTS test it was meant to fix seems like it may be bogus.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
7 years agoi965/fs: Allow scalar source regions on SNB math instructions.
Francisco Jerez [Sat, 28 May 2016 06:29:14 +0000 (23:29 -0700)]
i965/fs: Allow scalar source regions on SNB math instructions.

I haven't found any evidence that this isn't supported by the
hardware, in fact according to the SNB hardware spec:

 "The supported regioning modes for math instructions are align16,
  align1 with the following restrictions:
   - Scalar source is supported.
  [...]
   - Source and destination offset must be the same, except the case of
     scalar source."

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoi965/fs: Fix constant combining for instructions that cannot accept source mods.
Francisco Jerez [Sat, 28 May 2016 06:29:10 +0000 (23:29 -0700)]
i965/fs: Fix constant combining for instructions that cannot accept source mods.

This is the case for SNB math instructions so we need to be careful
and insert the literal value of the immediate into the table (rather
than its absolute value) if the instruction is unable to invert the
sign of the constant on the fly.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoi965/fs: Extend remove_duplicate_mrf_writes() to handle non-VGRF to MRF copies.
Francisco Jerez [Wed, 25 May 2016 20:17:41 +0000 (13:17 -0700)]
i965/fs: Extend remove_duplicate_mrf_writes() to handle non-VGRF to MRF copies.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoi965/fs: Fix compute_to_mrf() to coalesce VGRFs initialized by multiple single-GRF...
Francisco Jerez [Fri, 27 May 2016 23:03:34 +0000 (16:03 -0700)]
i965/fs: Fix compute_to_mrf() to coalesce VGRFs initialized by multiple single-GRF writes.

Which requires using a bitset instead of a boolean flag to keep track
of the GRFs we've seen a generating instruction for already.  The
search loop continues until all instructions initializing the value of
the source VGRF have been found, or it is determined that coalescing
is not possible.

Fixes a few piglit test cases on Gen4-6 which were regressed by
6956015aa514f2d06d0e4b33bfe6bca83142fbf0 due to the different (yet
perfectly valid) ordering in which copy instructions are emitted now
by the simd lowering pass, which had the side effect of causing this
optimization pass to start corrupting the program in cases where a
VGRF-to-MRF copy instruction would be eliminated but only the last
instruction writing to the source VGRF region would be rewritten to
point to the target MRF.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoi965/fs: Teach compute_to_mrf() about the COMPR4 address transformation.
Francisco Jerez [Fri, 27 May 2016 21:17:28 +0000 (14:17 -0700)]
i965/fs: Teach compute_to_mrf() about the COMPR4 address transformation.

This will be required to correctly transform the destination of 8-wide
instructions that write a single GRF of a VGRF to MRF copy marked
COMPR4.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoi965/fs: Refactor compute_to_mrf() to split search and rewrite into separate loops.
Francisco Jerez [Fri, 27 May 2016 20:15:55 +0000 (13:15 -0700)]
i965/fs: Refactor compute_to_mrf() to split search and rewrite into separate loops.

This will allow compute_to_mrf to handle cases where the source of the
VGRF-to-MRF copy is initialized by more than one instruction.  In such
cases we cannot rewrite the destination of any of the generating
instructions until it's known whether the whole VGRF source region can
be coalesced into the destination MRF, which will imply continuing the
search until all generating instructions have been found or it has
been determined that the VGRF and MRF registers cannot be coalesced.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoi965/fs: Fix compute-to-mrf VGRF region coverage condition.
Francisco Jerez [Fri, 27 May 2016 23:41:35 +0000 (16:41 -0700)]
i965/fs: Fix compute-to-mrf VGRF region coverage condition.

Compute-to-mrf was checking whether the destination of scan_inst is
more than one component (making assumptions about the instruction data
type) in order to find out whether the result is being fully copied
into the MRF destination, which is rather inaccurate in cases where a
single-component instruction is only partially contained in the source
region, or when the execution size of the copy and scan_inst
instructions differ.  Instead check whether the destination region of
the instruction is really contained within the bounds of the source
region of the copy.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoi965/fs: Simplify and improve accuracy of compute_to_mrf() by using regions_overlap().
Francisco Jerez [Fri, 27 May 2016 19:50:28 +0000 (12:50 -0700)]
i965/fs: Simplify and improve accuracy of compute_to_mrf() by using regions_overlap().

Compute-to-mrf was being rather heavy-handed about checking whether
instruction source or destination regions interfere with the copy
instruction, which could conceivably lead to program miscompilation.
Fix it by using regions_overlap() instead of the open-coded and
dubiously correct overlap checks.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoi965/fs: Teach regions_overlap() about COMPR4 MRF regions.
Francisco Jerez [Fri, 27 May 2016 06:53:31 +0000 (23:53 -0700)]
i965/fs: Teach regions_overlap() about COMPR4 MRF regions.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoDon't use python 3
Dylan Baker [Tue, 31 May 2016 20:31:44 +0000 (13:31 -0700)]
Don't use python 3

Now there are not files that require python 3, so for now just remove
the python 3 dependency and use python 2. I think the right plan is to
just get all of the python ready for python 3, and then use whatever
python is available.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>

7 years agogenxml: change chbang to python 2
Dylan Baker [Tue, 31 May 2016 18:40:22 +0000 (11:40 -0700)]
genxml: change chbang to python 2

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>

7 years agogenxml: use the isalpha method rather than str.isalpha.
Dylan Baker [Tue, 31 May 2016 20:33:50 +0000 (13:33 -0700)]
genxml: use the isalpha method rather than str.isalpha.

This fixes gen_pack_header to work on python 2, where name[0] is unicode
not str.

Signed-off-by: Dylan Bake <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>

7 years agogenxml: require future imports for python2 compatibility.
Dylan Baker [Tue, 31 May 2016 18:36:26 +0000 (11:36 -0700)]
genxml: require future imports for python2 compatibility.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>

7 years agogenxml: mark re strings as raw
Dylan Baker [Tue, 31 May 2016 18:33:19 +0000 (11:33 -0700)]
genxml: mark re strings as raw

This is a correctness issue.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>

7 years agogenxml: Make classes descendants of object
Dylan Baker [Tue, 31 May 2016 18:31:18 +0000 (11:31 -0700)]
genxml: Make classes descendants of object

This is the default in python3, but in python2 you get old style
classes. No one likes old-style classes.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>

7 years agogenxml: mark gen_pack_header.py as encoded in utf-8
Dylan Baker [Tue, 31 May 2016 18:29:50 +0000 (11:29 -0700)]
genxml: mark gen_pack_header.py as encoded in utf-8

There is unicode in this file, and I'm actually surprised that the
python interpreter hasn't gotten grumpy.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>

7 years agoradeonsi: Decompress DCC textures in a render feedback loop.
Bas Nieuwenhuizen [Tue, 31 May 2016 12:11:49 +0000 (14:11 +0200)]
radeonsi: Decompress DCC textures in a render feedback loop.

By using a counter to quickly reject textures that are not
bound to a framebuffer, the performance impact when binding
sampler_views/images is not too large.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: Add counter to check if a texture is bound to a framebuffer.
Bas Nieuwenhuizen [Tue, 31 May 2016 11:44:03 +0000 (13:44 +0200)]
radeonsi: Add counter to check if a texture is bound to a framebuffer.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agovc4: Fix compiler warnings in fail_instr path of QIR validate pass
Rhys Kidd [Fri, 20 May 2016 03:17:20 +0000 (23:17 -0400)]
vc4: Fix compiler warnings in fail_instr path of QIR validate pass

Introduced in 8e2d0843c02daf5280184f179ae8ed440ac90d7f.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
7 years agoanv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards
Emil Velikov [Tue, 31 May 2016 13:55:04 +0000 (14:55 +0100)]
anv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards

The generated sources should follow the example set by the vulkan
headers and our non-generated code. Namely: the code for all supported
platforms should be available, each one guarded by its respective
VK_USE_PLATFORM_*_KHR macro.

v2: Reword commit message.

Cc: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96285
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1 over IRC)
7 years agosvga: change enum pipe_resource_usage back to unsigned
Brian Paul [Tue, 31 May 2016 13:25:03 +0000 (07:25 -0600)]
svga: change enum pipe_resource_usage back to unsigned

This parameter is actually a bitmask of PIPE_TRANSFER_x flags.
Change it back to a simple unsigned type.  IIRC, some compilers
complain about masks of enum values.  Also, this make the function
signature match u_resource_vtbl::transfer_map() again.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
7 years agoradeonsi: fix CP DMA hazard with index buffer fetches
Marek Olšák [Thu, 26 May 2016 20:00:03 +0000 (22:00 +0200)]
radeonsi: fix CP DMA hazard with index buffer fetches

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
7 years agor600g: do GL-compliant integer resolves
Marek Olšák [Tue, 31 May 2016 10:03:32 +0000 (12:03 +0200)]
r600g: do GL-compliant integer resolves

The GL spec has been clarified and the new rule says we should just
copy 1 sample. u_blitter does the right thing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: do GL-compliant integer resolves
Marek Olšák [Tue, 31 May 2016 10:03:32 +0000 (12:03 +0200)]
radeonsi: do GL-compliant integer resolves

The GL spec has been clarified and the new rule says we should just
copy 1 sample. u_blitter does the right thing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agogallium/u_blitter: do GL-compliant integer resolves
Marek Olšák [Tue, 31 May 2016 10:03:32 +0000 (12:03 +0200)]
gallium/u_blitter: do GL-compliant integer resolves

The GL spec has been clarified and the new rule says we should just
copy 1 sample.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agomesa: fix crash in driver_RenderTexture_is_safe
Marek Olšák [Mon, 30 May 2016 14:29:18 +0000 (16:29 +0200)]
mesa: fix crash in driver_RenderTexture_is_safe

This just fixed the crash with the apitrace in bug report.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95246

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: don't flush TC at the end of IBs on DRM >= 3.2.0
Marek Olšák [Thu, 26 May 2016 18:39:51 +0000 (20:39 +0200)]
radeonsi: don't flush TC at the end of IBs on DRM >= 3.2.0

It's not needed since it was fixed in the kernel.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
7 years agogallium/radeon: fixed division by zero
Jakob Sinclair [Wed, 18 May 2016 17:48:29 +0000 (19:48 +0200)]
gallium/radeon: fixed division by zero

Coverity is getting a false positive that a division by zero can occur
here. This change will silence the Coverity warnings as a division by zero
cannot occur in this case.

Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/glsl_to_tgsi: prevent infinite loop
Eric Engestrom [Tue, 31 May 2016 01:20:12 +0000 (02:20 +0100)]
st/glsl_to_tgsi: prevent infinite loop

`unsigned j` would never fail `j >= 0`, leading to an infinite loop as
`j--` wraps around.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
7 years agoglsl/images: bounds check image unit assignment
Dave Airlie [Mon, 23 May 2016 02:49:25 +0000 (12:49 +1000)]
glsl/images: bounds check image unit assignment

The CTS test:
GL45-CTS.multi_bind.dispatch_bind_image_textures
binds 192 image uniforms, we reject this later,
but not until after we trash the contents of the
struct gl_shader.

Error now reads:
Too many compute shader image uniforms (192 > 16)
instead of
Too many compute shader image uniforms (2745344416 > 16)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agonvc0/ir: fix spilling predicates to registers
Ilia Mirkin [Mon, 30 May 2016 21:25:41 +0000 (17:25 -0400)]
nvc0/ir: fix spilling predicates to registers

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
7 years agonvc0/ir: limit max number of regs based on availability in SM
Ilia Mirkin [Sat, 28 May 2016 18:28:07 +0000 (14:28 -0400)]
nvc0/ir: limit max number of regs based on availability in SM

This effectively limits registers to 32 and 64 for fermi and kepler when
1024 threads are used, but allows the full amount to be used with
smaller thread sizes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
7 years agonv50/ir: record number of threads in a compute shader
Ilia Mirkin [Sat, 28 May 2016 18:23:35 +0000 (14:23 -0400)]
nv50/ir: record number of threads in a compute shader

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
7 years agonv50/ir: Add missing handling of U64/S64 in inlines
Pierre Moreau [Thu, 19 May 2016 18:13:50 +0000 (20:13 +0200)]
nv50/ir: Add missing handling of U64/S64 in inlines

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agodocs: rename release notes to 12.0.0
Emil Velikov [Mon, 30 May 2016 17:50:17 +0000 (18:50 +0100)]
docs: rename release notes to 12.0.0

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 7ad2cb6f08bf318219ceb02d297f794db9221efa)

7 years agodocs: move nvc0 out of individual lines of GL 4.2, 4.3, ES 3.1
Ilia Mirkin [Mon, 30 May 2016 19:18:02 +0000 (15:18 -0400)]
docs: move nvc0 out of individual lines of GL 4.2, 4.3, ES 3.1

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agodocs: add 12.1.0-devel release notes template, bump version
Emil Velikov [Mon, 30 May 2016 19:02:12 +0000 (20:02 +0100)]
docs: add 12.1.0-devel release notes template, bump version

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agodocs/GL3: mark radeonsi as all done up to GL 4.3 and GLES 3.1
Marek Olšák [Mon, 30 May 2016 18:44:19 +0000 (20:44 +0200)]
docs/GL3: mark radeonsi as all done up to GL 4.3 and GLES 3.1

7 years agonir: add the SConscript.nir to the tarball
Emil Velikov [Mon, 30 May 2016 17:57:09 +0000 (18:57 +0100)]
nir: add the SConscript.nir to the tarball

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agovc4: Fix doxygen warnings
Rhys Kidd [Wed, 25 May 2016 21:10:46 +0000 (17:10 -0400)]
vc4: Fix doxygen warnings

Now that vc4 automated code documentation can be generated with
doxygen, fix the warnings issued by Doxygen 1.8.11.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agodoxygen: Plumb through gallium/ to automated documentation
Rhys Kidd [Wed, 25 May 2016 21:10:45 +0000 (17:10 -0400)]
doxygen: Plumb through gallium/ to automated documentation

Add Gallium and the Gallium-based drivers to doxygen's automated
code documentation infrastructure.

Can be individually created with:

  cd $MESA_TOP_LEVEL/
  make -C doxygen/ gallium.tag

Benefits from the existing doxygen Makefile runners to clean up
afterwards with 'make clean'.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoRevert "osmesa: don't try to bundle osmesa.def SConscript"
Emil Velikov [Mon, 30 May 2016 16:11:16 +0000 (17:11 +0100)]
Revert "osmesa: don't try to bundle osmesa.def SConscript"

This reverts commit c07df0f2014636b601cdbaff63214296599b1ad5.

Now that the SCons build is back we need to include the files in the
tarball.

7 years agoscons: build osmesa swrast and gallium
Andreas Fänger [Tue, 8 Mar 2016 11:04:00 +0000 (11:04 +0000)]
scons: build osmesa swrast and gallium

This patch makes it possible to build classic osmesa/swrast on windows
again. It was removed in commit 69db422218b0264b5b8eef45bd003a2544e9cbd6.
Although there is a gallium version of osmesa now, the swrast version
still has more features lacking in llvmpipe, e.g. anisotropic filtering.

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
[Emil Velikov: remove trailing whitespace]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoautomake: rework the git_sha1.h rule, include in tarball
Emil Velikov [Mon, 30 May 2016 11:32:05 +0000 (12:32 +0100)]
automake: rework the git_sha1.h rule, include in tarball

As we'll need the file in the release tarball, rework the rule so that
the file is regenerated _only_ if we're in a git repository.

With this in place we can build vulkan (anv) from a release tarball.

Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoautomake: move the git_sha1.h rule a level up
Emil Velikov [Mon, 30 May 2016 11:09:04 +0000 (12:09 +0100)]
automake: move the git_sha1.h rule a level up

This way we can reuse the header from other places like -
src/intel/vulkan and src/gallium. Only the former is hooked up atm.

Make sure .gitignore is updated, as well as all the users (the mesa
code does not need any changes).

Also ensure that the file is always created by adding it to the
BUILT_SOURCES target.

Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agomesa_glinterop: remove mesa_glinterop typedefs
Emil Velikov [Mon, 30 May 2016 09:56:33 +0000 (10:56 +0100)]
mesa_glinterop: remove mesa_glinterop typedefs

As is there are two places that do the typedefs - dri_interface.h and
this header. As we cannot include the former in here, just drop the
typedefs and use the struct directly (as needed).

This is required because typedef redefinition is C11 feature which is
not supported on all the versions of GCC used to build mesa.

v2: Kill the typedef alltogether, as per Marek.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96236
Cc: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoglx/glvnd: automake: include all the sources in libglx_la_SOURCES
Emil Velikov [Mon, 30 May 2016 15:49:02 +0000 (16:49 +0100)]
glx/glvnd: automake: include all the sources in libglx_la_SOURCES

Otherwise the headers will be missing from the release tarball.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoglx/glvnd: remove the final if defined($extension) guards
Emil Velikov [Mon, 30 May 2016 15:45:39 +0000 (16:45 +0100)]
glx/glvnd: remove the final if defined($extension) guards

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoglx/glvnd: rework dispatch functions/indices tables lookup
Emil Velikov [Wed, 11 May 2016 18:01:55 +0000 (14:01 -0400)]
glx/glvnd: rework dispatch functions/indices tables lookup

Rather than checking if the function name maps to a valid entry in the
respective table, just create a dummy entry at the end of each table.

This allows us to remove some unnessesary "index >= 0" checks, which get
executed quite often.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoglx/glvnd: Use strcmp() based binary search in FindGLXFunction()
Emil Velikov [Wed, 11 May 2016 18:01:54 +0000 (14:01 -0400)]
glx/glvnd: Use strcmp() based binary search in FindGLXFunction()

It will allows us to find the function within 6 attempts, out of the ~80
entry long table.

v2: calculate middle on each iteration, correctly set the lower limit.

Reviewed-by: Adam Jackson <ajax@redhat.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoconfigure.ac: correct the xlib/xlib-gallium GLX detection for GLVND
Chuck Atkins [Mon, 30 May 2016 15:35:40 +0000 (16:35 +0100)]
configure.ac: correct the xlib/xlib-gallium GLX detection for GLVND

Things have changed since commit a92910a ("glx: Refactor the configure
options for glx implementation choice (v3)") where only a single
configure option is used to control the GLX provider.

[Emil Velikov: Ensure that the check is moved after the detection code.]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoglx: Implement the libglvnd interface.
Kyle Brenneman [Wed, 11 May 2016 18:01:53 +0000 (14:01 -0400)]
glx: Implement the libglvnd interface.

With reference to the libglvnd branch:

https://cgit.freedesktop.org/mesa/mesa/log/?h=libglvnd

This is a squashed commit containing all of Kyle's commits, all but two
of Emil's commits (to follow), and a small fixup from myself to mark the
rest of the glX* functions as _GLX_PUBLIC so they are not exported when
building for libglvnd. I (ajax) squashed them together both for ease of
review, and because most of the changes are un-useful intermediate
states representing the evolution of glvnd's internal API.

Co-author: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
7 years agogallivm: initialize init_native_targets_once_flag correctly
Frederic Devernay [Mon, 30 May 2016 14:09:21 +0000 (16:09 +0200)]
gallivm: initialize init_native_targets_once_flag correctly

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
7 years agonvc0/ir: fix emission of predicate spill to register
Ilia Mirkin [Sun, 29 May 2016 13:58:40 +0000 (09:58 -0400)]
nvc0/ir: fix emission of predicate spill to register

The lane mask only applies to real mov's, while here we're using PSET.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agonvc0: fix some compute texture validation bits on kepler
Ilia Mirkin [Mon, 30 May 2016 02:15:07 +0000 (22:15 -0400)]
nvc0: fix some compute texture validation bits on kepler

(a) Make sure to update the TIC in case of an updated buffer address
(b) Mark newly-inactive textures dirty so that we update the handle in
set_tex_handles.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
7 years agomesa/xfb: report calculated size for XFB buffer objects.
Dave Airlie [Sun, 29 May 2016 20:56:52 +0000 (06:56 +1000)]
mesa/xfb: report calculated size for XFB buffer objects.

This fixes:
GL45-CTS.direct_state_access.xfb_buffers

This test looks correct to me, we should work out the
size value and report it rather than using only the size
from the Range interface.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoswr: automake: silence the python invocation
Emil Velikov [Fri, 27 May 2016 14:35:45 +0000 (15:35 +0100)]
swr: automake: silence the python invocation

Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoswr: automake: attempt to fix the out-of-tree build
Emil Velikov [Fri, 27 May 2016 14:35:44 +0000 (15:35 +0100)]
swr: automake: attempt to fix the out-of-tree build

Make sure that the output folder is created otherwise the python scripts
yells at us.

Cc: 0xe2.0x9a.0x9b@gmail.com
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96238
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoswr: remove LLVM dependency from source generation rules.
Emil Velikov [Fri, 27 May 2016 14:35:43 +0000 (15:35 +0100)]
swr: remove LLVM dependency from source generation rules.

The dependencies should not mention any files external to the project.
If we want to do sanity checks for the LLVM installed on the system we
should do that in configure, yet again where is the merit which header
gets checked and which doesn't ?

Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoswr: add all the generators to the release tarball.
Emil Velikov [Fri, 27 May 2016 14:35:42 +0000 (15:35 +0100)]
swr: add all the generators to the release tarball.

Namely the python scripts and the knobs.template.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoanv: automake: don't forget to cleanup dev_icd.json
Emil Velikov [Fri, 27 May 2016 14:35:39 +0000 (15:35 +0100)]
anv: automake: don't forget to cleanup dev_icd.json

Otherwise `make distcheck' will barf at us as the file is dangling.

Ideally this should be part of the clean-local hook, although we include
install-lib-links.mk which already has one.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: automake: bring back VULKAN_ENTRYPOINT_CPPFLAGS
Emil Velikov [Fri, 27 May 2016 14:35:38 +0000 (15:35 +0100)]
anv: automake: bring back VULKAN_ENTRYPOINT_CPPFLAGS

We should not have removed them in the first place. There's a subtle
difference between generating the complete sources and using them which
was not obvious as we nuked them.

Without this, the release tarball ends up without various hunks of the
generated sources, thus things fail at a later stage as we attempt to
build them.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: automake: ship the json files in the release tarball
Emil Velikov [Fri, 27 May 2016 14:35:37 +0000 (15:35 +0100)]
anv: automake: ship the json files in the release tarball

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agosoftpipe: add sp_buffer.h to the sources list (release tarball)
Emil Velikov [Fri, 27 May 2016 14:35:36 +0000 (15:35 +0100)]
softpipe: add sp_buffer.h to the sources list (release tarball)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agofreedreno: make sure we pick up ir3_nir_trig.py in the release tarball
Emil Velikov [Fri, 27 May 2016 14:35:35 +0000 (15:35 +0100)]
freedreno: make sure we pick up ir3_nir_trig.py in the release tarball

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoisl: add isl_priv.h to the sources list
Emil Velikov [Fri, 27 May 2016 14:35:34 +0000 (15:35 +0100)]
isl: add isl_priv.h to the sources list

Otherwise it will be missing from the release tarball.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoisl: move the sources lists to Makefile.sources
Mauro Rossi [Fri, 27 May 2016 14:35:33 +0000 (15:35 +0100)]
isl: move the sources lists to Makefile.sources

[Emil Velikov: use the file in the autoconf build]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoisl: automake: list builddir before srcdir in the includes list
Emil Velikov [Fri, 27 May 2016 14:35:32 +0000 (15:35 +0100)]
isl: automake: list builddir before srcdir in the includes list

As seen elsewhere - we want to include the freshly built sources as
opposed the the (likely) stale ones in the srcdir.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoisl: automake: flatten the tests rules
Emil Velikov [Fri, 27 May 2016 14:35:31 +0000 (15:35 +0100)]
isl: automake: flatten the tests rules

Fold the unneeded extra variable tests_ldadd, the explicit sources
section (single file with the default extension) and flip the
check_PROGRAMS <> TESTS order (TESTS includes scripts, while
check_PROGRAMS is binaries only).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoisl: automake: remove unneeded install-lib-links.mk include
Emil Velikov [Fri, 27 May 2016 14:35:30 +0000 (15:35 +0100)]
isl: automake: remove unneeded install-lib-links.mk include

One uses the makefile to create compatibility symlinks (to
$top_builddir/libs) for shared libraries/modules. As we don't create any
here, there's no need to include the file.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoisl: automake: remove unneeded SUBDIRS
Emil Velikov [Fri, 27 May 2016 14:35:29 +0000 (15:35 +0100)]
isl: automake: remove unneeded SUBDIRS

As we do not include any other subdirs but self, we don't need to set
it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agogenxml: move the sources (headers) list to Makefile.sources
Mauro Rossi [Fri, 27 May 2016 14:35:28 +0000 (15:35 +0100)]
genxml: move the sources (headers) list to Makefile.sources

[Emil Velikov: use the file in the autoconf build]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoanv: bail out if anv_wsi_init() fails
Emil Velikov [Sat, 28 May 2016 19:03:34 +0000 (20:03 +0100)]
anv: bail out if anv_wsi_init() fails

Otherwise we'll end up setting up a device with no winsys integration.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
---
Hard-coding the rendernode name in anv_physical_device_init() is a bad
idea really. We could/should be using drmGetDevices() to get info on all
the devices (master/render/etc. node names, pci location etc.) and apply
our heuristics on top of that.

That can come up as a follow up change.

7 years agoanv: resolve wayland-only build
Emil Velikov [Sat, 28 May 2016 18:49:37 +0000 (19:49 +0100)]
anv: resolve wayland-only build

Ensure that the final X11/XCB hunk is guarded by the correct macro.
Otherwise we'll require the symbol even when building without said
platform.

Cc: Cedric Sodhi <manday@openmail.cc>
Reported-by: Cedric Sodhi <manday@openmail.cc>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: Fix use of uninitialized variable.
Robert Foss [Wed, 4 May 2016 12:58:27 +0000 (08:58 -0400)]
anv: Fix use of uninitialized variable.

The return variable was not set for failure paths.
It has now been changed to VK_ERROR_INITIALIZATION_FAILED
for failure paths.

Coverity: 1358944
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
[Emil Velikov: rebase against master, s/vulkan/anv/]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agogallium: push offset down to driver
Stanimir Varbanov [Thu, 26 May 2016 22:10:37 +0000 (01:10 +0300)]
gallium: push offset down to driver

Push offset down to drivers when importing dmabuf. This is needed
to more fully support EGL_EXT_image_dma_buf_import when a non-zero
offset is specified.

Tesing has been done for freedreno, and compile tested following
gallium drivers:
nouveau,svga,virgl,r600,r300,radeonsi,swrast,i915,ilo

Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
7 years agost/dri: cleanup image_from_fd/dma_buf paths
Stanimir Varbanov [Thu, 26 May 2016 22:10:36 +0000 (01:10 +0300)]
st/dri: cleanup image_from_fd/dma_buf paths

Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
7 years agost/dri: add handling of R8 and GR88 DRI fourcc formats
Stanimir Varbanov [Thu, 26 May 2016 22:10:35 +0000 (01:10 +0300)]
st/dri: add handling of R8 and GR88 DRI fourcc formats

This helps to import dmabuf buffers from DRM_FORMAT_R8 and
DRM_FORMAT_GR88 used for example by GStreamer for YUV to RGB
conversion using shaders.

Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
7 years agoradeonsi: Don't offset OFFCHIP_BUFFERING on pre-VI cards.
Bas Nieuwenhuizen [Sun, 29 May 2016 16:35:22 +0000 (18:35 +0200)]
radeonsi: Don't offset OFFCHIP_BUFFERING on pre-VI cards.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96239
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agoi965: Expose GL 4.3 on Gen8+.
Francisco Jerez [Fri, 20 May 2016 07:19:18 +0000 (00:19 -0700)]
i965: Expose GL 4.3 on Gen8+.

ARB_compute_shader was the last feature missing.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Skip gen4 pre/post-send dependency workaronds for the first/last block.
Francisco Jerez [Wed, 25 May 2016 21:21:49 +0000 (14:21 -0700)]
i965/fs: Skip gen4 pre/post-send dependency workaronds for the first/last block.

We know that there cannot be any destination dependency race if we
reach the beginning or end of the program without having found any
other instruction the send could possibly race with.  This avoids
emitting a pile of useless moves at the beginning or end of the
program in the most common case in which the program has a single
basic block only.

On the original i965 I get the following shader-db results:

 total instructions in shared programs: 3354165 -> 3215637 (-4.13%)
 instructions in affected programs: 3183065 -> 3044537 (-4.35%)
 helped: 13498
 HURT: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Skip SIMD lowering source unzipping for regular scalar regions.
Francisco Jerez [Sun, 29 May 2016 05:44:13 +0000 (22:44 -0700)]
i965/fs: Skip SIMD lowering source unzipping for regular scalar regions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Factor out region zipping and unzipping from the SIMD lowering pass.
Francisco Jerez [Fri, 27 May 2016 06:07:58 +0000 (23:07 -0700)]
i965/fs: Factor out region zipping and unzipping from the SIMD lowering pass.

Just to make sure we keep the SIMD lowering pass tidy when we
introduce additional logic to try to optimize out the copy
instructions used to zip and unzip the destination and source regions
into multiple packed regions of the lowered instruction width.
Shouldn't cause any functional changes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Generalize regions_overlap() from copy propagation to handle non-VGRF files.
Francisco Jerez [Fri, 27 May 2016 06:20:19 +0000 (23:20 -0700)]
i965/fs: Generalize regions_overlap() from copy propagation to handle non-VGRF files.

This will be useful in several places.  The only externally visible
difference (other than non-VGRF files being supported now) is that the
region sizes are now passed in byte units instead of in GRF units
because the loss of precision would have become a problem in the SIMD
lowering pass.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965/fs: Refactor offset() into a separate function taking the width as argument.
Francisco Jerez [Fri, 27 May 2016 06:09:46 +0000 (23:09 -0700)]
i965/fs: Refactor offset() into a separate function taking the width as argument.

This will be useful in the SIMD lowering pass to avoid having to
construct a builder object of the known region width just to pass it
as argument to offset(), which doesn't do anything with it other than
taking the builder dispatch_width as region width.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>